![]() |
[email protected] |
![]() |
3275638434 |
![]() |
![]() |
Paper Publishing WeChat |
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
A Review of Automatic Pre-editing Approaches in Machine Translation
WANG Jun-song, MENG Ya-qi, WANG Ai-qing
Full-Text PDF
XML 66 Views
DOI:10.17265/2159-5836/2025.06.006
Northwestern Polytechnical University, Xi’an, China Northwestern Polytechnical University, Xi’an, China University of Liverpool, Liverpool, United Kingdom
With the development of machine translation technology, automatic pre-editing has attracted increasing research attention for its important role in improving translation quality and efficiency. This study utilizes UAM Corpus Tool 3.0 to annotate and categorize 99 key publications between 1992 and 2024, tracing the research paths and technological evolution of automatic pre-translation editing. The study finds that current approaches can be classified into four categories: controlled language-based approaches, text simplification approaches, interlingua-based approaches, and large language model-driven approaches. By critically examining their technical features and applicability in various contexts, this review aims to provide valuable insights to guide the future optimization and expansion of pre-translation editing systems.
automatic pre-editing, machine translation, controlled language, text simplification, large language models
Journal of Literature and Art Studies, June 2025, Vol. 15, No. 6, 483-489
Awasthi, A., Gupta, N., Samanta, B., Dave, S., & Sunita, S. (2022). Bootstrapping multilingual semantic parsers using large language models. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 1.
Barreiro, A. (2011). SPIDER: A system for paraphrasing in document editing and revision—Applicability in machine translation pre-editing. International Conference on Intelligent Text Processing and Computational Linguistics. Berlin, Heidelberg: Springer.
Bott, S., & Saggion, H. (2012). Automatic simplification of Spanish text for e-accessibility. Computers Helping People with Special Needs. ICCHP 2012. Berlin, Heidelberg: Springer.
Bott, S., Rello, L., Drndarević, B., & Saggion, H. (2012). Can Spanish be simpler? LexSiS: Lexical simplification for Spanish. Proceedings of COLING.
CHEN, S., JIN, Q., & FU, J. (2019). From words to sentences: A progressive learning approach for zero-resource machine translation with visual pivots. arXiv preprint arXiv:1906.00872.
Devlin, S., & Unthank, G. (2006). Helping aphasic people process online information. Proceedings of the 8th International ACM SIGACCESS Conference on Computers and Accessibility.
FENG, Q., & GAO, L. (2017). The influence of controlled language-based pre-editing on machine translation (Jiyu shoukong yuyan de yiqian bianji dui jiqi fanyi de yingxiang 基于受控语言的译前编辑对机器翻译的影响). Contemporary Foreign Languages Studies, (02).
Koehn, P., & Knowles, R. (2017). Six challenges for neural machine translation. Proceedings of the 1st Workshop on Neural Machine Translation.
Mitamura, T. (1999). Controlled language for multilingual machine translation. Proceedings of MT Summit VII.
Mitamura, T., & Nyberg, E. (2001). Automatic rewriting for controlled language translation. Proceedings of the NLPRS2001 Workshop on Automatic Paraphrasing.
Mitamura, T., Nyberg, E., & Nino, M. (1999). The KANT system: Fast, accurate, high-quality translation in practical domains. Proceedings of COLING 1992, 3.
MIN, S., LYU, X., & Sulem, E. (2022). Rethinking the role of demonstrations: What makes in-context learning work? arXiv preprint arXiv:2202.12837.
Pabst, H., & Siegel, M. (2009). Easier, faster communication in international teams: Lower translation costs with controlled language checking. Proceedings of the tekom Annual Conference 2009.
QIAN, M., & KONG, C. (2024). Enabling human-centered machine translation using concept-based large language model prompting and translation memory. International Conference on Human-Computer Interaction. Cham: Springer Nature Switzerland.
QIAN, M., WU, H., YANG, L., & WAN, A. (2023). Augmented machine translation enabled by GPT-4: Performance evaluation on human-machine teaming approaches. Proceedings of the First NLP4TIA Workshop.
Saggion, H., Gómez-Martínez, E., Etayo, E., Anula, A., & Bourg, L. (2011). Text simplification in SIMPLEXT: Making texts more accessible. Procesamiento del Lenguaje Natural.
Saggion, H., Bott, S., & Rello, L. (2013). Comparing resources for Spanish lexical simplification. Statistical Language and Speech Processing: Proceedings of SLSP 2013. Berlin, Heidelberg: Springer.
Scarton, C., Oliveira, M., Candido Jr, A., Gasperin, C., & Aluísio, S. (2010). SIMPLIFICA: A tool for authoring simplified texts in Brazilian Portuguese guided by readability assessments. Proceedings of the NAACL HLT 2010 Demonstration Session.
Seretan, V., Roturier, J., Silva, D., & Bouillon, P. (2014). The ACCEPT Portal: An online framework for the pre-editing and post-editing of user-generated content. Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation.
Štajner, S., Calixto, I., & Saggion, H. (2015). Automatic text simplification for Spanish: Comparative evaluation of various simplification strategies. Proceedings of RANLP.
SUN, Y., O’Brien, S., O’Hagan, M., & Hollowood, F. (2010). A novel statistical pre-processing model for rule-based machine translation systems. Proceedings of the 14th EAMT Conference.
Tyagi, S., Chopra, D., Mathur, I., & Joshi, N. (2015). Classifier-based text simplification for improved machine translation. Proceedings of the International Conference on Advances in Computer Engineering and Applications.
Tyagi, S., Chopra, D., Mathur, I., & Joshi, N. (2015). Comparison of classifier-based approach with baseline for English–Hindi text simplification. Proceedings of the International Conference on Computing, Communication & Automation.
WEI, J., WANG, X., Schuurmans, D., Bosma, M., Ichter, B., XIA, F., CHI, E., LE, Q., & ZHOU, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903.
WU, H., & WANG, H. (2009). Revisiting pivot language approaches for machine translation. Proceedings of ACL-IJCNLP 2009.
XU, Y., & Seneff, S. (2008). Two-stage translation: A combined linguistic and statistical machine translation framework. Proceedings of AMTA 2008.