Post-editing for the kazakh language using opennmt

Authors

  • D. R. Rakhimova
  • Aliya Zjigerovna Zhunussova Kaznu

DOI:

https://doi.org/10.26577/JMMCS.2022.v113.i1.12

Keywords:

Opennmt, neural machine translation, turkic languages

Abstract

The modern world and our immediate future depend on applied intelligent systems, as new technologies develop every day. One of the tasks of intelligent systems is machine (automated) translation from one natural language to another. Machine translation (MT) allows people to communicate regardless of language differences, as it removes the language barrier and opens up new languages for communication. Machine translation is a new technology, a special step in human development. This type of translation can help when you need to quickly understand what your interlocutor wrote or said in a letter. The work of online translators used to translate into Kazakh and vice versa. Translation errors are identified, general advantages and disadvantages of online machine translation systems in Kazakh are given. A model for the development of a post-editing machine translation system for the Kazakh language is presented. OpenNMT (Open Neural Machine Translation) is an open source system for neural machine translation and neural sequence training. To learn languages in OpenNMT, you need parallel corpuses for language pairs. The advantage of OpenNMT is that it can be applied to all languages and can handle large corpora. Experimental data were obtained for the English-Kazakh language pair. Experimental data were obtained for the English-Kazakh language pair.

References

[1] Moore R.C., "A discriminative framework for bilingual word alignment" , Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT. Vancouver (2005): 81-88.
[2] Bekbulatov, E. and Kartbayev A., "A study of certain morphological structures of Kazakh and their impact on the machine translation quality" , Proceedings of the IEEE 8th International
[3] Conference on Application of Information and Communication Technologies. Astana (2014): 495-501.
[4] Nirenburg S., "Knowledge-Based Machine Translation" , Machine Translation, Springer 1 (4) (1989): 5-24.
[5] Nagao M., "A framework of a mechanical translation between Japanese and English by analogy principle" , Proceedings of the international NATO symposium on Artificial and human intelligence (1984): 173-180.
[6] Ziemski M., Junczys-Dowmunt M. and Pouliquen B., "The United Nations Parallel Corpus" , Proceedings of Language Resources and Evaluation LREC. Slovenia (2016): 3530-3534.
[7] Koehn P., "Europarl: A Parallel Corpus for Statistical Machine Translation" , Proceedings of the 10th Machine Translation Summit Phuket (2005): 79-86.
[8] Boitet C., "Bernard Vauquois’ contribution to the theory and practice of building MT systems" , Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering Beijing (2010): 331-334.
[9] Tomas Mikolov, Wen-tau Yih and Geoffrey Zweig, "Linguistic Regularities in Continuous Space Word Representations" , The Association for Computational Linguistics. In HLTNAACL (2013): 746-751.
[10] Nal Kalchbrenner, Phil Blunsom, "Recurrent Continuous Translation Models" , Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA (2013): 1700-1709.
[11] Mikel L. Forcado and Ramon P. Neco, "Recursive Hetero-Associative Memories for Translation" , International WorkConference on Artificial and Natural Neural Networks, IWANN’97 Lanzarote, Canary Islands, Spain (1997): 453-46

Downloads

Published

2022-03-31