Single state transducer model for Kazakh and Russian morphology

  • У. А. Тукеев al-Farabi Kazakh National University, Almaty, Republic of Kazakhstan
  • Д. Р. Рахимова al-Farabi Kazakh National University, Almaty, Republic of Kazakhstan
  • Ж. М. Жуманов al-Farabi Kazakh National University, Almaty, Republic of Kazakhstan
  • А. Ж. Картбаев al-Farabi Kazakh National University, Almaty, Republic of Kazakhstan

Abstract

This paper provides a broad overview of issues related to the construction of finite state transducers with one state for the two-level morphology of inflectional languages, particularly, the direct transformation of word endings to the grammatical characteristics. This problem has been studied on the base of the Kazakh and Russian languages, which are usually named the inflectional languages. The solution of this problem is the trivial Mealy automaton with one state, i.e. a single state transducer, and a multi-valued mapping method is used as well. We study the problem of completeness of the finite state transducers input for the analyzed languages. The determination of transducer input completeness for morphological analysis gives a guarantee that all the words of the analyzed language will be accepted. The problem of determining the completeness of the set of possible endings for agglutinative languages is a complex issue. In this article, we define the completeness of a set of endings in Kazakh language. The proposed technology is implemented for the Russian-Kazakh machine translation, a translation quality assessment performed by the method of BLEU.

References

[1] Koskenniemi K. Two-level morphology: A general computational model of word-form recognition and production. // Technical report publication of the University of Helsinki. - 1983. - No.11. - p.115-159.
[2] Gurenko V.V. Intoduction to automata theory - M.:MGTU, 2013. - 62 p.
[3] Oflazer K. Two-level description of Turkish morphology // Literary and Linguistic Computing. - Stroudsburg. - 1994. - No.2. - p.137-148.
[4] Washington J. N., Salimzyanov I., Tyers F.M. Finite-state morphological transducers for three Kypchak languages. // Proceedings of the 9th Conference on Language Resources and Evaluation. - Reykjavik. - 2014. - pp.545-548.
[5] Kairakbay B.M., Zaurbekov D. L. Finite State Approach to the Kazakh Nominal Paradigm. // Proceedings of the 11th International Conference on Finite State Methods and Natural Language Processing. - St. Andrews. - 2013. - p.108-112.
[6] Kessikbayeva G., Cicekli I. Rule Based Morphological Analyzer of Kazakh Language // Proceedings of the 2014 Joint Meeting of SIGMORPHON and SIGFSM.- Baltimore. - 2014. - p.137-148.
[7] Bektayev K. Big Kazakh-Russian and Russian-Kazakh dictionary. - Almaty: Altyn Kazyna, 1999. - 704 p.
Published
2018-04-01
How to Cite
ТУКЕЕВ, У. А. et al. Single state transducer model for Kazakh and Russian morphology. KazNU Bulletin. Mathematics, Mechanics, Computer Science Series, [S.l.], v. 89, n. 2, p. 110-117, apr. 2018. ISSN 1563-0277. Available at: <http://bm.kaznu.kz/index.php/kaznu/article/view/388>. Date accessed: 17 aug. 2018.
Keywords: machine translation, finite transducer, two-level morphology, inflectional languages, multi-valued mapping