Using artificial intelligence to develop a machine translation system and teaching resources in the Tuvan language

Authors

DOI:

https://doi.org/10.25178/nit.2024.1.1

Keywords:

Tuvan language; artificial intelligence; machine translation; neural networks; large language models; digital presence; machine learning

Abstract

The advancement of computer technologies applied in the humanities and the progress in the development of large language models based on machine learning and neural network technologies have reached an exceptionally high level of sophistication. The linguistic potential of large language models elicits a natural interest among researchers, which constitutes a justified reflection of the relevance and importance of using artificial intelligence to create machine translation systems and educational resources.

The article explores the experience of creating a large language model for the Tuvan language using machine learning and artificial intelligence. The authors undertook an attempt to develop a large language model capable of recognizing the Tuvan language, translating phrases into Russian and back. In addition, the possibilities of generating text in Tuvan were examined and tested, which can be used both in the field of language teaching and when conducting various kinds of linguistic research.

This experience is unique since, as of now, the Tuvan language is not represented in any well-established machine translation systems. A secondary aim of the research is to analyze the level of the language's digital presence on the Internet, as well as to provide recommendations for devising an optimal algorithm for building similar systems and web services based on machine learning. The research outcomes are of practical value not only with respect to the Tuvan language but can also be extrapolated to other official languages in the Russian Federation.

References

Arefyev, A. L., Bakhtikireeva, U. M. and Sinyachkin, V. P. (2021). Issues of bilingualism in the school language education system of the Republic of Tuva. New Research of Tuva, no. 1, pp. 255–272. (In Russ.) DOI: https://doi.org/10.25178/nit.2021.1.14

Borgoiakova, T. G. and Bitkeeva, A. N. (2023) The Tuvan component of the bilingual space or reflections on the strategy of state support of the Tuvan language. New Research of Tuva, no. 4, pp. 290–300. (In Russ.) DOI: https://doi.org/10.25178/nit.2023.4.20

Dyrkheeva, G. A. and Tsybenova, Ch. S. (2020) Language attitudes and language loyalty of minor language speakers under the conditions of national-Russian bilingualism: the case of Buryats and Tuvans. New Research of Tuva, no. 1, pp. 62–74. (In Russ.) DOI: https://doi.org/10.25178/nit.2020.1.5

Kuzhugget, Sh. Yu., Suvandii, N. D. and Lamazhaa, Ch. K. (2021) The problem of translating cultural concepts into another language (on the example of Tuvan cultural concepts). Polylinguality & transcultural practices, nо. 18. (4), pp. 405–420. (In Russ.) DOI: https://doi.org/10.22363/2618-897X-2021-18-4-405-420

Ondar, Ch. G., Dongak, V. S. and Mongush, D. Sh. (2023). The Tuvan language on the Internet: representation, challenges, and prospects. New Research of Tuva, no. 1, pp. 186–207. (In Russ.) DOI: https://doi.org/10.25178/nit.2023.1.11

Papyn, A. S. (2010) Tuvan keyboard layout. New Research of Tuva, no. 1, pp. 19–25. (In Russ.)

Tuvans: Native People (2022). Ed. by Ch. K. Lamazhaa and N. D. Suvandii. St. Petersburg, Nestor-Istoriya. 344 pp. (In Russ.).

Athaluri, S. A., Manthena, S. V., Kesapragada, V. K. M., Yarlagadda, V., Dave, T. and Duddumpudi, R. T. S. (2023). Exploring the boundaries of reality: investigating the phenomenon of artificial intelligence hallucination in scientific writing through ChatGPT references. Cureus, no. 15(4). DOI: https://doi.org/10.7759/cureus.37432

Armstrong, L. E., Bergeron, M. F., Lee, E. C., Mershon, J. E. and Armstrong, E. M. (2022) Overtraining syndrome as a complex systems phenomenon. Frontiers in Network Physiology, no. 1 (20). DOI: https://doi.org/10.3389/fnetp.2021.794392

Garcia, X. Bansal, Y, Cherry, C., Foster, G., Krikun, M., Feng, F., Johnson, M. and First, O. (2023) The unreasonable effectiveness of few-shot learning for machine translation. International Conference on Machine Learning, PMLR, pp. 10867–10878. DOI: https://doi.org/10.48550/arXiv.2302.01398

Le, T. N. and Sadat, F. (2020) Revitalization of indigenous languages through pre-processing and neural machine translation: The case of Inuktitut. Proceedings of the 28th International Conference on Computational Linguistics, pp. 4661–4666. DOI: https://doi.org/10.18653/v1/2020.coling-main.410.

Sreelekha, S., Bhattacharyya, P., Jha, S. K. and Malathi, D. (2016) A survey report on evolution of machine translation. IJCTA, 9 (33), pp. 233–240 [online]: https://www.serialsjournals.com/abstract/65435_article-24.pdf (access date: 12.11.2023).

Srinivasan, K., Raman, K., Chen, J., Bendersky, M. and Najork, M. (2021) Wit: Wikipedia-based image text dataset for multimodal multilingual machine learning. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2443–2449. DOI: https://doi.org/10.48550/arXiv.2103.01913

Spennemann, D. H. R. (2023) ChatGPT and the generation of digitally born “knowledge”: How does a generative AI language model interpret cultural heritage values? Knowledge, no. 3, pp. 480–512. DOI: https://doi.org/10.3390/knowledge3030032

Zwischenberger, C. (2022) Online collaborative translation: its ethical, social, and conceptual conditions and consequences. Perspectives, no. 30 (1), pp. 1–18. DOI: https://doi.org/10.1080/0907676X.2021.1872662

Published

13.03.2024

How to Cite

Новикова М. Л., Новиков Ф. Н. Использование искусственного интеллекта для создания системы машинного перевода и образовательных ресурсов на тувинском языке // Новые исследования Тувы. 2024, № 1. С. 6-17. DOI: https://doi.org/10.25178/nit.2024.1.1

For citation:
Novikova M. L. and Novikov Ph. N. Using artificial intelligence to develop a machine translation system and teaching resources in the Tuvan language. New Research of Tuva, 2024, no. 1, pp. 6-17. (In Russ.). DOI: https://doi.org/10.25178/nit.2024.1.1

Issue

Section

Special theme

Author Biographies

Marina L. Novikova, RUDN University

Doctor of Philology, Professor, Russian Language and Cultural Studies Department, Russian Language Institute, RUDN University.

Postal address: 10, bldg. 3 Miklukho-Maklaya St., 117198, Moscow, Russia.

Email: novikova-ml@rudn.ru

Philipp N. Novikov , RUDN University

Candidate of Philology, Associate Professor, Law Institute Department of Foreign Languages, RUDN University.

Postal address: 6 Miklukho-Maklaya St., 117198, Moscow, Russia.

Email: novikov_fn@pfur.ru