The structure of an entry in the National corpus of Tuvan language

Authors

  • Mengi V. Ondar Tuvan State University

Keywords:

dictionary structure; textual corpus; dictionary; Tuvan language; electronic dictionary; Microsoft Office Access

Abstract

Contemporary information technologies and mathematical modelling has made creating corpora of natural languages significantly easier. A corpus is an information and reference system based on a collection of digitally processed texts. A corpus includes various written and oral texts in the given language, a set of dictionaries and markup – information on the properties of the text. It is the presence of the markup which distinguishes a corpus from an electronic library.

At the moment, national corpora are being set up for many languages of the Russian Federation, including those of the Turkic peoples. Faculty members, postgraduate and undergraduate students at Tuvan State University and Siberian Federal University are working on the National corpus of Tuvan language.

This article describes the structure of a dictionary entry in the National corpus of Tuvan language. The corpus database comprises the following tables: MAIN – the headword table, RUS, ENG, GER — translations of the headword into three languages, MORPHOLOGY — the table containing morphological data on the headword. The database is built in Microsoft Office Access.

Working with the corpus dictionary includes the following functions: adding, editing and removing an entry, entry search (with transcription), setting and visualizing morphological features of a headword.

The project allows us to view the corpus dictionary as a multi-structure entity with a complex hierarchical structure and a dictionary entry as its key component. The corpus dictionary we developed can be used for studying Tuvan language in its pronunciation, orthography and word analysis, as well as for searching for words and collocations in the texts included into the corpus.

References

Aranchyn, Yu. L. (1976) Ege sөs. In: Ernin ekizi Khan-Kharangui. Kyzyl, Tyvanyӊ nom үndүrer cheri. 118 p. (In Tuv.)

Kuular, D. S. (1995) Ege sөs. In: On chүktүn eezi, on khorannyn үndүsүn үsken Achyty Kezer-Mergen dugaiynda toozhu. Kyzyl, Tyvanyӊ nom үndүrer cheri. 128 p. (In Tuv.)

Mөge Baian-Toolai: tuvinskii narodnyi epos (1975), transl. by L. Grebnev. In: Geroicheskii epos narodov SSSR : in 2 vols. Moscow, Khudozhestvennaia literatura. Vol. 1. 575 p. Pp. 352-366. (In Russ.)

Orus-ool, S. M. (1995) Taiylbyrlar. In: Boktu-Kirish, Bora-Sheelei. Tyva ulustuң maadyrlyg toolu. Kyzyl, Tyvanyn nom үndүrer cheri. Vol. IV. Pp. 221–223. (In Tuv.)

Orus-ool, S. M. (2011a) Maadyrlyg tooldar. In: Orus-ool S. M. Izbrannye nauchnye trudy. Abakan, OOO «Zhurnalist». 296 p. Pp. 106–118. (In Russ.)

Orus-ool, S. M. (2011b) O perevodakh tuvinskogo geroicheskogo eposa na russkii iazyk. In: Orus-ool S. M. Izbrannye nauchnye trudy. Abakan, OOO «Zhurnalist». 296 p. Pp. 55–68. (In Russ.)

Orus-ool, S. M. (2001s) Tuvinskie geroicheskie skazaniia (tekstologiia, poetika, stil'). Moscow, Maks press. 422 p. (In Russ.)

Salchak, A. Ya. (2012) Elektronnyi korpus tekstov tuvinskogo iazyka. Novye issledovaniia Tuvy, no. 3 [online] Available at: http://www.tuva.asia/journal/issue_15/5231-salchak.html (access date: 16.10.2016).

Published

02.12.2016

How to Cite

Ondar, M. V. (2016) “The structure of an entry in the National corpus of Tuvan language”, The New Research of Tuva, 4. Available at: https://nit.tuva.asia/nit/article/view/616 (Accessed: 22.11.2024).

Issue

Section

Philology

Author Biography

Mengi V. Ondar, Tuvan State University

Postgraduate student, Department of Tuvan Philology and General Linguistics, Tuvan State University.

Postal address: 36 Lenin St. 667000 Kyzyl, Republic Tyva, Russian Federation.

Tel.: +7 (394-22) 2-19-69.

E-mail: mengi89@yandex.ru

Research advisor: Candidate of Philosophy, Associate Professor M. V. Bavuu-Surun.