Semantic markup of nouns and adjectives for the Electronic corpus of texts in Tuvan language

Authors

  • Bailak Ch. Oorzhak Tuvan State University
  • Arzhaana B. Hertek Tuvan State University
  • Maria A. Kuzhuget Tuvan State University
  • Valentina S. Ondar Tuvan State University

Keywords:

Tuvan language; electronic database; automated search system; lexis; lexico-semantic classes and subclasses; descriptor; tag; noun; adjective; lexical compatibility

Abstract

The article examines the progress of semantic markup of the Electronic corpus of texts in Tuvan language (ECTTL), which is another stage of adding Tuvan texts to the database and marking up the corpus. ECTTL is a collaborative project by researchers from Tuvan State University (Research and Education Center of Turkic Studies and Department of Information Technologies).

Semantic markup of Tuvan lexis will come as a search engine and reference system which will help users find text snippets containing words with desired meanings in ECTTL.

The first stage of this process is setting up databases of basic lexemes of Tuvan language. All meaningful lexemes were classified into the following semantic groups: humans, animals, objects, natural objects and phenomena, and abstract concepts. All Tuvan object nouns, as well as both descriptive and relative adjectives, were assigned to one of these lexico-semantic classes. Each class, sub-class and descriptor is tagged in Tuvan, Russian and English; these tags, in turn, will help automatize searching.

The databases of meaningful lexemes of Tuvan language will also outline their lexical combinations. The automatized system will contain information on semantic combinations of adjectives with nouns, adverbs with verbs, nouns with verbs, as well as on the combinations which are semantically incompatible.

References

Bavuu-Siuriun, M. V. and Dalaa, S. M. Morfemno-orfograficheskii slovar' tuvinskogo iazyka. Elektronnyi korpus tekstov tuvinskogo iazyka [online] Available at: http://www.tuvacorpus.ru/?q=content/slovari (access date: 12.09.2016). (In Russ.)

Oorzhak, B. Ch. and Khertek, A. B. (2015) Razrabotka semanticheskoi razmetki elektronnogo korpusa tuvinskogo iazyka. In: Materialy 3-ei Mezhdunarodnoi konferentsii po komp'iuternoi obrabotke tiurkskikh iazykov «TurkLang 2015». Kazan', 17–19 sentiabria 2015. Kazan', Izd-vo AN Respubliki Tatarstan. Pp. 351–362. (In Russ.)

Cozdanie bazy dannykh leksicheskogo fonda tuvinskogo iazyka (2016) / Oorzhak, B. Ch, Khertek, A. B., Kuzhuget, M. A., Salchak, A. Ia., Ondar, V. S. and Chamzyryn, E. T. In: Trudy Mezhdunarodnoi konferentsii po komp'iuternoi i kognitivnoi lingvistike. TEL-2016. Kazan', 21–24 aprelia 2016. Kazan', Izd-vo Kazanskogo gosuniversiteta. Vol. 17. 392 p. Pp. 278–281. (In Russ.)

Published

02.12.2016

How to Cite

Oorzhak, B. C., Hertek, A. B., Kuzhuget, M. A. and Ondar, V. S. (2016) “Semantic markup of nouns and adjectives for the Electronic corpus of texts in Tuvan language”, The New Research of Tuva, 4. Available at: https://nit.tuva.asia/nit/article/view/615 (Accessed: 21.11.2024).

Issue

Section

Philology

Author Biographies

Bailak Ch. Oorzhak, Tuvan State University

Candidate of Philology, Senior Research Fellow, Research and Education Center of Turkic Studies, Tuvan State University. Postal address: 32 Lenin St., 667000 Kyzyl, Republic of Tuva, Russian Federation. Tel: +7 (394-22) 3-03-78. E-mail: oorzhak.baylak@mail.ru

Arzhaana B. Hertek, Tuvan State University

Candidate of Philology, Senior Research Fellow, Research and Education Center of Turkic Studies, Tuvan State University. Postal address: 32 Lenin St., 667000 Kyzyl, Republic of Tuva, Russian Federation. Tel: +7 (394-22) 3-03-78. E-mail: khertek.ab@yandex.ru

Maria A. Kuzhuget, Tuvan State University

Head of the Literature Museum, Tuvan State University. Postal address: 32 Lenin St., 667000 Kyzyl, Republic of Tuva, Russian Federation. Tel: +7 (394-22) 3-10-62. E-mail: kuzhuget.m55@mail.ru

Valentina S. Ondar, Tuvan State University

Candidate of Philology, Associate Professor, Department of Russian language and literature, Tuvan State University. Postal address: 32 Lenin St., 667000 Kyzyl, Republic of Tuva, Russian Federation. Tel: +7 (394-22) 5-22-50. E-mail: barys-hoov@mail.ru