Structural Models of English Terms of Automated Processing of Scientific and Technical Texts Corpora

The article is devoted to the structural models of English multi-component terms from the subject area “Welding types” as a basis for marking the corpora of scientific and technical texts. The place of corpora of scientific and technical texts in corpus linguistics and prospects of further scientific research based on them are marked. Relevance of the research is conditioned by the necessity to create the corpus of scientific and technical texts, in general, and means of automatic marking of terms, in particular. It has been substantiated that the main problem in creating the corpus of scientific and technical texts is automatic marking of terminological word combinations. The analysis of the current state of the terminology system of the subject area “Welding types” has been carried out. The formal structure of elements of the “Welding types” terminology system is considered. The results of the analysis of two, three, four-component English terminological word combinations of the “Welding types” subject area and their structural models are presented. All structural models of English terminology combinations are illustrated with examples. The most productive models of English terms word combinations are highlighted. It is shown that the most productive model — the combination of a nucleus element with a noun or an adjective in the function of the prepositional definition — can be traced in two-component word combinations, but the analysis of more complex formations shows that the model of “left definition attached to the term kernel” is also present in them, demonstrating generic features. The necessity of enumerating all possible structural models of terminological combinations in the subject area “Welding types” has been substantiated. The novelty of the study is seen in the formation of a database of structural models of terminological combinations as the basis of a superstructure database on the structure of terms to improve the quality of automatic marking of the bodies of scientific and technical texts and processing of terms-candidates in the conduct of body studies. © 2022, RUDN UNiversity. All rights reserved.

Авторы
Butenko I.I. , Nikolaeva N.S. , Kartseva E.Yu.
Издательство
РУДН
Номер выпуска
1
Язык
Русский
Страницы
80-95
Статус
Опубликовано
Том
13
Год
2022
Организации
  • 1 Peoples’ Friendship University of Russia (RUDN University), 6, Mikluho-Maklaya str, Moscow, 117198, Russian Federation
  • 2 Bauman Moscow State Technical University, 5/1, 2-nd Baymanskaya str, Moscow, Russian Federation
Ключевые слова
Markup; Scientific and technical discourse; Scientific and technical texts corpora; Structural model; Term; Terminological word combination
Цитировать
Поделиться

Другие записи