Using TXM platform for research on language changes over time: The dynamics of vocabulary and punctuation in Russian Literary Texts1

The aim of this article is to test the methodological tools provided by TXM open-source software for research on dynamics of vocabulary and punctuation marks in diachronic corpora. TXM provides both quantitative and qualitative analysis features. It is shown that Russian revolution of 1917 did make significant changes in the core vocabulary of the corpus of Russian Short Stories (1901–1930). The same methodology may be used both for diachronic studies of literature and for various NLP tasks. © 2021 Tomsk State University. All rights reserved.

Авторы
Lavrentiev A.M.1 , Sherstinova T.Yu.2, 3 , Chepovskiy A.M. 4, 5 , Pincemin B.1
Издательство
Tomsk State University
Номер выпуска
70
Язык
Английский
Страницы
69-89
Статус
Опубликовано
Год
2021
Организации
  • 1 French National Centre for Scientific Research, Lyon, France
  • 2 Higher School of Economics (Saint Petersburg, Russian Federation
  • 3 Saint-Petersburg State University, Saint Petersburg, Russian Federation
  • 4 Higher School of Economics, Moscow, Russian Federation
  • 5 RUDN University, Moscow, Russian Federation
Ключевые слова
Corpus linguistics; Diachronic linguistics; Punctuation; Russian literature of 20th century; Stylometry; Textometry; TXM platform; Vocabulary
Цитировать
Поделиться

Другие записи