Using TXM platform for research on language changes over time: The dynamics of vocabulary and punctuation in Russian Literary Texts1

The aim of this article is to test the methodological tools provided by TXM open-source software for research on dynamics of vocabulary and punctuation marks in diachronic corpora. TXM provides both quantitative and qualitative analysis features. It is shown that Russian revolution of 1917 did make significant changes in the core vocabulary of the corpus of Russian Short Stories (1901–1930). The same methodology may be used both for diachronic studies of literature and for various NLP tasks. © 2021 Tomsk State University. All rights reserved.

Authors
Lavrentiev A.M.1 , Sherstinova T.Yu.2, 3 , Chepovskiy A.M. 4, 5 , Pincemin B.1
Publisher
Tomsk State University
Number of issue
70
Language
English
Pages
69-89
Status
Published
Year
2021
Organizations
  • 1 French National Centre for Scientific Research, Lyon, France
  • 2 Higher School of Economics (Saint Petersburg, Russian Federation
  • 3 Saint-Petersburg State University, Saint Petersburg, Russian Federation
  • 4 Higher School of Economics, Moscow, Russian Federation
  • 5 RUDN University, Moscow, Russian Federation
Keywords
Corpus linguistics; Diachronic linguistics; Punctuation; Russian literature of 20th century; Stylometry; Textometry; TXM platform; Vocabulary
Date of creation
20.07.2021
Date of change
20.07.2021
Short link
https://repository.rudn.ru/en/records/article/record/74383/
Share

Other records