Distributional models and auxiliary methods for determining the hypernyms of words in Russian [ДИСТРИБУТИВНЫЕ МОДЕЛИ И ВСПОМОГАТЕЛЬНЫЕ МЕТОДЫ ДЛЯ ОПРЕДЕЛЕНИЯ ГИПЕРОНИМОВ СЛОВ РУССКОГО ЯЗЫКА]

This paper describes our participation in the first shared task on Automatic Taxonomy Construction for the Russian language RUSSE'2020. The goal of this task is the following: input words (neologisms that are not yet included in the taxonomy) need to be associated with the appropriate hypernyms from an existing taxonomy. For example, for the input word “duck”, it is expected that participants will provide a list of its ten hypernyms-synsets to which the word can most likely be attributed, such as “animal,” “bird” and so on. An input word can refer to one, two, or more “parents” at the same time. In this article we are trying to answer the following question: what results can be achieved using only “raw” vectors from distributional models without additional training? The article presents the results for several pre-trained models that are based on fastText, Elmo, and BERT algorithms. Also, an out-of-vocabulary analysis was performed for the models under consideration. Taking into account all public scores from the leaderboards, we showed the results corresponding to the following places in the ranking: the 3rd place on public nouns, the 2nd on private nouns, the 4th on public verbs, and the 4th on private verbs. © 2020 ABBYY PRODUCTION LLC. All rights reserved.

Авторы
Yadrintsev V.V. 1, 2 , Ryzhova A.A.3 , Sochenkov I.V. 1
Издательство
Rossiiskii Gosudarstvennyi Gumanitarnyi Universitet
Номер выпуска
19
Язык
Английский
Страницы
762-772
Статус
Опубликовано
Том
2020-June
Год
2020
Организации
  • 1 Federal Research Center “Computer Science and Control, Russian Academy of Sciences, Moscow, Russian Federation
  • 2 Peoples Friendship University of Russia, RUDN University, Moscow, Russian Federation
  • 3 Skolkovo Institute of Science and Technology, Moscow, Russian Federation
Ключевые слова
BERT; Elmo; FastText; Hypernym discovery; Rusvectores; RuWordNet; Vector models
Дата создания
02.11.2020
Дата изменения
02.11.2020
Постоянная ссылка
https://repository.rudn.ru/ru/records/article/record/65453/
Поделиться

Другие записи