Semantic role labeling with pretrained language models for known and unknown predicates

We build the first full pipeline for semantic role labelling of Russian texts. The pipeline implements predicate identification, argument extraction, argument classification (labeling), and global scoring via integer linear programming. We train supervised neural network models for argument classification using Russian semantically annotated corpus - FrameBank. However, we note that this resource provides annotations only to a very limited set of predicates. We combat the problem of annotation scarcity by introducing two models that rely on different sets of features: one for “known” predicates that are present in the training set and one for “unknown” predicates that are not. We show that the model for “unknown” predicates can alleviate the lack of annotation by using pretrained embeddings. We perform experiments with various types of embeddings including the ones generated by deep pretrained language models: word2vec, FastText, ELMo, BERT, and show that embeddings generated by deep pretrained language models are superior to classical shallow embeddings for argument classification of both “known” and “unknown” predicates. © 2019 Association for Computational Linguistics (ACL). All rights reserved.

Авторы
Larionov D. 1, 2 , Chistova E. 1, 2 , Shelmanov A.1, 3 , Smirnov I. 1, 2
Сборник материалов конференции
Издательство
Incoma Ltd
Язык
Английский
Страницы
619-628
Статус
Опубликовано
Том
2019-September
Год
2019
Организации
  • 1 FRC CSC RAS, Moscow, Russian Federation
  • 2 RUDN University, Moscow, Russian Federation
  • 3 Skoltech, Moscow, Russian Federation
Ключевые слова
Computational linguistics; Deep learning; Embeddings; Integer programming; Pipelines; Semantics; Integer Linear Programming; Language model; Semantic role labeling; Semantic roles; Sets of features; Supervised neural networks; Training sets; Natural language processing systems
Дата создания
24.12.2019
Дата изменения
24.12.2019
Постоянная ссылка
https://repository.rudn.ru/ru/records/article/record/55379/