Anomaly detection for short texts: Identifying whether your chatbot should switch from goal-oriented conversation to chit-chatting

Goal-oriented conversational agents are systems able converse with humans using natural language to help them reach a certain goal. The number of goals (or domains) about which an agent could converse is limited, and one of the issues is to identify whether a user talks about the unknown domain (in order to report a misunderstanding or switch to chit-chatting mode). We argue that this issue could be resolved if we consider it as an anomaly detection task which is in a field of machine learning. The scientific community developed a broad range of methods for resolving this task, and their applicability to the short text data was never investigated before. The aim of this work is to compare performance of 6 different anomaly detection methods on Russian and English short texts modeling conversational utterances, proposing the first evaluation framework for this task. As a result of the study, we find out that a simple threshold for cosine similarity works better than other methods for both of the considered languages. © Springer Nature Switzerland AG 2018.

Authors
Bakarov A.1, 2 , Yadrintsev V. 2, 4 , Sochenkov I. 2, 3
Publisher
Springer Verlag
Language
English
Pages
289-298
Status
Published
Volume
859
Year
2018
Organizations
  • 1 The National Research University Higher School of Economics, Moscow, Russian Federation
  • 2 Federal Research Center ‘Computer Science and Control’ of Russian Academy of Sciences, Moscow, Russian Federation
  • 3 Skolkovo Institute of Science and Technology, Moscow, Russian Federation
  • 4 Peoples’ Friendship University of Russia (RUDN University), Moscow, Russian Federation
Keywords
Anomaly detection; Chatbot; Conversational agent; Distributional semantics; Novelty detection; Word embeddings
Share

Other records