A Russian keyword spotting system based on large vocabulary continuous speech recognition and linguistic knowledge

The paper describes the key concepts of a word spotting system for Russian based on large vocabulary continuous speech recognition. Key algorithms and system settings are described, including the pronunciation variation algorithm, and the experimental results on the real-life telecom data are provided. The description of system architecture and the user interface is provided. The system is based on CMU Sphinx open-source speech recognition platform and on the linguistic models and algorithms developed by Speech Drive LLC. The effective combination of baseline statistic methods, real-world training data, and the intensive use of linguistic knowledge led to a quality result applicable to industrial use. © 2016 Valentin Smirnov et al.

Authors
Smirnov V.1 , Ignatov D.1 , Gusev M.1 , Farkhadov M.2 , Rumyantseva N. 3 , Farkhadova M. 3
Publisher
Hindawi Limited
Language
English
Status
Published
Number
4062786
Volume
2016
Year
2016
Organizations
  • 1 Speech Drive LLC, Saint-Petersburg, Russian Federation
  • 2 V.A. Trapeznikov Institute of Control Sciences of RAS, Moscow, Russian Federation
  • 3 RUDN University, Moscow, Russian Federation
Keywords
Continuous speech recognition; Digital storage; Linguistics; Open systems; User interfaces; Vocabulary control; Industrial use; Keyword spotting systems; Large vocabulary continuous speech recognition; Linguistic knowledge; Linguistic models; Pronunciation variation; Statistic method; System architectures; Speech recognition
Date of creation
19.10.2018
Date of change
19.10.2018
Short link
https://repository.rudn.ru/en/records/article/record/4363/
Share

Other records