Emotions recognition in human speech with deep learning models

The paper investigates the architecture of deep neural networks for recognizing human emotions from speech. Convolutional neural networks and recurrent neural networks with an LSTM memory cell were used as models of deep neural networks. An ensemble of neural networks was also built on their basis. Computer experiments with the proposed deep learning models and basic machine learning algorithms for recognizing emotions in human speech contained in the RAVDESS audio database were conducted. The results obtained showed high efficiency of neural network models, and accuracy estimates for some classes of the emotions were 80%.

Authors
Shchetinin E.Y.1 , Sevastianov L.A. 2, 3 , Kulyabov D.S. 2, 3 , Demidova A.V. 2
Publisher
Российский университет дружбы народов (РУДН)
Language
English
Pages
368-372
Status
Published
Year
2020
Organizations
  • 1 Financial University under the Government of the Russian Federation
  • 2 Peoples' Friendship University of Russia (RUDN University)
  • 3 Joint Institute for Nuclear Research
Keywords
emotion recognition; deep learning; recurrent networks; BLSTM model
Share

Other records