Algorithms for prosodic discourse feature interpretation in case of its processing using low-speed codecs

Bessonov, M.A.; Bessonova, N.A.; Farkhadov, M.P.

Algorithms for prosodic discourse feature interpretation in case of its processing using low-speed codecs

In this article we propose two algorithms for discourse prosodic feature interpretation. The first algorithm based on wide phonetic categories and second algorithm based on audio signal melodic cross-correlation functions and short-timed energy series – as well as methodical recommendations for their use are proposed as a part of the problem of audio signal language identification based on a prosodic approach. An experimental evaluation of both algorithms is proposed. Neural networks are used as a decision rule. Wide phonetic categories were pause, pitch, noise. We have expanded wide phonetic categories to pause, pitch, noise, five levels of pitch, sites of decreasing energy, main maximum, adverse maximum. The total number of categories was 14. These algorithms can be applied for language identification or speaker identification. At the same time there is no requirement to restore the speech signal after processing it by low-speed codec. Certainly, frames of the speech codec must contain such parameters as pitch, tone-noise parameter, energy. The base of speech signals consists of 10 languages 10 speakers per language. Total time of the speech per speaker is 100 minutes. This time takes into account statistical regularities of languages. Tests for evaluation of the algorithms were carried out with a multilayer perceptron. © 2018 ASSA.

Авторы

Bessonov M.A. ¹ , Bessonova N.A. ¹ , Farkhadov M.P. ²

Журнал

Достижения в области системных наук и приложений (Advances in Systems Science and Applications)

Издательство

International Institute for General Systems Studies

Номер выпуска

Язык

Английский

Страницы

1-11

Статус

Опубликовано

Ссылка

Внешняя ссылка

DOI

10.25728/ASSA.2018.18.1.524

Том

Год

2018

Организации

¹ Peoples' Friendship University of Russia, Moscow, Russian Federation
² V.A. Trapeznikov Institute for Control Sciences of Russian Academy of Science, Moscow, Russian Federation

Ключевые слова

Discourse prosodic feature; Language identification; Neural networks; Wide phonetic categories

Цитировать

ГОСТ MLA RIS BibTex

Другие записи

AUTONOMY IN THE RUSSIAN FEDERATION: THEORY AND PRACTICE

Статья

Kartashkin V.A., Abashidze A.Kh.

International Journal on Minority and Group Rights. Том 10. 2003. С. 203-220

HIERARCHICAL CONCAVE LAYERED TRIANGULAR PTCU ALLOY NANOSTRUCTURES: RATIONAL INTEGRATION OF DENDRITIC NANOSTRUCTURES FOR EFFICIENT FORMIC ACID ELECTROOXIDATION

Статья

Wu F., Lai J., Zhang L., Niu W., Lou B., Luque R., Xu G.

Nanoscale. Том 10. 2018. С. 9369-9375