Developing a computer system for student learning based on vision-language models

Shchetinin, E.Yu.; Glushkova A.G.; Demidova A.V.

Developing a computer system for student learning based on vision-language models

In recent years, artificial intelligence methods have been developed in various fields, particularly in ed-ucation. The development of computer systems for student learning is an important task and can significantly improve student learning. The development and implementation of deep learning methods in the educational process has gained immense popularity. The most successful among them are models that consider the multi-modal nature of information, in particular the combination of text, sound, images, and video. The difficulty in processing such data is that combining multimodal input data by different channel concatenation methods that ignore the heterogeneity of different modalities is an inefficient approach. To solve this problem, an inter-channel attention module is proposed in this paper. The paper presents a computer vision-linguistic system of student learning process based on the concatenation of multimodal input data using the inter-channel attention module. It is shown that the creation of effective and flexible learning systems and technologies based on such models allows to adapt the educational process to the individual needs of students and increase its efficiency. © Shchetinin E. Y., Glushkova A. G., Demidova A. V., 2024.

Авторы

Shchetinin E.Yu. , Glushkova A.G. , Demidova A.V.

Журнал

Discrete and Continuous Models and Applied Computational Science

Издательство

Федеральное государственное автономное образовательное учреждение высшего образования Российский университет дружбы народов (РУДН)

Номер выпуска

Язык

Английский

Страницы

234-241

Статус

Опубликовано

Ссылка

Внешняя ссылка

DOI

10.22363/2658-4670-2024-32-2-234-241

Том

Год

2024

Организации

¹ Department of Mathematics, Financial University under the Government of the Russian Federation, 49 Leningradsky Ave, Moscow, 125993, Russian Federation
² Endeavor, Chiswick Park, 566 Chiswick High Road, London, W4 5HR, United Kingdom
³ Department of Probability Theory and Cyber Security, RUDN University, 6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation

Ключевые слова

deep learning; neural networks-transformers; through-channel attention module; vision-language learning model

Цитировать

ГОСТ MLA RIS BibTex

Другие записи

А. CHEKHOV IN THE WORKS OF ZHANG JIE; [А. ЧЕХОВ В ТВОРЧЕСТВЕ ЧЖАН ЦЗЕ]

Статья

Meskin V.A., Zhang H.

Вестник Российского университета дружбы народов. Серия: Литературоведение, журналистика. Том 29. 2024. С. 524-533

IS RELIABLE SOCIOLOGICAL MEASUREMENT OF SOCIAL WELL-BEING POSSIBLE? THE CASE OF GHANA

Статья

Trotsuk I.V., Anamoa-Pokoo S.

Вестник Российского университета дружбы народов. Серия: Социология. Том 24. 2024. С. 600-614