Analysis of 19.9 million publications from the PubMed/MEDLINE database using artificial intelligence methods: Approaches to the generalizations of accumulated data and the phenomenon of “fake news” [Анализ 19,9 млн публикаций базы данных PubMed/MEDLINE методами искусственного интеллекта: подходы к обобщению накопленных данных и феномен “fake news”]

Introduction. The English-language databases PubMed/MEDLINE and Embase are valuable information resources for finding original publications in basic and clinical medicine. Currently, there are no artificial intelligence systems to evaluate the quality of these publications. Aim. Development and testing of a system for sentiment analysis (i.e. analysis of emotional modality) of biomedical publications. Materials and methods. The technique of analysis of the “Big data” of biomedical publications was formulated on the basis of the topological theory of sentiment analysis. Algorithms have been developed that allow for the classification of texts from 16 sentiment classes with 90% accuracy (manipulative speech, research without positive results, propaganda, falsification of results, negative personal attitude, aggressive text, negative emotional background, etc.). Based on the algorithms, a scale for assessing the sentiment quality of research (β-score) is proposed. Results. Abstracts of 19.9 million publications registered in PubMed/MEDLINE over the past 50 years (1970–2019) were analyzed. It was shown that publications with low sentiment quality (the value of the β-score of the text is less than zero, which corresponds to the prevalence of manipulative and negative sentiments in the text) comprise only 18.5% (3.68 out of 19.9 million). The greatest values of the β-score were characterized by publications on sports medicine, systems biology, nutrition, on the use of applied mathematics and data mining in medicine. The rubrication of the entire array of publications by 27,840 headings (MESH-system of PubMed/MEDLINE) indicated an increase in the β-score by years (i.e., the positive dynamics of sentiment quality of the texts of publications) for 27,090 of the studied headings. The most intense positive dynamics was found for research in genetics, physiology, pharmacology, and gerontology. 249 headings with sharply negative dynamics of sentiment quality and with a pronounced increase in the manipulative sentiments characteristic of the tabloid press were highlighted. Separate assessments of international experts are presented that confirm the patterns identified. Conclusion. The proposed artificial intelligence system allows a researcher to make an effective assessment of the sentiment quality of biomedical research papers, filtering out potentially inappropriate publications disguised as “evidence-based”. Copyright © 2020, Farmakoekonomika. All rights reserved.

Авторы

Yu T.I.^1, ² , Gromova O.A. ^1, ² , Stakhovskaya L.V. ³ , Vanchakova N.P.⁴ , Galustyan A.N.⁵ , Kobalava Zh.D. ⁶ , Grishina T.R.⁷ , Gromov A.N.¹ , Ilovaiskaya I.A.⁸ , Kodentsova V.M.⁹ , Kalacheva A.G.⁷ , Limanova O.A.⁷ , Maksimov V.A. ¹⁰ , Malyavskaya S.I.¹¹ , Mozgovaya E.V.¹² , Tapilskaya N.I.^5, ¹² , Rudakov K.V.¹ , Semenov V.A.¹³

Журнал

Фармакоэкономика. Современная фармакоэкономика и фармакоэпидемиология

Номер выпуска

Язык

Русский

Страницы

146-163

Статус

Опубликовано

Ссылка

Внешняя ссылка

DOI

10.17749/2070-4909/FARMAKOEKONOMIKA.2020.021

Том

Год

2020

Организации

¹ Federal Research Center “Informatics and Management of the Russian Academy of Sciences, 44-2 Vavilova Str., Moscow, 119333, Russian Federation
² Moscow State University, 1 Leninskie gory, Moscow, 119991, Russian Federation
³ Federal Center for Cerebrovascular Pathology and Stroke, 1-10 Ostrovityanova Str., Moscow, 117997, Russian Federation
⁴ Center for Psychosomatic Medicine at the Clinical Hospital No. 122 named after L. G., Sokolov (4 pr. Kultury, St. Petersburg, 194291, Russian Federation
⁵ Saint-Petersburg State Pediatric Medical University, 2 Litovskaya Str., St. Petersburg, 194100, Russian Federation
⁶ Peoples’ Friendship University of Russia, 10/3 Miklukho-Maklaya Str., Moscow, 117198, Russian Federation
⁷ Ivanovo State Medical Academy, 8 Sheremetevsky prospekt, Ivanovo, 153012, Russian Federation
⁸ State Budgetary Healthcare Institution of Moscow Area, Moscows Regional Research Clinical Institute n.a. M.F. Vladimirskiy, 61/2 Shchepkina Str., Moscow, 129110, Russian Federation
⁹ Federal Research Center for Nutrition and Biotechnology, 2/14 Ustinsky proezd, Moscow, 109240, Russian Federation
¹⁰ Federal State Budgetary Educational Institution of Continuing Professional Education “Russian Medical Academy of Continu ing Professional Education”, Ministry of Health of the Russian Federation, 2/1 Building 1 Barrikadnaya Str., Moscow, 125993, Russian Federation
¹¹ Northern State Medical University, 51 Troitskiy Ave., Arkhangelsk, 163000, Russian Federation
¹² Research Institute of Obstetrics, Gynecology and Reproductology named after D.O. Ott, 3 Mendeleev Line, St. Petersburg, 199034, Russian Federation
¹³ Kemerovo State Medical University, 22a Voroshilova Str., Kemerovo, 650056, Russian Federation

Ключевые слова

Artificial intelligence; Big data analysis; Evidence-based medicine; Machine learning; Pharmacoinformatics; Publication quality assessment algorithms thematic modeling

Цитировать

ГОСТ MLA RIS BibTex

Другие записи

POSITION PAPER. THE ROLE OF IRON DEFICIENCY IN PATIENTS WITH CHRONIC HEART FAILURE AND CURRENT CORRECTIVE APPROACHES

Статья

Mareev V.Yu., Gilyarevskiy S.R., Mareev Yu.V., Begrambekova Yu.L., Belenkov Yu.N., Vasyuk Yu.A., Galyavich A.S., Gendlin G.E., Glezer M.G., Kobalava Zh.D., Lelyavina T.A., Orlova Ya.A., Fomin I.V., Shaposhnik I.I.

КАРДИОЛОГИЯ. KlinMed Consulting. Том 60. 2020. С. 99-106

CORONAVIRUS - SCIENTIFIC INSIGHTS AND SOCIETAL ASPECTS

Статья

Volpert V., Banerjee M., D'Onofrio A., Lipniacki T., Petrovskii S., Tran V.C.

Mathematical Modelling of Natural Phenomena. EDP Sciences. Том 15. 2020.