The Hybrid Method for Accurate Patent Classification

Yadrintsev, V.V.; Sochenkov, I.V.

The Hybrid Method for Accurate Patent Classification

This article is dedicated to stacking of two approaches of patent classification. First is based on linguistically-supported k-nearest neighbors algorithm using the method of search for topically similar documents based on a comparison of vectors of lexical descriptors. Second is the word embeddings based fastText, where the sentence (or a document) vector is obtained by averaging the n-gram embeddings, and then a multinomial logistic regression exploits these vectors as features. We show in Russian and English datasets that stacking classifier shows better results compared to single classifiers. © 2019, Pleiades Publishing, Ltd.

Authors

Yadrintsev V.V. ^1, ² , Sochenkov I.V. ^1, ³

Journal

Lobachevskii Journal of Mathematics

Publisher

Pleiades Publishing

Issue number

Language

English

Pages

1873-1880

State

Published

Link

External link

DOI

10.1134/S1995080219110325

Volume

Year

2019

Organizations

¹ Federal Research Center Computer Science and Control of the Russian Academy of Sciences, Moscow, 119333, Russian Federation
² Peoples’ Friendship University of Russia (RUDN University), Moscow, 117198, Russian Federation
³ Lomonosov Moscow State University, Moscow, 119991, Russian Federation

Keywords

fastText; KNN; patent classification; similarity search; stacking; word embeddings

Cite

ГОСТ MLA RIS BibTex

COMPARISON OF QUANTITATIVE ANALYTICAL TECHNIQUES FOR DABIGATRAN IN BLOOD PLASMA OF HUMANS WITH KNEE REPLACEMENTS

Article

Kozlov A.V., Smirnov V.V., Sychev D.A., Bochkov P.O., Chistyakov V.V., Stepanova E.S., Makarenkova L.M.

Pharmaceutical Chemistry Journal. Vol. 53. 2019. P.. 771-774

AUTOMATED ANALYSIS OF THE PIGMENT NETWORK IN DERMATOSCOPIC IMAGES OF MELANOCYTIC SKIN TUMORS

Article

Nikitaev V.G., Pronichev A.N., Tamrazova O.B., Sergeev V.Y., Sergeev Y.Y., Kozyreva A.V., Polyakov E.V., Druzhinina E.A.

Biomedical Engineering. Vol. 53. 2019. P.. 254-257

The Hybrid Method for Accurate Patent Classification

Other records

COMPARISON OF QUANTITATIVE ANALYTICAL TECHNIQUES FOR DABIGATRAN IN BLOOD PLASMA OF HUMANS WITH KNEE REPLACEMENTS

AUTOMATED ANALYSIS OF THE PIGMENT NETWORK IN DERMATOSCOPIC IMAGES OF MELANOCYTIC SKIN TUMORS

Cite