Fast and Accurate Patent Classification in Search Engines

This article presents a new approach to large scale patent classification. The need to classify documents often takes place in professional information retrieval systems. In this paper we describe our approach, based on linguistically-supported k-nearest neighbors. We experimentally evaluate it on the Russian and English datasets and compare modern classification technique fastText. We show that KNN is a viable alternative to traditional text classifiers, achieving comparable accuracy while using less additional hardware resources. © Published under licence by IOP Publishing Ltd.

Authors
Yadrintsev V. 1, 2 , Bakarov A.1, 3 , Suvorov R. 1 , Sochenkov I. 1, 4
Conference proceedings
Publisher
Institute of Physics Publishing
Number of issue
1
Language
English
Status
Published
Number
012004
Volume
1117
Year
2018
Organizations
  • 1 Federal Research Center Computer Science and Control, Russian Academy of Sciences, Moscow, Russian Federation
  • 2 Peoples Friendship University of Russia, RUDN University, Moscow, Russian Federation
  • 3 National Research University Higher School of Economics, Moscow, Russian Federation
  • 4 Skolkovo Institute of Science and Technology, Moscow, Russian Federation
Keywords
Big data; Classification (of information); Information retrieval; Information retrieval systems; Nearest neighbor search; Patents and inventions; Tellurium compounds; Classification technique; Hardware resources; K-nearest neighbors; New approaches; Patent classifications; Text classifiers; Search engines
Date of creation
04.02.2019
Date of change
04.02.2019
Short link
https://repository.rudn.ru/en/records/article/record/36232/
Share

Other records