Exponential Discretization of Weights of Neural Network Connections in Pre-Trained Neural Network. Part II: Correlation Maximization

Abstract—: In this article, we develop method of linear and exponential quantization of neural network weights. We improve it by means of maximizing correlations between the initial and quantized weights taking into account the weight density distribution in each layer. We perform the quantization after the neural network training without a subsequent post-training and compare our algorithm with linear and exponential quantization. The quality of the neural network VGG-16 is already satisfactory (top5 accuracy 76%) in the case of 3-bit exponential quantization. The ResNet50 and Xception neural networks show top5 accuracy at 4 bits 79% and 61%, respectively. © 2020, Allerton Press, Inc.

Authors
Pushkareva M.M.1 , Karandashev I.M. 1, 2
Publisher
Allerton Press Incorporation
Number of issue
3
Language
English
Pages
179-186
Status
Published
Volume
29
Year
2020
Organizations
  • 1 Scientific Research Institute for System Analysis, Russian Academy of Sciences, Moscow, 117218, Russian Federation
  • 2 Peoples Friendship University of Russia (RUDN University), Moscow, 117198, Russian Federation
Keywords
correlation maximization; exponential quantization; neural network; neural network compression; reduction of bit depth of weights; weight quantization
Date of creation
20.04.2021
Date of change
20.04.2021
Short link
https://repository.rudn.ru/en/records/article/record/72643/
Share

Other records