The paper studies some aspects of building a global information monitoring system with big data by using the method of automated language determination. The language identification technology, which significantly reduces the consumption of resources at processing incoming data in systems for analyzing large amounts of multilingual information and increases their efficiency are described. The technology is based on revealed thematically independent syntactic markers, which allow not only to highlight the grammatical basis of the text, but also to identify the language that was used to record information in it. A functional mathematical model was created that presents the essence and main components of the developed technology, and also proposed a system for assessing its effectiveness. © 2018 IEEE.