Finding control policy for one discrete-time Markov chain on [0, 1] with a given invariant measure

A discrete-time Markov chain on the interval [0, 1] with two possible transitions (left or right) at each step has been considerred. The probability of transition towards 0 (and towards 1) is a function of the current value of the chain. Having chosen the direction, the chain moves to the randomly chosen point from the appropriate interval. The authors assume that the transition probabilities depend on the current value of the chain only through a finite number of real-valued numbers. Under this assumption, they seek the transition probabilities, which guarantee the L2 distance between the stationary density of the Markov chain and the given invariant measure on [0, 1] is minimal. Since there is no reward function in this problem, it does not fit in the MDP (Markov decision process) framework. The authors follow the sensitivity-based approach and propose the gradient- and simulation-based method for estimating the parameters of the transition probabilities. Numerical results are presented which show the performance of the method for various transition probabilities and invariant measures on [0, 1]. © 2018 Federal Research Center Computer Science and Control of Russian Academy of Sciences.

Авторы

Konovalov M.G. ¹ , Razumchik R.V. ^1, ²

Journal

Информатика и ее применения (Informatika i ee Primeneniya)

Издательство

Федеральный исследовательский центр "Информатика и управление" РАН

Номер выпуска

Язык

Russian

Страницы

2-13

Статус

Published

Ссылка

Внешняя ссылка

DOI

10.14357/19922264180301

Том

Год

2018

Организации

¹ Institute of Informatics Problems, Federal Research Center “Computer Science and Control”, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow, 119333, Russian Federation
² Peoples Friendship University of Russia, RUDN University, 6 Miklukho-Maklaya Str., Moscow, 117198, Russian Federation

Ключевые слова

Continuous state space; Control; Derivative estimation; Markov chain; Sensitivity-based approach

Цитировать

ГОСТ MLA RIS BibTex

Другие записи

AUTONOMY IN THE RUSSIAN FEDERATION: THEORY AND PRACTICE

Article

Kartashkin V.A., Abashidze A.Kh.

International Journal on Minority and Group Rights. Том 10. 2003. С. 203-220

RUSSIAN ROCK MUSIC CULTURE DISCOURSE SPECIFICS AND THE LOGOCENTRIC MODEL OF A SYNTHETIC LINGUISTIC PERSONALITY [ESPECIFICOS DEL DISCURSO DE LA CULTURA DE LA MÚSICA ROCK DE RUSIA Y EL MODELO LOGOCÉ- TRICO DE UNA PERSONALIDAD LINGÜÍSTICA SINTÉTICA]

Article

Ivanov D.I., Shaklein V.M., Mitrofanova I.I., Mikova S.S., Deryabina S.A.

Opcion. Том 34. 2018. С. 764-788