"Solving Markov Decision Processes via Simulation" by Abhijit Gosavi

Selected Works of Abhijit Gosavi

Follow Contact

Contribution to Book

Solving Markov Decision Processes via Simulation

International Series in Operations Research and Management Science: Handbook of Simulation Optimization

Abhijit Gosavi, Missouri University of Science and Technology

Link

Abstract

This chapter presents an overview of simulation-based techniques useful for solving Markov decision processes (MDPs). MDPs model problems of sequential decision-making under uncertainty, in which decisions made in each state collectively affect the trajectory of the states visited by the system over a time horizon of interest. Traditionally, MDPs have been solved via dynamic programming (DP), which requires the transition probability model that is difficult to derive in many realistic settings. The use of simulation for solving MDPs allows us to bypass the transition probability model and solve large-scale MDPs considered intractable to solve by traditional DP. The simulation-based methodology for solving MDPs, which like DP is also rooted in the Bellman equations, goes by names such as reinforcement learning, neuro-DP, and approximate or adaptive DP.We begin with a description of algorithms for infinite-horizon discounted reward MDPs, followed by the same for infinite-horizon average reward MDPs. Then we present a discussion on finite-horizon MDPs. For each problem considered, we present a step-by-step description of a selected group of algorithms. In making this selection, we have attempted to blend the old and the classical with more recent developments. Finally, after touching upon extensions and convergence theory, we conclude with a brief summary of some applications and directions for future research.

Department(s)

Engineering Management and Systems Engineering

Document Type

Book - Chapter

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

1-1-2015

Publication Date

01 Jan 2015

Disciplines

Operations Research, Systems Engineering and Industrial Engineering

Citation Information

Abhijit Gosavi. "Solving Markov Decision Processes via Simulation" International Series in Operations Research and Management Science: Handbook of Simulation Optimization Vol. 216 (2015) p. 341 - 379 ISSN: 0884-8289; 2214-7934
Available at: http://works.bepress.com/abhijit-gosavi/32/