On Maximizing the Average Time at a GoalStochastic Processes and Their Applications
AbstractIn a decision process (gambling or dynamic programming problem) with finite state space and arbitrary decision sets (gambles or actions), there is always available a Markov strategy which uniformly (nearly) maximizes the average time spent at a goal. If the decision sets are closed, there is even a stationary strategy with the same property.Examples are given to show that approximations by discounted or finite horizon payoffs are not useful for the general average reward problem.
Citation InformationS. Demko and Theodore P. Hill. "On Maximizing the Average Time at a Goal" Stochastic Processes and Their Applications Vol. 17 Iss. 2 (1984) p. 349 - 357
Available at: http://works.bepress.com/tphill/12/