On the Global Convergence of Stochastic Fictitious Play in Stochastic Games with Turn-based Controllers
This paper presents a learning dynamic with almost sure convergence guarantee for any stochastic game with turn-based controllers (on state transitions) as long as stage-payoffs have stochastic fictitious-play-property. For example, two-player zero-sum and n-player potential strategic-form games have this property. Note also that stage-payoffs for different states can have different structures such as they can sum to zero in some states and be identical in others. The dynamics presented combines the classical stochastic fictitious play with value iteration for stochastic games. There are two key properties: (i) players play finite horizon stochastic games with increasing lengths within the underlying infinite-horizon stochastic game, and (ii) the turn-based controllers ensure that the auxiliary stage-games (induced from the continuation payoff estimated) have the stochastic fictitious-play-property.
READ FULL TEXT