Learning to Correlate in Multi-Player General-Sum Sequential Games
In the context of multi-player, general-sum games, there is an increasing interest in solution concepts modeling some form of communication among players, since they can lead to socially better outcomes with respect to Nash equilibria, and may be reached through learning dynamics in a decentralized fashion. In this paper, we focus on coarse correlated equilibria (CCEs) in sequential games. First, we complete the picture on the complexity of finding social-welfare-maximizing CCEs by showing that the problem is not in Poly-APX unless P = NP. Furthermore, simple arguments show that CFR - working with behavioral strategies - may not converge to a CCE. However, we devise a simple variant (CFR-S) which provably converges to the set of CCEs, but may be empirically inefficient. Thus, we design a variant of the CFR algorithm (called CFR-Jr) which approaches the set of CCEs with a regret bound sub-linear in the size of the game, and is shown to be dramatically faster than CFR-S and the state-of-the-art algorithms to compute CCEs.
READ FULL TEXT