Concurrent Decentralized Channel Allocation and Access Point Selection using Multi-Armed Bandits in multi BSS WLANs
Enterprise Wireless Local Area Networks (WLANs) consist of multiple Access Points (APs) covering a given area. Finding a suitable network configuration able to maximize the performance of enterprise WLANs is a challenging task given the complex dependencies between APs and stations. Recently, in wireless networking, the use of reinforcement learning techniques has emerged as an effective solution to efficiently explore the impact of different network configurations in the system performance, identifying those that provide better performance. In this paper, we study if Multi-Armed Bandits (MABs) are able to offer a feasible solution to the decentralized channel allocation and AP selection problems in Enterprise WLAN scenarios. To do so, we empower APs and stations with agents that, by means of implementing the Thompson sampling algorithm, explore and learn which is the best channel to use, and which is the best AP to associate, respectively. Our evaluation is performed over randomly generated scenarios, which enclose different network topologies and traffic loads. The presented results show that the proposed adaptive framework using MABs outperform the static approach (i.e., using always the initial default configuration, usually random) regardless of the network density and the traffic requirements. Moreover, we show that the use of the proposed framework reduces the performance variability between different scenarios. Results also show that we achieve the same performance (or better) than static strategies with less APs for the same number of stations. Finally, special attention is placed on how the agents interact. Even if the agents operate in a completely independent manner, their decisions have interrelated effects, as they take actions over the same set of channel resources.
READ FULL TEXT