Reinforcement Learning for Improved Random Access in Delay-Constrained Heterogeneous Wireless Networks
In this paper, we for the first time investigate the random access problem for a delay-constrained heterogeneous wireless network. We begin with a simple two-device problem where two devices deliver delay-constrained traffic to an access point (AP) via a common unreliable collision channel. By assuming that one device (called Device 1) adopts ALOHA, we aim to optimize the random access scheme of the other device (called Device 2). The most intriguing part of this problem is that Device 2 does not know the information of Device 1 but needs to maximize the system timely throughput. We first propose a Markov Decision Process (MDP) formulation to derive a model-based upper bound so as to quantify the performance gap of certain random access schemes. We then utilize reinforcement learning (RL) to design an R-learning-based random access scheme, called tiny state-space R-learning random access (TSRA), which is subsequently extended for the tackling of the general multi-device problem. We carry out extensive simulations to show that the proposed TSRA simultaneously achieves higher timely throughput, lower computation complexity, and lower power consumption than the existing baseline–deep-reinforcement learning multiple access (DLMA). This indicates that our proposed TSRA scheme is a promising means for efficient random access over massive mobile devices with limited computation and battery capabilities.
READ FULL TEXT