Learning-based Optimal Admission Control in a Single Server Queuing System

12/21/2022

∙

We consider a long-term average profit maximizing admission control problem in an M/M/1 queuing system with a known arrival rate but an unknown service rate. With a fixed reward collected upon service completion and a cost per unit of time enforced on customers waiting in the queue, a dispatcher decides upon arrivals whether to admit the arriving customer or not based on the full history of observations of the queue-length of the system. <cit.> showed that if all the parameters of the model are known, then it is optimal to use a static threshold policy - admit if the queue-length is less than a predetermined threshold and otherwise not. We propose a learning-based dispatching algorithm and characterize its regret with respect to optimal dispatch policies for the full information model of <cit.>. We show that the algorithm achieves an O(1) regret when all optimal thresholds with full information are non-zero, and achieves an O(ln^3+ϵ(N)) regret in the case that an optimal threshold with full information is 0 (i.e., an optimal policy is to reject all arrivals), where N is the number of arrivals and ϵ>0.

READ FULL TEXT

Learning-based Optimal Admission Control in a Single Server Queuing System

Sign in with Google

Consider DeepAI Pro