Reward Function Optimization of a Deep Reinforcement Learning Collision Avoidance System
The proliferation of unmanned aircraft systems (UAS) has caused airspace regulation authorities to examine the interoperability of these aircraft with collision avoidance systems initially designed for large transport category aircraft. Limitations in the currently mandated TCAS led the Federal Aviation Administration to commission the development of a new solution, the Airborne Collision Avoidance System X (ACAS X), designed to enable a collision avoidance capability for multiple aircraft platforms, including UAS. While prior research explored using deep reinforcement learning algorithms (DRL) for collision avoidance, DRL did not perform as well as existing solutions. This work explores the benefits of using a DRL collision avoidance system whose parameters are tuned using a surrogate optimizer. We show the use of a surrogate optimizer leads to DRL approach that can increase safety and operational viability and support future capability development for UAS collision avoidance.
READ FULL TEXT