Neural Certificates for Safe Control Policies
This paper develops an approach to learn a policy of a dynamical system that is guaranteed to be both provably safe and goal-reaching. Here, the safety means that a policy must not drive the state of the system to any unsafe region, while the goal-reaching requires the trajectory of the controlled system asymptotically converges to a goal region (a generalization of stability). We obtain the safe and goal-reaching policy by jointly learning two additional certificate functions: a barrier function that guarantees the safety and a developed Lyapunov-like function to fulfill the goal-reaching requirement, both of which are represented by neural networks. We show the effectiveness of the method to learn both safe and goal-reaching policies on various systems, including pendulums, cart-poles, and UAVs.
READ FULL TEXT