Learning from failures in robot-assisted feeding: Using online learning to develop manipulation strategies for bite acquisition
Successful robot-assisted feeding requires bite acquisition of a wide variety of food items. Different food items may require different manipulation actions for successful bite acquisition. Therefore, a key challenge is to handle previously-unseen food items with very different action distributions. By leveraging contexts from previous bite acquisition attempts, a robot should be able to learn online how to acquire those previously-unseen food items. In this ongoing work, we construct a contextual bandit framework for this problem setting. We then propose variants of the ϵ-greedy and LinUCB contextual bandit algorithms to minimize cumulative regret within that setting. In future, we expect empirical estimates of cumulative regret for each algorithm on robot bite acquisition trials as well as updated theoretical regret bounds that leverage the more structured context of this problem setting.
READ FULL TEXT