Symbolic Plans as High-Level Instructions for Reinforcement Learning

León Illanes, Xi Yan, Rodrigo Toro Icarte, Sheila A. McIlraith

PosterID: 69 PDF Slides Poster BibTeX

Reinforcement Learning (RL) agents seek to maximize the cumulative reward obtained when interacting with their environment. When this reward signal is sparsely distributed---as is the case for final-state goals---it may take a very large number of interactions before the agent learns an adequate policy. Some modern RL approaches address this issue by directly providing the agent with high-level instructions or by specifying reward functions that implicitly consider such instructions. In this work, we explore the use of high-level symbolic action models and Automated Planning techniques in order to automatically synthesize high-level instructions. We show how high-level plans can be exploited in a Hierarchical RL (HRL) setting, and do an empirical evaluation over multiple sets of final-state goal tasks. Our results show that our approach converges to near-optimal solutions much faster than standard RL and HRL techniques and that it provides an effective framework for transferring learned skills across multiple tasks in a given environment.

Session Am4: Planning & Learning
Canb 10/29/2020, 10:00 – 11:00
10/31/2020, 03:00 – 04:00
Paris 10/29/2020, 00:00 – 01:00
10/30/2020, 17:00 – 18:00
NYC 10/28/2020, 19:00 – 20:00
10/30/2020, 12:00 – 13:00
LA 10/28/2020, 16:00 – 17:00
10/30/2020, 09:00 – 10:00