Guidelines for Action Space Definition in Reinforcement Learning-Based Traffic Signal Control Systems

Maxime Treca, Julian Garbiso, Dominique Barth

PosterID: 61
picture_as_pdf PDF
library_books Slides
library_books Poster
menu_book BibTeX
Traffic signal control is an urban planning tool with important economic, social and environmental implications. Reinforcement learning applied to traffic signal control (RL-TSC) has shown promising results compared to existing methods. If previous works in the RL-TSC literature have focused on optimizing state and reward definitions, the impact of the agent's action space definition remains largely unexplored. Indeed, typical RL-TSC models feature either phase-based controllers — which determine a signal duration in one go — or step-based controllers — which can decide to extend a phase duration interactively — without comparing their respective merits. In this paper, we provide guidelines for optimally defining RL-TSC actions by comparing different action types in a simulated network featuring different traffic demand patterns. Our results show that an agent's performance and convergence speed both increase with its interaction frequency with the environment. However, certain methods with lower observation frequencies — that can be achieved with realistic sensing technologies — have reasonably similar performance compared to higher frequency ones in all scenarios, and even outperform them under specific traffic conditions.

Session E3: Planning with Uncertainty
Canb 10/29/2020, 01:00 – 02:00
10/30/2020, 21:00 – 22:00
Paris 10/28/2020, 15:00 – 16:00
10/30/2020, 11:00 – 12:00
NYC 10/28/2020, 10:00 – 11:00
10/30/2020, 06:00 – 07:00
LA 10/28/2020, 07:00 – 08:00
10/30/2020, 03:00 – 04:00