Deep Reinforcement Learning Approach to Solve Dynamic Vehicle Routing Problem with Stochastic Customers

Waldy Joe, Hoong Chuin Lau

PosterID: 41

In real-world urban logistics operations, changes to the routes and tasks occur in response to dynamic events. To ensure customers’ demands are met, planners need to make these changes quickly (sometimes instantaneously). This paper proposes the formulation of a dynamic vehicle routing problem with time windows and both known and stochastic customers as a route-based Markov Decision Process. We propose a solution approach that combines Deep Reinforcement Learning (specifically neural networks-based Temporal-Difference learning with experience replay) to approximate the value function and a routing heuristic based on Simulated Annealing, called DRLSA. Our approach enables optimized re-routing decision to be generated almost instantaneously. Furthermore, to exploit the structure of this problem, we propose a state representation based on the total cost of the remaining routes of the vehicles. We show that the cost of the remaining routes of vehicles can serve as proxy to the sequence of the routes and time window requirements. DRLSA is evaluated against the commonly used Approximate Value Iteration (AVI) and Multiple Scenario Approach (MSA). Our experiment results show that DRLSA can achieve on average, 10% improvement over myopic, outperforming AVI and MSA even with small training episodes on problems with degree of dynamism above 0.5.

Session Aus3+Aus5: Probabilistic Planning & Learning

Canb 10/28/2020, 11:00 – 12:15

10/29/2020, 20:00 – 21:15

Paris 10/28/2020, 01:00 – 02:15

10/29/2020, 10:00 – 11:15

NYC 10/27/2020, 20:00 – 21:15

10/29/2020, 05:00 – 06:15

LA 10/27/2020, 17:00 – 18:15

10/29/2020, 02:00 – 03:15

Solving K-MDPs

Jonathan Ferrer-Mestres, Thomas G. Dietterich, Olivier Buffet, Iadine Chadès

Optimal and Heuristic Approaches for Constrained Flight Planning under Weather Uncertainty

Florian Geißer, Guillaume Povéda, Felipe Trevizan, Manon Bondouy, Florent Teichteil-Königsbuch, Sylvie Thiébaux

We Mind Your Well-Being: Preventing Depression in Uncertain Social Networks by Sequential Interventions

Aye Phyu Phyu Aung, Xinrun Wang, Bo An, Xiaoli Li

Learning Domain-Independent Planning Heuristics with Hypergraph Networks

William Shen, Felipe Trevizan, Sylvie Thiébaux

Deep Reinforcement Learning Approach to Solve Dynamic Vehicle Routing Problem with Stochastic Customers

Waldy Joe, Hoong Chuin Lau