Bandit-Guided Dynamic Programming for Last-Mile Delivery in Stochastic Networks
Published in July 29, 2026
We propose a Bandit-enhanced Dynamic Programming algorithm for efficient routing of Autonomous Delivery Vehicles in stochastic urban networks, achieving near-optimal delivery costs with significantly higher computational efficiency than standard Value Iteration.
Recommended citation: Li, Y., & Xiong, X. (2026). "Bandit-Guided Dynamic Programming for Last-Mile Delivery in Stochastic Networks." 2026 INFORMS Transportation Science and Logistics Conference. Cambridge, MA.
