Deep learning for real-time traffic signal control on urban networks
File(s)
Author(s)
Song, Junwoo
Type
Thesis or dissertation
Abstract
Real-time traffic signal controls are frequently challenged by (1) uncertain knowledge about the traffic states; (2) need for efficient computation to allow timely decisions; (3) multiple objectives such as traffic delays and vehicle emissions that are difficult to optimize; and (4) idealized assumptions about data completeness and quality that are often made in developing many theoretical signal control models. This thesis addresses these challenges by proposing two real-time signal control frameworks based on deep learning techniques, followed by extensive simulation tests that verifies their effectiveness in view of the aforementioned challenges.
The first method, called the Nonlinear Decision Rule (NDR), defines a nonlinear mapping between network states and signal control parameters to network performances based on prevailing traffic conditions, and such a mapping is optimized via off-line simulation. The NDR is instantiated with two neural networks: feedforward neural network (FFNN) and recurrent neural network (RNN), which have different ways of processing traffic information in the near past. The NDR is implemented and tested within microscopic traffic simulation (S-Paramics) for a real-world network in West Glasgow, where the off-line training of the NDR amounts to a simulation-based optimization procedure aiming to reduce delay, CO2 and black carbon emissions. Extensive tests are performed to assess the NDR framework, not only in terms of its effectiveness in optimizing different traffic and environmental objectives, but also in relation to local vs. global benefits, trade-off between delay and emissions, impact of sensor locations, and different levels of network saturation.
The second method, called the Advanced Reinforcement Learning (ARL), employs the potential-based reward shaping function using Q-learning and 3rd party advisor to enhance its performance over conventional reinforcement learning. The potential-based reward shaping in this thesis obtains an opinion from the 3rd party advisor when calculating reward. This technique can resolve the problem of sparse reward and slow learning speed. The ARL is tested with a range of existing reinforcement learning methods. The results clearly show that ARL outperforms the other models in almost all the scenarios.
Lastly, this thesis evaluates the impact of information availability and quality on different real-time signal control methods, including the two proposed ones. This is driven by the observation that most responsive signal control models in the literature tend to make idealized assumptions on the quality and availability of data. This research shows the varying levels of performance deterioration of different signal controllers in the presence of missing data, data noise, and different data types. Such knowledge and insights are crucial for real-world implementation of these signal control methods.
The first method, called the Nonlinear Decision Rule (NDR), defines a nonlinear mapping between network states and signal control parameters to network performances based on prevailing traffic conditions, and such a mapping is optimized via off-line simulation. The NDR is instantiated with two neural networks: feedforward neural network (FFNN) and recurrent neural network (RNN), which have different ways of processing traffic information in the near past. The NDR is implemented and tested within microscopic traffic simulation (S-Paramics) for a real-world network in West Glasgow, where the off-line training of the NDR amounts to a simulation-based optimization procedure aiming to reduce delay, CO2 and black carbon emissions. Extensive tests are performed to assess the NDR framework, not only in terms of its effectiveness in optimizing different traffic and environmental objectives, but also in relation to local vs. global benefits, trade-off between delay and emissions, impact of sensor locations, and different levels of network saturation.
The second method, called the Advanced Reinforcement Learning (ARL), employs the potential-based reward shaping function using Q-learning and 3rd party advisor to enhance its performance over conventional reinforcement learning. The potential-based reward shaping in this thesis obtains an opinion from the 3rd party advisor when calculating reward. This technique can resolve the problem of sparse reward and slow learning speed. The ARL is tested with a range of existing reinforcement learning methods. The results clearly show that ARL outperforms the other models in almost all the scenarios.
Lastly, this thesis evaluates the impact of information availability and quality on different real-time signal control methods, including the two proposed ones. This is driven by the observation that most responsive signal control models in the literature tend to make idealized assumptions on the quality and availability of data. This research shows the varying levels of performance deterioration of different signal controllers in the presence of missing data, data noise, and different data types. Such knowledge and insights are crucial for real-world implementation of these signal control methods.
Version
Open Access
Date Issued
2019-08
Date Awarded
2020-02
Copyright Statement
Creative Commons Attribution NonCommercial Licence
Advisor
Han, Ke
Ochieng, Washington Yotto
Publisher Department
Civil and Environmental Engineering
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)