Disaggregated short-term travel location prediction
File(s)
Author(s)
Dong, Yanjie
Type
Thesis
Abstract
Disaggregated short-term location prediction aims at utilising historical spatial and temporal location data with current observed data to make predictions of next and future locations at the personal level. This type of prediction has many important applications, for example demand side traffic and travel management, epidemic control and prediction, mobile and targeted advertising, and location-based services such as ridesharing and micro-mobility. Existing studies have formulated the short-term location prediction as a sequence prediction problem and implemented methods such as Markov based models, trajectory matching method and recently deep learning-based methods. However, there are gaps in several areas including problem formulation, methodology and data platforms.
In this thesis we first create an agent-based simulation platform to resolve the issue around the lack of ground truth data in the literature. The agent-based simulation combines London Travel Demand Survey (LTDS) and National Travel Survey (NTS) data to generate realistic activities, locations, and movement trajectories on a real road network in London, using a data-driven probabilistic approach. This modular platform can be further expanded to examine more scenarios or to be used as a generic tool to test location-based methodologies.
Then we propose a spatial and temporal encoder-decoder deep learning model with attention and embedding layers (ST-EDAE) for short-term location prediction. Rather than the commonly used method of transforming location history as a sequence which is created based on minimum dwelling time requirements, we formulate the location history as time annotated locations into a fixed time intervals, which would allow the model to understand the notion of time and predict locations at specific time point in future.
The model is tested both on simulated data from the agent-based simulation platform and on a real-world GPS dataset collected in Beijing. Results show that our model outperforms other popular methods in the literature for both next timestep location prediction and future location prediction. Prediction accuracy decreases as prediction horizon increases, but our model still performs the best across heterogeneous agent groups. Significant performance variations can be observed, with the model performing better on frequently visited than less visited locations. In addition, our model is shown to be able to learn a contextual representation of locations in the embedding layer, potentially opens up new research opportunities in data-driven location/land use classifications.
In this thesis we first create an agent-based simulation platform to resolve the issue around the lack of ground truth data in the literature. The agent-based simulation combines London Travel Demand Survey (LTDS) and National Travel Survey (NTS) data to generate realistic activities, locations, and movement trajectories on a real road network in London, using a data-driven probabilistic approach. This modular platform can be further expanded to examine more scenarios or to be used as a generic tool to test location-based methodologies.
Then we propose a spatial and temporal encoder-decoder deep learning model with attention and embedding layers (ST-EDAE) for short-term location prediction. Rather than the commonly used method of transforming location history as a sequence which is created based on minimum dwelling time requirements, we formulate the location history as time annotated locations into a fixed time intervals, which would allow the model to understand the notion of time and predict locations at specific time point in future.
The model is tested both on simulated data from the agent-based simulation platform and on a real-world GPS dataset collected in Beijing. Results show that our model outperforms other popular methods in the literature for both next timestep location prediction and future location prediction. Prediction accuracy decreases as prediction horizon increases, but our model still performs the best across heterogeneous agent groups. Significant performance variations can be observed, with the model performing better on frequently visited than less visited locations. In addition, our model is shown to be able to learn a contextual representation of locations in the embedding layer, potentially opens up new research opportunities in data-driven location/land use classifications.
Version
Open Access
Date Issued
2022-05
Date Awarded
2024-01
Copyright Statement
Creative Commons Attribution NonCommercial Licence
License URL
Advisor
Sivakumar, Aruna
Polak, John
Guo, Fangce
Publisher Department
Civil and Environmental Engineering
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)