Fast Bayesian estimation of spatial count data models
File(s)2007.03681.pdf (8.35 MB)
Accepted version
Author(s)
Bansal, Prateek
Krueger, Rico
Graham, Daniel J
Type
Journal Article
Abstract
Spatial count data models are used to explain and predict the frequency of phenomena such as traffic accidents in geographically distinct entities such as census tracts or road segments. These models are typically estimated using Bayesian Markov chain Monte Carlo (MCMC) simulation methods, which, however, are computationally expensive and do not scale well to large datasets. Variational Bayes (VB), a method from machine learning, addresses the shortcomings of MCMC by casting Bayesian estimation as an optimisation problem instead of a simulation problem. Considering all these advantages of VB, a VB method is derived for posterior inference in negative binomial models with unobserved parameter heterogeneity and spatial dependence. Pólya-Gamma augmentation is used to deal with the non-conjugacy of the negative binomial likelihood and an integrated non-factorised specification of the variational distribution is adopted to capture posterior dependencies. The benefits of the proposed approach are demonstrated in a Monte Carlo study and an empirical application on estimating youth pedestrian injury counts in census tracts of New York City. The VB approach is around 45 to 50 times faster than MCMC on a regular eight-core processor in a simulation and an empirical study, while offering similar estimation and predictive accuracy. Conditional on the availability of computational resources, the embarrassingly parallel architecture of the proposed VB method can be exploited to further accelerate its estimation by up to 20 times.
Date Issued
2021-05
Date Acceptance
2020-12-01
Citation
Computational Statistics & Data Analysis, 2021, 157, pp.1-19
ISSN
0167-9473
Publisher
Elsevier BV
Start Page
1
End Page
19
Journal / Book Title
Computational Statistics & Data Analysis
Volume
157
Copyright Statement
© 2020 Elsevier Ltd. All rights reserved. This manuscript is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Licence http://creativecommons.org/licenses/by-nc-nd/4.0/
Identifier
https://www.sciencedirect.com/science/article/pii/S0167947320302437?via%3Dihub
Subjects
0104 Statistics
0802 Computation Theory and Mathematics
1403 Econometrics
Statistics & Probability
Publication Status
Published online
Article Number
107152
Date Publish Online
2020-12-13