Variational Bayes for high-dimensional linear regression with sparse priors
File(s)
OA Location
Author(s)
Ray, Kolyan
Szabó, Botond
Type
Journal Article
Abstract
We study a mean-field spike and slab variational Bayes (VB) approximation to Bayesian model selection priors in sparse high-dimensional linear regression. Under compatibility conditions on the design matrix, oracle inequalities are derived for the mean-field VB approximation, implying that it converges to the sparse truth at the optimal rate and gives optimal prediction of the response vector. The empirical performance of our algorithm is studied, showing that it works comparably well as other state-of-the-art Bayesian variable selection methods. We also numerically demonstrate that the widely used coordinate-ascent variational inference (CAVI) algorithm can be highly sensitive to the parameter updating order, leading to potentially poor performance. To mitigate this, we propose a novel prioritized updating scheme that uses a data-driven updating order and performs better in simulations. The variational algorithm is implemented in the R package sparsevb.
Date Issued
2022-07-01
Date Acceptance
2020-11-02
Citation
Journal of the American Statistical Association, 2022, 117 (539), pp.1270-1281
ISSN
0162-1459
Publisher
Informa UK Limited
Start Page
1270
End Page
1281
Journal / Book Title
Journal of the American Statistical Association
Volume
117
Issue
539
Copyright Statement
© 2020 The Author(s). Published with license by Taylor & Francis Group, LLC.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which
permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which
permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.
Identifier
https://www.tandfonline.com/doi/full/10.1080/01621459.2020.1847121
Subjects
Science & Technology
Physical Sciences
Statistics & Probability
Mathematics
Model selection
Oracle inequalities
Sparsity
Spike-and-slab prior
Variational Bayes
EMPIRICAL BAYES
VARIABLE SELECTION
POSTERIOR CONCENTRATION
CONVERGENCE-RATES
INFERENCE
NEEDLES
STRAW
SPIKE
Statistics & Probability
0104 Statistics
1403 Econometrics
1603 Demography
Publication Status
Published
Date Publish Online
2021-01-14