Distributed testing and estimation in sparse high dimensional models
File(s)aosAccepted.pdf (1.59 MB)
Accepted version
Author(s)
Battey, HS
Fan, J
Liu, H
Lu, J
Zhu, Z
Type
Journal Article
Abstract
This paper studies hypothesis testing and parameter estimation in the context of the divide-and-conquer algorithm. In a unified likelihood-based framework, we propose new test statistics and point estimators obtained by aggregating various statistics from k subsamples of size n/k, where n is the sample size. In both low dimensional and sparse high dimensional settings, we address the important question of how large k can be, as n grows large, such that the loss of efficiency due to the divide-and-conquer algorithm is negligible. In other words, the resulting estimators have the same inferential efficiencies and estimation rates as an oracle with access to the full sample. Thorough numerical results are provided to back up the theory.
Date Issued
2018-06-01
Online Publication Date
2018-05-03
Date Acceptance
2017-05-17
ISSN
0090-5364
Publisher
Institute of Mathematical Statistics
Start Page
1352
End Page
1382
Journal / Book Title
Annals of Statistics
Volume
46
Issue
3
Copyright Statement
© Institute of Mathematical Statistics, 2018
Source Database
manual-entry
Subjects
Science & Technology
Physical Sciences
Statistics & Probability
Mathematics
Divide and conquer
debiasing
massive data
thresholding
NONCONCAVE PENALIZED LIKELIHOOD
VARIABLE SELECTION
CONFIDENCE-INTERVALS
NP-DIMENSIONALITY
GENERAL-THEORY
LINEAR-MODELS
REGRESSION
REGIONS
LASSO
RATES
62F10
Divide and conquer
Primary 62F05
debiasing
massive data
secondary 62F12
thresholding
0102 Applied Mathematics
0104 Statistics
1403 Econometrics
Statistics & Probability
Publication Status
Published
Date Publish Online
2018-05-03