1
IRUS Total
Downloads
  Altmetric

A Bayesian hierarchical small area population model accounting for data source specific methodologies from American Community Survey, Population Estimates Program, and Decennial census data

File Description SizeFormat 
23-AOAS1849.pdfPublished version6.02 MBAdobe PDFView/Open
Title: A Bayesian hierarchical small area population model accounting for data source specific methodologies from American Community Survey, Population Estimates Program, and Decennial census data
Authors: Peterson, EN
Nethery, RC
Padellini, T
Chen, JT
Coull, BA
Piel, FB
Wakefield, J
Blangiardo, M
Waller, LA
Item Type: Journal Article
Abstract: Small area population counts are necessary for many epidemiological studies, yet their quality and accuracy are often not assessed. In the United States, small area population counts are published by the United States Census Bureau (USCB) in the form of the decennial census counts, intercensal population projections (PEP), and American Community Survey (ACS) estimates. Although there are significant relationships between these three data sources, there are important contrasts in data collection, data availability, and processing methodologies such that each set of reported population counts may be subject to different sources and magnitudes of error. Additionally, these data sources do not report identical small area population counts due to post-survey adjustments specific to each data source. Consequently, in public health studies, small area disease/mortality rates may differ depending on which data source is used for denominator data. To accurately estimate annual small area population counts and their associated uncertainties, we present a Bayesian population (BPop) model, which fuses information from all three USCB sources, accounting for data source specific methodologies and associated errors. We produce comprehensive small area race-stratified estimates of the true population, and associated uncertainties, given the observed trends in all three USCB population estimates. The main features of our framework are: (1) a single model integrating multiple data sources, (2) accounting for data source specific data generating mechanisms and specifically accounting for data source specific errors, and (3) prediction of population counts for years without USCB reported data. We focus our study on the Black and White only populations for 159 counties of Georgia and produce estimates for years 2006–2023. We compare BPop population estimates to decennial census counts, PEP annual counts, and ACS multi-year estimates. Additionally, we illustrate and explain the different types of data source specific errors. Lastly, we compare model performance using simulations and validation exercises. Our Bayesian population model can be extended to other applications at smaller spatial granularity and for demographic subpopulations defined further by race, age, and sex, and/or for other geographical regions.
Issue Date: Jun-2024
Date of Acceptance: 1-Aug-2023
URI: http://hdl.handle.net/10044/1/111833
DOI: 10.1214/23-aoas1849
ISSN: 1932-6157
Publisher: Institute of Mathematical Statistics
Start Page: 1565
End Page: 1595
Journal / Book Title: The Annals of Applied Statistics
Volume: 18
Issue: 2
Copyright Statement: Rights: Copyright © 2024 Institute of Mathematical Statistics
Publication Status: Published
Online Publication Date: 2024-04-05
Appears in Collections:School of Public Health