A systematic comparison of linear regression-based statistical methods to assess exposome-health associations

File Description SizeFormat 
EHP172.acco.pdfPublished version769.85 kBAdobe PDFView/Open
Title: A systematic comparison of linear regression-based statistical methods to assess exposome-health associations
Authors: Agier, L
Portengen, L
Chadeau-Hyam, M
Basagana, X
Giorgis-Allemand, L
Siroux, V
Robinson, O
Vlaanderen, J
Gonzalez, JR
Nieuwenhuijsen, MJ
Vineis, P
Vrijheid, M
Slama, R
Vermeulen, R
Item Type: Journal Article
Abstract: BACKGROUND: The exposome constitutes a promising framework to better understand the effect of environmental exposures on health by explicitly considering multiple testing and avoiding selective reporting. However, exposome studies are challenged by the simultaneous consideration of many correlated exposures. OBJECTIVES: We compared the performances of linear regression-based statistical methods in assessing exposome-health associations. METHODS: In a simulation study, we generated 237 exposure covariates with a realistic correlation structure, and a health outcome linearly related to 0 to 25 of these covariates. Statistical methods were compared primarily in terms of false discovery proportion (FDP) and sensitivity. RESULTS: On average over all simulation settings, the elastic net and sparse partial least-squares regression showed a sensitivity of 76% and a FDP of 44%; Graphical Unit Evolutionary Stochastic Search (GUESS) and the deletion/substitution/addition (DSA) algorithm a sensitivity of 80% and a FDP of 33%. The environment-wide association study (EWAS) underperformed these methods in terms of FDP (average FDP, 86%), despite a higher sensitivity. Performances decreased considerably when assuming an exposome exposure matrix with high levels of correlation between covariates. CONCLUSIONS: Correlation between exposures is a challenge for exposome research, and the statistical methods investigated in this study are limited in their ability to efficiently differentiate true predictors from correlated covariates in a realistic exposome context. While GUESS and DSA provided a marginally better balance between sensitivity and FDP, they did not outperform the other multivariate methods across all scenarios and properties examined, and computational complexity and flexibility should also be considered when choosing between these methods.
Issue Date: 24-May-2016
Date of Acceptance: 28-Apr-2016
URI: http://hdl.handle.net/10044/1/33344
DOI: https://dx.doi.org/10.1289/ehp172
ISSN: 0091-6765
Publisher: Environmental Health Perspectives
Start Page: 1848
End Page: 1856
Journal / Book Title: Environ Health Perspect
Volume: 124
Issue: 12
Sponsor/Funder: Commission of the European Communities
Funder's Grant Number: 308610
Keywords: Toxicology
11 Medical And Health Sciences
05 Environmental Sciences
Notes: 1552-9924 Agier, Lydiane Portengen, Lutzen Chadeau-Hyam, Marc Basagana, Xavier Giorgis-Allemand, Lise Siroux, Valerie Robinson, Oliver Vlaanderen, Jelle Gonzalez, Juan R Nieuwenhuijsen, Mark J Vineis, Paolo Vrijheid, Martine Slama, Remy Vermeulen, Roel Journal article United States Environ Health Perspect. 2016 May 24. BACKGROUND: The exposome constitutes a promising framework to better understand the effect of environmental exposures on health by explicitly considering multiple testing and avoiding selective reporting. However, exposome studies are challenged by the simultaneous consideration of many correlated exposures. OBJECTIVES: We compared the performances of linear regression-based statistical methods in assessing exposome-health associations. METHODS: In a simulation study, we generated 237 exposure covariates with a realistic correlation structure, and a health outcome linearly related to 0 to 25 of these covariates. Statistical methods were compared primarily in terms of false discovery proportion (FDP) and sensitivity. RESULTS: On average over all simulation settings, the elastic net and sparse partial least-squares regression showed a sensitivity of 76% and a FDP of 44%; Graphical Unit Evolutionary Stochastic Search (GUESS) and the deletion/substitution/addition (DSA) algorithm a sensitivity of 80% and a FDP of 33%. The environment-wide association study (EWAS) underperformed these methods in terms of FDP (average FDP, 86%), despite a higher sensitivity. Performances decreased considerably when assuming an exposome exposure matrix with high levels of correlation between covariates. CONCLUSIONS: Correlation between exposures is a challenge for exposome research, and the statistical methods investigated in this study are limited in their ability to efficiently differentiate true predictors from correlated covariates in a realistic exposome context. While GUESS and DSA provided a marginally better balance between sensitivity and FDP, they did not outperform the other multivariate methods across all scenarios and properties examined, and computational complexity and flexibility should also be considered when choosing between these methods.
Publication Status: Published
Appears in Collections:Faculty of Medicine
Epidemiology, Public Health and Primary Care



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commonsx