Variable Prioritization in Nonlinear Black Box Methods: A Genetic Association Case Study

File Description SizeFormat 
1801.07318v3.pdfFile embargoed until 01 January 100001.11 MBAdobe PDF    Request a copy
Title: Variable Prioritization in Nonlinear Black Box Methods: A Genetic Association Case Study
Authors: Crawford, L
Flaxman, SR
Runcie, DE
West, M
Item Type: Journal Article
Abstract: The central aim in this paper is to address variable selection questions in nonlinear and nonparametric regression. Motivated by statistical genetics, where nonlinear interactions are of particular interest, we introduce a novel and interpretable way to summarize the relative importance of predictor variables. Methodologically, we develop the "RelATive cEntrality" (RATE) measure to prioritize candidate genetic variants that are not just marginally important, but whose associations also stem from significant covarying relationships with other variants in the data. We illustrate RATE through Bayesian Gaussian process regression, but the methodological innovations apply to other "black box" methods. It is known that nonlinear models often exhibit greater predictive accuracy than linear models, particularly for phenotypes generated by complex genetic architectures. With detailed simulations and two real data association mapping studies, we show that applying RATE enables an explanation for this improved performance.
Issue Date: 1-Jun-2019
Date of Acceptance: 19-Oct-2018
ISSN: 1932-6157
Publisher: Institute of Mathematical Statistics
Journal / Book Title: Annals of Applied Statistics
Copyright Statement: This paper is embargoed until publication.
Keywords: stat.ME
0104 Statistics
Statistics & Probability
Notes: 28 pages, 5 figures, 1 tables; Supplementary Material
Embargo Date: publication subject to indefinite embargo
Appears in Collections:Statistics
Faculty of Natural Sciences

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commonsx