Repository logo
  • Log In
    Log in via Symplectic to deposit your publication(s).
Repository logo
  • Communities & Collections
  • Research Outputs
  • Statistics
  • Log In
    Log in via Symplectic to deposit your publication(s).
  1. Home
  2. Faculty of Natural Sciences
  3. Mathematics
  4. Mathematics PhD theses
  5. Principled bayesian modeling and statistical learning with non-representative data in astrophysics
 
  • Details
Principled bayesian modeling and statistical learning with non-representative data in astrophysics
File(s)
Autenrieth-M-2023-PhD-Thesis.pdf (19.28 MB)
Thesis
Author(s)
Autenrieth, Maximilian
Type
Thesis or dissertation
Abstract
This thesis tackles the fundamental issue of non-representative data in astrophysics via the development and application of methodology within the areas of statistical machine learning, Bayesian statistics and causal inference; to efficiently handle big data, to allow for probabilistic and principled parameter estimation with proper uncertainty quantification, and to deal with systematic uncertainties and biases in the data collection process. To enable (a) statistically principled, (b) scientifically justified, and (c) computationally efficient analysis of non-representative, complex astrophysical data, this thesis provides novel general-purpose statistical methodology, and statistical methodology tailored to topical scientific problems, in cosmology and high-energy astrophysics, as grouped into three related projects hereafter:
(i) We propose a simple, statistically principled, and theoretically justified general-purpose method, StratLearn, to improve supervised learning when the training set is not representative, a situation known as covariate shift. Building upon a well-established methodology in causal inference, we show that the effects of covariate shift can be reduced or eliminated by conditioning on propensity scores. We demonstrate that fitting learners within strata constructed on the estimated propensity scores improves upon state-of-the-art importance weighting methods on two topical scientific tasks – conditional density estimation of galaxy redshift (photo-z), and photometric supernovae type Ia (SNIa) classification.
(ii) We improve weak lensing photo-z calibration via Bayesian hierarchical modeling of full galaxy photo-z conditional density estimates obtained within StratLearn. We substantially improve the galaxy tomographic bin assignment, and obtain almost unbiased estimates of target population means within tomographic bins.
(iii) We propose a science-driven hierarchical Bayesian framework to estimate the galaxy luminosity distribution in X-rays, combining non-representative X-ray and optical surveys. Our proposed framework accounts for incompleteness bias by incorporating an X-ray incompleteness function (estimated from simulations) and an optical incompleteness function (with parameters learned from the observed data) into the model. This allows for improved recovery of the luminosity function even with high proportions of systematic incompleteness, evaluated on simulations, and applied to data from the Chandra Deep Field South (CDFS).
Version
Open Access
Date Issued
2023-08
Date Awarded
2023-11
URI
http://hdl.handle.net/10044/1/108121
DOI
https://doi.org/10.25560/108121
Copyright Statement
Creative Commons Attribution NonCommercial Licence
License URL
https://creativecommons.org/licenses/by-nc/4.0/
Advisor
van Dyk, David A.
Trotta, Roberto
Stenning, David C.
Sponsor
Imperial College London
Publisher Department
Mathematics
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)
About
Spiral Depositing with Spiral Publishing with Spiral Symplectic
Contact us
Open access team Report an issue
Other Services
Scholarly Communications Library Services
logo

Imperial College London

South Kensington Campus

London SW7 2AZ, UK

tel: +44 (0)20 7589 5111

Accessibility Modern slavery statement Cookie Policy

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback