Bayesian modelling and quantification of Raman spectroscopy
File(s)1604.07299v2.pdf (2.49 MB)
Working paper
Author(s)
Type
Working Paper
Abstract
Raman spectroscopy can be used to identify molecules such as DNA by the
characteristic scattering of light from a laser. It is sensitive at very low
concentrations and can accurately quantify the amount of a given molecule in a
sample. The presence of a large, nonuniform background presents a major
challenge to analysis of these spectra. To overcome this challenge, we
introduce a sequential Monte Carlo (SMC) algorithm to separate each observed
spectrum into a series of peaks plus a smoothly-varying baseline, corrupted by
additive white noise. The peaks are modelled as Lorentzian, Gaussian, or
pseudo-Voigt functions, while the baseline is estimated using a penalised cubic
spline. This latent continuous representation accounts for differences in
resolution between measurements. The posterior distribution can be
incrementally updated as more data becomes available, resulting in a scalable
algorithm that is robust to local maxima. By incorporating this representation
in a Bayesian hierarchical regression model, we can quantify the relationship
between molecular concentration and peak intensity, thereby providing an
improved estimate of the limit of detection, which is of major importance to
analytical chemistry.
characteristic scattering of light from a laser. It is sensitive at very low
concentrations and can accurately quantify the amount of a given molecule in a
sample. The presence of a large, nonuniform background presents a major
challenge to analysis of these spectra. To overcome this challenge, we
introduce a sequential Monte Carlo (SMC) algorithm to separate each observed
spectrum into a series of peaks plus a smoothly-varying baseline, corrupted by
additive white noise. The peaks are modelled as Lorentzian, Gaussian, or
pseudo-Voigt functions, while the baseline is estimated using a penalised cubic
spline. This latent continuous representation accounts for differences in
resolution between measurements. The posterior distribution can be
incrementally updated as more data becomes available, resulting in a scalable
algorithm that is robust to local maxima. By incorporating this representation
in a Bayesian hierarchical regression model, we can quantify the relationship
between molecular concentration and peak intensity, thereby providing an
improved estimate of the limit of detection, which is of major importance to
analytical chemistry.
Date Issued
2018-01-24
Citation
2018
Publisher
arXiv
Copyright Statement
© 2018 The Authors.
Identifier
http://arxiv.org/abs/1604.07299v2
Subjects
stat.AP
stat.AP
stat.CO
92E99, 65D10, 62F15, 62H12
Publication Status
Published