Type: | Package |
Title: | Bayesian Modelling of Raman Spectroscopy |
Version: | 0.5-0 |
Date: | 2021-06-28 |
Description: | Sequential Monte Carlo (SMC) algorithms for fitting a generalised additive mixed model (GAMM) to surface-enhanced resonance Raman spectroscopy (SERRS), using the method of Moores et al. (2016) <doi:10.48550/arXiv.1604.07299>. Multivariate observations of SERRS are highly collinear and lend themselves to a reduced-rank representation. The GAMM separates the SERRS signal into three components: a sequence of Lorentzian, Gaussian, or pseudo-Voigt peaks; a smoothly-varying baseline; and additive white noise. The parameters of each component of the model are estimated iteratively using SMC. The posterior distributions of the parameters given the observed spectra are represented as a population of weighted particles. |
License: | GPL-2 | GPL-3 | file LICENSE [expanded from: GPL (≥ 2) | file LICENSE] |
URL: | https://github.com/mooresm/serrsBayes, https://mooresm.github.io/serrsBayes/ |
BugReports: | https://github.com/mooresm/serrsBayes/issues |
Depends: | R (≥ 3.5.0), Matrix, truncnorm, splines |
Imports: | Rcpp (≥ 0.11.3), methods |
LinkingTo: | Rcpp, RcppEigen |
Suggests: | testthat, knitr, rmarkdown, Hmisc |
LazyData: | true |
RoxygenNote: | 7.1.1 |
VignetteBuilder: | knitr |
NeedsCompilation: | yes |
Packaged: | 2021-06-28 10:24:02 UTC; matt |
Author: | Matt Moores |
Maintainer: | Matt Moores <mmoores@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2021-06-28 12:20:02 UTC |
Compute the log-likelihood.
Description
This is an internal function that is only exposed on the public API for unit testing purposes. It computes the log-likelihood of the spline and the noise, once the spectral signature has been subtracted from the observed data. Thus, it can be used with either Lorentzian, Gaussian, or pseudo-Voigt broadening functions.
Usage
computeLogLikelihood(
obsi,
lambda,
prErrNu,
prErrSS,
basisMx,
eigVal,
precMx,
xTx,
aMx,
ruMx
)
Arguments
obsi |
Vector of residuals after the spectral signature has been subtracted. |
lambda |
smoothing parameter of the penalised B-spline. |
prErrNu |
hyperparameter of the additive noise |
prErrSS |
hyperparameter of the additive noise |
basisMx |
Matrix of B-spline basis functions |
eigVal |
eigenvalues of the Demmler-Reinsch factorisation |
precMx |
precision matrix for the spline |
xTx |
sparse matrix cross-product |
aMx |
orthoganal matrix A from the Demmler-Reinsch factorisation |
ruMx |
product of Ru from the Demmler-Reinsch factorisation |
Value
The logarithm of the likelihood.
Initialise the vector of Metropolis-Hastings proposals.
Description
This is an internal function that is only exposed on the public API for unit testing purposes.
Usage
copyLogProposals(nPK, T_Prop_Theta)
Arguments
nPK |
number of Raman peaks in the spectral signature |
T_Prop_Theta |
Vector of logarithms of the MH proposals |
Value
Vector of proposals
Compute the effective sample size (ESS) of the particles.
Description
The ESS is a "rule of thumb" for assessing the degeneracy of the importance distribution:
ESS = \frac{(\sum_{q=1}^Q w_q)^2}{\sum_{q=1}^Q w_q^2}
Usage
effectiveSampleSize(log_weights)
Arguments
log_weights |
logarithms of the importance weights of each particle. |
Value
the effective sample size, a scalar between 0 and Q
References
Liu, JS (2001) "Monte Carlo Strategies in Scientific Computing." Springer, NY, pp. 34–36.
Examples
x <- runif(100)
effectiveSampleSize(log(x))
Fit the model using Markov chain Monte Carlo.
Description
Fit the model using Markov chain Monte Carlo.
Usage
fitSpectraMCMC(wl, spc, peakWL, lPriors, sd_mh, niter = 10000, nchains = 4)
Arguments
wl |
Vector of |
spc |
|
peakWL |
Vector of locations for each peak (cm^-1) |
lPriors |
List of hyperparameters for the prior distributions. |
sd_mh |
Vector of |
niter |
number of MCMC iterations per chain. |
nchains |
number of concurrent MCMC chains. |
Value
a List containing MCMC samples for the model parameters:
amplitude
niter * nchains * npeaks
Array of amplitudes.scale
niter * nchains * npeaks
Array of scale parameters.sigma
niter * nchains
Matrix of standard deviations.n_acc
The number of RWMH proposals that were accepted.
See Also
Examples
wavenumbers <- seq(200,600,by=10)
spectra <- matrix(nrow=1, ncol=length(wavenumbers))
peakLocations <- c(300,500)
peakAmplitude <- c(10000,4000)
peakScale <- c(10, 15)
signature <- weightedLorentzian(peakLocations, peakScale, peakAmplitude, wavenumbers)
baseline <- 1000*cos(wavenumbers/200) + 2*wavenumbers
spectra[1,] <- signature + baseline + rnorm(length(wavenumbers),0,200)
lPriors <- list(scale.mu=log(11.6) - (0.4^2)/2, scale.sd=0.4, bl.smooth=10^11, bl.knots=20,
amp.mu=5000, amp.sd=5000, noise.sd=200, noise.nu=4)
rw_bw <- c(100, 100, 2, 2)
result <- fitSpectraMCMC(wavenumbers, spectra, peakLocations, lPriors, rw_bw, 500)
result$n_acc
Fit the model using Sequential Monte Carlo (SMC).
Description
Fit the model using Sequential Monte Carlo (SMC).
Usage
fitSpectraSMC(
wl,
spc,
peakWL,
lPriors,
conc = rep(1, nrow(spc)),
npart = 10000,
rate = 0.9,
minESS = npart/2,
destDir = NA
)
Arguments
wl |
Vector of |
spc |
|
peakWL |
Vector of locations for each peak (cm^-1) |
lPriors |
List of hyperparameters for the prior distributions. |
conc |
Vector of |
npart |
number of SMC particles to use for the importance sampling distribution. |
rate |
the target rate of reduction in the effective sample size (ESS). |
minESS |
minimum effective sample size, below which the particles are resampled. |
destDir |
destination directory to save intermediate results (for long-running computations) |
Value
a List containing weighted parameter values, known as particles:
weights
Vector of importance weights for each particle.
beta
npart * npeaks
Matrix of regression coefficients for the amplitudes.scale
npart * npeaks
Matrix of scale parameters.sigma
Vector of
npart
standard deviations.alpha
bl.knots * n_y * npart
Array of spline coefficients for the baseline.basis
A dense
nwl * bl.knots
Matrix containing the values of the basis functions.expFn
npart * nwl
Matrix containing the spectral signature.ess
Vector containing the effective sample size (ESS) at each SMC iteration.
logEvidence
Vector containing the logarithm of the model evidence (marginal likelihood).
accept
Vector containing the Metropolis-Hastings acceptance rate at each SMC iteration.
sd.mh
niter * 2npeaks
Matrix of random walk MH bandwidths at each SMC iteration..
References
Chopin (2002) "A Sequential Particle Filter Method for Static Models," Biometrika 89(3): 539–551, doi: 10.1093/biomet/89.3.539
Examples
wavenumbers <- seq(200,600,by=10)
spectra <- matrix(nrow=1, ncol=length(wavenumbers))
peakLocations <- c(300,500)
peakAmplitude <- c(10000,4000)
peakScale <- c(10, 15)
signature <- weightedLorentzian(peakLocations, peakScale, peakAmplitude, wavenumbers)
baseline <- 1000*cos(wavenumbers/200) + 2*wavenumbers
spectra[1,] <- signature + baseline + rnorm(length(wavenumbers),0,200)
lPriors <- list(scale.mu=log(11.6) - (0.4^2)/2, scale.sd=0.4, bl.smooth=10^11, bl.knots=20,
beta.mu=5000, beta.sd=5000, noise.sd=200, noise.nu=4)
## Not run:
result <- fitSpectraSMC(wavenumbers, spectra, peakLocations, lPriors, npart=500)
## End(Not run)
Fit the model with Voigt peaks using iterated batch importance sampling (IBIS).
Description
Fit the model with Voigt peaks using iterated batch importance sampling (IBIS).
Usage
fitVoigtIBIS(
wl,
spc,
n,
lResult,
conc = rep(1, nrow(spc)),
batch = rep(1, nrow(spc)),
npart = 10000,
rate = 0.9,
mcAR = 0.234,
mcSteps = 20,
minESS = npart/2,
minPart = npart,
destDir = NA
)
Arguments
wl |
Vector of |
spc |
|
n |
index of the new observation |
lResult |
List of results from the previous call to “fitVoigtPeaksSMC“ or “fitVoigtIBIS“ |
conc |
Vector of |
batch |
identifies to which batch each observation belongs |
npart |
number of SMC particles to use for the importance sampling distribution. |
rate |
the target rate of reduction in the effective sample size (ESS). |
mcAR |
target acceptance rate for the MCMC kernel |
mcSteps |
number of iterations of the MCMC kernel |
minESS |
minimum effective sample size, below which the particles are resampled. |
minPart |
target number of unique particles for the MCMC iterations |
destDir |
destination directory to save intermediate results (for long-running computations) |
References
Chopin (2002) "A Sequential Particle Filter Method for Static Models," Biometrika 89(3): 539–551, doi: 10.1093/biomet/89.3.539
Fit the model with Voigt peaks using Sequential Monte Carlo (SMC).
Description
Fit the model with Voigt peaks using Sequential Monte Carlo (SMC).
Usage
fitVoigtPeaksSMC(
wl,
spc,
lPriors,
conc = rep(1, nrow(spc)),
npart = 10000,
rate = 0.9,
mcAR = 0.234,
mcSteps = 20,
minESS = npart/2,
destDir = NA,
minPart = npart
)
Arguments
wl |
Vector of |
spc |
|
lPriors |
List of hyperparameters for the prior distributions. |
conc |
Vector of |
npart |
number of SMC particles to use for the importance sampling distribution. |
rate |
the target rate of reduction in the effective sample size (ESS). |
mcAR |
target acceptance rate for the MCMC kernel |
mcSteps |
number of iterations of the MCMC kernel |
minESS |
minimum effective sample size, below which the particles are resampled. |
destDir |
destination directory to save intermediate results (for long-running computations) |
minPart |
target number of unique particles for the MCMC iterations |
Examples
wavenumbers <- seq(200,600,by=10)
spectra <- matrix(nrow=1, ncol=length(wavenumbers))
peakLocations <- c(300,500)
peakAmplitude <- c(10000,4000)
peakScale <- c(10, 15)
signature <- weightedLorentzian(peakLocations, peakScale, peakAmplitude, wavenumbers)
baseline <- 1000*cos(wavenumbers/200) + 2*wavenumbers
spectra[1,] <- signature + baseline + rnorm(length(wavenumbers),0,200)
lPriors <- list(scaG.mu=log(11.6) - (0.4^2)/2, scaG.sd=0.4, scaL.mu=log(11.6) - (0.4^2)/2,
scaL.sd=0.4, bl.smooth=5, bl.knots=20, loc.mu=peakLocations, loc.sd=c(5,5),
beta.mu=c(5000,5000), beta.sd=c(5000,5000), noise.sd=200, noise.nu=4)
## Not run:
result <- fitVoigtPeaksSMC(wavenumbers, spectra, lPriors, npart=50, mcSteps=1)
## End(Not run)
Compute cubic B-spline basis functions for the given wavenumbers.
Description
This function computes penalised cubic B-splines using the method proposed by Eilers & Marx (1996). The spline coefficients can be computed efficiently using sparse matrix algebra, as described in Sect. 2.3.3 of Green & Silverman (1994) and Appendix B of Ruppert, Wand & Carroll (2003).
Usage
getBsplineBasis(V, n.b, pen, prec = 1e-08)
Arguments
V |
a |
n.b |
the number of basis functions to use. |
pen |
the smoothing penalty hyperparameter. |
prec |
a constant scale factor. |
Value
a list
containing:
basis
A dense nwl by n.b matrix containing the values of the basis functions.
precision
A sparse n.b by n.b
dsCMatrix
, the inverse of the prior covariance.distance
The distance between each knot
(cm^{-1})
.knots
The knot locations.
References
Eilers, PHC & Marx, BD (1996) "Flexible smoothing with B-splines and penalties," Statist. Sci. 11(2): 89–121, doi: 10.1214/ss/1038425655
Green, PJ & Silverman, BW (1994) "Nonparametric Regression and Generalized Linear Models: a roughness penalty approach" Chapman & Hall, Boca Raton, FL, pp. 11–21.
Ruppert, D; Wand, MP & Carroll, RJ (2003) "Semiparametric Regression" CUP, Cambridge, UK, pp. 336–340.
See Also
Compute the pseudo-Voigt mixing ratio for each peak.
Description
Calculates the mixing parameter \eta_j
from the scales of the Gaussian/Lorentzian
components.
Usage
getVoigtParam(scale_G, scale_L)
Arguments
scale_G |
Vector of standard deviations |
scale_L |
Vector of scale parameters |
Details
First, calculate a polynomial average of the scale parameters according to the approximation of Thompson et al. (1987):
f_{G,L} = (\sigma_j^5 + 2.69\sigma_j^4\phi_j + 2.42\sigma_j^3\phi_j^2 + 4.47\sigma_j^2\phi_j^3 + 0.07\sigma_j\phi_j^4 + \phi_j^5)^{1/5}
Then the Voigt mixing parameter \eta_j
is defined as:
\eta_j = 1.36\frac{\phi_j}{f_{G,L}} - 0.47(\frac{\phi_j}{f_{G,L}})^2 + 0.11(\frac{\phi_j}{f_{G,L}})^3
Value
The Voigt mixing weights for each peak, between 0 (Gaussian) and 1 (Lorentzian).
References
Thompson, Cox & Hastings (1987) "Rietveld refinement of Debye–Scherrer synchrotron X-ray data from Al_2 O_3
,"
J. Appl. Crystallogr. 20(2): 79–83, doi: 10.1107/S0021889887087090
Surface-enhanced Raman spectram of tetramethylrhodamine+DNA (T20)
Description
Surface-enhanced Raman spectram of tetramethylrhodamine+DNA (T20)
Usage
lsTamra
Format
A list containing 2 variables:
- wavenumbers
a numeric Vector of 2401 wavenumbers (cm^-1)
- wavenumbers
a
1 * 2401
Matrix of intensity values (a.u.)
Update all of the parameters using a single Metropolis-Hastings step.
Description
Updates all of the parameters using a single Metropolis-Hastings step, such that the
baseline cancels out in the MH ratio, using the marginalisation identity of Chib (1995).
If npart > 1
, then multiple MCMC chains will be executed independently in parallel using OpenMP.
This means that all functions used for the proposal distributions and to evaluate the MH ratio
need to be thread-safe. Specifically, no calls to R::rnorm
, R::dnorm
, nor their
Rcpp equivalents, can be made from within the parallel portion of the code.
Usage
marginalMetropolisUpdate(
spectra,
n,
conc,
wavelengths,
peakWL,
betaMx,
scaleMx,
sigma,
expMx,
baselines,
sd_mh,
priors
)
Arguments
spectra |
|
n |
number of observations to use in calculating the likelihood |
conc |
Vector of |
wavelengths |
Vector of |
peakWL |
Vector of locations for each peak (cm^-1) |
betaMx |
|
scaleMx |
|
sigma |
Vector of |
expMx |
|
baselines |
|
sd_mh |
Vector of |
priors |
List of hyperparameters for the prior distributions. |
Value
The number of RWMH proposals that were accepted.
References
Chib (1995) "Marginal Likelihood from the Gibbs Output," JASA 90(432): 1313–1321, doi: 10.1080/01621459.1995.10476635
Rosenthal (2000) "Parallel computing and Monte Carlo algorithms" Far East J. Theor. Stat. 4(2): 207–236, URL: https://www.pphmj.com/abstract/1961.htm
Raman spectrum of methanol (CH3OH)
Description
Raman spectrum of methanol (CH3OH)
Usage
methanol
Format
A list containing 2 variables:
- wavenumbers
a numeric Vector of 331 wavenumbers (cm^-1)
- wavenumbers
a
1 * 331
Matrix of intensity values (a.u.)
Update the parameters of the Voigt peaks using marginal Metropolis-Hastings.
Description
Updates all of the parameters (location, amplitude, std. dev., and scale) using a single Metropolis-
Hastings step, such that the baseline cancels out in the MH ratio, using the marginalisation identity
of Chib (1995).
Note: if npart > 1
, then multiple MCMC chains will be executed independently in parallel using
OpenMP. This means that all functions used for the proposal distributions and to evaluate the MH ratio
need to be thread-safe. Specifically, no calls to R::rnorm
, R::dnorm
, nor their
Rcpp equivalents, can be made from within the parallel portion of the code.
Usage
mhUpdateVoigt(
spectra,
n,
kappa,
conc,
wavenum,
thetaMx,
logThetaMx,
mhChol,
priors
)
Arguments
spectra |
|
n |
number of observations to use in calculating the likelihood. |
kappa |
likelihood tempering parameter. |
conc |
Vector of |
wavenum |
Vector of |
thetaMx |
|
logThetaMx |
|
mhChol |
lower-triangular Cholesky factorisation of the covariance matrix for the random walk proposals. |
priors |
List of hyperparameters for the prior distributions. |
Value
The number of RWMH proposals that were accepted.
References
Chib (1995) "Marginal Likelihood from the Gibbs Output," JASA 90(432): 1313–1321, doi: 10.1080/01621459.1995.10476635
Rosenthal (2000) "Parallel computing and Monte Carlo algorithms" Far East J. Theor. Stat. 4(2): 207–236, URL: https://www.pphmj.com/abstract/1961.htm
Compute the spectral signature using Voigt peaks.
Description
Calculates the value of the pseudo-Voigt broadening function at the given wavenumbers, given the parameters of the peaks. This function is thread-safe.
Usage
mixedVoigt(location, scale_G, scale_L, amplitude, wavenum)
Arguments
location |
Vector of location parameters of the peaks ( |
scale_G |
Vector of standard deviations |
scale_L |
Vector of scale parameters |
amplitude |
Vector of amplitudes of the peaks (a.u.) |
wavenum |
Vector of wavenumbers at which to compute the function. |
Value
The value of the pseudo-Voigt function at the given wavenumbers.
References
Thompson, Cox & Hastings (1987) "Rietveld refinement of Debye–Scherrer synchrotron X-ray data from Al_2 O_3
,"
J. Appl. Crystallogr. 20(2): 79–83, DOI: doi: 10.1107/S0021889887087090
Examples
Cal_V <- seq(300,400,by=5)
loc <- c(320,350,375)
scG <- c(10,5,1)
scL <- c(3,20,7)
amp <- c(100,500,200)
mixedVoigt(loc,scG,scL,amp,Cal_V)
Update the importance weights of each particle.
Description
Update the importance weights of each particle.
Usage
reWeightParticles(
spectra,
peaks,
baselines,
i,
start,
sigma,
old_weights,
alpha,
idx
)
Arguments
spectra |
|
peaks |
|
baselines |
|
i |
index of the current observation to use in calculating the likelihood |
start |
index of the next wavelength to use in calculating the likelihood, permuted by |
sigma |
Vector of |
old_weights |
logarithms of the importance weights of each particle. |
alpha |
the target learning rate for the reduction in effective sample size (ESS). |
idx |
permutation of the indices of the wavelengths. |
Value
a List containing:
ess
The effective sample size, after reweighting.
weights
Vector of updated importance weights.
index
index of the last wavelength used.
evidence
SMC estimate of the logarithm of the model evidence.
References
Pitt, dos Santos Silva, Giordani & Kohn (2012) "On some properties of Markov chain Monte Carlo simulation methods based on the particle filter" J. Econometrics 171(2): 134–151, DOI: doi: 10.1016/j.jeconom.2012.06.004
Zhou, Johansen & Aston (2015) "Towards Automatic Model Comparison: An Adaptive Sequential Monte Carlo Approach" arXiv:1303.3123 [stat.ME]
Resample in place to avoid expensive copying of data structures, using a permutation of the ancestry vector.
Description
Resample in place to avoid expensive copying of data structures, using a permutation of the ancestry vector.
Usage
resampleParticles(log_weights, ampMx, scaleMx, peaks, baselines, n_y, nwl)
Arguments
log_weights |
logarithms of the importance weights of each particle |
ampMx |
|
scaleMx |
|
peaks |
|
baselines |
|
n_y |
number of observations |
nwl |
number of wavenumbers |
Value
Vector of indices to the parents of the resampled particles.
References
Murray, L.M., Lee, A. & Jacob, P.E. (2015) "Parallel resampling in the particle filter" arXiv:1301.4019v3
See Also
Compute an ancestry vector for residual resampling of the SMC particles.
Description
Compute an ancestry vector for residual resampling of the SMC particles.
Usage
residualResampling(log_wt)
Arguments
log_wt |
logarithms of the importance weights of each particle. |
Value
Vector of indices to the particles that will be propagated forward to the next generation (i.e. the parents)
References
Liu & Chen (1998) "Sequential Monte Carlo methods for dynamic systems," JASA 93(443): 1032-1044, doi: 10.1080/01621459.1998.10473765
Douc, Cappe & Moulines (2005) "Comparison of resampling schemes for particle filtering" In Proc. 4th IEEE Int. Symp. ISPA, pp. 64-69, doi: 10.1109/ISPA.2005.195385
SMC particles for TAMRA+DNA (T20)
Description
Posterior distribution for pseudo-Voigt parameters, obtained by running 'fitVoigtPeaksSMC' on a spectrum from Gracie et al. (Anal. Chem., 2016). 1000 SMC particles with 32 peaks. For details, see the vignette.
Usage
result
Format
A list containing 15 variables:
- weights
normalised importance weights for each particle
- location
location parameters of 32 peaks
- beta
amplitudes of 32 peaks
- scale_G
scale of the Gaussian (RBF) broadening
- scale_L
scale of the Lorentzian (Cauchy) broadening
- sigma
standard deviation of the additive white noise
- lambda
smoothing parameter of the cubic B-splines
- priors
List of informative priors
- ess
history of the effective sample size
- kappa
history of the likelihood tempering
- accept
history of Metropolis-Hastings acceptance rates
- mhSteps
history of Metropolis-Hastings steps
- times
history of times for each SMC iteration
- time
computation time taken by the SMC algorithm
SMC particles for methanol (CH3OH)
Description
Posterior distribution for pseudo-Voigt parameters, obtained by running 'fitVoigtPeaksSMC' on a Raman spectrum of methanol with 4 peaks. For details, refer to the vignette.
Usage
result2
Format
A list containing 15 variables.
Bayesian modelling and quantification of Raman spectroscopy
Description
This R package implements sequential Monte Carlo (SMC) algorithms for fitting a generalised additive mixed model (GAMM) to Raman spectra. These multivariate observations are highly collinear and lend themselves to a reduced-rank representation. The GAMM separates the hyperspectral signal into three components: a sequence of Lorentzian or Gaussian peaks; a smoothly-varying baseline; and zero-mean, additive white noise. The parameters of each component of the model are estimated iteratively using SMC. The posterior distributions of the parameters given the observed spectra are represented as a population of weighted particles.
Details
Raman spectroscopy can be used to identify molecules by the characteristic scattering of light from a laser. The pattern of peaks in a Raman spectrum corresponds to the vibrational modes of the molecule. The shift in wavenumber of the photons is proportional to the change in energy state, which is reflected in the locations of the peaks. Surface-enhanced Raman scattering (SERS) is a technique that amplifies the Raman signal using metallic substrates, such as nanoparticles. The laser can also be tuned to the resonant frequency of the molecule, which is known as surface-enhanced resonance Raman scattering (SERRS). Under controlled experimental conditions, the amplitudes of the peaks are linearly related to the concentration of the molecule, from the limit of detection (LOD) up to monolayer coverage of the nanoparticle surface.
The GAMM represents the peaks and baseline as continuous functions. The background fluorescence is modelled using a penalised cubic spline, while the peaks are an additive mixture of squared exponential (Gaussian) or Lorentzian (Cauchy) kernels:
Y = \sum_{m=1}^M \alpha_{i,m}B_m(\nu_j) + \sum_{p=1}^P s(\nu_j | l_p, A_p, \phi_p) + \epsilon_{i,j}
where Y
is a matrix of hyperspectral observations y_{i,j}
that have been discretised at wavenumbers \nu_j
;
B_m
are the M
spline basis functions with coefficients \alpha_{i,m}
; s(\nu_j | l_p, A_p, \phi_p)
are the radial basis functions for each peak, with location l_p
, amplitude A_p
, and scale \phi_p
parameters.
\epsilon_{i,j}
is assumed to be zero mean, additive white noise with constant variance \sigma^2
.
This model-based approach accounts for differences in resolution and experimental conditions, enabling comparison and alignment of heterogeneous spectra. The relationship between concentration and peak intensity can be quantified by fitting a Bayesian functional regression:
A_p = c_i \beta_p
where c_i
is the nanomolar (nM) concentration of the molecule in the i
th spectrum,
c_{LOD} < c_i <= c_{MLC}
. The regression model produces highest posterior density (HPD) intervals for the
limit of detection of each peak. A consistent, unbiased estimate of the model evidence (also known as the marginal likelihood)
is also computed. This can be used to evaluate whether Gaussian or Lorentzian peaks are a better fit to the data.
Author(s)
M. T. Moores, J. Carson & M. Girolami
Maintainer: Matt Moores <mmoores@gmail.com>
References
Moores, Gracie, Carson, Faulds, Graham & Girolami "Bayesian modelling and quantification of Raman spectroscopy," arXiv preprint
Examples
# simulate some data with known parameter values
wavenumbers <- seq(700,1400,by=2)
spectra <- matrix(nrow=1, ncol=length(wavenumbers))
peakLocations <- c(840, 960, 1140, 1220, 1290)
peakAmplitude <- c(11500, 2500, 4000, 3000, 2500)
peakScale <- c(10, 15, 20, 10, 12)
signature <- weightedLorentzian(peakLocations, peakScale, peakAmplitude, wavenumbers)
baseline <- 1000*cos(wavenumbers/200) + 2*wavenumbers
spectra[1,] <- signature + baseline + rnorm(length(wavenumbers),0,200)
plot(wavenumbers, spectra[1,], type='l', xlab="Raman offset", ylab="intensity")
lines(wavenumbers, baseline, col=2, lty=4)
lines(wavenumbers, baseline + signature, col=4, lty=2)
# fit the model using SMC
lPriors <- list(scale.mu=log(11.6) - (0.4^2)/2, scale.sd=0.4, bl.smooth=10^11, bl.knots=50,
beta.mu=5000, beta.sd=5000, noise.sd=200, noise.nu=4)
## Not run:
## takes approx. 1 minute for 100 SMC iterations with 10,000 particles
result <- fitSpectraSMC(wavenumbers, spectra, peakLocations, lPriors)
plot.ts(result$ess, xlab="SMC iterations", ylab="ESS")
# sample 200 particles from the posterior distribution
samp.idx <- sample.int(length(result$weights), 200, prob=result$weights)
plot(wavenumbers, spectra[1,], type='l', xlab="Raman offset", ylab="intensity")
for (pt in samp.idx) {
bl.est <- result$basis %*% result$alpha[,1,pt]
lines(wavenumbers, bl.est, col="#C3000009")
lines(wavenumbers, bl.est + result$expFn[pt,], col="#0000C309")
}
## End(Not run)
Sum log-likelihoods of i.i.d. exponential.
Description
This is an internal function that is only exposed on the public API for unit testing purposes.
Usage
sumDexp(x, rate)
Arguments
x |
Vector of i.i.d. exponential random varibles |
rate |
parameter of the exponential distribution |
Details
The sum of the log-likelihoods (log of the product of the likelihoods) for independent, identically-distributed, exponential random variables. Note: this Rcpp function is thread-safe, unlike the equivalent alternatives.
Value
log-likelihood of x
See Also
sum(dexp(x, rate, log=TRUE))
Sum log-likelihoods of i.i.d. lognormal.
Description
This is an internal function that is only exposed on the public API for unit testing purposes.
Usage
sumDlogNorm(x, meanlog, sdlog)
Arguments
x |
Vector of i.i.d. lognormal random varibles |
meanlog |
mean of the distribution on the log scale |
sdlog |
standard deviation on the log scale |
Details
The sum of the log-likelihoods (log of the product of the likelihoods) for independent, identically-distributed, lognormal random variables. Note: this Rcpp function is thread-safe, unlike the equivalent alternatives.
Value
log-likelihood of x
See Also
sum(dlnorm(x, meanlog, sdlog, log=TRUE))
Examples
x <- rlnorm(100)
sumDlogNorm(x,0,1)
Sum log-likelihoods of Gaussian.
Description
This is an internal function that is only exposed on the public API for unit testing purposes.
Usage
sumDnorm(x, mean, sd)
Arguments
x |
Vector of i.i.d. Gaussian random varibles |
mean |
Vector of means |
sd |
Vector of standard deviations |
Details
The sum of the log-likelihoods (log of the product of the likelihoods) for independent, identically-distributed, Gaussian random variables. Note: this Rcpp function is thread-safe, unlike the equivalent alternatives.
Value
log-likelihood of x
See Also
sum(dnorm(x, mean, sd, log=TRUE))
Examples
x <- rnorm(100)
mu <- rep(0,length(x))
sd <- rep(1,length(x))
sumDnorm(x,mu,sd)
Compute the spectral signature using Gaussian peaks.
Description
Calculates the value of the squared exponential radial basis function at the given wavelengths, given the parameters of the peaks. This function is thread-safe.
Usage
weightedGaussian(location, scale, amplitude, wavelengths)
Arguments
location |
Vector of location parameters of the peaks (mean). |
scale |
Vector of scale parameters of the peaks (standard deviation). |
amplitude |
Vector of amplitudes of the peaks. |
wavelengths |
Vector of wavenumbers at which to compute the function. |
Value
The value of the Gaussian function at the given wavelengths.
Examples
Cal_V <- seq(300,400,by=5)
loc <- c(320,350,375)
sca <- c(10,5,18)
amp <- c(1000,5000,2000)
weightedGaussian(loc,sca,amp,Cal_V)
Compute the spectral signature using Lorentzian peaks.
Description
Calculates the value of the Lorentzian function at the given wavelengths, given the parameters of the peaks. This function is thread-safe.
Usage
weightedLorentzian(location, scale, amplitude, wavelengths)
Arguments
location |
Vector of location parameters of the peaks. |
scale |
Vector of scale parameters of the peaks. |
amplitude |
Vector of amplitudes of the peaks. |
wavelengths |
Vector of wavenumbers at which to compute the function. |
Value
The value of the Lorentian function at the given wavelengths.
Examples
Cal_V <- seq(300,400,by=5)
loc <- c(320,350,375)
sca <- c(10,5,18)
amp <- c(1000,5000,2000)
weightedLorentzian(loc,sca,amp,Cal_V)
Compute the weighted arithmetic means of the particles.
Description
This SMC estimate of the means can be used to centre independent Metropolis-Hastings proposals.
Usage
weightedMean(particles, log_weights)
Arguments
particles |
|
log_weights |
logarithms of the importance weights of each particle. |
Value
A vector of means, one for each row.
See Also
Compute the weighted variance of the particles.
Description
This SMC estimate of the variance can be used to scale the bandwidth of adaptive, Gaussian random walk Metropolis-Hastings proposals.
Usage
weightedVariance(particles, log_weights, mean)
Arguments
particles |
|
log_weights |
logarithms of the importance weights of each particle. |
mean |
Vector of weighted means of each particle. |
Value
A vector of variances, one for each row.