Help for package ern

Title:

Effective Reproduction Number Estimation

Version:

2.1.2

Maintainer:

David Champredon <david.champredon@canada.ca>

Description:

Estimate the effective reproduction number from wastewater and clinical data sources.

License:

MIT + file LICENSE

Imports:

assertthat, coda, dplyr, EpiEstim, ggplot2, lubridate, patchwork, rjags, runjags, stats, stringr, tibble, tidyr, zoo

Suggests:

knitr, rmarkdown, bookdown, purrr, testthat (≥ 3.0.0)

VignetteBuilder:

knitr

Config/testthat/edition:

Encoding:

UTF-8

RoxygenNote:

7.3.1

Depends:

R (≥ 4.1.0)

LazyData:

true

NeedsCompilation:

Packaged:

2025-01-16 17:31:38 UTC; user5023114

Author:

David Champredon

[aut, cre], Warsame Yusuf

[aut], Irena Papst

[aut]

Repository:

CRAN

Date/Publication:

2025-01-16 18:20:02 UTC

Infer daily counts from aggregates

Description

Infer daily counts from aggregates

Usage

agg_to_daily(cl.data, dist.gi, prm.daily, silent = FALSE)

Arguments

cl.data

Data frame. Must have variables:

date: calendar date of report
value: count of reported cases

dist.gi

List. Parameters for the generation interval distribution in the same format as returned by def_dist().

prm.daily

List. Parameters for daily report inference via MCMC. Elements include:

method: String. Method name to infer the daily incidence reports from aggregated ones. Either linear or renewal is currently implemented. The linear method simply performs a linear interpolation that matches the aggregated values. The renewal method fits a SIR-like model using a renewal equation to infer the daily incidence. In this case, the fitting algorithm is a Markov Chain Monte Carlo (MCMC) implemented in JAGS and needs the parameters below (e.g., burn,iter,chains,...). The renewal method is more adapted for short single wave epidemics as this models i) naturally fits a single wave and ii) has longer computing time. For longer time series, user may perfer the linear method.
popsize: Integer. Population size to use in MCMC simulation to infer daily observations from aggregated input data.
burn: Numeric. Length of burn-in period (number of days).
iter: Numeric. Number of iterations after burn-in period (number of days).
chains: Numeric. Number of chains to simulate.
prior_R0_shape: Shape of the (hyper-)parameter for the prior Gamma distribution for R0.
prior_R0_rate: Rate of the (hyper-)parameter for the prior Gamma distribution for R0.
prior_alpha_shape: Shape of the (hyper-)parameter for the prior Gamma distribution for alpha.
prior_alpha_rate: Rate of the (hyper-)parameter for the prior Gamma distribution for alpha.
first.agg.period: length of aggregation period for first aggregated observation (number of days); if NULL, assume same aggregation period as observed for second observation (gap between first and second observations)

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

Value

A list containing a data frame with individual realizations of daily reported cases and the JAGS object.

Examples


# Importing data attached to the `ern` package
# and selecting the Omicron wave in Ontario, Canada.
# This is *weekly* incidence.
data(cl.data)
data = cl.data[cl.data$pt == 'on' & 
                  cl.data$date > as.Date('2021-11-30') & 
                  cl.data$date < as.Date('2021-12-31'),] 
head(data)
dist.gi = ern::def_dist(
 dist     = "gamma",
 mean     = 6.84,
 mean_sd  = 0.7486,
 shape    = 2.39,
 shape_sd = 0.3573,
 max      = 15
)

a = agg_to_daily(
cl.data = data, 
dist.gi = dist.gi, 
  prm.daily = list(
  method = "renewal",
  popsize = 14e6,
  # MCMC parameters.
  # small values for computation speed for this example.
  # Increase for better accuracy
  burn = 100,
  iter = 100,
  chains = 2,
  # - - - - - 
  prior_R0_shape = 2,
  prior_R0_rate = 0.6,
  prior_alpha_shape = 1,
  prior_alpha_rate = 1
))
# This is a Bayesian inference, so we 
# have a posterior distribution of  
# daily incidences. Here we just plot
# one single draw:
      
 df = a$df
 df1 = df[df$id==1,]
 plot(x = df1$t, y = df1$value, typ = 'o',
      xlab = 'days', ylab = 'daily incidence',
      main = 'Posterior daily incidence infered from weekly incidence')
 
 # Extract of the parameters values from the first chain
 a$jags.object[[1]][1:9,1:9]

Attach start date from first observation for aggregated data from time (day number)

Description

Attach start date from first observation for aggregated data from time (day number)

Usage

attach_startdate_agg(x)

Arguments

x

dataframe. only has columns date, value, and t

Attach internal time index column

Description

Attach internal time index column

Usage

attach_t(df)

Arguments

df

Data frame. Must have column date.

Value

Tibble.

Attach time index (number of days) column. Exclude first day since we don't necessarily know over which period of time that data was aggregated

Description

Attach time index (number of days) column. Exclude first day since we don't necessarily know over which period of time that data was aggregated

Usage

attach_t_agg(cl.data, prm.daily = NULL, silent = FALSE)

Arguments

cl.data

Data frame. Must have variables:

date: calendar date of report
value: count of reported cases

prm.daily

List. Parameters for daily report inference via MCMC. Elements include:

method: String. Method name to infer the daily incidence reports from aggregated ones. Either linear or renewal is currently implemented. The linear method simply performs a linear interpolation that matches the aggregated values. The renewal method fits a SIR-like model using a renewal equation to infer the daily incidence. In this case, the fitting algorithm is a Markov Chain Monte Carlo (MCMC) implemented in JAGS and needs the parameters below (e.g., burn,iter,chains,...). The renewal method is more adapted for short single wave epidemics as this models i) naturally fits a single wave and ii) has longer computing time. For longer time series, user may perfer the linear method.
popsize: Integer. Population size to use in MCMC simulation to infer daily observations from aggregated input data.
burn: Numeric. Length of burn-in period (number of days).
iter: Numeric. Number of iterations after burn-in period (number of days).
chains: Numeric. Number of chains to simulate.
prior_R0_shape: Shape of the (hyper-)parameter for the prior Gamma distribution for R0.
prior_R0_rate: Rate of the (hyper-)parameter for the prior Gamma distribution for R0.
prior_alpha_shape: Shape of the (hyper-)parameter for the prior Gamma distribution for alpha.
prior_alpha_rate: Rate of the (hyper-)parameter for the prior Gamma distribution for alpha.
first.agg.period: length of aggregation period for first aggregated observation (number of days); if NULL, assume same aggregation period as observed for second observation (gap between first and second observations)

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

Value

Data frame

Check the format of input clinical data

Description

Check the format of input clinical data

Usage

check_cl.input_format(cl.data, silent = FALSE)

Arguments

cl.data

Data frame. Must have variables:

date: calendar date of report
value: count of reported cases

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

Check if input data is already daily

Description

Check if input data is already daily

Usage

check_df.input_daily(df.input)

Arguments

df.input

Data frame. Must have variables:

date: calendar date of observation (report for clinical data, wastewater sampling for wastewater data)
value: observed values (count of reported cases for clinical data, wastewater concentration for wastewater data)

Value

Logical. Indicates whether input data is already daily.

Check distributions

Description

Check distributions

Usage

check_dist(x)

Arguments

x

family of distributions to be checked, as generated by def_dist_()

Check that deconvolution inputs are compatible

Description

Check that deconvolution inputs are compatible

Usage

check_for_deconv(obs, dist)

Arguments

obs

Numeric. Vector with observed signal (e.g., case reports)

dist

Numeric. Vector of discretized distribution used as the deconvolution kernel (e.g., reporting delay distribution)

Check parameters for Rt calculation

Description

Check parameters for Rt calculation

Usage

check_prm.R(x, silent = FALSE)

Arguments

x

List. Parameters for Rt calculation.

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

Check parameters for daily data inference

Description

Check parameters for daily data inference

Usage

check_prm.daily(x)

Arguments

x

List. Parameters for daily data inference.

Check parameters for daily data inference check

Description

Check parameters for daily data inference check

Usage

check_prm.daily.check(x)

Arguments

x

List. Parameters for daily data inference check.

Check parameters for smoothing

Description

Check parameters for smoothing

Usage

check_prm.smooth(x)

Arguments

x

List that specifies the type of smoothing and the parameters associated with the smoothing method.

Sample of aggregated clinical reports

Description

A subset of COVID-19 weekly reports in the Government of Canada Health Infobase. See https://health-infobase.canada.ca/covid-19/

Usage

cl.data

Format

`cl.data`

A data frame with 96 rows and 3 columns:

pt: standard two-character abbreviation (lowercase) of the province name (based on Statistics Canada 2021 census abbreviations)
date: report date
value: count of reported cases for the previous week

Filter indicating a specific province to extract a sample dataset for use with estimate_R_cl(), e.g.

estimate_R_cl(cl.data = dplyr::filter(cl.data, pt == 'bc'), ...)

Correct underreporting by scaling up

Description

Correct underreporting by scaling up

Usage

correct_underreporting(reports.daily, reporting.fraction)

Arguments

reports.daily

dataframe of daily reported cases. must at least include value column with counts.

reporting.fraction

numeric. proportion of incidence that is reported.

Wrapper for deconvolution with a given distribution

Description

Wrapper for deconvolution with a given distribution

Usage

deconv(counts, dist, max.iter = 10, silent = FALSE)

Arguments

counts

Numeric. Vector of daily counts.

dist

Numeric. Vector of discrete distributions (daily, truncated) with which we're deconvoluting counts, e.g., produced by get_discrete_dist()

max.iter

Numeric. Maximum number of Richardson-Lucy iterations.

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

Deconvoluting Wastewater Data to Incidence

Description

Function estimates incidence from smoothed data

Usage

deconv_ww_inc(d, fec, scaling.factor, silent, RL.max.iter)

Arguments

d

Data frame. Wastewater dataframe. Must include at least date, time t and obs columns.

fec

List. Parameters for a single fecal shedding distribution, as generated by sample_a_dist().

scaling.factor

Numeric. Scaling from wastewater concentration to prevalence. This value may be assumed or independently calibrated to data.

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

RL.max.iter

Integer. Maximum of iterations for the Richardson-Lucy deconvolution algorithm.

Value

Data frame with deconvoluted incidence

Richardson-Lucy Deconvolution.

Description

Richardson-Lucy Deconvolution.

Usage

deconvolution_RL(
  observed,
  times,
  p_delay,
  max_iter = 50,
  out_col_name = "RL_result",
  right_censor = TRUE,
  verbose = FALSE
)

Arguments

observed

A vector of the number of observed cases, deaths, hospitalizations, etc. per day

times

A numeric vector of times corresponding to the entries in observed. Must be the same length as observed.

p_delay

A numeric vector whose entries give the probability that the delay from infection to observation is exactly 0, 1, ... N days. p_delay must sum to 1 and to facilitate vectorization should be the same length as observed, and times (the last several entries will most likely be 0s).

max_iter

Maximum number of times to iterate. Following Goldstein et al., the algorithm continues to run until the normalized chi2 statistic comparing the observed and expected number of deaths per day falls below 1, or until the maximum number of iterations is reached.

out_col_name

String giving the name of the column in which to output the imputed times of infection

right_censor

If TRUE (default), upscale the observations at the end of the time series based on the cumulative probability that an infection occurring on that date would have been observed.

verbose

Logical. Verbose mode.

Value

A data frame with columns time and out_col_name (which gives the imputed number of infections per timestep)

Define a family of distributions.

Description

Define a family of distributions.

Usage

def_dist(dist, ...)

Arguments

dist

distribution type. Distributions currently supported are:

norm = normal,
lnorm = log-normal,
gamma = Gamma,
unif = uniform

...

a series of distribution parameters. Included should be the following:

mean distribution mean (only for dist = lnorm or gamma).
mean_sd standard deviation of the mean (only for dist = lnorm or gamma).
sd standard deviation (only for dist = lnorm or gamma).
sd_sd standard deviation of the standard deviation (only for dist = lnorm or gamma).
min minimum value of the random variable modelled by this distribution (only for dist = unif).
max maximum value of the random variable modelled by this distribution.

Value

List with components specified in the parameters.

Examples

d = def_dist(
  dist     = "gamma",
  mean     = 3.49,
  mean_sd  = 0.1477,
  shape    = 8.5,
  shape_sd = 1.8945,
  max      = 8
)
print(d)

Draw from gamma for a parameter specified in a distribution family list.

Description

Draw from gamma for a parameter specified in a distribution family list.

Usage

draw_from_gamma(par, dist)

Arguments

par

Character. Name of the parameter to sample.

dist

List. Distribution definition, as output by def_dist().

Value

Numeric. The sampled parameter value.

Estimate the effective reproduction from clinical report data

Description

Estimate the effective reproduction from clinical report data

Usage

estimate_R_cl(
  cl.data,
  dist.repdelay,
  dist.repfrac,
  dist.incub,
  dist.gi,
  prm.daily = list(method = "linear", popsize = NULL, burn = 500, iter = 2000, chains =
    3, prior_R0_shape = 2, prior_R0_rate = 0.6, prior_alpha_shape = 1, prior_alpha_rate =
    1, first.agg.period = NULL),
  prm.daily.check = list(agg.reldiff.tol = 10),
  prm.smooth = list(method = "rollmean", align = "right", window = 7),
  prm.R = list(iter = 10, CI = 0.95, window = 7, config.EpiEstim = NULL),
  RL.max.iter = 10,
  silent = FALSE
)

Arguments

cl.data

Data frame. Must have variables:

date: calendar date of report
value: count of reported cases

dist.repdelay

List. Parameters for the reporting delay distribution in the same format as returned by def_dist().

dist.repfrac

List. Parameters for the reporting fraction distribution in the same format as returned by def_dist().

dist.incub

List. Parameters for the incubation period distribution in the same format as returned by def_dist().

dist.gi

List. Parameters for the generation interval distribution in the same format as returned by def_dist().

prm.daily

List. Parameters for daily report inference via MCMC. Elements include:

method: String. Method name to infer the daily incidence reports from aggregated ones. Either linear or renewal is currently implemented. The linear method simply performs a linear interpolation that matches the aggregated values. The renewal method fits a SIR-like model using a renewal equation to infer the daily incidence. In this case, the fitting algorithm is a Markov Chain Monte Carlo (MCMC) implemented in JAGS and needs the parameters below (e.g., burn,iter,chains,...). The renewal method is more adapted for short single wave epidemics as this models i) naturally fits a single wave and ii) has longer computing time. For longer time series, user may perfer the linear method.
popsize: Integer. Population size to use in MCMC simulation to infer daily observations from aggregated input data.
burn: Numeric. Length of burn-in period (number of days).
iter: Numeric. Number of iterations after burn-in period (number of days).
chains: Numeric. Number of chains to simulate.
prior_R0_shape: Shape of the (hyper-)parameter for the prior Gamma distribution for R0.
prior_R0_rate: Rate of the (hyper-)parameter for the prior Gamma distribution for R0.
prior_alpha_shape: Shape of the (hyper-)parameter for the prior Gamma distribution for alpha.
prior_alpha_rate: Rate of the (hyper-)parameter for the prior Gamma distribution for alpha.
first.agg.period: length of aggregation period for first aggregated observation (number of days); if NULL, assume same aggregation period as observed for second observation (gap between first and second observations)

prm.daily.check

List. Parameters for checking aggregated to daily report inference. Elements include:

agg.reldiff.tol: numerical tolerance (%) for relative error between aggregated inferred daily reports and original aggregated reports; chronological observations are dropped until this tolerance is first acheived (convergence at the start of the timeseries is often the worst, need to maintain uninterrupted daily timeseries for input into Rt calculation).

Set this entire argument to NULL to use inferred daily reports as is.

prm.smooth

List. list of smoothing parameters. Parameters should be specified as followed:

method: smoothing method, either 'rollmean' (rolling mean) or 'loess' (LOESS smoothing via stats::loess())
window: for ⁠method = 'rollmean⁠ only; width of smoothing window in days
align: for ⁠method = 'rollmean⁠ only; smoothing alignment, either 'center', 'left', 'right'
span: for method = 'loess' only; smoothing span (see the documentation for stats::loess() for details)
floor: optional call for wastewater concentration smoothing with method = 'loess' only; user defined minimum smoothing concentration

Set this entire list to NULL to turn off smoothing

prm.R

List. Settings for the ensemble when calculating Rt. Elements include:

iter: Integer. Number of iterations for the Rt ensemble
CI: Numeric between 0 and 1. Confidence interval width for Rt estimates after sampling uncertain distributions.
window: Integer. Number of days defining the window of data used by EpiEstim to estimate Rt. If NULL, will default to 7.
config.EpiEstim: (optional) configuration for EpiEstim defined via EpiEstim::make_config(). If NULL, will use default config from EpiEstim.

RL.max.iter

Integer. Maximum of iterations for the Richardson-Lucy deconvolution algorithm.

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

Value

List. Elements include:

cl.data: original aggregated reports signal
cl.daily: reports as input for Rt calculation (inferred daily counts, smoothed)
inferred.agg: inferred daily reports aggregated on the reporting schedule as input in cl.data
R: the effective R estimate (summary from ensemble)

Examples


# -- THIS EXAMPLE TAKES ABOUT 30 SECONDS TO RUN --
# Estimate Rt

## Not run: 
# Load SARS-CoV-2 reported cases in Quebec
# during the Summer 2021
dat <- (ern::cl.data
    |> dplyr::filter(
      pt == "qc", 
      dplyr::between(date, as.Date("2021-06-01"), as.Date("2021-09-01"))
    )
)
# distributions
dist.repdelay = ern::def_dist(
    dist = 'gamma',
    mean = 5, 
    mean_sd = 1,
    sd = 1,
    sd_sd = 0.1,
    max = 10
)
dist.repfrac = ern::def_dist(
    dist = "unif",
    min = 0.1,
    max = 0.3
)
dist.incub = ern::def_dist(
    dist = "gamma",
    mean = 3.49,
    mean_sd = 0.1477,
    shape = 8.5,
    shape_sd = 1.8945,
    max = 8
)
dist.gi = ern::def_dist(
    dist = "gamma",
    mean = 6,
    mean_sd = 0.75,
    shape = 2.4,
    shape_sd = 0.3,
    max = 10
)

# settings
prm.daily <- list(
    method = "renewal",
    popsize = 8.5e6, # Q3 (July 1) 2022 estimate for Quebec
    burn = 500,
    iter = 500,
    chains = 2,
    prior_R0_shape = 1.1, prior_R0_rate = 0.6, 
    prior_alpha_shape = 1, prior_alpha_rate = 1
)
prm.daily.check <- list(
    agg.reldiff.tol = 10
)
prm.smooth <- list(
    method = "rollmean",
    align = "center",
    window = 7
)
prm.R <- list(
    iter = 20, 
    CI = 0.95, 
    window = 7, 
    config.EpiEstim = NULL
)

x <- estimate_R_cl(
  dat,
  dist.repdelay,
  dist.repfrac,
  dist.incub,
  dist.gi,
  prm.daily,
  prm.daily.check,
  prm.smooth,
  prm.R
)

# Rt estimates
print(x$R)

## End(Not run)

Ensemble estimate of Rt

Description

Ensemble estimate of Rt

Usage

estimate_R_cl_rep(
  cl.daily,
  dist.repfrac,
  dist.repdelay,
  dist.incub,
  dist.gi,
  prm.R,
  silent = FALSE,
  RL.max.iter = 10
)

Arguments

cl.daily

Dataframe of inferred daily incidence.

dist.repfrac

List. Parameters for the reporting fraction distribution in the same format as returned by def_dist().

dist.repdelay

List. Parameters for the reporting delay distribution in the same format as returned by def_dist().

dist.incub

List. Parameters for the incubation period distribution in the same format as returned by def_dist().

dist.gi

List. Parameters for the generation interval distribution in the same format as returned by def_dist().

prm.R

List. Settings for the ensemble when calculating Rt. Elements include:

iter: Integer. Number of iterations for the Rt ensemble
CI: Numeric between 0 and 1. Confidence interval width for Rt estimates after sampling uncertain distributions.
window: Integer. Number of days defining the window of data used by EpiEstim to estimate Rt. If NULL, will default to 7.
config.EpiEstim: (optional) configuration for EpiEstim defined via EpiEstim::make_config(). If NULL, will use default config from EpiEstim.

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

RL.max.iter

Integer. Maximum of iterations for the Richardson-Lucy deconvolution algorithm.

Value

A summary of individual Rt realizations with 95% confidence intervals

A single realization of the Rt estimate

Description

A single realization of the Rt estimate

Usage

estimate_R_cl_single(
  cl.daily,
  dist.repfrac,
  dist.repdelay,
  dist.incub,
  dist.gi,
  prm.R,
  silent = FALSE,
  RL.max.iter = 10
)

Arguments

cl.daily

Dataframe of inferred daily incidence.

dist.repfrac

List. Parameters for the reporting fraction distribution in the same format as returned by def_dist().

dist.repdelay

List. Parameters for the reporting delay distribution in the same format as returned by def_dist().

dist.incub

List. Parameters for the incubation period distribution in the same format as returned by def_dist().

dist.gi

List. Parameters for the generation interval distribution in the same format as returned by def_dist().

prm.R

List. Settings for the ensemble when calculating Rt. Elements include:

iter: Integer. Number of iterations for the Rt ensemble
CI: Numeric between 0 and 1. Confidence interval width for Rt estimates after sampling uncertain distributions.
window: Integer. Number of days defining the window of data used by EpiEstim to estimate Rt. If NULL, will default to 7.
config.EpiEstim: (optional) configuration for EpiEstim defined via EpiEstim::make_config(). If NULL, will use default config from EpiEstim.

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

RL.max.iter

Integer. Maximum of iterations for the Richardson-Lucy deconvolution algorithm.

Estimate the effective reproduction from wastewater concentration data.

Description

Estimate the effective reproduction from wastewater concentration data.

Usage

estimate_R_ww(
  ww.conc,
  dist.fec,
  dist.gi,
  scaling.factor = 1,
  prm.smooth = list(window = 14, align = "center", method = "loess", span = 0.2),
  prm.R = list(iter = 10, CI = 0.95, window = 7, config.EpiEstim = NULL),
  silent = FALSE,
  RL.max.iter = 9
)

Arguments

ww.conc

Data frame. Must have variables:

date: calendar date of wastewater collection
value: pathogen concentration

dist.fec

List. Parameters for the fecal shedding distribution in the same format as returned by def_dist().

dist.gi

List. Parameters for the generation interval distribution in the same format as returned by def_dist().

scaling.factor

Numeric. Scaling from wastewater concentration to prevalence. This value may be assumed or independently calibrated to data.

prm.smooth

List. list of smoothing parameters. Parameters should be specified as followed:

method: smoothing method, either 'rollmean' (rolling mean) or 'loess' (LOESS smoothing via stats::loess())
window: for ⁠method = 'rollmean⁠ only; width of smoothing window in days
align: for ⁠method = 'rollmean⁠ only; smoothing alignment, either 'center', 'left', 'right'
span: for method = 'loess' only; smoothing span (see the documentation for stats::loess() for details)
floor: optional call for wastewater concentration smoothing with method = 'loess' only; user defined minimum smoothing concentration

Set this entire list to NULL to turn off smoothing

prm.R

List. Settings for the ensemble when calculating Rt. Elements include:

iter: Integer. Number of iterations for the Rt ensemble
CI: Numeric between 0 and 1. Confidence interval width for Rt estimates after sampling uncertain distributions.
window: Integer. Number of days defining the window of data used by EpiEstim to estimate Rt. If NULL, will default to 7.
config.EpiEstim: (optional) configuration for EpiEstim defined via EpiEstim::make_config(). If NULL, will use default config from EpiEstim.

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

RL.max.iter

Integer. Maximum of iterations for the Richardson-Lucy deconvolution algorithm.

Value

List. Elements include:

ww.conc: original wastewater signal
ww.smooth: smoothed wastewater signal
inc: inferred incidence
R: the effective reproduction number estimate

Examples


 # Load data of viral concentration in wastewater
data("ww.data")

# Run the estimation of Rt based on the wastewater data
x = estimate_R_ww(
  ww.conc  = ww.data,
  dist.fec = ern::def_dist(
    dist = "gamma",
    mean = 12.90215,
    mean_sd = 1.136829,
    shape = 1.759937,
    shape_sd = 0.2665988,
    max = 33
    ),
  dist.gi  = ern::def_dist(
    dist     = "gamma",
    mean     = 6.84,
    mean_sd  = 0.7486,
    shape    = 2.39,
    shape_sd = 0.3573,
    max      = 15
    ), 
  silent   = TRUE
)

# Rt estimates
head(x$R)

# inferred daily incidence
head(x$inc)

Extract MCMC chains from a JAGS object

Description

Extract MCMC chains from a JAGS object

Usage

extract_mcmc_values(chain, jags.obj)

Arguments

chain

Integer. Chain number.

jags.obj

JAGS object as returned by code.sample()

Value

A dataframe of the chain values for selected parameters.

Fit JAGS model to aggregated data

Description

Fit JAGS model to aggregated data

Usage

fit_jags_aggreg(obs.times, Y, g, N, n.days = NULL, prm.daily, silent = FALSE)

Arguments

obs.times

Numeric. Vector of observation times.

Y

Numeric. Vector of aggregated counts.

g

Numeric. Vector of discretized generation interval density.

N

Numeric. Scalar population size.

n.days

Numeric. Total number of days. if NULL, use max(obs.times)

prm.daily

List. Parameters for daily report inference via MCMC. Elements include:

method: String. Method name to infer the daily incidence reports from aggregated ones. Either linear or renewal is currently implemented. The linear method simply performs a linear interpolation that matches the aggregated values. The renewal method fits a SIR-like model using a renewal equation to infer the daily incidence. In this case, the fitting algorithm is a Markov Chain Monte Carlo (MCMC) implemented in JAGS and needs the parameters below (e.g., burn,iter,chains,...). The renewal method is more adapted for short single wave epidemics as this models i) naturally fits a single wave and ii) has longer computing time. For longer time series, user may perfer the linear method.
popsize: Integer. Population size to use in MCMC simulation to infer daily observations from aggregated input data.
burn: Numeric. Length of burn-in period (number of days).
iter: Numeric. Number of iterations after burn-in period (number of days).
chains: Numeric. Number of chains to simulate.
prior_R0_shape: Shape of the (hyper-)parameter for the prior Gamma distribution for R0.
prior_R0_rate: Rate of the (hyper-)parameter for the prior Gamma distribution for R0.
prior_alpha_shape: Shape of the (hyper-)parameter for the prior Gamma distribution for alpha.
prior_alpha_rate: Rate of the (hyper-)parameter for the prior Gamma distribution for alpha.
first.agg.period: length of aggregation period for first aggregated observation (number of days); if NULL, assume same aggregation period as observed for second observation (gap between first and second observations)

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

Get a discretized, truncated version of a distribution

Description

Get a discretized, truncated version of a distribution

Usage

get_discrete_dist(params)

Arguments

params

distribution params (output of ⁠def_dist_*()⁠ function)

Value

Numeric. Vector with discretized density.

Examples


# Define distributions
fec = ern::def_dist(
  dist = "gamma",
  mean = 12.90215,
  mean_sd = 1.136829,
  shape = 1.759937,
  shape_sd = 0.2665988,
  max = 33
  )
gi  = ern::def_dist(
  dist     = "gamma",
  mean     = 6.84,
  mean_sd  = 0.7486,
  shape    = 2.39,
  shape_sd = 0.3573,
  max      = 15
  )

# Get their (discretized) densities
d.fec = get_discrete_dist(fec)
d.gi  = get_discrete_dist(gi)

print(d.fec)
print(d.gi)

Retrieve realizations for aggregated to daily inference

Description

Retrieve realizations for aggregated to daily inference

Usage

get_realizations(fit.reports.daily, reports)

Arguments

fit.reports.daily

Data frame. Realizations from daily report inference. Must at least have t (time index), var (variable name), iteration (realization number), and value (inferred count) columns.

reports

Data frame. Original aggregated reports. Must at least have date column.

Value

Data frame

Get dates for which to use (trust) inferred daily reports

Description

Get dates for which to use (trust) inferred daily reports

Usage

get_use_dates(cl.daily, cl.data, prm.daily.check, dates.only = TRUE)

Arguments

cl.daily

Data frame. Output of agg_to_daily().

cl.data

Data frame. Must have variables:

date: calendar date of report
value: count of reported cases

prm.daily.check

List. Parameters for checking aggregated to daily report inference. Elements include:

agg.reldiff.tol: numerical tolerance (%) for relative error between aggregated inferred daily reports and original aggregated reports; chronological observations are dropped until this tolerance is first acheived (convergence at the start of the timeseries is often the worst, need to maintain uninterrupted daily timeseries for input into Rt calculation).

Set this entire argument to NULL to use inferred daily reports as is.

dates.only

Logical. Return use dates only or all columns of cl.daily.

Value

Data frame or vector, depending on dates.only

Helper function. Converts wastewater to Rt after sampling one fecal shedding and one generation interval distribution.

Description

Helper function. Converts wastewater to Rt after sampling one fecal shedding and one generation interval distribution.

Usage

inc2R_one_iter(
  i,
  dist.fec,
  dist.gi,
  ww.conc,
  scaling.factor,
  prm.R,
  silent,
  RL.max.iter
)

Arguments

i

Numeric. Iteration index. (not used but required when using lapply())

dist.fec

List. Parameters for the fecal shedding distribution in the same format as returned by def_dist().

dist.gi

List. Parameters for the generation interval distribution in the same format as returned by def_dist().

ww.conc

Data frame. Must have variables:

date: calendar date of wastewater collection
value: pathogen concentration

scaling.factor

Numeric. Scaling from wastewater concentration to prevalence. This value may be assumed or independently calibrated to data.

prm.R

List. Settings for the ensemble when calculating Rt. Elements include:

iter: Integer. Number of iterations for the Rt ensemble
CI: Numeric between 0 and 1. Confidence interval width for Rt estimates after sampling uncertain distributions.
window: Integer. Number of days defining the window of data used by EpiEstim to estimate Rt. If NULL, will default to 7.
config.EpiEstim: (optional) configuration for EpiEstim defined via EpiEstim::make_config(). If NULL, will use default config from EpiEstim.

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

RL.max.iter

Integer. Maximum of iterations for the Richardson-Lucy deconvolution algorithm.

Value

List. Elements include inc (incidence) and rt (reproduction number)

Estimate Rt using EpiEstim

Description

Estimate Rt using EpiEstim

Usage

incidence_to_R(incidence, generation.interval, prm.R)

Arguments

incidence

Data frame. Estimated incidence. Must include at least date, I, and t columns.

generation.interval

List. Parameters for a single generation interval distribution, as generated by sample_a_dist().

prm.R

List. Settings for the ensemble when calculating Rt. Elements include:

iter: Integer. Number of iterations for the Rt ensemble
CI: Numeric between 0 and 1. Confidence interval width for Rt estimates after sampling uncertain distributions.
window: Integer. Number of days defining the window of data used by EpiEstim to estimate Rt. If NULL, will default to 7.
config.EpiEstim: (optional) configuration for EpiEstim defined via EpiEstim::make_config(). If NULL, will use default config from EpiEstim.

Daily incidence from linear interpolation

Description

Daily incidence from linear interpolation

Usage

linear_int_daily(cl.data)

Arguments

cl.data

Aggregated incidence.

Value

A dataframe of daily incidence

Replace NAs with 0s in vector

Description

Replace NAs with 0s in vector

Usage

na_to_0(vec)

Arguments

vec

Vector.

Diagnostic plot for R estimation from clinical report data

Description

Diagnostic plot for R estimation from clinical report data

Usage

plot_diagnostic_cl(r.estim, caption = NULL, wrap.plots = TRUE)

Arguments

r.estim

List. Output of estimate_R_cl().

caption

String. Caption to be inserted in the plot. Default is caption = NULL which disables the caption.

wrap.plots

Logical. Wrap the plots together into a single ggplot object? If wrap.plots = TRUE (the default) will return wrapped plots in a single object, else will return a list of separate ggplot objects.

Value

Plots of the clinical data used, the inferred daily incidence and Rt estimates. If wrap.plots = TRUE (the default) will return wrapped plots (with x-axis aligned to facilitate the comaprison) in a single object, else will return a list of separate ggplot objects.

A ggplot object (or a list of ggplot objects if wrap.plots = FALSE).

Examples


# -- THIS EXAMPLE TAKES ABOUT 30 SECONDS TO RUN --
# Estimate Rt

## Not run: 
# Load SARS-CoV-2 reported cases in Quebec
# during the Summer 2021
dat <- (ern::cl.data
    |> dplyr::filter(
      pt == "qc", 
      dplyr::between(date, as.Date("2021-06-01"), as.Date("2021-09-01"))
    )
)
# distributions
dist.repdelay = ern::def_dist(
    dist = 'gamma',
    mean = 5, 
    mean_sd = 1,
    sd = 1,
    sd_sd = 0.1,
    max = 10
)
dist.repfrac = ern::def_dist(
    dist = "unif",
    min = 0.1,
    max = 0.3
)
dist.incub = ern::def_dist(
    dist = "gamma",
    mean = 3.49,
    mean_sd = 0.1477,
    shape = 8.5,
    shape_sd = 1.8945,
    max = 8
)
dist.gi = ern::def_dist(
    dist = "gamma",
    mean = 6,
    mean_sd = 0.75,
    shape = 2.4,
    shape_sd = 0.3,
    max = 10
)

# settings
prm.daily <- list(
    method = "renewal",
    popsize = 8.5e6, # Q3 (July 1) 2022 estimate for Quebec
    burn = 500,
    iter = 500,
    chains = 2,
    prior_R0_shape = 1.1, prior_R0_rate = 0.6, 
    prior_alpha_shape = 1, prior_alpha_rate = 1
)
prm.daily.check <- list(
    agg.reldiff.tol = 10
)
prm.smooth <- list(
    method = "rollmean",
    align = "center",
    window = 7
)
prm.R <- list(
    iter = 20, 
    CI = 0.95, 
    window = 7, 
    config.EpiEstim = NULL
)

x <- estimate_R_cl(
  dat,
  dist.repdelay,
  dist.repfrac,
  dist.incub,
  dist.gi,
  prm.daily,
  prm.daily.check,
  prm.smooth,
  prm.R
)

# Diagnostic plot for Rt estimates 
# from clinical data
g = plot_diagnostic_cl(x)
plot(g)

g2 = plot_diagnostic_cl(x, caption = 'This is your caption', wrap.plots = FALSE)
plot(g2$clinical_data)
plot(g2$inferred_incidence)
plot(g2$Rt)

## End(Not run)

Diagnostic plot for R estimation from wastewater data

Description

Diagnostic plot for R estimation from wastewater data

Usage

plot_diagnostic_ww(r.estim, caption = NULL, wrap.plots = TRUE)

Arguments

r.estim

List. Output of estimate_R_ww().

caption

Character. Optional plot caption.

wrap.plots

Logical. Wrap all diagnostic plots into one single ggplot object (default = TRUE).

Value

A ggplot object.

Examples


# Load data of viral concentration in wastewater
data("ww.data")

# Estimate Rt based on wastewater data
x = estimate_R_ww(
  ww.conc  = ww.data,
  dist.fec = ern::def_dist(
    dist = "gamma",
    mean = 12.9,
    mean_sd = 1.13,
    shape = 1.75,
    shape_sd = 0.26,
    max = 33
    ),
  dist.gi  = ern::def_dist(
    dist     = "gamma",
    mean     = 6.84,
    mean_sd  = 0.74,
    shape    = 2.39,
    shape_sd = 0.35,
    max      = 15
    ), 
  silent   = TRUE
)

# Diagnostic plot
g = plot_diagnostic_ww(x)
plot(g)

g2 = plot_diagnostic_ww(x, wrap.plots = FALSE, caption = "This is your caption")
plot(g2$wastewater_data)
plot(g2$inferred_incidence)
plot(g2$Rt)

Plot a distribution

Description

Plot a distribution

Usage

plot_dist(d)

Arguments

d

List that defines the distribution (as returned by def_dist_incubation_period() for example)

Value

A ggplot object.

Examples

# Define a `ern` distribution:
gi  = ern::def_dist(
  dist     = "gamma",
  mean     = 6.84,
  mean_sd  = 0.7486,
  shape    = 2.39,
  shape_sd = 0.3573,
  max      = 15
  )

# Plot can be customized like any `ggplot` object:
g = plot_dist(gi) + ggplot2::labs(subtitle = 'your subtitle')
plot(g)

Plot the Gelman Rubin statistic for all parameters.

Description

Plot the Gelman Rubin statistic for all parameters.

Usage

plot_gelman_rubin(jags.obj)

Arguments

jags.obj

JAGS object as returned by code.sample()

Value

A ggplot plot.

Plot MCMC traces

Description

Plot MCMC traces

Usage

plot_traces(jags.obj)

Arguments

jags.obj

JAGS object as returned by code.sample()

Value

A ggplot plot.

Infer incidence from reports via a series of deconvolutions

Description

Infer incidence from reports via a series of deconvolutions

Usage

reports_to_incidence(
  reports.daily,
  reporting.delay,
  incubation.period,
  max.iter = 9,
  silent = FALSE
)

Arguments

reports.daily

Data frame. Daily report counts. Includes at least date and value columns.

reporting.delay

List. Parameters for a single reporting delay distribution, as generated by sample_a_dist().

incubation.period

List. Parameters for a single incubation period distribution, as generated by sample_a_dist().

max.iter

Numeric. Maximum number of Richardson-Lucy iterations.

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

Reshape JAGS fit object

Description

Reshape JAGS fit object

Usage

reshape_fit_jags(x)

Arguments

x

Data frame. JAGS output from fit_jags_aggreg().

Sample parameters for a single distribution from a family of distributions, assuming parameters come from a Gamma distribution.

Description

Sample parameters for a single distribution from a family of distributions, assuming parameters come from a Gamma distribution.

Usage

sample_a_dist(dist)

Arguments

dist

List. A list of distribution parameters, as defined by def_dist().

Sample from a distribution (currently only implemented for a uniform distribution)

Description

Sample from a distribution (currently only implemented for a uniform distribution)

Usage

sample_from_dist(n, params)

Arguments

n

number of samples to draw

params

distribution parameters

Helper function - sample from EpiEstim's R posterior distribution

Description

Helper function - sample from EpiEstim's R posterior distribution

Usage

sample_post_ee(window, objee, n)

Arguments

window

integer. Time slice used to estimate Rt.

objee

List. EpiEstim object as returned by EpiEstim::estimate_R

n

integer. Number of posterior samples drawn.

Smooth realizations from estimating daily reports

Description

Smooth realizations from estimating daily reports

Usage

smooth_cl(cl.daily, prm.smooth)

Arguments

cl.daily

Data frame. Output of agg_to_daily().

prm.smooth

List. list of smoothing parameters. Parameters should be specified as followed:

method: smoothing method, either 'rollmean' (rolling mean) or 'loess' (LOESS smoothing via stats::loess())
window: for ⁠method = 'rollmean⁠ only; width of smoothing window in days
align: for ⁠method = 'rollmean⁠ only; smoothing alignment, either 'center', 'left', 'right'
span: for method = 'loess' only; smoothing span (see the documentation for stats::loess() for details)
floor: optional call for wastewater concentration smoothing with method = 'loess' only; user defined minimum smoothing concentration

Set this entire list to NULL to turn off smoothing

Value

Data frame

Smooth wastewater data (using pre-processed wastewater)

Description

Smooth wastewater data (using pre-processed wastewater)

Usage

smooth_ww(ww.conc, prm.smooth, silent = FALSE)

Arguments

ww.conc

Data frame. Must have variables:

date: calendar date of wastewater collection
value: pathogen concentration

prm.smooth

List. list of smoothing parameters. Parameters should be specified as followed:

method: smoothing method, either 'rollmean' (rolling mean) or 'loess' (LOESS smoothing via stats::loess())
window: for ⁠method = 'rollmean⁠ only; width of smoothing window in days
align: for ⁠method = 'rollmean⁠ only; smoothing alignment, either 'center', 'left', 'right'
span: for method = 'loess' only; smoothing span (see the documentation for stats::loess() for details)
floor: optional call for wastewater concentration smoothing with method = 'loess' only; user defined minimum smoothing concentration

Set this entire list to NULL to turn off smoothing

silent

Logical. Flag to suppress all output messages, warnings, and progress bars.

Value

Data frame

Summarise observations by date for raw iterations from an ensemble

Description

Summarise observations by date for raw iterations from an ensemble

Usage

summarise_by_date_iters(df)

Arguments

df

Data frame. Must have date and value columns.

Summarise daily inferred reports based on original reporting schedule and calculate error

Description

Summarise daily inferred reports based on original reporting schedule and calculate error

Usage

summarise_report_counts(df, prm.daily.check)

Arguments

df

Data frame. As output by get_use_dates().

prm.daily.check

List. Parameters for checking aggregated to daily report inference. Elements include:

agg.reldiff.tol: numerical tolerance (%) for relative error between aggregated inferred daily reports and original aggregated reports; chronological observations are dropped until this tolerance is first acheived (convergence at the start of the timeseries is often the worst, need to maintain uninterrupted daily timeseries for input into Rt calculation).

Set this entire argument to NULL to use inferred daily reports as is.

Value

Data frame

Helper function that summarises posterior samples of Rt from EpiEstim

Description

Helper function that summarises posterior samples of Rt from EpiEstim

Usage

summary_postsamples(x, prm.R)

Arguments

x

dataframe

prm.R

list

Value

summarised dataframe

Sample of wastewater concentration

Description

A subset of SARS-CoV-2 (N2 gene) concentration data in wastewater sampled from the Iona Island wastewater treatment plant in Vancouver between 7 July 2023 and 5 November 2023. Units are in N2 gene copies per milliliter of wastewater. Concentration was measured using RT-qPCR assays; RNA was extracted from suspended solids. See https://health-infobase.canada.ca/covid-19/wastewater/

Usage

ww.data

Format

`ww.data`

A data frame with 47 rows and 3 columns:

date: sampling date
value: mean sample concentration between multiple replicates