Type: | Package |
Title: | Estimate the Effective Reproductive Number with Trend Filtering |
Version: | 1.0.0 |
Description: | Use trend filtering, a type of regularized nonparametric regression, to estimate the instantaneous reproduction number, also called Rt. This value roughly says how many new infections will result from each new infection today. Values larger than 1 indicate that an epidemic is growing while those less than 1 indicate decline. For more details about this methodology, see Liu, Cai, Gustafson, and McDonald (2024) <doi:10.1371/journal.pcbi.1012324>. |
License: | MIT + file LICENSE |
URL: | https://github.com/dajmcdon/rtestim, https://dajmcdon.github.io/rtestim/ |
BugReports: | https://github.com/dajmcdon/rtestim/issues |
Depends: | R (≥ 3.6.2) |
Imports: | checkmate, cli, dspline, ggplot2, Matrix, methods, Rcpp, rlang, tibble, tvdenoising, vctrs |
Suggests: | dplyr, forcats, knitr, nnet, rmarkdown, testthat (≥ 3.0.0), tidyr, xml2 |
LinkingTo: | BH, dspline, Rcpp, RcppEigen, testthat, tvdenoising |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | yes |
Packaged: | 2025-07-01 17:18:31 UTC; dajmcdon |
Author: | Daniel J. McDonald [aut, cre, cph], Jiaping Liu [aut], Zhenglun Cai [ctb] |
Maintainer: | Daniel J. McDonald <daniel@stat.ubc.ca> |
Repository: | CRAN |
Date/Publication: | 2025-07-05 18:50:02 UTC |
Canadian Incident COVID-19 Cases
Description
This dataset contains 3+ years of incident COVID-19 case counts as reported by opencovid.ca as of July 4, 2023.
Usage
cancovid
Format
A data frame with 1,253 rows and 2 columns:
- date
The observed date. A
Date
object.- incident_cases
The number of new recorded cases for this date.
Source
This data is available under the CC-BY-4.0 License.
See:
Berry, I., O’Neill, M., Sturrock, S. L., Wright, J. E., Acharya, K., Brankston, G., Harish, V., Kornas, K., Maani, N., Naganathan, T., Obress, L., Rossi, T., Simmons, A. E., Van Camp, M., Xie, X., Tuite, A. R., Greer, A. L., Fisman, D. N., & Soucy, J.-P. R. (2021). A sub-national real-time epidemiological and vaccination database for the COVID-19 pandemic in Canada. Scientific Data, 8(1). doi:10.1038/s41597-021-00955-2
Add confidence bands to estimated Rt or incidence curves
Description
Create an approximate confidence band for the Rt or incidence estimate. Note that the variance computation is approximate.
Usage
confband(object, lambda, level = 0.95, type = c("Rt", "Yt"), ...)
Arguments
object |
a |
lambda |
the selected lambda. May be a scalar value, or in the case of
|
level |
the desired confidence level(s). These will be sorted if necessary. |
type |
the type |
... |
additional arguments for methods. Unused. |
Value
A data.frame
containing the estimates Rt
or Yt
at the chosen
lambda
, and confidence limits corresponding to level
Examples
y <- c(1, rpois(100, dnorm(1:100, 50, 15) * 500 + 1))
out <- estimate_rt(y, nsol = 10)
head(confband(out, out$lambda[2]))
head(confband(out, out$lambda[2], level = c(0.95, 0.8, 0.5)))
cv <- cv_estimate_rt(y, nfold = 3, nsol = 30)
head(confband(cv, "lambda.min", c(0.5, 0.9)))
Rt estimation algorithm configuration
Description
Rt estimation algorithm configuration
Usage
configure_rt_admm(
rho = -1,
alpha = 0.5,
gamma = 0.9,
tolerance = 1e-04,
maxiter_newton = 50L,
maxiter_line = 20L,
verbose = 0,
...
)
Arguments
rho |
Double. An ADMM parameter; coefficient of augmented term in the Lagrangian function. |
alpha |
Double. A parameter adjusting upper bound in line search algorithm
in |
gamma |
Double. A parameter adjusting step size in line search algorithm
in |
tolerance |
Double. Tolerance of ADMM convergence. |
maxiter_newton |
Integer. Maximum number of iterations for the outer Newton iteration. |
maxiter_line |
Integer. Maximum number of iterations for the linesearch algorithm in the proximal Newton method. |
verbose |
Integer. |
... |
space for future extensions |
Value
a list of model parameters with class rt_admm_configuration
Examples
configure_rt_admm()
configure_rt_admm(tolerance = 1e-6, verbose = 1L)
Leave-kth-out cross validation for choosing a optimal parameter lambda
Description
Leave-kth-out cross validation for choosing a optimal parameter lambda
Usage
cv_estimate_rt(
observed_counts,
korder = 3L,
dist_gamma = c(2.5, 2.5),
nfold = 3L,
error_measure = c("deviance", "mse", "mae"),
x = 1:n,
lambda = NULL,
maxiter = 1000000L,
delay_distn = NULL,
delay_distn_periodicity = NULL,
regular_splits = FALSE,
invert_splits = FALSE,
...
)
Arguments
observed_counts |
vector of the observed daily infection counts |
korder |
Integer. Degree of the piecewise polynomial curve to be
estimated. For example, |
dist_gamma |
Vector of length 2. These are the shape and scale for the assumed serial interval distribution. Roughly, this distribution describes the probability of an infectious individual infecting someone else after some period of time after having become infectious. As in most literature, we assume that this interval follows a gamma distribution with some shape and scale. |
nfold |
Integer. This number of folds to conduct the leave-kth-out
cross validation. For leave-kth-out cross validation, every kth
observed_counts and their corresponding position (evenly or unevenly
spaced) are placed into the same fold. The first and last observed_counts are
not assigned to any folds. Smallest allowable value is |
error_measure |
Metric used to calculate cross validation scores.
Must be choose from |
x |
a vector of positions at which the counts have been observed. In an ideal case, we would observe data at regular intervals (e.g. daily or weekly) but this may not always be the case. May be numeric or Date. |
lambda |
Vector. A user supplied sequence of tuning parameters which
determines the balance between data fidelity and
smoothness of the estimated Rt; larger |
maxiter |
Integer. Maximum number of iterations for the estimation algorithm. |
delay_distn |
in the case of a non-gamma delay distribution,
a vector or matrix (or |
delay_distn_periodicity |
Controls the relationship between the spacing
of the computed delay distribution and the spacing of |
regular_splits |
Logical.
If
where 0 indicates no assignment.
Therefore, the folds are not random and running |
invert_splits |
Logical.
Typical K-fold CV would use K-1 folds for the training
set while reserving 1 fold for evaluation (repeating the split K times).
Setting this to true inverts this process, using a much smaller training
set with a larger evaluation set. This tends to result in larger values
of |
... |
additional parameters passed to |
Value
An object with S3 class "cv_poisson_rt"
. Among the list components:
-
full_fit
An object with S3 class"poisson_rt"
, fitted with allobserved_counts
andlambda
-
cv_scores
leave-kth-out cross validation scores -
cv_se
leave-kth-out cross validation standard error -
lambda.min
lambda which achieved the optimal cross validation score -
lambda.1se
lambda that gives the optimal cross validation score within one standard error. -
lambda
the value oflambda
used in the algorithm.
Examples
y <- c(1, rpois(100, dnorm(1:100, 50, 15) * 500 + 1))
cv <- cv_estimate_rt(y, korder = 3, nfold = 3, nsol = 30)
cv
Calculate the total infectiousness at each observed time point.
Description
The total infectiousness at each observed time point is calculated
by \sum_{s=1}^t I_{t-s}w_s
, where I
denotes the vector containing
observed incidence, and w
denotes the generation interval
distribution. Typically, the generation interval is challenging to estimate
from data, so the serial interval is used instead. The serial interval
distribution expresses the probability
of a secondary infection caused by a primary infection which occurred s
days earlier.
Usage
delay_calculator(
observed_counts,
x = NULL,
dist_gamma = c(2.5, 2.5),
delay_distn = NULL,
delay_distn_periodicity = NULL,
xout = x
)
Arguments
observed_counts |
vector of the observed daily infection counts |
x |
a vector of positions at which the counts have been observed. In an ideal case, we would observe data at regular intervals (e.g. daily or weekly) but this may not always be the case. May be numeric or Date. |
dist_gamma |
Vector of length 2. These are the shape and scale for the assumed serial interval distribution. Roughly, this distribution describes the probability of an infectious individual infecting someone else after some period of time after having become infectious. As in most literature, we assume that this interval follows a gamma distribution with some shape and scale. |
delay_distn |
in the case of a non-gamma delay distribution,
a vector or matrix (or |
delay_distn_periodicity |
Controls the relationship between the spacing
of the computed delay distribution and the spacing of |
xout |
a vector of positions at which the results should be returned.
By default, this will be the same as |
Value
A vector containing the total infectiousness at each
point xout
.
Examples
delay_calculator(c(3, 2, 5, 3, 1), dist_gamma = c(2.5, 2.5))
Compute the discretized density function for gamma distribution
Description
The serial interval distribution expresses the probability of the symptom onset of a secondary infection occurred a given number of days after the primary infection. The serial interval distribution is commonly represented by a discretized Gamma distribution in literature, parametrized by the shape and scale parameters.
Usage
discretize_gamma(x, shape = 2.5, scale = 2.5, rate = 1/scale)
Arguments
x |
locations (times) where cases are observed. Must be nonnegative. |
shape , scale |
shape and scale parameters. Must be positive,
|
rate |
an alternative way to specify the scale. |
Value
probability mass of the discretized gamma distribution
Examples
discretize_gamma(1:30, shape = 1, scale = 1)
Estimate Rt using smoothness-penalized Poisson likelihood
Description
The Effective Reproduction Number R_t
of an infectious
disease can be estimated by solving the smoothness penalized Poisson
regression (trend filtering) of the form:
\hat{\theta} = \arg\min_{\theta} \frac{1}{n} \sum_{i=1}^n (w_i e^{\theta_i} -
y_i\theta_i) + \lambda\Vert D^{(k+1)}\theta\Vert_1,
where R_t = e^{\theta}
, y_i
is the observed case count at day
i
, w_i
is the weighted past counts at day i
, \lambda
is the smoothness penalty, and D^{(k+1)}
is the (k+1)
-th order
difference matrix.
Usage
estimate_rt(
observed_counts,
korder = 3L,
dist_gamma = c(2.5, 2.5),
x = 1:n,
lambda = NULL,
nsol = 50L,
delay_distn = NULL,
delay_distn_periodicity = NULL,
lambdamin = NULL,
lambdamax = NULL,
lambda_min_ratio = 1e-04,
maxiter = 1e+05,
init = configure_rt_admm()
)
Arguments
observed_counts |
vector of the observed daily infection counts |
korder |
Integer. Degree of the piecewise polynomial curve to be
estimated. For example, |
dist_gamma |
Vector of length 2. These are the shape and scale for the assumed serial interval distribution. Roughly, this distribution describes the probability of an infectious individual infecting someone else after some period of time after having become infectious. As in most literature, we assume that this interval follows a gamma distribution with some shape and scale. |
x |
a vector of positions at which the counts have been observed. In an ideal case, we would observe data at regular intervals (e.g. daily or weekly) but this may not always be the case. May be numeric or Date. |
lambda |
Vector. A user supplied sequence of tuning parameters which
determines the balance between data fidelity and
smoothness of the estimated Rt; larger |
nsol |
Integer. The number of tuning parameters |
delay_distn |
in the case of a non-gamma delay distribution,
a vector or matrix (or |
delay_distn_periodicity |
Controls the relationship between the spacing
of the computed delay distribution and the spacing of |
lambdamin |
Optional value for the smallest |
lambdamax |
Optional value for the largest |
lambda_min_ratio |
If neither |
maxiter |
Integer. Maximum number of iterations for the estimation algorithm. |
init |
a list of internal configuration parameters of class
|
Value
An object with S3 class poisson_rt
. Among the list components:
-
observed_counts
the observed daily infection counts. -
x
a vector of positions at which the counts have been observed. -
weighted_past_counts
the weighted sum of past infection counts. -
Rt
the estimated effective reproduction rate. This is a matrix with each column corresponding to one value oflambda
. -
lambda
the values oflambda
actually used in the algorithm. -
korder
degree of the estimated piecewise polynomial curve. -
dof
degrees of freedom of the estimated trend filtering problem. -
niter
the required number of iterations for each value oflambda
. -
convergence
if number of iterations for each value oflambda
is less than the maximum number of iterations for the estimation algorithm.
Examples
y <- c(1, rpois(100, dnorm(1:100, 50, 15) * 500 + 1))
out <- estimate_rt(y)
out
plot(out)
out0 <- estimate_rt(y, korder = 0L, nsol = 40)
out0
plot(out0)
Fitted cv_poisson_rt
Description
Fitted cv_poisson_rt
Usage
## S3 method for class 'cv_poisson_rt'
fitted(object, which_lambda = c("lambda.min", "lambda.1se"), ...)
Arguments
object |
result of cross validation of type |
which_lambda |
select which Rt's to output. If not provided, all Rt's are returned. If provided a list of lambda,the corresponding Rt estimation will be returned. If provided a string, it must be either one of
|
... |
not used. |
Value
Rt's estimated from provided lambda
Examples
y <- c(1, rpois(100, dnorm(1:100, 50, 15) * 500 + 1))
cv <- cv_estimate_rt(y, korder = 3, nfold = 3, nsol = 30)
f <- fitted(cv)
f <- fitted(cv, which_lambda = cv$lambda[1])
f <- fitted(cv, which_lambda = "lambda.1se")
f <- fitted(cv, which_lambda = NULL)
Interpolate (or extrapolate) Rt estimates to intermediate design points
Description
Interpolate (or extrapolate) Rt estimates to intermediate design points
Usage
interpolate_rt(object, xout, ...)
## S3 method for class 'cv_poisson_rt'
interpolate_rt(object, xout, which_lambda = c("lambda.min", "lambda.1se"), ...)
## S3 method for class 'poisson_rt'
interpolate_rt(object, xout, lambda = NULL, ...)
Arguments
object |
A fitted object produced by |
xout |
a vector of new positions at which Rt should be produced, but where counts may not have been observed. |
... |
additional arguments passed to methods. |
which_lambda |
Select which lambdas from the object to use. If not provided, all Rt's are returned. Note that new lambdas not originally used in the estimation procedure may be provided, but the results will be calculated by linearly interpolating the estimated Rt's. The strings |
lambda |
Vector. A user supplied sequence of tuning parameters which
determines the balance between data fidelity and
smoothness of the estimated Rt; larger |
Value
A vector or matrix of interpolated Rt estimates.
Examples
y <- c(1, rpois(100, dnorm(1:100, 50, 15) * 500 + 1))
out <- estimate_rt(y)
# originally estimated at
out$x
# get the Rt at 3 new points (for all estimated lambdas)
int <- interpolate_rt(out, c(10.5, 11.5, 12.5))
# get the Rt at a single value of lambda
interpolate_rt(out, c(10.5, 11.5, 12.5), lambda = out$lambda[20])
y <- c(1, rpois(100, dnorm(1:100, 50, 15) * 500 + 1))
out <- estimate_rt(y, nsol = 10)
interpolate_rt(out, xout = c(1.5, 2.5))
Plot cv_poisson_rt
Description
Plot cv_poisson_rt
Usage
## S3 method for class 'cv_poisson_rt'
plot(x, which_lambda = c("cv_scores", "lambda.min", "lambda.1se"), ...)
Arguments
x |
result of cv_estimate_rt of class |
which_lambda |
select which Rt's to plot. If not provided, the cross validation score will be plotted. If provided a list of lambda, the corresponding Rt estimation will be plotted. If provided a string, it
must be either one of
|
... |
Not used. |
Value
Examples
y <- c(1, rpois(100, dnorm(1:100, 50, 15) * 500 + 1))
cv <- cv_estimate_rt(y, korder = 1, nfold = 3, nsol = 30)
plot(cv)
plot(cv, which_lambda = cv$lambda[1])
plot(cv, which_lambda = "lambda.min")
plot(cv, which_lambda = "lambda.1se")
plot(cv, NULL)
Plot estimated Rt values from a poisson_rt
object
Description
Produces a figure showing some or all estimated Rt values for different
values of the penalty. The result is a ggplot2::ggplot()
. Additional user
modifications can be added as desired.
Usage
## S3 method for class 'poisson_rt'
plot(x, lambda = NULL, ...)
Arguments
x |
output of the function |
lambda |
select which Rt's to plot. If not provided, all Rt's are plotted. |
... |
Not used. |
Value
Examples
y <- c(1, rpois(100, dnorm(1:100, 50, 15) * 500 + 1))
out <- estimate_rt(y, lambda = log(c(1.1, 1.3, 1.5)))
plot(out)
Plot estimated confidence bands for an estimate of Rt
Description
Produces a figure showing a single estimated Rt value along with approximate
confidence bands. The result is a ggplot2::ggplot()
. Additional user
modifications can be added as desired.
Usage
## S3 method for class 'rt_confidence_band'
plot(x, colour = "#3A448F", ...)
Arguments
x |
An object of class |
colour |
The colour of the desired plot |
... |
Not used. |
Value
Examples
y <- c(1, rpois(100, dnorm(1:100, 50, 15) * 500 + 1))
out <- estimate_rt(y, nsol = 10)
cb <- confband(out, out$lambda[2], level = c(0.95, 0.8, 0.5))
plot(cb)
cb_y <- confband(out, out$lambda[2], level = c(0.95, 0.8, 0.5), type = "Yt")
plot(cb_y)
Predict observed data using estimated Rt
Description
Given an object of class poisson_rt
produced with estimate_rt()
,
calculate predicted observed cases for the estimated Rt values.
Note: This function is not intended for "new x" or to produce forecasts, but
rather to examine how Rt relates to observables.
Usage
## S3 method for class 'cv_poisson_rt'
predict(object, which_lambda = c("lambda.min", "lambda.1se"), ...)
Arguments
object |
result of cross validation of type |
which_lambda |
Select which lambdas from the object to use. If not provided, all Rt's are returned. Note that new lambdas not originally used in the estimation procedure may be provided, but the results will be calculated by linearly interpolating the estimated Rt's. The strings |
... |
not used. |
Value
A vector or matrix of predicted case counts.
Examples
y <- c(1, rpois(100, dnorm(1:100, 50, 15) * 500 + 1))
cv <- cv_estimate_rt(y, korder = 3, nfold = 3, nsol = 30)
p <- predict(cv)
p <- predict(cv, which_lambda = cv$lambda[1])
p <- predict(cv, which_lambda = "lambda.1se")
p <- predict(cv, which_lambda = NULL)
plot(y)
matlines(p, lty = 2)
Predict observed data using estimated Rt
Description
Given an object of class poisson_rt
produced with estimate_rt()
,
calculate predicted observed cases for the estimated Rt values.
Note: This function is not intended for "new x" or to produce forecasts, but
rather to examine how Rt relates to observables.
Usage
## S3 method for class 'poisson_rt'
predict(object, lambda = NULL, ...)
Arguments
object |
An object of class |
lambda |
Select which lambdas from the object to use. If not provided (the default), all are returned. Note that new lambdas not originally used in the estimation procedure may be provided, but the results will be calculated by linearly interpolating the estimated Rt's. |
... |
Not used. |
Value
A vector or matrix of predicted case counts.
Examples
y <- c(1, rpois(100, dnorm(1:100, 50, 15) * 500 + 1))
out <- estimate_rt(y, nsol = 10)
preds <- predict(out)
plot(y)
matlines(preds, lty = 1)