Title: | Simulating and Analysing Transmission Chain Statistics Using Branching Process Models |
Version: | 0.1.1 |
Description: | Provides methods to simulate and analyse the size and length of branching processes with an arbitrary offspring distribution. These can be used, for example, to analyse the distribution of chain sizes or length of infectious disease outbreaks, as discussed in Farrington et al. (2003) <doi:10.1093/biostatistics/4.2.279>. |
License: | MIT + file LICENSE |
URL: | https://github.com/epiverse-trace/epichains, https://epiverse-trace.github.io/epichains/ |
BugReports: | https://github.com/epiverse-trace/epichains/issues |
Depends: | R (≥ 3.6.0) |
Imports: | checkmate, stats, utils |
Suggests: | bookdown, dplyr, epicontacts, ggplot2, knitr, lubridate, rmarkdown, spelling, testthat, truncdist |
VignetteBuilder: | knitr |
Config/Needs/website: | epiverse-trace/epiversetheme |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
Language: | en-GB |
LazyData: | true |
RoxygenNote: | 7.3.2.9000 |
NeedsCompilation: | no |
Packaged: | 2024-10-09 11:53:58 UTC; lshja16 |
Author: | James M. Azam |
Maintainer: | James M. Azam <james.azam@lshtm.ac.uk> |
Repository: | CRAN |
Date/Publication: | 2024-10-14 09:20:11 UTC |
epichains: Simulating and Analysing Transmission Chain Statistics Using Branching Process Models
Description
Provides methods to simulate and analyse the size and length of branching processes with an arbitrary offspring distribution. These can be used, for example, to analyse the distribution of chain sizes or length of infectious disease outbreaks, as discussed in Farrington et al. (2003) doi:10.1093/biostatistics/4.2.279.
Author(s)
Maintainer: James M. Azam james.azam@lshtm.ac.uk (ORCID) [copyright holder]
Authors:
Sebastian Funk sebastian.funk@lshtm.ac.uk (ORCID) [copyright holder]
Flavio Finger flavio.finger@epicentre.msf.org (ORCID)
Other contributors:
Zhian N. Kamvar zkamvar@gmail.com (ORCID) [contributor]
Hugo Gruson hugo@data.org (ORCID) [contributor, reviewer]
Karim Mané karim.mane@lshtm.ac.uk (ORCID) [reviewer]
Pratik Gupte pratik.gupte@lshtm.ac.uk (ORCID) [reviewer]
Joshua W. Lambert joshua.lambert@lshtm.ac.uk (ORCID) [reviewer]
See Also
Useful links:
Report bugs at https://github.com/epiverse-trace/epichains/issues
Adjust new offspring if it exceeds the susceptible population size
Description
This function is used internally, and input checking is not performed here, only in the context where it is used. Using it directly is not recommended.
Usage
.adjust_next_gen(next_gen, susc_pop)
Arguments
next_gen |
A numeric vector of next generation offspring. |
susc_pop |
The susceptible population size; A number coercible to integer. |
Value
A numeric vector of the adjusted next generation offspring.
Check if the generation_time argument is specified as a function
Description
Check if the generation_time argument is specified as a function
Usage
.check_generation_time_valid(generation_time)
Arguments
generation_time |
The generation time function; the name
of a user-defined named or anonymous function with only one argument |
Check if constructed random number generator for offspring exists
and has an n
argument.
Description
Check if constructed random number generator for offspring exists
and has an n
argument.
Usage
.check_offspring_func_valid(offspring_dist)
Arguments
offspring_dist |
Offspring distribution: a |
Check inputs to simulate_chains()
and simulate_summary()
Description
Check inputs to simulate_chains()
and simulate_summary()
Usage
.check_sim_args(
n_chains,
statistic,
offspring_dist,
stat_threshold,
pop,
percent_immune
)
Arguments
n_chains |
Number of chains to simulate. |
statistic |
The chain statistic to track as the
stopping criteria for each chain being simulated when
|
offspring_dist |
Offspring distribution: a |
stat_threshold |
A stopping criterion for individual chain simulations;
a positive number coercible to integer. When any chain's cumulative statistic
reaches or surpasses |
pop |
Population size; An |
percent_immune |
Percent of the population immune to
infection at the start of the simulation; A |
Value
NULL; called for side effects
Check that the statistic
and stat_threshold
arguments are valid
Description
The function treats these two arguments as related and checks them in one place to remove repeated checks in several places in the package.
Usage
.check_statistic_args(statistic, stat_threshold)
Arguments
statistic |
The chain statistic to track as the
stopping criteria for each chain being simulated when
|
stat_threshold |
A stopping criterion for individual chain simulations;
a positive number coercible to integer. When any chain's cumulative statistic
reaches or surpasses |
Value
NULL; called for side effects
Check inputs that control time events
Description
This function checks the time-related inputs, i.e., start time of each chain,
t0
, the end time of the simulation, tf
, and the generation time,
generation_time. It also checks that the generation_time argument is
specified if tf
is specified as these go hand-in-hand.
Usage
.check_time_args(tf_specified, tf, generation_time, t0)
Arguments
tf_specified |
|
tf |
A number for the cut-off for the infection times (if generation
time is given); Defaults to |
generation_time |
The generation time function; the name
of a user-defined named or anonymous function with only one argument |
t0 |
Start time (if generation time is given); either a single value
or a vector of same length as |
Value
NULL
; called for side effects
Calculates the complementary log-probability
Description
Given x and norm, this calculates log(1-sum(exp(x)))
Usage
.complementary_logprob(x)
Arguments
x |
A numeric vector of log-probabilities. Must be negative. |
Value
A numeric value of the complementary log-probability.
Author(s)
Sebastian Funk
Create an <epichains>
object
Description
epichains()
constructs an <epichains>
object, which is
inherently an <data.frame>
object that stores some of the inputs
passed to the simulate_chains()
and the
simulated output. The stored attributes are useful for downstream
analyses and reproducibility. This function checks the validity of the
object created to ensure it has the right columns and column types.
An <epichains>
object contains a <data.frame>
of the simulated
outbreak with ids for each infector and infectee, generation, and
optionally, time, the number of chains simulated,
the chain statistic that was tracked, and whether the susceptible
population was tracked.
Usage
.epichains(
sim_df,
n_chains,
offspring_dist,
track_pop,
statistic = c("size", "length"),
stat_threshold = Inf
)
Arguments
sim_df |
a |
n_chains |
Number of chains to simulate. |
offspring_dist |
Offspring distribution: a |
track_pop |
Was the susceptible population tracked? Logical. |
statistic |
The chain statistic to track as the
stopping criteria for each chain being simulated when
|
stat_threshold |
A stopping criterion for individual chain simulations;
a positive number coercible to integer. When any chain's cumulative statistic
reaches or surpasses |
Value
An <epichains>
object.
Author(s)
James M. Azam
Create an <epichains_summary>
object
Description
epichains_summary()
constructs an <epichains_summary>
object.
An <epichains_summary>
object is a <vector>
of the simulated
chain sizes or lengths. It also stores information on the number of chains
simulated, and the statistic that was tracked.
Usage
.epichains_summary(
chains_summary,
n_chains,
offspring_dist,
statistic = c("size", "length"),
stat_threshold = Inf
)
Arguments
chains_summary |
A numeric |
n_chains |
Number of chains to simulate. |
offspring_dist |
Offspring distribution: a |
statistic |
The chain statistic to track as the
stopping criteria for each chain being simulated when
|
stat_threshold |
A stopping criterion for individual chain simulations;
a positive number coercible to integer. When any chain's cumulative statistic
reaches or surpasses |
Value
An <epichains_summary>
object.
Author(s)
James M. Azam
Log-likelihood of the size of chains with gamma-Borel offspring distribution
Description
Log-likelihood of the size of chains with gamma-Borel offspring distribution
Usage
.gborel_size_ll(x, size, prob, mu)
Arguments
x |
A numeric vector of chain sizes. |
size |
The dispersion parameter (often called |
prob |
Probability of success (in the parameterisation with
|
mu |
Mean; A positive number. |
Value
A numeric vector of log-likelihood values.
Author(s)
Sebastian Funk James M. Azam
Log-likelihood of the length of chains with geometric offspring distribution
Description
Log-likelihood of the length of chains with geometric offspring distribution
Usage
.geom_length_ll(x, prob)
Arguments
x |
A numeric vector of chain lengths. |
prob |
The probability of the geometric distribution with mean
|
Value
A numeric vector of log-likelihood values.
Author(s)
Sebastian Funk James M. Azam
Return a function for calculating chain statistics
Description
Return a function for calculating chain statistics
Usage
.get_statistic_func(chain_statistic)
Value
A function for calculating chain statistics.
Sample the number of susceptible offspring from all possible offspring
Description
Sample susceptible offspring to be infected from all possible offspring. This function is used internally, and input checking is not performed here, only in the context where it is used. Using it directly is not recommended.
Usage
.get_susceptible_offspring(new_offspring, susc_pop, pop)
Arguments
new_offspring |
A numeric vector of the possible new offspring per
chain produced by |
pop |
Population size; An |
Value
A numeric vector of the number of offspring that can be infected given the current susceptible population size.
Adjust next generation vector to match susceptible population size
Description
Calculates the initial susceptible population size given the total population size, the percent immune, and the number of index cases. This function is used internally, and input checking is not performed here, only in the context where it is used. Using it directly is not recommended.
Usage
.init_susc_pop(pop, percent_immune, index_cases)
Arguments
pop |
Population size; An |
percent_immune |
Percent of the population immune to
infection at the start of the simulation; A |
Value
Initial susceptible population size; A numeric coercible to integer.
Test if x is an epichains
object
Description
Test if x is an epichains
object
Usage
.is_epichains(x)
Arguments
x |
An R object. |
Value
Logical; TRUE
if the object is an <epichains>
and FALSE
otherwise.
Author(s)
James M. Azam
Test if x is an epichains_summary
object
Description
Test if x is an epichains_summary
object
Usage
.is_epichains_summary(x)
Arguments
x |
An R object. |
Value
Logical; TRUE
if the object is an <epichains_summary>
and
FALSE
otherwise.
Author(s)
James M. Azam
Log-likelihood of the size of chains with Negative-Binomial offspring distribution
Description
Log-likelihood of the size of chains with Negative-Binomial offspring distribution
Usage
.nbinom_size_ll(x, size, prob, mu)
Arguments
x |
A numeric vector of chain sizes. |
size |
The dispersion parameter (often called |
prob |
Probability of success (in the parameterisation with
|
mu |
Mean; A positive number. |
Value
A numeric vector of log-likelihood values.
Author(s)
Sebastian Funk James M. Azam
Construct an <epichains>
object
Description
new_epichains()
constructs an <epichains>
object from a
supplied <data.frame>
and extra attributes passed as individual arguments.
It is meant to be lazy and performant, by creating the object without
checking the arguments for correctness. It is not safe to call
new_epichains()
on its own as is called within epichains()
after the arguments have been checked. To create an <epichains>
object, use epichains()
.
Usage
.new_epichains(
sim_df,
n_chains,
statistic,
offspring_dist,
stat_threshold,
track_pop
)
Arguments
sim_df |
a |
n_chains |
Number of chains to simulate. |
statistic |
The chain statistic to track as the
stopping criteria for each chain being simulated when
|
offspring_dist |
Offspring distribution: a |
stat_threshold |
A stopping criterion for individual chain simulations;
a positive number coercible to integer. When any chain's cumulative statistic
reaches or surpasses |
track_pop |
Was the susceptible population tracked? Logical. |
Author(s)
James M. Azam
Construct a <epichains_summary>
object
Description
new_epichains_summary()
constructs an <epichains_summary>
object from a
supplied <vector>
of chain sizes or lengths. It also stores extra
attributes passed as individual arguments.
new_epichains_summary()
is meant to be lazy and performant, by creating
the object without checking the arguments for correctness. It is not safe
to call new_epichains_summary()
on its own as is called within
epichains_summary()
after the arguments have been checked. To create a
new <epichains_summary>
object safely, use epichains_summary()
.
Usage
.new_epichains_summary(
chains_summary,
n_chains,
statistic,
offspring_dist,
stat_threshold
)
Arguments
chains_summary |
A numeric |
n_chains |
Number of chains to simulate. |
statistic |
The chain statistic to track as the
stopping criteria for each chain being simulated when
|
offspring_dist |
Offspring distribution: a |
stat_threshold |
A stopping criterion for individual chain simulations;
a positive number coercible to integer. When any chain's cumulative statistic
reaches or surpasses |
Author(s)
James M. Azam
Log-likelihood of the summary (size/length) of chains with generic offspring distribution
Description
The log-likelihoods are calculated with a crude approximation using simulated chain summaries by linearly approximating any missing values in the empirical cumulative distribution function (ecdf).
Usage
.offspring_ll(x, offspring_dist, statistic, nsim_offspring = 100, ...)
Arguments
x |
A numeric vector of chain summaries (sizes/lengths). |
offspring_dist |
Offspring distribution: a |
statistic |
The chain statistic to track as the
stopping criteria for each chain being simulated when
|
nsim_offspring |
Number of simulations of the offspring distribution for approximating the distribution of the chain statistic summary (size/length) |
... |
any parameters to pass to |
Value
A numeric vector of log-likelihood values.
Author(s)
Sebastian Funk James M. Azam
Log-likelihood of the length of chains with Poisson offspring distribution
Description
Log-likelihood of the length of chains with Poisson offspring distribution
Usage
.pois_length_ll(x, lambda)
Arguments
x |
A numeric vector of chain lengths. |
lambda |
The rate of the Poisson distribution. A single numeric value. Must be positive. |
Value
A numeric vector of log-likelihood values.
Author(s)
Sebastian Funk James M. Azam
Log-likelihood of the size of chains with Poisson offspring distribution
Description
Log-likelihood of the size of chains with Poisson offspring distribution
Usage
.pois_size_ll(x, lambda)
Arguments
x |
A numeric vector of sizes |
lambda |
The rate of the Poisson distribution; A single numeric value. |
Value
A numeric vector of log-likelihood values.
Author(s)
Sebastian Funk James M. Azam
Samples size (the number of trials) of a binomial distribution
Description
Samples the size parameter from the binomial distribution with fixed x (number of successes) and p (success probability)
Usage
.rbinom_size(n, x, prob)
Arguments
n |
The number of samples to generate. Must be a positive integer. |
x |
The number of successes. Must be a positive integer. |
prob |
The probability of success. A numeric between 0 and 1. |
Value
A numeric vector of the sampled sizes.
Author(s)
Sebastian Funk
Samples chain lengths with given observation probabilities
Description
Samples the length of a transmission chain where each individual element is observed with binomial probability with parameters n (number of successes) and p (success probability)
Usage
.rgen_length(n, x, prob)
Arguments
n |
The number of samples to generate. Must be a positive integer. |
x |
A numeric vector of the observed chain lengths. |
prob |
The probability of observation. A numeric between 0 and 1. |
Value
A numeric vector of the sampled chain lengths.
Author(s)
Sebastian Funk
Sample all possible offspring for the next generation
Description
Sample next generation of offspring using offspring distribution and associated parameters. This function is used internally, and input checking is not performed here, only in the context where it is used. Using it directly is not recommended.
Usage
.sample_possible_offspring(
offspring_func,
offspring_func_pars,
n_offspring,
chains
)
Arguments
offspring_func |
A function to sample offspring. |
offspring_func_pars |
A list of parameters for the offspring function. |
n_offspring |
A numeric vector of the number of offspring per chain. |
chains |
Numeric indices of chains/infectors being simulated |
Value
A numeric vector of the number of offspring per chain.
Determine and update the chain statistic being tracked
Description
Determine and update the chain statistic being tracked
Usage
.update_chain_stat(stat_type, stat_latest, n_offspring)
Arguments
stat_type |
Chain statistic (size/length) to update; A character string. Must be one of 'size' or 'length'. |
stat_latest |
The latest chain statistic numeric vector to be updated. |
n_offspring |
A vector of offspring per chain. A numeric vector (coercible to integer). |
Value
A vector of chain statistics (size/length). A numeric vector coercible to integer.
Validate an <epichains>
object
Description
Validate an <epichains>
object
Usage
.validate_epichains(x)
Arguments
x |
An |
Value
Invisibly returns the object if it is valid.
Author(s)
James M. Azam
Validate an <epichains_summary>
object
Description
Validate an <epichains_summary>
object
Usage
.validate_epichains_summary(x)
Arguments
x |
An |
Value
Invisibly returns the object if it is valid.
Author(s)
James M. Azam
Aggregate cases in <epichains>
objects by "generation" or "time", if
present
Description
This function provides a quick way to create a time series of cases over
generation or time (if generation_time
was specified) from simulated
<epichains>
objects.
Usage
## S3 method for class 'epichains'
aggregate(x, by = c("time", "generation"), ...)
Arguments
x |
An |
by |
The variable to aggregate by; A character string with options "time" and "generation". |
... |
Not used. |
Value
A <data.frame>
object of cases by by
.
Author(s)
James M. Azam
Examples
set.seed(32)
chains <- simulate_chains(
n_chains = 10,
statistic = "size",
offspring_dist = rpois,
stat_threshold = 10,
generation_time = function(n) rep(3, n),
lambda = 2
)
chains
# Aggregate cases per time
cases_per_time <- aggregate(chains, by = "time")
head(cases_per_time)
# Aggregate cases per generation
cases_per_gen <- aggregate(chains, by = "generation")
head(cases_per_gen)
COVID-19 Data Repository for South Africa
Description
An aggregated subset of the COVID-19 Data Repository for South Africa created, maintained and hosted by Data Science for Social Impact research group, led by Dr. Vukosi Marivate.
Usage
covid19_sa
Format
covid19_sa
A data frame with 19 rows and 2 columns:
- date
Date case was reported
- cases
Number of cases
Details
The data is originally provided as a linelist but has been subsetted and
cleaned in data-raw/covid19_sa.R
.
Source
https://github.com/dsfsi/covid19za
Further details in data-raw/covid19_sa.R
.
Density of the Borel distribution
Description
Density of the Borel distribution
Usage
dborel(x, mu, log = FALSE)
Arguments
x |
A numeric vector of quantiles. |
mu |
A non-negative number for the poisson mean. |
log |
Logical; if TRUE, probabilities p are given as log(p). |
Value
A numeric vector of the borel probability density.
Author(s)
Sebastian Funk James M. Azam
Examples
set.seed(32)
dborel(1:5, 1)
head
and tail
method for <epichains>
class
Description
head
and tail
method for <epichains>
class
Usage
## S3 method for class 'epichains'
head(x, ...)
## S3 method for class 'epichains'
tail(x, ...)
Arguments
x |
An |
... |
Further arguments passed to or from other methods. |
Details
This returns the top rows of an <epichains>
object, starting from the
first known infectors.
To view the full output, use as.data.frame(<object_name>)
.
Value
An object of class <data.frame>
.
Author(s)
James M. Azam
Examples
set.seed(32)
chains_pois_offspring <- simulate_chains(
n_chains = 10,
statistic = "size",
offspring_dist = rpois,
stat_threshold = 10,
generation_time = function(n) rep(3, n),
lambda = 2
)
head(chains_pois_offspring)
set.seed(32)
chains_pois_offspring <- simulate_chains(
n_chains = 10,
statistic = "size",
offspring_dist = rpois,
stat_threshold = 10,
generation_time = function(n) rep(3, n),
lambda = 2
)
tail(chains_pois_offspring)
Estimate the log-likelihood/likelihood for observed branching processes
Description
Estimate the log-likelihood/likelihood for observed branching processes
Usage
likelihood(
chains,
statistic = c("size", "length"),
offspring_dist,
nsim_obs,
obs_prob = 1,
stat_threshold = Inf,
log = TRUE,
exclude = NULL,
individual = FALSE,
...
)
Arguments
chains |
Vector of chain summaries (sizes/lengths). Can be a
|
statistic |
The chain statistic to track as the
stopping criteria for each chain being simulated when
|
offspring_dist |
Offspring distribution: a |
nsim_obs |
Number of simulations to be used to approximate the
log-likelihood/likelihood if |
obs_prob |
Observation probability. A number (probability) between 0
and 1. A value greater than 0 but less 1 implies imperfect observation and
simulations will be used to approximate the (log)likelihood. This will
also require specifying |
stat_threshold |
A stopping criterion for individual chain simulations;
a positive number coercible to integer. When any chain's cumulative statistic
reaches or surpasses |
log |
Should the log-likelihoods be transformed to
likelihoods? Logical. Defaults to |
exclude |
A vector of indices of the sizes/lengths to exclude from the log-likelihood calculation. |
individual |
Logical; If TRUE, a vector of individual log-likelihood/likelihood contributions will be returned rather than the sum/product. |
... |
any parameters to pass to |
Value
If log = TRUE
A joint log-likelihood (sum of individual log-likelihoods), if
individual == FALSE
(default) andobs_prob == 1
(default), orA list of individual log-likelihoods, if
individual == TRUE
andobs_prob == 1
(default), orA list of individual log-likelihoods (same length as
nsim_obs
), ifindividual == TRUE
and0 <= obs_prob < 1
, orA vector of joint log-likelihoods (same length as
nsim_obs
), if individual == FALSE and0 <= obs_prob < 1
(imperfect observation).
If log = FALSE
, the same structure of outputs as above are returned,
except that likelihoods, instead of log-likelihoods, are calculated in all
cases. Moreover, the joint likelihoods are the product, instead of the sum,
of the individual likelihoods.
Author(s)
Sebastian Funk, James M. Azam
Examples
# example of observed chain sizes
set.seed(121)
# randomly generate 20 chains of size 1 to 10
chain_sizes <- sample(1:10, 20, replace = TRUE)
likelihood(
chains = chain_sizes,
statistic = "size",
offspring_dist = rpois,
nsim_obs = 100,
lambda = 0.5
)
# Example using an <epichains> object
set.seed(32)
epichains_obj_eg <- simulate_chains(
n_chains = 10,
pop = 100,
percent_immune = 0,
statistic = "size",
offspring_dist = rnbinom,
stat_threshold = 10,
generation_time = function(n) rep(3, n),
mu = 2,
size = 0.2
)
epichains_obj_eg_lik <- likelihood(
chains = epichains_obj_eg,
statistic = "size",
offspring_dist = rnbinom,
mu = 2,
size = 0.2,
stat_threshold = 10
)
epichains_obj_eg_lik
# Example using a <epichains_summary> object
set.seed(32)
epichains_summary_eg <- simulate_chain_stats(
n_chains = 10,
pop = 100,
percent_immune = 0,
statistic = "size",
offspring_dist = rnbinom,
stat_threshold = 10,
mu = 2,
size = 0.2
)
epichains_summary_eg_lik <- likelihood(
chains = epichains_summary_eg,
statistic = "size",
offspring_dist = rnbinom,
mu = 2,
size = 0.2
)
epichains_summary_eg_lik
Print an <epichains>
object
Description
Print an <epichains>
object
Usage
## S3 method for class 'epichains'
print(x, ...)
Arguments
x |
An |
... |
Other parameters passed to |
Value
Invisibly returns an <epichains>
. Called for
side-effects.
Author(s)
James M. Azam
Examples
# Using a Poisson offspring distribution and simulating from an infinite
# population up to chain size 10.
set.seed(32)
chains_pois_offspring <- simulate_chains(
n_chains = 10,
statistic = "size",
offspring_dist = rpois,
stat_threshold = 10,
generation_time = function(n) rep(3, n),
lambda = 2
)
chains_pois_offspring # Print the object
Print an <epichains_summary>
object
Description
Prints a summary of the <epichains_summary>
object. In particular, it
prints the number of chains simulated, and the range of
the statistic, represented as the maximum (max_stat
) and minimum
(min_stat
). If the minimum or maximum is infinite, it is represented as
>= stat_threshold
where stat_threshold
is the value of the censoring
limit. See ?epichains_summary()
for the definition of stat_threshold
.
Usage
## S3 method for class 'epichains_summary'
print(x, ...)
Arguments
x |
An |
... |
Not used. |
Value
Invisibly returns an <epichains_summary>
. Called for
side-effects.
Author(s)
James M. Azam
Examples
# Using a Poisson offspring distribution and simulating from an infinite
# population up to chain size 10.
set.seed(32)
chain_summary_print_eg <- simulate_chain_stats(
n_chains = 10,
statistic = "size",
offspring_dist = rpois,
stat_threshold = 10,
lambda = 2
)
chain_summary_print_eg # Print the object
Generate random numbers from the Borel distribution
Description
Random numbers are generated by simulating from a Poisson branching process
Usage
rborel(n, mu, censor_at = Inf)
Arguments
n |
Number of random variates to generate. |
mu |
A non-negative number for the poisson mean. |
censor_at |
A stopping criterion; |
Value
A numeric vector of random numbers.
Author(s)
Sebastian Funk James M. Azam
Examples
set.seed(32)
rborel(5, 1)
Generate random numbers from a Gamma-Borel mixture distribution
Description
Generate random numbers from a Gamma-Borel mixture distribution
Usage
rgborel(n, size, prob, mu, censor_at = Inf)
Arguments
n |
Number of random variates to generate. |
size |
The dispersion parameter (often called |
prob |
Probability of success (in the parameterisation with
|
mu |
Mean; A positive number. |
censor_at |
A stopping criterion; |
Value
Numeric vector of random numbers
Author(s)
Sebastian Funk James M. Azam
Examples
set.seed(32)
rgborel(n = 5, size = 0.3, mu = 1, censor_at = 5)
Simulate a vector of transmission chains statistics (sizes/lengths)
Description
It generates a vector of transmission chain sizes or lengths using the
same model as simulate_chains()
but without tracking details of the
individual chains. This function is useful when only the chain sizes or
lengths are of interest.
It uses a simple branching process model that simulates independent chains, using an offspring distribution for each chain. Each chain uses a threshold chain size or length as the stopping criterion especially where R0 > 1. The function also optionally accepts population related inputs such as the population size (defaults to Inf) and percentage of the population initially immune (defaults to 0).
Usage
simulate_chain_stats(
n_chains,
statistic = c("size", "length"),
offspring_dist,
...,
stat_threshold = Inf,
pop = Inf,
percent_immune = 0
)
Arguments
n_chains |
Number of chains to simulate. |
statistic |
The chain statistic to track as the
stopping criteria for each chain being simulated when
|
offspring_dist |
Offspring distribution: a |
... |
Parameters of the offspring distribution as required by R. |
stat_threshold |
A stopping criterion for individual chain simulations;
a positive number coercible to integer. When any chain's cumulative statistic
reaches or surpasses |
pop |
Population size; An |
percent_immune |
Percent of the population immune to
infection at the start of the simulation; A |
Value
An object of class <epichains_summary>
, which is a numeric
vector of chain sizes or lengths with extra attributes for storing the
simulation parameters.
simulate_chain_stats()
vs simulate_chains()
simulate_chain_stats()
is a time-invariant version of simulate_chains()
.
In particular, it does not track the details of individual transmission
events but deals with eventual chain statistics, that is, the statistic
realised by a chain after dying out.
It is useful for generating a vector of chain sizes or lengths for a given number of chains, if details of who infected whom and the timing of infection are not of interest.
This function is used in {epichains}
for calculating likelihoods in
the likelihood()
function and for sampling from the borel
distribution (See ?epichains::rborel). It is used extensively in the
vignette on
modelling disease control,
where only data on observed chain sizes and lengths are available.
Definition of a transmission chain
A transmission chain as used here is an independent case and all the secondary cases linked to it through transmission. The chain starts with a single case, and each case in the chain generates secondary cases according to the offspring distribution. The chain ends when no more secondary cases are generated.
Calculating chain sizes and lengths
The function simulates the chain size for chain i
at time
t
, I_{t, i}
, as:
I_{t, i} = \sum_{i}^{I_{t-1}}X_{t, i},
and the chain length/duration for chain i
at time t
,
L_{t, i}
, as:
L_{t, i} = {\sf min}(1, X_{t, i}),
where X_{t, i}
is the secondary cases generated by chain i
at time t
, and I_{0, i} = L_{0, i} = 1
.
The distribution of secondary cases, X_{t, i}
is modelled by the
offspring distribution (offspring_dist
).
Author(s)
James M. Azam, Sebastian Funk
References
Jacob C. (2010). Branching processes: their role in epidemiology. International journal of environmental research and public health, 7(3), 1186–1204. doi:10.3390/ijerph7031204
Blumberg, S., and J. O. Lloyd-Smith. 2013. "Comparing Methods for Estimating R0 from the Size Distribution of Subcritical Transmission Chains." Epidemics 5 (3): 131–45. doi:10.1016/j.epidem.2013.05.002.
Farrington, C. P., M. N. Kanaan, and N. J. Gay. 2003. "Branching Process Models for Surveillance of Infectious Diseases Controlled by Mass Vaccination.” Biostatistics (Oxford, England) 4 (2): 279–95. doi:10.1093/biostatistics/4.2.279.
Examples
# Simulate chain sizes with a poisson offspring distribution, assuming an
# infinite population and no immunity.
set.seed(32)
simulate_chain_stats(
n_chains = 20,
statistic = "size",
offspring_dist = rpois,
stat_threshold = 10,
lambda = 0.9
)
# Simulate chain sizes with a negative binomial distribution and assuming
# a finite population and 10% immunity.
set.seed(32)
simulate_chain_stats(
pop = 1000,
percent_immune = 0.1,
n_chains = 20,
statistic = "size",
offspring_dist = rnbinom,
stat_threshold = 10,
mu = 0.9,
size = 0.36
)
Simulate transmission chains
Description
It generates independent transmission chains starting with a single case per chain, using a simple branching process model (See details for definition of "chains" and assumptions). Offspring for each chain are generated with an offspring distribution, and an optional generation time distribution function.
The individual chain simulations are controlled by customisable stopping criteria, including a threshold chain size or length, and a generation time cut off. The function also optionally accepts population related inputs such as the population size (defaults to Inf) and percentage of the population initially immune (defaults to 0).
Usage
simulate_chains(
n_chains,
statistic = c("size", "length"),
offspring_dist,
...,
stat_threshold = Inf,
pop = Inf,
percent_immune = 0,
generation_time = NULL,
t0 = 0,
tf = Inf
)
Arguments
n_chains |
Number of chains to simulate. |
statistic |
The chain statistic to track as the
stopping criteria for each chain being simulated when
|
offspring_dist |
Offspring distribution: a |
... |
Parameters of the offspring distribution as required by R. |
stat_threshold |
A stopping criterion for individual chain simulations;
a positive number coercible to integer. When any chain's cumulative statistic
reaches or surpasses |
pop |
Population size; An |
percent_immune |
Percent of the population immune to
infection at the start of the simulation; A |
generation_time |
The generation time function; the name
of a user-defined named or anonymous function with only one argument |
t0 |
Start time (if generation time is given); either a single value
or a vector of same length as |
tf |
A number for the cut-off for the infection times (if generation
time is given); Defaults to |
Value
An <epichains>
object, which is basically a <data.frame>
with columns:
-
chain
- an ID for active/ongoing chains, -
infectee
- a unique ID for each infectee. -
infector
- an ID for the infector of each infectee. -
generation
- a discrete time point during which infection occurs, and optionally, -
time
- the time of infection.
Definition of a transmission chain
A transmission chain as used here is an independent case and all the secondary cases linked to it through transmission. The chain starts with a single case, and each case in the chain generates secondary cases according to the offspring distribution. The chain ends when no more secondary cases are generated.
Calculating chain sizes and lengths
The function simulates the chain size for chain i
at time
t
, I_{t, i}
, as:
I_{t, i} = \sum_{i}^{I_{t-1}}X_{t, i},
and the chain length/duration for chain i
at time t
,
L_{t, i}
, as:
L_{t, i} = {\sf min}(1, X_{t, i}),
where X_{t, i}
is the secondary cases generated by chain i
at time t
, and I_{0, i} = L_{0, i} = 1
.
The distribution of secondary cases, X_{t, i}
is modelled by the
offspring distribution (offspring_dist
).
Specifying generation_time
The argument generation_time
must be specified as a function with
one argument, n
.
For example, assuming we want to specify the generation time as a random
log-normally distributed variable with meanlog = 0.58
and sdlog = 1.58
,
we could define a named function, let's call it "generation_time_fn",
with only one argument representing the number of generation times to
sample: generation_time_fn <- function(n){rlnorm(n, 0.58, 1.38)}
,
and assign the name of the function to generation_time
in
the simulation function, i.e.
`simulate_*`(..., generation_time = generation_time_fn)
,
where ...
are the other arguments to simulate_*()
and * is a placeholder
for the rest of simulation function's name.
Alternatively, we could assign an anonymous function to generation_time
in the simulate_*()
call, i.e.
simulate_*(..., generation_time = function(n){rlnorm(n, 0.58, 1.38)})
OR simulate_*(..., generation_time = \(n){rlnorm(n, 0.58, 1.38)})
,
where ...
are the other arguments to simulate_*()
.
Author(s)
James M. Azam, Sebastian Funk
References
Jacob C. (2010). Branching processes: their role in epidemiology. International journal of environmental research and public health, 7(3), 1186–1204. doi:10.3390/ijerph7031204
Blumberg, S., and J. O. Lloyd-Smith. 2013. "Comparing Methods for Estimating R0 from the Size Distribution of Subcritical Transmission Chains." Epidemics 5 (3): 131–45. doi:10.1016/j.epidem.2013.05.002.
Farrington, C. P., M. N. Kanaan, and N. J. Gay. 2003. "Branching Process Models for Surveillance of Infectious Diseases Controlled by Mass Vaccination.” Biostatistics (Oxford, England) 4 (2): 279–95. doi:10.1093/biostatistics/4.2.279.
Examples
# Using a Poisson offspring distribution and simulating from an infinite
# population up to chain size 10.
set.seed(32)
chains_pois_offspring <- simulate_chains(
n_chains = 10,
statistic = "size",
offspring_dist = rpois,
stat_threshold = 10,
generation_time = function(n) rep(3, n),
lambda = 2
)
chains_pois_offspring
# Using a Negative binomial offspring distribution and simulating from a
# finite population up to chain size 10.
set.seed(32)
chains_nbinom_offspring <- simulate_chains(
n_chains = 10,
pop = 100,
percent_immune = 0,
statistic = "size",
offspring_dist = rnbinom,
stat_threshold = 10,
generation_time = function(n) rep(3, n),
mu = 2,
size = 0.2
)
chains_nbinom_offspring
Summary method for <epichains>
class
Description
This calculates the chain statistic (size/length) for the simulated
chains and returns an object with the same information as that returned
by an equivalent simulate_chain_stats()
call.
Usage
## S3 method for class 'epichains'
summary(object, ...)
Arguments
object |
An |
... |
Not used. |
Value
An <epichains_summary>
object containing the chain summary
statistics as follows:
"size": the total number of offspring produced by a chain before it goes extinct.
"length": the number of generations achieved by a chain before it goes extinct.
Author(s)
James M. Azam
Examples
# Using a negative binomial offspring distribution and simulating from a
# finite population up to chain size 10.
set.seed(32)
sim_chains_nbinom <- simulate_chains(
n_chains = 10,
pop = 100,
percent_immune = 0,
statistic = "size",
offspring_dist = rnbinom,
stat_threshold = 10,
mu = 2,
size = 0.2
)
# Summarise the simulated chains
sim_chains_nbinom_summary <- summary(sim_chains_nbinom)
sim_chains_nbinom_summary
# Same results can be obtained using `simulate_chain_stats()`
set.seed(32)
sim_summary_nbinom <- simulate_chain_stats(
n_chains = 10,
pop = 100,
percent_immune = 0,
statistic = "size",
offspring_dist = rnbinom,
stat_threshold = 10,
mu = 2,
size = 0.2
)
sim_summary_nbinom
# Check that the results are the same
setequal(sim_chains_nbinom_summary, sim_summary_nbinom)
Summary method for <epichains_summary>
class
Description
Summary method for <epichains_summary>
class
Usage
## S3 method for class 'epichains_summary'
summary(object, ...)
Arguments
object |
An |
... |
Not used. |
Value
A list of chain summaries. The list contains the following elements:
-
n_chains
: the number of chains simulated. -
max_stat
: the maximum chain statistic (size/length) achieved by the chains. -
min_stat
: the minimum chain statistic (size/length) achieved by the chains.
Author(s)
James M. Azam
Examples
# Using a Poisson offspring distribution and simulating from an infinite
# population up to chain size 10.
set.seed(32)
chain_stats <- simulate_chain_stats(
n_chains = 10,
statistic = "size",
offspring_dist = rpois,
stat_threshold = 10,
lambda = 2
)
summary(chain_stats)