Type: | Package |
Title: | Statistical Methods for the Analysis of Excess Lifetimes |
Version: | 1.2.1 |
BugReports: | https://github.com/lbelzile/longevity/issues |
URL: | https://lbelzile.github.io/longevity/ |
Depends: | R (≥ 4.0.0) |
Imports: | numDeriv, Rcpp (≥ 1.0.6), rlang, Rsolnp |
Suggests: | knitr, ggplot2 (≥ 3.0.0), tinytest, rmarkdown |
LinkingTo: | Rcpp, RcppArmadillo |
Description: | A collection of parametric and nonparametric methods for the analysis of survival data. Parametric families implemented include Gompertz-Makeham, exponential and generalized Pareto models and extended models. The package includes an implementation of the nonparametric maximum likelihood estimator for arbitrary truncation and censoring pattern based on Turnbull (1976) <doi:10.1111/j.2517-6161.1976.tb01597.x>, along with graphical goodness-of-fit diagnostics. Parametric models for positive random variables and peaks over threshold models based on extreme value theory are described in Rootzén and Zholud (2017) <doi:10.1007/s10687-017-0305-5>; Belzile et al. (2021) <doi:10.1098/rsos.202097> and Belzile et al. (2022) <doi:10.1146/annurev-statistics-040120-025426>. |
License: | GPL-3 |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
VignetteBuilder: | knitr |
NeedsCompilation: | yes |
Packaged: | 2025-07-03 20:52:41 UTC; lbelzile |
Author: | Leo Belzile |
Maintainer: | Leo Belzile <belzilel@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-07-03 21:20:02 UTC |
Identification sets
Description
Identification sets
Usage
.censTruncLimits(tsets, lcens, rcens, ltrunc, rtrunc, trunc, cens)
Arguments
tsets |
Turnbull's sets |
lcens |
numeric vector of left censoring |
rcens |
numeric vector of right censoring |
ltrunc |
numeric vector of left truncation |
rtrunc |
numeric vector of right truncation |
trunc |
logical are observation truncated? |
Value
a matrix with the bounds of the intervals for Turnbull sets
Identification sets for double truncated data
Description
Identification sets for double truncated data
Usage
.censTruncLimitsDtrunc(tsets, lcens, rcens, ltrunc, rtrunc, trunc, cens)
Arguments
tsets |
Turnbull's sets |
lcens |
numeric vector of left censoring |
rcens |
numeric vector of right censoring |
ltrunc |
numeric matrix of left truncation |
rtrunc |
numeric matrix of right truncation |
trunc |
logical are observation truncated? |
Value
a matrix with the bounds of the intervals for Turnbull sets
Check survival output
Description
Check survival output
Usage
.check_surv(
time,
time2 = NULL,
event = NULL,
type = c("right", "left", "interval", "interval2")
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
type |
character string specifying the type of censoring. Possible values are " |
Value
a list with transformed inputs or an error
Turnbull EM algorithm (low storage implementation)
Description
Turnbull EM algorithm (low storage implementation)
Usage
.turnbull_em(
tsets,
lcens,
rcens,
ltrunc,
rtrunc,
weights,
cens = TRUE,
trunc = TRUE,
tol = 1e-12,
zerotol = 1e-10,
maxiter = 100000L
)
Arguments
tsets |
Turnbull's sets |
lcens |
numeric vector of left censoring |
rcens |
numeric vector of right censoring |
ltrunc |
numeric vector of left truncation |
rtrunc |
numeric vector of right truncation |
weights |
vector of weights for observations |
cens |
logical; if |
tol |
tolerance level for terminating the EM algorithm |
maxiter |
maximum number of iteration for the EM algorithm |
Value
a list with the probabilities and the standard errors
Turnbull EM algorithm (low storage implementation)
Description
Turnbull EM algorithm (low storage implementation)
Usage
.turnbull_em_dtrunc(
tsets,
lcens,
rcens,
ltrunc,
rtrunc,
weights,
cens = TRUE,
trunc = TRUE,
tol = 1e-12,
zerotol = 1e-10,
maxiter = 100000L
)
Arguments
tsets |
Turnbull's sets |
lcens |
numeric vector of left censoring |
rcens |
numeric vector of right censoring |
ltrunc |
numeric vector of left truncation |
rtrunc |
numeric vector of right truncation |
weights |
vector of weights for observations |
cens |
logical; if |
tol |
tolerance level for terminating the EM algorithm |
maxiter |
maximum number of iteration for the EM algorithm |
Value
a list with the probabilities and the standard errors
Turnbull's sets
Description
Given truncation and censoring sets, construct disjoint increasing intervals whose left and right endpoints lie in L and R and which contain no other members of L and R
Usage
.turnbull_intervals(Lcens, Rcens, Ltrunc, Rtrunc, status)
Arguments
Lcens |
set of left censoring limits |
Rcens |
vector of right censoring limits |
Ltrunc |
vector of left truncation limits |
Rtrunc |
vector of right truncation limits |
status |
integer vector giving status of censoring set |
Value
a matrix containing limits of intervals for EM
Weighted empirical distribution function
Description
Weighted empirical distribution function
Usage
.wecdf(x, w, type = c("dist", "surv"))
Arguments
x |
vector of length |
w |
vector of weights of length |
type |
string, one of distribution function ( |
Author(s)
Adapted from spatstat (c) Adrian Baddeley and Rolf Turner
P-value plot for Northrop and Coleman diagnostic
Description
The Northrop-Coleman tests for penultimate models are comparing the piece-wise generalized Pareto distribution to the generalized Pareto above the lower threshold.
Usage
autoplot.elife_northropcoleman(object, ...)
## S3 method for class 'elife_northropcoleman'
plot(x, plot.type = c("base", "ggplot"), plot = TRUE, ...)
Arguments
object |
object of class |
... |
additional arguments for base |
x |
an object of class |
plot.type |
string indicating the type of plot |
plot |
logical; should the routine print the graph if |
Value
a base R or ggplot object with p-values for the Northrop-Coleman test against thresholds.
Goodness-of-fit plots for parametric models
Description
Because of censoring and truncation, the plotting positions must be adjusted accordingly. For right-censored data, the methodology is described in Waller & Turnbull (1992). Only non-censored observations are displayed, which can create distortion.
Usage
autoplot.elife_par(object, ...)
## S3 method for class 'elife_par'
plot(
x,
plot.type = c("base", "ggplot"),
which.plot = c("pp", "qq"),
confint = c("none", "pointwise", "simultaneous"),
plot = TRUE,
...
)
Arguments
object |
an object of class |
... |
additional arguments, currently ignored by the function. |
x |
a parametric model of class |
plot.type |
string, one of |
which.plot |
vector of string indicating the plots, among |
confint |
logical; if |
plot |
logical; if |
Details
For truncated data, we first estimate the distribution function
nonparametrically, F_n
. The uniform plotting positions of the data
v_i = [F_n(y_i) - F_n(a_i)]/[F_n(b_i) - F_n(a_i)].
For probability-probability plots, the empirical quantiles are transformed
using the same transformation, with F_n
replaced by the postulated or estimated
distribution function F_0
.
For quantile-quantile plots, the plotting positions v_i
are mapped back
to the data scale viz.
F_0^{-1}\{F_0(a_i) + v_i[F_0(b_i) - F_0(a_i)]\}
When data are truncated and observations are mapped back to the untruncated scale (with, e.g., exp
), the plotting positions need not be in the same order as the order statistics of the data.
Value
The function produces graphical goodness-of-fit plots using base R or ggplot objects (returned as an invisible list).
Examples
set.seed(1234)
samp <- samp_elife(
n = 200,
scale = 2,
shape = 0.3,
family = "gomp",
lower = 0, upper = runif(200, 0, 10),
type2 = "ltrc")
fitted <- fit_elife(
time = samp$dat,
thresh = 0,
event = ifelse(samp$rcens, 0L, 1L),
type = "right",
family = "exp",
export = TRUE)
plot(fitted, plot.type = "ggplot")
# Left- and right-truncated data
n <- 40L
samp <- samp_elife(
n = n,
scale = 2,
shape = 0.3,
family = "gp",
lower = ltrunc <- runif(n),
upper = rtrunc <- ltrunc + runif(n, 0, 15),
type2 = "ltrt")
fitted <- fit_elife(
time = samp,
thresh = 0,
ltrunc = ltrunc,
rtrunc = rtrunc,
family = "gp",
export = TRUE)
plot(fitted, which.plot = c("tmd", "dens"))
Threshold stability plots
Description
Threshold stability plots
Usage
autoplot.elife_tstab(object, ...)
## S3 method for class 'elife_tstab'
plot(
x,
plot.type = c("base", "ggplot"),
which.plot = c("scale", "shape"),
plot = TRUE,
...
)
Arguments
object |
object of class |
... |
additional arguments, currently ignored by the function. |
x |
an object of class |
plot.type |
string, one of |
which.plot |
vector of string indicating the plots, among |
plot |
logical; if |
Box-Cox transformation function
Description
Given a vector of parameters, apply the Box-Cox transformation.
Usage
boxcox_transfo(x, lambda = rep(1, length(x)))
Arguments
x |
vector of arguments |
lambda |
vector of Box-Cox parameters |
Value
a vector of the same length as x
Check default arguments
Description
Check arguments and override default values.
If a named list, arguments
, is provided by the user,
it will override any default value.
If one of the argument is provided directly,
it will take precedence over the values in arguments
, provided it is not a default value.
Usage
check_arguments(func, call, arguments = NULL)
Arguments
func |
function whose parameters are to be superseded |
call |
user call, obtained from |
arguments |
named list with arguments |
Value
a named list with all arguments
Check parameters of extended lifetime distributions
Description
Check parameters of extended lifetime distributions
Usage
check_elife_dist(
rate,
scale,
shape,
family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
"beard", "perksmake", "beardmake")
)
Arguments
rate |
rate parameter(s); for models with Makeham component, the last entry should be part of the rate vector |
scale |
scale parameter |
shape |
vector of shape parameter(s). |
family |
string indicating the parametric model, one of |
Value
The function has no return value and is only used to throw error for invalid arguments.
Confidence intervals for profile likelihoods
Description
This code is adapted from the mev
package.
Usage
conf_interv(
object,
level = 0.95,
prob = c((1 - level)/2, 1 - (1 - level)/2),
print = FALSE,
...
)
Arguments
object |
a list containing information about the profile likelihood in the same format as the |
level |
probability level of the confidence interval |
prob |
vector of length 2 containing the bounds, by default double-sided |
print |
logical indicating whether the intervals are printed to the console |
... |
additional arguments passed to the function |
Value
a table with confidence intervals.
Density function of the extended generalized Pareto distribution
Description
Density function of the extended generalized Pareto distribution
Hazard function of the extended generalized Pareto distribution
Quantile function of the extended generalized Pareto distribution
Usage
dextgp(x, scale = 1, shape1 = 0, shape2 = 0, log = FALSE)
hextgp(x, scale = 1, shape1 = 0, shape2 = 0, log = FALSE)
qextgp(p, scale = 1, shape1 = 0, shape2 = 0, lower.tail = TRUE)
Arguments
x |
vector of quantiles. |
scale |
scale parameter, strictly positive. |
shape1 |
positive shape parameter |
shape2 |
shape parameter |
log |
logical; if |
p |
vector of probabilities. |
lower.tail |
logical; if |
Value
a vector of (log)-density of the same length as x
a vector of (log)-hazard of the same length as x
a vector of quantiles
Dutch survival data
Description
This data frame contains information about all Dutch who died
above age 92 years between 1986 and 2015. Observations are
doubly truncated and such bounds are calculated based on the
range of plausible values for these variables.
There are 226 records that are interval-censored and interval-truncated
for which bdate
, ddate
and ndays
is missing (NA
).
Usage
dutch
Format
A data frame with 305143 rows and 11 variables:
- ndays
survival time (in days)
- bdate
the smallest plausible birth date given information about month of birth and death and survival (
Date
)- bmonth
month of birth
- byear
year of birth
- ddate
the largest plausible death date given information about month of birth and death and survival (
Date
)- dmonth
month of death
- dyear
year of death
- ltrunc
minimum age (in days); the maximum of either 92 years or the number of days reached in 1986
- rtrunc
maximum age (in days) an individual could have reached by the end of 2015
- gender
factor indicating gender of individual, either
female
ormale
- valid
quality flag;
A
for individuals born in the Netherlands,B
for individuals born abroad who died in the Netherlands
Source
Statistics Netherlands (CBS). Accessed via the Supplemental material of Einmahl, Einmahl and de Haan (2019)
References
Einmahl, J.J., J.H.J. Einmahl and L. de Haan (2019). Limits to Human Life Span Through Extreme Value Theory, Journal of the American Statistical Association, 114(527), 1075-1080. doi:10.1080/01621459.2018.1537912
Excess lifetime distributions
Description
Quantile, distribution, density and hazard functions of excess lifetime distribution for threshold exceedances.
Usage
qelife(
p,
rate,
scale,
shape,
family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
"perksmake", "beard", "beardmake"),
lower.tail = TRUE
)
pelife(
q,
rate,
scale,
shape,
family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
"perksmake", "beard", "beardmake"),
lower.tail = TRUE,
log.p = FALSE
)
selife(
q,
rate,
scale,
shape,
family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
"perksmake", "beard", "beardmake"),
log.p = FALSE
)
relife(
n,
scale = 1,
rate,
shape,
family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
"perksmake", "beard", "beardmake")
)
delife(
x,
scale = 1,
rate,
shape,
family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
"perksmake", "beard", "beardmake"),
log = FALSE
)
helife(
x,
scale = 1,
rate,
shape,
family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
"perksmake", "beard", "beardmake"),
log = FALSE
)
Arguments
p |
vector of probabilities |
rate |
rate parameter(s); for models with Makeham component, the last entry should be part of the rate vector |
scale |
scale parameter |
shape |
vector of shape parameter(s). |
family |
string indicating the parametric model, one of |
lower.tail |
logical; if |
q |
vector of quantiles. |
n |
sample size |
log , log.p |
logical; if |
Value
depending on the function type, a vector of probabilities (pelife
), survival probabilities (selife
), quantiles (qelife
), density (delife
), or hazard (helife
). The function relife
returns a random sample of size n
from the distribution.
Profile likelihood for the endpoint of the generalized Pareto distribution
Description
This function returns the profile log likelihood over a grid of values of psi
, the endpoints.
Usage
endpoint.profile(
time,
time2 = NULL,
event = NULL,
thresh = 0,
type = c("right", "left", "interval", "interval2"),
ltrunc = NULL,
rtrunc = NULL,
weights = rep(1, length(time)),
psi = NULL,
confint = FALSE,
level = 0.95,
arguments = NULL,
...
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
thresh |
vector of thresholds |
type |
character string specifying the type of censoring. Possible values are " |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
weights |
weights for observations |
psi |
mandatory vector of endpoints at which to compute the profile |
confint |
logical; if |
level |
numeric; the level for the confidence intervals |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional parameters, currently ignored |
Value
a list with the maximum likelihood estimate of the endpoint and the profile log-likelihood
Examples
set.seed(2023)
time <- relife(n = 100, scale = 3, shape = -0.3, family = "gp")
endpt <- endpoint.profile(
time = time,
psi = seq(max(time) + 1e-4, max(time) + 40, length.out = 51L))
print(endpt)
plot(endpt)
confint(endpt)
Threshold stability plots for endpoint
Description
This function calls endpoint.profile
to compute
the endpoint with profile likelihood confidence intervals
at each threshold. It then returns a data frame and
a plot
Usage
endpoint.tstab(
time,
time2 = NULL,
event = NULL,
thresh = 0,
type = c("right", "left", "interval", "interval2"),
ltrunc = NULL,
rtrunc = NULL,
weights = rep(1, length(time)),
psi = NULL,
confint = FALSE,
level = 0.95,
arguments = NULL,
plot = TRUE,
...
)
prof_gp_endpt(
time,
time2 = NULL,
event = NULL,
thresh = 0,
type = c("right", "left", "interval", "interval2"),
ltrunc = NULL,
rtrunc = NULL,
weights = rep(1, length(time)),
psi = NULL,
confint = FALSE,
level = 0.95,
arguments = NULL,
...
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
thresh |
vector of thresholds |
type |
character string specifying the type of censoring. Possible values are " |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
weights |
weights for observations |
psi |
mandatory vector of endpoints at which to compute the profile |
confint |
logical; if |
level |
numeric; the level for the confidence intervals |
arguments |
a named list specifying default arguments of the function that are common to all |
plot |
logical; if |
... |
additional parameters, currently ignored |
Value
a data frame with the threshold, endpoint estimates and profile confidence intervals
England and Wales simulated supercentenarian data
Description
This data frame contains information about 179 fake records mimicking Welsh and English who died age 110 and above
Usage
ewsim
Format
A data frame with 179 rows and 3 variables:
- time
survival time above 110 (in years)
- ltrunc
minimum age above 110 (in years), or zero
;
- rtrunc
maximum age (in years) an individual could have reached by the end of the time frame
Fit excess lifetime models for doubly interval truncated data
Description
This function is a wrapper around constrained optimization routines for different models with non-informative censoring and truncation patterns.
Usage
fit_ditrunc_elife(
time,
ltrunc1 = NULL,
rtrunc1 = NULL,
ltrunc2 = NULL,
rtrunc2 = NULL,
thresh = 0,
family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
"extweibull", "perks", "beard", "perksmake", "beardmake"),
weights = NULL,
export = FALSE,
start = NULL,
restart = FALSE,
arguments = NULL,
...
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
ltrunc1 |
lower truncation limit, default to |
rtrunc1 |
upper truncation limit, default to |
ltrunc2 |
lower truncation limit, default to |
rtrunc2 |
upper truncation limit, default to |
thresh |
vector of thresholds |
family |
string; choice of parametric family, either exponential ( |
weights |
weights for observations |
export |
logical; should data be included in the returned object to produce diagnostic plots? Default to |
start |
vector of starting values for the optimization routine. If |
restart |
logical; should multiple starting values be attempted? Default to |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional arguments for optimization, currently ignored. |
Value
an object of class elife_par
Note
The extended generalized Pareto model is constrained
to avoid regions where the likelihood is flat so \xi \in [-1, 10]
in
the optimization algorithm.
The standard errors are obtained via the observed information matrix, calculated using the hessian. In many instances, such as when the shape parameter is zero or negative, the hessian is singular and no estimates are returned.
Fit excess lifetime models by maximum likelihood
Description
This function is a wrapper around constrained optimization routines for different models with non-informative censoring and truncation patterns.
Usage
fit_elife(
time,
time2 = NULL,
event = NULL,
type = c("right", "left", "interval", "interval2"),
ltrunc = NULL,
rtrunc = NULL,
thresh = 0,
status = NULL,
family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "gppiece",
"extweibull", "perks", "perksmake", "beard", "beardmake"),
weights = NULL,
export = FALSE,
start = NULL,
restart = FALSE,
arguments = NULL,
check = FALSE,
...
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
type |
character string specifying the type of censoring. Possible values are " |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
thresh |
vector of thresholds |
status |
integer vector giving status of an observation. If |
family |
string; choice of parametric family |
weights |
weights for observations |
export |
logical; should data be included in the returned object to produce diagnostic plots? Default to |
start |
vector of starting values for the optimization routine. If |
restart |
logical; should multiple starting values be attempted? Default to |
arguments |
a named list specifying default arguments of the function that are common to all |
check |
logical; if |
... |
additional parameters, currently ignored |
Value
an object of class elife_par
Note
The extended generalized Pareto model is constrained
to avoid regions where the likelihood is flat so \xi \in [-1, 10]
in
the optimization algorithm.
The standard errors are obtained via the observed information matrix, calculated using the hessian. In many instances, such as when the shape parameter is zero or negative, the hessian is singular and no estimates are returned.
Examples
data(ewsim, package = "longevity")
fit1 <- fit_elife(
arguments = ewsim,
export = TRUE,
family = "exp")
fit2 <- fit_elife(
arguments = ewsim,
export = TRUE,
family = "gp")
plot(fit1)
summary(fit1)
anova(fit2, fit1)
Piece-wise generalized Pareto distribution
Description
Density, distribution, quantile functions and random number generation from the
mixture model of Northrop and Coleman (2014), which consists of m
different generalized Pareto distributions over non-overlapping intervals
with m
shape parameters and one scale parameter; the other scale parameters are
constrained so that the resulting distribution is continuous over the domain
and reduces to a generalized Pareto distribution if all of the shape parameters are equal.
Usage
dgppiece(x, scale, shape, thresh, log = FALSE)
pgppiece(q, scale, shape, thresh, lower.tail = TRUE, log.p = FALSE)
qgppiece(p, scale, shape, thresh, lower.tail = TRUE, log.p = FALSE)
rgppiece(n, scale, shape, thresh)
Arguments
x , q |
vector of quantiles |
scale |
positive value for the first scale parameter |
shape |
vector of |
thresh |
vector of |
log , log.p |
logical; if |
lower.tail |
logical; if TRUE (default), probabilities are |
p |
vector of probabilities |
n |
sample size |
Value
a vector of quantiles (qgppiece
), probabilities (pgppiece
), density (dgppiece
) or random number generated from the model (rgppiece
)
References
Northrop & Coleman (2014). Improved threshold diagnostic plots for extreme value analyses, Extremes, 17(2), 289–303.
Profile likelihood for hazard
Description
This function computes the hazard for different elife
parametric
models with profile-likelihood based confidence intervals.
It is also used to provide local hazard plots at varying thresholds.
Usage
hazard_elife(
x,
time,
time2 = NULL,
event = NULL,
status = NULL,
thresh = 0,
ltrunc = NULL,
rtrunc = NULL,
type = c("right", "left", "interval", "interval2"),
family = c("exp", "gp", "gomp", "weibull", "extgp"),
weights = rep(1, length(time)),
level = 0.95,
psi = NULL,
plot = FALSE,
arguments = NULL,
...
)
Arguments
x |
value of the threshold exceedance at which to estimate the hazard |
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
status |
integer vector giving status of an observation. If |
thresh |
vector of thresholds |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
type |
character string specifying the type of censoring. Possible values are " |
family |
string; choice of parametric family |
weights |
weights for observations |
level |
numeric; the level for the confidence intervals. Default to 0.95 |
psi |
optional vector of hazard at which to compute the profile log likelihood |
plot |
logical; if true, display the profile log-likelihood. Default to |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional arguments for optimization, currently ignored. |
Value
an invisible object of class elife_hazard
containing information about the profile likelihood
Examples
n <- 2500
time <- samp_elife(n = n, scale = 2,
family = "gp", shape = 0.1,
lower = ltrunc <- runif(n),
upper = rtrunc <- (5 + runif(n)), type2 = "ltrt")
hazard_elife(x = 2, time = time,
ltrunc = ltrunc, rtrunc = rtrunc, family = "exp")
Hazard function for various parametric models
Description
Hazard function for various parametric models
Usage
hazard_fn_elife(
x,
par,
family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp")
)
Arguments
x |
vector of points at which to evaluate the hazard function |
par |
vector of scale and shape parameters |
family |
string; choice of parametric family |
Value
a vector with the value of the hazard function at x
Hazard function of the exponential distribution
Description
Hazard function of the exponential distribution
Usage
hexp(x, rate = 1, log = FALSE)
Arguments
x |
vector of quantiles |
rate |
rate vector of rates |
log |
logical; if |
Value
a vector of (log)-hazard.
Hazard function of the Perks distribution
Description
Hazard function of the Perks distribution
Distribution function of the Perks distribution
Density function of the Perks distribution
Quantile function of the Perks distribution
Usage
hperks(x, rate = 1, shape = 1, log = FALSE)
pperks(q, rate = 1, shape = 1, lower.tail = TRUE, log.p = FALSE)
dperks(x, rate = 1, shape = 1, log = FALSE)
qperks(p, rate = 1, shape = 1, lower.tail = TRUE)
Arguments
x |
vector of quantiles. |
rate |
rate parameter ( |
shape |
shape parameter ( |
log |
logical; if |
q |
vector of quantiles. |
lower.tail |
logical; if |
log.p |
logical; if |
p |
vector of probabilities. |
Value
a vector of (log)-hazard.
a vector of (log)-probabilities of the same length as q
a vector of (log)-density.
a vector of quantiles
Hazard function of the Weibull distribution
Description
Hazard function of the Weibull distribution
Usage
hweibull(x, shape, scale = 1, log = FALSE)
Arguments
x |
vector of quantiles |
shape |
shape parameter |
scale |
scale parameter, default to 1 |
log |
logical; if |
Value
a vector of (log)-hazard
IDL metadata
Description
This data frame contains country codes and the associated data collection period corresponding to the range for age at death.
Usage
idlmetadata
Format
A data frame with 21 rows and 4 variables:
- country
factor, one of
AUT
(Austria),BEL
(Belgium),CAN
(Quebec),DEU
(Germany),DNK
(Denmark),ESP
(Spain),FIN
(Finland),FRA
(France),JPN
(Japan),NOR
(Norway),SWE
(Sweden),EAW
(England and Wales) andUSA
(United States of America)- group
factor, either
105-109
for semi-supercentenarians or110+
for supercentenarians"- ldate
Date, smallest death date
- rdate
Date, latest death date
Details
Due to confidentiality restrictions, some data that were available in previous versions of the IDL for Switzerland, Italy and some entries for Japan and Belgium have been removed. As the IDL metadata are updated somewhat regularly and former versions of the database are not preserved, results from published analyses are replicable but not reproducible.
References
International Database on Longevity, extracted on February 13th, 2023
Japanese survival data
Description
This data frame contains information about the counts of dead Japanese by gender and year of birth (cohort), categorized by the whole part of age attained at death.
Usage
japanese
Format
A data frame with 1038 rows and 4 variables:
- age
integer, age (to the smallest year) at death (in years)
- byear
integer, birth year
- count
integer, number of death for cohort at given age
- gender
factor, the gender of the individuals; either
male
orfemale
Details
These data were obtained from the Annual Vital Statistics Report of Japan, released by the Japanese government every year since 1947. The authors note that "All the members of that cohort have died by the end of the observation period, a procedure referred to as the extinct cohort method". The data were obtained from the Human Mortality Database by the authors. Only positive counts are reported and two records (Misao Okawa and Jiroemon Kimura) are excluded because they do not correspond to the same selection mechanism.
Source
Table extracted from Hanayama & Sibuya (2016).
References
Hanayama, N. and M. Sibuya (2016). Estimating the Upper Limit of Lifetime Probability Distribution, Based on Data of Japanese Centenarians, The Journals of Gerontology: Series A, 71(8), 1014–1021. doi:10.1093/gerona/glv113
Japanese survival data (2)
Description
This data frame is extracted from Table 10.3 from Chapter 10, "Centenarians and Supercentenarians in Japan", in the Monograph Exceptional lifespans. The data were constructed by the extinct cohort method and are stratified by age cohort (five year group, except 1899-1900) and by sex. Note that the family registry system (KOSEKI), introduced in 1872, was standardized in 1886.
Usage
japanese2
Format
A data frame with 216 rows and 4 variables:
- age
integer, age (to the smallest year) at death (in years)
- bcohort
factor, birth cohort
- count
integer, number of death for cohort at given age
- gender
factor, the gender of the individuals; either
male
orfemale
Source
Table 10.3
References
Saito, Yasuhiko and Futoshi Ishii, and Jean-Marie Robine (2021). Centenarians and Supercentenarians in Japan. In Exceptional lifespans, Maier, H., Jeune, B., Vaupel, J. W. (Eds.), Demographic research monographs 17 VII, pp. 125-145. Cham, Springer.
Goodness-of-fit diagnostics
Description
Warning: EXPERIMENTAL Compute the Kolmogorov-Smirnov test statistic and compare it with a simulated null distribution obtained via a parametric bootstrap.
Usage
ks_test(
time,
time2 = NULL,
event = NULL,
thresh = 0,
ltrunc = NULL,
rtrunc = NULL,
type = c("right", "left", "interval", "interval2"),
family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
"extweibull", "perks", "beard", "perksmake", "beardmake"),
B = 999L,
arguments = NULL,
...
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
thresh |
vector of thresholds |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
type |
character string specifying the type of censoring. Possible values are " |
family |
string; choice of parametric family |
B |
number of bootstrap simulations |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional parameters, currently ignored |
Value
a list with elements
-
stat
the value of the test statistic -
pval
p-value obtained via simulation
Note
The bootstrap scheme requires simulating new data, fitting a parametric model and estimating the nonparametric maximum likelihood estimate for each new sample. This is computationally intensive in large samples.
Log posterior distribution with MDI priors
Description
Log of the posterior distribution for excess lifetime distribution with maximal data information priors.
Usage
lpost_elife(
par,
time,
time2 = NULL,
event = NULL,
type = c("right", "left", "interval", "interval2"),
ltrunc = NULL,
rtrunc = NULL,
family = c("exp", "gp", "gomp"),
thresh = 0,
weights = rep(1, length(time)),
status = NULL,
arguments = NULL,
...
)
Arguments
par |
vector of parameters, in the following order: scale, rate and shape |
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
type |
character string specifying the type of censoring. Possible values are " |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
family |
string; choice of parametric family |
thresh |
vector of thresholds |
weights |
weights for observations |
status |
integer vector giving status of an observation. If |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional arguments for optimization, currently ignored. |
Value
a vector proportional to the log posterior (the sum of the log likelihood and log prior)
Score test of Northrop and Coleman
Description
This function computes the score test with the piecewise generalized Pareto distribution under the null hypothesis that the generalized Pareto with a single shape parameter is an adequate simplification. The score test statistic is calculated using the observed information matrix; both hessian and score vector are obtained through numerical differentiation.
Usage
nc_score_test(
time,
time2 = NULL,
event = NULL,
thresh = 0,
ltrunc = NULL,
rtrunc = NULL,
type = c("right", "left", "interval", "interval2"),
weights = rep(1, length(time))
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
thresh |
a vector of thresholds |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
type |
character string specifying the type of censoring. Possible values are " |
weights |
weights for observations |
Details
The score test is much faster and perhaps less fragile than the likelihood ratio test: fitting the piece-wise generalized Pareto model is difficult due to the large number of parameters and multimodal likelihood surface.
The reference distribution is chi-square
Value
the value of a call to nc_test
Examples
set.seed(1234)
n <- 100L
x <- samp_elife(n = n,
scale = 2,
shape = -0.2,
lower = low <- runif(n),
upper = upp <- runif(n, min = 3, max = 20),
type2 = "ltrt",
family = "gp")
test <- nc_test(
time = x,
ltrunc = low,
rtrunc = upp,
thresh = quantile(x, seq(0, 0.5, by = 0.1)))
print(test)
plot(test)
Score test of Northrop and Coleman
Description
This function computes the score test with the piecewise generalized Pareto distribution under the null hypothesis that the generalized Pareto with a single shape parameter is an adequate simplification. The score test statistic is calculated using the observed information matrix; both hessian and score vector are obtained through numerical differentiation.
Usage
nc_test(
time,
time2 = NULL,
event = NULL,
thresh = 0,
ltrunc = NULL,
rtrunc = NULL,
type = c("right", "left", "interval", "interval2"),
weights = rep(1, length(time)),
test = c("score", "lrt"),
arguments = NULL,
...
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
thresh |
a vector of thresholds |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
type |
character string specifying the type of censoring. Possible values are " |
weights |
weights for observations |
test |
string, either |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional parameters, currently ignored |
Details
The score test is much faster and perhaps less fragile than the likelihood ratio test: fitting the piece-wise generalized Pareto model is difficult due to the large number of parameters and multimodal likelihood surface.
The reference distribution is chi-square
Value
a data frame with the following variables:
-
thresh
: threshold for the generalized Pareto distribution -
nexc
: number of exceedances -
score
: score statistic -
df
: degrees of freedom -
pval
: the p-value obtained from the asymptotic chi-square approximation.
Examples
set.seed(1234)
n <- 100L
x <- samp_elife(n = n,
scale = 2,
shape = -0.2,
lower = low <- runif(n),
upper = upp <- runif(n, min = 3, max = 20),
type2 = "ltrt",
family = "gp")
test <- nc_test(
time = x,
ltrunc = low,
rtrunc = upp,
thresh = quantile(x, seq(0, 0.5, by = 0.1)))
print(test)
plot(test)
Likelihood for doubly interval truncated data
Description
Computes the log-likelihood for various parametric models suitable for threshold exceedances. If threshold is non-zero, then only right-censored, observed event time and interval censored data whose timing exceeds the thresholds are kept.
Usage
nll_ditrunc_elife(
par,
time,
ltrunc1 = NULL,
rtrunc1 = NULL,
ltrunc2 = NULL,
rtrunc2 = NULL,
family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
"extweibull", "perks", "beard", "perksmake", "beardmake"),
thresh = 0,
weights = rep(1, length(time)),
arguments = NULL,
...
)
Arguments
par |
vector of parameters |
time |
excess time of the event of follow-up time, depending on the value of event |
ltrunc1 |
lower truncation limit, default to |
rtrunc1 |
upper truncation limit, default to |
ltrunc2 |
lower truncation limit, default to |
rtrunc2 |
upper truncation limit, default to |
family |
string; choice of parametric family, either exponential ( |
thresh |
vector of thresholds |
weights |
weights for observations |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional arguments for optimization, currently ignored. |
Value
log-likelihood value
Likelihood for arbitrary censored and truncated data
Description
Computes the log-likelihood for various parametric models suitable for threshold exceedances. If threshold is non-zero, then only right-censored, observed event time and interval censored data whose timing exceeds the thresholds are kept.
Usage
nll_elife(
par,
time,
time2 = NULL,
event = NULL,
type = c("right", "left", "interval", "interval2"),
ltrunc = NULL,
rtrunc = NULL,
family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
"extweibull", "perks", "beard", "perksmake", "beardmake"),
thresh = 0,
weights = NULL,
status = NULL,
arguments = NULL,
...
)
Arguments
par |
vector of parameters, in the following order: scale, rate and shape |
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
type |
character string specifying the type of censoring. Possible values are " |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
family |
string; choice of parametric family |
thresh |
vector of thresholds |
weights |
weights for observations |
status |
integer vector giving status of an observation. If |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional arguments for optimization, currently ignored. |
Value
log-likelihood values
Examples
data(ewsim, package = "longevity")
nll_elife(par = c(5, 0.3),
family = "gp",
arguments = ewsim)
Nonparametric estimation of the survival function
Description
The survival function is obtained through the EM algorithm
described in Turnbull (1976); censoring and truncation are
assumed to be non-informative.
The survival function changes only at the J
distinct
exceedances y_i-u
and truncation points.
Usage
np_elife(
time,
time2 = NULL,
event = NULL,
type = c("right", "left", "interval", "interval2"),
thresh = 0,
ltrunc = NULL,
rtrunc = NULL,
tol = 1e-12,
weights = NULL,
method = c("em", "sqp"),
arguments = NULL,
maxiter = 100000L,
...
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
type |
character string specifying the type of censoring. Possible values are " |
thresh |
double thresh |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
tol |
double, relative tolerance for convergence of the EM algorithm |
weights |
double, vector of weights for the observations |
method |
string, one of |
arguments |
a named list specifying default arguments of the function that are common to all |
maxiter |
integer, maximum number of iterations for the EM algorithm |
... |
additional arguments, currently ignored |
Details
The unknown parameters of the model are p_j (j=1, \ldots, J)
subject to the constraint that \sum_{j=1}^J p_j=1
.
Value
a list with elements
-
cdf
: right-continuousstepfun
object defined by probabilities -
time
: matrix of unique values for the Turnbull intervals defining equivalence classes; only those with non-zero probabilities are returned -
prob
:J
vector of non-zero probabilities -
niter
: number of iterations
References
Turnbull, B. W. (1976). The Empirical Distribution Function with Arbitrarily Grouped, Censored and Truncated Data. Journal of the Royal Statistical Society. Series B (Methodological) 38(3), 290–295.
Gentleman, R. and C. J. Geyer (1994). Maximum likelihood for interval censored data: Consistency and computation, Biometrika, 81(3), 618–623.
Frydman, H. (1994). A Note on Nonparametric Estimation of the Distribution Function from Interval-Censored and Truncated Observations, Journal of the Royal Statistical Society. Series B (Methodological) 56(1), 71-74.
Examples
set.seed(2021)
n <- 20L
# Create fake data
ltrunc <- pmax(0, runif(n, -0.5, 1))
rtrunc <- runif(n, 6, 10)
dat <- samp_elife(n = n,
scale = 1,
shape = -0.1,
lower = ltrunc,
upper = rtrunc,
family = "gp",
type2 = "ltrt")
npi <- np_elife(time = dat,
rtrunc = rtrunc,
ltrunc = ltrunc)
print(npi)
summary(npi)
plot(npi)
Marginal log likelihood function of the nonparametric multinomial with censoring and truncation
Description
Marginal log likelihood function of the nonparametric multinomial with censoring and truncation
Usage
np_nll(par, cens_lb, cens_ub, trunc_lb, trunc_ub, cens, trunc, weights)
Arguments
par |
vector of |
cens_lb |
index of interval in which death occurs |
cens_ub |
index of interval in which death occurs (if death is observed), or else the largest interval. |
trunc_lb |
vector of largest index for the lower truncation |
trunc_ub |
vector of smallest index for the upper truncation |
Value
a scalar, the negative log likelihood value
Nonparametric maximum likelihood estimation for arbitrary truncation
Description
The syntax is reminiscent of the Surv function, with additional vectors for left-truncation and right-truncation.
Usage
npsurv(
time,
time2 = NULL,
event = NULL,
type = c("right", "left", "interval", "interval2"),
ltrunc = NULL,
rtrunc = NULL,
weights = NULL,
arguments = NULL,
...
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
type |
character string specifying the type of censoring. Possible values are " |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
weights |
vector of weights, default to |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional arguments passed to the functions |
Value
a list with components
-
xval
: unique ordered values of sets on which the distribution function is defined -
prob
: estimated probability of failure on intervals -
convergence
: logical;TRUE
if the EM algorithm iterated until convergence -
niter
: logical; number of iterations for the EM algorithm -
cdf
: nonparametric maximum likelihood estimator of the distribution function
Note
Contrary to the Kaplan-Meier estimator, the mass is placed in the interval
[max(time), Inf
) so the resulting distribution function is not deficient.
See Also
Examples
# Toy example with interval censoring and right censoring
# Two observations: A1: [1,3], A2: 4
# Probability of 0.5
test_simple2 <- npsurv(
time = c(1,4),
time2 = c(3,4),
event = c(3,1),
type = "interval"
)
Distribution function of the Beard-Makeham distribution
Description
Distribution function of the Beard-Makeham distribution
Density function of the Beard-Makeham distribution
Hazard function of the Beard-Makeham distribution
Quantile function of the Beard-Makeham distribution
Usage
pbeardmake(
q,
rate = 1,
shape1 = 1,
shape2 = 1,
lambda = 0,
lower.tail = TRUE,
log.p = FALSE
)
dbeardmake(x, rate = rate, shape1 = 1, shape2 = 1, lambda = 0, log = FALSE)
hbeardmake(x, rate = rate, shape1 = 1, shape2 = 1, lambda = 0, log = FALSE)
qbeardmake(p, rate = 1, shape1 = 1, shape2 = 1, lambda = 0, lower.tail = TRUE)
Arguments
q |
vector of quantiles. |
rate |
shape parameter ( |
shape1 |
shape parameter ( |
shape2 |
shape parameter ( |
lambda |
exponential rate of the Makeham component |
lower.tail |
logical; if |
log.p |
logical; if |
x |
vector of quantiles. |
log |
logical; if |
p |
vector of probabilities. |
Value
a vector of (log)-probabilities of the same length as q
a vector of (log)-density.
a vector of (log)-hazard.
a vector of quantiles
Distribution function of the extended generalized Pareto distribution
Description
Distribution function of the extended generalized Pareto distribution
Usage
pextgp(q, scale = 1, shape1 = 0, shape2 = 0, lower.tail = TRUE, log.p = FALSE)
Arguments
q |
vector of quantiles. |
scale |
scale parameter, strictly positive. |
shape1 |
positive shape parameter |
shape2 |
shape parameter |
lower.tail |
logical; if |
log.p |
logical; if |
Value
a vector of (log)-probabilities of the same length as q
Distribution function of the extended Weibull distribution
Description
Distribution function of the extended Weibull distribution
Density function of the extended Weibull distribution
Hazard function of the extended Weibull distribution
Quantile function of the extended Weibull distribution
Usage
pextweibull(
q,
scale = 1,
shape1 = 0,
shape2 = 1,
lower.tail = TRUE,
log.p = FALSE
)
dextweibull(x, scale = 1, shape1 = 0, shape2 = 1, log = FALSE)
hextweibull(x, scale = 1, shape1 = 0, shape2 = 1, log = FALSE)
qextweibull(p, scale = 1, shape1 = 0, shape2 = 1, lower.tail = TRUE)
Arguments
q |
vector of quantiles. |
scale |
scale parameter, strictly positive. |
shape1 |
shape parameter of the generalized Pareto component. |
shape2 |
shape parameter of the Weibull component. |
lower.tail |
logical; if |
log.p |
logical; if |
x |
vector of quantiles. |
log |
logical; if |
p |
vector of probabilities. |
Value
a vector of (log)-probabilities of the same length as q
a vector of (log)-density.
a vector of (log)-hazard
vector of quantiles
a vector of quantiles
Distribution function of the Gompertz distribution
Description
Distribution function of the Gompertz distribution
Quantile function of the Gompertz distribution
Density function of the Gompertz distribution
Hazard function of the Gompertz distribution
Usage
pgomp(q, scale = 1, shape = 0, lower.tail = TRUE, log.p = FALSE)
qgomp(p, scale = 1, shape = 0, lower.tail = TRUE)
dgomp(x, scale = 1, shape = 0, log = FALSE)
hgomp(x, scale = 1, shape = 0, log = FALSE)
Arguments
q |
vector of quantiles. |
scale |
positive scale parameter |
shape |
non-negative shape parameter |
lower.tail |
logical; if |
log.p |
logical; if |
p |
vector of probabilities. |
x |
vector of quantiles |
log |
logical; if |
Value
a vector of (log)-probabilities of the same length as q
a vector of quantiles
a vector of (log)-density.
a vector of (log)-hazard.
Distribution function of the Gompertz-Makeham distribution
Description
Distribution function of the Gompertz-Makeham distribution
Quantile function of the Gompertz-Makeham distribution
Density function of the Gompertz-Makeham distribution
Hazard function of the Gompertz-Makeham distribution
Usage
pgompmake(
q,
scale = 1,
shape = 0,
lambda = 0,
lower.tail = TRUE,
log.p = FALSE
)
qgompmake(p, scale = 1, shape = 0, lambda = 0, lower.tail = TRUE)
dgompmake(x, scale = 1, shape = 0, lambda = 0, log = FALSE)
hgompmake(x, scale = 1, shape = 0, lambda = 0, log = FALSE)
Arguments
q |
vector of quantiles. |
scale |
scale parameter, strictly positive. |
shape |
shape parameter. |
lambda |
exponential rate |
lower.tail |
logical; if |
log.p |
logical; if |
p |
vector of probabilities. |
x |
vector of quantiles. |
log |
logical; if |
Value
a vector of (log)-probabilities of the same length as q
a vector of quantiles
a vector of density
a vector of (log)-hazard.
Note
The quantile function is defined in terms of Lambert's W function. Particular parameter combinations (small values of lambda
lead to numerical overflow; the function throws a warning when this happens.
Distribution function of the generalized Pareto distribution
Description
Distribution function of the generalized Pareto distribution
Density function of the generalized Pareto distribution
Hazard function of the generalized Pareto distribution
Quantile function of the generalized Pareto distribution
Usage
pgpd(q, loc = 0, scale = 1, shape = 0, lower.tail = TRUE, log.p = FALSE)
dgpd(x, loc = 0, scale = 1, shape = 0, log = FALSE)
hgpd(x, loc = 0, scale = 1, shape = 0, log = FALSE)
qgpd(p, loc = 0, scale = 1, shape = 0, lower.tail = TRUE)
Arguments
q |
vector of quantiles. |
loc |
location parameter. |
scale |
scale parameter, strictly positive. |
shape |
shape parameter. |
lower.tail |
logical; if |
log.p |
logical; if |
x |
vector of quantiles. |
log |
logical; if |
p |
vector of probabilities. |
Value
a vector of (log)-probabilities of the same length as q
a vector of (log)-density.
a vector of (log)-hazard.
a vector of quantiles
Plot empirical distribution function
Description
Plot empirical distribution function
Usage
## S3 method for class 'elife_ecdf'
plot(x, ...)
Arguments
x |
argument of class |
... |
additional arguments for the plot |
Value
base R plot of the empirical distribution function
Plot profile of endpoint
Description
Plot profile of endpoint
Usage
## S3 method for class 'elife_profile'
plot(x, plot.type = c("base", "ggplot"), plot = TRUE, ...)
autoplot.elife_profile(object, ...)
autoplot.elife_tstab_endpoint(object, ...)
Arguments
x |
an object of class |
plot.type |
string indicating whether to use base R for plots or |
plot |
logical; if |
... |
additional arguments to pass to |
object |
object of class |
Value
base R or ggplot object for a plot of the profile log likelihood of the endpoint of the generalized Pareto distribution
Plot threshold stability plot for endpoint
Description
Plot threshold stability plot for endpoint
Usage
## S3 method for class 'elife_tstab_endpoint'
plot(x, plot.type = c("base", "ggplot"), plot = TRUE, ...)
Arguments
x |
an object of class |
plot.type |
string indicating whether to use base R for plots or |
plot |
logical; if |
... |
additional arguments to pass to |
Value
base R or ggplot object for a plot of the profile log likelihood of the endpoint of the generalized Pareto distribution
Distribution function of the Perks-Makeham distribution
Description
Distribution function of the Perks-Makeham distribution
Density function of the Perks-Makeham distribution
Hazard function of the Perks-Makeham distribution
Quantile function of the Perks-Makeham distribution
Usage
pperksmake(
q,
rate = 1,
shape = 1,
lambda = 0,
lower.tail = TRUE,
log.p = FALSE
)
dperksmake(x, rate = 1, shape = 1, lambda = 0, log = FALSE)
hperksmake(x, rate = 1, shape = 1, lambda = 0, log = FALSE)
qperksmake(p, rate = 1, shape = 1, lambda = 0, lower.tail = TRUE)
Arguments
q |
vector of quantiles. |
rate |
rate parameter ( |
shape |
shape parameter ( |
lambda |
exponential rate of the Makeham component |
lower.tail |
logical; if |
log.p |
logical; if |
x |
vector of quantiles. |
log |
logical; if |
p |
vector of probabilities. |
Value
a vector of (log)-probabilities of the same length as q
a vector of (log)-density.
a vector of (log)-hazard.
vector of quantiles
a vector of quantiles
Profile log likelihood for the scale parameter of the exponential distribution
Description
This internal function is used to produce threshold stability plots.
Usage
prof_exp_scale(
mle = NULL,
time,
time2 = NULL,
event = NULL,
thresh = 0,
ltrunc = NULL,
rtrunc = NULL,
type = c("right", "left", "interval", "interval2"),
level = 0.95,
psi = NULL,
weights = NULL,
confint = TRUE,
arguments = NULL,
...
)
Arguments
mle |
an object of class |
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
thresh |
vector of thresholds |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
type |
character string specifying the type of censoring. Possible values are " |
level |
level of the confidence interval |
weights |
weights for observations |
confint |
logical; if |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional arguments for optimization, currently ignored. |
Value
a vector of length three with the maximum likelihood of the scale and profile-based confidence interval
Profile log likelihood for the transformed scale parameter of the generalized Pareto distribution
Description
This internal function is used to produce threshold stability plots.
Usage
prof_gp_scalet(
mle = NULL,
time,
time2 = NULL,
event = NULL,
thresh = 0,
ltrunc = NULL,
rtrunc = NULL,
type = c("right", "left", "interval", "interval2"),
level = 0.95,
psi = NULL,
weights = NULL,
confint = TRUE,
arguments = NULL,
...
)
Arguments
mle |
an object of class |
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
thresh |
vector of thresholds |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
type |
character string specifying the type of censoring. Possible values are " |
level |
level of the confidence interval |
weights |
weights for observations |
confint |
logical; if |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional arguments for optimization, currently ignored. |
Value
a confidence interval or a list with profile values
Profile log likelihood for the shape parameter of the generalized Pareto distribution
Description
This internal function is used to produce threshold stability plots.
Usage
prof_gp_shape(
mle = NULL,
time,
time2 = NULL,
event = NULL,
thresh,
ltrunc = NULL,
rtrunc = NULL,
type = c("right", "left", "interval", "interval2"),
level = 0.95,
psi = NULL,
weights = NULL,
confint = TRUE,
arguments = NULL,
...
)
Arguments
mle |
an object of class |
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
thresh |
vector of thresholds |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
type |
character string specifying the type of censoring. Possible values are " |
level |
level of the confidence interval |
weights |
weights for observations |
confint |
logical; if |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional arguments for optimization, currently ignored. |
Value
if confint=TRUE
, a vector of length three with the maximum likelihood of the shape and profile-based confidence interval
Sample observations from an interval truncated excess lifetime distribution
Description
Sample observations from an interval truncated excess lifetime distribution
Usage
r_ditrunc_elife(
n,
rate,
scale,
shape,
lower,
upper,
family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
"extweibull", "perks", "beard", "perksmake", "beardmake")
)
Arguments
n |
sample size |
rate |
rate parameter(s) |
scale |
scale parameter(s) |
shape |
shape parameter(s) |
lower |
vector of lower bounds |
upper |
vector of upper bounds |
family |
string; choice of parametric family |
Value
a vector of n
observations
Examples
n <- 100L
# the lower and upper bound are either scalar,
# or else vectors of length n
r_dtrunc_elife(n = n, scale = 1, shape = -0.1,
lower = pmax(0, runif(n, -0.5, 1)),
upper = runif(n, 6, 10),
family = "gp")
Sample observations from an interval truncated excess lifetime distribution
Description
Sample observations from an interval truncated excess lifetime distribution
Usage
r_dtrunc_elife(
n,
scale,
rate,
shape,
lower,
upper,
family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
"extweibull", "perks", "beard", "perksmake", "beardmake")
)
Arguments
n |
sample size |
scale |
scale parameter(s) |
rate |
rate parameter(s) |
shape |
shape parameter(s) |
lower |
vector of lower bounds |
upper |
vector of upper bounds |
family |
string; choice of parametric family |
Value
a vector of n
observations
Examples
n <- 100L
# the lower and upper bound are either scalar,
# or else vectors of length n
r_dtrunc_elife(n = n, scale = 1, shape = -0.1,
lower = pmax(0, runif(n, -0.5, 1)),
upper = runif(n, 6, 10),
family = "gp")
Sample observations from a left-truncated right-censored excess lifetime distribution
Description
Sample observations from a left-truncated right-censored excess lifetime distribution
Usage
r_ltrc_elife(
n,
scale,
rate,
shape,
lower,
upper,
family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
"extweibull", "perks", "beard", "perksmake", "beardmake")
)
Arguments
n |
sample size |
scale |
scale parameter |
shape |
shape parameter(s) |
lower |
vector of lower bounds |
upper |
vector of upper bounds above which data are right-truncated |
family |
string; choice of parametric family, either exponential ( |
Value
a list with n
observations dat
and a logical vector of the same length with TRUE
for right-censored observations and FALSE
otherwise.
Examples
n <- 100L
# the lower and upper bound are either scalar,
# or else vectors of length n
r_ltrc_elife(n = n, scale = 5, shape = -0.1,
lower = pmax(0, runif(n, -0.5, 1)),
upper = 5,
family = "gp")
Beard distribution
Description
Distribution function, density, hazard, quantile function and random number generation for the Beard distribution
Usage
rbeard(n, rate, shape1, shape2)
pbeard(q, rate = 1, shape1 = 1, shape2 = 1, lower.tail = TRUE, log.p = FALSE)
dbeard(x, rate = 1, shape1 = 1, shape2 = 1, log = FALSE)
hbeard(x, rate = 1, shape1 = 1, shape2 = 1, log = FALSE)
qbeard(p, rate = 1, shape1 = 1, shape2 = 1, lower.tail = TRUE)
Arguments
n |
sample size |
rate |
rate parameter ( |
shape1 |
shape parameter ( |
shape2 |
shape parameter ( |
q |
vector of quantiles. |
lower.tail |
logical; if |
log.p |
logical; if |
x |
vector of quantiles. |
log |
logical; if |
p |
vector of probabilities. |
Value
a vector of random variates
a vector of (log)-probabilities of the same length as q
a vector of (log)-density.
a vector of (log)-hazard.
a vector of quantiles
Sample lifetime from excess lifetime model
Description
Given parameters of a elife
distribution, sampling window and
birth dates with excess lifetimes, sample new observations; excess lifetime
at c1
are sampled from an exponential distribution, whereas
the birth dates are sampled from a jittered histogram-based distribution
The new excess lifetime above the threshold are right-censored if they exceed
c2
.
Usage
samp2_elife(
n,
scale,
shape,
family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
"extweibull", "perks", "beard", "perksmake", "beardmake"),
xcal,
c1,
c2
)
Arguments
n |
sample size |
scale |
scale parameter(s) |
shape |
shape parameter(s) |
family |
string; choice of parametric family |
xcal |
date at which individual reaches |
c1 |
date, first day of the sampling frame |
c2 |
date, last day of the sampling frame |
Value
list with new birthdates (xcal
), excess lifetime at c1
(ltrunc
),
excess lifetime above u
(dat
) and right-censoring indicator (rightcens
).
Simulate excess lifetime with truncation or right-censoring
Description
This function dispatches simulations accounting for potential left-truncation (remove by setting lower to zero).
If type2=ltrt
, simulated observations will be lower than the upper bounds upper
.
If type2=ltrc
, simulated observations are capped at upper
and the observation is right-censored (rcens=TRUE
).
Usage
samp_elife(
n,
scale,
rate,
shape = NULL,
lower = 0,
upper = Inf,
family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
"extweibull", "perks", "beard", "perksmake", "beardmake"),
type2 = c("none", "ltrt", "ltrc", "ditrunc")
)
Arguments
n |
sample size |
scale |
scale parameter(s) |
rate |
rate parameter(s) |
shape |
shape parameter(s) |
lower |
vector of lower bounds |
upper |
vector of upper bounds |
family |
string; choice of parametric family |
type2 |
string, either |
Value
either a vector of observations or, if type2=ltrc
, a list with n
observations dat
and a logical vector of the same length with TRUE
for right-censored observations and FALSE
otherwise.
Note
As the tails of the Gompertz and Gompertz–Makeham models decrease exponentially fast, the method fails in the rare event case if the lower bound is too large (say larger than the 99.99
Examples
set.seed(1234)
n <- 500L
# Simulate interval truncated data
x <- samp_elife(n = n,
scale = 2,
shape = 1.5,
lower = low <- runif(n),
upper = upp <- runif(n, min = 3, max = 15),
type2 = "ltrt",
family = "weibull")
coef(fit_elife(
time = x,
ltrunc = low,
rtrunc = upp,
family = "weibull"))
# Simulate left-truncated right-censored data
x <- samp_elife(n = n,
scale = 2,
shape = 1.5,
lower = low <- runif(n),
upper = upp <- runif(n, min = 3, max = 15),
type2 = "ltrc",
family = "gomp")
#note that the return value is a list...
coef(fit_elife(
time = x$dat,
ltrunc = low,
event = !x$rcens,
family = "gomp"))
Likelihood ratio test for doubly interval truncated data
Description
This function fits separate models for each distinct value of covariates and computes a likelihood ratio test to test whether there are significant differences between groups.
Usage
test_ditrunc_elife(
time,
covariate,
thresh = 0,
ltrunc1 = NULL,
rtrunc1 = NULL,
ltrunc2 = NULL,
rtrunc2 = NULL,
family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "extweibull", "perks",
"beard", "perksmake", "beardmake"),
weights = rep(1, length(time)),
arguments = NULL,
...
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
covariate |
vector of factors, logical or integer whose distinct values are |
thresh |
vector of thresholds |
family |
string; choice of parametric family |
weights |
weights for observations |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional arguments for optimization, currently ignored. |
Value
a list with elements
-
stat
: likelihood ratio statistic -
df
: degrees of freedom -
pval
: the p-value obtained from the asymptotic chi-square approximation.
Likelihood ratio test for covariates
Description
This function fits separate models for each distinct
value of the factor covariate
and computes a likelihood ratio test
to test whether there are significant differences between
groups.
Usage
test_elife(
time,
time2 = NULL,
event = NULL,
covariate,
thresh = 0,
ltrunc = NULL,
rtrunc = NULL,
type = c("right", "left", "interval", "interval2"),
family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
"perksmake", "beard", "beardmake"),
weights = rep(1, length(time)),
arguments = NULL,
...
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
covariate |
vector of factors, logical or integer whose distinct values define groups |
thresh |
vector of thresholds |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
type |
character string specifying the type of censoring. Possible values are " |
family |
string; choice of parametric family |
weights |
weights for observations |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional arguments for optimization, currently ignored. |
Value
a list with elements
-
stat
likelihood ratio statistic -
df
degrees of freedom -
pval
the p-value obtained from the asymptotic chi-square approximation.
Examples
test <- with(subset(dutch, ndays > 39082),
test_elife(
time = ndays,
thresh = 39082L,
covariate = gender,
ltrunc = ltrunc,
rtrunc = rtrunc,
family = "exp"))
test
Threshold stability plots
Description
The generalized Pareto and exponential distribution
are threshold stable. This property, which is used
for extrapolation purposes, can also be used to diagnose
goodness-of-fit: we expect the parameters \xi
and \tilde{\sigma} = \sigma + \xi u
to be constant over a range of thresholds. The threshold stability
plot consists in plotting maximum likelihood estimates with pointwise confidence interval.
This function handles interval truncation and right-censoring.
Usage
tstab(
time,
time2 = NULL,
event = NULL,
thresh = 0,
ltrunc = NULL,
rtrunc = NULL,
type = c("right", "left", "interval", "interval2"),
family = c("gp", "exp"),
method = c("wald", "profile"),
level = 0.95,
plot = TRUE,
plot.type = c("base", "ggplot"),
which.plot = c("scale", "shape"),
weights = NULL,
arguments = NULL,
...
)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
event |
status indicator, normally 0=alive, 1=dead. Other choices are |
thresh |
vector of thresholds |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
type |
character string specifying the type of censoring. Possible values are " |
family |
string; distribution, either generalized Pareto ( |
method |
string; the type of pointwise confidence interval, either Wald ( |
level |
probability level for the pointwise confidence intervals |
plot |
logical; should a plot be returned alongside with the estimates? Default to |
plot.type |
string; either |
which.plot |
string; which parameters to plot; |
weights |
weights for observations |
arguments |
a named list specifying default arguments of the function that are common to all |
... |
additional arguments for optimization, currently ignored. |
Details
The shape estimates are constrained
Value
an invisible list with pointwise estimates and confidence intervals for the scale and shape parameters
See Also
tstab.gpd
from package mev
, gpd.fitrange
from package ismev
or tcplot
from package evd
, among others.
Examples
set.seed(1234)
n <- 100L
x <- samp_elife(n = n,
scale = 2,
shape = -0.2,
lower = low <- runif(n),
upper = upp <- runif(n, min = 3, max = 20),
type2 = "ltrt",
family = "gp")
tstab_plot <- tstab(time = x,
ltrunc = low,
rtrunc = upp,
thresh = quantile(x, seq(0, 0.5, length.out = 4)))
plot(tstab_plot, plot.type = "ggplot")
Turnbull's sets
Description
Given censoring and truncation set, compute the regions of the real line that get positive mass and over which the distribution function is well-defined.
Usage
turnbull_intervals(time, time2 = NULL, status, ltrunc = NULL, rtrunc = NULL)
Arguments
time |
excess time of the event of follow-up time, depending on the value of event |
time2 |
ending excess time of the interval for interval censored data only. |
ltrunc |
lower truncation limit, default to |
rtrunc |
upper truncation limit, default to |
Value
a matrix with two columns containing the left
and right
bounds Of Turnbull sets
Note
The function adds the square root of the machine tolerance to left bounds of interval censored data so they are open.
References
Frydman, H. (1994). A Note on Nonparametric Estimation of the Distribution Function from Interval-Censored and Truncated Observations, Journal of the Royal Statistical Society. Series B (Methodological) 56(1), 71-74.
Turnbull, B. W. (1976). The Empirical Distribution Function with Arbitrarily Grouped, Censored and Truncated Data. Journal of the Royal Statistical Society, Series B 38, 290-295.