Help for package longevity

Type:

Package

Title:

Statistical Methods for the Analysis of Excess Lifetimes

Version:

1.2.1

BugReports:

https://github.com/lbelzile/longevity/issues

URL:

https://lbelzile.github.io/longevity/

Depends:

R (≥ 4.0.0)

Imports:

numDeriv, Rcpp (≥ 1.0.6), rlang, Rsolnp

Suggests:

knitr, ggplot2 (≥ 3.0.0), tinytest, rmarkdown

LinkingTo:

Rcpp, RcppArmadillo

Description:

A collection of parametric and nonparametric methods for the analysis of survival data. Parametric families implemented include Gompertz-Makeham, exponential and generalized Pareto models and extended models. The package includes an implementation of the nonparametric maximum likelihood estimator for arbitrary truncation and censoring pattern based on Turnbull (1976) <doi:10.1111/j.2517-6161.1976.tb01597.x>, along with graphical goodness-of-fit diagnostics. Parametric models for positive random variables and peaks over threshold models based on extreme value theory are described in Rootzén and Zholud (2017) <doi:10.1007/s10687-017-0305-5>; Belzile et al. (2021) <doi:10.1098/rsos.202097> and Belzile et al. (2022) <doi:10.1146/annurev-statistics-040120-025426>.

License:

GPL-3

Encoding:

UTF-8

LazyData:

true

RoxygenNote:

7.3.2

VignetteBuilder:

knitr

NeedsCompilation:

yes

Packaged:

2025-07-03 20:52:41 UTC; lbelzile

Author:

Leo Belzile

[aut, cre]

Maintainer:

Leo Belzile <belzilel@gmail.com>

Repository:

CRAN

Date/Publication:

2025-07-03 21:20:02 UTC

Identification sets

Description

Identification sets

Usage

.censTruncLimits(tsets, lcens, rcens, ltrunc, rtrunc, trunc, cens)

Arguments

tsets

Turnbull's sets

lcens

numeric vector of left censoring

rcens

numeric vector of right censoring

ltrunc

numeric vector of left truncation

rtrunc

numeric vector of right truncation

trunc

logical are observation truncated?

Value

a matrix with the bounds of the intervals for Turnbull sets

Identification sets for double truncated data

Description

Identification sets for double truncated data

Usage

.censTruncLimitsDtrunc(tsets, lcens, rcens, ltrunc, rtrunc, trunc, cens)

Arguments

tsets

Turnbull's sets

lcens

numeric vector of left censoring

rcens

numeric vector of right censoring

ltrunc

numeric matrix of left truncation

rtrunc

numeric matrix of right truncation

trunc

logical are observation truncated?

Value

a matrix with the bounds of the intervals for Turnbull sets

Check survival output

Description

Check survival output

Usage

.check_surv(
  time,
  time2 = NULL,
  event = NULL,
  type = c("right", "left", "interval", "interval2")
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

status indicator, normally 0=alive, 1=dead. Other choices are TRUE/FALSE (TRUE for death). For interval censored data, the status indicator is 0=right censored, 1=event at time, 2=left censored, 3=interval censored. Although unusual, the event indicator can be omitted, in which case all subjects are assumed to have experienced an event.

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

Value

a list with transformed inputs or an error

Turnbull EM algorithm (low storage implementation)

Description

Turnbull EM algorithm (low storage implementation)

Usage

.turnbull_em(
  tsets,
  lcens,
  rcens,
  ltrunc,
  rtrunc,
  weights,
  cens = TRUE,
  trunc = TRUE,
  tol = 1e-12,
  zerotol = 1e-10,
  maxiter = 100000L
)

Arguments

tsets

Turnbull's sets

lcens

numeric vector of left censoring

rcens

numeric vector of right censoring

ltrunc

numeric vector of left truncation

rtrunc

numeric vector of right truncation

weights

vector of weights for observations

cens

logical; if FALSE, then censUpp = censLow and a particular update can be avoided in the EM algorithm

tol

tolerance level for terminating the EM algorithm

maxiter

maximum number of iteration for the EM algorithm

Value

a list with the probabilities and the standard errors

Turnbull EM algorithm (low storage implementation)

Description

Turnbull EM algorithm (low storage implementation)

Usage

.turnbull_em_dtrunc(
  tsets,
  lcens,
  rcens,
  ltrunc,
  rtrunc,
  weights,
  cens = TRUE,
  trunc = TRUE,
  tol = 1e-12,
  zerotol = 1e-10,
  maxiter = 100000L
)

Arguments

tsets

Turnbull's sets

lcens

numeric vector of left censoring

rcens

numeric vector of right censoring

ltrunc

numeric vector of left truncation

rtrunc

numeric vector of right truncation

weights

vector of weights for observations

cens

logical; if FALSE, then censUpp = censLow and a particular update can be avoided in the EM algorithm

tol

tolerance level for terminating the EM algorithm

maxiter

maximum number of iteration for the EM algorithm

Value

a list with the probabilities and the standard errors

Turnbull's sets

Description

Given truncation and censoring sets, construct disjoint increasing intervals whose left and right endpoints lie in L and R and which contain no other members of L and R

Usage

.turnbull_intervals(Lcens, Rcens, Ltrunc, Rtrunc, status)

Arguments

Lcens

set of left censoring limits

Rcens

vector of right censoring limits

Ltrunc

vector of left truncation limits

Rtrunc

vector of right truncation limits

status

integer vector giving status of censoring set

Value

a matrix containing limits of intervals for EM

Weighted empirical distribution function

Description

Weighted empirical distribution function

Usage

.wecdf(x, w, type = c("dist", "surv"))

Arguments

x

vector of length n of values

w

vector of weights of length n

type

string, one of distribution function (dist) or survival function (surv)

Author(s)

Adapted from spatstat (c) Adrian Baddeley and Rolf Turner

P-value plot for Northrop and Coleman diagnostic

Description

The Northrop-Coleman tests for penultimate models are comparing the piece-wise generalized Pareto distribution to the generalized Pareto above the lower threshold.

Usage

autoplot.elife_northropcoleman(object, ...)

## S3 method for class 'elife_northropcoleman'
plot(x, plot.type = c("base", "ggplot"), plot = TRUE, ...)

Arguments

object

object of class elife_northropcoleman, with the fitted piecewise-constant generalized Pareto model

...

additional arguments for base plot

x

an object of class elife_northropcoleman

plot.type

string indicating the type of plot

plot

logical; should the routine print the graph if plot.type equals "ggplot"? Default to TRUE.

Value

a base R or ggplot object with p-values for the Northrop-Coleman test against thresholds.

Goodness-of-fit plots for parametric models

Description

Because of censoring and truncation, the plotting positions must be adjusted accordingly. For right-censored data, the methodology is described in Waller & Turnbull (1992). Only non-censored observations are displayed, which can create distortion.

Usage

autoplot.elife_par(object, ...)

## S3 method for class 'elife_par'
plot(
  x,
  plot.type = c("base", "ggplot"),
  which.plot = c("pp", "qq"),
  confint = c("none", "pointwise", "simultaneous"),
  plot = TRUE,
  ...
)

Arguments

object

an object of class elife_par containing the fitted parametric model

...

additional arguments, currently ignored by the function.

x

a parametric model of class elife_par

plot.type

string, one of base for base R or ggplot

which.plot

vector of string indicating the plots, among pp for probability-probability plot, qq for quantile-quantile plot, erp for empirically rescaled plot (only for censored data), exp for standard exponential quantile-quantile plot or tmd for Tukey's mean difference plot, which is a variant of the Q-Q plot in which we map the pair (x,y) is mapped to ((x+y)/2,y-x) are detrended, dens and cdf return the empirical distribution function with the fitted parametric density or distribution function curve superimposed.

confint

logical; if TRUE, creates uncertainty diagnostic via a parametric bootstrap

plot

logical; if TRUE, creates a plot when plot.type="ggplot". Useful for returning ggplot objects without printing the graphs

Details

For truncated data, we first estimate the distribution function nonparametrically, F_n. The uniform plotting positions of the data

v_i = [F_n(y_i) - F_n(a_i)]/[F_n(b_i) - F_n(a_i)].

For probability-probability plots, the empirical quantiles are transformed using the same transformation, with F_n replaced by the postulated or estimated distribution function F_0. For quantile-quantile plots, the plotting positions v_i are mapped back to the data scale viz.

F_0^{-1}\{F_0(a_i) + v_i[F_0(b_i) - F_0(a_i)]\}

When data are truncated and observations are mapped back to the untruncated scale (with, e.g., exp), the plotting positions need not be in the same order as the order statistics of the data.

Value

The function produces graphical goodness-of-fit plots using base R or ggplot objects (returned as an invisible list).

Examples

set.seed(1234)
samp <- samp_elife(
 n = 200,
 scale = 2,
 shape = 0.3,
 family = "gomp",
 lower = 0, upper = runif(200, 0, 10),
 type2 = "ltrc")
fitted <- fit_elife(
 time = samp$dat,
 thresh = 0,
 event = ifelse(samp$rcens, 0L, 1L),
 type = "right",
 family = "exp",
 export = TRUE)
plot(fitted, plot.type = "ggplot")
# Left- and right-truncated data
n <- 40L
samp <- samp_elife(
 n = n,
 scale = 2,
 shape = 0.3,
 family = "gp",
 lower = ltrunc <- runif(n),
 upper = rtrunc <- ltrunc + runif(n, 0, 15),
 type2 = "ltrt")
fitted <- fit_elife(
 time = samp,
 thresh = 0,
 ltrunc = ltrunc,
 rtrunc = rtrunc,
 family = "gp",
 export = TRUE)
plot(fitted,  which.plot = c("tmd", "dens"))

Threshold stability plots

Description

Threshold stability plots

Usage

autoplot.elife_tstab(object, ...)

## S3 method for class 'elife_tstab'
plot(
  x,
  plot.type = c("base", "ggplot"),
  which.plot = c("scale", "shape"),
  plot = TRUE,
  ...
)

Arguments

object

object of class elife_tstab, representing parameter estimates to draw threshold stability plots

...

additional arguments, currently ignored by the function.

x

an object of class elife_tstab containing the fitted parameters as a function of threshold

plot.type

string, one of base for base R or ggplot

which.plot

plot

logical; if TRUE, creates a plot when plot.type="ggplot". Useful for returning ggplot objects without printing the graphs

Box-Cox transformation function

Description

Given a vector of parameters, apply the Box-Cox transformation.

Usage

boxcox_transfo(x, lambda = rep(1, length(x)))

Arguments

x

vector of arguments

lambda

vector of Box-Cox parameters

Value

a vector of the same length as x

Check default arguments

Description

Check arguments and override default values. If a named list, arguments, is provided by the user, it will override any default value. If one of the argument is provided directly, it will take precedence over the values in arguments, provided it is not a default value.

Usage

check_arguments(func, call, arguments = NULL)

Arguments

func

function whose parameters are to be superseded

call

user call, obtained from match.call(expand.dots = FALSE)

arguments

named list with arguments

Value

a named list with all arguments

Check parameters of extended lifetime distributions

Description

Check parameters of extended lifetime distributions

Usage

check_elife_dist(
  rate,
  scale,
  shape,
  family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
    "beard", "perksmake", "beardmake")
)

Arguments

rate

rate parameter(s); for models with Makeham component, the last entry should be part of the rate vector

scale

scale parameter

shape

vector of shape parameter(s).

family

string indicating the parametric model, one of exp, gp, gomp, gompmake, weibull, extgp, extweibull, perks, perksmake, beard and beardmake

Value

The function has no return value and is only used to throw error for invalid arguments.

Confidence intervals for profile likelihoods

Description

This code is adapted from the mev package.

Usage

conf_interv(
  object,
  level = 0.95,
  prob = c((1 - level)/2, 1 - (1 - level)/2),
  print = FALSE,
  ...
)

Arguments

object

a list containing information about the profile likelihood in the same format as the hoa package

level

probability level of the confidence interval

prob

vector of length 2 containing the bounds, by default double-sided

print

logical indicating whether the intervals are printed to the console

...

additional arguments passed to the function

Value

a table with confidence intervals.

Density function of the extended generalized Pareto distribution

Description

Density function of the extended generalized Pareto distribution

Hazard function of the extended generalized Pareto distribution

Quantile function of the extended generalized Pareto distribution

Usage

dextgp(x, scale = 1, shape1 = 0, shape2 = 0, log = FALSE)

hextgp(x, scale = 1, shape1 = 0, shape2 = 0, log = FALSE)

qextgp(p, scale = 1, shape1 = 0, shape2 = 0, lower.tail = TRUE)

Arguments

x

vector of quantiles.

scale

scale parameter, strictly positive.

shape1

positive shape parameter \beta; model defaults to generalized Pareto when it equals zero.

shape2

shape parameter \gamma; model reduces to Gompertz when shape2=0.

log

logical; if TRUE, return the log hazard

p

vector of probabilities.

lower.tail

logical; if TRUE (default), the lower tail probability \Pr(X \leq x) is returned.

Value

a vector of (log)-density of the same length as x

a vector of (log)-hazard of the same length as x

a vector of quantiles

Dutch survival data

Description

This data frame contains information about all Dutch who died above age 92 years between 1986 and 2015. Observations are doubly truncated and such bounds are calculated based on the range of plausible values for these variables. There are 226 records that are interval-censored and interval-truncated for which bdate, ddate and ndays is missing (NA).

Usage

dutch

Format

A data frame with 305143 rows and 11 variables:

ndays: survival time (in days)
bdate: the smallest plausible birth date given information about month of birth and death and survival (Date)
bmonth: month of birth
byear: year of birth
ddate: the largest plausible death date given information about month of birth and death and survival (Date)
dmonth: month of death
dyear: year of death
ltrunc: minimum age (in days); the maximum of either 92 years or the number of days reached in 1986
rtrunc: maximum age (in days) an individual could have reached by the end of 2015
gender: factor indicating gender of individual, either female or male
valid: quality flag; A for individuals born in the Netherlands, B for individuals born abroad who died in the Netherlands

Source

Statistics Netherlands (CBS). Accessed via the Supplemental material of Einmahl, Einmahl and de Haan (2019)

References

Einmahl, J.J., J.H.J. Einmahl and L. de Haan (2019). Limits to Human Life Span Through Extreme Value Theory, Journal of the American Statistical Association, 114(527), 1075-1080. doi:10.1080/01621459.2018.1537912

Excess lifetime distributions

Description

Quantile, distribution, density and hazard functions of excess lifetime distribution for threshold exceedances.

Usage

qelife(
  p,
  rate,
  scale,
  shape,
  family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
    "perksmake", "beard", "beardmake"),
  lower.tail = TRUE
)

pelife(
  q,
  rate,
  scale,
  shape,
  family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
    "perksmake", "beard", "beardmake"),
  lower.tail = TRUE,
  log.p = FALSE
)

selife(
  q,
  rate,
  scale,
  shape,
  family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
    "perksmake", "beard", "beardmake"),
  log.p = FALSE
)

relife(
  n,
  scale = 1,
  rate,
  shape,
  family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
    "perksmake", "beard", "beardmake")
)

delife(
  x,
  scale = 1,
  rate,
  shape,
  family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
    "perksmake", "beard", "beardmake"),
  log = FALSE
)

helife(
  x,
  scale = 1,
  rate,
  shape,
  family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
    "perksmake", "beard", "beardmake"),
  log = FALSE
)

Arguments

p

vector of probabilities

rate

rate parameter(s); for models with Makeham component, the last entry should be part of the rate vector

scale

scale parameter

shape

vector of shape parameter(s).

family

string indicating the parametric model, one of exp, gp, gomp, gompmake, weibull, extgp, extweibull, perks, perksmake, beard and beardmake

lower.tail

logical; if TRUE (default), the lower tail probability \Pr(X \leq x) is returned.

q

vector of quantiles.

n

sample size

log, log.p

logical; if TRUE, values are returned on the logarithmic scale (default to FALSE).

Value

depending on the function type, a vector of probabilities (pelife), survival probabilities (selife), quantiles (qelife), density (delife), or hazard (helife). The function relife returns a random sample of size n from the distribution.

Profile likelihood for the endpoint of the generalized Pareto distribution

Description

This function returns the profile log likelihood over a grid of values of psi, the endpoints.

Usage

endpoint.profile(
  time,
  time2 = NULL,
  event = NULL,
  thresh = 0,
  type = c("right", "left", "interval", "interval2"),
  ltrunc = NULL,
  rtrunc = NULL,
  weights = rep(1, length(time)),
  psi = NULL,
  confint = FALSE,
  level = 0.95,
  arguments = NULL,
  ...
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

thresh

vector of thresholds

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

weights

weights for observations

psi

mandatory vector of endpoints at which to compute the profile

confint

logical; if TRUE, return a level confidence interval instead of a list with the profile log-likelihood components

level

numeric; the level for the confidence intervals

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional parameters, currently ignored

Value

a list with the maximum likelihood estimate of the endpoint and the profile log-likelihood

Examples

set.seed(2023)
time <- relife(n = 100, scale = 3, shape = -0.3, family = "gp")
endpt <- endpoint.profile(
  time = time,
  psi = seq(max(time) + 1e-4, max(time) + 40, length.out = 51L))
print(endpt)
plot(endpt)
confint(endpt)

Threshold stability plots for endpoint

Description

This function calls endpoint.profile to compute the endpoint with profile likelihood confidence intervals at each threshold. It then returns a data frame and a plot

Usage

endpoint.tstab(
  time,
  time2 = NULL,
  event = NULL,
  thresh = 0,
  type = c("right", "left", "interval", "interval2"),
  ltrunc = NULL,
  rtrunc = NULL,
  weights = rep(1, length(time)),
  psi = NULL,
  confint = FALSE,
  level = 0.95,
  arguments = NULL,
  plot = TRUE,
  ...
)

prof_gp_endpt(
  time,
  time2 = NULL,
  event = NULL,
  thresh = 0,
  type = c("right", "left", "interval", "interval2"),
  ltrunc = NULL,
  rtrunc = NULL,
  weights = rep(1, length(time)),
  psi = NULL,
  confint = FALSE,
  level = 0.95,
  arguments = NULL,
  ...
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

thresh

vector of thresholds

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

weights

weights for observations

psi

mandatory vector of endpoints at which to compute the profile

confint

logical; if TRUE, return a level confidence interval instead of a list with the profile log-likelihood components

level

numeric; the level for the confidence intervals

arguments

a named list specifying default arguments of the function that are common to all elife calls

plot

logical; if TRUE, return a plot

...

additional parameters, currently ignored

Value

a data frame with the threshold, endpoint estimates and profile confidence intervals

England and Wales simulated supercentenarian data

Description

This data frame contains information about 179 fake records mimicking Welsh and English who died age 110 and above

Usage

ewsim

Format

A data frame with 179 rows and 3 variables:

time: survival time above 110 (in years)
ltrunc: minimum age above 110 (in years), or zero

;

rtrunc: maximum age (in years) an individual could have reached by the end of the time frame

Fit excess lifetime models for doubly interval truncated data

Description

This function is a wrapper around constrained optimization routines for different models with non-informative censoring and truncation patterns.

Usage

fit_ditrunc_elife(
  time,
  ltrunc1 = NULL,
  rtrunc1 = NULL,
  ltrunc2 = NULL,
  rtrunc2 = NULL,
  thresh = 0,
  family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
    "extweibull", "perks", "beard", "perksmake", "beardmake"),
  weights = NULL,
  export = FALSE,
  start = NULL,
  restart = FALSE,
  arguments = NULL,
  ...
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

ltrunc1

lower truncation limit, default to NULL

rtrunc1

upper truncation limit, default to NULL

ltrunc2

lower truncation limit, default to NULL

rtrunc2

upper truncation limit, default to NULL

thresh

vector of thresholds

family

string; choice of parametric family, either exponential (exp), Weibull (weibull), generalized Pareto (gp), Gompertz (gomp), Gompertz-Makeham (gompmake) or extended generalized Pareto (extgp).

weights

weights for observations

export

logical; should data be included in the returned object to produce diagnostic plots? Default to FALSE.

start

vector of starting values for the optimization routine. If NULL, the algorithm attempts to find default values and returns a warning with false convergence diagnostic if it cannot.

restart

logical; should multiple starting values be attempted? Default to FALSE.

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional arguments for optimization, currently ignored.

Value

an object of class elife_par

Note

The extended generalized Pareto model is constrained to avoid regions where the likelihood is flat so \xi \in [-1, 10] in the optimization algorithm.

The standard errors are obtained via the observed information matrix, calculated using the hessian. In many instances, such as when the shape parameter is zero or negative, the hessian is singular and no estimates are returned.

Fit excess lifetime models by maximum likelihood

Description

This function is a wrapper around constrained optimization routines for different models with non-informative censoring and truncation patterns.

Usage

fit_elife(
  time,
  time2 = NULL,
  event = NULL,
  type = c("right", "left", "interval", "interval2"),
  ltrunc = NULL,
  rtrunc = NULL,
  thresh = 0,
  status = NULL,
  family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "gppiece",
    "extweibull", "perks", "perksmake", "beard", "beardmake"),
  weights = NULL,
  export = FALSE,
  start = NULL,
  restart = FALSE,
  arguments = NULL,
  check = FALSE,
  ...
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

thresh

vector of thresholds

status

integer vector giving status of an observation. If NULL (default), this argument is computed internally based on type.

family

string; choice of parametric family

weights

weights for observations

export

logical; should data be included in the returned object to produce diagnostic plots? Default to FALSE.

start

vector of starting values for the optimization routine. If NULL, the algorithm attempts to find default values and returns a warning with false convergence diagnostic if it cannot.

restart

logical; should multiple starting values be attempted? Default to FALSE.

arguments

a named list specifying default arguments of the function that are common to all elife calls

check

logical; if TRUE, fit all submodels to ensure that simpler models fit worst or as well

...

additional parameters, currently ignored

Value

an object of class elife_par

Note

The extended generalized Pareto model is constrained to avoid regions where the likelihood is flat so \xi \in [-1, 10] in the optimization algorithm.

Examples

data(ewsim, package = "longevity")
fit1 <- fit_elife(
   arguments = ewsim,
   export = TRUE,
   family = "exp")
fit2 <- fit_elife(
   arguments = ewsim,
   export = TRUE,
   family = "gp")
plot(fit1)
summary(fit1)
anova(fit2, fit1)

Piece-wise generalized Pareto distribution

Description

Density, distribution, quantile functions and random number generation from the mixture model of Northrop and Coleman (2014), which consists of m different generalized Pareto distributions over non-overlapping intervals with m shape parameters and one scale parameter; the other scale parameters are constrained so that the resulting distribution is continuous over the domain and reduces to a generalized Pareto distribution if all of the shape parameters are equal.

Usage

dgppiece(x, scale, shape, thresh, log = FALSE)

pgppiece(q, scale, shape, thresh, lower.tail = TRUE, log.p = FALSE)

qgppiece(p, scale, shape, thresh, lower.tail = TRUE, log.p = FALSE)

rgppiece(n, scale, shape, thresh)

Arguments

x, q

vector of quantiles

scale

positive value for the first scale parameter

shape

vector of m shape parameters

thresh

vector of m thresholds

log, log.p

logical; if TRUE, the values are returned on the log scale

lower.tail

logical; if TRUE (default), probabilities are \Pr[X \leq x] otherwise, \Pr[X > x].

p

vector of probabilities

n

sample size

Value

a vector of quantiles (qgppiece), probabilities (pgppiece), density (dgppiece) or random number generated from the model (rgppiece)

References

Northrop & Coleman (2014). Improved threshold diagnostic plots for extreme value analyses, Extremes, 17(2), 289–303.

Profile likelihood for hazard

Description

This function computes the hazard for different elife parametric models with profile-likelihood based confidence intervals. It is also used to provide local hazard plots at varying thresholds.

Usage

hazard_elife(
  x,
  time,
  time2 = NULL,
  event = NULL,
  status = NULL,
  thresh = 0,
  ltrunc = NULL,
  rtrunc = NULL,
  type = c("right", "left", "interval", "interval2"),
  family = c("exp", "gp", "gomp", "weibull", "extgp"),
  weights = rep(1, length(time)),
  level = 0.95,
  psi = NULL,
  plot = FALSE,
  arguments = NULL,
  ...
)

Arguments

x

value of the threshold exceedance at which to estimate the hazard

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

status

integer vector giving status of an observation. If NULL (default), this argument is computed internally based on type.

thresh

vector of thresholds

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

family

string; choice of parametric family

weights

weights for observations

level

numeric; the level for the confidence intervals. Default to 0.95

psi

optional vector of hazard at which to compute the profile log likelihood

plot

logical; if true, display the profile log-likelihood. Default to FALSE.

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional arguments for optimization, currently ignored.

Value

an invisible object of class elife_hazard containing information about the profile likelihood

Examples

n <- 2500
time <- samp_elife(n = n, scale = 2,
family = "gp", shape = 0.1,
lower = ltrunc <- runif(n),
upper = rtrunc <- (5 + runif(n)), type2 = "ltrt")
hazard_elife(x = 2, time = time,
 ltrunc = ltrunc, rtrunc = rtrunc, family = "exp")

Hazard function for various parametric models

Description

Hazard function for various parametric models

Usage

hazard_fn_elife(
  x,
  par,
  family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp")
)

Arguments

x

vector of points at which to evaluate the hazard function

par

vector of scale and shape parameters

family

string; choice of parametric family

Value

a vector with the value of the hazard function at x

Hazard function of the exponential distribution

Description

Hazard function of the exponential distribution

Usage

hexp(x, rate = 1, log = FALSE)

Arguments

x

vector of quantiles

rate

rate vector of rates

log

logical; if FALSE (default), return the log hazard

Value

a vector of (log)-hazard.

Hazard function of the Perks distribution

Description

Hazard function of the Perks distribution

Distribution function of the Perks distribution

Density function of the Perks distribution

Quantile function of the Perks distribution

Usage

hperks(x, rate = 1, shape = 1, log = FALSE)

pperks(q, rate = 1, shape = 1, lower.tail = TRUE, log.p = FALSE)

dperks(x, rate = 1, shape = 1, log = FALSE)

qperks(p, rate = 1, shape = 1, lower.tail = TRUE)

Arguments

x

vector of quantiles.

rate

rate parameter (\nu)

shape

shape parameter (\alpha)

log

logical; if FALSE (default), return the density, else the log likelihood of the individual observations.

q

vector of quantiles.

lower.tail

logical; if TRUE (default), the lower tail probability \Pr(X \leq x) is returned.

log.p

logical; if FALSE (default), values are returned on the probability scale.

p

vector of probabilities.

Value

a vector of (log)-hazard.

a vector of (log)-probabilities of the same length as q

a vector of (log)-density.

a vector of quantiles

Hazard function of the Weibull distribution

Description

Hazard function of the Weibull distribution

Usage

hweibull(x, shape, scale = 1, log = FALSE)

Arguments

x

vector of quantiles

shape

shape parameter

scale

scale parameter, default to 1

log

logical; if TRUE, returns the log hazard

Value

a vector of (log)-hazard

IDL metadata

Description

This data frame contains country codes and the associated data collection period corresponding to the range for age at death.

Usage

idlmetadata

Format

A data frame with 21 rows and 4 variables:

country: factor, one of AUT (Austria), BEL (Belgium), CAN (Quebec), DEU (Germany), DNK (Denmark), ESP (Spain), FIN (Finland), FRA (France), JPN (Japan), NOR (Norway), SWE (Sweden), EAW (England and Wales) and USA (United States of America)
group: factor, either 105-109 for semi-supercentenarians or 110+ for supercentenarians"
ldate: Date, smallest death date
rdate: Date, latest death date

Details

Due to confidentiality restrictions, some data that were available in previous versions of the IDL for Switzerland, Italy and some entries for Japan and Belgium have been removed. As the IDL metadata are updated somewhat regularly and former versions of the database are not preserved, results from published analyses are replicable but not reproducible.

References

International Database on Longevity, extracted on February 13th, 2023

Japanese survival data

Description

This data frame contains information about the counts of dead Japanese by gender and year of birth (cohort), categorized by the whole part of age attained at death.

Usage

japanese

Format

A data frame with 1038 rows and 4 variables:

age: integer, age (to the smallest year) at death (in years)
byear: integer, birth year
count: integer, number of death for cohort at given age
gender: factor, the gender of the individuals; either male or female

Details

These data were obtained from the Annual Vital Statistics Report of Japan, released by the Japanese government every year since 1947. The authors note that "All the members of that cohort have died by the end of the observation period, a procedure referred to as the extinct cohort method". The data were obtained from the Human Mortality Database by the authors. Only positive counts are reported and two records (Misao Okawa and Jiroemon Kimura) are excluded because they do not correspond to the same selection mechanism.

Source

Table extracted from Hanayama & Sibuya (2016).

References

Hanayama, N. and M. Sibuya (2016). Estimating the Upper Limit of Lifetime Probability Distribution, Based on Data of Japanese Centenarians, The Journals of Gerontology: Series A, 71(8), 1014–1021. doi:10.1093/gerona/glv113

Japanese survival data (2)

Description

This data frame is extracted from Table 10.3 from Chapter 10, "Centenarians and Supercentenarians in Japan", in the Monograph Exceptional lifespans. The data were constructed by the extinct cohort method and are stratified by age cohort (five year group, except 1899-1900) and by sex. Note that the family registry system (KOSEKI), introduced in 1872, was standardized in 1886.

Usage

japanese2

Format

A data frame with 216 rows and 4 variables:

age: integer, age (to the smallest year) at death (in years)
bcohort: factor, birth cohort
count: integer, number of death for cohort at given age
gender: factor, the gender of the individuals; either male or female

Source

Table 10.3

References

Saito, Yasuhiko and Futoshi Ishii, and Jean-Marie Robine (2021). Centenarians and Supercentenarians in Japan. In Exceptional lifespans, Maier, H., Jeune, B., Vaupel, J. W. (Eds.), Demographic research monographs 17 VII, pp. 125-145. Cham, Springer.

Goodness-of-fit diagnostics

Description

Warning: EXPERIMENTAL Compute the Kolmogorov-Smirnov test statistic and compare it with a simulated null distribution obtained via a parametric bootstrap.

Usage

ks_test(
  time,
  time2 = NULL,
  event = NULL,
  thresh = 0,
  ltrunc = NULL,
  rtrunc = NULL,
  type = c("right", "left", "interval", "interval2"),
  family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
    "extweibull", "perks", "beard", "perksmake", "beardmake"),
  B = 999L,
  arguments = NULL,
  ...
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

thresh

vector of thresholds

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

family

string; choice of parametric family

B

number of bootstrap simulations

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional parameters, currently ignored

Value

a list with elements

stat the value of the test statistic
pval p-value obtained via simulation

Note

The bootstrap scheme requires simulating new data, fitting a parametric model and estimating the nonparametric maximum likelihood estimate for each new sample. This is computationally intensive in large samples.

Log posterior distribution with MDI priors

Description

Log of the posterior distribution for excess lifetime distribution with maximal data information priors.

Usage

lpost_elife(
  par,
  time,
  time2 = NULL,
  event = NULL,
  type = c("right", "left", "interval", "interval2"),
  ltrunc = NULL,
  rtrunc = NULL,
  family = c("exp", "gp", "gomp"),
  thresh = 0,
  weights = rep(1, length(time)),
  status = NULL,
  arguments = NULL,
  ...
)

Arguments

par

vector of parameters, in the following order: scale, rate and shape

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

family

string; choice of parametric family

thresh

vector of thresholds

weights

weights for observations

status

integer vector giving status of an observation. If NULL (default), this argument is computed internally based on type.

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional arguments for optimization, currently ignored.

Value

a vector proportional to the log posterior (the sum of the log likelihood and log prior)

Score test of Northrop and Coleman

Description

This function computes the score test with the piecewise generalized Pareto distribution under the null hypothesis that the generalized Pareto with a single shape parameter is an adequate simplification. The score test statistic is calculated using the observed information matrix; both hessian and score vector are obtained through numerical differentiation.

Usage

nc_score_test(
  time,
  time2 = NULL,
  event = NULL,
  thresh = 0,
  ltrunc = NULL,
  rtrunc = NULL,
  type = c("right", "left", "interval", "interval2"),
  weights = rep(1, length(time))
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

thresh

a vector of thresholds

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

weights

weights for observations

Details

The score test is much faster and perhaps less fragile than the likelihood ratio test: fitting the piece-wise generalized Pareto model is difficult due to the large number of parameters and multimodal likelihood surface.

The reference distribution is chi-square

Value

the value of a call to nc_test

Examples


set.seed(1234)
n <- 100L
x <- samp_elife(n = n,
                scale = 2,
                shape = -0.2,
                lower = low <- runif(n),
                upper = upp <- runif(n, min = 3, max = 20),
                type2 = "ltrt",
                family = "gp")
test <- nc_test(
  time = x,
  ltrunc = low,
  rtrunc = upp,
  thresh = quantile(x, seq(0, 0.5, by = 0.1)))
print(test)
plot(test)

Score test of Northrop and Coleman

Description

Usage

nc_test(
  time,
  time2 = NULL,
  event = NULL,
  thresh = 0,
  ltrunc = NULL,
  rtrunc = NULL,
  type = c("right", "left", "interval", "interval2"),
  weights = rep(1, length(time)),
  test = c("score", "lrt"),
  arguments = NULL,
  ...
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

thresh

a vector of thresholds

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

weights

weights for observations

test

string, either "score" for the score test or "lrt" for the likelihood ratio test.

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional parameters, currently ignored

Details

The reference distribution is chi-square

Value

a data frame with the following variables:

thresh: threshold for the generalized Pareto distribution
nexc: number of exceedances
score: score statistic
df: degrees of freedom
pval: the p-value obtained from the asymptotic chi-square approximation.

Examples


set.seed(1234)
n <- 100L
x <- samp_elife(n = n,
                scale = 2,
                shape = -0.2,
                lower = low <- runif(n),
                upper = upp <- runif(n, min = 3, max = 20),
                type2 = "ltrt",
                family = "gp")
test <- nc_test(
  time = x,
  ltrunc = low,
  rtrunc = upp,
  thresh = quantile(x, seq(0, 0.5, by = 0.1)))
print(test)
plot(test)

Likelihood for doubly interval truncated data

Description

Computes the log-likelihood for various parametric models suitable for threshold exceedances. If threshold is non-zero, then only right-censored, observed event time and interval censored data whose timing exceeds the thresholds are kept.

Usage

nll_ditrunc_elife(
  par,
  time,
  ltrunc1 = NULL,
  rtrunc1 = NULL,
  ltrunc2 = NULL,
  rtrunc2 = NULL,
  family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
    "extweibull", "perks", "beard", "perksmake", "beardmake"),
  thresh = 0,
  weights = rep(1, length(time)),
  arguments = NULL,
  ...
)

Arguments

par

vector of parameters

time

excess time of the event of follow-up time, depending on the value of event

ltrunc1

lower truncation limit, default to NULL

rtrunc1

upper truncation limit, default to NULL

ltrunc2

lower truncation limit, default to NULL

rtrunc2

upper truncation limit, default to NULL

family

thresh

vector of thresholds

weights

weights for observations

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional arguments for optimization, currently ignored.

Value

log-likelihood value

Likelihood for arbitrary censored and truncated data

Description

Usage

nll_elife(
  par,
  time,
  time2 = NULL,
  event = NULL,
  type = c("right", "left", "interval", "interval2"),
  ltrunc = NULL,
  rtrunc = NULL,
  family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
    "extweibull", "perks", "beard", "perksmake", "beardmake"),
  thresh = 0,
  weights = NULL,
  status = NULL,
  arguments = NULL,
  ...
)

Arguments

par

vector of parameters, in the following order: scale, rate and shape

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

family

string; choice of parametric family

thresh

vector of thresholds

weights

weights for observations

status

integer vector giving status of an observation. If NULL (default), this argument is computed internally based on type.

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional arguments for optimization, currently ignored.

Value

log-likelihood values

Examples

data(ewsim, package = "longevity")
nll_elife(par = c(5, 0.3),
          family = "gp",
          arguments = ewsim)

Nonparametric estimation of the survival function

Description

The survival function is obtained through the EM algorithm described in Turnbull (1976); censoring and truncation are assumed to be non-informative. The survival function changes only at the J distinct exceedances y_i-u and truncation points.

Usage

np_elife(
  time,
  time2 = NULL,
  event = NULL,
  type = c("right", "left", "interval", "interval2"),
  thresh = 0,
  ltrunc = NULL,
  rtrunc = NULL,
  tol = 1e-12,
  weights = NULL,
  method = c("em", "sqp"),
  arguments = NULL,
  maxiter = 100000L,
  ...
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

thresh

double thresh

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

tol

double, relative tolerance for convergence of the EM algorithm

weights

double, vector of weights for the observations

method

string, one of "em" for expectation-maximization (EM) algorithm or "sqp" for sequential quadratic programming with augmented Lagrange multiplie method.

arguments

a named list specifying default arguments of the function that are common to all elife calls

maxiter

integer, maximum number of iterations for the EM algorithm

...

additional arguments, currently ignored

Details

The unknown parameters of the model are p_j (j=1, \ldots, J) subject to the constraint that \sum_{j=1}^J p_j=1.

Value

a list with elements

cdf: right-continuous stepfun object defined by probabilities
time: matrix of unique values for the Turnbull intervals defining equivalence classes; only those with non-zero probabilities are returned
prob: J vector of non-zero probabilities
niter: number of iterations

References

Turnbull, B. W. (1976). The Empirical Distribution Function with Arbitrarily Grouped, Censored and Truncated Data. Journal of the Royal Statistical Society. Series B (Methodological) 38(3), 290–295.

Gentleman, R. and C. J. Geyer (1994). Maximum likelihood for interval censored data: Consistency and computation, Biometrika, 81(3), 618–623.

Frydman, H. (1994). A Note on Nonparametric Estimation of the Distribution Function from Interval-Censored and Truncated Observations, Journal of the Royal Statistical Society. Series B (Methodological) 56(1), 71-74.

Examples

set.seed(2021)
n <- 20L
# Create fake data
ltrunc <- pmax(0, runif(n, -0.5, 1))
rtrunc <- runif(n, 6, 10)
dat <- samp_elife(n = n,
                  scale = 1,
                  shape = -0.1,
                  lower = ltrunc,
                  upper = rtrunc,
                  family = "gp",
                  type2 = "ltrt")
npi <- np_elife(time = dat,
                rtrunc = rtrunc,
                ltrunc = ltrunc)
print(npi)
summary(npi)
plot(npi)

Marginal log likelihood function of the nonparametric multinomial with censoring and truncation

Description

Marginal log likelihood function of the nonparametric multinomial with censoring and truncation

Usage

np_nll(par, cens_lb, cens_ub, trunc_lb, trunc_ub, cens, trunc, weights)

Arguments

par

vector of D-1 parameters

cens_lb

index of interval in which death occurs

cens_ub

index of interval in which death occurs (if death is observed), or else the largest interval.

trunc_lb

vector of largest index for the lower truncation

trunc_ub

vector of smallest index for the upper truncation

Value

a scalar, the negative log likelihood value

Nonparametric maximum likelihood estimation for arbitrary truncation

Description

The syntax is reminiscent of the Surv function, with additional vectors for left-truncation and right-truncation.

Usage

npsurv(
  time,
  time2 = NULL,
  event = NULL,
  type = c("right", "left", "interval", "interval2"),
  ltrunc = NULL,
  rtrunc = NULL,
  weights = NULL,
  arguments = NULL,
  ...
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

weights

vector of weights, default to NULL for equiweighted

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional arguments passed to the functions

Value

a list with components

xval: unique ordered values of sets on which the distribution function is defined
prob: estimated probability of failure on intervals
convergence: logical; TRUE if the EM algorithm iterated until convergence
niter: logical; number of iterations for the EM algorithm
cdf: nonparametric maximum likelihood estimator of the distribution function

Note

Contrary to the Kaplan-Meier estimator, the mass is placed in the interval [max(time), Inf) so the resulting distribution function is not deficient.

Examples

# Toy example with interval censoring and right censoring
# Two observations: A1: [1,3], A2: 4
# Probability of 0.5

test_simple2 <- npsurv(
  time = c(1,4),
  time2 = c(3,4),
  event = c(3,1),
  type = "interval"
)

Distribution function of the Beard-Makeham distribution

Description

Distribution function of the Beard-Makeham distribution

Density function of the Beard-Makeham distribution

Hazard function of the Beard-Makeham distribution

Quantile function of the Beard-Makeham distribution

Usage

pbeardmake(
  q,
  rate = 1,
  shape1 = 1,
  shape2 = 1,
  lambda = 0,
  lower.tail = TRUE,
  log.p = FALSE
)

dbeardmake(x, rate = rate, shape1 = 1, shape2 = 1, lambda = 0, log = FALSE)

hbeardmake(x, rate = rate, shape1 = 1, shape2 = 1, lambda = 0, log = FALSE)

qbeardmake(p, rate = 1, shape1 = 1, shape2 = 1, lambda = 0, lower.tail = TRUE)

Arguments

q

vector of quantiles.

rate

shape parameter (\nu)

shape1

shape parameter (\alpha)

shape2

shape parameter (\beta)

lambda

exponential rate of the Makeham component

lower.tail

logical; if TRUE (default), the lower tail probability \Pr(X \leq x) is returned.

log.p

logical; if FALSE (default), values are returned on the probability scale.

x

vector of quantiles.

log

logical; if FALSE (default), return the hazard

p

vector of probabilities.

Value

a vector of (log)-probabilities of the same length as q

a vector of (log)-density.

a vector of (log)-hazard.

a vector of quantiles

Distribution function of the extended generalized Pareto distribution

Description

Distribution function of the extended generalized Pareto distribution

Usage

pextgp(q, scale = 1, shape1 = 0, shape2 = 0, lower.tail = TRUE, log.p = FALSE)

Arguments

q

vector of quantiles.

scale

scale parameter, strictly positive.

shape1

positive shape parameter \beta; model defaults to generalized Pareto when it equals zero.

shape2

shape parameter \gamma; model reduces to Gompertz when shape2=0.

lower.tail

logical; if TRUE (default), the lower tail probability \Pr(X \leq x) is returned.

log.p

logical; if FALSE (default), values are returned on the probability scale.

Value

a vector of (log)-probabilities of the same length as q

Distribution function of the extended Weibull distribution

Description

Distribution function of the extended Weibull distribution

Density function of the extended Weibull distribution

Hazard function of the extended Weibull distribution

Quantile function of the extended Weibull distribution

Usage

pextweibull(
  q,
  scale = 1,
  shape1 = 0,
  shape2 = 1,
  lower.tail = TRUE,
  log.p = FALSE
)

dextweibull(x, scale = 1, shape1 = 0, shape2 = 1, log = FALSE)

hextweibull(x, scale = 1, shape1 = 0, shape2 = 1, log = FALSE)

qextweibull(p, scale = 1, shape1 = 0, shape2 = 1, lower.tail = TRUE)

Arguments

q

vector of quantiles.

scale

scale parameter, strictly positive.

shape1

shape parameter of the generalized Pareto component.

shape2

shape parameter of the Weibull component.

lower.tail

logical; if TRUE (default), the lower tail probability \Pr(X \leq x) is returned.

log.p

logical; if FALSE (default), values are returned on the probability scale.

x

vector of quantiles.

log

logical; if FALSE (default), return the hazard, else the log hazard.

p

vector of probabilities.

Value

a vector of (log)-probabilities of the same length as q

a vector of (log)-density.

a vector of (log)-hazard

vector of quantiles

a vector of quantiles

Distribution function of the Gompertz distribution

Description

Distribution function of the Gompertz distribution

Quantile function of the Gompertz distribution

Density function of the Gompertz distribution

Hazard function of the Gompertz distribution

Usage

pgomp(q, scale = 1, shape = 0, lower.tail = TRUE, log.p = FALSE)

qgomp(p, scale = 1, shape = 0, lower.tail = TRUE)

dgomp(x, scale = 1, shape = 0, log = FALSE)

hgomp(x, scale = 1, shape = 0, log = FALSE)

Arguments

q

vector of quantiles.

scale

positive scale parameter

shape

non-negative shape parameter

lower.tail

logical; if TRUE (default), the lower tail probability \Pr(X \leq x) is returned.

log.p

logical; if FALSE (default), values are returned on the probability scale.

p

vector of probabilities.

x

vector of quantiles

log

logical; if TRUE, return the log hazard

Value

a vector of (log)-probabilities of the same length as q

a vector of quantiles

a vector of (log)-density.

a vector of (log)-hazard.

Distribution function of the Gompertz-Makeham distribution

Description

Distribution function of the Gompertz-Makeham distribution

Quantile function of the Gompertz-Makeham distribution

Density function of the Gompertz-Makeham distribution

Hazard function of the Gompertz-Makeham distribution

Usage

pgompmake(
  q,
  scale = 1,
  shape = 0,
  lambda = 0,
  lower.tail = TRUE,
  log.p = FALSE
)

qgompmake(p, scale = 1, shape = 0, lambda = 0, lower.tail = TRUE)

dgompmake(x, scale = 1, shape = 0, lambda = 0, log = FALSE)

hgompmake(x, scale = 1, shape = 0, lambda = 0, log = FALSE)

Arguments

q

vector of quantiles.

scale

scale parameter, strictly positive.

shape

shape parameter.

lambda

exponential rate

lower.tail

logical; if TRUE (default), the lower tail probability \Pr(X \leq x) is returned.

log.p

logical; if FALSE (default), values are returned on the probability scale.

p

vector of probabilities.

x

vector of quantiles.

log

logical; if TRUE, return the log hazard

Value

a vector of (log)-probabilities of the same length as q

a vector of quantiles

a vector of density

a vector of (log)-hazard.

Note

The quantile function is defined in terms of Lambert's W function. Particular parameter combinations (small values of lambda lead to numerical overflow; the function throws a warning when this happens.

Distribution function of the generalized Pareto distribution

Description

Distribution function of the generalized Pareto distribution

Density function of the generalized Pareto distribution

Hazard function of the generalized Pareto distribution

Quantile function of the generalized Pareto distribution

Usage

pgpd(q, loc = 0, scale = 1, shape = 0, lower.tail = TRUE, log.p = FALSE)

dgpd(x, loc = 0, scale = 1, shape = 0, log = FALSE)

hgpd(x, loc = 0, scale = 1, shape = 0, log = FALSE)

qgpd(p, loc = 0, scale = 1, shape = 0, lower.tail = TRUE)

Arguments

q

vector of quantiles.

loc

location parameter.

scale

scale parameter, strictly positive.

shape

shape parameter.

lower.tail

logical; if TRUE (default), the lower tail probability \Pr(X \leq x) is returned.

log.p

logical; if FALSE (default), values are returned on the probability scale.

x

vector of quantiles.

log

logical; if FALSE (default), return the log hazard

p

vector of probabilities.

Value

a vector of (log)-probabilities of the same length as q

a vector of (log)-density.

a vector of (log)-hazard.

a vector of quantiles

Plot empirical distribution function

Description

Plot empirical distribution function

Usage

## S3 method for class 'elife_ecdf'
plot(x, ...)

Arguments

x

argument of class elife_ecdf

...

additional arguments for the plot

Value

base R plot of the empirical distribution function

Plot profile of endpoint

Description

Plot profile of endpoint

Usage

## S3 method for class 'elife_profile'
plot(x, plot.type = c("base", "ggplot"), plot = TRUE, ...)

autoplot.elife_profile(object, ...)

autoplot.elife_tstab_endpoint(object, ...)

Arguments

x

an object of class elife_profile containing information about the profile likelihood, maximum likelihood and grid of values for the endpoint

plot.type

string indicating whether to use base R for plots or ggplot2

plot

logical; if TRUE, creates a plot when plot.type="ggplot". Useful for returning ggplot objects without printing the graphs

...

additional arguments to pass to plot, currently ignored

object

object of class elife_tstab_endpoint

Value

base R or ggplot object for a plot of the profile log likelihood of the endpoint of the generalized Pareto distribution

Plot threshold stability plot for endpoint

Description

Plot threshold stability plot for endpoint

Usage

## S3 method for class 'elife_tstab_endpoint'
plot(x, plot.type = c("base", "ggplot"), plot = TRUE, ...)

Arguments

x

an object of class elife_tstab_endpoint containing information about the profile likelihood, maximum likelihood and grid of values for the endpoint

plot.type

string indicating whether to use base R for plots or ggplot2

plot

logical; if TRUE, creates a plot when plot.type="ggplot". Useful for returning ggplot objects without printing the graphs

...

additional arguments to pass to plot, currently ignored

Value

base R or ggplot object for a plot of the profile log likelihood of the endpoint of the generalized Pareto distribution

Distribution function of the Perks-Makeham distribution

Description

Distribution function of the Perks-Makeham distribution

Density function of the Perks-Makeham distribution

Hazard function of the Perks-Makeham distribution

Quantile function of the Perks-Makeham distribution

Usage

pperksmake(
  q,
  rate = 1,
  shape = 1,
  lambda = 0,
  lower.tail = TRUE,
  log.p = FALSE
)

dperksmake(x, rate = 1, shape = 1, lambda = 0, log = FALSE)

hperksmake(x, rate = 1, shape = 1, lambda = 0, log = FALSE)

qperksmake(p, rate = 1, shape = 1, lambda = 0, lower.tail = TRUE)

Arguments

q

vector of quantiles.

rate

rate parameter (\nu)

shape

shape parameter (\alpha)

lambda

exponential rate of the Makeham component

lower.tail

logical; if TRUE (default), the lower tail probability \Pr(X \leq x) is returned.

log.p

logical; if FALSE (default), values are returned on the probability scale.

x

vector of quantiles.

log

logical; if FALSE (default), return the hazard

p

vector of probabilities.

Value

a vector of (log)-probabilities of the same length as q

a vector of (log)-density.

a vector of (log)-hazard.

vector of quantiles

a vector of quantiles

Profile log likelihood for the scale parameter of the exponential distribution

Description

This internal function is used to produce threshold stability plots.

Usage

prof_exp_scale(
  mle = NULL,
  time,
  time2 = NULL,
  event = NULL,
  thresh = 0,
  ltrunc = NULL,
  rtrunc = NULL,
  type = c("right", "left", "interval", "interval2"),
  level = 0.95,
  psi = NULL,
  weights = NULL,
  confint = TRUE,
  arguments = NULL,
  ...
)

Arguments

mle

an object of class elife_par

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

thresh

vector of thresholds

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

level

level of the confidence interval

weights

weights for observations

confint

logical; if TRUE (default), return confidence intervals rather than list

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional arguments for optimization, currently ignored.

Value

a vector of length three with the maximum likelihood of the scale and profile-based confidence interval

Profile log likelihood for the transformed scale parameter of the generalized Pareto distribution

Description

This internal function is used to produce threshold stability plots.

Usage

prof_gp_scalet(
  mle = NULL,
  time,
  time2 = NULL,
  event = NULL,
  thresh = 0,
  ltrunc = NULL,
  rtrunc = NULL,
  type = c("right", "left", "interval", "interval2"),
  level = 0.95,
  psi = NULL,
  weights = NULL,
  confint = TRUE,
  arguments = NULL,
  ...
)

Arguments

mle

an object of class elife_par

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

thresh

vector of thresholds

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

level

level of the confidence interval

weights

weights for observations

confint

logical; if TRUE (default), return confidence intervals rather than list

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional arguments for optimization, currently ignored.

Value

a confidence interval or a list with profile values

Profile log likelihood for the shape parameter of the generalized Pareto distribution

Description

This internal function is used to produce threshold stability plots.

Usage

prof_gp_shape(
  mle = NULL,
  time,
  time2 = NULL,
  event = NULL,
  thresh,
  ltrunc = NULL,
  rtrunc = NULL,
  type = c("right", "left", "interval", "interval2"),
  level = 0.95,
  psi = NULL,
  weights = NULL,
  confint = TRUE,
  arguments = NULL,
  ...
)

Arguments

mle

an object of class elife_par

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

thresh

vector of thresholds

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

level

level of the confidence interval

weights

weights for observations

confint

logical; if TRUE (default), return confidence intervals rather than list

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional arguments for optimization, currently ignored.

Value

if confint=TRUE, a vector of length three with the maximum likelihood of the shape and profile-based confidence interval

Sample observations from an interval truncated excess lifetime distribution

Description

Sample observations from an interval truncated excess lifetime distribution

Usage

r_ditrunc_elife(
  n,
  rate,
  scale,
  shape,
  lower,
  upper,
  family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
    "extweibull", "perks", "beard", "perksmake", "beardmake")
)

Arguments

n

sample size

rate

rate parameter(s)

scale

scale parameter(s)

shape

shape parameter(s)

lower

vector of lower bounds

upper

vector of upper bounds

family

string; choice of parametric family

Value

a vector of n observations

Examples

n <- 100L
# the lower and upper bound are either scalar,
# or else vectors of length n
r_dtrunc_elife(n = n, scale = 1, shape = -0.1,
               lower = pmax(0, runif(n, -0.5, 1)),
               upper = runif(n, 6, 10),
               family = "gp")

Sample observations from an interval truncated excess lifetime distribution

Description

Sample observations from an interval truncated excess lifetime distribution

Usage

r_dtrunc_elife(
  n,
  scale,
  rate,
  shape,
  lower,
  upper,
  family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
    "extweibull", "perks", "beard", "perksmake", "beardmake")
)

Arguments

n

sample size

scale

scale parameter(s)

rate

rate parameter(s)

shape

shape parameter(s)

lower

vector of lower bounds

upper

vector of upper bounds

family

string; choice of parametric family

Value

a vector of n observations

Examples

n <- 100L
# the lower and upper bound are either scalar,
# or else vectors of length n
r_dtrunc_elife(n = n, scale = 1, shape = -0.1,
               lower = pmax(0, runif(n, -0.5, 1)),
               upper = runif(n, 6, 10),
               family = "gp")

Sample observations from a left-truncated right-censored excess lifetime distribution

Description

Sample observations from a left-truncated right-censored excess lifetime distribution

Usage

r_ltrc_elife(
  n,
  scale,
  rate,
  shape,
  lower,
  upper,
  family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
    "extweibull", "perks", "beard", "perksmake", "beardmake")
)

Arguments

n

sample size

scale

scale parameter

shape

shape parameter(s)

lower

vector of lower bounds

upper

vector of upper bounds above which data are right-truncated

family

string; choice of parametric family, either exponential (exp), Weibull (weibull), generalized Pareto (gp), Gompertz (gomp) or extended generalized Pareto (extgp).

Value

a list with n observations dat and a logical vector of the same length with TRUE for right-censored observations and FALSE otherwise.

Examples

n <- 100L
# the lower and upper bound are either scalar,
# or else vectors of length n
r_ltrc_elife(n = n, scale = 5, shape = -0.1,
               lower = pmax(0, runif(n, -0.5, 1)),
               upper = 5,
               family = "gp")

Beard distribution

Description

Distribution function, density, hazard, quantile function and random number generation for the Beard distribution

Usage

rbeard(n, rate, shape1, shape2)

pbeard(q, rate = 1, shape1 = 1, shape2 = 1, lower.tail = TRUE, log.p = FALSE)

dbeard(x, rate = 1, shape1 = 1, shape2 = 1, log = FALSE)

hbeard(x, rate = 1, shape1 = 1, shape2 = 1, log = FALSE)

qbeard(p, rate = 1, shape1 = 1, shape2 = 1, lower.tail = TRUE)

Arguments

n

sample size

rate

rate parameter (\nu)

shape1

shape parameter (\alpha)

shape2

shape parameter (\beta)

q

vector of quantiles.

lower.tail

logical; if TRUE (default), the lower tail probability \Pr(X \leq x) is returned.

log.p

logical; if FALSE (default), values are returned on the probability scale.

x

vector of quantiles.

log

logical; if FALSE (default), return the hazard

p

vector of probabilities.

Value

a vector of random variates

a vector of (log)-probabilities of the same length as q

a vector of (log)-density.

a vector of (log)-hazard.

a vector of quantiles

Sample lifetime from excess lifetime model

Description

Given parameters of a elife distribution, sampling window and birth dates with excess lifetimes, sample new observations; excess lifetime at c1 are sampled from an exponential distribution, whereas the birth dates are sampled from a jittered histogram-based distribution The new excess lifetime above the threshold are right-censored if they exceed c2.

Usage

samp2_elife(
  n,
  scale,
  shape,
  family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
    "extweibull", "perks", "beard", "perksmake", "beardmake"),
  xcal,
  c1,
  c2
)

Arguments

n

sample size

scale

scale parameter(s)

shape

shape parameter(s)

family

string; choice of parametric family

xcal

date at which individual reaches u years

c1

date, first day of the sampling frame

c2

date, last day of the sampling frame

Value

list with new birthdates (xcal), excess lifetime at c1 (ltrunc), excess lifetime above u (dat) and right-censoring indicator (rightcens).

Simulate excess lifetime with truncation or right-censoring

Description

This function dispatches simulations accounting for potential left-truncation (remove by setting lower to zero). If type2=ltrt, simulated observations will be lower than the upper bounds upper. If type2=ltrc, simulated observations are capped at upper and the observation is right-censored (rcens=TRUE).

Usage

samp_elife(
  n,
  scale,
  rate,
  shape = NULL,
  lower = 0,
  upper = Inf,
  family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "gppiece",
    "extweibull", "perks", "beard", "perksmake", "beardmake"),
  type2 = c("none", "ltrt", "ltrc", "ditrunc")
)

Arguments

n

sample size

scale

scale parameter(s)

rate

rate parameter(s)

shape

shape parameter(s)

lower

vector of lower bounds

upper

vector of upper bounds

family

string; choice of parametric family

type2

string, either none, ltrt for left- and right-truncated data or ltrc for left-truncated right-censored data

Value

either a vector of observations or, if type2=ltrc, a list with n observations dat and a logical vector of the same length with TRUE for right-censored observations and FALSE otherwise.

Note

As the tails of the Gompertz and Gompertz–Makeham models decrease exponentially fast, the method fails in the rare event case if the lower bound is too large (say larger than the 99.99

Examples

set.seed(1234)
n <- 500L
# Simulate interval truncated data
x <- samp_elife(n = n,
                scale = 2,
                shape = 1.5,
                lower = low <- runif(n),
                upper = upp <- runif(n, min = 3, max = 15),
                type2 = "ltrt",
                family = "weibull")
coef(fit_elife(
   time = x,
   ltrunc = low,
   rtrunc = upp,
   family = "weibull"))
# Simulate left-truncated right-censored data
x <- samp_elife(n = n,
                scale = 2,
                shape = 1.5,
                lower = low <- runif(n),
                upper = upp <- runif(n, min = 3, max = 15),
                type2 = "ltrc",
                family = "gomp")
#note that the return value is a list...
coef(fit_elife(
   time = x$dat,
   ltrunc = low,
   event = !x$rcens,
   family = "gomp"))

Likelihood ratio test for doubly interval truncated data

Description

This function fits separate models for each distinct value of covariates and computes a likelihood ratio test to test whether there are significant differences between groups.

Usage

test_ditrunc_elife(
  time,
  covariate,
  thresh = 0,
  ltrunc1 = NULL,
  rtrunc1 = NULL,
  ltrunc2 = NULL,
  rtrunc2 = NULL,
  family = c("exp", "gp", "gomp", "gompmake", "weibull", "extgp", "extweibull", "perks",
    "beard", "perksmake", "beardmake"),
  weights = rep(1, length(time)),
  arguments = NULL,
  ...
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

covariate

vector of factors, logical or integer whose distinct values are

thresh

vector of thresholds

family

string; choice of parametric family

weights

weights for observations

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional arguments for optimization, currently ignored.

Value

a list with elements

stat: likelihood ratio statistic
df: degrees of freedom
pval: the p-value obtained from the asymptotic chi-square approximation.

Likelihood ratio test for covariates

Description

This function fits separate models for each distinct value of the factor covariate and computes a likelihood ratio test to test whether there are significant differences between groups.

Usage

test_elife(
  time,
  time2 = NULL,
  event = NULL,
  covariate,
  thresh = 0,
  ltrunc = NULL,
  rtrunc = NULL,
  type = c("right", "left", "interval", "interval2"),
  family = c("exp", "gp", "weibull", "gomp", "gompmake", "extgp", "extweibull", "perks",
    "perksmake", "beard", "beardmake"),
  weights = rep(1, length(time)),
  arguments = NULL,
  ...
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

covariate

vector of factors, logical or integer whose distinct values define groups

thresh

vector of thresholds

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

family

string; choice of parametric family

weights

weights for observations

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional arguments for optimization, currently ignored.

Value

a list with elements

stat likelihood ratio statistic
df degrees of freedom
pval the p-value obtained from the asymptotic chi-square approximation.

Examples

test <- with(subset(dutch, ndays > 39082),
 test_elife(
 time = ndays,
 thresh = 39082L,
 covariate = gender,
 ltrunc = ltrunc,
 rtrunc = rtrunc,
 family = "exp"))
 test

Threshold stability plots

Description

The generalized Pareto and exponential distribution are threshold stable. This property, which is used for extrapolation purposes, can also be used to diagnose goodness-of-fit: we expect the parameters \xi and \tilde{\sigma} = \sigma + \xi u to be constant over a range of thresholds. The threshold stability plot consists in plotting maximum likelihood estimates with pointwise confidence interval. This function handles interval truncation and right-censoring.

Usage

tstab(
  time,
  time2 = NULL,
  event = NULL,
  thresh = 0,
  ltrunc = NULL,
  rtrunc = NULL,
  type = c("right", "left", "interval", "interval2"),
  family = c("gp", "exp"),
  method = c("wald", "profile"),
  level = 0.95,
  plot = TRUE,
  plot.type = c("base", "ggplot"),
  which.plot = c("scale", "shape"),
  weights = NULL,
  arguments = NULL,
  ...
)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

event

thresh

vector of thresholds

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

type

character string specifying the type of censoring. Possible values are "right", "left", "interval", "interval2".

family

string; distribution, either generalized Pareto (gp) or exponential (exp)

method

string; the type of pointwise confidence interval, either Wald (wald) or profile likelihood (profile)

level

probability level for the pointwise confidence intervals

plot

logical; should a plot be returned alongside with the estimates? Default to TRUE

plot.type

string; either base for base R plots or ggplot for ggplot2 plots

which.plot

string; which parameters to plot;

weights

weights for observations

arguments

a named list specifying default arguments of the function that are common to all elife calls

...

additional arguments for optimization, currently ignored.

Details

The shape estimates are constrained

Value

an invisible list with pointwise estimates and confidence intervals for the scale and shape parameters

Examples

set.seed(1234)
n <- 100L
x <- samp_elife(n = n,
                scale = 2,
                shape = -0.2,
                lower = low <- runif(n),
                upper = upp <- runif(n, min = 3, max = 20),
                type2 = "ltrt",
                family = "gp")
tstab_plot <- tstab(time = x,
                    ltrunc = low,
                   rtrunc = upp,
                   thresh = quantile(x, seq(0, 0.5, length.out = 4)))
plot(tstab_plot, plot.type = "ggplot")

Turnbull's sets

Description

Given censoring and truncation set, compute the regions of the real line that get positive mass and over which the distribution function is well-defined.

Usage

turnbull_intervals(time, time2 = NULL, status, ltrunc = NULL, rtrunc = NULL)

Arguments

time

excess time of the event of follow-up time, depending on the value of event

time2

ending excess time of the interval for interval censored data only.

ltrunc

lower truncation limit, default to NULL

rtrunc

upper truncation limit, default to NULL

Value

a matrix with two columns containing the left and right bounds Of Turnbull sets

Note

The function adds the square root of the machine tolerance to left bounds of interval censored data so they are open.

References

Turnbull, B. W. (1976). The Empirical Distribution Function with Arbitrarily Grouped, Censored and Truncated Data. Journal of the Royal Statistical Society, Series B 38, 290-295.