Title: | 'parsnip' Engines for Survival Models |
Version: | 0.3.3 |
Description: | Engines for survival models from the 'parsnip' package. These include parametric models (e.g., Jackson (2016) <doi:10.18637/jss.v070.i08>), semi-parametric (e.g., Simon et al (2011) <doi:10.18637/jss.v039.i05>), and tree-based models (e.g., Buehlmann and Hothorn (2007) <doi:10.1214/07-STS242>). |
License: | MIT + file LICENSE |
URL: | https://github.com/tidymodels/censored, https://censored.tidymodels.org |
BugReports: | https://github.com/tidymodels/censored/issues |
Depends: | parsnip (≥ 1.3.0), R (≥ 3.5.0), survival (≥ 3.7-0) |
Imports: | cli, dials, dplyr (≥ 0.8.0.1), generics, glue, hardhat (≥ 1.4.1), lifecycle, mboost, prodlim (≥ 2023.03.31), purrr, rlang (≥ 1.0.0), stats, tibble (≥ 3.1.3), tidyr (≥ 1.0.0), vctrs |
Suggests: | aorsf (≥ 0.1.2), coin, covr, flexsurv (≥ 2.2.1), glmnet (≥ 4.1), ipred, partykit, pec, rmarkdown, rpart, testthat (≥ 3.0.0) |
Config/Needs/website: | tidymodels, tidyverse/tidytemplate |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2025-02-14 19:21:15 UTC; hannah |
Author: | Emil Hvitfeldt |
Maintainer: | Hannah Frick <hannah@posit.co> |
Repository: | CRAN |
Date/Publication: | 2025-02-14 20:20:02 UTC |
censored: parsnip Engines for Survival Models
Description
censored provides engines for survival models from the parsnip package. The models include parametric survival models, proportional hazards models, decision trees, boosted trees, bagged trees, and random forests. See the "Fitting and Predicting with censored" article for various examples. See below for examples of classic survival models and how to fit them with censored.
Author(s)
Maintainer: Hannah Frick hannah@posit.co (ORCID)
Authors:
Emil Hvitfeldt emil.hvitfeldt@posit.co (ORCID)
Other contributors:
Posit Software, PBC [copyright holder, funder]
See Also
Useful links:
Report bugs at https://github.com/tidymodels/censored/issues
Examples
# Accelerated Failure Time (AFT) model
fit_aft <- survival_reg(dist = "weibull") %>%
set_engine("survival") %>%
fit(Surv(time, status) ~ age + sex + ph.karno, data = lung)
predict(fit_aft, lung[1:3, ], type = "time")
# Cox's Proportional Hazards model
fit_cox <- proportional_hazards() %>%
set_engine("survival") %>%
fit(Surv(time, status) ~ age + sex + ph.karno, data = lung)
predict(fit_cox, lung[1:3, ], type = "time")
# Andersen-Gill model for recurring events
fit_ag <- proportional_hazards() %>%
set_engine("survival") %>%
fit(Surv(tstart, tstop, status) ~ treat + inherit + age + strata(hos.cat),
data = cgd
)
predict(fit_ag, cgd[1:3, ], type = "time")
Internal helper function for aorsf objects
Description
Internal helper function for aorsf objects
Usage
survival_prob_orsf(object, new_data, eval_time, time = deprecated())
Arguments
object |
A parsnip |
new_data |
A data frame to be predicted. |
eval_time |
A vector of times to predict the survival probability. |
time |
Deprecated in favor of |
Value
A tibble with a list column of nested tibbles.
Examples
mod <- rand_forest() %>%
set_engine("aorsf") %>%
set_mode("censored regression") %>%
fit(Surv(time, status) ~ age + ph.ecog, data = na.omit(lung))
preds <- survival_prob_orsf(mod, lung[1:3, ], eval_time = c(250, 100))
Boosted trees via mboost
Description
blackboost_train()
is a wrapper for the blackboost()
function in the
mboost package that fits tree-based models
where all of the model arguments are in the main function.
Usage
blackboost_train(
formula,
data,
family,
weights = NULL,
teststat = "quadratic",
testtype = "Teststatistic",
mincriterion = 0,
minsplit = 10,
minbucket = 4,
maxdepth = 2,
saveinfo = FALSE,
...
)
Arguments
formula |
A symbolic description of the model to be fitted. |
data |
A data frame containing the variables in the model. |
family |
A |
weights |
An optional vector of weights to be used in the fitting process. |
teststat |
A character specifying the type of the test statistic to be applied for variable selection. |
testtype |
A character specifying how to compute the distribution of
the test statistic. The first three options refer to p-values as criterion,
|
mincriterion |
The value of the test statistic or 1 - p-value that must be exceeded in order to implement a split. |
minsplit |
The minimum sum of weights in a node in order to be considered for splitting. |
minbucket |
The minimum sum of weights in a terminal node. |
maxdepth |
The maximum depth of the tree. The default |
saveinfo |
Logical. Store information about variable selection procedure in info slot of each partynode. |
... |
Other arguments to pass. |
Value
A fitted blackboost model.
Examples
blackboost_train(Surv(time, status) ~ age + ph.ecog,
data = lung[-14, ], family = mboost::CoxPH()
)
Wrapper for glmnet for censored
Description
Not to be used directly by users.
Usage
coxnet_train(
formula,
data,
alpha = 1,
lambda = NULL,
weights = NULL,
...,
call = caller_env()
)
Arguments
formula |
The model formula. |
data |
The data. |
alpha |
The elasticnet mixing parameter, with
|
lambda |
A user supplied |
weights |
observation weights. Can be total counts if responses are proportion matrices. Default is 1 for each observation |
... |
additional parameters passed to glmnet::glmnet. |
call |
The call used in errors and warnings. |
Details
This wrapper translates from formula interface to glmnet's matrix due to how
stratification can be specified. glmnet requires that the response is
stratified via glmnet::stratifySurv()
. censored allows specification via a
survival::strata()
term on the right-hand side of the formula. The formula
is used to generate the stratification information needed for stratifying the
response. The formula without the strata term is used for generating the
model matrix for glmnet.
The wrapper retains the original formula and the pre-processing elements including the training data to allow for predictions from the fitted model.
Value
A fitted glmnet
model.
Examples
coxnet_mod <- coxnet_train(Surv(time, status) ~ age + sex, data = lung)
A wrapper for survival probabilities with coxnet models
Description
A wrapper for survival probabilities with coxnet models
Usage
survival_prob_coxnet(
object,
new_data,
eval_time,
time = deprecated(),
output = "surv",
penalty = NULL,
multi = FALSE,
...
)
Arguments
object |
A parsnip |
new_data |
Data for prediction. |
eval_time |
A vector of integers for prediction times. |
time |
Deprecated in favor of |
output |
One of "surv" or "haz". |
penalty |
Penalty value(s). |
multi |
Allow multiple penalty values? Defaults to FALSE. |
... |
Options to pass to |
Value
A tibble with a list column of nested tibbles.
Examples
cox_mod <- proportional_hazards(penalty = 0.1) %>%
set_engine("glmnet") %>%
fit(Surv(time, status) ~ ., data = lung)
survival_prob_coxnet(cox_mod, new_data = lung[1:3, ], eval_time = 300)
A wrapper for survival probabilities with coxph models
Description
A wrapper for survival probabilities with coxph models
Usage
survival_prob_coxph(
object,
x = deprecated(),
new_data,
eval_time,
time = deprecated(),
output = "surv",
interval = "none",
conf.int = 0.95,
...
)
Arguments
object |
A parsnip |
x |
Deprecated. A model from |
new_data |
Data for prediction |
eval_time |
A vector of integers for prediction times. |
time |
Deprecated in favor of |
output |
One of |
interval |
Add confidence interval for survival probability? Options
are |
conf.int |
The confidence level. |
... |
Options to pass to |
Value
A tibble with a list column of nested tibbles.
Examples
cox_mod <- proportional_hazards() %>%
set_engine("survival") %>%
fit(Surv(time, status) ~ ., data = lung)
survival_prob_coxph(cox_mod, new_data = lung[1:3, ], eval_time = 300)
A wrapper for survival probabilities with mboost models
Description
A wrapper for survival probabilities with mboost models
Usage
survival_prob_mboost(object, new_data, eval_time, time = deprecated())
Arguments
object |
A parsnip |
new_data |
Data for prediction. |
eval_time |
A vector of integers for prediction times. |
time |
Deprecated in favor of |
Value
A tibble with a list column of nested tibbles.
Examples
mod <- boost_tree() %>%
set_engine("mboost") %>%
set_mode("censored regression") %>%
fit(Surv(time, status) ~ ., data = lung)
survival_prob_mboost(mod, new_data = lung[1:3, ], eval_time = 300)
A wrapper for survival probabilities with partykit models
Description
A wrapper for survival probabilities with partykit models
Usage
survival_prob_partykit(
object,
new_data,
eval_time,
time = deprecated(),
output = "surv"
)
Arguments
object |
A parsnip |
new_data |
A data frame to be predicted. |
eval_time |
A vector of times to predict the survival probability. |
time |
Deprecated in favor of |
output |
Type of output. Can be either |
Value
A tibble with a list column of nested tibbles.
Examples
tree <- decision_tree() %>%
set_mode("censored regression") %>%
set_engine("partykit") %>%
fit(Surv(time, status) ~ age + ph.ecog, data = lung)
survival_prob_partykit(tree, lung[1:3, ], eval_time = 100)
forest <- rand_forest() %>%
set_mode("censored regression") %>%
set_engine("partykit") %>%
fit(Surv(time, status) ~ age + ph.ecog, data = lung[1:100, ])
survival_prob_partykit(forest, lung[1:3, ], eval_time = 100)
A wrapper for survival probabilities with pecRpart models
Description
A wrapper for survival probabilities with pecRpart models
Usage
survival_prob_pecRpart(object, new_data, eval_time)
Arguments
object |
A parsnip |
new_data |
Data for prediction. |
eval_time |
A vector of integers for prediction times. |
Value
A tibble with a list column of nested tibbles.
Examples
mod <- decision_tree() %>%
set_mode("censored regression") %>%
set_engine("rpart") %>%
fit(Surv(time, status) ~ ., data = lung)
survival_prob_pecRpart(mod, new_data = lung[1:3, ], eval_time = 300)
A wrapper for survival probabilities with survbagg
models
Description
A wrapper for survival probabilities with survbagg
models
Usage
survival_prob_survbagg(object, new_data, eval_time, time = deprecated())
Arguments
object |
A parsnip |
new_data |
Data for prediction. |
eval_time |
A vector of prediction times. |
time |
Deprecated in favor of |
Value
A vctrs list of tibbles.
Examples
bagged_tree <- bag_tree() %>%
set_engine("rpart") %>%
set_mode("censored regression") %>%
fit(Surv(time, status) ~ age + ph.ecog, data = lung)
survival_prob_survbagg(bagged_tree, lung[1:3, ], eval_time = 100)
Internal function helps for parametric survival models
Description
Internal function helps for parametric survival models
Usage
survival_prob_survreg(object, new_data, eval_time, time = deprecated())
hazard_survreg(object, new_data, eval_time)
Arguments
object |
A parsnip |
new_data |
A data frame. |
eval_time |
A vector of time points. |
time |
Deprecated in favor of |
Value
A tibble with a list column of nested tibbles.
Examples
mod <- survival_reg() %>%
set_engine("survival") %>%
fit(Surv(time, status) ~ ., data = lung)
survival_prob_survreg(mod, lung[1:3, ], eval_time = 100)
hazard_survreg(mod, lung[1:3, ], eval_time = 100)
A wrapper for survival times with coxnet models
Description
A wrapper for survival times with coxnet models
Usage
survival_time_coxnet(object, new_data, penalty = NULL, multi = FALSE, ...)
Arguments
object |
A parsnip |
new_data |
Data for prediction. |
penalty |
Penalty value(s). |
multi |
Allow multiple penalty values? |
... |
Options to pass to |
Value
A vector.
Examples
cox_mod <- proportional_hazards(penalty = 0.1) %>%
set_engine("glmnet") %>%
fit(Surv(time, status) ~ ., data = lung)
survival_time_coxnet(cox_mod, new_data = lung[1:3, ], penalty = 0.1)
A wrapper for survival times with coxph
models
Description
A wrapper for survival times with coxph
models
Usage
survival_time_coxph(object, new_data)
Arguments
object |
A parsnip |
new_data |
Data for prediction |
Value
A vector.
Examples
cox_mod <- proportional_hazards() %>%
set_engine("survival") %>%
fit(Surv(time, status) ~ ., data = lung)
survival_time_coxph(cox_mod, new_data = lung[1:3, ])
A wrapper for mean survival times with mboost
models
Description
A wrapper for mean survival times with mboost
models
Usage
survival_time_mboost(object, new_data)
Arguments
object |
A parsnip |
new_data |
Data for prediction |
Value
A tibble.
Examples
boosted_tree <- boost_tree() %>%
set_engine("mboost") %>%
set_mode("censored regression") %>%
fit(Surv(time, status) ~ age + ph.ecog, data = lung[-14, ])
survival_time_mboost(boosted_tree, new_data = lung[1:3, ])
A wrapper for survival times with survbagg
models
Description
A wrapper for survival times with survbagg
models
Usage
survival_time_survbagg(object, new_data)
Arguments
object |
A parsnip |
new_data |
Data for prediction |
Value
A vector.
Examples
bagged_tree <- bag_tree() %>%
set_engine("rpart") %>%
set_mode("censored regression") %>%
fit(Surv(time, status) ~ age + ph.ecog, data = lung)
survival_time_survbagg(bagged_tree, lung[1:3, ])
Number of days before a movie grosses $1M USD
Description
These data are a somewhat biased random sample of 551 movies released between 2015 and 2018. Columns include
Details
-
title
: a character string for the movie title. -
time
: number of days until the movie earns a million US dollars. -
event
: a binary value for whether the movie reached this goal. About 94% of the movies had observed events. -
released
: a date field for the release date. -
distributor
: a factor with the the name of the distributor. -
released_theaters
: the maximum number of theaters where the movie played in the first two weeks of release. -
year
: the release year. -
rated
: a factor for the Motion Picture Association film rating. -
runtime
: the length of the movie (in minutes). A set of indicators columns for the movie genre (e.g.
action
,crime
, etc.).A set of indicators for the language (e.g.,
english
,hindi
, etc.).A set of indicators for countries where the movie was released (e.g.,
uk
,japan
, etc.)
Value
time_to_million |
a tibble |