Type: | Package |
Title: | Estimate Cutpoints of Metric Variables in the Context of Cox Regression |
Version: | 1.0.0 |
Description: | Estimate one or two cutpoints of a metric or ordinal-scaled variable in the multivariable context of survival data or time-to-event data. Visualise the cutpoint estimation process using contour plots, index plots, and spline plots. It is also possible to estimate cutpoints based on the assumption of a U-shaped or inverted U-shaped relationship between the predictor and the hazard ratio. Govindarajulu, U., and Tarpey, T. (2022) <doi:10.1080/02664763.2020.1846690>. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
Language: | en-GB |
LazyData: | true |
Imports: | graphics, magrittr, plotly, RcppAlgos, stats, survival, utils |
RoxygenNote: | 7.3.2 |
URL: | https://github.com/jan-por/cutpoint |
BugReports: | https://github.com/jan-por/cutpoint/issues |
Depends: | R (≥ 3.5) |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2025-05-08 16:09:54 UTC; janpo |
Author: | Jan Porthun |
Maintainer: | Jan Porthun <jan.porthun@ntnu.no> |
Repository: | CRAN |
Date/Publication: | 2025-05-09 15:20:09 UTC |
cutpoint: Estimate Cutpoints of Metric Variables in the Context of Cox Regression
Description
Estimate one or two cutpoints of a metric or ordinal-scaled variable in the multivariable context of survival data or time-to-event data. Visualise the cutpoint estimation process using contour plots, index plots, and spline plots. It is also possible to estimate cutpoints based on the assumption of a U-shaped or inverted U-shaped relationship between the predictor and the hazard ratio. Govindarajulu, U., and Tarpey, T. (2022) doi:10.1080/02664763.2020.1846690.
Author(s)
Maintainer: Jan Porthun jan.porthun@ntnu.no (ORCID) [copyright holder]
See Also
Useful links:
Pipe operator
Description
See magrittr::%>%
for details.
Usage
lhs %>% rhs
Arguments
lhs |
A value or the magrittr placeholder. |
rhs |
A function call using the magrittr semantics. |
Value
The result of calling rhs(lhs)
.
Estimate cutpoints in a multivariable setting for survival data
Description
One or two cutpoints of a metric variable are estimated using either the AIC (Akaike Information Criterion) or the LRT (Likelihood-Ratio Test statistic) within a multivariable Cox proportional hazards model. These cutpoints are used to create two or three groups with different survival probabilities.
The cutpoints are estimated by dichotomising the variable of interest, which is then incorporated into the Cox regression model. The cutpoint of this variable is the value at which the AIC reaches its lowest value or the LRT statistic achieves its maximum for the corresponding Cox-regression model.
This process occurs within a multivariable framework, as other
covariates and/or factors are considered during the search for the
cutpoints. Cutpoints can also be estimated when the variable of interest
shows a U-shaped or inverse U-shaped relationship to the hazard ratio of
time-to-event data. The argument symtail
facilitates the estimation of two
cutpoints, ensuring that the two outer tails represent groups of equal size.
Usage
cp_est(
cpvarname,
time = "time",
event = "event",
covariates = NULL,
data = data,
nb_of_cp = 1,
bandwith = 0.1,
est_type = "AIC",
cpvar_strata = FALSE,
ushape = FALSE,
symtails = FALSE,
dp = 2,
plot_splines = TRUE,
all_splines = TRUE,
print_res = TRUE,
verbose = TRUE
)
Arguments
cpvarname |
character, the name of the variable for which the cutpoints are estimated. |
time |
character, this is the follow-up time. |
event |
character, the status indicator, normally 0=no event, 1=event |
covariates |
character vector with the names of the covariates and/ or
factors. If no covariates are used, set |
data |
a data.frame, contains the following variables:
|
nb_of_cp |
numeric, number of cutpoints to be estimated (1 or 2). The
default is: |
bandwith |
numeric, minimum group size per group in percent of the total
sample size, |
est_type |
character, the method used to estimate the cutpoints. The default is 'AIC' (Akaike information criterion). The other options is 'LRT' (likelihood ratio test statistic) |
cpvar_strata |
logical value: if |
ushape |
logical value: if |
symtails |
logical value: if |
dp |
numeric, number of decimal places the cutpoints are rounded to.
Default is |
plot_splines |
logical value: if |
all_splines |
logical value: if |
print_res |
logical value: if |
verbose |
logical value: if |
Value
Returns the cpobj
object with cutpoints and the characteristics
of the formed groups.
References
Govindarajulu, U., & Tarpey, T. (2020). Optimal partitioning for the proportional hazards model. Journal of Applied Statistics, 49(4), 968–987. https://doi.org/10.1080/02664763.2020.1846690
See Also
cp_splines_plot()
for penalized spline plots, cp_value_plot()
for Value plots and Index plots
Examples
# Example 1:
# Estimate two cutpoints of the variable biomarker.
# The dataset data1 is included in this package and contains
# the variables time, event, biomarker, covariate_1, and covariate_2.
cpobj <- cp_est(
cpvarname = "biomarker",
covariates = c("covariate_1", "covariate_2"),
data = data1,
nb_of_cp = 2,
plot_splines = FALSE
)
# Example 2:
# Searching for cutpoints, if the variable shows a U-shaped or
# inverted U-shaped relationship to the hazard ratio.
# The dataset data2_ushape is included in this package and contains
# the variables time, event, biomarker, and cutpoint_1.
cpobj <- cp_est(
cpvarname = "biomarker",
covariates = c("covariate_1"),
data = data2_ushape,
nb_of_cp = 2,
bandwith = 0.2,
ushape = TRUE,
plot_splines = FALSE
)
Summarise cutpoint estimation
Description
Writes the summary of the cutpoint estimation to the console.
Usage
cp_estsum(cpobj, verbose = TRUE)
Arguments
cpobj |
list, contains variables for |
verbose |
logical value: if |
Value
Summary of the cutpoint estimation.
See Also
cp_est()
for main function of the package.
Examples
# Example
# Writes the summary to the console
# The data set data1 is included in this package
cpobj <- cp_est(
cpvarname = "biomarker",
covariates = c("covariate_1", "covariate_2"),
data = data1,
nb_of_cp = 2,
plot_splines = FALSE,
print_res = FALSE
)
cp_estsum(cpobj, verbose = TRUE)
Plot penalized smoothing splines from cpobj
object
Description
Create penalized smoothing splines plot with different degrees of freedom and shows the cutpoints of the dichotomised variable.
Usage
cp_splines_plot(cpobj, show_splines = TRUE, adj_splines = TRUE)
Arguments
cpobj |
list, contains variables for pspline plot:
|
show_splines |
logical, if |
adj_splines |
logical, if |
Value
Plots penalized smoothing splines and shows the cutpoints.
See Also
cp_est()
for main function of the package, cp_value_plot()
for Value plots and Index plots
Examples
cpvar <- rnorm(100, mean = 100, sd = 10)
time <- seq(1, 100, 1)
event <- rbinom(100, 1, 0.5)
datf <- data.frame(time, event, cpvar)
plot_splines_list <- list(cpdata = datf, nb_of_cp = 1, cp = 95, dp = 2,
cpvarname = "Biomarker")
cp_splines_plot(plot_splines_list)
Plot AIC and LRT-statistics values from cpobj
object
Description
Create a plot of AIC or Likelihood ratio test statistic values for the estimation procedure. If there are two cutpoints, a Contour-plot and an Index-plot can be generated.
Usage
cp_value_plot(
cpobj,
plotvalues = "AIC",
dp.plot = 2,
show_limit = TRUE,
plottype2cp = "contour"
)
Arguments
cpobj |
list, contains a vector of AIC values (AIC_values) and Likelihood ratio test statistic values (LRT_values) of the estimating procedure |
plotvalues |
character, either |
dp.plot |
numeric, digits for the AIC values and LRT values.
Default is |
show_limit |
logical, if |
plottype2cp |
character, either |
Value
Plots the AIC- or LRT-values, derived from the estimation procedure.
See Also
cp_est()
for main function of the package, cp_splines_plot()
for penalized spline plots
Examples
# Example 1
# Plot AIC-values and potential cutpoints of the estimation process
# Create AIC values:
AIC_values <- c(1950:1910, 1910:1920, 1920:1880, 1880:1920)
AIC_values <- round(AIC_values + rnorm(length(AIC_values),
mean = 0, sd = 5), digits = 2)
# Create a cutpoint variable:
cpvariable_values <- matrix(NA, nrow = length(AIC_values), ncol = 2)
cpvariable_values[ ,1] <- c(1:length(AIC_values))
# Create a cutpoint object (cpobj):
cpobj <- list(AIC_values = AIC_values,
nb_of_cp = 1,
cpvariable_values = cpvariable_values,
cpvarname = "Cutpoint variable"
)
cp_value_plot(cpobj, plotvalues = "AIC", dp.plot = 2, show_limit = TRUE)
# Example 2
# Splines plot based on data1
# The data set data1 is included in this package
cpobj <- cp_est(
cpvarname = "biomarker",
covariates = c("covariate_1", "covariate_2"),
data = data1,
nb_of_cp = 2,
plot_splines = TRUE,
)
# Example 3
# Contour plot based on data1
# The data set data1 is included in this package
cpobj <- cp_est(
cpvarname = "biomarker",
covariates = c("covariate_1", "covariate_2"),
data = data1,
nb_of_cp = 2,
plot_splines = FALSE,
)
cp_value_plot(cpobj, plotvalues = "AIC", plottype2cp = "contour")
Dataset for testing the cutpoint estimating function: cp_est
Description
A dataset containing data for testing the estimating of one or two cutpoints
Usage
data(data1)
Format
"data1"
A data frame with 100 rows and 5 variables:
- biomarker
numeric from 1 to 257
- covariate_1
numeric, from 4.25 to 12.33, with effect of the cutpoint of the biomarker
- covariate_2
numeric, from 465 to 1205, with no or small effect of the cutpoint of the biomarker
- time
numeric, from 3 to 328
- event
numeric, 0 or 1
Author(s)
Jan Porthun
Source
Self-generated example data
Examples
data(data1)
Dataset for testing the ushape argument of cp_est function
Description
A dataset containing data for testing the ushape argument of cp_est function.
Usage
data(data2_ushape)
Format
"data2_ushape"
A data frame with 200 rows and 4 variables:
- biomarker
numeric from 1e-04 to 4.7
- covariate_1
numeric, from 8.07e-05 to 1.90
- time
numeric, from 0.002 to 5.09
- event
numeric, 0 or 1
Author(s)
Jan Porthun
Source
Self-generated example data
Examples
data(data2_ushape)
Combine Factors
Description
Intern function, used for creation of a matrix with all factor combinations of the cutpoint-variable
Usage
factors_combine(bandwith = 0.1, nb_of_cp = 1, nrm, symtails = FALSE)
Arguments
bandwith |
numeric, determines the minimum size per group of the dichitomised variable |
nb_of_cp |
numeric, number of cutpoints searching for |
nrm |
numeric, number of rows in cpdata after removing observations with missing values in biomarker |
symtails |
logical, if TRUE the tails of the dichotomised variable are symmetrical |
Value
All factor combinations of the dichotomized variable.