Title: Event Prediction
Version: 0.2.9
Date: 2025-06-10
Description: Predicts enrollment and events at the design or analysis stage using specified enrollment and time-to-event models through simulations.
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Encoding: UTF-8
RoxygenNote: 7.3.2
Imports: data.table (≥ 1.14.10), magrittr (≥ 2.0.0), ggplot2 (≥ 3.5.1), plotly (≥ 4.10.1), survival (≥ 2.41-3), splines (≥ 3.5.0), Matrix (≥ 1.2-14), mvtnorm (≥ 1.1-3), rstpm2 (≥ 1.6.1), numDeriv (≥ 2016.8-1.1), purrr (≥ 1.0.2), flexsurv (≥ 2.2.2), erify (≥ 0.4.0), stats (≥ 3.5.0), shiny (≥ 1.7.1), rlang (≥ 1.0.6), lrstat (≥ 0.2.12)
Suggests: knitr, rmarkdown, testthat (≥ 3.0.0)
Depends: R (≥ 3.5.0)
VignetteBuilder: knitr
LazyData: true
URL: https://github.com/kaifenglu/eventPred
BugReports: https://github.com/kaifenglu/eventPred/issues
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2025-06-10 14:41:28 UTC; kaife
Author: Kaifeng Lu ORCID iD [aut, cre]
Maintainer: Kaifeng Lu <kaifenglu@gmail.com>
Repository: CRAN
Date/Publication: 2025-06-10 16:00:02 UTC

Event Prediction

Description

Predicts enrollment and events at the design stage using assumed enrollment and treatment-specific time-to-event models, or at the analysis stage using blinded or unblinded data and specified enrollment and time-to-event models through simulations.

Details

Accurately predicting the date at which a target number of subjects or events will be achieved is critical for the planning, monitoring, and execution of clinical trials. The eventPred package provides enrollment and event prediction capabilities using assumed enrollment and treatment-specific time-to-event models at the design stage, using blinded or unblinded data and specified enrollment and time-to-event models at the analysis stage.

At the design stage, enrollment is often specified using a piecewise Poisson process with a constant enrollment rate during each specified time interval. At the analysis stage, before enrollment completion, the eventPred package considers several models, including the homogeneous Poisson model, the time-decay model with an enrollment rate function \lambda(t) = (\mu/\delta) (1 - \exp(-\delta t)), the B-spline model with the daily enrollment rate \lambda(t) = \exp(B(t)'\theta), and the piecewise Poisson model. If prior information exists on the model parameters, it can be combined with the likelihood to yield the posterior distribution.

The eventPred package also offers several time-to-event models, including exponential, Weibull, log-logistic, log-normal, piecewise exponential, model averaging of Weibull and log-normal, spline, and cox. For time to dropout, the same set of model options are considered. If enrollment is complete, ongoing subjects who have not had the event of interest or dropped out of the study before the data cut contribute additional events in the future. Their event times are generated from the conditional distribution given that they have survived at the data cut. For new subjects that need to be enrolled, their enrollment time and event time can be generated from the specified enrollment and time-to-event models with parameters drawn from the posterior distribution. Time-to-dropout can be generated in a similar fashion.

The eventPred package displays the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC) and a fitted curve overlaid with observed data to help users select the most appropriate model for enrollment and event prediction. Prediction intervals in the prediction plot can be used to measure prediction uncertainty, and the simulated enrollment and event data can be used for further data exploration.

The most useful function in the eventPred package is getPrediction, which combines model fitting, data simulation, and a summary of simulation results. Other functions perform individual tasks and can be used to select an appropriate prediction model.

The eventPred package implements a model parameterization that enhances the asymptotic normality of parameter estimates. Specifically, the package utilizes the following parameterization to achieve this goal:

The eventPred package uses days as its primary time unit. If you need to convert enrollment or event rates per month to rates per day, simply divide by 30.4375.

Author(s)

Kaifeng Lu, kaifenglu@gmail.com

References

Emilia Bagiella and Daniel F. Heitjan. Predicting analysis times in randomized clinical trials. Stat in Med. 2001; 20:2055-2063.

Gui-shuang Ying and Daniel F. Heitjan. Weibull prediction of event times in clinical trials. Pharm Stat. 2008; 7:107-120.

Xiaoxi Zhang and Qi Long. Stochastic modeling and prediction for accrual in clinical trials. Stat in Med. 2010; 29:649-658.

Patrick Royston and Mahesh K. B. Parmar. Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Stat in Med. 2002; 21:2175-2197.


Final enrollment and event data after achieving the target number of events

Description

A data frame with 300 rows and 7 columns:

trialsdt

The trial start date

usubjid

The unique subject ID

randdt

The randomization date

treatment

The treatment group number

treatment_description

Description of the treatment group

time

The day of event or censoring since randomization

event

The event indicator: 1 for event, 0 for non-event

dropout

The dropout indicator: 1 for dropout, 0 for non-dropout

cutoffdt

The cutoff date

For ongoing subjects, both event and dropout are equal to 0.

Usage

finalData

Format

An object of class tbl_df (inherits from tbl, data.frame) with 300 rows and 9 columns.


Fit time-to-dropout model

Description

Fits a specified time-to-dropout model to the dropout data.

Usage

fitDropout(
  df,
  dropout_model = "exponential",
  piecewiseDropoutTime = 0,
  k_dropout = 0,
  scale_dropout = "hazard",
  m_dropout = 5,
  showplot = TRUE,
  by_treatment = FALSE,
  covariates = NULL,
  generate_plot = TRUE,
  interactive_plot = TRUE
)

Arguments

df

The subject-level dropout data, including time and dropout. The data should also include treatment coded as 1, 2, and so on, and treatment_description for fitting the dropout model by treatment.

dropout_model

The dropout model used to analyze the dropout data which can be set to one of the following options: "exponential", "Weibull", "log-logistic", "log-normal", "piecewise exponential", "model averaging", "spline", or "cox". The model averaging uses the exp(-bic/2) weighting and combines Weibull and log-normal models. The spline model of Royston and Parmar (2002) assumes that a transformation of the survival function is modeled as a natural cubic spline function of log time. By default, it is set to "exponential".

piecewiseDropoutTime

A vector that specifies the time intervals for the piecewise exponential dropout distribution. Must start with 0, e.g., c(0, 60) breaks the time axis into 2 event intervals: [0, 60) and [60, Inf). By default, it is set to 0.

k_dropout

The number of inner knots of the spline. The default k_dropout=0 gives a Weibull, log-logistic or log-normal model, if scale_dropout is "hazard", "odds", or "normal", respectively. The knots are chosen as equally-spaced quantiles of the log uncensored survival times. The boundary knots are chosen as the minimum and maximum log uncensored survival times.

scale_dropout

The scale of the spline. The default is "hazard", in which case the log cumulative hazard is modeled as a spline function. If scale = "odds", the log cumulative odds is modeled as a spline function. If scale = "normal", -qnorm(S(t)) is modeled as a spline function.

m_dropout

The number of dropout time intervals to extrapolate the hazard function beyond the last observed dropout time when dropout_model = "cox".

showplot

A Boolean variable to control whether or not to show the fitted time-to-dropout survival curve. By default, it is set to TRUE.

by_treatment

A Boolean variable to control whether or not to fit the time-to-dropout data by treatment group. By default, it is set to FALSE.

covariates

The names of baseline covariates from the input data frame to include in the dropout model, e.g., c("age", "sex"). Factor variables need to be declared in the input data frame.

generate_plot

Whether to generate plots.

interactive_plot

Whether to produce interactive plots using plotly or static plots using ggplot2.

Value

A list of results from the model fit including key information such as the dropout model, model, the estimated model parameters, theta, the covariance matrix, vtheta, as well as the Akaike Information Criterion, aic, and Bayesian Information Criterion, bic.

If the piecewise exponential model is used, the location of knots used in the model, piecewiseDropoutTime, will be included in the list of results.

If the model averaging option is chosen, the weight assigned to the Weibull component is indicated by the w1 variable.

If the spline option is chosen, the knots and scale will be included in the list of results.

If the cox option is chosen, the list of results will include model, theta, vtheta, aic, bic, and piecewiseDropoutTime. Here

\theta = (\log(\lambda_1), \ldots, \log(\lambda_M), \beta^T)^T,

M denotes the number of distinct observed dropout times, t_1 < \cdots < t_M, \lambda_j denotes the estimated baseline hazard rate in the jth dropout time interval, (t_{j-1}, t_j], and \beta represents the regression coefficients (log hazard ratios) from the Cox model. For a fair comparison, the estimation of baseline hazards is incorporated into the aic and bic values. In addition, \mbox{piecewiseDropoutTime} = (0, t_1, \ldots, t_M). To extend the survival curve beyond the last observed dropout time, a weighted average of the hazard rates from the final m_dropout dropout time intervals is used. The weights are proportional to the lengths of those intervals, i.e.,

\lambda_{M+1} = \sum_{j=M-m_{\rm{dropout}}+1}^{M} w_j \lambda_j,

where w_j = (t_j - t_{j-1})/(t_M - t_{M-m_{\rm{dropout}}}) for j=M-m_{\rm{dropout}}+1,\ldots,M.

When fitting the dropout model by treatment, the outcome is presented as a list of lists, where each list element corresponds to a specific treatment group.

The fitted time-to-dropout survival curve is also returned.

Author(s)

Kaifeng Lu, kaifenglu@gmail.com

References

Patrick Royston and Mahesh K. B. Parmar. Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Stat in Med. 2002; 21:2175-2197.

Examples


dropout_fit <- fitDropout(
  df = interimData2,
  dropout_model = "exponential")


Fit enrollment model

Description

Fits a specified enrollment model to the enrollment data.

Usage

fitEnrollment(
  df,
  enroll_model = "b-spline",
  nknots = 0,
  accrualTime = 0,
  showplot = TRUE,
  generate_plot = TRUE,
  interactive_plot = TRUE
)

Arguments

df

The subject-level enrollment data, including trialsdt, randdt and cutoffdt.

enroll_model

The enrollment model which can be specified as "Poisson", "Time-decay", "B-spline", or "Piecewise Poisson". By default, it is set to "B-spline".

nknots

The number of inner knots for the B-spline enrollment model. By default, it is set to 0.

accrualTime

The accrual time intervals for the piecewise Poisson model. Must start with 0, e.g., c(0, 30) breaks the time axis into 2 accrual intervals: [0, 30) and [30, Inf). By default, it is set to 0.

showplot

A Boolean variable to control whether or not to show the fitted enrollment curve. By default, it is set to TRUE.

generate_plot

Whether to generate plots.

interactive_plot

Whether to produce interactive plots using plotly or static plots using ggplot2.

Details

For the time-decay model, the mean function is

\mu(t) = (\mu/\delta)(t - (1/\delta)(1 - \exp(-\delta t)))

and the rate function is

\lambda(t) = (\mu/\delta)(1 - \exp(-\delta t)).

For the B-spline model, the daily enrollment rate is \lambda(t) = \exp(B(t)' \theta), where B(t) represents the B-spline basis functions.

Value

A list of results from the model fit including key information such as the enrollment model, model, the estimated model parameters, theta, the covariance matrix, vtheta, the Akaike Information Criterion, aic, and the Bayesian Information Criterion, bic, as well as the design matrix x for the B-spline enrollment model, and accrualTime for the piecewise Poisson enrollment model.

The fitted enrollment curve is also returned.

Author(s)

Kaifeng Lu, kaifenglu@gmail.com

References

Xiaoxi Zhang and Qi Long. Stochastic modeling and prediction for accrual in clinical trials. Stat in Med. 2010; 29:649-658.

Examples


enroll_fit <- fitEnrollment(
  df = interimData1, enroll_model = "b-spline",
  nknots = 1)


Fit time-to-event model

Description

Fits a specified time-to-event model to the event data.

Usage

fitEvent(
  df,
  event_model = "model averaging",
  piecewiseSurvivalTime = 0,
  k = 0,
  scale = "hazard",
  m = 5,
  showplot = TRUE,
  by_treatment = FALSE,
  covariates = NULL,
  generate_plot = TRUE,
  interactive_plot = TRUE
)

Arguments

df

The subject-level event data, including time and event. The data should also include treatment coded as 1, 2, and so on, and treatment_description for fitting the event model by treatment.

event_model

The event model used to analyze the event data which can be set to one of the following options: "exponential", "Weibull", "log-logistic", "log-normal", "piecewise exponential", "model averaging", "spline", or "cox". The model averaging uses the exp(-bic/2) weighting and combines Weibull and log-normal models. The spline model of Royston and Parmar (2002) assumes that a transformation of the survival function is modeled as a natural cubic spline function of log time. By default, it is set to "model averaging".

piecewiseSurvivalTime

A vector that specifies the time intervals for the piecewise exponential survival distribution. Must start with 0, e.g., c(0, 60) breaks the time axis into 2 event intervals: [0, 60) and [60, Inf). By default, it is set to 0.

k

The number of inner knots of the spline. The default k=0 gives a Weibull, log-logistic or log-normal model, if scale is "hazard", "odds", or "normal", respectively. The knots are chosen as equally-spaced quantiles of the log uncensored survival times. The boundary knots are chosen as the minimum and maximum log uncensored survival times.

scale

The scale of the spline. The default is "hazard", in which case the log cumulative hazard is modeled as a spline function. If scale = "odds", the log cumulative odds is modeled as a spline function. If scale = "normal", -qnorm(S(t)) is modeled as a spline function.

m

The number of event time intervals to extrapolate the hazard function beyond the last observed event time when event_model = "cox".

showplot

A Boolean variable to control whether or not to show the fitted time-to-event survival curve. By default, it is set to TRUE.

by_treatment

A Boolean variable to control whether or not to fit the time-to-event data by treatment group. By default, it is set to FALSE.

covariates

The names of baseline covariates from the input data frame to include in the event model, e.g., c("age", "sex"). Factor variables need to be declared in the input data frame.

generate_plot

Whether to generate plots.

interactive_plot

Whether to produce interactive plots using plotly or static plots using ggplot2.

Value

A list of results from the model fit including key information such as the event model, model, the estimated model parameters, theta, the covariance matrix, vtheta, as well as the Akaike Information Criterion, aic, and Bayesian Information Criterion, bic.

If the piecewise exponential model is used, the location of knots used in the model, piecewiseSurvivalTime, will be included in the list of results.

If the model averaging option is chosen, the weight assigned to the Weibull component is indicated by the w1 variable.

If the spline option is chosen, the knots and scale will be included in the list of results.

If the cox option is chosen, the list of results will include model, theta, vtheta, aic, bic, and piecewiseSurvivalTime. Here

\theta = (\log(\lambda_1), \ldots, \log(\lambda_M), \beta^T)^T,

M denotes the number of distinct observed event times, t_1 < \cdots < t_M, \lambda_j denotes the estimated baseline hazard rate in the jth event time interval, (t_{j-1}, t_j], and \beta represents the regression coefficients (log hazard ratios) from the Cox model. For a fair comparison, the estimation of baseline hazards is incorporated into the aic and bic values. In addition, \mbox{piecewiseSurvivalTime} = (0, t_1, \ldots, t_M). To extend the survival curve beyond the last observed event time, a weighted average of the hazard rates from the final m event time intervals is used. The weights are proportional to the lengths of those intervals, i.e.,

\lambda_{M+1} = \sum_{j=M-m+1}^{M} w_j \lambda_j,

where w_j = (t_j - t_{j-1})/(t_M - t_{M-m}) for j=M-m+1,\ldots,M.

When fitting the event model by treatment, the outcome is presented as a list of lists, where each list element corresponds to a specific treatment group.

The fitted time-to-event survival curve is also returned.

Author(s)

Kaifeng Lu, kaifenglu@gmail.com

References

Patrick Royston and Mahesh K. B. Parmar. Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Stat in Med. 2002; 21:2175-2197.

Examples


event_fit <- fitEvent(
  df = interimData2,
  event_model = "piecewise exponential",
  piecewiseSurvivalTime = c(0, 180))


Enrollment and event prediction

Description

Performs enrollment and event prediction by utilizing observed data and specified enrollment and event models.

Usage

getPrediction(
  df = NULL,
  to_predict = "enrollment and event",
  target_n = NA,
  target_d = NA,
  enroll_model = "b-spline",
  nknots = 0,
  lags = 30,
  accrualTime = 0,
  enroll_prior = NULL,
  event_model = "model averaging",
  piecewiseSurvivalTime = 0,
  k = 0,
  scale = "hazard",
  m = 5,
  event_prior = NULL,
  dropout_model = "exponential",
  piecewiseDropoutTime = 0,
  k_dropout = 0,
  scale_dropout = "hazard",
  m_dropout = 5,
  dropout_prior = NULL,
  fixedFollowup = FALSE,
  followupTime = 365,
  pilevel = 0.9,
  nyears = 4,
  target_t = NA,
  nreps = 500,
  showEnrollment = TRUE,
  showEvent = TRUE,
  showDropout = FALSE,
  showOngoing = FALSE,
  showsummary = TRUE,
  showplot = TRUE,
  by_treatment = FALSE,
  ngroups = 1,
  alloc = NULL,
  treatment_label = NULL,
  covariates_event = NULL,
  event_prior_with_covariates = NULL,
  covariates_dropout = NULL,
  dropout_prior_with_covariates = NULL,
  fix_parameter = FALSE,
  generate_plot = TRUE,
  interactive_plot = TRUE
)

Arguments

df

The subject-level enrollment and event data, including trialsdt, usubjid, randdt, and cutoffdt for enrollment prediction, and, additionally, time, event, and dropout for event prediction. The data should also include treatment coded as 1, 2, and so on, and treatment_description for enrollment and event prediction by treatment. By default, it is set to NULL for enrollment and event prediction at the design stage.

to_predict

Specifies what to predict: "enrollment only", "event only", or "enrollment and event". By default, it is set to "enrollment and event".

target_n

The target number of subjects to enroll in the study.

target_d

The target number of events to reach in the study.

enroll_model

The enrollment model which can be specified as "Poisson", "Time-decay", "B-spline", or "Piecewise Poisson". By default, it is set to "B-spline".

nknots

The number of inner knots for the B-spline enrollment model. By default, it is set to 0.

lags

The day lags to compute the average enrollment rate to carry forward for the B-spline enrollment model. By default, it is set to 30.

accrualTime

The accrual time intervals for the piecewise Poisson model. Must start with 0, e.g., c(0, 30) breaks the time axis into 2 accrual intervals: [0, 30) and [30, Inf). By default, it is set to 0.

enroll_prior

The prior of enrollment model parameters.

event_model

The event model used to analyze the event data which can be set to one of the following options: "exponential", "Weibull", "log-logistic", "log-normal", "piecewise exponential", "model averaging", "spline", or "cox". The model averaging uses the exp(-bic/2) weighting and combines Weibull and log-normal models. By default, it is set to "model averaging".

piecewiseSurvivalTime

A vector that specifies the time intervals for the piecewise exponential survival distribution. Must start with 0, e.g., c(0, 60) breaks the time axis into 2 event intervals: [0, 60) and [60, Inf). By default, it is set to 0.

k

The number of inner knots of the spline event model of Royston and Parmar (2002). The default k=0 gives a Weibull, log-logistic or log-normal model, if scale is "hazard", "odds", or "normal", respectively. The knots are chosen as equally-spaced quantiles of the log uncensored survival times. The boundary knots are chosen as the minimum and maximum log uncensored survival times.

scale

If "hazard", the log cumulative hazard is modeled as a spline function. If "odds", the log cumulative odds is modeled as a spline function. If "normal", -qnorm(S(t)) is modeled as a spline function.

m

The number of event time intervals to extrapolate the hazard function beyond the last observed event time.

event_prior

The prior of event model parameters.

dropout_model

The dropout model used to analyze the dropout data which can be set to one of the following options: "none", "exponential", "Weibull", "log-logistic", "log-normal", "piecewise exponential", "model averaging", "spline", or "cox". The model averaging uses the exp(-bic/2) weighting and combines Weibull and log-normal models. By default, it is set to "exponential".

piecewiseDropoutTime

A vector that specifies the time intervals for the piecewise exponential dropout distribution. Must start with 0, e.g., c(0, 60) breaks the time axis into 2 event intervals: [0, 60) and [60, Inf). By default, it is set to 0.

k_dropout

The number of inner knots of the spline dropout model of Royston and Parmar (2002). The default k_dropout=0 gives a Weibull, log-logistic or log-normal model, if scale_dropout is "hazard", "odds", or "normal", respectively. The knots are chosen as equally-spaced quantiles of the log uncensored survival times. The boundary knots are chosen as the minimum and maximum log uncensored survival times.

scale_dropout

If "hazard", the log cumulative hazard for dropout is modeled as a spline function. If "odds", the log cumulative odds is modeled as a spline function. If "normal", -qnorm(S(t)) is modeled as a spline function.

m_dropout

The number of dropout time intervals to extrapolate the hazard function beyond the last observed dropout time.

dropout_prior

The prior of dropout model parameters.

fixedFollowup

A Boolean variable indicating whether a fixed follow-up design is used. By default, it is set to FALSE for a variable follow-up design.

followupTime

The follow-up time for a fixed follow-up design, in days. By default, it is set to 365.

pilevel

The prediction interval level. By default, it is set to 0.90.

nyears

The number of years after the data cut for prediction. By default, it is set to 4.

target_t

The target number of days after the data cutoff used to predict both the number of events and the probability of achieving the target event count.

nreps

The number of replications for simulation. By default, it is set to 500.

showEnrollment

A Boolean variable to control whether or not to show the number of enrolled subjects. By default, it is set to TRUE.

showEvent

A Boolean variable to control whether or not to show the number of events. By default, it is set to TRUE.

showDropout

A Boolean variable to control whether or not to show the number of dropouts. By default, it is set to FALSE.

showOngoing

A Boolean variable to control whether or not to show the number of ongoing subjects. By default, it is set to FALSE.

showsummary

A Boolean variable to control whether or not to show the prediction summary. By default, it is set to TRUE.

showplot

A Boolean variable to control whether or not to show the plots. By default, it is set to TRUE.

by_treatment

A Boolean variable to control whether or not to predict by treatment group. By default, it is set to FALSE.

ngroups

The number of treatment groups for enrollment prediction at the design stage. By default, it is set to 1. It is replaced with the actual number of treatment groups in the observed data if df is not NULL.

alloc

The treatment allocation in a randomization block. By default, it is set to NULL, which yields equal allocation among the treatment groups.

treatment_label

The treatment labels for treatments in a randomization block for design stage prediction. It is replaced with the treatment_description in the observed data if df is not NULL.

covariates_event

The names of baseline covariates from the input data frame to include in the event model, e.g., c("age", "sex"). Factor variables need to be declared in the input data frame.

event_prior_with_covariates

The prior of event model parameters in the presence of covariates.

covariates_dropout

The names of baseline covariates from the input data frame to include in the dropout model, e.g., c("age", "sex"). Factor variables need to be declared in the input data frame.

dropout_prior_with_covariates

The prior of dropout model parameters in the presence of covariates.

fix_parameter

Whether to fix parameters at the maximum likelihood estimates when generating new data for prediction. Defaults to FALSE, in which case, parameters will be drawn from their approximate posterior distribution.

generate_plot

Whether to generate plots.

interactive_plot

Whether to produce interactive plots using plotly or static plots using ggplot2.

Details

For the time-decay model, the mean function is \mu(t) = (\mu/\delta)(t - (1/\delta)(1 - \exp(-\delta t))) and the rate function is \lambda(t) = (\mu/\delta)(1 - \exp(-\delta t)). For the B-spline model, the daily enrollment rate is approximated as \lambda(t) = \exp(B(t)' \theta), where B(t) represents the B-spline basis functions.

The enroll_prior variable should be a list that includes model to specify the enrollment model (poisson, time-decay, or piecewise poisson), theta and vtheta to indicate the parameter values and the covariance matrix. One can use a very small value of vtheta to fix the parameter values. For the piecewise Poisson enrollment model, the list should also include accrualTime. It should be noted that the B-spline model is not appropriate for use as prior.

For event prediction by treatment with prior information, the event_prior (dropout_prior) variable should be a list with one element per treatment. For each treatment, the element should include model to specify the event (dropout) model (exponential, weibull, log-logistic, log-normal, or piecewise exponential), and theta and vtheta to indicate the parameter values and the covariance matrix. For the piecewise exponential event (dropout) model, the list should also include piecewiseSurvivalTime (piecewiseDropoutTime) to indicate the location of knots. It should be noted that the model averaging, spline, and cox options are not appropriate for use as prior.

If the event prediction is not by treatment while the prior information is given by treatment, then each element of event_prior (dropout_prior) should also include w to specify the weight of the treatment in a randomization block. If the prediction is not by treatment and the prior is given for the overall study, then event_prior (dropout_prior) is a flat list with model, theta, and vtheta. For the piecewise exponential event (dropout) model, it should also include piecewiseSurvivalTime (piecewiseDropoutTime) to indicate the location of knots.

For analysis-stage enrollment and event prediction, the enroll_prior, event_prior, and dropout_prior are either set to NULL to use the observed data only, or specify the prior distribution of model parameters to be combined with observed data likelihood for enhanced modeling flexibility.

Value

A list that includes the fits of observed data models, as well as simulated enrollment data for new subjects and simulated event data for ongoing and new subjects.

Author(s)

Kaifeng Lu, kaifenglu@gmail.com

Examples

# Event prediction after enrollment completion
set.seed(3000)

pred <- getPrediction(
  df = interimData2, to_predict = "event only",
  target_d = 200,
  event_model = "weibull",
  dropout_model = "exponential",
  pilevel = 0.90, nreps = 100)


Interim enrollment and event data before enrollment completion

Description

A data frame with 225 rows and 9 columns:

trialsdt

The trial start date

usubjid

The unique subject ID

randdt

The randomization date

treatment

The treatment group number

treatment_description

Description of the treatment group

time

The day of event or censoring since randomization

event

The event indicator: 1 for event, 0 for non-event

dropout

The dropout indicator: 1 for dropout, 0 for non-dropout

cutoffdt

The cutoff date

For ongoing subjects, both event and dropout are equal to 0.

Usage

interimData1

Format

An object of class tbl_df (inherits from tbl, data.frame) with 224 rows and 9 columns.


Interim enrollment and event data after enrollment completion

Description

A data frame with 300 rows and 7 columns:

trialsdt

The trial start date

usubjid

The unique subject ID

randdt

The randomization date

treatment

The treatment group number

treatment_description

Description of the treatment group

time

The day of event or censoring since randomization

event

The event indicator: 1 for event, 0 for non-event

dropout

The dropout indicator: 1 for dropout, 0 for non-dropout

cutoffdt

The cutoff date

For ongoing subjects, both event and dropout are equal to 0.

Usage

interimData2

Format

An object of class tbl_df (inherits from tbl, data.frame) with 300 rows and 9 columns.


Profile log likelihood for piecewise exponential regression

Description

Obtains the profile log likelihood value for piecewise exponential regression.

Usage

pllik_pwexp(beta, time, event, J, tcut, x)

Arguments

beta

The regression coefficients with respect to the covariates.

time

The survival time.

event

The event indicator.

J

The number of time intervals.

tcut

A vector that specifies the endpoints of time intervals for the baseline piecewise exponential survival distribution. Must start with 0, e.g., c(0, 60) breaks the time axis into 2 event intervals: [0, 60) and [60, Inf). By default, it is set to 0.

x

The covariates matrix (including the intercept).

Value

The profile log likelihood value for piecewise exponential regression.


Distribution function for model averaging of Weibull and log-normal

Description

Obtains the distribution function value for model-averaging of Weibull and log-normal regression.

Usage

pmodavg(t, theta, w1, q = 0, x = 1, lower.tail = TRUE, log.p = FALSE)

Arguments

t

The vector of time points.

theta

The parameter vector consisting of the accelerate failure time (AFT) regression coefficients and the logrithm of the AFT regression scale parameter for the Weibull and log-normal distributions.

w1

The weight for the Weibull component distribution.

q

The number of elements in the vector of covariates (excluding the intercept).

x

The vector of covariates (including the intercept).

lower.tail

logical; if TRUE (default), probabilities are the distribution function, otherwise, the survival function.

log.p

logical; if TRUE, probabilities p are given as log(p).

Value

The probabilities p = P(T <= t | X = x).


Distribution function for piecewise exponential regression

Description

Obtains the distribution function value for piecewise exponential regression.

Usage

ppwexp(t, theta, J, tcut, q = 0, x = 1, lower.tail = TRUE, log.p = FALSE)

Arguments

t

The vector of time points.

theta

The parameter vector consisting of gamma for log piecewise hazards and beta for regression coefficients.

J

The number of time intervals.

tcut

A vector that specifies the time intervals for the piecewise exponential survival distribution. Must start with 0, e.g., c(0, 60) breaks the time axis into 2 event intervals: [0, 60) and [60, Inf). By default, it is set to 0.

q

The number of elements in the vector of covariates (excluding the intercept).

x

The vector of covariates (including the intercept).

lower.tail

logical; if TRUE (default), probabilities are the distribution function, otherwise, the survival function.

log.p

logical; if TRUE, probabilities p are given as log(p).

Value

The probabilities p = P(T <= t | X = x).


Predict enrollment

Description

Utilizes a pre-fitted enrollment model to generate enrollment times for new subjects and provide a prediction interval for the expected time to reach the enrollment target.

Usage

predictEnrollment(
  df = NULL,
  target_n = NA,
  enroll_fit = NULL,
  lags = 30,
  pilevel = 0.9,
  nyears = 4,
  nreps = 500,
  showsummary = TRUE,
  showplot = TRUE,
  by_treatment = FALSE,
  ngroups = 1,
  alloc = NULL,
  treatment_label = NULL,
  fix_parameter = FALSE,
  generate_plot = TRUE,
  interactive_plot = TRUE
)

Arguments

df

The subject-level enrollment data, including trialsdt, randdt and cutoffdt. The data should also include treatment coded as 1, 2, and so on, and treatment_description for prediction by treatment group. By default, it is set to NULL for enrollment prediction at the design stage.

target_n

The target number of subjects to enroll in the study.

enroll_fit

The pre-fitted enrollment model used to generate predictions.

lags

The day lags to compute the average enrollment rate to carry forward for the B-spline enrollment model. By default, it is set to 30.

pilevel

The prediction interval level. By default, it is set to 0.90.

nyears

The number of years after the data cut for prediction. By default, it is set to 4.

nreps

The number of replications for simulation. By default, it is set to 500.

showsummary

A Boolean variable to control whether or not to show the prediction summary. By default, it is set to TRUE.

showplot

A Boolean variable to control whether or not to show the prediction plot. By default, it is set to TRUE.

by_treatment

A Boolean variable to control whether or not to predict enrollment by treatment group. By default, it is set to FALSE.

ngroups

The number of treatment groups for enrollment prediction at the design stage. By default, it is set to 1. It is replaced with the actual number of treatment groups in the observed data if df is not NULL.

alloc

The treatment allocation in a randomization block. By default, it is set to NULL, which yields equal allocation among the treatment groups.

treatment_label

The treatment labels for treatments in a randomization block for design stage prediction. It is replaced with the treatment_description in the observed data if df is not NULL.

fix_parameter

Whether to fix parameters at the maximum likelihood estimates when generating new data for prediction. Defaults to FALSE, in which case, parameters will be drawn from their approximate posterior distributions.

generate_plot

Whether to generate plots.

interactive_plot

Whether to produce interactive plots using plotly or static plots using ggplot2.

Details

The enroll_fit variable can be used for enrollment prediction at the design stage. A piecewise Poisson model can be parameterized through the time intervals, accrualTime, which is treated as fixed, and the enrollment rates in the intervals, accrualIntensity, the log of which is used as the model parameter. For the homogeneous Poisson, time-decay, and piecewise Poisson models, enroll_fit is used to specify the prior distribution of model parameters, with a very small variance being used to fix the parameter values. It should be noted that the B-spline model is not appropriate for use during the design stage.

During the enrollment stage, enroll_fit is the enrollment model fit based on the observed data. The fitted enrollment model is used to generate enrollment times for new subjects.

Value

A list of prediction results, which includes important information such as the median, lower and upper percentiles for the estimated time to reach the target number of subjects, as well as simulated enrollment data for new subjects. The data for the prediction plot is also included within the list.

Author(s)

Kaifeng Lu, kaifenglu@gmail.com

References

Xiaoxi Zhang and Qi Long. Stochastic modeling and prediction for accrual in clinical trials. Stat in Med. 2010; 29:649-658.

Examples

# Enrollment prediction at the design stage
set.seed(1000)

enroll_pred <- predictEnrollment(
  target_n = 300,
  enroll_fit = list(
    model = "piecewise poisson",
    theta = log(26/9*seq(1, 9)/30.4375),
    vtheta = diag(9)*1e-8,
    accrualTime = seq(0, 8)*30.4375),
  pilevel = 0.90, nreps = 100)


Predict event

Description

Utilizes pre-fitted time-to-event and time-to-dropout models to generate event and dropout times for ongoing subjects and new subjects. It also provides a prediction interval for the expected time to reach the target number of events.

Usage

predictEvent(
  df = NULL,
  target_d = NA,
  newSubjects = NULL,
  event_fit = NULL,
  m = 5,
  dropout_fit = NULL,
  m_dropout = 5,
  fixedFollowup = FALSE,
  followupTime = 365,
  pilevel = 0.9,
  nyears = 4,
  target_t = NA,
  nreps = 500,
  showEnrollment = TRUE,
  showEvent = TRUE,
  showDropout = FALSE,
  showOngoing = FALSE,
  showsummary = TRUE,
  showplot = TRUE,
  by_treatment = FALSE,
  covariates_event = NULL,
  event_fit_with_covariates = NULL,
  covariates_dropout = NULL,
  dropout_fit_with_covariates = NULL,
  fix_parameter = FALSE,
  generate_plot = TRUE,
  interactive_plot = TRUE
)

Arguments

df

The subject-level enrollment and event data, including trialsdt, usubjid, randdt, cutoffdt, time, event, and dropout. The data should also include treatment coded as 1, 2, and so on, and treatment_description for by-treatment prediction. By default, it is set to NULL for event prediction at the design stage.

target_d

The target number of events to reach in the study.

newSubjects

The enrollment data for new subjects including draw and arrivalTime. The data should also include treatment for prediction by treatment. By default, it is set to NULL, indicating the completion of subject enrollment.

event_fit

The pre-fitted event model used to generate predictions.

m

The number of event time intervals to extrapolate the hazard function beyond the last observed event time.

dropout_fit

The pre-fitted dropout model used to generate predictions. By default, it is set to NULL, indicating no dropout.

m_dropout

The number of dropout time intervals to extrapolate the hazard function beyond the last observed dropout time.

fixedFollowup

A Boolean variable indicating whether a fixed follow-up design is used. By default, it is set to FALSE for a variable follow-up design.

followupTime

The follow-up time for a fixed follow-up design, in days. By default, it is set to 365.

pilevel

The prediction interval level. By default, it is set to 0.90.

nyears

The number of years after the data cut for prediction. By default, it is set to 4.

target_t

The target number of days after the data cutoff used to predict both the number of events and the probability of achieving the target event count.

nreps

The number of replications for simulation. By default, it is set to 500. If newSubjects is not NULL, the number of draws in newSubjects should be nreps.

showEnrollment

A Boolean variable to control whether or not to show the number of enrolled subjects. By default, it is set to TRUE.

showEvent

A Boolean variable to control whether or not to show the number of events. By default, it is set to TRUE.

showDropout

A Boolean variable to control whether or not to show the number of dropouts. By default, it is set to FALSE.

showOngoing

A Boolean variable to control whether or not to show the number of ongoing subjects. By default, it is set to FALSE.

showsummary

A Boolean variable to control whether or not to show the prediction summary. By default, it is set to TRUE.

showplot

A Boolean variable to control whether or not to show the prediction plot. By default, it is set to TRUE.

by_treatment

A Boolean variable to control whether or not to predict event by treatment group. By default, it is set to FALSE.

covariates_event

The names of baseline covariates from the input data frame to include in the event model, e.g., c("age", "sex"). Factor variables need to be declared in the input data frame.

event_fit_with_covariates

The pre-fitted event model with covariates used to generate event predictions for ongoing subjects.

covariates_dropout

The names of baseline covariates from the input data frame to include in the dropout model, e.g., c("age", "sex"). Factor variables need to be declared in the input data frame.

dropout_fit_with_covariates

The pre-fitted dropout model with covariates used to generate dropout predictions for ongoing subjects.

fix_parameter

Whether to fix parameters at the maximum likelihood estimates when generating new data for prediction. Defaults to FALSE, in which case, parameters will be drawn from their approximate posterior distribution.

generate_plot

Whether to generate plots.

interactive_plot

Whether to produce interactive plots using plotly or static plots using ggplot2.

Details

To ensure successful event prediction at the design stage, it is important to provide the newSubjects data set.

To specify the event (dropout) model used during the design-stage event prediction, the event_fit (dropout_fit) should be a list with one element per treatment. For each treatment, the element should include model to specify the event model (exponential, weibull, log-logistic, log-normal, or piecewise exponential), and theta and vtheta to indicate the parameter values and the covariance matrix. For the piecewise exponential event (dropout) model, the list should also include piecewiseSurvivalTime (piecewiseDropoutTime) to indicate the location of knots. It should be noted that the model averaging and spline options are not appropriate for use during the design stage.

Following the commencement of the trial, we obtain the event model fit and the dropout model fit based on the observed data, denoted as event_fit and dropout_fit, respectively. These fitted models are subsequently utilized to generate event and dropout times for both ongoing and new subjects in the trial.

Value

A list of prediction results which includes important information such as the median, lower and upper percentiles for the estimated day and date to reach the target number of events, as well as simulated event data for both ongoing and new subjects. The data for the prediction plot is also included within this list. If target_t is specified, it additionally provides the median, lower, and upper percentiles of the event count at target_t, as well as the predictive probability of achieving the target number of events by target_t.

Author(s)

Kaifeng Lu, kaifenglu@gmail.com

References

Emilia Bagiella and Daniel F. Heitjan. Predicting analysis times in randomized clinical trials. Stat in Med. 2001; 20:2055-2063.

Gui-shuang Ying and Daniel F. Heitjan. Weibull prediction of event times in clinical trials. Pharm Stat. 2008; 7:107-120.

Examples


# Event prediction after enrollment completion
set.seed(2000)

event_fits <- fitEvent(
  df = interimData2,
  event_model = "piecewise exponential",
  piecewiseSurvivalTime = c(0, 140, 352))

dropout_fits <- fitDropout(
  df = interimData2,
  dropout_model = "exponential")

event_pred <- predictEvent(
  df = interimData2, target_d = 200,
  event_fit = event_fits$fit,
  dropout_fit = dropout_fits$fit,
  pilevel = 0.90, nreps = 100)


Piecewise exponential regression

Description

Obtains the maximum likelihood estimates for piecewise exponential regression.

Usage

pwexpreg(time, event, J, tcut, q = 0, x = 1)

Arguments

time

The survival time.

event

The event indicator.

J

The number of time intervals.

tcut

A vector that specifies the endpoints of time intervals for the baseline piecewise exponential survival distribution. Must start with 0, e.g., c(0, 60) breaks the time axis into 2 event intervals: [0, 60) and [60, Inf). By default, it is set to 0.

q

The number of columns of the covariates matrix (exluding the intercept).

x

The covariates matrix (including the intercept).

Value

The maximum likelihood estimates and the associated covariance matrix, AIC and BIC.


Quantile function for piecewise exponential regression

Description

Obtains the quantile function value for piecewise exponential regression.

Usage

qpwexp(p, theta, J, tcut, q = 0, x = 1, lower.tail = TRUE, log.p = FALSE)

Arguments

p

The vector of probabilities.

theta

The parameter vector consisting of gamma for log piecewise hazards and beta for regression coefficients.

J

The number of time intervals.

tcut

A vector that specifies the endpoints of time intervals for the baseline piecewise exponential survival distribution. Must start with 0, e.g., c(0, 60) breaks the time axis into 2 event intervals: [0, 60) and [60, Inf). By default, it is set to 0.

q

The number of elements in the vector of covariates (excluding the intercept).

x

The vector of covariates (including the intercept).

lower.tail

logical; if TRUE (default), probabilities are the distribution function, otherwise, the survival function.

log.p

logical; if TRUE, probabilities p are given as log(p).

Value

The quantiles t such that P(T <= t | X = x) = p.


Run Shiny app

Description

Runs the event prediction Shiny app.

Usage

runShinyApp_eventPred()

Author(s)

Kaifeng Lu, kaifenglu@gmail.com


Summarize observed data

Description

Provides an overview of the observed data, including the trial start date, data cutoff date, enrollment duration, number of subjects enrolled, number of events and dropouts, number of subjects at risk, cumulative enrollment and event data, daily enrollment rates, and Kaplan-Meier plots for time to event and time to dropout.

Usage

summarizeObserved(
  df,
  to_predict = "event only",
  showplot = TRUE,
  by_treatment = FALSE,
  generate_plot = TRUE,
  interactive_plot = TRUE
)

Arguments

df

The subject-level data, including trialsdt, usubjid, randdt, and cutoffdt for enrollment prediction, as well as time, event and dropout for event prediction, and treatment coded as 1, 2, and so on, and treatment_description for prediction by treatment group.

to_predict

Specifies what to predict: "enrollment only", "event only", or "enrollment and event". By default, it is set to "event only".

showplot

A Boolean variable to control whether or not to show the observed data plots. By default, it is set to TRUE.

by_treatment

A Boolean variable to control whether or not to summarize observed data by treatment group. By default, it is set to FALSE.

generate_plot

Whether to generate plots.

interactive_plot

Whether to produce interactive plots using plotly or static plots using ggplot2.

Value

A list that includes a range of summary statistics, data sets, and plots depending on the value of to_predict.

Author(s)

Kaifeng Lu, kaifenglu@gmail.com

Examples


observed1 <- summarizeObserved(
  df = interimData1,
  to_predict = "enrollment and event")

observed2 <- summarizeObserved(
  df = interimData2,
  to_predict = "event only")