Type: Package
Title: ROBust INference for Covariate Adjustment in Randomized Clinical Trials
Version: 0.2.0
Date: 2025-09-09
Description: Performs robust estimation and inference when using covariate adjustment and/or covariate-adaptive randomization in randomized controlled trials. This package is trimmed to reduce the dependencies and validated to be used across industry. See "FDA's final guidance on covariate adjustment"https://www.regulations.gov/docket/FDA-2019-D-0934, Tsiatis (2008) <doi:10.1002/sim.3113>, Bugni et al. (2018) <doi:10.1080/01621459.2017.1375934>, Ye, Shao, Yi, and Zhao (2023)<doi:10.1080/01621459.2022.2049278>, Ye, Shao, and Yi (2022)<doi:10.1093/biomet/asab015>, Rosenblum and van der Laan (2010)<doi:10.2202/1557-4679.1138>, Wang et al. (2021)<doi:10.1080/01621459.2021.1981338>, Ye, Bannick, Yi, and Shao (2023)<doi:10.1080/24754269.2023.2205802>, and Bannick, Shao, Liu, Du, Yi, and Ye (2024)<doi:10.48550/arXiv.2306.10213>.
License: Apache License 2.0
URL: https://github.com/openpharma/RobinCar2/
BugReports: https://github.com/openpharma/RobinCar2/issues
Depends: R (≥ 3.6)
Imports: checkmate, numDeriv, MASS, sandwich, stats, survival, utils
Suggests: knitr, rmarkdown, testthat (≥ 3.0)
VignetteBuilder: knitr
Config/testthat/edition: 3
Encoding: UTF-8
Language: en-US
LazyData: true
RoxygenNote: 7.3.3
NeedsCompilation: no
Packaged: 2025-09-09 02:08:05 UTC; root
Author: Liming Li ORCID iD [aut, cre], Marlena Bannick ORCID iD [aut], Daniel Sabanes Bove ORCID iD [aut], Dong Xi [aut], Ting Ye [aut], Yanyao Yi [aut], Gregory Chen [ctb], Gilead Sciences, Inc. [cph, fnd], F. Hoffmann-La Roche AG [cph, fnd], Merck Sharp & Dohme, Inc. [cph, fnd], AstraZeneca plc [cph, fnd], Eli Lilly and Company [cph, fnd], The University of Washington [cph, fnd]
Maintainer: Liming Li <liming.li1@astrazeneca.com>
Repository: CRAN
Date/Publication: 2025-09-09 07:20:25 UTC

RobinCar2 Package

Description

RobinCar2 implements unbiased prediction and robust inference of variance of a fit in R.

Author(s)

Maintainer: Liming Li liming.li1@astrazeneca.com (ORCID)

Authors:

Other contributors:

See Also

Useful links:


Prediction Bias

Description

Obtain prediction bias within each stratum.

Usage

bias(residual, treatment, group_idx)

Arguments

residual

(numeric) residuals.

treatment

(factor) treatment.

group_idx

(character) stratum index.

Value

Numeric matrix of bias in each stratum.


Block Sum of a matrix

Description

Block Sum of a matrix

Usage

block_sum(x, n)

Confidence Interval

Description

Obtain the confidence interval for the marginal mean or the contrast.

Usage

## S3 method for class 'prediction_cf'
confint(object, parm, level = 0.95, include_se = FALSE, ...)

## S3 method for class 'surv_effect'
confint(object, parm, level = 0.95, transform, ...)

## S3 method for class 'treatment_effect'
confint(object, parm, level = 0.95, transform, ...)

Arguments

object

Object to construct confidence interval.

parm

(character or integer) Names of the parameters to construct confidence interval.

level

(numeric) Confidence level.

include_se

(flag) Whether to include the standard error as a column in the result matrix.

...

Not used.

transform

(function) Transform function.

Value

A matrix of the confidence interval.

Examples

robin_res <- robin_glm(
  y_b ~ treatment * s1,
  data = glm_data, treatment = treatment ~ s1, contrast = "log_risk_ratio"
)
confint(robin_res$marginal_mean, level = 0.7)
confint(robin_res$contrast, parm = 1:3, level = 0.9)

Derive Outcome Values Based on Log Hazard Ratio

Description

Compute the derived outcome values based on a given log hazard ratio.

Usage

h_derived_outcome_vals(
  theta,
  df,
  treatment,
  time,
  status,
  covariates,
  n = nrow(df)
)

h_strat_derived_outcome_vals(
  theta,
  df,
  treatment,
  time,
  status,
  strata,
  covariates
)

Arguments

theta

(number) The assumed log hazard ratio of the second vs. the first level of the treatment arm variable.

df

(data.frame) The data frame containing the survival data.

treatment

(string) The name of the treatment arm variable in df. It should be a factor with two levels, where the first level is the reference group.

time

(string) The name of the time variable in df, representing the survival time.

status

(string) The name of the status variable in df, with 0 for censored and 1 for event.

covariates

(character) The column names in df to be used for covariate adjustment.

n

(count) The number of observations. Note that this can be higher than the number of rows when used in stratified analyses computations.

strata

(string) The name of the strata variable in df, which must be a factor.

Details

Please note that the covariates must not include index, treatment, time, status to avoid naming conflicts.

Value

A data frame containing the same data as the input df, but restructured with standardized column names index, treatment, time, status, the covariates, and an additional column O_hat containing the derived outcome values. For the stratified version, the list of data frames is returned, one for each stratum.

Functions


Find Data in a Fit

Description

Find Data in a Fit

Usage

find_data(fit, ...)

Arguments

fit

A fit object.

...

Additional arguments.

Value

A data frame used in the fit.


Calculate Coefficient Estimates from Linear Model Input

Description

Calculate the coefficient estimates for each treatment arm from the linear model input data.

Usage

h_get_beta_estimates(lm_input)

h_get_strat_beta_estimates(strat_lm_input)

Arguments

lm_input

(list) A list containing the linear model input data for each treatment arm, as returned by h_get_lm_input().

strat_lm_input

(list) A list of lists, one for each stratum, containing the linear model input data for each treatment arm, as returned by h_get_strat_lm_input().

Value

A list containing the coefficient estimates for each treatment arm.

Functions


Get Linear Model Input Data

Description

Prepare the input data for a linear model based on the provided data frame and model formula.

Usage

h_get_lm_input(df, model)

h_get_strat_lm_input(df_split, model)

Arguments

df

(data.frame) Including the covariates needed for the model, as well as the derived outcome O_hat and the treatment factor.

model

(formula) The right-hand side only model formula.

df_split

(list) A list of data frames, one for each stratum, as returned by h_strat_derived_outcome_vals().

Value

A list containing for each element of the treatment factor a list with the corresponding model matrix X and the response vector y. For the stratified version, a list of such lists is returned, one for each stratum.

Functions


Example Trial Data for GLMs with Permute-Block Randomization

Description

This dataset contains the trial example data for GLMs with permute block randomization.

Usage

glm_data

Format

A data frame with 600 rows and 7 columns:

id

The ID of the patients.

treatment

The treatment assignment, "pbo", "trt1" and "trt2"

s1

The first stratification variable, "a" and "b".

s2

The second stratification variable, "c" and "d".

covar

The covariate following normal distribution.

y

The continuous response.

y_b

The binary response.

Source

The data is generated by the create_glm_data.R script.


Obtain Adjustment for Proportion of Treatment Assignment

Description

Obtain Adjustment for Proportion of Treatment Assignment

Usage

h_adjust_pi(pi)

Arguments

pi

(numeric) vector of proportions.

Value

Numeric matrix.


Confidence interval calculations which are common across effect results.

Description

Confidence interval calculations which are common across effect results.

Usage

h_confint(x, parm, level = 0.95, transform, include_se = FALSE, ...)

Contrast Functions and Jacobians

Description

Contrast Functions and Jacobians

Create Contrast of Pairs

Usage

h_diff(x, y)

h_jac_diff(x, y)

h_risk_ratio(x, y)

h_jac_risk_ratio(x, y)

h_odds_ratio(x, y)

h_jac_odds_ratio(x, y)

h_log_risk_ratio(x, y)

h_jac_log_risk_ratio(x, y)

h_log_odds_ratio(x, y)

h_jac_log_odds_ratio(x, y)

eff_jacob(f)

pairwise(levels, x = levels)

against_ref(levels, ref = levels[1], x = tail(levels, -1))

custom_contrast(levels, x, y)

Arguments

x

(vector) A vector of treatment levels.

y

(vector) A vector of treatment levels.

f

(function) Function with argument x and y to compute treatment effect.

levels

(character) Levels of the treatment.

ref

(string or int) Reference level.

Value

Vector of contrasts, or matrix of jacobians.

A list of contrast object with following elements:

Examples

h_diff(1:3, 4:6)
h_jac_risk_ratio(1:3, 4:6)

Prepare Events Table

Description

This function creates a data frame summarizing the number of patients and events for each treatment arm and stratification factor.

Usage

h_events_table(data, vars)

Arguments

data

(data.frame) The data frame containing the survival data.

vars

(list) A list containing the treatment, time, status, and strata variables.

Value

A data frame with columns for the treatment, strata, number of patients, and number of events.


Obtain Adjustment for Covariance Matrix

Description

Obtain Adjustment for Covariance Matrix

Usage

h_get_erb(resi, group_idx, trt, pi, randomization)

Arguments

resi

(numeric) vector of residuals.

group_idx

(list of integer) index for each groups.

trt

(factor) of treatment assignment.

pi

(numeric) proportion of treatment assignment.

randomization

(string) name of the randomization schema.


Extract Variable Names

Description

Extract Variable Names

Usage

h_get_vars(treatment)

Arguments

treatment

(string or formula) string name of the treatment, or a formula.

Details

Extract the formula elements, including treatment, schema and strata.

Value

A list of three elements, treatment, schema and strata.


Evaluate if Interaction Exists

Description

Evaluate if Interaction Exists

Usage

h_interaction(formula, treatment)

Arguments

formula

(formula) the formula for model fitting.

treatment

(formula) the formula for treatment assignment.


Log Hazard Ratio Coefficient Matrix

Description

This function creates a coefficient matrix for the log hazard ratio estimates.

Usage

h_log_hr_coef_mat(x)

Arguments

x

(list) A list containing the log hazard ratio estimates and their standard errors.

Value

A matrix with columns for the log hazard ratio estimate, standard error, z-value, and p-value.


Estimate Log Hazard Ratio via Score Function

Description

This function estimates the log hazard ratio by finding the root of the log-rank score function.

Usage

h_log_hr_est_via_score(score_fun, interval = c(-5, 5), ...)

Arguments

score_fun

(function) The log-rank score function to be used for estimation.

interval

(numeric) A numeric vector of length 2 specifying the interval in which to search for the root.

...

Additional arguments passed to score_fun.

Details

This deactivates the ties factor correction in the score function by passing use_ties_factor = FALSE to the score_fun.

Value

A list containing:


Log-Rank Test via Score Function

Description

This function performs a log-rank test using the score function.

Usage

h_lr_test_via_score(score_fun, ...)

Arguments

score_fun

(function) The log-rank score function to be used for testing.

...

Additional arguments passed to score_fun.

Details

This activates the ties factor correction in the score function by passing use_ties_factor = TRUE to the score_fun.

Value

A list containing:


Count Number of Events per Unique Event Time

Description

This function counts the number of events at each unique event time point in a survival dataset.

Usage

h_n_events_per_time(df, time, status)

Arguments

df

(data.frame) containing the survival data.

time

(string) name of the time variable.

status

(string) name of the status variable, where 1 indicates an event and 0 indicates censoring.

Details

If there are no events in the dataset, it returns an empty data.frame.

Value

A data.frame with two columns: time and n_events, where n_events is the number of events at each time point.


Prepare Survival Input

Description

Prepare Survival Input

Usage

h_prep_survival_input(formula, data, treatment)

Arguments

formula

(formula) with a left hand side of the form Surv(time, status) and a right hand side defining optional covariates or just 1 if there are no covariates.

data

(data.frame) containing the variables in the formula.

treatment

(string or formula) string name of the treatment, or a formula.

Details

Note that formula can also contain an externally defined survival::Surv object. In this case, the time and status variables are extracted and added to the data input. Note that it is up to the user to ensure that in this case the column binding is correct, i.e., that the rows of the data match with the rows of the Surv object. In addition, the same named variables must not appear in both the data and the Surv object, to avoid ambiguity (this is a difference vs. the behavior of survival::coxph() for better transparency).

Value

A list containing the following elements:


Log-Rank Test Results Matrix

Description

This function creates a matrix summarizing the results of the log-rank test.

Usage

h_test_mat(x)

Arguments

x

(list) A list containing the log-rank test results.

Value

A matrix with columns for the test statistic and p-value.


Obtain the Jacobian matrix

Description

Obtain the Jacobian matrix

Usage

jac_mat(jac, pair)

Counterfactual Prediction

Description

Obtain counterfactual prediction of a fit.

Usage

predict_counterfactual(fit, treatment, data, vcov, vcov_args, ...)

Arguments

fit

fitted object.

treatment

(formula) formula of form treatment ~ strata(s).

data

(data.frame) raw dataset.

vcov

(function or character) variance function or name.

vcov_args

(list) additional arguments for variance function.

...

Additional arguments for methods.

Value

List of class prediction_cf containing following elements:


S3 Methods for prediction_cf

Description

S3 Methods for prediction_cf

Usage

## S3 method for class 'prediction_cf'
print(x, level = 0.95, ...)

Arguments

x

(prediction_cf)
the obtained counter-factual prediction object.

level

(number)
the significance level.

Value

No return value.

Functions


Randomization schema

Description

Randomization schema

Usage

randomization_schema

Format

An object of class data.frame with 3 rows and 2 columns.


Covariate adjusted glm model

Description

Covariate adjusted glm model

Usage

robin_glm(
  formula,
  data,
  treatment,
  contrast = c("difference", "risk_ratio", "odds_ratio", "log_risk_ratio",
    "log_odds_ratio"),
  contrast_jac = NULL,
  vcov = "vcovG",
  family = gaussian(),
  vcov_args = list(),
  pair,
  ...
)

Arguments

formula

(formula) A formula of analysis.

data

(data.frame) Input data frame.

treatment

(formula or character(1)) A formula of treatment assignment or assignment by stratification, or a string name of treatment assignment.

contrast

(function or character(1)) A function to calculate the treatment effect, or character of "difference", "risk_ratio", "odds_ratio" for default contrasts.

contrast_jac

(function) A function to calculate the Jacobian of the contrast function. Ignored if using default contrasts.

vcov

(function) A function to calculate the variance-covariance matrix of the treatment effect, including vcovHC and vcovG.

family

(family) A family object of the glm model.

vcov_args

(list) Additional arguments passed to vcov.

pair

Pairwise treatment comparison.

...

Additional arguments passed to glm or glm.nb.

Details

If family is MASS::negative.binomial(NA), the function will use MASS::glm.nb instead of glm.

Value

A robin_output object, with marginal_mean and contrast components.

Examples

robin_glm(
  y ~ treatment * s1,
  data = glm_data,
  treatment = treatment ~ s1, contrast = "difference"
)

Covariate adjusted lm model

Description

Covariate adjusted lm model

Usage

robin_lm(
  formula,
  data,
  treatment,
  vcov = "vcovG",
  vcov_args = list(),
  pair,
  ...
)

Arguments

formula

(formula) A formula of analysis.

data

(data.frame) Input data frame.

treatment

(formula or character(1)) A formula of treatment assignment or assignment by stratification, or a string name of treatment assignment.

vcov

(function) A function to calculate the variance-covariance matrix of the treatment effect, including vcovHC and vcovG. The default is 'vcovG'.

vcov_args

(list) Additional arguments passed to vcov.

pair

Pairwise treatment comparison.

...

Additional arguments passed to lm.

Value

A robin_output object, with marginal_mean and contrast components.

Examples

robin_lm(
  y ~ treatment * s1,
  data = glm_data,
  treatment = treatment ~ s1
)

Covariate Adjusted and Stratified Survival Analysis

Description

Calculate log-rank test as well as hazard ratio estimates for survival data, optionally adjusted for covariates and a stratification factor.

Usage

robin_surv(
  formula,
  data,
  treatment,
  comparisons,
  contrast = "hazardratio",
  test = "logrank",
  ...
)

Arguments

formula

(formula) A formula of analysis, of the form Surv(time, status) ~ covariates. (If no covariates should be adjusted for, use 1 instead on the right hand side.)

data

(data.frame) Input data frame.

treatment

(formula) A formula of treatment assignment or assignment by stratification.

comparisons

(list) An optional list of comparisons between treatment levels to be performed, see details. By default, all pairwise comparisons are performed automatically.

contrast

(character(1)) The contrast statistic to be used, currently only "hazardratio" is supported.

test

(character(1)) The test to be used, currently only "logrank" is supported.

...

Additional arguments passed to the survival analysis functions, in particular hr_se_plugin_adjusted (please see here for details).

Details

The user can optionally specify a list of comparisons between treatment levels to be performed. The list must have two elements:

So for example if you would like to compare level 3 with level 1, and also level 3 with level 2 (but not level 2 with level 1) then you can specify: comparisons = list(c(3, 3), c(1, 2))

Value

A surv_effect object containing the results of the survival analysis.

See Also

surv_effect_methods for S3 methods.

Examples

# Adjusted for covariates meal.cal and age and adjusted for stratification by sex:
robin_surv(
  formula = Surv(time, status) ~ meal.cal + age,
  data = surv_data,
  treatment = sex ~ strata
)

# Adjusted for stratification by strata but not for covariates:
robin_surv(
  formula = Surv(time, status) ~ 1,
  data = surv_data,
  treatment = sex ~ strata
)

# Unadjusted for covariates and stratification:
robin_surv(
  formula = Surv(time, status) ~ 1,
  data = surv_data,
  treatment = sex ~ 1
)

Log Hazard Ratio Estimation and Log-Rank Test via Score Function

Description

This function combines the estimation of the log hazard ratio and the log-rank test using a score function. Only two treatment arms are being compared and the data is subset accordingly.

Usage

robin_surv_comparison(
  score_fun,
  vars,
  data,
  exp_level,
  control_level,
  unadj_score_fun = NULL,
  ...
)

Arguments

score_fun

(function) The log-rank score function to be used for both estimation and testing.

vars

(list) A list containing levels, treatment, and covariates.

data

(data.frame) The data frame containing the survival data.

exp_level

(count) Level of the experimental treatment arm.

control_level

(count) Level of the control treatment arm.

unadj_score_fun

(function or NULL) Optional unadjusted score function, see details.

...

Additional arguments passed to score_fun.

Details

If an unadjusted score function is provided in unadj_score_fun, then it is used to estimate the log hazard ratio first. This unadjusted log hazard ratio estimate is then passed on to the adjusted score function score_fun as theta_hat. This is required when the score function is adjusted for covariates.

Value

A list containing:


Sum vectors in a list

Description

Sum vectors in a list

Usage

sum_vectors_in_list(lst)

Survival Example Data

Description

This dataset contains survival data from the survival package's survival::lung dataset, modified to include factors for sex and strata, as well as a binary status variable which is 1 for death and 0 for censored.

Usage

surv_data

Format

An object of class data.frame with 228 rows and 12 columns.

Source

The data is generated by the create_surv_data.R script.


S3 Methods for surv_effect

Description

S3 Methods for surv_effect

Usage

## S3 method for class 'surv_effect'
print(x, ...)

table(x, ...)

## Default S3 method:
table(x, ...)

## S3 method for class 'surv_effect'
table(x, ...)

Arguments

x

(surv_effect) the obtained result from robin_surv().

...

ignored additional arguments (for compatibility).

Functions

Examples

x <- robin_surv(
  formula = Surv(time, status) ~ meal.cal + age,
  data = surv_data,
  treatment = sex ~ strata
)
print(x)
table(x)

Survival Comparison Functions

Description

These are simple wrappers around robin_surv_comparison() called with the corresponding log-rank score functions.

Usage

robin_surv_no_strata_no_cov(vars, data, exp_level, control_level)

robin_surv_strata(vars, data, exp_level, control_level)

robin_surv_cov(vars, data, exp_level, control_level, ...)

robin_surv_strata_cov(vars, data, exp_level, control_level, ...)

Arguments

vars

(list) A list containing levels, treatment, and covariates.

data

(data.frame) The data frame containing the survival data.

exp_level

(count) Level of the experimental treatment arm.

control_level

(count) Level of the control treatment arm.

...

Additional arguments passed to score_fun.

Value

See robin_surv_comparison().

Functions


Log-Rank Score Functions for Survival Analysis

Description

These functions compute the log-rank score statistics for a survival analysis. Depending on the function, these are stratified and/or adjusted for covariates.

Usage

h_lr_score_no_strata_no_cov(
  theta,
  df,
  treatment,
  time,
  status,
  n = nrow(df),
  use_ties_factor = TRUE
)

h_lr_score_strat(
  theta,
  df,
  treatment,
  time,
  status,
  strata,
  use_ties_factor = TRUE
)

h_lr_score_cov(
  theta,
  df,
  treatment,
  time,
  status,
  model,
  theta_hat = theta,
  use_ties_factor = TRUE,
  hr_se_plugin_adjusted = TRUE
)

h_lr_score_strat_cov(
  theta,
  df,
  treatment,
  time,
  status,
  strata,
  model,
  theta_hat = theta,
  use_ties_factor = TRUE,
  hr_se_plugin_adjusted = TRUE
)

Arguments

theta

(number) The assumed log hazard ratio of the second vs. the first level of the treatment arm variable.

df

(data.frame) The data frame containing the survival data.

treatment

(string) The name of the treatment arm variable in df. It should be a factor with two levels, where the first level is the reference group.

time

(string) The name of the time variable in df, representing the survival time.

status

(string) The name of the status variable in df, with 0 for censored and 1 for event.

n

(count) The number of observations. Note that this can be higher than the number of rows when used in stratified analyses computations.

use_ties_factor

(flag) Whether to use the ties factor in the variance calculation. This is used when calculating the score test statistic, but not when estimating the log hazard ratio.

strata

(string) The name of the strata variable in df, which must be a factor.

model

(formula) The model formula for covariate adjustment, e.g., ~ cov1 + cov2.

theta_hat

(number) The estimated log hazard ratio when not adjusting for covariates.

hr_se_plugin_adjusted

(flag) Defines the method for calculating the standard error of the log hazard ratio estimate when adjusting for covariates, see details.

Details

Value

The score function value(s), with the following attributes:

Functions


Treatment Effect

Description

Obtain treatment effect and variance from counter-factual prediction

Usage

treatment_effect(
  object,
  pair = pairwise(names(object$estimate)),
  eff_measure,
  eff_jacobian = eff_jacob(eff_measure),
  contrast_name,
  ...
)

difference(object, ...)

risk_ratio(object, ...)

odds_ratio(object, ...)

log_risk_ratio(object, ...)

log_odds_ratio(object, ...)

Arguments

object

Object from which to obtain treatment effect.

pair

(contrast) Contrast choices.

eff_measure

(function) Treatment effect measurement function.

eff_jacobian

(function) Treatment effect jacobian function.

contrast_name

(string) Name of the contrast.

...

Additional arguments for variance.

Value

A list of treatment_effect object with following elements:


Update levels in a contrast pair

Description

Update levels in a contrast pair

Usage

update_levels(pair, levels)

Generalized Covariance (ANHECOVA)

Description

Generalized Covariance (ANHECOVA)

Usage

vcovG(x, decompose = TRUE, ...)

Arguments

x

(prediction_cf) Counter-factual prediction.

decompose

(flag) whether to use decompose method to calculate the variance.

...

Not used.

Value

Named covariance matrix.


Heteroskedasticity-consistent covariance matrix for predictions

Description

The heteroskedasticity-consistent covariance matrix for predictions is obtained with sandwich::vocvHC using sandwich method.

Usage

vcovHC(x, type = "HC3", ...)

Arguments

x

(prediction_cf) Counter-factual prediction.

type

(character) Type of HC covariance matrix.

...

Additional arguments for sandwich::vcovHC.

Value

Matrix of the heteroskedasticity-consistent covariance for the predictions.