Help for package random.cdisc.data

Type:

Package

Title:

Create Random ADaM Datasets

Version:

0.3.16

Date:

2024-09-28

Description:

A set of functions to create random Analysis Data Model (ADaM) datasets and cached dataset. ADaM dataset specifications are described by the Clinical Data Interchange Standards Consortium (CDISC) Analysis Data Model Team.

License:

Apache License 2.0

URL:

https://insightsengineering.github.io/random.cdisc.data/, https://github.com/insightsengineering/random.cdisc.data/

BugReports:

https://github.com/insightsengineering/random.cdisc.data/issues

Depends:

R (≥ 3.6)

Imports:

checkmate (≥ 2.1.0), dplyr (≥ 1.1.2), lifecycle (≥ 1.0.3), lubridate (≥ 1.7.10), magrittr (≥ 1.5), rlang (≥ 1.1.0), stringr (≥ 1.4.1), tibble (≥ 3.2.1), tidyr (≥ 1.1.4), yaml (≥ 2.1.19)

Suggests:

diffdf, knitr (≥ 1.42), rmarkdown (≥ 2.23), testthat (≥ 3.0.4), withr (≥ 2.0.0)

VignetteBuilder:

knitr, rmarkdown

RdMacros:

lifecycle

Config/Needs/verdepcheck:

mllg/checkmate, tidyverse/dplyr, r-lib/lifecycle, tidyverse/lubridate, tidyverse/magrittr, r-lib/rlang, tidyverse/stringr, tidyverse/tibble, tidyverse/tidyr, yaml=vubiostat/r-yaml, gowerc/diffdf, yihui/knitr, rstudio/rmarkdown, r-lib/testthat, r-lib/withr

Config/Needs/website:

insightsengineering/nesttemplate

Config/testthat/edition:

Encoding:

UTF-8

Language:

en-US

LazyData:

true

RoxygenNote:

7.3.2

NeedsCompilation:

Packaged:

2024-10-10 09:14:50 UTC; rstudio

Author:

Pawel Rucki [aut], Nick Paszty [aut], Jana Stoilova [aut], Joe Zhu [aut, cre], Davide Garolini [aut], Emily de la Rua [aut], Christopher DiPietrantonio [aut], Adrian Waddell [aut], F. Hoffmann-La Roche AG [cph, fnd]

Maintainer:

Joe Zhu <joe.zhu@roche.com>

Repository:

CRAN

Date/Publication:

2024-10-10 09:40:02 UTC

`random.cdisc.data` Package

Description

Package to create random SDTM and ADAM datasets.

Author(s)

Maintainer: Joe Zhu joe.zhu@roche.com

Authors:

Pawel Rucki pawel.rucki@roche.com
Nick Paszty npaszty@gene.com
Jana Stoilova jana.stoilova@roche.com
Davide Garolini davide.garolini@roche.com
Emily de la Rua emily.de_la_rua@contractors.roche.com
Christopher DiPietrantonio
Adrian Waddell adrian.waddell@gene.com

Other contributors:

F. Hoffmann-La Roche AG [copyright holder, funder]

Apply Metadata

Description

Apply label and variable ordering attributes to domains.

Usage

apply_metadata(
  df,
  filename,
  add_adsl = TRUE,
  adsl_filename = "metadata/ADSL.yml"
)

Arguments

df

(data.frame)
Data frame to which metadata is applied.

filename

(yaml)
File containing domain metadata.

add_adsl

(logical)
Should ADSL data be merged to domain.

adsl_filename

(yaml)
File containing ADSL metadata.

Value

Data frame with metadata applied.

Examples

seed <- 1
adsl <- radsl(seed = seed)
adsub <- radsub(adsl, seed = seed)
yaml_path <- file.path(path.package("random.cdisc.data"), "inst", "metadata")
adsl <- apply_metadata(adsl, file.path(yaml_path, "ADSL.yml"), FALSE)
adsub <- apply_metadata(
  adsub, file.path(yaml_path, "ADSUB.yml"), TRUE,
  file.path(yaml_path, "ADSL.yml")
)

Standard Arguments

Description

The documentation to this function lists all the arguments in random.cdisc.data that are used repeatedly in dataset creation.

Arguments

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

lookup

(data.frame)
Additional parameters.

lookup_aag

(data.frame)
Additional metadata parameters.

param

(⁠character vector⁠)
Parameter values.

paramcd

(⁠character vector⁠)
Parameter code values.

paramu

(⁠character vector⁠)
Parameter unit values.

visit_format

(character)
Type of visit. Options are "WEEK" and "CYCLE".

n_assessments

(integer)
Number of weeks or cycles.

n_days

(integer)
Number of days in each cycle (only used if visit_format is "CYCLE").

aval_mean

(⁠numeric vector⁠)
Mean values corresponding to each parameter.

Cached ADAB

Description

Cached ADAB data generated with seed = 1

Usage

data(cadab)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 6916 rows and 21 columns.

Cached ADAE

Description

Cached ADAE data generated with seed = 1

Usage

data(cadae)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 1934 rows and 92 columns.

Cached ADAETTE

Description

Cached ADAETTE data generated with seed = 1

Usage

data(cadaette)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 3600 rows and 66 columns.

Cached ADCM

Description

Cached ADCM data generated with seed = 1

Usage

data(cadcm)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 3685 rows and 83 columns.

Cached ADDV

Description

Cached ADDV data generated with seed = 1

Usage

data(caddv)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 119 rows and 66 columns.

Cached ADEG

Description

Cached ADEG data generated with seed = 1

Usage

data(cadeg)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 13600 rows and 88 columns.

Cached ADEX

Description

Cached ADEX data generated with seed = 1

Usage

data(cadex)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 6400 rows and 79 columns.

Cached ADHY

Description

Cached ADHY data generated with seed = 1

Usage

data(cadhy)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 20000 rows and 71 columns.

Cached ADLB

Description

Cached ADLB data generated with seed = 1

Usage

data(cadlb)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 8400 rows and 102 columns.

Cached ADMH

Description

Cached ADMH data generated with seed = 1

Usage

data(cadmh)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 1934 rows and 67 columns.

Cached ADPC

Description

Cached ADPC data generated with seed = 1

Usage

data(cadpc)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 6640 rows and 72 columns.

Cached ADPP

Description

Cached ADPP data generated with seed = 1

Usage

data(cadpp)

Format

An object of class data.frame with 26268 rows and 68 columns.

Cached ADQLQC

Description

Cached ADQLQC data generated with seed = 1

Usage

data(cadqlqc)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 116803 rows and 50 columns.

Cached ADQS

Description

Cached ADQS data generated with seed = 1

Usage

data(cadqs)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 14000 rows and 73 columns.

Cached ADRS

Description

Cached ADRS data generated with seed = 1

Usage

data(cadrs)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 3200 rows and 65 columns.

Cached ADSL

Description

Cached ADSL data generated with seed = 1

Usage

data(cadsl)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 400 rows and 55 columns.

Cached ADSUB

Description

Cached ADSUB data generated with seed = 1

Usage

data(cadsub)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 2000 rows and 65 columns.

Cached ADTR

Description

Cached ADTR data generated with seed = 1

Usage

data(cadtr)

Format

An object of class data.frame with 2800 rows and 76 columns.

Cached ADTTE

Description

Cached ADTTE data generated with seed = 1

Usage

data(cadtte)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 2000 rows and 67 columns.

Cached ADVS

Description

Cached ADVS data generated with seed = 1

Usage

data(cadvs)

Format

An object of class tbl_df (inherits from tbl, data.frame) with 16800 rows and 87 columns.

Helper Functions for Constructing ADQLQC

Description

Internal functions used by radqlqc.

Usage

get_qs_data(
  adsl,
  visit_format = "CYCLE",
  n_assessments = 5L,
  n_days = 1L,
  lookup = NULL,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(QSORRES = c(1234, 0.2), QSSTRESC = c(1234, 0.2))
)

get_random_dates_between(from, to, visit_id)

prep_adqlqc(df)

calc_scales(adqlqc1)

derv_chgcat1(dataset)

comp_derv(dataset, percent, number)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

visit_format

(character)
Type of visit. Options are "WEEK" and "CYCLE".

n_assessments

(integer)
Number of weeks or cycles.

n_days

(integer)
Number of days in each cycle (only used if visit_format is "CYCLE").

lookup

(data.frame)
Additional parameters.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

from

(⁠datetime vector⁠)
Start date/times.

to

(⁠datetime vector⁠)
End date/times.

visit_id

(vector)
Visit identifiers.

df

(data.frame)
SDTM QS dataset.

adqlqc1

(data.frame)
Prepared data generated from the prep_adqlqc() function.

dataset

(data.frame)
Dataset.

percent

(numeric)
Completion - Completed at least y percent of questions, 1 record per visit

number

(numeric)
Completion - Completed at least x question(s), 1 record per visit

Value

a dataframe with SDTM questionnaire data

Data frame with new randomly generated dates variable.

data.frame

Functions

get_qs_data(): Questionnaires EORTC QLQ-C30 V3.0 SDTM (QS)

Function for generating random Questionnaires SDTM domain
get_random_dates_between(): Function for generating random dates between 2 dates
prep_adqlqc(): Prepare ADaM ADQLQC data, adding PARAMCD to SDTM QS data
calc_scales(): Scale calculation for ADQLQC data
derv_chgcat1(): Calculate Change from Baseline Category 1
comp_derv(): Completion/Compliance Data Calculation

Examples

adsl <- radsl(N = 10, study_duration = 2, seed = 1)
adqlqc <- radqlqc(adsl, seed = 1, percent = 80, number = 2)

Generate Anthropometric Measurements for Males and Females.

Description

Anthropometric measurements are randomly generated using normal approximation. The default mean and standard deviation values used are based on US National Health Statistics for adults aged 20 years or over. The measurements are generated in same units as provided to the function.

Usage

h_anthropometrics_by_sex(
  df,
  seed = 1,
  id_var = "USUBJID",
  sex_var = "SEX",
  sex_var_level_male = "M",
  male_weight_in_kg = list(mean = 90.6, sd = 44.9),
  female_weight_in_kg = list(mean = 77.5, sd = 46.2),
  male_height_in_m = list(mean = 1.75, sd = 0.14),
  female_height_in_m = list(mean = 1.61, sd = 0.24)
)

Arguments

df

(data.frame)
Analysis dataset.

seed

(numeric)
Seed to use for reproducible random number generation.

id_var

(character)
Patient identifier variable name.

sex_var

(character)
Name of variable representing sex of patient.

sex_var_level_male

(character)
Level of sex_var representing males.

male_weight_in_kg

(named list)
List of means and SDs of male weights in kilograms.

female_weight_in_kg

(named list)
List of means and SDs of female weights in kilograms.

male_height_in_m

(named list)
List of means and SDs of male heights in metres.

female_height_in_m

(named list)
list of means and SDs of female heights in metres.

Details

One record per subject.

Value

a dataframe with anthropometric measurements for each subject in analysis dataset.

Replace Values with NA

Description

Replace column values with NAs.

Usage

mutate_na(ds, na_vars = NULL, na_percentage = 0.05)

Arguments

ds

(data.frame)
Any data set.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

Value

dataframe without NA values.

Anti-Drug Antibody Analysis Dataset (ADAB)

Description

Function for generating a random Anti-Drug Antibody Analysis Dataset for a given Subject-Level Analysis Dataset and Pharmacokinetics Analysis Dataset.

Usage

radab(
  adsl,
  adpc,
  constants = c(D = 100, ka = 0.8, ke = 1),
  paramcd = c("R1800000", "RESULT1", "R1800001", "RESULT2", "ADASTAT1", "INDUCD1",
    "ENHANC1", "TRUNAFF1", "EMERNEG1", "EMERPOS1", "PERSADA1", "TRANADA1", "BFLAG1",
    "TIMADA1", "ADADUR1", "ADASTAT2", "INDUCD2", "ENHANC2", "EMERNEG2", "EMERPOS2",
    "BFLAG2", "TRUNAFF2"),
  param = c("Antibody titer units", "ADA interpreted per sample result",
    "Neutralizing Antibody titer units", "NAB interpreted per sample result",
    "ADA Status of a patient", "Treatment induced ADA", "Treatment enhanced ADA",
    "Treatment unaffected", "Treatment Emergent - Negative",
    "Treatment Emergent - Positive", "Persistent ADA", "Transient ADA", "Baseline",
    "Time to onset of ADA", "ADA Duration", "NAB Status of a patient",
    "Treatment induced ADA, Neutralizing Antibody",
    "Treatment enhanced ADA, Neutralizing Antibody", 
    
    "Treatment Emergent - Negative, Neutralizing Antibody",
    "Treatment Emergent - Positive, Neutralizing Antibody",
    "Baseline, Neutralizing Antibody", "Treatment unaffected, Neutralizing Antibody"),
  avalu = c("titer", "", "titer", "", "", "", "", "", "", "", "", "", "", "weeks",
    "weeks", "", "", "", "", "", "", ""),
  seed = NULL,
  na_percentage = 0,
  na_vars = list(AVAL = c(NA, 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

adpc

(data.frame)
Pharmacokinetics Analysis Dataset.

constants

(⁠character vector⁠)
Constant parameters to be used in formulas for creating analysis values.

paramcd

(⁠character vector⁠)
Parameter code values.

param

(⁠character vector⁠)
Parameter values.

avalu

(character)
Analysis value units.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADAB data cadab should be returned or new data should be generated. If set to TRUE then the other arguments to radab will be ignored.

Details

One record per study per subject per parameter per time point: "R1800000", "RESULT1", "R1800001", "RESULT2".

Value

data.frame

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)
adpc <- radpc(adsl, seed = 2, duration = 9 * 7)

adab <- radab(adsl, adpc, seed = 2)
adab

Adverse Event Analysis Dataset (ADAE)

Description

Function for generating random Adverse Event Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radae(
  adsl,
  max_n_aes = 10L,
  lookup = NULL,
  lookup_aag = NULL,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(AEBODSYS = c(NA, 0.1), AEDECOD = c(1234, 0.1), AETOXGR = c(1234, 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

max_n_aes

(integer)
Maximum number of AEs per patient. Defaults to 10.

lookup

(data.frame)
Additional parameters.

lookup_aag

(data.frame)
Additional metadata parameters.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADAE data cadae should be returned or new data should be generated. If set to TRUE then the other arguments to radae will be ignored.

Details

One record per each record in the corresponding SDTM domain.

Keys: STUDYID, USUBJID, ASTDTM, AETERM, AESEQ

Value

data.frame

Examples

adsl <- radsl(N = 10, study_duration = 2, seed = 1)

adae <- radae(adsl, seed = 2)
adae

# Add metadata.
aag <- utils::read.table(
  sep = ",", header = TRUE,
  text = paste(
    "NAMVAR,SRCVAR,GRPTYPE,REFNAME,REFTERM,SCOPE",
    "CQ01NAM,AEDECOD,CUSTOM,D.2.1.5.3/A.1.1.1.1 AESI,dcd D.2.1.5.3,",
    "CQ01NAM,AEDECOD,CUSTOM,D.2.1.5.3/A.1.1.1.1 AESI,dcd A.1.1.1.1,",
    "SMQ01NAM,AEDECOD,SMQ,C.1.1.1.3/B.2.2.3.1 AESI,dcd C.1.1.1.3,BROAD",
    "SMQ01NAM,AEDECOD,SMQ,C.1.1.1.3/B.2.2.3.1 AESI,dcd B.2.2.3.1,BROAD",
    "SMQ02NAM,AEDECOD,SMQ,Y.9.9.9.9/Z.9.9.9.9 AESI,dcd Y.9.9.9.9,NARROW",
    "SMQ02NAM,AEDECOD,SMQ,Y.9.9.9.9/Z.9.9.9.9 AESI,dcd Z.9.9.9.9,NARROW",
    sep = "\n"
  ), stringsAsFactors = FALSE
)

adae <- radae(adsl, lookup_aag = aag)

with(
  adae,
  cbind(
    table(AEDECOD, SMQ01NAM),
    table(AEDECOD, CQ01NAM)
  )
)

Time to Adverse Event Analysis Dataset (ADAETTE)

Description

Function to generate random Time-to-AE Dataset for a given Subject-Level Analysis Dataset.

Usage

radaette(
  adsl,
  event_descr = NULL,
  censor_descr = NULL,
  lookup = NULL,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(CNSR = c(NA, 0.1), AVAL = c(1234, 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

event_descr

(⁠character vector⁠)
Descriptions of events. Defaults to NULL.

censor_descr

(⁠character vector⁠)
Descriptions of censors. Defaults to NULL.

lookup

(data.frame)
Additional parameters.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADAETTE data cadaette should be returned or new data should be generated. If set to TRUE then the other arguments to radaette will be ignored.

Details

Keys: STUDYID, USUBJID, PARAMCD

Value

data.frame

Author(s)

Xiuting Mi

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adaette <- radaette(adsl, seed = 2)
adaette

Previous and Concomitant Medications Analysis Dataset (ADCM)

Description

Function for generating random Concomitant Medication Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radcm(
  adsl,
  max_n_cms = 10L,
  lookup = NULL,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(CMCLAS = c(NA, 0.1), CMDECOD = c(1234, 0.1), ATIREL = c(1234, 0.1)),
  who_coding = FALSE,
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

max_n_cms

(integer)
Maximum number of concomitant medications per patient. Defaults to 10.

lookup

(data.frame)
Additional parameters.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

who_coding

(flag)
Whether WHO coding (with multiple paths per medication) should be used.

cached

boolean whether the cached ADCM data cadcm should be returned or new data should be generated. If set to TRUE then the other arguments to radcm will be ignored.

Details

One record per each record in the corresponding SDTM domain.

Keys: STUDYID, USUBJID, ASTDTM, CMSEQ

Value

data.frame

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adcm <- radcm(adsl, seed = 2)
adcm

adcm_who <- radcm(adsl, seed = 2, who_coding = TRUE)
adcm_who

Protocol Deviations Analysis Dataset (ADDV)

Description

Function for generating random Protocol Deviations Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

raddv(
  adsl,
  max_n_dv = 3L,
  p_dv = 0.15,
  lookup = NULL,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(ASTDT = c(seed = 1234, percentage = 0.1), DVCAT = c(seed = 1234,
    percentage = 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

max_n_dv

(integer)
Maximum number of deviations per patient. Defaults to 3.

p_dv

(proportion)
Probability of a patient having protocol deviations.

lookup

(data.frame)
Additional parameters.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADDV data caddv should be returned or new data should be generated. If set to TRUE then the other arguments to raddv will be ignored.

Details

One record per each record in the corresponding SDTM domain.

Keys: STUDYID, USUBJID, ASTDT, DVTERM, DVSEQ

Value

data.frame

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

addv <- raddv(adsl, seed = 2)
addv

ECG Analysis Dataset (ADEG)

Description

Function for generating random dataset from ECG Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radeg(
  adsl,
  egcat = c("INTERVAL", "INTERVAL", "MEASUREMENT", "FINDING"),
  param = c("QT Duration", "RR Duration", "Heart Rate", "ECG Interpretation"),
  paramcd = c("QT", "RR", "HR", "ECGINTP"),
  paramu = c("msec", "msec", "beats/min", ""),
  visit_format = "WEEK",
  n_assessments = 5L,
  n_days = 5L,
  max_n_eg = 10L,
  lookup = NULL,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(ABLFL = c(1235, 0.1), BASE = c(NA, 0.1), BASEC = c(NA, 0.1), CHG =
    c(1234, 0.1), PCHG = c(1234, 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

egcat

(⁠character vector⁠)
EG category values.

param

(⁠character vector⁠)
Parameter values.

paramcd

(⁠character vector⁠)
Parameter code values.

paramu

(⁠character vector⁠)
Parameter unit values.

visit_format

(character)
Type of visit. Options are "WEEK" and "CYCLE".

n_assessments

(integer)
Number of weeks or cycles.

n_days

(integer)
Number of days in each cycle (only used if visit_format is "CYCLE").

max_n_eg

(integer)
Maximum number of EG results per patient. Defaults to 10.

lookup

(data.frame)
Additional parameters.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADEG data cadeg should be returned or new data should be generated. If set to TRUE then the other arguments to radeg will be ignored.

Details

One record per subject per parameter per analysis visit per analysis date.

Keys: STUDYID, USUBJID, PARAMCD, BASETYPE, AVISITN, ATPTN, DTYPE, ADTM, EGSEQ, ASPID

Value

data.frame

Author(s)

tomlinsj, npaszty, Xuefeng Hou, dipietrc

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adeg <- radeg(adsl, visit_format = "WEEK", n_assessments = 7L, seed = 2)
adeg

adeg <- radeg(adsl, visit_format = "CYCLE", n_assessments = 2L, seed = 2)
adeg

Exposure Analysis Dataset (ADEX)

Description

Function for generating random Exposure Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radex(
  adsl,
  param = c("Dose administered during constant dosing interval",
    "Number of doses administered during constant dosing interval",
    "Total dose administered", "Total number of doses administered"),
  paramcd = c("DOSE", "NDOSE", "TDOSE", "TNDOSE"),
  paramu = c("mg", " ", "mg", " "),
  parcat1 = c("INDIVIDUAL", "OVERALL"),
  parcat2 = c("Drug A", "Drug B"),
  visit_format = "WEEK",
  n_assessments = 5L,
  n_days = 5L,
  max_n_exs = 6L,
  lookup = NULL,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(AVAL = c(NA, 0.1), AVALU = c(NA), 0.1),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

param

(⁠character vector⁠)
Parameter values.

paramcd

(⁠character vector⁠)
Parameter code values.

paramu

(⁠character vector⁠)
Parameter unit values.

parcat1

(⁠character vector⁠)
Dose amount categories. Defaults to "Individual" and "Overall".

parcat2

(⁠character vector⁠)
Types of drug received. Defaults to "Drug A" and "Drug B".

visit_format

(character)
Type of visit. Options are "WEEK" and "CYCLE".

n_assessments

(integer)
Number of weeks or cycles.

n_days

(integer)
Number of days in each cycle (only used if visit_format is "CYCLE").

max_n_exs

(integer)
Maximum number of exposures per patient. Defaults to 6.

lookup

(data.frame)
Additional parameters.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADEX data cadex should be returned or new data should be generated. If set to TRUE then the other arguments to radex will be ignored.

Details

One record per each record in the corresponding SDTM domain.

Keys: STUDYID, USUBJID, EXSEQ, PARAMCD, PARCAT1, ASTDTM, AENDTM, ASTDY, AENDY, AVISITN, EXDOSFRQ, EXROUTE, VISIT, VISITDY, EXSTDTC, EXENDTC, EXSTDY, EXENDY

Value

data.frame

Examples

adsl <- radsl(N = 10, study_duration = 2, seed = 1)

adex <- radex(adsl, seed = 2)
adex

Hy's Law Analysis Dataset (ADHY)

Description

Function for generating a random Hy's Law Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radhy(
  adsl,
  param = c("TBILI <= 2 times ULN and ALT value category",
    "TBILI > 2 times ULN and AST value category",
    "TBILI > 2 times ULN and ALT value category",
    "TBILI <= 2 times ULN and AST value category",
    "TBILI > 2 times ULN and ALKPH <= 2 times ULN and ALT value category",
    "TBILI > 2 times ULN and ALKPH <= 2 times ULN and AST value category",
    "TBILI > 2 times ULN and ALKPH <= 5 times ULN and ALT value category",
    "TBILI > 2 times ULN and ALKPH <= 5 times ULN and AST value category",
    "TBILI <= 2 times ULN and two consecutive elevations of ALT in relation to ULN", 
   
     "TBILI > 2 times ULN and two consecutive elevations of AST in relation to ULN",
    "TBILI <= 2 times ULN and two consecutive elevations of AST in relation to ULN",
    "TBILI > 2 times ULN and two consecutive elevations of ALT in relation to ULN",
    "TBILI > 2 times ULN and two consecutive elevations of ALT in relation to Baseline",
    "TBILI <= 2 times ULN and two consecutive elevations of ALT in relation to Baseline",
    "TBILI > 2 times ULN and two consecutive elevations of AST in relation to Baseline",
    
    
    "TBILI <= 2 times ULN and two consecutive elevations of AST in relation to Baseline",
    "ALT > 3 times ULN by Period", "AST > 3 times ULN by Period",
    "ALT or AST > 3 times ULN by Period", "ALT > 3 times Baseline by Period",
    "AST > 3 times Baseline by Period", "ALT or AST > 3 times Baseline by Period"),
  paramcd = c("BLAL", "BGAS", "BGAL", "BLAS", "BA2AL", "BA2AS", "BA5AL", "BA5AS",
    "BL2AL2CU", "BG2AS2CU", "BL2AS2CU", "BG2AL2CU", "BG2AL2CB", "BL2AL2CB", "BG2AS2CB",
    "BL2AS2CB", "ALTPULN", "ASTPULN", "ALTASTPU", "ALTPBASE", "ASTPBASE", "ALTASTPB"),
  seed = NULL,
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

param

(⁠character vector⁠)
Parameter values.

paramcd

(⁠character vector⁠)
Parameter code values.

seed

(numeric)
Seed to use for reproducible random number generation.

cached

boolean whether the cached ADHY data cadhy should be returned or new data should be generated. If set to TRUE then the other arguments to radhy will be ignored.

Details

One record per subject per parameter per analysis visit per analysis date.

Keys: STUDYID, USUBJID, PARAMCD, AVISITN, ADTM, SRCSEQ

Value

data.frame

Author(s)

wojciakw

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adhy <- radhy(adsl, seed = 2)
adhy

Laboratory Data Analysis Dataset (ADLB)

Description

Function for generating a random Laboratory Data Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radlb(
  adsl,
  lbcat = c("CHEMISTRY", "CHEMISTRY", "IMMUNOLOGY"),
  param = c("Alanine Aminotransferase Measurement", "C-Reactive Protein Measurement",
    "Immunoglobulin A Measurement"),
  paramcd = c("ALT", "CRP", "IGA"),
  paramu = c("U/L", "mg/L", "g/L"),
  aval_mean = c(18, 9, 2.9),
  visit_format = "WEEK",
  n_assessments = 5L,
  n_days = 5L,
  max_n_lbs = 10L,
  lookup = NULL,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(LOQFL = c(NA, 0.1), ABLFL2 = c(1234, 0.1), ABLFL = c(1235, 0.1), BASE2 =
    c(NA, 0.1), BASE = c(NA, 0.1), CHG2 = c(1235, 0.1), PCHG2 = c(1235, 0.1), CHG =
    c(1234, 0.1), PCHG = c(1234, 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

lbcat

(⁠character vector⁠)
LB category values.

param

(⁠character vector⁠)
Parameter values.

paramcd

(⁠character vector⁠)
Parameter code values.

paramu

(⁠character vector⁠)
Parameter unit values.

aval_mean

(⁠numeric vector⁠)
Mean values corresponding to each parameter.

visit_format

(character)
Type of visit. Options are "WEEK" and "CYCLE".

n_assessments

(integer)
Number of weeks or cycles.

n_days

(integer)
Number of days in each cycle (only used if visit_format is "CYCLE").

max_n_lbs

(integer)
Maximum number of labs per patient. Defaults to 10.

lookup

(data.frame)
Additional parameters.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADLB data cadlb should be returned or new data should be generated. If set to TRUE then the other arguments to radlb will be ignored.

Details

One record per subject per parameter per analysis visit per analysis date.

Keys: STUDYID, USUBJID, PARAMCD, BASETYPE, AVISITN, ATPTN, DTYPE, ADTM, LBSEQ, ASPID

Value

data.frame

Author(s)

tomlinsj, npaszty, Xuefeng Hou

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adlb <- radlb(adsl, visit_format = "WEEK", n_assessments = 7L, seed = 2)
adlb

adlb <- radlb(adsl, visit_format = "CYCLE", n_assessments = 2L, seed = 2)
adlb

Medical History Analysis Dataset (ADMH)

Description

Function for generating a random Medical History Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radmh(
  adsl,
  max_n_mhs = 10L,
  lookup = NULL,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(MHBODSYS = c(NA, 0.1), MHDECOD = c(1234, 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

max_n_mhs

(integer)
Maximum number of MHs per patient. Defaults to 10.

lookup

(data.frame)
Additional parameters.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADMH data cadmh should be returned or new data should be generated. If set to TRUE then the other arguments to radmh will be ignored.

Details

One record per each record in the corresponding SDTM domain.

Keys: STUDYID, USUBJID, ASTDTM, MHSEQ

Value

data.frame

Examples

adsl <- radsl(N = 10, study_duration = 2, seed = 1)

admh <- radmh(adsl, seed = 2)
admh

Pharmacokinetics Analysis Dataset (ADPC)

Description

Function for generating a random Pharmacokinetics Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radpc(
  adsl,
  avalu = "ug/mL",
  constants = c(D = 100, ka = 0.8, ke = 1),
  duration = 2,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(AVAL = c(NA, 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

avalu

(character)
Analysis value units.

constants

(⁠character vector⁠)
Constant parameters to be used in formulas for creating analysis values.

duration

(numeric)
Duration in number of days.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADPC data cadpc should be returned or new data should be generated. If set to TRUE then the other arguments to radpc will be ignored.

Details

One record per study, subject, parameter, and time point.

Value

data.frame

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adpc <- radpc(adsl, seed = 2)
adpc

adpc <- radpc(adsl, seed = 2, duration = 3)
adpc

Pharmacokinetics Parameters Dataset (ADPP)

Description

Function for generating a random Pharmacokinetics Parameters Dataset for a given Subject-Level Analysis Dataset.

Usage

radpp(
  adsl,
  ppcat = c("Plasma Drug X", "Plasma Drug Y", "Metabolite Drug X", "Metabolite Drug Y"),
  ppspec = c("Plasma", "Plasma", "Plasma", "Matrix of PD", "Matrix of PD", "Urine",
    "Urine", "Urine", "Urine"),
  paramcd = c("AUCIFO", "CMAX", "CLO", "RMAX", "TON", "RENALCL", "RENALCLD", "RCAMINT",
    "RCPCINT"),
  param = c("AUC Infinity Obs", "Max Conc", "Total CL Obs", "Time of Maximum Response",
    "Time to Onset", "Renal CL", "Renal CL Norm by Dose", "Amt Rec from T1 to T2",
    "Pct Rec from T1 to T2"),
  paramu = c("day*ug/mL", "ug/mL", "ml/day/kg", "hr", "hr", "L/hr", "L/hr/mg", "mg",
    "%"),
  aval_mean = c(200, 30, 5, 10, 3, 0.05, 0.005, 1.5613, 15.65),
  visit_format = "CYCLE",
  n_days = 2L,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(AVAL = c(NA, 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

ppcat

(⁠character vector⁠)
Categories of parameters.

ppspec

(⁠character vector⁠)
Specimen material types.

paramcd

(⁠character vector⁠)
Parameter code values.

param

(⁠character vector⁠)
Parameter values.

paramu

(⁠character vector⁠)
Parameter unit values.

aval_mean

(⁠numeric vector⁠)
Mean values corresponding to each parameter.

visit_format

(character)
Type of visit. Options are "WEEK" and "CYCLE".

n_days

(integer)
Number of days in each cycle (only used if visit_format is "CYCLE").

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADPP data cadpp should be returned or new data should be generated. If set to TRUE then the other arguments to radpp will be ignored.

Details

One record per study, subject, parameter category, parameter and visit.

Value

data.frame

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adpp <- radpp(adsl, seed = 2)
adpp

EORTC QLQ-C30 V3 Analysis Dataset (ADQLQC)

Description

Function for generating a random EORTC QLQ-C30 V3 Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radqlqc(adsl, percent, number, seed = NULL, cached = FALSE)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

percent

(numeric)
Completion - Completed at least y percent of questions, 1 record per visit

number

(numeric)
Completion - Completed at least x question(s), 1 record per visit

seed

(numeric)
Seed to use for reproducible random number generation.

cached

boolean whether the cached ADQLQC data cadqlqc should be returned or new data should be generated. If set to TRUE then the other arguments to radqlqc will be ignored.

Details

Keys: STUDYID, USUBJID, PARCAT1N, PARAMCD, BASETYPE, AVISITN, ATPTN, ADTM, QSSEQ

Value

data.frame

Examples

adsl <- radsl(N = 10, study_duration = 2, seed = 1)

adqlqc <- radqlqc(adsl, seed = 1, percent = 80, number = 2)
adqlqc

Questionnaires Analysis Dataset (ADQS)

Description

Function for generating a random Questionnaires Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radqs(
  adsl,
  param = c("BFI All Questions", "Fatigue Interference",
    "Function/Well-Being (GF1,GF3,GF7)", "Treatment Side Effects (GP2,C5,GP5)",
    "FKSI-19 All Questions"),
  paramcd = c("BFIALL", "FATIGI", "FKSI-FWB", "FKSI-TSE", "FKSIALL"),
  visit_format = "WEEK",
  n_assessments = 5L,
  n_days = 5L,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(LOQFL = c(NA, 0.1), ABLFL2 = c(1234, 0.1), ABLFL = c(1235, 0.1), CHG2 =
    c(1235, 0.1), PCHG2 = c(1235, 0.1), CHG = c(1234, 0.1), PCHG = c(1234, 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

param

(⁠character vector⁠)
Parameter values.

paramcd

(⁠character vector⁠)
Parameter code values.

visit_format

(character)
Type of visit. Options are "WEEK" and "CYCLE".

n_assessments

(integer)
Number of weeks or cycles.

n_days

(integer)
Number of days in each cycle (only used if visit_format is "CYCLE").

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADQS data cadqs should be returned or new data should be generated. If set to TRUE then the other arguments to radqs will be ignored.

Details

One record per subject per parameter per analysis visit per analysis date.

Keys: STUDYID, USUBJID, PARAMCD, AVISITN

Value

data.frame

Author(s)

npaszty

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adqs <- radqs(adsl, visit_format = "WEEK", n_assessments = 7L, seed = 2)
adqs

adqs <- radqs(adsl, visit_format = "CYCLE", n_assessments = 3L, seed = 2)
adqs

Tumor Response Analysis Dataset (ADRS)

Description

Function for generating a random Tumor Response Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radrs(
  adsl,
  avalc = NULL,
  lookup = NULL,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(AVISIT = c(NA, 0.1), AVAL = c(1234, 0.1), AVALC = c(1234, 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

avalc

(⁠character vector⁠)
Analysis value categories.

lookup

(data.frame)
Additional parameters.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADRS data cadrs should be returned or new data should be generated. If set to TRUE then the other arguments to radrs will be ignored.

Details

One record per subject per parameter per analysis visit per analysis date. SDTM variables are populated on new records coming from other single records. Otherwise, SDTM variables are left blank.

Keys: STUDYID, USUBJID, PARAMCD, AVISITN, ADT, RSSEQ

Value

data.frame

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adrs <- radrs(adsl, seed = 2)
adrs

Time to Safety Event Analysis Dataset (ADSAFTTE)

Description

Function to generate random Time-to-Safety Event Dataset for a given Subject-Level Analysis Dataset.

Usage

radsaftte(adsl, ...)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

...

Additional arguments to be passed to radaette

Details

Keys: STUDYID, USUBJID, PARAMCD

Value

data.frame

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adsaftte <- radsaftte(adsl, seed = 2)
adsaftte

Subject-Level Analysis Dataset (ADSL)

Description

The Subject-Level Analysis Dataset (ADSL) is used to provide the variables that describe attributes of a subject. ADSL is a source for subject-level variables used in other analysis data sets, such as population flags and treatment variables. There is only one ADSL per study. ADSL and its related metadata are required in a CDISC-based submission of data from a clinical trial even if no other analysis data sets are submitted.

Usage

radsl(
  N = 400,
  study_duration = 2,
  seed = NULL,
  with_trt02 = TRUE,
  na_percentage = 0,
  na_vars = list(AGE = NA, SEX = NA, RACE = NA, STRATA1 = NA, STRATA2 = NA, BMRKR1 =
    c(seed = 1234, percentage = 0.1), BMRKR2 = c(1234, 0.1), BEP01FL = NA),
  ae_withdrawal_prob = 0.05,
  cached = FALSE
)

Arguments

N

(numeric)
Number of patients.

study_duration

(numeric)
Duration of study in years.

seed

(numeric)
Seed to use for reproducible random number generation.

with_trt02

(logical)
Should period 2 be added.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

ae_withdrawal_prob

(proportion)
Probability that there is at least one Adverse Event leading to the withdrawal of a study drug.

cached

boolean whether the cached ADSL data cadsl should be returned or new data should be generated. If set to TRUE then the other arguments to radsl will be ignored.

Details

One record per subject.

Keys: STUDYID, USUBJID

Value

data.frame

Examples

adsl <- radsl(N = 10, study_duration = 2, seed = 1)
adsl

adsl <- radsl(
  N = 10, seed = 1,
  na_percentage = 0.1,
  na_vars = list(
    DTHDT = c(seed = 1234, percentage = 0.1),
    LSTALVDT = c(seed = 1234, percentage = 0.1)
  )
)
adsl

adsl <- radsl(N = 10, seed = 1, na_percentage = .1)
adsl

Subcategory Analysis Dataset (ADSUB)

Description

Function for generating a random Subcategory Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radsub(
  adsl,
  param = c("Baseline Weight", "Baseline Height", "Baseline BMI", "Baseline ECOG",
    "Baseline Biomarker Mutation"),
  paramcd = c("BWGHTSI", "BHGHTSI", "BBMISI", "BECOG", "BBMRKR1"),
  seed = NULL,
  na_percentage = 0,
  na_vars = list(),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

param

(⁠character vector⁠)
Parameter values.

paramcd

(⁠character vector⁠)
Parameter code values.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADSUB data cadsub should be returned or new data should be generated. If set to TRUE then the other arguments to radsub will be ignored.

Details

One record per subject.

Keys: STUDYID, USUBJID, PARAMCD, AVISITN, ADTM, SRCSEQ

Value

data.frame

Author(s)

tomlinsj, npaszty, Xuefeng Hou, dipietrc

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adsub <- radsub(adsl, seed = 2)
adsub

Tumor Response Analysis Dataset (ADTR)

Description

Function for generating a random Tumor Response Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radtr(
  adsl,
  param = c("Sum of Longest Diameter by Investigator"),
  paramcd = c("SLDINV"),
  seed = NULL,
  cached = FALSE,
  ...
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

param

(⁠character vector⁠)
Parameter values.

paramcd

(⁠character vector⁠)
Parameter code values.

seed

(numeric)
Seed to use for reproducible random number generation.

cached

boolean whether the cached ADTR data cadtr should be returned or new data should be generated. If set to TRUE then the other arguments to radtr will be ignored.

...

Additional arguments to be passed to radrs.

Details

One record per subject per parameter per analysis visit per analysis date.

Keys: STUDYID, USUBJID, PARAMCD, BASETYPE, AVISITN, DTYPE

Value

data.frame

Author(s)

tomlinsj, npaszty, Xuefeng Hou, dipietrc

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adtr <- radtr(adsl, seed = 2)
adtr

Time-to-Event Analysis Dataset (ADTTE)

Description

Function for generating a random Time-to-Event Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radtte(
  adsl,
  event_descr = NULL,
  censor_descr = NULL,
  lookup = NULL,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(CNSR = c(NA, 0.1), AVAL = c(1234, 0.1), AVALU = c(1234, 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

event_descr

(⁠character vector⁠)
Descriptions of events. Defaults to NULL.

censor_descr

(⁠character vector⁠)
Descriptions of censors. Defaults to NULL.

lookup

(data.frame)
Additional parameters.

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADTTE data cadtte should be returned or new data should be generated. If set to TRUE then the other arguments to radtte will be ignored.

Details

Keys: STUDYID, USUBJID, PARAMCD

Value

data.frame

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

adtte <- radtte(adsl, seed = 2)
adtte

Vital Signs Analysis Dataset (ADVS)

Description

Function for generating a random Vital Signs Analysis Dataset for a given Subject-Level Analysis Dataset.

Usage

radvs(
  adsl,
  param = c("Diastolic Blood Pressure", "Pulse Rate", "Respiratory Rate",
    "Systolic Blood Pressure", "Temperature", "Weight"),
  paramcd = c("DIABP", "PULSE", "RESP", "SYSBP", "TEMP", "WEIGHT"),
  paramu = c("Pa", "beats/min", "breaths/min", "Pa", "C", "Kg"),
  visit_format = "WEEK",
  n_assessments = 5L,
  n_days = 5L,
  seed = NULL,
  na_percentage = 0,
  na_vars = list(CHG2 = c(1235, 0.1), PCHG2 = c(1235, 0.1), CHG = c(1234, 0.1), PCHG =
    c(1234, 0.1), AVAL = c(123, 0.1), AVALU = c(123, 0.1)),
  cached = FALSE
)

Arguments

adsl

(data.frame)
Subject-Level Analysis Dataset (ADSL).

param

(⁠character vector⁠)
Parameter values.

paramcd

(⁠character vector⁠)
Parameter code values.

paramu

(⁠character vector⁠)
Parameter unit values.

visit_format

(character)
Type of visit. Options are "WEEK" and "CYCLE".

n_assessments

(integer)
Number of weeks or cycles.

n_days

(integer)
Number of days in each cycle (only used if visit_format is "CYCLE").

seed

(numeric)
Seed to use for reproducible random number generation.

na_percentage

(proportion)
Default percentage of values to be replaced by NA.

na_vars

(list)
A named list where the name of each element is a column name of ds. Each element of this list should be a numeric vector with two elements:

seed (numeric)
The seed to be used for this element - can be NA.
percentage (proportion)
Percentage of elements to be replaced with NA. If NA, na_percentage is used as a default.

cached

boolean whether the cached ADVS data cadvs should be returned or new data should be generated. If set to TRUE then the other arguments to radvs will be ignored.

Details

One record per subject per parameter per analysis visit per analysis date.

Keys: STUDYID, USUBJID, PARAMCD, BASETYPE, AVISITN, ATPTN, DTYPE, ADTM, VSSEQ, ASPID

Value

data.frame

Author(s)

npaszty

Examples

adsl <- radsl(N = 10, seed = 1, study_duration = 2)

advs <- radvs(adsl, visit_format = "WEEK", n_assessments = 7L, seed = 2)
advs

advs <- radvs(adsl, visit_format = "CYCLE", n_assessments = 3L, seed = 2)
advs

Primary Keys: Labels

Description

Shallow copy of formatters::var_relabel(). Used mainly internally to relabel a subset of variables in a data set.

Usage

rcd_var_relabel(x, ...)

Arguments

x

(data.frame)
Data frame containing variables to which labels are applied.

...

(⁠named character⁠)
Name-Value pairs, where name corresponds to a variable name in x and the value to the new variable label.

Value

x (data.frame)
Data frame with labels applied.

Related Variables: Assign

Description

Assign values to a related variable within a domain.

Usage

rel_var(df, var_name, related_var, var_values = NULL)

Arguments

df

(data.frame)
Data frame containing the related variables.

var_name

(character)
Name of variable related to rel_var to add to df.

related_var

(character)
Name of variable within df with values to which values of var_name must relate.

var_values

(any)
Vector of values related to values of related_var.

Value

df with added factor variable var_name containing var_values corresponding to related_var.

Examples

# Example with data.frame.
params <- c("Level A", "Level B", "Level C")
adlb_df <- data.frame(
  ID = 1:9,
  PARAM = factor(
    rep(c("Level A", "Level B", "Level C"), 3),
    levels = params
  )
)
rel_var(
  df = adlb_df,
  var_name = "PARAMCD",
  var_values = c("A", "B", "C"),
  related_var = "PARAM"
)

# Example with tibble.
adlb_tbl <- tibble::tibble(
  ID = 1:9,
  PARAM = factor(
    rep(c("Level A", "Level B", "Level C"), 3),
    levels = params
  )
)
rel_var(
  df = adlb_tbl,
  var_name = "PARAMCD",
  var_values = c("A", "B", "C"),
  related_var = "PARAM"
)

Related Variables: Initialize

Description

Verify and initialize related variable values. For example, relvar_init("Alanine Aminotransferase Measurement", "ALT").

Usage

relvar_init(relvar1, relvar2)

Arguments

relvar1

(list of character)
List of n elements.

relvar2

(list of character)
List of n elements.

Value

A vector of n elements.

Replace Values in a Vector by NA

Description

Randomized replacement of values by NA.

Usage

replace_na(v, percentage = 0.05, seed = NULL)

Arguments

v

(any)
Vector of any type.

percentage

(proportion)
Value between 0 and 1 defining how much of the vector shall be replaced by NA. This number is randomized by +/- 5% to have full randomization.

seed

(numeric)
Seed to use for reproducible random number generation.

Value

The input vector v where a certain number of values are replaced by NA.

Primary Keys: Retain Values

Description

Retain values within primary keys.

Usage

retain(df, value_var, event, outside = NA)

Arguments

df

(data.frame)
Data frame in which to apply the retain.

value_var

(any)
Variable in df containing the value to be retained.

event

(expression)
Expression returning a logical value to trigger the retain.

outside

(any)
Additional value to retain. Defaults to NA.

Value

A vector of values where expression is true.

Truncated Exponential Distribution

Description

This generates random numbers from a truncated Exponential distribution, i.e. from X | X > l or X | X < r when X ~ Exp(rate). The advantage here is that we guarantee to return exactly n numbers and without using a loop internally. This can be derived from the quantile functions of the left- and right-truncated Exponential distributions.

Usage

rtexp(n, rate, l = NULL, r = NULL)

Arguments

n

(numeric)
Number of random numbers.

rate

(numeric)
Non-negative rate.

l

(numeric)
Positive left-hand truncation parameter.

r

(numeric)
Positive right-hand truncation parameter.

Value

The random numbers. If neither l nor r are provided then the usual Exponential distribution is used.

Examples

x <- stats::rexp(1e6, rate = 5)
x <- x[x > 0.5]
hist(x)

y <- rtexp(1e6, rate = 5, l = 0.5)
hist(y)

z <- rtexp(1e6, rate = 5, r = 0.5)
hist(z)

Zero-Truncated Poisson Distribution

Description

This generates random numbers from a zero-truncated Poisson distribution, i.e. from X | X > 0 when X ~ Poisson(lambda). The advantage here is that we guarantee to return exactly n numbers and without using a loop internally. This solution was provided in a post by Peter Dalgaard.

Usage

rtpois(n, lambda)

Arguments

n

(numeric)
Number of random numbers.

lambda

(numeric)
Non-negative mean(s).

Value

The random numbers.

Examples

x <- rpois(1e6, lambda = 5)
x <- x[x > 0]
hist(x)

y <- rtpois(1e6, lambda = 5)
hist(y)

Create a Factor with Random Elements of x

Description

Sample elements from x with replacement to build a factor.

Usage

sample_fct(x, N, ...)

Arguments

x

(⁠character vector⁠ or factor)
If character vector then it is also used as levels of the returned factor. If factor then the levels are used as the new levels.

N

(numeric)
Number of items to choose.

...

Additional arguments to be passed to sample.

Value

A factor of length N.

Examples

sample_fct(letters[1:3], 10)
sample_fct(iris$Species, 10)

Create Visit Schedule

Description

Create a visit schedule as a factor.

Usage

visit_schedule(visit_format = "WEEK", n_assessments = 10L, n_days = 5L)

Arguments

visit_format

(character)
Type of visit. Options are "WEEK" and "CYCLE".

n_assessments

(integer)
Number of weeks or cycles.

n_days

(integer)
Number of days in each cycle (only used if visit_format is "CYCLE").

Details

X number of visits, or X number of cycles and Y number of days.

Value

A factor of length n_assessments.

Examples

visit_schedule(visit_format = "WEeK", n_assessments = 10L)
visit_schedule(visit_format = "CyCLE", n_assessments = 5L, n_days = 2L)

random.cdisc.data Package

Description

Author(s)

See Also

Apply Metadata

Description

Usage

Arguments

Value

Examples

Standard Arguments

Description

Arguments

Cached ADAB

Description

Usage

Format

Cached ADAE

Description

Usage

Format

Cached ADAETTE

Description

Usage

Format

Cached ADCM

Description

Usage

Format

Cached ADDV

Description

Usage

Format

Cached ADEG

Description

Usage

Format

Cached ADEX

Description

Usage

Format

Cached ADHY

Description

Usage

Format

Cached ADLB

Description

Usage

Format

Cached ADMH

Description

Usage

Format

Cached ADPC

Description

Usage

Format

Cached ADPP

Description

Usage

Format

Cached ADQLQC

Description

Usage

Format

Cached ADQS

Description

Usage

Format

Cached ADRS

Description

Usage

Format

Cached ADSL

Description

Usage

Format

Cached ADSUB

Description

Usage

`random.cdisc.data` Package