Type: | Package |
Title: | Systematic Screening of Study Data for Subgroup Effects |
Version: | 4.0.1 |
Description: | Identifying outcome relevant subgroups has now become as simple as possible! The formerly lengthy and tedious search for the needle in a haystack will be replaced by a single, comprehensive and coherent presentation. The central result of a subgroup screening is a diagram in which each single dot stands for a subgroup. The diagram may show thousands of them. The position of the dot in the diagram is determined by the sample size of the subgroup and the statistical measure of the treatment effect in that subgroup. The sample size is shown on the horizontal axis while the treatment effect is displayed on the vertical axis. Furthermore, the diagram shows the line of no effect and the overall study results. For small subgroups, which are found on the left side of the plot, larger random deviations from the mean study effect are expected, while for larger subgroups only small deviations from the study mean can be expected to be chance findings. So for a study with no conspicuous subgroup effects, the dots in the figure are expected to form a kind of funnel. Any deviations from this funnel shape hint to conspicuous subgroups. |
License: | GPL-3 |
Depends: | R (≥ 3.5.0) |
Encoding: | UTF-8 |
LazyData: | TRUE |
Imports: | utils, plyr, data.table, ggplot2, ggrepel, rlang, stringr, grDevices, graphics, shiny, DT, stats, shinyjs, methods, bsplus, colourpicker, dplyr, ranger, shinyWidgets |
Suggests: | parallel, survival, knitr, rmarkdown, testthat |
NeedsCompilation: | no |
RoxygenNote: | 7.2.3 |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
Packaged: | 2025-03-18 15:35:29 UTC; sgfpj |
Author: | Bodo Kirsch [aut, cre], Steffen Jeske [aut], Julia Eichhorn [aut], Susanne Lippert [aut], Thomas Schmelter [aut], Christoph Muysers [aut], Hermann Kulmann [aut] |
Maintainer: | Bodo Kirsch <kirschbodo@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-03-18 21:20:02 UTC |
Function to create the subgroup filter table
Description
Function to create the subgroup filter table
Usage
createFilteredTable(
filter1,
filter2,
variableChosen1,
variableChosen2,
results,
y,
x,
bg.color,
key
)
Arguments
filter1 |
variable name of first filter. |
filter2 |
variable name of second filter. |
variableChosen1 |
level of first filter variable. |
variableChosen2 |
level of second filter variable. |
results |
results data set object of class "SubScreenResult". |
y |
target variable name. |
x |
variable name. |
bg.color |
background color. |
key |
number factors. |
Function to create the subgroup parent table
Description
Function to create the subgroup parent table
Usage
createParentTable(results, parents, y, x, x2, bg.color, navpanel)
Arguments
results |
results data set object of class "SubScreenResult". |
parents |
subgroup ids parents. |
y |
target variable name. |
x |
variable name. |
x2 |
second variable name. |
bg.color |
background color. |
navpanel |
navpanel id ("SubscreenExplorer"/"SubscreenComparer"). |
Function to create a data set with complement information based on selected subgroup
Description
Function to create a data set with complement information based on selected subgroup
Usage
createPlot_points_data_complement(results_tmp, y, sel_ids)
Arguments
results_tmp |
subscreen data set |
y |
target variable |
sel_ids |
selected subgroup id |
shiny widgets of display option panel
Description
shiny widgets of display option panel
Usage
displayOptionsPanel()
Example importance data set
Description
Example importance data set
Creates an interaction plot used in Explorer and ASMUS-tab in Subgroup Explorer
Description
Creates an interaction plot used in Explorer and ASMUS-tab in Subgroup Explorer
Usage
interaction_plot2(
df_data,
fac1,
fac2 = NULL,
fac3 = NULL,
response,
bg.col = "#6B6B6B",
bg.col2 = NULL,
font.col = "white",
y.min = "NA",
y.max = "NA",
box.col = "white",
sg_green = "#5cb85c",
sg_blue = "#3a6791",
plot_type = ""
)
Arguments
df_data |
data frame with factorial context |
fac1 |
name of factor level 1 |
fac2 |
name of factor level 2 (default: NULL) |
fac3 |
name of factor level 3 (default: NULL) |
response |
target variable |
bg.col |
background color |
bg.col2 |
second background color |
font.col |
font color |
y.min |
y-axis mininum. |
y.max |
y-axis maximum. |
box.col |
box color. |
sg_green |
hex code for color palette creation. |
sg_blue |
hex code for color palette creation. |
plot_type |
linear ("") or logarithmic ("log") y-axis (default: ""). |
Returns all 'parent'-subgroups of a specific subgroup
Description
Returns all 'parent'-subgroups of a specific subgroup
Usage
parents(data, SGID)
Arguments
data |
The "SubScreenResult" object generated via function 'subscreencalc'. |
SGID |
Subgroup id(s) of the subgroup for which the 'parent'-subgroups are requested. |
Value
List of 'parent'-subgroups.
Function for adding the status of a factorial context ("complete"/"incomplete" or "pseudo complete") to the SubScreenResult object (used in subscreencalc if parameter 'factorial = TRUE').
Description
Function for adding the status of a factorial context ("complete"/"incomplete" or "pseudo complete") to the SubScreenResult object (used in subscreencalc if parameter 'factorial = TRUE').
Usage
pseudo_contexts(data, endpoint, factors)
Arguments
data |
The list entry 'sge' from the "SubScreenResult" object generated via function 'subscreencalc'. |
endpoint |
The vector of target variable(s). |
factors |
The list entry 'factors' from the "SubScreenResult" object generated via function 'subscreencalc'. |
Generate variables for complete/incomplete/pseudo complete factorial context(s)
Description
Generate variables for complete/incomplete/pseudo complete factorial context(s)
Usage
pseudo_func(results, endpoint, factors)
Arguments
results |
The list entry 'sge' from the "SubScreenResult" object generated via function 'subscreencalc'. |
endpoint |
The vector of target variable(s). |
factors |
The list entry 'factors' from the "SubScreenResult" object generated via function 'subscreencalc'. |
Example results data set without factorial context and complement calculations
Description
Example results data set without factorial context and complement calculations
Example results data set without factorial context and with complement calculations
Description
Example results data set without factorial context and with complement calculations
Example results data set with factorial context and complement calculations
Description
Example results data set with factorial context and complement calculations
Example results data set with factorial context and without complement calculations
Description
Example results data set with factorial context and without complement calculations
Creates an mosaic plot used in Mosaic-tab in Subgroup Explorer
Description
Creates an mosaic plot used in Mosaic-tab in Subgroup Explorer
Usage
subscreen_mosaicPlot(
res,
mos.x,
mos.y = NULL,
mos.y2 = NULL,
mos.z,
col.bg = c("#424242"),
col.txt = c("#ffffff"),
colrange.z = c("#00BCFF", "gray89", "#89D329"),
scale = "lin"
)
Arguments
res |
results data set from subscreencalc |
mos.x |
first endpoint variable |
mos.y |
second endpoint variable (default:NULL) |
mos.y2 |
third endpoint variable (default: NULL) |
mos.z |
reference variable (mosaic size) |
col.bg |
background color (default: '#424242') |
col.txt |
text color font (default: '#ffffff') |
colrange.z |
three color scale for mosaic colors (default: c('#00BCFF','gray89','#89D329')) |
scale |
scale of endpoint values linear or logarithmic (default: 'lin') |
(i) Calculation of the results for the subgroups
Description
This function systematically calculates the defined outcome for every combination of subgroups up to the given level (max_comb), i.e. the number of maximum combinations of subgroup defining factors. If, e.g., in a study sex, age group (<=60, >60), BMI group (<=25, >25) are of interest, subgroups of level 2 would be, e.g, male subjects with BMI>25 or young females, while subgroups of level 3 would be any combination of all three variables.
Usage
subscreencalc(
data,
eval_function,
subjectid = "subjid",
factors = NULL,
max_comb = 3,
nkernel = 1,
par_functions = "",
verbose = TRUE,
factorial = FALSE,
use_complement = FALSE,
...
)
Arguments
data |
dataframe with study data |
eval_function |
name of the function for data analysis |
subjectid |
name of variable in data that contains the subject identifier, defaults to subjid |
factors |
character vector containing the names of variables that define the subgroups (required) |
max_comb |
maximum number of factor combination levels to define subgruops, defaults to 3 |
nkernel |
number of kernels for parallelization (defaults to 1) |
par_functions |
vector of names of functions used in eval_function to be exported to cluster (needed only if nkernel > 1) |
verbose |
logical value to switch on/off output of computational information (defaults to TRUE) |
factorial |
logical value to switch on/off calculation of factorial contexts (defaults to FALSE) |
use_complement |
logical value to switch on/off calculation of complement subgroups (defaults to FALSE) |
... |
further parameters which where outdated used for notes and errors. |
Details
The evaluation function (eval_function) has to defined by the user. The result needs to be a vector of numerical values, e.g., outcome variable(s) and number of observations/subjects. The input of eval_function is a data frame with the same structure as the input data frame (data) used in the subsreencalc call. See example below. Potential errors occurring due to small subgroups should be caught and handled within eval_function. As the eval_function will be called with every subgroup it may happen that there is only one observation or only one treatment arm or only observations with missing data going into the eval_function. There should always be valid result vector be returned (NAs allowed) and no error causing program abort. For a better display the results may be cut-off to a reasonable range. For example: If my endpoint is a hazard ratio that is expected to be between 0.5 and 2 I would set all values smaller than 0.01 to 0.01 and values above 100 to 100.
Value
an object of type SubScreenResult of the form list(sge=H, max_comb=max_comb, min_comb=min_comb, subjectid=subjectid, treat=treat, factors=factors, results_total=eval_function(cbind(F,T)))
Examples
# get the pbc data from the survival package
require(survival)
data(pbc, package="survival")
# generate categorical versions of some of the baseline covariates
pbc$ageg[!is.na(pbc$age)] <-
ifelse(pbc$age[!is.na(pbc$age)] <= median(pbc$age, na.rm=TRUE), "Low", "High")
pbc$albuming[!is.na(pbc$albumin)]<-
ifelse(pbc$albumin[!is.na(pbc$albumin)] <= median(pbc$albumin, na.rm=TRUE), "Low", "High")
pbc$phosg[!is.na(pbc$alk.phos)] <-
ifelse(pbc$alk.phos[!is.na(pbc$alk.phos)]<= median(pbc$alk.phos,na.rm=TRUE), "Low", "High")
pbc$astg[!is.na(pbc$ast)] <-
ifelse(pbc$ast[!is.na(pbc$ast)] <= median(pbc$ast, na.rm=TRUE), "Low", "High")
pbc$bilig[!is.na(pbc$bili)] <-
ifelse(pbc$bili[!is.na(pbc$bili)] <= median(pbc$bili, na.rm=TRUE), "Low", "High")
pbc$cholg[!is.na(pbc$chol)] <-
ifelse(pbc$chol[!is.na(pbc$chol)] <= median(pbc$chol, na.rm=TRUE), "Low", "High")
pbc$copperg[!is.na(pbc$copper)] <-
ifelse(pbc$copper[!is.na(pbc$copper)] <= median(pbc$copper, na.rm=TRUE), "Low", "High")
#eliminate treatment NAs
pbcdat <- pbc[!is.na(pbc$trt), ]
# PFS and OS endpoints
set.seed(2006)
pbcdat$'event.pfs' <- sample(c(0,1),dim(pbcdat)[1],replace=TRUE)
pbcdat$'timepfs' <- sample(1:5000,dim(pbcdat)[1],replace=TRUE)
pbcdat$'event.os' <- pbcdat$event
pbcdat$'timeos' <- pbcdat$time
#variable importance for OS for the created categorical variables
#(higher is more important, also works for numeric variables)
varnames <- c('ageg', 'sex', 'bilig', 'cholg', 'astg', 'albuming', 'phosg')
# define function the eval_function()
# Attention: The eval_function ALWAYS needs to return a dataframe with one row.
# Include exception handling, like if(N1>0 && N2>0) hr <- exp(coxph(...) )
# to avoid program abort due to errors
hazardratio <- function(D) {
HRpfs <- tryCatch(exp(coxph(Surv(D$timepfs, D$event.pfs) ~ D$trt )$coefficients[[1]]),
warning=function(w) {NA})
HRpfs <- 1/HRpfs
HR.pfs <- round(HRpfs, 2)
HR.pfs[HR.pfs > 10] <- 10
HR.pfs[HR.pfs < 0.00001] <- 0.00001
HRos <- tryCatch(exp(coxph(Surv(D$timeos, D$event.os) ~ D$trt )$coefficients[[1]]),
warning=function(w) {NA})
HRos <- 1/HRos
HR.os <- round(HRos, 2)
HR.os[HR.os > 10] <- 10
HR.os[HR.os < 0.00001] <- 0.00001
data.frame( HR.pfs, HR.os#, N.of.subjects,N1 ,N2
)
}
# run subscreen
## Not run:
results <- subscreencalc(
data=pbcdat,
eval_function=hazardratio,
subjectid = "id",
factors=c("ageg", "sex", "bilig", "cholg", "copperg"),
use_complement = FALSE,
factorial = FALSE
)
# visualize the results of the subgroup screening with a Shiny app
subscreenshow(results)
## End(Not run)
(ii) Visualization
Description
Start the Shiny based interactive visualization tool to show the subgroup results generated by subscreencalc. See and explore all subgroup results at one glance. Pick and chose a specific subgroup, the level of combinations or a certain factor with its combinations. Switch easily between different endpoint/target variables.
Usage
subscreenshow(
scresults = NULL,
variable_importance = NULL,
host = NULL,
port = NULL,
NiceNumbers = c(1, 1.5, 2, 4, 5, 6, 8, 10),
windowTitle = "Subgroup Explorer",
graphSubtitle = NULL,
favour_label_verum_name = NULL,
favour_label_comparator_name = NULL
)
Arguments
scresults |
SubScreenResult object with results from a subscreencalc call |
variable_importance |
variable importance object calculated via subscreenvi to unlock 'variable importance'-tab in the app |
host |
host name or IP address for Shiny display |
port |
port number for Shiny display |
NiceNumbers |
list of numbers used for a 'nice' scale |
windowTitle |
title which is shown for the browser tab |
graphSubtitle |
subtitle for explorer plot |
favour_label_verum_name |
verum name for label use in explorer graph |
favour_label_comparator_name |
comparator name for label use in explorer graph |
(iii) Determine variable importance
Description
Determine variable importance for continuous, categorical or right-censored survival endpoints (overall and per treatment group) using random forests
Usage
subscreenvi(data, y, cens = NULL, x = NULL, trt = NULL)
Arguments
data |
The data frame containing the dependent and independent variables. |
y |
The name of the column in |
cens |
The name of the column in |
x |
Vector that contains the names of the columns in |
trt |
The name of the column in |
Value
A list containing ordered data frames with the variable importances (one for each treatment level, one with the ranking variability between the treatment levels and one with the total results)
Examples
## Not run:
require(survival)
data(pbc, package="survival")
# generate categorical versions of some of the baseline covariates
pbc$ageg[!is.na(pbc$age)] <-
ifelse(pbc$age[!is.na(pbc$age)] <= median(pbc$age, na.rm=TRUE), "Low", "High")
pbc$albuming[!is.na(pbc$albumin)]<-
ifelse(pbc$albumin[!is.na(pbc$albumin)] <= median(pbc$albumin, na.rm=TRUE), "Low", "High")
pbc$phosg[!is.na(pbc$alk.phos)] <-
ifelse(pbc$alk.phos[!is.na(pbc$alk.phos)]<= median(pbc$alk.phos,na.rm=TRUE), "Low", "High")
pbc$astg[!is.na(pbc$ast)] <-
ifelse(pbc$ast[!is.na(pbc$ast)] <= median(pbc$ast, na.rm=TRUE), "Low", "High")
pbc$bilig[!is.na(pbc$bili)] <-
ifelse(pbc$bili[!is.na(pbc$bili)] <= median(pbc$bili, na.rm=TRUE), "Low", "High")
pbc$cholg[!is.na(pbc$chol)] <-
ifelse(pbc$chol[!is.na(pbc$chol)] <= median(pbc$chol, na.rm=TRUE), "Low", "High")
pbc$copperg[!is.na(pbc$copper)] <-
ifelse(pbc$copper[!is.na(pbc$copper)] <= median(pbc$copper, na.rm=TRUE), "Low", "High")
pbc$ageg[is.na(pbc$age)] <- "No Data"
pbc$albuming[is.na(pbc$albumin)] <- "No Data"
pbc$phosg[is.na(pbc$alk.phos)] <- "No Data"
pbc$astg[is.na(pbc$ast)] <- "No Data"
pbc$bilig[is.na(pbc$bili)] <- "No Data"
pbc$cholg[is.na(pbc$chol)] <- "No Data"
pbc$copperg[is.na(pbc$copper)] <- "No Data"
#eliminate treatment NAs
pbcdat <- pbc[!is.na(pbc$trt), ]
pbcdat$status <- ifelse(pbcdat$status==0,0,1)
importance <- subscreenvi(data=pbcdat, y='time', cens='status',
trt='trt', x=c("ageg", "sex", "bilig", "cholg", "copperg"))
## End(Not run)
shiny widgets of variable option panel
Description
shiny widgets of variable option panel
Usage
variableOptionsPanel()