Title: | Run Permutation Tests and Construct Associated Confidence Intervals |
Version: | 1.0.0 |
Description: | Implements permutation tests for any test statistic and randomization scheme and constructs associated confidence intervals as described in Glazer and Stark (2024) <doi:10.48550/arXiv.2405.05238>. |
License: | GPL (≥ 3) |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2024-09-25 17:18:13 UTC; ag88343 |
Author: | Amanda Glazer |
Maintainer: | Amanda Glazer <amanda.glazer@austin.utexas.edu> |
Repository: | CRAN |
Date/Publication: | 2024-09-26 11:20:02 UTC |
Adjust p-values for multiple testing
Description
This function takes an array of p-values and returns adjusted p-values using user-inputted FWER or FDR correction method
Usage
adjust_p_value(pvalues, method = "holm-bonferroni")
Arguments
pvalues |
Array of p-values |
method |
The FWER or FDR correction to use, either 'holm-bonferroni', 'bonferroni', or 'benjamini-hochberg' |
Value
Adjusted p-values
Examples
adjust_p_value(pvalues = c(.05, .1, .5), method='holm-bonferroni')
Calculate difference in means
Description
This function takes a data frame, and group and outcome column names as input and returns the difference in mean outcome between the two groups
Usage
diff_in_means(df, group_col, outcome_col, treatment_value = NULL)
Arguments
df |
A data frame |
group_col |
The name of the column in df that corresponds to the group label |
outcome_col |
The name of the column in df that corresponds to the outcome variable |
treatment_value |
The value of group_col to be considered 'treatment' |
Value
The difference in mean outcome between the two groups
Examples
data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
outcome = c(rep(3, 4), rep(5, 4)))
diff_in_means(df = data,
group_col = "group",
outcome_col = "outcome",
treatment_value = 1)
Calculate difference in medians
Description
This function takes a data frame, and group and outcome column names as input and returns the difference in median outcome between the two groups
Usage
diff_in_medians(df, group_col, outcome_col, treatment_value = NULL)
Arguments
df |
A data frame |
group_col |
The name of the column in df that corresponds to the group label |
outcome_col |
The name of the column in df that corresponds to the outcome variable |
treatment_value |
The value of group_col to be considered 'treatment' |
Value
The difference in median outcome between the two groups
Examples
data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
outcome = c(rep(3, 4), rep(5, 4)))
diff_in_medians(df = data,
group_col = "group",
outcome_col = "outcome",
treatment_value = 1)
Fisher combining function
Description
This function takes an array of p-values and returns a combined p-value using fisher's combining function:
-2 \sum_i \log(p_i)
Usage
fisher(pvalues)
Arguments
pvalues |
Array of p-values |
Value
Combined p-value using fisher's method
Examples
fisher(pvalues = c(.05, .1, .5))
Liptak combining function
Description
This function takes an array of p-values and returns a combined p-value using Liptak's combining function:
\sum_i \Phi^{-1}(1-p_i)
where \Phi
is the CDF of the Normal distribution
Usage
liptak(pvalues)
Arguments
pvalues |
Array of p-values |
Value
Combined p-value using Liptak's method
Examples
liptak(pvalues = c(.05, .1, .5))
Run NPC
Description
This function takes a data frame and group and outcome column names as input and returns the nonparametric combination of tests (NPC) omnibus p-value
Usage
npc(
df,
group_col,
outcome_cols,
strata_col = NULL,
test_stat = "diff_in_means",
perm_func = permute_group,
combn = "fisher",
shift = 0,
reps = 10000,
perm_set = NULL,
complete_enum = FALSE,
seed = NULL
)
Arguments
df |
A data frame |
group_col |
The name of the column in df that corresponds to the group label |
outcome_cols |
The names of the columns in df that corresponds to the outcome variable |
strata_col |
The name of the column in df that corresponds to the strata |
test_stat |
Test statistic function |
perm_func |
Function to permute group, default is permute_group which randomly permutes group assignment |
combn |
Combining function method to use, takes values 'fisher', 'tippett', or 'liptak', or a user defined function |
shift |
Value of shift to apply in one- or two-sample problem |
reps |
Number of iterations to use when calculating permutation p-value |
perm_set |
Matrix of permutations to use instead of reps iterations of perm_func |
complete_enum |
Boolean, whether to calculate P-value under complete enumeration of permutations |
seed |
An integer seed value |
Value
The omnibus p-value
Examples
data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
out1 = c(0, 1, 0, 0, 1, 1, 1, 0),
out2 = rep(1, 8))
output <- npc(df = data, group_col = "group",
outcome_cols = c("out1", "out2"), perm_func = permute_group,
combn = "fisher", reps = 10^4, seed=42)
One-sample permutation test
Description
This function runs a permutation test for the one-sample problem by calling the permutation_test function using the one-sample mean test statistic.
Usage
one_sample(x, shift = 0, alternative = "greater", reps = 10^4, seed = NULL)
Arguments
x |
array of data |
shift |
Value of shift to apply in one-sample problem |
alternative |
String, two-sided or one-sided (greater or less) p-value |
reps |
Number of iterations to use when calculating permutation p-value |
seed |
An integer seed value |
Value
The permutation test p-value
Examples
one_sample(x = c(-1, 1, 2), seed = 42)
Calculate the one-sample problem test statistic
Description
This function takes a data frame, and group and outcome column names as input and returns the mean of the product of the outcome and group. This test statistic is used for the one-sample problem.
Usage
one_sample_mean(df, group_col, outcome_col)
Arguments
df |
A data frame |
group_col |
The name of the column in df that corresponds to the group label |
outcome_col |
The name of the column in df that corresponds to the outcome variable |
Value
The one-sample problem test statistic: the mean of the product of the outcome and group
Examples
data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
outcome = c(rep(3, 4), rep(5, 4)))
one_sample_mean(df = data,
group_col = "group",
outcome_col = "outcome")
Calculate one-way anova test statistic
Description
This function takes a data frame, and group and outcome column names as input and returns the one-way anova test statistic
Usage
one_way_anova_stat(df, group_col, outcome_col)
Arguments
df |
A data frame |
group_col |
The name of the column in df that corresponds to the group label |
outcome_col |
The name of the column in df that corresponds to the outcome variable |
Value
The one-way anova test statistic:
\sum_{g=1}^G n_g(\overline{X_g} - \overline{X})^2
where g
indexes the groups
Run permutation test
Description
Run permutation test with user inputted data, test statistic, and permutation function
Usage
permutation_test(
df,
group_col,
outcome_col,
strata_col = NULL,
test_stat = "diff_in_means",
perm_func = permute_group,
alternative = "two-sided",
shift = 0,
reps = 10000,
perm_set = NULL,
complete_enum = FALSE,
return_test_dist = FALSE,
return_perm_dist = FALSE,
seed = NULL
)
Arguments
df |
A data frame |
group_col |
The name of the column in df that corresponds to the group label |
outcome_col |
The name of the column in df that corresponds to the outcome variable |
strata_col |
The name of the column in df that corresponds to the strata |
test_stat |
Test statistic function |
perm_func |
Function to permute group |
alternative |
String, two-sided or one-sided (greater or less) p-value; options are 'greater', 'less', or 'two-sided' |
shift |
Value of shift to apply in one- or two-sample problem |
reps |
Number of iterations to use when calculating permutation p-value |
perm_set |
Matrix of group assignments to use instead of reps iterations of perm_func |
complete_enum |
Boolean, whether to calculate P-value under complete enumeration of permutations |
return_test_dist |
Boolean, whether to return test statistic distribution under permutations |
return_perm_dist |
Boolean, whether to return a matrix where each row is the group assignment under that permutation |
seed |
An integer seed value |
Value
p_value
: the permutation test p-value
test_stat_dist
: array, the distribution of the test statistic under the set of permutations,
if return_test_dist is set to TRUE
perm_indices_mat
: matrix, each row corresponds to a permutation used
in the permutation test calculation
Examples
data <- data.frame(group = c(rep(1, 10), rep(2, 10)), outcome = c(rep(1, 10), rep(1, 10)))
permutation_test(df = data, group_col = "group", outcome_col = "outcome",
test_stat = "diff_in_means", perm_func = permute_group, alternative = "greater",
shift = 0, reps = 10, return_perm_dist = TRUE, return_test_dist = TRUE, seed = 42)
Construct confidence interval by inverting permutation tests
Description
This function constructs a confidence interval by inverting permutation tests and applying the method in Glazer and Stark, 2024.
Usage
permutation_test_ci(
df,
group_col,
outcome_col,
strata_col = NULL,
test_stat = "diff_in_means",
perm_func = permute_group,
upper_bracket = NULL,
lower_bracket = NULL,
cl = 0.95,
e = 0.1,
reps = 10000,
perm_set = NULL,
seed = 42
)
Arguments
df |
A data frame |
group_col |
The name of the column in df that corresponds to the group label |
outcome_col |
The name of the column in df that corresponds to the outcome variable |
strata_col |
The name of the column in df that corresponds to the strata |
test_stat |
Test statistic function |
perm_func |
Function to permute group |
upper_bracket |
Array with 2 values that bracket upper confidence bound |
lower_bracket |
Array with 2 values that bracket lower confidence bound |
cl |
Confidence level, default 0.95 |
e |
Maximum distance from true confidence bound value |
reps |
Number of iterations to use when calculating permutation p-value |
perm_set |
Matrix of group assignments to use instead of reps iterations of perm_func |
seed |
An integer seed value |
Value
A list containing the permutation test p-value, and the test statistic distribution if applicable
Examples
x <- c(35.3, 35.9, 37.2, 33.0, 31.9, 33.7, 36.0, 35.0, 33.3, 33.6, 37.9, 35.6, 29.0, 33.7, 35.7)
y <- c(32.5, 34.0, 34.4, 31.8, 35.0, 34.6, 33.5, 33.6, 31.5, 33.8, 34.6)
df <- data.frame(outcome = c(x, y), group = c(rep(1, length(x)), rep(0, length(y))))
permutation_test_ci(df = df, group_col = "group", outcome_col = "outcome", strata_col = NULL,
test_stat = "diff_in_means", perm_func = permute_group,
upper_bracket = NULL, lower_bracket = NULL,
cl = 0.95, e = 0.01, reps = 10^3, seed = 42)
Unstratified group permutation
Description
This function takes a data frame and group column name as input and returns the dataframe with the group column randomly permuted
Usage
permute_group(df, group_col, strata_col = NULL, seed = NULL)
Arguments
df |
A data frame |
group_col |
String, the name of the column in df that corresponds to the group label |
strata_col |
The name of the column in df that corresponds to the strata, should be NULL for unstratified permutation |
seed |
An integer seed value |
Value
The inputted data frame with the group column randomly shuffled
Examples
data <- data.frame(group_label = c(1, 2, 2, 1, 2, 1), outcome = 1:6)
permute_group(df = data, group_col = "group_label", strata_col = NULL, seed = 42)
Sign permutation
Description
This function takes a data frame and group and outcome column name as input and returns the dataframe with the group column replaced with randomly assigned signs
Usage
permute_sign(df, group_col, strata_col = NULL, seed = NULL)
Arguments
df |
A data frame |
group_col |
The name of the column in df that corresponds to the group label |
strata_col |
The name of the column in df that corresponds to the strata, should be NULL for this function |
seed |
An integer seed value |
Value
The inputted data frame with the group column replaced with randomly assigned signs
Examples
data <- data.frame(group_label = rep(1, 6), outcome = 1:6)
permute_group(df = data, group_col = "group_label", strata_col = NULL, seed = 42)
Stratified group permutation
Description
This function takes a data frame and group and strata column name as input and returns the dataframe with the group column randomly permuted by strata
Usage
strat_permute_group(df, group_col, strata_col, seed = NULL)
Arguments
df |
A data frame |
group_col |
The name of the column in df that corresponds to the group label |
strata_col |
The name of the column in df that corresponds to the strata |
seed |
An integer seed value |
Value
The inputted data frame with the group column randomly shuffled by strata
Examples
data <- data.frame(group_label = c(1, 2, 2, 1, 2, 1), stratum = c(1, 1, 1, 2, 2, 2), outcome = 1:6)
permute_group(df = data, group_col = "group_label", strata_col = "stratum", seed = 42)
Tippett combining function
Description
This function takes an array of p-values and returns a combined p-value using Tippett's combining function:
\max_i \{1-p_i\}
Usage
tippett(pvalues)
Arguments
pvalues |
Array of p-values |
Value
Combined p-value using Tippett's method
Examples
tippett(pvalues = c(.05, .1, .5))
Calculate t-test statistic
Description
This function takes a data frame, and group and outcome column names as input and returns the t test statistic
Usage
ttest_stat(df, group_col, outcome_col)
Arguments
df |
A data frame |
group_col |
The name of the column in df that corresponds to the group label |
outcome_col |
The name of the column in df that corresponds to the outcome variable |
Value
The t test statistic
Two-sample permutation test
Description
This function runs a permutation test with difference in means test statistic for the two-sample problem by calling the permutation_test function.
Usage
two_sample(x, y, shift = 0, alternative = "greater", reps = 10^4, seed = NULL)
Arguments
x |
array of data for treatment group |
y |
array of data for control group |
shift |
Value of shift to apply in two-sample problem |
alternative |
String, two-sided or one-sided (greater or less) p-value; options are 'greater', 'less', or 'two-sided' |
reps |
Number of iterations to use when calculating permutation p-value |
seed |
An integer seed value |
Value
The permutation test p-value
Examples
two_sample(x = c(10, 9, 11), y = c(12, 11, 13), alternative = "less", seed = 42)