Title: | Estimate Split-Half Reliabilities |
Version: | 3.0.0 |
Date: | 2025-04-29 |
Author: | Thomas Pronk [aut, cre] |
Description: | Estimates split-half reliabilities for scoring algorithms of cognitive tasks and questionnaires. The 'splithalfr' supports researcher-provided scoring algorithms, with six vignettes illustrating how on included datasets. The package provides four splitting methods (first-second, odd-even, permutated, Monte Carlo), the option to stratify splits by task design, a number of reliability coefficients, the option to sub-sample data, and bootstrapped confidence intervals. |
Depends: | R (≥ 3.6.0) |
License: | GPL-3 |
Encoding: | UTF-8 |
LazyData: | true |
Suggests: | knitr (≥ 1.20), rmarkdown (≥ 1.10), testthat (≥ 2.1.0), MASS (≥ 7.3.51) |
Imports: | dplyr (≥ 1.0.7), tibble (≥ 2.1.1), psych (≥ 1.8.12), boot (≥ 1.3.31), rlang (≥ 0.4.0) |
RoxygenNote: | 7.3.2 |
VignetteBuilder: | knitr |
URL: | https://github.com/tpronk/splithalfr |
BugReports: | https://github.com/tpronk/splithalfr/issues |
NeedsCompilation: | no |
Packaged: | 2025-04-28 23:46:42 UTC; Thomas |
Maintainer: | Thomas Pronk <pronkthomas@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-04-29 00:00:02 UTC |
splithalfr: Split-Half Reliabilities
Description
Estimates split-half reliabilities for scoring algorithms of cognitive tasks and questionnaires.
Getting started
We've got six short vignettes to help you get started. You can open a vignette by running the corresponding code snippet (vignette(...)
) in the R console.
-
vignette("rapi_sum")
Sum-score for data of the 23-item version of the Rutgers Alcohol Problem Index (White & Labouvie, 1989 <doi:10.15288/jsa.1989.50.30>) -
vignette("vpt_diff_of_means")
Difference of mean RTs for correct responses, after removing RTs below 200 ms and above 520 ms, on Visual Probe Task data (Mogg & Bradley, 1999 <doi:10.1080/026999399379050>) -
vignette("aat_double_diff_of_medians")
Double difference of medians for correct responses on Approach Avoidance Task data (Heuer, Rinck, & Becker, 2007 <doi:10.1016/j.brat.2007.08.010>) -
vignette("iat_dscore_ri")
Improved d-score algorithm for data of an Implicit Association Task that requires a correct response in order to continue to the next trial (Greenwald, Nosek, & Banaji, 2003) -
vignette("sst_ssrti")
Stop-Signal Reaction Time integration method for data of a Stop Signal Task (Logan, 1981) -
vignette("gng_dprime")
D-prime for data of a Go/No Go task (Miller, 1996 <doi:10.3758/BF03205476>)
Splitting methods
The splithalfr supports a variety of methods for splitting your data. We review and assess each method in the compendium paper (Pronk et al., 2021 <doi:10.3758/s13423-021-01948-3>), but based on more recent concerning findings with Monte Carlo splitting (Kahveci et al., 2025 <doi:10.3758/s13423-024-02597-y>; Pronk et al., 2023 <doi:10.3758/s13428-022-01885-6>), I now only recommend permutated splitting and not Monte Carlo splitting. This vignette illustrates how to apply each splitting method via the splithalfr: 'vignette("splitting_methods")'
first-second and odd-even (Green et al., 2016 <doi:10.3758/s13423-015-0968-3>; Webb, Shavelson, & Haertel, 1996 <doi:10.1016/S0169-7161(06)26004-8>; Williams & Kaufmann, 2012 <doi:10.1016/j.jesp.2012.03.001>)
stratified (Green et al., 2016 <doi:10.3758/s13423-015-0968-3>)
permutated/bootstrapped/random sample of split halves (Kopp, Lange, & Steinke, 2021 <doi:10.1177/1073191119866257>, Parsons, Kruijt, & Fox, 2019 <doi:10.1177/2515245919879695>; Williams & Kaufmann, 2012 <doi:10.1016/j.jesp.2012.03.001>)
Monte Carlo (Williams & Kaufmann, 2012 <doi:10.1016/j.jesp.2012.03.001>)
Citing the splithalfr
Please cite the compendium paper (Pronk et al., 2022 <doi:10.3758/s13423-021-01948-3>]) and the software. To cite the software, type 'citation("splithalfr")' in R, or use the reference below.
Pronk, T. (2025). splithalfr: Estimates split-half reliabilities for scoring algorithms of cognitive tasks and questionnaires (Version 3.0.0) [Computer software]. <doi:10.5281/zenodo.7777894>
Validation of split-half estimations
Part of the splithalfr algorithm has been validated via a set of simulations that are not included in this package. The R script for these simulations can be found here.
Related packages
These R packages offer bootstrapped split-half reliabilities for specific scoring algorithms and are available via CRAN at the time of this writing: multicon, psych, splithalf, and rapidsplithalf.
Acknowledgments
I would like to thank Craig Hedge, Eva Schmitz, Fadie Hanna, Helle Larsen, Marilisa Boffo, and Marjolein Zee for making datasets available for inclusion in the splithalfr. Additionally, I would like to thank Craig Hedge and Benedict Williams for sharing R-scripts with scoring algorithms that were adapted for splithalfr vignettes. Finally, I would like to thank Mae Nuijs and Sera-Maren Wiechert for spotting bugs in earlier versions of this package.
Author(s)
Maintainer: Thomas Pronk pronkthomas@gmail.com
See Also
Useful links:
Calculate Angoff-Feldt coefficient
Description
Angoff-Feldt reliability coefficient. Formula obtained from Warrens (2015) <doi:10.1007/s11634-015-0198-6>
Usage
angoff_feldt(x, y)
Arguments
x |
(vector) a numeric vector |
y |
(vector) a numeric vector with compatible dimensions to x |
Value
(numeric) Angoff-Feldt cefficient
See Also
Other splithalfr coefficients:
assmd()
,
flanagan_rulon()
,
sdregi()
,
short_icc()
,
spearman_brown()
Examples
# Generate two variables with different means, variances and a correlation of about 0.5
library(MASS)
vars = mvrnorm(30, mu = c(0, 2), Sigma = matrix(c(5, 2, 2, 3), ncol = 2), empirical = TRUE)
# Calculate coefficient
angoff_feldt(vars[,1], vars[,2])
Split each element of a list of strata based on a list of indexes
Description
Splits each element of strata
into two parts based on a list of
indexes. For more information about splitting options, and an extensive list
of examples, see get_split_indexes_from_stratum
.
Usage
apply_split_indexes_to_strata(strata, indexes)
Arguments
strata |
(list) list of strata to split |
indexes |
(list) list of indexes, which can be generated via
|
Value
(list) A list with two elements, containing the first and second
split of strata
.
See Also
Other splitting functions:
apply_split_indexes_to_stratum()
,
check_strata()
,
get_split_indexes_from_strata()
,
get_split_indexes_from_stratum()
,
split_df()
,
split_strata()
,
split_stratum()
,
stratify()
Examples
# Stratify a data frame, then split it odd-even
ds <- data.frame(condition = rep(c("a", "b"), each = 4), score = 1 : 8)
strata <- stratify(ds, ds$condition)
split_indexes <- get_split_indexes_from_strata(strata, method = "odd_even")
apply_split_indexes_to_strata(strata, split_indexes)
Split a stratum based on a list of indexes
Description
Splits stratum
into two parts based on a list of indexes. For more information
about splitting options, and an extensive list of examples, see
get_split_indexes_from_stratum
.
Usage
apply_split_indexes_to_stratum(stratum, indexes_1, indexes_2)
Arguments
stratum |
(data frame, tibble, list, or vector) stratum to split |
indexes_1 |
(vector) indexes for first split, which can be generated via |
indexes_2 |
(vector) indexes for second split, which can be generated via |
Value
(list) List with two elements that contain stratum split in two parts.
See Also
Other splitting functions:
apply_split_indexes_to_strata()
,
check_strata()
,
get_split_indexes_from_strata()
,
get_split_indexes_from_stratum()
,
split_df()
,
split_strata()
,
split_stratum()
,
stratify()
Examples
# Random split-half. One of the splits gets 4 elements and the other 5
stratum = letters[1:9]
indexes = get_split_indexes_from_stratum(stratum)
apply_split_indexes_to_stratum(stratum, indexes[[1]], indexes[[2]])
Calculate Absolute Strictly Standardized Mean Difference (ASSMD)
Description
Returns the absolute difference of the mean of x
and y
divided by their shared standard deviation. Since the resulting difference
is absolute, the larger of the two means is always used as minuend and the
smallest as subtrahend. Based on
Zhang (2012) <doi:10.1016/j.ygeno.2006.12.014>
Usage
assmd(x, y)
Arguments
x |
(vector) a numeric vector |
y |
(vector) a numeric vector with compatible dimensions to x |
Value
(numeric) Absolute SSMD
See Also
Other splithalfr coefficients:
angoff_feldt()
,
flanagan_rulon()
,
sdregi()
,
short_icc()
,
spearman_brown()
Examples
# Generate two variables with different means, variances and a correlation of about 0.5
library(MASS)
vars = mvrnorm(30, mu = c(0, 2), Sigma = matrix(c(5, 2, 2, 3), ncol = 2), empirical = TRUE)
# Calculate Absolute SSMD
assmd(vars[,1], vars[,2])
Calculate split scores per participant
Description
Calculates split scores, by applying fn_score
to subsets of
data
as specified via participants
. It provides a range of
additional arguments for different splitting methods and to support parallel
processing. To learn more about writing scoring algorithms for use with the
splithalfr
, see the included vignettes. by_split
is modeled after the by
function, accepting similar values for
the first three arguments (data
, INDICES
, FUN
). For more
information about different methods for splitting data, see
get_split_indexes_from_stratum
. For more information about
stratification, see split_df
Usage
by_split(
data,
participants,
fn_score,
stratification = NULL,
replications = 1,
method = c("random", "odd_even", "first_second"),
replace = FALSE,
split_p = 0.5,
subsample_p = 1,
subsample_n = NULL,
careful = TRUE,
match_participants = FALSE,
ncores = detectCores(),
seed = NULL,
verbose = TRUE
)
Arguments
data |
(data frame) data frame containing data to score. Data should be in long format, with one row per combination of participant and trial or item. |
participants |
(vector) Vector that identifies participants in
|
fn_score |
(function) This function receives split data and should return a single number. |
stratification |
(vector). Vector that identifies which subsets of
|
replications |
(numeric) Number of replications that split scores are calculated. |
method |
(character) Splitting method. Note that |
replace |
(logical) If TRUE, stratum is sampled with replacement. |
split_p |
(numeric) Desired length of both parts, expressed as a
proportion of the length of the data per participant. If |
subsample_p |
(numeric) Subsample a proportion of |
subsample_n |
(numeric) Subsample a number of participants before splitting. |
careful |
(boolean) If TRUE, stop with an error when called with arguments that may yield unexpected splits |
match_participants |
(logical) Default FALSE. If FALSE, the split-halves
are newly randomized for each iteration and participant. If TRUE, the
split-halves are newly randomized for each replication, but within a
replication the same randomization is applied across participants. If the
order of rows of datasets per participant denotes similar observations
(such as items in a questionnaire), |
ncores |
(integer). By default, all available CPU cores are used. If 1,
split replications are executed serially (via |
seed |
(integer). When split replications are exectured in parallel,
|
verbose |
(logical) If TRUE, reports progress. Note that progress across split replications is not displayed when these are executed in parallel. |
Value
(data frame) Returns a data frame with a column for
participant
, a column replication
that counts split
replications, and score_1
and score_2
for the score
calculated of each part via fn_score
.
Examples
# N.B. This example uses R script from the vignette: "rapi_sum"
data("ds_rapi", package = "splithalfr")
# Convert to long format
ds_long <- reshape(
ds_rapi,
varying = paste("V", 1 : 23, sep = ""),
v.names = "answer",
direction = "long",
idvar = "twnr",
timevar = "item"
)
# Function for RAPI sum score
rapi_fn_score <- function (data) {
return (sum(data$answer))
}
# Calculate scores on full data
by(
ds_long,
ds_long$twnr,
rapi_fn_score
)
# Permutation split, one iteration, items matched across participants
split_scores <- by_split(
ds_long,
ds_long$twnr,
rapi_fn_score,
ncores = 1,
match_participants = TRUE
)
# Mean flanagan-rulon coefficient across splits
fr <- mean(split_coefs(split_scores, flanagan_rulon))
Check whether two strata have the same structure
Description
Checks strata
against strata_left
. Each element of
strata_left
should also be present in strata
, be of a similar
type (data frame/tibble or list/vector), and be of similar size
(nrow
for data frames/tibbles or length
for
lists/vectors). Stops with an arror if any checks fail.
Usage
check_strata(strata_left, strata_right)
Arguments
strata_left |
(list) strata to check against |
strata_right |
(list) strata to check |
Value
None
See Also
Other splitting functions:
apply_split_indexes_to_strata()
,
apply_split_indexes_to_stratum()
,
get_split_indexes_from_strata()
,
get_split_indexes_from_stratum()
,
split_df()
,
split_strata()
,
split_stratum()
,
stratify()
Examples
check_strata(list(1 : 4), list(1 : 4))
Example Approach Avoidance Task (AAT) Measurement Data in JASMIN2 Format
Description
The JASMIN1 AAT was an irrelevant feature task, in which participants were instructed to approach/avoid left/right rotated stimuli. This particular AAT was administered (and described in detail) in Boffo et al. (2018) <doi:10.1111/add.14071>. Participants were presented stimuli from a "test" category, which were gambling-related pictures, and from a "control" category, which were pictures unrelated to gambling. It registered approach responses by participants pressing (and holding) the arrow down key, while avoid responses were given via the arrow up key. Upon a response, the stimulus zoomed in or out, until it disappeared from the screen. The first response to a stimulus was logged. The dataset contains one row per trial. This dataset was graciously provided by Eva Schmitz.
Usage
ds_aat
Format
An object of class data.frame
with 6528 rows and 12 columns.
Details
Overview of columns:
UserID. Identifies participants
approach_tilt. If "left", participants were instructed to approach left rotated stimuli. If "right", participants were instructed to approach right rotated stimuli.
block_type. Type of block: "practice" for practice trials with neutral stimuli, "assess" for assessment trials with salient stimuli
block. Counts blocks, starting at zero
trial. Counts trials in blocks, starting at zero
appr. If "yes", this trial was an approach trial. If "no", this trial was an avoid trial.
tilt. Whether the stimulus was rotated to the "left" or to the "right"
cat. Stimulus category: "practice", "test", or "control"
stim. Stimulus ID
response. Response; 1 = correct, 2 = incorrect, 3 = timeout (no response in 4000 ms), 4 = invalid key
rt. Response time in milliseconds
sust. Was approach or avoid response sustained until the stimulus was completely zoomed in or out?
Example Go/No Go data
Description
The Go/No Go is a task in which participants respond to one set of stimuli, but withhold a response to another set of stimuli. This particular dataset is from the first session of a study that is described in detail in Hedge et al. (2018) <doi:10.1016/j.tate.2019.102887>. It was graciously provided by Craig Hedge and can be obtained from https://osf.io/cwzds.
Usage
ds_gng
Format
An object of class data.frame
with 28200 rows and 7 columns.
Details
Overview of columns:
block. Block number
trial. Trial number
stim. Stimuli set used in that block
condition. 0 = go, 2 = no go
response. Correct (1) or incorrect (0)
rt. Reaction time (seconds)
participant. Participant ID
Example Implicit Association Task (IAT) Data in JASMIN2 Format
Description
The JASMIN2 IAT closely followed the original IAT procedure (Greenwald, McGhee, & Schwartz, 1998), except that target and attribute trials did not alternate. Upon a correct response, the next trial started. Upon an incorrect response, the current trial was repeated. The response to each trial was logged. This particular dataset is from a Ethnicity-Valence IAT, which was administered (and described in detail) in Abacioglu et al. (2019) <doi:10.1016/j.tate.2019.102887>. This dataset was graciously provided by Fadie Hanna and Marjolein Zee.
Usage
ds_iat
Format
An object of class data.frame
with 9696 rows and 11 columns.
Details
Overview of columns:
participation_id Identifies participants
t1_left. If TRUE, the first combination block had target 1 on the left (and target 2 on the right)
a1_left. If TRUE, the first combination block had attribute 1 on the left (and attribute 2 on the right)
block_type. Type of block
block. Counts blocks, starting at zero
trial. Counts trials in blocks, starting at zero
attempt. Counts attempts (responses) in trials, starting at zero
cat. Category that stimulus belonged to
stim. Stimulus
response. Response; 1 = correct, 2 = incorrect, 3 = timeout (no response in 4000 ms), 4 = invalid key
rt. Response time in milliseconds. Note that some response times may exceed the timeout window due to clock errors on the computer that the IAT was administered
The variable block_type can have these values:
tar_discr: target discrimination
att_discr: attribute discrimination
tar1att1_1: target 1 with attitude 1, practice block
tar1att1_2: target 1 with attitude 1, test block
tar_rev: reverse target discrimination
tar1att2_1: target 1 with attitude 2, practice block
tar1att2_2: target 1 with attitude 2, test block
Example 23-item Rutgers Alcohol Problem Inventory (RAPI) data
Description
The RAPI is a questionnaire which asks how often a participant experienced each of 23 alcohol-related problems within the last year (White & Labouvie, 1989 <doi:10.15288/jsa.1989.50.30>). The dataset contains one row per participant.
Usage
ds_rapi
Format
An object of class data.frame
with 426 rows and 24 columns.
Details
The dataset contains the following columns:
twnr. Identifies participants
V1 to V23. Answers on each of the 23 RAPI items
Each item is answered on a four-point scale with the following answer options:
0 = None
1 = 1-2 times
2 = 3-5
3 = More than 5 times
Example Stop Signal Task data
Description
The Stop Signal Task is a task in which participants responded whether a stimulus was a square or a circle. On 25 This particular dataset is from the first session of a study that is described in detail in Hedge et al. (2018) <doi:10.1016/j.tate.2019.102887>. It was graciously provided by Craig Hedge and can be obtained from https://osf.io/cwzds.
Usage
ds_sst
Format
An object of class data.frame
with 27000 rows and 7 columns.
Details
Overview of columns:
block. Block number
trial. Trial number
ssd. Stop signal delay
condition. 0 = go, 1 = stop
response. Correct (1) or incorrect (0)
rt. Reaction time (milliseconds)
participant. Participant ID
Example Visual Probe Task (VPT) Measurement Data in JASMIN1 Format
Description
The JASMIN1 VPT distinguished between "test" stimuli, which are in some way assumed to be salient to the participant and "control" stimuli, which are not. Test and control stimuli were presented in pairs, with one left and one right, followed by a probe that was an arrow pointing up or down. Participants needed to indicate whether the arrow pointed up or down. Upon a correct response the next trial started and upon an incorrect response the current trial was repeated. Only the first response to a new trial was logged. This particular VPT was part of the pre-measurement of a cognitive bias modification study. The "test" stimuli were alcoholic beverages and the "control" stimuli were non-alcoholic beverages, selected from the Amsterdam Beverage Picture Set (Pronk et al., 2015) <doi:10.1111/acer.12853>. The dataset contains one row per trial. This dataset was graciously provided by Marilisa Boffo.
Usage
ds_vpt
Format
An object of class data.frame
with 19520 rows and 12 columns.
Details
Overview of columns:
UserID. Identifies participants
patt. Probe-at-test. If "yes", the probe was positioned at the test stimulus. If "no", the probe was positioned at the control stimulus.
phor. Probe horizontal position. Values: "left" or "right"
thor. Test horizontal position. Values: "left" or "right"
keep. If "yes" the probe was superimposed on the stimuli. If "no" the probe replaced the stimuli.
pdir. Probe direction. Values: "up" or "down"
stim. Stimulus
response. Response; 1 = correct, 2 = incorrect, 3 and NA = timeout (no response in 5000 ms), 4 = invalid key
rt. Response time in milliseconds
block. Counts blocks, starting at zero
trial. Counts trials in blocks, starting at zero
block_type. Type of block: "assess" for assessment trials with salient stimuli
Calculate Flanagon-Rulon coefficient
Description
Flanagon-Rulon reliability coefficient. Formula obtained from Warrens (2015) <doi:10.1007/s11634-015-0198-6>
Usage
flanagan_rulon(x, y)
Arguments
x |
(vector) a numeric vector |
y |
(vector) a numeric vector with compatible dimensions to x |
Value
(numeric) Flanagon-Rulon coefficient
See Also
Other splithalfr coefficients:
angoff_feldt()
,
assmd()
,
sdregi()
,
short_icc()
,
spearman_brown()
Examples
# Generate two variables with different means, variances and a correlation of about 0.5
library(MASS)
vars = mvrnorm(30, mu = c(0, 2), Sigma = matrix(c(5, 2, 2, 3), ncol = 2), empirical = TRUE)
# Calculate coefficient
flanagan_rulon(vars[,1], vars[,2])
Generate indexes for splitting strata
Description
Generates indexes for splitting each element of strata
into two parts.
For more information about splitting options, and an extensive list of
examples, see get_split_indexes_from_stratum
.
Usage
get_split_indexes_from_strata(strata, ...)
Arguments
strata |
(list) Strata to split |
... |
Arguments passed on to
|
Value
(list) A list with two elements, containing the first and second part
of strata
.
See Also
Other splitting functions:
apply_split_indexes_to_strata()
,
apply_split_indexes_to_stratum()
,
check_strata()
,
get_split_indexes_from_stratum()
,
split_df()
,
split_strata()
,
split_stratum()
,
stratify()
Examples
# Stratify a data frame, then split it odd-even
ds <- data.frame(condition = rep(c("a", "b"), each = 4), score = 1 : 8)
strata <- stratify(ds, ds$condition)
split_indexes <- get_split_indexes_from_strata(strata, method = "odd_even")
apply_split_indexes_to_strata(strata, split_indexes)
Generate indexes that can be used to split a stratum into two parts
Description
get_split_indexes_from_stratum
returns a list with indexes for
splitting its stratum
argument in two parts. The splits differ at most
by one in size. With default arguments, a random split-half is returned,
which samples elements for each part from stratum
without replacement.
Via additional arguments to get_split_indexes_from_stratum
a range of
other splitting methods can be applied.
Usage
get_split_indexes_from_stratum(
stratum,
method = c("random", "odd_even", "first_second"),
replace = FALSE,
split_p = 0.5,
subsample_p = 1,
careful = TRUE
)
Arguments
stratum |
(data frame, tibble, list, or vector) Object to split; dataframes and tibbles are counted and split by row. All other data types are counted and split by element |
method |
(character) Splitting method. Note that |
replace |
(logical) If FALSE, splits are constructed by sampling from stratum without replacement. If TRUE, stratum is sampled with replacement. |
split_p |
(numeric) Desired joint size of both parts, expressed as a
proportion of the size of the subsampled |
subsample_p |
(numeric) Subsample a proportion of |
careful |
(boolean) If TRUE, stop with an error when called with arguments that may yield unexpected splits |
Details
The following rounding rules apply to subsample size and split size:
If the size of the subsample, calculated as
subsample_p
times size ofstratum
, is a fraction, then subsample size is rounded up.If the joint size of the two parts, calculated as 2 *
split_p
times size of the subsampledstratum
, is a fraction, the part size is rounded up.If the joint size of the two parts is odd and
replace
is FALSE, then one of the parts randomly gets one more element than the other part.If the joint size of the two parts is odd and
replace
is TRUE, part size is rounded up to the next whole number, so each of the splits has the same size.
Value
(list) List with two elements that contain indexes that can be used to split the stratum in two parts two splits of stratum.
See Also
Other splitting functions:
apply_split_indexes_to_strata()
,
apply_split_indexes_to_stratum()
,
check_strata()
,
get_split_indexes_from_strata()
,
split_df()
,
split_strata()
,
split_stratum()
,
stratify()
Examples
# Split-half. One of the splits gets 4 elements and the other 5
stratum = letters[1:9]
indexes = get_split_indexes_from_stratum(stratum)
apply_split_indexes_to_stratum(stratum, indexes[[1]], indexes[[2]])
SD ratio of equalities or greater inequalities
Description
Returns the ratio of the SDs of x
and y
, using the largest
SD of the two as denominator. Hence, the result is always 1 (ratio of
equalities) or greater than 1 (ratio of greater inequalities).
If x or y have less than two elements, NA is returned.
Usage
sdregi(x, y)
Arguments
x |
(vector) a numeric vector |
y |
(vector) a numeric vector with compatible dimensions to x |
Value
(numeric) SD ratio
See Also
Other splithalfr coefficients:
angoff_feldt()
,
assmd()
,
flanagan_rulon()
,
short_icc()
,
spearman_brown()
Examples
# Generate two variables with different means, variances and a correlation of about 0.5
library(MASS)
vars = mvrnorm(30, mu = c(0, 2), Sigma = matrix(c(5, 2, 2, 3), ncol = 2), empirical = TRUE)
# Calculate SD ratio of left and right variables
sdregi(vars[,1], vars[,2])
# Calculate SD ratio of right and left variables; should give same result
sdregi(vars[,1], vars[,2])
Calculate Intraclass Correlation Coefficient (ICC)
Description
Wrapper for ICCs calculated via ICC
. If x or y have
less than two elements, NA is returned.
Usage
short_icc(
x,
y,
type = c("ICC1", "ICC2", "ICC3", "ICC1k", "ICC2k", "ICC3k"),
...
)
Arguments
x |
(vector) a numeric vector |
y |
(vector) a numeric vector with compatible dimensions to x |
type |
(character) type of ICC to calculate, see |
... |
Arguments passed to |
Value
(numeric) Value of ICC coefficient
See Also
Other splithalfr coefficients:
angoff_feldt()
,
assmd()
,
flanagan_rulon()
,
sdregi()
,
spearman_brown()
Examples
# Generate two variables with different means, variances and a correlation of about 0.5
library(MASS)
vars = mvrnorm(30, mu = c(0, 2), Sigma = matrix(c(5, 2, 2, 3), ncol = 2), empirical = TRUE)
# Calculate ICC1
short_icc(vars[,1], vars[,2], type = "ICC1", lmer = FALSE)
Calculate Spearman-brown coefficient
Description
Spearman-Brown reliability coefficient for doubling test length. Formula obtained from Warrens (2015) <doi:10.1007/s11634-015-0198-6>
Usage
spearman_brown(x, y, fn_cor = cor, ...)
Arguments
x |
(vector) a numeric vector |
y |
(vector) a numeric vector with compatible dimensions to x |
fn_cor |
(function) a function returning a correlation coefficient |
... |
Arguments passed to |
Value
(numeric) Spearman-Brown coefficient
See Also
Other splithalfr coefficients:
angoff_feldt()
,
assmd()
,
flanagan_rulon()
,
sdregi()
,
short_icc()
Examples
# Generate two variables with different means, variances and a correlation of about 0.5
library(MASS)
vars = mvrnorm(30, mu = c(0, 2), Sigma = matrix(c(5, 2, 2, 3), ncol = 2), empirical = TRUE)
# Calculate coefficient based on Pearson correlation
spearman_brown(vars[,1], vars[,2])
# Calculate coefficient based on ICC, two-way, random effects, absolute agreement, single rater
spearman_brown(vars[,1], vars[,2], short_icc, type = "ICC1", lmer = FALSE)
Generate bootstrap replicates of a coefficient split-half reliability estimate by sampling participants
Description
Generates bootstrap replicates via boot
. The parameter
ds
should be a data frame as returned by by_split
: Each
unique value of the column participant
is considered a independent
sample of the target population. For each unique value of the column
split
in ds
, it selects the corresponding rows in ds
,
and passes the values in the columns score_1
and score_2
as the
first and second argument to fn_coef
. Any row in ds
for which
score_1
or score_2
is NA is pairwise removed before passing the
data to fn_coef
. Any coefficient that is NA is removed before passing
the data to fn_summary
.
Usage
split_ci(
ds,
fn_coef,
fn_average,
bootstrap_replications = 1000,
parallel = "snow",
ncpus = detectCores(),
...
)
Arguments
ds |
(data frame) a data frame with columns |
fn_coef |
(function) a function that calculates a bivariate (reliability) coefficient |
fn_average |
(function) a function that calculates an average across coefficients |
bootstrap_replications |
(integer) number of bootstrap replications |
parallel |
(character) Type of parallel processing (if any) used for bootstrapping. See |
ncpus |
(character) Number of cores for parallel processing. See |
... |
Additional arguments passed to |
Details
For a practical example, see one of the vignettes for getting started with the splithalfr. Also, note that the boot package supports parallel processing via the parameters 'parallel' and 'ncpus'.
For averaging internal consistency coefficients, see Feldt and Charter (2006). For more information about bias-corrected and accelerated bootstrap confidence intervals, see Efron (1987).
Value
Confidence interval
References
Efron, B. (1987). Better bootstrap confidence intervals. Journal of the American statistical Association, 82(397), 171-185. doi:10.1080/01621459.1987.10478410
Feldt, L. S., & Charter, R. A. (2006). Averaging internal consistency reliability coefficients. Educational and Psychological Measurement, 66(2), 215-227. doi:10.1177/0013164404273947
See Also
Other split aggregation functions:
split_coefs()
Examples
# Import boot library
library(boot)
# Generate five splits with scores that are correlated 0.00, 0.25, 0.5, 0.75, and 1.00
library(MASS)
ds_splits = data.frame(V1 = numeric(), V2 = numeric(), split = numeric())
for (r in 0:4) {
vars = mvrnorm(10, mu = c(0, 0), Sigma = matrix(c(10, 3, 3, 2), ncol = 2), empirical = FALSE)
ds_splits = rbind(ds_splits, cbind(vars, r, 1 : 10))
}
names(ds_splits) = c("score_1", "score_2", "replication", "participant")
# Conduct bootstrap
bootstrap_result <- split_ci(ds_splits, cor, mean, parallel = "no")
# Get boosted and accelerated confidence intervals
print(boot.ci(bootstrap_result, type="bca"))
Calculate a bivariate coefficient for each split-half replication
Description
Calculates a bivariate coefficient across participants for each split-half
replication and returns their values calculated across
replications. ds
should be a data frame as returned by
by_split
: For each unique value of the column split
in
ds
, it selects the corresponding rows in ds
, and passes the
values in the columns score_1
and score_2
as the first and
second argument to fn_coef
. Any row in ds
for which
score_1
or score_2
is NA is pairwise removed before passing the
data to fn_coef
. For averaging internal consistency coefficients,
see Feldt and Charter (2006).
Usage
split_coefs(ds, fn_coef, ...)
Arguments
ds |
(data frame) a data frame with columns |
fn_coef |
(function) a function that calculates a bivariate coefficient. |
... |
Additional arguments passed to |
Value
Coefficients per split calculated via fn_coef
.
References
Feldt, L. S., & Charter, R. A. (2006). Averaging internal consistency reliability coefficients. Educational and Psychological Measurement, 66(2), 215-227. doi:10.1177/0013164404273947
See Also
Other split aggregation functions:
split_ci()
Examples
# Generate five splits with scores that are correlated 0.00, 0.25, 0.5, 0.75, and 1.00
library(MASS)
ds_splits = data.frame(score_1 = numeric(), score_2 = numeric(), replication = numeric())
for (r in 0:4) {
vars = mvrnorm(10, mu = c(0, 0), Sigma = matrix(c(10, 3, 3, 2), ncol = 2), empirical = FALSE)
ds_splits = rbind(ds_splits, cbind(vars, r))
}
names(ds_splits) = c("score_1", "score_2", "replication")
# Pearson correlations
split_coefs(ds_splits, cor)
# Spearman-brown corrected Pearson correlations
split_coefs(ds_splits, spearman_brown)
# Flanagan-Rulon coefficient
split_coefs(ds_splits, flanagan_rulon)
# Angoff-Feldt coefficient
split_coefs(ds_splits, angoff_feldt)
# Spearman-Brown corrected ICCs
split_coefs(
ds_splits,
spearman_brown,
short_icc,
type = "ICC1",
lmer = FALSE
)
Split a data frame into two parts
Description
Splits data
, Applies a stratified split to a data frame and returns
each part. For more information about splitting options, and an extensive
list of examples, see get_split_indexes_from_stratum
.
Usage
split_df(data, stratification = NULL, ...)
Arguments
data |
(data frame) Data to split, in long format, with one row per observation. |
stratification |
(vector). Vector that identifies which subsets of
|
... |
Arguments passed on to
|
Value
(list) List with two elements that each contain one of two parts.
See Also
Other splitting functions:
apply_split_indexes_to_strata()
,
apply_split_indexes_to_stratum()
,
check_strata()
,
get_split_indexes_from_strata()
,
get_split_indexes_from_stratum()
,
split_strata()
,
split_stratum()
,
stratify()
Examples
ds <- data.frame(condition = rep(c("a", "b"), each = 4), score = 1 : 8)
split_df(ds, method = "random")
split_df(ds, method = "odd_even")
split_df(ds, method = "first_second")
split_df(ds, stratification = ds$condition, method = "random")
split_df(ds, stratification = ds$condition, method = "odd_even")
split_df(ds, stratification = ds$condition, method = "first_second")
ds <- data.frame(condition = rep(c("a", "b"), 4), score = 1 : 8)
split_df(ds, method = "random")
split_df(ds, method = "odd_even")
split_df(ds, method = "first_second")
split_df(ds, stratification = ds$condition, method = "random")
split_df(ds, stratification = ds$condition, method = "odd_even")
split_df(ds, stratification = ds$condition, method = "first_second")
Split each stratum into two parts
Description
Splits each element of strata
into two parts. For more information
about splitting options, and an extensive list of examples, see
get_split_indexes_from_stratum
.
Usage
split_strata(strata, ...)
Arguments
strata |
(list) list of strata to split |
... |
Arguments passed on to
|
Value
(list) A list with two elements, containing the first and second
split of strata
.
See Also
Other splitting functions:
apply_split_indexes_to_strata()
,
apply_split_indexes_to_stratum()
,
check_strata()
,
get_split_indexes_from_strata()
,
get_split_indexes_from_stratum()
,
split_df()
,
split_stratum()
,
stratify()
Examples
# Stratify a data frame, then split it odd-even
ds <- data.frame(condition = rep(c("a", "b"), each = 4), score = 1 : 8)
strata <- stratify(ds, ds$condition)
split_strata(strata, method = "odd_even")
Split a stratum into two parts
Description
Splits stratum
into two parts. For more information
about splitting options, and an extensive list of examples, see
get_split_indexes_from_stratum
.
Usage
split_stratum(stratum, ...)
Arguments
stratum |
(data frame, tibble, list, or vector) Stratum to split; dataframes and tibbles are counted and split by row. All other data types are counted and split by element |
... |
Arguments passed on to
|
Value
(list) List with two elements that contain each of the two parts of stratum split in two.
See Also
Other splitting functions:
apply_split_indexes_to_strata()
,
apply_split_indexes_to_stratum()
,
check_strata()
,
get_split_indexes_from_strata()
,
get_split_indexes_from_stratum()
,
split_df()
,
split_strata()
,
stratify()
Examples
# Split stratum odd-even
ds <- data.frame(condition = rep(c("a", "b"), each = 4), score = 1 : 8)
split_stratum(ds, method = "odd_even")
Stratify a data frame
Description
Split a dataframe into strata formed by each a unique value of stratification
.
Usage
stratify(ds, stratification = NULL)
Arguments
ds |
(data frame) data to split into strata |
stratification |
(vector). Vector that identifies which subsets of
|
Value
(list) List of strata
See Also
Other splitting functions:
apply_split_indexes_to_strata()
,
apply_split_indexes_to_stratum()
,
check_strata()
,
get_split_indexes_from_strata()
,
get_split_indexes_from_stratum()
,
split_df()
,
split_strata()
,
split_stratum()
Examples
# Stratify a data frame, then split it odd-even
ds <- data.frame(condition = rep(c("a", "b"), each = 4), score = 1 : 8)
strata <- stratify(ds, ds$condition)
split_strata(strata, method = "odd_even")