Help for package detectR

Type:

Package

Title:

Change Point Detection

Version:

0.3.0

Maintainer:

Changryong Baek <crbaek@skku.edu>

Description:

Time series analysis of network connectivity. Detects and visualizes change points between networks. Methods included in the package are discussed in depth in Baek, C., Gates, K. M., Leinwand, B., Pipiras, V. (2021) "Two sample tests for high-dimensional auto-covariances" <doi:10.1016/j.csda.2020.107067> and Baek, C., Gampe, M., Leinwand B., Lindquist K., Hopfinger J. and Gates K. (2023) “Detecting functional connectivity changes in fMRI data” <doi:10.1007/s11336-023-09908-7>.

License:

Unlimited

Encoding:

UTF-8

LazyData:

true

Imports:

signal, lavaan, doParallel, glasso, LogConcDEAD, foreach, parallel

Depends:

R (≥ 2.10)

URL:

https://github.com/crbaek/detectR

RoxygenNote:

7.3.1

NeedsCompilation:

Packaged:

2024-03-07 06:19:59 UTC; crbaek

Author:

Changryong Baek [aut, cre], Mattew Gampe [aut], Kathleen M. Gates [aut], Seok-Oh Jeong [aut], Vladas Pipiras [aut]

Repository:

CRAN

Date/Publication:

2024-03-08 20:30:02 UTC

(multivariate) Bartlett long-run variance calculation

Description

(multivariate) Bartlett long-run variance calculation

Usage

LRV.bartlett(cL, q)

Optional bandpass filtering using the Butterworth filter.

Description

Usage

bandpass(data, butterfreq)

Block length formula by Andrews

Description

Block length formula by Andrews

Usage

blocklength.andrew(data)

Changepoint Example Data

Description

This dataset contains a simulated multivariate time series with two changepoints at time point 150 and 300. The dimension of the data is T=450 and p=20.

Usage

changesim

Format

An object of class matrix (inherits from array) with 450 rows and 20 columns.

Change point detection using PCA and binary segmentation

Description

This function uses PCA-based method to find breaks. Simultaneous breaks are found from binary segmentation.

Usage

detectBinary(
  Y,
  Del,
  L,
  q = "fixed",
  alpha = 0.05,
  nboot = 199,
  n.cl,
  bsize = "log",
  bootTF = TRUE,
  scaleTF = TRUE,
  diagTF = TRUE,
  plotTF = TRUE
)

Arguments

Y

data: Y = length*dim

Del

Delta away from the boundary restriction

L

the number of factors

q

methods in calculating long-run variance of the test statistic. Default is "fixed" = length^(1/3). Adaptive selection method is also available via "andrews", or user specify the length

alpha

significance level of the test

nboot

the number of bootstrap sample for p-value. Default is 199.

n.cl

number of cores in parallel computing. The default is (machine cores - 1)

bsize

block size for the Block Wild Bootstrapping. Default is log(length), "sqrt" uses sqrt(length), "adaptive" determines block size using data dependent selection of Andrews

bootTF

determine whether the threshold is calculated from bootstrap or asymptotic

scaleTF

scale the variance into 1

diagTF

include diagonal term of covariance matrix or not

plotTF

Draw plot to see test statistic and threshold

Value

tstathist The complete history of test statistics.

Brhist The sequence of breakpoints found from binary splitting

L The number of factors used in the procedure

q The estimated vectorized autocovariance on each regime.

crit The critical value to identify change point

bsize The block size of the bootstrap

diagTF If TRUE, the diagonal entry of covariance matrix is used in detecting connectivity changes.

bootTF If TRUE, bootstrap is used to find critical value

scaleTF If TRUE, the multivariate signal is studentized to have zero mean and unit variance.

Examples

out3= detectBinary(changesim, L=2, n.cl=1)

Change point detection using Graphical lasso as in Cribben et al. (2012)

Description

This function implements the Dynamic Connectivity Regression (DCR) algorithm proposed by Cribben el al. (2012) to locate changepoints.

Usage

detectGlasso(
  Y,
  Del,
  p,
  lambda = "bic",
  nboot = 100,
  n.cl,
  bound = c(0.001, 1),
  gridTF = FALSE,
  plotTF = TRUE
)

Arguments

Y

Input data of dimension length*dim (T times d)

Del

Delta away from the boundary restriction

p

Gep(p) distribution controls the size of stationary bootstrap. The mean block length is 1/p

lambda

two selections possible for optimal parameter of lambda. "bic" finds lambda from bic criteria, or user can directly input the penalty value

nboot

the number of bootstrap sample for p-value. Default is 100.

n.cl

number of cores in parallel computing. The default is (machine cores - 1)

bound

bound of bic search in "bic" rule. Default is (.001, 1)

gridTF

minimum bic is found by grid search. Default is FALSE

plotTF

Draw plot to see test statistic

Value

A list with component

br The estimated breakpoints including boundary (0, T)

brhist The sequence of breakpoints found from binary splitting

diffhist The history of BIC reduction on each step

W The estimated vectorized autocovariance on each regime.

WI The estimated vectorized precision matrix on each regime.

lambda The penalty parameter estimated on each regime.

pvalhist The empirical p-values on each binary splitting.

fitzero Detailed output at first stage. Useful in producing plot.

Examples

out1= detectGlasso(changesim, p=.2, n.cl=1)

Change point detection using max-type statistic as in Jeong et. al (2016)

Description

Change point detection using max-type statistic as in Jeong et. al (2016)

Usage

detectMaxChange(
  Y,
  m = c(30, 40, 50),
  margin = 30,
  thre.localfdr = 0.2,
  design.mat = NULL,
  plotTF = TRUE,
  n.cl
)

Arguments

Y

Input data matrix

m

window sizes

margin

margin

thre.localfdr

threshold for local fdr

design.mat

design matrix for analyzing task data

plotTF

Draw plot to see test statistic and threshold

n.cl

number of clusters for parallel computing

Value

CLX Test statistic corresponding to window size arranged in column

CLXLocalFDR The Local FDR calculated for each time point

br The final estimated break points

Examples

out2= detectMaxChange(changesim, m=c(30, 35, 40, 45, 50), n.cl=1)

Change point detection using PCA and sliding method

Description

Change point detection using PCA and sliding method

Usage

detectSliding(
  Y,
  wd = 40,
  L,
  Del,
  q = "fixed",
  alpha = 0.05,
  nboot = 199,
  n.cl,
  bsize = "log",
  bootTF = TRUE,
  scaleTF = TRUE,
  diagTF = TRUE,
  plotTF = TRUE
)

Arguments

Y

data: Y = length*dim

wd

window size for sliding averages

L

the number of factors

Del

Delta away from the boundary restriction

q

methods in calculating long-run variance of the test statistic. Default is "fixed" = length^(1/3) or "andrews" implements data adaptive selection, or user specify the length

alpha

significance level of the test

nboot

the number of bootstrap sample for p-value. Default is 199.

n.cl

number of cores in parallel computing. The default is (machine cores - 1)

bsize

block size for the Block Wild Bootstrapping. Default is log(length), "sqrt" uses sqrt(length), "adaptive" determines block size using data dependent selection of Andrews

bootTF

determine whether the threshold is calculated from bootstrap or asymptotic

scaleTF

scale the variance into 1

diagTF

include diagonal term of covariance matrix or not

plotTF

Draw plot to see test statistic and threshold

Value

sW The test statistic

L The number of factors used in the procedure

q The estimated vectorized autocovariance on each regime.

crit The critical value to identify change point

bsize The block size of the bootstrap

diagTF If TRUE, the diagonal entry of covariance matrix is used in detecting connectivity changes.

bootTF If TRUE, bootstrap is used to find critical value

scaleTF If TRUE, the multivariate signal is studentized to have zero mean and unit variance.

Examples

out4 = detectSliding(changesim, wd=40, L=2, n.cl=1)

Global Variables and functions

Description

Defining variables and functions used in the internal functions

If model is indicated, reduce data to components.

Description

If model is indicated, reduce data to components.

Usage

networkpca(model, Y)

Optional preprocessing step for regressing out noise variables.

Description

This function saves the residuals after regressing the user-defined noise variables from the user-defined "signal" variables. Noise variables can be nuisance regressors (such as cebral spinal fluid in fMRI), time vectors, or any other vector that the user wishes to regress out.

Usage

preproc(Y, noise, signal)

Arguments

Y

data: Y = length*dim

noise

string vector indicating which variables (columns) are regressors

signal

string vector indicating which variables to regress out the noise variables from

Value

signalmatrix

Data preparation for changepoint detection using functions in this package..

Description

Usage

preprocess(file = NULL,
header = NULL,
sep    = NULL,
signal = NULL,
noise = NULL,
butterfreq = NULL,
model = NULL)

Arguments

file

a data matrix or file name with columns as variables and rows as observations across time.

header

logical for whether or not there is a header in the data file.

sep

The spacing of the data files. "" indicates space-delimited, "/t" indicates tab-delimited, "," indicates comma delimited. Only necessary to specify if reading data in from physical directory.

signal

(optional) a character vector containing the names of variables that contain signal i.e., which variables to use to detect change point. The default (NULL) indicates all variables except those in 'noise' argument are considered signal. Example: signal = c("dDMN4", "vDMN5", "vDMN1",

noise

(optional) a character vector containing the names of variables that contain noise. The signal variables will be regressed on these variables and residuals used in change point detection. The default (NULL) indicates there are no noise variables. Example: noise = c("White.Matter1", "CSF1")

butterfreq

(optional) bandpass filter frequency ranges. Example: c(.04,.4)

model

(optional) syntax indicating which variables belong to which networks for first pass of data reduction that is user-specified. If no header naming convention follows "V#". Notation should follow lavaan syntax style.

Test for for the equality of connectivity based on the Graphical lasso estimation.

Description

This function utilizes Dynamic Connectivity Regression (DCR) algorithm proposed by Cribben el al. (2012) to test the equality of connectivity in two fMRI signals.

Usage

testGlasso(
  subY1,
  subY2,
  p,
  lambda = "bic",
  nboot = 100,
  n.cl,
  bound = c(0.001, 1),
  gridTF = FALSE
)

Arguments

subY1

a sample of size length*dim

subY2

a sample of size length*dim

p

Gep(p) distribution controls the size of stationary bootstrap. The mean block length is 1/p

lambda

two selections possible for optimal parameter of lambda. "bic" finds lambda from bic criteria, or user can directly input the penalty value.

nboot

the number of bootstrap sample for p-value. Default is 100.

n.cl

number of cores in parallel computing. The default is (machine cores - 1)

bound

bound of bic search in "bic" rule. Default is (.001, 1)

gridTF

Utilize a grid search to optimize hyperparameters

Value

pval The empirical p-value for testing the equality of connectivity structure

rho The sequence of penalty parameter based on the combined sample, subY1 and subY2.

fit0 Output of glasso for combined sample

fit1 Output of glasso for subY1

fit2 Output of glasso for subY2

Examples

test1= testGlasso(testsim$X, testsim$Y, n.cl=1)

Max-type test for for the equality of connectivity

Description

This function produces three test results based on max-type block bootstrap (BMB), long-run variance block bootstrapping with lagged-window estimator (LVBWR) and sum-type block bootstrap (BSUM). See Baek el al. (2019) for details.

Usage

testMax(subY1, subY2, diagTF = TRUE, nboot, q = "fixed", n.cl)

Arguments

subY1

a sample of size length*dim

subY2

a sample of size length*dim

diagTF

include diagonal term of covariance matrix or not

nboot

number of bootstrap sample, default is 2000

q

methods in calculating long-run variance of the test statistic. Default is "fixed" = length^(1/3) or "andrews" implements data adaptive selection, or user specify the length

n.cl

number of cores in parallel computing. The default is (machine cores - 1)

Value

tstat Test statistic for testing the equality of connectivity structure

pval The p-value for testing the equality of connectivity structure

q The tuning parameter used in calculating long-run variance

Examples

test2 = testMax(testsim$X, testsim$Y, n.cl=1)

PCA-based test for the equality of connectivity

Description

This function performs PCA-test for testing the equality of connectivity in two fMRI signals

Usage

testPCA(subY1, subY2, L = 2, nlag, diagTF = TRUE)

Arguments

subY1

a sample of size length*dim

subY2

a sample of size length*dim

L

the number of factors

nlag

is the number of ACF lag to be used in the test, default is 2, Default is nlag = floor(N^(1/3))

diagTF

include diagonal term of covariance matrix or not

Value

tstat Test statistic

pval Returns the p-value

df The degree of freedom in PCA-best test

L The number of factors used in the test

diagTF If true, the diagonal entry of covariance matrix is used in testing

Examples

test3 = testPCA(testsim$X, testsim$Y, L=2)

Test Example Data

Description

This dataset contains a simulated multivariate time series with two different autocovariances. It is a list data with two variables X and Y. Each multivariate time series had dimension of T=150 and p=20

Usage

testsim

Format

An object of class list of length 2.