Title: Principal Component of Explained Variance
Version: 2.2.2
Description: Principal component of explained variance (PCEV) is a statistical tool for the analysis of a multivariate response vector. It is a dimension- reduction technique, similar to Principal component analysis (PCA), that seeks to maximize the proportion of variance (in the response vector) being explained by a set of covariates.
Depends: R (≥ 3.0.0)
Imports: RMTstat, stats, corpcor
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
LazyData: true
URL: http://github.com/GreenwoodLab/pcev
BugReports: http://github.com/GreenwoodLab/pcev/issues
Suggests: knitr
VignetteBuilder: knitr
RoxygenNote: 6.0.1
NeedsCompilation: no
Packaged: 2018-02-03 23:05:56 UTC; mturgeon
Author: Maxime Turgeon [aut, cre], Aurelie Labbe [aut], Karim Oualkacha [aut], Stepan Grinek [aut]
Maintainer: Maxime Turgeon <maxime.turgeon@mail.mcgill.ca>
Repository: CRAN
Date/Publication: 2018-02-03 23:14:47 UTC

pcev: A package for computing principal components of explained variance.

Description

PCEV is a statistical tool for the analysis of a multivariate response vector. It is a dimension-reduction technique, similar to Principal Components Analysis (PCA), which seeks the maximize the proportion of variance (in the response vector) being explained by a set of covariates.

pcev functions

estimatePcev computePCEV PcevObj permutePval wilksPval roysPval


Constructor functions for the different pcev objects

Description

PcevClassical, PcevBlock and PcevSingular create the pcev objects from the provided data that are necessary to compute the PCEV according to the user's parameters.

Usage

PcevClassical(response, covariate, confounder)

PcevBlock(response, covariate, confounder)

PcevSingular(response, covariate, confounder)

Arguments

response

A matrix of response variables.

covariate

A matrix or a data frame of covariates.

confounder

A matrix or data frame of confounders

Value

A pcev object, of the class that corresponds to the estimation method. These objects are lists that contain the data necessary for computation.

See Also

estimatePcev, computePCEV


Principal Component of Explained Variance

Description

computePCEV computes the first PCEV and tests its significance.

Usage

computePCEV(response, covariate, confounder, estimation = c("all", "block",
  "singular"), inference = c("exact", "permutation"), index = "adaptive",
  shrink = FALSE, nperm = 1000, Wilks = FALSE)

Arguments

response

A matrix of response variables.

covariate

An array or a data frame of covariates.

confounder

An array or data frame of confounders.

estimation

Character string specifying which estimation method to use: "all", "block" or "singular". Default value is "all".

inference

Character string specifying which inference method to use: "exact" or "permutation". Default value is "exact".

index

Only used if estimation = "block". Default value is "adapative". See details.

shrink

Should we use a shrinkage estimate of the residual variance? Default value is FALSE.

nperm

The number of permutations to perform if inference = "permutation" or for the Tracy-Widom empirical estimate (if estimation = "singular").

Wilks

Should we use a Wilks test instead of Roy's largest test? This is only implemented for a single covariate and with estimation = "all".

Details

This is the main function. It computes the PCEV using either the classical method, block approach or singular. A p-value is also computed, testing the significance of the PCEV.

The p-value is computed using either a permutation approach or an exact test. The implemented exact tests use Wilks' Lambda (only for a single covariate) or Roy's Largest Root. The latter uses Johnstone's approximation to the null distribution. Note that for the block approach, only p-values obtained from a permutation procedure are available.

When estimation = "singular", the p-value is computed using a heuristic: using the method of moments and a small number of permutations (i.e. 25), a location-scale family of the Tracy-Widom distribution of order 1 is fitted to the null distribution. This fitted distribution is then used to compute p-values.

When estimation = "block", there are three different ways of specifying the blocks: 1) if index is a vector of the same length as the number of columns in response, then it is used to match each response to a block. 2) If index is a single positive integer, it is understood as the number of blocks, and each response is matched to a block randomly. 3) If index = "adaptive" (the default), the number of blocks is chosen so that there are about n/2 responses per block, and each response is match to a block randomly. All other values of index should result in an error.

Value

An object of class Pcev containing the first PCEV, the p-value, the estimate of the shrinkage factor, etc.

See Also

estimatePcev

Examples

set.seed(12345)
Y <- matrix(rnorm(100*20), nrow=100)
X <- rnorm(100)
pcev_out <- computePCEV(Y, X)
pcev_out2 <- computePCEV(Y, X, shrink = TRUE)

Estimation of PCEV

Description

estimatePcev estimates the PCEV.

Usage

estimatePcev(pcevObj, ...)

## Default S3 method:
estimatePcev(pcevObj, ...)

## S3 method for class 'PcevClassical'
estimatePcev(pcevObj, shrink, index, ...)

## S3 method for class 'PcevBlock'
estimatePcev(pcevObj, shrink, index, ...)

## S3 method for class 'PcevSingular'
estimatePcev(pcevObj, shrink, index, ...)

Arguments

pcevObj

A pcev object of class PcevClassical, PcevBlock or PcevSingular

...

Extra parameters.

shrink

Should we use a shrinkage estimate of the residual variance?

index

If pcevObj is of class PcevBlock, index is a vector describing the block to which individual response variables correspond.

Value

A list containing the variance components, the first PCEV, the eigenvalues of V_R^{-1}V_M and the estimate of the shrinkage parameter \rho

See Also

computePCEV


Methylation values around BLK gene

Description

A dataset containing methylation values for cell-separated samples. The methylation was measured using bisulfite sequencing. The data also contains the genomic position of these CpG sites, as well as a binary phenotype (i.e. whether the sample comes from a B cell).

Usage

methylation

pheno

position

index

pheno2

position2

methylation2

Format

The data comes in four objects:

methylation

Matrix of methylation values at 5,986 sites measured on 40 samples

pheno

Vector of phenotype, indicating whether the sample comes from a B cell

position

Data frame recording the position of each CpG site along the chromosome

index

Index vector used in the computation of PCEV-block

methylation2

Matrix of methylation values at 1000 sites measured on 40 samples

pheno2

Vector of phenotype, indicating the cell type of the sample (B cell, T cell, or Monocyte)

position2

Data frame recording the position of each CpG site along the chromosome

Details

Methylation was first measured at 24,068 sites, on 40 samples. Filtering was performed to keep the 25% most variable sites. See the vignette for more detail.

A second sample of the methylation dataset was extracted. This second dataset contains methylation values at 1000 CpG dinucleotides.

Source

Tomi Pastinen, McGill University, Genome Quebec.


Permutation p-value

Description

Computes a p-value using a permutation procedure.

Usage

permutePval(pcevObj, ...)

## Default S3 method:
permutePval(pcevObj, ...)

## S3 method for class 'PcevClassical'
permutePval(pcevObj, shrink, index, nperm, ...)

## S3 method for class 'PcevBlock'
permutePval(pcevObj, shrink, index, nperm, ...)

## S3 method for class 'PcevSingular'
permutePval(pcevObj, shrink, index, nperm, ...)

Arguments

pcevObj

A pcev object of class PcevClassical or PcevSingular PcevBlock

...

Extra parameters.

shrink

Should we use a shrinkage estimate of the residual variance?

index

If pcevObj is of class PcevBlock, index is a vector describing the block to which individual response variables correspond.

nperm

The number of permutations to perform.


Roy's largest root exact test

Description

In the classical domain of PCEV applicability this function uses Johnstone's approximation to the null distribution of ' Roy's Largest Root statistic. It uses a location-scale variant of the Tracy-Widom distribution of order 1.

Usage

roysPval(pcevObj, ...)

## Default S3 method:
roysPval(pcevObj, ...)

## S3 method for class 'PcevClassical'
roysPval(pcevObj, shrink, index, ...)

## S3 method for class 'PcevSingular'
roysPval(pcevObj, shrink, index, nperm, ...)

## S3 method for class 'PcevBlock'
roysPval(pcevObj, shrink, index, ...)

Arguments

pcevObj

A pcev object of class PcevClassical or PcevBlock

...

Extra parameters.

shrink

Should we use a shrinkage estimate of the residual variance?

index

If pcevObj is of class PcevBlock, index is a vector describing the block to which individual response variables correspond

nperm

Number of permutations for Tracy-Widom empirical estimate.

Details

Note that if shrink is set to TRUE, the location-scale parameters are estimated using a small number of permutations.


Wilks' lambda exact test

Description

Computes a p-value using Wilks' Lambda.

Usage

wilksPval(pcevObj, ...)

## Default S3 method:
wilksPval(pcevObj, ...)

## S3 method for class 'PcevClassical'
wilksPval(pcevObj, shrink, index, ...)

## S3 method for class 'PcevSingular'
wilksPval(pcevObj, shrink, index, ...)

## S3 method for class 'PcevBlock'
wilksPval(pcevObj, shrink, index, ...)

Arguments

pcevObj

A pcev object of class PcevClassical or PcevBlock

...

Extra parameters.

shrink

Should we use a shrinkage estimate of the residual variance?

index

If pcevObj is of class PcevBlock, index is a vector describing the block to which individual response variables correspond.

Details

The null distribution of this test statistic is only known in the case of a single covariate, and therefore this is the only case implemented.