Version: | 0.13.0 |
Depends: | R (≥ 3.2.1), R.utils (≥ 2.11.0), aroma.core (≥ 3.2.2) |
Imports: | utils, MASS, R.methodsS3 (≥ 1.8.1), R.oo (≥ 1.24.0), matrixStats (≥ 0.61.0), R.filesets (≥ 2.14.0) |
Suggests: | DNAcopy |
Title: | Improved Allele-Specific Copy Number of SNP Microarrays for Downstream Segmentation |
Description: | The CalMaTe method calibrates preprocessed allele-specific copy number estimates (ASCNs) from DNA microarrays by controlling for single-nucleotide polymorphism-specific allelic crosstalk. The resulting ASCNs are on average more accurate, which increases the power of segmentation methods for detecting changes between copy number states in tumor studies including copy neutral loss of heterozygosity. CalMaTe applies to any ASCNs regardless of preprocessing method and microarray technology, e.g. Affymetrix and Illumina. |
License: | LGPL-2.1 | LGPL-3 [expanded from: LGPL (≥ 2.1)] |
URL: | https://github.com/HenrikBengtsson/calmate/ |
BugReports: | https://github.com/HenrikBengtsson/calmate/issues |
LazyLoad: | TRUE |
biocViews: | aCGH, CopyNumberVariants, SNP, Microarray, OneChannel, TwoChannel, Genetics |
NeedsCompilation: | no |
Packaged: | 2022-03-08 19:16:14 UTC; hb |
Author: | Maria Ortiz [aut, ctb], Ander Aramburu [ctb], Henrik Bengtsson [aut, cre, cph], Pierre Neuvial [aut, ctb], Angel Rubio [aut, ctb] |
Maintainer: | Henrik Bengtsson <henrikb@braju.com> |
Repository: | CRAN |
Date/Publication: | 2022-03-08 23:00:03 UTC |
Package calmate
Description
The CalMaTe method calibrates preprocessed allele-specific copy number estimates (ASCNs) from DNA microarrays by controlling for single-nucleotide polymorphism-specific allelic crosstalk. The resulting ASCNs are on average more accurate, which increases the power of segmentation methods for detecting changes between copy number states in tumor studies including copy neutral loss of heterozygosity. CalMaTe applies to any ASCNs regardless of preprocessing method and microarray technology, e.g. Affymetrix and Illumina.
Requirements
This package depends on a set of packages that are all available via CRAN. It has been tested and verified to run on all common operating systems on which R runs, including Linux, Windows and OSX.
Installation and updates
To install this package, do install.packages("calmate")
.
To get started
To process SNP and non-polymorphic signals, see
calmateByTotalAndFracB
(). If you are working solely with SNP signals,calmateByThetaAB
() is also available, but we recommend the former.For processing data in the aroma framework, see
CalMaTeCalibration
.
How to cite
Please cite [1] when using CalMaTe.
License
LGPL (>= 2.1).
Author(s)
Maria Ortiz [aut, ctb], Ander Aramburu [ctb], Henrik Bengtsson [aut, cre, cph], Pierre Neuvial [aut, ctb], Angel Rubio [aut, ctb].
References
[1] M. Ortiz-Estevez, A. Aramburu, H. Bengtsson, P. Neuvial and A. Rubio, CalMaTe: A method and software to improve allele-specific copy number of SNP arrays for downstream segmentation, Bioinformatics, 2012 [PMC3381965].
The CalMaTeCalibration class
Description
Package: calmate
Class CalMaTeCalibration
Object
~~|
~~+--
ParametersInterface
~~~~~~~|
~~~~~~~+--
CalMaTeCalibration
Directly known subclasses:
public static class CalMaTeCalibration
extends ParametersInterface
This class represents the CalMaTe method [1], which corrects for SNP effects in allele-specific copy-number estimates (ASCNs).
Usage
CalMaTeCalibration(data=NULL, tags="*", references=NULL, flavor=c("v2", "v1"), ...)
Arguments
data |
A named |
tags |
Tags added to the output data sets. |
references |
An optional |
flavor |
A |
... |
Additional arguments passed to |
Fields and Methods
Methods:
findUnitsTodo | - | |
getDataSets | - | |
getFullName | - | |
getName | - | |
getOutputDataSets | - | |
getPath | - | |
getReferences | - | |
getRootPath | - | |
getTags | - | |
nbrOfFiles | - | |
process | - | |
setTags | - | |
Methods inherited from ParametersInterface:
getParameterSets, getParameters, getParametersAsString
Methods inherited from Object:
$, $<-, [[, [[<-, as.character, attach, attachLocally, clearCache, clearLookupCache, clone, detach, equals, extend, finalize, getEnvironment, getFieldModifier, getFieldModifiers, getFields, getInstantiationTime, getStaticInstance, hasField, hashCode, ll, load, names, objectSize, print, save, asThis
Reference samples
In order to estimate the calibration parameters, the model assumes that, for any given SNP, there are a majority of samples that are diploid at that SNP. Note that it does not have to be the same set of samples for all SNPs.
By using argument references
, it is possible so specify which
samples should be used when estimating the calibration parameters.
This is useful when for instance there are several tumor samples with
unknown properties as well as a set of normal samples that can be
assumed to be diploid.
Theoretical, a minimum of three reference samples are needed in order for the model to be identifiable. If less, an error is thrown. However, in practice more reference samples should be used, that is, in the order of at least 6-10 reference samples with a diverse set of genotypes.
Flavors
For backward compatibility, we try to keep all major versions of
the CalMaTe algorithm available. Older versions can be used by
specifying argument flavor
.
For more information about the different flavors,
see fitCalMaTeInternal
.
References
[1] M. Ortiz-Estevez, A. Aramburu, H. Bengtsson, P. Neuvial and A. Rubio, CalMaTe: A method and software to improve allele-specific copy number of SNP arrays for downstream segmentation, Bioinformatics, 2012 [PMC3381965].
See Also
Low-level versions of the CalMaTe method is available
via the calmateByThetaAB
() and
calmateByTotalAndFracB
() methods.
For further information on the internal fit functions, see
fitCalMaTeInternal
.
Examples
## Not run:
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# CRMAv2 - Preprocess raw Affymetrix data
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
library("aroma.affymetrix"); # Needed for CRMAv2
dataSet <- "Affymetrix_2006-TumorNormal";
chipType <- "Mapping250K_Nsp";
dsList <- doCRMAv2(dataSet, chipType=chipType, combineAlleles=FALSE,
plm="RmaCnPlm", verbose=-10);
print(dsList);
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# CalMaTe - Post-calibration of ASCNs estimates
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
asn <- CalMaTeCalibration(dsList);
print(asn);
# For speed issues, we will here only process loci on Chromosome 17.
chr <- 17;
ugp <- getAromaUgpFile(dsList$total);
units <- getUnitsOnChromosome(ugp, chr);
dsNList <- process(asn, units=units, verbose=verbose);
print(dsNList);
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Plot allele B fractions (before and after)
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Sample #1 and Chromosome 17
ii <- 1;
# Extract raw (TCN,BAF)
df <- getFile(dsList$total, ii);
dfR <- getAverageFile(dsList$total, verbose=verbose);
gamma <- extractRawCopyNumbers(df, logBase=NULL, chromosome=chr);
gammaR <- extractRawCopyNumbers(dfR, logBase=NULL, chromosome=chr);
gamma <- 2*divideBy(gamma, gammaR);
df <- getFile(dsList$fracB, ii);
beta <- extractRawAlleleBFractions(df, chromosome=chr);
# Extract calibrated (TCN,BAF)
dfN <- getFile(dsNList$fracB, ii);
betaN <- extractRawAlleleBFractions(dfN, chromosome=chr);
dfN <- getFile(dsNList$total, ii);
gammaN <- extractRawCopyNumbers(dfN, logBase=NULL, chromosome=chr);
# Plot
subplots(4, ncol=2, byrow=FALSE);
plot(beta);
title(sprintf("%s", getName(beta)));
plot(gamma);
plot(betaN);
title(sprintf("%s (CalMaTe)", getName(betaN)));
plot(gammaN);
## End(Not run)
Non-documented objects
Description
This page contains aliases for all "non-documented" objects that
R CMD check
detects in this package.
Almost all of them are generic functions that have specific
document for the corresponding method coupled to a specific class.
Other functions are re-defined by setMethodS3()
to
default methods. Neither of these two classes are non-documented
in reality.
The rest are deprecated methods.
Normalize allele-specific copy numbers (CA,CB)
Description
Normalize allele-specific copy numbers (CA,CB).
Usage
## S3 method for class 'array'
calmateByThetaAB(data, references=NULL, ..., truncate=FALSE, refAvgFcn=NULL,
flavor=c("v2", "v1"), verbose=FALSE)
Arguments
data |
An Jx2xI |
references |
An index |
... |
Additional arguments passed to the internal fit function
|
truncate |
If |
refAvgFcn |
(optional) A |
flavor |
A |
verbose |
See |
Value
Returns an Jx2xI numeric
array
with the same dimension names as argument data
.
Flavors
For backward compatibility, we try to keep all major versions of
the CalMaTe algorithm available. Older versions can be used by
specifying argument flavor
.
The default flavor is v2
.
For more information about the different flavors,
see fitCalMaTeInternal
.
References
[1] M. Ortiz-Estevez, A. Aramburu, H. Bengtsson, P. Neuvial and A. Rubio, CalMaTe: A method and software to improve allele-specific copy number of SNP arrays for downstream segmentation, Bioinformatics, 2012 [PMC3381965].
See Also
To calibrate (total,fracB) data,
see *calmateByTotalAndFracB()
.
We strongly recommend to always work with (total,fracB) data
instead of (CA,CB) data, because it is much more general.
For further information on the internal fit functions, see
fitCalMaTeInternal
.
Examples
# Load example (thetaA,thetaB) signals
path <- system.file("exData", package="calmate");
theta <- loadObject("thetaAB,100x2x40.Rbin", path=path);
# Calculate (CA,CB)
thetaR <- matrixStats::rowMedians(theta[,"A",] + theta[,"B",], na.rm=TRUE);
C <- 2*theta/thetaR;
# Calibrate (CA,CB) by CalMaTe
CC <- calmateByThetaAB(theta);
# Plot two "random" arrays
Clim <- c(0,4);
subplots(4, ncol=2, byrow=FALSE);
for (ii in c(1,5)) {
sampleName <- dimnames(C)[[3]][ii];
sampleLabel <- sprintf("Sample #%d ('%s')", ii, sampleName);
plot(C[,,ii], xlim=Clim, ylim=Clim);
title(main=sampleLabel);
plot(CC[,,ii], xlim=Clim, ylim=Clim);
title(main=sprintf("%s\ncalibrated", sampleLabel));
}
Normalize allele-specific copy numbers (total,fracB)
Description
Normalize allele-specific copy numbers (total,fracB), where total is the total (non-polymorphic) signal and
fracB is the allele B fraction.
It is only loci with a non-missing (NA
) fracB value that are
considered to be SNPs and normalized by CalMaTe. The other loci
are left untouched.
Usage
## S3 method for class 'array'
calmateByTotalAndFracB(data, references=NULL, ..., refAvgFcn=NULL, verbose=FALSE)
Arguments
data |
An Jx2xI |
references |
A |
... |
Additional arguments passed to |
refAvgFcn |
(optional) A |
verbose |
See |
Value
Returns an Jx2xI numeric
array
with the same dimension names as argument data
.
References
[1] M. Ortiz-Estevez, A. Aramburu, H. Bengtsson, P. Neuvial and A. Rubio, CalMaTe: A method and software to improve allele-specific copy number of SNP arrays for downstream segmentation, Bioinformatics, 2012 [PMC3381965].
See Also
To calibrate (thetaA,thetaB) or (CA,CB) signals,
see *calmateByThetaAB()
.
Examples
# Load example (thetaA,thetaB) signals
path <- system.file("exData", package="calmate");
theta <- loadObject("thetaAB,100x2x40.Rbin", path=path);
# Transform to (total,fracB) signals
data <- thetaAB2TotalAndFracB(theta);
# Calibrate (total,fracB) by CalMaTe
dataC <- calmateByTotalAndFracB(data);
# Calculate copy-number ratios
theta <- data[,"total",];
thetaR <- matrixStats::rowMedians(theta, na.rm=TRUE);
data[,"total",] <- 2*theta/thetaR;
# Plot two "random" arrays
Clim <- c(0,4);
Blim <- c(0,1);
subplots(4, ncol=2, byrow=FALSE);
for (ii in c(1,5)) {
sampleName <- dimnames(data)[[3]][ii];
sampleLabel <- sprintf("Sample #%d ('%s')", ii, sampleName);
plot(data[,,ii], xlim=Clim, ylim=Blim);
title(main=sampleLabel);
plot(dataC[,,ii], xlim=Clim, ylim=Blim);
title(main=sprintf("%s\ncalibrated", sampleLabel));
}
# Assert that it also works with a single unit
dummy <- calmateByTotalAndFracB(data[1,,,drop=FALSE]);
stopifnot(length(dim(dummy)) == 3);
Calibrates SNP loci according to the CalMaTe method
Description
Calibrates SNP loci according to the CalMaTe method. Note: This is an internal function of the package, which is kept only kept to provide easy access to the internal fit functions. It it actually not elsewhere in the package, and should nor by others.
Usage
## S3 method for class 'matrix'
fitCalMaTe(dataT, references, flavor=c("v2", "v1"), ...)
Arguments
dataT |
A 2xI |
references |
A |
... |
Additional arguments passed to the internal fit functions. |
flavor |
A |
Value
Returns a 2xI numeric
matrix
of calibrated ASCNs.
Flavors
For backward compatibility, we try to keep all major versions of
the CalMaTe algorithm available. Older versions can be used by
specifying argument flavor
.
For more information about the different flavors,
see fitCalMaTeInternal
.
References
[1] M. Ortiz-Estevez, A. Aramburu, H. Bengtsson, P. Neuvial and A. Rubio, CalMaTe: A method and software to improve allele-specific copy number of SNP arrays for downstream segmentation, Bioinformatics, 2012 [PMC3381965].
See Also
For further information on the internal fit functions,
see fitCalMaTeInternal
.
Normalizes non-polymorphic copy number loci according to the CalMaTe method
Description
Normalizes non-polymorphic copy number loci according to the CalMaTe method.
Usage
## S3 method for class 'matrix'
fitCalMaTeCNprobes(dataT, references, ...)
Arguments
dataT |
A JxI |
references |
A |
... |
Not used. |
Value
Returns a numeric
vector
of length J.
Algorithms to fit the CalMaTe model for a single SNP
Description
Algorithms to fit the CalMaTe model for a single SNP. Note: These are internal functions of the package. They should not be used elsewhere.
Usage
fitCalMaTeV1(dataT, references, fB1=1/3, fB2=2/3, maxIter=50, ...)
fitCalMaTeV2(dataT, references, fB1=1/3, fB2=2/3, maxIter=50, ...)
fitCalMaTeMedians(dataT, references, fB1=1/3, fB2=2/3,...)
Arguments
dataT |
A 2xI |
references |
A |
fB1 , fB2 |
Thresholds for calling genotypes AA, AB, BB from the allele B fractions. |
maxIter |
The maximum number of iterations without converging before the algorithm quits. |
... |
Not used. |
Value
Returns a 2xI numeric
matrix
of calibrated ASCNs.
Flavor v1
This is an early version (June 2010-January 2012) of the algorithm described in [1].
Flavor v2
This is the model and algorithm described in [1].
This version was introduced to decrease the number of "artificial outliers" introduced by CalMaTe for some SNPs due to non-converging or wreak-havoc estimates of the SNP effects. Flavor v2 differ from Flavor v1 as follows:
The estimation of the model parameters are now done solely based on reference samples. In previous versions, some of the initial estimation steps were using also non-reference samples.
For a small number of SNPs, the main CalMaTe scheme for estimating parameters would not converge or converge poorly. For such SNPs CalMaTe now falls back to using a plain median estimator, i.e.
fitCalMaTeMedians()
.The above fallback estimator is also used in cases where all samples are identified to be homozygous.
The fitCalMaTeMedians()
method is used as a fallback method
by fitCalMaTeV2()
. It fits CalMaTe without using the
rlm
function.
References
[1] M. Ortiz-Estevez, A. Aramburu, H. Bengtsson, P. Neuvial and A. Rubio, CalMaTe: A method and software to improve allele-specific copy number of SNP arrays for downstream segmentation, Bioinformatics, 2012 [PMC3381965].
See Also
These functions are called by calmateByThetaAB
().
Converts an Jx2xI array between (thetaA,thetaB) and (total,fracB) formats
Description
Converts an Jx2xI array between (thetaA,thetaB) and (total,fracB) formats.
Usage
## S3 method for class 'array'
thetaAB2TotalAndFracB(data, ..., verbose=FALSE)
Arguments
data |
An Jx2xI |
... |
Not used. |
verbose |
See |
Value
Returns an Jx2xI numeric
array.
Examples
# Load example (thetaA,thetaB) signals
path <- system.file("exData", package="calmate");
theta <- loadObject("thetaAB,100x2x40.Rbin", path=path);
data <- thetaAB2TotalAndFracB(theta);
str(data);
theta2 <- totalAndFracB2ThetaAB(data);
str(theta2);
stopifnot(all.equal(theta2, theta));