Type: | Package |
Title: | Estimates Allele Frequency on qPCR DeltaDeltaCq from Bulk Samples |
Version: | 0.4.0 |
Maintainer: | Masaaki Sudo <masaaki@sudori.info> |
Description: | Interval estimation of the population allele frequency from qPCR analysis based on the restriction enzyme digestion (RED)-DeltaDeltaCq method (Osakabe et al. 2017, <doi:10.1016/j.pestbp.2017.04.003>), as well as general DeltaDeltaCq analysis. Compatible with the Cq measurement of DNA extracted from multiple individuals at once, so called "group-testing", this model assumes that the quantity of DNA extracted from an individual organism follows a gamma distribution. Therefore, the point estimate is robust regarding the uncertainty of the DNA yield. |
URL: | https://github.com/sudoms/freqpcr |
Depends: | R (≥ 3.6), cubature |
Imports: | methods |
License: | GPL (≥ 3) |
Encoding: | UTF-8 |
RoxygenNote: | 7.1.1 |
Suggests: | knitr, rmarkdown, remotes |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2022-01-26 10:57:50 UTC; masaa |
Author: | Masaaki Sudo |
Repository: | CRAN |
Date/Publication: | 2022-01-27 10:30:04 UTC |
The freqpcr package
Description
Allele Frequency Estimation on qPCR \Delta\Delta
Cq Values from Bulk Samples
Author(s)
Maintainer: Masaaki Sudo masaaki@sudori.info (ORCID)
See Also
Useful links:
Log-likelihood of obtaining Cq values under given parameter set.
Description
The internal function is called from the optimizer, nlm()
, running in freqpcr()
. It defines the log-likelihood by obtaining the two \Delta
Cq values (differences in the four Cq measurements) provided that the allele mixing ratio for each bulk sample is given together with other parameters. This function is vectorized over multiple bulk samples.
Usage
.freqpcr_loglike(
X,
N,
DCW,
DCD,
zeroAmount,
para.fixed = NULL,
beta = TRUE,
diploid = FALSE,
dummyDCW = FALSE
)
Arguments
X |
Numeric vector that stores the parameter values to be optimized via |
N |
Sample sizes as a numeric vector. |
DCW , DCD |
Numeric vectors having the same length as |
zeroAmount |
(In RED- |
para.fixed |
Named numeric vector that stores the fixed parameters inherited from |
beta |
Whether to use the beta distribution to approximate the sample allele ratio instead of specifying individual gamma distribution for each of the allelic DNA amounts? Default is |
diploid |
Is the target organism diploidy? Default is |
dummyDCW |
Whether the |
Value
Scalar of the log likelihood.
Log-likelihood when sample allele ratio is continuous.
Description
Called from freqpcr()
instead of .freqpcr_loglike()
when the model is ‘continuous’. This function assumes that each sample does not consist of n
individual organisms with certain genotypes, but the result of a direct DNA extraction from the sub-population having the allele ratio around p:(1-p)
. Each sample allele ratio is considered to follow Beta(apk, a(1-p)k)
, where a
and k
are the relative DNA content of the sample and the gamma shape parameter, respectively.
Usage
.freqpcr_loglike_cont(
X,
A,
DCW,
DCD,
zeroAmount,
para.fixed = NULL,
beta = TRUE,
dummyDCW = FALSE
)
Arguments
X |
Numeric vector that stores the parameter values to be optimized via |
A |
Relative DNA content between the samples. A continuous version of |
DCW , DCD |
Numeric vectors. They store the measured values of the two |
zeroAmount |
(In RED- |
para.fixed |
Named numeric vector that stores the fixed parameters inherited from |
beta |
Whether to use the beta distribution to approximate the sample allele ratio instead of specifying individual gamma distribution for each of the allelic DNA amounts? Default is |
dummyDCW |
Whether the |
Value
Scalar of the log likelihood.
Integration of likelihood function based on the beta assumption.
Description
Internal function to integrate the likelihood getting the \Delta
Cq value (the argument del
) over the entire range of the allele ratio (0 to 1). Vectorized for multiple bulk samples. Having the same arguments with .integrate_gamma.
Usage
.integrate_beta(
del,
SHR,
SHS,
zeroAmount,
targetScale,
sdMeasure,
xsm = 2,
EPCR,
cubmethod = "hcubature",
relTol = 0.1,
absTol = 1e-08,
maxEval = 10^6
)
Arguments
del |
Numeric vector of the observed |
SHR |
The gamma shape parameters for the mutant (R) portion of the bulk samples. Should be the same vector length as del. Each element of SHR is defined as K*(the assumed number of R allele in the bulk sample: 1, 2, 3, ..., n-1). |
SHS |
The gamma shape parameters for the wild (S) portion of the bulk samples. Should be the same length as del. Each element of SHS is defined as K*(the assumed number of S allele in the bulk sample). |
zeroAmount |
(In RED- |
targetScale |
( |
sdMeasure |
( |
xsm |
Specify the accumulation of the standard deviation of the Cq measuring errors when the *-Cq values are fed as difference. For |
EPCR |
( |
cubmethod |
Cubature method passed to the integrator function. See the section "Methods for cubintegrate". |
relTol |
The maximum tolerance passed to the cubature method. Though the default of cubature::cubintegrate function is 1e-5, the accuracy is reduced here to acceralate the integration. |
absTol |
The absolute tolerance passed to the cubature method. The default is 1e-8, which is less accurate than the default of cubintegrate function (1e-12) but considered enough for the estimation. |
maxEval |
Maximum number of function evaluations needed. The default is 10^6, which is same as the cubintegrate default. |
Value
A numeric vector of marginal likelihoods having the same length as del
.
Methods for cubintegrate
The following methods are available for cubintegrate()
: cubmethod = c("hcubature", "pcubature", "cuhre", "divonne", "suave", "vegas").
hcubature
is considerably fast, but less accurate with larger reltol
. cuhre
is moderately fast and the most accurate in most relTol
range. If you can wait longer, hcubature
with relTol = 1e-4
or cuhre
with relTol = 1e-1
is recommended.
At reltol = 1e-1
, hcubature
is three times faster than cuhre
, but the log-likelihood fluctuate by 0.1 (by 0.001 in cuhre
).
The speed and accuracy of hcubature
at reltol = 1e-5
is comparable with cuhre
at reltol = 1e-1
.
pcubature
and divonne
frequently returns NaN
and are not recommended.
suave
and vegas
are Monte Carlo integration and slow. They are certainly accurate even at reltol = 1
, but interior to cuhre
with same reltol
.
See Also
Other integrators:
.integrate_gamma()
Double integration of likelihood over the two DNA quantities obeying the gamma distributions.
Description
Internal function to integrate the likelihood getting \Delta
Cq value (the argument del
) over the entire range of the DNA quantities of the two alleles, 0 <= x_S < Inf and 0 <= x_R < Inf. Vectorized for multiple bulk samples. It shares the arguments with .integrate_beta.
Usage
.integrate_gamma(
del,
SHR,
SHS,
zeroAmount,
targetScale,
sdMeasure,
xsm = 2,
EPCR,
cubmethod = "hcubature",
relTol = 0.1,
absTol = 1e-08,
maxEval = 10^6
)
Arguments
del |
Numeric vector of the observed |
SHR |
The gamma shape parameters for the mutant (R) portion of the bulk samples. Should be the same vector length as del. Each element of SHR is defined as K*(the assumed number of R allele in the bulk sample: 1, 2, 3, ..., n-1). |
SHS |
The gamma shape parameters for the wild (S) portion of the bulk samples. Should be the same length as del. Each element of SHS is defined as K*(the assumed number of S allele in the bulk sample). |
zeroAmount |
(In RED- |
targetScale |
( |
sdMeasure |
( |
xsm |
Specify the accumulation of the standard deviation of the Cq measuring errors when the *-Cq values are fed as difference. For |
EPCR |
( |
cubmethod |
Cubature method passed to the integrator function. See the section "Methods for cubintegrate". |
relTol |
The maximum tolerance passed to the cubature method. Though the default of cubature::cubintegrate function is 1e-5, the accuracy is reduced here to acceralate the integration. |
absTol |
The absolute tolerance passed to the cubature method. The default is 1e-8, which is less accurate than the default of cubintegrate function (1e-12) but considered enough for the estimation. |
maxEval |
Maximum number of function evaluations needed. The default is 10^6, which is same as the cubintegrate default. |
Value
A numeric vector of marginal likelihoods having the same length as del
.
See Also
Other integrators:
.integrate_beta()
Log-likelihood of getting Cq values when exact allele mixing ratios are known.
Description
Internal function to return the log-likelihood getting the four Cq measurements under true allele frequency for each bulk sample is known.
Usage
.knownqpcr_loglike(X, A, trueY, Digest, Gene, Cq)
Arguments
X |
A numeric vector that stores the current parameter sizes of |
A |
Optionally, you can specify relative DNA content between the samples, as a numeric vector having the same length as the Cq data. If present, |
trueY |
A numeric vector having the same length as the Cq data. |
Value
A scalar of the log likelihood.
Log-likelihood of getting Cq values when exact allele mixing ratios are known, lacking the quartet structure.
Description
Internal function to return the log-likelihood getting the four Cq measurements under true allele frequency for each bulk sample is known. This is a variant for the 'duo' structure dataset and baseChange
is absent. obs.housek0
and obs.target0
are defined for the parity of source code, but they actually accept no data.
Usage
.knownqpcr_loglike_duo(X, A, trueY, Digest, Gene, Cq)
Arguments
X |
A numeric vector that stores the current parameter sizes of |
A |
Optionally, you can specify relative DNA content between the samples, as a numeric vector having the same length as the Cq data. If present, |
trueY |
A numeric vector having the same length as the Cq data. |
Value
A scalar of the log likelihood.
Output object of freqpcr()
.
Description
Output object of freqpcr()
.
Slots
report
A matrix of the simultaneous parameter estimation result. The rows represent the parameters
P
,K
,targetScale
(\delta_{T}
),sdMeasure
(\sigma_{c}
), andEPCR
(\eta
).obj
Returned value of the optimizer function
nlm()
as a list.cal.time
Calculation time of
nlm()
, stored as aproc_time
class object.
S4 class storing the dummy Cq data for performance test.
Description
A dummy Cq dataset suitable for a package test, typically obtained as the output of make_dummy()
.
Slots
N
Sample sizes as a numeric vector. The
ntrap
andnpertrap
arguments ofmake_dummy()
are inherited to the length of N and eachN[i]
element, the number of individuals (both for haploidy and diploidy) contained in the ith bulk sample, respectively.m
Segregation ratio. As for haploidy,
m
is a matrix with 2 rows andntrap
columns.m[1, i]
andm[2, i]
stores the number of R (mutant) or S (wild type) individuals whileN[i] = sum(m[, i])
specifies the total in the bulk sample. It has 3 rows andntrap
columns as for diploidy. Whilem[1, i]
stands for the number of RR hogozygote individuals,m[2, i]
andm[3, i]
stand for the numbers of RS heterozygotes and SS homozygotes, respectively.xR,xS
Numeric vector of the same length with N.
xR[i]
stores the amount of the template DNA for R allele contained in the ith bulk sample.housek0,target0,housek1,target1
Numeric vectors of the same lengths with N. Store the generated Cq values.
DCW
\Delta
Cq values measured on the control samples (DNA extract without endonuclease digestion in the RED-\Delta\Delta
Cq method, or pure R solution in a general\Delta\Delta
Cq method),DCW
, is defined as (target0 - housek0
).DCD
\Delta
Cq values measured on the test samples (samples after endonuclease digestion in the RED-\Delta\Delta
Cq method, or samples with unknown allele mixing ratios in a general\Delta\Delta
Cq method),DCD
, is defined as (target1 - housek1
).deldel
\Delta\Delta
Cq value, defined as (DCD - DCW
).RFreqMeasure
A classical index of the allele frequency calculated for each bulk sample, which is defined as
(1.0+EPCR)^(-deldel)
. Note that the values ofEPCR
and other parameters, such asP
orK
, are not recorded in the object to avoid leakage of information.ObsP
As
RFreqMeasure
can exceed 1 by definition,ObsP
is defined asmin(RFreqMeasure, 1)
.rand.seed
The seed of the random-number generator (RNG) which was fed to the current R session to generate dummy
m
,xR
andxS
data.
Estimate population allele frequency from the set of Cq measurements.
Description
The function estimates the population allele frequency using the dataset of Cq values measured over length(N)
bulk samples, each of which has a sample size of N[i]
as the number of individuals included. N[i]
can be 1, meaning that every individual is analyzed separately. For the ith sample, the four Cq values were measured as housek0[i]
, target0[i]
, housek1[i]
, and target1[i]
. The function can estimate up to five parameters simultaneously when the Cq sets are available for more than two (bulk) samples: P
, K
, targetScale
, sdMeasure
, and EPCR
.
Since v0.3.2, user can also use an experimental ‘continuous model’ by specifying A
instead of N
. That is, each sample DNA is directly extracted from the environment and the sample allele ratio y
follows y ~ Beta(apk, a(1-p)k)
instead of y ~ Beta(mk, (n-m)k), m ~ Binomial(n, p)
, where p
and k
are the population allele frequency and the gamma shape parameter of the individual DNA yield, respectively. Each element of A
, a
is a scaling factor of relative DNA contents between the samples. The continuous model is likely when each sample directly comes from the environment e.g., water sampling in an eDNA analysis or cell culture in a petri dish.
Since v0.4.0, freqpcr()
also works without specifying housek0
and target0
i.e., it can estimate population allele frequency from \Delta
Cq values instead of \Delta\Delta
Cq. In this setting, the sizes of targetScale
and sdMeasure
should be fixed in addition to EPCR
and zeroAmount
.
Usage
freqpcr(
N,
A,
housek0,
target0,
housek1,
target1,
P = NULL,
K = NULL,
targetScale = NULL,
sdMeasure = NULL,
EPCR = 0.99,
XInit0 = c(P = NULL, K = NULL, targetScale = NULL, sdMeasure = NULL, EPCR = NULL),
zeroAmount = NULL,
beta = TRUE,
diploid = FALSE,
pvalue = 0.05,
gradtol = 1e-04,
steptol = 1e-09,
iterlim = 100,
maxtime = 600,
print.level = 1,
...
)
Arguments
N |
Sample sizes as a numeric vector. |
A |
Use instead of |
housek0 |
A numeric vector. In RED- |
target0 |
In RED- |
housek1 |
The Cq values of the test sample measured on the housekeeping gene after the restriction enzyme digestion (in RED- |
target1 |
For each test sample with unknown allele-ratio, |
P |
Scalar. Population allele frequency from which the test samples are derived. Default is |
K |
Scalar. The gamma shape parameter of the individual DNA yield. Default is |
targetScale |
( |
sdMeasure |
( |
EPCR |
( |
XInit0 |
Optionally the initial value for the parameter optimization can be specified, but it is strongly recommended to keep the argument as is. Unlike |
zeroAmount |
(In RED- |
beta |
Whether to use the beta distribution to approximate the sample allele ratio instead of specifying individual gamma distribution for each of the allelic DNA amounts? Default is |
diploid |
Is the target organism diploidy? Default is |
pvalue |
The two-sided confidence interval is calculated at the last iteration at given significance level. Default is 0.05, which returns the 95% Wald's CI (2.5 to 97.5 percentile) based on the Hessian matrix. |
gradtol , steptol , iterlim |
Control parameters passed to |
maxtime |
A positive scalar to set the maximum calculation time in seconds to abort the optimizer (and return error). The total calculation time largely depends on |
print.level |
|
... |
Additional arguments passed to the function. |
Value
Object of the S4 class CqFreq. The slot report
is a matrix and each row contains the estimated parameter value with 100*(1-pvalue)% confidence interval. The following parameters are returned:
P
, the population allele frequency from which the test samples are derived.K
, the gamma shape parameter of the individual DNA yield.targetScale
(\delta_{T}
), the relative template DNA amount of the target to the houskeeping loci.EPCR
(\eta
), the amplification efficiency per PCR cycle.sdMeasure
or "Cq measurement error" (\sigma_{c}
).
Choise of the parameters to be estimated
Estimation is conducted only for parameters where the values are not specified or specified explicitly as NULL
. If one feeds a value e.g. K=1
or sdMeasure=0.24
, it is then treated as fixed parameter. Notwithstanding, EPCR
is estimated only when EPCR = NULL
is specified explicitly.
You must verify the size of EPCR
and zeroAmount
beforehand because they are not estimable from the experiments with unknown allele ratios. Although targetScale
and sdMeasure
are estimable within freqpcr()
, it is better to feed the known values, especially when you have only a few bulk samples (length(N) <= 3). Fixing targetScale
and sdMeasure
is also strongly recommended when housek0
and target0
are absent (\Delta
Cq method). The functions knownqpcr()
or knownqpcr_unpaired()
, depending on your data format, provide the procedure to estimate the sizes of the experimental parameters using the DNA solutions of known allele mixing ratios.
For the unknown parameters, XInit0
optionally specifies initial values for the optimization using nlm()
though the use of internal default is strongly recommended. The specification as a fixed parameter has higher priority than the specification in XInit0
. Every user-specified parameter values, fixed or unknown, must be given in linear scale (e.g. between 0 and 1 for the allele frequency); they are transformed internally to log or logit.
See Also
Other estimation procedures:
knownqpcr_unpaired()
,
knownqpcr()
,
sim_dummy()
Examples
# Dummy Cq dataset with six bulk samples, each of which comprises of eight haploids.
EPCR <- 0.95; zeroAmount <- 0.0016; # True values for the two must be known.
P <- 0.75
dmy_cq <- make_dummy( rand.seed=1, P=P, K=2, ntrap=6, npertrap=8,
scaleDNA=1e-07, targetScale=1.5, baseChange=0.3,
EPCR=EPCR, zeroAmount=zeroAmount,
sdMeasure=0.3, diploid=FALSE )
print(dmy_cq)
# Estimation with freqpcr, where P, K, targetScale, and baseChange are marked unknown.
result <- freqpcr( N=dmy_cq@N, housek0=dmy_cq@housek0, target0=dmy_cq@target0,
housek1=dmy_cq@housek1, target1=dmy_cq@target1,
EPCR=EPCR, zeroAmount=zeroAmount, beta=TRUE, print.level=0 )
print(result)
# Estimation with freqpcr, assuming diploidy.
result <- freqpcr( N=dmy_cq@N, housek0=dmy_cq@housek0, target0=dmy_cq@target0,
housek1=dmy_cq@housek1, target1=dmy_cq@target1,
EPCR=EPCR, zeroAmount=zeroAmount, beta=TRUE, diploid=TRUE )
# Estimation where you have knowledge on the size of K.
result <- freqpcr( N=dmy_cq@N, housek0=dmy_cq@housek0, target0=dmy_cq@target0,
housek1=dmy_cq@housek1, target1=dmy_cq@target1,
K=2,
EPCR=EPCR, zeroAmount=zeroAmount, beta=TRUE, print.level=2 )
# (>= v0.3.2)
# Provided the model is continuous (specify A instead of N).
result <- freqpcr( A=dmy_cq@N, housek0=dmy_cq@housek0, target0=dmy_cq@target0,
housek1=dmy_cq@housek1, target1=dmy_cq@target1,
K=2, EPCR=EPCR, zeroAmount=zeroAmount, beta=TRUE, print.level=1 )
# (>= v0.4.0)
# If the dataset lacks control samples (housek0 and target0 are absent).
# Fixing the sizes of targetScale and sdMeasure is strongly recommended.
result <- freqpcr( N=dmy_cq@N, housek1=dmy_cq@housek1, target1=dmy_cq@target1,
K=2, EPCR=EPCR, zeroAmount=zeroAmount,
targetScale=1.5, sdMeasure=0.3, beta=TRUE, print.level=1 )
Estimate auxiliary parameters using samples with known allele ratios.
Description
The function to estimate the auxiliary experimental parameters using DNA solutions, provided the dataset contains samples with multiple allele mixing ratios and the exact mixing ratio are known for each sample. This function is used when all replicates in the dataset comprise the complete observations on the 2 \times 2
combinations of the qPCR conditions in a RED-\Delta\Delta
Cq analysis: (loci for target or housekeeping genes) and (the target gene is undigested or digested with endonuclease). The quartet of the four Cq data, housek0
, target0
(these two are undigested samples amplified with housekeeping and target genes, respectively), housek1
, and target1
(digested samples) should be prepared as four numeric vectors having the same length.
One more variable, trueY
is needed to run the estimation process; it a numeric vector having the same length with the four Cq data. It holds the exact allele-mixing ratio for each quartet (also see the code example). Optionally, you can adjust the relative DNA concentration between the replicates with a parameter vector A
.
Since version 0.3.2, the knownqpcr()
function can also deal with general \Delta\Delta
Cq analyses. In such cases, samples with any mixing ratios are generally marked as ‘digested samples’ i.e., either of housek1
or target1
, depending on the loci to be amplified. The arguments of the corresponding undigested samples, housek0
and target0
, must not be specified by the user. Then, the parameter baseChange
(\delta_{B}
: the change rate of DNA contents before/after the endonuclease digestion) is not included in the estimation result.
Usage
knownqpcr(
housek0,
target0,
housek1,
target1,
trueY,
A = rep(1, length(trueY)),
XInit = c(meanDNA = -10, targetScale = 0, baseChange = 0, sdMeasure = 1, zeroAmount =
-5, EPCR = 0),
method = "BFGS",
pvalue = 0.05,
trace = 0,
report = 10,
verbose = FALSE
)
Arguments
housek0 , target0 , housek1 , target1 |
Measured Cq values. Numeric vectors having the same length as |
trueY |
A numeric vector having the same length as the Cq data. |
A |
Optionally, you can specify relative DNA content between the samples, as a numeric vector having the same length as the Cq data. If present, |
XInit |
Optionally, the named vector specifies the initial parameter values to be optimized. Defined in the natural log scale; e.g. |
method |
A string specifying the optimization algorithm used in |
pvalue |
The two-sided confidence interval is calculated at the last iteration at given significance level. Default is 0.05, which returns the 95% Wald's CI (2.5 to 97.5 percentile) based on the Hessian matrix. |
trace |
Non-negative integer. If positive, |
report |
The frequency of reports if |
verbose |
Send messages to stdout? Default is FALSE. |
Value
A table containing the estimated values for the following parameters:
meanDNA
is the template DNA concentration (of housekeeping gene, per unit volume of sample solution) compared to the threshold line of PCR.targetScale
(\delta_{T}
) is the relative template DNA amount of the target to the houskeeping loci.baseChange
(\delta_{B}
) is the change rate in the DNA amount after endonuclease digestion in RED-\Delta\Delta
Cq method. In general\Delta\Delta
Cq analyses (neitherhousek0
nortarget0
is input), this parameter is not returned. In both cases,baseChange
is not required to runfreqpcr()
.sdMeasure
(\sigma_{c}
) is the measurement error (standard deviation) at each Cq value.zeroAmount
(z
) is the ratio of non-target allele amplified in qPCR (see the argument list offreqpcr()
).EPCR
(\eta
) is the amplification efficiency per PCR cycle.
See Also
Other estimation procedures:
freqpcr()
,
knownqpcr_unpaired()
,
sim_dummy()
Examples
# A dummy Cq dataset: four mixing ratios with four replicates.
# K:2, scaleDNA:1e-11, targetScale:1.5, baseChange:0.3, zeroAmount:1e-3,
# sdMeasure:0.3, and EPCR:0.95. Assuming a RED-DeltaDeltaCq analyses.
trueY <- c(rep(0.1, 4), rep(0.25, 4), rep(0.5, 4), rep(1, 4))
housek0 <- c( 19.39, 19.78, 19.28, 19.58, 18.95, 19.91, 19.66, 19.96,
20.05, 19.86, 19.55, 19.61, 19.86, 19.27, 19.59, 20.21 )
target0 <- c( 19.16, 19.08, 19.28, 19.03, 19.17, 19.67, 18.68, 19.52,
18.92, 18.79, 18.8, 19.28, 19.57, 19.21, 19.05, 19.15 )
housek1 <- c( 21.61, 21.78, 21.25, 21.07, 22.04, 21.45, 20.72, 21.6,
21.51, 21.27, 21.08, 21.7, 21.44, 21.46, 21.5, 21.8 )
target1 <- c( 24.3, 24.22, 24.13, 24.13, 22.74, 23.14, 23.02, 23.14,
21.65, 22.62, 22.28, 21.65, 20.83, 20.82, 20.76, 21.3 )
d.cmp <- data.frame(A=rep(1, 16), trueY, housek0, target0, housek1, target1)
print(d.cmp)
# In RED-DeltaDeltaCq analyses, four observations stem from one sample solution.
# Each argument must be specified with its name (name=source).
knownqpcr( housek0=d.cmp$housek0, target0=d.cmp$target0,
housek1=d.cmp$housek1, target1=d.cmp$target1,
trueY=d.cmp$trueY, A=d.cmp$A, verbose=FALSE )
# In general DeltaDeltaCq analyses, the experimental design will not include
# dedicated control samples. The function then runs without 'housek0' and 'target0'.
knownqpcr( housek1=d.cmp$housek1, target1=d.cmp$target1,
trueY=d.cmp$trueY, A=d.cmp$A, verbose=TRUE )
Estimate auxiliary parameters when the sample pairs are incomplete.
Description
A variant of knownqpcr()
that accepts the Cq values concatenated into a vector (the argument Cq
) accompanied with the experimental conditions (the arguments Digest
and Gene
). Their exact allele mixing ratios are known as trueY
.
Usage
knownqpcr_unpaired(
Digest,
Gene,
trueY,
Cq,
A = rep(1, length(Cq)),
XInit = c(meanDNA = -10, targetScale = 0, baseChange = 0, sdMeasure = 1, zeroAmount =
-5, EPCR = 0),
method = "BFGS",
pvalue = 0.05,
trace = 0,
report = 10,
verbose = FALSE
)
Arguments
Digest |
Numeric vector having the same length as |
Gene |
Numeric vector that specify each Cq measure (element of |
trueY |
A numeric vector. |
Cq |
Measured Cq values. This argument is a numeric vector and can contain |
A |
Optionally, you can specify relative DNA content between the samples, as a numeric vector having the same length as |
XInit |
Optionally, the named vector specifies the initial parameter values to be optimized. Defined in the natural log scale; e.g. |
method |
A string specifying the optimization algorithm used in |
pvalue |
The two-sided confidence interval is calculated at the last iteration at given significance level. Default is 0.05, which returns the 95% Wald's CI (2.5 to 97.5 percentile) based on the Hessian matrix. |
trace |
Non-negative integer. If positive, |
report |
The frequency of reports if |
verbose |
Send messages to stdout? Default is FALSE. |
Value
A table containing the estimated parameter values. The format is same as knownqpcr()
.
See Also
Other estimation procedures:
freqpcr()
,
knownqpcr()
,
sim_dummy()
Examples
# A dummy Cq dataset: four mixing ratios with four replicates.
# K:2, scaleDNA:1e-11, targetScale:1.5, baseChange:0.3, zeroAmount:1e-3,
# sdMeasure:0.3, and EPCR:0.95. Assuming a RED-DeltaDeltaCq analyses.
trueY <- c(rep(0.1, 4), rep(0.25, 4), rep(0.5, 4), rep(1, 4))
housek0 <- c( 19.39, 19.78, 19.28, 19.58, 18.95, 19.91, 19.66, 19.96,
20.05, 19.86, 19.55, 19.61, 19.86, 19.27, 19.59, 20.21 )
target0 <- c( 19.16, 19.08, 19.28, 19.03, 19.17, 19.67, 18.68, 19.52,
18.92, 18.79, 18.8, 19.28, 19.57, 19.21, 19.05, 19.15 )
housek1 <- c( 21.61, 21.78, 21.25, 21.07, 22.04, 21.45, 20.72, 21.6,
21.51, 21.27, 21.08, 21.7, 21.44, 21.46, 21.5, 21.8 )
target1 <- c( 24.3, 24.22, 24.13, 24.13, 22.74, 23.14, 23.02, 23.14,
21.65, 22.62, 22.28, 21.65, 20.83, 20.82, 20.76, 21.3 )
# Incomplete observation dataset, prepared as the "long" format.
# If the undegested (Digest == 0) samples were only analyzed when trueY == 1.
d.long.all <- data.frame(
trueY=rep(trueY, 4), Digest=c(rep(0, 16 + 16), rep(1, 16 + 16)),
Gene=c(rep(0, 16), rep(1, 16), rep(0, 16), rep(1, 16)),
A=rep(1, 16*4), Cq=c(housek0, target0, housek1, target1) )
d.long <- d.long.all[d.long.all$Digest == 1 | d.long.all$trueY == 1, ]
print(d.long)
knownqpcr_unpaired( Digest=d.long$Digest, Gene=d.long$Gene,
trueY=d.long$trueY, Cq=d.long$Cq, A=d.long$A )
# In general DeltaDeltaCq analyses, the experimental design will not include
# dedicated control samples (Digest == 0).
d.long <- d.long.all[d.long.all$Digest == 1, ]
knownqpcr_unpaired( Digest=d.long$Digest, Gene=d.long$Gene,
trueY=d.long$trueY, Cq=d.long$Cq, A=d.long$A )
Generate dummy DNA dataset ready for allele-frequency estimation.
Description
The function generates a dummy dataset of typical RED-\Delta\Delta
Cq analysis. You can directly feed the output of this function to the first argument of sim_dummy()
.
Usage
make_dummy(
rand.seed,
P,
K,
ntrap,
npertrap,
scaleDNA = (1/K) * 1e-06,
targetScale,
baseChange,
EPCR,
zeroAmount,
sdMeasure,
diploid = FALSE
)
Arguments
rand.seed |
Seed for the R built-in random-number-generator. |
P |
A numeric between 0 and 1 giving the population allele frequency from which the test samples are generated. |
K |
A positive numeric of the gamma shape parameter of the individual DNA yield. |
ntrap , npertrap |
Scalar specifying the number of bulk samples ( |
scaleDNA |
Small positive scalar that specifies the scale parameter of the gamma distribution appriximating the DNA yield from (per-haploid) individual. The yield of |
targetScale |
( |
baseChange |
( |
EPCR |
( |
zeroAmount |
A numeric between 0 and 1, usually near 0, giving the residue rate of restriction enzyme digestion in RED- |
sdMeasure |
( |
diploid |
Is the target organism diploidy? Default is |
Value
Object of the S4 class CqList, storing the dummy experiment data of Cq-based qPCR analysis. Note that a CqList object in no way contains original information on P
, K
, targetScale
, sdMeasure
, and EPCR
.
Examples
P <- 0.25
# Just a test: segregation ratios for six bulk samples, 1000 individuals for each.
rmultinom(n=6, size=1000, prob=c(P, 1-P)) # haploidy
rmultinom(6, size=1000, prob=c(P^2, 2*P*(1-P), (1-P)^2)) # diploidy
# Dummy Cq dataset with six bulk samples, each of which comprises eight haploids.
dmy_cq <- make_dummy( rand.seed=1, P=0.15, K=2, ntrap=6, npertrap=8,
scaleDNA=1e-07, targetScale=1.5, baseChange=0.3, EPCR=0.95,
zeroAmount=1e-3, sdMeasure=0.3, diploid=FALSE )
print(dmy_cq)
Simulate freqpcr estimation based on user-generated dummy data.
Description
Wrapper of freqpcr()
suitable for the performance test using a randomly-generated data object.
Usage
sim_dummy(
CqList,
EPCR,
zeroAmount,
P = NULL,
K = NULL,
targetScale = NULL,
sdMeasure = NULL,
beta,
diploid,
maxtime,
print.level,
aux = NULL,
verbose = TRUE,
...
)
Arguments
CqList |
Object belonging to the CqList class, typically the output from |
EPCR |
( |
zeroAmount |
A numeric between 0 and 1, usually near 0, giving the residue rate of restriction enzyme digestion in RED- |
P , K , targetScale , sdMeasure |
If NULL (default), the parameter is considered unknown and estimated via |
beta , diploid , maxtime , print.level |
Configuration parameters which are passed directly to |
aux |
Additional information to be displayed on the console. The default is |
verbose |
Prints more information e.g. system time. Default is |
... |
Additional arguments passed to |
Value
Object of the S4 class CqFreq, which is same as freqpcr()
.
See Also
Other estimation procedures:
freqpcr()
,
knownqpcr_unpaired()
,
knownqpcr()
Examples
# Prepare the parameter values.
K <- 2 # You already know the size of K in this case.
EPCR <- 0.97 # The sizes of EPCR and zeroAmount must always be supplied.
zeroAmount <- 1.6e-03
is.diploid <- FALSE
# First, make a dummy Cq dataset with six bulk DNA samples,
# each of which comprises of eight haploid individuals.
dmy_cq <- make_dummy( rand.seed=1, P=0.75, K=K, ntrap=6, npertrap=8, scaleDNA=1e-07,
targetScale=1.5, baseChange=0.3, EPCR=EPCR,
zeroAmount=zeroAmount, sdMeasure=0.3, diploid=is.diploid )
# Estimate the population allele frequency on the dummy dataset, presupposing K = 2.
sim_dummy( CqList=dmy_cq, EPCR=EPCR, zeroAmount=zeroAmount,
K=K,
beta=TRUE, diploid=is.diploid, maxtime=60, print.level=2, aux="test" )
# If the maximum calculation time was too short to converge, nlm() returns error.
# Then sim_dummy() returns a matrix filled with zeros.
sim_dummy( CqList=dmy_cq, EPCR=EPCR, zeroAmount=zeroAmount,
beta=TRUE, diploid=is.diploid, maxtime=0.01, print.level=2 )