Title: | Distribution Fitting and Evaluation |
Version: | 0.0.6 |
Description: | A library of density, distribution function, quantile function, (bounded) raw moments and random generation for a collection of distributions relevant for the firm size literature. Additionally, the package contains tools to fit these distributions using maximum likelihood and evaluate these distributions based on (i) log-likelihood ratio and (ii) deviations between the empirical and parametrically implied moments of the distributions. We add flexibility by allowing the considered distributions to be combined into piecewise composite or finite mixture distributions, as well as to be used when truncated. See Dewitte (2020) https://hdl.handle.net/1854/LU-8644700 for a description and application of methods available in this package. |
Depends: | R (≥ 3.6.0) |
Imports: | Rdpack, stats, flexmix, modeltools, methods |
RdMacros: | Rdpack |
License: | GPL-3 |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.1.0 |
Suggests: | knitr, rmarkdown, tidyverse, testthat |
NeedsCompilation: | no |
Packaged: | 2020-05-15 16:43:21 UTC; ruben |
Author: | Ruben Dewitte [aut, cre] |
Maintainer: | Ruben Dewitte <ruben0dewitte@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2020-05-25 18:50:03 UTC |
The Burr distribution
Description
Density, distribution function, quantile function, raw moments and random generation for the Burr distribution, also known as the Burr Type XII distribution or the Singh-Maddala distribution.
Usage
dburr(x, shape1 = 2, shape2 = 1, scale = 0.5, log = FALSE)
pburr(q, shape1 = 2, shape2 = 1, scale = 0.5, log.p = FALSE, lower.tail = TRUE)
qburr(p, shape1 = 2, shape2 = 1, scale = 0.5, log.p = FALSE, lower.tail = TRUE)
mburr(
r = 0,
truncation = 0,
shape1 = 2,
shape2 = 1,
scale = 0.5,
lower.tail = TRUE
)
rburr(n, shape1 = 2, shape2 = 1, scale = 0.5)
Arguments
x , q |
vector of quantiles |
shape1 , shape2 , scale |
Shape1, shape2 and scale of the Burr distribution, defaults to 2, 1 and 0.5. |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the distribution |
truncation |
lower truncation parameter |
n |
number of observations |
Details
Probability and Cumulative Distribution Function:
f(x) =\frac{\frac{kc}{scale}(\frac{\omega}{scale})^{shape2-1}}{(1+(\frac{\omega}{scale})^shape2)^{shape1+1}}, \qquad F_X(x) = 1-\frac{1}{(1+(\frac{\omega}{scale})^shape2)^shape1}
The y-bounded r-th raw moment of the Fréchet distribution equals:
scale^{r} shape1 [\boldsymbol{B}(\frac{r}{shape2} +1,shape1-\frac{r}{shape2}) - \boldsymbol{B}(\frac{(\frac{y}{scale})^{shape2}}{1+(\frac{y}{scale})^{shape2}};\frac{r}{shape2} +1,shape1-\frac{r}{shape2})], \qquad shape2>r, kc>r
Value
dburr returns the density, pburr the distribution function, qburr the quantile function, mburr the rth moment of the distribution and rburr generates random deviates.
The length of the result is determined by n for rburr, and is the maximum of the lengths of the numerical arguments for the other functions.
Examples
## Burr density
plot(x = seq(0, 5, length.out = 100), y = dburr(x = seq(0, 5, length.out = 100)))
plot(x = seq(0, 5, length.out = 100), y = dburr(x = seq(0, 5, length.out = 100), shape2 = 3))
## Demonstration of log functionality for probability and quantile function
qburr(pburr(2, log.p = TRUE), log.p = TRUE)
## The zeroth truncated moment is equivalent to the probability function
pburr(2)
mburr(truncation = 2)
## The (truncated) first moment is equivalent to the mean of a
#(truncated) random sample, for large enough samples.
x <- rburr(1e5, shape2 = 3)
mean(x)
mburr(r = 1, shape2 = 3, lower.tail = FALSE)
sum(x[x > quantile(x, 0.1)]) / length(x)
mburr(r = 1, shape2 = 3, truncation = quantile(x, 0.1), lower.tail = FALSE)
Burr coefficients after power-law transformation
Description
Coefficients of a power-law transformed Burr distribution
Usage
burr_plt(shape1 = 2, shape2 = 1, scale = 0.5, a = 1, b = 1, inv = FALSE)
Arguments
shape1 , shape2 , scale |
Shape1, shape2 and scale of the Burr distribution, defaults to 2, 1 and 1 respectively. |
a , b |
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
Details
If the random variable x is Burr distributed with scale shape and shape scale, then the power-law transformed variable
y = ax^b
is Burr distributed with shape1 shape1
, shape2 b*shape2
and scale ( \frac{scale}{a})^{\frac{1}{b}}
.
Value
Returns a named list containing
- coefficients
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables pburr(3,shape1=2,shape2=3,scale=1) coeff = burr_plt(shape1=2,shape2=3,scale=1,a=5,b=7)$coefficients pburr(5*3^7,shape1=coeff[["shape1"]],shape2=coeff[["shape2"]],scale=coeff[["scale"]])
pburr(5*0.9^7,shape1=2,shape2=3,scale=1) coeff = burr_plt(shape1=2,shape2=3,scale=1,a=5,b=7, inv=TRUE)$coefficients pburr(0.9,shape1=coeff[["shape1"]],shape2=coeff[["shape2"]],scale=coeff[["scale"]])
## Comparing the first moments and sample means of power-law transformed variables for large enough samples x = rburr(1e5,shape1=2,shape2=3,scale=1) coeff = burr_plt(shape1=2,shape2=3,scale=1,a=2,b=0.5)$coefficients y = rburr(1e5,shape1=coeff[["shape1"]],shape2=coeff[["shape2"]],scale=coeff[["scale"]]) mean(2*x^0.5) mean(y) mburr(r=1,shape1=coeff[["shape1"]],shape2=coeff[["shape2"]],scale=coeff[["scale"]],lower.tail=FALSE)
Pareto scale determination à la Clauset
Description
This method determines the optimal scale parameter of the Inverse Pareto distribution using the iterative method (Clauset et al. 2009) that minimizes the Kolmogorov-Smirnov distance.
Usage
clauset.xmax(x, q = 1)
Arguments
x |
data vector |
q |
Percentage of data to search over (starting from the smallest values) |
Value
Returns a named list containing a
- coefficients
Named vector of coefficients
- KS
Minimum Kolmogorov-Smirnov distance
- n
Number of observations in the Inverse Pareto tail
- coeff.evo
Evolution of the Inverse Pareto shape parameter over the iterations
References
Clauset A, Shalizi CR, Newman ME (2009). “Power-law distributions in empirical data.” SIAM review, 51(4), 661–703.
Examples
## Determine cuttof from compostie InvPareto-Lognormal distribution using Clauset's method
dist <- c("invpareto", "lnorm")
coeff <- c(coeff1.k = 1.5, coeff2.meanlog = 1, coeff2.sdlog = 0.5)
x <- rcomposite(1e3, dist = dist, coeff = coeff)
out <- clauset.xmax(x = x)
out$coefficients
coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))$coeff1
## Speed up method by considering values above certain quantile only
dist <- c("invpareto", "lnorm")
coeff <- c(coeff1.k = 1.5, coeff2.meanlog = 1, coeff2.sdlog = 0.5)
x <- rcomposite(1e3, dist = dist, coeff = coeff)
out <- clauset.xmax(x = x, q = 0.5)
out$coefficients
coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))$coeff1
Pareto scale determination à la Clauset
Description
This method determines the optimal scale parameter of the Pareto distribution using the iterative method (Clauset et al. 2009)that minimizes the Kolmogorov-Smirnov distance.
Usage
clauset.xmin(x, q = 0)
Arguments
x |
data vector |
q |
Percentage of data to search over (starting from the largest values) |
Value
Returns a named list containing a
- coefficients
Named vector of coefficients
- KS
Minimum Kolmogorov-Smirnov distance
- n
Number of observations in the Pareto tail
- coeff.evo
Evolution of the Pareto shape parameter over the iterations
References
Clauset A, Shalizi CR, Newman ME (2009). “Power-law distributions in empirical data.” SIAM review, 51(4), 661–703.
Examples
## Determine cuttof from compostie lognormal-Pareto distribution using Clauset's method
dist <- c("lnorm", "pareto")
coeff <- c(coeff1.meanlog = -0.5, coeff1.sdlog = 0.5, coeff2.k = 1.5)
x <- rcomposite(1e3, dist = dist, coeff = coeff)
out <- clauset.xmin(x = x)
out$coefficients
coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))$coeff2
## Speed up method by considering values above certain quantile only
dist <- c("lnorm", "pareto")
coeff <- c(coeff1.meanlog = -0.5, coeff1.sdlog = 0.5, coeff2.k = 1.5)
x <- rcomposite(1e3, dist = dist, coeff = coeff)
out <- clauset.xmin(x = x, q = 0.5)
out$coefficients
coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))$coeff2
Parametrise two-/three- composite distribution
Description
Determines the weights and cutoffs of the three-composite distribution numerically applying te continuity- and differentiability condition.
Usage
coeffcomposite(dist, coeff, startc = c(1, 1))
Arguments
dist |
character vector denoting the distribution of the first-, second- (and third) component respectively. If only two components are provided, the distribution reduces to the two-component distribution. |
coeff |
named numeric vector holding the coefficients of the first-, second- (and third) component, predeced by coeff1., coeff2. (and coeff3.), respectively. Coefficients for the last component do not have to be provided for the two-component distribution and will be disregarded. |
startc |
starting values for the lower and upper cutoff, defaults to c(1,1). |
Details
The continuity condition implies
\alpha_1 = \frac{m_2(c_1) M_1(c_1)}{m_1(c_1)[M_2(c_2) - M_2(c_1)]}, \qquad \alpha_2 = \frac{m_2(c_2) [1 - M_3(c_2)]}{m_3(c_2) [M_2(c_2) - M_2(c_1)]}
The differentiability condition implies
\frac{d}{dc_1} ln[\frac{m_1(c_1)}{m_2(c_1)}] = 0, \qquad \frac{d}{dc_2} ln[\frac{m_2(c_2)}{m_3(c_2)}] = 0
Value
Returns a named list containing a the separate distributions and their respective coefficients, as well as the cutoffs and weights of the composite distribution
Examples
# Three-composite distribution
dist <- c("invpareto", "lnorm", "pareto")
coeff <- c(coeff1.k = 1, coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 1)
coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))
# Two-composite distribution
dist <- c("lnorm", "pareto")
coeff <- c(coeff1.meanlog = -0.5, coeff1.sdlog = 0.5, coeff2.k = 1.5)
coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))
dist <- c("invpareto", "lnorm")
coeff <- c(coeff1.k = 1.5, coeff2.meanlog = 2, coeff2.sdlog = 0.5)
coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))
#'
Combined distributions
Description
Density, distribution function, quantile function, raw moments and random generation for combined (empirical, single, composite and finite mixture) truncated or complete distributions.
Usage
dcombdist(
x,
dist,
prior = c(1),
coeff,
log = FALSE,
compress = TRUE,
lowertrunc = 0,
uppertrunc = Inf
)
pcombdist(
q,
dist,
prior = 1,
coeff,
log.p = FALSE,
lower.tail = TRUE,
compress = TRUE,
lowertrunc = NULL,
uppertrunc = NULL
)
qcombdist(p, dist, prior, coeff, log.p = FALSE, lower.tail = TRUE)
mcombdist(
r,
truncation = NULL,
dist,
prior = 1,
coeff,
lower.tail = TRUE,
compress = TRUE,
uppertrunc = 0,
lowertrunc = Inf
)
rcombdist(n, dist, prior, coeff, uppertrunc = NULL, lowertrunc = NULL)
Arguments
x , q |
vector of quantiles |
dist |
character vector denoting the distribution(s). |
prior |
Numeric vector of prior coefficients, defaults to single vector with value one. |
coeff |
list of parameters for the distribution(s). |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
compress |
Logical indicating whether return values from individual densities of finite mixtures should be gathered or not, defaults to TRUE. |
lowertrunc , uppertrunc |
lowertrunc- and uppertrunc truncation points, defaults to 0 and Inf respectively |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter |
n |
number of observations |
Value
dcombdist gives the density, pcombdist gives the distribution function, qcombdist gives the quantile function, mcombdist gives the rth moment of the distribution and rcombdist generates random deviates.
The length of the result is determined by n for rcombdist, and is the maximum of the lengths of the numerical arguments for the other functions.
Examples
# Load necessary tools
data("fit_US_cities")
library(tidyverse)
x <- rcombdist(
n = 25359, dist = "lnorm",
prior = subset(fit_US_cities, (dist == "lnorm" & components == 5))$prior[[1]],
coeff = subset(fit_US_cities, (dist == "lnorm" & components == 5))$coefficients[[1]]
) # Generate data from one of the fitted functions
# Evaluate functioning of dcomdist by calculating log likelihood for all distributions
loglike <- fit_US_cities %>%
group_by(dist, components, np, n) %>%
do(loglike = sum(dcombdist(dist = .[["dist"]], x = sort(x), prior = .[["prior"]][[1]],
coeff = .[["coefficients"]][[1]], log = TRUE))) %>%
unnest(cols = loglike) %>%
mutate(NLL = -loglike, AIC = 2 * np - 2 * (loglike), BIC = log(n) * np - 2 * (loglike)) %>%
arrange(NLL)
# Evaluate functioning of mcombdist and pcombdist by calculating NMAD
#(equivalent to the Kolmogorov-Smirnov test statistic for the zeroth moment
#of the distribution) for all distributions
nmad <- fit_US_cities %>%
group_by(dist, components, np, n) %>%
do(
KS = max(abs(pempirical(q = sort(x), data = x) - pcombdist(dist = .[["dist"]],
q = sort(x), prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]]))),
nmad_0 = nmad_test(r = 0, dist = .[["dist"]], x = sort(x), prior = .[["prior"]][[1]],
coeff = .[["coefficients"]][[1]], stat = "max"),
nmad_1 = nmad_test(r = 1, dist = .[["dist"]], x = sort(x), prior = .[["prior"]][[1]],
coeff = .[["coefficients"]][[1]], stat = "max")
) %>%
unnest(cols = c(KS, nmad_0, nmad_1)) %>%
arrange(nmad_0)
# Evaluate functioning of qcombdist pcombdist by calculating NMAD (equivalent to the Kolmogorov-
#Smirnov test statistic for the zeroth moment of the distribution) for all distributions
test <- fit_US_cities %>%
group_by(dist, components, np, n) %>%
do(out = qcombdist(pcombdist(2, dist = .[["dist"]], prior = .[["prior"]][[1]],
coeff = .[["coefficients"]][[1]], log.p = TRUE),
dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]],
log.p = TRUE
)) %>%
unnest(cols = c(out))
Combined distributions MLE
Description
Maximum Likelihood estimation for combined ( single, composite and finite mixture) truncated or complete distributions.
Usage
combdist.mle(
x,
dist,
start = NULL,
lower = NULL,
upper = NULL,
components = 1,
nested = FALSE,
steps = 1,
lowertrunc = 0,
uppertrunc = Inf,
...
)
Arguments
x |
data vector |
dist |
character vector denoting the distribution(s). |
start |
named numeric vector holding the starting values for the coefficients. |
lower , upper |
Lower and upper bounds to the estimated coefficients, defaults to -Inf and Inf respectively. |
components |
number of components for a mixture distribution. |
nested |
logical indicating whether results should be returned in a nested list or a flat list form, defaults to FALSE. |
steps |
number of steps taken in stepflexmix, defaults to 1. |
lowertrunc , uppertrunc |
lowertrunc- and uppertrunc truncation points, defaults to 0 and Inf respectively |
... |
Additional arguments. |
Value
Returns a named list containing a
- dist
Character vector denoting the distributions, separated by an underscore
- components
Nr. of combined distributions
- prior
Weights assigned to the respective component distributions
- coefficients
Named vector of coefficients
- convergence
logical indicator of convergence
- n
Length of the fitted data vector
- np
Nr. of coefficients
Examples
x <- rdoubleparetolognormal(1e3)
combdist.mle(x = x, dist = "doubleparetolognormal") # Double-Pareto Lognormal
combdist.mle(x = x, components = 2, dist = "lnorm", steps = 20) # FMM with 2 components
combdist.mle( x = x, dist = c("invpareto", "lnorm", "pareto"),
start = c(coeff1.k = 1, coeff2.meanlog = mean(log(x)), coeff2.sdlog = sd(log(x)), coeff3.k = 1),
lower = c(1e-10, -Inf, 1e-10, 1e-10), upper = c(Inf, Inf, Inf, Inf), nested = TRUE)
# composite distribution
Combined coefficients of power-law transformed combined distribution
Description
Coefficients of a power-law transformed combined distribution
Usage
combdist_plt(
dist,
prior = NULL,
coeff,
a = 1,
b = 1,
inv = FALSE,
nested = FALSE
)
Arguments
dist |
character vector denoting the distribution(s). |
prior |
Numeric vector of prior coefficients, defaults to single vector with value one. |
coeff |
list of parameters for the distribution(s). |
a , b |
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
nested |
logical indicating whether results should be returned in a nested list or flat list, defaults to FALSE. |
Value
Returns a nested or flat list containing
- coefficients
Named vector of coefficients
Examples
# Load necessary tools
data("fit_US_cities")
library(tidyverse)
## Comparing probabilites of power-law transformed transformed variables
prob <- fit_US_cities %>%
filter(!(dist %in% c(
"exp", "invpareto_exp_pareto", "exp_pareto", "invpareto_exp",
"gamma", "invpareto_gamma_pareto", "gamma_pareto", "invpareto_gamma"
))) %>%
group_by(dist, components, np, n) %>%
do(prob = pcombdist(q = 1.1, dist = .[["dist"]], prior = .[["prior"]][[1]],
coeff = .[["coefficients"]][[1]])) %>%
unnest(cols = c(prob))
fit_US_cities_plt <- fit_US_cities %>%
filter(!(dist %in% c(
"exp", "invpareto_exp_pareto", "exp_pareto", "invpareto_exp",
"gamma", "invpareto_gamma_pareto", "gamma_pareto", "invpareto_gamma"
))) %>%
group_by(dist, components, np, n, convergence) %>%
do(results = as_tibble(combdist_plt(dist = .[["dist"]], prior = .[["prior"]][[1]],
coeff = .[["coefficients"]][[1]], a = 2, b = 0.5, nested = TRUE))) %>%
unnest(cols = c(results))
prob$prob_plt <- fit_US_cities_plt %>%
group_by(dist, components, np, n) %>%
do(prob_plt = pcombdist(q = 2 * 1.1^0.5, dist = .[["dist"]], prior = .[["prior"]][[1]],
coeff = .[["coefficients"]][[1]])) %>%
unnest(cols = c(prob_plt)) %>%
.$prob_plt
prob <- prob %>%
mutate(check = abs(prob - prob_plt))
prob <- fit_US_cities %>%
filter(!(dist %in% c(
"exp", "invpareto_exp_pareto", "exp_pareto", "invpareto_exp",
"gamma", "invpareto_gamma_pareto", "gamma_pareto", "invpareto_gamma"
))) %>%
group_by(dist, components, np, n) %>%
do(prob = pcombdist(q = 2 * 1.1^0.5, dist = .[["dist"]], prior = .[["prior"]][[1]],
coeff = .[["coefficients"]][[1]])) %>%
unnest(cols = c(prob))
fit_US_cities_plt <- fit_US_cities %>%
filter(!(dist %in% c(
"exp", "invpareto_exp_pareto", "exp_pareto", "invpareto_exp",
"gamma", "invpareto_gamma_pareto", "gamma_pareto", "invpareto_gamma"
))) %>%
group_by(dist, components, np, n, convergence) %>%
do(results = as_tibble(combdist_plt(dist = .[["dist"]], prior = .[["prior"]][[1]],
coeff = .[["coefficients"]][[1]], a = 2, b = 0.5, nested = TRUE, inv = TRUE))) %>%
unnest(cols = c(results))
prob$prob_plt <- fit_US_cities_plt %>%
group_by(dist, components, np, n) %>%
do(prob_plt = pcombdist(q = 1.1, dist = .[["dist"]], prior = .[["prior"]][[1]],
coeff = .[["coefficients"]][[1]])) %>%
unnest(cols = c(prob_plt)) %>%
.$prob_plt
prob <- prob %>%
mutate(check = abs(prob - prob_plt))
The two- or three-composite distribution
Description
Density, distribution function, quantile function, raw moments and random generation for the two- or three-composite distribution.
Usage
dcomposite(x, dist, coeff, startc = c(1, 1), log = FALSE)
pcomposite(q, dist, coeff, startc = c(1, 1), log.p = FALSE, lower.tail = TRUE)
qcomposite(p, dist, coeff, startc = c(1, 1), log.p = FALSE, lower.tail = TRUE)
mcomposite(
r = 0,
truncation = 0,
dist,
coeff,
startc = c(1, 1),
lower.tail = TRUE
)
rcomposite(n, dist, coeff, startc = c(1, 1))
Arguments
x , q |
vector of quantiles |
dist |
character vector denoting the distribution of the first-, second- (and third) component respectively. If only two components are provided, the distribution reduces to the two-component distribution. |
coeff |
named numeric vector holding the coefficients of the first-, second- (and third) component, predeced by coeff1., coeff2. (and coeff3.), respectively. Coefficients for the last component do not have to be provided for the two-component distribution and will be disregarded. |
startc |
starting values for the lower and upper cutoff, defaults to c(1,1). |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter |
n |
number of observations |
Details
These derivations are based on the two-composite distribution proposed by (Bakar et al. 2015). Probability Distribution Function:
f(x) = \{ \begin{array}{lrl}
\frac{\alpha_1}{1 + \alpha_1 + \alpha_2} \frac{m_1(x)}{M_1(c_1)} & { if} & 0<x \leq c_1 \\
\frac{1}{1 + \alpha_1 + \alpha_2} \frac{m_2(x)}{M_2(c_2) - M_2(c_1)} & { if} & c_1<x \leq c_2 \\
\frac{\alpha_2}{1 + \alpha_1 + \alpha_2} \frac{m_3(x)}{1-M_3(c_2)} & { if} & c_{2}<x < \infty \\
\end{array} .
Cumulative Distribution Function:
\{
\begin{array}{lrl}
\frac{\alpha_1}{1 + \alpha_1 + \alpha_2} \frac{M_1(x)}{M_1(c_1)} & { if} & 0<x \leq c_1 \\
\frac{\alpha_1}{1 + \alpha_1 + \alpha_2} + \frac{1}{1 + \alpha_1 + \alpha_2}\frac{M_2(x) - M_2(c_1)}{M_2(c_2) - M_2(c_1)} & { if} & c_1<x \leq c_2 \\
\frac{1+\alpha_1}{1 + \alpha_1 + \alpha_2} + \frac{\alpha_2}{1 + \alpha_1 + \alpha_2} \frac{M_3(x) - M_3(c_2)}{1-M_3(c_2)} & { if} & c_{2}<x < \infty \\
\end{array}
.
Quantile function
Q(p) = \{
\begin{array}{lrl}
Q_1( \frac{1 + \alpha_1 + \alpha_2}{\alpha_1} p M_1(c_1) ) & { if} & 0<x \leq \frac{\alpha_1}{1 + \alpha_1 + \alpha_2} \\
Q_2[((p - \frac{\alpha_1}{1 + \alpha_1 + \alpha_2})(1 + \alpha_1 + \alpha_2)(M_2(c_2) - M_2(c_1))) + M_2(c_1)] & { if} & \frac{\alpha_1}{1 + \alpha_1 + \alpha_2}<x \leq \frac{1+\alpha_1}{1 + \alpha_1 + \alpha_2} \\
Q_3[((p - \frac{1+\alpha_1}{1 + \alpha_1 + \alpha_2})(\frac{1 + \alpha_1 + \alpha_2}{\alpha_2})(1 - M_3(c_2))) + M_3(c_2)] & { if} & \frac{1+\alpha_1}{1 + \alpha_1 + \alpha_2}<x < \infty \\
\end{array}
.
The lower y-bounded r-th raw moment of the distribution equals
\mu_y^r =
\{
\begin{array}{lrl}
\frac{\alpha_1}{1 + \alpha_1 + \alpha_2}\frac{ (\mu_1)_y^r - (\mu_1)_{c_1}^r}{M_1(c_1)} + \frac{1}{1 + \alpha_1 + \alpha_2}\frac{ (\mu_2)_{c_1}^r - (\mu_2)_{c_2}^r }{M_2(c_2) - M_2(c_1)} + \frac{\alpha_2}{1 + \alpha_1 + \alpha_2} \frac{(\mu_3)_y^r}{1-M_3(c_2)} & { if} & 0< y \leq c_2 \\
\frac{1}{1 + \alpha_1 + \alpha_2} \frac{(\mu_2)_y^r - (\mu_2)_{c_2}^r }{M_2(c_2) - M_2(c_1)} + \frac{\alpha_2}{1 + \alpha_1 + \alpha_2} \frac{(\mu_3)_{c_2}^r}{1-M_3(c_2)} & { if} & c_1< y \leq c_2\\
\frac{\alpha_2}{1 + \alpha_1 + \alpha_2} \frac{(\mu_3)_y^r}{1-M_3(c_2)} & { if} & c_2< y < \infty \\
\end{array}
.
Value
dcomposite returns the density, pcomposite the distribution function, qcomposite the quantile function, mcomposite the rth moment of the distribution and rcomposite generates random deviates.
The length of the result is determined by n for rcomposite, and is the maximum of the lengths of the numerical arguments for the other functions.
References
Bakar SA, Hamzah N, Maghsoudi M, Nadarajah S (2015). “Modeling loss data using composite models.” Insurance: Mathematics and Economics, 61, 146–154.
Examples
#' ## Three-component distribution
dist <- c("invpareto", "lnorm", "pareto")
coeff <- c(coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 1.5, coeff1.k = 1.5)
# Compare density with the Double-Pareto Lognormal distribution
plot(x = seq(0, 5, length.out = 1e3), y = dcomposite(x = seq(0, 5, length.out = 1e3),
dist = dist, coeff = coeff))
lines(x = seq(0, 5, length.out = 1e3), y = ddoubleparetolognormal(x = seq(0, 5, length.out = 1e3)))
# Demonstration of log functionality for probability and quantile function
qcomposite(pcomposite(2, dist = dist, coeff = coeff, log.p = TRUE), dist = dist,
coeff = coeff, log.p = TRUE)
# The zeroth truncated moment is equivalent to the probability function
pcomposite(2, dist = dist, coeff = coeff)
mcomposite(truncation = 2, dist = dist, coeff = coeff)
# The (truncated) first moment is equivalent to the mean of a (truncated) random sample,
#for large enough samples.
coeff <- c(coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 3, coeff1.k = 1.5)
x <- rcomposite(1e5, dist = dist, coeff = coeff)
mean(x)
mcomposite(r = 1, lower.tail = FALSE, dist = dist, coeff = coeff)
sum(x[x > quantile(x, 0.1)]) / length(x)
mcomposite(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE, dist = dist, coeff = coeff)
## Two-component distribution
dist <- c("lnorm", "pareto")
coeff <- coeff <- c(coeff2.k = 1.5, coeff1.meanlog = -0.5, coeff1.sdlog = 0.5)
# Compare density with the Right-Pareto Lognormal distribution
plot(x = seq(0, 5, length.out = 1e3), y = dcomposite(x = seq(0, 5, length.out = 1e3),
dist = dist, coeff = coeff))
lines(x = seq(0, 5, length.out = 1e3), y = drightparetolognormal(x = seq(0, 5, length.out = 1e3)))
# Demonstration of log functionality for probability and quantile function
qcomposite(pcomposite(2, dist = dist, coeff = coeff, log.p = TRUE), dist = dist,
coeff = coeff, log.p = TRUE)
# The zeroth truncated moment is equivalent to the probability function
pcomposite(2, dist = dist, coeff = coeff)
mcomposite(truncation = 2, dist = dist, coeff = coeff)
# The (truncated) first moment is equivalent to the mean of a (truncated) random sample,
#for large enough samples.
coeff <- c(coeff1.meanlog = -0.5, coeff1.sdlog = 0.5, coeff2.k = 3)
x <- rcomposite(1e5, dist = dist, coeff = coeff)
mean(x)
mcomposite(r = 1, lower.tail = FALSE, dist = dist, coeff = coeff)
sum(x[x > quantile(x, 0.1)]) / length(x)
mcomposite(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE, dist = dist, coeff = coeff)
Composite MLE
Description
Maximum likelihood estimation of the parameters of the two-/three- composite distribution
Usage
composite.mle(x, dist, start, lower = NULL, upper = NULL)
Arguments
x |
data vector |
dist |
character vector denoting the distribution of the first-, second- (and third) component respectively. If only two components are provided, the distribution reduces to the two-component distribution. |
start |
named numeric vector holding the coefficients of the first-, second- (and third) component, predeced by coeff1., coeff2. (and coeff3.), respectively. Coefficients for the last component do not have to be provided for the two-component distribution and will be disregarded. |
lower , upper |
Lower and upper bounds to the estimated coefficients, defaults to -Inf and Inf respectively. |
Value
Returns a named list containing a
- coefficients
Named vector of coefficients
- convergence
logical indicator of convergence
- cutoffs
Cutoffs of the composite distribution
- n
Length of the fitted data vector
- np
Nr. of coefficients
- components
Nr. of components
Examples
dist <- c("invpareto", "lnorm", "pareto")
coeff <- c(
coeff1.k = 1.5, coeff2.meanlog = -0.5,
coeff2.sdlog = 0.5, coeff3.k = 1.5
)
lower <- c(1e-10, -Inf, 1e-10, 1e-10)
upper <- c(Inf, Inf, Inf, Inf)
x <- rcomposite(1e3, dist = dist, coeff = coeff)
composite.mle(x = x, dist = dist, start = coeff + 0.2, lower = lower, upper = upper)
#'
Composite coefficients after power-law transformation
Description
Coefficients of a power-law transformed composite distribution
Usage
composite_plt(dist, coeff, a = 1, b = 1, inv = FALSE)
Arguments
dist |
character vector denoting the distribution of the first-, second- (and third) component respectively. If only two components are provided, the distribution reduces to the two-component distribution. |
coeff |
named numeric vector holding the coefficients of the first-, second- (and third) component, predeced by coeff1., coeff2. (and coeff3.), respectively. Coefficients for the last component do not have to be provided for the two-component distribution and will be disregarded. |
a , b |
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
Value
Returns a named list containing
- coefficients
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables dist <- c("invpareto", "lnorm", "pareto") coeff <- c(coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 1.5, coeff1.k = 1.5)
pcomposite(3,dist=dist,coeff=coeff) newcoeff = composite_plt(dist=dist,coeff=coeff,a=5,b=7)$coefficients pcomposite(5*3^7,dist=dist,coeff=newcoeff)
pcomposite(5*0.9^3,dist=dist,coeff=coeff) newcoeff = composite_plt(dist=dist,coeff=coeff,a=5,b=3,inv=TRUE)$coefficients pcomposite(0.9,dist=dist,coeff=newcoeff)
The Double-Pareto Lognormal distribution
Description
Density, distribution function, quantile function and random generation for the Double-Pareto Lognormal distribution.
Usage
ddoubleparetolognormal(
x,
shape1 = 1.5,
shape2 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
log = FALSE
)
pdoubleparetolognormal(
q,
shape1 = 1.5,
shape2 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
lower.tail = TRUE,
log.p = FALSE
)
qdoubleparetolognormal(
p,
shape1 = 1.5,
shape2 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
lower.tail = TRUE,
log.p = FALSE
)
mdoubleparetolognormal(
r = 0,
truncation = 0,
shape1 = 1.5,
shape2 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
lower.tail = TRUE
)
rdoubleparetolognormal(
n,
shape1 = 1.5,
shape2 = 1.5,
meanlog = -0.5,
sdlog = 0.5
)
Arguments
x , q |
vector of quantiles |
shape1 , shape2 , meanlog , sdlog |
Shapes, mean and variance of the Double-Pareto Lognormal distribution respectively, defaults to shape1=1.5, shape2=1.5, meanlog=-0.5, sdlog=0.5. |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter, defaults to xmin |
n |
number of observations |
Details
Probability and Cumulative Distribution Function as provided by (Reed and Jorgensen 2004):
f(x) = \frac{shape2shape1}{shape2 + shape1}[ x^{-shape2-1}e^{shape2 meanlog + \frac{shape2^2sdlog^2}{2}}\Phi(\frac{lnx - meanlog - shape2 sdlog^2}{sdlog}) + x^{shape1-1}e^{-shape1 meanlog + \frac{shape1^2sdlog^2}{2}}\Phi^c(\frac{lnx - meanlog +shape1 sdlog^2}{sdlog}) ], \newline F_X(x) = \Phi(\frac{lnx - meanlog }{sdlog}) - \frac{1}{shape2 + shape1}[ shape1 x^{-shape2}e^{shape2 meanlog + \frac{shape2^2sdlog^2}{2}}\Phi(\frac{lnx - meanlog - shape2 sdlog^2}{sdlog}) - shape2 x^{shape1}e^{-shape1 meanlog + \frac{shape1^2sdlog^2}{2}}\Phi^c(\frac{lnx - meanlog +shape1 sdlog^2}{sdlog}) ]
The y-bounded r-th raw moment of the Double-Pareto Lognormal distribution equals:
meanlog^{r}_{y} = -\frac{shape2shape1}{shape2 + shape1}e^{shape2 meanlog + \frac{shape2^2sdlog^2}{2}}\frac{y^{r- shape2}}{r - shape2}\Phi(\frac{lny - meanlog - shape2 sdlog^2}{sdlog}) \newline \qquad- \frac{shape2shape1}{shape2 + shape1} \frac{1}{r-shape2} e^{\frac{ r^2sdlog^2 + 2meanlog r }{2}}\Phi^c(\frac{lny - rsdlog^2 - meanlog}{sdlog}) \newline \qquad -\frac{shape2shape1}{shape2 + shape1}e^{-shape1 meanlog + \frac{shape1^2sdlog^2}{2}}\frac{y^{r + shape1}}{r + shape1}\Phi^c(\frac{lny - meanlog + shape1 sdlog^2}{sdlog}) \newline \qquad + \frac{shape2shape1}{shape2 + shape1} \frac{1}{r+shape1} e^{\frac{ r^2sdlog^2 + 2meanlog r }{2}}\Phi^c(\frac{lny - rsdlog^2 - meanlog}{sdlog}), \qquad shape2>r
Value
ddoubleparetolognormal returns the density, pdoubleparetolognormal the distribution function, qdoubleparetolognormal the quantile function, mdoubleparetolognormal the rth moment of the distribution and rdoubleparetolognormal generates random deviates.
The length of the result is determined by n for rdoubleparetolognormal, and is the maximum of the lengths of the numerical arguments for the other functions.
References
Reed WJ, Jorgensen M (2004). “The Double Pareto-Lognormal Distribution–A New Parametric Model for Size Distributions.” Communications in Statistics - Theory and Methods, 33(8), 1733–1753.
Examples
## Double-Pareto Lognormal density
plot(x = seq(0, 5, length.out = 100), y = ddoubleparetolognormal(x = seq(0, 5, length.out = 100)))
plot(x = seq(0, 5, length.out = 100), y = ddoubleparetolognormal(x = seq(0, 5, length.out = 100),
shape2 = 1))
## Double-Pareto Lognormal relates to the right-pareto Lognormal distribution if
#shape1 goes to infinity
pdoubleparetolognormal(q = 6, shape1 = 1e20, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5)
prightparetolognormal(q = 6, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5)
## Double-Pareto Lognormal relates to the left-pareto Lognormal distribution if
# shape2 goes to infinity
pdoubleparetolognormal(q = 6, shape1 = 1.5, shape2 = 1e20, meanlog = -0.5, sdlog = 0.5)
pleftparetolognormal(q = 6, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5)
## Double-Pareto Lognormal relates to the Lognormal if both shape parameters go to infinity
pdoubleparetolognormal(q = 6, shape1 = 1e20, shape2 = 1e20, meanlog = -0.5, sdlog = 0.5)
plnorm(q = 6, meanlog = -0.5, sdlog = 0.5)
## Demonstration of log functionality for probability and quantile function
qdoubleparetolognormal(pdoubleparetolognormal(2, log.p = TRUE), log.p = TRUE)
## The zeroth truncated moment is equivalent to the probability function
pdoubleparetolognormal(2)
mdoubleparetolognormal(truncation = 2)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample,
#for large enough samples.
x <- rdoubleparetolognormal(1e5, shape2 = 3)
mean(x)
mdoubleparetolognormal(r = 1, shape2 = 3, lower.tail = FALSE)
sum(x[x > quantile(x, 0.1)]) / length(x)
mdoubleparetolognormal(r = 1, shape2 = 3, truncation = quantile(x, 0.1), lower.tail = FALSE)
Double-Pareto Lognormal MLE
Description
Maximum likelihood estimation of the parameters of the Double-Pareto Lognormal distribution.
Usage
doubleparetolognormal.mle(
x,
lower = c(1e-10, 1e-10, 1e-10),
upper = c(Inf, Inf, Inf),
start = NULL
)
Arguments
x |
data vector |
lower , upper |
Upper and lower bounds for the estimation procedure on the parameters c(shape2,shape1,sdlog), defaults to c(1e-10,1e-10,1e-10) and c(Inf,Inf,Inf) respectively. |
start |
named vector with starting values, default to c(shape2=2,shape1=2,sdlog=sd(log(x))) |
Value
Returns a named list containing a
- coefficients
Named vector of coefficients
- convergence
logical indicator of convergence
- n
Length of the fitted data vector
- np
Nr. of coefficients
Examples
x <- rdoubleparetolognormal(1e3)
## Pareto fit with xmin set to the minium of the sample
doubleparetolognormal.mle(x = x)
Double-Pareto Lognormal coefficients of power-law transformed Double-Pareto Lognormal
Description
Coefficients of a power-law transformed Double-Pareto Lognormal distribution
Usage
doubleparetolognormal_plt(
shape1 = 1.5,
shape2 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
a = 1,
b = 1,
inv = FALSE
)
Arguments
shape1 , shape2 , meanlog , sdlog |
Shapes, mean and variance of the Double-Pareto Lognormal distribution respectively. |
a , b |
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
Details
If the random variable y is Double-Pareto Lognormal distributed with mean meanlog and standard deviation sdlog, then the power-law transformed variable
y = ax^b
is Double-Pareto Lognormal distributed with shape1*b, \frac{meanlog-log(a)}{b}, \frac{sdlog}{b}, shape2*b
.
Value
Returns a named list containing
- coefficients
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables pdoubleparetolognormal(3,shape1 = 1.5, shape2 = 3, meanlog = -0.5, sdlog = 0.5) coeff = doubleparetolognormal_plt(shape1 = 1.5, shape2 = 3, meanlog = -0.5, sdlog = 0.5,a=5,b=7)$coefficients pdoubleparetolognormal(5*3^7,shape1=coeff[["shape1"]],shape2=coeff[["shape2"]],meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]])
pdoubleparetolognormal(5*0.9^7,shape1 = 1.5, shape2 = 3, meanlog = -0.5, sdlog = 0.5) coeff = doubleparetolognormal_plt(shape1 = 1.5, shape2 = 3, meanlog = -0.5, sdlog = 0.5,a=5,b=7, inv=TRUE)$coefficients pdoubleparetolognormal(0.9,shape1=coeff[["shape1"]],shape2=coeff[["shape2"]],meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]])
The empirical distribution
Description
Density, distribution function, quantile function, and raw moments for the empirical distribution.
Usage
dempirical(x, data, log = FALSE)
pempirical(q, data, log.p = FALSE, lower.tail = TRUE)
qempirical(p, data, lower.tail = TRUE, log.p = FALSE)
mempirical(r = 0, data, truncation = NULL, lower.tail = TRUE)
Arguments
x , q |
vector of quantiles |
data |
data vector |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), moments are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter, defaults to NULL. |
Details
The density function is a standard Kernel density estimation for 1e6 equally spaced points. The cumulative Distribution Function:
F_n(x) = \frac{1}{n}\sum_{i=1}^{n}I_{x_i \leq x}
The y-bounded r-th raw moment of the empirical distribution equals:
\mu^{r}_{y} = \frac{1}{n}\sum_{i=1}^{n}I_{x_i \leq x}x^r
Value
dempirical returns the density, pempirical the distribution function, qempirical the quantile function, mempirical gives the rth moment of the distribution or a function that allows to evaluate the rth moment of the distribution if truncation is NULL..
Examples
#'
## Generate random sample to work with
x <- rlnorm(1e5, meanlog = -0.5, sdlog = 0.5)
## Empirical density
plot(x = seq(0, 5, length.out = 100), y = dempirical(x = seq(0, 5, length.out = 100), data = x))
# Compare empirical and parametric quantities
dlnorm(0.5, meanlog = -0.5, sdlog = 0.5)
dempirical(0.5, data = x)
plnorm(0.5, meanlog = -0.5, sdlog = 0.5)
pempirical(0.5, data = x)
qlnorm(0.5, meanlog = -0.5, sdlog = 0.5)
qempirical(0.5, data = x)
mlnorm(r = 0, truncation = 0.5, meanlog = -0.5, sdlog = 0.5)
mempirical(r = 0, truncation = 0.5, data = x)
mlnorm(r = 1, truncation = 0.5, meanlog = -0.5, sdlog = 0.5)
mempirical(r = 1, truncation = 0.5, data = x)
## Demonstration of log functionailty for probability and quantile function
quantile(x, 0.5, type = 1)
qempirical(p = pempirical(q = quantile(x, 0.5, type = 1), data = x, log.p = TRUE),
data = x, log.p = TRUE)
## The zeroth truncated moment is equivalent to the probability function
pempirical(q = quantile(x, 0.5, type = 1), data = x)
mempirical(truncation = quantile(x, 0.5, type = 1), data = x)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample,
#for large enough samples.
mean(x)
mempirical(r = 1, data = x, truncation = 0, lower.tail = FALSE)
sum(x[x > quantile(x, 0.1)]) / length(x)
mempirical(r = 1, data = x, truncation = quantile(x, 0.1), lower.tail = FALSE)
#'
The Exponential distribution
Description
Raw moments for the exponential distribution.
Usage
mexp(r = 0, truncation = 0, rate = 1, lower.tail = TRUE)
Arguments
r |
rth raw moment of the distribution, defaults to 1. |
truncation |
lower truncation parameter, defaults to 0. |
rate |
rate of the distribution with default values of 1. |
lower.tail |
logical; if TRUE (default), moments are |
Details
Probability and Cumulative Distribution Function:
f(x) = \frac{1}{s}e^{-\frac{\omega}{s}} , \qquad F_X(x) = 1-e^{-\frac{\omega}{s}}
The y-bounded r-th raw moment of the distribution equals:
s^{\sigma_s - 1} \Gamma\left(\sigma_s +1, \frac{y}{s} \right)
where \Gamma(,)
denotes the upper incomplete gamma function.
Value
Returns the truncated rth raw moment of the distribution.
Examples
## The zeroth truncated moment is equivalent to the probability function
pexp(2, rate = 1)
mexp(truncation = 2)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample,
#for large enough samples.
x <- rexp(1e5, rate = 1)
mean(x)
mexp(r = 1, lower.tail = FALSE)
sum(x[x > quantile(x, 0.1)]) / length(x)
mexp(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
Fitted distributions to the US Census 2000 city size distribution.
Description
A dataset containing 52 distribution fits to the US Census 2000 city size distributions
Usage
fit_US_cities
Format
A data frame with 52 rows and 7 variables:
- dist
distribution
- components
number of components
- prior
list of prior weights for the individual distribution components of FMM
- coefficients
list of coefficients for the distributions
- np
Number of paramters
- n
Number of observations
- convergence
Logical indicating whether the fitting procedure converged
Source
http://doi.org/10.3886/E113328V1
The Fréchet distribution
Description
Density, distribution function, quantile function, raw moments and random generation for the Fréchet distribution.
Usage
dfrechet(x, shape = 1.5, scale = 0.5, log = FALSE)
pfrechet(q, shape = 1.5, scale = 0.5, log.p = FALSE, lower.tail = TRUE)
qfrechet(p, shape = 1.5, scale = 0.5, log.p = FALSE, lower.tail = TRUE)
mfrechet(r = 0, truncation = 0, shape = 1.5, scale = 0.5, lower.tail = TRUE)
rfrechet(n, shape = 1.5, scale = 0.5)
Arguments
x , q |
vector of quantiles |
shape , scale |
Shape and scale of the Fréchet distribution, defaults to 1.5 and 0.5 respectively. |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the distribution |
truncation |
lower truncation parameter |
n |
number of observations |
Details
Probability and Cumulative Distribution Function:
f(x) =\frac{shape}{scale}\left(\frac{\omega}{scale}\right)^{-1-shape} e^{-\left(\frac{\omega}{scale}\right)^{-shape}}, \qquad F_X(x) = e^{-\left(\frac{\omega}{scale}\right)^{-shape}}
The y-bounded r-th raw moment of the Fréchet distribution equals:
\mu^{r}_{y} = scale^{\sigma_s - 1} \left[1-\Gamma\left(1-\frac{\sigma_s - 1}{shape}, \left(\frac{y}{scale}\right)^{-shape} \right)\right], \qquad shape>r
Value
dfrechet returns the density, pfrechet the distribution function, qfrechet the quantile function, mfrechet the rth moment of the distribution and rfrechet generates random deviates.
The length of the result is determined by n for rfrechet, and is the maximum of the lengths of the numerical arguments for the other functions.
Examples
## Frechet density
plot(x = seq(0, 5, length.out = 100), y = dfrechet(x = seq(0, 5, length.out = 100),
shape = 1, scale = 1))
plot(x = seq(0, 5, length.out = 100), y = dfrechet(x = seq(0, 5, length.out = 100),
shape = 2, scale = 1))
plot(x = seq(0, 5, length.out = 100), y = dfrechet(x = seq(0, 5, length.out = 100),
shape = 3, scale = 1))
plot(x = seq(0, 5, length.out = 100), y = dfrechet(x = seq(0, 5, length.out = 100),
shape = 3, scale = 2))
## frechet is also called the inverse weibull distribution, which is available in the stats package
pfrechet(q = 5, shape = 2, scale = 1.5)
1 - pweibull(q = 1 / 5, shape = 2, scale = 1 / 1.5)
## Demonstration of log functionality for probability and quantile function
qfrechet(pfrechet(2, log.p = TRUE), log.p = TRUE)
## The zeroth truncated moment is equivalent to the probability function
pfrechet(2)
mfrechet(truncation = 2)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample,
#for large enough samples.
x <- rfrechet(1e5, scale = 1)
mean(x)
mfrechet(r = 1, lower.tail = FALSE, scale = 1)
sum(x[x > quantile(x, 0.1)]) / length(x)
mfrechet(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE, scale = 1)
Fréchet MLE
Description
Maximum likelihood estimation of the coefficients of the Fréchet distribution
Usage
frechet.mle(
x,
weights = NULL,
start = c(shape = 1.5, scale = 0.5),
lower = c(1e-10, 1e-10),
upper = c(Inf, Inf)
)
Arguments
x |
data vector |
weights |
numeric vector for weighted MLE, should have the same length as data vector x |
start |
named vector with starting values, default to c(shape=1.5,scale=0.5) |
lower , upper |
Lower and upper bounds to the estimated shape parameter, defaults to 1e-10 and Inf respectively |
Value
Returns a named list containing a
- coefficients
Named vector of coefficients
- convergence
logical indicator of convergence
- n
Length of the fitted data vector
- np
Nr. of coefficients
x = rfrechet(1e3)
## Pareto fit with xmin set to the minium of the sample frechet.mle(x=x)
Fréchet coefficients after power-law transformation
Description
Coefficients of a power-law transformed Fréchet distribution
Usage
frechet_plt(shape = 1.5, scale = 0.5, a = 1, b = 1, inv = FALSE)
Arguments
shape , scale |
Scale and shape of the Fréchet distribution, defaults to 1.5 and 0.5 respectively. |
a , b |
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
Details
If the random variable x is Fréchet distributed with scale shape and shape scale, then the power-law transformed variable
y = ax^b
is Fréchet distributed with scale \left( \frac{scale}{a}\right)^{\frac{1}{b}}
and shape b*k
.
Value
Returns a named list containing
- coefficients
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables pfrechet(3,shape=2,scale=1) coeff = frechet_plt(shape=2,scale=1,a=5,b=7)$coefficients pfrechet(5*3^7,shape=coeff[["shape"]],scale=coeff[["scale"]])
pfrechet(5*0.8^7,shape=2,scale=1) coeff = frechet_plt(shape=2,scale=1,a=5,b=7,inv=TRUE)$coefficients pfrechet(0.8,shape=coeff[["shape"]],scale=coeff[["scale"]])
The Gamma distribution
Description
Raw moments for the Gamma distribution.
Usage
mgamma(
r = 0,
truncation = 0,
shape = 2,
rate = 1,
scale = 1/rate,
lower.tail = TRUE
)
Arguments
r |
rth raw moment of the distribution, defaults to 1. |
truncation |
lower truncation parameter, defaults to 0. |
shape , rate , scale |
shape, rate and scale of the distribution with default values of 2 and 1 respectively. |
lower.tail |
logical; if TRUE (default), moments are |
Details
Probability and Cumulative Distribution Function:
f(x) = \frac{1}{s^k\Gamma(k)}\omega^{k-1}e^{-\frac{\omega}{s}},\qquad F_X(x) = \frac{1}{\Gamma(k)}\gamma(k,\frac{\omega}{s})
,
where \Gamma(x)
stands for the upper incomplete gamma function function, while \gamma(s,x)
stands for the lower incomplete Gamma function with upper bound x
.
The y-bounded r-th raw moment of the distribution equals:
\mu^r_y = \frac{s^{r}}{\Gamma(k)} \Gamma\left(r + k , \frac{y}{s} \right)
Value
Provides the truncated rth raw moment of the distribution.
## The zeroth truncated moment is equivalent to the probability function pgamma(2,shape=2,rate=1) mgamma(truncation=2)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x = rgamma(1e5,shape=2,rate=1) mean(x) mgamma(r=1,lower.tail=FALSE)
sum(x[x>quantile(x,0.1)])/length(x) mgamma(r=1,truncation=quantile(x,0.1),lower.tail=FALSE)
The Inverse Pareto distribution
Description
Density, distribution function, quantile function, raw moments and random generation for the Pareto distribution.
Usage
dinvpareto(x, k = 1.5, xmax = 5, log = FALSE, na.rm = FALSE)
pinvpareto(
q,
k = 1.5,
xmax = 5,
lower.tail = TRUE,
log.p = FALSE,
log = FALSE,
na.rm = FALSE
)
qinvpareto(p, k = 1.5, xmax = 5, lower.tail = TRUE, log.p = FALSE)
minvpareto(r = 0, truncation = 0, k = 1.5, xmax = 5, lower.tail = TRUE)
rinvpareto(n, k = 1.5, xmax = 5)
Arguments
x , q |
vector of quantiles |
xmax , k |
Scale and shape of the Inverse Pareto distribution, defaults to 5 and 1.5 respectively. |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
na.rm |
Removes values that fall outside the support of the distribution |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Inverse Pareto distribution |
truncation |
lower truncation parameter, defaults to xmin |
n |
number of observations |
Details
Probability and Cumulative Distribution Function:
f(x) =\frac{k x_{max}^{-k}}{x^{-k+1}}, \qquad F_X(x) = (\frac{x_{max} }{x})^{-k}
The y-bounded r-th raw moment of the Inverse Pareto distribution equals:
\mu^r_y =k\omega_{max}^{-k}\frac{\omega_{max}^{r+k}- y^{r+k}}{r+k}
Value
dinvpareto returns the density, pinvpareto the distribution function, qinvpareto the quantile function, minvpareto the rth moment of the distribution and rinvpareto generates random deviates.
The length of the result is determined by n for rinvpareto, and is the maximum of the lengths of the numerical arguments for the other functions.
Examples
## Inverse invpareto density
plot(x = seq(0, 5, length.out = 100), y = dinvpareto(x = seq(0, 5, length.out = 100)))
## Demonstration of log functionality for probability and quantile function
qinvpareto(pinvpareto(2, log.p = TRUE), log.p = TRUE)
## The zeroth truncated moment is equivalent to the probability function
pinvpareto(2)
minvpareto(truncation = 2)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample,
#for large enough samples.
x <- rinvpareto(1e5)
mean(x)
minvpareto(r = 1, lower.tail = FALSE)
sum(x[x > quantile(x, 0.1)]) / length(x)
minvpareto(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
Inverse Pareto MLE
Description
Maximum likelihood estimation of the Inverse Pareto shape parameter using the Hill estimator.
Usage
invpareto.mle(x, xmax = NULL, clauset = FALSE, q = 1)
Arguments
x |
data vector |
xmax |
scale parameter of the Inverse Pareto distribution, set to max(x) if not provided |
clauset |
Indicator variable for calculating the scale parameter using the clauset method, overrides provided xmax |
q |
Percentage of data to search over (starting from the smallest values), dafults to 1. |
Details
The Hill estimator equals
\hat{k} =- \frac{1}{\frac{1}{n}\sum_{i=1}^{n}log\frac{x_{max}}{x_i}}
Value
Returns a named list containing a
- coefficients
Named vector of coefficients
- convergence
logical indicator of convergence
- n
Length of the fitted data vector
- np
Nr. of coefficients
Examples
x <- rinvpareto(1e3, k = 1.5, xmax = 5)
## Pareto fit with xmin set to the minium of the sample
invpareto.mle(x = x)
## Pareto fit with xmin set to its real value
invpareto.mle(x = x, xmax = 5)
## Pareto fit with xmin determined by the Clauset method
invpareto.mle(x = x, clauset = TRUE)
Inverse Pareto coefficients after power-law transformation
Description
Coefficients of a power-law transformed Inverse Pareto distribution
Usage
invpareto_plt(xmax = 5, k = 1.5, a = 1, b = 1, inv = FALSE)
Arguments
xmax , k |
Scale and shape of the Inverse Pareto distribution, defaults to 5 and 1.5 respectively. |
a , b |
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
Details
If the random variable x is Inverse Pareto-distributed with scale xmin and shape k, then the power-law transformed variable
y = ax^b
is Inverse Pareto distributed with scale ( \frac{xmin}{a})^{\frac{1}{b}}
and shape b*k
.
Value
Returns a named list containing
- coefficients
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables pinvpareto(3,k=2,xmax=5) coeff = invpareto_plt(xmax=5,k=2,a=5,b=7)$coefficients pinvpareto(5*3^7,k=coeff[["k"]],xmax=coeff[["xmax"]])
pinvpareto(5*0.9^7,k=2,xmax=5) coeff = invpareto_plt(xmax=5,k=2,a=5,b=7, inv=TRUE)$coefficients pinvpareto(0.9,k=coeff[["k"]],xmax=coeff[["xmax"]])
The Left-Pareto Lognormal distribution
Description
Density, distribution function, quantile function and random generation for the Left-Pareto Lognormal distribution.
Usage
dleftparetolognormal(x, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, log = FALSE)
pleftparetolognormal(
q,
shape1 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
lower.tail = TRUE,
log.p = FALSE
)
qleftparetolognormal(
p,
shape1 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
lower.tail = TRUE,
log.p = FALSE
)
mleftparetolognormal(
r = 0,
truncation = 0,
shape1 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
lower.tail = TRUE
)
rleftparetolognormal(n, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5)
Arguments
x , q |
vector of quantiles |
shape1 , meanlog , sdlog |
Shape, mean and variance of the Left-Pareto Lognormal distribution respectively. |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter, defaults to xmin |
n |
number of observations |
Details
Probability and Cumulative Distribution Function as provided by (Reed and Jorgensen 2004):
f(x) = shape1\omega^{shape1-1}e^{-shape1 meanlog + \frac{shape1^2sdlog^2}{2}}\Phi^c(\frac{ln\omega - meanlog +shape1 sdlog^2}{sdlog}), \newline F_X(x) = \Phi(\frac{ln\omega - meanlog }{sdlog}) - \omega^{shape1}e^{-shape1 meanlog + \frac{shape1^2sdlog^2}{2}}\Phi^c(\frac{ln\omega - meanlog +shape1 sdlog^2}{sdlog})
The y-bounded r-th raw moment of the Let-Pareto Lognormal distribution equals:
meanlog^{r}_{y} = -shape1e^{-shape1 meanlog + \frac{shape1^2sdlog^2}{2}}\frac{y^{\sigma_s + shape1-1}}{\sigma_s + shape1 - 1}\Phi^c(\frac{lny - meanlog + shape1 sdlog^2}{sdlog}) \newline \qquad + \frac{shape1}{r+shape1} e^{\frac{ r^2sdlog^2 + 2meanlog r }{2}}\Phi^c(\frac{lny - rsdlog^2 + meanlog}{sdlog})
Value
dleftparetolognormal gives the density, pleftparetolognormal gives the distribution function, qleftparetolognormal gives the quantile function, mleftparetolognormal gives the rth moment of the distribution and rleftparetolognormal generates random deviates.
The length of the result is determined by n for rleftparetolognormal, and is the maximum of the lengths of the numerical arguments for the other functions.
References
Reed WJ, Jorgensen M (2004). “The Double Pareto-Lognormal Distribution–A New Parametric Model for Size Distributions.” Communications in Statistics - Theory and Methods, 33(8), 1733–1753.
## Left-Pareto Lognormal density plot(x=seq(0,5,length.out=100),y=dleftparetolognormal(x=seq(0,5,length.out=100))) plot(x=seq(0,5,length.out=100),y=dleftparetolognormal(x=seq(0,5,length.out=100),shape1=1))
## Left-Pareto Lognormal relates to the Lognormal if the shape parameter goes to infinity pleftparetolognormal(q=6,shape1=1e20,meanlog=-0.5,sdlog=0.5) plnorm(q=6,meanlog=-0.5,sdlog=0.5)
## Demonstration of log functionality for probability and quantile function qleftparetolognormal(pleftparetolognormal(2,log.p=TRUE),log.p=TRUE)
## The zeroth truncated moment is equivalent to the probability function pleftparetolognormal(2) mleftparetolognormal(truncation=2)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x = rleftparetolognormal(1e5)
mean(x) mleftparetolognormal(r=1,lower.tail=FALSE)
sum(x[x>quantile(x,0.1)])/length(x) mleftparetolognormal(r=1,truncation=quantile(x,0.1),lower.tail=FALSE)
Left-Pareto Lognormal MLE
Description
Maximum likelihood estimation of the parameters of the Left-Pareto Lognormal distribution.
Usage
leftparetolognormal.mle(
x,
lower = c(1e-10, 1e-10),
upper = c(Inf, Inf),
start = NULL
)
Arguments
x |
data vector |
lower , upper |
Upper and lower bounds for the estimation procedure on the parameters c(shape1,sdlog), defaults to c(1e-10,1e-10) and c(Inf,Inf) respectively. |
start |
named vector with starting values, default to c(shape1=2,sdlog=sd(log(x))) |
Value
Returns a named list containing a
- coefficients
Named vector of coefficients
- convergence
logical indicator of convergence
- n
Length of the fitted data vector
- np
Nr. of coefficients
x = rleftparetolognormal(1e3)
## Pareto fit with xmin set to the minium of the sample leftparetolognormal.mle(x=x)
Left-Pareto Lognormal coefficients of power-law transformed Left-Pareto Lognormal
Description
Coefficients of a power-law transformed Left-Pareto Lognormal distribution
Usage
leftparetolognormal_plt(
shape1 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
a = 1,
b = 1,
inv = FALSE
)
Arguments
shape1 , meanlog , sdlog |
Shapes, mean and variance of the Left-Pareto Lognormal distribution respectively. |
a , b |
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
Details
If the random variable y is Left-Pareto Lognormal distributed with mean meanlog and standard deviation sdlog, then the power-law transformed variable
y = ax^b
is Left-Pareto Lognormal distributed with shape1*b, \frac{meanlog-log(a)}{b}, \frac{sdlog}{b}
.
Value
Returns a named list containing
- coefficients
Named vector of coefficients
Examples
## Comparing probabilites of power-law transformed transformed variables
pleftparetolognormal(3, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5)
coeff <- leftparetolognormal_plt(shape1 = 1.5, meanlog = -0.5, sdlog = 0.5,
a = 5, b = 7)$coefficients
pleftparetolognormal(5 * 3^7, shape1 = coeff[["shape1"]], meanlog = coeff[["meanlog"]],
sdlog = coeff[["sdlog"]])
pleftparetolognormal(5 * 0.9^7, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5)
coeff <- leftparetolognormal_plt(shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, a = 5, b = 7,
inv = TRUE)$coefficients
pleftparetolognormal(0.9, shape1 = coeff[["shape1"]], meanlog = coeff[["meanlog"]],
sdlog = coeff[["sdlog"]])
Vuong's closeness test
Description
Likelihood ratio test for model selection using the Kullback-Leibler information criterion (Vuong 1989)
Usage
llr_vuong(x, y, np.x, np.y, corr = c("none", "BIC", "AIC"))
Arguments
x , y |
vector of log-likelihoods |
np.x , np.y |
Number of paremeters respectively |
corr |
type of correction for parameters, defaults to none. |
Value
returns data frame with test statistic, p-value and character vector indicating the test outcome.
References
Vuong QH (1989). “Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses.” Econometrica, 57(2), 307–333.
Examples
x <- rlnorm(1e4, meanlog = -0.5, sdlog = 0.5)
pareto_fit <- combdist.mle(x = x, dist = "pareto")
pareto_loglike <- dcombdist(x = x, dist = "pareto", coeff = pareto_fit$coefficients, log = TRUE)
lnorm_fit <- combdist.mle(x = x, dist = "lnorm")
lnorm_loglike <- dcombdist(x = x, dist = "lnorm", coeff = lnorm_fit$coefficients, log = TRUE)
llr_vuong(x = pareto_loglike, y = lnorm_loglike, np.x = pareto_fit$np, np.y = lnorm_fit$np)
# BIC type parameter correction
llr_vuong(x = pareto_loglike, y = lnorm_loglike, np.x = pareto_fit$np, np.y = lnorm_fit$np,
corr = "BIC")
# AIC type parameter correction
llr_vuong(x = pareto_loglike, y = lnorm_loglike, np.x = pareto_fit$np, np.y = lnorm_fit$np,
corr = "AIC")
The Lognormal distribution
Description
Raw moments for the Lognormal distribution.
Usage
mlnorm(r = 0, truncation = 0, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE)
Arguments
r |
rth raw moment of the distribution, defaults to 1. |
truncation |
lower truncation parameter, defaults to 0. |
meanlog , sdlog |
mean and standard deviation of the distribution on the log scale with default values of 0 and 1 respectively. |
lower.tail |
logical; if TRUE (default), moments are |
Details
Probability and Cumulative Distribution Function:
f(x) = \frac{1}{{x Var \sqrt {2\pi } }}e^{- (lnx - \mu )^2/ 2Var^2} , \qquad F_X(x) = \Phi(\frac{lnx- \mu}{Var})
The y-bounded r-th raw moment of the Lognormal distribution equals:
\mu^r_y = e^{\frac{r (rVar^2 + 2\mu)}{2}}[1-\Phi(\frac{lny - (rVar^2 + \mu)}{Var})]
Value
Provides the y-bounded, rth raw moment of the distribution.
Examples
## The zeroth truncated moment is equivalent to the probability function
plnorm(2, meanlog = -0.5, sdlog = 0.5)
mlnorm(truncation = 2)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample,
#for large enough samples.
x <- rlnorm(1e5, meanlog = -0.5, sdlog = 0.5)
mean(x)
mlnorm(r = 1, lower.tail = FALSE)
sum(x[x > quantile(x, 0.1)]) / length(x)
mlnorm(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
Log Normal coefficients of power-law transformed log normal
Description
Coefficients of a power-law transformed log normal distribution
Usage
lnorm_plt(meanlog = 0, sdlog = 1, a = 1, b = 1, inv = FALSE)
Arguments
meanlog , sdlog |
mean and standard deviation of the log normal distributed variable, defaults to 0 and 1 respectively. |
a , b |
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
Details
If the random variable y is log normally distributed with mean meanlog and standard deviation sdlog, then the power-law transformed variable
y = ax^b
is log normally distributed with mean \frac{meanlog - ln(a)}{b}
and standard deviation \frac{sdlog}{b}
.
Value
Returns a named list containing
- coefficients
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables plnorm(3,meanlog=-0.5,sdlog=0.5) coeff = lnorm_plt(meanlog=-0.5,sdlog=0.5,a=5,b=7)$coefficients plnorm(5*3^7,meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]])
plnorm(5*0.8^7,meanlog=-0.5,sdlog=0.5) coeff = lnorm_plt(meanlog=-0.5,sdlog=0.5,a=5,b=7,inv=TRUE)$coefficients plnorm(0.8,meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]])
## Comparing the first moments and sample means of power-law transformed variables for large enough samples x = rlnorm(1e5,meanlog=-0.5,sdlog=0.5) coeff = lnorm_plt(meanlog=-0.5,sdlog=0.5,a=2,b=0.5)$coefficients y = rlnorm(1e5,meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]]) mean(2*x^0.5) mean(y) mlnorm(r=1,meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]],lower.tail=FALSE)
Normalized Absolute Deviation
Description
Calculates the Normalized Absolute Deviation between the empirical moments and the moments of the provided distribution. Corresponds to the Kolmogorov-Smirnov test statistic for the zeroth moment.
Usage
nmad_test(
x,
r = 0,
dist,
prior = 1,
coeff,
stat = c("NULL", "max", "sum"),
...
)
Arguments
x |
data vector |
r |
moment parameter |
dist |
character vector containing distribution |
prior |
named list of priors, defaults to 1 |
coeff |
named list of coefficients |
stat |
character vector indicating which statistic should be calculated: none (NULL), the maximum deviation "max" or the sum of deviations "sum". Defaults to NULL. |
... |
Additional arguments can be passed to the parametric moment call. |
Examples
x <- rlnorm(1e2, meanlog = -0.5, sdlog = 0.5)
nmad_test(x = x, r = 0, dist = "lnorm", coeff = c(meanlog = -0.5, sdlog = 0.5))
nmad_test(x = x, r = 0, dist = "lnorm", coeff = c(meanlog = -0.5, sdlog = 0.5), stat = "max")
nmad_test(x = x, r = 0, dist = "lnorm", coeff = c(meanlog = -0.5, sdlog = 0.5), stat = "sum")
The Pareto distribution
Description
Density, distribution function, quantile function, raw moments and random generation for the Pareto distribution.
Usage
dpareto(x, k = 2, xmin = 1, log = FALSE, na.rm = FALSE)
ppareto(q, k = 2, xmin = 1, lower.tail = TRUE, log.p = FALSE, na.rm = FALSE)
qpareto(p, k = 2, xmin = 1, lower.tail = TRUE, log.p = FALSE)
mpareto(r = 0, truncation = xmin, k = 2, xmin = 1, lower.tail = TRUE)
rpareto(n, k = 2, xmin = 1)
Arguments
x , q |
vector of quantiles |
xmin , k |
Scale and shape of the Pareto distribution, defaults to 1 and 2 respectively. |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
na.rm |
Removes values that fall outside the support of the distribution |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter, defaults to xmin |
n |
number of observations |
Details
Probability and Cumulative Distribution Function:
f(x) = \frac{kx_{min}^{k}}{x^{k+1}}, \qquad F_X(x) = 1-(\frac{x_{min} }{x})^{k}
The y-bounded r-th raw moment of the Pareto distribution equals:
\mu^{r}_{y} = k x_{min}^k \frac{- y^{r-k} }{r-k}, \qquad k>r
Value
dpareto returns the density, ppareto the distribution function, qpareto the quantile function, mpareto the rth moment of the distribution and rpareto generates random deviates.
The length of the result is determined by n for rpareto, and is the maximum of the lengths of the numerical arguments for the other functions.
Examples
## Pareto density
plot(x = seq(1, 5, length.out = 100), y = dpareto(x = seq(1, 5, length.out = 100), k = 2, xmin = 1))
## Pareto relates to the exponential distribution available in the stats package
ppareto(q = 5, k = 2, xmin = 3)
pexp(q = log(5 / 3), rate = 2)
## Demonstration of log functionality for probability and quantile function
qpareto(ppareto(2, log.p = TRUE), log.p = TRUE)
## The zeroth truncated moment is equivalent to the probability function
ppareto(2)
mpareto(truncation = 2)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample,
#for large enough samples.
x <- rpareto(1e5)
mean(x)
mpareto(r = 1, lower.tail = FALSE)
sum(x[x > quantile(x, 0.1)]) / length(x)
mpareto(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
Pareto MLE
Description
Maximum likelihood estimation of the Pareto shape parameter using the Hill estimator.
Usage
pareto.mle(x, xmin = NULL, clauset = FALSE, q = 0, lower = 1e-10, upper = Inf)
Arguments
x |
data vector |
xmin |
scale parameter of the Pareto distribution, set to min(x) if not provided |
clauset |
Indicator variable for calculating the scale parameter using the clauset method, overrides provided xmin |
q |
Percentage of data to search over (starting from the largest values), defaults to 0. |
lower , upper |
Lower and upper bounds to the estimated shape parameter, defaults to 1e-10 and Inf respectively |
Details
The Hill estimator equals
\hat{k} = \frac{1}{ \frac{1}{n} \sum_{i=1}^{n} log \frac{x_i}{x_{min}}}
Value
Returns a named list containing a
- coefficients
Named vector of coefficients
- convergence
logical indicator of convergence
- n
Length of the fitted data vector
- np
Nr. of coefficients
Examples
x <- rpareto(1e3, k = 2, xmin = 2)
## Pareto fit with xmin set to the minium of the sample
pareto.mle(x = x)
## Pareto fit with xmin set to its real value
pareto.mle(x = x, xmin = 2)
## Pareto fit with xmin determined by the Clauset method
pareto.mle(x = x, clauset = TRUE)
Pareto coefficients after power-law transformation
Description
Coefficients of a power-law transformed Pareto distribution
Usage
pareto_plt(xmin = 1, k = 2, a = 1, b = 1, inv = FALSE)
Arguments
xmin , k |
Scale and shape of the Pareto distribution, defaults to 1 and 2 respectively. |
a , b |
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
Details
If the random variable x is Pareto-distributed with scale xmin and shape k, then the power-law transformed variable
y = ax^b
is Pareto distributed with scale ( \frac{xmin}{a})^{\frac{1}{b}}
and shape b*k
.
Value
Returns a named list containing
- coefficients
Named vector of coefficients
Examples
## Comparing probabilites of power-law transformed transformed variables
ppareto(3, k = 2, xmin = 2)
coeff <- pareto_plt(xmin = 2, k = 2, a = 5, b = 7)$coefficients
ppareto(5 * 3^7, k = coeff[["k"]], xmin = coeff[["xmin"]])
ppareto(5 * 0.9^7, k = 2, xmin = 2)
coeff <- pareto_plt(xmin = 2, k = 2, a = 5, b = 7, inv = TRUE)$coefficients
ppareto(0.9, k = coeff[["k"]], xmin = coeff[["xmin"]])
## Comparing the first moments and sample means of power-law transformed variables for
#large enough samples
x <- rpareto(1e5, k = 2, xmin = 2)
coeff <- pareto_plt(xmin = 2, k = 2, a = 2, b = 0.5)$coefficients
y <- rpareto(1e5, k = coeff[["k"]], xmin = coeff[["xmin"]])
mean(2 * x^0.5)
mean(y)
mpareto(r = 1, k = coeff[["k"]], xmin = coeff[["xmin"]], lower.tail = FALSE)
The Right-Pareto Lognormal distribution
Description
Density, distribution function, quantile function and random generation for the Right-Pareto Lognormal distribution.
Usage
drightparetolognormal(
x,
shape2 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
log = FALSE
)
prightparetolognormal(
q,
shape2 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
lower.tail = TRUE,
log.p = FALSE
)
qrightparetolognormal(
p,
shape2 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
lower.tail = TRUE,
log.p = FALSE
)
mrightparetolognormal(
r = 0,
truncation = 0,
shape2 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
lower.tail = TRUE
)
rrightparetolognormal(
n,
shape2 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
lower.tail = TRUE
)
Arguments
x , q |
vector of quantiles |
shape2 , meanlog , sdlog |
Shape, mean and variance of the Right-Pareto Lognormal distribution respectively. |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter, defaults to xmin |
n |
number of observations |
Details
Probability and Cumulative Distribution Function as provided by (Reed and Jorgensen 2004):
f(x) = shape2 \omega^{-shape2-1}e^{shape2 meanlog + \frac{shape2^2sdlog^2}{2}}\Phi(\frac{lnx - meanlog - shape2 sdlog^2}{sdlog}), \newline F_X(x) = \Phi(\frac{lnx - meanlog }{sdlog}) - \omega^{-shape2}e^{shape2 meanlog + \frac{shape2^2sdlog^2}{2}}\Phi(\frac{lnx - meanlog - shape2 sdlog^2}{sdlog})
The y-bounded r-th raw moment of the Right-Pareto Lognormal distribution equals:
meanlog^{r}_{y} = -shape2e^{shape2 meanlog + \frac{shape2^2sdlog^2}{2}}\frac{y^{\sigma_s - shape2-1}}{\sigma_s - shape2 - 1}\Phi(\frac{lny - meanlog - shape2 sdlog^2}{sdlog}) \newline \qquad - \frac{shape2}{r-shape2} e^{\frac{ r^2sdlog^2 + 2meanlog r }{2}}\Phi^c(\frac{lny - rsdlog^2 + meanlog}{sdlog}), \qquad shape2>r
Value
drightparetolognormal gives the density, prightparetolognormal gives the distribution function, qrightparetolognormal gives the quantile function, mrightparetolognormal gives the rth moment of the distribution and rrightparetolognormal generates random deviates.
The length of the result is determined by n for rrightparetolognormal, and is the maximum of the lengths of the numerical arguments for the other functions.
References
Reed WJ, Jorgensen M (2004). “The Double Pareto-Lognormal Distribution–A New Parametric Model for Size Distributions.” Communications in Statistics - Theory and Methods, 33(8), 1733–1753.
Examples
## Right-Pareto Lognormal density
plot(x = seq(0, 5, length.out = 100), y = drightparetolognormal(x = seq(0, 5, length.out = 100)))
plot(x = seq(0, 5, length.out = 100), y = drightparetolognormal(x = seq(0, 5, length.out = 100),
shape2 = 1))
## Right-Pareto Lognormal relates to the Lognormal if the shape parameter goes to infinity
prightparetolognormal(q = 6, shape2 = 1e20, meanlog = -0.5, sdlog = 0.5)
plnorm(q = 6, meanlog = -0.5, sdlog = 0.5)
## Demonstration of log functionality for probability and quantile function
qrightparetolognormal(prightparetolognormal(2, log.p = TRUE), log.p = TRUE)
## The zeroth truncated moment is equivalent to the probability function
prightparetolognormal(2)
mrightparetolognormal(truncation = 2)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample,
#for large enough samples.
x <- rrightparetolognormal(1e5, shape2 = 3)
mean(x)
mrightparetolognormal(r = 1, shape2 = 3, lower.tail = FALSE)
sum(x[x > quantile(x, 0.1)]) / length(x)
mrightparetolognormal(r = 1, shape2 = 3, truncation = quantile(x, 0.1), lower.tail = FALSE)
Right-Pareto Lognormal MLE
Description
Maximum likelihood estimation of the parameters of the Right-Pareto Lognormal distribution.
Usage
rightparetolognormal.mle(
x,
lower = c(1e-10, 1e-10),
upper = c(Inf, Inf),
start = NULL
)
Arguments
x |
data vector |
lower , upper |
Upper and lower bounds for the estimation procedure on the parameters c(shape2,sdlog), defaults to c(1e-10,1e-10) and c(Inf,Inf) respectively. |
start |
named vector with starting values, default to c(shape2=2,sdlog=sd(log(x))) |
Value
Returns a named list containing a
- coefficients
Named vector of coefficients
- convergence
logical indicator of convergence
- n
Length of the fitted data vector
- np
Nr. of coefficients
Examples
x <- rrightparetolognormal(1e3)
## Pareto fit with xmin set to the minium of the sample
rightparetolognormal.mle(x = x)
Right-Pareto Lognormal coefficients of power-law transformed Right-Pareto Lognormal
Description
Coefficients of a power-law transformed Right-Pareto Lognormal distribution
Usage
rightparetolognormal_plt(
shape2 = 1.5,
meanlog = -0.5,
sdlog = 0.5,
a = 1,
b = 1,
inv = FALSE
)
Arguments
shape2 , meanlog , sdlog |
Shapes, mean and variance of the Right-Pareto Lognormal distribution respectively. |
a , b |
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
Details
If the random variable y is Right-Pareto Lognormal distributed with mean meanlog and standard deviation sdlog, then the power-law transformed variable
y = ax^b
is Right-Pareto Lognormal distributed with \frac{meanlog-log(a)}{b}, \frac{sdlog}{b}, shape2*b
.
Value
Returns a named list containing
- coefficients
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables prightparetolognormal(3, shape2 = 3, meanlog = -0.5, sdlog = 0.5) coeff = rightparetolognormal_plt( shape2 = 3, meanlog = -0.5, sdlog = 0.5,a=5,b=7)$coefficients prightparetolognormal(5*3^7,shape2=coeff[["shape2"]],meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]])
prightparetolognormal(5*0.9^7,shape2 = 3, meanlog = -0.5, sdlog = 0.5) coeff = rightparetolognormal_plt(shape2 = 3, meanlog = -0.5, sdlog = 0.5,a=5,b=7, inv=TRUE)$coefficients prightparetolognormal(0.9,shape2=coeff[["shape2"]],meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]])
Truncated distribution
Description
Density, distribution function, quantile function, raw moments and random generation for a truncated distribution.
Usage
dtruncdist(
x,
dist = c("lnormtrunc"),
coeff = list(meanlog = 0, sdlog = 1),
lowertrunc = 0,
uppertrunc = Inf,
log = FALSE
)
ptruncdist(
q,
dist = c("lnormtrunc"),
coeff = list(meanlog = 0, sdlog = 1),
lowertrunc = 0,
uppertrunc = Inf,
log.p = FALSE,
lower.tail = TRUE
)
qtruncdist(
p,
dist = c("lnormtrunc"),
coeff = list(meanlog = 0, sdlog = 1),
lowertrunc = 0,
uppertrunc = Inf,
lower.tail = TRUE,
log.p = FALSE
)
mtruncdist(
r,
truncation = 0,
dist = c("lnormtrunc"),
coeff = list(meanlog = 0, sdlog = 1),
lowertrunc = 0,
uppertrunc = Inf,
lower.tail = TRUE
)
rtruncdist(
n,
dist = c("lnormtrunc"),
coeff = list(meanlog = 0, sdlog = 1),
lowertrunc = 0,
uppertrunc = Inf
)
Arguments
x , q |
vector of quantiles |
dist |
distribution to be truncated, defaults to lnorm |
coeff |
list of parameters for the truncated distribution, defaults to list(meanlog=0,sdlog=1) |
lowertrunc , uppertrunc |
lowertrunc- and uppertrunc truncation points, defaults to 0 and Inf respectively |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the distribution |
truncation |
lowertrunc truncation parameter, defaults to 0. |
n |
number of observations |
Details
Probability and Cumulative Distribution Function:
f(x) = \frac{g(x)}{F(uppertrunc)-F(lowertrunc)}, \qquad F_X(x) = \frac{F(x)-F(lowertrunc)}{F(uppertrunc)-F(lowertrunc)}
Value
dtruncdist gives the density, ptruncdist gives the distribution function, qtruncdist gives the quantile function, mtruncdist gives the rth moment of the distribution and rtruncdist generates random deviates.
The length of the result is determined by n for rpareto, and is the maximum of the lengths of the numerical arguments for the other functions.
Examples
## Truncated lognormal density
plot(x = seq(0.5, 3, length.out = 100), y = dtruncdist(x = seq(0.5, 5, length.out = 100),
dist = "lnorm", coeff = list(meanlog = 0.5, sdlog = 0.5), lowertrunc = 0.5, uppertrunc = 5))
lines(x = seq(0, 6, length.out = 100), y = dlnorm(x = seq(0, 6, length.out = 100),
meanlog = 0.5, sdlog = 0.5))
# Compare quantities
dtruncdist(0.5)
dlnorm(0.5)
dtruncdist(0.5, lowertrunc = 0.5, uppertrunc = 3)
ptruncdist(2)
plnorm(2)
ptruncdist(2, lowertrunc = 0.5, uppertrunc = 3)
qtruncdist(0.25)
qlnorm(0.25)
qtruncdist(0.25, lowertrunc = 0.5, uppertrunc = 3)
mtruncdist(r = 0, truncation = 2)
mlnorm(r = 0, truncation = 2, meanlog = 0, sdlog = 1)
mtruncdist(r = 0, truncation = 2, lowertrunc = 0.5, uppertrunc = 3)
mtruncdist(r = 1, truncation = 2)
mlnorm(r = 1, truncation = 2, meanlog = 0, sdlog = 1)
mtruncdist(r = 1, truncation = 2, lowertrunc = 0.5, uppertrunc = 3)
The Weibull distribution
Description
Raw moments for the Weibull distribution.
Usage
mweibull(r = 0, truncation = 0, shape = 2, scale = 1, lower.tail = TRUE)
Arguments
r |
rth raw moment of the distribution, defaults to 1. |
truncation |
lower truncation parameter, defaults to 0. |
shape , scale |
shape and scale of the distribution with default values of 2 and 1 respectively. |
lower.tail |
logical; if TRUE (default), moments are |
Details
Probability and Cumulative Distribution Function:
f(x) = \frac{shape}{scale}(\frac{\omega}{scale})^{shape-1}e^{-(\frac{\omega}{scale})^shape} , \qquad F_X(x) = 1-e^{-(\frac{\omega}{scale})^shape}
The y-bounded r-th raw moment of the distribution equals:
\mu^r_y = scale^{r} \Gamma(\frac{r}{shape} +1, (\frac{y}{scale})^shape )
where \Gamma(,)
denotes the upper incomplete gamma function.
Value
returns the truncated rth raw moment of the distribution.
Examples
## The zeroth truncated moment is equivalent to the probability function
pweibull(2, shape = 2, scale = 1)
mweibull(truncation = 2)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample,
#for large enough samples.
x <- rweibull(1e5, shape = 2, scale = 1)
mean(x)
mweibull(r = 1, lower.tail = FALSE)
sum(x[x > quantile(x, 0.1)]) / length(x)
mweibull(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
Weibull coefficients of power-law transformed Weibull
Description
Coefficients of a power-law transformed Weibull distribution
Usage
weibull_plt(scale = 1, shape = 2, a = 1, b = 1, inv = FALSE)
Arguments
shape , scale |
shape and scale of the distribution with default values of 2 and 1 respectively. |
a , b |
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
Details
If the random variable y is Weibull distributed with mean meanlog and standard deviation sdlog, then the power-law transformed variable
y = ax^b
is Weibull distributed with scale ( \frac{scale}{a})^{\frac{1}{b}}
and shape b*shape
.
Value
Returns a named list containing
- coefficients
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables pweibull(3,shape=2,scale=1) coeff = weibull_plt(shape=2,scale=1,a=5,b=7)$coefficients pweibull(5*3^7,shape=coeff[["shape"]],scale=coeff[["scale"]])
pweibull(5*0.8^7,shape=2,scale=1) coeff = weibull_plt(shape=2,scale=1,a=5,b=7,inv=TRUE)$coefficients pweibull(0.8,shape=coeff[["shape"]],scale=coeff[["scale"]])
## Comparing the first moments and sample means of power-law transformed variables for large enough samples x = rweibull(1e5,shape=2,scale=1) coeff = weibull_plt(shape=2,scale=1,a=2,b=0.5)$coefficients y = rweibull(1e5,shape=coeff[["shape"]],scale=coeff[["scale"]]) mean(2*x^0.5) mean(y) mweibull(r=1,shape=coeff[["shape"]],scale=coeff[["scale"]],lower.tail=FALSE)