Title: | Computation of Tree (Im)Balance Indices |
Version: | 1.2.0 |
Description: | The aim of the 'R' package 'treebalance' is to provide functions for the computation of a large variety of (im)balance indices for rooted trees. The package accompanies the book ”Tree balance indices: a comprehensive survey” by M. Fischer, L. Herbst, S. Kersting, L. Kuehn and K. Wicke (2023) <ISBN: 978-3-031-39799-8>, <doi:10.1007/978-3-031-39800-1>, which gives a precise definition for the terms 'balance index' and 'imbalance index' (Chapter 4) and provides an overview of the terminology in this manual (Chapter 2). For further information on (im)balance indices, see also Fischer et al. (2021) https://treebalance.wordpress.com. Considering both established and new (im)balance indices, 'treebalance' provides (among others) functions for calculating the following 18 established indices and index families: the average leaf depth, the B1 and B2 index, the Colijn-Plazzotta rank, the normal, corrected, quadratic and equal weights Colless index, the family of Colless-like indices, the family of I-based indices, the Rogers J index, the Furnas rank, the rooted quartet index, the s-shape statistic, the Sackin index, the symmetry nodes index, the total cophenetic index and the variance of leaf depths. Additionally, we include 9 tree shape statistics that satisfy the definition of an (im)balance index but have not been thoroughly analyzed in terms of tree balance in the literature yet. These are: the total internal path length, the total path length, the average vertex depth, the maximum width, the modified maximum difference in widths, the maximum depth, the maximum width over maximum depth, the stairs1 and the stairs2 index. As input, most functions of 'treebalance' require a rooted (phylogenetic) tree in 'phylo' format (as introduced in 'ape' 1.9 in November 2006). 'phylo' is used to store (phylogenetic) trees with no vertices of out-degree one. For further information on the format we kindly refer the reader to E. Paradis (2012) http://ape-package.ird.fr/misc/FormatTreeR_24Oct2012.pdf. |
Author: | Mareike Fischer [aut], Lina Herbst [aut], Sophie Kersting [aut], Luise Kuehn [aut, cre], Kristina Wicke [aut] |
Maintainer: | Luise Kuehn <treebalanceindices@gmail.com> |
Depends: | R (≥ 3.5.0) |
Imports: | ape, memoise, gmp |
License: | GPL-3 |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.2.3 |
NeedsCompilation: | no |
Packaged: | 2023-12-14 08:29:24 UTC; kuehnl |
Repository: | CRAN |
Date/Publication: | 2023-12-14 10:00:03 UTC |
Calculation of the B1 index for rooted trees
Description
This function calculates the B1
index B1(T)
for a given rooted
tree T
. The tree must not necessarily be binary. B1(T)
is defined as
B1(T)=\sum_{u\in V_{in}(T)\setminus\{\rho\}} h(T_u)^{-1}
in which
V_{in}(T)\setminus\{\rho\}
denotes the set of inner vertices of T
without the root, and
h(T_u)
denotes the height of the pending subtree rooted at u
.
When restricted to binary trees, the B1
index is a balance index. For
arbitrary trees it does not fulfill the definition of an (im)balance index.
For n=1
the function returns B1(T)=0
and a warning.
For details on the B1 index, see
also Chapter 10 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_10).
Usage
B1I(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
B1I
returns the B1 index of the given tree.
Author(s)
Sophie Kersting
References
K.-T. Shao and R. R. Sokal. Tree Balance. Systematic Zoology, 39(3):266, 1990.
doi: 10.2307/2992186.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
B1I(tree)
Calculation of the B2 index for rooted trees
Description
This function calculates the B2 index B2(T)
for a given rooted
tree T
. The tree must not necessarily be binary. B2(T)
is defined as
B2(T)=-\sum_{x\in V_L(T)} p_x\cdot log(p_x)
in which V_L(T)
denotes the leaf
set of T
, and in which
p_x=\prod_{v\in anc(x)} \frac{1}{|child(v)|}
denotes
the probability of reaching leaf x
when starting at the root and assuming
equiprobable branching at each vertex v\in anc(x)
with anc(x)
denoting the set of ancestors of x
excluding
x
. child(v)
denotes the set of children of the inner vertex v
.
The B2
index is a balance index.
For n=1
the function returns B2(T)=0
and a warning.
For details on the B2 index, see
also Chapter 11 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_11).
Usage
B2I(tree, logbase = 2)
Arguments
tree |
A rooted tree in phylo format. |
logbase |
The base that shall be used for the logarithm. For binary trees it is common to use base 2. |
Value
B2I
returns the B2 index of the given tree.
Author(s)
Sophie Kersting, Luise Kuehn
References
K.-T. Shao and R. R. Sokal. Tree Balance. Systematic Zoology, 39(3):266, 1990.
doi: 10.2307/2992186.
P.-M. Agapow and A. Purvis. Power of Eight Tree Shape Statistics to Detect Nonrandom Diversification: A Comparison by Simulation of Two Models of Cladogenesis. Systematic Biology, 51(6):866-872, 2002.doi: 10.1080/10635150290102564.
URL https://doi.org/10.1080/10635150290102564.
M. Hayati, B. Shadgar, and L. Chindelevitch. A new resolution function to evaluate tree shape statistics. PLOS ONE, 14(11):e0224197, 2019. doi: 10.1371/journal.pone.0224197.
URL https://doi.org/10.1371/journal.pone.0224197.
M. Kirkpatrick and M. Slatkin. Searching for evolutionary patterns in the shape of a phylogenetic tree. Evolution, 47(4):1171-1181, 1993. doi: 10.1111/j.1558-5646.1993.tb02144.x.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
B2I(tree)
Calculation of the I-based indices for rooted trees
Description
This function calculates I
-based indices I(T)
for a given rooted
tree T
. Note that the leaves of the tree may represent single species or
groups of more than one species. Thus, a vector is required that contains for
each leaf the number of species that it represents.
The tree may contain few polytomies, which are not allowed to concentrate in
a particular region of the tree (see p. 238 in Fusco and Cronk (1995)).
Let v
be a vertex of T
that fulfills the following criteria: a)
The number of descendant (terminal) species of v
is k_v>3
(note that if each leaf represents only one species k_v
is simply the
number of leaves in the pending subtree rooted at v
), and
b) v
has exactly two children.
Then, we can calculate the I_v
value as follows:
I_v=\frac{k_{v_a}-\left\lceil\frac{k_v}{2}\right\rceil}{k_v-1-\left\lceil\frac{k_v}{2}\right\rceil}
in which k_{v_a}
denotes the number of descendant (terminal) species
in the bigger one of the two pending subtrees rooted at v
.
As the expected value of I_v
under the Yule model depends on k_v
,
Purvis et al. (2002) suggested to take the corrected values I'_v
or I_v^w
instead.
The I'_v
value of v
is defined as follows: I'_v=I_v
if k_v
is odd and I'_v=\frac{k_v-1}{k_v}\cdot I_v
if k_v
is even.
The I_v^w
value of v
is defined as follows:
I_v^w=\frac{w(I_v)\cdot I_v}{mean_{V'(T)} w(I_v)}
where V'(T)
is the set of inner vertices of T
that have precisely
two children and k_v\geq 4
, and w(I_v)
is a weight function with
w(I_v)=1
if k_v
is odd and w(I_v)=\frac{k_v-1}{k_v}
if k_v
is even and I_v>0
, and w(I_v)=\frac{2\cdot(k_v-1)}{k_v}
if k_v
is even and I_v=0
.
The I
-based index of T
can now be calculated using different methods.
Here, we only state the version for the I'
correction method, but the non-corrected
version or the I_v^w
corrected version works analoguously.
1) root: The I'
index of T
equals the I'_v
value of the root of
T
, i.e. I'(T)=I'_{\rho}
, provided that the root fulfills the two
criteria. Note that this method does not fulfil the definition of an (im)balance index.
2) median: The I'
index of T
equals the median I'_v
value of all
vertices v
that fulfill the two criteria.
3) total: The I'
index of T
equals the summarised I'_v
values of all
vertices v
that fulfill the two criteria.
4) mean: The I'
index of T
equals the mean I'_v
value of all
vertices v
that fulfill the two criteria.
5) quartile deviation: The I'
index of T
equals the quartile
deviation (half the difference between third and first quartile) of the I'_v
values of all
vertices v
that fulfill the two criteria.
For details on the family of I-based indices, see
also Chapter 17 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_17).
Usage
IbasedI(
tree,
specnum = rep(1, length(tree$tip.label)),
method = "mean",
correction = "none",
logs = TRUE
)
Arguments
tree |
A rooted tree in phylo format (with possibly few polytomies). |
specnum |
A vector whose |
method |
A character string specifying the method that shall be used to
calculate |
correction |
A character string specifying the correction method that shall be applied to the I values. It can be one of the following: "none", "prime", "w". |
logs |
Boolean value, (default true), determines if the number of suitable nodes (i.e. nodes that fulfill the criteria) and polytomies in the tree should be printed. |
Value
IbasedI
returns an I
-based balance index of the given tree according to the chosen (correction and) method.
Author(s)
Luise Kuehn and Sophie Kersting
References
G. Fusco and Q. C. Cronk. A new method for evaluating the shape of large phylogenies. Journal of Theoretical Biology, 1995. doi: 10.1006/jtbi.1995.0136. URL https://doi.org/10.1006/jtbi.1995.0136.
A. Purvis, A. Katzourakis, and P.-M. Agapow. Evaluating Phylogenetic Tree Shape: Two Modifications to Fusco & Cronks Method. Journal of Theoretical Biology, 2002. doi: 10.1006/jtbi.2001.2443. URL https://doi.org/10.1006/jtbi.2001.2443.
Examples
tree <- ape::read.tree(text="(((((,),),),),);")
IbasedI(tree, method="mean")
IbasedI(tree, method="mean", correction="prime", specnum=c(1,1,2,1,1,1))
Calculation of the area per pair index for rooted trees
Description
This function calculates the area per pair index APP(T)
for a given rooted
tree T
. The tree must not necessarily be binary. APP(T)
is defined as
APP(T)=\frac{2}{n\cdot(n-1)}\cdot\sum_{1\leq i<j\leq n} d_T(i,j)
in which n
denotes the
number of leaves in T
, and
d_T(i,j)
denotes the number of edges on the path between the two
leaves i
and j
. Note that APP(T)
can also be
computed from the Sackin index S(T)
and the total cophenetic
index TCI(T)
of T
as
APP(T)=\frac{2}{n}\cdot S(T)-\frac{4}{n(n-1)}\cdot TCI(T)
enabling efficient computation.
The area per pair index does not fulfill the definition of an (im)balance
index given in "Tree balance indices: a comprehensive survey" (Fischer et al., 2023).
For details on the area per pair index, see
also Chapter 24 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_24).
Usage
areaPerPairI(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
areaPerPairI
returns the area per pair index of the given tree.
Author(s)
Luise Kuehn
References
T. Araujo Lima, F. M. D. Marquitti, and M. A. M. de Aguiar. Measuring Tree Balance with Normalized Tree Area. arXiv e-prints, art. arXiv:2008.12867, 2020.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
areaPerPairI(tree)
Calculation of the average leaf depth index for rooted trees
Description
This function calculates the average leaf depth N(T)
for a given
rooted tree T
. The tree must not necessarily be binary. N(T)
is
defined as
N(T)=\frac{1}{n}\cdot\sum_{u\in V_{in}(T)} n_u
in which n
denotes the number of leaves in T
,
V_{in}(T)
denotes the set of inner nodes of T
and
n_u
denotes the number of leaves in the pending subtree that is rooted
at the inner node u
. Note that N(T)
can also be
computed from the Sackin index S(T)
as N(T)=\frac{1}{n}\cdot S(T)
.
The average leaf depth is an imbalance index.
For n=1
the function returns N(T)=0
and a warning.
For details on the average leaf depth, see also Chapter 6 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_6).
Usage
avgLeafDepI(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
avgLeafDepI
returns the average leaf depth of the given tree.
Author(s)
Luise Kuehn
References
M. J. Sackin. "Good" and "Bad" Phenograms. Systematic Biology, 21(2):225-226, 1972. doi: 10.1093/sysbio/21.2.225.
K.-T. Shao and R. R. Sokal. Tree Balance. Systematic Zoology, 39(3):266, 1990.
doi: 10.2307/2992186.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
avgLeafDepI(tree)
Calculation of the average vertex depth for rooted trees
Description
This function calculates the average vertex depth AVD(T)
for a given rooted
tree T
. The tree must not necessarily be binary. AVD(T)
is defined as
AVD(T)=\frac{1}{|V(T)|}\cdot\sum_{x\in V(T)} \delta(x)
in
which V(T)
denotes the set of vertices of T
, and \delta(x)
denotes the depth of the vertex x
. The average vertex depth is a normalised version of the total path length and an
imbalance index.
For n=1
the function returns AVD(T)=0
and a warning.
For details on the average vertex depth, see
also Chapter 23 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_23).
Usage
avgVertDep(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
avgVertDep
returns the average vertex depth of the given tree.
Author(s)
Luise Kuehn
References
A. Herrada et al. Scaling properties of protein family phylogenies. BMC Evolutionary Biology, 11(1), June 2011. doi: 10.1186/1471-2148-11-155.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,,),),(,)));")
avgVertDep(tree)
Calculation of the cherry index for rooted trees
Description
This function calculates the cherry index ChI(T)
for a
given rooted tree T
. The tree must not necessarily be binary.
ChI(T)
is defined as the number of cherries in the tree. A cherry
is a pair of leaves that have the same direct ancestor. Note, if a vertex u
has x
leaves as direct descendants, the number of cherries induced by u
is
binom(x,2)
.
The cherry index does not fulfill the definition
of an (im)balance index given in "Tree balance indices: a comprehensive survey"
(Fischer et al., 2023).
For details on the cherry index, see
also Chapter 24 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_24).
Usage
cherryI(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
cherryI
returns the cherry index of the given tree.
Author(s)
Sophie Kersting
References
A. McKenzie and M. Steel. Distributions of cherries for two models of trees. Mathematical Biosciences, 164(1):81-92, 2000. doi: 10.1016/s0025-5564(99)00060-7.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
cherryI(tree)
tree <- ape::read.tree(text="((,),((((,),),),(,)));")
cherryI(tree)
tree <- ape::read.tree(text="((,,,),(,,));")
cherryI(tree)
Calculation of the Colijn-Plazzotta rank for rooted binary trees
Description
This function calculates the Colijn-Plazzotta rank CP(T)
for a
given rooted binary tree T
.
For a binary tree T
, the Colijn-Plazzotta rank CP(T)
is
recursively defined as CP(T)=1
if T
consists of only
one leaf and otherwise
CP(T)=\frac{1}{2}\cdot CP(T_1)\cdot(CP(T_1)-1)+CP(T_2)+1
with CP(T_1) \geq CP(T_2)
being the ranks of the two pending
subtrees rooted at the children of the root of T
. This rank
of T
corresponds to its position in the
lexicographically sorted list of (i,j
): (1),(1,1),(2,1),(2,2),(3,1),...
The Colijn-Plazzotta rank of binary trees has been shown to be an imbalance index.
For n=1
the function returns CP(T)=1
and a warning.
Note that problems can sometimes arise even for trees with small leaf numbers due
to the limited range of computable values (ranks can reach INF quickly).
For details on the Colijn-Plazzotta rank, see
also Chapter 21 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_21).
Usage
colPlaLab(tree)
Arguments
tree |
A rooted binary tree in phylo format. |
Value
colPlaLab
returns the Colijn-Plazotta rank of the given tree. Since the values can get quite large, the
function returns them in big.z format (package gmp).
Author(s)
Sophie Kersting, Luise Kuehn
References
C. Colijn and G. Plazzotta. A Metric on Phylogenetic Tree Shapes. Systematic Biology, doi: 10.1093/sysbio/syx046.
N. A. Rosenberg. On the Colijn-Plazzotta numbering scheme for unlabeled binary rooted trees. Discrete Applied Mathematics, 2021. doi: 10.1016/j.dam.2020.11.021.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
colPlaLab(tree)
Generation of the rooted binary tree corresponding to a given Colijn-Plazzotta rank
Description
This function generates the unique rooted binary tree T
(in phylo
format) that corresponds to the given Colijn-Plazzotta rank CP(T)
. It
is the inverse function of colPlaLab().
colPlaLab()
:
For a given rooted binary tree T
, CP(T)
is recursively defined as
CP(T)=1
if T
consists of only one vertex and otherwise
CP(T)=\frac{1}{2}\cdot CP(T_1)\cdot(CP(T_1)-1)+CP(T_2)+1
with
CP(T_1) \geq CP(T_2)
being the
ranks of the two pending subtrees rooted at the children of the root of T
.
The rank CP(T)
of T
corresponds to its position in the
lexicographically sorted list of (i,j
): (1),(1,1),(2,1),(2,2),(3,1),...
colPlaLab_inv()
:
For a given rank CP
the corresponding tree T
can be reconstructed
by starting from one vertex \rho
(labelled CP
) and recursively
splitting vertices whose labels h
are greater than 1 into two children with the labels:
i=\left\lceil\frac{1+\sqrt{8\cdot h-7}}{2}\right\rceil-1
and
j=h-\frac{i\cdot(i-1)}{2}-1
until there are no more vertices to split.
For CP=1
the function returns the smallest possible tree in the
phylo format: the tree consisting of a single edge.
Note that problems can arise for extremely high input values (>10e+18).
For details on the Colijn-Plazzotta rank, see
also Chapter 21 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_21).
Usage
colPlaLab_inv(rank)
Arguments
rank |
An integer denoting the Colijn-Plazzotta rank of the sought tree. |
Value
colPlaLab_inv
returns the unique rooted binary tree for the given rank.
Author(s)
Sophie Kersting
References
C. Colijn and G. Plazzotta. A Metric on Phylogenetic Tree Shapes. Systematic Biology, 67(1):113-126,2018. doi: 10.1093/sysbio/syx046.
N. A. Rosenberg. On the Colijn-Plazzotta numbering scheme for unlabeled binary rooted trees. Discrete Applied Mathematics, 291:88-98, 2021. doi: 10.1016/j.dam.2020.11.021.
Examples
colPlaLab_inv(22)
Calculation of the Colless index for rooted binary trees
Description
This function calculates variants of the Colless index for a given rooted
binary tree T
. All of them are imbalance indices.
The original Colless index C(T)
is defined as
C(T)=\sum_{u \in V_{in}(T)} |n_{u_a}-n_{u_b}|
in which V_{in}(T)
denotes the set of all inner vertices
of T
, and in which n_{u_a}
and n_{u_b}
denote the number of leaves in the two pending subtrees that are
rooted at the direct descendants of u
.
The corrected Colless index I_C(T)
of T
is defined as I_C(T)=0
for
n=1
and n=2
and for n>2
as
I_C(T)=\frac{2\cdot C(T)}{(n-1)\cdot(n-2)}
in which n
denotes the
total number of leaves in T
.
The quadratic Colless index QC(T)
of T
is defined as
QC(T)=\sum_{u\in V_{in}(T)} |n_{u_a}-n_{u_b}|^2
Special cases: For n=1
the function returns C(T)=I_C(T)=QC(T)=0
and a warning.
For details on the original, corrected and quadratic Colless indices, see
also Chapters 12, 13 and 15 in "Tree balance indices: a comprehensive survey"
(https://doi.org/10.1007/978-3-031-39800-1_12, https://doi.org/10.1007/978-3-031-39800-1_13,
https://doi.org/10.1007/978-3-031-39800-1_15).
Usage
collessI(tree, method = "original")
Arguments
tree |
A rooted binary tree in phylo format. |
method |
A character string specifying the version that shall be computed. It can be one of the following: "original", "corrected", "quadratic". |
Value
collessI
returns the Colless index of the given tree according to the chosen method.
Author(s)
Luise Kuehn and Sophie Kersting
References
D. Colless. Review of Phylogenetics: the theory and practice of phylogenetic systematics. Systematic Zoology, 1982. ISSN 00397989.
T. M. Coronado, M. Fischer, L. Herbst, F. Rossello, and K. Wicke. On the minimum value of the Colless index and the bifurcating trees that achieve it. Journal of Mathematical Biology, 2020.doi: 10.1007/s00285-020-01488-9.
S. B. Heard. Patterns in tree balance among cladistic, phenetic, and randomly generated phylogenetic trees. Evolution, 1992. doi: 10.1111/j.1558-5646.1992.tb01171.x.
K. Bartoszek, T. M. Coronado, A. Mir, and F. Rossello. Squaring within the Colless index yields a better balance index. Mathematical Biosciences, 331:108503, 2021. doi: 10.1016/j.mbs.2020.108503.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
collessI(tree, method="original")
collessI(tree, method="corrected")
collessI(tree, method="quadratic")
Calculation of the Colless-like indices for rooted trees
Description
This function calculates the Colless-like index for a given rooted
tree T
according to the chosen weight function f
and dissimilarity D
.
The Colless-like index CL(T)
relative to D
and f
is the sum of the (D,f)
-balance values
over all inner vertices of the tree. More precisely,
CL(T)=\sum_{v\in V_{in}(T)} bal_{D,f}(v)
where V_{in}(T)
is the
set of inner vertices of T
. The (D,f)
-balance value
of v
with children v_1,...,v_k
is computed as
bal_{D,f}(v)=D(fs(T_{v_1}),...,fs(T_{v_k}))
with D
denoting the dissimilarity and fs
denoting the f.size.
The f.size fs(T)
of a tree T
uses the function f
, which maps any
integer to a non-negative real number, to build a weighted sum of
the out-degrees of all vertices in T
. More precisely,
fs(T)=\sum_{v\in V(T)} f(deg+(v))
where V(T)
is the set of all
vertices of T
and deg+(v)
denotes the out-degree (i.e. the number of
children) of the vertex v
. The f
-functions that are already
implemented are f(x)=e^x
and f(x)=ln(x+e)
.
The dissimilarity D(x_1,...,x_k)
of a vector x_1,...,x_k
assigns
a non-negative value to the vector, is independent of the order of the vector
entries and equals zero if and only if x_1=...=x_k
. In this
implementation the following dissimilarity functions are already built-in:
mean deviation from the median (mdm
),
the sample variance (var
) and the sample standard deviation (sd
).
collesslikeI
also allows the use of other functions for the weight function f
and the dissimilarity D
.
Special cases: For n=1
the function returns CL(T)=0
and a warning.
For details on the family of Colless-like indices, see
also Chapter 16 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_16).
Usage
collesslikeI(tree, f.size, dissim)
Arguments
tree |
A rooted binary tree in phylo format. |
f.size |
A character string specifying the function |
dissim |
A character string specifying the dissimilarity that shall be used. It can be one of the following: "mdm", "var", "sd" or the name of a function as a string. |
Value
collesslikeI
returns the Colless-like index of the given tree according to the chosen
function and dissimilarity.
Author(s)
Luise Kuehn, Sophie Kersting
References
A. Mir, L. Rotger, and F. Rossello. Sound Colless-like balance indices for multifurcating trees. PLOSONE, 13(9):e0203401, 2018. doi: 10.1371/journal.pone.0203401
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
collesslikeI(tree, f.size="exp", dissim="mdm")
collesslikeI(tree, f.size="exp", dissim="var")
collesslikeI(tree, f.size="ln", dissim="sd")
myfsize <- function(x) return(x+1)
mydissim <- function(x) return (var(x))
collesslikeI(tree, f.size="myfsize",dissim = "mydissim")
Calculation of the equal weights Colless index for rooted binary trees
Description
This function calculates the equal weights Colless index I_2(T)
for a
given rooted binary tree T
. I_2(T)
is defined as
I_2(T)=\frac{1}{n-2}\cdot\sum_{u\in V_{in}(T), n_u>2} \frac{|n_{u_a}-n_{u_b}|}{n_u-2}
in which V_{in}(T)
denotes the set of all inner vertices of T
,
and in which n_u
, n_{u_a}
and n_{u_b}
denote the number of
leaves in the pending subtrees that are rooted at u
and the two direct
descendants of u
. The equal weights Colless index is an imbalance index.
For n=1
and n=2
the function returns I_2(T)=0
and a warning.
For details on the equal weigths Colless index, see
also Chapter 14 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_14).
Usage
ewCollessI(tree)
Arguments
tree |
A rooted binary tree in phylo format. |
Value
ewCollessI
returns the equal weights Colless index of the given tree.
Author(s)
Luise Kuehn
References
A. O. Mooers and S. B. Heard. Inferring Evolutionary Process from Phylogenetic Tree Shape. The Quarterly Review of Biology, 72(1), 1997. doi: 10.1086/419657.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
ewCollessI(tree)
Calculation of the Furnas rank for rooted binary trees
Description
This function calculates the Furnas rank F(T)
for a given rooted
binary tree T
. F(T)
is the unique rank of the tree T
among all rooted binary trees with n
leaves in the left-light rooted
ordering. For details on the left-light rooted ordering as well as details
on how the Furnas rank is computed, see "The generation
of random, binary unordered trees" by G.W. Furnas (1984) or "Tree balance
indices: a comprehensive survey" by Fischer et al. (2023). The Furnas rank
is a balance index.
The concept of assigning each rooted binary tree a unique tuple (rank, n)
allows to store many trees with minimal storage use.
For details on the Furnas rank, see
also Chapter 22 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_22).
Usage
furnasI(tree)
Arguments
tree |
A rooted binary tree in phylo format. |
Value
furnasI
returns the unique Furnas rank of the given tree, i.e.
the rank of the tree among all rooted binary trees with n
leaves in the
left-light rooted ordering. Since the values can get quite large, the function
returns them in big.z
format (package gmp
).
Author(s)
Luise Kuehn, Lina Herbst
References
G. W. Furnas. The generation of random, binary unordered trees. Journal of Classification, 1984. doi: 10.1007/bf01890123. URL https://doi.org/10.1007/bf01890123.
M. Kirkpatrick and M. Slatkin. Searching for evolutionary patterns in the shape of a phylogenetic tree. Evolution, 1993. doi: 10.1111/j.1558-5646.1993.tb02144.x.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
furnasI(tree)
Calculation of rooted binary tree for tuple (rank, leaf number)
Description
This function calculates the unique tree T
(in phylo format) for two
given integer values r
and n
, with n
denoting the number
of leaves of T
and r
denoting the rank of T
in the
left-light rooted ordering of all rooted binary trees with n
leaves.
It is the inverse function of furnasI()
. For details on how to calculate
T
(including algorithm) see "The generation of random, binary
unordered trees" by G.W. Furnas (1984) or "Tree balance indices: a comprehensive
survey" by Fischer et al. (2023).
furnasI_inv
can be used e.g. to generate random rooted binary trees with a
certain number of leaves. Also, the concept of assigning each rooted binary
tree a unique tuple (rank, n)
allows to store many trees with minimal
storage use.
For details on the Furnas rank, see
also Chapter 22 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_22).
Usage
furnasI_inv(rank, n)
Arguments
rank |
An integer denoting the rank of the sought tree among all rooted
binary trees with |
n |
An integer denoting the number of leaves of the sought tree. |
Value
furnasI_inv
returns the unique tree (in phylo format) for
the given leaf number and rank.
Author(s)
Sophie Kersting
References
G. W. Furnas. The generation of random, binary unordered trees. Journal of Classification, 1984. doi: 10.1007/bf01890123. URL https://doi.org/10.1007/bf01890123.
Examples
furnasI_inv(rank=6,n=8)
Auxiliary functions
Description
getDescMatrix
- Creates a matrix that contains the descendants of
node i
in row i
.
getAncVec
- Creates a vector that contains the parent (direct ancestor) of
node i
at position i
.
getNodesOfDepth
- Creates a matrix that contains the nodes of
depth i
in row i
.
symBucketLexicoSort
- Sorts the pairs of numbers lexicographically and
returns ranking. Uses bucket sort.
getAllAncestors
- Returns all ancestors of v
including v
itself.
cPL_inv
- Returns the binary tree that belongs to the input label in an incomplete
Newick format.
maxDepthLeaf
- Returns the maximumy< depth of a leaf in the subtree that
is rooted at v
.
get.subtreesize
- Creates a vector that contains at the i
-th position
the number of leaves in the pending subtree rooted at i
.
getlca
- Returns the name of the lowest common ancestor of the two
input vertices v
and w
.
we_eth
- Returns the Wedderburn-Etherington number we(n)
for a given non-negative integer n
.
getfurranks
- Returns for each vertex i
the Furnas rank of the
subtree rooted at i
.
getsubtree
- Returns the pending subtree (in phylo format) that is
rooted at the input vertex. If the input vertex is a leaf, the function returns
the standard tree for n=1
(with 1 edge).
is_binary
- Returns TRUE if the input tree is binary and FALSE otherwise.
is_phylo
- Tests all requirements of the phylo format, and returns TRUE
if the tree is correctly formatted, else FALSE with detailed feedback on the
features that are not met.
tree_decomposition
- Returns a list of length two, which
contains the two pending subtrees that are rooted at the children of the root
of the input tree. The
smaller one (according to the number of leaves) is stated first.
tree_merge
- Returns a rooted tree T
in phylo
format, which contains the input trees tree1
and tree2
as
"left" and "right" maximal pending subtrees.
treenumber
- Returns the unique tree number tn(T)
of the given tree.
tn(T)
is the rank of the tree T
among all
rooted binary trees in the left-light rooted ordering. It can
be calculated as follows:
tn(T)=F(T) + \sum_{i=1}^{n-1} we(i)
in which n
is the number of leaves in T
, F(T)
is the Furnas
rank of T
, i.e. the rank of T
in the left-light rooted ordering
of all rooted binary trees with n
leaves, and we(i)
is the
Wedderburn-Etherington number of i
.
The concept of assigning each rooted binary tree a unique tree number allows
to store many trees with minimal storage use.
For n=1
the function returns tn(T)=1
and a warning.
treenumber_inv
- Returns the unique tree (in phylo format) for
the given tree number.
auxE_l_X
- Returns the sum of all products of l different values in X.
Usage
getDescMatrix(tree)
getAncVec(tree)
getNodesOfDepth(mat, root, n)
symBucketLexicoSort(workLabs)
getAllAncestors(tree, v)
cPL_inv(label)
maxDepthLeaf(tree, v = length(tree$tip.label) + 1)
get.subtreesize(tree)
getlca(tree, v, w)
we_eth(n)
getfurranks(tree)
getsubtree(tree, subroot)
is_binary(tree)
is_phylo(tree)
tree_decomposition(tree)
tree_merge(tree1, tree2)
treenumber(tree)
treenumber_inv(treenum)
auxE_l_X(subX, Xset)
Arguments
tree |
A rooted tree in phylo format, >= 2 leaves |
mat |
Descendants matrix from |
root |
Number (label) of the root of the tree |
n |
Number of leaves of the tree |
workLabs |
numeric matrix (2 columns) |
v |
A vertex of the tree. |
label |
A Colijn-Plazotta label of desired tree, a positive integer. |
w |
A vertex of the tree. |
subroot |
A vertex of the tree. It is not recommended to use leaves as subroots. |
tree1 |
A rooted tree in phylo format. |
tree2 |
A rooted tree in phylo format. |
treenum |
An integer denoting the tree number of the sought tree. |
subX |
integer >=1, size of the subsets of X. |
Xset |
Vector (multiset) of numeric values. |
Value
desc_mat
numeric matrix
anc_vec
numeric vector
nodes_of_depth
numeric matrix
ranking
numeric vector
vectorWithAncs
numeric vector
Author(s)
Sophie Kersting, Luise Kuehn and Lina Herbst
Examples
mat <- cbind(c(7,7,6,5,5,6),c(1,2,3,4,6,7))
tree <- list(edge=mat, tip.label=c("","","",""), Nnode=3)
getDescMatrix(tree)
mat <- cbind(c(5,5,5,5),c(1,2,3,4))
tree <- list(edge=mat, tip.label=c("","","",""), Nnode=1)
getDescMatrix(tree)
getAncVec(tree)
getNodesOfDepth(mat=getDescMatrix(tree),root=length(tree$tip.label)+1,
n=length(tree$tip.label))
myWorkLabs <- cbind(c(0,1,2,3,1,0),c(0,2,2,4,1,0))
symBucketLexicoSort(myWorkLabs)
getAllAncestors(tree,v=6)
cPL_inv(label=6)
maxDepthLeaf(tree,v=6)
get.subtreesize(tree)
getlca(tree,1,2)
we_eth(5)
getfurranks(tree)
getsubtree(tree,4)
is_binary(ape::read.tree(text="((((,),),(,)),(((,),),(,)));"))
is_phylo(ape::read.tree(text="((((,),),(,)),(((,),),(,)));"))
tree_decomposition(ape::read.tree(text="((((,),),(,)),(((,),),(,)));"))
treeA <- ape::read.tree(text="(((,),),(,));")
treeB <- ape::read.tree(text="((,),);")
tree_merge(treeA, treeB)
treenumber(ape::read.tree(text="((((,),),(,)),(((,),),(,)));"))
treenumber_inv(192)
auxE_l_X(subX=3,Xset=c(1,1,2,2))
Calculation of the modified cherry index for rooted binary trees
Description
This function calculates the modified cherry index mChI(T)
for a
given rooted binary tree T
. Note that compared to the original cherry index ChI(T)
,
the modified cherry index is defined for binary trees only.
mChI(T)
is defined as n-2\cdot ChI(T)
, i.e. it counts the
number of leaves of the tree which are not in a cherry. A cherry
is a pair of leaves that have the same direct ancestor.
The modified cherry index does not fulfill the definition
of an (im)balance index given in "Tree balance indices: a comprehensive survey"
(Fischer et al., 2023).
For details on the modified cherry index, see
also Chapter 24 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_24).
Usage
mCherryI(tree)
Arguments
tree |
A rooted binary tree in phylo format. |
Value
mCherryI
returns the modified cherry index of the given tree.
Author(s)
Luise Kuehn
References
S. J. Kersting, M. Fischer. Measuring tree balance using symmetry nodes — A new balance index and its extremal properties. Mathematical Biosciences, 341:108690, 2021. doi: 10.1016/j.mbs.2021.108690.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
mCherryI(tree)
tree <- ape::read.tree(text="((,),((((,),),),(,)));")
mCherryI(tree)
Calculation of the maximum width over maximum depth of the tree
Description
This function calculates the maximum width over maximum depth mWovermD(T)
for a
given rooted tree T
. The tree must not necessarily be binary. For n>1
,
mWovermD(T)
is defined as
mWovermD(T)=maxWidth(T) / h(T)
in which h(T)
denotes the height of the tree T
, which is the same as the
maximum depth of any leaf in the tree, and maxWidth(T)
denotes
the maximum width of the tree T
. The maximum width over maximum depth
is a balance index.
For details on the maximum width over maximum depth, see
also Chapter 23 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_23).
Usage
mWovermD(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
mWovermD
returns the maximum width over maximum depth of a tree.
Author(s)
Luise Kuehn
References
C. Colijn and J. Gardy. Phylogenetic tree shapes resolve disease transmission patterns. Evolution, Medicine, and Public Health, 2014(1):96–108, 2014. doi: 10.1093/emph/eou018.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
mWovermD(tree)
tree <- ape::read.tree(text="((,),((((,),),),(,)));")
mWovermD(tree)
Calculation of the (modified) maximum difference in widths for a rooted tree
Description
This function calculates the maximum difference in widths delW(T)
and
the modified maximum difference in width mdelW(T)
for a
given rooted tree T
. The tree must not necessarily be binary.
delW(T)
is defined as
delW(T)=\max_{i=0,...,h(T)-1} |w(i+1)-w(i)|
and mdelW(T)
is defined as
mdelW(T)=\max_{i=0,...,h(T)-1} w(i+1)-w(i)
in which h(T)
denotes the height of the tree T
and w(i)
denotes
the number of vertices in T
that have depth i
. The modified maximum difference
in widths is a balance index, while the maximum difference in widths is neither a balance nor imbalance index.
Note that there was a spelling error in the previous manual of this function - we wrote "maximum difference in widths"
while the given definition and the R code corresponded to the "modified maximum difference in width".
For details on the maximum difference in widths and the modified maximum difference in widths, see
also Chapters 24 and 23 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_24, https://doi.org/10.1007/978-3-031-39800-1_23).
Usage
maxDelW(tree, method = "modified")
Arguments
tree |
A rooted tree in phylo format. |
method |
A character string specifying whether the original maximum difference in widths or the modified maximum difference in widths shall be computed. Can be any of "original" or "modified" (default is modified). |
Value
maxDelW
returns the maximum difference in widths of a tree (if method
is set to original
)
or the modified maximum difference in widths (if method
is set to modified
).
Author(s)
Sophie Kersting, Luise Kuehn
References
C. Colijn, J. Gardy. Phylogenetic tree shapes resolve disease transmission patterns. Evolution, Medicine, and Public Health, 2014(1):96-108, 2014. ISSN 2050-6201. doi: 10.1093/emph/eou018.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
maxDelW(tree, method="original")
tree <- ape::read.tree(text="((,),((((,),),),(,)));")
maxDelW(tree, method="modified")
Calculation of the maximum depth of the tree
Description
This function calculates the maximum depth of any vertex in a rooted tree T
, which
is at the same time its height h(T)
. The tree must not necessarily be binary. Formally,
h(T)
is defined as
h(T)=\max_{v\in V(T)} \delta(v)
with \delta(v)
being the depth of the vertex v
.
The maximum depth is an imbalance index.
For n=1
the function returns h(T)=0
and a warning.
For details on the maximum depth, see
also Chapter 23 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_23).
Usage
maxDepth(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
maxDepth
returns the maximum depth, i.e. height, of a tree.
Author(s)
Luise Kuehn, Sophie Kersting
References
C. Colijn and J. Gardy. Phylogenetic tree shapes resolve disease transmission patterns. Evolution, Medicine, and Public Health, 2014(1):96-108, 2014. ISSN 2050-6201. doi: 10.1093/emph/eou018.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
maxDepth(tree)
tree <- ape::read.tree(text="((,),((((,),),),(,)));")
maxDepth(tree)
Calculation of the maximum width of the tree
Description
This function calculates the maximum width maxWidth(T)
for a
given rooted tree T
. The tree must not necessarily be binary.
maxWidth(T)
is defined as
maxWidth(T)=\max_{i=0,...,h(T)} w(i)
in which h(T)
denotes the height of the tree T
and w(i)
denotes
the number of vertices in T
that have depth i
. The maximum width
is a balance index.
For details on the maximum width, see
also Chapter 23 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_23).
Usage
maxWidth(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
maxWidth
returns the maximum width of a tree.
Author(s)
Sophie Kersting
References
C. Colijn and J. Gardy. Phylogenetic tree shapes resolve disease transmission patterns. Evolution, Medicine, and Public Health, 2014(1):96-108, 2014. ISSN 2050-6201. doi: 10.1093/emph/eou018.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
maxWidth(tree)
tree <- ape::read.tree(text="((,),((((,),),),(,)));")
maxWidth(tree)
Calculation of the rooted quartet index for rooted trees
Description
This function calculates the rooted quartet index rQI(T)
for a given
rooted tree T
. The tree must not necessarily be binary.
Let T
be a rooted tree, whose leaves are 1,...,n
. Let
P_4
denote the set of all subsets of \{1,...,n\}
that have
cardinality 4. Let T(Q)
denote the rooted quartet on Q\in P_4
that is obtained by taking the subgraph of T
that is induced by
Q
and supressing its outdegree-1 vertices. T(Q)
can have one of
the five following shapes:
- Q_0^*
: This is the caterpillar tree shape on 4 leaves, i.e.
"(,(,(,)));"
in Newick format. It has 2 automorphisms.
- Q_1^*
: This is the tree shape on 4 leaves that has three pending
subtrees rooted at the children of the root of T
, one of them being
a cherry and the other two
being single vertices, i.e. "((,),,);"
in Newick format. It has 4
automorphisms.
- Q_2^*
: This is the tree shape on 4 leaves that has two pending
subtrees rooted at the children of the root of T
, one of them being
a star tree shape on 3 leaves
and the other one being a single vertex, i.e. "((,,),);"
in Newick
format. It has 6 automorphisms.
- Q_3^*
: This is the fully balanced binary tree shape on 4 leaves,
i.e. "((,),(,));"
in Newick format. It has 8 automorphisms.
- Q_4^*
: This is the star tree shape on 4 leaves, i.e.
"(,,,);"
in Newick format. It has 24 automorphisms.
T(Q)
is assigned an rQI-value based on its shape, i.e. rQI(T(Q))=q_i
if T(Q)
has the shape Q_i^*
. The values q_0,...,q_4
are
chosen in such a way that they increase with the symmetry of the shape as
measured by means of its number of automorphisms. Coronado et al. (2019)
suggested the values q_0=0
and q_i=i
or q_i=2^i
for i=1,...,4
.
The rooted quartet index rQI(T)
of the tree T
is then defined as
the sum of the rQI-values of its rooted quartets:
rQI(T)=\sum_{Q\in P_4} rQI(T(Q))
The rooted quartet index is a balance index.
For details on the rooted quartet index, see
also Chapter 20 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_20).
Usage
rQuartetI(tree, shapeVal = c(0, 1, 2, 3, 4))
Arguments
tree |
A rooted tree in phylo format. |
shapeVal |
A vector of length 5 containing the shape values |
Value
rQuartetI
returns the rooted quartet index of the given tree based on the chosen shape values (see description for details).
Author(s)
Sophie Kersting
References
T. M. Coronado, A. Mir, F. Rossello, and G. Valiente. A balance index for phylogenetic trees based on rooted quartets. Journal of Mathematical Biology, 79(3):1105-1148, 2019. doi: 10.1007/s00285-019-01377-w. URL https://doi.org/10.1007/s00285-019-01377-w.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
rQuartetI(tree)
Calculation of the Rogers J index for rooted binary trees
Description
This function calculates the Rogers J index J(T)
for a given rooted
binary tree T
. It is defined as the number of inner vertices whose balance
value is unequal to zero, more precisely
J(T)=\sum_{u \in V_{in}(T)} (1-I(n_{u_a}=n_{u_b}))
in which V_{in}(T)
denotes the set of all inner vertices
of T
, and in which n_{u_a}
and n_{u_b}
denote the number of leaves in the two pending subtrees that are
rooted at the direct descendants of u
.
Special cases: For n=1
, the function returns J(T)=0
and a warning.
For details on the Rogers J index, see
also Chapter 19 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_19).
Usage
rogersI(tree)
Arguments
tree |
A rooted binary tree in phylo format. |
Value
rogersI
returns the Rogers J index of the given tree.
Author(s)
Sophie Kersting
References
J. S. Rogers. Central Moments and Probability Distributions of Three Measures of Phylogenetic Tree Imbalance. Systematic Biology, 45(1):99-110, 1996. doi: 10.1093/sysbio/45.1.99.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
rogersI(tree)
Calculation of the s-shape statistic for rooted trees
Description
This function calculates the s-shape statistic sShape(T)
for a given rooted
tree T
. The tree must not necessarily be binary, however sShape
only fulfils
the definition of an imbalance index on the space of binary trees. sShape(T)
is defined as
sShape(T)=\sum_{u\in V_{in}(T)} log(n_u-1)
in
which V_{in}(T)
denotes the set of inner vertices of T
and n_u
denotes the number of leaves
in the pending subtree that is rooted at u
. An arbitrary logarithm base can be used
(for binary trees it is common to use base 2).
For n=1
the function returns sShape(T)=0
and a warning.
For details on the s-shape statistic, see
also Chapter 9 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_9).
Usage
sShapeI(tree, logbase = 2)
Arguments
tree |
A rooted tree in phylo format. |
logbase |
The logarithm base that shall be used. |
Value
sShapeI
returns the s-shape statistic of the given tree.
Author(s)
Luise Kuehn
References
M.G. Blum and O. Francois. Which random processes describe the tree of life? a large-scale study of phylogenetic tree imbalance. Systematic Biology, 2006.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
sShapeI(tree)
Calculation of the Sackin index for rooted trees
Description
This function calculates the Sackin index S(T)
for a given rooted
tree T
. The tree must not necessarily be binary. S(T)
is defined as
S(T)=\sum_{x\in V_L(T)} \delta(x)=\sum_{u\in V_{in}(T)} n_u
in
which V_L(T)
denotes the leaf set of T
, \delta(x)
denotes the depth of the leaf x
, V_{in}(T)
denotes the set of
inner vertices in T
, and n_u
denotes the number of leaves
in the pending subtree that is rooted at u
. The Sackin index is an
imbalance index.
For n=1
the function returns S(T)=0
and a warning.
For details on the Sackin index, see
also Chapter 5 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_5).
Usage
sackinI(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
sackinI
returns the Sackin index of the given tree.
Author(s)
Luise Kuehn
References
M.J. Sackin. "Good" and "Bad" Phenograms. Systematic Biology, 21(2):225-226, 1972. doi: 10.1093/sysbio/21.2.225.
K.-T. Shao and R.R. Sokal. Tree Balance. Systematic Zoology, 39(3):266, 1990. doi: 10.2307/2992186.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
sackinI(tree)
Calculation of the stairs1 value for rooted binary trees
Description
This function calculates the stairs1 value st1(T)
for a given rooted
binary tree T
. It is a modified version of the Rogers J index and is
defined as the fraction of inner vertices whose balance
value is unequal to zero, more precisely
st1(T)=\frac{1}{n-1}\cdot\sum_{u \in V_{in}(T)} (1-I(n_{u_a}=n_{u_b}))
in which V_{in}(T)
denotes the set of all inner vertices
of T
, and in which n_{u_a}
and n_{u_b}
denote the number of leaves in the two pending subtrees that are
rooted at the direct descendants of u
. The stairs1 value is an imbalance index.
Special cases: For n=1
, the function returns st1(T)=0
and a warning.
For details on the stairs1 value, see
also Chapter 23 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_23).
Usage
stairs1(tree)
Arguments
tree |
A rooted binary tree in phylo format. |
Value
stairs1
returns the stairs1 value of the given tree.
Author(s)
Sophie Kersting
References
M. M. Norstrom, M. C. Prosperi, R. R. Gray, A. C. Karlsson, and M. Salemi. PhyloTempo: A Set of R Scripts for Assessing and Visualizing Temporal Clustering in Genealogies Inferred from Serially Sampled Viral Sequences. Evolutionary Bioinformatics, 8:EBO.S9738, 2012. ISSN 1176-9343, 1176-9343. doi:10.4137/EBO.S9738.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
stairs1(tree)
Calculation of the stairs2 value for rooted binary trees
Description
This function calculates the stairs2 value st2(T)
for a given rooted
binary tree T
. It is defined as the mean ratio between the leaf
numbers of the smaller and larger pending subtree over all inner vertices, more precisely
st2(T)=\frac{1}{n-1}\cdot\sum_{u \in V_{in}(T)} \frac{n_{u_a}}{n_{u_b}}
in which V_{in}(T)
denotes the set of all inner vertices
of T
, and in which n_{u_a}\geq n_{u_b}
denote the number of leaves
in the two pending subtrees that are
rooted at the direct descendants of u
. The stairs2 value is an imbalance index.
Special cases: For n=1
, the function returns st2(T)=0
and a warning.
For details on the stairs2 value, see
also Chapter 23 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_23).
Usage
stairs2(tree)
Arguments
tree |
A rooted binary tree in phylo format. |
Value
stairs2
returns the stairs2 value of the given tree.
Author(s)
Sophie Kersting
References
C. Colijn, J. Gardy. Phylogenetic tree shapes resolve disease transmission patterns. Evolution, Medicine, and Public Health, 2014(1):96-108, 2014. ISSN 2050-6201. doi: 10.1093/emph/eou018.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
stairs2(tree)
Calculation of the symmetry nodes index for rooted binary trees
Description
This function calculates the symmetry nodes index SNI(T)
for a given rooted
binary tree T
. SNI(T)
is defined as the number of inner vertices v
that are not
symmetry nodes, i.e. the two pending subtrees rooted at the children of v
do not
have the same tree shape.
For n=1
the function returns SNI(T)=0
and a warning.
For details on the symmetry nodes index, see
also Chapter 18 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_18).
Usage
symNodesI(tree)
Arguments
tree |
A rooted binary tree in phylo format. |
Value
symNodesI
returns the symmetry nodes index of the given tree.
Author(s)
Sophie Kersting
References
S. J. Kersting, M. Fischer. Measuring tree balance using symmetry nodes — A new balance index and its extremal properties. Mathematical Biosciences, page 108690, 2021. ISSN 0025-5564. doi:https://doi.org/10.1016/j.mbs.2021.108690
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
symNodesI(tree)
Calculation of the total cophenetic index for rooted trees
Description
This function calculates the total cophenetic index TCI(T)
of a given
rooted tree T
. The tree must not necessarily be binary. TCI(T)
is defined as
TCI(T)=\sum_{1\leq i<j\leq n} \delta(lca(i,j))=\sum_{u\in V_{in}(T)\setminus\{\rho\}} binom(n_u,2)
in which \delta(lca(i,j))
denotes the depth of the lowest
common ancestor of the two leaves i
and j
and V_{in}(T)\setminus\{\rho\}
denotes the set of all inner vertices exept the root and n_u
denotes the
number of descendant leaves of u
. The second formula is useful for efficient
computation of TCI(T)
. The total cophenetic index is an imbalance index.
For n=1
the function returns TCI(T)=0
.
For details on the total cophenetic index, see
also Chapter 8 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_8).
Usage
totCophI(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
totCophI
returns the total cophenetic index of the given tree.
Author(s)
Sophie Kersting
References
A. Mir, F. Rossello, and L. Rotger. A new balance index for phylogenetic trees. Mathematical Bio-sciences, 241(1):125-136, 2013. doi: 10.1016/j.mbs.2012.10.005.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
totCophI(tree)
tree <- ape::read.tree(text="((,),((((,),),),(,)));")
totCophI(tree)
tree <- ape::read.tree(text="((,,,),(,,));")
totCophI(tree)
Calculation of the total internal path length for rooted trees
Description
This function calculates the total internal path length TIP(T)
for a given rooted
tree T
. The tree must not necessarily be binary. TIP(T)
is defined as
TIP(T)=\sum_{x\in V_{in}(T)} \delta(x)
in
which V_{in}(T)
denotes the set of inner vertices of T
, and \delta(x)
denotes the depth of the vertex x
. The total internal path length is an
imbalance index.
For details on the total internal path length, see
also Chapter 23 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_23).
Usage
totIntPathLen(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
totIntPathLen
returns the total internal path length of the given tree.
Author(s)
Luise Kuehn
References
D. E. Knuth. The art of computer programming: fundamental algorithms, volume 1. Addison-Wesley, Reading, Mass, 3rd edition, 1997. ISBN 9780201896831.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,,),),(,)));")
totIntPathLen(tree)
Calculation of the total path length for rooted trees
Description
This function calculates the total path length TPL(T)
for a given rooted
tree T
. The tree must not necessarily be binary. TPL(T)
is defined as
TPL(T)=\sum_{x\in V(T)} \delta(x)
in
which V(T)
denotes the set of vertices of T
, and \delta(x)
denotes the depth of the vertex x
. The total path length is an
imbalance index.
For n=1
the function returns TPL(T)=0
and a warning.
For details on the total path length, see
also Chapter 23 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_23).
Usage
totPathLen(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
totPathLen
returns the total path length of the given tree.
Author(s)
Luise Kuehn
References
see e.g. R. P. Dobrow, J. A. Fill. Total path length for random recursive trees. Combinatorics, Probability and Computing, 8(4):317–333, 1999. doi: 10.1017/S0963548399003855.
see e.g. L. Takacs. On the total heights of random rooted trees. Journal of Applied Probability, 29(3):543–556, 1992. doi: 10.2307/3214892.
see e.g. L. Takacs. On the total heights of random rooted binary trees. Journal of Combinatorial Theory, Series B, 61(2):155–166, 1994. ISSN 0095-8956. doi: 10.1006/jctb.1994.1041.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,,),),(,)));")
totPathLen(tree)
Calculation of the variance of leaf depths index for rooted trees
Description
This function calculates the variance of leaf depths index VLD(T)
for a given rooted tree T
. The tree must not necessarily be binary.
VLD(T)
is defined as
VLD(T)=\frac{1}{n}\cdot\sum_{x\in V_L(T)} (\delta(x)-N(T))^2
in which n
denotes the number of leaves of T
, V_L(T)
denotes the set of leaves of T
, \delta(x)
denotes the depth of
the leaf x
and N(T)
denotes the average leaf depth of T
.
For n=1
the function returns VLD(T)=0
and a warning.
For details on the variance of leaf depths, see
also Chapter 7 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_7).
Usage
varLeafDepI(tree)
Arguments
tree |
A rooted tree in phylo format. |
Value
varLeafDepI
returns the variance of leaf depths index of the given tree.
Author(s)
Sophie Kersting
References
T. M. Coronado, A. Mir, F. Rossello, and L. Rotger. On Sackin's original proposal: the variance of the leaves' depths as a phylogenetic balance index. BMC Bioinformatics, 21(1), 2020. doi: 10.1186/s12859-020-3405-1. URL https://doi.org/10.1186/s12859-020-3405-1.
M. J. Sackin. "Good" and "Bad" Phenograms. Systematic Biology, 21(2):225-226, 1972. doi: 10.1093/sysbio/21.2.225.
K.-T. Shao and R. R. Sokal. Tree Balance. Systematic Zoology, 39(3):266, 1990. doi: 10.2307/2992186.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
varLeafDepI(tree)
Wedderburn Etherington numbers (from OEIS)
Description
Contains a vector of Wedderburn Etherington numbers for n=1
to n=2545
.
Usage
data(wedEth)
Format
numerical vector
Source
OEIS Sequence A001190 available at https://oeis.org/A001190
Examples
data(wedEth)
wedEth[5]
Calculation of weighted l1 distance index for rooted binary trees
Description
This function calculates the weighted l1 distance index D_{l1}(T)
for a
given rooted binary tree T
. D_{l1}(T)
is defined as
D_{l1}(T)=\sum_{z=2}^n z \cdot |f_n(z)-p_n(z)|
in which n
denotes the
number of leaves of T
, f_n(z)
denotes the frequency of pending subtrees
of size z
in T
and p_n(z)
is the expected number of
pending subtrees of size z
under the Yule model, i.e. p_n(z)=\frac{1}{n-1}
if z=n
and otherwise \frac{n}{n-1}\cdot\frac{2}{z\cdot(z+1)}
.
For n=1
the function returns D_{l1}(T)=0
.
For details on the weighted l1 distance index, see
also Chapter 24 in "Tree balance indices: a comprehensive survey" (https://doi.org/10.1007/978-3-031-39800-1_24).
Usage
weighL1dist(tree)
Arguments
tree |
A rooted binary tree in phylo format. |
Value
weighL1distI
returns the weighted l1 distance index of the given tree.
Author(s)
Sophie Kersting
References
M. G. Blum and O. Francois. On statistical tests of phylogenetic tree imbalance: The Sackin and other indices revisited. Mathematical Biosciences, 195(2):141-153, 2005. doi: 10.1016/j.mbs.2005.03.003.
Examples
tree <- ape::read.tree(text="((((,),),(,)),(((,),),(,)));")
weighL1dist(tree)