Title: | Drawing Gapped Cluster Heatmaps with 'ggplot2' |
Version: | 1.0.0 |
Description: | The gap encodes the distance between clusters and improves interpretation of cluster heatmaps. The gaps can be of the same distance based on a height threshold to cut the dendrogram. Another option is to vary the size of gaps based on the distance between clusters. |
License: | GPL-2 | GPL-3 |
Encoding: | UTF-8 |
Depends: | ggplot2, reshape2 |
Imports: | grid |
Suggests: | knitr, dendsort, RColorBrewer, rmarkdown |
VignetteBuilder: | knitr |
URL: | https://github.com/evanbiederstedt/gapmap |
BugReports: | https://github.com/evanbiederstedt/gapmap/issues |
RoxygenNote: | 7.2.3 |
NeedsCompilation: | no |
Maintainer: | Evan Biederstedt <evan.biederstedt@gmail.com> |
Packaged: | 2024-01-22 17:42:53 UTC; evanbiederstedt |
Author: | Ryo Sakai [aut], Evan Biederstedt [cre, aut] |
Repository: | CRAN |
Date/Publication: | 2024-01-22 20:50:02 UTC |
Draws gapped heatmap (gapmap) and gapped dendrograms using ggplot2 in [R].
Description
Functions for drawing gapped cluster heatmap with ggplot2
Details
This is a set of tools for drawing gapmaps using ggplot
gap_data
extracts data from a dendrogram object. Make sure to convert hclust
object to dendrogram
object by calling as.dendrogram()
.
This method generates an object class gapdata
, consisting of a list of data.frames
.
The general workflow is as following:
Hierarchical clustering
hclust()
Convert the
hclust
output class intodendrogram
by callingas.dendrogram()
Generate a gapped cluster heatmap by specifying a
matrix
anddendrogram
objects for rows and columns ingapmap()
function
Author(s)
Ryo Sakai ryo.sakai@esat.kuleuven.be
Make a gapdata class object
Description
This function generates a gapdata class object. This object is used for drawing dendrograms and heatmaps.
Usage
as.gapdata(d, segments, labels, ...)
Arguments
d |
dendrogram class object |
segments |
a data.frame containing segments information |
labels |
a data.frame containing labels information |
... |
ignored |
Value
the gapdata class object
Re-evaluate the position of branches
Description
This function reevaluate the position of branches based on the gaps calculated.
Usage
assign_branch_positions(d, verbose = FALSE, ...)
Arguments
d |
dendrogram class object |
verbose |
logical for whether in verbose mode or not |
... |
ignored |
Value
the reevaluated dendrogram object
Re-evaluate the position of leaves
Description
This function reevaluate the position of leaves based on the gaps calculated.
Usage
assign_positions(d, runningX = 1, verbose = FALSE, ...)
Arguments
d |
dendrogram class object |
runningX |
numerical position of leaf node on display |
verbose |
logical for whether in verbose mode or not |
... |
ignored |
Value
the reevaluated dendrogram object
Calculate the gaps based on distance
Description
This function takes a dendrogram class object and other attributes to calcuate the size of gaps between leaves. The gap is stored in a leaf to its left. The function is called recursively.
Usage
calculate_gap(
d,
sum,
gap_total,
mode = c("quantitative", "threshold"),
mapping = c("exponential", "linear"),
scale = 0.2,
max_height = 0,
threshold = 2,
gap_size = 0,
verbose = FALSE,
...
)
Arguments
d |
dendrogram class object |
sum |
the sum of distance |
gap_total |
the total width allocated for gaps |
mode |
gap mode, either "threshold" or "quantitative" |
mapping |
in case of quantitative mode, either "linear" or "exponential" mapping |
scale |
the sclae log base for the exponential mapping |
max_height |
the highest distance value, which is the value of the first dendrogram branch |
threshold |
the threshold value for threshold mode |
gap_size |
the size of gap for threshold mode |
verbose |
logical for whether in verbose mode or not |
... |
ignored |
Value
the annotated dendrogram class object
Count the number of gaps based on a threshold
Description
This function counts the number of gaps based on the cutting the tree method. It counts the number of branches that are above the threshold distance.
Usage
count_gap(d = d, count = 0, threshold = threshold)
Arguments
d |
dendrogram class object |
count |
count |
threshold |
a numeric value for threshold |
Value
the count of gaps
Extract a list from the dendrogram object
Description
This function extract list of data.frames for drawing dendrograms
Usage
extract_list(d, type, segments_df = NULL, labels_df = NULL, ...)
Arguments
d |
dendrogram class object |
type |
either "triangular" or "rectangular". It determines the same of branches. |
segments_df |
data.frame storing the segment information |
labels_df |
data.frame storking the label positions |
... |
ignored |
Value
the extracted list
Function to format a number
Description
This function takes a floating number and round to 2 decimal point
Usage
format_number(x)
Arguments
x |
a floating number |
Value
formatted number
Generate a gapdata class object from a dendrogram object
Description
This function takes a dendrogram class object as an input, and generate a gapdata class object as an output. By parsing the dendrogram object based on parameters for gaps, gaps between leaves in a dendrogram are introduced, and the coordinates of the leaves are adjusted. The gaps can be based on the a height (or distance) threshold to to introduce the gaps of the same width, or quantitative mapping of distance values mapped linearly or exponentially.
Usage
gap_data(
d,
mode = c("quantitative", "threshold"),
mapping = c("exponential", "linear"),
ratio = 0.2,
scale = 0.5,
threshold = 0,
verbose = FALSE,
...
)
Arguments
d |
dendrogram class object |
mode |
gap mode, either "threshold" or "quantitative" |
mapping |
in case of quantitative mode, either "linear" or "exponential" mapping |
ratio |
the percentage of width allocated for the sum of gaps. |
scale |
the sclae log base for the exponential mapping |
threshold |
the height at which the dendrogram is cult to infer clusters |
verbose |
logical for whether in verbose mode or not |
... |
ignored |
Value
a list of data frames that contain coordinates for drawing a gapped dendrogram
Function to draw a gapped dendrogram
Description
This function draws a gapped dendrogram using the ggplot2 package. The input for the function is the gapdata class object, generated from gap_data() function.
Usage
gap_dendrogram(
data,
leaf_labels = TRUE,
rotate_label = FALSE,
orientation = c("top", "right", "bottom", "left"),
...
)
Arguments
data |
gapdata class object |
leaf_labels |
a logical to show labels or not |
rotate_label |
a logical to rotate labels or not |
orientation |
a character to set the orientation of dendrogram. Choices are "top", "right", "bottom", "left". |
... |
ignored |
Value
a ggplot object
Function to draw a gapped heatmap
Description
This function draws a gapped heatmap using the ggplot2 package. The input for the function are the gapdata class objects, generated from gap_data() function, and the data matrix.
Usage
gap_heatmap(
m,
row_gap = NULL,
col_gap = NULL,
row_labels = TRUE,
col_labels = TRUE,
rotate = FALSE,
col = c("#053061", "#2166AC", "#4393C3", "#92C5DE", "#D1E5F0", "#F7F7F7", "#FDDBC7",
"#F4A582", "#D6604D", "#B2182B", "#67001F")
)
Arguments
m |
data matrix |
row_gap |
a gapdata class object for rows |
col_gap |
a gapdata class object for columns |
row_labels |
a logical to show labels for rows |
col_labels |
a logical to show lables for columns |
rotate |
a logical to rotate row labels |
col |
colors used for heatmap |
Value
a ggplot object
Function to draw a gapped labels
Description
This function draws a gapped labels using the ggplot2 package. The input for the function is the gapdata class object, generated from gap_data() function.
Usage
gap_label(data, orientation, label_size = 5)
Arguments
data |
gapdata class object |
orientation |
orientation of the labels, "left", "top", "right", or "bottom" |
label_size |
a numeric to set the label text size |
Value
a ggplot object
Function to draw a gapped cluster heatmap
Description
This function draws a gapped cluster heatmap using the ggplot2 package. The input for the function is the a matrix, two dendrograms, and parameters for gaps.
Usage
gapmap(
m,
d_row,
d_col,
mode = c("quantitative", "threshold"),
mapping = c("exponential", "linear"),
ratio = 0.2,
scale = 0.5,
threshold = 0,
row_threshold = NULL,
col_threshold = NULL,
rotate_label = TRUE,
verbose = FALSE,
left = "dendrogram",
top = "dendrogram",
right = "label",
bottom = "label",
col = c("#053061", "#2166AC", "#4393C3", "#92C5DE", "#D1E5F0", "#F7F7F7", "#FDDBC7",
"#F4A582", "#D6604D", "#B2182B", "#67001F"),
h_ratio = c(0.2, 0.7, 0.1),
v_ratio = c(0.2, 0.7, 0.1),
label_size = 5,
show_legend = FALSE,
...
)
Arguments
m |
matrix |
d_row |
a dendrogram class object for rows |
d_col |
a dendrogram class object for columns |
mode |
gap mode, either "threshold" or "quantitative" |
mapping |
in case of quantitative mode, either "linear" or "exponential" mapping |
ratio |
the percentage of width allocated for the sum of gaps. |
scale |
the sclae log base for the exponential mapping |
threshold |
the height at which the dendrogram is cut to infer clusters |
row_threshold |
the height at which the row dendrogram is cut |
col_threshold |
the height at which the column dendrogram is cut |
rotate_label |
a logical to rotate column labels or not |
verbose |
logical for whether in verbose mode or not |
left |
a character indicating "label" or "dendrogram" for composition |
top |
a character indicating "label" or "dendrogram" for composition |
right |
a character indicating "label" or "dendrogram" for composition |
bottom |
a character indicating "label" or "dendrogram" for composition |
col |
colors used for heatmap |
h_ratio |
a vector to set the horizontal ratio of the grid. It should add up to 1. top, center, bottom. |
v_ratio |
a vector to set the vertical ratio of the grid. It should add up to 1. left, center, right. |
label_size |
a numeric to set the label text size |
show_legend |
a logical to set whether to show a legend or not |
... |
ignored |
Value
a ggplot object
Examples
set.seed(1234)
#generate sample data
x <- rnorm(10, mean=rep(1:5, each=2), sd=0.4)
y <- rnorm(10, mean=rep(c(1,2), each=5), sd=0.4)
dataFrame <- data.frame(x=x, y=y, row.names=c(1:10))
#calculate distance matrix. default is Euclidean distance
distxy <- dist(dataFrame)
#perform hierarchical clustering. default is complete linkage.
hc <- hclust(distxy)
dend <- as.dendrogram(hc)
#make a cluster heatmap plot
gapmap(m = as.matrix(distxy), d_row= rev(dend), d_col=dend)
Get the most left leaf object from a dendrogram
Description
This function returns the most left leaf object.
Usage
get_most_left_leaf(d)
Arguments
d |
dendrogram class object |
Value
the most left leaf object
Get the most right leaf object from a dendrogram
Description
This function returns the most right leaf object.
Usage
get_most_right_leaf(d)
Arguments
d |
dendrogram class object |
Value
the most right leaf object
Make a data.frame object
Description
This function just make a data.frame based on 4 input parameters
Usage
get_segment_df(x0, y0, x1, y1)
Arguments
x0 |
x coordinate of point 1 |
y0 |
y coordinate of point 1 |
x1 |
x coordinate of point 2 |
y1 |
y coordinate of point 2 |
Value
A data.frame
Function to check if a object is a gapdata class object
Description
This function checks if a object is a gapdata class object.
Usage
is.gapdata(x)
Arguments
x |
a object |
Value
a logical TRUE or FALSE
A function to map values in a range
Description
This function maps a value in one range to another range.
Usage
map(value, start1, stop1, start2, stop2)
Arguments
value |
input value |
start1 |
lower bound of the value's current range |
stop1 |
upper bound of the value's current range |
start2 |
lower bound of the value's taget range |
stop2 |
upper bound of the value's target range |
Value
a numeric value
A function to map values in a range exponentially
Description
This function maps a value in one range to another range exponentially.
Usage
map.exp(value, start1, stop1, start2, stop2, scale = 0.5)
Arguments
value |
input value |
start1 |
lower bound of the value's current range |
stop1 |
upper bound of the value's current range |
start2 |
lower bound of the value's taget range |
stop2 |
upper bound of the value's target range |
scale |
scale log base |
Value
a numeric value
Sample data matrix from the integrated pathway analysis of gastric cancer from the Cancer Genome Atlas (TCGA) study.
Description
a multivariate table obtained from the integrated pathway analysis of gastric cancer from the Cancer Genome Atlas (TCGA) study. In this data set, each column represents a pathway consisting of a set of genes and each row represents a cohort of samples based on specific clinical or genetic features. For each pair of a pathway and a feature, a continuous value of between 1 and -1 is assigned to score positive or negative association, respectively.
Usage
data(sample_tcga)
Format
A data frame with 215 rows and 117 variables
Details
We would like to thank Sheila Reynolds and Vesteinn Thorsson from the Institute for Systems Biology for sharing this sample data set.
Set the most right leaf object from a dendrogram
Description
This function replace the most right leaf with provided dendrogram
Usage
set_most_right_leaf(d, d2, ...)
Arguments
d |
dendrogram class object, subtree object |
d2 |
dendrogram class object, a leaf to replace the most right |
Value
the dendrogram class object where the most right leaf is replaced
Sum the distance of all branches in a dendrogram
Description
This function takes a dendrogram class object as an input, and adds up all the distances of branches. This function is called recursively to adds up the sum. In case of exponential mapping for the quantitative mode, the sum is in the exponential scale
Usage
sum_distance(
d,
sum = 0,
mapping = c("exponential", "linear"),
scale = 0,
max_height = 0,
...
)
Arguments
d |
dendrogram class object |
sum |
the sum of distance |
mapping |
in case of quantitative mode, either "linear" or "exponential" mapping |
... |
ignored |
Value
the sum of distances