Type: | Package |
Title: | Support Functions for Wrangling and Visualization |
Version: | 1.5.0 |
Date: | 2025-06-03 |
Maintainer: | Nicholas J Lyon <njlyon@alumni.iastate.edu> |
Description: | Suite of helper functions for data wrangling and visualization. The only theme for these functions is that they tend towards simple, short, and narrowly-scoped. These functions are built for tasks that often recur but are not large enough in scope to warrant an ecosystem of interdependent functions. |
License: | MIT + file LICENSE |
Language: | en-US |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.1 |
URL: | https://github.com/njlyon0/supportR, https://njlyon0.github.io/supportR/ |
BugReports: | https://github.com/njlyon0/supportR/issues |
Depends: | R (≥ 3.5) |
Imports: | data.tree, dplyr, ggplot2, gh, googledrive, graphics, lifecycle, magrittr, methods, purrr, rlang, rmarkdown, scales, stringi, stringr, tidyr, vegan |
Suggests: | ape, devtools, knitr, palmerpenguins, testthat (≥ 3.0.0) |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2025-06-03 14:44:12 UTC; nick.lyon |
Author: | Nicholas J Lyon |
Repository: | CRAN |
Date/Publication: | 2025-06-03 15:20:02 UTC |
supportR: Support Functions for Wrangling and Visualization
Description
Suite of helper functions for data wrangling and visualization. The only theme for these functions is that they tend towards simple, short, and narrowly-scoped. These functions are built for tasks that often recur but are not large enough in scope to warrant an ecosystem of interdependent functions.
Author(s)
Maintainer: Nicholas J Lyon njlyon@alumni.iastate.edu (ORCID) [copyright holder]
See Also
Useful links:
Report bugs at https://github.com/njlyon0/supportR/issues
Melt an Array into a Dataframe
Description
Melts an array of dimensions x, y, and z into a dataframe containing columns x
, y
, z
, and value
where value
is whatever was stored in the array at those coordinates.
Usage
array_melt(array = NULL)
Arguments
array |
(array) array object to melt into a dataframe |
Value
(dataframe) object containing the "flattened" array in dataframe format
Examples
# First we need to create an array to melt
## Make data to fill the array
vec1 <- c(5, 9, 3)
vec2 <- c(10:15)
## Create dimension names (x = col, y = row, z = which matrix)
x_vals <- c("Col_1","Col_2","Col_3")
y_vals <- c("Row_1","Row_2","Row_3")
z_vals <- c("Mat_1","Mat_2")
## Make an array from these components
g <- array(data = c(vec1, vec2), dim = c(3, 3, 2),
dimnames = list(x_vals, y_vals, z_vals))
## "Melt" the array into a dataframe
supportR::array_melt(array = g)
Count Occurrences of Unique Vector Elements
Description
Counts the number of occurrences of each element in the provided vector. Counting of NAs in addition to non-NA values is supported.
Usage
count(vec = NULL)
Arguments
vec |
(vector) vector containing elements to count |
Value
(dataframe) two-column dataframe with as many rows as there are unique elements in the provided vector. First column is named "value" and includes the unique elements of the vector, second column is named "count" and includes the number of occurrences of each vector element.
Examples
# Count instances of vector elements
supportR::count(vec = c(1, 1, NA, "a", 1, "a", NA, "x"))
Count Difference in Occurrences of Vector Elements
Description
Counts the number of occurrences of each element in both provided vectors and then calculates the difference in that count between the first and second input vector. Counting of NAs in addition to non-NA values is supported.
Usage
count_diff(vec1, vec2, what = NULL)
Arguments
vec1 |
(vector) first vector containing elements to count |
vec2 |
(vector) second vector containing elements to count |
what |
(vector) optional argument for what element(s) to count. If left |
Value
(dataframe) four-column dataframe with as many rows as there are unique elements across both specified vectors or as the number of elements passed to 'what'. First column is named "value" and includes the unique elements of the vector, second and third columns are named "vec1_count" and "vec2_count" respectively and include the number of occurrences of each vector element in each vector. Final column is "diff" and the difference in the count of each element between the first and second input vectors
Examples
# Define two vectors
x1 <- c(1, 1, NA, "a", 1, "a", NA, "x")
x2 <- c(1, "a", "x")
# Count difference in number of NAs between the two vectors
supportR::count_diff(vec1 = x1, vec2 = x2, what = NA)
# Count difference in all values between the two
supportR::count_diff(vec1 = x1, vec2 = x2)
Crop a Triangle from Data Object
Description
Accepts a symmetric data object and replaces the chosen triangle with NAs. Also allows user to choose whether to keep or drop the diagonal of the data object
Usage
crop_tri(data = NULL, drop_tri = "upper", drop_diag = FALSE)
Arguments
data |
(dataframe, dataframe-like, or matrix) symmetric data object to remove one of the triangles from |
drop_tri |
(character) which triangle to replace with NAs, either "upper" or "lower" |
drop_diag |
(logical) whether to drop the diagonal of the data object (defaults to FALSE) |
Value
(dataframe or dataframe-like) data object with desired triangle removed and either with or without the diagonal
Examples
# Define a simple matrix wtih symmetric dimensions
mat <- matrix(data = c(1:2, 2:1), nrow = 2, ncol = 2)
# Crop off it's lower triangle
supportR::crop_tri(data = mat, drop_tri = "lower", drop_diag = FALSE)
Check Columns for Non-Dates
Description
Identifies any elements in the column(s) that would be changed to NA if as.Date
is used on the column(s). This is useful for quickly identifying only the "problem" entries of ostensibly date column(s) that is/are read in as a character.
Usage
date_check(data = NULL, col = NULL)
Arguments
data |
(dataframe) object containing at least one column of supposed dates |
col |
(character or numeric) name(s) or column number(s) of the column(s) containing putative dates in the data object |
Value
(list) malformed dates from each supplied column in separate list elements
Examples
# Make a dataframe to test the function
loc <- c("LTR", "GIL", "PYN", "RIN")
time <- c("2021-01-01", "2021-01-0w", "1990", "2020-10-xx")
time2 <- c("1880-08-08", "2021-01-02", "1992", "2049-11-01")
time3 <- c("2022-10-31", "tomorrow", "1993", NA)
# Assemble our vectors into a dataframe
sites <- data.frame("site" = loc, "first_visit" = time, "second" = time2, "third" = time3)
# Use `date_check()` to return only the entries that would be lost
supportR::date_check(data = sites, col = c("first_visit", "second", "third"))
Identify Probable Format for Ambiguous Date Formats
Description
In a column containing multiple date formats (e.g., MM/DD/YYYY, "YYYY/MM/DD, etc.) identifies probable format of each date. Provision of a grouping column improves inference. Any formats that cannot be determined are flagged as "FORMAT UNCERTAIN" for human double-checking. This is useful for quickly sorting the bulk of ambiguous dates into clear categories for later conditional wrangling.
Usage
date_format_guess(
data = NULL,
date_col = NULL,
groups = TRUE,
group_col = NULL,
return = "dataframe",
quiet = FALSE
)
Arguments
data |
(dataframe) object containing at least one column of ambiguous dates |
date_col |
(character) name of column containing ambiguous dates |
groups |
(logical) whether groups exist in the dataframe / should be used (defaults to TRUE) |
group_col |
(character) name of column containing grouping variable |
return |
(character) either "dataframe" or "vector" depending on whether the user wants the date format "guesses" returned as a new column on the dataframe or a vector |
quiet |
(logical) whether certain optional messages should be displayed (defaults to FALSE) |
Value
(dataframe or character) object containing date format guesses
Examples
# Create dataframe of example ambiguous dates & grouping variable
my_df <- data.frame('data_enterer' = c('person A', 'person B',
'person B', 'person B',
'person C', 'person D',
'person E', 'person F',
'person G'),
'bad_dates' = c('2022.13.08', '2021/2/02',
'2021/2/03', '2021/2/04',
'1899/1/15', '10-31-1901',
'26/11/1901', '08.11.2004',
'6/10/02'))
# Now we can invoke the function!
supportR::date_format_guess(data = my_df, date_col = "bad_dates",
group_col = "data_enterer", return = "dataframe")
# If preferred, do it without groups and return a vector
supportR::date_format_guess(data = my_df, date_col = "bad_dates",
groups = FALSE, return = "vector")
Compare Difference Between Two Vectors
Description
Reflexively compares two vectors and identifies (1) elements that are found in the first but not the second (i.e., "lost" components) and (2) elements that are found in the second but not the first (i.e., "gained" components). This is particularly helpful when manipulating a dataframe and comparing what columns are lost or gained between wrangling steps. Alternately it can compare the contents of two columns to see how two dataframes differ.
Usage
diff_check(old = NULL, new = NULL, sort = TRUE, return = FALSE)
Arguments
old |
(vector) starting / original object |
new |
(vector) ending / modified object |
sort |
(logical) whether to sort the difference between the two vectors |
return |
(logical) whether to return the two vectors as a 2-element list |
Value
No return value (unless return = TRUE
), called for side effects. If return = TRUE
, returns a two-element list
Examples
# Make two vectors
vec1 <- c("x", "a", "b")
vec2 <- c("y", "z", "a")
# Compare them!
supportR::diff_check(old = vec1, new = vec2, return = FALSE)
# Return the difference for later use
diff_out <- supportR::diff_check(old = vec1, new = vec2, return = TRUE)
diff_out
Force Coerce to Numeric
Description
Coerces a vector into a numeric vector and automatically silences NAs introduced by coercion
warning. Useful for cases where non-numbers are known to exist in vector and their coercion to NA is expected / unremarkable. Essentially just a way of forcing this coercion more succinctly than wrapping as.numeric
in suppressWarnings
.
Usage
force_num(x = NULL)
Arguments
x |
(non-numeric) vector containing elements to be coerced into class numeric |
Value
(numeric) vector of numeric values
Examples
# Coerce a character vector to numeric without throwing a warning
supportR::force_num(x = c(2, "A", 4))
List Objects in a GitHub Repository
Description
Accepts a GitHub repository URL and identifies all files in the specified folder. If no folder is specified, lists top-level repository contents. Recursive listing of sub-folders is supported by an additional argument. This function only works on repositories (public or private) to which you have access.
Usage
github_ls(repo = NULL, folder = NULL, recursive = TRUE, quiet = FALSE)
Arguments
repo |
(character) full URL for a GitHub repository (including "github.com") |
folder |
(NULL/character) either |
recursive |
(logical) whether to recursively list contents (i.e., list contents of sub-folders identified within previously identified sub-folders) |
quiet |
(logical) whether to print an informative message as the contents of each folder is being listed |
Value
(dataframe) three-column dataframe including (1) the names of the contents, (2) the type of each content item (e.g., file/directory/etc.), and (3) the full path from the starting folder to each item
Examples
## Not run:
# List complete contents of the `supportR` package repository
supportR::github_ls(repo = "https://github.com/njlyon0/supportR", recursive = TRUE, quiet = FALSE)
## End(Not run)
List Objects in a Single Folder of a GitHub Repository
Description
Accepts a GitHub repository URL and identifies all files in the specified folder. If no folder is specified, lists top-level repository contents. This function only works on repositories (public or private) to which you have access.
Usage
github_ls_single(repo = NULL, folder = NULL)
Arguments
repo |
(character) full URL for a GitHub repository (including "github.com") |
folder |
(NULL/character) either |
Value
(dataframe) two-column dataframe including (1) the names of the contents and (2) the type of each content item (e.g., file/directory/etc.)
Examples
## Not run:
# List contents of the top-level of the `supportR` package repository
supportR::github_ls_single(repo = "https://github.com/njlyon0/supportR")
## End(Not run)
Create File Tree of a GitHub Repository
Description
Recursively identifies all files in a GitHub repository and creates a file tree using the data.tree
package to create a simple, human-readable visualization of the folder hierarchy. Folders can be specified for exclusion in which case the number of elements within them is listed but not the names of those objects. This function only works on repositories (public or private) to which you have access.
Usage
github_tree(repo = NULL, exclude = NULL, quiet = FALSE)
Arguments
repo |
(character) full URL for a github repository (including "github.com") |
exclude |
(character) vector of folder names to exclude from the file tree. If |
quiet |
(logical) whether to print an informative message as the contents of each folder is being listed and as the tree is prepared from that information |
Value
(node / R6) data.tree
package object class
Examples
## Not run:
# Create a file tree for the `supportR` package GitHub repository
supportR::github_tree(repo = "github.com/njlyon0/supportR", exclude = c("man", "docs", ".github"))
## End(Not run)
Check Multiple Columns for Non-Dates
Description
This function was deprecated because I realized that it is just a special case of the date_check()
function.
Usage
multi_date_check(data = NULL, col_vec = NULL)
Arguments
data |
(dataframe) object containing at least one column of supposed dates |
col_vec |
(character or numeric) vector of names or column numbers of the columns containing putative dates in the data object |
Value
list of same length as col_vec
with malformed dates from each column in their respective element
Examples
# Make a dataframe to test the function
loc <- c("LTR", "GIL", "PYN", "RIN")
time <- c('2021-01-01', '2021-01-0w', '1990', '2020-10-xx')
time2 <- c('1880-08-08', '2021-01-02', '1992', '2049-11-01')
time3 <- c('2022-10-31', 'tomorrow', '1993', NA)
# Assemble our vectors into a dataframe
sites <- data.frame('site' = loc, 'first_visit' = time, "second" = time2, "third" = time3)
# Use `multi_date_check()` to return only the entries that would be lost
multi_date_check(data = sites, col_vec = c("first_visit", "second", "third"))
Check Multiple Columns for Non-Numbers
Description
This function was deprecated because I realized that it is just a special case of the num_check()
function.
Usage
multi_num_check(data = NULL, col_vec = NULL)
Arguments
data |
(dataframe) object containing at least one column of supposed numbers |
col_vec |
(character or numeric) vector of names or column numbers of the columns containing putative numbers in the data object |
Value
list of same length as col_vec
with malformed numbers from each column in their respective element
Examples
# Create dataframe with a numeric column where some entries would be coerced into NA
spp <- c('salmon', 'bass', 'halibut', 'eel')
ct <- c(1, '14x', '_23', 12)
ct2 <- c('a', '2', '4', '0')
ct3 <- c(NA, 'Y', 'typo', '2')
fish <- data.frame('species' = spp, 'count' = ct, 'num_col2' = ct2, 'third_count' = ct3)
# Use `multi_num_check()` to return only the entries that would be lost
multi_num_check(data = fish, col_vec = c("count", "num_col2", "third_count"))
Create Named Vector
Description
Create a named vector in a single line without either manually defining names at the outset (e.g., c("name_1" = 1, "name_2" = 2, ...
) or spending a second line to assign names to an existing vector (e.g., names(vec) <- c("name_1", "name_2", ...)
). Useful in cases where you need a named vector within a pipe and don't want to break into two pipes just to define a named vector (see tidyr::separate_wider_position
)
Usage
name_vec(content = NULL, name = NULL)
Arguments
content |
(vector) content of vector |
name |
(vector) names to assign to vector (must be in same order) |
Value
(named vector) vector with contents from the content
argument and names from the name
argument
Examples
# Create a named vector
supportR::name_vec(content = 1:10, name = paste0("text_", 1:10))
Publication-Quality Non-metric Multi-dimensional Scaling (NMS) Ordinations
Description
This function has been superseded by ordination
because this is just a special case of that function. Additionally, ordination
provides users much more control over the internal graphics
functions used to create the fundamental elements of the graph
Produces Non-Metric Multi-dimensional Scaling (NMS) ordinations for up to 10 groups. Assigns a unique color for each group and draws an ellipse around the standard deviation of the points. Automatically adds stress (see vegan::metaMDS
for explanation of "stress") as legend title. Because there are only five hollow shapes (see ?graphics::pch()
) all shapes are re-used a maximum of 2 times when more than 5 groups are supplied.
Usage
nms_ord(
mod = NULL,
groupcol = NULL,
title = NA,
colors = c("#41b6c4", "#c51b7d", "#7fbc41", "#d73027", "#4575b4", "#e08214", "#8073ac",
"#f1b6da", "#b8e186", "#8c96c6"),
shapes = rep(x = 21:25, times = 2),
lines = rep(x = 1, times = 10),
pt_size = 1.5,
pt_alpha = 1,
lab_text_size = 1.25,
axis_text_size = 1,
leg_pos = "bottomleft",
leg_cont = unique(groupcol)
)
Arguments
mod |
(metaMDS/monoMDS) object returned by |
groupcol |
(dataframe) column specification in the data that includes the groups (accepts either bracket or $ notation) |
title |
(character) string to use as title for plot |
colors |
(character) vector of colors (as hexadecimal codes) of length >= group levels (default not colorblind safe because of need for 10 built-in unique colors) |
shapes |
(numeric) vector of shapes (as values accepted by |
lines |
(numeric) vector of line types (as integers) of length >= group levels |
pt_size |
(numeric) value for point size (controlled by character expansion i.e., |
pt_alpha |
(numeric) value for transparency of points (ranges from 0 to 1) |
lab_text_size |
(numeric) value for axis label text size |
axis_text_size |
(numeric) value for axis tick text size |
leg_pos |
(character or numeric) legend position, either numeric vector of x/y coordinates or shorthand accepted by |
leg_cont |
(character) vector of desired legend entries. Defaults to |
Value
(plot) base R ordination with an ellipse for each group
Examples
# Use data from the vegan package
utils::data("varespec", package = 'vegan')
resp <- varespec
# Make some columns of known number of groups
factor_4lvl <- c(rep.int("Trt1", (nrow(resp)/4)),
rep.int("Trt2", (nrow(resp)/4)),
rep.int("Trt3", (nrow(resp)/4)),
rep.int("Trt4", (nrow(resp)/4)))
# And combine them into a single data object
data <- cbind(factor_4lvl, resp)
# Actually perform multidimensional scaling
mds <- vegan::metaMDS(data[-1], autotransform = FALSE, expand = FALSE, k = 2, try = 50)
# With the scaled object and original dataframe we can use this function
nms_ord(mod = mds, groupcol = data$factor_4lvl,
title = '4-Level NMS', leg_pos = 'topright',
leg_cont = as.character(1:4))
Check Columns for Non-Numbers
Description
Identifies any elements in the column(s) that would be changed to NA if as.numeric
is used on the column(s). This is useful for quickly identifying only the "problem" entries of ostensibly numeric column(s) that is/are read in as a character.
Usage
num_check(data = NULL, col = NULL)
Arguments
data |
(dataframe) object containing at least one column of supposed numbers |
col |
(character or numeric) name(s) or column number(s) of the column(s) containing putative numbers in the data object |
Value
(list) malformed numbers from each supplied column in separate list elements
Examples
# Create dataframe with a numeric column where some entries would be coerced into NA
spp <- c("salmon", "bass", "halibut", "eel")
ct <- c(1, "14x", "_23", 12)
ct2 <- c("a", "2", "4", "0")
ct3 <- c(NA, "Y", "typo", "2")
fish <- data.frame("species" = spp, "count" = ct, "num_col2" = ct2, "third_count" = ct3)
# Use `num_check()` to return only the entries that would be lost
supportR::num_check(data = fish, col = c("count", "num_col2", "third_count"))
Create an Ordination with Ellipses for Groups
Description
Produces a Nonmetric Multidimensional Scaling (NMS) or Principal Coordinate Analysis (PCoA) for up to 10 groups. Draws an ellipse around the standard deviation of the points in each group. By default, assigns a unique color (colorblind-safe) and point shape for each group. If the user supplies colors/shapes then the function can support more than 10 groups. For NMS ordinations, includes the stress as the legend title (see ?vegan::metaMDS
for explanation of "stress"). For PCoA ordinations includes the percent variation explained parenthetically in the axis labels.
Usage
ordination(mod = NULL, grps = NULL, ...)
Arguments
mod |
(pcoa | monoMDS/metaMDS) object returned by |
grps |
(vector) vector of categorical groups for data. Must be same length as number of rows in original data object |
... |
additional arguments passed to |
Value
(plot) base R ordination with an ellipse for each group
Examples
# Use data from the vegan package
utils::data("varespec", package = 'vegan')
# Make some columns of known number of groups
treatment <- c(rep.int("Trt1", (nrow(varespec)/4)),
rep.int("Trt2", (nrow(varespec)/4)),
rep.int("Trt3", (nrow(varespec)/4)),
rep.int("Trt4", (nrow(varespec)/4)))
# And combine them into a single data object
data <- cbind(treatment, varespec)
# Get a distance matrix from the data
dist <- vegan::vegdist(varespec, method = 'kulczynski')
# Perform PCoA / NMS
pcoa_mod <- ape::pcoa(dist)
nms_mod <- vegan::metaMDS(data[-1], autotransform = FALSE, expand = FALSE, k = 2, try = 50)
# Create PCoA ordination (with optional agruments)
supportR::ordination(mod = pcoa_mod, grps = data$treatment,
bg = c("red", "blue", "purple", "orange"),
lty = 2, col = "black")
# Create NMS ordination
supportR::ordination(mod = nms_mod, grps = data$treatment, alpha = 0.3,
x = "topright", legend = LETTERS[1:4])
Publication-Quality Principal Coordinates Analysis (PCoA) Ordinations
Description
This function has been superseded by ordination
because this is just a special case of that function. Additionally, ordination
provides users much more control over the internal graphics
functions used to create the fundamental elements of the graph
Produces Principal Coordinates Analysis (PCoA) ordinations for up to 10 groups. Assigns a unique color for each group and draws an ellipse around the standard deviation of the points. Automatically adds percent of variation explained by first two principal component axes parenthetically to axis labels. Because there are only five hollow shapes (see ?graphics::pch
) all shapes are re-used a maximum of 2 times when more than 5 groups are supplied.
Usage
pcoa_ord(
mod = NULL,
groupcol = NULL,
title = NA,
colors = c("#41b6c4", "#c51b7d", "#7fbc41", "#d73027", "#4575b4", "#e08214", "#8073ac",
"#f1b6da", "#b8e186", "#8c96c6"),
shapes = rep(x = 21:25, times = 2),
lines = rep(x = 1, times = 10),
pt_size = 1.5,
pt_alpha = 1,
lab_text_size = 1.25,
axis_text_size = 1,
leg_pos = "bottomleft",
leg_cont = unique(groupcol)
)
Arguments
mod |
(pcoa) object returned by |
groupcol |
(dataframe) column specification in the data that includes the groups (accepts either bracket or $ notation) |
title |
(character) string to use as title for plot |
colors |
(character) vector of colors (as hexadecimal codes) of length >= group levels (default not colorblind safe because of need for 10 built-in unique colors) |
shapes |
(numeric) vector of shapes (as values accepted by |
lines |
(numeric) vector of line types (as integers) of length >= group levels |
pt_size |
(numeric) value for point size (controlled by character expansion i.e., |
pt_alpha |
(numeric) value for transparency of points (ranges from 0 to 1) |
lab_text_size |
(numeric) value for axis label text size |
axis_text_size |
(numeric) value for axis tick text size |
leg_pos |
(character or numeric) legend position, either numeric vector of x/y coordinates or shorthand accepted by |
leg_cont |
(character) vector of desired legend entries. Defaults to |
Value
(plot) base R ordination with an ellipse for each group
Examples
# Use data from the vegan package
data("varespec", package = 'vegan')
resp <- varespec
# Make some columns of known number of groups
factor_4lvl <- c(rep.int("Trt1", (nrow(resp)/4)),
rep.int("Trt2", (nrow(resp)/4)),
rep.int("Trt3", (nrow(resp)/4)),
rep.int("Trt4", (nrow(resp)/4)))
# And combine them into a single data object
data <- cbind(factor_4lvl, resp)
# Get a distance matrix from the data
dist <- vegan::vegdist(resp, method = 'kulczynski')
# Perform a PCoA on the distance matrix to get points for an ordination
pnts <- ape::pcoa(dist)
# Test the function for 4 groups
pcoa_ord(mod = pnts, groupcol = data$factor_4lvl)
Replace Non-ASCII Characters with Comparable ASCII Characters
Description
Finds all non-ASCII (American Standard Code for Information Interchange) characters in a character vector and replaces them with ASCII characters that are as visually similar as possible. For example, various special dash types (e.g., em dash, en dash, etc.) are replaced with a hyphen. The function will return a warning if it finds any non-ASCII characters for which it does not have a hard-coded replacement. Please open a GitHub Issue if you encounter this warning and have a suggestion for what the replacement character should be for that particular character.
Usage
replace_non_ascii(x = NULL, include_letters = FALSE)
Arguments
x |
(character) vector in which to replace non-ASCII characters |
include_letters |
(logical) whether to include letters with accents (e.g., u with an umlaut, etc.). Defaults to |
Value
(character) vector where all non-ASCII characters have been replaced by ASCII equivalents
Examples
# Make a vector of the hexadecimal codes for several non-ASCII characters
## This function accepts the characters themselves but CRAN checks do not
non_ascii <- c("\u201C", "\u00AC", "\u00D7")
# Invoke function
(ascii <- supportR::replace_non_ascii(x = non_ascii))
Knit an R Markdown File and Export to Google Drive
Description
This function allows you to knit a specified R Markdown file locally and export it to the Google Drive folder for which you provided a link. NOTE that if you have not used googledrive::drive_auth
this will prompt you to authorize a Google account in a new browser tab. If you do not check the box in that screen before continuing you will not be able to use this function until you clear your browser cache and re-authenticate. I recommend invoking drive_auth
beforehand to reduce the chances of this error
Usage
rmd_export(
rmd = NULL,
out_path = getwd(),
out_name = NULL,
out_type = "html",
drive_link
)
Arguments
rmd |
(character) name and path to R markdown file to knit |
out_path |
(character) path to the knit file's destination (defaults to path returned by |
out_name |
(character) desired name for knit file (with or without file suffix) |
out_type |
(character) either "html" or "pdf" depending on what YAML entry you have in the |
drive_link |
(character) full URL of drive folder to upload the knit document |
Value
No return value, called for side effect (to knit R Markdown file)
Examples
## Not run:
# Authorize R to interact with GoogleDrive
googledrive::drive_auth()
## NOTE: See warning about possible misstep at this stage
# Use `rmd_export()` to knit and export an .Rmd file
supportR::rmd_export(rmd = "my_markdown.Rmd", in_path = getwd(), out_path = getwd(),
out_name = "my_markdown", out_type = "html",
drive_link = "<Google Drive folder URL>")
## End(Not run)
Safely Rename Columns in a Dataframe
Description
Replaces specified column names with user-defined vector of new column name(s). This operation is done "safely" because it specifically matches each 'bad' name with its corresponding 'good' name and thus minimizes the risk of accidentally replacing the wrong column name.
Usage
safe_rename(data = NULL, bad_names = NULL, good_names = NULL)
Arguments
data |
(dataframe or dataframe-like) object with column names that match the values passed to the |
bad_names |
(character) vector of column names to replace in original data object. Order does not need to match data column order but must match the |
good_names |
(character) vector of column names to use as replacements for data object. Order does not need to match data column order but must match the |
Value
(dataframe or dataframe-like) with renamed columns
Examples
# Make a dataframe to demonstrate
df <- data.frame("first" = 1:3, "middle" = 4:6, "second" = 7:9)
# Invoke the function
supportR::safe_rename(data = df, bad_names = c("second", "middle"),
good_names = c("third", "second"))
Generate Summary Table for Supplied Response and Grouping Variables
Description
Calculates mean, standard deviation, sample size, and standard error of a given response variable within user-defined grouping variables. This is meant as a convenience instead of doing dplyr::group_by
followed by dplyr::summarize
iteratively themselves.
Usage
summary_table(
data = NULL,
groups = NULL,
response = NULL,
drop_na = FALSE,
round_digits = 2
)
Arguments
data |
(dataframe or dataframe-like) object with column names that match the values passed to the |
groups |
(character) vector of column names to group by |
response |
(character) name of the column name to calculate summary statistics for (the column must be numeric) |
drop_na |
(logical) whether to drop NAs in grouping variables. Defaults to FALSE |
round_digits |
(numeric) number of digits to which mean, standard deviation, and standard error should be rounded |
Value
(dataframe) summary table containing the mean, standard deviation, sample size, and standard error of the supplied response variable
Make a Markdown File into a Table
Description
Accepts one markdown file (i.e., "md" file extension) and returns its content as a table. Nested heading structure in markdown file–as defined by hashtags / pounds signs (#)–is identified and preserved as columns in the resulting tabular format. Each line of non-heading content in the file is preserved in the right-most column of one row of the table.
Usage
tabularize_md(file = NULL)
Arguments
file |
(character/url connection) name and file path of markdown file to transform into a table or a connection object to a URL of a markdown file (see |
Value
(dataframe) table with one additional column than there are heading levels in the document (e.g., if first and second level headings are in the document, the resulting table will have three columns) and one row per line of non-heading content in the markdown file.
Examples
## Not run:
# Identify URL to the NEWS.md file in `supportR` GitHub repo
md_cxn <- url("https://raw.githubusercontent.com/njlyon0/supportR/main/NEWS.md")
# Transform it into a table
md_df <- supportR::tabularize_md(file = md_cxn)
# Close connection (just good housekeeping to do so)
close(md_cxn)
# Check out the table format
str(md_df)
## End(Not run)
Complete ggplot2
Theme for Non-Data Aesthetics
Description
Custom alternative to the ggtheme
options built into ggplot2
. Removes gray boxes and grid lines from plot background. Increases font size of tick marks and axis labels. Removes gray box from legend background and legend key. Removes legend title.
Usage
theme_lyon(title_size = 16, text_size = 13)
Arguments
title_size |
(numeric) size of font in axis titles |
text_size |
(numeric) size of font in tick labels |
Value
(ggplot theme) list of ggplot2 theme elements