Help for package fgeo.tool

Title:

Import and Manipulate 'ForestGEO' Data

Version:

1.2.10

Description:

To help you access, transform, analyze, and visualize 'ForestGEO' data, we developed a collection of R packages (https://forestgeo.github.io/fgeo/). This package, in particular, helps you to easily import, filter, and modify 'ForestGEO' data. To learn more about 'ForestGEO' visit https://forestgeo.si.edu/.

License:

GPL-3

URL:

https://forestgeo.github.io/fgeo.tool/, https://github.com/forestgeo/fgeo.tool

BugReports:

https://github.com/forestgeo/fgeo.tool/issues

Depends:

R (≥ 3.2)

Imports:

dplyr (≥ 0.8.0.1), glue (≥ 1.3.1), magrittr (≥ 1.5), purrr (≥ 0.3.2), readr (≥ 1.3.1), rlang (≥ 0.4.11), tibble (≥ 2.1.1), tidyselect (≥ 0.2.5)

Suggests:

covr (≥ 3.2.1), fgeo.x (≥ 1.1.3), knitr (≥ 1.22), roxygen2 (≥ 6.1.1), spelling (≥ 2.1), stringr (≥ 1.4.0), testthat (≥ 2.1.1), tidyr (≥ 0.8.3)

Encoding:

UTF-8

Language:

en-US

RoxygenNote:

7.3.2

NeedsCompilation:

Packaged:

2025-04-03 17:11:30 UTC; rstudio

Author:

Mauro Lepore

[aut, ctr, cre], Richard Condit [aut], Suzanne Lao [aut], Anudeep Singh [aut], CTFS-ForestGEO [cph, fnd]

Maintainer:

Mauro Lepore <maurolepore@gmail.com>

Repository:

CRAN

Date/Publication:

2025-04-03 17:30:02 UTC

fgeo.tool: Import and Manipulate 'ForestGEO' Data

Description

Author(s)

Maintainer: Mauro Lepore maurolepore@gmail.com (ORCID) [contractor]

Authors:

Richard Condit richardcondit@gmail.com
Suzanne Lao laoz@si.edu
Anudeep Singh anudeep7@gmail.com

Other contributors:

CTFS-ForestGEO ForestGEO@si.edu [copyright holder, funder]

Pipe operator

Description

See magrittr::%>% for details.

Usage

lhs %>% rhs

Add column `status_tree` based on the status of all stems of each tree.

Description

Add column status_tree based on the status of all stems of each tree.

Usage

add_status_tree(data, status_a = "A", status_d = "D")

Arguments

data

A ForestGEO-like dataframe: A ViewFullTable, tree or stem table.

status_a, status_d

Sting to match alive and dead stems; it corresponds to the values of the variable status (in census tables) or Status (with capital "S" in ViewFull tables).

Value

The input data set with the additional variable status_tree.

Examples

# styler: off
stem <- tribble(
  ~CensusID, ~treeID, ~stemID, ~status,
          1,       1,       1,     "A",
          1,       1,       2,     "D",

          1,       2,       3,     "D",
          1,       2,       4,     "D",



          2,       1,       1,     "A",
          2,       1,       2,     "G",

          2,       2,       3,     "D",
          2,       2,       4,     "G"
)
# styler: on

add_status_tree(stem)

Add column `subquadrat` based on `QX` and `QY` coordinates.

Description

Add column subquadrat based on QX and QY coordinates.

Usage

add_subquad(data, x_q, y_q = x_q, x_sq, y_sq = x_sq, subquad_offset = NULL)

Arguments

data

A dataframe with quadrat coordinates QX and QY (e.g. a ViewFullTable).

x_q, y_q

Size in meters of a quadrat's side. For ForestGEO sites, a common value is 20.

x_sq, y_sq

Size in meters of a subquadrat's side. For ForestGEO sites, a common value is 5.

subquad_offset

Either -1 or 1, to rest or add one unit to the digit of each subquadrat that represents the column number.

First column is 0    First column is 1
-----------------    -----------------
   04 14 24 34          14 24 34 44
   03 13 23 33          13 23 33 43
   02 12 22 32          12 22 32 42
   01 11 21 31          11 21 31 41

Value

Returns data with the additional variable subquadrat.

Author(s)

Anudeep Singh and Mauro Lepore.

Examples

# styler: off
vft <- tribble(
   ~QX,  ~QY,
  17.9,    0,
   4.1,   15,
   6.1, 17.3,
   3.8,  5.9,
   4.5, 12.4,
   4.9,  9.3,
   9.8,  3.2,
  18.6,  1.1,
  17.3,  4.1,
   1.5, 16.3
)
# styler: on

add_subquad(vft, 20, 20, 5, 5)

add_subquad(vft, 20, 20, 5, 5, subquad_offset = -1)

Add columns `lx/ly`, `QX/QY`, `index`, `col/row`, `hectindex`, `quad`, `gx/gy`.

Description

These functions add columns to position trees in a forest plot. They work with ViewFullTable, tree and stem tables. From the input table, most functions use only the gx and gy columns (or equivalent columns). The exception is the function add_gxgy() which inputs quadrat information. If your data lacks some important column, an error message will inform you which column is missing.

Usage

add_lxly(data, gridsize = 20, plotdim = NULL)

add_qxqy(data, gridsize = 20, plotdim = NULL)

add_index(data, gridsize = 20, plotdim = NULL)

add_col_row(data, gridsize = 20, plotdim = NULL)

add_hectindex(data, gridsize = 20, plotdim = NULL)

add_quad(data, gridsize = 20, plotdim = NULL, start = NULL, width = 2)

add_gxgy(data, gridsize = 20, start = 0)

Arguments

data

A ForestGEO-like dataframe: A ViewFullTable, tree or stem table.

gridsize

The gridsize of the census plot (commonly 20 m).

plotdim

The global dimensions of the census plot (i.e. the maximum possible values of gx and gy).

start

Defaults to label the first quadrat as "0101". Use 0 to label it as "0000" instead.

width

Number; width to pad the labels of plot-columns and -rows.

Details

These functions are adapted from the CTFS R Package.

Value

For any given var, a function add_var() returns a modified version of the input dataframe, with the additional variable(s) var.

Examples

# styler: off
x <- tribble(
    ~gx,    ~gy,
      0,      0,
     50,     25,
  999.9, 499.95,
   1000,    500
)
# styler: on

# `gridsize` has a common default; `plotdim` is guessed from the data
add_lxly(x)

gridsize <- 20
plotdim <- c(1000, 500)

add_qxqy(x, gridsize, plotdim)

add_index(x, gridsize, plotdim)

add_hectindex(x, gridsize, plotdim)

add_quad(x, gridsize, plotdim)

add_quad(x, gridsize, plotdim, start = 0)

# `width` gives the nuber of digits to pad the label of plot-rows and
# plot-columns, e.g. 3 pads plot-rows with three zeros and plot-columns with
# an extra trhree zeros, resulting in a total of 6 zeros.
add_quad(x, gridsize, plotdim, start = 0, width = 3)

add_col_row(x, gridsize, plotdim)


# From `quadrat` or `QuadratName` --------------------------------------
# styler: off
x <- tribble(
  ~QuadratName,
        "0001",
        "0011",
        "0101",
        "1001"
)
# styler: on

# Output `gx` and `gy` ---------------

add_gxgy(x)

assert_is_installed("fgeo.x")
# Warning: The data may already have `gx` and `gx` columns
gxgy <- add_gxgy(fgeo.x::tree5)
select(gxgy, matches("gx|gy"))

# Output `col` and `row` -------------

# Create columns `col` and `row` from `QuadratName` with `tidyr::separate()`
# The argument `sep` lets you separate `QuadratName` at any positon
## Not run: 
tidyr_is_installed <- requireNamespace("tidyr", quietly = TRUE)
stringr_is_installed <- requireNamespace("stringr", quietly = TRUE)

if (tidyr_is_installed && stringr_is_installed) {
  library(tidyr)
  library(stringr)

  vft <- tibble(QuadratName = c("0001", "0011"))
  vft

  separate(
    vft,
    QuadratName,
    into = c("col", "row"),
    sep = 2
  )

  census <- select(fgeo.x::tree5, quadrat)
  census

  census$quadrat <- str_pad(census$quadrat, width = 4, pad = 0)

  separate(
    census,
    quadrat,
    into = c("col", "row"),
    sep = 2,
    remove = FALSE
  )
}

## End(Not run)

Assert a package is installed.

Description

Assert a package is installed.

Usage

assert_is_installed(pkg)

Arguments

pkg

Character vector giving the name of a package.

Value

An error if pkg is not installed or invisible pkg if it is.

Examples

assert_is_installed("base")
## Not run: 
try(assert_is_installed("bad"))

## End(Not run)

Check if an object contains specific names.

Description

Check if an object contains specific names.

Usage

check_crucial_names(x, nms)

Arguments

x

A named object.

nms

String; names expected to be found in x.

Value

Invisible x, or an error with informative message.

Examples

v <- c(x = 1)
check_crucial_names(v, "x")

dfm <- data.frame(x = 1)
check_crucial_names(dfm, "x")

Drop if missing values.

Description

Valuable mostly for its warning.

Usage

drop_if_na(dfm, x)

Arguments

dfm

A dataframe.

x

String giving a column name of dfm.

Value

A dataframe.

Examples

dfm <- data.frame(a = 1, b = NA)
drop_if_na(dfm, "b")
drop_if_na(dfm, "a")

Extract plot dimensions from habitat data.

Description

Extract plot dimensions from habitat data.

Usage

extract_gridsize(habitats)

extract_plotdim(habitats)

Arguments

habitats

Data frame giving the habitat designation for each 20x20 quadrat.

Value

extract_plotdim(): plotdim (vector of length 2);
extract_gridsize(): gridsize (scalar).

Examples

assert_is_installed("fgeo.x")
habitat <- fgeo.x::habitat
extract_plotdim(habitat)
extract_gridsize(habitat)

Detect and extract matching strings – ignoring case.

Description

Detect and extract matching strings – ignoring case.

Return TRUE in position where name of x is in y; FALSE otherwise.

Usage

extract_insensitive(x, y)

detect_insensitive(x, y)

Arguments

x

A string to be muted as in y, it a case insensitive match is found.

y

A string to use as a reference to match x.

Value

⁠detect_*⁠ and ⁠extract_*⁠ return a logical vector and a string.

Examples

x <- c("stemid", "n")
y <- c("StemID", "treeID")
detect_insensitive(x, y)
extract_insensitive(x, y)

vft <- data.frame(TreeID = 1, Status = 1)
extract_insensitive(tolower(names(vft)), names(vft))
extract_insensitive(names(vft), tolower(names(vft)))

Create elevation data.

Description

This function constructs an object of class "fgeo_elevation". It standardizes the structure of elevation data to always output a dataframe with names gx, gy and elev.

Usage

fgeo_elevation(elev)

Arguments

elev

One of these:

A dataframe containing elevation data, with columns gx, gy, and elev, or x, y, and elev (e.g. fgeo.x::elevation$col).
A ForestGEO-like elevation list with elements xdim and ydim giving plot dimensions, and element col containing a dataframe as described in the previous item (e.g. fgeo.x::elevation).

Value

A dataframe with names x/gx, y/gy and elev.

Acknowledgments

This function was inspired by David Kenfack.

Examples

assert_is_installed("fgeo.x")

# Input: Elevation dataframe
elevation_df <- fgeo.x::elevation$col
fgeo_elevation(elevation_df)

class(elevation_df)
class(fgeo_elevation(elevation_df))

names(elevation_df)
names(fgeo_elevation(elevation_df))

# Input: Elevation list
elevation_ls <- fgeo.x::elevation
fgeo_elevation(elevation_ls)

class(elevation_ls)
class(fgeo_elevation(elevation_ls))

names(elevation_ls)
names(fgeo_elevation(elevation_ls))

Flag if a vector or dataframe-column meets a condition.

Description

This function returns a condition (error, warning, or message) and its first argument, invisibly. It is a generic. If the first input is a vector, it evaluates it directly; if it is is a dataframe, it evaluates a given column.

Usage

flag_if(.data, ...)

## Default S3 method:
flag_if(.data, predicate, condition = warning, msg = NULL, ...)

## S3 method for class 'data.frame'
flag_if(.data, name, predicate, condition = warning, msg = NULL, ...)

Arguments

.data

Vector.

...

Other arguments passed to methods.

predicate

A predicate function.

condition

A condition function (e.g. stop(), warning(), rlang::inform()).

msg

String. An optional custom message.

name

String. The name of a column of a dataframe.

Value

A condition (and .data invisibly).

Examples

# WITH VECTORS
dupl <- c(1, 1)
flag_if(dupl, is_duplicated)
# Silent
flag_if(dupl, is_multiple)

mult <- c(1, 2)
flag_if(mult, is_multiple, message, "Custom")
# Silent
flag_if(mult, is_duplicated)

# Both silent
flag_if(c(1, NA), is_multiple)
flag_if(c(1, NA), is_duplicated)

# WITH DATAFRAMES
.df <- data.frame(a = 1:3, b = 1, stringsAsFactors = FALSE)
flag_if(.df, "b", is_multiple)
flag_if(.df, "a", is_multiple)
flag_if(.df, "a", is_multiple, message, "Custom")

Detect and flag based on a predicate applied to a variable by groups.

Description

These functions extend flag_if()] and detect_if() to work by groups defined with dplyr::group_by().

Usage

flag_if_group(.data, name, predicate, condition = warn, msg = NULL)

detect_if_group(.data, name, predicate)

Arguments

.data

A dataframe.

name

String. The name of a column of the dataframe.

predicate

A predicate function, e.g. is_multiple().

condition

A condition function, e.g. rlang::inform() or base::stop().

msg

String to customize the returned message.

Value

flag_if_group(): A condition and its first input, invisibly.
detect_if_group(): Logical of length 1.

Examples

tree <- tibble(CensusID = c(1, 2), treeID = c(1, 2))
detect_if_group(tree, "treeID", is_multiple)
flag_if_group(tree, "treeID", is_multiple)

by_censusid <- group_by(tree, CensusID)
detect_if_group(by_censusid, "treeID", is_multiple)
flag_if_group(by_censusid, "treeID", is_multiple)

Functions to get variables from other variables.

Description

These functions wrap their corresponding functions from the CTFS R Package, but these versions are stricter. The main differences are these:

names use "_" not ".".
argument gridsize defaults to missing to force the user to provide it.
If the argument plotdim is missing from functions gxgy_fun(), its value will be guessed and notified.

Usage

rowcol_to_index(rowno, colno, gridsize, plotdim)

index_to_rowcol(index, gridsize, plotdim)

gxgy_to_index(gx, gy, gridsize, plotdim)

gxgy_to_lxly(gx, gy, gridsize, plotdim)

gxgy_to_qxqy(gx, gy, gridsize, plotdim)

gxgy_to_rowcol(gx, gy, gridsize, plotdim)

gxgy_to_hectindex(gx, gy, plotdim)

index_to_gxgy(index, gridsize, plotdim)

Arguments

rowno, colno

Row and column number – as defined in a census plot.

gridsize

The gridsize of the census plot (commonly 20 m).

plotdim

The global dimensions of the census plot (i.e. the maximum possible values of gx and gy).

index

Index number as defined for a census plot.

gx, gy

A number; global x and y position in a census plot.

Details

gxgy_to_qxqy() didn't exist in the original CTFS R Package. Added for consistency.

Value

A vector or dataframe (see examples).

Author(s)

Rick Condit, Suzanne Lao.

Examples

gxgy_to_index(c(0, 400, 990), c(0, 200, 490), gridsize = 20)

gridsize <- 20
plotdim <- c(1000, 500)

x <- gxgy_to_hectindex(1:3, 1:3, plotdim)
x
typeof(x)
is.data.frame(x)
is.vector(x)

x <- gxgy_to_index(1:3, 1:3, gridsize, plotdim)
x
typeof(x)
is.data.frame(x)
is.vector(x)

x <- gxgy_to_lxly(1:3, 1:3, gridsize, plotdim)
x
typeof(x)
is.data.frame(x)
is.vector(x)

x <- gxgy_to_rowcol(1:3, 1:3, gridsize, plotdim)
x
typeof(x)
is.data.frame(x)
is.vector(x)

x <- index_to_rowcol(1:3, gridsize, plotdim)
x
typeof(x)
is.data.frame(x)
is.vector(x)

x <- rowcol_to_index(1:3, 1:3, gridsize, plotdim)
x
typeof(x)
is.data.frame(x)
is.vector(x)

index_to_gxgy(1:3, gridsize, plotdim)

Guess plot dimensions.

Description

Guess plot dimensions.

Usage

guess_plotdim(x, accuracy = 20)

Arguments

x

A ForestGEO-like dataframe: A ViewFullTable, tree or stem table.

accuracy

A number giving the accuracy with which to round gx and gy.

Value

A numeric vector of length 2.

Examples

x <- data.frame(
  gx = c(0, 300, 979),
  gy = c(0, 300, 481)
)
guess_plotdim(x)

Predicates to detect and flag duplicated and multiple values of a variable.

Description

is_multiple() and is_duplicated() return TRUE if they detect, respectively, multiple different values of a variable (e.g. c(1, 2)⁠), or duplicated values of a variable (e.g. c(1, 1)⁠).

Usage

is_multiple(.data)

is_duplicated(.data)

Arguments

.data

A vector.

Value

Logical.

Examples

is_multiple(c(1, 2))
is_multiple(c(1, 1))
is_multiple(c(1, NA))

is_duplicated(c(1, 2))
is_duplicated(c(1, 1))
is_duplicated(c(1, NA))

Try to rename an object.

Description

Given a name you want and a possible alternative, this function renames an object as you want or errs with an informative message.

Usage

nms_try_rename(x, want, try)

Arguments

x

A named object.

want

String of length 1 giving the name you want the object to have.

try

String of length 1 giving the name the object might have.

Examples

nms_try_rename(c(a = 1), "A", "a")
nms_try_rename(data.frame(a = 1), "A", "a")

# Passes
nms_try_rename(c(a = 1, 1), "A", "a")
## Not run: 
# Errs
# nms_try_rename(1, "A", "A")

## End(Not run)

Pick and drop rows from ViewFullTable, tree, and stem tables.

Description

These functions provide an expressive and convenient way to pick specific rows from ForestGEO datasets. They allow you to remove missing values (with na.rm = TRUE) but conservatively default to preserving them. This behavior is similar to base::subset() and unlike dplyr::filter(). This conservative default is important because you want want to include missing trees in your analysis.

Usage

pick_dbh_min(data, value, na.rm = FALSE)

pick_dbh_max(data, value, na.rm = FALSE)

pick_dbh_under(data, value, na.rm = FALSE)

pick_dbh_over(data, value, na.rm = FALSE)

pick_status(data, value, na.rm = FALSE)

drop_status(data, value, na.rm = FALSE)

Arguments

data

A ForestGEO-like dataframe: A ViewFullTable, tree or stem table.

value

An atomic vector; a single value against to compare each value of the variable encoded in the function's name.

na.rm

Set to TRUE if you want to remove missing values from the variable encoded in the function's name.

Value

A dataframe similar to .data but including only the rows with matching conditions.

Examples

# styler: off
census <- tribble(
  ~dbh, ~status,
     0,     "A",
    50,     "A",
   100,     "A",
   150,     "A",
    NA,     "M",
    NA,     "D",
    NA,      NA
  )
# styler: on

# <=
pick_dbh_max(census, 100)
pick_dbh_max(census, 100, na.rm = TRUE)

# >=
pick_dbh_min(census, 100)
pick_dbh_min(census, 100, na.rm = TRUE)

# <
pick_dbh_under(census, 100)
pick_dbh_under(census, 100, na.rm = TRUE)

# >
pick_dbh_over(census, 100)
pick_dbh_over(census, 100, na.rm = TRUE)
# Same, but `subset()` does not let you keep NAs.
subset(census, dbh > 100)

# ==
pick_status(census, "A")
pick_status(census, "A", na.rm = TRUE)

# !=
drop_status(census, "D")
drop_status(census, "D", na.rm = TRUE)

# Compose
pick_dbh_over(
  drop_status(census, "D", na.rm = TRUE),
  100
)

# More readable as a pipiline
census %>%
  drop_status("D", na.rm = TRUE) %>%
  pick_dbh_over(100)

# Also works with ViewFullTables
# styler: off
vft <- tribble(
  ~DBH,   ~Status,
     0,   "alive",
    50,   "alive",
   100,   "alive",
   150,   "alive",
    NA, "missing",
    NA,    "dead",
    NA,        NA
)
# styler: on

pick_dbh_max(vft, 100)

pick_status(vft, "alive", na.rm = TRUE)

Pick the main stem or main stemid(s) of each tree in each census.

Description

pick_main_stem() picks a unique row for each treeID per census.
pick_main_stemid() picks a unique row for each stemID per census. It is only useful when a single stem was measured twice in the same census, which sometimes happens to correct for the effect of large buttresses.

Usage

pick_main_stem(data)

pick_main_stemid(data)

Arguments

data

A ForestGEO-like dataframe: A ViewFullTable, tree or stem table.

Details

pick_main_stem() picks the main stem of each tree in each census. It collapses data of multi-stem trees by picking a single stem per treeid per censusid. From this group, it picks the stem at the top of a list sorted first by descending order of hom and then by descending order of dbh. This this corrects the effect of buttresses and picks the main stem. It ignores groups of grouped data and rejects data with multiple plots.
pick_main_stemid() does one step less than pick_main_stem(). It only picks the main stemid(s) of each tree in each census and keeps all stems per treeid. This is useful when calculating the total basal area of a tree, because you need to sum the basal area of each individual stem as well as sum only one of the potentially multiple measurements of each buttressed stem per census.

Value

A dataframe with a single plotname, and one row per per treeid per censusid.

Warning

These functions may be considerably slow. They are fastest if the data already has a single stem per treeid. They are slower with data containing multiple stems per treeid (per censusid), which is the main reason for using this function. The slowest scenario is when data also contains duplicated values of stemid per treeid (per censusid). This may happen if trees have buttresses, in which case these functions check every stem for potential duplicates and pick the one with the largest hom value.

For example, in a windows computer with 32 GB of RAM, a dataset with 2 million rows with multiple stems and buttresses took about 3 minutes to run. And a dataset with 2 million rows made up entirely of main stems took about ten seconds to run.

Examples

# One `treeID` with multiple stems.
# `stemID == 1.1` has two measurements (due to buttresses).
# `stemID == 1.2` has a single measurement.
# styler: off
census <- tribble(
    ~sp, ~treeID, ~stemID,  ~hom, ~dbh, ~CensusID,
  "sp1",     "1",   "1.1",   140,   40,         1,  # main stemID (max `hom`)
  "sp1",     "1",   "1.1",   130,   60,         1,
  "sp1",     "1",   "1.2",   130,   55,         1   # main stemID (only one)
)
#' # styler: on

# Picks a unique row per unique `treeID`
pick_main_stem(census)

# Picks a unique row per unique `stemID`
pick_main_stemid(census)

Import ViewFullTable or ViewTaxonomy data from a .tsv or .csv file.

Description

read_vft() and read_taxa() help you to read ViewFullTable and ViewTaxonomy data from text files delivered by the ForestGEO database. These functions avoid common problems about column separators, missing values, column names, and column types.

Usage

read_vft(file, delim = NULL, na = c("", "NA", "NULL"), ...)

read_taxa(file, delim = NULL, na = c("", "NA", "NULL"), ...)

Arguments

file

A path to a file.

delim

Single character used to separate fields within a record. The default (delim = NULL) is to guess between comma or tab ("," or "\t").

na

Character vector of strings to interpret as missing values. Set this option to character() to indicate no missing values.

...

Other arguments passed to readr::read_delim().

Value

A tibble.

Acknowledgments

Thanks to Shameema Jafferjee Esufali for inspiring the feature that automatically detects delim (issue 65).

Examples

assert_is_installed("fgeo.x")
library(fgeo.x)

example_path()

file_vft <- example_path("view/vft_4quad.csv")
read_vft(file_vft)

file_taxa <- example_path("view/taxa.csv")
read_taxa(file_taxa)

Recode subquadrat.

Description

Recode subquadrat.

Usage

recode_subquad(data, offset = -1)

Arguments

data

A dataframe with the variable subquadrat.

offset

A number; either -1 or 1, to rest or add one unit to the number of column of each subquadrat.

First column is 0    First column is 1
-----------------    -----------------
   04 14 24 34          14 24 34 44
   03 13 23 33          13 23 33 43
   02 12 22 32          12 22 32 42
   01 11 21 31          11 21 31 41

Value

A modified version of the input.

Examples

first_subquad_11 <- tibble(subquadrat = c("11", "12", "22"))
first_subquad_11

first_subquad_01 <- recode_subquad(first_subquad_11, offset = -1)
first_subquad_01

first_subquad_11 <- recode_subquad(first_subquad_01, offset = 1)
first_subquad_11

Objects exported from other packages

Description

These objects are imported from other packages. Follow the links below to see their documentation.

dplyr: add_count, arrange, count, filter, group_by, mutate, select, summarise, summarize, ungroup
rlang: %||%
tibble: as_tibble, tibble, tribble, tribble
tidyselect: contains, ends_with, everything, last_col, matches, num_range, one_of, starts_with

Rename an object based on case-insensitive match of the names of a reference.

Description

Rename an object based on case-insensitive match of the names of a reference.

Usage

rename_matches(x, y)

Arguments

x

x object which names to restored if they match the reference.

y

Named object to use as reference.

Value

The output is x with as many names changed as case-insensitive matches there are with the reference.

Examples

ref <- data.frame(COL1 = 1, COL2 = 1)
x <- data.frame(col1 = 5, col2 = 1, n = 5)
rename_matches(x, ref)

Fix common problems in ViewFullTable and ViewTaxonomy data.

Description

These functions fix common problems of ViewFullTable and ViewTaxonomy data:

Ensure that each column has the correct type.
Ensure that missing values are represented with NAs – not with the literal string "NULL".

Usage

sanitize_vft(.data, na = c("", "NA", "NULL"), ...)

sanitize_taxa(.data, na = c("", "NA", "NULL"), ...)

Arguments

.data

A dataframe; either a ForestGEO ViewFullTable (sanitize_vft()). or ViewTaxonomy (sanitize_vft()).

na

Character vector of strings to interpret as missing values. Set this option to character() to indicate no missing values.

...

Arguments passed to readr::type_convert().

Value

A dataframe.

Acknowledgments

Thanks to Shameema Jafferjee Esufali for motivating this functions.

Examples

assert_is_installed("fgeo.x")

vft <- fgeo.x::vft_4quad

# Introduce problems to show how to fix them
# Bad column types
vft[] <- lapply(vft, as.character)
# Bad representation of missing values
vft$PlotName <- "NULL"

# "NULL" should be replaced by `NA` and `DBH` should be numeric
str(vft[c("PlotName", "DBH")])

# Fix
vft_sane <- sanitize_vft(vft)
str(vft_sane[c("PlotName", "DBH")])

taxa <- read.csv(fgeo.x::example_path("taxa.csv"))
# E.g. inserting bad column types
taxa[] <- lapply(taxa, as.character)
# E.g. inserting bad representation of missing values
taxa$SubspeciesID <- "NULL"

# "NULL" should be replaced by `NA` and `ViewID` should be integer
str(taxa[c("SubspeciesID", "ViewID")])

# Fix
taxa_sane <- sanitize_taxa(taxa)
str(taxa_sane[c("SubspeciesID", "ViewID")])

Tidy eval helpers

Description

This page lists the tidy eval tools reexported in this package from rlang. To learn about using tidy eval in scripts and packages at a high level, see the dplyr programming vignette and the ggplot2 in packages vignette. The Metaprogramming section of Advanced R may also be useful for a deeper dive.

The tidy eval operators ⁠{{⁠, ⁠!!⁠, and ⁠!!!⁠ are syntactic constructs which are specially interpreted by tidy eval functions. You will mostly need ⁠{{⁠, as ⁠!!⁠ and ⁠!!!⁠ are more advanced operators which you should not have to use in simple cases.

The curly-curly operator ⁠{{⁠ allows you to tunnel data-variables passed from function arguments inside other tidy eval functions. ⁠{{⁠ is designed for individual arguments. To pass multiple arguments contained in dots, use ... in the normal way.
```
my_function <- function(data, var, ...) {
  data %>%
    group_by(...) %>%
    summarise(mean = mean({{ var }}))
}
```
rlang::enquo() and rlang::enquos() delay the execution of one or several function arguments. The former returns a single expression, the latter returns a list of expressions. Once defused, expressions will no longer evaluate on their own. They must be injected back into an evaluation context with ⁠!!⁠ (for a single expression) and ⁠!!!⁠ (for a list of expressions).
```
my_function <- function(data, var, ...) {
  # Defuse
  var <- enquo(var)
  dots <- enquos(...)

  # Inject
  data %>%
    group_by(!!!dots) %>%
    summarise(mean = mean(!!var))
}
```
In this simple case, the code is equivalent to the usage of ⁠{{⁠ and ... above. Defusing with enquo() or enquos() is only needed in more complex cases, for instance if you need to inspect or modify the expressions in some way.
The .data pronoun is an object that represents the current slice of data. If you have a variable name in a string, use the .data pronoun to subset that variable with [[.
```
my_var <- "disp"
mtcars %>% summarise(mean = mean(.data[[my_var]]))
```

Another tidy eval operator is ⁠:=⁠. It makes it possible to use glue and curly-curly syntax on the LHS of =. For technical reasons, the R language doesn't support complex expressions on the left of =, so we use ⁠:=⁠ as a workaround.

my_function <- function(data, var, suffix = "foo") {
  # Use `{{` to tunnel function arguments and the usual glue
  # operator `{` to interpolate plain strings.
  data %>%
    summarise("{{ var }}_mean_{suffix}" := mean({{ var }}))
}

Many tidy eval functions like dplyr::mutate() or dplyr::summarise() give an automatic name to unnamed inputs. If you need to create the same sort of automatic names by yourself, use as_label(). For instance, the glue-tunnelling syntax above can be reproduced manually with:
```
my_function <- function(data, var, suffix = "foo") {
  var <- enquo(var)
  prefix <- as_label(var)
  data %>%
    summarise("{prefix}_mean_{suffix}" := mean(!!var))
}
```
Expressions defused with enquo() (or tunnelled with ⁠{{⁠) need not be simple column names, they can be arbitrarily complex. as_label() handles those cases gracefully. If your code assumes a simple column name, use as_name() instead. This is safer because it throws an error if the input is not a name as expected.

Ensure the specific columns of a dataframe have a particular type.

Description

Ensure the specific columns of a dataframe have a particular type.

Usage

type_ensure(df, ensure_nms, type = "numeric")

Arguments

df

A dataframe.

ensure_nms

Character vector giving names of df to ensure type

type

A string giving the type to ensure in columns ensure_nms

Value

A modified version of df, with columns (specified in ensure_nms) of type type.

Examples

dfm <- tibble(
  w = c(NA, 1, 2),
  x = 1:3,
  y = as.character(1:3),
  z = letters[1:3]
)
dfm
type_ensure(dfm, c("w", "x", "y"), "numeric")
type_ensure(dfm, c("w", "x", "y", "z"), "character")

Help to read ForestGEO data safely, with consistent columns type.

Description

A common cause of problems is feeding functions with data which columns are not all of the expected type. The problem often begins when reading data from a text file with functions such as utils::read.csv(), utils::read.delim(), and friends – which commonly guess wrongly the column type that you more likely expect. These common offenders are strongly discouraged; instead consider using readr::read_csv(), readr::read_tsv(), and friends, which guess column types correctly much more often than their analogs from the utils package.

type_vft() and type_taxa() help you to read data more safely by explicitly specifying what type to expect from each column of known datasets. These functions output the specification of column types used internally by read_vft() and read_taxa():

⁠type_vft():⁠ Type specification for ViewFullTable.
⁠type_taxa():⁠ Type specification for ViewFullTaxonomy.

Usage

type_vft()

type_taxa()

Details

Types reference (for more details see readr::read_delim()):

c = character,
i = integer,
n = number,
d = double,
l = logical,
D = date,
T = date time,
t = time,
? = guess,
or _/- to skip the column.'.

Value

A list.

Examples

assert_is_installed("fgeo.x")
library(fgeo.x)
library(readr)

str(type_vft())

read_csv(example_path("view/vft_4quad.csv"), col_types = type_vft())

str(type_taxa())

read_csv(example_path("view/taxa.csv"), col_types = type_taxa())

fgeo.tool: Import and Manipulate 'ForestGEO' Data

Description

Author(s)

See Also

Pipe operator

Description

Usage

Add column status_tree based on the status of all stems of each tree.

Description

Usage

Arguments

Value

See Also

Examples

Add column subquadrat based on QX and QY coordinates.

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Add columns lx/ly, QX/QY, index, col/row, hectindex, quad, gx/gy.

Description

Usage

Arguments

Details

Value

See Also

Examples

Assert a package is installed.

Description

Usage

Arguments

Value

Examples

Check if an object contains specific names.

Description

Usage

Arguments

Value

See Also

Examples

Drop if missing values.

Description

Usage

Arguments

Value

See Also

Examples

Extract plot dimensions from habitat data.

Description

Usage

Arguments

Value

Examples

Detect and extract matching strings – ignoring case.

Description

Usage

Arguments

Value

See Also

Examples

Create elevation data.

Description

Usage

Arguments

Value

Acknowledgments

Examples

Flag if a vector or dataframe-column meets a condition.

Description

Usage

Arguments

Value

See Also

Examples

Detect and flag based on a predicate applied to a variable by groups.

Description

Usage

Add column `status_tree` based on the status of all stems of each tree.

Add column `subquadrat` based on `QX` and `QY` coordinates.

Add columns `lx/ly`, `QX/QY`, `index`, `col/row`, `hectindex`, `quad`, `gx/gy`.