Title: Import and Manipulate 'ForestGEO' Data
Version: 1.2.10
Description: To help you access, transform, analyze, and visualize 'ForestGEO' data, we developed a collection of R packages (https://forestgeo.github.io/fgeo/). This package, in particular, helps you to easily import, filter, and modify 'ForestGEO' data. To learn more about 'ForestGEO' visit https://forestgeo.si.edu/.
License: GPL-3
URL: https://forestgeo.github.io/fgeo.tool/, https://github.com/forestgeo/fgeo.tool
BugReports: https://github.com/forestgeo/fgeo.tool/issues
Depends: R (≥ 3.2)
Imports: dplyr (≥ 0.8.0.1), glue (≥ 1.3.1), magrittr (≥ 1.5), purrr (≥ 0.3.2), readr (≥ 1.3.1), rlang (≥ 0.4.11), tibble (≥ 2.1.1), tidyselect (≥ 0.2.5)
Suggests: covr (≥ 3.2.1), fgeo.x (≥ 1.1.3), knitr (≥ 1.22), roxygen2 (≥ 6.1.1), spelling (≥ 2.1), stringr (≥ 1.4.0), testthat (≥ 2.1.1), tidyr (≥ 0.8.3)
Encoding: UTF-8
Language: en-US
RoxygenNote: 7.3.2
NeedsCompilation: no
Packaged: 2025-04-03 17:11:30 UTC; rstudio
Author: Mauro Lepore ORCID iD [aut, ctr, cre], Richard Condit [aut], Suzanne Lao [aut], Anudeep Singh [aut], CTFS-ForestGEO [cph, fnd]
Maintainer: Mauro Lepore <maurolepore@gmail.com>
Repository: CRAN
Date/Publication: 2025-04-03 17:30:02 UTC

fgeo.tool: Import and Manipulate 'ForestGEO' Data

Description

To help you access, transform, analyze, and visualize 'ForestGEO' data, we developed a collection of R packages (https://forestgeo.github.io/fgeo/). This package, in particular, helps you to easily import, filter, and modify 'ForestGEO' data. To learn more about 'ForestGEO' visit https://forestgeo.si.edu/.

Author(s)

Maintainer: Mauro Lepore maurolepore@gmail.com (ORCID) [contractor]

Authors:

Other contributors:

See Also

Useful links:


Pipe operator

Description

See magrittr::%>% for details.

Usage

lhs %>% rhs

Add column status_tree based on the status of all stems of each tree.

Description

Add column status_tree based on the status of all stems of each tree.

Usage

add_status_tree(data, status_a = "A", status_d = "D")

Arguments

data

A ForestGEO-like dataframe: A ViewFullTable, tree or stem table.

status_a, status_d

Sting to match alive and dead stems; it corresponds to the values of the variable status (in census tables) or Status (with capital "S" in ViewFull tables).

Value

The input data set with the additional variable status_tree.

See Also

Other functions to add columns to dataframes: add_subquad(), add_var()

Other functions for ForestGEO data: add_subquad(), add_var()

Other functions for fgeo census: add_var(), guess_plotdim(), pick_drop

Other functions for fgeo vft: add_subquad(), add_var(), guess_plotdim(), pick_drop

Examples

# styler: off
stem <- tribble(
  ~CensusID, ~treeID, ~stemID, ~status,
          1,       1,       1,     "A",
          1,       1,       2,     "D",

          1,       2,       3,     "D",
          1,       2,       4,     "D",



          2,       1,       1,     "A",
          2,       1,       2,     "G",

          2,       2,       3,     "D",
          2,       2,       4,     "G"
)
# styler: on

add_status_tree(stem)


Add column subquadrat based on QX and QY coordinates.

Description

Add column subquadrat based on QX and QY coordinates.

Usage

add_subquad(data, x_q, y_q = x_q, x_sq, y_sq = x_sq, subquad_offset = NULL)

Arguments

data

A dataframe with quadrat coordinates QX and QY (e.g. a ViewFullTable).

x_q, y_q

Size in meters of a quadrat's side. For ForestGEO sites, a common value is 20.

x_sq, y_sq

Size in meters of a subquadrat's side. For ForestGEO sites, a common value is 5.

subquad_offset

Either -1 or 1, to rest or add one unit to the digit of each subquadrat that represents the column number.

First column is 0    First column is 1
-----------------    -----------------
   04 14 24 34          14 24 34 44
   03 13 23 33          13 23 33 43
   02 12 22 32          12 22 32 42
   01 11 21 31          11 21 31 41

Value

Returns data with the additional variable subquadrat.

Author(s)

Anudeep Singh and Mauro Lepore.

See Also

Other functions to add columns to dataframes: add_status_tree(), add_var()

Other functions for ForestGEO data: add_status_tree(), add_var()

Other functions for fgeo vft: add_status_tree(), add_var(), guess_plotdim(), pick_drop

Examples

# styler: off
vft <- tribble(
   ~QX,  ~QY,
  17.9,    0,
   4.1,   15,
   6.1, 17.3,
   3.8,  5.9,
   4.5, 12.4,
   4.9,  9.3,
   9.8,  3.2,
  18.6,  1.1,
  17.3,  4.1,
   1.5, 16.3
)
# styler: on

add_subquad(vft, 20, 20, 5, 5)

add_subquad(vft, 20, 20, 5, 5, subquad_offset = -1)


Add columns lx/ly, QX/QY, index, col/row, hectindex, quad, gx/gy.

Description

These functions add columns to position trees in a forest plot. They work with ViewFullTable, tree and stem tables. From the input table, most functions use only the gx and gy columns (or equivalent columns). The exception is the function add_gxgy() which inputs quadrat information. If your data lacks some important column, an error message will inform you which column is missing.

Usage

add_lxly(data, gridsize = 20, plotdim = NULL)

add_qxqy(data, gridsize = 20, plotdim = NULL)

add_index(data, gridsize = 20, plotdim = NULL)

add_col_row(data, gridsize = 20, plotdim = NULL)

add_hectindex(data, gridsize = 20, plotdim = NULL)

add_quad(data, gridsize = 20, plotdim = NULL, start = NULL, width = 2)

add_gxgy(data, gridsize = 20, start = 0)

Arguments

data

A ForestGEO-like dataframe: A ViewFullTable, tree or stem table.

gridsize

The gridsize of the census plot (commonly 20 m).

plotdim

The global dimensions of the census plot (i.e. the maximum possible values of gx and gy).

start

Defaults to label the first quadrat as "0101". Use 0 to label it as "0000" instead.

width

Number; width to pad the labels of plot-columns and -rows.

Details

These functions are adapted from the CTFS R Package.

Value

For any given var, a function add_var() returns a modified version of the input dataframe, with the additional variable(s) var.

See Also

Other functions to add columns to dataframes: add_status_tree(), add_subquad()

Other functions for ForestGEO data: add_status_tree(), add_subquad()

Other functions for fgeo census: add_status_tree(), guess_plotdim(), pick_drop

Other functions for fgeo vft: add_status_tree(), add_subquad(), guess_plotdim(), pick_drop

Examples

# styler: off
x <- tribble(
    ~gx,    ~gy,
      0,      0,
     50,     25,
  999.9, 499.95,
   1000,    500
)
# styler: on

# `gridsize` has a common default; `plotdim` is guessed from the data
add_lxly(x)

gridsize <- 20
plotdim <- c(1000, 500)

add_qxqy(x, gridsize, plotdim)

add_index(x, gridsize, plotdim)

add_hectindex(x, gridsize, plotdim)

add_quad(x, gridsize, plotdim)

add_quad(x, gridsize, plotdim, start = 0)

# `width` gives the nuber of digits to pad the label of plot-rows and
# plot-columns, e.g. 3 pads plot-rows with three zeros and plot-columns with
# an extra trhree zeros, resulting in a total of 6 zeros.
add_quad(x, gridsize, plotdim, start = 0, width = 3)

add_col_row(x, gridsize, plotdim)


# From `quadrat` or `QuadratName` --------------------------------------
# styler: off
x <- tribble(
  ~QuadratName,
        "0001",
        "0011",
        "0101",
        "1001"
)
# styler: on

# Output `gx` and `gy` ---------------

add_gxgy(x)

assert_is_installed("fgeo.x")
# Warning: The data may already have `gx` and `gx` columns
gxgy <- add_gxgy(fgeo.x::tree5)
select(gxgy, matches("gx|gy"))

# Output `col` and `row` -------------

# Create columns `col` and `row` from `QuadratName` with `tidyr::separate()`
# The argument `sep` lets you separate `QuadratName` at any positon
## Not run: 
tidyr_is_installed <- requireNamespace("tidyr", quietly = TRUE)
stringr_is_installed <- requireNamespace("stringr", quietly = TRUE)

if (tidyr_is_installed && stringr_is_installed) {
  library(tidyr)
  library(stringr)

  vft <- tibble(QuadratName = c("0001", "0011"))
  vft

  separate(
    vft,
    QuadratName,
    into = c("col", "row"),
    sep = 2
  )

  census <- select(fgeo.x::tree5, quadrat)
  census

  census$quadrat <- str_pad(census$quadrat, width = 4, pad = 0)

  separate(
    census,
    quadrat,
    into = c("col", "row"),
    sep = 2,
    remove = FALSE
  )
}

## End(Not run)


Assert a package is installed.

Description

Assert a package is installed.

Usage

assert_is_installed(pkg)

Arguments

pkg

Character vector giving the name of a package.

Value

An error if pkg is not installed or invisible pkg if it is.

Examples

assert_is_installed("base")
## Not run: 
try(assert_is_installed("bad"))

## End(Not run)

Check if an object contains specific names.

Description

Check if an object contains specific names.

Usage

check_crucial_names(x, nms)

Arguments

x

A named object.

nms

String; names expected to be found in x.

Value

Invisible x, or an error with informative message.

See Also

Other functions to check inputs: flag_if_group(), is_multiple()

Other functions for developers: extract_insensitive(), flag_if_group(), is_multiple(), nms_try_rename(), rename_matches(), type_ensure()

Examples

v <- c(x = 1)
check_crucial_names(v, "x")

dfm <- data.frame(x = 1)
check_crucial_names(dfm, "x")

Drop if missing values.

Description

Valuable mostly for its warning.

Usage

drop_if_na(dfm, x)

Arguments

dfm

A dataframe.

x

String giving a column name of dfm.

Value

A dataframe.

See Also

tidyr::drop_na().

Examples

dfm <- data.frame(a = 1, b = NA)
drop_if_na(dfm, "b")
drop_if_na(dfm, "a")

Extract plot dimensions from habitat data.

Description

Extract plot dimensions from habitat data.

Usage

extract_gridsize(habitats)

extract_plotdim(habitats)

Arguments

habitats

Data frame giving the habitat designation for each 20x20 quadrat.

Value

Examples

assert_is_installed("fgeo.x")
habitat <- fgeo.x::habitat
extract_plotdim(habitat)
extract_gridsize(habitat)

Detect and extract matching strings – ignoring case.

Description

Detect and extract matching strings – ignoring case.

Return TRUE in position where name of x is in y; FALSE otherwise.

Usage

extract_insensitive(x, y)

detect_insensitive(x, y)

Arguments

x

A string to be muted as in y, it a case insensitive match is found.

y

A string to use as a reference to match x.

Value

⁠detect_*⁠ and ⁠extract_*⁠ return a logical vector and a string.

See Also

Other functions for developers: check_crucial_names(), flag_if_group(), is_multiple(), nms_try_rename(), rename_matches(), type_ensure()

Other general functions to deal with names: rename_matches()

Examples

x <- c("stemid", "n")
y <- c("StemID", "treeID")
detect_insensitive(x, y)
extract_insensitive(x, y)

vft <- data.frame(TreeID = 1, Status = 1)
extract_insensitive(tolower(names(vft)), names(vft))
extract_insensitive(names(vft), tolower(names(vft)))

Create elevation data.

Description

This function constructs an object of class "fgeo_elevation". It standardizes the structure of elevation data to always output a dataframe with names gx, gy and elev.

Usage

fgeo_elevation(elev)

Arguments

elev

One of these:

  • A dataframe containing elevation data, with columns gx, gy, and elev, or x, y, and elev (e.g. fgeo.x::elevation$col).

  • A ForestGEO-like elevation list with elements xdim and ydim giving plot dimensions, and element col containing a dataframe as described in the previous item (e.g. fgeo.x::elevation).

Value

A dataframe with names x/gx, y/gy and elev.

Acknowledgments

This function was inspired by David Kenfack.

Examples

assert_is_installed("fgeo.x")

# Input: Elevation dataframe
elevation_df <- fgeo.x::elevation$col
fgeo_elevation(elevation_df)

class(elevation_df)
class(fgeo_elevation(elevation_df))

names(elevation_df)
names(fgeo_elevation(elevation_df))

# Input: Elevation list
elevation_ls <- fgeo.x::elevation
fgeo_elevation(elevation_ls)

class(elevation_ls)
class(fgeo_elevation(elevation_ls))

names(elevation_ls)
names(fgeo_elevation(elevation_ls))

Flag if a vector or dataframe-column meets a condition.

Description

This function returns a condition (error, warning, or message) and its first argument, invisibly. It is a generic. If the first input is a vector, it evaluates it directly; if it is is a dataframe, it evaluates a given column.

Usage

flag_if(.data, ...)

## Default S3 method:
flag_if(.data, predicate, condition = warning, msg = NULL, ...)

## S3 method for class 'data.frame'
flag_if(.data, name, predicate, condition = warning, msg = NULL, ...)

Arguments

.data

Vector.

...

Other arguments passed to methods.

predicate

A predicate function.

condition

A condition function (e.g. stop(), warning(), rlang::inform()).

msg

String. An optional custom message.

name

String. The name of a column of a dataframe.

Value

A condition (and .data invisibly).

See Also

Other functions for internal use in other fgeo packages: guess_plotdim(), is_multiple()

Examples

# WITH VECTORS
dupl <- c(1, 1)
flag_if(dupl, is_duplicated)
# Silent
flag_if(dupl, is_multiple)

mult <- c(1, 2)
flag_if(mult, is_multiple, message, "Custom")
# Silent
flag_if(mult, is_duplicated)

# Both silent
flag_if(c(1, NA), is_multiple)
flag_if(c(1, NA), is_duplicated)

# WITH DATAFRAMES
.df <- data.frame(a = 1:3, b = 1, stringsAsFactors = FALSE)
flag_if(.df, "b", is_multiple)
flag_if(.df, "a", is_multiple)
flag_if(.df, "a", is_multiple, message, "Custom")

Detect and flag based on a predicate applied to a variable by groups.

Description

These functions extend flag_if()] and detect_if() to work by groups defined with dplyr::group_by().

Usage

flag_if_group(.data, name, predicate, condition = warn, msg = NULL)

detect_if_group(.data, name, predicate)

Arguments

.data

A dataframe.

name

String. The name of a column of the dataframe.

predicate

A predicate function, e.g. is_multiple().

condition

A condition function, e.g. rlang::inform() or base::stop().

msg

String to customize the returned message.

Value

See Also

Other functions to check inputs: check_crucial_names(), is_multiple()

Other functions for developers: check_crucial_names(), extract_insensitive(), is_multiple(), nms_try_rename(), rename_matches(), type_ensure()

Examples

tree <- tibble(CensusID = c(1, 2), treeID = c(1, 2))
detect_if_group(tree, "treeID", is_multiple)
flag_if_group(tree, "treeID", is_multiple)

by_censusid <- group_by(tree, CensusID)
detect_if_group(by_censusid, "treeID", is_multiple)
flag_if_group(by_censusid, "treeID", is_multiple)

Functions to get variables from other variables.

Description

These functions wrap their corresponding functions from the CTFS R Package, but these versions are stricter. The main differences are these:

Usage

rowcol_to_index(rowno, colno, gridsize, plotdim)

index_to_rowcol(index, gridsize, plotdim)

gxgy_to_index(gx, gy, gridsize, plotdim)

gxgy_to_lxly(gx, gy, gridsize, plotdim)

gxgy_to_qxqy(gx, gy, gridsize, plotdim)

gxgy_to_rowcol(gx, gy, gridsize, plotdim)

gxgy_to_hectindex(gx, gy, plotdim)

index_to_gxgy(index, gridsize, plotdim)

Arguments

rowno, colno

Row and column number – as defined in a census plot.

gridsize

The gridsize of the census plot (commonly 20 m).

plotdim

The global dimensions of the census plot (i.e. the maximum possible values of gx and gy).

index

Index number as defined for a census plot.

gx, gy

A number; global x and y position in a census plot.

Details

gxgy_to_qxqy() didn't exist in the original CTFS R Package. Added for consistency.

Value

A vector or dataframe (see examples).

Author(s)

Rick Condit, Suzanne Lao.

Examples

gxgy_to_index(c(0, 400, 990), c(0, 200, 490), gridsize = 20)

gridsize <- 20
plotdim <- c(1000, 500)

x <- gxgy_to_hectindex(1:3, 1:3, plotdim)
x
typeof(x)
is.data.frame(x)
is.vector(x)

x <- gxgy_to_index(1:3, 1:3, gridsize, plotdim)
x
typeof(x)
is.data.frame(x)
is.vector(x)

x <- gxgy_to_lxly(1:3, 1:3, gridsize, plotdim)
x
typeof(x)
is.data.frame(x)
is.vector(x)

x <- gxgy_to_rowcol(1:3, 1:3, gridsize, plotdim)
x
typeof(x)
is.data.frame(x)
is.vector(x)

x <- index_to_rowcol(1:3, gridsize, plotdim)
x
typeof(x)
is.data.frame(x)
is.vector(x)

x <- rowcol_to_index(1:3, 1:3, gridsize, plotdim)
x
typeof(x)
is.data.frame(x)
is.vector(x)

index_to_gxgy(1:3, gridsize, plotdim)

Guess plot dimensions.

Description

Guess plot dimensions.

Usage

guess_plotdim(x, accuracy = 20)

Arguments

x

A ForestGEO-like dataframe: A ViewFullTable, tree or stem table.

accuracy

A number giving the accuracy with which to round gx and gy.

Value

A numeric vector of length 2.

See Also

Other functions for fgeo census and vft: pick_drop

Other functions for fgeo census: add_status_tree(), add_var(), pick_drop

Other functions for fgeo vft: add_status_tree(), add_subquad(), add_var(), pick_drop

Other functions for internal use in other fgeo packages: flag_if(), is_multiple()

Examples

x <- data.frame(
  gx = c(0, 300, 979),
  gy = c(0, 300, 481)
)
guess_plotdim(x)

Predicates to detect and flag duplicated and multiple values of a variable.

Description

is_multiple() and is_duplicated() return TRUE if they detect, respectively, multiple different values of a variable (e.g. c(1, 2)⁠), or duplicated values of a variable (e.g. c(1, 1)⁠).

Usage

is_multiple(.data)

is_duplicated(.data)

Arguments

.data

A vector.

Value

Logical.

See Also

Other functions for internal use in other fgeo packages: flag_if(), guess_plotdim()

Other functions to check inputs: check_crucial_names(), flag_if_group()

Other functions for developers: check_crucial_names(), extract_insensitive(), flag_if_group(), nms_try_rename(), rename_matches(), type_ensure()

Examples

is_multiple(c(1, 2))
is_multiple(c(1, 1))
is_multiple(c(1, NA))

is_duplicated(c(1, 2))
is_duplicated(c(1, 1))
is_duplicated(c(1, NA))

Try to rename an object.

Description

Given a name you want and a possible alternative, this function renames an object as you want or errs with an informative message.

Usage

nms_try_rename(x, want, try)

Arguments

x

A named object.

want

String of length 1 giving the name you want the object to have.

try

String of length 1 giving the name the object might have.

See Also

nms

Other functions for developers: check_crucial_names(), extract_insensitive(), flag_if_group(), is_multiple(), rename_matches(), type_ensure()

Examples

nms_try_rename(c(a = 1), "A", "a")
nms_try_rename(data.frame(a = 1), "A", "a")

# Passes
nms_try_rename(c(a = 1, 1), "A", "a")
## Not run: 
# Errs
# nms_try_rename(1, "A", "A")

## End(Not run)


Pick and drop rows from ViewFullTable, tree, and stem tables.

Description

These functions provide an expressive and convenient way to pick specific rows from ForestGEO datasets. They allow you to remove missing values (with na.rm = TRUE) but conservatively default to preserving them. This behavior is similar to base::subset() and unlike dplyr::filter(). This conservative default is important because you want want to include missing trees in your analysis.

Usage

pick_dbh_min(data, value, na.rm = FALSE)

pick_dbh_max(data, value, na.rm = FALSE)

pick_dbh_under(data, value, na.rm = FALSE)

pick_dbh_over(data, value, na.rm = FALSE)

pick_status(data, value, na.rm = FALSE)

drop_status(data, value, na.rm = FALSE)

Arguments

data

A ForestGEO-like dataframe: A ViewFullTable, tree or stem table.

value

An atomic vector; a single value against to compare each value of the variable encoded in the function's name.

na.rm

Set to TRUE if you want to remove missing values from the variable encoded in the function's name.

Value

A dataframe similar to .data but including only the rows with matching conditions.

See Also

dplyr::filter(), Extract ([).

Other functions for fgeo census and vft: guess_plotdim()

Other functions for fgeo census: add_status_tree(), add_var(), guess_plotdim()

Other functions for fgeo vft: add_status_tree(), add_subquad(), add_var(), guess_plotdim()

Other functions to pick or drop rows of a ForestGEO dataframe: pick_main_stem()

Examples

# styler: off
census <- tribble(
  ~dbh, ~status,
     0,     "A",
    50,     "A",
   100,     "A",
   150,     "A",
    NA,     "M",
    NA,     "D",
    NA,      NA
  )
# styler: on

# <=
pick_dbh_max(census, 100)
pick_dbh_max(census, 100, na.rm = TRUE)

# >=
pick_dbh_min(census, 100)
pick_dbh_min(census, 100, na.rm = TRUE)

# <
pick_dbh_under(census, 100)
pick_dbh_under(census, 100, na.rm = TRUE)

# >
pick_dbh_over(census, 100)
pick_dbh_over(census, 100, na.rm = TRUE)
# Same, but `subset()` does not let you keep NAs.
subset(census, dbh > 100)

# ==
pick_status(census, "A")
pick_status(census, "A", na.rm = TRUE)

# !=
drop_status(census, "D")
drop_status(census, "D", na.rm = TRUE)

# Compose
pick_dbh_over(
  drop_status(census, "D", na.rm = TRUE),
  100
)

# More readable as a pipiline
census %>%
  drop_status("D", na.rm = TRUE) %>%
  pick_dbh_over(100)

# Also works with ViewFullTables
# styler: off
vft <- tribble(
  ~DBH,   ~Status,
     0,   "alive",
    50,   "alive",
   100,   "alive",
   150,   "alive",
    NA, "missing",
    NA,    "dead",
    NA,        NA
)
# styler: on

pick_dbh_max(vft, 100)

pick_status(vft, "alive", na.rm = TRUE)


Pick the main stem or main stemid(s) of each tree in each census.

Description

Usage

pick_main_stem(data)

pick_main_stemid(data)

Arguments

data

A ForestGEO-like dataframe: A ViewFullTable, tree or stem table.

Details

Value

A dataframe with a single plotname, and one row per per treeid per censusid.

Warning

These functions may be considerably slow. They are fastest if the data already has a single stem per treeid. They are slower with data containing multiple stems per treeid (per censusid), which is the main reason for using this function. The slowest scenario is when data also contains duplicated values of stemid per treeid (per censusid). This may happen if trees have buttresses, in which case these functions check every stem for potential duplicates and pick the one with the largest hom value.

For example, in a windows computer with 32 GB of RAM, a dataset with 2 million rows with multiple stems and buttresses took about 3 minutes to run. And a dataset with 2 million rows made up entirely of main stems took about ten seconds to run.

See Also

Other functions to pick or drop rows of a ForestGEO dataframe: pick_drop

Examples

# One `treeID` with multiple stems.
# `stemID == 1.1` has two measurements (due to buttresses).
# `stemID == 1.2` has a single measurement.
# styler: off
census <- tribble(
    ~sp, ~treeID, ~stemID,  ~hom, ~dbh, ~CensusID,
  "sp1",     "1",   "1.1",   140,   40,         1,  # main stemID (max `hom`)
  "sp1",     "1",   "1.1",   130,   60,         1,
  "sp1",     "1",   "1.2",   130,   55,         1   # main stemID (only one)
)
#' # styler: on

# Picks a unique row per unique `treeID`
pick_main_stem(census)

# Picks a unique row per unique `stemID`
pick_main_stemid(census)


Import ViewFullTable or ViewTaxonomy data from a .tsv or .csv file.

Description

read_vft() and read_taxa() help you to read ViewFullTable and ViewTaxonomy data from text files delivered by the ForestGEO database. These functions avoid common problems about column separators, missing values, column names, and column types.

Usage

read_vft(file, delim = NULL, na = c("", "NA", "NULL"), ...)

read_taxa(file, delim = NULL, na = c("", "NA", "NULL"), ...)

Arguments

file

A path to a file.

delim

Single character used to separate fields within a record. The default (delim = NULL) is to guess between comma or tab ("," or "\t").

na

Character vector of strings to interpret as missing values. Set this option to character() to indicate no missing values.

...

Other arguments passed to readr::read_delim().

Value

A tibble.

Acknowledgments

Thanks to Shameema Jafferjee Esufali for inspiring the feature that automatically detects delim (issue 65).

See Also

readr::read_delim(), type_vft(), type_taxa().

Other functions to read text files delivered by ForestgGEO's database: type_vft()

Examples

assert_is_installed("fgeo.x")
library(fgeo.x)

example_path()

file_vft <- example_path("view/vft_4quad.csv")
read_vft(file_vft)

file_taxa <- example_path("view/taxa.csv")
read_taxa(file_taxa)

Recode subquadrat.

Description

Recode subquadrat.

Usage

recode_subquad(data, offset = -1)

Arguments

data

A dataframe with the variable subquadrat.

offset

A number; either -1 or 1, to rest or add one unit to the number of column of each subquadrat.

First column is 0    First column is 1
-----------------    -----------------
   04 14 24 34          14 24 34 44
   03 13 23 33          13 23 33 43
   02 12 22 32          12 22 32 42
   01 11 21 31          11 21 31 41

Value

A modified version of the input.

Examples

first_subquad_11 <- tibble(subquadrat = c("11", "12", "22"))
first_subquad_11

first_subquad_01 <- recode_subquad(first_subquad_11, offset = -1)
first_subquad_01

first_subquad_11 <- recode_subquad(first_subquad_01, offset = 1)
first_subquad_11

Objects exported from other packages

Description

These objects are imported from other packages. Follow the links below to see their documentation.

dplyr

add_count, arrange, count, filter, group_by, mutate, select, summarise, summarize, ungroup

rlang

%||%

tibble

as_tibble, tibble, tribble, tribble

tidyselect

contains, ends_with, everything, last_col, matches, num_range, one_of, starts_with


Rename an object based on case-insensitive match of the names of a reference.

Description

Rename an object based on case-insensitive match of the names of a reference.

Usage

rename_matches(x, y)

Arguments

x

x object which names to restored if they match the reference.

y

Named object to use as reference.

Value

The output is x with as many names changed as case-insensitive matches there are with the reference.

See Also

Other functions for developers: check_crucial_names(), extract_insensitive(), flag_if_group(), is_multiple(), nms_try_rename(), type_ensure()

Other general functions to deal with names: extract_insensitive()

Examples

ref <- data.frame(COL1 = 1, COL2 = 1)
x <- data.frame(col1 = 5, col2 = 1, n = 5)
rename_matches(x, ref)

Fix common problems in ViewFullTable and ViewTaxonomy data.

Description

These functions fix common problems of ViewFullTable and ViewTaxonomy data:

Usage

sanitize_vft(.data, na = c("", "NA", "NULL"), ...)

sanitize_taxa(.data, na = c("", "NA", "NULL"), ...)

Arguments

.data

A dataframe; either a ForestGEO ViewFullTable (sanitize_vft()). or ViewTaxonomy (sanitize_vft()).

na

Character vector of strings to interpret as missing values. Set this option to character() to indicate no missing values.

...

Arguments passed to readr::type_convert().

Value

A dataframe.

Acknowledgments

Thanks to Shameema Jafferjee Esufali for motivating this functions.

See Also

read_vft().

Examples

assert_is_installed("fgeo.x")

vft <- fgeo.x::vft_4quad

# Introduce problems to show how to fix them
# Bad column types
vft[] <- lapply(vft, as.character)
# Bad representation of missing values
vft$PlotName <- "NULL"

# "NULL" should be replaced by `NA` and `DBH` should be numeric
str(vft[c("PlotName", "DBH")])

# Fix
vft_sane <- sanitize_vft(vft)
str(vft_sane[c("PlotName", "DBH")])

taxa <- read.csv(fgeo.x::example_path("taxa.csv"))
# E.g. inserting bad column types
taxa[] <- lapply(taxa, as.character)
# E.g. inserting bad representation of missing values
taxa$SubspeciesID <- "NULL"

# "NULL" should be replaced by `NA` and `ViewID` should be integer
str(taxa[c("SubspeciesID", "ViewID")])

# Fix
taxa_sane <- sanitize_taxa(taxa)
str(taxa_sane[c("SubspeciesID", "ViewID")])

Tidy eval helpers

Description

This page lists the tidy eval tools reexported in this package from rlang. To learn about using tidy eval in scripts and packages at a high level, see the dplyr programming vignette and the ggplot2 in packages vignette. The Metaprogramming section of Advanced R may also be useful for a deeper dive.


Ensure the specific columns of a dataframe have a particular type.

Description

Ensure the specific columns of a dataframe have a particular type.

Usage

type_ensure(df, ensure_nms, type = "numeric")

Arguments

df

A dataframe.

ensure_nms

Character vector giving names of df to ensure type

type

A string giving the type to ensure in columns ensure_nms

Value

A modified version of df, with columns (specified in ensure_nms) of type type.

See Also

purrr::modify_at().

Other functions to operate on column types: type_vft()

Other functions for developers: check_crucial_names(), extract_insensitive(), flag_if_group(), is_multiple(), nms_try_rename(), rename_matches()

Examples

dfm <- tibble(
  w = c(NA, 1, 2),
  x = 1:3,
  y = as.character(1:3),
  z = letters[1:3]
)
dfm
type_ensure(dfm, c("w", "x", "y"), "numeric")
type_ensure(dfm, c("w", "x", "y", "z"), "character")

Help to read ForestGEO data safely, with consistent columns type.

Description

A common cause of problems is feeding functions with data which columns are not all of the expected type. The problem often begins when reading data from a text file with functions such as utils::read.csv(), utils::read.delim(), and friends – which commonly guess wrongly the column type that you more likely expect. These common offenders are strongly discouraged; instead consider using readr::read_csv(), readr::read_tsv(), and friends, which guess column types correctly much more often than their analogs from the utils package.

type_vft() and type_taxa() help you to read data more safely by explicitly specifying what type to expect from each column of known datasets. These functions output the specification of column types used internally by read_vft() and read_taxa():

Usage

type_vft()

type_taxa()

Details

Types reference (for more details see readr::read_delim()):

Value

A list.

See Also

readr::read_delim().

Other functions to operate on column types: type_ensure()

Other functions to read text files delivered by ForestgGEO's database: read_vft()

Examples

assert_is_installed("fgeo.x")
library(fgeo.x)
library(readr)

str(type_vft())

read_csv(example_path("view/vft_4quad.csv"), col_types = type_vft())

str(type_taxa())

read_csv(example_path("view/taxa.csv"), col_types = type_taxa())