Title: Access and Analyze Data from the Red Book of Endemic Plants of Peru
Version: 0.0.3
Description: Provides access to and analysis of data from "The Red Book of Endemic Plants of Peru" (León, B., Roque, J., Ulloa, C., Jorgensen, P.M., Pitman, N., Cano, A. 2006) <doi:10.15381/rpb.v13i2.1782>. This package offers comprehensive taxonomic, geographic, and conservation information about Peru's endemic plant species. It includes functions to verify species inclusion, obtain updated taxonomic details, and explore the dataset.
License: MIT + file LICENSE
Suggests: knitr, rmarkdown, testthat (≥ 3.0.0)
Config/testthat/edition: 3
LazyData: true
LazyDataCompression: xz
Encoding: UTF-8
RoxygenNote: 7.3.2
URL: https://github.com/PaulESantos/redbookperu, https://paulesantos.github.io/redbookperu/
BugReports: https://github.com/PaulESantos/redbookperu/issues
Depends: R (≥ 2.10)
Maintainer: Paul E. Santos Andrade <paulefrens@gmail.com>
NeedsCompilation: no
Packaged: 2024-07-02 01:20:14 UTC; PC
Author: Paul E. Santos Andrade ORCID iD [aut, cre], Lucely L. Vilca Bustamante ORCID iD [aut]
Repository: CRAN
Date/Publication: 2024-07-02 07:30:02 UTC

The matching algorithm

Description

The matching algorithm

Usage

.match_algorithm(
  splist_class,
  max_distance,
  progress_bar = FALSE,
  keep_closest = TRUE,
  genus_fuzzy = TRUE,
  grammar_check = FALSE
)

Check Species Names in the Red Book of Endemic Plants of Peru

Description

This function checks a list of species names against the Red Book of Endemic Plants of Peru database and provides information about whether a species was recorded as endemic, and checks for misspelling typos (fuzzy match).

Usage

check_redbooklist(splist, dist = 0.02)

Arguments

splist

A character vector containing the species names to be checked.

dist

Maximum allowed distance for fuzzy matching of species names.

Details

This function checks each species name in the provided list against the Red Book of Endemic Plants of Peru database using fuzzy matching based on the specified maximum distance (dist). It provides information about the endemic status of each species and flags if the recorded name needs updating. It also counts the number of exact and fuzzy matches found.

Value

A character vector indicating if each input species name is listed as "endemic" in the Red Book of Endemic Plants of Peru database. Returns "endemic" if the species name is listed and "not endemic" if no matching entry is found.

References

Red Book of Endemic Plants of Peru The World Checklist of Vascular Plants, a continuously updated resource for exploring global plant diversity. Taxonomic Name Resolution Service - TNRS Plants of the World Online - Facilitated by the Royal Botanic Gardens - Kew.

Examples

# Example usage of the function
splist <- c("Aphelandra cuscoenses",
            "Piper stevensi",
            "Sanchezia ovata",
            "Verbesina andina",
            "Festuca dentiflora",
            "Eucrosia bicolor var. plowmanii",
            "Hydrocotyle bonplandii var. hirtipes",
            "Persea americana")

# Basic usage
check_redbooklist(splist = splist, dist = 0.2)

# Using base R with a data frame
plant_list <- data.frame(splist = splist)
plant_list$label <- check_redbooklist(plant_list$splist, dist = 0.2)
plant_list


Get Red Book Data for Given Species List

Description

This function retrieves comprehensive information from the Red Book of Endemic Plants of Peru database for a provided list of species. It associates the provided species names with their corresponding updated taxonomic information and descriptions recorded in the original publication.

Usage

get_redbook_data(splist, dist = 0.1)

Arguments

splist

A character vector containing the species names to be queried.

dist

Maximum allowed distance for fuzzy matching of species names.

Details

This function checks each species name in the provided list against the Red Book of Endemic Plants of Peru database using fuzzy matching based on the specified maximum distance (dist). For each species, it retrieves and combines taxonomic information (accepted name, accepted family, accepted name author) with additional descriptive data recorded in the original publication, such as IUCN conservation category, bibliographic reference, collector, herbariums, common name, departmental registrations, ecological regions, protected natural areas (SINANPE), Peruvian herbaria, and additional remarks.

Value

A data frame containing comprehensive information about the provided species, including updated taxonomic details and descriptions.

References

Red Book of Endemic Plants of Peru The World Checklist of Vascular Plants, a continuously updated resource for exploring global plant diversity. Taxonomic Name Resolution Service - TNRS Plants of the World Online - Facilitated by the Royal Botanic Gardens - Kew.

See Also

check_redbooklist function for a more focused check of species endemic status.

Examples

# Example illustrating how to use the get_redbook_data function
species_list <- c("Aphelandra cuscoensis", "Sanchezia ovata", "Piper stevensii")
redbook_data <- get_redbook_data(species_list)
head(redbook_data)


List of the number positions of the first 3 letters of the species name in the redbook_tab

Description

The 'redbook_position' reports the position (in term of number of rows) of the first three letters (triphthong) for the plant names stored in the variable 'accepted_name' of the table 'redbook_tab'. This indexing system speeds up of the search on the largest list using the package.

Usage

redbook_position

Format

A data frame with 978 observations on the following 3 variables:

position

A character vector. The position of the first three letters of the species name in the redbook_tab.

triphthong

A character vector. The first three letters of the species name in the redbook_tab.

genus

A character vector. The corresponding genus name.

Details

Positions of Species Names in The Red Book of Endemic Plants of Peru

The redbook_position dataset provides the positions of the first three letters of each species name listed in the redbook_tab.

Examples


data("redbook_position")
head(redbook_position)


Database of the Red Book of Endemic Plants of Peru

Description

This database contains comprehensive information regarding the endemic plant species listed in the Red Book of Endemic Plants of Peru. Each endemic taxon is accompanied by corresponding variables that detail its taxonomic status, IUCN conservation category, bibliographic references, type collection details, common names, departmental registrations, ecological regions, protected natural areas (SINANPE), and Peruvian herbaria where the specimens are deposited, as recorded in the original book.

Usage

redbook_sp_data

Format

A data frame with the following variables:

redbook_id

Unique identifier for each species in the Red Book of Endemic Plants of Peru.

redbook_name

Scientific name of the endemic species.

iucn

Conservation category assigned according to IUCN.

publication

Bibliographic reference where the taxon was originally described.

collector

Name(s) of the collector(s) of the type specimen.

herbariums

Acronyms of the institutions where the type specimens of the taxon are deposited.

common_name

Common names of the species as mentioned in the literature.

dep_registry

Abbreviations of the departments where the taxon has been recorded.

ecological_regions

Abbreviations of the ecological regions proposed by Zamora (1996).

sinampe

Abbreviation of the Protected Natural Area where the taxon was recorded.

peruvian_herbariums

Acronyms of the Peruvian institutions where both type and non-type specimens are deposited.

remarks

Observations and additional information about the endemic taxon.

Details

This database provides essential information for research and conservation efforts related to Peru's endemic flora, offering access to the data presented in the corresponding book.

References

León, Blanca, et.al. 2006. “The Red Book of Endemic Plants of Peru”. Revista Peruana De Biología 13 (2): 9s-22s. https://revistasinvestigacion.unmsm.edu.pe/index.php/rpb/issue/view/153

Examples


# Example illustrating how to load and explore the database
data("redbook_sp_data")
head(redbook_sp_data)


List of Species Names in redbook_sps_class Separated by Category

Description

The redbook_sps_class dataset includes all species names separated by genus, epithet, author, subspecies, variety, and their position (ID) in the redbook_tab.

Usage

redbook_sps_class

Format

A data.frame with the following columns:

species

A character vector. The full species name.

genus

A character vector. The genus of the species.

epithet

A character vector. The specific epithet of the species.

input_subspecies_epitheton

A character vector. The infraspecific epithet of the species, if applicable.

rank

A character vector. The taxonomic rank (e.g., "species", "subspecies", "variety").

subspecies

A character vector. The subspecies name, if applicable.

variety

A character vector. The variety name, if applicable.

hybrid

A character vector. Indicates if the species is a hybrid.

id

A character vector. The ID of the species in the redbook_tab.

Examples


data("redbook_sps_class")
head(redbook_sps_class)


Species Names Listed in The Red Book of Endemic Plants of Peru

Description

The redbook_tab contains records for all species listed in The Red Book of Endemic Plants of Peru.

Usage

redbook_tab

Format

A tibble with the following columns:

redbook_id

The fixed species ID of the input taxon in The Red Book of Endemic Plants of Peru.

redbook_name

A character vector. The species name as listed in The Red Book of Endemic Plants of Peru.

input_genus

A character vector. The input genus of the corresponding species name listed.

input_epitheton

A character vector. The specific epithet of the corresponding species name listed.

rank

A character vector. The taxonomic rank (e.g., "species", "subspecies", "variety") of the corresponding species name listed.

input_subspecies_epitheton

A character vector. The infraspecific epithet of the corresponding species name listed, if applicable.

accepted_name

A character vector. The accepted plant taxa names according to the World Checklist of Vascular Plants (WCVP).

accepted_family

A character vector. The corresponding family name of the accepted name.

accepted_name_author

A character vector. The author of the accepted name.

accepted_name_rank

A character vector. The rank of the accepted name (e.g., species, subspecies).

tag_subsp_wcvp

A character vector. A tag indicating if the subspecies is recognized in the WCVP.

genus_ephitethon_wcvp

A character vector. The genus part of the name according to the WCVP.

species_ephitethon_wcvp

A character vector. The specific epithet part of the name according to the WCVP.

subspecies_ephitethon_wcvp

A character vector. The infraspecific epithet part of the name according to the WCVP, if applicable.

References

León, Blanca, et.al. 2006. “The Red Book of Endemic Plants of Peru”. Revista Peruana De Biología 13 (2): 9s-22s. https://doi.org/10.15381/rpb.v13i2.1782.

Examples


data("redbook_tab")
head(redbook_tab)


Search Species Names in “The Red Book of Endemic Plants of Peru”

Description

This function allows searching for plant taxa names listed in "The Red Book of Endemic Plants of Peru". It connects to the data listed in the catalog and validates if the species is present, removing orthographic errors in plant names.

Usage

search_redbook(
  splist,
  max_distance = 0.2,
  show_correct = FALSE,
  genus_fuzzy = FALSE,
  grammar_check = FALSE
)

Arguments

splist

A character vector specifying the input taxon, each element including genus and specific epithet and, potentially, infraspecific rank, infraspecific name, and author name. Only valid characters are allowed (see base::validEnc).

max_distance

An integer or fraction specifying the maximum distance allowed when comparing the submitted name with the closest name matches in the species listed in "The Red Book of Endemic Plants of Peru". The distance used is a generalized Levenshtein distance indicating the total number of insertions, deletions, and substitutions allowed to match the two names. For example, a name with a length of 10 and a max_distance = 0.1 allows only one change (insertion, deletion, or substitution). A max_distance = 2 allows two changes.

show_correct

If TRUE, a column is added to the final result indicating whether the binomial name was exactly matched (TRUE) or if it is misspelled (FALSE).

genus_fuzzy

If TRUE, allows fuzzy matching at the genus level.

grammar_check

If TRUE, performs a grammar check on the species names.

Details

The function tries to match names in "The Red Book of Endemic Plants of Peru", which has a corresponding accepted valid name (accepted_name). If the input name is a valid name, it will be duplicated in the accepted_name column.

The algorithm will first try to exactly match the binomial names provided in splist. If no match is found, it will try to find the closest name given the maximum distance defined in max_distance. Note that only binomial names with valid characters are allowed in this function.

Value

A data frame with the matched species names and additional information from the redbook catalog. If no match is found, a warning is issued suggesting to increase the max_distance argument.

References

León, Blanca, et.al. 2006. “The Red Book of Endemic Plants of Peru”. Revista Peruana De Biología 13 (2): 9s-22s. https://doi.org/10.15381/rpb.v13i2.1782.