Title: | Access and Analyze Data from the Red Book of Endemic Plants of Peru |
Version: | 0.0.3 |
Description: | Provides access to and analysis of data from "The Red Book of Endemic Plants of Peru" (León, B., Roque, J., Ulloa, C., Jorgensen, P.M., Pitman, N., Cano, A. 2006) <doi:10.15381/rpb.v13i2.1782>. This package offers comprehensive taxonomic, geographic, and conservation information about Peru's endemic plant species. It includes functions to verify species inclusion, obtain updated taxonomic details, and explore the dataset. |
License: | MIT + file LICENSE |
Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
LazyData: | true |
LazyDataCompression: | xz |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
URL: | https://github.com/PaulESantos/redbookperu, https://paulesantos.github.io/redbookperu/ |
BugReports: | https://github.com/PaulESantos/redbookperu/issues |
Depends: | R (≥ 2.10) |
Maintainer: | Paul E. Santos Andrade <paulefrens@gmail.com> |
NeedsCompilation: | no |
Packaged: | 2024-07-02 01:20:14 UTC; PC |
Author: | Paul E. Santos Andrade
|
Repository: | CRAN |
Date/Publication: | 2024-07-02 07:30:02 UTC |
The matching algorithm
Description
The matching algorithm
Usage
.match_algorithm(
splist_class,
max_distance,
progress_bar = FALSE,
keep_closest = TRUE,
genus_fuzzy = TRUE,
grammar_check = FALSE
)
Check Species Names in the Red Book of Endemic Plants of Peru
Description
This function checks a list of species names against the Red Book of Endemic Plants of Peru database and provides information about whether a species was recorded as endemic, and checks for misspelling typos (fuzzy match).
Usage
check_redbooklist(splist, dist = 0.02)
Arguments
splist |
A character vector containing the species names to be checked. |
dist |
Maximum allowed distance for fuzzy matching of species names. |
Details
This function checks each species name in the provided list against the
Red Book of Endemic Plants of Peru database using fuzzy matching based on
the specified maximum distance (dist
). It provides information about the
endemic status of each species and flags if the recorded name needs updating.
It also counts the number of exact and fuzzy matches found.
Value
A character vector indicating if each input species name is listed as "endemic" in the Red Book of Endemic Plants of Peru database. Returns "endemic" if the species name is listed and "not endemic" if no matching entry is found.
References
Red Book of Endemic Plants of Peru The World Checklist of Vascular Plants, a continuously updated resource for exploring global plant diversity. Taxonomic Name Resolution Service - TNRS Plants of the World Online - Facilitated by the Royal Botanic Gardens - Kew.
Examples
# Example usage of the function
splist <- c("Aphelandra cuscoenses",
"Piper stevensi",
"Sanchezia ovata",
"Verbesina andina",
"Festuca dentiflora",
"Eucrosia bicolor var. plowmanii",
"Hydrocotyle bonplandii var. hirtipes",
"Persea americana")
# Basic usage
check_redbooklist(splist = splist, dist = 0.2)
# Using base R with a data frame
plant_list <- data.frame(splist = splist)
plant_list$label <- check_redbooklist(plant_list$splist, dist = 0.2)
plant_list
Get Red Book Data for Given Species List
Description
This function retrieves comprehensive information from the Red Book of Endemic Plants of Peru database for a provided list of species. It associates the provided species names with their corresponding updated taxonomic information and descriptions recorded in the original publication.
Usage
get_redbook_data(splist, dist = 0.1)
Arguments
splist |
A character vector containing the species names to be queried. |
dist |
Maximum allowed distance for fuzzy matching of species names. |
Details
This function checks each species name in the provided list against the
Red Book of Endemic Plants of Peru database using fuzzy matching based on
the specified maximum distance (dist
). For each species, it retrieves and
combines taxonomic information (accepted name, accepted family, accepted name author)
with additional descriptive data recorded in the original publication, such as
IUCN conservation category, bibliographic reference, collector, herbariums,
common name, departmental registrations, ecological regions, protected natural
areas (SINANPE), Peruvian herbaria, and additional remarks.
Value
A data frame containing comprehensive information about the provided species, including updated taxonomic details and descriptions.
References
Red Book of Endemic Plants of Peru The World Checklist of Vascular Plants, a continuously updated resource for exploring global plant diversity. Taxonomic Name Resolution Service - TNRS Plants of the World Online - Facilitated by the Royal Botanic Gardens - Kew.
See Also
check_redbooklist
function for a more focused check of species endemic status.
Examples
# Example illustrating how to use the get_redbook_data function
species_list <- c("Aphelandra cuscoensis", "Sanchezia ovata", "Piper stevensii")
redbook_data <- get_redbook_data(species_list)
head(redbook_data)
List of the number positions of the first 3 letters of the species name in the redbook_tab
Description
The 'redbook_position' reports the position (in term of number of rows) of the first three letters (triphthong) for the plant names stored in the variable 'accepted_name' of the table 'redbook_tab'. This indexing system speeds up of the search on the largest list using the package.
Usage
redbook_position
Format
A data frame with 978 observations on the following 3 variables:
- position
A character vector. The position of the first three letters of the species name in the
redbook_tab
.- triphthong
A character vector. The first three letters of the species name in the
redbook_tab
.- genus
A character vector. The corresponding genus name.
Details
Positions of Species Names in The Red Book of Endemic Plants of Peru
The redbook_position
dataset provides the positions of the first three letters of each species name listed in the redbook_tab
.
Examples
data("redbook_position")
head(redbook_position)
Database of the Red Book of Endemic Plants of Peru
Description
This database contains comprehensive information regarding the endemic plant species listed in the Red Book of Endemic Plants of Peru. Each endemic taxon is accompanied by corresponding variables that detail its taxonomic status, IUCN conservation category, bibliographic references, type collection details, common names, departmental registrations, ecological regions, protected natural areas (SINANPE), and Peruvian herbaria where the specimens are deposited, as recorded in the original book.
Usage
redbook_sp_data
Format
A data frame with the following variables:
- redbook_id
Unique identifier for each species in the Red Book of Endemic Plants of Peru.
- redbook_name
Scientific name of the endemic species.
- iucn
Conservation category assigned according to IUCN.
- publication
Bibliographic reference where the taxon was originally described.
- collector
Name(s) of the collector(s) of the type specimen.
- herbariums
Acronyms of the institutions where the type specimens of the taxon are deposited.
- common_name
Common names of the species as mentioned in the literature.
- dep_registry
Abbreviations of the departments where the taxon has been recorded.
- ecological_regions
Abbreviations of the ecological regions proposed by Zamora (1996).
- sinampe
Abbreviation of the Protected Natural Area where the taxon was recorded.
- peruvian_herbariums
Acronyms of the Peruvian institutions where both type and non-type specimens are deposited.
- remarks
Observations and additional information about the endemic taxon.
Details
This database provides essential information for research and conservation efforts related to Peru's endemic flora, offering access to the data presented in the corresponding book.
References
León, Blanca, et.al. 2006. “The Red Book of Endemic Plants of Peru”. Revista Peruana De Biología 13 (2): 9s-22s. https://revistasinvestigacion.unmsm.edu.pe/index.php/rpb/issue/view/153
Examples
# Example illustrating how to load and explore the database
data("redbook_sp_data")
head(redbook_sp_data)
List of Species Names in redbook_sps_class Separated by Category
Description
The redbook_sps_class
dataset includes all species names separated by genus,
epithet, author, subspecies, variety, and their position (ID) in the
redbook_tab
.
Usage
redbook_sps_class
Format
A data.frame with the following columns:
- species
A character vector. The full species name.
- genus
A character vector. The genus of the species.
- epithet
A character vector. The specific epithet of the species.
- input_subspecies_epitheton
A character vector. The infraspecific epithet of the species, if applicable.
- rank
A character vector. The taxonomic rank (e.g., "species", "subspecies", "variety").
- subspecies
A character vector. The subspecies name, if applicable.
- variety
A character vector. The variety name, if applicable.
- hybrid
A character vector. Indicates if the species is a hybrid.
- id
A character vector. The ID of the species in the
redbook_tab
.
Examples
data("redbook_sps_class")
head(redbook_sps_class)
Species Names Listed in The Red Book of Endemic Plants of Peru
Description
The redbook_tab
contains records for all species listed in The Red Book of Endemic Plants of Peru.
Usage
redbook_tab
Format
A tibble with the following columns:
- redbook_id
The fixed species ID of the input taxon in The Red Book of Endemic Plants of Peru.
- redbook_name
A character vector. The species name as listed in The Red Book of Endemic Plants of Peru.
- input_genus
A character vector. The input genus of the corresponding species name listed.
- input_epitheton
A character vector. The specific epithet of the corresponding species name listed.
- rank
A character vector. The taxonomic rank (e.g., "species", "subspecies", "variety") of the corresponding species name listed.
- input_subspecies_epitheton
A character vector. The infraspecific epithet of the corresponding species name listed, if applicable.
- accepted_name
A character vector. The accepted plant taxa names according to the World Checklist of Vascular Plants (WCVP).
- accepted_family
A character vector. The corresponding family name of the accepted name.
- accepted_name_author
A character vector. The author of the accepted name.
- accepted_name_rank
A character vector. The rank of the accepted name (e.g., species, subspecies).
- tag_subsp_wcvp
A character vector. A tag indicating if the subspecies is recognized in the WCVP.
- genus_ephitethon_wcvp
A character vector. The genus part of the name according to the WCVP.
- species_ephitethon_wcvp
A character vector. The specific epithet part of the name according to the WCVP.
- subspecies_ephitethon_wcvp
A character vector. The infraspecific epithet part of the name according to the WCVP, if applicable.
References
León, Blanca, et.al. 2006. “The Red Book of Endemic Plants of Peru”. Revista Peruana De Biología 13 (2): 9s-22s. https://doi.org/10.15381/rpb.v13i2.1782.
Examples
data("redbook_tab")
head(redbook_tab)
Search Species Names in “The Red Book of Endemic Plants of Peru”
Description
This function allows searching for plant taxa names listed in "The Red Book of Endemic Plants of Peru". It connects to the data listed in the catalog and validates if the species is present, removing orthographic errors in plant names.
Usage
search_redbook(
splist,
max_distance = 0.2,
show_correct = FALSE,
genus_fuzzy = FALSE,
grammar_check = FALSE
)
Arguments
splist |
A character vector specifying the input taxon, each element
including genus and specific epithet and, potentially, infraspecific rank,
infraspecific name, and author name.
Only valid characters are allowed (see |
max_distance |
An integer or fraction specifying the maximum distance allowed when comparing the submitted name with the closest name matches in the species listed in "The Red Book of Endemic Plants of Peru". The distance used is a generalized Levenshtein distance indicating the total number of insertions, deletions, and substitutions allowed to match the two names. For example, a name with a length of 10 and a max_distance = 0.1 allows only one change (insertion, deletion, or substitution). A max_distance = 2 allows two changes. |
show_correct |
If TRUE, a column is added to the final result indicating whether the binomial name was exactly matched (TRUE) or if it is misspelled (FALSE). |
genus_fuzzy |
If TRUE, allows fuzzy matching at the genus level. |
grammar_check |
If TRUE, performs a grammar check on the species names. |
Details
The function tries to match names in "The Red Book of Endemic Plants of Peru", which has a corresponding accepted valid name (accepted_name). If the input name is a valid name, it will be duplicated in the accepted_name column.
The algorithm will first try to exactly match the binomial names provided in
splist
. If no match is found, it will try to find the closest name given the
maximum distance defined in max_distance
.
Note that only binomial names with valid characters are allowed in this
function.
Value
A data frame with the matched species names and additional
information from the redbook catalog.
If no match is found, a warning is issued suggesting to increase
the max_distance
argument.
References
León, Blanca, et.al. 2006. “The Red Book of Endemic Plants of Peru”. Revista Peruana De Biología 13 (2): 9s-22s. https://doi.org/10.15381/rpb.v13i2.1782.