Title: | Reptile Database Data |
Version: | 0.0.0.1 |
Description: | Provides easy access to 'The Reptile Database', a comprehensive catalogue of all living reptile species and their classification. This package includes taxonomic data for over 10,000 reptile species, approximately 2,800 of which are subspecies, covering all extant reptiles. The dataset features taxonomic names, synonyms, distribution data, type specimens, and literature references, making it ready for research and analysis. Data is sourced from 'The Reptile Database' http://www.reptile-database.org/. |
License: | MIT + file LICENSE |
Depends: | R (≥ 4.1.0) |
Suggests: | testthat (≥ 3.0.0), knitr, rmarkdown |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
LazyData: | true |
URL: | https://github.com/PaulESantos/reptiledb.data |
BugReports: | https://github.com/PaulESantos/reptiledb.data/issues |
Imports: | httr, rvest, stringr, tibble |
Maintainer: | Paul Efren Santos Andrade <paulefrens@gmail.com> |
NeedsCompilation: | no |
Packaged: | 2025-07-01 01:55:08 UTC; PC |
Author: | Paul Efren Santos Andrade
|
Repository: | CRAN |
Date/Publication: | 2025-07-05 15:40:02 UTC |
Check if reptile database data needs updating based on date comparison
Description
This function checks if the local reptile database data is up-to-date by comparing the date extracted from the local dataset name with the date from the latest available file on The Reptile Database website.
Usage
check_data_update(silent = FALSE, check_connection = TRUE)
Arguments
silent |
Logical. If TRUE, suppresses messages and only returns results. Default is FALSE. |
check_connection |
Logical. If TRUE, checks internet connection before attempting to access online data. Default is TRUE. |
Value
A list containing the following elements:
- update_needed
Logical. TRUE if an update is needed, FALSE otherwise
- local_info
List. Information about the local dataset
- remote_info
List. Information about the remote dataset
- message
Character. Status message describing the comparison result
- recommendation
Character. Recommendation for user action
- local_date
Character. Date of local data in YYYY-MM-DD format
- remote_date
Character. Date of remote data in YYYY-MM-DD format (if available)
- remote_filename
Character. Filename of the remote file (if available)
- days_difference
Numeric. Number of days difference between local and remote data (if both dates available)
If an error occurs or internet connection is not available, only the message element will contain relevant error information.
Examples
# Silent check (no messages) - requires internet connection
update_status <- check_data_update(silent = TRUE)
# Verbose check with connection verification
update_status <- check_data_update(silent = FALSE, check_connection = TRUE)
# Check without internet connection verification
update_status <- check_data_update(check_connection = FALSE)
Check Internet Connection
Description
Helper function to check if internet connection is available
Usage
check_internet_connection()
Value
Logical. TRUE if internet is available, FALSE otherwise.
Extract date from dataset name or filename
Description
Extract date from dataset name or filename
Usage
extract_date_from_name(name, type = "local")
Arguments
name |
Dataset name or filename |
type |
Type of name ("local" or "remote") |
Value
A Date object representing the extracted date, or NULL if extraction fails. For local datasets, expects pattern "reptiledb_MMYYYY" (e.g., reptiledb_012025). For remote files, expects pattern "reptile_checklist_YYYY_MM.xlsx".
Get Latest Reptile Database Download Link
Description
This function retrieves the most recent download link for reptile database files from the Reptile Database website. It searches for files from the current year first, and if none are found, searches for files from the previous year.
Usage
get_latest_reptile_download(
base_url = "http://www.reptile-database.org/data/",
current_year = as.numeric(format(Sys.Date(), "%Y")),
file_types = c("xls", "xlsx", "zip"),
return_info = FALSE
)
Arguments
base_url |
Character string. The base URL of the reptile database data page. Default is "http://www.reptile-database.org/data/". |
current_year |
Numeric. The current year to search for files. Default is the current system year. |
file_types |
Character vector. File extensions to search for. Default is c("xls", "xlsx", "zip"). |
return_info |
Logical. If TRUE, returns a list with detailed information about the found file. If FALSE, returns only the URL. Default is FALSE. |
Details
The function performs web scraping on the specified URL to find download links. It prioritizes files from the current year, but will fall back to the previous year if no current year files are available.
The function requires the following packages: rvest, dplyr, and stringr. These packages must be installed before using this function.
Value
If return_info = FALSE
, returns a character string with the URL
of the most recent file, or NULL if no suitable file is found.
If return_info = TRUE
, returns a list containing:
- url
Character. The complete URL of the file
- filename
Character. The name of the file
- file_type
Character. The file extension
- extraction_date
Date. The date when the link was extracted
- source_page
Character. The source webpage URL
Returns NULL if no suitable file is found or if an error occurs during web scraping.
See Also
http://www.reptile-database.org/ for more information about the Reptile Database.
Examples
# Get just the URL - requires internet connection
url <- get_latest_reptile_download()
# Get detailed information
info <- get_latest_reptile_download(return_info = TRUE)
# Search for specific file types
zip_url <- get_latest_reptile_download(file_types = "zip")
# Search for files from a specific year
url_2024 <- get_latest_reptile_download(current_year = 2024)
Reptile Checklist with Subspecies Information
Description
A comprehensive dataset extracted from The Reptile Database containing taxonomic and nomenclatural information for reptile species and their subspecies. This tibble includes detailed columns related to authorship, type species, and taxonomic changes.
Usage
reptiledb_012025
Format
A tibble with 14,474 rows and 16 columns:
- order
Taxonomic order of the reptile (e.g.,
"Sauria"
).- family
Taxonomic family (e.g.,
"Scincidae"
).- genus
Genus name.
- epithet
Species epithet (second part of the species name).
- species
Full species name (genus + epithet).
- species_author
Primary author(s) of the species name.
- species_name_year
Year the species was described.
- subspecies_name
Epithet of the subspecies (if any).
- subspecie_author_info
Full author citation of the subspecies.
- subspecies_name_author
Author(s) of the subspecies name.
- subspecies_year
Year the subspecies was described.
- type_species
Name of the type species, if available.
- change
Text description of any taxonomic or nomenclatural change.
- rdb_sp_id
Unique identifier assigned by The Reptile Database.
- nomenclature_change
Logical flag indicating if a nomenclatural change has occurred (
TRUE
/FALSE
).- nomenclature_change_species
Logical flag indicating if the nomenclatural change affects the species level (
TRUE
/FALSE
).
Details
This dataset is part of the reptiledb.data
package and provides structured access to reptile
taxonomy data, enabling users to filter, analyze, or visualize species and subspecies information
across multiple reptile families and genera.