Help for package naturaList

Type:

Package

Title:

Classify Occurrences by Confidence Levels in the Species ID

Version:

0.5.2

Description:

Classify occurrence records based on confidence levels of species identification. In addition, implement tools to filter occurrences inside grid cells and to manually check for possibles errors with an interactive shiny application.

License:

MIT + file LICENSE

Encoding:

UTF-8

LazyData:

true

RoxygenNote:

7.2.3

Imports:

shiny, shinyWidgets, dplyr, stringr, sp, raster, shinydashboard, leaflet, leaflet.extras, tidytext, magrittr, vegan, fasterize, sf, htmltools, methods, rlang, tm, stringi

Suggests:

knitr, rmarkdown, testthat (≥ 3.0.0), rnaturalearth, lwgeom, shinyLP

VignetteBuilder:

knitr

Depends:

R (≥ 2.10)

URL:

https://github.com/avrodrigues/naturaList

BugReports:

https://github.com/avrodrigues/naturaList/issues

Config/testthat/edition:

NeedsCompilation:

Packaged:

2024-02-06 07:59:04 UTC; rodriart

Author:

Arthur Vinicius Rodrigues

[aut, cre], Gabriel Nakamura

[aut], Leandro Duarte

[aut]

Maintainer:

Arthur Vinicius Rodrigues <rodrigues.arthur.v@gmail.com>

Repository:

CRAN

Date/Publication:

2024-02-06 08:10:02 UTC

Pipe operator

Description

See magrittr::%>% for details.

Usage

lhs %>% rhs

Occurrence records of Alsophila setosa downloaded from Global Biodiversity Information Facility (GBIF).

Description

A GBIF raw dataset containing 508 occurrence records for the tree fern Alsophila setosa.

Usage

A.setosa

Format

A data frame with 508 rows and 45 variables

Source

GBIF.org (08 July 2019) GBIF Occurrence Download doi:10.15468/dl.6jesg0

Brazil boundary

Description

A spatial polygon with the Brazil boundaries

Usage

BR

Format

A 'SpatialPolygonsDataFrame' with 1 feature

Internal function of naturaList - Return abbreviation collapsed

Description

Return collapsed abbreviation for a specific line of specialist data frame. It is used as pattern in grep function inside classify_occ

Usage

abrev.pttn(df, line)

Arguments

df

spec data frame provided in classify_occ

line

specifies the line of the data frame to be collapsed

Value

a list with two elements in regex format:

[[1]]

the abbreviation of the first name;

[[2]]

regex pattern with all names and abbreviations.

Internal function of naturaList - Manual check of ambiguity in specialist's name

Description

Creates interaction with user in which the user check if a string with the identifier of a specimen has a specialist name. It solves ambiguity in classify an occurrence as identified by a specialist. It is used inside classify_occ

Usage

check.spec(class.occ, crit.levels, identified.by)

Arguments

class.occ

internal data frame with observation classified according classify_occ criteria

crit.levels

crit.levels choose by user in classify_occ

identified.by

same as identified.by argument in classify_occ

Value

a character vector with 'naturaList_levels" ID.

Classify occurrence records in levels of confidence in species identification

Description

Classifies occurrence records in levels of confidence in species identification

Usage

classify_occ(
  occ,
  spec = NULL,
  na.rm.coords = TRUE,
  crit.levels = c("det_by_spec", "not_spec_name", "image", "sci_collection", "field_obs",
    "no_criteria_met"),
  ignore.det.names = NULL,
  spec.ambiguity = "not.spec",
  institution.code = "institutionCode",
  collection.code = "collectionCode",
  catalog.number = "catalogNumber",
  year = "year",
  date.identified = "dateIdentified",
  species = "species",
  identified.by = "identifiedBy",
  decimal.latitude = "decimalLatitude",
  decimal.longitude = "decimalLongitude",
  basis.of.record = "basisOfRecord",
  media.type = "mediaType",
  occurrence.id = "occurrenceID",
  institution.source,
  year.event,
  scientific.name,
  determined.by,
  latitude,
  longitude,
  basis.of.rec,
  occ.id
)

Arguments

occ

data frame with occurrence records information.

spec

data frame with specialists' names. See details.

na.rm.coords

logical. If TRUE, remove occurrences with NA in decimal.latitude or decimal.longitude

crit.levels

character. Vector with levels of confidence in decreasing order. The criteria allowed are det_by_spec, not_spec_name, image, sci_collection, field_obs, no_criteria_met. See details.

ignore.det.names

character vector indicating strings in identified.by that should be ignored as a taxonomist. See details.

spec.ambiguity

character. Indicates how to deal with ambiguity in specialists names. not.spec solve ambiguity by classifying the identification as done by a non-specialist;is.spec assumes the identification was done by a specialist; manual.check enables the user to manually check all ambiguous names. Default is not.spec.

institution.code

column name of occ with the name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record.

collection.code

column name of occ with The name, acronym, code, or initials identifying the collection or data set from which the record was derived.

catalog.number

column name of occ with an identifier (preferably unique) for the record within the data set or collection.

year

Column name of occ the four-digit year in which the Event occurred, according to the Common Era Calendar.

date.identified

Column name of occ with the date on which the subject was determined as representing the Taxon.

species

column name of occ with the species names.

identified.by

column name of occ with the name of who determined the species.

decimal.latitude

column name of occ latitude in decimal degrees.

decimal.longitude

column name of occ longitude in decimal degrees.

basis.of.record

column name with the specific nature of the data record. See details.

media.type

column name of occ with the media type of recording. See details.

occurrence.id

column name of occ with link or code for the occurrence record. See in Darwin Core Format

institution.source

deprecated, use institution.code instead.

year.event

deprecated, use year instead.

scientific.name

deprecated, use species instead.

determined.by

deprecated, use identified.by instead

latitude

deprecated, use decimal.latitude instead

longitude

deprecated, use decimal.longitude instead

basis.of.rec

deprecated, use basis.of.record instead.

occ.id

deprecated, use occurrence.id instead

Details

spec data frame must have columns separating LastName, Name and Abbrev. See create_spec_df function for a easy way to produce this data frame.

When ignore.det.name = NULL (default), the function ignores strings with "RRC ID Flag", "NA", "", "-" and "_". When a character vector is provided, the function adds the default strings to the provided character vector and ignore all these strings as being a name of a taxonomist.

The function classifies the occurrence records in six levels of confidence in species identification. The six levels are:

det_by_spec - when the identification was made by a specialists which is present in the list of specialists provided in the spec argument;
not_spec_name - when the identification was made by a name who is not a specialist name provide in spec;
image - the occurrence have not name of a identifier, but present an image associated;
sci_collection - the occurrence have not name of a identifier, but preserved in a scientific collection;
field_obs - the occurrence have not name of a identifier, but it was identified in field observation;
no_criteria_met - no other criteria was met.

The (decreasing) order of the levels in the character vector determines the classification level order.

basis.of.record is a character vector with one of the following types of record: PRESERVED_SPECIMEN, PreservedSpecimen, HUMAN_OBSERVATION or HumanObservation, as in GBIF data 'basisOfRecord'.

media.type uses the same pattern as GBIF mediaType column, indicating the existence of an associated image with stillImage.

Value

The occ data frame plus the classification of each record in a new column, named naturaList_levels.

Author(s)

Arthur V. Rodrigues

Examples

data("A.setosa")
data("speciaLists")
occ.class <- classify_occ(A.setosa, speciaLists)

Evaluate the cleaning of occurrences records

Description

This function compare the area occupied by a species before and after pass through the cleaning procedure according to the chosen level of filter. The comparison can be made by measuring area in the geographical and in the environmental space

Usage

clean_eval(
  occ.cl,
  geo.space,
  env.space = NULL,
  level.filter = c("1_det_by_spec"),
  r,
  species = "species",
  decimal.longitude = "decimalLongitude",
  decimal.latitude = "decimalLatitude",
  scientific.name,
  longitude,
  latitude
)

Arguments

occ.cl

data frame with occurrence records information already classified by classify_occ function.

geo.space

a SpatialPolygons* or sf object defining the geographical space

env.space

a SpatialPolygons* or sf object defining the environmental space. Use the define_env_space for create this object. By default env.space = NULL, hence do not evaluate the cleaning in the environmental space.

level.filter

a character vector including the levels in 'naturaList_levels' column which filter the occurrence data set.

r

a raster with 2 layers representing the environmental variables. If env.space = NULL, it could be a single layer raster, from which the cell size and extent are extracted to produce the composition matrix.

species

column name of occ.cl with the species names.

decimal.longitude

column name of occ.cl longitude in decimal degrees.

decimal.latitude

column name of occ.cl latitude in decimal degrees.

scientific.name

deprecated, use species instead.

longitude

deprecated, use decimal.longitude instead

latitude

deprecated, use decimal.latitude instead

Value

a list in which:

area data frame remaining area after cleaning proportional to the area before cleaning. The values vary from 0 to 1. Column named r.geo.area is the remaining area for all species in the geographic space and the r.env.area in the environmental space.

comp data frame with composition of species in sites (cells from raster layers) before cleaning (comp$comp$BC) and after cleaning (comp$comp$AC). The number of rows is equal the number of cells in r, and number of columns is equal to the number of species in the occ.cl.

rich data frame with a single column with the richness of each site

site.coords data frame with site's coordinates. It facilitates to built raster layers from results using rasterFromXYZ

Examples

## Not run: 

library(sp)
library(raster)


data("speciaLists") # list of specialists
data("cyathea.br") # occurrence dataset


# classify
occ.cl <- classify_occ(cyathea.br, speciaLists)

# delimit the geographic space
# land area
data("BR")


# Transform occurrence data in SpatialPointsDataFrame
spdf.occ.cl <- sp::SpatialPoints(occ.cl[, c("decimalLongitude", "decimalLatitude")])


# load climate data
data("r.temp.prec") # mean temperature and annual precipitation
df.temp.prec <- raster::as.data.frame(r.temp.prec)

### Define the environmental space for analysis
# this function will create a boundary of available environmental space,
# analogous to the continent boundary in the geographical space
env.space <- define_env_space(df.temp.prec, buffer.size = 0.05)

# filter by year to be consistent with the environmental data
occ.class.1970 <-  occ.cl %>%
  dplyr::filter(year >= 1970)

### run the evaluation
cl.eval <- clean_eval(occ.class.1970,
                      env.space = env.space,
                      geo.space = BR,
                      r = r.temp.prec)

#area results
head(cl.eval$area)


### richness maps
## it makes sense if there are more than one species
rich.before.clean <- raster::rasterFromXYZ(cbind(cl.eval$site.coords,
                                                 cl.eval$rich$rich.BC))
rich.after.clean <- raster::rasterFromXYZ(cbind(cl.eval$site.coords,
                                                cl.eval$rich$rich.AC))

raster::plot(rich.before.clean)
raster::plot(rich.after.clean)

### species area map
comp.bc <- as.data.frame(cl.eval$comp$comp.BC)
comp.ac <- as.data.frame(cl.eval$comp$comp.AC)

c.villosa.bc <- raster::rasterFromXYZ(cbind(cl.eval$site.coords,
                                            comp.bc$`Cyathea villosa`))
c.villosa.ac <- raster::rasterFromXYZ(cbind(cl.eval$site.coords,
                                            comp.ac$`Cyathea villosa`))

raster::plot(c.villosa.bc)
raster::plot(c.villosa.ac)

## End(Not run)

Create specialist data frame from character vector

Description

Creates a specialist data frame ready for use in classify_occ from a character vector containing the specialists names

Usage

create_spec_df(spec.char)

Arguments

spec.char

a character vector with specialist names

Value

a data frame. Columns split the names, surname and abbreviation for the names. If the full name contain any special character, such as accent marks, two lines for that name will be provided, with and without the special characters. See examples.

Examples

# Example using Latin accent marks
data(spec_names_ex)

spec_names_ex
create_spec_df(spec_names_ex)

Occurrence records of Cyathea species in Brazil downloaded from Global Biodiversity Information Facility (GBIF).

Description

A filtered GBIF dataset containing 3851 occurrence records for the fern species from the genus Cyathea in Brazil. We filtered the data after download from GBIF to ensure all occurrences records are from Brazil.

Usage

cyathea.br

Format

A data frame with 3851 rows and 50 variables

Source

GBIF.org (07 March 2021) GBIF Occurrence Download doi:10.15468/dl.qrhynv

Define environmental space for species occurrence

Description

Based on two continuous environmental variables, it defines a bi-dimensional environmental space.

Usage

define_env_space(env, buffer.size, plot = TRUE)

Arguments

env

matrix or data frame with two columns containing two environmental variables. The variables must be numeric, even for data frames.

buffer.size

numeric value indicating a buffer size around each point which will delimit the environmental geographical border for the occurrence point. See details.

plot

logical. whether to plot the polygon. Default is TRUE.

Details

The environmental variables are standardized by range, which turns the range of each environmental variable from 0 to 1. Then, it is delimited a buffer of size equal to buffer.size around each point in this space and a polygon is draw to link these buffers. The function returns the polygon needed to link all points, and the area of the polygon indicates the environmental space based in the variables used.

Value

An object of sfc_POLYGON class

Examples

## Not run: 
library("raster")

# load climate data
data("r.temp.prec")
env.data <- raster::as.data.frame(r.temp.prec)

define_env_space(env.data, 0.05)

## End(Not run)

Filter occurrences in environmental space

Description

Filter the occurrence with the most realible species identification in the environmental space. This function is based in the function envSample provided by Varela et al. (2014) and were adapted to the naturaList package to select the occurrence with the most realible species identification in each environmental grid.

Usage

env_grid_filter(
  occ.cl,
  env.data,
  grid.res,
  institution.code = "institutionCode",
  collection.code = "collectionCode",
  catalog.number = "catalogNumber",
  year = "year",
  date.identified = "dateIdentified",
  species = "species",
  identified.by = "identifiedBy",
  decimal.latitude = "decimalLatitude",
  decimal.longitude = "decimalLongitude",
  basis.of.record = "basisOfRecord",
  media.type = "mediaType",
  occurrence.id = "occurrenceID"
)

Arguments

occ.cl

data frame with occurrence records information already classified by classify_occ function.

env.data

data frame with rows for occurrence observation and columns for each environmental variable

grid.res

numeric vector. Each value represents the width of each bin in the scale of the environmental variable. The order in this vector is assumed to be the same order in the of the variables in the env.data data frame.

institution.code

column name of occ.cl with the name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record.

collection.code

column name of occ.cl with The name, acronym, code, or initials identifying the collection or data set from which the record was derived.

catalog.number

column name of occ.cl with an identifier (preferably unique) for the record within the data set or collection.

year

Column name of occ.cl the four-digit year in which the Event occurred, according to the Common Era Calendar.

date.identified

Column name of occ.cl with the date on which the subject was determined as representing the Taxon.

species

column name of occ with the species names.

identified.by

column name of occ.cl with the name of who determined the species.

decimal.latitude

column name of occ.cl latitude in decimal degrees.

decimal.longitude

column name of occ.cl longitude in decimal degrees.

basis.of.record

column name with the specific nature of the data record. See details.

media.type

column name of occ.cl with the media type of recording. See details.

occurrence.id

column name of occ with link or code for the occurrence record. See in Darwin Core Format

Value

Data frame with the same columns of occ.cl.

References

Varela et al. (2014). Environmental filters reduce the effects of sampling bias and improve predictions of ecological niche models. *Ecography*. 37(11) 1084-1091.

Examples


## Not run: 
library(naturaList)
library(tidyverse)

data("cyathea.br")
data("speciaLists")
data("r.temp.prec")

occ <- cyathea.br %>%
  filter(species == "Cyathea atrovirens")

occ.cl <- classify_occ(occ, speciaLists, spec.ambiguity = "is.spec")

# temperature and precipitaion data
env.data <- raster::extract(
  r.temp.prec,
  occ.cl[,c("decimalLongitude", "decimalLatitude")]
) %>% as.data.frame()

# the bins for temperature has 5 degrees each and for precipitation has 100 mm each
grid.res <- c(5, 100)

occ.filtered <- env_grid_filter(
  occ.cl,
  env.data,
  grid.res
)


## End(Not run)

Internal function of naturaList - Detect if a string has a specialist name

Description

Detect if a string with identifiers name has a specialist name. It is used inside classify_occ

Usage

func.det.by.esp(sp.df, i, specialist)

Arguments

sp.df

reduced version of occurrence data frame provided in classify_occ

i

row number of specialist data frame

specialist

specialist data

Value

integer with the row numbers of the sp.df data frame which was identified by the specialist name in row i.

Get the names in the 'identified.by' column

Description

This function facilitates the search for non-taxonomist strings in the 'identified.by' column of occurrence records data set

Usage

get_det_names(
  occ,
  identified.by = "identifiedBy",
  freq = FALSE,
  decreasing = TRUE,
  determined.by
)

Arguments

occ

data frame with occurrence records information.

identified.by

column name of occ with the name of who determined the species.

freq

logical. If TRUE output contain the number of times each string is repeated in the identified.by column. Default = FALSE

decreasing

logical. sort strings in decreasing order of frequency. Default = TRUE.

determined.by

deprecated, use identified.by instead.

Value

character vector containing the strings in identified.by column of occ. If freq = TRUE it return a data frame with two columns: 'strings' and 'frequency'.

Examples

data("A.setosa")
get_det_names(A.setosa, freq = TRUE)

Filter the occurrence with most confidence in species identification inside grid cells

Description

In each grid cell it selects the occurrence with the highest confidence level in species identification made by classify_occ function.

Usage

grid_filter(
  occ.cl,
  grid.resolution = c(0.5, 0.5),
  r = NULL,
  institution.code = "institutionCode",
  collection.code = "collectionCode",
  catalog.number = "catalogNumber",
  year = "year",
  date.identified = "dateIdentified",
  species = "species",
  identified.by = "identifiedBy",
  decimal.latitude = "decimalLatitude",
  decimal.longitude = "decimalLongitude",
  basis.of.record = "basisOfRecord",
  media.type = "mediaType",
  occurrence.id = "occurrenceID",
  institution.source,
  year.event,
  scientific.name,
  determined.by,
  latitude,
  longitude,
  basis.of.rec,
  occ.id
)

Arguments

occ.cl

data frame with occurrence records information already classified by classify_occ function.

grid.resolution

numeric vector with width and height of grid cell in decimal degrees.

r

raster from which the grid cell resolution is derived.

institution.code

column name of occ.cl with the name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record.

collection.code

column name of occ.cl with The name, acronym, code, or initials identifying the collection or data set from which the record was derived.

catalog.number

column name of occ.cl with an identifier (preferably unique) for the record within the data set or collection.

year

Column name of occ.cl the four-digit year in which the Event occurred, according to the Common Era Calendar.

date.identified

Column name of occ.cl with the date on which the subject was determined as representing the Taxon.

species

column name of occ with the species names.

identified.by

column name of occ.cl with the name of who determined the species.

decimal.latitude

column name of occ.cl latitude in decimal degrees.

decimal.longitude

column name of occ.cl longitude in decimal degrees.

basis.of.record

column name with the specific nature of the data record. See details.

media.type

column name of occ.cl with the media type of recording. See details.

occurrence.id

column name of occ with link or code for the occurrence record. See in Darwin Core Format

institution.source

deprecated, use institution.code instead.

year.event

deprecated, use year instead.

scientific.name

deprecated, use species instead.

determined.by

deprecated, use identified.by instead

latitude

deprecated, use decimal.latitude instead

longitude

deprecated, use decimal.longitude instead

basis.of.rec

deprecated, use basis.of.record instead.

occ.id

deprecated, use occurrence.id instead

Value

Data frame with the same columns of occ.cl.

Author(s)

Arthur V. Rodrigues

Examples


## Not run: 

data("A.setosa")
data("speciaLists")

occ.class <- classify_occ(A.setosa, speciaLists)
occ.grid <- grid_filter(occ.class)


## End(Not run)

Internal function of naturaList - Identifies if a occurrence has a name for the identifier of the specimen

Description

Identifies if a occurrence has a name for the identifier of the specimen. It is used inside classify_occ

Usage

has.det.ID(sp.df, ignore.det.names = NULL)

Arguments

sp.df

reduced version of occurrence data frame provided in classify_occ

ignore.det.names

ignore.det.names character vector indicating strings in the identified.by column that should be ignored as a taxonomist. See classify_occ.

@return an integer vector indicating the rows which have 'identified by' ID

Internal function of naturaList - Create SpatialPolygons from a list of coordinates

Description

Create SpatialPolygons from a list of coordinates. It is used in map_module

Usage

make.polygon(df)

Arguments

df

a data frame provided by pol.coords

Value

a SpatialPolygon object

Check the occurrence records in a interactive map module

Description

Allows to delete occurrence records and to select occurrence points by classification levels or by drawing spatial polygons.

Usage

map_module(
  occ.cl,
  action = "clean",
  institution.code = "institutionCode",
  collection.code = "collectionCode",
  catalog.number = "catalogNumber",
  year = "year",
  date.identified = "dateIdentified",
  species = "species",
  identified.by = "identifiedBy",
  decimal.latitude = "decimalLatitude",
  decimal.longitude = "decimalLongitude",
  basis.of.record = "basisOfRecord",
  media.type = "mediaType",
  occurrence.id = "occurrenceID",
  institution.source,
  year.event,
  scientific.name,
  determined.by,
  latitude,
  longitude,
  basis.of.rec,
  occ.id
)

Arguments

occ.cl

Data frame with occurrence records information already classified by classify_occ function.

action

a string with '"clean"' or '"flag"' which defines the action of 'map_module' function with the occurrence dataset. Default is '"clean"'. If the string is '"clean"' the dataset returned only the occurrences records selected by the user. If the string is '"flag"', a column named 'map_module_flag' is added in the output dataset, with tags 'selected' and 'deleted', following the choices of the user in the application.

institution.code

column name of occ with the name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record.

collection.code

column name of occ with The name, acronym, code, or initials identifying the collection or data set from which the record was derived.

catalog.number

column name of occ with an identifier (preferably unique) for the record within the data set or collection.

year

Column name of occ the four-digit year in which the Event occurred, according to the Common Era Calendar.

date.identified

Column name of occ with the date on which the subject was determined as representing the Taxon.

species

column name of occ with the species names.

identified.by

column name of occ with the name of who determined the species.

decimal.latitude

column name of occ latitude in decimal degrees.

decimal.longitude

column name of occ longitude in decimal degrees.

basis.of.record

column name with the specific nature of the data record. See details.

media.type

column name of occ with the media type of recording. See details.

occurrence.id

column name of occ with link or code for the occurrence record. See in Darwin Core Format

institution.source

deprecated, use institution.code instead.

year.event

deprecated, use year instead.

scientific.name

deprecated, use species instead.

determined.by

deprecated, use identified.by instead

latitude

deprecated, use decimal.latitude instead

longitude

deprecated, use decimal.longitude instead

basis.of.rec

deprecated, use basis.of.record instead.

occ.id

deprecated, use occurrence.id instead

Value

Data frame with the same columns of occ.cl.

Author(s)

Arthur V. Rodrigues

Examples

## Not run: 
data("A.setosa")
data("speciaLists")

occ.class <- classify_occ(A.setosa, speciaLists)
occ.selected <- map_module(occ.class)
occ.selected


## End(Not run)

Internal function of naturaList - Get coordinates from polygons created in leaflet map

Description

Get coordinates from polygons created in leaflet map. It is used in map_module

Usage

pol.coords(input.polig)

Arguments

input.polig

an interactive polygon from leaflet map. input$map_draw_all_features$features[[i]]

Value

a data frame with the coordinates

Internal function of naturaList - Return specialists names in a collapsed string

Description

Return specialists names in a collapsed string to be used in the internal function specialist.conference

Usage

pttn.all.specialist(specialist)

Arguments

specialist

specialist data frame

Value

character. A regex pattern for the specialist full name

Raster of temperature and precipitation

Description

Raster of Annual Mean Temperature (bio1) and Total Annual Precipitation (bio2). Layers were downloaded from worldclim database and cropped to the extent of cyathea_br with a buffer of 100 km.

Usage

r.temp.prec

Format

A raster with two layers

Internal function of naturaList - reduce data.frame of occurrence for a minimal column length

Description

Reduce columns of occurrence data.frame required by classify_occ to facilitate internal operation

Usage

reduce.df(
  df,
  institution.code = "institutionCode",
  collection.code = "collectionCode",
  catalog.number = "catalogNumber",
  year = "year",
  date.identified = "dateIdentified",
  species = "species",
  identified.by = "identifiedBy",
  decimal.latitude = "decimalLatitude",
  decimal.longitude = "decimalLongitude",
  basis.of.record = "basisOfRecord",
  media.type = "mediaType",
  occurrence.id = "occurrenceID",
  na.rm.coords = TRUE
)

Arguments

df

occurrence data frame provided in classify_occ

institution.code

institution.code = "institutionCode"

collection.code

collection.code = "collectionCode"

catalog.number

catalog.number = "catalogNumber"

year

year = "year",

date.identified

date.identified = "dateIdentified"

species

species = "species"

identified.by

identified.by = "identifiedBy"

decimal.latitude

decimal.latitude = "decimalLatitude"

decimal.longitude

decimal.longitude = "decimalLongitude"

basis.of.record

basis.of.record = "basisOfRecord"

media.type

media.type = "mediaType"

occurrence.id

occ.id = "occurrenceID"

na.rm.coords

na.rm.coords = TRUE

Value

a data frame with only the columns required for the naturaList package

Internal function of naturaList - Remove duplicate occurrence

Description

Remove duplicated occurrence based on coordinates. It is used in grid_filter

Usage

rm.coord.dup(x, decimal.latitude, decimal.longitude)

Arguments

x

data frame with filtered occurrences

decimal.latitude

name of column with decimal.latitude

decimal.longitude

name of column with decimal.longitude

Value

data frame with occurrence records

Example of specialist names with accent marks

Description

Example of specialist names with accent marks

Usage

spec_names_ex

Format

character

Specialists of ferns and lycophytes of Brazil

Description

A dataset containing the specialists of ferns and lycophytes of Brazil formatted to be used by naturaList package. This data serves as a format example for spec argument in classify_occ.

Usage

speciaLists

Format

A data frame with 27 rows and 8 columns:

LastName: Last name of the specialist.
Name1: Columns with the names of specialist. Could be repeated as long as needed. In this data Name* was repeated three times.
Name2: Columns with the names of specialist.
Name3: Columns with the names of specialist.
Name4: Columns with the names of specialist.
Abbrev1: Columns with the abbreviation (one character) of the names of specialists. Could be repeated as long as needed. In this data Abbrev* was repeated three times.
Abbrev2: Columns with the abbreviation (one character) of the names of specialists.
Abbrev3: Columns with the abbreviation (one character) of the names of specialists.

Source

The specialists names was derived from the authors of paper: doi:10.1590/2175-7860201566410

Internal function of naturaList - Confirm if an occurrence record was identified by a specialist without ambiguity

Description

Confirm if an occurrence record was identified by a specialist without ambiguity. It is used inside classify_occ

Usage

specialist.conference(pt.df, specialist)

Arguments

pt.df

a line of the reduced version of the occurrence data frame

specialist

specialist data frame

Value

character with naturaList level code "1_det_by_spec" or "1_det_by_spec_verify"

Internal function of naturaList - Verify if a string has unambiguous specialist name

Description

Based on pattern generated by pttn.all.specialist it verifies if a string has unambiguous specialist name. It is used in internal function specialist.conference

Usage

verify.specialist(pattern, string)

Arguments

pattern

a pattern from pttn.all.specialist function

string

string with the name of who identified the specimen

Value

character. "" or "_verify".

Pipe operator

Description

Usage

Occurrence records of Alsophila setosa downloaded from Global Biodiversity Information Facility (GBIF).

Description

Usage

Format

Source

Brazil boundary

Description

Usage

Format

Internal function of naturaList - Return abbreviation collapsed

Description

Usage

Arguments

Value

See Also

Internal function of naturaList - Manual check of ambiguity in specialist's name

Description

Usage

Arguments

Value

Classify occurrence records in levels of confidence in species identification

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Evaluate the cleaning of occurrences records

Description

Usage

Arguments

Value

See Also

Examples

Create specialist data frame from character vector

Description

Usage

Arguments

Value

Examples

Occurrence records of Cyathea species in Brazil downloaded from Global Biodiversity Information Facility (GBIF).

Description

Usage

Format

Source

Define environmental space for species occurrence

Description

Usage

Arguments

Details

Value

Examples

Filter occurrences in environmental space

Description

Usage

Arguments

Value

References

See Also

Examples

Internal function of naturaList - Detect if a string has a specialist name

Description

Usage

Arguments

Value

Get the names in the 'identified.by' column

Description

Usage

Arguments

Value

Examples

Filter the occurrence with most confidence in species identification inside grid cells

Description

Usage

Arguments