Title: Interface for the 'Finnish Biodiversity Information Facility' API
Version: 0.9.10
Description: A programmatic interface to the 'Finnish Biodiversity Information Facility' ('FinBIF') API (https://api.laji.fi). 'FinBIF' aggregates Finnish biodiversity data from multiple sources in a single open access portal for researchers, citizen scientists, industry and government. 'FinBIF' allows users of biodiversity information to find, access, combine and visualise data on Finnish plants, animals and microorganisms. The 'finbif' package makes the publicly available data in 'FinBIF' easily accessible to programmers. Biodiversity information is available on taxonomy and taxon occurrence. Occurrence data can be filtered by taxon, time, location and other variables. The data accessed are conveniently preformatted for subsequent analyses.
License: MIT + file LICENSE
URL: https://github.com/luomus/finbif, https://luomus.github.io/finbif/
BugReports: https://github.com/luomus/finbif/issues
VignetteBuilder: knitr
Encoding: UTF-8
Language: en-US
Depends: R (≥ 3.5.0)
Imports: digest, httr, jsonlite, lutz, utils
RoxygenNote: 7.3.2
Suggests: callr, data.table, DBI, future, knitr, rmarkdown, RSQLite, testthat (≥ 3.0.0), vcr (≥ 0.6.0), webfakes
Config/testthat/edition: 3
X-schema.org-applicationCategory: Biodiversity
X-schema.org-keywords: api, biodiversity, biodiversity-informatics, biodiversity-information, finbif, finbif-access, occurrences, r-package, r-programming, rstats, species, specimens, taxon, taxonomy, web-services
X-schema.org-isPartOf: https://species.fi
NeedsCompilation: no
Packaged: 2025-04-15 09:53:32 UTC; atlanci
Author: Finnish Museum of Natural History - Luomus [cph], William K. Morris ORCID iD [aut, cre]
Maintainer: William K. Morris <willi@mmorris.email>
Repository: CRAN
Date/Publication: 2025-04-15 10:10:02 UTC

finbif: Interface for the 'Finnish Biodiversity Information Facility' API

Description

logo

A programmatic interface to the 'Finnish Biodiversity Information Facility' ('FinBIF') API (https://api.laji.fi). 'FinBIF' aggregates Finnish biodiversity data from multiple sources in a single open access portal for researchers, citizen scientists, industry and government. 'FinBIF' allows users of biodiversity information to find, access, combine and visualise data on Finnish plants, animals and microorganisms. The 'finbif' package makes the publicly available data in 'FinBIF' easily accessible to programmers. Biodiversity information is available on taxonomy and taxon occurrence. Occurrence data can be filtered by taxon, time, location and other variables. The data accessed are conveniently preformatted for subsequent analyses.

Package options

finbif_api_url

Character. The base url of the API to query. Default: "https://api.laji.fi"

finbif_api_version

Character. The API version to use. Default: "v0"

finbif_allow_query

Logical. Should remote API queries by allowed. Default: TRUE

finbif_use_cache

Logical or Integer. If TRUE or a number greater than zero, then data-caching will be used. If not logical then cache will be invalidated after the number of hours indicated by the value. Default: TRUE

finbif_use_cache_metadata

Logical or Integer. If TRUE or a number greater than zero, then metadata-caching will be used. If not logical then cache will be invalidated after the number of hours indicated by the value. Default: TRUE

finbif_cache_path

Character. The path to the directory where to store cached API queries. If unset (the default) in memory caching is used.

finbif_tz

Character. The timezone used by finbif functions that compute dates and times. Default: Sys.timezone()

finbif_locale

Character. One of the supported two-letter ISO 639-1 language codes. Current supported languages are English, Finnish and Swedish. By default, the system settings are used to set this option if they are set to one of the supported languages, otherwise English is used.

finbif_hide_progress

Logical. Global option to suppress progress indicators for downloading, importing and processing FinBIF records. Default: FALSE

Author(s)

Maintainer: William K. Morris willi@mmorris.email (ORCID)

Other contributors:

See Also

Useful links:


Caching FinBIF downloads

Description

Working with cached data from FinBIF.

Turning caching off

By default, local caching of most FinBIF API requests is turned on. Any request made using the same arguments will only request data from FinBIF in the first instance and subsequent requests will use the local cache while it exists. This will increase the speed of repeated requests and save bandwidth and computation for the FinBIF server. Caching can be turned off temporarily by setting cache = c(FALSE, FALSE) in the requesting function.

Setting options(finbif_use_cache = FALSE, finbif_use_cache_metadata = FALSE) will turn off caching for the current session.

Using filesystem caching

By default cached requests are stored in memory. This can be changed by setting the file path for the current session with options(finbif_cache_path = "path/to/cache").

Using database caching

Caching can also be done using a database. Using a database for caching requires the DBI package and a database backend package such as RSQLite to be installed. To use the database for caching simply pass the connection objected created with DBI::dbConnect to the finbif_cache_path option (e.g., db <- DBI::dbConnect(RSQLite::SQLite(), "my-db.sqlite"); options(finbif_cache_path = db) ).

Timeouts

A cache timeout can be set by using an integer (number of hours until cache is considered invalid and is cleared) instead of a logical value for the finbif_use_cache and finbif_use_cache_metadata options or the cache function arguments.

Clearing the cache

The cache can be reset finbif_clear_cache().

Updating the cache

The cache can be updated using finbif_update_cache().


Filtering FinBIF records

Description

Filters available for FinBIF records and occurrence data.

Taxa

Filters related to taxa include:

Location

Filters related to location of record include:

Time

Filters related to time of record include:

Quality

Filters related to quality of record:

Misc

Other filters:


Check FinBIF taxa

Description

Check that taxa are in the FinBIF database.

Usage

finbif_check_taxa(taxa, cache = getOption("finbif_use_cache"))

Arguments

taxa

Character (or list of named character) vector(s). If a list each vector can have the name of a taxonomic rank (genus, species, etc.,). The elements of the vectors should be the taxa to check.

cache

Logical or Integer. If TRUE or a number greater than zero, then data-caching will be used. If not logical then cache will be invalidated after the number of hours indicated by the argument.

Value

An object of class finbif_taxa. A list with the same form as taxa.

Examples

## Not run: 

# Check a scientific name
finbif_check_taxa("Cygnus cygnus")

# Check a common name
finbif_check_taxa("Whooper swan")

# Check a genus
finbif_check_taxa("Cygnus")

# Check a list of taxa
finbif_check_taxa(
  list(
    species = c("Cygnus cygnus", "Ursus arctos"),
    genus   = "Betula"
  )
)

## End(Not run)

Clear cache

Description

Remove cached FinBIF API requests.

Usage

finbif_clear_cache()

Examples

## Not run: 

finbif_clear_cache()


## End(Not run)

FinBIF collections

Description

Get information on collections in the FinBIF database.

Usage

finbif_collections(
  filter,
  select,
  subcollections = TRUE,
  supercollections = FALSE,
  locale = getOption("finbif_locale"),
  nmin = 0,
  cache = getOption("finbif_use_cache_metadata")
)

Arguments

filter

Logical. Expression indicating elements or rows to keep: missing values are taken as false.

select

Expression. Indicates columns to select from the data frame.

subcollections

Logical. Return subcollection metadata of higher level collections.

supercollections

Logical. Return lowest level collection metadata.

locale

Character. Language of data returned. One of "en", "fi", or "sv".

nmin

Integer. Filter collections by number of records. Only return information on collections with greater than value specified. If NA then return information on all collections.

cache

Logical or Integer. If TRUE or a number greater than zero, then data-caching will be used. If not logical then cache will be invalidated after the number of hours indicated by the argument.

Value

A data.frame.

Examples

## Not run: 

# Get collection metadata
collections <- finbif_collections()


## End(Not run)

FinBIF informal groups

Description

Display the informal taxonomic groups used in the FinBIF database.

Usage

finbif_informal_groups(
  group,
  limit = 5,
  quiet = FALSE,
  locale = getOption("finbif_locale"),
  cache = getOption("finbif_use_cache_metadata")
)

Arguments

group

Character. Optional, if supplied only display this top-level group and its subgroups.

limit

Integer. The maximum number top-level informal groups (and their sub-groups) to display.

quiet

Logical. Return informal group names without displaying them.

locale

Character. One of the supported two-letter ISO 639-1 language codes. Current supported languages are English, Finnish and Swedish. For data where more than one language is available the language denoted by locale will be preferred while falling back to the other languages in the order indicated above.

cache

Logical or Integer. If TRUE or a number greater than zero, then data-caching will be used. If not logical then cache will be invalidated after the number of hours indicated by the argument.

Value

A character vector (invisibly).

Examples

## Not run: 

# Display the informal taxonomic groups used by FinBIF
finbif_informal_groups()


## End(Not run)

Get last modified date for FinBIF occurrence records

Description

Get last modified date for filtered occurrence data from FinBIF.

Usage

finbif_last_mod(..., filter)

Arguments

...

Character vectors or list of character vectors. Taxa of records to download.

filter

List of named character vectors. Filters to apply to records.

Value

A Date object

Examples

## Not run: 

# Get last modified date for Whooper Swan occurrence records from Finland
finbif_last_mod("Cygnus cygnus", filter = c(country = "Finland"))


## End(Not run)

FinBIF metadata

Description

Display metadata from the FinBIF database.

Usage

finbif_metadata(
  which,
  locale = getOption("finbif_locale"),
  cache = getOption("finbif_use_cache_metadata")
)

Arguments

which

Character. Which category of metadata to display. If unspecified, function returns the categories of metadata available.

locale

Character. One of the supported two-letter ISO 639-1 language codes. Current supported languages are English, Finnish and Swedish. For data where more than one language is available the language denoted by locale will be preferred while falling back to the other languages in the order indicated above.

cache

Logical or Integer. If TRUE or a number greater than zero, then data-caching will be used. If not logical then cache will be invalidated after the number of hours indicated by the argument.

Value

A data.frame.

Examples

## Not run: 

finbif_metadata("red_list")


## End(Not run)

Download FinBIF occurrence records

Description

Download filtered occurrence data from FinBIF as a data.frame.

Usage

finbif_occurrence(
  ...,
  filter = NULL,
  select = NULL,
  order_by = NULL,
  aggregate = "none",
  sample = FALSE,
  n = 10,
  page = 1,
  count_only = FALSE,
  quiet = getOption("finbif_hide_progress"),
  cache = getOption("finbif_use_cache"),
  dwc = FALSE,
  date_time_method = NULL,
  check_taxa = TRUE,
  on_check_fail = c("warn", "error"),
  tzone = getOption("finbif_tz"),
  locale = getOption("finbif_locale"),
  seed = NULL,
  drop_na = FALSE,
  aggregate_counts = TRUE,
  exclude_na = FALSE,
  unlist = FALSE,
  facts = NULL,
  duplicates = FALSE,
  filter_col = NULL,
  restricted_api = NULL
)

Arguments

...

Character vectors or list of character vectors. Taxa of records to download.

filter

List of named character vectors. Filters to apply to records.

select

Character vector. Variables to return. If not specified, a default set of commonly used variables will be used. Use "default_vars" as a shortcut for this set. Variables can be deselected by prepending a - to the variable name. If only deselects are specified the default set of variables without the deselection will be returned.

order_by

Character vector. Variables to order records by before they are returned. Most, though not all, variables can be used to order records before they are returned. Ordering is ascending by default. To return in descending order append a - to the front of the variable (e.g., "-date_start"). Default order is "-date_start" > "-load_date" > "reported_name" > "record_id".

aggregate

Character. If "none" (default), returns full records. If one or more of "records", "species", "taxa", "individuals", "pairs", "events" or "documents"; aggregates combinations of the selected variables by counting records, species, taxa, individuals or events or documents. Aggregation by events or documents cannot be done in combination with any of the other aggregation types.

sample

Logical. If TRUE randomly sample the records from the FinBIF database.

n

Integer. How many records to download/import.

page

Integer. Which page of records to start downloading from.

count_only

Logical. Only return the number of records available.

quiet

Logical. Suppress the progress indicator for multipage downloads. Defaults to value of option finbif_hide_progress.

cache

Logical or Integer. If TRUE or a number greater than zero, then data-caching will be used. If not logical then the cache will be invalidated after the number of hours indicated by the argument. If a length one vector is used, its value will only apply to caching occurrence records. If the value is length two, then the second element will determine how metadata is cached.

dwc

Logical. Use Darwin Core (or Darwin Core style) variable names.

date_time_method

Character. Passed to lutz::tz_lookup_coords() when date_time and/or duration variables have been selected. Default is "fast" when less than 100,000 records are requested and "none" when more. Using method "none" assumes all records are in timezone "Europe/Helsinki", Use date_time_method = "accurate" (requires package sf) for greater accuracy at the cost of slower computation.

check_taxa

Logical. Check first that taxa are in the FinBIF database. If true only records that match known taxa (have a valid taxon ID) are returned.

on_check_fail

Character. What to do if a taxon is found not valid. One of "warn" (default) or "error".

tzone

Character. If date_time has been selected the timezone of the outputted date-time. Defaults to system timezone.

locale

Character. One of the supported two-letter ISO 639-1 language codes. Current supported languages are English, Finnish and Swedish. For data where more than one language is available the language denoted by locale will be preferred while falling back to the other languages in the order indicated above.

seed

Integer. Set a seed for randomly sampling records.

drop_na

Logical. A vector indicating which columns to check for missing data. Values recycled to the number of columns. Defaults to all columns.

aggregate_counts

Logical. Should count variables be returned when using aggregation.

exclude_na

Logical. Should records where all selected variables have non-NA values only be returned.

unlist

Logical. Should variables that contain non atomic data be concatenated into a string separated by ";"?

facts

Character vector. Extra variables to be extracted from record, event and document "facts".

duplicates

Logical. If TRUE, allow duplicate records/aggregated records when making multi-filter set requests. If FALSE (default) duplicate records are removed.

filter_col

Character. The name of a column, with values derived from the names of the filter sets used when using multiple filters, to include when using multiple filter sets. If NULL (default), no column is included.

restricted_api

Character. If using a restricted data API token in addition to a personal access token, a string indicating the name of an environment variable storing the restricted data API token.

Value

A data.frame. If count_only = TRUE an integer.

Examples

## Not run: 

# Get recent occurrence data for taxon
finbif_occurrence("Cygnus cygnus")

# Specify the number of records
finbif_occurrence("Cygnus cygnus", n = 100)

# Get multiple taxa
finbif_occurrence("Cygnus cygnus", "Ursus arctos")

# Filter the records
finbif_occurrence(
  species = "Cygnus cygnus",
  filter = list(coordinate_accuracy_max = 100)
)


## End(Not run)

Load FinBIF occurrence records from a file

Description

Load occurrence data from a file as a data.frame.

Usage

finbif_occurrence_load(
  file,
  select = NULL,
  n = -1,
  count_only = FALSE,
  quiet = getOption("finbif_hide_progress"),
  cache = getOption("finbif_use_cache"),
  dwc = FALSE,
  date_time_method = NULL,
  tzone = getOption("finbif_tz"),
  write_file = tempfile(),
  dt = NA,
  keep_tsv = FALSE,
  facts = list(),
  type_convert_facts = TRUE,
  drop_na = FALSE,
  drop_facts_na = drop_na,
  locale = getOption("finbif_locale"),
  skip = 0
)

Arguments

file

Character or Integer. Either the path to a Zip archive or tabular data file that has been downloaded from "laji.fi", a URI linking to such a data file (e.g., https://tun.fi/HBF.49381) or an integer representing the URI (i.e., 49381).

select

Character vector. Variables to return. If not specified, a default set of commonly used variables will be used. Use "default_vars" as a shortcut for this set. Variables can be deselected by prepending a - to the variable name. If only deselects are specified the default set of variables without the deselection will be returned. Use "all" to select all available variables in the file.

n

Integer. How many records to import. Negative and other invalid values are ignored causing all records to be imported.

count_only

Logical. Only return the number of records available.

quiet

Logical. Suppress the progress indicator for multipage downloads. Defaults to value of option finbif_hide_progress.

cache

Logical or Integer. If TRUE or a number greater than zero, then data-caching will be used. If not logical then the cache will be invalidated after the number of hours indicated by the argument. If a length one vector is used, its value will only apply to caching occurrence records. If the value is length two, then the second element will determine how metadata is cached.

dwc

Logical. Use Darwin Core (or Darwin Core style) variable names.

date_time_method

Character. Passed to lutz::tz_lookup_coords() when date_time and/or duration variables have been selected. Default is "fast" when less than 100,000 records are requested and "none" when more. Using method "none" assumes all records are in timezone "Europe/Helsinki", Use date_time_method = "accurate" (requires package sf) for greater accuracy at the cost of slower computation.

tzone

Character. If date_time has been selected the timezone of the outputted date-time. Defaults to system timezone.

write_file

Character. Path to write downloaded zip file to if file refers to a URI. Will be ignored if getOption("finbif_cache_path") is not NULL and will use the cache path instead.

dt

Logical. If package, data.table, is available return a data.table object rather than a data.frame.

keep_tsv

Logical. Whether to keep the TSV file if file is a ZIP archive or represents a URI. Is ignored if file is already a TSV. If TRUE the tsv file will be kept in the same directory as the ZIP archive.

facts

List. A named list of "facts" to extract from supplementary "fact" files in a local or online FinBIF data archive. Names can include one or more of "record", "event" or "document". Elements of the list are character vectors of the "facts" to be extracted and then joined to the return value.

type_convert_facts

Logical. Should facts be converted from character to numeric or integer data where applicable?

drop_na

Logical. A vector indicating which columns to check for missing data. Values recycled to the number of columns. Defaults to all columns.

drop_facts_na

Logical. Should missing or "all NA" facts be dropped? Any value other than a length one logical vector with the value of TRUE will be interpreted as FALSE. Argument is ignored if drop_na is TRUE for all variables explicitly or via recycling. To only drop some missing/NA-data facts use drop_na argument.

locale

Character. One of the supported two-letter ISO 639-1 language codes. Current supported languages are English, Finnish and Swedish. For data where more than one language is available the language denoted by locale will be preferred while falling back to the other languages in the order indicated above.

skip

Integer. The number of lines of the data file to skip before beginning to read data (not including the header).

Value

A data.frame, or if count_only = TRUE an integer.

Examples

## Not run: 

# Get occurrence data
finbif_occurrence_load(49381)


## End(Not run)

Get a FinBIF personal access token

Description

Have a personal access token for use with the FinBIF API sent to a specified email address.

Usage

finbif_request_token(email, quiet = FALSE)

Arguments

email

Character. The email address to which to send the API access token.

quiet

Logical. Suppress messages.

Value

If an access token has already been set then NULL (invisibly) if not then, invisibly, a finbif_api object containing the response from the FinBIF server.

Examples

## Not run: 

# Request a token for example@email.com
finbif_request_token("example@email.com")


## End(Not run)

Search the FinBIF taxa

Description

Search the FinBIF database for taxon.

Usage

finbif_taxa(
  name,
  n = 1,
  type = c("exact", "partial", "likely"),
  cache = getOption("finbif_use_cache")
)

common_name(name, locale = getOption("finbif_locale"))

scientific_name(name)

taxon_id(name)

Arguments

name

Character. The name or ID of a taxon. Or, for functions other than finbif_taxa a finbif_taxa object.

n

Integer. Maximum number of matches to return. For types "exact" and "likely" only one taxon will be returned.

type

Character. Type of match to make. Must be one of exact, partial or likely.

cache

Logical or Integer. If TRUE or a number greater than zero, then data-caching will be used. If not logical then cache will be invalidated after the number of hours indicated by the argument.

locale

Character. One of the supported two-letter ISO 639-1 language codes. Current supported languages are English, Finnish and Swedish. For data where more than one language is available the language denoted by locale will be preferred while falling back to the other languages in the order indicated above.

Value

For finbif_taxa a finbif_taxa object. Otherwise, a character vector.

Examples

## Not run: 

# Search for a taxon
finbif_taxa("Ursus arctos")

# Use partial matching
finbif_taxa("Ursus", n = 10, "partial")

# Get Swedish name of Eurasian Eagle-owl
common_name("Bubo bubo", "sv")

# Get scientific name of "Otter"
scientific_name("Otter")

# Get taxon identifier of "Otter"
taxon_id("Otter")


## End(Not run)

Update cache

Description

Update all cached FinBIF API requests.

Usage

finbif_update_cache()

Examples

## Not run: 

finbif_update_cache()


## End(Not run)

Convert variable names

Description

Convert variable names to Darwin Core or FinBIF R package native style.

Usage

to_dwc(...)

to_native(...)

from_schema(..., to = c("native", "dwc"), file = c("none", "citable", "lite"))

Arguments

...

Character. Variable names to convert. For to_dwc and to_native the names must be in the opposite format. For from_schema the names must be from the FinBIF schema (e.g., names returned by https://api.laji.fi) or a FinBIF download file (citable or lite).

to

Character. Type of variable names to convert to.

file

Character. For variable names that are derived from a FinBIF download file which type of file.

Value

Character vector.

Examples


to_dwc("record_id", "date_time", "scientific_name")

FinBIF record variables

Description

FinBIF record variables that can be selected in a finbif occurrence search.

Identifiers

All identifiers are returned in the form of a URI. Identifiers include:

Taxa

Variables related to taxonomy of records include:

Abundance, sex & life history

Variables related to abundance, sex and life history include:

Location

Variables related to the location of records include:

Time

Variables related to time of record include:

Data restrictions

Variables related to restricted records include:

Data quality

Variables related to the quality of records include:

Misc

Other variables: