Package: specleanr
Type: Package
Title: Detecting Environmental Outliers in Data Analysis Pipelines
Version: 1.0.0
Authors@R: c( 
    person(given = "Anthony", family = "Basooma", email = "anthony.basooma@boku.ac.at", 
    role = c("aut", "cre"),comment = c(ORCID = "0000-0002-8994-9989")),
    person("Thomas", "Hein", role = c("ctb", "fnd","ths"), comment = c(ORCID = "0000-0002-7767-4607")),
    person("Astrid", "Schmidt-Kloiber", role = c("ctb", "fnd", "dtc"), comment = c(ORCID = "0000-0001-8839-5913")),
    person("Merret", "Buurman", role = c("ctb"), email = "merret.buurman@igb-berlin.de"),
    person("Sami", "Domisch", role = c("ctb"), email = "sami.domisch@igb-berlin.de"),
    person("Martin", "Tschikof", role = c("ctb"), email = "martin.tschikof@boku.ac.at"),
    person("Florian", "Borgwardt", role = c("ctb","fnd"), comment = c(ORCID = "0000-0002-8974-7834"))
           )
Description: A framework used to detect and handle outliers during data analysis workflows. Outlier detection is a statistical concept with applications in data analysis workflows, highlighting records that are suspiciously high or low. Outlier detection in distribution models was initiated by Chapman (1991) (available at <https://www.researchgate.net/publication/332537800_Quality_control_and_validation_of_point-sourced_environmental_resource_data>), who developed the reverse jackknifing method. The concept was further developed and incorporated into different R packages, including 'flexsdm' (Velazco et al., 2022, <doi:10.1111/2041-210X.13874>) and 'biogeo' (Robertson et al., 2016 <doi:10.1111/ecog.02118>). We compiled various outlier detection methods obtained from the literature, including those elaborated in Dastjerdy et al. (2023) <doi:10.3390/geotechnics3020022> and Liu et al. (2008) <doi:10.1109/ICDM.2008.17>. In this package, we introduced the ensembling aspect, where multiple outlier detection methods are used to flag the record as either an absolute outlier. The concept can also be applied in general data analysis, as well as during the development of species distribution models.
License: GPL (>= 3)
Encoding: UTF-8
LazyData: true
URL: https://anthonybasooma.github.io/specleanr/
BugReports: https://github.com/AnthonyBasooma/specleanr/issues
RoxygenNote: 7.3.2
Suggests: dplyr, knitr, rmarkdown, testthat (>= 3.0.0), ggplot2,
        ggpmisc, tibble, rinat, rvertnet, rgbif, curl, rfishbase (>=
        5.0.1), sf, terra, tidytext, scatterplot3d
Config/testthat/edition: 3
VignetteBuilder: knitr
Imports: cluster, dbscan, e1071, isotree, methods, utils, robust,
        robustbase, usdm, mgcv
Depends: R (>= 4.1.0)
NeedsCompilation: no
Packaged: 2025-11-20 19:10:56 UTC; anthbasooma
Author: Anthony Basooma [aut, cre] (ORCID:
    <https://orcid.org/0000-0002-8994-9989>),
  Thomas Hein [ctb, fnd, ths] (ORCID:
    <https://orcid.org/0000-0002-7767-4607>),
  Astrid Schmidt-Kloiber [ctb, fnd, dtc] (ORCID:
    <https://orcid.org/0000-0001-8839-5913>),
  Merret Buurman [ctb],
  Sami Domisch [ctb],
  Martin Tschikof [ctb],
  Florian Borgwardt [ctb, fnd] (ORCID:
    <https://orcid.org/0000-0002-8974-7834>)
Maintainer: Anthony Basooma <anthony.basooma@boku.ac.at>
Repository: CRAN
Date/Publication: 2025-11-25 20:20:02 UTC
Built: R 4.5.1; ; 2025-11-25 23:13:59 UTC; unix
