Type: Package
Title: Mass-Preserving Spline Functions for Soil Data
Version: 0.1.6
Date: 2022-04-03
Description: A low-dependency implementation of GSIF::mpspline() https://r-forge.r-project.org/scm/viewvc.php/pkg/R/mpspline.R?view=markup&revision=240&root=gsif, which applies a mass-preserving spline to soil attributes. Splining soil data is a safe way to make continuous down-profile estimates of attributes measured over discrete, often discontinuous depth intervals.
License: GPL-2 | GPL-3 [expanded from: GPL]
Encoding: UTF-8
Imports: stats
Suggests: testthat, covr
RoxygenNote: 7.1.2
NeedsCompilation: no
Packaged: 2022-04-03 04:40:31 UTC; leobr
Author: Lauren O'Brien ORCID iD [aut, cre], Brendan Malone ORCID iD [ctb], Tomislav Hengl ORCID iD [ctb], Tom Bishop [ctb], David Rossiter [ctb], Dylan Beaudette [ctb], Andrew Brown [ctb]
Maintainer: Lauren O'Brien <obrlsoilau@gmail.com>
Repository: CRAN
Date/Publication: 2022-04-03 19:20:04 UTC

Spline discrete soils data - multiple sites

Description

This function implements the mass-preserving spline method of Bishop et al (1999) (doi: 10.1016/S0016-7061(99)00003-8) for interpolating between measured soil attributes down a soil profile, across multiple sites' worth of data.

Usage

mpspline(
  obj = NULL,
  var_name = NULL,
  lam = 0.1,
  d = c(0, 5, 15, 30, 60, 100, 200),
  vlow = 0,
  vhigh = 1000
)

Arguments

obj

data.frame or matrix. Column 1 must contain site identifiers. Columns 2 and 3 must contain upper and lower sample depths, respectively. Subsequent columns will contain measured values for those depths.

var_name

length-1 character or length-1 integer denoting the column in obj in which target data is stored. If not supplied, the fourth column of the input object is assumed to contain the target data.

lam

number; smoothing parameter for spline. Defaults to 0.1.

d

sequential integer vector; denotes the output depth ranges in cm. Defaults to c(0, 5, 15, 30, 60, 100, 200) after the GlobalSoilMap specification, giving output predictions over intervals 0-5cm, 5-15cm, etc.

vlow

numeric; constrains the minimum predicted value to a realistic number. Defaults to 0.

vhigh

numeric; constrains the maximum predicted value to a realistic number. Defaults to 1000.

Value

A nested list of data for each input site. List elements are: Site ID, vector of predicted values over input intervals, vector of predicted values for each cm down the profile to max(d), vector of predicted values over d (output) intervals, and root mean squared error.

Examples

dat <- data.frame("SID" = c( 1,  1,  1,  1,   2,   2,   2,   2),
                   "UD" = c( 0, 20, 40, 60,   0,  15,  45,  80),
                   "LD" = c(10, 30, 50, 70,   5,  30,  60, 100),
                  "VAL" = c( 6,  4,  3, 10, 0.1, 0.9, 2.5,   6),
                   stringsAsFactors = FALSE)
m1 <- mpspline(obj = dat, var_name = 'VAL')

Spline discrete soils data - multiple sites, compact output

Description

This function implements the mass-preserving spline method of Bishop et al (1999) (doi: 10.1016/S0016-7061(99)00003-8) for interpolating between measured soil attributes down a soil profile, across multiple sites' worth of data. It returns a more compact output object than mpspline().

Usage

mpspline_compact(
  obj = NULL,
  var_name = NULL,
  lam = 0.1,
  d = c(0, 5, 15, 30, 60, 100, 200),
  vlow = 0,
  vhigh = 1000
)

Arguments

obj

data.frame or matrix. Column 1 must contain site identifiers. Columns 2 and 3 must contain upper and lower sample depths, respectively. Subsequent columns will contain measured values for those depths.

var_name

length-1 character or length-1 integer denoting the column in obj in which target data is stored. If not supplied, the fourth column of the input object is assumed to contain the target data.

lam

number; smoothing parameter for spline. Defaults to 0.1.

d

sequential integer vector; denotes the output depth ranges in cm. Defaults to c(0, 5, 15, 30, 60, 100, 200) after the GlobalSoilMap specification, giving output predictions over intervals 0-5cm, 5-15cm, etc.

vlow

numeric; constrains the minimum predicted value to a realistic number. Defaults to 0.

vhigh

numeric; constrains the maximum predicted value to a realistic number. Defaults to 1000.

Value

A four-item list containing a matrix of predicted values over the input depth ranges, a matrix of predicted values over the output depth ranges, a matrix of 1cm predictions, and a matrix of RMSE and IQR-scaled RMSE values. Site identifiers are in rownames attributes.

Examples

dat <- data.frame("SID" = c( 1,  1,  1,  1,   2,   2,   2,   2),
                   "UD" = c( 0, 20, 40, 60,   0,  15,  45,  80),
                   "LD" = c(10, 30, 50, 70,   5,  30,  60, 100),
                  "VAL" = c( 6,  4,  3, 10, 0.1, 0.9, 2.5,   6),
                   stringsAsFactors = FALSE)
mpspline_compact(obj = dat, var_name = 'VAL')

Convert data for splining

Description

Generate a consistent input object for splining

Usage

mpspline_conv(obj = NULL)

## S3 method for class 'matrix'
mpspline_conv(obj = NULL)

## S3 method for class 'data.frame'
mpspline_conv(obj = NULL)

Arguments

obj

data.frame or matrix. Column 1 must contain site identifiers. Columns 2 and 3 must contain upper and lower sample depths, respectively. Subsequent columns will contain measured values for those depths.

Value

data frame, sorted by site ID, upper and lower depth.


pre-spline data checks

Description

Runs a few data quality checks and makes some repairs where possible.

Usage

mpspline_datchk(s = NULL, var_name = NULL)

Arguments

s

data frame, input data for a single soil profile.

var_name

length-1 character or length-1 integer denoting the column in site in which target data is stored. If not supplied, the fourth column of the input object is assumed to contain the target data.

Value

If data passes checks it is returned unchanged. Sites with no data to spline and sites with overlapping input depth ranges return NA.


Estimate spline parameters

Description

Estimate key parameters for building a mass-preserving spline across a single soil profile

Usage

mpspline_est1(s = NULL, var_name = NULL, lam = NULL)

Arguments

s

data.frame containing a single profile's worth of soil info

var_name

length-1 character or length-1 integer denoting the column in site in which target data is stored. If not supplied, the fourth column of the input object is assumed to contain the target data.

lam

number; smoothing parameter for spline. Defaults to 0.1.

Value

A list of parameters used for spline fitting.


Fit spline parameters

Description

Fit spline parameters to data for a single site.

Usage

mpspline_fit1(
  s = NULL,
  p = NULL,
  var_name = NULL,
  d = NULL,
  vhigh = NULL,
  vlow = NULL
)

Arguments

s

data.frame; data for one site

p

list; estimated spline parameters for one site from mpspline_est1

var_name

length-1 character or length-1 integer denoting the column in obj in which target data is stored. If not supplied, the fourth column of the input object is assumed to contain the target data.

d

sequential integer vector; denotes the output depth ranges in cm. Defaults to c(0, 5, 15, 30, 60, 100, 200) after the GlobalSoilMap specification, giving output predictions over intervals 0-5cm, 5-15cm, etc.

vhigh

numeric; constrains the maximum predicted value to a realistic number. Defaults to 1000.

vlow

numeric; constrains the minimum predicted value to a realistic number. Defaults to 0.

Value

list of two vectors: fitted values at 1cm intervals and the average of same over the requested depth ranges.


Spline discrete soils data - single site

Description

This function implements the mass-preserving spline method of Bishop et al (1999) (doi: 10.1016/S0016-7061(99)00003-8) for interpolating between measured soil attributes down a single soil profile.

Usage

mpspline_one(
  site = NULL,
  var_name = NULL,
  lam = 0.1,
  d = c(0, 5, 15, 30, 60, 100, 200),
  vlow = 0,
  vhigh = 1000
)

Arguments

site

data frame containing data for a single soil profile. Column 1 must contain site identifiers. Columns 2 and 3 must contain upper and lower sample depths, respectively, measured in centimeters. Subsequent columns will contain measured values for those depths.

var_name

length-1 character or length-1 integer denoting the column in site in which target data is stored. If not supplied, the fourth column of the input object is assumed to contain the target data.

lam

number; smoothing parameter for spline. Defaults to 0.1.

d

sequential integer vector; denotes the output depth ranges in cm. Defaults to c(0, 5, 15, 30, 60, 100, 200) after the GlobalSoilMap specification, giving output predictions over intervals 0-5cm, 5-15cm, etc.

vlow

numeric; constrains the minimum predicted value to a realistic number. Defaults to 0.

vhigh

numeric; constrains the maximum predicted value to a realistic number. Defaults to 1000.

Value

A list with the following elements: Site ID, vector of predicted values over input intervals, vector of predicted values for each cm down the profile to max(d), vector of predicted values over d (output) intervals, and root mean squared error.

Examples

dat <- data.frame("SID" = c( 1,  1,  1,  1),
                   "UD" = c( 0, 20, 40, 60),
                   "LD" = c(10, 30, 50, 70),
                  "VAL" = c( 6,  4,  3, 10),
                   stringsAsFactors = FALSE)
mpspline_one(site = dat, var_name = 'VAL')

calculate RMSE

Description

Calculates Root Mean Squared Error (RMSE) for estimates on a single site

Usage

mpspline_rmse1(s = NULL, p = NULL, var_name = NULL)

Arguments

s

data.frame; data for one site

p

list; estimated spline parameters for one site from mpspline_est1

var_name

length-1 character or length-1 integer denoting the column in site in which target data is stored. If not supplied, the fourth column of the input object is assumed to contain the target data.

Value

length-2 named numeric - RMSE and RMSE scaled against input data's interquartile range.

Note

Useful for comparing the results of varying parameter lam.


Spline discrete soils data - multiple sites, tidy output

Description

This function implements the mass-preserving spline method of Bishop et al (1999) (doi: 10.1016/S0016-7061(99)00003-8) for interpolating between measured soil attributes down a soil profile, across multiple sites' worth of data. It returns an output object with tidy data formatting.

Usage

mpspline_tidy(
  obj = NULL,
  var_name = NULL,
  lam = 0.1,
  d = c(0, 5, 15, 30, 60, 100, 200),
  vlow = 0,
  vhigh = 1000
)

Arguments

obj

data.frame or matrix. Column 1 must contain site identifiers. Columns 2 and 3 must contain upper and lower sample depths, respectively, and be measured in centimeters. Subsequent columns will contain measured values for those depths.

var_name

length-1 character or length-1 integer denoting the column in obj in which target data is stored. If not supplied, the fourth column of the input object is assumed to contain the target data.

lam

number; smoothing parameter for spline. Defaults to 0.1.

d

sequential integer vector; denotes the output depth ranges in cm. Defaults to c(0, 5, 15, 30, 60, 100, 200) after the GlobalSoilMap specification, giving output predictions over intervals 0-5cm, 5-15cm, etc.

vlow

numeric; constrains the minimum predicted value to a realistic number. Defaults to 0.

vhigh

numeric; constrains the maximum predicted value to a realistic number. Defaults to 1000.

Value

A four-item list containing data frames of predicted values over the input depth ranges, the output depth ranges, 1cm-increment predictions, and RMSE and IQR-scaled RMSE values.

Examples

dat <- data.frame("SID" = c( 1,  1,  1,  1,   2,   2,   2,   2),
                   "UD" = c( 0, 20, 40, 60,   0,  15,  45,  80),
                   "LD" = c(10, 30, 50, 70,   5,  30,  60, 100),
                  "VAL" = c( 6,  4,  3, 10, 0.1, 0.9, 2.5,   6),
                   stringsAsFactors = FALSE)
mpspline_tidy(obj = dat, var_name = 'VAL')