Help for package spbal

Type:

Package

Title:

Spatially Balanced Sampling Algorithms

Version:

1.0.1

Description:

Encapsulates a number of spatially balanced sampling algorithms, namely, Balanced Acceptance Sampling (equal, unequal, seed point, panels), Halton frames (for discretizing a continuous resource), Halton Iterative Partitioning (equal probability) and Simple Random Sampling. Robertson, B. L., Brown, J. A., McDonald, T. and Jaksons, P. (2013) <doi:10.1111/biom.12059>. Robertson, B. L., McDonald, T., Price, C. J. and Brown, J. A. (2017) <doi:10.1016/j.spl.2017.05.004>. Robertson, B. L., McDonald, T., Price, C. J. and Brown, J. A. (2018) <doi:10.1007/s10651-018-0406-6>. Robertson, B. L., van Dam-Bates, P. and Gansell, O. (2021a) <doi:10.1007/s10651-020-00481-1>. Robertson, B. L., Davies, P., Gansell, O., van Dam-Bates, P., McDonald, T. (2025) <doi:10.1111/anzs.12435>.

Depends:

R (≥ 3.6.0)

License:

MIT + file LICENSE

Encoding:

UTF-8

Imports:

units, sf, Rcpp

SystemRequirements:

C++17

Suggests:

knitr, rmarkdown, testthat (≥ 3.0.0), bookdown, ggplot2, gridExtra

VignetteBuilder:

knitr

LinkingTo:

Rcpp, RcppThread

RoxygenNote:

7.3.2

Config/testthat/edition:

NeedsCompilation:

yes

Packaged:

2025-03-28 09:09:34 UTC; phil

Author:

Phil Davies [aut, cre], Blair Robertson [aut], Paul van Dam-Bates [aut], Oliver Gansell [aut]

Maintainer:

Phil Davies <philip.davies@canterbury.ac.nz>

Repository:

CRAN

Date/Publication:

2025-03-28 09:20:02 UTC

Balanced Acceptance Sampling (BAS).

Description

BAS draws spatially balanced samples from areal resources. To draw BAS samples, spbal requires a study region shapefile and the region’s bounding box. An initial sample size is also needed, which can be easily increased or decreased within spbal for master sampling applications

Usage

BAS(
  shapefile = NULL,
  n = 100,
  boundingbox = NULL,
  minRadius = NULL,
  panels = NULL,
  panel_overlap = NULL,
  stratum = NULL,
  seeds = NULL,
  verbose = FALSE
)

Arguments

shapefile

Shape file as a polygon (sp or sf) to select sites for.

n

Number of sites to select. If using stratification it is a named vector containing sample sizes of each group.

boundingbox

Bounding box around the study area. If a bounding box is not supplied then spbal will generate a bounding box for the shapefile.

minRadius

If specified, the minimum distance, in meters, allowed between sample points. This is applied to the $sample points. Points that meet the minRadius criteria are retuned in the minRadius output variable.

panels

A list of integers that define the size of each panel in a non-overlapping panels design. The length of the list determines the number of panels required. The sum of the integers in the panels parameter will determine the total number of samples selected, n. The default value for panels is NULL, this indicates that a non-overlapping panel design is not wanted.

panel_overlap

A list of integers that define the overlap into the previous panel. Is only used when the panels parameter is not NULL. The default value for panel_overlap is NULL. The length of panel_overlap must be equal to the length of panels. The first value is always forced to zero as the first panel never overlaps any region.

stratum

The name of a column in the data.frame attached to shapefile that defines the strata of interest.

seeds

A vector of 2 seeds, u1 and u2. If not specified, the default is NULL and will be defined randomly using function generateUVector.

verbose

Boolean if you want to see any output printed to screen. Helpful if taking a long time. Default is FALSE i.e. no informational messages are displayed.

Value

A list containing three variables, $sample containing locations in the BAS sample, in BAS order, $seeds, the u1 and u2 seeds used to generate the sample and $minRadius containing points from $sample that meet the minRadius criteria. If the minRadius parameter is NULL then the $minRadius returned will also be NULL.

The sample points are returned in the form of a simple feature collection of POINT objects. They have the following attributes:

SiteID A unique identifier for every sample point. This encodes the BAS order.
spbalSeqID A unique identifier for every sample point. This encodes the BAS sample order.
geometry The XY co-ordinates of the sample point in the CRS of the original shapefile.

Author(s)

This function was first written by Paul van Dam-Bates for the package BASMasterSample and later simplified by Phil Davies.

Examples

# Equal probability BAS sample ----------------------------------------------

# Use the North Carolina shapefile supplied in the sf R package.
shp_file <- sf::st_read(system.file("shape/nc.shp", package="sf"))
shp_gates <- shp_file[shp_file$NAME == "Gates",]

# Vertically aligned master sample bounding box.
bb <- spbal::BoundingBox(shapefile = shp_gates)

set.seed(511)
n_samples <- 20
# Equal probability BAS sample.
result <- spbal::BAS(shapefile = shp_gates,
                     n = n_samples,
                     boundingbox = bb)
BAS20 <- result$sample
# display first three sample points.
BAS20[1:3,]

# Increase the BAS sample size ----------------------------------------------
n_samples <- 50
result2 <- spbal::BAS(shapefile = shp_gates,
                      n = n_samples,
                      boundingbox = bb,
                      seeds = result$seed)
BAS50 <- result2$sample
BAS50[1:3,]

# Check, first n_samples points in both samples must be the same.
all.equal(BAS20$geometry, BAS50$geometry[1:20])

Create a bounding box for a study region.

Description

Randomly generate a seed from 10,000 possible values in right now 2 dimensions. Note that in van Dam-Bates et al. (2018) we required that the random seed falls into main object shape, such as one of the islands in New Zealand, or within marine environment for BC west coast. However, with a random rotation, we are able to ignore that detail. If this function is used without a random rotation, we recommend running it until the first master sample point does indeed fall within the largest scale of the master sample use.

Usage

BoundingBox(shapefile, d = 2, rotate = FALSE, verbose = FALSE)

Arguments

shapefile

Spatial feature that defines the boundary of the area to define a bounding box over.

d

Dimension of the new Master Sample, at this stage we only work with d=2.

rotate

Boolean of whether or not to randomly rotate the bounding box. This parameter is not supported at this time.

verbose

Print the rotation and random seed when it is generated.

Value

bounding box for a study area.

Author(s)

This function was first written by Paul van Dam-Bates for the package BASMasterSample and later ported to this package, spbal.

Examples

# Create a bounding box for the Gates, North Carolina study area -------------
# Use the North Carolina shapefile supplied in the sf R package.
shp_file <- sf::st_read(system.file("shape/nc.shp", package="sf"))
shp_gates <- shp_file[shp_file$NAME == "Gates",]
# Vertically aligned master sample bounding box.
bb <- spbal::BoundingBox(shapefile = shp_gates)
bb

Halton Iterative Partitioning (HIP).

Description

HIP draws spatially balanced samples and over-samples from point resources by partitioning the resource into boxes with the same nested structure as Halton boxes. The spbal parameter iterations defines the number of boxes used in the HIP partition and should be larger than the sample size but less than the population size. The iterations parameter also defines the number of units available in the HIP over-sample, where the over-sample contains one unit from each box in the HIP partition.

Usage

HIP(
  population = NULL,
  n = 20,
  iterations = 7,
  minRadius = NULL,
  panels = NULL,
  panel_overlap = NULL,
  verbose = FALSE
)

Arguments

population

A population of point pairs.

n

The number of points to draw from the population. Default 20.

iterations

The levels of partitioning required. Default 7.

minRadius

If specified, the minimum distance, in meters, allowed between sample points. This is applied to the $overSample.

panels

panel_overlap

verbose

Boolean if you want to see any output printed to screen. Helpful if taking a long time. Default is FALSE i.e. no informational messages are displayed.

Details

Halton iterative partitioning (HIP) extends Basic acceptance sampling (BAS) to point resources. It partitions the resource into $B >= n$ boxes that have the same nested structure as in BAS, but different sizes. These boxes are then uniquely numbered using a random-start Halton sequence of length $B$. The HIP sample is obtained by randomly drawing one point from each of the boxes numbered $1, 2, . . . , n$.

Value

Return a list containing the following five variables:

Population Original population point pairs as an sf object.
HaltonIndex The Halton index for the point. Points will be spread equally across all Halton indices.
sample The sample from the population of size n.
overSample The overSample contains one point from each Halton box. All contiguous sub-samples from oversample are spatially balanced, and the first n points are identical to sample.
minRadius This result variable contains the sample created using the minRadius parameter. If the minRadius parameter is not specified then the minRadius variable will contain NULL.

Author(s)

Phil Davies, Blair Robertson.

Examples

# generating 20 points from a population of 5,000 (random) points with 7
# levels of partitioning (4 in the first dimension and 3 in the second) to
# give (2^4) * (3^3) = 32 * 27, resulting in 864 boxes ----------------------

# set random seed
set.seed(511)

# define HIP parameters.
pop <- matrix(runif(5000*2), nrow = 5000, ncol = 2)
n <- 20
its <- 7

# Convert the population matrix to an sf point object.
sf_points <- sf::st_as_sf(data.frame(pop), coords = c("X1", "X2"))
dim(sf::st_coordinates(sf_points))

# generate HIP sample.
result <- spbal::HIP(population = sf_points,
                     n = n,
                     iterations =  its)

# HaltonIndex
HaltonIndex <- result$HaltonIndex
table(HaltonIndex)

# Population Sample
HIPsample <- result$sample
HIPsample

Create a Halton Frame.

Description

Halton frames discretize an areal resource into a spatially ordered grid, where samples of consecutive frame points are spatially balanced. To generate Halton Frames, spbal requires a study region shapefile and the region’s bounding box.

Usage

HaltonFrame(
  N = 1,
  J = base::c(3, 2),
  bases = base::c(2, 3),
  boundingbox = NULL,
  shapefile = NULL,
  panels = NULL,
  panel_overlap = NULL,
  seeds = NULL,
  stratum = NULL,
  verbose = FALSE
)

Arguments

N

The number of points in the frame to generate.

J

The number of grid cells. A list of 2 values. The default value is c(3, 2).

bases

Co-prime base for the Halton Sequence. The default value is c(2, 3).

boundingbox

Bounding box around the study area. If a bounding box is not supplied then spbal will generate a bounding box for the shapefile.

shapefile

A sf object. If the shapefile parameter is NULL then function HaltonFrameBase is called directly.

panels

panel_overlap

seeds

A vector of 2 seeds, u1 and u2. If not specified, the default is NULL.

stratum

Name of column in shapefile that makes up the strata.

verbose

Boolean if you want to see any output printed to screen. Helpful if taking a long time. Default is FALSE i.e. no informational messages are displayed.

Value

Returns a list containing five variables:

J The number of grid cells. A list of 2 values that were used to generate this Halton grid and frame.
hg.pts.shp Halton grid over the bounding box and study area.
hf.pts.shp Halton frame, the sample points within the study area.
bb The bounding box.
seeds The u1 and u2 seeds used to generate the sample.

The sample points in hf.pts.shp are returned in the form of a simple feature collection of POINT objects. As well as having the features from the original shapefile, the following new attributes have been added:

spbalSeqID: A unique identifier for every sample point.
ID: A unique identifier, the Halton frame point order.

Author(s)

Phil Davies.

Examples

# we discretize the Gates study region into a coarse grid using
# B = 2^{J_1} * 3^{J_2}= (2^3) * (3^2) (9 by 8 grid) ------------------------

# Use the North Carolina shapefile supplied in the sf R package.
shp_file <- sf::st_read(system.file("shape/nc.shp", package="sf"))
shp_gates <- shp_file[shp_file$NAME == "Gates",]

# Vertically aligned master sample bounding box.
bb <- spbal::BoundingBox(shapefile = shp_gates)

set.seed(511)
result6 <- spbal::HaltonFrame(shapefile = shp_gates,
                              J = c(3, 2),
                              boundingbox = bb)
# get the frame points.
Frame <- result6$hf.pts.shp
Frame
# get the grid points.
Grid <- result6$hg.pts.shp
Grid

Generate a Halton Frame.

Description

A description of this useful function.

Usage

HaltonFrameBase(
  n = (bases[1]^J[1]) * (bases[2]^J[2]),
  J = base::c(3, 2),
  bases = base::c(2, 3),
  seeds = NULL
)

Arguments

n

The number of points in the frame to generate.

J

The number of grid cells. A list of 2 values. The default value is c(3, 2), we could also use c(5, 3).

bases

Co-prime base for the Halton Sequence. The default value is c(2, 3).

seeds

The u1 and u2 seeds to use.

Details

This function was written by Phil Davies.

Value

A list containing the following four variables: halton_seq - halton_seq_div - Z - halton_frame -

Assign panel ids to the samples.

Description

This function assigns panel id's to each sample based on values in the panels and panel_overlap parameters. This is an internal only function.

Usage

PanelDesignAssignPanelids(
  smp,
  panels,
  panel_overlap,
  panel_design,
  number_panels,
  verbose = FALSE
)

Arguments

smp

The shapefile for the region under study.

panels

A list of integers that defines the size of each panel in a non-overlapping panels design. The length of the list determines the number of panels required. The sum of the integers in the panels parameter will determine the total number of samples selected, n. The default value for panels is NULL, this indicates that a non-overlapping panel design is not wanted.

panel_overlap

panel_design

A flag, when TRUE, indicates that we are performing a panel design and the parameters used are specified in the panels and panel_overlap parameters.

number_panels

The number of sample panels required.

verbose

Boolean, set TRUE if you want to see output messaged to screen. Default is FALSE i.e. no informational messages are displayed.

Value

Returns a list of the following variables:

sample This is a sample from the original shapefile that has had the appropriate panel id's add as a feature. The panel id values are determined by the panels and panel_overlap parameters.

Author(s)

Phil Davies.

Simple random sampling.

Description

This function invokes base::sample() to draw a random sample using a user specified random seed.

Usage

SRS(seed = 511, total_rows = 0, sample_size = 0)

Arguments

seed

The random seed to be used to draw the current sample.

total_rows

The total number of rows in our input dataset.

sample_size

The number of rows wanted in our random sample.

Details

This function was written by Phil Davies.

Value

A random sample.

Examples

# Create a random sample with a seed of 99 ----------------------------------
spbal::SRS(seed = 99, total_rows = 100, sample_size = 20)

# Create a random sample with a seed of 42 ----------------------------------
spbal::SRS(seed = 42, total_rows = 100, sample_size = 20)

# Create a random sample with a seed of 99 ----------------------------------
spbal::SRS(seed = 99, total_rows = 100, sample_size = 25)

Validate the panels and panel_overlap parameters.

Description

This function is used to validate the panels and panel_overlap parameters. The panel_design flag is set TRUE when the panels and/or panel_overlap parameters are not NULL. This is an internal only function.

Usage

ValidatePanelDesign(panels, panel_overlap, n)

Arguments

panels

panel_overlap

A list of integers that define the overlap into the previous panel. It is only used when the panels parameter is not NULL. The default value for panel_overlap is NULL. The length of panel_overlap must be equal to the length of panels. The first value is always forced to zero as the first panel never overlaps any region.

n

The number of samples required. Only used when panels and panel_overlap are NULL.

Value

A list containing four variables, they are detailed below.

n When the panels parameter is not null, the n parameter is set using the sum of all the panel sizes in panels.
panel_design A boolean, TRUE, indicates that the user wants a panels design.
number_panels The number of panels specified in the panel design.
panel_overlap Updated panel_overlap vector, the first element is always forced to zero irrespective of what the user specified.

Author(s)

Phil Davies.

Check if the sf object contains a specified feature.

Description

Used to check if a simple file object contains a feature. This is an internal only function.

Usage

contains_feature(sf_object, feature_name)

Arguments

sf_object

Simple file object that we want to verify if it contains a feature called feature_name.

feature_name

The feature name we want to find in the simple file object sf_object.

Value

Returns TRUE if the simple file object sf_object contains the feature feature_name. Otherwise FALSE is returned.

Author(s)

Phil Davies.

Generate numbers from a Halton Sequence.

Description

For efficiency, this function can generate points along a random start Halton Sequence for a predefined Halton.

Usage

cppBASpts(
  n = 10L,
  seeds = as.integer(c()),
  bases = as.numeric(c()),
  verbose = FALSE
)

Arguments

n

Number of points required.

seeds

Random starting point in each dimension.

bases

Co-prime base for the Halton Sequence.

verbose

A boolean indicating whether informational messages are to be issued.

Value

Matrix with the columns, order of points, x in [0,1) and y in [0,1)

Author(s)

This function was first written in R by Blair Robertson, subsequently it was re-written in C/C++ by Phil Davies.

Examples

# First 10 points in the Halton Sequence for base 2,3
spbal::cppBASpts(n = 10)
# First 10 points in the Halton Sequence for base 2,3 with
# starting point at the 15th and 22nd index.
spbal::cppBASpts(n = 10, seeds = c(14, 21))

Generate numbers from a Halton Sequence along a specified set of indices.

Description

For efficiency, this function can generate points along a random start Halton Sequence for a predefined set of indices away from the seed. When boxes are provided it will calculate the Halton Sequence only at those boxes and not along the entire sequence.

Usage

cppBASptsIndexed(
  n = 10L,
  seeds = as.integer(c()),
  bases = as.numeric(c()),
  boxes = as.integer(c()),
  verbose = FALSE
)

Arguments

n

Number of points required.

seeds

Random starting point in each dimension.

bases

Co-prime base for the Halton Sequence.

boxes

Integer vector of indices to sample along the Halton sequence (default 1:n).

verbose

A boolean indicating whether informational messages are to be issued.

Details

When not all points along the Halton sequence are required, this function efficiently generates the points that are needed along a sequence. Taking all points from the random seed equates to boxes = 1:n. However, taking advantage of how the Halton Sequence repeats itself by B = prod(base^J), where $J$ is an integer. We can also select every Bth box to efficiently generate values at specific locations along the sequence. This reduces future computation when bounding boxes are large in comparison to the polygon being sampled.

Value

Matrix with the columns, order of points, x in [0,1) and y in [0,1)

Author(s)

Phil Davies, Paul van Dam-Bates, Blair Robertson.

Examples

# First 10 points in the Halton Sequence for base 2,3
spbal::cppBASptsIndexed(n = 10)
# First 10 points in the Halton Sequence for base 2,3 with
# starting point at the 15th and 22nd index.
spbal::cppBASptsIndexed(n = 10, seeds = c(14, 21))

Generate numbers from a Halton Sequence with a random start

Description

For efficiency, this function can generate points along a random start Halton Sequence for a predefined Halton.

Usage

cppRSHalton_br(
  n = 10L,
  bases = as.numeric(c()),
  seeds = as.numeric(c()),
  verbose = FALSE
)

Arguments

n

Number of points required

bases

Co-prime base for the Halton Sequence

seeds

Random starting point in each dimension

verbose

A boolean indicating whether informational messages are to be issued.

Value

Matrix with the columns, order of point, x in [0,1) and y in [0,1).

Author(s)

This function was first written in R by Blair Robertson, subsequently it was written in C/C++ by Phil Davies.

Examples

# First 10 points in the Halton Sequence for base 2,3
 spbal::cppRSHalton_br(n = 10)
# First 10 points in the Halton Sequence for base 2,3 with
# starting point at the 15th and 22nd index.
 spbal::cppRSHalton_br(n = 10, seeds = c(14, 21))

Filter sample using a minimum distance.

Description

The input parameter minRadius >= 0 is the minimum distance between any two points in the sample. My idea is to apply this condition to the points in the over-sample, result$overSample. Let's call these points x1, x2, ..., xB. Create a new set S = (x1). Starting from x1, we check if dist(S,x2) > minRadius. If it is, add x2 to S. For x3, we check if dist(S,x3) > minRadius, where dist is the smallest distance from a point in S to x3 (single linkage distance). If dist(S,x3) > minRadius, add x3 to S. Continue until you reach xB.

The distances are calculated as great circles over an oblate spheroid and the units are meters.

Usage

filterOnDistance(overSample, minRadius)

Arguments

overSample

A HIP sample.

minRadius

The minimum distance between any two points in the sample.

Details

Key points:

result$minRadius is nonempty (it always contains x1). Hence, if the user chooses a crazy minRadius, they get one point.
result$minRadius is a subset of result$overSample.
The number of points in result$minRadius is random. That's fine!
If they want n points and result$minRadius has less than n, too bad! They can reduce minRadius and/or increase the iterations parameter.
If they want a sample with the minimum radius property, they use:
- smp <- result$minRadius
- sample <- smp[1:n,]

Value

S The set of points that are more than minRadius from each other.

Author(s)

Phil Davies.

Randomly generates a point in the study region and maps it to the Halton Sequence.

Description

This function uses sf::st_sample() internally to generate a random point in the study region. It then maps that point to the Halton Sequence to ensure that the random starting point is within the region. That point is approximately mapped, and thus a check to make sure the new point is still within the study region is completed. This function is used internally, but may useful for a user to generate multiple seeds in advance in a simulation study using BAS.

Usage

findBASSeed(shapefile, bb, n = 1, verbose = FALSE)

Arguments

shapefile

Shape file as a polygon (sp or sf) of the study area(s).

bb

Bounding box which defines the sample. A bounding box must be supplied and may not necessarily be the bounding box of the provided shape.

n

Number of seeds to produce.

verbose

Boolean if you want to see any output printed to screen. Helpful if taking a long time. Default is FALSE i.e. no informational messages are displayed.

Value

A vector when n = 1 (Default), or a matrix when n > 1.

Author(s)

Paul van Dam-Bates

Get a randomly chosen Halton point from within the study area and the associated seeds.

Description

This function repeatedly calls function spbal::getBASSample to generate the Halton frame sample. This function selects the first point at random from those points in the study area. This point and the seeds used to generate the sample are returned to the caller.

Usage

findFirstStudyRegionPoint(shapefile, bb, seeds, verbose = FALSE)

Arguments

shapefile

Shape file as a polygon (sp or sf) of the study area(s).

bb

Bounding box which defines the Master Sample. A bounding box must be supplied.

seeds

A vector of 2 seeds, u1 and u2. If not specified, the default is NULL and will be defined randomly using function generateUVector.

verbose

Boolean if you want to see any output printed to screen. Helpful if taking a long time. Default is FALSE i.e. no informational messages are displayed.

Value

A list containing three variables:

seeds The u1 and u2 seeds used to generate the first point.
k The index of the first point in the initial sample.

Author(s)

This function was written by Phil Davies.

Randomly generates a point in the study region and maps it to the Halton Sequence.

Description

This function uses sf::st_sample() to generate a random point in the study region. it then maps that point to the Halton Sequence to ensure that the random starting point is within the region. this function is used internally, and is called by a wrapper findBASSeed().

Usage

findRandomHaltonIndex(shapefile, bb, n = 1, uplim = 10^6, verbose = FALSE)

Arguments

shapefile

Shape file as a polygon (sp or sf) of the study area(s).

bb

Bounding box which defines the sample. A bounding box must be supplied and may not necessarily be the bounding box of the provided shape.

n

Number of seeds to produce.

uplim

Limit of how accurate to be mapping point to Halton sequence. Not advised larger than 10^15.

verbose

Boolean if you want to see any output printed to screen. Helpful if taking a long time. Default is FALSE i.e. no informational messages are displayed.

Value

A matrix with n rows and 2 columns.

Author(s)

Paul van Dam-Bates and Blair Robertson

Generate a vector of two random seeds.

Description

This function generates two seeds, u1 and u2, in the range 0 to 2^11 and 0 to 3^7 respectively. These are returned to the caller in the form of a vector. This is for internal use only.

Usage

generateUVector()

Value

A vector containing two seeds, u1 and u2.

Author(s)

Phil Davies.

Generate the BAS sample.

Description

This function is repeatedly called from function spbal::getBASSampleDriver to generate a BAS sample.

Usage

getBASSample(shapefile, bb, n, seeds, boxes = NULL)

Arguments

shapefile

Shape file as a polygon (sp or sf) to select sites for.

bb

Bounding box which defines the area around the study area. A bounding box must be supplied.

n

Number of sites to select. If using stratification it is a named vector containing sample sizes of each group.

seeds

A vector of 2 seeds, u1 and u2. seeds must have a value when this function is called.

boxes

A vector of integers for which points along the Halton random starting point to sample from.

Value

A list containing two variables, $sample containing locations in the BAS sample, in BAS order and $seeds, the u1 and u2 seeds used to generate the sample.

Author(s)

This function was written by Phil Davies.

Manage BAS sampling.

Description

This function repeatedly calls function spbal::getBASSample to generate the BAS sample. Once the requested number of points within the intersection of the shapefile and the study area have been obtained, the sample and seeds are returned to the caller.

Usage

getBASSampleDriver(shapefile, bb, n, seeds, verbose = FALSE)

Arguments

shapefile

sf shape file as a polygon to select sites from.

bb

Bounding box which defines the area around the study area. A bounding box must be supplied.

n

Number of sites to select. If using stratification it is a named vector containing sample sizes of each group.

seeds

A vector of 2 seeds, u1 and u2. If not specified, the default is NULL and will be defined randomly.

verbose

Boolean if you want to see any output printed to screen. Helpful if taking a long time. Default is FALSE i.e. no informational messages are displayed.

Value

A list containing two variables, $sample containing locations in the BAS sample, in BAS order and $seeds, the u1 and u2 seeds used to generate the sample.

Author(s)

This function was written by Phil Davies based on origin code by Paul van Dam-Bates from the BASMasterSample package.

Obtain a Halton Frame over a shapefile.

Description

An internal only function.

Usage

getHaltonFrame(shapefile, J, i, bases, seeds, crs)

Arguments

shapefile

A MULTIPOINT or POINT object that we want to generate a halton frame for.

J

The number of grid cells. A list of 2 values.

i

An integer to add to the J parameter elements to expand the Halton Frame in both directions if the required number of sample points cannot be found in the region of interest in the current Halton frame.

bases

Co-prime base for the Halton Sequence.

seeds

A list of 2 seeds, u1 and u2.

crs

Coordinate reference system for the shapefile.

Details

This function was written by Phil Davies.

Value

A list containing the following variables: hf_, sample, pts.shp, bb.new, seeds

Generate a Halton frame.

Description

Find the requested number of Halton points from within a study area using the supplied J and seeds parameters. If the number of points are not found on the first attempt, the frame is expanded, and spbal::getHaltonFrame is called again. This process is repeated until the requested number of points are found. The points and the seeds used to generate the sample are returned to the caller.

Usage

getHaltonPointsFromExpandableGrid(
  shapefile,
  N,
  J = base::c(4, 3),
  bases,
  seeds,
  crs,
  verbose = FALSE,
  stratify_found_first = FALSE
)

Arguments

shapefile

Shape file as a polygon (sp or sf) of the study area(s).

N

Number of sites to select. If using stratification it is a named vector containing sample sizes of each group.

J

The number of grid cells. A list of 2 values. The default value is c(3, 2).

bases

Co-prime base for the Halton Sequence. The default value is c(2, 3).

seeds

A vector of 2 seeds, u1 and u2.

crs

Coordinate reference system for the shapefile.

verbose

Boolean if you want to see any output printed to screen. Helpful if taking a long time. Default is FALSE i.e. no informational messages are displayed.

stratify_found_first

A flag to indicate whether we have found the first point in the study region or not. Default FALSE.

Value

A list containing five variables:

i The index of the first point chosen at random in the study area.
diff_ Halton points, the intersection of the bounding box and the study area.
pts.shp Halton frame, the sample points within the study area.
bb.new The bounding box.
seeds The u1 and u2 seeds used to generate the sample.

Author(s)

Phil Davies.

Extract all points with a specified panel id from a sample.

Description

This is the main function for selecting sites using the BAS master sample. It assumes that you have already defined the master sample using the BoundingBox() function or will be selecting a marine master sample site in BC.

Usage

getPanel(shapefile, panelid)

Arguments

shapefile

Shape file as a polygon (sp or sf) containing a sample that contains a feature column named panel_id.

panelid

The overlapped panel in the shapefile shp the user wants sample points from.

Value

The sample for the specified panel.

Author(s)

Phil Davies.

Examples

# Halton frame overlapping panel design showing use of getPanel.

# Use the North Carolina shapefile supplied in the sf R package.
shp_file <- sf::st_read(system.file("shape/nc.shp", package="sf"))
shp_gates <- shp_file[shp_file$NAME == "Gates",]

# Vertically aligned master sample bounding box.
bb <- spbal::BoundingBox(shapefile = shp_gates)

# Three panels, of 20 samples each.
panels <- c(20, 20, 20)

# second panel overlaps first panel by 5, and third panel
# overlaps second panel by 5.
panel_overlap <- c(0, 5, 5)

# generate the sample.
samp <- spbal::HaltonFrame(J = c(4, 3),
                           boundingbox = bb,
                           panels = panels,
                           panel_overlap = panel_overlap,
                           shapefile = shp_gates)

# get halton frame data from our sample.
samp3 <- samp$hf.pts.shp
samp3

panelid <- 1
olPanel_1 <- spbal::getPanel(samp3, panelid)

Extract a sample of a specified size from a master sample.

Description

A description of this useful function.

Usage

getSample(shapefile, n, randomStart = FALSE, strata = NULL, stratum = NULL)

Arguments

shapefile

A MULTIPOINT or POINT object from where to take the sample.

n

The number of sample points to return.

randomStart

Whether a spatially balanced sample will be randomly drawn from the frame or not. Default value is FALSE.

strata

to be added

stratum

The name of a column in the dataframe attached to shapefile that defines the strata of interest.

Value

A list containing the following variable:

sample The sample from the shapefile POINTS.

Author(s)

Phil Davies.

Examples

# Draw a spatially balanced sample of n = 25 from a Halton Frame over Gates --

# Use the North Carolina shapefile supplied in the sf R package.
shp_file <- sf::st_read(system.file("shape/nc.shp", package="sf"))
shp_gates <- shp_file[shp_file$NAME == "Gates",]

# Vertically aligned master sample bounding box.
bb <- spbal::BoundingBox(shapefile = shp_gates)

set.seed(511)
result7 <- spbal::HaltonFrame(shapefile = shp_gates,
                              J = c(6, 4),
                              boundingbox = bb)
Frame <- result7$hf.pts.shp

# Get the first 25 sites from a B = (2^6) * (3^4) Halton Frame (62,208 grid
# points covering Gates).
n_samples <- 25
FrameSample <-getSample(shapefile = Frame,
                        n = n_samples)
FrameSample <- FrameSample$sample
FrameSample

Permute Halton indices.

Description

A description.

Usage

hipIndexRandomPermutation(its)

Arguments

its

The number of partitioning iterations.

Value

A list containing the following variables:

permHaltonIndex The permuted halton indices for all points.
B The number of Halton boxes.

Author(s)

Phil Davies, Blair Robertson.

Partition the population.

Description

Partition the resource into boxes with the same nested structure as Halton boxes. The spbal parameter iterations defines the number of boxes used in the HIP partition and should be larger than the sample size but less than the population size.

Usage

hipPartition(pts, its)

Arguments

pts

The population of points.

its

The number of partitioning iterations.

Value

A list containing the following variables:

ptsIndex The population index.
HaltonIndex Updated Halton indices for all points in pts.

Author(s)

Phil Davies, Blair Robertson.

First dimension split.

Description

Split point pairs using the first dimension.

Usage

hipX1split(x1pts, HaltonIndex, BoxIndex, xlevel, x1Hpts)

Arguments

x1pts

First dimension component of point pair.

HaltonIndex

Halton indices for all points in x1Hpts.

BoxIndex

Index of current box to process.

xlevel

The current iteration level.

x1Hpts

First dimension component of Halton point pair.

Value

A variable called HaltonIndex, the updated Halton indices for all points in x1Hpts.

Author(s)

Phil Davies, Blair Robertson.

Second dimension split.

Description

Split point pairs using the second dimension.

Usage

hipX2split(x2pts, HaltonIndex, BoxIndex, xlevel, x2Hpts)

Arguments

x2pts

Second dimension component of point pair.

HaltonIndex

Halton indices for all points in x2Hpts.

BoxIndex

Index of current box to process.

xlevel

The current iteration level.

x2Hpts

Second dimension component of Halton point pair.

Value

A variable called HaltonIndex, the updated Halton indices for all points in x2Hpts.

Author(s)

Phil Davies, Blair Robertson.

Check if an object is an sf points object.

Description

Tests if the object passed to the function is a sf points object or not. An internal only function.

Usage

is_sf_points(x)

Arguments

x

A probable sf points object.

Details

Detect if an object is a sf points object or not.

Value

Either TRUE or FALSE.

Author(s)

Phil Davies, Blair Robertson.

Compute the log of a to base b.

Description

Compute the log of a to base b.

Arguments

a

Integer to find the log to base b of.

b

Base

Value

The log of a to base b.

Author(s)

Phil Davies.

Vector modulus.

Description

Computes the remainder of dividing a by n using the modulo operator. This function uses a trick to avoid using the modulo operator directly, which can be slow for large values of a and n.

Arguments

a

The input value of type T. This is a NumericVector.

n

The divisor of type int.

Value

The remainder of dividing a by n, of type T in the form of a NumericVector.

Author(s)

Phil Davies.

Remove duplicate values from a NumericVector.

Description

Sort the input numeric vector and removes any duplicate values.

Arguments

vec

A NumericVector that may contain duplicate values.

Value

A NumericVector that is sorted with duplicates removed.

Author(s)

Phil Davies.

Generate a rotation matrix for rotating objects later.

Description

Generate a rotation matrix for rotating objects later.

Usage

rot(a)

Arguments

a

radians of rotation.

Value

Matrix

Author(s)

This function was first written by Paul van Dam-Bates for the package BASMasterSample.

Scale and rotate points from the unit square to a defined projection.

Description

Given some coordinates on [0,1)x[0,1), shift and scale them to the bounding box, and then rotate them given the bounding box rotation defined by the Master Sample.

Usage

rotate.scale.coords(coords, bb, back = TRUE)

Arguments

coords

Output from RSHalton() to be converted to the spatial surface of interest.

bb

Special shape file defining the bounding box with attributes for centroid and rotation.

back

Boolean for whether or not the rotation is back to the original rotated bounding box.

Value

sf spatial points with projection defined in bb.

Author(s)

This function was first written by Paul van Dam-Bates for the package BASMasterSample.

Finds a set of Halton indices that will create BAS points within a shape bounding box.

Description

This function is designed to be called internally for efficiency in site selection.

Usage

setBASIndex(shapefile, bb, seeds = base::c(0, 0))

Arguments

shapefile

Shape file as a polygon (sp or sf) to select sites for.

bb

Bounding box which defines the area around the study area. A bounding box must be supplied.

seeds

A vector of 2 seeds, u1 and u2. seeds must have a value when this function is called.

Details

To be used when doing a Master Sample and the bounding box of the greater frame is potentially much larger than the the polygon being sampled. In this case, we don't want to generate points across the entire larger bounding box region and then clip them. Instead, we can make use of the Halton sequence and only generate BAS points near to the shape being sampled. This function finds returns those indices.

Value

A list containing two variables, $boxes containing indices of the BAS sample that fall into the bounding box, $J, the number of subdivision powers taken to find those boxes, $B, the number of boxes that the indices relate to (1-B), $xlim, the ylimit of the bounding box of the shapefile, shifted to the base[1]^J[1] coordinates on the unit box [0,1), $ylim, the ylimit of the bounding box of the shapefile, shifted to the base[2]^J[2] coordinates on the unit box [0,1).

Validate spbal function parameters.

Description

This function is used to validate parameters passed to all spbal functions that may be called by a user.

Usage

validate_parameters(parm, parm_value)

Arguments

parm

The parameter to be validated.

parm_value

The value of the parameter to be validated. Must be defined as a list.

Value

Always returns TRUE indicating that the parameter was parsed successfully. If a parameter fails validation further execution is terminated using the STOP function.

Author(s)

Phil Davies.