Type: | Package |
Title: | Disaggregation Modelling |
Version: | 0.4.0 |
Description: | Fits disaggregation regression models using 'TMB' ('Template Model Builder'). When the response data are aggregated to polygon level but the predictor variables are at a higher resolution, these models can be useful. Regression models with spatial random fields. The package is described in detail in Nandi et al. (2023) <doi:10.18637/jss.v106.i11>. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Imports: | splancs, Matrix, stats, TMB, dplyr, ggplot2, cowplot, rSPDE, sparseMVN, fmesher, tidyterra, terra, sf, utils |
Suggests: | testthat, knitr, rmarkdown, SpatialEpi |
LinkingTo: | TMB, RcppEigen |
SystemRequirements: | GNU make |
VignetteBuilder: | knitr |
NeedsCompilation: | yes |
Packaged: | 2024-10-02 16:23:34 UTC; simon |
Author: | Anita Nandi |
Maintainer: | Tim Lucas <timcdlucas@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2024-10-02 20:40:03 UTC |
Function to fit the disaggregation model
Description
Function to fit the disaggregation model
Usage
as.disag_data(
polygon_shapefile,
shapefile_names,
covariate_rasters,
polygon_data,
covariate_data,
aggregation_pixels,
coords_for_fit,
coords_for_prediction,
start_end_index,
mesh = NULL,
coordsForFit = NULL,
coordsForPrediction = NULL,
startendindex = NULL
)
Arguments
polygon_shapefile |
sf object containing the response data |
shapefile_names |
List of 2: polygon id variable name and response variable name from x |
covariate_rasters |
SpatRaster of covariates |
polygon_data |
data.frame with two columns: polygon id and response |
covariate_data |
data.frame with cell id, polygon id and covariate columns |
aggregation_pixels |
vector with value of aggregation raster at each pixel |
coords_for_fit |
coordinates of the covariate data points within the polygons in x |
coords_for_prediction |
coordinates of the covariate data points in the whole raster extent |
start_end_index |
matrix containing the start and end index for each polygon |
mesh |
inla.mesh object to use in the fit |
coordsForFit |
Deprecated. |
coordsForPrediction |
Deprecated. |
startendindex |
Deprecated. |
Value
A list is returned of class disag_data
.
The functions summary, print and plot can be used on disag_data
.
The list of class disag_data
contains:
x |
The sf object used as an input. |
covariate_rasters |
The SpatRaster used as an input. |
polygon_data |
A data frame with columns of area_id, response and N (sample size: all NAs unless using binomial data). Each row represents a polygon. |
covariate_data |
A data frame with columns of area_id, cell_id and one for each covariate in covariate_rasters. Each row represents a pixel in a polygon. |
aggregation_pixels |
An array with the value of the aggregation raster for each pixel in the same order as the rows of covariate_data. |
coords_for_fit |
A matrix with two columns of x, y coordinates of pixels within the polygons. Used to make the spatial field. |
coords_for_prediction |
A matrix with two columns of x, y coordinates of pixels in the whole Raster. Used to make predictions. |
start_end_index |
A matrix with two columns containing the start and end index of the pixels within each polygon. |
mesh |
A INLA mesh to be used for the spatial field of the disaggregation model. |
Build mesh for disaggregaton model
Description
build_mesh function takes a sf object and mesh arguments to build an appropriate mesh for the spatial field.
Usage
build_mesh(shapes, mesh_args = NULL, mesh.args = NULL)
Arguments
shapes |
sf covering the region under investigation. |
mesh_args |
list of parameters that control the mesh structure. convex, concave and resolution, to control the boundary of the inner mesh, and max.edge, cutoff and offset, to control the mesh itself, with the parameters having the same meaning as in the INLA functions inla.convex.hull and inla.mesh.2d. cut has been deprecated - use cutoff instead. |
mesh.args |
Deprecated. |
Details
The mesh is created by finding a tight boundary around the polygon data, and creating a fine mesh within the boundary and a coarser mesh outside. This speeds up computation time by only having a very fine mesh within the area of interest and having a small region outside with a coarser mesh to avoid edge effects.
Six mesh parameters can be specified as arguments: convex, concave and resolution, to control the boundary of the inner mesh, and max.edge, cutoff and offset, to control the mesh itself, with the names meaning the same as used by the fmesher functions fm_nonconvex_hull_inla and fm_mesh_2d.
Defaults are: pars <- list(convex = -0.01, concave = -0.5, resolution = 300, max.edge = c(3.0, 8), cutoff = 0.4, offset = c(1, 15)).
Value
An inla.mesh object
Examples
## Not run:
polygons <- list()
for(i in 1:14) {
row <- ceiling(i/10)
col <- ifelse(i %% 10 != 0, i %% 10, 10)
xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
c(ymax, ymax, ymin, ymin, ymax)))
}
polys <- lapply(polygons, sf::st_polygon)
response_df <- data.frame(area_id = 1:100,
response = runif(100, min = 0, max = 10))
spdf <- sf::st_sf(polys, response_df)
my_mesh <- build_mesh(spdf)
## End(Not run)
Fit the disaggregation model
Description
disag_model function takes a disag_data object created by
prepare_data
and performs a Bayesian disaggregation fit.
Usage
disag_model(
data,
priors = NULL,
family = "gaussian",
link = "identity",
iterations = 100,
field = TRUE,
iid = TRUE,
hess_control_parscale = NULL,
hess_control_ndeps = 1e-04,
silent = TRUE
)
Arguments
data |
disag_data object returned by |
priors |
list of prior values:
|
family |
likelihood function: gaussian, binomial or poisson |
link |
link function: logit, log or identity |
iterations |
number of iterations to run the optimisation for |
field |
logical. Flag the spatial field on or off |
iid |
logical. Flag the iid effect on or off |
hess_control_parscale |
Argument to scale parameters during the calculation of the Hessian.
Must be the same length as the number of parameters. See |
hess_control_ndeps |
Argument to control step sizes during the calculation of the Hessian.
Either length 1 (same step size applied to all parameters) or the same length as the number of parameters.
Default is 1e-3, try setting a smaller value if you get NaNs in the standard error of the parameters.
See |
silent |
logical. Suppress verbose output. |
Details
The model definition
The disaggregation model makes predictions at the pixel level:
link(pred_i) = \beta_0 + \beta X + GP(s_i) + u_i
And then aggregates these predictions to the polygon level using the weighted sum (via the aggregation raster, agg_i
):
cases_j = \sum_{i \epsilon j} pred_i \times agg_i
rate_j = \frac{\sum_{i \epsilon j} pred_i \times agg_i}{\sum_{i \epsilon j} agg_i}
The different likelihood correspond to slightly different models (y_j
is the response count data):
Gaussian: If
\sigma
is the dispersion of the pixel data,\sigma_j
is the dispersion of the polygon data, where\sigma_j = \sigma \sqrt{\sum agg_i^2} / \sum agg_i
dnorm(y_j/\sum agg_i, rate_j, \sigma_j)
- predicts incidence rate.
Binomial: For a survey in polygon j,
y_j
is the number positive andN_j
is the number tested.dbinom(y_j, N_j, rate_j)
- predicts prevalence rate.
Poisson:
dpois(y_j, cases_j)
- predicts incidence count.
Specify priors for the regression parameters, field and iid effect as a single list. Hyperpriors for the field
are given as penalised complexity priors you specify \rho_{min}
and \rho_{prob}
for the range of the field
where P(\rho < \rho_{min}) = \rho_{prob}
, and \sigma_{min}
and \sigma_{prob}
for the variation of the field
where P(\sigma > \sigma_{min}) = \sigma_{prob}
. Also, specify pc priors for the iid effect
The family and link arguments are used to specify the likelihood and link function respectively. The likelihood function can be one of gaussian, poisson or binomial. The link function can be one of logit, log or identity. These are specified as strings.
The field and iid effect can be turned on or off via the field and iid logical flags. Both are default TRUE.
The iterations argument specifies the maximum number of iterations the model can run for to find an optimal point.
The silent argument can be used to publish/suppress verbose output. Default TRUE.
Value
A list is returned of class disag_model
.
The functions summary, print and plot can be used on disag_model
.
The list of class disag_model
contains:
obj |
The TMB model object returned by |
opt |
The optimized model object returned by |
sd_out |
The TMB object returned by |
data |
The disag_data object used as an input to the model. |
model_setup |
A list of information on the model setup. Likelihood function (family), link function(link), logical: whether a field was used (field) and logical: whether an iid effect was used (iid). |
References
Nanda et al. (2023) disaggregation: An R Package for Bayesian Spatial Disaggregation Modeling. <doi:10.18637/jss.v106.i11>
Examples
## Not run:
polygons <- list()
n_polygon_per_side <- 10
n_polygons <- n_polygon_per_side * n_polygon_per_side
n_pixels_per_side <- n_polygon_per_side * 2
for(i in 1:n_polygons) {
row <- ceiling(i/n_polygon_per_side)
col <- ifelse(i %% n_polygon_per_side != 0, i %% n_polygon_per_side, n_polygon_per_side)
xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
c(ymax, ymax, ymin, ymin, ymax)))
}
polys <- lapply(polygons,sf::st_polygon)
N <- floor(runif(n_polygons, min = 1, max = 100))
response_df <- data.frame(area_id = 1:n_polygons, response = runif(n_polygons, min = 0, max = 1000))
spdf <- sf::st_sf(response_df, geometry = polys)
# Create raster stack
r <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r) <- terra::ext(spdf)
r[] <- sapply(1:terra::ncell(r), function(x){
rnorm(1, ifelse(x %% n_pixels_per_side != 0, x %% n_pixels_per_side, n_pixels_per_side), 3))}
r2 <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r2) <- terra::ext(spdf)
r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/n_pixels_per_side), 3))
cov_stack <- c(r, r2)
names(cov_stack) <- c('layer1', 'layer2')
test_data <- prepare_data(polygon_shapefile = spdf,
covariate_rasters = cov_stack)
result <- disag_model(test_data, iterations = 2)
## End(Not run)
Roxygen commands
Description
Roxygen commands
Usage
dummy()
Get a SpatRaster of covariates from a folder containing .tif files
Description
Looks in a specified folder for raster files. Returns a multi-layered SpatRaster of the rasters cropped to the extent specified by the shape parameter.
Usage
getCovariateRasters(directory, file_pattern = ".tif$", shape)
Arguments
directory |
Filepath to the directory containing the rasters. |
file_pattern |
Pattern the filenames must match. Default is all files ending in .tif . |
shape |
An object with an extent that the rasters will be cropped to. |
Value
A multi-layered SpatRaster of the raster files in the directory
Examples
## Not run:
getCovariateRasters('/home/rasters', '.tif$', shape)
## End(Not run)
Extract polygon id and response data into a data.frame from a sf object
Description
Returns a data.frame with a row for each polygon in the sf object and columns: area_id, response and N, containing the id of the polygon, the values of the response for that polygon, and the sample size respectively. If the data is not survey data (the sample size does not exist), this column will contain NAs.
Usage
getPolygonData(
shape,
id_var = "area_id",
response_var = "response",
sample_size_var = NULL
)
Arguments
shape |
A sf object containing response data. |
id_var |
Name of column in shape object with the polygon id. Default 'area_id'. |
response_var |
Name of column in shape object with the response data. Default 'response'. |
sample_size_var |
For survey data, name of column in sf object (if it exists) with the sample size data. Default NULL. |
Value
A data.frame with a row for each polygon in the sf object and columns: area_id, response and N, containing the id of the polygon, the values of the response for that polygon, and the sample size respectively. If the data is not survey data (the sample size does not exist), this column will contain NAs.
Examples
{
polygons <- list()
for(i in 1:100) {
row <- ceiling(i/10)
col <- ifelse(i %% 10 != 0, i %% 10, 10)
xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
c(ymax, ymax, ymin, ymin, ymax)))
}
polys <- lapply(polygons,sf::st_polygon)
response_df <- data.frame(area_id = 1:100, response = runif(100, min = 0, max = 10))
spdf <- sf::st_sf(response_df, geometry = polys)
getPolygonData(spdf, id_var = 'area_id', response_var = 'response')
}
Function to match pixels to their corresponding polygon
Description
From the covariate data and polygon data, the function matches the polygon id between the two to find which pixels from the covariate data are contained in each of the polygons.
Usage
getStartendindex(covariates, polygon_data, id_var = "area_id")
Arguments
covariates |
data.frame with each covariate as a column an and id column. |
polygon_data |
data.frame with polygon id and response data. |
id_var |
string with the name of the column in the covariate data.frame containing the polygon id. |
Details
Takes a data.frame containing the covariate data with a polygon id column and one column for each covariate,
and another data.frame containing polygon data with a polygon id, response and sample size column (as returned
by getPolygonData
function).
Returns a matrix with two columns and one row for each polygon. The first column is the index of the first row in covariate data that corresponds to that polygon, the second column is the index of the last row in covariate data that corresponds to that polygon.
Value
A matrix with two columns and one row for each polygon. The first column is the index of the first row in covariate data that corresponds to that polygon, the second column is the index of the last row in covariate data that corresponds to that polygon.
Examples
{
covs <- data.frame(area_id = c(1, 1, 1, 2, 2, 3, 3, 3, 3), response = c(3, 9, 5, 2, 3, 6, 7, 3, 5))
response <- data.frame(area_id = c(1, 2, 3), response = c(4, 7, 2), N = c(NA, NA, NA))
getStartendindex(covs, response, 'area_id')
}
Create the TMB model object for the disaggregation model
Description
make_model_object function takes a disag_data object created by prepare_data
and creates a TMB model object to be used in fitting.
Usage
make_model_object(
data,
priors = NULL,
family = "gaussian",
link = "identity",
field = TRUE,
iid = TRUE,
silent = TRUE
)
Arguments
data |
disag_data object returned by |
priors |
list of prior values |
family |
likelihood function: gaussian, binomial or poisson |
link |
link function: logit, log or identity |
field |
logical. Flag the spatial field on or off |
iid |
logical. Flag the iid effect on or off |
silent |
logical. Suppress verbose output. |
Details
The model definition
The disaggregation model make predictions at the pixel level:
link(pred_i) = \beta_0 + \beta X + GP(s_i) + u_i
And then aggregates these predictions to the polygon level using the weighted sum (via the aggregation raster, agg_i
):
cases_j = \sum_{i \epsilon j} pred_i \times agg_i
rate_j = \frac{\sum_{i \epsilon j} pred_i \times agg_i}{\sum_{i \epsilon j} agg_i}
The different likelihood correspond to slightly different models (y_j
is the response count data):
Gaussian: If
\sigma
is the dispersion of the pixel data,\sigma_j
is the dispersion of the polygon data, where\sigma_j = \sigma \sqrt{\sum agg_i^2} / \sum agg_i
dnorm(y_j/\sum agg_i, rate_j, \sigma_j)
- predicts incidence rate.
Binomial: For a survey in polygon j,
y_j
is the number positive andN_j
is the number tested.dbinom(y_j, N_j, rate_j)
- predicts prevalence rate.
Poisson:
dpois(y_j, cases_j)
- predicts incidence count.
Specify priors for the regression parameters, field and iid effect as a single named list. Hyperpriors for the field
are given as penalised complexity priors you specify \rho_{min}
and \rho_{prob}
for the range of the field
where P(\rho < \rho_{min}) = \rho_{prob}
, and \sigma_{min}
and \sigma_{prob}
for the variation of the field
where P(\sigma > \sigma_{min}) = \sigma_{prob}
. Also, specify pc priors for the iid effect.
The precise names and default values for these priors are:
priormean_intercept: 0
priorsd_intercept: 10.0
priormean_slope: 0.0
priorsd_slope: 0.5
prior_rho_min: A third the length of the diagonal of the bounding box.
prior_rho_prob: 0.1
prior_sigma_max: sd(response/mean(response))
prior_sigma_prob: 0.1
prior_iideffect_sd_max: 0.1
prior_iideffect_sd_prob: 0.01
The family and link arguments are used to specify the likelihood and link function respectively. The likelihood function can be one of gaussian, poisson or binomial. The link function can be one of logit, log or identity. These are specified as strings.
The field and iid effect can be turned on or off via the field and iid logical flags. Both are default TRUE.
The iterations argument specifies the maximum number of iterations the model can run for to find an optimal point.
The silent argument can be used to publish/supress verbose output. Default TRUE.
Value
The TMB model object returned by MakeADFun
.
Examples
## Not run:
polygons <- list()
n_polygon_per_side <- 10
n_polygons <- n_polygon_per_side * n_polygon_per_side
n_pixels_per_side <- n_polygon_per_side * 2
for(i in 1:n_polygons) {
row <- ceiling(i/n_polygon_per_side)
col <- ifelse(i %% n_polygon_per_side != 0, i %% n_polygon_per_side, n_polygon_per_side)
xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
c(ymax, ymax, ymin, ymin, ymax)))
}
polys <- lapply(polygons,sf::st_polygon)
N <- floor(runif(n_polygons, min = 1, max = 100))
response_df <- data.frame(area_id = 1:n_polygons, response = runif(n_polygons, min = 0, max = 1000))
spdf <- sf::st_sf(response_df, geometry = polys)
# Create raster stack
r <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r) <- terra::ext(spdf)
r[] <- sapply(1:terra::ncell(r), function(x){
rnorm(1, ifelse(x %% n_pixels_per_side != 0, x %% n_pixels_per_side, n_pixels_per_side), 3))}
r2 <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r2) <- terra::ext(spdf)
r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/n_pixels_per_side), 3))
cov_stack <- c(r, r2)
names(cov_stack) <- c('layer1', 'layer2')
test_data <- prepare_data(polygon_shapefile = spdf,
covariate_rasters = cov_stack)
result <- make_model_object(test_data)
## End(Not run)
Plot input data for disaggregation
Description
Plotting function for class disag_data (the input data for disaggregation).
Usage
## S3 method for class 'disag_data'
plot(x, which = c(1, 2, 3), ...)
Arguments
x |
Object of class disag_data to be plotted. |
which |
If a subset of plots is required, specify a subset of the numbers 1:3 |
... |
Further arguments to plot function. |
Details
Produces three plots: polygon response data, covariate rasters and INLA mesh.
Value
A list of three plots: the polygon plot (ggplot), covariate plot (spplot) and INLA mesh plot (ggplot)
Plot results of fitted model
Description
Plotting function for class disag_model (the result of the disaggregation fitting).
Usage
## S3 method for class 'disag_model'
plot(x, include_iid = FALSE, ...)
Arguments
x |
Object of class disag_model to be plotted. |
include_iid |
logical. Whether or not to include predictions that include the IID effect. |
... |
Further arguments to plot function. |
Details
Produces two plots: results of the fixed effects and in-sample observed vs predicted plot.
Value
A list of two ggplot plots: results of the fixed effects and an in-sample observed vs predicted plot
Plot mean and uncertainty predictions from the disaggregation model results
Description
Plotting function for class disag_prediction (the mean and uncertainty predictions of the disaggregation fitting).
Usage
## S3 method for class 'disag_prediction'
plot(x, ...)
Arguments
x |
Object of class disag_prediction to be plotted. |
... |
Further arguments to plot function. |
Details
Produces raster plots of the mean prediction, and the lower and upper confidence intervals.
Value
A list of plots of rasters from the prediction: mean prediction, lower CI and upper CI.
Convert results of the model ready for plotting
Description
Convert results of the model ready for plotting
Usage
plot_disag_model_data(x)
Arguments
x |
Object of class disag_model to be plotted. |
Value
A list that contains:
posteriors |
A data.frame of posteriors |
data |
A data.frame of observed and predicted data |
title |
The title of the observed vs. predicted plot |
Predict mean and uncertainty from the disaggregation model result
Description
predict.disag_model function takes a disag_model object created by disaggregation::disag_model and predicts mean and uncertainty maps.
Usage
## S3 method for class 'disag_model'
predict(
object,
new_data = NULL,
predict_iid = FALSE,
N = 100,
CI = 0.95,
newdata = NULL,
...
)
Arguments
object |
disag_model object returned by disag_model function. |
new_data |
If NULL, predictions are made using the data in model_output. If this is a raster stack or brick, predictions will be made over this data. |
predict_iid |
logical. If TRUE, any polygon iid effect from the model will be used in the prediction. Default FALSE. |
N |
Number of realisations. Default: 100. |
CI |
Credible interval to be calculated from the realisations. Default: 0.95. |
newdata |
Deprecated. |
... |
Further arguments passed to or from other methods. |
Details
To predict over a different spatial extent to that used in the model, a SpatRaster covering the region to make predictions over is passed to the argument new_data. If this is not given predictions are made over the data used in the fit.
The predict_iid logical flag should be set to TRUE if the results of the iid effect from the model are to be used in the prediction.
For the uncertainty calculations, the number of the realisations and the size of the credible interval to be calculated are given by the arguments N and CI respectively.
Value
An object of class disag_prediction which consists of a list of two objects:
mean_prediction |
List of:
|
uncertainty_prediction: |
List of:
|
Examples
## Not run:
predict(fit_result)
## End(Not run)
Function to predict mean from the model result
Description
predict_model function takes a disag_model object created by disaggregation::disag_model and predicts mean maps.
Usage
predict_model(
model_output,
new_data = NULL,
predict_iid = FALSE,
newdata = NULL
)
Arguments
model_output |
disag_model object returned by disag_model function |
new_data |
If NULL, predictions are made using the data in model_output. If this is a SpatRaster, predictions will be made over this data. Default NULL. |
predict_iid |
If TRUE, any polygon iid effect from the model will be used in the prediction. Default FALSE. |
newdata |
Deprecated. |
Details
Function returns rasters of the mean predictions as well as the covariate and field contributions to the linear predictor.
To predict over a different spatial extent to that used in the model, a SpatRaster covering the region to make predictions over is passed to the argument new_data. If this is not given predictions are made over the data used in the fit.
The predict_iid logical flag should be set to TRUE if the results of the iid effect from the model are to be used in the prediction.
Value
The mean prediction, which is a list of:
-
prediction Raster of mean predictions based.
-
field Raster of the field component of the linear predictor.
-
iid Raster of the iid component of the linear predictor.
-
covariates Raster of the covariate component of the linear predictor.
Examples
## Not run:
predict_model(result)
## End(Not run)
Function to predict uncertainty from the model result
Description
predict_uncertainty function takes a disag_model object created by disaggregation::disag_model and predicts upper and lower credible interval maps.
Usage
predict_uncertainty(
model_output,
new_data = NULL,
predict_iid = FALSE,
N = 100,
CI = 0.95,
newdata = NULL
)
Arguments
model_output |
disag_model object returned by disag_model function. |
new_data |
If NULL, predictions are made using the data in model_output. If this is a raster stack or brick, predictions will be made over this data. Default NULL. |
predict_iid |
If TRUE, any polygon iid effect from the model will be used in the prediction. Default FALSE. |
N |
number of realisations. Default: 100. |
CI |
credible interval. Default: 0.95. |
newdata |
Deprecated. |
Details
Function returns a SpatRaster of the realisations as well as the upper and lower credible interval rasters.
To predict over a different spatial extent to that used in the model, a SpatRaster covering the region to make predictions over is passed to the argument new_data. If this is not given predictions are made over the data used in the fit.
The predict_iid logical flag should be set to TRUE if the results of the iid effect from the model are to be used in the prediction.
The number of the realisations and the size of the credible interval to be calculated. are given by the arguments N and CI respectively.
Value
The uncertainty prediction, which is a list of:
-
realisations SpatRaster of realisations of predictions. Number of realisations defined by argument N.
-
predictions_ci SpatRaster of the upper and lower credible intervals. Defined by argument CI.
Examples
## Not run:
predict_uncertainty(result)
## End(Not run)
Prepare data for disaggregation modelling
Description
prepare_data function is used to extract all the data required for fitting a disaggregation model. Designed to be used in the disaggregation::disag_model function.
Usage
prepare_data(
polygon_shapefile,
covariate_rasters,
aggregation_raster = NULL,
id_var = "area_id",
response_var = "response",
sample_size_var = NULL,
mesh_args = NULL,
na_action = FALSE,
make_mesh = TRUE,
mesh.args = NULL,
na.action = NULL,
makeMesh = NULL,
ncores = NULL
)
Arguments
polygon_shapefile |
sf object containing at least three columns: one with the geometried, one with the id for the polygons (id_var) and one with the response count data (response_var); for binomial data, i.e survey data, it can also contain a sample size column (sample_size_var). |
covariate_rasters |
SpatRaster of covariate rasters to be used in the model. |
aggregation_raster |
SpatRaster to aggregate pixel level predictions to polygon level e.g. population to aggregate prevalence. If this is not supplied a uniform raster will be used. |
id_var |
Name of column in sf object with the polygon id. |
response_var |
Name of column in sf object with the response data. |
sample_size_var |
For survey data, name of column in sf object (if it exists) with the sample size data. |
mesh_args |
list of parameters that control the mesh structure with the same names as used by INLA. |
na_action |
logical. If TRUE, NAs in response will be removed, covariate NAs will be given the median value, aggregation NAs will be set to zero. Default FALSE (NAs in response or covariate data within the polygons will give errors). |
make_mesh |
logical. If TRUE, build INLA mesh, takes some time. Default TRUE. |
mesh.args |
Deprecated. |
na.action |
Deprecated. |
makeMesh |
Deprecated. |
ncores |
Deprecated. |
Details
Takes a sf object with the response data and a SpatRaster of covariates.
Extract the values of the covariates (as well as the aggregation raster, if given) at each pixel within the polygons (parallelExtract function). This is done in parallel and n.cores argument is used to set the number of cores to use for covariate extraction. This can be the number of covariates used in the model.
The aggregation raster defines how the pixels within each polygon are aggregated. The disaggregation model performs a weighted sum of the pixel prediction, weighted by the pixel values in the aggregation raster. For disease incidence rate you use the population raster to aggregate pixel incidence rate by summing the number of cases (rate weighted by population). If no aggregation raster is provided a uniform distribution is assumed, i.e. the pixel predictions are aggregated to polygon level by summing the pixel values.
Makes a matrix that contains the start and end pixel index for each polygon. Builds an INLA mesh to use for the spatial field (getStartendindex function).
The mesh.args argument allows you to supply a list of INLA mesh parameters to control the mesh used for the spatial field (build_mesh function).
The na.action flag is automatically off. If there are any NAs in the response or covariate data within the polygons the prepare_data method will error. Ideally the NAs in the data would be dealt with beforehand, however, setting na.action = TRUE will automatically deal with NAs. It removes any polygons that have NAs as a response, sets any aggregation pixels with NA to zero and sets covariate NAs pixels to the median value for the that covariate.
Value
A list is returned of class disag_data
.
The functions summary, print and plot can be used on disag_data
.
The list of class disag_data
contains:
x |
The sf object used as an input. |
covariate_rasters |
The SpatRaster used as an input. |
polygon_data |
A data frame with columns of area_id, response and N (sample size: all NAs unless using binomial data). Each row represents a polygon. |
covariate_data |
A data frame with columns of area_id, cell_id and one for each covariate in covariate_rasters. Each row represents a pixel in a polygon. |
aggregation_pixels |
An array with the value of the aggregation raster for each pixel in the same order as the rows of covariate_data. |
coords_for_fit |
A matrix with two columns of x, y coordinates of pixels within the polygons. Used to make the spatial field. |
coords_for_prediction |
A matrix with two columns of x, y coordinates of pixels in the whole Raster. Used to make predictions. |
start_end_index |
A matrix with two columns containing the start and end index of the pixels within each polygon. |
mesh |
A INLA mesh to be used for the spatial field of the disaggregation model. |
Examples
polygons <- list()
for(i in 1:100) {
row <- ceiling(i/10)
col <- ifelse(i %% 10 != 0, i %% 10, 10)
xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
c(ymax, ymax, ymin, ymin, ymax)))
}
polys <- lapply(polygons,sf::st_polygon)
response_df <- data.frame(area_id = 1:100, response = runif(100, min = 0, max = 10))
spdf <- sf::st_sf(response_df,geometry=polys)
r <- terra::rast(nrow=20,ncol=20)
terra::ext(r) <- terra::ext(spdf)
r[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ifelse(x %% 20 != 0, x %% 20, 20), 3))
r2 <- terra::rast(nrow=20,ncol=20)
terra::ext(r2) <- terra::ext(spdf)
r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/10), 3))
cov_rasters <- c(r, r2)
test_data <- prepare_data(polygon_shapefile = spdf,
covariate_rasters = cov_rasters)
Print function for disaggregation input data
Description
Function that prints the input data from the disaggregation model.
Usage
## S3 method for class 'disag_data'
print(x, ...)
Arguments
x |
Object returned from prepare_data. |
... |
Further arguments to print function. |
Details
Prints the number of polyons and pixels, the number of pixels in the largest and smallest polygons and summaries of the covariates.
Print function for disaggregation fit result.
Description
Function that prints the result of the fit from the disaggregation model.
Usage
## S3 method for class 'disag_model'
print(x, ...)
Arguments
x |
Object returned from disag_model. |
... |
Further arguments to print function. |
Details
Prints the negative log likelihood, model parameters and calculates metrics from in-sample performance.
Print function for disaggregation prediction
Description
Function that prints the prediction from the disaggregation model.
Usage
## S3 method for class 'disag_prediction'
print(x, ...)
Arguments
x |
Object returned from predict.disag_model. |
... |
Further arguments to print function. |
Details
Prints the number of polyons and pixels, the number of pixels in the largest and smallest polygons and summaries of the covariates.
Summary function for disaggregation input data
Description
Function that summarizes the input data from the disaggregation model.
Usage
## S3 method for class 'disag_data'
summary(object, ...)
Arguments
object |
Object returned from prepare_data. |
... |
Further arguments to summary function. |
Details
Prints the number of polyons and pixels, the number of pixels in the largest and smallest polygons and summaries of the covariates.
Value
A list of the number of polyons, the number of covariates and summaries of the covariates.
Summary function for disaggregation fit result
Description
Function that summarises the result of the fit from the disaggregation model.
Usage
## S3 method for class 'disag_model'
summary(object, ...)
Arguments
object |
Object returned from disag_model. |
... |
Further arguments to summary function. |
Details
Prints the negative log likelihood, model parameters and calculates metrics from in-sample performance.
Value
A list of the model parameters, negative log likelihood and metrics from in-sample performance.
Summary function for disaggregation prediction
Description
Function that summarizes the prediction from the disaggregation model.
Usage
## S3 method for class 'disag_prediction'
summary(object, ...)
Arguments
object |
Object returned from predict.disag_model |
... |
Further arguments to summary function. |
Details
Prints the number of polyons and pixels, the number of pixels in the largest and smallest polygons and summaries of the covariates.
Value
A list of the number of polyons, the number of covariates and summaries of the covariates.