Title: | Genetic and Isotopic Assignment Accounting for Habitat Suitability |
Description: | Tools for using genetic markers, stable isotope data, and habitat suitability data to calculate posterior probabilities of breeding origin of migrating birds. |
Version: | 0.0.5 |
Maintainer: | Eric C. Anderson <eric.anderson@noaa.gov> |
Depends: | R (≥ 3.5.0) |
Imports: | dplyr, geosphere, magrittr, raster, rlang, sp |
Suggests: | knitr, rmarkdown, ggplot2 |
License: | CC0 |
LazyData: | TRUE |
RoxygenNote: | 7.2.1 |
Encoding: | UTF-8 |
NeedsCompilation: | no |
Packaged: | 2023-04-27 16:20:55 UTC; eriq |
Author: | Eric C. Anderson [cre, aut] |
Repository: | CRAN |
Date/Publication: | 2023-04-27 19:50:02 UTC |
Pipe operator
Description
Pipe operator
Usage
lhs %>% rhs
Posterior probs of genetic region origin from Leave-one-out cross validation for breeding WIWAs
Description
A data frame of the same birds (roughly) that appear in breeding_wiwa_isotopes
. A long
format data frame with 2,358 rows and 5 columns
Usage
breeding_wiwa_genetic_posteriors
Format
A tibble with 2,358 rows and 5 variables. The variables are:
- ID
unique identifier for each bird
- Short_Name
another id for the bird
- NumberOfLoci
Number of loci successfully typed
- region
one of the genetic regions
- posterior
the posterior prob of originating from that region
Source
Kristen Ruegg, Eric Anderson, Thomas Smith
Isotope values, latitude, longitude and more data from 357 breeding Wilson's warblers
Description
A data frame containing hydrogen isotope values, lat, long, and IDs and some other columns of data for birds sampled on the breeding grounds. Notice that the latitude column is named "lat" and the longitude column is named "long". Those names are both all lowercase. That is the way we roll here. Make sure that you use "lat" and "long" instead of "Lat" and "Long".
Usage
breeding_wiwa_isotopes
Format
A tibble with 357 rows and 15 variables. The relevant variables for analyses here are:
- ID
unique identifier for each bird
- Isotope.Value
hydrogen isotope ratios measured in the bird's feather
- lat
latitude of the bird's breeding/sampling location
- long
latitude of the bird's breeding/sampling location
Source
Kristen Ruegg, Jeff Kelly, Thomas Smith
combine genetics, isotopes, and habitat raster with exponents as given
Description
This just multiplies the rasters together, each raised to the appropriate exponent, normalizes and returns the result
Usage
comboize(Mgen, Miso, Mhab, beta_gen, beta_iso, beta_hab)
Arguments
Mgen |
the genetic posteriors rasterStack. Must be a rasterStack |
Miso |
the isotope posteriors rasterStack. |
Mhab |
a single layer raster with the habitat suitabiilty measure as a normalized probability surface. |
beta_gen |
the exponent to raise the habitat raster to |
beta_iso |
the exponent to raise the isotope raster to |
beta_hab |
the exponent to raise the genetic raster to |
Examples
# first, run through the example for isotope_posterior_probs() to get
# the rasters for two migrant birds. This gives us the list "birds2"
example(isotope_posterior_probs)
# extract the posterior probs rasters from that output into a raster stack
miso <- lapply(birds2$regular, function(x) x$posterior_probs) %>%
raster::stack()
# see the names of the birds we are dealing with:
names(miso)
# get the genetic posteriors for those two birds
mig_gen2 <- migrant_wiwa_genetic_posteriors %>%
dplyr::filter(ID %in% c(names(miso)))
# make genetic posterior rasters for those two birds, make sure they are
# sorted in the same order as miso, and make a raster stack of it
mgen <- genetic_posteriors2rasters(G = mig_gen2, R = genetic_regions)[names(miso)] %>%
raster::stack()
# make a normalized prior from habitat quality that is zeros everywhere
# outside of the "known" range.
tmp <- wiwa_habitat_unclipped * wiwa_breed
mhab <- tmp / raster::cellStats(tmp, sum)
# combine genetics, isotopes and habitat with exponents of 1 on each
mcombo <- comboize(mgen, miso, mhab, 1, 1, 1)
prepare fortified output for multipanel plot
Description
This takes Mgen, Miso, and Mhab for a single bird and, if available, the true breeding location. Then it computes the combo-ized raster at all the requested levels of the exponents, and creates a fortified data frame of the results suitable for plotting in ggplot
Usage
comboize_and_fortify(
mgen,
miso,
mhab,
gen_beta_levels = 1,
iso_beta_levels = c(1),
hab_beta_levels = c(1)
)
Arguments
mgen |
genetics posterior raster |
miso |
isotope posterior raster |
mhab |
habitat suitability raster |
gen_beta_levels |
vector of the desired values of gen_beta |
iso_beta_levels |
vector of the desired values of iso_beta |
hab_beta_levels |
vector of the desired values of hab_beta |
Examples
# run through the example for comboize to get the variables
# mgen, miso, and mhab that we will use.
example(comboize)
# then run that on the first bird to get a data frame
# that you can use with ggplot
ff <- comboize_and_fortify(mgen[[1]], miso[[1]], mhab)
# this can be plotted with ggplot2
## Not run:
library(ggplot2)
wmap <- get_wrld_simpl()
ggplot(mapping = aes(x=long, y = lat)) +
coord_fixed(1.3, xlim = c(-170, -50), ylim = c(33, 70)) +
geom_polygon(data = wmap, aes(group = group), fill = NA, color = "black", size = .05) +
geom_raster(data = ff, mapping = aes(fill = prob), interpolate = TRUE) +
scale_fill_gradientn(colours = c("#EBEBEB", rainbow(7)), na.value = NA) +
theme_bw() +
facet_wrap( ~ beta_vals, ncol = 2) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
## End(Not run)
Output of isotope_posterior_probs
for two migrant birds.
Description
Because it takes too long to generate this output for future examples, we
just store it as a data object to use in examples. See the example in
isotope_posterior_probs
to see what this is.
Usage
example_isotope_posteriors
Format
An object of class list
of length 2.
Source
Ruegg et al 2014
Add the isomap prediction and sd in columns attached to the feather isotope data frame
Description
Rasterizes the isomap predictions and standard deviation (using isomap2raster) and then extracts the values associated with each location from the raster and returns the full data frame with those values joined on in columns named iso_pred and iso_sd. It overwrites those columns with a warning if either of those columns already exists in the data.
Usage
extract_isopredictions(isoscape, birds, pred = "predkrig", sd = "stdkrig")
Arguments
isoscape |
the data frame of prediction.txt from ISOMAP. The latitude column must be named "lat" and the longitude column must be named "long". |
birds |
data frame of the individual isotope values for the birds/feathers. Should be
something like |
pred |
name of the column holding the prediction (like "predkrig") in the isoscape data frame |
sd |
name of the column holding the standard deviation (like "stdkrig") in the isoscape data frame |
Examples
# Using the provided data from breeding Wilson's warblers and the provided
# predictions from isomap_job54152:
x <- extract_isopredictions(isoscape = isomap_job54152_prediction,
birds = breeding_wiwa_isotopes,
pred = "predkrig",
sd = "stdkrig")
gaiah: Genetic and Isotopic Assignment Accounting for Habitat Suitability
Description
Tools for using genetic markers, stable isotope data, and habitat suitability data to calculate posterior probabilities of breeding origin of migrating birds.
Details
There is not a tutorial within the package, currently. The best place to find an example of how to use the functions in this package is at the GitHub repository: https://github.com/eriqande/gaiah-wiwa. Go ahead and read the README there. It should provide you with everything you need to get up and running with the gaiah package.
Finally, note that the development version of gaiah is available at https://github.com/eriqande/gaiah.
Convert posteriors to particular genetic reporting groups into raster
Description
When birds have been assigned to breeding groups or "general areas" as in Ruegg et al. 2014 then the posterior probabilty with which the birds were assigned to the groups needs to be "smeared out" in a raster over the spatial extent of the groups.
Usage
genetic_posteriors2rasters(G, R)
Arguments
G |
long format data frame like breeding_wiwa_genetic_posteriors. Has to have columns of ID, region, and posterior |
R |
a RasterStack like "genetic_regions". The sum of these should be the total known range. The names of the regions in R must be the same as the entries in the "region" column in G. |
Value
This returns a list of rasters for each bird in G. The entries in the raster are the posterior probability of being from that cell. This assumes that birds are equally likely to come from any cell within the group's region. It doesn't return a rasterStack because you can't subset rasterStacks to change orders, etc., and it mangles names.
Examples
library(raster) # needed to deal with "genetic_regions" variable
# get a small subset of individuals so it doesn't take too long
data(breeding_wiwa_genetic_posteriors)
data(genetic_regions)
BW <- breeding_wiwa_genetic_posteriors %>%
dplyr::filter(Short_Name %in% c("eNBFR01", "wABCA05", "wORHA21"))
# run the function on those
GPRs <- genetic_posteriors2rasters(BW, genetic_regions)
RasterStack showing the 6 genetic regions that Wilson's warblers may be assigned to
Description
The sum over layers gives the same as wiwa_breed
Usage
genetic_regions
Format
RasterStack with 6 layers. Each contains 1's in the genetic region and 0's elsewhere.
The sum of these layers is the raster wiwa_breed
.
- class
RasterStack
- dimensions
80, 228, 18240, 6 (nrow, ncol, ncell, nlayers)
- resolution
0.5, 0.5 (x, y)
- extent
-168.1, -54.1, 31.2, 71.2 (xmin, xmax, ymin, ymax)
- coord. ref.
+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0
- data source
in memory
- names
CalSierra, Basin.Rockies, Eastern, AK.EastBC.AB, Wa.To.NorCalCoast, CentCalCoast
Source
Ruegg et al 2014
return the wrld_simpl data set from maptools
Description
I define this as a function so that we don't have to attach maptools, but we can just have it in the imports. Couldn't figure out how to do it otherwise.
Usage
get_wrld_simpl()
Examples
ws <- get_wrld_simpl()
head(ws)
## Not run: plot(ws)
return a raster of great circle distances (in km)
Description
Given an input raster R, this returns a raster of the same dimension where every cell is the great circle distance between lat, and long, and the center of every cell in R.
Usage
great_circle_raster(R, lat, long)
Arguments
R |
a raster |
lat |
a latitude value (must be of length 1) |
long |
a longitude value (must be of length 1) |
Examples
# We compute the great circle distance between the lat/long of my office in
# California, to every cell in the raster denoting the breeding habitat
# of Wilson's warbler:
gcr <- great_circle_raster(wiwa_breed, lat = 36.951564, long = -122.065116)
# plot that if you want
## Not run:
plot(gcr)
lines(get_wrld_simpl())
## End(Not run)
Group bird isotope data by locations
Description
This takes as input a data frame of feather isotope data that also has the
isoscape predictions attached to it, just like the data frame returned by
extract_isopredictions
. The data frame must have a column
that gives the general location by which you will group birds for the
rescaling function. The isoscape predictions by default should be in columns named
iso_pred
for the actual prediction, and iso_sd
for the standard deviation,
as produced by extract_isopredictions
, but those are user configurable,
as well.
Usage
group_birds_by_location(
D,
feather_isotope_col,
location_col,
iso_pred_col = "iso_pred",
iso_sd_col = "iso_sd"
)
Arguments
D |
the data frame of feather isotope data with the isoscape predictions extracted for each location, as well, and a column giving general grouping locations for the birds. |
feather_isotope_col |
the string name of the column holding the feather isotope data. |
location_col |
the string name of the column holding the locations to be used for grouping. |
iso_pred_col |
name of the column holding the predicted values from the isoscape. Default
is |
iso_sd_col |
name of the column holding the standard deviations of the predicted values
from the isoscape. Default is |
Details
This function returns a data frame with columns for the mean and SD of feather/bird values,
(meanH
and sdH
) and the mean predicted isotope value and the mean sd of the predicted
isotope values (meaniso
and sdiso
) for all the samples within each location. It
also returns the Location column itself and a column cnt
that gives the number of bird/tissue
samples from each location.
This function throws an error if any of the locations has only 1 sample. If that is the case, you may consider merging that sample with another location (or dropping it?).
Examples
# first run the example for extract_isopredictions to get the variable "x"
example("extract_isopredictions")
# If this were run it gives an error because there is only 1 bird at the
# location "Charlevoix"
## Not run:
group_birds_by_location(x, feather_isotope_col = "Isotope.Value", location_col = "Location")
## End(Not run)
# remove that one bird at Charlevoix and re-run
y <- x %>%
dplyr::filter(Location != "Charlevoix")
# then group birds by location
gbl <- group_birds_by_location(D = y,
feather_isotope_col = "Isotope.Value",
location_col = "Location")
finds the cumulative posterior probability of each cell for HPD calculation
Description
This function is not exported currently, but is used in min_hpd_inclusion_area
Usage
hpd_cumul(B)
Arguments
B |
a raster of posterior probs |
convert columns of an ISOMAP isoscape to a raster object
Description
Just simple conversion, but nice to have this in a brief function
Usage
isomap2raster(isoscape, column, Proj = raster::projection(get_wrld_simpl()))
Arguments
isoscape |
the data frame of prediction.txt from ISOMAP. The latitude column must be named "lat" and the longitude column must be named "long". |
column |
the name of the column to turn into a raster object. This should be a quoted string, like "predkrig". |
Proj |
the desired projection. By default it is raster::projection(get_wrld_simpl()), i.e. the same projection as the wrld_simpl map. |
Examples
isorast <- isomap2raster(isomap_job54152_prediction, "predreg")
isorast
Predicted isotope values from ISOMAP
Description
A data frame containing predicted hydrogen isotope values, lat, long, and IDs and some other columns of data prections made by ISOMAP
Usage
isomap_job54152_prediction
Format
A tibble with 10,786 rows and 12 variables. The relevant variables for analyses here are:
- lat
latitude of the predicted location
- long
longitude of the predicted location
- predreg
Fill in
- stdreg
Fill in
- predkrig
Fill in
- stdkrig
Fill in
Source
Kristina Paxton and ISOMAP (http://isomap.rcac.purdue.edu:8080/gridsphere/gridsphere)
compute posterior probabilities of origin given isotope values
Description
This function automates the whole process described in the appendix of Vander Zanden et al. (2014) for computing the posterior probability of origin of an individual (or group of individuals) given its stable isotope values, and those of a set of reference individuals, and an ISOMAP prediction of isotope values across a landscape.
Usage
isotope_posterior_probs(
isoscape,
ref_birds,
assign_birds = NULL,
isoscape_pred_column = "predkrig",
isoscape_sd_column = "stdkrig",
self_assign = FALSE
)
Arguments
isoscape |
the data frame read in from "prediction.txt" from ISOMAP. The
latitude column must be named "lat" and the longitude column must be named "long". You have to choose
which columns to use with the parameters |
ref_birds |
a data frame of reference birds. This should have (at least) columns of "ID" (for unique identifiers for each bird), "lat", "long", "Isotope.Value" and "Location". The "Location" column will be used to group samples for the Vander Zanden Rescaling. |
assign_birds |
A data frame of birds whose breeding origins are to be inferred. These must have at a minimum the column "ID" (for uniqe identifiers for the birds) and the column "Isotope.Value". This can be left NULL if there are no birds of unknown origin to assign (for example if you are performing cross-validation on the ref_birds). |
isoscape_pred_column |
the name of the column in |
isoscape_sd_column |
the name of the column in |
self_assign |
if TRUE, then the birds in |
Details
For details see:
Vander Zanden HB, Wunder MB, Hobson KA, Van Wilgenburg SL, Wassenaar LI, Welker JM, Bowen GJ (2014) Contrasting assignment of migratory organisms to geographic origins using long-term versus year-specific precipitation isotope maps. Methods in Ecology and Evolution, 5, 891-900.
And the re-explanation of that method in Ruegg et al. (2017).
Examples
# obtain posterior probability rasters for the first 2 birds in the migrant_wiwa_isotopes
# data set. This takes about 15 seconds on my laptop (most of that time is preparatory---once
# that is done, each bird goes much faster). So we don't run it here.
## Not run:
birds2 <- isotope_posterior_probs(isoscape = isomap_job54152_prediction,
ref_birds = breeding_wiwa_isotopes,
assign_birds = migrant_wiwa_isotopes[1:2,]
)
## End(Not run)
# However, you can load the results as a saved data object to see what they look like:
birds2 <- example_isotope_posteriors
# Since the ref_birds above were separate from the migrant birds, no leave-one-out was
# performed. Hence birds2$loo_results is NULL, and all the results are in
# birds2$regular.
# Look at the names of the resulting output for the first bird:
names(birds2$regular[[1]])
names(birds2$regular[[1]]$assignment_parameters)
# If you want to do self-assignment for a whole bunch of reference birds, it takes much longer.
# It looks like this:
## Not run:
self_ass <- isotope_posterior_probs(isoscape = isomap_job54152_prediction,
ref_birds = breeding_wiwa_isotopes,
self_assign = TRUE)
## End(Not run)
Posterior probs of genetic region of origin for 926 WIWAs sampled during migration
Description
A long format data frame with 5,556 rows and 6 columns
Usage
migrant_wiwa_genetic_posteriors
Format
A tibble with 5,556 rows and 6 columns. The relevant variables for analyses here are:
- ID
unique identifier for each bird
- Short_Name
same id for the bird
- Collection_Date
The date the bird was sampled.
- NumberOfLoci
Number of loci successfully typed
- region
one of the genetic regions
- posterior
the posterior prob of originating from that region
Source
Kristina Paxton, Kristen Ruegg, Eric Anderson, Thomas Smith
Isotope values and meta data for 688 migrating Wilson's Warblers
Description
A data frame containing hydrogen isotope values, lat, long, and IDs and some other
columns of data for birds sampled during migration from Arizona. 604 of the individuals
in this data set also have values in migrant_wiwa_genetic_posteriors
.
Usage
migrant_wiwa_isotopes
Format
A tibble with 688 rows and 14 variables. The relevant variables for analyses here are:
- ID
unique identifier for each bird
- Isotope.Value
hydrogen isotope ratios measured in the bird's feather
Source
Kristina Paxton
compute the minimum hpd inclusion area for all the birds in a rasterStack
Description
This is a convenient wrapper that lets you pass in a data frame of bird names with lats and longs and a rasterStack or rasterBrick and it returns the min hpd inclusion area for each bird in a tbl_df-ed data frame.
Usage
min_hpd_inc_area_df(birds, R)
Arguments
birds |
a data frame that minimally has columns |
R |
a rasterStack or rasterBrick of posteriors that each bird originated from each of the cells. The names of the layers must correspond to the ID column in kbirds. |
Value
This returns a list with three components:
area the minimum hpd inclusion area as a data frame with columns "Short_Name" and "min_hpd_area"
missingFromRastercharacter vector of the names of birds found in birds$Short_Name but not found in the names of R
missingFromDF character vector of names in R that are not found in birds$Short_Name.
compute the smallest area of an HPD credible set that includes the true bird location
Description
This function computes the area of the HPD credible set that is as small as possible but just barely includes the cell with the true location (or, if the true location is not actually inluded in the range map, we use the closest cell on the range map to the true location). This operates on a single rasterLayer. To operate on a rasterBrick of posteriors and a data frame of lat-longs, use 'min_hpd_inc_area_df()'.
Usage
min_hpd_inclusion_area(M, lat, long)
Arguments
M |
the posterior surface matrix |
lat |
the true latitude |
long |
the true longitude |
internal function for isotope machinations
Description
does these things: 1. removes birds in Locations that have only 1 bird in them (printing a warning message while it does so) 2. does the vza rescaling process and returns the output that is needed for the vza_assign function. (basically the ouput of the mean and var rasters). Should return a-bar and b-bar too — all of that in a big list. Then with that output I can pump it all into vza_assign for a particular (left-out) ref_bird or for an assign bird. I should return the fitted model for each one.
Usage
private_rescale(birds, pred, std)
Arguments
birds |
a data frame like "ref_birds" in |
pred |
The raster of isoscape predictions |
std |
The raster of isoscape standard deviations |
assign posterior probability of origin for a bird in each cell in the raster
Description
This is a rewrite of the function assignment
from the Vander
Zanden appendix code.
Usage
vza_assign(rescale_mean, rescale_var, precip_sd, sd_indiv, bird_iso)
Arguments
rescale_mean |
the tissue specific mean raster, such as the mean.raster
component of the output of |
rescale_var |
tissue specific raster of standard deviations, such as the var.raster
component of the output of |
precip_sd |
SD raster associated with the IsoMAP output. This is the precip component of the variance term. |
sd_indiv |
the individual component of the variance term. This is a value to be determined by the user. The standard approach is to use the mean of the SDs observed among individuals at all of the calibration sites. |
bird_iso |
a single value giving the isotope ratio found in the individual's feather. |
Details
This is a fairly low-level function. It returns a raster of posterior probs (they are scaled to sum to one over all cells).
calculate a raster of mean and variance of expected tissue isotopes from precip data and resampled regressions
Description
This is a rewrite of the function raster.conversion
from the Vander Zanden
appendix. They expressed things in terms of the standard deviations, but they need to
be turned into variances, anyway, so that is what we've done here. Following the notation
of the paper on Wilson's warbler, this function computes $tildeT^(mu)$ (returned as
list component mean.raster
) and $R^(sigma^2)$ (returned as list component
var.raster
)
Usage
vza_mean_and_var_rasters(iso_raster, si)
Arguments
iso_raster |
the raster of isotope precipitation values, for example, like that
produced by |
si |
slopes and intercepts from the resampled regressions. This is a data frame
with columns named "slopes" and "intercepts" like that returned by |
Parametric bootstrap for rescaling a la the vander zanden appendix
Description
This is a vectorized and much more efficient implementation
of the original rescale
function from the Vander Zanden appendix.
It takes the output of group_birds_by_location
directly
and does the parametric bootstrapping for vza_rescale_reps samples.
Usage
vza_rescale(SBL, vza_rescale_reps = 1000)
Arguments
SBL |
the data frame that summarizes the isotope feather data and
isoscape predictions at each location. This must have columns of |
vza_rescale_reps |
Number of simulated regressions to do. Default is 1000. |
Value
Returns a matrix with vza_rescale_reps rows. Column 1 is "intercepts" and column two is "slopes"
a raster of the breeding range of Wilson's warbler
Description
a raster of the breeding range of Wilson's warbler
Usage
wiwa_breed
Format
This a rasterized version of the breeding range of Wilson's warbler It contains 1's in the breeding range and 0's elsewhere.
- class
RasterLayer
- dimensions
80, 228, 18240 (nrow, ncol, ncell)
- resolution
0.5, 0.5 (x, y)
- extent
-168.1, -54.1, 31.2, 71.2 (xmin, xmax, ymin, ymax)
- coord. ref.
+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0
- data source
in memory
- names
layer
- values
0, 1 (min, max)
Source
The rasters were generated from shapefiles provided to us by BirdLife International. (BirdLife International and NatureServe (2012) Bird species distribution maps of the world. BirdLife International, Cambridge, UK and NatureServe, Arlington, USA). Persons interested in the range map should contact BirdLife International http://www.birdlife.org/ or NatureServe http://www.natureserve.org/ directly.
RasterLayer showing the MaxEnt habitat suitability model unclipped by the known breeding range
Description
- class
RasterLayer
- dimensions
80, 228, 18240, 6 (nrow, ncol, ncell, nlayers)
- resolution
0.5, 0.5 (x, y)
- extent
-168.1, -54.1, 31.2, 71.2 (xmin, xmax, ymin, ymax)
- coord. ref.
+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0
- data source
in memory
- values
0, 0.001093349 (min, max)
Usage
wiwa_habitat_unclipped
Format
An object of class RasterLayer
of dimension 80 x 228 x 1.
Source
Ryan Harrigan
Simple map of the world
Description
This is wrld_simpl from the maptools package. It was all that I used from the maptools package which is going to be archived at the end of 2023. So, I just saved wrld_simpl as a data object in this package.
Usage
wrld_simpl
Format
a SpatialPolygonsDataFrame
Source
Got this from the old maptools package. See ?maptools::wrld_simpl