Type: | Package |
Title: | Time-Based Rolling Functions |
Version: | 0.1.6 |
Description: | Provides rolling statistical functions based on date and time windows instead of n-lagged observations. |
URL: | https://mps9506.github.io/tbrf/ |
BugReports: | https://github.com/mps9506/tbrf/issues |
License: | GPL-3 | file LICENSE |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.1 |
Depends: | R (≥ 2.10) |
Imports: | boot, dplyr, lubridate, purrr, rlang, tibble, tidyr |
Suggests: | spelling, covr, ggalt, ggplot2, testthat, knitr, rmarkdown |
VignetteBuilder: | knitr |
Language: | en-US |
Config/Needs/website: | mps9506/mpsTemplates |
NeedsCompilation: | no |
Packaged: | 2025-04-02 15:12:30 UTC; michael.schramm |
Author: | Michael Schramm |
Maintainer: | Michael Schramm <mpschramm@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-04-02 16:00:05 UTC |
Dissolved oxygen measurements from the Tres Palacios rivers
Description
Data from the Texas Commission on Environmental Quality Surface Water Quality Monitoring Information System. The 'AverageDO“ field is the mean of dissolved oxygen concentrations (mg/L) measured at a field site at that day. The MinDO is the minimum dissolved oxygen concentration measured at that site on that day.
Usage
data(Dissolved_Oxygen)
Format
A data frame with 236 rows and 6 variables:
- Station_ID
unique water quality monitoring station identifier
- Date
sampling date in yyyy-mm-dd format
- Param_Code
unique parameter code
- Param_Desc
parameter description with units
- Average_DO
mean of dissolved oxygen measurement, in mg/L
- Min_DO
minimum of dissolved oxygen measurement, in mg/L
Source
https://www80.tceq.texas.gov/SwqmisPublic/public/default.htm
Confidence Intervals for Binomial Probabilities
Description
An implementation of the binconf
function in Frank
Harrell's Hmisc package. Produces 1-alpha confidence intervals for binomial
probabilities.
Usage
binom_ci(
x,
n,
alpha = 0.05,
method = c("wilson", "exact", "asymptotic"),
return.df = FALSE
)
Arguments
x |
vector containing the number of "successes" for binomial variates. |
n |
vector containing the numbers of corresponding observations. |
alpha |
probability of a type I error, so confidence coefficient = 1-alpha. |
method |
character string specifying which method to use. The "exact" method uses the F distribution to compute exact (based on the binomial cdf) intervals; the "wilson" interval is score-test-based; and the "asymptotic" is the text-book, asymptotic normal interval. Following Agresti and Coull, the Wilson interval is to be preferred and so is the default. |
return.df |
logical flag to indicate that a data frame rather than a matrix be returned. |
Author(s)
Frank Harrell, modified by Michael Schramm
References
A. Agresti and B.A. Coull, Approximate is better than "exact" for interval estimation of binomial proportions, American Statistician, 52:119–126, 1998.
R.G. Newcombe, Logit confidence intervals and the inverse sinh transformation, American Statistician, 55:200–202, 2001.
L.D. Brown, T.T. Cai and A. DasGupta, Interval estimation for a binomial proportion (with discussion), Statistical Science, 16:101–133, 2001.
Examples
binom_ci(46,50,method="wilson")
Calculates the Geometric Mean
Description
Originally from Paul McMurdie, Ben Bolker, and Gregor on Stack Overflow: https://stackoverflow.com/questions/2602583/geometric-mean-is-there-a-built-in
Usage
gm_mean(x, na.rm = TRUE, zero.propagate = FALSE)
Arguments
x |
vector of numeric values |
na.rm |
logical TRUE/FALSE remove NA values |
zero.propagate |
logical TRUE/FALSE. Allows the optional propagation of zeros. |
Value
the geometric mean of the vector
Returns the Geomean and CI
Description
Generates Geometric mean and confidence intervals using bootstrap.
Usage
gm_mean_ci(
window,
conf = 0.95,
na.rm = TRUE,
type = "basic",
R = 1000,
parallel = "no",
ncpus = getOption("boot.ncpus", 1L),
cl = NULL,
zero.propagate = FALSE
)
Arguments
window |
vector of data values |
conf |
confidence level of the required interval. |
na.rm |
logical |
type |
character string, one of |
R |
the number of bootstrap replicates. see |
parallel |
The type of parallel operation to be used (if any). see
|
ncpus |
integer: number of process to be used in parallel operation. see
|
cl |
optional parallel or snow cluster for use if |
zero.propagate |
logical |
Value
named list with geometric mean and (optionally) specified confidence interval
List NA
Description
function to return tibble with NAs as specified
Usage
list_NA(x)
Arguments
x |
named vector |
Value
empty tibble
Returns the mean and CI
Description
Generates mean and confidence intervals using bootstrap.
Usage
mean_ci(
window,
conf = 0.95,
na.rm = TRUE,
type = "basic",
R = 1000,
parallel = "no",
ncpus = getOption("boot.ncpus", 1L),
cl = NULL
)
Arguments
window |
vector of data values |
conf |
confidence level of the required interval. |
na.rm |
logical |
type |
character string, one of |
R |
the number of bootstrap replicates. see |
parallel |
The type of parallel operation to be used (if any). see
|
ncpus |
integer: number of process to be used in parallel operation. see
|
cl |
optional parallel or snow cluster for use if |
Value
named list with mean and (optionally) specified confidence interval
Returns the median and CI
Description
Generates median and confidence intervals using bootstrap.
Usage
median_ci(
window,
conf = 0.95,
na.rm = TRUE,
type = "basic",
R = 1000,
parallel = "no",
ncpus = getOption("boot.ncpus", 1L),
cl = NULL
)
Arguments
window |
vector of data values |
conf |
confidence level of the required interval. |
na.rm |
logical |
type |
character string, one of |
R |
the number of bootstrap replicates. see |
parallel |
The type of parallel operation to be used (if any). see
|
ncpus |
integer: number of process to be used in parallel operation. see
|
cl |
optional parallel or snow cluster for use if |
Value
named list with mean and (optionally) specified confidence interval
Open Window
Description
calculates the period at each row from the row of interest
Usage
open_window(x, tcolumn, unit = "years", n, i)
Arguments
x |
dataframe |
tcolumn |
time column |
unit |
unit |
n |
desired n |
i |
row number |
Value
vector
Time-Based Rolling Binomial Probability
Description
Produces a a rolling time-window based vector of binomial probability and confidence intervals.
Usage
tbr_binom(.tbl, x, tcolumn, unit = "years", n, alpha = 0.05)
Arguments
.tbl |
dataframe with two variables. |
x |
indicates the variable column containing "success" and "failure" observations coded as 1 or 0. |
tcolumn |
indicates the variable column containing Date or Date-Time values. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window in the selected units. |
alpha |
numeric, probability of a type 1 error, so confidence coefficient = 1-alpha |
Value
tibble with binomial point estimate and confidence intervals.
See Also
Examples
## Generate Sample Data
df <- tibble::tibble(
date = sample(seq(as.Date('2000-01-01'), as.Date('2015/12/30'), by = "day"), 100),
value = rbinom(100, 1, 0.25)
)
## Run Function
tbr_binom(df, x = value,
tcolumn = date, unit = "years", n = 5,
alpha = 0.1)
Binomial test based on time window
Description
Binomial test based on time window
Usage
tbr_binom_window(x, tcolumn, unit = "years", n, i, alpha)
Arguments
x |
column containing "success" and "failure" observations as 0 or 1 |
tcolumn |
formatted time column |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
i |
rows |
alpha |
numeric, probability of a type 1 error, so confidence coefficient = 1-alpha |
Value
list
Time-Based Rolling Geometric Mean
Description
Produces a a rolling time-window based vector of geometric means and confidence intervals.
Usage
tbr_gmean(.tbl, x, tcolumn, unit = "years", n, ...)
Arguments
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the values to calculate the geometric mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
... |
additional arguments passed to |
Value
tibble with columns for the rolling geometric mean and upper and lower confidence levels.
See Also
Examples
## Return a tibble with new rolling geometric mean column
tbr_gmean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5)
## Not run:
## Return a tibble with rolling geometric mean and 95% CI
tbr_gmean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95)
## End(Not run)
Geometric mean based on a time-window
Description
Geometric mean based on a time-window
Usage
tbr_gmean_window(x, tcolumn, unit = "years", n, i, ...)
Arguments
x |
column containing the values to calculate the geometric mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
i |
row |
... |
additional arguments passed to gmean_ci |
Value
list
Time-Based Rolling Mean
Description
Produces a a rolling time-window based vector of means and confidence intervals.
Usage
tbr_mean(.tbl, x, tcolumn, unit = "years", n, ...)
Arguments
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the numeric values to calculate the mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
... |
additional arguments passed to |
Value
tibble with columns for the rolling mean and upper and lower confidence intervals.
See Also
Examples
## Return a tibble with new rolling mean column
tbr_mean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5)
## Not run:
## Return a tibble with rolling mean and 95% CI
tbr_mean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95)
## End(Not run)
Mean Based on a Time-Window
Description
Mean Based on a Time-Window
Usage
tbr_mean_window(x, tcolumn, unit = "years", n, i, ...)
Arguments
x |
column containing the values to calculate the mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
i |
row |
... |
additional arguments passed to |
Value
list
Time-Based Rolling Median
Description
Produces a a rolling time-window based vector of medians and confidence intervals.
Usage
tbr_median(.tbl, x, tcolumn, unit = "years", n, ...)
Arguments
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the numeric values to calculate the mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
... |
additional arguments passed to |
Value
tibble with columns for the rolling median and upper and lower confidence intervals.
See Also
Examples
## Return a tibble with new rolling median column
tbr_median(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years",
n = 5)
## Not run:
## Return a tibble with rolling median and 95% CI
tbr_median(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95)
## End(Not run)
Median Based on a Time-Window
Description
Median Based on a Time-Window
Usage
tbr_median_window(x, tcolumn, unit = "years", n, i, ...)
Arguments
x |
column containing the values to calculate the median. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
i |
row |
... |
additional arguments passed to |
Value
list
Use Generic Functions with Time Windows
Description
Use Generic Functions with Time Windows
Usage
tbr_misc(.tbl, x, tcolumn, unit = "years", n, func, ...)
Arguments
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the values the function is applied to. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
func |
specified function |
... |
optional additional arguments passed to function |
Value
tibble
Examples
tbr_misc(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, func = mean)
Time-Based Rolling Standard Deviation
Description
Time-Based Rolling Standard Deviation
Usage
tbr_sd(.tbl, x, tcolumn, unit = "years", n, na.rm = FALSE)
Arguments
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the values to calculate the standard deviation. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.rm |
logical. Should missing values be removed? |
Value
tibble with column for the rolling sd.
See Also
Examples
tbr_sd(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5)
Standard Deviation Based on a Time-Window
Description
Standard Deviation Based on a Time-Window
Usage
tbr_sd_window(x, tcolumn, unit = "years", n, i, ...)
Arguments
x |
column containing the values to calculate the standard deviation. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
i |
row |
... |
additional arguments passed to base::sd() |
Value
numeric value
Time-Based Rolling Sum
Description
Time-Based Rolling Sum
Usage
tbr_sum(.tbl, x, tcolumn, unit = "years", n, na.rm = FALSE)
Arguments
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the values to calculate the sum. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.rm |
logical. Should missing values be removed? |
Value
dataframe with column for the rolling sum.
See Also
Examples
tbr_sum(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n =
5)
Sum Based on a Time-Window
Description
Sum Based on a Time-Window
Usage
tbr_sum_window(x, tcolumn, unit = "years", n, i, na.rm)
Arguments
x |
column containing the values to calculate the sum. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
i |
row |
na.rm |
logical. Should missing values be removed? |
Value
numeric value