Title: | Transform Univariate Time Series |
Version: | 0.0.1 |
Description: | Univariate time series operations that follow an opinionated design. The main principle of 'transx' is to keep the number of observations the same. Operations that reduce this number have to fill the observations gap. |
License: | GPL-3 |
Imports: | rlang |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.1.1 |
URL: | https://github.com/kvasilopoulos/transx |
BugReports: | https://github.com/kvasilopoulos/transx/issues |
Suggests: | dplyr, ggplot2, cli, testthat, knitr, DescTools, outliers, rmarkdown, mFilter, covr |
VignetteBuilder: | knitr |
Language: | en-US |
Depends: | R (≥ 2.10) |
NeedsCompilation: | no |
Packaged: | 2020-11-26 15:04:28 UTC; T460p |
Author: | Kostas Vasilopoulos
|
Maintainer: | Kostas Vasilopoulos <k.vasilopoulo@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2020-11-27 11:40:02 UTC |
Removes measure of centrality from the series
Description
Removes the mean, the median or the mode from the series.
Usage
demean(x, na.rm = getOption("transx.na.rm"))
demedian(x, na.rm = getOption("transx.na.rm"))
demode(x, na.rm = getOption("transx.na.rm"))
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
na.rm |
A value indicating whether NA values should be stripped before the computation proceeds. |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
x <- c(2,5,10,20,30)
summary(x)
demean(x)
demedian(x)
demode(x)
Compute lagged differnces
Description
Returns suitably lagged and iterated difference
-
diffx
computes simple differences. -
rdffix
computes percentage differences. -
ldiffx
computes logged differences.
Usage
diffx(x, n = 1L, order = 1L, rho = 1, fill = NA)
rdiffx(x, n = 1L, order = 1L, rho = NULL, fill = NA)
ldiffx(x, n = 1L, order = 1L, rho = 1, fill = NA)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
n |
Value indicating which lag to use. |
order |
Value indicating the order of the difference. |
rho |
Value indicating the autocorrelation parameter. The purpose of this parameter is to provide quasi-differencing assuming the value falls within 0 and 1. |
fill |
Numeric value(s) or function used to fill observations. |
Examples
x <- c(2, 4, 8, 20)
diffx(x)
rdiffx(x)
ldiffx(x)
Deterministic Trend
Description
Remove global deterministic trend information from the series.
-
dt_lin
removes the linear trend. -
dt_quad
removes the quadratic trend. -
dt_poly
removes the nth-degree polynomial trend.
Usage
dtrend_lin(x, bp = NULL, na.rm = getOption("transx.na.rm"))
dtrend_quad(x, bp = NULL, na.rm = getOption("transx.na.rm"))
dtrend_poly(x, degree, bp = NULL, na.rm = getOption("transx.na.rm"))
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
bp |
Break points to define piecewise segments of the data. |
na.rm |
A value indicating whether NA values should be stripped before the computation proceeds. |
degree |
Value indicating the degree of polynomial |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
set.seed(123)
t <- 1:20
# Linear trend
x <- 3*sin(t) + t
plotx(cbind(x, dtrend_lin(x)))
# Quadratic trend
x2 <- 3*sin(t) + t + t^2
plotx(cbind(raw = x2, quad = dtrend_quad(x2)))
# Introduce a breaking point at point = 10
xbp <- 3*sin(t) + t
xbp[10:20] <- x[10:20] + 15
plotx(cbind(raw = xbp, lin = dtrend_lin(xbp), lin_bp = dtrend_lin(xbp, bp = 10)))
Fill with "linear approximation"
Description
Fill with "linear approximation"
Usage
fill_linear(body, idx, ...)
Arguments
body |
The body of the vector. |
idx |
the index to replace with. |
... |
Further arguments passed to |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
x <- c(5,3,2,2,5)
xlen <- length(x)
n <- 2
n <- pmin(n, xlen)
idx <- 1:n
body <- x[seq_len(xlen - n)]
fill_linear(body, idx)
Fill with "Last Observation Carried Forward"
Description
Fill with "Last Observation Carried Forward"
Usage
fill_locf(body, idx, fail = NA)
Arguments
body |
The body of the vector. |
idx |
the index to replace with. |
fail |
In case it fails to fill some values. |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
x <- c(5,3,2,2,5)
lagx(x, n = 2, fill = fill_locf)
leadx(x, n = 2, fill = fill_locf)
lagx(x, n = 2, fill = fill_nocb)
leadx(x, n = 2, fill = fill_nocb)
Fill with "Next observation carried backwards"
Description
Fill with "Next observation carried backwards"
Usage
fill_nocb(body, idx, fail = NA)
Arguments
body |
The body of the vector. |
idx |
the index to replace with. |
fail |
In case it fails to fill some values. |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
x <- c(5,3,2,2,5)
leadx(x, n = 2, fill = fill_locf)
xlen <- length(x)
n <- 2
n <- pmin(n, xlen)
idx <- (xlen - n + 1):xlen
body <- x[-seq_len(n)]
fill_locf(body, idx, NA)
Fill with "cubic spline interpolation"
Description
Fill with "cubic spline interpolation"
Usage
fill_spline(body, idx, ...)
Arguments
body |
The body of the vector. |
idx |
the index to replace with. |
... |
Further arguments passed to |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
x <- c(5,3,NA,2,5)
fill_spline(x, 3)
Baxter-King Filter
Description
This function computes the cyclical component of the Baxter-King filter.
Usage
filter_bk(x, fill = NA, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
fill |
Numeric value(s) or function used to fill observations. |
... |
Further arguments passed to |
Examples
unemp <- ggplot2::economics$unemploy
unemp_cycle <- filter_bk(unemp)
plotx(cbind(unemp, unemp_cycle))
Butterworth Filter
Description
This function computes the cyclical component of the Butterworth filter.
Usage
filter_bw(x, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
... |
Further arguments passed to |
Examples
unemp <- ggplot2::economics$unemploy
unemp_cycle <- filter_bw(unemp, freq = 10)
plotx(cbind(unemp, unemp_cycle))
Christiano-Fitzgerald Filter
Description
This function computes the cyclical component of the Christiano-Fitzgerald filter.
Usage
filter_cf(x, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
... |
Further arguments passed to |
Examples
unemp <- ggplot2::economics$unemploy
unemp_cycle <- filter_cf(unemp)
plotx(cbind(unemp, unemp_cycle))
Hamilton Filter
Description
This function computes the cyclical component of the Hamilton filter.
Usage
filter_hamilton(x, p = 4, horizon = 8, fill = NA)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
p |
A value indicating the number of lags |
horizon |
A value indicating the number of periods to look ahead. |
fill |
Numeric value(s) or function used to fill observations. |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
unemp <- ggplot2::economics$unemploy
unemp_cycle <- filter_hamilton(unemp)
plotx(cbind(unemp, unemp_cycle))
Hodrick-Prescot Filter
Description
This function computes the cyclical component of the Hodrick-Prescot filter.
Usage
filter_hp(x, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
... |
Further arguments passed to |
See Also
select_lambda
Examples
unemp <- ggplot2::economics$unemploy
unemp_cycle <- filter_hp(unemp, freq = select_lambda("monthly"))
plotx(cbind(unemp, unemp_cycle))
Trigonometric regression Filter
Description
This function computes the cyclical component of the trigonometric regression filter.
Usage
filter_tr(x, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
... |
Further arguments passed to |
Examples
unemp <- ggplot2::economics$unemploy
unemp_cycle <- filter_tr(unemp, pl=8, pu=40)
plotx(cbind(unemp, unemp_cycle))
Geometric Mean value
Description
Compute the sample geometric mean.
Usage
gmean(x, na.rm = getOption("transx.na.rm"))
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
na.rm |
A value indicating whether NA values should be stripped before the computation proceeds. |
Value
Returns a vector with the same class and attributes as the input vector.
Compute lagged or leading values
Description
Find the "previous" (lagx()
) or "next" (leadx()
) values in a vector.
Useful for comparing values behind of or ahead of the current values.
Usage
lagx(x, n = 1L, fill = NA)
leadx(x, n = 1L, fill = NA)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
n |
Value indicating the number of positions to lead or lag by. |
fill |
Numeric value(s) or function used to fill observations. |
Details
This functions has been taken and modified from the dplyr
package,
however, to reduce dependencies they are not imported.
Value
Returns a vector with the same class and attributes as the input vector.
Examples
x <- c(5,3,2,2,5)
lagx(x)
lagx(x, fill = mean)
lagx(x, fill = fill_nocb)
leadx(x)
leadx(x, fill = fill_locf)
Mode value
Description
Compute the sample median.
Usage
modex(x, na.rm = getOption("transx.na.rm"))
modex_int(x, na.rm = getOption("transx.na.rm"))
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
na.rm |
A value indicating whether NA values should be stripped before the computation proceeds. |
Detect outliers with Tukey's method
Description
Usage
out_iqr(x, cutoff = 1.5, fill = NA, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
cutoff |
|
fill |
Numeric value(s) or function used to fill observations. |
... |
further arguments passed to |
Examples
out_iqr(c(0,1,3,4,20))
Detect outliers with Percentiles
Description
Usage
out_pt(x, pt_low = 0.1, pt_high = 0.9, fill = NA)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
pt_low |
the lowest quantile |
pt_high |
the highest quantile |
fill |
Numeric value(s) or function used to fill observations. |
Examples
x <- c(1, 3, -1, 5, 10, 100)
out_pt(x)
Detect outliers with zscore
Description
Usage
out_score_z(x, cutoff = 3, fill = NA, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
cutoff |
|
fill |
Numeric value(s) or function used to fill observations. |
... |
Further arguments passed to |
Examples
out_score_z(c(0,0.1,2,1,3,2.5,2,.5,6,4,100))
Detect outliers Iglewicz and Hoaglin (1993) robust z-score method
Description
Usage
out_score_zrob(x, cutoff = 3.5, fill = NA, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
cutoff |
|
fill |
Numeric value(s) or function used to fill observations. |
... |
further arguments passed to |
Examples
out_score_zrob(c(0,0.1,2,1,3,2.5,2,.5,6,4,100))
Detect outliers with upper and lower threshold
Description
Usage
out_threshold(x, tlow = NULL, thigh = NULL, fill = NA)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
tlow |
The lower threshold. |
thigh |
The upper threshold. |
fill |
Numeric value(s) or function used to fill observations. |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
x <- c(1, 3, -1, 5, 10, 100)
out_threshold(x, tlow = 0, fill = 0)
out_threshold(x, thigh = 9, fill = function(x) quantile(x, 0.9))
Winsorize
Description
Replace extremely values that are defined by min
and max
.
Usage
out_winsorise(x, min = quantile(x, 0.05), max = quantile(x, 0.95))
out_winsorize(x, min = quantile(x, 0.05), max = quantile(x, 0.95))
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
min |
The lower bound, all values lower than this will be replaced by this value. |
max |
The upper bound, all values above than this will be replaced by this value. |
Value
Returns a vector with the same class and attributes as the input vector.
See Also
Examples
x <- c(1, 3, -1, 5, 10, 100)
out_winsorise(x)
Plotting wrapper around plot.default
Description
Helper function to only plot x as a line plot.
Usage
plotx(x, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
... |
Further arguments used in |
nth Power Transformation
Description
Usage
pow(x, pow = NULL, modulus = FALSE)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
pow |
The nth power. |
modulus |
positive |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
pow(2, 2)
pow(-2, 2)
pow(-2,2, TRUE)
Box-Cox Transformations
Description
Usage
pow_boxcox(x, lambda = NULL, lambda2 = NULL, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
lambda |
Transformation exponent, |
lambda2 |
Transformation exponent, |
... |
Further arguments passed to |
Value
Returns a vector with the same class and attributes as the input vector.
References
Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 211-252. https://www.jstor.org/stable/2984418
Examples
set.seed(123)
x <- runif(10)
pow_boxcox(x, 3)
Manly(1971) Transformations
Description
The transformation was reported to be successful in transform unimodal skewed distribution into normal distribution, but is not quite useful for bimodal or U-shaped distribution.
Usage
pow_manly(x, lambda = NULL)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
lambda |
Transformation exponent, |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
set.seed(123)
x <- runif(10)
pow_manly(x, 3)
Tukey Transformations Transformations
Description
Usage
pow_tukey(x, lambda = NULL, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
lambda |
Transformation exponent, |
... |
Further arguments passed to |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
set.seed(123)
x <- runif(10)
pow_tukey(x, 2)
Yeo and Johnson(2000) Transformations
Description
Usage
pow_yj(x, lambda = NULL, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
lambda |
Transformation exponent, |
... |
Further arguments passed to |
Value
Returns a vector with the same class and attributes as the input vector.
References
Yeo, I., & Johnson, R. (2000). A New Family of Power Transformations to Improve Normality or Symmetry. Biometrika, 87(4), 954-959. http://www.jstor.org/stable/2673623
Examples
set.seed(123)
x <- runif(10)
pow_yj(x, 3)
Change the base year
Description
Change the base year.
Usage
rebase(x, n = NULL)
rebase_origin(x)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
n |
The index of the new base year. |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
x <- 3:10
# New base would be 5
rebase(x, 5)
# Or the origin
rebase_origin(x)
# Fro the base to be 100 or 0 then:
rebase(x, 5)*100
rebase(x, 5) - 1
nth Root Transformation
Description
-
root
: nth root -
root_sqrt
: square root -
root_cubic
: cubic root
Usage
root(x, root = NULL, modulus = FALSE)
root_sq(x, ...)
root_cubic(x, ...)
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
root |
The nth root. |
modulus |
Transformation will work for data with both positive and negative |
... |
Further arguments passed to |
Examples
root(4, 2)
root(-4, 2)
root(-4, 2, TRUE)
Rescale
Description
Usage
scale_range(x, to, na.rm = getOption("transx.na.rm"))
scale_minmax(x, na.rm = getOption("transx.na.rm"))
scale_unit_len(x, na.rm = getOption("transx.na.rm"))
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
to |
Values that will determine the output range. |
na.rm |
A value indicating whether NA values should be stripped before the computation proceeds. |
Details
To rescale a range between an arbitrary set of values [a, b], the formula becomes:
Value
Returns a vector with the same class and attributes as the input vector.
Examples
x <- c(10,5,1,-2)
scale_range(x, c(-1, 2))
scale_minmax(x)
Score transformation
Description
These functions calculate the scores according to:
-
score_z
: Normal(z) distribution -
score_mad
: Mean absolute deviation -
score_t
: t-distribution -
score_chi
: chi-distribution
Usage
score_z(x, na.rm = getOption("transx.na.rm"))
score_mad(x, na.rm = getOption("transx.na.rm"))
score_t(x, na.rm = getOption("transx.na.rm"))
score_chisq(x, na.rm = getOption("transx.na.rm"))
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
na.rm |
A value indicating whether NA values should be stripped before the computation proceeds. |
Details
Because function are known with different names:
-
score_z
is identical tostd_mean
-
score_mad
is identical tostd_median
Value
Returns a vector with the same class and attributes as the input vector.
See Also
Examples
x <- seq(-3,3,0.5)
score_z(x)
score_mad(x)
score_t(x)
Selecting lambda
Description
Approaches to selecting lambda.
Usage
select_lambda(
freq = c("quarterly", "annual", "monthly", "weekly"),
type = c("rot", "ru2002")
)
Arguments
freq |
The frequency of the dataset. |
type |
The methodology to select lambda. |
Details
Rule of thumb is from Hodrick and Prescot (1997):
Lambda = 100*(number of periods in a year)^2
Annual data = 100 x 1^2 = 100
Quarterly data = 100 x 4^2 = 1,600
Monthly data = 100 x 12^2 = 14,400
Weekly data = 100 x 52^2 = 270,400
Daily data = 100 x 365^2 = 13,322,500
Ravn and Uhlig (2002) state that lambda should vary by the fourth power of the frequency observation ratio;
Lambda = 6.25 x (number of periods in a year)^4
Thus, the rescaled default values for lambda are:
Annual data = 1600 x 1^4 = 6.25
Quarterly data = 1600 x 4^4= 1600
Monthly data = 1600 x 12^4= 129,600
Weekly data = 1600 x 12^4 = 33,177,600
References
Hodrick, R. J., & Prescott, E. C. (1997). Postwar US business cycles: an empirical investigation. Journal of Money, credit, and Banking, 1-16.
Ravn, M. O., & Uhlig, H. (2002). On adjusting the Hodrick-Prescott filter for the frequency of observations. Review of economics and statistics, 84(2), 371-376.
Skewness/Kurtosis Value
Description
Compute the sample skewness/kurtosis
Usage
skewness(x, na.rm = getOption("transx.na.rm"))
kurtosis(x, na.rm = getOption("transx.na.rm"))
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
na.rm |
A value indicating whether NA values should be stripped before the computation proceeds. |
Standarization
Description
Convert number of standard deviations by which the value of a raw score is above or below the mean value of what is being observed or measured.
Usage
std_mean(x, na.rm = getOption("transx.na.rm"))
std_median(x, na.rm = getOption("transx.na.rm"))
Arguments
x |
Univariate vector, numeric or ts object with only one dimension. |
na.rm |
A value indicating whether NA values should be stripped before the computation proceeds. |
Value
Returns a vector with the same class and attributes as the input vector.
Examples
x <- c(10,2,5,3)
std_mean(x)
scale(x)
std_median(x)