Type: | Package |
Title: | Analyse and Interpret Time Series Features |
Version: | 0.2.0 |
Date: | 2025-07-10 |
Maintainer: | Trent Henderson <then6675@uni.sydney.edu.au> |
Description: | Provides a suite of functions for analysing, interpreting, and visualising time-series features calculated from different feature sets from the 'theft' package. Implements statistical learning methodologies described in Henderson, T., Bryant, A., and Fulcher, B. (2023) <doi:10.48550/arXiv.2303.17809>. |
BugReports: | https://github.com/hendersontrent/theftdlc/issues |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
Depends: | R (≥ 3.5.0), theft (≥ 0.8.1) |
Imports: | rlang, stats, tibble, dplyr, ggplot2, tidyr, purrr, furrr, future, reshape2, scales, broom, Rtsne, e1071, janitor, umap, MASS, mclust, normaliseR, correctR, glmnet |
Suggests: | lifecycle, cachem, bslib, knitr, rmarkdown, pkgdown, testthat |
RoxygenNote: | 7.3.2 |
VignetteBuilder: | knitr |
URL: | https://hendersontrent.github.io/theftdlc/ |
NeedsCompilation: | no |
Packaged: | 2025-07-10 08:02:45 UTC; trenthenderson |
Author: | Trent Henderson [cre, aut] |
Repository: | CRAN |
Date/Publication: | 2025-07-10 08:20:02 UTC |
Fit classifiers using time-series features using a resample-based approach and get a fast understanding of performance
Description
Fit classifiers using time-series features using a resample-based approach and get a fast understanding of performance
Usage
classify(
data,
classifier = NULL,
train_size = 0.75,
n_resamples = 30,
by_set = TRUE,
use_null = FALSE,
seed = 123
)
tsfeature_classifier(
data,
classifier = NULL,
train_size = 0.75,
n_resamples = 30,
by_set = TRUE,
use_null = FALSE,
seed = 123
)
Arguments
data |
|
classifier |
|
train_size |
|
n_resamples |
|
by_set |
|
use_null |
|
seed |
|
Value
list
containing a named vector
of train-test set sizes, and a data.frame
of classification performance results
Author(s)
Trent Henderson
Examples
library(theft)
features <- theft::calculate_features(theft::simData,
feature_set = "catch22")
classifiers <- classify(features,
by_set = FALSE,
n_resamples = 3)
Perform cluster analysis of time series using their feature vectors
Description
Perform cluster analysis of time series using their feature vectors
Usage
cluster(
data,
norm_method = c("zScore", "Sigmoid", "RobustSigmoid", "MinMax", "MaxAbs"),
unit_int = FALSE,
clust_method = c("kmeans", "hclust", "mclust"),
k = 2,
features = NULL,
na_removal = c("feature", "sample"),
seed = 123,
...
)
Arguments
data |
|
norm_method |
|
unit_int |
|
clust_method |
|
k |
|
features |
|
na_removal |
|
seed |
|
... |
arguments to be passed to |
Value
object of class feature_cluster
containing the clustering algorithm and a tidy version of clusters joined to the input dataset ready for further analysis
Author(s)
Trent Henderson
Examples
library(theft)
features <- theft::calculate_features(theft::simData,
feature_set = "catch22")
clusts <- cluster(features,
k = 6)
Conduct statistical testing on time-series feature classification performance to identify top features or compare entire sets
Description
Conduct statistical testing on time-series feature classification performance to identify top features or compare entire sets
Usage
compare_features(
data,
metric = c("accuracy", "precision", "recall", "f1"),
by_set = TRUE,
hypothesis = c("null", "pairwise"),
p_adj = c("none", "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr"),
n_workers = 1
)
Arguments
data |
|
metric |
|
by_set |
|
hypothesis |
|
p_adj |
|
n_workers |
|
Value
data.frame
containing the results
Author(s)
Trent Henderson
References
Henderson, T., Bryant, A. G., and Fulcher, B. D. Never a Dull Moment: Distributional Properties as a Baseline for Time-Series Classification. 27th Pacific-Asia Conference on Knowledge Discovery and Data Mining, (2023).
Examples
library(theft)
features <- theft::calculate_features(theft::simData,
feature_set = NULL,
features = list("mean" = mean, "sd" = sd))
classifiers <- classify(features,
by_set = FALSE,
n_resamples = 3)
compare_features(classifiers,
by_set = FALSE,
hypothesis = "pairwise")
Remove duplicate features that exist in multiple feature sets and retain a reproducible random selection of one of them
Description
Remove duplicate features that exist in multiple feature sets and retain a reproducible random selection of one of them
Usage
filter_duplicates(data, preference = NULL, seed = 123)
Arguments
data |
|
preference |
deprecated. Do not use |
seed |
|
Value
feature_calculations
object containing filtered feature data
Author(s)
Trent Henderson
Filter resample data sets according to good feature list
Description
Filter resample data sets according to good feature list
Usage
filter_good_features(data, x, good_features)
Arguments
data |
|
x |
|
good_features |
|
Value
list
of filtered train and test data
Author(s)
Trent Henderson
Helper function to find features in both train and test set that are "good"
Description
Helper function to find features in both train and test set that are "good"
Usage
find_good_features(data, x)
Arguments
data |
|
x |
|
Value
character
vector of "good" feature names
Author(s)
Trent Henderson
Fit classification model and compute key metrics
Description
Fit classification model and compute key metrics
Usage
fit_models(data, iter_data, row_id, is_null_run = FALSE, classifier)
Arguments
data |
|
iter_data |
|
row_id |
|
is_null_run |
|
classifier |
|
Value
data.frame
of classification results
Author(s)
Trent Henderson
Calculate central tendency and spread values for all numeric columns in a dataset
Description
Calculate central tendency and spread values for all numeric columns in a dataset
Usage
get_rescale_vals(data)
Arguments
data |
|
Value
list
of central tendency and spread values
Author(s)
Trent Henderson
Calculate interval summaries with a measure of central tendency of classification results
Description
Calculate interval summaries with a measure of central tendency of classification results
Usage
interval(
data,
metric = c("accuracy", "precision", "recall", "f1"),
by_set = TRUE,
type = c("sd", "qt", "quantile"),
interval = NULL,
model_type = c("main", "null")
)
calculate_interval(
data,
metric = c("accuracy", "precision", "recall", "f1"),
by_set = TRUE,
type = c("sd", "qt", "quantile"),
interval = NULL,
model_type = c("main", "null")
)
Arguments
data |
|
metric |
|
by_set |
|
type |
|
interval |
|
model_type |
|
Value
interval_calculations
object which is a data frame containing the results
Author(s)
Trent Henderson
Examples
library(theft)
features <- theft::calculate_features(theft::simData,
feature_set = NULL,
features = list("mean" = mean, "sd" = sd))
classifiers <- classify(features,
by_set = FALSE,
n_resamples = 3)
interval(classifiers,
by_set = FALSE,
type = "sd",
interval = 1)
Helper function for converting to title case
Description
Helper function for converting to title case
Usage
make_title(x)
Arguments
x |
|
Value
character
vector
Author(s)
Trent Henderson
Produce a plot for a feature_calculations object
Description
Produce a plot for a feature_calculations object
Usage
## S3 method for class 'feature_calculations'
plot(
x,
type = c("matrix", "cor", "violin", "box", "quality"),
norm_method = c("zScore", "Sigmoid", "RobustSigmoid", "MinMax", "MaxAbs"),
unit_int = FALSE,
clust_method = c("average", "ward.D", "ward.D2", "single", "complete", "mcquitty",
"median", "centroid"),
cor_method = c("pearson", "spearman"),
feature_names = NULL,
...
)
Arguments
x |
|
type |
|
norm_method |
|
unit_int |
|
clust_method |
|
cor_method |
|
feature_names |
|
... |
Arguments to be passed to |
Value
object of class ggplot
that contains the graphic
Author(s)
Trent Henderson
Produce a plot for a feature_projection object
Description
Produce a plot for a feature_projection object
Usage
## S3 method for class 'feature_projection'
plot(x, show_covariance = TRUE, ...)
Arguments
x |
|
show_covariance |
|
... |
Arguments to be passed to methods |
Value
object of class ggplot
that contains the graphic
Author(s)
Trent Henderson
Produce a plot for a interval_calculations object
Description
Produce a plot for a interval_calculations object
Usage
## S3 method for class 'interval_calculations'
plot(x, ...)
Arguments
x |
|
... |
Arguments to be passed to methods |
Value
object of class ggplot
that contains the graphic
Author(s)
Trent Henderson
Project a feature matrix into a two-dimensional representation using PCA, MDS, t-SNE, or UMAP ready for plotting
Description
Project a feature matrix into a two-dimensional representation using PCA, MDS, t-SNE, or UMAP ready for plotting
Usage
project(
data,
norm_method = c("zScore", "Sigmoid", "RobustSigmoid", "MinMax", "MaxAbs"),
unit_int = FALSE,
low_dim_method = c("PCA", "tSNE", "ClassicalMDS", "KruskalMDS", "SammonMDS", "UMAP"),
na_removal = c("feature", "sample"),
seed = 123,
...
)
reduce_dims(
data,
norm_method = c("zScore", "Sigmoid", "RobustSigmoid", "MinMax", "MaxAbs"),
unit_int = FALSE,
low_dim_method = c("PCA", "tSNE", "ClassicalMDS", "KruskalMDS", "SammonMDS", "UMAP"),
na_removal = c("feature", "sample"),
seed = 123,
...
)
Arguments
data |
|
norm_method |
|
unit_int |
|
low_dim_method |
|
na_removal |
|
seed |
|
... |
arguments to be passed to |
Value
object of class feature_projection
which is a named list containing the feature_calculations
data supplied to the function, the wide matrix of filtered data, a tidy data.frame
of the projected 2-D data, and the model fit object
Author(s)
Trent Henderson
Examples
library(theft)
features <- theft::calculate_features(theft::simData,
feature_set = "catch22")
pca <- project(features,
norm_method = "zScore",
low_dim_method = "PCA")
Helper function to create a resampled dataset
Description
Helper function to create a resampled dataset
Usage
resample_data(data, train_rows, test_rows, train_groups, test_groups, seed)
Arguments
data |
|
train_rows |
|
test_rows |
|
train_groups |
|
test_groups |
|
seed |
|
Value
list
containing new train and test data
Author(s)
Trent Henderson
Calculate z-score for all columns in a dataset using train set central tendency and spread
Description
Calculate z-score for all columns in a dataset using train set central tendency and spread
Usage
rescale_zscore(data, rescalers)
Arguments
data |
|
rescalers |
|
Value
data.frame
of rescaled data
Author(s)
Trent Henderson
Helper function to select only the relevant columns for statistical testing
Description
Helper function to select only the relevant columns for statistical testing
Usage
select_stat_cols(data, by_set, metric, hypothesis)
Arguments
data |
|
by_set |
|
metric |
|
hypothesis |
|
Value
object of class data.frame
Author(s)
Trent Henderson
Use a cross validated penalized maximum likelihood generalized linear model to perform feature selection
Description
Use a cross validated penalized maximum likelihood generalized linear model to perform feature selection
Usage
shrink(data, threshold = c("one", "all"), plot = FALSE, ...)
Arguments
data |
|
threshold |
|
plot |
|
... |
arguments to be passed to |
Value
feature_calculations
object containing a data frame of the reduced feature set
Author(s)
Trent Henderson
Examples
library(theft)
features <- theft::calculate_features(theft::simData,
feature_set = "catch22")
best_features <- shrink(features)
Calculate p-values for feature sets or features relative to an empirical null or each other using resampled t-tests
Description
Calculate p-values for feature sets or features relative to an empirical null or each other using resampled t-tests
Usage
stat_test(
data,
iter_data,
row_id,
by_set = FALSE,
hypothesis,
metric,
train_test_sizes,
n_resamples
)
Arguments
data |
|
iter_data |
|
row_id |
|
by_set |
|
hypothesis |
|
metric |
|
train_test_sizes |
|
n_resamples |
|
Value
object of class data.frame
Author(s)
Trent Henderson
Analyse and Interpret Time Series Features
Description
Analyse and Interpret Time Series Features