Type: | Package |
Title: | Ensembles of Caret Models |
Version: | 4.0.1 |
Date: | 2024-08-17 |
URL: | http://zachmayer.github.io/caretEnsemble/, https://github.com/zachmayer/caretEnsemble |
BugReports: | https://github.com/zachmayer/caretEnsemble/issues |
Description: | Functions for creating ensembles of caret models: caretList() and caretStack(). caretList() is a convenience function for fitting multiple caret::train() models to the same dataset. caretStack() will make linear or non-linear combinations of these models, using a caret::train() model as a meta-model. |
Depends: | R (≥ 4.1.0) |
Suggests: | MASS, caTools, covr, earth, gbm, glmnet, klaR, knitr, lintr, mgcv, mlbench, nnet, randomForest, rmarkdown, rhub, rpart, spelling, testthat, usethis |
Imports: | caret, data.table, ggplot2, lattice, methods, patchwork, pbapply, rlang |
License: | MIT + file LICENSE |
VignetteBuilder: | knitr |
RoxygenNote: | 7.3.2 |
LazyData: | true |
Language: | en-US |
Encoding: | UTF-8 |
NeedsCompilation: | no |
Packaged: | 2024-09-12 20:07:27 UTC; zach |
Author: | Zachary A. Deane-Mayer [aut, cre, cph], Jared E. Knowles [ctb], Antón López [ctb] |
Maintainer: | Zachary A. Deane-Mayer <zach.mayer@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2024-09-12 21:50:09 UTC |
Combine several predictive models via weights
Description
Find a greedy, positive only linear combination of several train
objects
Functions for creating ensembles of caret models: caretList and caretStack
Usage
caretEnsemble(all.models, excluded_class_id = 0L, tuneLength = 1L, ...)
Arguments
all.models |
an object of class caretList |
excluded_class_id |
The integer level to exclude from binary classification or multiclass problems. By default no classes are excluded, as the greedy optimizer requires all classes because it cannot use negative coefficients. |
tuneLength |
The size of the grid to search for tuning the model. Defaults to 1, as the only parameter to optimize is the number of iterations, and the default of 100 works well. |
... |
additional arguments to pass caret::train |
Details
greedyMSE works well when you want an ensemble that will never be worse than any single model in the dataset. In the worst case scenario, it will select the single best model, if none of them can be ensembled to improve the overall score. It will also never assign any model a negative coefficient, which can help avoid unintuitive cases at prediction time (e.g. if the correlations between predictors breaks down on new data, negative coefficients can lead to bad results).
Value
a caretEnsemble
object
Note
Every model in the "library" must be a separate train
object. For
example, if you wish to combine a random forests with several different
values of mtry, you must build a model for each value of mtry. If you
use several values of mtry in one train model, (e.g. tuneGrid =
expand.grid(.mtry=2:5)), caret will select the best value of mtry
before we get a chance to include it in the ensemble. By default,
RMSE is used to ensemble regression models, and AUC is used to ensemble
Classification models. This function does not currently support multi-class
problems
Author(s)
Maintainer: Zachary A. Deane-Mayer zach.mayer@gmail.com [copyright holder]
Other contributors:
Jared E. Knowles jknowles@gmail.com [contributor]
Antón López anton.gomez.lopez@rai.usc.es [contributor]
See Also
Useful links:
Report bugs at https://github.com/zachmayer/caretEnsemble/issues
Examples
set.seed(42)
models <- caretList(iris[1:50, 1:2], iris[1:50, 3], methodList = c("rpart", "rf"))
ens <- caretEnsemble(models)
summary(ens)
data for classification
Description
data for classification
Author(s)
Zachary Deane-Mayer zach.mayer@gmail.com
data for classification
Description
data for classification
Author(s)
Zachary Deane-Mayer zach.mayer@gmail.com
data for classification
Description
data for classification
data for regression
Description
data for regression
Author(s)
Zachary Deane-Mayer zach.mayer@gmail.com
Index a caretList
Description
Index a caret list to extract caret models into a new caretList object
Usage
## S3 method for class 'caretList'
object[index]
Arguments
object |
an object of class caretList |
index |
selected index |
Aggregate mean or first
Description
For numeric data take the mean. For character data take the first value.
Usage
aggregate_mean_or_first(x)
Arguments
x |
a train object |
Value
a data.table::data.table with predictions
Convert object to caretList object
Description
Converts object into a caretList
Usage
as.caretList(object)
Arguments
object |
R Object |
Value
a caretList
object
Convert object to caretList object - For Future Use
Description
Converts object into a caretList - For Future Use
Usage
## Default S3 method:
as.caretList(object)
Arguments
object |
R object |
Value
NA
Convert list to caretList
Description
Converts list to caretList
Usage
## S3 method for class 'list'
as.caretList(object)
Arguments
object |
list of caret models |
Value
a caretList
object
Convenience function for more in-depth diagnostic plots of caretStack objects
Description
This function provides a more robust series of diagnostic plots for a caretEnsemble object.
Usage
## S3 method for class 'caretStack'
autoplot(object, training_data = NULL, xvars = NULL, show_class_id = 2L, ...)
Arguments
object |
a |
training_data |
The data used to train the ensemble. Required if xvars is not NULL Must be in the same row order as when the models were trained. |
xvars |
a vector of the names of x variables to plot against residuals |
show_class_id |
For classification only: which class level to show on the plot |
... |
ignored |
Value
A grid of diagnostic plots. Top left is the range of the performance metric across each component model along with its standard deviation. Top right is the residuals from the ensembled model plotted against fitted values. Middle left is a bar graph of the weights of the component models. Middle right is the disagreement in the residuals of the component models (unweighted) across the fitted values. Bottom left and bottom right are the plots of the residuals against two random or user specified variables. Note that the ensemble must have been trained with savePredictions = "final", which is required to get residuals from the stack for the plot.
Examples
set.seed(42)
data(models.reg)
ens <- caretStack(models.reg[1:2], method = "lm")
autoplot(ens)
S3 definition for concatenating caretList
Description
take N objects of class caretList and concatenate them into a larger object of class caretList for future ensembling
Usage
## S3 method for class 'caretList'
c(...)
Arguments
... |
the objects of class caretList or train to bind into a caretList |
Value
a caretList
object
Examples
data(iris)
model_list1 <- caretList(Sepal.Width ~ .,
data = iris,
tuneList = list(
lm = caretModelSpec(method = "lm")
)
)
model_list2 <- caretList(Sepal.Width ~ .,
data = iris, tuneLength = 1L,
tuneList = list(
rf = caretModelSpec(method = "rf")
)
)
bigList <- c(model_list1, model_list2)
S3 definition for concatenating train objects
Description
take N objects of class train and concatenate into an object of class caretList for future ensembling
Usage
## S3 method for class 'train'
c(...)
Arguments
... |
the objects of class train to bind into a caretList |
Value
a caretList
object
Examples
data(iris)
model_lm <- caret::train(Sepal.Length ~ .,
data = iris,
method = "lm"
)
model_rf <- caret::train(Sepal.Length ~ .,
data = iris,
method = "rf",
tuneLength = 1L
)
model_list <- c(model_lm, model_rf)
Create a list of several train models from the caret package
Description
Build a list of train objects suitable for ensembling using the caretStack
function.
Usage
caretList(
...,
trControl = NULL,
methodList = NULL,
tuneList = NULL,
metric = NULL,
continue_on_fail = FALSE,
trim = TRUE
)
Arguments
... |
arguments to pass to |
trControl |
a |
methodList |
optional, a character vector of caret models to ensemble. One of methodList or tuneList must be specified. |
tuneList |
optional, a NAMED list of caretModelSpec objects. This much more flexible than methodList and allows the specification of model-specific parameters (e.g. passing trace=FALSE to nnet) |
metric |
a string, the metric to optimize for. If NULL, we will choose a good one. |
continue_on_fail |
logical, should a valid caretList be returned that excludes models that fail, default is FALSE |
trim |
logical should the train models be trimmed to save memory and speed up stacking |
Value
A list of train
objects. If the model fails to build,
it is dropped from the list.
Examples
caretList(
Sepal.Length ~ Sepal.Width,
head(iris, 50),
methodList = c("glm", "lm"),
tuneList = list(
nnet = caretModelSpec(method = "nnet", trace = FALSE, tuneLength = 1)
)
)
Generate a specification for fitting a caret model
Description
A caret model specification consists of 2 parts: a model (as a string) and the arguments to the train call for fitting that model
Usage
caretModelSpec(method = "rf", ...)
Arguments
method |
the modeling method to pass to caret::train |
... |
Other arguments that will eventually be passed to caret::train |
Value
a list of lists
Examples
caretModelSpec("rf", tuneLength = 5, preProcess = "ica")
Prediction wrapper for train
Description
This is a prediction wrapper for train
with several features:
- If newdata is null, return stacked predictions from the training job, rather than in-sample predictions.
- Always returns probabilities for classification models.
- Optionally drops one predicted class for classification models.
- Always returns a data.table
Usage
caretPredict(object, newdata = NULL, excluded_class_id = 1L, ...)
Arguments
object |
a |
newdata |
New data to use for predictions. If NULL, stacked predictions from the training data are returned. |
excluded_class_id |
an integer indicating the class to exclude. If 0L, no class is excluded |
... |
additional arguments to pass to |
Value
a data.table
Combine several predictive models via stacking
Description
Stack several train
models using a train
model.
Usage
caretStack(
all.models,
new_X = NULL,
new_y = NULL,
metric = NULL,
trControl = NULL,
excluded_class_id = 1L,
...
)
Arguments
all.models |
a caretList, or an object coercible to a caretList (such as a list of train objects) |
new_X |
Data to predict on for the caretList, prior to training the stack (for transfer learning). if NULL, the stacked predictions will be extracted from the caretList models. |
new_y |
The outcome variable to predict on for the caretList, prior to training the stack (for transfer learning). If NULL, will use the observed levels from the first model in the caret stack If 0, will include all levels. |
metric |
the metric to use for grid search on the stacking model. |
trControl |
a trainControl object to use for training the ensemble model. If NULL, will use defaultControl. |
excluded_class_id |
The integer level to exclude from binary classification or multiclass problems. |
... |
additional arguments to pass to the stacking model |
Details
Uses either transfer learning or stacking to stack models. Assumes that all models were trained on the same number of rows of data, with the same target values. The features, cross-validation strategies, and model types (class vs reg) may vary however. If your stack of models were trained with different number of rows, please provide new_X and new_y so the models can predict on a common set of data for stacking.
If your models were trained on different columns, you should use stacking.
If you have both differing rows and columns in your model set, you are out of luck. You need at least a common set of rows during training (for stacking) or a common set of columns at inference time for transfer learning.
Value
S3 caretStack object
References
Caruana, R., Niculescu-Mizil, A., Crew, G., & Ksikes, A. (2004). Ensemble Selection from Libraries of Models. https://www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml04.icdm06long.pdf
Examples
models <- caretList(
x = iris[1:50, 1:2],
y = iris[1:50, 3],
methodList = c("rpart", "glm")
)
caretStack(models, method = "glm")
Wrapper to train caret models
Description
This function is a wrapper around the 'train' function from the 'caret' package. It allows for the passing of local and global arguments to the 'train' function. It also allows for the option to continue on fail, and to trim the output model. Trimming the model removes components that are not needed for stacking, to save memory and speed up the stacking process. It also converts preds to a data.table. Its an internal function for use with caretList.
Usage
caretTrain(local_args, global_args, continue_on_fail = FALSE, trim = TRUE)
Arguments
local_args |
A list of arguments to pass to the 'train' function. |
global_args |
A list of arguments to pass to the 'train' function. |
continue_on_fail |
A logical indicating whether to continue if the 'train' function fails. If 'TRUE', the function will return 'NULL' if the 'train' function fails. |
trim |
A logical indicating whether to trim the output model. If 'TRUE', the function will remove some elements that are not needed from the output model. |
Value
The output of the 'train' function.
Validate a custom caret model info list
Description
Currently, this only ensures that all model info lists were also assigned a "method" attribute for consistency with usage of non-custom models
Usage
checkCustomModel(x)
Arguments
x |
a model info list (e.g. |
Value
validated model info list (i.e. x)
Check caretStack object
Description
Make sure a caretStack has both a caretList and a train object
Usage
check_caretStack(object)
Arguments
object |
a caretStack object |
Construct a default train control for use with caretList
Description
Unlike caret::trainControl, this function defaults to 5 fold CV. CV is good for stacking, as every observation is in the test set exactly once. We use 5 instead of 10 to save compute time, as caretList is for fitting many models. We also construct explicit fold indexes and return the stacked predictions, which are needed for stacking. For classification models we return class probabilities.
Usage
defaultControl(
target,
method = "cv",
number = 5L,
savePredictions = "final",
index = caret::createFolds(target, k = number, list = TRUE, returnTrain = TRUE),
is_class = is.factor(target) || is.character(target),
is_binary = length(unique(target)) == 2L,
...
)
Arguments
target |
the target variable. |
method |
the method to use for trainControl. |
number |
the number of folds to use. |
savePredictions |
the type of predictions to save. |
index |
the fold indexes to use. |
is_class |
logical, is this a classification or regression problem. |
is_binary |
logical, is this binary classification. |
... |
other arguments to pass to |
Construct a default metric
Description
Caret defaults to RMSE for classification and RMSE for regression. For classification, I would rather use ROC.
Usage
defaultMetric(is_class, is_binary)
Arguments
is_class |
logical, is this a classification or regression problem. |
is_binary |
logical, is this binary classification. |
Comparison dotplot for a caretStack object
Description
This is a function to make a dotplot from a caretStack. It uses dotplot from the caret package on all the models in the ensemble, excluding the final ensemble model.At the moment, this function only works if the ensembling model has the same number of resamples as the component models.
Usage
## S3 method for class 'caretStack'
dotplot(x, ...)
Arguments
x |
An object of class caretStack |
... |
passed to dotplot |
Examples
set.seed(42)
models <- caretList(
x = iris[1:100, 1:2],
y = iris[1:100, 3],
methodList = c("rpart", "glm")
)
meta_model <- caretStack(models, method = "lm")
lattice::dotplot(meta_model)
Drop Excluded Class
Description
Drop the excluded class from a prediction data.table
Usage
dropExcludedClass(x, all_classes, excluded_class_id)
Arguments
x |
a data.table of predictions |
all_classes |
a character vector of all classes |
excluded_class_id |
an integer indicating the class to exclude |
Extract the best predictions from a train object
Description
Extract the best predictions from a train object.
Usage
extractBestPreds(x)
Arguments
x |
a train object |
Value
a data.table::data.table with predictions
Extracts the target variable from a set of arguments headed to the caret::train function.
Description
This function extracts the y variable from a set of arguments headed to a caret::train model. Since there are 2 methods to call caret::train, this function also has 2 methods.
Usage
extractCaretTarget(...)
Arguments
... |
a set of arguments, as in the caret::train function |
Extracts the target variable from a set of arguments headed to the caret::train.default function.
Description
This function extracts the y variable from a set of arguments headed to a caret::train.default model.
Usage
## Default S3 method:
extractCaretTarget(x, y, ...)
Arguments
x |
an object where samples are in rows and features are in columns. This could be a simple matrix, data frame or other type (e.g. sparse matrix). See Details below. |
y |
a numeric or factor vector containing the outcome for each sample. |
... |
ignored |
Extracts the target variable from a set of arguments headed to the caret::train.formula function.
Description
This function extracts the y variable from a set of arguments headed to a caret::train.formula model.
Usage
## S3 method for class 'formula'
extractCaretTarget(form, data, ...)
Arguments
form |
A formula of the form y ~ x1 + x2 + ... |
data |
Data frame from which variables specified in formula are preferentially to be taken. |
... |
ignored |
Generic function to extract accuracy metrics from various model objects
Description
A generic function to extract cross-validated accuracy metrics from model objects.
Usage
extractMetric(x, ...)
Arguments
x |
An object from which to extract metrics.
The specific method will be dispatched based on the class of |
... |
Additional arguments passed to the specific methods. |
Value
See Also
extractMetric.train
,
extractMetric.caretList
,
extractMetric.caretStack
Extract accuracy metrics from a caretList
object
Description
Extract the cross-validated accuracy metrics from each model in a caretList.
Usage
## S3 method for class 'caretList'
extractMetric(x, ...)
Arguments
x |
a caretList object |
... |
passed to extractMetric.train |
Value
A data.table with metrics from each model.
Extract accuracy metrics from a caretStack
object
Description
Extract the cross-validated accuracy metrics from the ensemble model and individual models in a caretStack.
Usage
## S3 method for class 'caretStack'
extractMetric(x, ...)
Arguments
x |
a caretStack object |
... |
passed to extractMetric.train and extractMetric.caretList |
Value
A data.table with metrics from the ensemble model and individual models.
Extract accuracy metrics from a train
model
Description
Extract the cross-validated accuracy metrics and their SDs from caret.
Usage
## S3 method for class 'train'
extractMetric(x, metric = NULL, ...)
Arguments
x |
a train object |
metric |
a character string representing the metric to extract. |
... |
ignored If NULL, uses the metric that was used to train the model. |
Value
A numeric representing the metric desired metric.
Extract the method name associated with a single train object
Description
Extracts the method name associated with a single train object. Note that for standard models (i.e. those already prespecified by caret), the "method" attribute on the train object is used directly while for custom models the "method" attribute within the model$modelInfo attribute is used instead.
Usage
extractModelName(x)
Arguments
x |
a single caret train object |
Value
Name associated with model
Greedy optimization for MSE
Description
Greedy optimization for minimizing the mean squared error. Works for classification and regression.
Usage
greedyMSE(X, Y, max_iter = 100L)
Arguments
X |
A numeric matrix of features. |
Y |
A numeric matrix of target values. |
max_iter |
An integer scalar of the maximum number of iterations. |
Value
A list with components:
model_weights |
A numeric matrix of model_weights. |
RMSE |
A numeric scalar of the root mean squared error. |
max_iter |
An integer scalar of the maximum number of iterations. |
caret interface for greedyMSE
Description
caret interface for greedyMSE. greedyMSE works well when you want an ensemble that will never be worse than any single predictor in the dataset. It does not use an intercept and it does not allow for negative coefficients. This makes it highly constrained and in general does not work well on standard classification and regression problems. However, it does work well in the case of: * The predictors are highly correlated with each other * The predictors are highly correlated with the model * You expect or want positive only coefficients In the worse case, this method will select one input and use that, but in many other cases it will return a positive, weighted average of the inputs. Since it never uses negative weights, you never get into a scenario where one model is weighted negative and on new data you get were predictions because a correlation changed. Since this model will always be a positive weighted average of the inputs, it will rarely do worse than the individual models on new data.
Usage
greedyMSE_caret()
Is Classifier
Description
Check if a model is a classifier.
Usage
isClassifier(model)
Arguments
model |
A train object from the caret package. |
Value
A logical indicating whether the model is a classifier.
Validate a model type
Description
Validate the model type from a train
object.
For classification, validates that the model can predict probabilities, and,
if stacked predictions are requested, that classProbs = TRUE.
Usage
isClassifierAndValidate(object, validate_for_stacking = TRUE)
Arguments
object |
a |
validate_for_stacking |
a logical indicating whether to validate the object for stacked predictions |
Value
a logical. TRUE if classifier, otherwise FALSE.
Compute MAE
Description
Compute the mean absolute error between two vectors.
Usage
mae(a, b)
Arguments
a |
A numeric vector. |
b |
A numeric vector. |
Value
A numeric scalar.
Check that the methods supplied by the user are valid caret methods
Description
This function uses modelLookup from caret to ensure the list of methods supplied by the user are all models caret can fit.
Usage
methodCheck(x)
Arguments
x |
a list of user-supplied tuning parameters and methods |
caretList of classification models
Description
Data for the caretEnsemble package
Author(s)
Zachary Deane-Mayer zach.mayer@gmail.com
caretList of regression models
Description
caretList of regression models
Author(s)
Zachary Deane-Mayer zach.mayer@gmail.com
Normalize to One
Description
Normalize a vector to sum to one.
Usage
normalize_to_one(x)
Arguments
x |
A numeric vector. |
Value
A numeric vector.
Permutation Importance
Description
Permute each variable in a dataset and use the change in predictions to calculate the importance of each variable. Based on the scikit learn implementation of permutation importance: https://scikit-learn.org/stable/modules/permutation_importance.html. However, we don't compare to the target by a metric. We JUST look at the change in the model's predictions, as measured by MAE. (for classification, this is like using a Brier score). We shuffle each variable and recompute the predictions before and after the shuffle. The difference in MAE. is the importance of that variable. We normalize by computing the MAE of the shuffled original predictions as an upper bound on the MAE and divide by this value. So a variable that, when shuffled, caused predictions as bad as shuffling the output predictions, we know that variable is 100 Similarly, as with regular permutation importance, a variable that, when shuffled, gives the same MAE as the original model has an importance of 0.
This method cannot yield negative importances. It is merely a measure of how much the models uses the variable, and does not tell you which variables help or hurt generalization. Use the model's cross-validated metrics to assess generalization.
Usage
permutationImportance(model, newdata, normalize = TRUE)
Arguments
model |
A train object from the caret package. |
newdata |
A data.frame of new data to use to compute importances. Can be the training data. |
normalize |
A logical indicating whether to normalize the importances to sum to one. |
Value
A named numeric vector of variable importances.
Plot a caretList object
Description
This function plots the performance of each model in a caretList object.
Usage
## S3 method for class 'caretList'
plot(x, metric = NULL, ...)
Arguments
x |
a caretList object |
metric |
which metric to plot |
... |
ignored |
Value
A ggplot2 object
Plot a caretStack object
Description
This function plots the performance of each model in a caretList object.
Usage
## S3 method for class 'caretStack'
plot(x, metric = NULL, ...)
Arguments
x |
a caretStack object |
metric |
which metric to plot. If NULL, will use the default metric used to train the model. |
... |
ignored |
Value
a ggplot2 object
Create a matrix of predictions for each of the models in a caretList
Description
Make a matrix of predictions from a list of caret models
Usage
## S3 method for class 'caretList'
predict(object, newdata = NULL, verbose = FALSE, excluded_class_id = 1L, ...)
Arguments
object |
an object of class caretList |
newdata |
New data for predictions. It can be NULL, but this is ill-advised. |
verbose |
Logical. If FALSE no progress bar is printed if TRUE a progress bar is shown. Default FALSE. |
excluded_class_id |
Integer. The class id to drop when predicting for multiclass |
... |
Other arguments to pass to |
Make predictions from a caretStack
Description
Make predictions from a caretStack. This function passes the data to each function in turn to make a matrix of predictions, and then multiplies that matrix by the vector of weights to get a single, combined vector of predictions.
Usage
## S3 method for class 'caretStack'
predict(
object,
newdata = NULL,
se = FALSE,
level = 0.95,
excluded_class_id = 0L,
return_class_only = FALSE,
verbose = FALSE,
...
)
Arguments
object |
a |
newdata |
a new dataframe to make predictions on |
se |
logical, should prediction errors be produced? Default is false. |
level |
tolerance/confidence level should be returned |
excluded_class_id |
Which class to exclude from predictions. Note that if the caretStack was trained with an excluded_class_id, that class is ALWAYS excluded from the predictions from the caretList of input models. excluded_class_id for predict.caretStack is for the final ensemble model. So different classes could be excluded from the caretList models and the final ensemble model. |
return_class_only |
a logical indicating whether to return only the class predictions as a factor. If TRUE, the return will be a factor rather than a data.table. This is a convenience function, and should not be widely used. For example if you have a downstream process that consumes the output of the model, you should have that process consume probabilities for each class. This will make it easier to change prediction probability thresholds if needed in the future. |
verbose |
a logical indicating whether to print progress |
... |
arguments to pass to |
Details
Prediction weights are defined as variable importance in the stacked caret model. This is not available for all cases such as where the library model predictions are transformed before being passed to the stacking model.
Value
a data.table of predictions
Examples
models <- caretList(
x = iris[1:100, 1:2],
y = iris[1:100, 3],
methodList = c("rpart", "glm")
)
meta_model <- caretStack(models, method = "lm")
RMSE(predict(meta_model, iris[101:150, 1:2]), iris[101:150, 3])
Predict method for greedyMSE
Description
Predict method for greedyMSE objects.
Usage
## S3 method for class 'greedyMSE'
predict(object, newdata, return_labels = FALSE, ...)
Arguments
object |
A greedyMSE object. |
newdata |
A numeric matrix of new data. |
return_labels |
A logical scalar of whether to return labels. |
... |
Additional arguments. Ignored. |
Value
A numeric matrix of predictions.
Print a caretStack object
Description
This is a function to print a caretStack.
Usage
## S3 method for class 'caretStack'
print(x, ...)
Arguments
x |
An object of class caretStack |
... |
ignored |
Examples
models <- caretList(
x = iris[1:100, 1:2],
y = iris[1:100, 3],
methodList = c("rpart", "glm")
)
meta_model <- caretStack(models, method = "lm")
print(meta_model)
Print method for greedyMSE
Description
Print method for greedyMSE objects.
Usage
## S3 method for class 'greedyMSE'
print(x, ...)
Arguments
x |
A greedyMSE object. |
... |
Additional arguments. Ignored. |
Print a summary.caretList object
Description
This is a function to print a summary.caretList
Usage
## S3 method for class 'summary.caretList'
print(x, ...)
Arguments
x |
An object of class summary.caretList |
... |
ignored |
Print a summary.caretStack object
Description
This is a function to print a summary.caretStack.
Usage
## S3 method for class 'summary.caretStack'
print(x, ...)
Arguments
x |
An object of class summary.caretStack |
... |
ignored |
Set excluded class id
Description
Set the excluded class id for a caretStack object
Usage
set_excluded_class_id(object, is_class)
Arguments
object |
a caretStack object |
is_class |
the model type as a logical vector with length 1 |
Shuffled MAE
Description
Compute the mean absolute error of a model's predictions when a variable is shuffled.
Usage
shuffled_mae(model, original_data, target, pred_type, shuffle_idx)
Arguments
original_data |
A data.table of the original data. |
target |
A matrix of target values. |
shuffle_idx |
A vector of shuffled indices. |
Value
A numeric vector of mean absolute errors.
Extracted stacked residuals for the autoplot
Description
This function extracts the predictions, observeds, and residuals from a train
object.
It uses the object's stacked predictions from cross-validation.
Usage
stackedTrainResiduals(object, show_class_id = 2L)
Arguments
object |
a |
show_class_id |
For classification only: which class level to use for residuals |
Value
a data.table::data.table with predictions, observeds, and residuals
Summarize a caretList
Description
This function summarizes the performance of each model in a caretList object.
Usage
## S3 method for class 'caretList'
summary(object, metric = NULL, ...)
Arguments
object |
a caretList object |
metric |
The metric to show. If NULL will use the metric used to train each model |
... |
passed to extractMetric |
Value
A data.table with metrics from each model.
Summarize a caretStack object
Description
This is a function to summarize a caretStack.
Usage
## S3 method for class 'caretStack'
summary(object, ...)
Arguments
object |
An object of class caretStack |
... |
ignored |
Examples
models <- caretList(
x = iris[1:100, 1:2],
y = iris[1:100, 3],
methodList = c("rpart", "glm")
)
meta_model <- caretStack(models, method = "lm")
summary(meta_model)
Check that the tuning parameters list supplied by the user is valid
Description
This function makes sure the tuning parameters passed by the user are valid and have the proper naming, etc.
Usage
tuneCheck(x)
Arguments
x |
a list of user-supplied tuning parameters and methods |
Validate the excluded class
Description
Helper function to ensure that the excluded level for classification is an integer. Set to 0L to exclude no class.
Usage
validateExcludedClass(arg)
Arguments
arg |
The value to check |
Value
integer
Variable importance for caretStack
Description
This is a function to extract variable importance from a caretStack.
Usage
## S3 method for class 'caretStack'
varImp(object, newdata = NULL, normalize = TRUE, ...)
Arguments
object |
An object of class caretStack |
newdata |
the data to use for computing importance. If NULL, will use the stacked predictions from the models. |
normalize |
a logical indicating whether to normalize the importances to sum to one. |
... |
passed to predict.caretList |
variable importance for a greedyMSE model
Description
Variable importance for a greedyMSE model.
Usage
## S3 method for class 'greedyMSE'
varImp(object, ...)
Arguments
object |
A greedyMSE object. |
... |
Additional arguments. Ignored. |
Calculate a weighted standard deviation
Description
Used to weight deviations among ensembled model predictions
Usage
wtd.sd(x, w, na.rm = FALSE)
Arguments
x |
a numeric vector |
w |
a vector of weights equal to length of x |
na.rm |
a logical indicating how to handle missing values, default = TRUE |