Help for package tidyAML

Title:

Automatic Machine Learning with 'tidymodels'

Version:

0.0.6

Description:

The goal of this package will be to provide a simple interface for automatic machine learning that fits the 'tidymodels' framework. The intention is to work for regression and classification problems with a simple verb framework.

License:

MIT + file LICENSE

Encoding:

UTF-8

URL:

https://www.spsanderson.com/tidyAML/, https://github.com/spsanderson/tidyAML

BugReports:

https://github.com/spsanderson/tidyAML/issues

Depends:

parsnip, R (≥ 4.1.0)

Suggests:

knitr, rmarkdown, stats, tibble, stringr, utils, recipes, multilevelmod, rules, poissonreg, censored, baguette, bonsai, brulee, rstanarm, dbarts, kknn, ranger, randomForest, LiblineaR, flexsurv, gee, glmnet, discrim, kernlab, klaR, mda, sda, sparsediscrim

VignetteBuilder:

knitr

Imports:

rlang (≥ 0.4.11), purrr (≥ 0.3.5), dplyr (≥ 1.0.10), rsample (≥ 1.1.0), workflows (≥ 1.1.2), forcats, workflowsets, tidyr, broom, ggplot2, magrittr, tune (≥ 1.3.0)

RoxygenNote:

7.3.2

NeedsCompilation:

Packaged:

2025-05-12 15:12:24 UTC; ssanders

Author:

Steven Sanderson

[aut, cre, cph]

Maintainer:

Steven Sanderson <spsanderson@gmail.com>

Repository:

CRAN

Date/Publication:

2025-05-12 15:40:03 UTC

Pipe operator

Description

See magrittr::%>% for details.

Usage

lhs %>% rhs

Value

This does not return a value but rather is used to string functions together.

Check for Duplicate Rows in a Data Frame

Description

This function checks for duplicate rows in a data frame.

Usage

check_duplicate_rows(.data)

Arguments

.data

A data frame.

Details

This function checks for duplicate rows by comparing each row in the data frame to every other row. If a row is identical to another row, it is considered a duplicate.

Value

A logical vector indicating whether each row is a duplicate or not.

Author(s)

Steven P. Sanderson II, MPH

Examples

data <- data.frame(
  x = c(1, 2, 3, 1),
  y = c(2, 3, 4, 2),
  z = c(3, 2, 5, 3)
)

check_duplicate_rows(data)

Functions to Install all Core Libraries

Description

Lists the core packages necessary to run all potential modeling algorithms.

Usage

core_packages()

Details

Lists the core packages necessary to run all potential modeling algorithms.

Value

A character vector

Author(s)

Steven P. Sanderson II, MPH

Examples

core_packages()

Generate Model Specification calls to `parsnip`

Description

Creates a list/tibble of parsnip model specifications.

Usage

create_model_spec(
  .parsnip_eng = list("lm"),
  .mode = list("regression"),
  .parsnip_fns = list("linear_reg"),
  .return_tibble = TRUE
)

Arguments

.parsnip_eng

The input must be a list. The default for this is set to all. This means that all of the parsnip linear regression engines will be used, for example lm, or glm.

.mode

The input must be a list. The default is 'regression'

.parsnip_fns

The input must be a list. The default for this is set to all. This means that all of the parsnip linear regression functions will be used, for example linear_reg(), or cubist_rules.

.return_tibble

The default is TRUE. FALSE will return a list object.

Details

Creates a list/tibble of parsnip model specifications. With this function you can generate a list/tibble output of any model specification and engine you choose that is supported by the parsnip ecosystem.

Value

A list or a tibble.

Author(s)

Steven P. Sanderson II, MPH

Examples

create_model_spec(
 .parsnip_eng = list("lm","glm","glmnet","cubist"),
 .parsnip_fns = list(
      "linear_reg","linear_reg","linear_reg",
      "cubist_rules"
     )
 )

create_model_spec(
 .parsnip_eng = list("lm","glm","glmnet","cubist"),
 .parsnip_fns = list(
      "linear_reg","linear_reg","linear_reg",
      "cubist_rules"
     ),
 .return_tibble = FALSE
 )

Utility Create Splits Object

Description

Create a splits object.

Usage

create_splits(.data, .split_type = "initial_split", .split_args = NULL)

Arguments

.data

The data being passed to make a split on

.split_type

The default is "initial_split", you can pass any other split type from the rsample library.

.split_args

The default is NULL in order to use the default split arguments. If you want to pass other arguments then must pass a list with the parameter name and the argument.

Details

Create a splits object that returns a list object of both the splits object itself and the splits type. This function supports all splits types from the rsample package.

Value

A list object

Author(s)

Steven P. Sanderson II, MPH

Examples

create_splits(mtcars, .split_type = "vfold_cv")

Create a Workflow Set Object

Description

Create a workflow set object tibble from a model spec tibble.

Usage

create_workflow_set(.model_tbl = NULL, .recipe_list = list(), .cross = TRUE)

Arguments

.model_tbl

The model table that is generated from a function like fast_regression_parsnip_spec_tbl(). The model spec column will be grabbed automatically as the class of the object must be tidyaml_base_tbl

.recipe_list

Provide a list of recipes here that will get added to the workflow set object.

.cross

The default is TRUE, can be set to FALSE. This is passed to the cross parameter as an argument to the workflow_set() function.

Details

Create a ⁠workflow set⁠ object/tibble from a model spec tibble where the object class type is tidyaml_base_tbl. This function will take in a list of recipes and will grab the model specifications from the base tibble to create the workflow sets object. You can also supply the logical of TRUE/FALSe the .cross parameter which gets passed to the corresponding parameter as an argumnt to the workflowsets::workflow_set() function.

Value

A list object of workflows.

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes)

rec_obj <- recipe(mpg ~ ., data = mtcars)
spec_tbl <- fast_regression_parsnip_spec_tbl(
  .parsnip_fns = "linear_reg",
  .parsnip_eng = c("lm","glm")
)

create_workflow_set(
  spec_tbl,
  list(rec_obj)
)

Extract A Model Specification

Description

Extract a model specification from a tidyAML model tibble.

Usage

extract_model_spec(.data, .model_id = NULL)

Arguments

.data

The model table that must have the class tidyaml_mod_spec_tbl.

.model_id

The model number that you want to select, Must be an integer or sequence of integers, ie. 1 or c(1,3,5) or 1:2

Details

This function allows you to get a model specification or more from a tibble with a class of "tidyaml_mod_spec_tbl". It allows you to select the model by the .model_id column. You can call the model id's by an integer or a sequence of integers.

Value

A tibble with the chosen model specification(s).

Author(s)

Steven P. Sanderson II, MPH

Examples

spec_tbl <- fast_regression_parsnip_spec_tbl(
  .parsnip_fns = "linear_reg",
  .parsnip_eng = c("lm","glm")
)

extract_model_spec(spec_tbl, 1)
extract_model_spec(spec_tbl, 1:2)

Extract Residuals from Fast Regression Models

Description

This function extracts residuals from a fast regression model table (fast_regression()).

Usage

extract_regression_residuals(.model_tbl, .pivot_long = FALSE)

Arguments

.model_tbl

A fast regression model specification table (fst_reg_spec_tbl).

.pivot_long

A logical value indicating if the output should be pivoted. The default is FALSE.

Details

The function checks if the input model specification table inherits the class 'fst_reg_spec_tbl' and if it contains the column 'pred_wflw'. It then manipulates the data, grouping it by model, and extracts residuals for each model. The result is a list of data frames, each containing residuals, actual values, and predicted values for a specific model.

Value

The function returns a list of data frames, each containing residuals, actual values, and predicted values for a specific model.

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes, quietly = TRUE)

rec_obj <- recipe(mpg ~ ., data = mtcars)

fr_tbl <- fast_regression(mtcars, rec_obj, .parsnip_eng = c("lm","glm"),
.parsnip_fns = "linear_reg")

extract_regression_residuals(fr_tbl)
extract_regression_residuals(fr_tbl, .pivot_long = TRUE)

Extract Tunable Parameters from Model Specifications

Description

Extract a list of tunable parameters from the .model_spec column of a tidyaml_mod_spec_tbl.

Usage

extract_tunable_params(.model_tbl)

Arguments

.model_tbl

A model table with a class of tidyaml_mod_spec_tbl.

Details

This function iterates over the .model_spec column of a model table and extracts tunable parameters for each model using tunable(). The result is a list that can be further processed into a tibble if needed.

Value

A list of tibbles, each containing the tunable parameters for a model.

Examples

library(dplyr)
mods <- fast_regression_parsnip_spec_tbl(
  .parsnip_fns = "linear_reg",
  .parsnip_eng = c("lm","glmnet")
  )
extract_tunable_params(mods)

Extract A Model Workflow

Description

Extract a model workflow from a tidyAML model tibble.

Usage

extract_wflw(.data, .model_id = NULL)

Arguments

.data

The model table that must have the class tidyaml_mod_spec_tbl.

.model_id

The model number that you want to select, Must be an integer or sequence of integers, ie. 1 or c(1,3,5) or 1:2

Details

This function allows you to get a model workflow or more from a tibble with a class of "tidyaml_mod_spec_tbl". It allows you to select the model by the .model_id column. You can call the model id's by an integer or a sequence of integers.

Value

A tibble with the chosen model workflow(s).

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes)

rec_obj <- recipe(mpg ~ ., data = mtcars)
frt_tbl <- fast_regression(mtcars, rec_obj, .parsnip_eng = c("lm","glm"),
                                           .parsnip_fns = "linear_reg")

extract_wflw(frt_tbl, 1)
extract_wflw(frt_tbl, 1:2)

Extract A Model Fitted Workflow

Description

Extract a model fitted workflow from a tidyAML model tibble.

Usage

extract_wflw_fit(.data, .model_id = NULL)

Arguments

.data

The model table that must have the class tidyaml_mod_spec_tbl.

.model_id

The model number that you want to select, Must be an integer or sequence of integers, ie. 1 or c(1,3,5) or 1:2

Details

This function allows you to get a model fitted workflow or more from a tibble with a class of "tidyaml_mod_spec_tbl". It allows you to select the model by the .model_id column. You can call the model id's by an integer or a sequence of integers.

Value

A tibble with the chosen model workflow(s).

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes)

rec_obj <- recipe(mpg ~ ., data = mtcars)
frt_tbl <- fast_regression(mtcars, rec_obj, .parsnip_eng = c("lm","glm"),
                                           .parsnip_fns = "linear_reg")

extract_wflw_fit(frt_tbl, 1)
extract_wflw_fit(frt_tbl, 1:2)

Extract A Model Workflow Predictions

Description

Extract a model workflow predictions from a tidyAML model tibble.

Usage

extract_wflw_pred(.data, .model_id = NULL)

Arguments

.data

The model table that must have the class tidyaml_mod_spec_tbl.

.model_id

The model number that you want to select, Must be an integer or sequence of integers, ie. 1 or c(1,3,5) or 1:2

Details

This function allows you to get a model workflow predictions or more from a tibble with a class of "tidyaml_mod_spec_tbl". It allows you to select the model by the .model_id column. You can call the model id's by an integer or a sequence of integers.

Value

A tibble with the chosen model workflow(s).

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes)

rec_obj <- recipe(mpg ~ ., data = mtcars)
frt_tbl <- fast_regression(mtcars, rec_obj, .parsnip_eng = c("lm","glm"),
                                           .parsnip_fns = "linear_reg")

extract_wflw_pred(frt_tbl, 1)
extract_wflw_pred(frt_tbl, 1:2)

Generate Model Specification calls to `parsnip`

Description

Creates a list/tibble of parsnip model specifications.

Usage

fast_classification(
  .data,
  .rec_obj,
  .parsnip_fns = "all",
  .parsnip_eng = "all",
  .split_type = "initial_split",
  .split_args = NULL,
  .drop_na = TRUE
)

Arguments

.data

The data being passed to the function for the classification problem

.rec_obj

The recipe object being passed.

.parsnip_fns

The default is 'all' which will create all possible classification model specifications supported.

.parsnip_eng

the default is 'all' which will create all possible classification model specifications supported.

.split_type

The default is 'initial_split', you can pass any type of split supported by rsample

.split_args

The default is NULL, when NULL then the default parameters of the split type will be executed for the rsample split type.

.drop_na

The default is TRUE, which will drop all NA's from the data.

Details

With this function you can generate a tibble output of any classification model specification and it's fitted workflow object. Per recipes documentation explicitly with step_string2factor() it is encouraged to mutate your predictor into a factor before you create your recipe.

Value

A list or a tibble.

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes)
library(dplyr)
library(tidyr)

df <- Titanic |>
 as_tibble() |>
 uncount(n) |>
 mutate(across(everything(), as.factor))

rec_obj <- recipe(Survived ~ ., data = df)

fct_tbl <- fast_classification(
  .data = df,
  .rec_obj = rec_obj,
  .parsnip_eng = c("glm","earth")
  )

fct_tbl

Utility Classification call to `parsnip`

Description

Creates a tibble of parsnip classification model specifications.

Usage

fast_classification_parsnip_spec_tbl(
  .parsnip_fns = "all",
  .parsnip_eng = "all"
)

Arguments

.parsnip_fns

The default for this is set to all. This means that all of the parsnip classification functions will be used, for example bag_mars(), or bart(). You can also choose to pass a c() vector like c("barg_mars","bart")

.parsnip_eng

The default for this is set to all. This means that all of the parsnip classification engines will be used, for example earth, or dbarts. You can also choose to pass a c() vector like c('earth', 'dbarts')

Details

Creates a tibble of parsnip classification model specifications. This will create a tibble of 32 different classification model specifications which can be filtered. The model specs are created first and then filtered out. This will only create models for classification problems. To find all of the supported models in this package you can visit https://www.tidymodels.org/find/parsnip/

Value

A tibble with an added class of 'fst_class_spec_tbl'

Author(s)

Steven P. Sanderson II, MPH

Examples

fast_classification_parsnip_spec_tbl(.parsnip_fns = "logistic_reg")
fast_classification_parsnip_spec_tbl(.parsnip_eng = c("earth","dbarts"))

Generate Model Specification calls to `parsnip`

Description

Creates a list/tibble of parsnip model specifications.

Usage

fast_regression(
  .data,
  .rec_obj,
  .parsnip_fns = "all",
  .parsnip_eng = "all",
  .split_type = "initial_split",
  .split_args = NULL,
  .drop_na = TRUE
)

Arguments

.data

The data being passed to the function for the regression problem

.rec_obj

The recipe object being passed.

.parsnip_fns

The default is 'all' which will create all possible regression model specifications supported.

.parsnip_eng

the default is 'all' which will create all possible regression model specifications supported.

.split_type

The default is 'initial_split', you can pass any type of split supported by rsample

.split_args

The default is NULL, when NULL then the default parameters of the split type will be executed for the rsample split type.

.drop_na

The default is TRUE, which will drop all NA's from the data.

Details

With this function you can generate a tibble output of any regression model specification and it's fitted workflow object.

Value

A list or a tibble.

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes, quietly = TRUE)

rec_obj <- recipe(mpg ~ ., data = mtcars)
frt_tbl <- fast_regression(
  mtcars,
  rec_obj,
  .parsnip_eng = c("lm","glm","gee"),
  .parsnip_fns = "linear_reg"
  )

frt_tbl

Utility Regression call to `parsnip`

Description

Creates a tibble of parsnip regression model specifications.

Usage

fast_regression_parsnip_spec_tbl(.parsnip_fns = "all", .parsnip_eng = "all")

Arguments

.parsnip_fns

The default for this is set to all. This means that all of the parsnip linear regression functions will be used, for example linear_reg(), or cubist_rules. You can also choose to pass a c() vector like c("linear_reg","cubist_rules")

.parsnip_eng

The default for this is set to all. This means that all of the parsnip linear regression engines will be used, for example lm, or glm. You can also choose to pass a c() vector like c('lm', 'glm')

Details

Creates a tibble of parsnip regression model specifications. This will create a tibble of 46 different regression model specifications which can be filtered. The model specs are created first and then filtered out. This will only create models for regression problems. To find all of the supported models in this package you can visit https://www.tidymodels.org/find/parsnip/

Value

A tibble with an added class of 'fst_reg_spec_tbl'

Author(s)

Steven P. Sanderson II, MPH

Examples

fast_regression_parsnip_spec_tbl(.parsnip_fns = "linear_reg")
fast_regression_parsnip_spec_tbl(.parsnip_eng = c("lm","glm"))

Full Internal Workflow for Model and Recipe

Description

This function creates a full internal workflow for a model and recipe combination.

Usage

full_internal_make_wflw(.model_tbl, .rec_obj)

Arguments

.model_tbl

A model specification table (tidyaml_mod_spec_tbl).

.rec_obj

A recipe object.

Details

The function checks if the input model specification table inherits the class 'tidyaml_mod_spec_tbl'. It then manipulates the input table, making adjustments for factors and creating a list of grouped models. For each model-recipe pair, it uses the appropriate internal function based on the model type to create a workflow object. The specific internal function is selected using a switch statement based on the class of the model.

Value

The function returns a workflow object for the first model-recipe pair based on the internal function selected.

Author(s)

Steven P. Sanderson II, MPH

Examples

library(dplyr)
library(recipes)

rec_obj <- recipe(mpg ~ ., data = mtcars)

mod_tbl <- make_regression_base_tbl()
mod_tbl <- mod_tbl |>
  filter(
    .parsnip_engine %in% c("lm", "glm") &
    .parsnip_fns == "linear_reg"
    )
class(mod_tbl) <- c("tidyaml_mod_spec_tbl", class(mod_tbl))
mod_spec_tbl <- internal_make_spec_tbl(mod_tbl)
result <- full_internal_make_wflw(mod_spec_tbl, rec_obj)
result

Get a Model

Description

Get a model from a tidyAML model tibble.

Usage

get_model(.data, .model_id = NULL)

Arguments

.data

The model table that must have the class tidyaml_mod_spec_tbl.

.model_id

The model number that you want to select, Must be an integer or sequence of integers, ie. 1 or c(1,3,5) or 1:2

Details

This function allows you to get a model or models from a tibble with a class of "tidyaml_mod_spec_tbl". It allows you to select the model by the .model_id column. You can call the model id's by an integer or a sequence of integers.

Value

A tibble with the chosen models.

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes)

rec_obj <- recipe(mpg ~ ., data = mtcars)
spec_tbl <- fast_regression_parsnip_spec_tbl(
  .parsnip_fns = "linear_reg",
  .parsnip_eng = c("lm","glm")
)

get_model(spec_tbl, 1)
get_model(spec_tbl, 1:2)

Functions to Install all Core Libraries

Description

Installs all dependencies in the core_packages() function.

Usage

install_deps()

Details

Installs all dependencies in the core_packages() function.

Value

No return value, called for side effects

Author(s)

Steven P. Sanderson II, MPH

Examples

## Not run: 
  install_deps()

## End(Not run)

Internals Safely Make a Fitted Workflow from Model Spec tibble

Description

Safely Make a fitted workflow from a model spec tibble.

Usage

internal_make_fitted_wflw(.model_tbl, .splits_obj)

Arguments

.model_tbl

The model table that is generated from a function like fast_regression_parsnip_spec_tbl(), must have a class of "tidyaml_mod_spec_tbl". This is meant to be used after the function internal_make_wflw() has been run and the tibble has been saved.

.splits_obj

The splits object from the auto_ml function. It is internal to the auto_ml_ function.

Details

Create a fitted parnsip model from a workflow object.

Value

A list object of workflows.

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes, quietly = TRUE)

mod_spec_tbl <- fast_regression_parsnip_spec_tbl(
  .parsnip_eng = c("lm","glm"),
  .parsnip_fns = "linear_reg"
)

rec_obj <- recipe(mpg ~ ., data = mtcars)
splits_obj <- create_splits(mtcars, "initial_split")

mod_tbl <- mod_spec_tbl |>
  mutate(wflw = full_internal_make_wflw(mod_spec_tbl, rec_obj))

internal_make_fitted_wflw(mod_tbl, splits_obj)

Internals Make a Model Spec tibble

Description

Make a Model Spec tibble.

Usage

internal_make_spec_tbl(.model_tbl)

Arguments

.model_tbl

This is the data that should be coming from inside of the regression/classification to parsnip spec functions.

Details

Make a Model Spec tibble.

Value

A model spec tbl.

Author(s)

Steven P. Sanderson II, MPH

Examples

make_regression_base_tbl() |>
  internal_make_spec_tbl()

make_classification_base_tbl() |>
  internal_make_spec_tbl()

Internals Safely Make Workflow from Model Spec tibble

Description

Safely Make a workflow from a model spec tibble.

Usage

internal_make_wflw(.model_tbl, .rec_obj)

Arguments

.model_tbl

The model table that is generated from a function like fast_regression_parsnip_spec_tbl(), must have a class of "tidyaml_mod_spec_tbl".

.rec_obj

The recipe object that is going to be used to make the workflow object.

Details

Create a model specification tibble that has a workflows::workflow() list column.

Value

A list object of workflows.

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes, quietly = TRUE)

mod_spec_tbl <- fast_regression_parsnip_spec_tbl(
  .parsnip_eng = c("lm","glm","gee"),
  .parsnip_fns = "linear_reg"
)

rec_obj <- recipe(mpg ~ ., data = mtcars)

internal_make_wflw(mod_spec_tbl, rec_obj)

Internals Safely Make Workflow for GEE Linear Regression

Description

Safely Make a workflow from a model spec tibble.

Usage

internal_make_wflw_gee_lin_reg(.model_tbl, .rec_obj)

Arguments

.model_tbl

The model table that is generated from a function like fast_regression_parsnip_spec_tbl(), must have a class of "tidyaml_mod_spec_tbl".

.rec_obj

The recipe object that is going to be used to make the workflow object.

Details

Create a model specification tibble that has a workflows::workflow() list column.

Value

A list object of workflows.

Author(s)

Steven P. Sanderson II, MPH

Examples

library(dplyr)
library(recipes)
library(multilevelmod)

mod_tbl <- make_regression_base_tbl()
mod_tbl <- mod_tbl |>
  filter(
  .parsnip_engine %in% c("gee") &
  .parsnip_fns == "linear_reg"
  )

class(mod_tbl) <- c("tidyaml_mod_spec_tbl", class(mod_tbl))
mod_spec_tbl <- internal_make_spec_tbl(mod_tbl)
rec_obj <- recipe(mpg ~ ., data = mtcars)

internal_make_wflw_gee_lin_reg(mod_spec_tbl, rec_obj)

Internals Safely Make Predictions on a Fitted Workflow from Model Spec tibble

Description

Safely Make predictions on a fitted workflow from a model spec tibble.

Usage

internal_make_wflw_predictions(.model_tbl, .splits_obj)

Arguments

.model_tbl

The model table that is generated from a function like fast_regression_parsnip_spec_tbl(), must have a class of "tidyaml_mod_spec_tbl". This is meant to be used after the function internal_make_fitted_wflw() has been run and the tibble has been saved.

.splits_obj

The splits object from the auto_ml function. It is internal to the auto_ml_ function.

Details

Create predictions on a fitted parnsip model from a workflow object.

Value

A list object tibble of the outcome variable and it's values along with the testing and training predictions in a single tibble.

.data_category	.data_type	.value
actual	actual	21.0
actual	actual	21.0
actual	actual	22.8
...	...	...
predicted	training	21.0
...	...	...
predicted	training	21.0

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes, quietly = TRUE)

mod_spec_tbl <- fast_regression_parsnip_spec_tbl(
  .parsnip_eng = c("lm","glm"),
  .parsnip_fns = "linear_reg"
)

rec_obj <- recipe(mpg ~ ., data = mtcars)
splits_obj <- create_splits(mtcars, "initial_split")

mod_tbl <- mod_spec_tbl |>
  mutate(wflw = full_internal_make_wflw(mod_spec_tbl, rec_obj))

mod_fitted_tbl <- mod_tbl |>
  mutate(fitted_wflw = internal_make_fitted_wflw(mod_tbl, splits_obj))

internal_make_wflw_predictions(mod_fitted_tbl, splits_obj)

Internals Make a Tunable Model Specification

Description

Make a tuned model specification object.

Usage

internal_set_args_to_tune(.model_tbl)

Arguments

.model_tbl

The model table that is generated from a function like fast_regression_parsnip_spec_tbl(), must have a class of "tidyaml_mod_spec_tbl".

Details

This will take a model specification that is created from a function like fast_regression_parsnip_spec_tbl() and update the model_spec args to tune::tune(). This is done dynamically, meaning you do not need to know the names of the parameters inside of the model specification.

Value

A list object of workflows.

Author(s)

Steven P. Sanderson II, MPH

Examples

library(dplyr)

mod_tbl <- fast_regression_parsnip_spec_tbl()
mod_tbl$model_spec[[1]]

updated_mod_tbl <- mod_tbl |>
  mutate(model_spec = internal_set_args_to_tune(mod_tbl))
updated_mod_tbl$model_spec[[1]]

Functions to Install all Core Libraries

Description

Load all the core packages necessary to run all potential modeling algorithms.

Usage

load_deps()

Details

Load all the core packages necessary to run all potential modeling algorithms.

Value

No return value, called for side effects

Author(s)

Steven P. Sanderson II, MPH

Examples

## Not run: 
load_deps()

## End(Not run)

Internals Make Base Classification Tibble

Description

Creates a base tibble to create parsnip classification model specifications.

Usage

make_classification_base_tbl()

Details

Creates a base tibble to create parsnip classification model specifications.

Value

A tibble

Author(s)

Steven P. Sanderson II, MPH

Examples

make_classification_base_tbl()

Internals Make Base Regression Tibble

Description

Creates a base tibble to create parsnip regression model specifications.

Usage

make_regression_base_tbl()

Details

Creates a base tibble to create parsnip regression model specifications.

Value

A tibble

Author(s)

Steven P. Sanderson II, MPH

Examples

make_regression_base_tbl()

Match function arguments

Description

Match a functions arguments.

Usage

match_args(f, args)

Arguments

f

The parsnip function such as "linear_reg" as a string and without the parentheses.

args

The arguments you want to supply to f

Details

Match a functions arguments, the bad ones passed will be rejected but the remaining passing ones will be returned.

Value

A list of matched arguments.

Author(s)

Steven P. Sanderson II, MPH

Examples


match_args(
  f = "linear_reg",
  args = list(
    mode = "regression",
    engine = "lm",
    trees = 1,
    mtry = 1
   )
 )

Create ggplot2 plot of regression predictions

Description

Create a ggplot2 plot of regression predictions.

Usage

plot_regression_predictions(.data, .output = "list")

Arguments

.data

The data from the output of the extract_wflw_pred() function.

.output

The default is "list" which will return a list of plots. The other option is "facet" which will return a single faceted plot.

Details

Create a ggplot2 plot of regression predictions, the actual, training, and testing values. The output of this function can either be a list of plots or a single faceted plot. This function takes the output of the function extract_wflw_pred() function.

Value

A list of ggplot2 plots or a faceted plot.

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes)

rec_obj <- recipe(mpg ~ ., data = mtcars)
frt_tbl <- fast_regression(
  mtcars,
  rec_obj,
  .parsnip_eng = c("lm","glm"),
  .parsnip_fns = "linear_reg"
  )

extract_wflw_pred(frt_tbl,1) |> plot_regression_predictions()
extract_wflw_pred(frt_tbl,1:nrow(frt_tbl)) |>
  plot_regression_predictions(.output = "facet")

Create ggplot2 plot of regression residuals

Description

Create a ggplot2 plot of regression residuals.

Usage

plot_regression_residuals(.data)

Arguments

.data

The data from the output of the extract_regression_residuals() function.

Details

Create a ggplot2 plot of regression residuals. The output of this function can either be a list of plots or a single faceted plot. This function takes the output of the extract_regression_residuals() function.

Value

A list of ggplot2 plots or a faceted plot.

Author(s)

Steven P. Sanderson II, MPH

Examples

library(recipes)

rec_obj <- recipe(mpg ~ ., data = mtcars)
frt_tbl <- fast_regression(
  mtcars,
  rec_obj,
  .parsnip_eng = c("lm","glm"),
  .parsnip_fns = "linear_reg"
  )

extract_regression_residuals(frt_tbl, FALSE)[1] |> plot_regression_residuals()
extract_regression_residuals(frt_tbl, TRUE)[1] |> plot_regression_residuals()

Perform quantile normalization on a numeric matrix/data.frame

Description

This function will perform quantile normalization on two or more distributions of equal length. Quantile normalization is a technique used to make the distribution of values across different samples more similar. It ensures that the distributions of values for each sample have the same quantiles. This function takes a numeric matrix as input and returns a quantile-normalized matrix.

Usage

quantile_normalize(.data, .return_tibble = FALSE)

Arguments

.data

A numeric matrix where each column represents a sample.

.return_tibble

A logical value that determines if the output should be a tibble. Default is 'FALSE'.

Details

This function performs quantile normalization on a numeric matrix by following these steps:

Sort each column of the input matrix.
Calculate the mean of each row across the sorted columns.
Replace each column's sorted values with the row means.
Unsort the columns to their original order.

Value

A list object that has the following:

A numeric matrix that has been quantile normalized.
The row means of the quantile normalized matrix.
The sorted data
The ranked indices

Author(s)

Steven P. Sanderson II, MPH

Examples

# Create a sample numeric matrix
data <- matrix(rnorm(20), ncol = 4)

# Perform quantile normalization
normalized_data <- quantile_normalize(data)
normalized_data

as.data.frame(normalized_data$normalized_data) |>
  sapply(function(x) quantile(x, probs = seq(0, 1, 1 / 4)))

quantile_normalize(data, .return_tibble = TRUE)

Tidy eval helpers

Description

This page lists the tidy eval tools reexported in this package from rlang. To learn about using tidy eval in scripts and packages at a high level, see the dplyr programming vignette and the ggplot2 in packages vignette. The Metaprogramming section of Advanced R may also be useful for a deeper dive.

The tidy eval operators ⁠{{⁠, ⁠!!⁠, and ⁠!!!⁠ are syntactic constructs which are specially interpreted by tidy eval functions. You will mostly need ⁠{{⁠, as ⁠!!⁠ and ⁠!!!⁠ are more advanced operators which you should not have to use in simple cases.

The curly-curly operator ⁠{{⁠ allows you to tunnel data-variables passed from function arguments inside other tidy eval functions. ⁠{{⁠ is designed for individual arguments. To pass multiple arguments contained in dots, use ... in the normal way.
```
my_function <- function(data, var, ...) {
  data %>%
    group_by(...) %>%
    summarise(mean = mean({{ var }}))
}
```
enquo() and enquos() delay the execution of one or several function arguments. The former returns a single expression, the latter returns a list of expressions. Once defused, expressions will no longer evaluate on their own. They must be injected back into an evaluation context with ⁠!!⁠ (for a single expression) and ⁠!!!⁠ (for a list of expressions).
```
my_function <- function(data, var, ...) {
  # Defuse
  var <- enquo(var)
  dots <- enquos(...)

  # Inject
  data %>%
    group_by(!!!dots) %>%
    summarise(mean = mean(!!var))
}
```
In this simple case, the code is equivalent to the usage of ⁠{{⁠ and ... above. Defusing with enquo() or enquos() is only needed in more complex cases, for instance if you need to inspect or modify the expressions in some way.
The .data pronoun is an object that represents the current slice of data. If you have a variable name in a string, use the .data pronoun to subset that variable with [[.
```
my_var <- "disp"
mtcars %>% summarise(mean = mean(.data[[my_var]]))
```

Another tidy eval operator is ⁠:=⁠. It makes it possible to use glue and curly-curly syntax on the LHS of =. For technical reasons, the R language doesn't support complex expressions on the left of =, so we use ⁠:=⁠ as a workaround.

my_function <- function(data, var, suffix = "foo") {
  # Use `{{` to tunnel function arguments and the usual glue
  # operator `{` to interpolate plain strings.
  data %>%
    summarise("{{ var }}_mean_{suffix}" := mean({{ var }}))
}

Many tidy eval functions like dplyr::mutate() or dplyr::summarise() give an automatic name to unnamed inputs. If you need to create the same sort of automatic names by yourself, use as_label(). For instance, the glue-tunnelling syntax above can be reproduced manually with:
```
my_function <- function(data, var, suffix = "foo") {
  var <- enquo(var)
  prefix <- as_label(var)
  data %>%
    summarise("{prefix}_mean_{suffix}" := mean(!!var))
}
```
Expressions defused with enquo() (or tunnelled with ⁠{{⁠) need not be simple column names, they can be arbitrarily complex. as_label() handles those cases gracefully. If your code assumes a simple column name, use as_name() instead. This is safer because it throws an error if the input is not a name as expected.

Value

No return value, called for side effects

Pipe operator

Description

Usage

Value

Check for Duplicate Rows in a Data Frame

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Functions to Install all Core Libraries

Description

Usage

Details

Value

Author(s)

See Also

Examples

Generate Model Specification calls to parsnip

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Utility Create Splits Object

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Create a Workflow Set Object

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Extract A Model Specification

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Extract Residuals from Fast Regression Models

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Extract Tunable Parameters from Model Specifications

Description

Usage

Arguments

Details

Value

See Also

Examples

Extract A Model Workflow

Description

Usage

Arguments

Details

Value

Generate Model Specification calls to `parsnip`

Generate Model Specification calls to `parsnip`

Utility Classification call to `parsnip`

Generate Model Specification calls to `parsnip`

Utility Regression call to `parsnip`