Help for package processpredictR

Type:

Package

Title:

Process Prediction

Version:

0.1.0

Date:

2022-12-23

Description:

Means to predict process flow, such as process outcome, next activity, next time, remaining time, and remaining trace. Off-the-shelf predictive models based on the concept of Transformers are provided, as well as multiple ways to customize the models. This package is partly based on work described in Zaharah A. Bukhsh, Aaqib Saeed, & Remco M. Dijkman. (2021). "ProcessTransformer: Predictive Business Process Monitoring with Transformer Network" <doi:10.48550/arXiv.2104.00721>.

License:

MIT + file LICENSE

Encoding:

UTF-8

RoxygenNote:

7.2.3

Imports:

bupaR, edeaR, dplyr, forcats, magrittr, reticulate, tidyr, tidyselect, purrr, stringr, keras, tensorflow, rlang, data.table, mltools, ggplot2, cli, glue, plotly, progress

Config/testthat/edition:

Depends:

R (≥ 2.10)

Suggests:

knitr, rmarkdown, lubridate, eventdataR

VignetteBuilder:

knitr

NeedsCompilation:

Packaged:

2023-01-15 22:02:43 UTC; lucp8407

Author:

Ivan Esin [aut], Gert Janssenswillen [cre], Hasselt University [cph]

Maintainer:

Gert Janssenswillen <gert.janssenswillen@uhasselt.be>

Repository:

CRAN

Date/Publication:

2023-01-17 17:10:01 UTC

processpredictR

Description

Author(s)

Maintainer: Gert Janssenswillen gert.janssenswillen@uhasselt.be

Authors:

Ivan Esin ivan.esin@student.uhasselt

Other contributors:

Hasselt University [copyright holder]

Pipe operator

Description

See magrittr::%>% for details.

Usage

lhs %>% rhs

Arguments

lhs

A value or the magrittr placeholder.

rhs

A function call using the magrittr semantics.

Value

The result of calling rhs(lhs).

Confusion matrix for predictions

Description

Confusion matrix for predictions

Usage

confusion_matrix(predictions, ...)

Arguments

predictions

ppred_predictions: A data.frame with predicted values returned by predict.ppred_model().

...

additional arguments.

Value

A table object that can be used for plotting a confusion matrix using plot().

Define transformer model

Description

Defines the model using the keras functional API. The following 4 process monitoring tasks are defined:

outcome
next_activity
next_time
remaining_time
remaining_trace
remaining_trace_s2s

Usage

create_model(
  x_train,
  custom = FALSE,
  num_heads = 4,
  output_dim_emb = 36,
  dim_ff = 64,
  ...
)

Arguments

x_train

data.frame: A processed data.frame from prepare_examples().

custom

logical (default FALSE): If TRUE, returns a custom model.

num_heads

A number of attention heads of the keras::layer_embedding().

output_dim_emb

Dimension of the dense embedding of the keras::layer_embedding().

dim_ff

Dimensionality of the output space of the feedforward network part of the model (units argument of the keras::layer_dense()).

...

you can pass additional arguments to keras::keras_model() (ex.: name argument).

Value

An object of class ppred_model and list containing a Transformer model (returned by keras::keras_model()) and some additional useful metrics.

Create a vocabulary

Description

Creates a vocabulary of activities and outcome labels.

Usage

create_vocabulary(processed_df)

Arguments

processed_df

A preprocessed object of type ppred_examples_df returned by prepare_examples().

Value

A list consisting of:

"keys_x": list of activity labels
"keys_y": list of outcome labels (none for tasks "next_time" and "remaining_time")

Utils

Description

Utils

Usage

get_vocabulary(examples)

Arguments

examples

a preprocessed dataset returned by prepare_examples_dt().

Calculate the maximum length of a case / number of activities in the longest trace in an event log

Description

Calculate the maximum length of a case / number of activities in the longest trace in an event log

Usage

max_case_length(processed_df)

Arguments

processed_df

A processed dataset of class ppred_examples_df returned by prepare_examples().

Value

An integer number of the maximum case length (longest trace) in an event log.

Examples

library(processpredictR)
library(eventdataR)

df <- prepare_examples(patients)
max_case_length(df)

Calculate number of outputs (target variables)

Description

Calculate number of outputs (target variables)

Usage

num_outputs(processed_df)

Arguments

processed_df

A processed dataset of class ppred_examples_df.

Value

an integer number of outputs for supplying as an argument to a Transformer model, i.e. number of unique labels for a specific process monitoring task.

Examples

library(processpredictR)
library(eventdataR)
df <- prepare_examples(patients)
num_outputs(df)

Plot Methods

Description

Visualize metric

Usage

## S3 method for class 'ppred_predictions'
plot(x, ...)

Arguments

x

Data to plot. An object of type ppred_predictions.

...

Additional variables

Value

A ggplot object, which can be customized further, if deemed necessary.

ppred_examples_df object

Description

object of type ppred_examples_df is a transformed event log returned by prepare_examples_dt().

ppred_model object

Description

object of type ppred_model is a list returned by processpredictR::create_model() containing a custom keras functional (transformer) model and some other useful metrics of an event log.

ppred_predictions object

Description

object of type ppred_predictions is a data.frame with predicted values returned by predict.ppred_model().

Convert a dataset of type `log` into a preprocessed format.

Description

an event log is converted into a tibble where each row contains a cumulative sequence of activities per case. This sequence will eventually be feeded to the Transformer model's token embedding layer.

Usage

prepare_examples(
  log,
  task = c("outcome", "next_activity", "next_time", "remaining_time", "remaining_trace",
    "remaining_trace_s2s"),
  features = NULL,
  ...
)

Arguments

log

log: Object of class log or derivatives (grouped_log, eventlog, activitylog, etc.).

task

character: a process monitoring task for which to prepare an event log.

features

character (default NULL): additional features. Appends attributes (if present) numeric_features and/or categorical_features to a preprocessed event log.

...

additional arguments.

Value

a preprocessed dataset of class ppred_examples_df.

Examples

library(processpredictR)
library(eventdataR)

prepare_examples(patients, "next_activity")

Print methods

Description

Print methods

Usage

## S3 method for class 'ppred_model'
print(x, ...)

Arguments

x

ppred_model: An object of class ppred_model.

...

Additional Arguments.

Value

prints a Transformer model from a list returned by create_model().

Default compile function for ProcessTransformer model

Description

These objects are imported from other packages. Follow the links below to see their documentation.

keras: compile, evaluate, fit
stats: predict

Arguments

optimizer

Default optimizer for ppred_model

loss

Default loss for ppred_model

metrics

Default metrics for ppred_model

train_data

A training dataset

batch_size

A batch size

num_epochs

A number of epochs

verbose

A verbose

callbacks

list: A list of callbacks. keras default is NULL, but can be adjusted (ex. keras::callback_csv_logger(filename = paste("log_", object$task)), #or NULL keras::callback_tensorboard())

shuffle

logical (default TRUE): If TRUE shuffles the data

validation_split

A ratio to split on

object

ppred_model (default NULL): ProcessTransformer model of class ppred_model.

test_data

ppred_examples_df (default NULL): preprocessed test data.

append

logical (default FALSE): if TRUE, returns a passed data.frame with predicted values.

...

Additional arguments

Splits the preprocessed `data.frame`.

Description

Returns train- and test dataframes as a list.

Usage

split_train_test(processed_df, split = 0.7)

Arguments

processed_df

A preprocessed object of type ppred_examples_df returned by prepare_examples().

split

numeric (default 0.7): A train-test split ratio.

Value

A list containing the train- and the test set objects.

Examples

library(processpredictR)
library(eventdataR)

df <- prepare_examples(patients, "next_activity")
split_train_test(df, split = 0.8)

Stacks a keras layer on top of existing model

Description

User friendly interface to add a keras layer on top of existing model.

Usage

stack_layers(object, ...)

Arguments

object

a list containing a model returned by create_model().

...

functions for adding layers by using functional keras API. For example, keras::layer_dense(units=32, activation="relu").

Value

a list containing an adapted Transformer model.

Tokenize features and target of a processed dataset of class `ppred_examples_df`

Description

Tokenize features and target of a processed ppred_examples_df object to fit the Transformer model.

Usage

tokenize(processed_df)

Arguments

processed_df

A preprocessed object of type ppred_examples_df returned by prepare_examples().

Value

A list of (sequence) tokens and additional numeric or categorical features.

Calculate the vocabulary size, i.e. the sum of number of activities, outcome labels and padding keys

Description

Calculate the vocabulary size, i.e. the sum of number of activities, outcome labels and padding keys

Usage

vocab_size(processed_df)

Arguments

processed_df

A processed dataset of class ppred_examples_df from prepare_examples().

Value

an integer number of vocabulary size to define the Transformer model.

Examples

library(processpredictR)
library(eventdataR)
df <- prepare_examples(patients)
vocab_size(df)

processpredictR

Description

Author(s)

Pipe operator

Description

Usage

Arguments

Value

Confusion matrix for predictions

Description

Usage

Arguments

Value

Define transformer model

Description

Usage

Arguments

Value

Create a vocabulary

Description

Usage

Arguments

Value

Utils

Description

Usage

Arguments

Calculate the maximum length of a case / number of activities in the longest trace in an event log

Description

Usage

Arguments

Value

Examples

Calculate number of outputs (target variables)

Description

Usage

Arguments

Value

Examples

Plot Methods

Description

Usage

Arguments

Value

ppred_examples_df object

Description

ppred_model object

Description

ppred_predictions object

Description

Convert a dataset of type log into a preprocessed format.

Description

Usage

Arguments

Value

Examples

Print methods

Description

Usage

Arguments

Value

Default compile function for ProcessTransformer model

Description

Arguments

See Also

Splits the preprocessed data.frame.

Description

Usage

Arguments

Value

Examples

Stacks a keras layer on top of existing model

Description

Usage

Arguments

Value

Tokenize features and target of a processed dataset of class ppred_examples_df

Description

Usage

Arguments

Convert a dataset of type `log` into a preprocessed format.

Splits the preprocessed `data.frame`.

Tokenize features and target of a processed dataset of class `ppred_examples_df`