Type: | Package |
Title: | Multi-Criteria Design of Experiments for Optimal Design |
Version: | 0.9.4 |
Maintainer: | Andrea Melloncelli <andrea.melloncelli@vanlog.it> |
Description: | Multi-criteria design of experiments algorithm that simultaneously optimizes up to six different criteria ('I', 'Id', 'D', 'Ds', 'A' and 'As'). The algorithm finds the optimal Pareto front and, if requested, selects a possible symmetrical design on it. The symmetrical design is selected based on two techniques: minimum distance with the Utopia point or the 'TOPSIS' approach. |
License: | CC BY 4.0 |
URL: | https://github.com/andreamelloncelli/multiDoE |
Depends: | R (≥ 3.5.0) |
Imports: | ggplot2, magrittr, plotly, pracma |
Suggests: | devtools, here, knitr, logr, markdown, rgl, rmarkdown, testthat (≥ 3.0.0), usethis |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2025-04-02 17:06:07 UTC; mello |
Author: | Francesca Cucchi [aut], Andrea Melloncelli [aut, cre], Francesco Sambo [aut], Kalliopi Mylona [aut], Matteo Borrotti [aut] |
Repository: | CRAN |
Date/Publication: | 2025-04-02 21:50:06 UTC |
Pipe operator
Description
See magrittr::%>%
for details.
Usage
lhs %>% rhs
Arguments
lhs |
A value or the magrittr placeholder. |
rhs |
A function call using the magrittr semantics. |
Value
The result of calling 'rhs(lhs)'.
Experimental setup
Description
The MSOpt
function allows the user to define the
structure of the experiment, the set of optimization criteria and the a priori
model to be considered. The output is a list containing all information about
the settings of the experiment. According to the declared criteria, the list
also contains the basic matrices for their implementation, such as information
matrix, matrix of moments and matrix of weights. The returned list
is argument of the Score
and MSSearch
functions of the multiDoE
package.
Usage
MSOpt(facts, units, levels, etas, criteria, model)
Arguments
facts |
A list of vectors representing the distribution of factors
across strata. Each item in the list represents a stratum and the first item
is the highest stratum of the multi-stratum structure of the experiment.
Within the vectors, experimental factors are indicated by progressive integer
from |
units |
A list whose |
levels |
A vector containing the number of available levels for each
experimental factor in |
etas |
A list specifying the ratios of error variance between subsequent
strata. It follows that |
criteria |
A list specifying the criteria to be optimized. It can contain any combination of:
More detailed information on the available criteria is given under Details. |
model |
A string which indicates the type of model, among “main", “interaction" and “quadratic". |
Details
A little notation is introduced to show the criteria that can be
used in the multi-objective approach of the multiDoE
package.
For an experiment with N
runs and s
strata, with stratum i
having n_i
units within each unit at stratum i-1
and
stratum 0 being defined as the entire experiment (n_0 = 1
), the
general form of the model can be written as:
y = X\beta + \sum\limits_{i = 1}^{s} Z_i\varepsilon_i
where
y
is an N
-dimensional vector of responses (N = \prod_{j = 1}^{s}n_j
),
X
is an N
by p
model matrix,
\beta
is a p
-dimensional vector containing the p
fixed model parameters,
Z_i
is an N
by b_i
indicator matrix of 0
and
1
for the units in stratum i
(i.e. the (k,l
)th element
of Z_i
is 1
if the k
th run belongs to the l
th
block in stratum i
and 0
otherwise) and
b_i = \prod_{j = 1}^{i}n_j
.
Finally, the vector \varepsilon_i \sim N(0,\sigma_i^2I_{b_i})
is a b_i
-dimensional vector
containing the random effects, which are all uncorrelated. The variance
components \sigma^{2}_{i} (i = 1, \dots, s)
have to be estimated and this is usually done using
the REML (REstricted Maximum Likelihood) method.
The best linear unbiased estimator for the parameter vector \beta
is
the generalized least square estimator:
\hat{\beta}_{GLS} = (X'V^{-1}X)^{-1}X'V^{-1}y
This estimator has variance-covariance matrix:
Var(\hat{\beta}_{GLS}) = \sigma^{2}(X'V^{-1}X)^{-1}
where
V = \sum\limits_{i = 1}^{s}\eta_i Z_i'Zi
,
\eta_i = \frac{\sigma_i^{2}}{\sigma^{2}}
and
\sigma^{2} = \sigma^{2}_{s}
.
Let M = (X' V^{-1} X)
be the information matrix about \hat{\beta}
and let \eta
be the vector of the variance components.
-
D-optimality. It is based on minimizing the generalized variance of the parameter estimates. This can be done either by minimizing the determinant of the variance-covariance matrix of the factor effects' estimates or by maximizing the determinant of
M
.
The objective function to be minimized is:f_{D}(d; \eta) = \left(\frac{1}{\det(M)}\right)^{1/p}
where
d
is the design with information matrixM
andp
is the number of model parameters. -
A-optimality. This criterion is based on minimizing the average variance of the estimates of the regression coefficients. The sum of the variances of the parameter estimates (elements of
\hat{\beta}
) is taken as a measure, which is equivalent to considering the trace ofM^{-1}
.
The objective function to be minimized is:f_{A}(d; \eta) = trace(M^{-1})/p
where
d
is the design with information matrixM
andp
is the number of model parameters. -
I-optimality. It seeks to minimize the average prediction variance.
The objective function to be minimized is:f_{I}(d; \eta) = \frac{\int_{\chi} f'(x)(M)^{-1}f(x)\,dx }{\int_{\chi} \,dx}
where
d
is the design with information matrixM
and\chi
represents the design region.
It can be proved that when there arek
treatment factors each with two levels, so that the experimental region is of the form[-1, +1]^{k}
, the objective function can also be written as:f_{I}(d; \eta) = trace \left[(M)^{-1} B\right]
where
d
is the design with information matrixM
andB = 2^{-k} \int_{\chi}f'(x)f(x) \,dx
is the moments matrix. To know the implemented expression for calculating the moments matrix for a cuboidal design region see section 2.3 of Sambo, Borrotti, Mylona, and Gilmour (2016). -
Ds-optimality. Its aim is to minimize the generalized variance of the parameter estimates by excluding the intercept from the set of parameters of interest. Let
\beta_i
be the model parameter vector of dimension (p_i - 1
) to be estimated in stratumi
. LetX_i
be the associated model matrixm_i
by(p_i-1)
, wherem_i
is the number of units in stratumi
. The partition of interest of the matrix of variances and covariances of\hat{\beta}_i
is(M_i^{-1})_{22} = [X'_i (I - \frac{1}{m_i} 11^{'})X_i]^{-1}
The objective function to be minimized is:f_{D_s}(d; \eta) = (|(M_i^{-1})_{22}|)^{1/(p_i-1)}
As-optimality. This criterion is based on minimizing the average variance of the estimates of the regression coefficients by excluding the intercept from the set of parameters of interest.
With reference to the notation introduced for the previous criterion, the objective function to be minimized is:f_{A_s}(d; \eta) = trace(W_i(M_i^{-1})_{22})
where
W_i
is a diagonal matrix of weights, with the weights scaled so that the trace ofW_i
is equal to 1. Specifically the implemented matrix assigns to each main effect and each interaction effect an absolute weight equal to 1, while to the quadratic effects it assigns an absolute weight equal to 1/4.-
Id-optimality. It seeks to minimize the average prediction variance by excluding the intercept from the set of parameters of interest.
The objective function to be minimized is the same as the I-optimality criterion where the first row and columns of theB
matrix (see the Id-optimality criterion) are deleted.
Value
MSOpt
returns a list containing the following components:
facts
: The argumentfacts
.nfacts
: An integer. The number of experimental factors (blocking factors are excluded from the count).nstrat
: An integer. The number of strata.units
: The argumentunits
.runs
: An integer. The total number of runs.etas
: The argumentetas
.avlev
: A list showing the available levels for each experimental factor. The design space for each factor is [-1, 1].levs
: A vector showing the number of available levels for each experimental factor.Vinv
: The inverse of the variance-covariance matrix of the responses.model
: The argumentmodel
.crit
: The argumentcriteria
.ncrit
: An integer. The number of criteria considered.M
: The moment matrix. Only with I-optimality criteria.M0
: The moment matrix. Only with Id-optimality criteria.W
: The diagonal matrix of weights. Only with As-optimality criteria.
References
M. Borrotti and F. Sambo and K. Mylona and S. Gilmour. A multi-objective coordinate-exchange two-phase local search algorithm for multi-stratum experiments. Statistics & Computing, 2017.
S. G. Gilmour, J. M. Pardo, L. A. Trinca, K. Niranjan, D.S. Mottram. A split-plot response surface design for improving aroma retention in freeze dried coffee. In: Proceedings of the 6th. European conference on Food-Industry Statist, 2000.
Examples
## This example uses MSOpt to setup a split-plot design with
## 1 whole-plot factor and 4 subplot factors, which in the \code{facts}
## element appear numbered from 2 to 5.
## The experiment must be structured as follows: 6 whole plots and 5 subplots
## per whole plot, for a total of 30 runs.
## Each experimental factor has 3 different levels.
## To check the number of digits to be printed.
backup_options <- options()
options(digits = 10)
facts <- list(1, 2:5)
units <- list(6, 5)
levels <- 3
etas <- list(1)
criteria <- c('I', 'D', 'A')
model <- "quadratic"
msopt <- MSOpt(facts, units, levels, etas, criteria, model)
options(backup_options)
Local search algorithm for high quality design generation
Description
The MSSearch
function can be used to obtain an optimal
multi-stratum experimental design considering one or more optimality criteria,
up to a maximum of six criteria simultaneously.
This function implements the procedure MS-Opt proposed by Sambo, Borrotti,
Mylona e Gilmour (2016) as an extension of the Coordinate-Exchange (CE)
algorithm for constructing approximately optimal designs. This innovative
procedure is able to handle all possible multi-stratum experimental structures
and, instead of minimizing a single objective function as in the original CE
algorithm, it seeks to minimize the following scalarization of the objective
functions for all considered criteria:
f_W = \sum_{c \in C}{\alpha_c f_c(d; \eta)=\overline{\alpha} \cdot \overline{f}},
with
\sum_{c \in C} \alpha_c = 1,
where C
is the set of criteria to be minimized, f_c
is the
objective function for the c
criterion and \overline{\alpha}
is
the vector that controls the relative weights of the objective functions.
Usage
MSSearch(msopt, alpha, ...)
Arguments
msopt |
A list as returned by the |
alpha |
A vector of weights, whose elements must sum to one.
|
... |
optional arguments (see Details). |
Details
MSSearch
by default does not apply any normalization to the
individual objective functions f_c
before the calculation of f_w is
performed. However, it is possible to subject the vector of objective functions
\overline{f}
to the following transformation:
\overline{f}_{norm} = \frac{\overline{f} - CritTR}{CritSC},
by specifying CritTR
and CritSC
vectors as additional parameters,
as described below.
Additional arguments can be specified as follows:
-
'Start', sol
: A string and a matrix, used in pair. They provide a starting solution (or an initial design) to the algorithm. By default the initial solution is randomly generated following the SampleDesign() procedure described in Sambo, Borrotti, Mylona and Gilmour (2016). -
'Restarts', r
: A string and an integer, used in pair. Whenr=1
, the default value, the procedure implemented inMSSearch
results in a local search algorithm that optimizes the objective functionf_W
starting from one initial design in the design space. These parameters allows to restart the algorithmr
times. If no initial design is passed a different starting solution is generated for each iteration, letting the probability to find a global minimum be higher.Mssearch
returns the solution that minimizesf_W
across all ther
iterations. -
'Normalize', c(CritTR, CritSC)
: A string and a vector, used in pair. By specifying theCritTR
andCritSC
vectors, the user can establish the normalization factors to be applied to each objective function before evaluatingf_W
.CritTR
andCritSC
are vectors of length equal to the number of criteria, whose default elements are 0 and 1 respectively.
Value
MSSearch
returns a list, whose elements are:
optsol
: A design matrix. The best solution found.optscore
: A vector containing the criteria scores foroptsol
.feval
: An integer representing the number of score function evaluations (number off_W
evaluations over all iterations).trend
: A vector of lengthr
. Thei
-th element is the value that minimizesf_W
for thei
-th iteration.
References
M. Borrotti and F. Sambo and K. Mylona and S. Gilmour. A multi-objective coordinate-exchange two-phase local search algorithm for multi-stratum experiments. Statistics & Computing, 2016.
Examples
library(multiDoE)
## To check the number of digits to be printed.
backup_options <- options()
options(digits = 10)
## Definition of parameters for experimental setup
facts <- list(1, 2:5)
units <- list(21, 2)
level <- 3
etas <- list(1)
model2 <- "quadratic"
## Single-objective optimization
criteria_S <- c('I')
msopt_S <- MSOpt(facts, units, level, etas, criteria_S, model2)
mssearch_S <- MSSearch(msopt_S, alpha = 1, "Restarts", 100)
## Multi-objective optimization
criteria_M <- c('Id', 'Ds', 'As')
msopt_M <- MSOpt(facts, units, level, etas, criteria_M, model2)
mssearch_M <- MSSearch(msopt_M, alpha = c(1/2, 1/4, 1/4), "Restarts", 100)
options(backup_options)
## To reduce the computational cost of MSSearch function, you may reduce the number of restarts.
Criteria values for design matrix
Description
The Score
function returns the optimization criteria values for a
given MSOpt
list and design matrix.
Usage
Score(msopt, settings)
Arguments
msopt |
A list as returned by the function MSOpt. |
settings |
The design matrix for which criteria scores have to be calculated. |
Value
The data frame of criteria and scores.
Selection of a symmetrical design on the Pareto Front based on the utopian point
Description
The optMultiCrit
function provides an objective criterion
for the selection of the best experimental design among all Pareto front solutions.
The selection is based on minimizing the euclidean distance in the criteria space
between all the Pareto front points and an approximate utopian point. By default,
the coordinates of the utopian point correspond to the minimum value reached
by each criterion during the runTPLS
optimization procedure.
Alternatively, the utopian point can be chosen by the user.
Usage
optMultiCrit(ar, ...)
Arguments
ar |
A list as the |
... |
optional argument (see below). |
Details
Additional arguments can be specified as follows:
-
myUtopianPoint
: A vector containing the utopian point coordinates.
Value
The optMultiCrit
function returns a list whose elements are:
-
solution
: The selected optimal design matrix. -
score
: A vector containing the criteria scores forsolution
.
Selection of the best design from the Pareto Front for each criterion
Description
The optSingleCrit
function selects from the Pareto front
those designs that minimize the criteria when considered individually.
Usage
optSingleCrit(ar)
Arguments
ar |
A list as the |
Value
A list whose i
-th element corresponds to the solution that optimizes
the i
-th criterion. Every solution is a list of two elements:
-
score
: Scores vector. -
solution
: The design matrix.
Graphical representation of the Pareto Front
Description
plotPareto
returns a graphical representation (at most 3D)
of the Pareto front.
Usage
plotPareto(ar, x, y, z = NULL, mode = TRUE)
Arguments
ar |
A list as the |
x |
The criterion on the x axis. It can be one of the following: |
y |
The criterion on the y axis. It can be one of the following: |
z |
The criterion on the z axis. It can be one of the following: |
mode |
When |
Value
The Pareto front chart.
Multi-Stratum Two-Phase Local Search (MS-TPLS) Algorithm
Description
This function implements the Multi-Stratum Two-Phase Local
Search (MS-TPLS) algorithm described in Borrotti, Sambo, Mylona and Gilmour
(2016). This algorithm is useful to obtain exact optimal multi-stratum
designs through a multi-criteria approach. When using runTPLS the user must
establish the search problem (structure of the experiment, number of trials,
optimization criteria, etc.) and the total number of iterations of MS-TPLS.
The resulting experimental designs can minimize up to six criteria simultaneously
from the following: "I", "Id", "D", "Ds", "A" and "As". runTPLS
is able
to provide the set of solutions building the approximate Pareto front for
the specified optimization problem.
Usage
runTPLS(facts, units, criteria, model, iters, ...)
Arguments
facts |
A list of vectors representing the distribution of factors across strata. Each item in the list represents a stratum and the first item is the highest stratum of the multi-stratum structure of the experiment. Within the vectors, experimental factors are indicated by progressive integer from 1 (the first factor of the highest stratum) to the total number of experimental factors (the last factor of the lowest stratum). Blocking factors are differently denoted by empty vectors. |
units |
A list whose |
criteria |
A list specifying the criteria to be optimized. It can contain any combination of:
More detailed information on the available criteria is given in
|
model |
A string which indicates the type of model, among “main", “interaction" and “quadratic". |
iters |
An integer indicating the number of iterations of the MS-TPLS algorithm. |
... |
optional arguments (see below). |
Details
Additional arguments can be specified as follows:
-
'Restarts', restarts
: A string and an integer, used in pair.r
defines the number of times the MS-Opt procedure is altogether called within each iteration of the MS-TPLS algorithm. The default value isr=100
. -
'Levels', levels
: A string and a vector, used in pair.levels
is a vector containing the number of available levels for each experimental factor in the argumentfacts
(blocking factors are excluded). If all experimental factors share the number of levels one integer is sufficient. -
'Etas', etas
: A string and a list, used in pair. Inetas
the user must specify the ratios of error variance between subsequent strata, starting from the highest strata. It follows thatlength(etas)
must be equal tolength(facts)-1
. -
'RestInit', restInit
: A string and an integer, used in pair. Through these parameters, it is possible to determine how many of ther
iterations of MS-Opt should be used for each criterion in the first step of the MS-TPLS algorithm (lines 3-6 of the pseudo-code of MS-TPLS, see Borrotti, Sambo, Mylona and Gilmour (2017)). The default value isrestInit=50
. Letn
be the number of criteria under consideration. One can calculate accordingly asr - (n * restInit)
the number of times MS-Opt is called in the second step (lines 7-11 of the pseudo-code of MS-TPLS) of each iteration of MS-TPLS. -
'RngSeed', rngSeed
: A number indicating the seed for reproducibility. Default is to leave the random number generator alone.
Value
runTPLS
returns a list, whose elements are:
-
ar
: A list of length equal toiters
. Thei
-th element is a list whose elements are:-
nsols
: Number of designs produced during thei
-th iteration. -
dim
: The criteria space dimension. -
scores
: A matrix ofnsols
rows anddim
columns. Every row contains the value of the criteria for each solution of thei
-th iteration. -
solutions
: A list of length equal tonsols
containing the design matrices produced during thei
-th iteration. The values of the criteria corresponding at the first element ofsolutions
are placed in the first row of thescores
matrix and so on.
-
-
stats
: A list of length equal toiters
. Every element is a vector of sizer - (n * restInit) + 1
, wheren
is the number of the considered criteria. The first element represents the number of function evaluations during the first step of the MS-TPLS algorithm; thei
-th element (excluding the first one) is the sum of the number of evaluations for thei
-th scalarization and the maximum value in thestats
. -
megaAR
: A list whose elements are:-
nsols
: The number of the Pareto front solutions. -
dim
: The criteria space dimension. -
scores
: A matrix ofnsols
rows anddim
columns. Every row contains the criteria values for each Pareto front design. -
solutions
: A list of length equal tonsols
containing the design matrices for the Pareto front designs. The values of the criteria corresponding at the first element ofsolutions
are placed in the first row of thescores
matrix and so on.
-
References
M. Borrotti and F. Sambo and K. Mylona and S. Gilmour. A multi-objective coordinate-exchange two-phase local search algorithm for multi-stratum experiments. Statistics & Computing, 2017.
Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS)
Description
This function implements Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS).
This approach is based on the principle that the best solutions must be near to a positive ideal
solution (I+)
and far from a negative ideal solution (I-)
in the
criteria space. The weighted distance measure used to detect these similarities
allows the user to possibly assign different importance to the criteria considered.
The distance measure used is:
L_p(a,b) = \left[ \sum_{j=1}^{m}(w_j)^p(|a-b|)^p\right] ^(1/p)
The metric on the basis of which solution ranking occurs is:
S(x) = \frac{L_p(x,I-)}{(L_p(x,I+) + L_p(x,I-)}
Usage
topsisOpt(out, w = NULL, p = 2)
Arguments
out |
A list as the |
w |
A vector of weights. It must sum to 1. The default wights are uniform. |
p |
A coefficient. It determines the type of distance used. The default value is 2. |
Value
The function returns a list containing the following items:
ranking
: A dataframe containing the ranking values of S(x) and the ordered indexes according to the TOPSIS approach (from the best to the worst).bestScore
: The scores of the best solution.bestSol
: The best solution.
References
M. Méndez, M. Frutos, F. Miguel and R. Aguasca-Colomo. TOPSIS Decision on Approximate Pareto Fronts by Using Evolutionary Algorithms: Application to an Engineering Design Problem. Mathematics, 2020. https://www.mdpi.com/2227-7390/8/11/2072