Type: | Package |
Title: | Non-Smooth Regularization for Structural Equation Models |
Version: | 1.5.5 |
Maintainer: | Jannik H. Orzek <jannik.orzek@mailbox.org> |
Description: | Provides regularized structural equation modeling (regularized SEM) with non-smooth penalty functions (e.g., lasso) building on 'lavaan'. The package is heavily inspired by the ['regsem'](https://github.com/Rjacobucci/regsem) and ['lslx'](https://github.com/psyphh/lslx) packages. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Encoding: | UTF-8 |
RoxygenNote: | 7.2.3 |
Depends: | lavaan, methods |
Imports: | Rcpp (≥ 1.0.8), RcppArmadillo, RcppParallel, ggplot2, tidyr, stringr, numDeriv, utils, stats, graphics, rlang, mvtnorm |
Suggests: | knitr, plotly, rmarkdown, Rsolnp |
LinkingTo: | Rcpp, RcppArmadillo, RcppParallel |
VignetteBuilder: | knitr |
SystemRequirements: | GNU make, C++17 |
URL: | https://github.com/jhorzek/lessSEM |
BugReports: | https://github.com/jhorzek/lessSEM/issues |
NeedsCompilation: | yes |
Packaged: | 2024-01-21 10:06:17 UTC; jannik |
Author: | Jannik H. Orzek |
Repository: | CRAN |
Date/Publication: | 2024-01-22 13:20:02 UTC |
lessSEM
Description
Please see the vignettes and the readme on GitHub for the most up to date description of the package
Details
lessSEM (lessSEM estimates sparse SEM) is an R package for regularized structural equation modeling (regularized SEM) with non-smooth penalty functions (e.g., lasso) building on lavaan. lessSEM is heavily inspired by the regsem package and the lslx packages that have similar functionality. If you use lessSEM, please also cite regsem and and lslx!
The objectives of lessSEM are to provide ...
a flexible framework for regularizing SEM and
optimizers for other SEM packages which can be used with an interface similar to
optim
.
Important: Please also check out the implementations of regularized SEM in the more mature R packages regsem and lslx. Finally, you may want to check out the julia package StructuralEquationModels.jl.
regsem, lslx, and lessSEM
The packages regsem, lslx, and lessSEM can all be used to regularize basic SEM. In fact, as outlined above, lessSEM is heavily inspired by regsem and lslx. However, the packages differ in their targets: The objective of lessSEM is not to replace the more major packages regsem and lslx. Instead, our objective is to provide method developers with a flexible framework for regularized SEM. The following shows an incomplete comparison of some features implemented in the three packages:
regsem | lslx | lessSEM | |
Model specification | based on lavaan | similar to lavaan | based on lavaan |
Maximum likelihood estimation | Yes | Yes | Yes |
Least squares estimation | No | Yes | No |
Confidence Intervals | No | Yes | No |
Missing Data | FIML | Auxiliary Variables | FIML |
Multi-group models | No | Yes | Yes |
Stability selection | Yes | No | No |
Mixed penalties | No | No | Yes |
Equality constraints | Yes | No | Yes |
Parameter transformations | diff_lasso | No | Yes |
Definition variables | No | No | Yes |
Because lessSEM is fairly new, we currently recommend using lslx for cases that are covered by both, lessSEM and lslx.
Introduction
You will find a short introduction to regularized SEM with the lessSEM
package in vignette('lessSEM', package = 'lessSEM')
. More information is also
provided in the documentation of the individual functions (e.g., see ?lessSEM::scad
).
Finally, you will find templates for a selection of models which can be used with lessSEM
(e.g., the cross-lagged panel model) in the package lessTemplates.
Example
library(lessSEM) library(lavaan) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- lasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = c("l6", "l7", "l8", "l9", "l10", "l11", "l12", "l13", "l14", "l15"), # in case of lasso and adaptive lasso, we can specify the number of lambda # values to use. lessSEM will automatically find lambda_max and fit # models for nLambda values between 0 and lambda_max. For the other # penalty functions, lambdas must be specified explicitly nLambdas = 50) # use the plot-function to plot the regularized parameters: plot(lsem) # use the coef-function to show the estimates coef(lsem) # The best parameters can be extracted with: coef(lsem, criterion = "AIC") coef(lsem, criterion = "BIC") # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # AIC and BIC for all tuning parameter configurations: AIC(lsem) BIC(lsem) # cross-validation cv <- cvLasso(lavaanModel = lavaanModel, regularized = c("l6", "l7", "l8", "l9", "l10", "l11", "l12", "l13", "l14", "l15"), lambdas = seq(0,1,.1), standardize = TRUE) # get best model according to cross-validation: coef(cv) #### Advanced ### # Switching the optimizer # # Use the "method" argument to switch the optimizer. The control argument # must also be changed to the corresponding function: lsemIsta <- lasso( lavaanModel = lavaanModel, regularized = paste0("l", 6:15), nLambdas = 50, method = "ista", control = controlIsta( # Here, we can also specify that we want to use multiple cores: nCores = 2)) # Note: The results are basically identical: lsemIsta@parameters - lsem@parameters
Transformations
lessSEM allows for parameter transformations which could, for instance, be used to test
measurement invariance in longitudinal models (e.g., Liang, 2018; Bauer et al., 2020).
A thorough introduction is provided in vignette('Parameter-transformations', package = 'lessSEM')
.
As an example, we will test measurement invariance in the PoliticalDemocracy
data set.
library(lessSEM) library(lavaan) # we will use the PoliticalDemocracy from lavaan (see ?lavaan::sem) model <- ' # latent variable definitions ind60 =~ x1 + x2 + x3 # assuming different loadings for different time points: dem60 =~ y1 + a1*y2 + b1*y3 + c1*y4 dem65 =~ y5 + a2*y6 + b2*y7 + c2*y8 # regressions dem60 ~ ind60 dem65 ~ ind60 + dem60 # residual correlations y1 ~~ y5 y2 ~~ y4 + y6 y3 ~~ y7 y4 ~~ y8 y6 ~~ y8 ' fit <- sem(model, data = PoliticalDemocracy) # We will define a transformation which regularizes differences # between loadings over time: transformations <- " // which parameters do we want to use? parameters: a1, a2, b1, b2, c1, c2, delta_a2, delta_b2, delta_c2 // transformations: a2 = a1 + delta_a2; b2 = b1 + delta_b2; c2 = c1 + delta_c2; " # setting delta_a2, delta_b2, or delta_c2 to zero implies measurement invariance # for the respective parameters (a1, b1, c1) lassoFit <- lasso(lavaanModel = fit, # we want to regularize the differences between the parameters regularized = c("delta_a2", "delta_b2", "delta_c2"), nLambdas = 100, # Our model modification must make use of the modifyModel - function: modifyModel = modifyModel(transformations = transformations) )
Finally, we can extract the best parameters:
coef(lassoFit, criterion = "BIC")
As all differences (delta_a2
, delta_b2
, and delta_c2
) have been zeroed, we can
assume measurement invariance.
Experimental Features
The following features are relatively new and you may still experience some bugs. Please be aware of that when using these features.
From lessSEM to lavaan
lessSEM supports exporting specific models to lavaan. This can be very useful when plotting the final model. In our case, the best model is given by:
lambdaBest <- coef(lsem, criterion = "BIC")@tuningParameters$lambda
We can get the lavaan model with the parameters corresponding to those of the
regularized model with lambda = lambdaBest
as follows:
lavaanModel <- lessSEM2Lavaan(regularizedSEM = lsem, lambda = lambdaBest)
The result can be plotted with, for instance, semPlot:
library(semPlot) semPaths(lavaanModel, what = "est", fade = FALSE)
Multi-Group Models and Definition Variables
lessSEM supports multi-group SEM and, to some degree, definition variables.
Regularized multi-group SEM have been proposed by Huang (2018) and are
implemented in lslx (Huang, 2020). Here, differences between groups are regularized.
A detailed introduction can be found in
vignette(topic = "Definition-Variables-and-Multi-Group-SEM", package = "lessSEM")
.
Therein it is also explained how the multi-group SEM can be used to implement
definition variables (e.g., for latent growth curve models).
Mixed Penalties
lessSEM allows for defining different penalties for different parts
of the model. This feature is new and very experimental. Please keep that
in mind when using the procedure. A detailed introduction
can be found in vignette(topic = "Mixed-Penalties", package = "lessSEM")
.
To provide a short example, we will regularize the loadings and the regression
parameters of the Political Democracy data set with different penalties. The
following script is adapted from ?lavaan::sem
.
model <- ' # latent variable definitions ind60 =~ x1 + x2 + x3 + c2*y2 + c3*y3 + c4*y4 dem60 =~ y1 + y2 + y3 + y4 dem65 =~ y5 + y6 + y7 + c*y8 # regressions dem60 ~ r1*ind60 dem65 ~ r2*ind60 + r3*dem60 ' lavaanModel <- sem(model, data = PoliticalDemocracy) # Let's add a lasso penalty on the cross-loadings c2 - c4 and # scad penalty on the regressions r1-r3 fitMp <- lavaanModel |> mixedPenalty() |> addLasso(regularized = c("c2", "c3", "c4"), lambdas = seq(0,1,.1)) |> addScad(regularized = c("r1", "r2", "r3"), lambdas = seq(0,1,.2), thetas = 3.7) |> fit()
The best model according to the BIC can be extracted with:
coef(fitMp, criterion = "BIC")
Optimizers
Currently, lessSEM has the following optimizers:
(variants of) iterative shrinkage and thresholding (e.g., Beck & Teboulle, 2009; Gong et al., 2013; Parikh & Boyd, 2013); optimization of cappedL1, lsp, scad, and mcp is based on Gong et al. (2013)
glmnet (Friedman et al., 2010; Yuan et al., 2012; Huang, 2020)
These optimizers are implemented based on the
regCtsem package. Most importantly,
all optimizers in lessSEM are available for other packages.
There are three ways to implement them which are documented in
vignette("General-Purpose-Optimization", package = "lessSEM")
.
In short, these are:
using the R interface: All general purpose implementations of the functions are called with prefix "gp" (
gpLasso
,gpScad
, ...). More information and examples can be found in the documentation of these functions (e.g.,?lessSEM::gpLasso
,?lessSEM::gpAdaptiveLasso
,?lessSEM::gpElasticNet
). The interface is similar to the optim optimizers in R.using Rcpp, we can pass C++ function pointers to the general purpose optimizers
gpLassoCpp
,gpScadCpp
, ... (e.g.,?lessSEM::gpLassoCpp
)All optimizers are implemented as C++ header-only files in lessSEM. Thus, they can be accessed from other packages using C++. The interface is similar to that of the ensmallen library. We have implemented a simple example for elastic net regularization of linear regressions in the lessLM package. You can also find more details on the general design of the optimizer interface in
vignette("The-optimizer-interface", package = "lessSEM")
.
References
R - Packages / Software
-
lavaan Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36. https://doi.org/10.18637/jss.v048.i02
-
regsem: Jacobucci, R. (2017). regsem: Regularized Structural Equation Modeling. ArXiv:1703.08489 Stat. https://arxiv.org/abs/1703.08489
-
lslx: Huang, P.-H. (2020). lslx: Semi-confirmatory structural equation modeling via penalized likelihood. Journal of Statistical Software, 93(7). https://doi.org/10.18637/jss.v093.i07
-
fasta: Another implementation of the fista algorithm (Beck & Teboulle, 2009).
-
ensmallen: Curtin, R. R., Edel, M., Prabhu, R. G., Basak, S., Lou, Z., & Sanderson, C. (2021). The ensmallen library for flexible numerical optimization. Journal of Machine Learning Research, 22, 1–6.
-
regCtsem: Orzek, J. H., & Voelkle, M. C. (in press). Regularized continuous time structural equation models: A network perspective. Psychological Methods.
Regularized Structural Equation Modeling
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Huang, P.-H. (2018). A penalized likelihood method for multi-group structural equation modelling. British Journal of Mathematical and Statistical Psychology, 71(3), 499–522. https://doi.org/10.1111/bmsp.12130
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Penalty Functions
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted l1 Minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905. https://doi.org/10.1007/s00041-008-9045-x
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942. https://doi.org/10.1214/09-AOS729
Zhang, T. (2010). Analysis of Multi-stage Convex Relaxation for Sparse Regularization. Journal of Machine Learning Research, 11, 1081–1107.
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Optimizer
GLMNET
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
Variants of ISTA
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Miscellaneous
Liang, X., Yang, Y., & Huang, J. (2018). Evaluation of structural relationships in autoregressive cross-lagged models under longitudinal approximate invariance: A Bayesian analysis. Structural Equation Modeling: A Multidisciplinary Journal, 25(4), 558–572. https://doi.org/10.1080/10705511.2017.1410706
Bauer, D. J., Belzak, W. C. M., & Cole, V. T. (2020). Simplifying the Assessment of Measurement Invariance over Multiple Background Variables: Using Regularized Moderated Nonlinear Factor Analysis to Detect Differential Item Functioning. Structural Equation Modeling: A Multidisciplinary Journal, 27(1), 43–55. https://doi.org/10.1080/10705511.2019.1642754
Important Notes
THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Author(s)
Jannik Orzek orzek@mpib-berlin.mpg.de
See Also
Useful links:
.SEMFromLavaan
Description
internal function. Translates an object of class lavaan to the internal model representation.
Usage
.SEMFromLavaan(
lavaanModel,
whichPars = "est",
fit = TRUE,
addMeans = TRUE,
activeSet = NULL,
dataSet = NULL,
transformations = NULL,
transformationList = list(),
transformationGradientStepSize = 1e-06
)
Arguments
lavaanModel |
model of class lavaan |
whichPars |
which parameters should be used to initialize the model. If set to "est", the parameters will be set to the estimated parameters of the lavaan model. If set to "start", the starting values of lavaan will be used. The latter can be useful if parameters are to be optimized afterwards as setting the parameters to "est" may result in the model getting stuck in a local minimum. |
fit |
should the model be fitted and compared to the lavaanModel? |
addMeans |
If lavaanModel has meanstructure = FALSE, addMeans = TRUE will add a mean structure. FALSE will set the means of the observed variables to the average |
activeSet |
Option to only use a subset of the individuals in the data set. Logical vector of length N indicating which subjects should remain in the sample. |
dataSet |
optional: Pass an alternative data set to lessSEM:::.SEMFromLavaan which will replace the original data set in lavaanModel. |
transformationGradientStepSize |
step size used to compute the gradients of the transformations |
Value
Object of class Rcpp_SEMCpp
.SEMdata
Description
internal function. Creates internal data representation
Usage
.SEMdata(rawData)
Arguments
rawData |
matrix with raw data set |
Value
list with internal representation of data
.SEMdataWLS
Description
internal function. Creates internal data representation
Usage
.SEMdataWLS(rawData, lavaanModel)
Arguments
rawData |
matrix with raw data set |
lavaanModel |
lavaan model |
Value
list with internal representation of data
.adaptBreakingForWls
Description
wls needs smaller breaking points than ml
Usage
.adaptBreakingForWls(lavaanModel, currentBreaking, selectedDefault)
Arguments
lavaanModel |
single model or vector of models |
currentBreaking |
current breaking condition value |
selectedDefault |
was default breaking condition selected? |
Value
updated breaking
.addMeanStructure
Description
adds a mean strucuture to the parameter table
Usage
.addMeanStructure(parameterTable, manifestNames, MvectorElements)
Arguments
parameterTable |
table with parameters |
manifestNames |
names of manifest variables |
MvectorElements |
elements of the means vector |
Value
parameterTable
.checkLavaanModel
Description
checks model of type lavaan
Usage
.checkLavaanModel(lavaanModel)
Arguments
lavaanModel |
m0del of type lavaan |
Value
nothing
.checkPenalties
Description
Internal function to check a mixedPenalty object
Usage
.checkPenalties(mixedPenalty)
Arguments
mixedPenalty |
object of class mixedPenalty. This object can be created with the mixedPenalty function. Penalties can be added with the addCappedL1, addLasso, addLsp, addMcp, and addScad functions. |
.compileTransformations
Description
compile user defined parameter transformations to a pass to a SEM
Usage
.compileTransformations(syntax, parameterLabels, compile = TRUE, notes = NULL)
Arguments
syntax |
string with user defined transformations |
parameterLabels |
names of parameters in the model |
compile |
if set to FALSE, the function will not be compiled -> for visual inspection |
notes |
option to pass a notes to function. All notes of the current function will be added |
Value
list with parameter names and two Rcpp functions: (1) the transformation function and (2) a function to create a pointer to the transformation function. If starting values were defined, these are returned as well.
.computeInitialHessian
Description
computes the initial Hessian used in the optimization. Because we use the parameter estimates from lavaan as starting values, it typcially makes sense to just use the Hessian of the lavaan model as initial Hessian
Usage
.computeInitialHessian(
initialHessian,
rawParameters,
lavaanModel,
SEM,
addMeans,
stepSize,
notes = NULL
)
Arguments
initialHessian |
option to provide an initial Hessian to the optimizer. Must have row and column names corresponding to the parameter labels. use getLavaanParameters(lavaanModel) to see those labels. If set to "scoreBased", the outer product of the scores will be used as an approximation (see https://en.wikipedia.org/wiki/Berndt%E2%80%93Hall%E2%80%93Hall%E2%80%93Hausman_algorithm). If set to "compute", the initial hessian will be computed. If set to a single value, a diagonal matrix with the single value along the diagonal will be used. The default is "lavaan" which extracts the Hessian from the lavaanModel. This Hessian will typically deviate from that of the internal SEM represenation of lessSEM (due to the transformation of the variances), but works quite well in practice. |
rawParameters |
vector with raw parameters |
lavaanModel |
lavaan model object |
SEM |
internal SEM representation |
addMeans |
should a mean structure be added to the model? |
stepSize |
initial step size |
notes |
option to pass a notes to function. All notes of the current function will be added |
Value
Hessian matrix and notes
.createMultiGroupTransformations
Description
compiles the transformation function and adapts the parameter vector
Usage
.createMultiGroupTransformations(transformations, parameterValues)
Arguments
transformations |
string with transformations |
parameterValues |
values of parameters already in the model |
Value
list with extended parameter vector and transformation function pointer
.createParameterTable
Description
create a parameter table using the elements extracted from lavaan
Usage
.createParameterTable(
parameterValues,
parameterLabels,
modelParameters,
parameterIDs
)
Arguments
parameterValues |
values of parameters |
parameterLabels |
names of the parameters |
modelParameters |
model parameters from lavaan |
parameterIDs |
unique parameter ids from lavaan -> identify each parameter with a unique number |
Value
parameter table for lessSEM
.createRcppTransformationFunction
Description
create an Rcpp function which uses the user-defined parameter transformation
Usage
.createRcppTransformationFunction(syntax, parameters)
Arguments
syntax |
syntax with user defined transformations |
parameters |
labels of parameters used in these transformations |
Value
string with functions for compilations with Rcpp
.createTransformations
Description
compiles the transformation function and adapts the parameterTable
Usage
.createTransformations(transformations, parameterLabels, parameterTable)
Arguments
transformations |
string with transformations |
parameterLabels |
labels of parameteres already in the model |
parameterTable |
existing parameter table |
Value
list with parameterTable and transformation function pointer
.cvRegularizeSEMInternal
Description
Combination of regularized structural equation model and cross-validation
Usage
.cvRegularizeSEMInternal(
lavaanModel,
k,
standardize,
penalty,
weights,
returnSubsetParameters,
tuningParameters,
method,
modifyModel,
control
)
Arguments
lavaanModel |
model of class lavaan |
k |
the number of cross-validation folds. Alternatively, a matrix with pre-defined subsets can be passed to the function. See ?lessSEM::cvLasso for an example |
standardize |
should training and test sets be standardized? |
penalty |
string: name of the penalty used in the model |
weights |
labeled vector with weights for each of the parameters in the model. |
returnSubsetParameters |
if set to TRUE, the parameter estimates of the individual cross-validation training sets will be returned |
tuningParameters |
data.frame with tuning parameter values |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta() and controlGlmnet() functions. |
Details
Internal function: This function computes the regularized models for all penalty functions which are implemented for glmnet and gist. Use the dedicated penalty functions (e.g., lessSEM::cvLasso) to penalize the model.
Value
model of class cvRegularizedSEM
.cvRegularizeSmoothSEMInternal
Description
Combination of smoothly regularized structural equation model and cross-validation
Usage
.cvRegularizeSmoothSEMInternal(
lavaanModel,
k,
standardize,
penalty,
weights,
returnSubsetParameters,
tuningParameters,
epsilon,
modifyModel,
method = "bfgs",
control
)
Arguments
lavaanModel |
model of class lavaan |
k |
the number of cross-validation folds. Alternatively, a matrix with pre-defined subsets can be passed to the function. See ?lessSEM::cvSmoothLasso for an example |
standardize |
should training and test sets be standardized? |
penalty |
string: name of the penalty used in the model |
weights |
labeled vector with weights for each of the parameters in the model. |
returnSubsetParameters |
if set to TRUE, the parameter estimates of the individual cross-validation training sets will be returned |
tuningParameters |
data.frame with tuning parameter values |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
optimizer used. Currently only "bfgs" is supported. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Details
Internal function: This function computes the regularized models for all penalty functions which are implemented for bfgs. Use the dedicated penalty functions (e.g., lessSEM::cvSmoothLasso) to penalize the model.
Value
model of class cvRegularizedSEM
.cvregsem2LavaanParameters
Description
helper function: regsem and lavaan use slightly different parameter labels. This function can be used to translate the parameter labels of a cv_regsem object to lavaan labels
Usage
.cvregsem2LavaanParameters(cvregsemModel, lavaanModel)
Arguments
cvregsemModel |
model of class cvregsem |
lavaanModel |
model of class lavaan |
Value
regsem parameters with lavaan labels
.defineDerivatives
Description
adds all elements required to compute the derivatives of the fitting function with respect to the parameters to the SEMList
Usage
.defineDerivatives(SEMList, parameterTable, modelMatrices)
Arguments
SEMList |
list representing SEM |
parameterTable |
table with parameters |
modelMatrices |
matrices of the RAM model |
Value
SEMList
.extractParametersFromSyntax
Description
extract the names of the parameters in a syntax
Usage
.extractParametersFromSyntax(syntax, parameterLabels)
Arguments
syntax |
syntax for parameter transformations |
parameterLabels |
names of parameters in the model |
Value
vector with names of parameters used in the syntax and vector with boolean indicating if parameter is transformation result
.extractSEMFromLavaan
Description
internal function. Translates an object of class lavaan to the internal model representation.
Usage
.extractSEMFromLavaan(
lavaanModel,
whichPars = "est",
fit = TRUE,
addMeans = TRUE,
activeSet = NULL,
dataSet = NULL,
transformations = NULL
)
Arguments
lavaanModel |
model of class lavaan |
whichPars |
which parameters should be used to initialize the model. If set to "est", the parameters will be set to the estimated parameters of the lavaan model. If set to "start", the starting values of lavaan will be used. The latter can be useful if parameters are to be optimized afterwards as setting the parameters to "est" may result in the model getting stuck in a local minimum. |
fit |
should the model be fitted and compared to the lavaanModel? |
addMeans |
If lavaanModel has meanstructure = FALSE, addMeans = TRUE will add a mean structure. FALSE will set the means of the observed variables to the average |
activeSet |
Option to only use a subset of the individuals in the data set. Logical vector of length N indicating which subjects should remain in the sample. |
dataSet |
optional: Pass an alternative data set to lessSEM:::.SEMFromLavaan which will replace the original data set in lavaanModel. |
transformations |
optional: transform parameter values. |
Value
list with SEMList (model in RAM representation) and fit (boolean indicating if the model should be fit and compared to lavaan)
.fit
Description
fits an object of class Rcpp_SEMCpp.
Usage
.fit(SEM)
Arguments
SEM |
model of class Rcpp_SEMCpp. |
Value
fitted SEM
.fitElasticNetMix
Description
Optimizes an object with mixed penalty. See ?mixedPenalty for more details.
Usage
.fitElasticNetMix(mixedPenalty)
Arguments
mixedPenalty |
object of class mixedPenalty. This object can be created with the mixedPenalty function. Penalties can be added with the addCappedL1, addElastiNet, addLasso, addLsp, addMcp, and addScad functions. |
Value
object of class regularizedSEMMixedPenalty
.fitFunction
Description
internal function which returns the objective value of the fitting function of an object of class Rcpp_SEMCpp. This function can be used in optimizers
Usage
.fitFunction(par, SEM, raw)
Arguments
par |
labeled vector with parameter values |
SEM |
model of class Rcpp_SEMCpp. |
raw |
controls if the internal transformations of lessSEM is used. |
Value
objective value of the fitting function
.fitMix
Description
Optimizes an object with mixed penalty. See ?mixedPenalty for more details.
Usage
.fitMix(mixedPenalty)
Arguments
mixedPenalty |
object of class mixedPenalty. This object can be created with the mixedPenalty function. Penalties can be added with the addCappedL1, addElastiNet, addLasso, addLsp, addMcp, and addScad functions. |
Value
object of class regularizedSEMMixedPenalty
.getGradients
Description
returns the gradients of a model of class Rcpp_SEMCpp. This is the internal model representation. Models of this class can be generated with the lessSEM:::.SEMFromLavaan-function.
Usage
.getGradients(SEM, raw)
Arguments
SEM |
model of class Rcpp_SEMCpp |
raw |
controls if the internal transformations of lessSEM should be used. lessSEM will use an exponential function for all variances to avoid negative variances. When set to TRUE, the gradients will be given for the internal parameter representation. Set to FALSE to get the usual gradients |
Value
vector with derivatives of the -2log-Likelihood with respect to each parameter
.getHessian
Description
returns the Hessian of a model of class Rcpp_SEMCpp. This is the internal model representation. Models of this class can be generated with the lessSEM:::.SEMFromLavaan-function. The function is adapted from lavaan::lav_model_hessian.
Usage
.getHessian(SEM, raw, eps = 1e-07)
Arguments
SEM |
model of class Rcpp_SEMCpp |
raw |
controls if the internal transformations of lessSEM should be used. lessSEM will use an exponential function for all variances to avoid negative variances. When set to TRUE, the gradients will be given for the internal parameter representation. Set to FALSE to get the usual gradients |
eps |
eps controls the step size of the numerical approximation. |
Value
matrix with second derivatives of the -2log-Likelihood with respect to each parameter
.getMaxLambda_C
Description
generates a the first lambda which sets all regularized parameters to zero
Usage
.getMaxLambda_C(
regularizedModel,
SEM,
rawParameters,
weights,
N,
approx = FALSE
)
Arguments
regularizedModel |
Model combining likelihood and lasso type penalty |
SEM |
model of class Rcpp_SEMCpp |
rawParameters |
labeled vector with starting values |
weights |
weights given to each parameter in the penalty function |
N |
sample size |
approx |
When set to TRUE, .Machine$double.xmax^(.01) is used instead of .Machine$double.xmax^(.05) |
Value
first lambda value which sets all regularized parameters to zero (plus some tolerance)
.getParameters
Description
returns the parameters of the internal model representation.
Usage
.getParameters(SEM, raw = FALSE, transformations = FALSE)
Arguments
SEM |
model of class Rcpp_SEMCpp. Models of this class |
raw |
controls if the parameter are returned in raw format or transformed |
transformations |
should transformed parameters be included? |
Value
labeled vector with parameter values
.getRawData
Description
Extracts the raw data from lavaan or adapts a user supplied data set to the structure of the lavaan data
Usage
.getRawData(lavaanModel, dataSet, estimator)
Arguments
lavaanModel |
model fitted with lavaan |
dataSet |
user supplied data set |
estimator |
which estimator is used? |
Value
raw data
.getScores
Description
returns the scores of a model of class Rcpp_SEMCpp. This is the internal model representation. Models of this class can be generated with the lessSEM:::.SEMFromLavaan-function.
Usage
.getScores(SEM, raw)
Arguments
SEM |
model of class Rcpp_SEMCpp |
raw |
controls if the internal transformations of lessSEM should be used. lessSEM will use an exponential function for all variances to avoid negative variances. When set to TRUE, the scores will be given for the internal parameter representation. Set to FALSE to get the usual scores |
Value
matrix with derivatives of the -2log-Likelihood for each person and parameter (rows are persons, columns are parameters)
.gpGetMaxLambda
Description
generates a the first lambda which sets all regularized parameters to zero
Usage
.gpGetMaxLambda(
regularizedModel,
par,
fitFunction,
gradientFunction,
userSuppliedArguments,
weights
)
Arguments
regularizedModel |
Model combining likelihood and lasso type penalty |
par |
labeled vector with starting values |
fitFunction |
R fit function |
gradientFunction |
R gradient functions |
userSuppliedArguments |
list with arguments for fitFunction and gradientFunction |
weights |
weights given to each parameter in the penalty function |
Value
first lambda value which sets all regularized parameters to zero (plus some tolerance)
.gpOptimizationInternal
Description
Internal function: This function computes the regularized models for all penaltiy functions which are implemented for glmnet and gist. Use the dedicated penalty functions (e.g., lessSEM::gpLasso) to penalize the model.
Usage
.gpOptimizationInternal(
par,
weights,
fn,
gr = NULL,
additionalArguments,
isCpp = FALSE,
penalty,
tuningParameters,
method,
control
)
Arguments
par |
labeled vector with starting values |
weights |
labeled vector with weights for each of the parameters in the model. |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
additionalArguments |
additional argument passed to fn and gr |
isCpp |
boolean: are fn and gr C++ function pointers? |
penalty |
string: name of the penalty used in the model |
tuningParameters |
data.frame with tuning parameter values |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
control |
used to control the optimizer. This element is generated with the controlIsta() and controlGlmnet() functions. |
Value
Object of class gpRegularized
.gradientFunction
Description
internal function which returns the gradients of an object of class Rcpp_SEMCpp. This function can be used in optimizers
Usage
.gradientFunction(par, SEM, raw)
Arguments
par |
labeled vector with parameter values |
SEM |
model of class Rcpp_SEMCpp. |
raw |
controls if the internal transformations of lessSEM is used. |
Value
gradients of the model
.initializeMultiGroupSEMForRegularization
Description
initializes the internal C++ SEM for regularization functions
Usage
.initializeMultiGroupSEMForRegularization(
lavaanModels,
startingValues,
modifyModel
)
Arguments
lavaanModels |
vector with models of class lavaan |
startingValues |
either set to est, start, or labeled vector with starting values |
modifyModel |
user supplied model modifications |
Value
model to be used by the regularization procedure
.initializeSEMForRegularization
Description
initializes the internal C++ SEM for regularization functions
Usage
.initializeSEMForRegularization(lavaanModel, startingValues, modifyModel)
Arguments
lavaanModel |
model of class lavaan |
startingValues |
either set to est, start, or labeled vector with starting values |
modifyModel |
user supplied model modifications |
Value
model to be used by the regularization procedure
.initializeWeights
Description
initialize the adaptive lasso weights
Usage
.initializeWeights(
weights,
penalty,
method,
createAdaptiveLassoWeights,
control,
lavaanModel,
modifyModel,
startingValues,
rawParameters
)
Arguments
weights |
weight argument passed to function |
penalty |
penalty used |
createAdaptiveLassoWeights |
should adaptive lasso weights be created? |
control |
list with control elements for optimizer |
lavaanModel |
model of type lavaan |
modifyModel |
list with model modifications |
startingValues |
either set to est, start, or labeled vector with starting values |
rawParameters |
raw parameters |
Value
vector with weights
.labelLavaanParameters
Description
Adds labels to unlabeled parameters in the lavaan parameter table. Also removes fixed parameters.
Usage
.labelLavaanParameters(lavaanModel)
Arguments
lavaanModel |
fitted lavaan model |
Value
parameterTable with labeled parameters
.lavaan2regsemLabels
Description
helper function: regsem and lavaan use slightly different parameter labels. This function can be used to get both sets of labels.
Usage
.lavaan2regsemLabels(lavaanModel)
Arguments
lavaanModel |
model of class lavaan |
Value
a list with lavaan and regsem labels
.likelihoodRatioFit
Description
internal function which returns the likelihood ratio fit statistic
Usage
.likelihoodRatioFit(par, SEM, raw)
Arguments
par |
labeled vector with parameter values |
SEM |
model of class Rcpp_SEMCpp. |
raw |
controls if the internal transformations of lessSEM is used. |
Value
likelihood ratio fit statistic
.makeSingleLine
Description
checks if a parameter: or a start: statement spans multiple lines and reduces it to one line.
Usage
.makeSingleLine(syntax, what)
Arguments
syntax |
reduced syntax |
what |
which statement to look for (parameters or start) |
Value
a syntax where multi-line statements are condensed to one line
.multiGroupSEMFromLavaan
Description
internal function. Translates a vector of objects of class lavaan to the internal model representation.
Usage
.multiGroupSEMFromLavaan(
lavaanModels,
whichPars = "est",
fit = TRUE,
addMeans = TRUE,
transformations = NULL,
transformationList = list(),
transformationGradientStepSize = 1e-06
)
Arguments
lavaanModels |
vector with lavaan models |
whichPars |
which parameters should be used to initialize the model. If set to "est", the parameters will be set to the estimated parameters of the lavaan model. If set to "start", the starting values of lavaan will be used. The latter can be useful if parameters are to be optimized afterwards as setting the parameters to "est" may result in the model getting stuck in a local minimum. |
fit |
should the model be fitted |
addMeans |
If lavaanModel has meanstructure = FALSE, addMeans = TRUE will add a mean structure. FALSE will set the means of the observed variables to the average |
transformations |
string with transformations |
transformationList |
list for transformations |
transformationGradientStepSize |
step size used to compute the gradients of the transformations |
Value
Object of class Rcpp_mgSEMCpp
.noDotDotDot
Description
remplaces the dot dot dot part of the fitting and gradient fuction
Usage
.noDotDotDot(fn, fnName, ...)
Arguments
fn |
fit or gradient function. IMPORTANT: THE FIRST ARGUMENT TO THE FUNCTION MUST BE THE PARAMETER VECTOR |
fnName |
name of the function fn |
... |
additional arguments |
Value
list with (1) new function which wraps fn and (2) list with arguments passed to fn
.penaltyTypes
Description
translates the penalty from a numeric value to the character or from the character to the numeric value. The numeric value is used by the C++ backend.
Usage
.penaltyTypes(penalty)
Arguments
penalty |
either a number or the name of the penalty |
Value
number corresponding to one of the penalties
.reduceSyntax
Description
reduce user defined parameter transformation syntax to basic elements
Usage
.reduceSyntax(syntax)
Arguments
syntax |
string with user defined transformations |
Value
a cut and simplified version of the syntax
.regularizeSEMInternal
Description
Internal function: This function computes the regularized models for all penaltiy functions which are implemented for glmnet and gist. Use the dedicated penalty functions (e.g., lessSEM::lasso) to penalize the model.
Usage
.regularizeSEMInternal(
lavaanModel,
penalty,
weights,
tuningParameters,
method,
modifyModel,
control,
notes = NULL
)
Arguments
lavaanModel |
model of class lavaan |
penalty |
string: name of the penalty used in the model |
weights |
labeled vector with weights for each of the parameters in the model. |
tuningParameters |
data.frame with tuning parameter values |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta() and controlGlmnet() functions. |
notes |
option to pass a notes to function. All notes of the current function will be added |
Value
regularized SEM
.regularizeSEMWithCustomPenaltyRsolnp
Description
Optimize a SEM with custom penalty function using the Rsolnp optimizer (see ?Rsolnp::solnp). This optimizer is the default in regsem (see ?regsem::cv_regsem).
Usage
.regularizeSEMWithCustomPenaltyRsolnp(
lavaanModel,
individualPenaltyFunction,
tuningParameters,
penaltyFunctionArguments,
startingValues = "est",
carryOverParameters = TRUE,
control = list(trace = 0)
)
Arguments
lavaanModel |
model of class lavaan |
individualPenaltyFunction |
penalty function which takes the current parameter values as first argument, the tuning parameters as second, and the penaltyFunctionArguments as third argument and returns a single value - the value of the penalty function for a single person. If the true penalty function is non-differentiable (e.g., lasso) a smooth approximation of this function should be provided. |
tuningParameters |
data.frame with tuning parameter values. Important: The function will iterate over the rows of these tuning parameters and pass them to your penalty function |
penaltyFunctionArguments |
arguments passed to individualPenaltyFunction, individualPenaltyFunctionGradient, and individualPenaltyFunctionHessian |
startingValues |
option to provide initial starting values. Only used for the first lambda. Three options are supported. Setting to "est" will use the estimates from the lavaan model object. Setting to "start" will use the starting values of the lavaan model. Finally, a labeled vector with parameter values can be passed to the function which will then be used as starting values. |
carryOverParameters |
should parameters from the previous iteration be used as starting values of the next iteration? |
control |
option to set parameters of the optimizer; see ?Rsolnp::solnp |
Value
Model of class regularizedSEMWithCustomPenalty
.regularizeSmoothSEMInternal
Description
Internal function: This function computes the regularized models for all smooth penalty functions which are implemented for bfgs. Use the dedicated penalty functions (e.g., lessSEM::smoothLasso) to penalize the model.
Usage
.regularizeSmoothSEMInternal(
lavaanModel,
penalty,
weights,
tuningParameters,
epsilon,
tau,
method = "bfgs",
modifyModel,
control,
notes = NULL
)
Arguments
lavaanModel |
model of class lavaan |
penalty |
string: name of the penalty used in the model |
weights |
labeled vector with weights for each of the parameters in the model. |
tuningParameters |
data.frame with tuning parameter values |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
tau |
parameters below threshold tau will be seen as zeroed |
method |
optimizer used. Currently only "bfgs" is supported. |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
notes |
option to pass a notes to function. All notes of the current function will be added |
Value
regularizedSEM
.ridgeGradient
Description
ridge gradient function
Usage
.ridgeGradient(parameters, tuningParameters, penaltyFunctionArguments)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with field lambda (tuning parameter value) |
penaltyFunctionArguments |
list with field regularizedParameterLabels (labels of regularized parameters) |
Value
gradient values
.ridgeHessian
Description
ridge Hessian function
Usage
.ridgeHessian(parameters, tuningParameters, penaltyFunctionArguments)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with field lambda (tuning parameter value) |
penaltyFunctionArguments |
list with field regularizedParameterLabels (labels of regularized parameters) |
Value
Hessian matrix
.ridgeValue
Description
ridge penalty function
Usage
.ridgeValue(parameters, tuningParameters, penaltyFunctionArguments)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with field lambda (tuning parameter value) |
penaltyFunctionArguments |
list with field regularizedParameterLabels (labels of regularized parameters) |
Value
penalty function value
.setAMatrix
Description
internal function. Populates the matrix with directed effects in RAM notation
Usage
.setAMatrix(
model,
lavaanParameterTable,
nLatent,
nManifest,
latentNames,
manifestNames
)
Arguments
model |
model of class lavaan |
lavaanParameterTable |
parameter table from lavaan |
nLatent |
number of latent variables |
nManifest |
number of manifest variables |
latentNames |
names of latent variables |
manifestNames |
names of manifest variables |
.setFmatrix
Description
returns the filter matrix of a RAM
Usage
.setFmatrix(nManifest, manifestNames, nLatent, latentNames)
Arguments
nManifest |
number of manifest variables |
manifestNames |
names of manifest variables |
nLatent |
number of latent variables |
latentNames |
names of latent variables |
Value
matrix
.setMVector
Description
internal function. Populates the vector with means in RAM notation
Usage
.setMVector(
model,
lavaanParameterTable,
nLatent,
nManifest,
latentNames,
manifestNames,
rawData
)
Arguments
model |
model of class lavaan |
lavaanParameterTable |
parameter table from lavaan |
nLatent |
number of latent variables |
nManifest |
number of manifest variables |
latentNames |
names of latent variables |
manifestNames |
names of manifest variables |
rawData |
matrix with raw data |
.setParameters
Description
change the parameters of the internal model representation.
Usage
.setParameters(SEM, labels, values, raw)
Arguments
SEM |
model of class Rcpp_SEMCpp. Models of this class |
labels |
vector with parameter labels |
values |
vector with parameter values |
raw |
are the parameters given in raw format or transformed? |
Value
SEM with changed parameter values
.setSMatrix
Description
internal function. Populates the matrix with undirected paths in RAM notation
Usage
.setSMatrix(
model,
lavaanParameterTable,
nLatent,
nManifest,
latentNames,
manifestNames
)
Arguments
model |
model of class lavaan |
lavaanParameterTable |
parameter table from lavaan |
nLatent |
number of latent variables |
nManifest |
number of manifest variables |
latentNames |
names of latent variables |
manifestNames |
names of manifest variables |
.setupMulticore
Description
setup for multi-core support
Usage
.setupMulticore(control)
Arguments
control |
object created with controlBFGS, controlIsta or controlGlmnet function |
Value
nothing
.smoothAdaptiveLASSOGradient
Description
smoothed version of non-differentiable adaptive LASSO gradient
Usage
.smoothAdaptiveLASSOGradient(
parameters,
tuningParameters,
penaltyFunctionArguments
)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with fields lambdas (vector with one tuning parameter value for each parameter) |
penaltyFunctionArguments |
list with fields regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
gradient values
.smoothAdaptiveLASSOHessian
Description
smoothed version of non-differentiable adaptive LASSO Hessian
Usage
.smoothAdaptiveLASSOHessian(
parameters,
tuningParameters,
penaltyFunctionArguments
)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with fields lambdas (vector with one tuning parameter value for each parameter) |
penaltyFunctionArguments |
list with fields regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
Hessian matrix
.smoothAdaptiveLASSOValue
Description
smoothed version of non-differentiable adaptive LASSO penalty
Usage
.smoothAdaptiveLASSOValue(
parameters,
tuningParameters,
penaltyFunctionArguments
)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with fields lambdas (vector with one tuning parameter value for each parameter) |
penaltyFunctionArguments |
list with fields regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
penalty function value
.smoothCappedL1Value
Description
smoothed version of capped L1 penalty
Usage
.smoothCappedL1Value(parameters, tuningParameters, penaltyFunctionArguments)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with field lambda (tuning parameter value) |
penaltyFunctionArguments |
list with field regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
penalty function value
.smoothElasticNetGradient
Description
smoothed version of non-differentiable elastic LASSO gradient
Usage
.smoothElasticNetGradient(
parameters,
tuningParameters,
penaltyFunctionArguments
)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with fields lambda (tuning parameter value), alpha (0<alpha<1. Controls the weight of ridge and lasso terms. alpha = 1 is lasso, alpha = 0 ridge) |
penaltyFunctionArguments |
list with fields regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
gradient values
.smoothElasticNetHessian
Description
smoothed version of non-differentiable elastic LASSO Hessian
Usage
.smoothElasticNetHessian(
parameters,
tuningParameters,
penaltyFunctionArguments
)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with fields lambda (tuning parameter value), alpha (0<alpha<1. Controls the weight of ridge and lasso terms. alpha = 1 is lasso, alpha = 0 ridge) |
penaltyFunctionArguments |
list with fields regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
Hessian matrix
.smoothElasticNetValue
Description
smoothed version of non-differentiable elastic LASSO penalty
Usage
.smoothElasticNetValue(parameters, tuningParameters, penaltyFunctionArguments)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with fields lambda (tuning parameter value), alpha (0<alpha<1. Controls the weight of ridge and lasso terms. alpha = 1 is lasso, alpha = 0 ridge) |
penaltyFunctionArguments |
list with fields regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
penalty function value
.smoothLASSOGradient
Description
smoothed version of non-differentiable LASSO gradient
Usage
.smoothLASSOGradient(parameters, tuningParameters, penaltyFunctionArguments)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with field lambda (tuning parameter value) |
penaltyFunctionArguments |
list with field regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
gradient values
.smoothLASSOHessian
Description
smoothed version of non-differentiable LASSO Hessian
Usage
.smoothLASSOHessian(parameters, tuningParameters, penaltyFunctionArguments)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with field lambda (tuning parameter value) |
penaltyFunctionArguments |
list with field regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
Hessian matrix
.smoothLASSOValue
Description
smoothed version of non-differentiable LASSO penalty
Usage
.smoothLASSOValue(parameters, tuningParameters, penaltyFunctionArguments)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with field lambda (tuning parameter value) |
penaltyFunctionArguments |
list with field regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
penalty function value
.smoothLspValue
Description
smoothed version of lsp penalty
Usage
.smoothLspValue(parameters, tuningParameters, penaltyFunctionArguments)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with field lambda (tuning parameter value) |
penaltyFunctionArguments |
list with field regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
penalty function value
.smoothMcpValue
Description
smoothed version of mcp penalty
Usage
.smoothMcpValue(parameters, tuningParameters, penaltyFunctionArguments)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with field lambda (tuning parameter value) |
penaltyFunctionArguments |
list with field regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
penalty function value
.smoothScadValue
Description
smoothed version of scad penalty
Usage
.smoothScadValue(parameters, tuningParameters, penaltyFunctionArguments)
Arguments
parameters |
vector with labeled parameter values |
tuningParameters |
list with field lambda (tuning parameter value) |
penaltyFunctionArguments |
list with field regularizedParameterLabels (labels of regularized parameters), and eps (controls the smooth approximation of non-differential penalty functions (e.g., lasso, adaptive lasso, or elastic net). Smaller values result in closer approximation, but may also cause larger issues in optimization.) |
Value
penalty function value
.standardErrors
Description
compute the standard errors of a fitted SEM. IMPORTANT: Assumes that the SEM has been fitted and the parameter estimates are at the ordinary maximum likelihood estimates
Usage
.standardErrors(SEM, raw)
Arguments
SEM |
model of class Rcpp_SEMCpp. |
raw |
controls if the internal transformations of lessSEM is used. If set to TRUE, the standard errors will be returned for the internally used parameter specification |
Value
a vector with standard errors
.updateLavaan
Description
updates a lavaan model. lavaan has an update function that does exactly that, but it seems to not work with testthat. This is an attempt to hack around the issue...
Usage
.updateLavaan(lavaanModel, key, value)
Arguments
lavaanModel |
fitted lavaan model |
key |
label of the element that should be updated |
value |
new value for the updated element |
Value
lavaan model
.useElasticNet
Description
Internal function checking if elastic net is used
Usage
.useElasticNet(mixedPenalty)
Arguments
mixedPenalty |
object of class mixedPenalty. This object can be created with the mixedPenalty function. Penalties can be added with the addCappedL1, addLasso, addLsp, addMcp, and addScad functions. |
Value
TRUE if elastic net, FALSE otherwise
AIC
Description
AIC
Usage
## S4 method for signature 'Rcpp_SEMCpp'
AIC(object, ..., k = 2)
Arguments
object |
object of class Rcpp_SEMCpp |
... |
not used |
k |
multiplier for number of parameters |
Value
AIC values
AIC
Description
AIC
Usage
## S4 method for signature 'Rcpp_mgSEM'
AIC(object, ..., k = 2)
Arguments
object |
object of class Rcpp_mgSEM |
... |
not used |
k |
multiplier for number of parameters |
Value
AIC values
AIC
Description
returns the AIC
Usage
## S4 method for signature 'gpRegularized'
AIC(object, ..., k = 2)
Arguments
object |
object of class gpRegularized |
... |
not used |
k |
multiplier for number of parameters |
Value
data frame with fit values, appended with AIC
AIC
Description
returns the AIC
Usage
## S4 method for signature 'regularizedSEM'
AIC(object, ..., k = 2)
Arguments
object |
object of class regularizedSEM |
... |
not used |
k |
multiplier for number of parameters |
Value
AIC values
AIC
Description
returns the AIC
Usage
## S4 method for signature 'regularizedSEMMixedPenalty'
AIC(object, ..., k = 2)
Arguments
object |
object of class regularizedSEMMixedPenalty |
... |
not used |
k |
multiplier for number of parameters |
Value
AIC values
AIC
Description
returns the AIC. Expects penalizedParameterLabels and zeroThreshold
Usage
## S4 method for signature 'regularizedSEMWithCustomPenalty'
AIC(object, ..., k = 2)
Arguments
object |
object of class regularizedSEMWithCustomPenalty |
... |
Expects penalizedParameterLabels and zeroThreshold. penalizedParameterLabels: vector with labels of penalized parameters. zeroThreshold: penalized parameters below this threshold will be counted as zeroed. |
k |
multiplier for number of parameters |
Value
AIC values
BIC
Description
BIC
Usage
## S4 method for signature 'Rcpp_SEMCpp'
BIC(object, ...)
Arguments
object |
object of class Rcpp_SEMCpp |
... |
not used |
Value
BIC values
BIC
Description
BIC
Usage
## S4 method for signature 'Rcpp_mgSEM'
BIC(object, ...)
Arguments
object |
object of class Rcpp_mgSEM |
... |
not used |
Value
BIC values
BIC
Description
returns the BIC
Usage
## S4 method for signature 'gpRegularized'
BIC(object, ...)
Arguments
object |
object of class gpRegularized |
... |
not used |
Value
data frame with fit values, appended with BIC
BIC
Description
returns the BIC
Usage
## S4 method for signature 'regularizedSEM'
BIC(object, ...)
Arguments
object |
object of class regularizedSEM |
... |
not used |
Value
BIC values
BIC
Description
returns the BIC
Usage
## S4 method for signature 'regularizedSEMMixedPenalty'
BIC(object, ...)
Arguments
object |
object of class regularizedSEMMixedPenalty |
... |
not used |
Value
BIC values
BIC
Description
returns the BIC
Usage
## S4 method for signature 'regularizedSEMWithCustomPenalty'
BIC(object, ...)
Arguments
object |
object of class regularizedSEMWithCustomPenalty |
... |
Expects penalizedParameterLabels and zeroThreshold. penalizedParameterLabels: vector with labels of penalized parameters. zeroThreshold: penalized parameters below this threshold will be counted as zeroed. |
Value
BIC values
internal representation of SEM in C++
Description
internal representation of SEM in C++
Wrapper for C++ module. See ?lessSEM::bfgsEnetMgSEM
Description
Wrapper for C++ module. See ?lessSEM::bfgsEnetMgSEM
Wrapper for C++ module. See ?lessSEM::bfgsEnetSEM
Description
Wrapper for C++ module. See ?lessSEM::bfgsEnetSEM
Wrapper for C++ module. See ?lessSEM::glmnetCappedL1MgSEM
Description
Wrapper for C++ module. See ?lessSEM::glmnetCappedL1MgSEM
Wrapper for C++ module. See ?lessSEM:::glmnetCappedL1SEM
Description
Wrapper for C++ module. See ?lessSEM:::glmnetCappedL1SEM
Wrapper for C++ module. See ?lessSEM::glmnetEnetGeneralPurpose
Description
Wrapper for C++ module. See ?lessSEM::glmnetEnetGeneralPurpose
Wrapper for C++ module. See ?lessSEM::glmnetEnetGeneralPurposeCpp
Description
Wrapper for C++ module. See ?lessSEM::glmnetEnetGeneralPurposeCpp
Wrapper for C++ module. See ?lessSEM::glmnetEnetMgSEM
Description
Wrapper for C++ module. See ?lessSEM::glmnetEnetMgSEM
Wrapper for C++ module. See ?lessSEM::glmnetEnetSEM
Description
Wrapper for C++ module. See ?lessSEM::glmnetEnetSEM
Wrapper for C++ module. See ?lessSEM::glmnetLspMgSEM
Description
Wrapper for C++ module. See ?lessSEM::glmnetLspMgSEM
Wrapper for C++ module. See ?lessSEM:::glmnetLspSEM
Description
Wrapper for C++ module. See ?lessSEM:::glmnetLspSEM
Wrapper for C++ module. See ?lessSEM::glmnetMcpMgSEM
Description
Wrapper for C++ module. See ?lessSEM::glmnetMcpMgSEM
Wrapper for C++ module. See ?lessSEM:::glmnetMcpSEM
Description
Wrapper for C++ module. See ?lessSEM:::glmnetMcpSEM
Wrapper for C++ module. See ?lessSEM::glmnetScadMgSEM
Description
Wrapper for C++ module. See ?lessSEM::glmnetScadMgSEM
Wrapper for C++ module. See ?lessSEM:::glmnetScadSEM
Description
Wrapper for C++ module. See ?lessSEM:::glmnetScadSEM
Wrapper for C++ module. See ?lessSEM::istaCappedL1GeneralPurpose
Description
Wrapper for C++ module. See ?lessSEM::istaCappedL1GeneralPurpose
Wrapper for C++ module. See ?lessSEM::istaCappedL1GeneralPurposeCpp
Description
Wrapper for C++ module. See ?lessSEM::istaCappedL1GeneralPurposeCpp
Wrapper for C++ module. See ?lessSEM::istaCappedL1SEM
Description
Wrapper for C++ module. See ?lessSEM::istaCappedL1SEM
Wrapper for C++ module. See ?lessSEM::istaCappedL1MgSEM
Description
Wrapper for C++ module. See ?lessSEM::istaCappedL1MgSEM
Wrapper for C++ module. See ?lessSEM::istaEnetGeneralPurpose
Description
Wrapper for C++ module. See ?lessSEM::istaEnetGeneralPurpose
Wrapper for C++ module. See ?lessSEM::istaEnetGeneralPurposeCpp
Description
Wrapper for C++ module. See ?lessSEM::istaEnetGeneralPurposeCpp
Wrapper for C++ module. See ?lessSEM::istaEnetMgSEM
Description
Wrapper for C++ module. See ?lessSEM::istaEnetMgSEM
Wrapper for C++ module. See ?lessSEM::istaEnetMgSEM
Wrapper for C++ module. See ?lessSEM::istaEnetSEM
Description
Wrapper for C++ module. See ?lessSEM::istaEnetSEM
Wrapper for C++ module. See ?lessSEM::istaEnetSEM
Wrapper for C++ module. See ?lessSEM::istaLSPMgSEM
Description
Wrapper for C++ module. See ?lessSEM::istaLSPMgSEM
Wrapper for C++ module. See ?lessSEM::istaLSPSEM
Description
Wrapper for C++ module. See ?lessSEM::istaLSPSEM
Wrapper for C++ module. See ?lessSEM::istaLspGeneralPurpose
Description
Wrapper for C++ module. See ?lessSEM::istaLspGeneralPurpose
Wrapper for C++ module. See ?lessSEM::istaLspGeneralPurposeCpp
Description
Wrapper for C++ module. See ?lessSEM::istaLspGeneralPurposeCpp
Wrapper for C++ module. See ?lessSEM::istaMcpGeneralPurpose
Description
Wrapper for C++ module. See ?lessSEM::istaMcpGeneralPurpose
Wrapper for C++ module. See ?lessSEM::istaMcpGeneralPurposeCpp
Description
Wrapper for C++ module. See ?lessSEM::istaMcpGeneralPurposeCpp
Wrapper for C++ module. See ?lessSEM::istaMcpMgSEM
Description
Wrapper for C++ module. See ?lessSEM::istaMcpMgSEM
Wrapper for C++ module. See ?lessSEM::istaMcpSEM
Description
Wrapper for C++ module. See ?lessSEM::istaMcpSEM
Wrapper for C++ module. See ?lessSEM::istaMixedPenaltySEM
Description
Wrapper for C++ module. See ?lessSEM::istaMixedPenaltySEM
Wrapper for C++ module. See ?lessSEM::istaMixedPenaltymgSEM
Description
Wrapper for C++ module. See ?lessSEM::istaMixedPenaltymgSEM
Wrapper for C++ module. See ?lessSEM::istaScadGeneralPurpose
Description
Wrapper for C++ module. See ?lessSEM::istaScadGeneralPurpose
Wrapper for C++ module. See ?lessSEM::istaScadGeneralPurposeCpp
Description
Wrapper for C++ module. See ?lessSEM::istaScadGeneralPurposeCpp
Wrapper for C++ module. See ?lessSEM::istaScadMgSEM
Description
Wrapper for C++ module. See ?lessSEM::istaScadMgSEM
Wrapper for C++ module. See ?lessSEM::istaScadSEM
Description
Wrapper for C++ module. See ?lessSEM::istaScadSEM
internal representation of SEM in C++
Description
internal representation of SEM in C++
SEMCpp class
Description
internal SEM representation
Fields
new
Creates a new SEMCpp.
fill
fills the SEM with the elements from an Rcpp::List
addTransformation
adds transforamtions to a model
implied
Computes implied means and covariance matrix
fit
Fits the model. Returns objective value of the fitting function
getParameters
Returns a data frame with model parameters.
getEstimator
returns the estimator used in the model (e.g., fiml)
getParameterLabels
Returns a vector with unique parameter labels as used internally.
getGradients
Returns a matrix with scores.
getScores
Returns a matrix with scores.
getHessian
Returns the hessian of the model. Expects the labels of the parameters and the values of the parameters as well as a boolean indicating if these are raw. Finally, a double (eps) controls the precision of the approximation.
computeTransformations
compute the transformations.
setTransformationGradientStepSize
change the step size of the gradient computation for the transformations
adaptiveLasso
Description
Implements adaptive lasso regularization for structural equation models. The penalty function is given by:
p( x_j) = p( x_j) = \frac{1}{w_j}\lambda| x_j|
Adaptive lasso regularization will set parameters to zero if \lambda
is large enough.
Usage
adaptiveLasso(
lavaanModel,
regularized,
weights = NULL,
lambdas = NULL,
nLambdas = NULL,
reverse = TRUE,
curve = 1,
method = "glmnet",
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
weights |
labeled vector with weights for each of the parameters in the model. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object. If set to NULL, the default weights will be used: the inverse of the absolute values of the unregularized parameter estimates |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
reverse |
if set to TRUE and nLambdas is used, lessSEM will start with the largest lambda and gradually decrease lambda. Otherwise, lessSEM will start with the smallest lambda and gradually increase it. |
curve |
Allows for unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If curve is close to 1 all lambda values will be equally spaced, if curve is large lambda values will be more concentrated close to 0. See ?lessSEM::curveLambda for more information. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Adaptive lasso regularization:
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- adaptiveLasso(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
# in case of lasso and adaptive lasso, we can specify the number of lambda
# values to use. lessSEM will automatically find lambda_max and fit
# models for nLambda values between 0 and lambda_max. For the other
# penalty functions, lambdas must be specified explicitly
nLambdas = 50)
# use the plot-function to plot the regularized parameters:
plot(lsem)
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters[1,]
# fit Measures:
fitIndices(lsem)
# The best parameters can also be extracted with:
coef(lsem, criterion = "AIC")
# or
estimates(lsem, criterion = "AIC")
#### Advanced ###
# Switching the optimizer #
# Use the "method" argument to switch the optimizer. The control argument
# must also be changed to the corresponding function:
lsemIsta <- adaptiveLasso(
lavaanModel = lavaanModel,
regularized = paste0("l", 6:15),
nLambdas = 50,
method = "ista",
control = controlIsta())
# Note: The results are basically identical:
lsemIsta@parameters - lsem@parameters
addCappedL1
Description
Implements cappedL1 regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda \min(| x_j|, \theta)
where \theta > 0
. The cappedL1 penalty is identical to the lasso for
parameters which are below \theta
and identical to a constant for parameters
above \theta
. As adding a constant to the fitting function will not change its
minimum, larger parameters can stay unregularized while smaller ones are set to zero.
Usage
addCappedL1(mixedPenalty, regularized, lambdas, thetas)
Arguments
mixedPenalty |
model of class mixedPenalty created with the mixedPenalty function (see ?mixedPenalty) |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
CappedL1 regularization:
Zhang, T. (2010). Analysis of Multi-stage Convex Relaxation for Sparse Regularization. Journal of Machine Learning Research, 11, 1081–1107.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class mixedPenalty. Use the fit() - function to fit the model
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# We can add mixed penalties as follows:
regularized <- lavaanModel |>
# create template for regularized model with mixed penalty:
mixedPenalty() |>
# add penalty on loadings l6 - l10:
addCappedL1(regularized = paste0("l", 11:15),
lambdas = seq(0,1,.1),
thetas = 2.3) |>
# fit the model:
fit()
addElasticNet
Description
Adds an elastic net penalty to specified parameters. The penalty function is given by:
p( x_j) = \alpha\lambda|x_j| + (1-\alpha)\lambda x_j^2
Note that the elastic net combines ridge and lasso regularization. If \alpha = 0
,
the elastic net reduces to ridge regularization. If \alpha = 1
it reduces
to lasso regularization. In between, elastic net is a compromise between the shrinkage of
the lasso and the ridge penalty.
Usage
addElasticNet(mixedPenalty, regularized, alphas, lambdas, weights = 1)
Arguments
mixedPenalty |
model of class mixedPenalty created with the mixedPenalty function (see ?mixedPenalty) |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
alphas |
numeric vector: values for the tuning parameter alpha. Set to 1 for lasso and to zero for ridge. Anything in between is an elastic net penalty. |
lambdas |
numeric vector: values for the tuning parameter lambda |
weights |
can be used to give different weights to the different parameters |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Elastic net regularization:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class mixedPenalty. Use the fit() - function to fit the model
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# We can add mixed penalties as follows:
regularized <- lavaanModel |>
# create template for regularized model with mixed penalty:
mixedPenalty() |>
# add penalty on loadings l6 - l10:
addElasticNet(regularized = paste0("l", 11:15),
lambdas = seq(0,1,.1),
alphas = .4) |>
# fit the model:
fit()
addLasso
Description
Implements lasso regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda |x_j|
Lasso regularization will set parameters to zero if \lambda
is large enough
Usage
addLasso(mixedPenalty, regularized, weights = 1, lambdas)
Arguments
mixedPenalty |
model of class mixedPenalty created with the mixedPenalty function (see ?mixedPenalty) |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
weights |
can be used to give different weights to the different parameters |
lambdas |
numeric vector: values for the tuning parameter lambda |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Lasso regularization:
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class mixedPenalty. Use the fit() - function to fit the model
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# We can add mixed penalties as follows:
regularized <- lavaanModel |>
# create template for regularized model with mixed penalty:
mixedPenalty() |>
# add penalty on loadings l6 - l10:
addLasso(regularized = paste0("l", 11:15),
lambdas = seq(0,1,.1)) |>
# fit the model:
fit()
addLsp
Description
Implements lsp regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda \log(1 + |x_j|/\theta)
where \theta > 0
.
Usage
addLsp(mixedPenalty, regularized, lambdas, thetas)
Arguments
mixedPenalty |
model of class mixedPenalty created with the mixedPenalty function (see ?mixedPenalty) |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
lsp regularization:
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted l1 Minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905. https://doi.org/10.1007/s00041-008-9045-x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class mixedPenalty. Use the fit() - function to fit the model
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# We can add mixed penalties as follows:
regularized <- lavaanModel |>
# create template for regularized model with mixed penalty:
mixedPenalty() |>
# add penalty on loadings l6 - l10:
addLsp(regularized = paste0("l", 11:15),
lambdas = seq(0,1,.1),
thetas = 2.3) |>
# fit the model:
fit()
addMcp
Description
Implements mcp regularization for structural equation models. The penalty function is given by:
p( x_j) = \begin{cases}
\lambda |x_j| - x_j^2/(2\theta) & \text{if } |x_j| \leq \theta\lambda\\
\theta\lambda^2/2 & \text{if } |x_j| > \lambda\theta
\end{cases}
where \theta > 0
.
Usage
addMcp(mixedPenalty, regularized, lambdas, thetas)
Arguments
mixedPenalty |
model of class mixedPenalty created with the mixedPenalty function (see ?mixedPenalty) |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
mcp regularization:
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942. https://doi.org/10.1214/09-AOS729
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class mixedPenalty. Use the fit() - function to fit the model
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# We can add mixed penalties as follows:
regularized <- lavaanModel |>
# create template for regularized model with mixed penalty:
mixedPenalty() |>
# add penalty on loadings l6 - l10:
addMcp(regularized = paste0("l", 11:15),
lambdas = seq(0,1,.1),
thetas = 2.3) |>
# fit the model:
fit()
addScad
Description
Implements scad regularization for structural equation models. The penalty function is given by:
p( x_j) = \begin{cases}
\lambda |x_j| & \text{if } |x_j| \leq \theta\\
\frac{-x_j^2 + 2\theta\lambda |x_j| - \lambda^2}{2(\theta -1)} &
\text{if } \lambda < |x_j| \leq \lambda\theta \\
(\theta + 1) \lambda^2/2 & \text{if } |x_j| \geq \theta\lambda\\
\end{cases}
where \theta > 2
.
Usage
addScad(mixedPenalty, regularized, lambdas, thetas)
Arguments
mixedPenalty |
model of class mixedPenalty created with the mixedPenalty function (see ?mixedPenalty) |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
scad regularization:
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class mixedPenalty. Use the fit() - function to fit the model
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# We can add mixed penalties as follows:
regularized <- lavaanModel |>
# create template for regularized model with mixed penalty:
mixedPenalty() |>
# add penalty on loadings l6 - l10:
addScad(regularized = paste0("l", 11:15),
lambdas = seq(0,1,.1),
thetas = 3.1) |>
# fit the model:
fit()
bfgs
Description
This function allows for optimizing models built in lavaan using the BFGS optimizer implemented in lessSEM. Its elements can be accessed with the "@" operator (see examples). The main purpose is to make transformations of lavaan models more accessible.
Usage
bfgs(
lavaanModel,
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlBFGS()
)
Arguments
lavaanModel |
model of class lavaan |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. See ?controlBFGS for more details. |
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
lsem <- bfgs(
# pass the fitted lavaan model
lavaanModel = lavaanModel)
# the coefficients can be accessed with:
coef(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
smoothly approximated elastic net
Description
Object for smoothly approximated elastic net optimization with bfgs optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
smoothly approximated elastic net
Description
Object for smoothly approximated elastic net optimization with bfgs optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
smoothly approximated elastic net
Description
Object for smoothly approximated elastic net optimization with bfgs optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
callFitFunction
Description
wrapper to call user defined fit function
Usage
callFitFunction(fitFunctionSEXP, parameters, userSuppliedElements)
Arguments
fitFunctionSEXP |
pointer to fit function |
parameters |
vector with parameter values |
userSuppliedElements |
list with additional elements |
Value
fit value (double)
cappedL1
Description
Implements cappedL1 regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda \min(| x_j|, \theta)
where \theta > 0
. The cappedL1 penalty is identical to the lasso for
parameters which are below \theta
and identical to a constant for parameters
above \theta
. As adding a constant to the fitting function will not change its
minimum, larger parameters can stay unregularized while smaller ones are set to zero.
Usage
cappedL1(
lavaanModel,
regularized,
lambdas,
thetas,
modifyModel = lessSEM::modifyModel(),
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures |
control |
used to control the optimizer. This element is generated with the controlIsta (see ?controlIsta) |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
CappedL1 regularization:
Zhang, T. (2010). Analysis of Multi-stage Convex Relaxation for Sparse Regularization. Journal of Machine Learning Research, 11, 1081–1107.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cappedL1(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 20),
thetas = seq(0.01,2,length.out = 5))
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters[1,]
# fit Measures:
fitIndices(lsem)
# The best parameters can also be extracted with:
coef(lsem, criterion = "AIC")
# or
estimates(lsem, criterion = "AIC")
# optional: plotting the paths requires installation of plotly
# plot(lsem)
coef
Description
coef
Usage
## S4 method for signature 'Rcpp_SEMCpp'
coef(object, ...)
Arguments
object |
object of class Rcpp_SEMCpp |
... |
not used |
Value
all coefficients of the model in transformed form
coef
Description
coef
Usage
## S4 method for signature 'Rcpp_mgSEM'
coef(object, ...)
Arguments
object |
object of class Rcpp_mgSEM |
... |
not used |
Value
all coefficients of the model in transformed form
coef
Description
Returns the parameter estimates of an cvRegularizedSEM
Usage
## S4 method for signature 'cvRegularizedSEM'
coef(object, ...)
Arguments
object |
object of class cvRegularizedSEM |
... |
not used |
Value
the parameter estimates of an cvRegularizedSEM
coef
Description
Returns the parameter estimates of a gpRegularized
Usage
## S4 method for signature 'gpRegularized'
coef(object, ...)
Arguments
object |
object of class gpRegularized |
... |
criterion can be one of: "AIC", "BIC". If set to NULL, all parameters will be returned |
Value
parameter estimates
coef
Description
Returns the parameter estimates of a regularizedSEM
Usage
## S4 method for signature 'regularizedSEM'
coef(object, ...)
Arguments
object |
object of class regularizedSEM |
... |
criterion can be one of the ones returned by fitIndices. If set to NULL, all parameters will be returned |
Value
parameters of the model as data.frame
coef
Description
Returns the parameter estimates of a regularizedSEMMixedPenalty
Usage
## S4 method for signature 'regularizedSEMMixedPenalty'
coef(object, ...)
Arguments
object |
object of class regularizedSEMMixedPenalty |
... |
criterion can be one of: "AIC", "BIC". If set to NULL, all parameters will be returned |
Value
parameters of the model as data.frame
coef
Description
Returns the parameter estimates of a regularizedSEMWithCustomPenalty
Usage
## S4 method for signature 'regularizedSEMWithCustomPenalty'
coef(object, ...)
Arguments
object |
object of class regularizedSEMWithCustomPenalty |
... |
not used |
Value
data.frame with all parameter estimates
controlBFGS
Description
Control the BFGS optimizer.
Usage
controlBFGS(
startingValues = "est",
initialHessian = ifelse(all(startingValues == "est"), "lavaan", "compute"),
saveDetails = FALSE,
stepSize = 0.9,
sigma = 1e-05,
gamma = 0,
maxIterOut = 1000,
maxIterIn = 1000,
maxIterLine = 500,
breakOuter = 1e-08,
breakInner = 1e-10,
convergenceCriterion = 0,
verbose = 0,
nCores = 1
)
Arguments
startingValues |
option to provide initial starting values. Only used for the first lambda. Three options are supported. Setting to "est" will use the estimates from the lavaan model object. Setting to "start" will use the starting values of the lavaan model. Finally, a labeled vector with parameter values can be passed to the function which will then be used as starting values. |
initialHessian |
option to provide an initial Hessian to the optimizer. Must have row and column names corresponding to the parameter labels. use getLavaanParameters(lavaanModel) to see those labels. If set to "gradNorm", the maximum of the gradients at the starting values times the stepSize will be used. This is adapted from Optim.jl https://github.com/JuliaNLSolvers/Optim.jl/blob/f43e6084aacf2dabb2b142952acd3fbb0e268439/src/multivariate/solvers/first_order/bfgs.jl#L104 If set to a single value, a diagonal matrix with the single value along the diagonal will be used. The default is "lavaan" which extracts the Hessian from the lavaanModel. This Hessian will typically deviate from that of the internal SEM represenation of lessSEM (due to the transformation of the variances), but works quite well in practice. |
saveDetails |
when set to TRUE, additional details about the individual models are save. Currently, this are the Hessian and the implied means and covariances. Note: This may take a lot of memory! |
stepSize |
Initial stepSize of the outer iteration (theta_next = theta_previous + stepSize * Stepdirection) |
sigma |
only relevant when lineSearch = 'GLMNET'. Controls the sigma parameter in Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421. |
gamma |
Controls the gamma parameter in Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421. Defaults to 0. |
maxIterOut |
Maximal number of outer iterations |
maxIterIn |
Maximal number of inner iterations |
maxIterLine |
Maximal number of iterations for the line search procedure |
breakOuter |
Stopping criterion for outer iterations |
breakInner |
Stopping criterion for inner iterations |
convergenceCriterion |
which convergence criterion should be used for the outer iterations? possible are 0 = GLMNET, 1 = fitChange, 2 = gradients. Note that in case of gradients and GLMNET, we divide the gradients (and the Hessian) of the log-Likelihood by N as it would otherwise be considerably more difficult for larger sample sizes to reach the convergence criteria. |
verbose |
0 prints no additional information, > 0 prints GLMNET iterations |
nCores |
number of core to use. Multi-core support is provided by RcppParallel and only supported for SEM, not for general purpose optimization. |
Value
object of class controlBFGS
Examples
control <- controlBFGS()
controlGlmnet
Description
Control the GLMNET optimizer.
Usage
controlGlmnet(
startingValues = "est",
initialHessian = ifelse(all(startingValues == "est"), "lavaan", "compute"),
saveDetails = FALSE,
stepSize = 0.9,
sigma = 1e-05,
gamma = 0,
maxIterOut = 1000,
maxIterIn = 1000,
maxIterLine = 500,
breakOuter = 1e-08,
breakInner = 1e-10,
convergenceCriterion = 0,
verbose = 0,
nCores = 1
)
Arguments
startingValues |
option to provide initial starting values. Only used for the first lambda. Three options are supported. Setting to "est" will use the estimates from the lavaan model object. Setting to "start" will use the starting values of the lavaan model. Finally, a labeled vector with parameter values can be passed to the function which will then be used as starting values. |
initialHessian |
option to provide an initial Hessian to the optimizer. Must have row and column names corresponding to the parameter labels. use getLavaanParameters(lavaanModel) to see those labels. If set to "gradNorm", the maximum of the gradients at the starting values times the stepSize will be used. This is adapted from Optim.jl https://github.com/JuliaNLSolvers/Optim.jl/blob/f43e6084aacf2dabb2b142952acd3fbb0e268439/src/multivariate/solvers/first_order/bfgs.jl#L104 If set to "compute", the initial hessian will be computed. If set to a single value, a diagonal matrix with the single value along the diagonal will be used. The default is "lavaan" which extracts the Hessian from the lavaanModel. This Hessian will typically deviate from that of the internal SEM represenation of lessSEM (due to the transformation of the variances), but works quite well in practice. |
saveDetails |
when set to TRUE, additional details about the individual models are save. Currently, this are the Hessian and the implied means and covariances. Note: This may take a lot of memory! |
stepSize |
Initial stepSize of the outer iteration (theta_next = theta_previous + stepSize * Stepdirection) |
sigma |
only relevant when lineSearch = 'GLMNET'. Controls the sigma parameter in Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421. |
gamma |
Controls the gamma parameter in Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421. Defaults to 0. |
maxIterOut |
Maximal number of outer iterations |
maxIterIn |
Maximal number of inner iterations |
maxIterLine |
Maximal number of iterations for the line search procedure |
breakOuter |
Stopping criterion for outer iterations |
breakInner |
Stopping criterion for inner iterations |
convergenceCriterion |
which convergence criterion should be used for the outer iterations? possible are 0 = GLMNET, 1 = fitChange, 2 = gradients. Note that in case of gradients and GLMNET, we divide the gradients (and the Hessian) of the log-Likelihood by N as it would otherwise be considerably more difficult for larger sample sizes to reach the convergence criteria. |
verbose |
0 prints no additional information, > 0 prints GLMNET iterations |
nCores |
number of core to use. Multi-core support is provided by RcppParallel and only supported for SEM, not for general purpose optimization. |
Value
object of class controlGlmnet
Examples
control <- controlGlmnet()
controlIsta
Description
controlIsta
Usage
controlIsta(
startingValues = "est",
saveDetails = FALSE,
L0 = 0.1,
eta = 2,
accelerate = TRUE,
maxIterOut = 10000,
maxIterIn = 1000,
breakOuter = 1e-08,
convCritInner = 1,
sigma = 0.1,
stepSizeInheritance = ifelse(accelerate, 1, 3),
verbose = 0,
nCores = 1
)
Arguments
startingValues |
option to provide initial starting values. Only used for the first lambda. Three options are supported. Setting to "est" will use the estimates from the lavaan model object. Setting to "start" will use the starting values of the lavaan model. Finally, a labeled vector with parameter values can be passed to the function which will then be used as starting values. |
saveDetails |
when set to TRUE, additional details about the individual models are save. Currently, this are the implied means and covariances. Note: This may take a lot of memory! |
L0 |
L0 controls the step size used in the first iteration |
eta |
eta controls by how much the step size changes in the inner iterations with (eta^i)*L, where i is the inner iteration |
accelerate |
boolean: Should the acceleration outlined in Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231., p. 152 be used? |
maxIterOut |
maximal number of outer iterations |
maxIterIn |
maximal number of inner iterations |
breakOuter |
change in fit required to break the outer iteration. Note: The value will be multiplied internally with sample size N as the -2log-Likelihood depends directly on the sample size |
convCritInner |
this is related to the inner breaking condition. 0 = ista, as presented by Beck & Teboulle (2009); see Remark 3.1 on p. 191 (ISTA with backtracking) 1 = gist, as presented by Gong et al. (2013) (Equation 3) |
sigma |
sigma in (0,1) is used by the gist convergence criterion. larger sigma enforce larger improvement in fit |
stepSizeInheritance |
how should step sizes be carried forward from iteration to iteration? 0 = resets the step size to L0 in each iteration 1 = takes the previous step size as initial value for the next iteration 3 = Barzilai-Borwein procedure 4 = Barzilai-Borwein procedure, but sometimes resets the step size; this can help when the optimizer is caught in a bad spot. |
verbose |
if set to a value > 0, the fit every "verbose" iterations is printed. |
nCores |
number of core to use. Multi-core support is provided by RcppParallel and only supported for SEM, not for general purpose optimization. |
Value
object of class controlIsta
Examples
control <- controlIsta()
covariances
Description
Extract the labels of all covariances found in a lavaan model.
Usage
covariances(lavaanModel)
Arguments
lavaanModel |
fitted lavaan model |
Value
vector with parameter labels
Examples
# The following is adapted from ?lavaan::sem
library(lessSEM)
model <- '
# latent variable definitions
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + a*y2 + b*y3 + c*y4
dem65 =~ y5 + a*y6 + b*y7 + c*y8
# regressions
dem60 ~ ind60
dem65 ~ ind60 + dem60
# residual correlations
y1 ~~ y5
y2 ~~ y4 + y6
y3 ~~ y7
y4 ~~ y8
y6 ~~ y8
'
fit <- sem(model, data = PoliticalDemocracy)
covariances(fit)
createSubsets
Description
create subsets for cross-validation
Usage
createSubsets(N, k)
Arguments
N |
number of samples in the data set |
k |
number of subsets to create |
Value
matrix with subsets
Examples
createSubsets(N=100, k = 5)
curveLambda
Description
generates lambda values between 0 and lambdaMax using the function described here: https://math.stackexchange.com/questions/384613/exponential-function-with-values-between-0-and-1-for-x-values-between-0-and-1. The function is identical to the one implemented in the regCtsem package.
Usage
curveLambda(maxLambda, lambdasAutoCurve, lambdasAutoLength)
Arguments
maxLambda |
maximal lambda value |
lambdasAutoCurve |
controls the curve. A value close to 1 will result in a linear increase, larger values in lambdas more concentrated around 0 |
lambdasAutoLength |
number of lambda values to generate |
Value
numeric vector
Examples
library(lessSEM)
plot(curveLambda(maxLambda = 10, lambdasAutoCurve = 1, lambdasAutoLength = 100))
plot(curveLambda(maxLambda = 10, lambdasAutoCurve = 5, lambdasAutoLength = 100))
plot(curveLambda(maxLambda = 10, lambdasAutoCurve = 100, lambdasAutoLength = 100))
cvAdaptiveLasso
Description
Implements cross-validated adaptive lasso regularization for structural equation models. The penalty function is given by:
p( x_j) = p( x_j) = \frac{1}{w_j}\lambda| x_j|
Adaptive lasso regularization will set parameters to zero if \lambda
is large enough.
Usage
cvAdaptiveLasso(
lavaanModel,
regularized,
weights = NULL,
lambdas,
k = 5,
standardize = FALSE,
returnSubsetParameters = FALSE,
method = "glmnet",
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
weights |
labeled vector with weights for each of the parameters in the model. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object. If set to NULL, the default weights will be used: the inverse of the absolute values of the unregularized parameter estimates |
lambdas |
numeric vector: values for the tuning parameter lambda |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Adaptive lasso regularization:
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
model of class cvRegularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cvAdaptiveLasso(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,.1))
# use the plot-function to plot the cross-validation fit
plot(lsem)
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
# The best parameters can also be extracted with:
estimates(lsem)
cvCappedL1
Description
Implements cappedL1 regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda \min(| x_j|, \theta)
where \theta > 0
. The cappedL1 penalty is identical to the lasso for
parameters which are below \theta
and identical to a constant for parameters
above \theta
. As adding a constant to the fitting function will not change its
minimum, larger parameters can stay unregularized while smaller ones are set to zero.
Usage
cvCappedL1(
lavaanModel,
regularized,
lambdas,
thetas,
k = 5,
standardize = FALSE,
returnSubsetParameters = FALSE,
modifyModel = lessSEM::modifyModel(),
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures. |
control |
used to control the optimizer. This element is generated with the controlIsta function. See ?controlIsta for more details. |
Details
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
CappedL1 regularization:
Zhang, T. (2010). Analysis of Multi-stage Convex Relaxation for Sparse Regularization. Journal of Machine Learning Research, 11, 1081–1107.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
model of class cvRegularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cvCappedL1(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 5),
thetas = seq(0.01,2,length.out = 3))
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
# optional: plotting the cross-validation fit requires installation of plotly
# plot(lsem)
cvElasticNet
Description
Implements elastic net regularization for structural equation models. The penalty function is given by:
p( x_j) = \alpha\lambda| x_j| + (1-\alpha)\lambda x_j^2
Note that the elastic net combines ridge and lasso regularization. If \alpha = 0
,
the elastic net reduces to ridge regularization. If \alpha = 1
it reduces
to lasso regularization. In between, elastic net is a compromise between the shrinkage of
the lasso and the ridge penalty.
Usage
cvElasticNet(
lavaanModel,
regularized,
lambdas,
alphas,
k = 5,
standardize = FALSE,
returnSubsetParameters = FALSE,
method = "glmnet",
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
alphas |
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso. |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures. |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Elastic net regularization:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
model of class cvRegularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cvElasticNet(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 5),
alphas = seq(0,1,length.out = 3))
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
# optional: plotting the cross-validation fit requires installation of plotly
# plot(lsem)
cvLasso
Description
Implements cross-validated lasso regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda |x_j|
Lasso regularization will set parameters to zero if \lambda
is large enough
Usage
cvLasso(
lavaanModel,
regularized,
lambdas,
k = 5,
standardize = FALSE,
returnSubsetParameters = FALSE,
method = "glmnet",
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures. |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Lasso regularization:
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
model of class cvRegularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cvLasso(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,.1),
k = 5, # number of cross-validation folds
standardize = TRUE) # automatic standardization
# use the plot-function to plot the cross-validation fit:
plot(lsem)
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
# The best parameters can also be extracted with:
estimates(lsem)
cvLsp
Description
Implements lsp regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda \log(1 + |x_j|/\theta)
where \theta > 0
.
Usage
cvLsp(
lavaanModel,
regularized,
lambdas,
thetas,
k = 5,
standardize = FALSE,
returnSubsetParameters = FALSE,
modifyModel = lessSEM::modifyModel(),
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures. |
control |
used to control the optimizer. This element is generated with the controlIsta function. See ?controlIsta |
Details
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
lsp regularization:
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted l1 Minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905. https://doi.org/10.1007/s00041-008-9045-x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
model of class cvRegularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cvLsp(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 5),
thetas = seq(0.01,2,length.out = 3))
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
# optional: plotting the cross-validation fit requires installation of plotly
# plot(lsem)
cvMcp
Description
Implements mcp regularization for structural equation models. The penalty function is given by:
p( x_j) = \begin{cases}
\lambda |x_j| - x_j^2/(2\theta) & \text{if } |x_j| \leq \theta\lambda\\
\theta\lambda^2/2 & \text{if } |x_j| > \lambda\theta
\end{cases}
where \theta > 0
.
Usage
cvMcp(
lavaanModel,
regularized,
lambdas,
thetas,
k = 5,
standardize = FALSE,
returnSubsetParameters = FALSE,
modifyModel = lessSEM::modifyModel(),
method = "ista",
control = lessSEM::controlIsta()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures. |
control |
used to control the optimizer. This element is generated with the controlIsta function. See ?controlIsta |
Details
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
mcp regularization:
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942. https://doi.org/10.1214/09-AOS729
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
model of class cvRegularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cvMcp(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 5),
thetas = seq(0.01,2,length.out = 3))
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
# optional: plotting the cross-validation fit requires installation of plotly
# plot(lsem)
Class for cross-validated regularized SEM
Description
Class for cross-validated regularized SEM
Slots
parameters
data.frame with parameter estimates for the best combination of the tuning parameters
transformations
transformed parameters
cvfits
data.frame with all combinations of the tuning parameters and the sum of the cross-validation fits
parameterLabels
character vector with names of all parameters
regularized
character vector with names of regularized parameters
cvfitsDetails
data.frame with cross-validation fits for each subset
subsets
matrix indicating which person is in which subset
subsetParameters
optional: data.frame with parameter estimates for all combinations of the tuning parameters in all subsets
misc
list with additional return elements
notes
internal notes that have come up when fitting the model
cvRidge
Description
Implements ridge regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda x_j^2
Note that ridge regularization will not set any of the parameters to zero but result in a shrinkage towards zero.
Usage
cvRidge(
lavaanModel,
regularized,
lambdas,
k = 5,
standardize = FALSE,
returnSubsetParameters = FALSE,
method = "glmnet",
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Ridge regularization:
Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
model of class cvRegularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cvRidge(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 20))
# use the plot-function to plot the cross-validation fit:
plot(lsem)
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
cvRidgeBfgs
Description
Implements cross-validated ridge regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda x_j^2
Note that ridge regularization will not set any of the parameters to zero but result in a shrinkage towards zero.
Usage
cvRidgeBfgs(
lavaanModel,
regularized,
lambdas,
k = 5,
standardize = FALSE,
returnSubsetParameters = FALSE,
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlBFGS()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Ridge regularization:
Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Value
model of class cvRegularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cvRidgeBfgs(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 20))
# use the plot-function to plot the cross-validation fit:
plot(lsem)
# the coefficients can be accessed with:
coef(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
cvScad
Description
Implements scad regularization for structural equation models. The penalty function is given by:
p( x_j) = \begin{cases}
\lambda |x_j| & \text{if } |x_j| \leq \theta\\
\frac{-x_j^2 + 2\theta\lambda |x_j| - \lambda^2}{2(\theta -1)} &
\text{if } \lambda < |x_j| \leq \lambda\theta \\
(\theta + 1) \lambda^2/2 & \text{if } |x_j| \geq \theta\lambda\\
\end{cases}
where \theta > 2
.
Usage
cvScad(
lavaanModel,
regularized,
lambdas,
thetas,
k = 5,
standardize = FALSE,
returnSubsetParameters = FALSE,
modifyModel = lessSEM::modifyModel(),
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures. |
control |
used to control the optimizer. This element is generated with the controlIsta function. See ?controlIsta |
Details
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
scad regularization:
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
model of class cvRegularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cvScad(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 3),
thetas = seq(2.01,5,length.out = 3))
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
# optional: plotting the cross-validation fit requires installation of plotly
# plot(lsem)
cvScaler
Description
uses the means and standard deviations of the training set to standardize the test set. See, e.g., https://scikit-learn.org/stable/modules/cross_validation.html .
Usage
cvScaler(testSet, means, standardDeviations)
Arguments
testSet |
test data set |
means |
means of the training set |
standardDeviations |
standard deviations of the training set |
Value
scaled test set
Examples
library(lessSEM)
data <- matrix(rnorm(50),10,5)
cvScaler(testSet = data,
means = 1:5,
standardDeviations = 1:5)
cvSmoothAdaptiveLasso
Description
Implements cross-validated smooth adaptive lasso regularization for structural equation models. The penalty function is given by:
p( x_j) = p( x_j) = \frac{1}{w_j}\lambda\sqrt{(x_j + \epsilon)^2}
Usage
cvSmoothAdaptiveLasso(
lavaanModel,
regularized,
weights = NULL,
lambdas,
epsilon,
k = 5,
standardize = FALSE,
returnSubsetParameters = FALSE,
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlBFGS()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
weights |
labeled vector with weights for each of the parameters in the model. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object. If set to NULL, the default weights will be used: the inverse of the absolute values of the unregularized parameter estimates |
lambdas |
numeric vector: values for the tuning parameter lambda |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Adaptive lasso regularization:
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Value
model of class cvRegularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cvSmoothAdaptiveLasso(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,.1),
epsilon = 1e-8)
# use the plot-function to plot the cross-validation fit
plot(lsem)
# the coefficients can be accessed with:
coef(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
# The best parameters can also be extracted with:
coef(lsem)
cvSmoothElasticNet
Description
Implements cross-validated smooth elastic net regularization for structural equation models. The penalty function is given by:
p( x_j) = \alpha\lambda\sqrt{(x_j + \epsilon)^2} + (1-\alpha)\lambda x_j^2
Note that the smooth elastic net combines ridge and smooth lasso regularization. If \alpha = 0
,
the elastic net reduces to ridge regularization. If \alpha = 1
it reduces
to smooth lasso regularization. In between, elastic net is a compromise between the shrinkage of
the lasso and the ridge penalty.
Usage
cvSmoothElasticNet(
lavaanModel,
regularized,
lambdas,
alphas,
epsilon,
k = 5,
standardize = FALSE,
returnSubsetParameters = FALSE,
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlBFGS()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
alphas |
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso. |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Elastic net regularization:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Value
model of class cvRegularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cvSmoothElasticNet(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
epsilon = 1e-8,
lambdas = seq(0,1,length.out = 5),
alphas = .3)
# the coefficients can be accessed with:
coef(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
# optional: plotting the cross-validation fit requires installation of plotly
# plot(lsem)
cvSmoothLasso
Description
Implements cross-validated smooth lasso regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda \sqrt{(x_j + \epsilon)^2}
Usage
cvSmoothLasso(
lavaanModel,
regularized,
lambdas,
epsilon,
k = 5,
standardize = FALSE,
returnSubsetParameters = FALSE,
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlBFGS()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Lasso regularization:
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Value
model of class cvRegularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- cvSmoothLasso(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,.1),
k = 5, # number of cross-validation folds
epsilon = 1e-8,
standardize = TRUE) # automatic standardization
# use the plot-function to plot the cross-validation fit:
plot(lsem)
# the coefficients can be accessed with:
coef(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters
# The best parameters can also be extracted with:
coef(lsem)
elasticNet
Description
Implements elastic net regularization for structural equation models. The penalty function is given by:
p( x_j) = \alpha\lambda| x_j| + (1-\alpha)\lambda x_j^2
Note that the elastic net combines ridge and lasso regularization. If \alpha = 0
,
the elastic net reduces to ridge regularization. If \alpha = 1
it reduces
to lasso regularization. In between, elastic net is a compromise between the shrinkage of
the lasso and the ridge penalty.
Usage
elasticNet(
lavaanModel,
regularized,
lambdas,
alphas,
method = "glmnet",
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
alphas |
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the lessSEM::controlIsta() and controlGlmnet() functions. |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Elastic net regularization:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- elasticNet(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 5),
alphas = seq(0,1,length.out = 3))
# the coefficients can be accessed with:
coef(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters[1,]
# optional: plotting the paths requires installation of plotly
# plot(lsem)
#### Advanced ###
# Switching the optimizer #
# Use the "method" argument to switch the optimizer. The control argument
# must also be changed to the corresponding function:
lsemIsta <- elasticNet(
lavaanModel = lavaanModel,
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 5),
alphas = seq(0,1,length.out = 3),
method = "ista",
control = controlIsta())
# Note: The results are basically identical:
lsemIsta@parameters - lsem@parameters
S4 method to exract the estimates of an object
Description
S4 method to exract the estimates of an object
Usage
estimates(object, criterion = NULL, transformations = FALSE)
Arguments
object |
a model fitted with lessSEM |
criterion |
fitIndice used to select the parameters |
transformations |
boolean: Should transformations be returned? |
Value
returns a matrix with estimates
estimates
Description
estimates
Usage
## S4 method for signature 'cvRegularizedSEM'
estimates(object, criterion = NULL, transformations = FALSE)
Arguments
object |
object of class cvRegularizedSEM |
criterion |
not used |
transformations |
boolean: Should transformations be returned? |
Value
returns a matrix with estimates
estimates
Description
estimates
Usage
## S4 method for signature 'regularizedSEM'
estimates(object, criterion = NULL, transformations = FALSE)
Arguments
object |
object of class regularizedSEM |
criterion |
fit index (e.g., AIC) used to select the parameters |
transformations |
boolean: Should transformations be returned? |
Value
returns a matrix with estimates
estimates
Description
estimates
Usage
## S4 method for signature 'regularizedSEMMixedPenalty'
estimates(object, criterion = NULL, transformations = FALSE)
Arguments
object |
object of class regularizedSEMMixedPenalty |
criterion |
fit index (e.g., AIC) used to select the parameters |
transformations |
boolean: Should transformations be returned? |
Value
returns a matrix with estimates
fit
Description
Optimizes an object with mixed penalty. See ?mixedPenalty for more details.
Usage
fit(mixedPenalty)
Arguments
mixedPenalty |
object of class mixedPenalty. This object can be created with the mixedPenalty function. Penalties can be added with the addCappedL1, addElastiNet, addLasso, addLsp, addMcp, and addScad functions. |
Value
throws error in case of undefined penalty combinations.
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# We can add mixed penalties as follows:
regularized <- lavaanModel |>
# create template for regularized model with mixed penalty:
mixedPenalty() |>
# add penalty on loadings l6 - l10:
addElasticNet(regularized = paste0("l", 11:15),
lambdas = seq(0,1,.1),
alphas = .4) |>
# fit the model:
fit()
S4 method to compute fit indices (e.g., AIC, BIC, ...)
Description
S4 method to compute fit indices (e.g., AIC, BIC, ...)
Usage
fitIndices(object)
Arguments
object |
a model fitted with lessSEM |
Value
returns a data.frame with fit indices
fitIndices
Description
fitIndices
Usage
## S4 method for signature 'cvRegularizedSEM'
fitIndices(object)
Arguments
object |
object of class cvRegularizedSEM |
Value
returns a data.frame with fit indices
fitIndices
Description
fitIndices
Usage
## S4 method for signature 'regularizedSEM'
fitIndices(object)
Arguments
object |
object of class regularizedSEM |
Value
returns a data.frame with fit indices
fitIndices
Description
fitIndices
Usage
## S4 method for signature 'regularizedSEMMixedPenalty'
fitIndices(object)
Arguments
object |
object of class regularizedSEMMixedPenalty |
Value
returns a data.frame with fit indices
getLavaanParameters
Description
helper function: returns a labeled vector with parameters from lavaan
Usage
getLavaanParameters(lavaanModel, removeDuplicates = TRUE)
Arguments
lavaanModel |
model of class lavaan |
removeDuplicates |
should duplicated parameters be removed? |
Value
returns a labeled vector with parameters from lavaan
Examples
library(lessSEM)
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
getLavaanParameters(lavaanModel)
getTuningParameterConfiguration
Description
Returns the lambda, theta, and alpha values for the tuning parameters of a regularized SEM with mixed penalty.
Usage
getTuningParameterConfiguration(
regularizedSEMMixedPenalty,
tuningParameterConfiguration
)
Arguments
regularizedSEMMixedPenalty |
object of type regularizedSEMMixedPenalty (see ?mixedPenalty) |
tuningParameterConfiguration |
integer indicating which tuningParameterConfiguration should be extracted (e.g., 1). See the entry in the row tuningParameterConfiguration of regularizedSEMMixedPenalty@fits and regularizedSEMMixedPenalty@parameters. |
Value
data frame with penalty and tuning parameter settings
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# We can add mixed penalties as follows:
regularized <- lavaanModel |>
# create template for regularized model with mixed penalty:
mixedPenalty() |>
# add penalty on loadings l6 - l10:
addLsp(regularized = paste0("l", 11:15),
lambdas = seq(0,1,.1),
thetas = 2.3) |>
# fit the model:
fit()
getTuningParameterConfiguration(regularizedSEMMixedPenalty = regularized,
tuningParameterConfiguration = 2)
CappedL1 optimization with glmnet optimizer
Description
Object for cappedL1 optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
CappedL1 optimization with glmnet optimizer
Description
Object for cappedL1 optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
elastic net optimization with glmnet optimizer
Description
Object for elastic net optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, an R function to compute the fit, an R function to compute the gradients, a list with elements the fit and gradient function require, a lambda and an alpha value.
elastic net optimization with glmnet optimizer
Description
Object for elastic net optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEXP function pointer to compute the fit, a SEXP function pointer to compute the gradients, a list with elements the fit and gradient function require, a lambda and an alpha value.
elastic net optimization with glmnet optimizer
Description
Object for elastic net optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
elastic net optimization with glmnet optimizer
Description
Object for elastic net optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
lsp optimization with glmnet optimizer
Description
Object for lsp optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
lsp optimization with glmnet optimizer
Description
Object for lsp optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
mcp optimization with glmnet optimizer
Description
Object for mcp optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
mcp optimization with glmnet optimizer
Description
Object for mcp optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
mixed optimization with glmnet optimizer
Description
Object for mixed optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
mixed optimization with glmnet optimizer
Description
Object for mixed optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
mixed optimization with glmnet optimizer
Description
Object for mixed optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
mixed optimization with glmnet optimizer
Description
Object for mixed optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
scad optimization with glmnet optimizer
Description
Object for scad optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
scad optimization with glmnet optimizer
Description
Object for scad optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
gpAdaptiveLasso
Description
Implements adaptive lasso regularization for general purpose optimization problems. The penalty function is given by:
p( x_j) = p( x_j) = \frac{1}{w_j}\lambda| x_j|
Adaptive lasso regularization will set parameters to zero if \lambda
is large enough.
Usage
gpAdaptiveLasso(
par,
regularized,
weights = NULL,
fn,
gr = NULL,
lambdas = NULL,
nLambdas = NULL,
reverse = TRUE,
curve = 1,
...,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
weights |
labeled vector with adaptive lasso weights. NULL will use 1/abs(par) |
fn |
R function which takes the parameters as input and returns the fit value (a single value) |
gr |
R function which takes the parameters as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
reverse |
if set to TRUE and nLambdas is used, lessSEM will start with the largest lambda and gradually decrease lambda. Otherwise, lessSEM will start with the smallest lambda and gradually increase it. |
curve |
Allows for unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If curve is close to 1 all lambda values will be equally spaced, if curve is large lambda values will be more concentrated close to 0. See ?lessSEM::curveLambda for more information. |
... |
additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
Adaptive lasso regularization:
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for other objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(lessSEM)
set.seed(123)
# first, we simulate data for our
# linear regression.
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
# First, we must construct a fiting function
# which returns a single value. We will use
# the residual sum squared as fitting function.
# Let's start setting up the fitting function:
fittingFunction <- function(par, y, X, N){
# par is the parameter vector
# y is the observed dependent variable
# X is the design matrix
# N is the sample size
pred <- X %*% matrix(par, ncol = 1) #be explicit here:
# we need par to be a column vector
sse <- sum((y - pred)^2)
# we scale with .5/N to get the same results as glmnet
return((.5/N)*sse)
}
# let's define the starting values:
b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates
names(b) <- paste0("b", 1:length(b))
# names of regularized parameters
regularized <- paste0("b",1:p)
# define the weight for each of the parameters
weights <- 1/abs(b)
# we will re-scale the weights for equivalence to glmnet.
# see ?glmnet for more details
weights <- length(b)*weights/sum(weights)
# optimize
adaptiveLassoPen <- gpAdaptiveLasso(
par = b,
regularized = regularized,
weights = weights,
fn = fittingFunction,
lambdas = seq(0,1,.01),
X = X,
y = y,
N = N
)
plot(adaptiveLassoPen)
# You can access the fit results as follows:
adaptiveLassoPen@fits
# Note that we won't compute any fit measures automatically, as
# we cannot be sure how the AIC, BIC, etc are defined for your objective function
# for comparison:
# library(glmnet)
# coef(glmnet(x = X,
# y = y,
# penalty.factor = weights,
# lambda = adaptiveLassoPen@fits$lambda[20],
# intercept = FALSE,
# standardize = FALSE))[,1]
# adaptiveLassoPen@parameters[20,]
gpAdaptiveLassoCpp
Description
Implements adaptive lasso regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
p( x_j) = p( x_j) = \frac{1}{w_j}\lambda| x_j|
Adaptive lasso regularization will set parameters to zero if \lambda
is large enough.
Usage
gpAdaptiveLassoCpp(
par,
regularized,
weights = NULL,
fn,
gr,
lambdas = NULL,
nLambdas = NULL,
curve = 1,
additionalArguments,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
weights |
labeled vector with adaptive lasso weights. NULL will use 1/abs(par) |
fn |
R function which takes the parameters as input and returns the fit value (a single value) |
gr |
R function which takes the parameters as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
curve |
Allows for unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If curve is close to 1 all lambda values will be equally spaced, if curve is large lambda values will be more concentrated close to 0. See ?lessSEM::curveLambda for more information. |
additionalArguments |
list with additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
Adaptive lasso regularization:
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for C++ objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(Rcpp)
library(lessSEM)
linreg <- '
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::export]]
double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// compute the sum of squared errors:
arma::mat sse = arma::trans(y-X*b)*(y-X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
sse *= 1.0/(2.0 * y.n_elem);
// note: We must return a double, but the sse is a matrix
// To get a double, just return the single value that is in
// this matrix:
return(sse(0,0));
}
// [[Rcpp::export]]
arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// note: we want to return our gradients as row-vector; therefore,
// we have to transpose the resulting column-vector:
arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
gradients *= (.5/y.n_rows);
return(gradients);
}
// Dirk Eddelbuettel at
// https://gallery.rcpp.org/articles/passing-cpp-function-pointers/
typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t;
typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t;
// [[Rcpp::export]]
fitFunPtr_t fitfunPtr() {
return(fitFunPtr_t(new fitFunPtr(&fitfunction)));
}
// [[Rcpp::export]]
gradientFunPtr_t gradfunPtr() {
return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction)));
}
'
Rcpp::sourceCpp(code = linreg)
ffp <- fitfunPtr()
gfp <- gradfunPtr()
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
data <- list("y" = y,
"X" = cbind(1,X))
parameters <- rep(0, ncol(data$X))
names(parameters) <- paste0("b", 0:(length(parameters)-1))
al1 <- gpAdaptiveLassoCpp(par = parameters,
regularized = paste0("b", 1:(length(b)-1)),
fn = ffp,
gr = gfp,
lambdas = seq(0,1,.1),
additionalArguments = data)
al1@parameters
gpCappedL1
Description
Implements cappedL1 regularization for general purpose optimization problems. The penalty function is given by:
p( x_j) = \lambda \min(| x_j|, \theta)
where \theta > 0
. The cappedL1 penalty is identical to the lasso for
parameters which are below \theta
and identical to a constant for parameters
above \theta
. As adding a constant to the fitting function will not change its
minimum, larger parameters can stay unregularized while smaller ones are set to zero.
Usage
gpCappedL1(
par,
fn,
gr = NULL,
...,
regularized,
lambdas,
thetas,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
... |
additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
CappedL1 regularization:
Zhang, T. (2010). Analysis of Multi-stage Convex Relaxation for Sparse Regularization. Journal of Machine Learning Research, 11, 1081–1107.
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for other objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
# This example shows how to use the optimizers
# for other objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(lessSEM)
set.seed(123)
# first, we simulate data for our
# linear regression.
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
# First, we must construct a fiting function
# which returns a single value. We will use
# the residual sum squared as fitting function.
# Let's start setting up the fitting function:
fittingFunction <- function(par, y, X, N){
# par is the parameter vector
# y is the observed dependent variable
# X is the design matrix
# N is the sample size
pred <- X %*% matrix(par, ncol = 1) #be explicit here:
# we need par to be a column vector
sse <- sum((y - pred)^2)
# we scale with .5/N to get the same results as glmnet
return((.5/N)*sse)
}
# let's define the starting values:
b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates
names(b) <- paste0("b", 1:length(b))
# names of regularized parameters
regularized <- paste0("b",1:p)
# optimize
cL1 <- gpCappedL1(
par = b,
regularized = regularized,
fn = fittingFunction,
lambdas = seq(0,1,.1),
thetas = c(0.001, .5, 1),
X = X,
y = y,
N = N
)
# optional: plot requires plotly package
# plot(cL1)
# for comparison
fittingFunction <- function(par, y, X, N, lambda, theta){
pred <- X %*% matrix(par, ncol = 1)
sse <- sum((y - pred)^2)
smoothAbs <- sqrt(par^2 + 1e-8)
pen <- lambda * ifelse(smoothAbs < theta, smoothAbs, theta)
return((.5/N)*sse + sum(pen))
}
round(
optim(par = b,
fn = fittingFunction,
y = y,
X = X,
N = N,
lambda = cL1@fits$lambda[15],
theta = cL1@fits$theta[15],
method = "BFGS")$par,
4)
cL1@parameters[15,]
gpCappedL1Cpp
Description
Implements cappedL1 regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
p( x_j) = \lambda \min(| x_j|, \theta)
where \theta > 0
. The cappedL1 penalty is identical to the lasso for
parameters which are below \theta
and identical to a constant for parameters
above \theta
. As adding a constant to the fitting function will not change its
minimum, larger parameters can stay unregularized while smaller ones are set to zero.
Usage
gpCappedL1Cpp(
par,
fn,
gr,
additionalArguments,
regularized,
lambdas,
thetas,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
additionalArguments |
list with additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
CappedL1 regularization:
Zhang, T. (2010). Analysis of Multi-stage Convex Relaxation for Sparse Regularization. Journal of Machine Learning Research, 11, 1081–1107.
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for C++ objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(Rcpp)
library(lessSEM)
linreg <- '
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::export]]
double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// compute the sum of squared errors:
arma::mat sse = arma::trans(y-X*b)*(y-X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
sse *= 1.0/(2.0 * y.n_elem);
// note: We must return a double, but the sse is a matrix
// To get a double, just return the single value that is in
// this matrix:
return(sse(0,0));
}
// [[Rcpp::export]]
arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// note: we want to return our gradients as row-vector; therefore,
// we have to transpose the resulting column-vector:
arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
gradients *= (.5/y.n_rows);
return(gradients);
}
// https://gallery.rcpp.org/articles/passing-cpp-function-pointers/
typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t;
typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t;
// [[Rcpp::export]]
fitFunPtr_t fitfunPtr() {
return(fitFunPtr_t(new fitFunPtr(&fitfunction)));
}
// [[Rcpp::export]]
gradientFunPtr_t gradfunPtr() {
return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction)));
}
'
Rcpp::sourceCpp(code = linreg)
ffp <- fitfunPtr()
gfp <- gradfunPtr()
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
data <- list("y" = y,
"X" = cbind(1,X))
parameters <- rep(0, ncol(data$X))
names(parameters) <- paste0("b", 0:(length(parameters)-1))
cL1 <- gpCappedL1Cpp(par = parameters,
regularized = paste0("b", 1:(length(b)-1)),
fn = ffp,
gr = gfp,
lambdas = seq(0,1,.1),
thetas = seq(0.1,1,.1),
additionalArguments = data)
cL1@parameters
gpElasticNet
Description
Implements elastic net regularization for general purpose optimization problems. The penalty function is given by:
p( x_j) = p( x_j) = \frac{1}{w_j}\lambda| x_j|
Note that the elastic net combines ridge and lasso regularization. If \alpha = 0
,
the elastic net reduces to ridge regularization. If \alpha = 1
it reduces
to lasso regularization. In between, elastic net is a compromise between the shrinkage of
the lasso and the ridge penalty.
Usage
gpElasticNet(
par,
regularized,
fn,
gr = NULL,
lambdas,
alphas,
...,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
alphas |
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso. |
... |
additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
Elastic net regularization:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for other objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(lessSEM)
set.seed(123)
# first, we simulate data for our
# linear regression.
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
# First, we must construct a fiting function
# which returns a single value. We will use
# the residual sum squared as fitting function.
# Let's start setting up the fitting function:
fittingFunction <- function(par, y, X, N){
# par is the parameter vector
# y is the observed dependent variable
# X is the design matrix
# N is the sample size
pred <- X %*% matrix(par, ncol = 1) #be explicit here:
# we need par to be a column vector
sse <- sum((y - pred)^2)
# we scale with .5/N to get the same results as glmnet
return((.5/N)*sse)
}
# let's define the starting values:
b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates
names(b) <- paste0("b", 1:length(b))
# names of regularized parameters
regularized <- paste0("b",1:p)
# optimize
elasticNetPen <- gpElasticNet(
par = b,
regularized = regularized,
fn = fittingFunction,
lambdas = seq(0,1,.1),
alphas = c(0, .5, 1),
X = X,
y = y,
N = N
)
# optional: plot requires plotly package
# plot(elasticNetPen)
# for comparison:
fittingFunction <- function(par, y, X, N, lambda, alpha){
pred <- X %*% matrix(par, ncol = 1)
sse <- sum((y - pred)^2)
return((.5/N)*sse + (1-alpha)*lambda * sum(par^2) + alpha*lambda *sum(sqrt(par^2 + 1e-8)))
}
round(
optim(par = b,
fn = fittingFunction,
y = y,
X = X,
N = N,
lambda = elasticNetPen@fits$lambda[15],
alpha = elasticNetPen@fits$alpha[15],
method = "BFGS")$par,
4)
elasticNetPen@parameters[15,]
gpElasticNetCpp
Description
Implements elastic net regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
p( x_j) = p( x_j) = \frac{1}{w_j}\lambda| x_j|
Note that the elastic net combines ridge and lasso regularization. If \alpha = 0
,
the elastic net reduces to ridge regularization. If \alpha = 1
it reduces
to lasso regularization. In between, elastic net is a compromise between the shrinkage of
the lasso and the ridge penalty.
Usage
gpElasticNetCpp(
par,
regularized,
fn,
gr,
lambdas,
alphas,
additionalArguments,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
alphas |
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso. |
additionalArguments |
list with additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
Elastic net regularization:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for C++ objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(Rcpp)
library(lessSEM)
linreg <- '
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::export]]
double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// compute the sum of squared errors:
arma::mat sse = arma::trans(y-X*b)*(y-X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
sse *= 1.0/(2.0 * y.n_elem);
// note: We must return a double, but the sse is a matrix
// To get a double, just return the single value that is in
// this matrix:
return(sse(0,0));
}
// [[Rcpp::export]]
arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// note: we want to return our gradients as row-vector; therefore,
// we have to transpose the resulting column-vector:
arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
gradients *= (.5/y.n_rows);
return(gradients);
}
// Dirk Eddelbuettel at
// https://gallery.rcpp.org/articles/passing-cpp-function-pointers/
typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t;
typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t;
// [[Rcpp::export]]
fitFunPtr_t fitfunPtr() {
return(fitFunPtr_t(new fitFunPtr(&fitfunction)));
}
// [[Rcpp::export]]
gradientFunPtr_t gradfunPtr() {
return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction)));
}
'
Rcpp::sourceCpp(code = linreg)
ffp <- fitfunPtr()
gfp <- gradfunPtr()
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
data <- list("y" = y,
"X" = cbind(1,X))
parameters <- rep(0, ncol(data$X))
names(parameters) <- paste0("b", 0:(length(parameters)-1))
en <- gpElasticNetCpp(par = parameters,
regularized = paste0("b", 1:(length(b)-1)),
fn = ffp,
gr = gfp,
lambdas = seq(0,1,.1),
alphas = c(0,.5,1),
additionalArguments = data)
en@parameters
gpLasso
Description
Implements lasso regularization for general purpose optimization problems. The penalty function is given by:
p( x_j) = \lambda |x_j|
Lasso regularization will set parameters to zero if \lambda
is large enough
Usage
gpLasso(
par,
regularized,
fn,
gr = NULL,
lambdas = NULL,
nLambdas = NULL,
reverse = TRUE,
curve = 1,
...,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
fn |
R function which takes the parameters as input and returns the fit value (a single value) |
gr |
R function which takes the parameters as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
reverse |
if set to TRUE and nLambdas is used, lessSEM will start with the largest lambda and gradually decrease lambda. Otherwise, lessSEM will start with the smallest lambda and gradually increase it. |
curve |
Allows for unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If curve is close to 1 all lambda values will be equally spaced, if curve is large lambda values will be more concentrated close to 0. See ?lessSEM::curveLambda for more information. |
... |
additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
Lasso regularization:
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for other objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(lessSEM)
set.seed(123)
# first, we simulate data for our
# linear regression.
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
# First, we must construct a fiting function
# which returns a single value. We will use
# the residual sum squared as fitting function.
# Let's start setting up the fitting function:
fittingFunction <- function(par, y, X, N){
# par is the parameter vector
# y is the observed dependent variable
# X is the design matrix
# N is the sample size
pred <- X %*% matrix(par, ncol = 1) #be explicit here:
# we need par to be a column vector
sse <- sum((y - pred)^2)
# we scale with .5/N to get the same results as glmnet
return((.5/N)*sse)
}
# let's define the starting values:
b <- rep(0,p)
names(b) <- paste0("b", 1:length(b))
# names of regularized parameters
regularized <- paste0("b",1:p)
# optimize
lassoPen <- gpLasso(
par = b,
regularized = regularized,
fn = fittingFunction,
nLambdas = 100,
X = X,
y = y,
N = N
)
plot(lassoPen)
# You can access the fit results as follows:
lassoPen@fits
# Note that we won't compute any fit measures automatically, as
# we cannot be sure how the AIC, BIC, etc are defined for your objective function
gpLassoCpp
Description
Implements lasso regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
p( x_j) = \lambda |x_j|
Lasso regularization will set parameters to zero if \lambda
is large enough
Usage
gpLassoCpp(
par,
regularized,
fn,
gr,
lambdas = NULL,
nLambdas = NULL,
curve = 1,
additionalArguments,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
fn |
pointer to Rcpp function which takes the parameters as input and returns the fit value (a single value) |
gr |
pointer to Rcpp function which takes the parameters as input and returns the gradients of the objective function. |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
curve |
Allows for unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If curve is close to 1 all lambda values will be equally spaced, if curve is large lambda values will be more concentrated close to 0. See ?lessSEM::curveLambda for more information. |
additionalArguments |
list with additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
Lasso regularization:
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for C++ objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(Rcpp)
library(lessSEM)
linreg <- '
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::export]]
double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// compute the sum of squared errors:
arma::mat sse = arma::trans(y-X*b)*(y-X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
sse *= 1.0/(2.0 * y.n_elem);
// note: We must return a double, but the sse is a matrix
// To get a double, just return the single value that is in
// this matrix:
return(sse(0,0));
}
// [[Rcpp::export]]
arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// note: we want to return our gradients as row-vector; therefore,
// we have to transpose the resulting column-vector:
arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
gradients *= (.5/y.n_rows);
return(gradients);
}
// Dirk Eddelbuettel at
// https://gallery.rcpp.org/articles/passing-cpp-function-pointers/
typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t;
typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t;
// [[Rcpp::export]]
fitFunPtr_t fitfunPtr() {
return(fitFunPtr_t(new fitFunPtr(&fitfunction)));
}
// [[Rcpp::export]]
gradientFunPtr_t gradfunPtr() {
return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction)));
}
'
Rcpp::sourceCpp(code = linreg)
ffp <- fitfunPtr()
gfp <- gradfunPtr()
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
data <- list("y" = y,
"X" = cbind(1,X))
parameters <- rep(0, ncol(data$X))
names(parameters) <- paste0("b", 0:(length(parameters)-1))
l1 <- gpLassoCpp(par = parameters,
regularized = paste0("b", 1:(length(b)-1)),
fn = ffp,
gr = gfp,
lambdas = seq(0,1,.1),
additionalArguments = data)
l1@parameters
gpLsp
Description
Implements lsp regularization for general purpose optimization problems. The penalty function is given by:
Usage
gpLsp(
par,
fn,
gr = NULL,
...,
regularized,
lambdas,
thetas,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
... |
additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
numeric vector: values for the tuning parameter theta |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
lsp regularization:
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted l1 Minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905. https://doi.org/10.1007/s00041-008-9045-x
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
library(lessSEM)
set.seed(123)
# first, we simulate data for our
# linear regression.
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
# First, we must construct a fiting function
# which returns a single value. We will use
# the residual sum squared as fitting function.
# Let's start setting up the fitting function:
fittingFunction <- function(par, y, X, N){
# par is the parameter vector
# y is the observed dependent variable
# X is the design matrix
# N is the sample size
pred <- X %*% matrix(par, ncol = 1) #be explicit here:
# we need par to be a column vector
sse <- sum((y - pred)^2)
# we scale with .5/N to get the same results as glmnet
return((.5/N)*sse)
}
# let's define the starting values:
b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates
names(b) <- paste0("b", 1:length(b))
# names of regularized parameters
regularized <- paste0("b",1:p)
# optimize
lspPen <- gpLsp(
par = b,
regularized = regularized,
fn = fittingFunction,
lambdas = seq(0,1,.1),
thetas = c(0.001, .5, 1),
X = X,
y = y,
N = N
)
# optional: plot requires plotly package
# plot(lspPen)
# for comparison
fittingFunction <- function(par, y, X, N, lambda, theta){
pred <- X %*% matrix(par, ncol = 1)
sse <- sum((y - pred)^2)
smoothAbs <- sqrt(par^2 + 1e-8)
pen <- lambda * log(1.0 + smoothAbs / theta)
return((.5/N)*sse + sum(pen))
}
round(
optim(par = b,
fn = fittingFunction,
y = y,
X = X,
N = N,
lambda = lspPen@fits$lambda[15],
theta = lspPen@fits$theta[15],
method = "BFGS")$par,
4)
lspPen@parameters[15,]
gpLspCpp
Description
Implements lsp regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
p( x_j) = \lambda \log(1 + |x_j|/\theta)
where \theta > 0
.
Usage
gpLspCpp(
par,
fn,
gr,
additionalArguments,
regularized,
lambdas,
thetas,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
additionalArguments |
list with additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
numeric vector: values for the tuning parameter theta |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
lsp regularization:
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted l1 Minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905. https://doi.org/10.1007/s00041-008-9045-x
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for C++ objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(Rcpp)
library(lessSEM)
linreg <- '
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::export]]
double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// compute the sum of squared errors:
arma::mat sse = arma::trans(y-X*b)*(y-X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
sse *= 1.0/(2.0 * y.n_elem);
// note: We must return a double, but the sse is a matrix
// To get a double, just return the single value that is in
// this matrix:
return(sse(0,0));
}
// [[Rcpp::export]]
arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// note: we want to return our gradients as row-vector; therefore,
// we have to transpose the resulting column-vector:
arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
gradients *= (.5/y.n_rows);
return(gradients);
}
// Dirk Eddelbuettel at
// https://gallery.rcpp.org/articles/passing-cpp-function-pointers/
typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t;
typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t;
// [[Rcpp::export]]
fitFunPtr_t fitfunPtr() {
return(fitFunPtr_t(new fitFunPtr(&fitfunction)));
}
// [[Rcpp::export]]
gradientFunPtr_t gradfunPtr() {
return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction)));
}
'
Rcpp::sourceCpp(code = linreg)
ffp <- fitfunPtr()
gfp <- gradfunPtr()
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
data <- list("y" = y,
"X" = cbind(1,X))
parameters <- rep(0, ncol(data$X))
names(parameters) <- paste0("b", 0:(length(parameters)-1))
l <- gpLspCpp(par = parameters,
regularized = paste0("b", 1:(length(b)-1)),
fn = ffp,
gr = gfp,
lambdas = seq(0,1,.1),
thetas = seq(0.1,1,.1),
additionalArguments = data)
l@parameters
gpMcp
Description
Implements mcp regularization for general purpose optimization problems. The penalty function is given by:
p( x_j) = \begin{cases}
\lambda |x_j| - x_j^2/(2\theta) & \text{if } |x_j| \leq \theta\lambda\\
\theta\lambda^2/2 & \text{if } |x_j| > \lambda\theta
\end{cases}
where \theta > 0
.
Usage
gpMcp(
par,
fn,
gr = NULL,
...,
regularized,
lambdas,
thetas,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
... |
additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
numeric vector: values for the tuning parameter theta |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
mcp regularization:
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942. https://doi.org/10.1214/09-AOS729
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for other objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(lessSEM)
set.seed(123)
# first, we simulate data for our
# linear regression.
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
# First, we must construct a fiting function
# which returns a single value. We will use
# the residual sum squared as fitting function.
# Let's start setting up the fitting function:
fittingFunction <- function(par, y, X, N){
# par is the parameter vector
# y is the observed dependent variable
# X is the design matrix
# N is the sample size
pred <- X %*% matrix(par, ncol = 1) #be explicit here:
# we need par to be a column vector
sse <- sum((y - pred)^2)
# we scale with .5/N to get the same results as glmnet
return((.5/N)*sse)
}
# let's define the starting values:
# first, let's add an intercept
X <- cbind(1, X)
b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates
names(b) <- paste0("b", 0:(length(b)-1))
# names of regularized parameters
regularized <- paste0("b",1:p)
# optimize
mcpPen <- gpMcp(
par = b,
regularized = regularized,
fn = fittingFunction,
lambdas = seq(0,1,.1),
thetas = c(1.001, 1.5, 2),
X = X,
y = y,
N = N
)
# optional: plot requires plotly package
# plot(mcpPen)
gpMcpCpp
Description
Implements mcp regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
p( x_j) = \begin{cases}
\lambda |x_j| - x_j^2/(2\theta) & \text{if } |x_j| \leq \theta\lambda\\
\theta\lambda^2/2 & \text{if } |x_j| > \lambda\theta
\end{cases}
where \theta > 0
.
Usage
gpMcpCpp(
par,
fn,
gr,
additionalArguments,
regularized,
lambdas,
thetas,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
additionalArguments |
list with additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
numeric vector: values for the tuning parameter theta |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
mcp regularization:
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942. https://doi.org/10.1214/09-AOS729
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for C++ objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(Rcpp)
library(lessSEM)
linreg <- '
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::export]]
double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// compute the sum of squared errors:
arma::mat sse = arma::trans(y-X*b)*(y-X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
sse *= 1.0/(2.0 * y.n_elem);
// note: We must return a double, but the sse is a matrix
// To get a double, just return the single value that is in
// this matrix:
return(sse(0,0));
}
// [[Rcpp::export]]
arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// note: we want to return our gradients as row-vector; therefore,
// we have to transpose the resulting column-vector:
arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
gradients *= (.5/y.n_rows);
return(gradients);
}
// Dirk Eddelbuettel at
// https://gallery.rcpp.org/articles/passing-cpp-function-pointers/
typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t;
typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t;
// [[Rcpp::export]]
fitFunPtr_t fitfunPtr() {
return(fitFunPtr_t(new fitFunPtr(&fitfunction)));
}
// [[Rcpp::export]]
gradientFunPtr_t gradfunPtr() {
return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction)));
}
'
Rcpp::sourceCpp(code = linreg)
ffp <- fitfunPtr()
gfp <- gradfunPtr()
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
data <- list("y" = y,
"X" = cbind(1,X))
parameters <- rep(0, ncol(data$X))
names(parameters) <- paste0("b", 0:(length(parameters)-1))
m <- gpMcpCpp(par = parameters,
regularized = paste0("b", 1:(length(b)-1)),
fn = ffp,
gr = gfp,
lambdas = seq(0,1,.1),
thetas = seq(.1,1,.1),
additionalArguments = data)
m@parameters
Class for regularized model using general purpose optimization interface
Description
Class for regularized model using general purpose optimization interface
Slots
penalty
penalty used (e.g., "lasso")
parameters
data.frame with all parameter estimates
fits
data.frame with all fit results
parameterLabels
character vector with names of all parameters
weights
vector with weights given to each of the parameters in the penalty
regularized
character vector with names of regularized parameters
internalOptimization
list of elements used internally
inputArguments
list with elements passed by the user to the general purpose optimizer
gpRidge
Description
Implements ridge regularization for general purpose optimization problems. The penalty function is given by:
p( x_j) = \lambda x_j^2
Note that ridge regularization will not set any of the parameters to zero but result in a shrinkage towards zero.
Usage
gpRidge(
par,
regularized,
fn,
gr = NULL,
lambdas,
...,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
fn |
R function which takes the parameters as input and returns the fit value (a single value) |
gr |
R function which takes the parameters as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
... |
additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
Ridge regularization:
Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for other objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(lessSEM)
set.seed(123)
# first, we simulate data for our
# linear regression.
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
# First, we must construct a fiting function
# which returns a single value. We will use
# the residual sum squared as fitting function.
# Let's start setting up the fitting function:
fittingFunction <- function(par, y, X, N){
# par is the parameter vector
# y is the observed dependent variable
# X is the design matrix
# N is the sample size
pred <- X %*% matrix(par, ncol = 1) #be explicit here:
# we need par to be a column vector
sse <- sum((y - pred)^2)
# we scale with .5/N to get the same results as glmnet
return((.5/N)*sse)
}
# let's define the starting values:
b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates
names(b) <- paste0("b", 1:length(b))
# names of regularized parameters
regularized <- paste0("b",1:p)
# optimize
ridgePen <- gpRidge(
par = b,
regularized = regularized,
fn = fittingFunction,
lambdas = seq(0,1,.01),
X = X,
y = y,
N = N
)
plot(ridgePen)
# for comparison:
# fittingFunction <- function(par, y, X, N, lambda){
# pred <- X %*% matrix(par, ncol = 1)
# sse <- sum((y - pred)^2)
# return((.5/N)*sse + lambda * sum(par^2))
# }
#
# optim(par = b,
# fn = fittingFunction,
# y = y,
# X = X,
# N = N,
# lambda = ridgePen@fits$lambda[20],
# method = "BFGS")$par
# ridgePen@parameters[20,]
gpRidgeCpp
Description
Implements ridge regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
p( x_j) = \lambda x_j^2
Note that ridge regularization will not set any of the parameters to zero but result in a shrinkage towards zero.
Usage
gpRidgeCpp(
par,
regularized,
fn,
gr,
lambdas,
additionalArguments,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
fn |
R function which takes the parameters as input and returns the fit value (a single value) |
gr |
R function which takes the parameters as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
additionalArguments |
list with additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
Ridge regularization:
Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for C++ objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(Rcpp)
library(lessSEM)
linreg <- '
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::export]]
double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// compute the sum of squared errors:
arma::mat sse = arma::trans(y-X*b)*(y-X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
sse *= 1.0/(2.0 * y.n_elem);
// note: We must return a double, but the sse is a matrix
// To get a double, just return the single value that is in
// this matrix:
return(sse(0,0));
}
// [[Rcpp::export]]
arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// note: we want to return our gradients as row-vector; therefore,
// we have to transpose the resulting column-vector:
arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
gradients *= (.5/y.n_rows);
return(gradients);
}
// https://gallery.rcpp.org/articles/passing-cpp-function-pointers/
typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t;
typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t;
// [[Rcpp::export]]
fitFunPtr_t fitfunPtr() {
return(fitFunPtr_t(new fitFunPtr(&fitfunction)));
}
// [[Rcpp::export]]
gradientFunPtr_t gradfunPtr() {
return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction)));
}
'
Rcpp::sourceCpp(code = linreg)
ffp <- fitfunPtr()
gfp <- gradfunPtr()
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
data <- list("y" = y,
"X" = cbind(1,X))
parameters <- rep(0, ncol(data$X))
names(parameters) <- paste0("b", 0:(length(parameters)-1))
r <- gpRidgeCpp(par = parameters,
regularized = paste0("b", 1:(length(b)-1)),
fn = ffp,
gr = gfp,
lambdas = seq(0,1,.1),
additionalArguments = data)
r@parameters
gpScad
Description
Implements scad regularization for general purpose optimization problems. The penalty function is given by:
p( x_j) = \begin{cases}
\lambda |x_j| & \text{if } |x_j| \leq \theta\\
\frac{-x_j^2 + 2\theta\lambda |x_j| - \lambda^2}{2(\theta -1)} &
\text{if } \lambda < |x_j| \leq \lambda\theta \\
(\theta + 1) \lambda^2/2 & \text{if } |x_j| \geq \theta\lambda\\
\end{cases}
where \theta > 2
.
Usage
gpScad(
par,
fn,
gr = NULL,
...,
regularized,
lambdas,
thetas,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
... |
additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
numeric vector: values for the tuning parameter theta |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
scad regularization:
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for other objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(lessSEM)
set.seed(123)
# first, we simulate data for our
# linear regression.
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
# First, we must construct a fiting function
# which returns a single value. We will use
# the residual sum squared as fitting function.
# Let's start setting up the fitting function:
fittingFunction <- function(par, y, X, N){
# par is the parameter vector
# y is the observed dependent variable
# X is the design matrix
# N is the sample size
pred <- X %*% matrix(par, ncol = 1) #be explicit here:
# we need par to be a column vector
sse <- sum((y - pred)^2)
# we scale with .5/N to get the same results as glmnet
return((.5/N)*sse)
}
# let's define the starting values:
# first, let's add an intercept
X <- cbind(1, X)
b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates
names(b) <- paste0("b", 0:(length(b)-1))
# names of regularized parameters
regularized <- paste0("b",1:p)
# optimize
scadPen <- gpScad(
par = b,
regularized = regularized,
fn = fittingFunction,
lambdas = seq(0,1,.1),
thetas = c(2.001, 2.5, 5),
X = X,
y = y,
N = N
)
# optional: plot requires plotly package
# plot(scadPen)
# for comparison
#library(ncvreg)
#scadFit <- ncvreg(X = X[,-1],
# y = y,
# penalty = "SCAD",
# lambda = scadPen@fits$lambda[15],
# gamma = scadPen@fits$theta[15])
#coef(scadFit)
#scadPen@parameters[15,]
gpScadCpp
Description
Implements scad regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
p( x_j) = \begin{cases}
\lambda |x_j| & \text{if } |x_j| \leq \theta\\
\frac{-x_j^2 + 2\theta\lambda |x_j| - \lambda^2}{2(\theta -1)} &
\text{if } \lambda < |x_j| \leq \lambda\theta \\
(\theta + 1) \lambda^2/2 & \text{if } |x_j| \geq \theta\lambda\\
\end{cases}
where \theta > 2
.
Usage
gpScadCpp(
par,
fn,
gr,
additionalArguments,
regularized,
lambdas,
thetas,
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
additionalArguments |
list with additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
numeric vector: values for the tuning parameter theta |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
scad regularization:
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Object of class gpRegularized
Examples
# This example shows how to use the optimizers
# for C++ objective functions. We will use
# a linear regression as an example. Note that
# this is not a useful application of the optimizers
# as there are specialized packages for linear regression
# (e.g., glmnet)
library(Rcpp)
library(lessSEM)
linreg <- '
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::export]]
double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// compute the sum of squared errors:
arma::mat sse = arma::trans(y-X*b)*(y-X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
sse *= 1.0/(2.0 * y.n_elem);
// note: We must return a double, but the sse is a matrix
// To get a double, just return the single value that is in
// this matrix:
return(sse(0,0));
}
// [[Rcpp::export]]
arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){
// extract all required elements:
arma::colvec b = Rcpp::as<arma::colvec>(parameters);
arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable
arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix
// note: we want to return our gradients as row-vector; therefore,
// we have to transpose the resulting column-vector:
arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b);
// other packages, such as glmnet, scale the sse with
// 1/(2*N), where N is the sample size. We will do that here as well
gradients *= (.5/y.n_rows);
return(gradients);
}
// Dirk Eddelbuettel at
// https://gallery.rcpp.org/articles/passing-cpp-function-pointers/
typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t;
typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters
Rcpp::List& //additional elements
);
typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t;
// [[Rcpp::export]]
fitFunPtr_t fitfunPtr() {
return(fitFunPtr_t(new fitFunPtr(&fitfunction)));
}
// [[Rcpp::export]]
gradientFunPtr_t gradfunPtr() {
return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction)));
}
'
Rcpp::sourceCpp(code = linreg)
ffp <- fitfunPtr()
gfp <- gradfunPtr()
N <- 100 # number of persons
p <- 10 # number of predictors
X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix
b <- c(rep(1,4),
rep(0,6)) # true regression weights
y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
data <- list("y" = y,
"X" = cbind(1,X))
parameters <- rep(0, ncol(data$X))
names(parameters) <- paste0("b", 0:(length(parameters)-1))
s <- gpScadCpp(par = parameters,
regularized = paste0("b", 1:(length(b)-1)),
fn = ffp,
gr = gfp,
lambdas = seq(0,1,.1),
thetas = seq(2.1,3,.1),
additionalArguments = data)
s@parameters
cappedL1 optimization with ista
Description
Object for elastic net optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta value, a lambda and an alpha value (alpha must be 1).
cappedL1 optimization with ista
Description
Object for elastic net optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta value, a lambda and an alpha value (alpha must be 1).
elastic net optimization with ista
Description
Object for elastic net optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, an R function to compute the fit, an R function to compute the gradients, a list with elements the fit and gradient function require, a lambda and an alpha value.
elastic net optimization with ista
Description
Object for elastic net optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEXP function pointer to compute the fit, a SEXP function pointer to compute the gradients, a list with elements the fit and gradient function require, a lambda and an alpha value.
elastic net optimization with ista optimizer
Description
Object for elastic net optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
elastic net optimization with ista optimizer
Description
Object for elastic net optimization with glmnet optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
lsp optimization with ista
Description
Object for lsp optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
lsp optimization with ista
Description
Object for lsp optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
mcp optimization with ista
Description
Object for mcp optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
mcp optimization with ista
Description
Object for mcp optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
mixed penalty optimization with ista
Description
Object for elastic net optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object.
optimize
optimize the model.
mixed penalty optimization with ista
Description
Object for elastic net optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter, (2) a vector indicating which penalty is used, and (3) a list with control elements
optimize
optimize the model.
mixed penalty optimization with ista
Description
Object for elastic net optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter, (2) a vector indicating which penalty is used, and (3) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta value, a lambda and an alpha value (alpha must be 1).
mixed penalty optimization with ista
Description
Object for elastic net optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter, (2) a vector indicating which penalty is used, and (3) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta value, a lambda and an alpha value (alpha must be 1).
scad optimization with ista
Description
Object for scad optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
scad optimization with ista
Description
Object for scad optimization with ista optimizer
Value
a list with fit results
Fields
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
lasso
Description
Implements lasso regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda |x_j|
Lasso regularization will set parameters to zero if \lambda
is large enough
Usage
lasso(
lavaanModel,
regularized,
lambdas = NULL,
nLambdas = NULL,
reverse = TRUE,
curve = 1,
method = "glmnet",
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
reverse |
if set to TRUE and nLambdas is used, lessSEM will start with the largest lambda and gradually decrease lambda. Otherwise, lessSEM will start with the smallest lambda and gradually increase it. |
curve |
Allows for unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If curve is close to 1 all lambda values will be equally spaced, if curve is large lambda values will be more concentrated close to 0. See ?lessSEM::curveLambda for more information. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Lasso regularization:
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- lasso(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
# in case of lasso and adaptive lasso, we can specify the number of lambda
# values to use. lessSEM will automatically find lambda_max and fit
# models for nLambda values between 0 and lambda_max. For the other
# penalty functions, lambdas must be specified explicitly
nLambdas = 50)
# use the plot-function to plot the regularized parameters:
plot(lsem)
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters[1,]
# fit Measures:
fitIndices(lsem)
# The best parameters can also be extracted with:
coef(lsem, criterion = "AIC")
# or
estimates(lsem, criterion = "AIC")
#### Advanced ###
# Switching the optimizer #
# Use the "method" argument to switch the optimizer. The control argument
# must also be changed to the corresponding function:
lsemIsta <- lasso(
lavaanModel = lavaanModel,
regularized = paste0("l", 6:15),
nLambdas = 50,
method = "ista",
control = controlIsta())
# Note: The results are basically identical:
lsemIsta@parameters - lsem@parameters
lavaan2lslxLabels
Description
helper function: lslx and lavaan use slightly different parameter labels. This function can be used to get both sets of labels.
Usage
lavaan2lslxLabels(lavaanModel)
Arguments
lavaanModel |
model of class lavaan |
Value
list with lavaan labels and lslx labels
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
lavaan2lslxLabels(lavaanModel)
lessSEM2Lavaan
Description
Creates a lavaan model object from lessSEM (only if possible). Pass either a criterion or a combination of lambda, alpha, and theta.
Usage
lessSEM2Lavaan(
regularizedSEM,
criterion = NULL,
lambda = NULL,
alpha = NULL,
theta = NULL
)
Arguments
regularizedSEM |
object created with lessSEM |
criterion |
criterion used for model selection. Currently supported are "AIC" or "BIC" |
lambda |
value for tuning parameter lambda |
alpha |
value for tuning parameter alpha |
theta |
value for tuning parameter theta |
Value
lavaan model
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
regularized <- lasso(lavaanModel,
regularized = paste0("l", 11:15),
lambdas = seq(0,1,.1))
# using criterion
lessSEM2Lavaan(regularizedSEM = regularized,
criterion = "AIC")
# using tuning parameters (note: we only have to specify the tuning
# parameters that are actually used by the penalty function. In case
# of lasso, this is lambda):
lessSEM2Lavaan(regularizedSEM = regularized,
lambda = 1)
Class for the coefficients estimated by lessSEM.
Description
Class for the coefficients estimated by lessSEM.
Slots
tuningParameters
tuning parameters
estimates
parameter estimates
transformations
transformations of parameters
loadings
Description
Extract the labels of all loadings found in a lavaan model.
Usage
loadings(lavaanModel)
Arguments
lavaanModel |
fitted lavaan model |
Value
vector with parameter labels
Examples
# The following is adapted from ?lavaan::sem
library(lessSEM)
model <- '
# latent variable definitions
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + a*y2 + b*y3 + c*y4
dem65 =~ y5 + a*y6 + b*y7 + c*y8
# regressions
dem60 ~ ind60
dem65 ~ ind60 + dem60
# residual correlations
y1 ~~ y5
y2 ~~ y4 + y6
y3 ~~ y7
y4 ~~ y8
y6 ~~ y8
'
fit <- sem(model, data = PoliticalDemocracy)
loadings(fit)
logLik
Description
logLik
Usage
## S4 method for signature 'Rcpp_SEMCpp'
logLik(object, ...)
Arguments
object |
object of class Rcpp_SEMCpp |
... |
not used |
Value
log-likelihood of the model
logLik
Description
logLik
Usage
## S4 method for signature 'Rcpp_mgSEM'
logLik(object, ...)
Arguments
object |
object of class Rcpp_mgSEM |
... |
not used |
Value
log-likelihood of the model
Class for log-likelihood of regularized SEM. Note: we define a custom logLik - Function because the generic one is using df = number of parameters which might be confusing.
Description
Class for log-likelihood of regularized SEM. Note: we define a custom logLik - Function because the generic one is using df = number of parameters which might be confusing.
Slots
logLik
log-Likelihood
nParameters
number of parameters in the model
N
number of persons in the data set
logicalMatch
Description
Returns the rows for which all elements of a boolean matrix X are equal to the elements in boolean vector x
Usage
logicalMatch(X, x)
Arguments
X |
matrix with booleans |
x |
vector of booleans |
Value
numerical vector with indices of matching rows
lsp
Description
Implements lsp regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda \log(1 + |x_j|/\theta)
where \theta > 0
.
Usage
lsp(
lavaanModel,
regularized,
lambdas,
thetas,
modifyModel = lessSEM::modifyModel(),
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures |
control |
used to control the optimizer. This element is generated with the controlIsta (see ?controlIsta) |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
lsp regularization:
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted l1 Minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905. https://doi.org/10.1007/s00041-008-9045-x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- lsp(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 20),
thetas = seq(0.01,2,length.out = 5))
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters[1,]
# fit Measures:
fitIndices(lsem)
# The best parameters can also be extracted with:
coef(lsem, criterion = "AIC")
# or
estimates(lsem, criterion = "AIC")
# optional: plotting the paths requires installation of plotly
# plot(lsem)
makePtrs
Description
This function helps you create the pointers necessary to use the Cpp interface
Usage
makePtrs(fitFunName, gradFunName)
Arguments
fitFunName |
name of your C++ fit function (IMPORTANT: This must be the name used in C++) |
gradFunName |
name of your C++ gradient function (IMPORTANT: This must be the name used in C++) |
Value
a string which can be copied in the C++ function to create the pointers.
Examples
# see vignette("General-Purpose-Optimization", package = "lessSEM") for an example
mcp
Description
Implements mcp regularization for structural equation models. The penalty function is given by:
p( x_j) = \begin{cases}
\lambda |x_j| - x_j^2/(2\theta) & \text{if } |x_j| \leq \theta\lambda\\
\theta\lambda^2/2 & \text{if } |x_j| > \lambda\theta
\end{cases}
where \theta > 1
.
Usage
mcp(
lavaanModel,
regularized,
lambdas,
thetas,
modifyModel = lessSEM::modifyModel(),
method = "ista",
control = lessSEM::controlIsta()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
control |
used to control the optimizer. This element is generated with the controlIsta (see ?controlIsta) |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
In our experience, the glmnet optimizer can run in issues with the mcp penalty. Therefor, we default to using ista.
mcp regularization:
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942. https://doi.org/10.1214/09-AOS729
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- mcp(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 20),
thetas = seq(0.01,2,length.out = 5))
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters[1,]
# fit Measures:
fitIndices(lsem)
# The best parameters can also be extracted with:
coef(lsem, criterion = "AIC")
# or
estimates(lsem, criterion = "AIC")
# optional: plotting the paths requires installation of plotly
# plot(lsem)
mcpPenalty_C
Description
mcpPenalty_C
Usage
mcpPenalty_C(par, lambda_p, theta)
Arguments
par |
single parameter value |
lambda_p |
lambda value for this parameter |
theta |
theta value for this parameter |
Value
penalty value
mgSEM class
Description
internal mgSEM representation
Fields
new
Creates a new mgSEM.
addModel
add a model. Expects Rcpp::List
addTransformation
adds transforamtions to a model
implied
Computes implied means and covariance matrix
fit
Fits the model. Returns objective value of the fitting function
getParameters
Returns a data frame with model parameters.
getParameterLabels
Returns a vector with unique parameter labels as used internally.
getEstimator
Returns a vector with names of the estimators used in the submodels.
getGradients
Returns a matrix with scores.
getScores
Returns a matrix with scores. Not yet implemented
getHessian
Returns the hessian of the model. Expects the labels of the parameters and the values of the parameters as well as a boolean indicating if these are raw. Finally, a double (eps) controls the precision of the approximation.
computeTransformations
compute the transformations.
setTransformationGradientStepSize
change the step size of the gradient computation for the transformations
mixedPenalty
Description
Provides possibility to impose different penalties on different parameters.
Usage
mixedPenalty(
lavaanModel,
modifyModel = lessSEM::modifyModel(),
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently supported are "glmnet" and "ista". |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
The mixedPenalty
function allows you to add multiple penalties to a single model.
For instance, you may want to regularize both loadings and regressions in a SEM.
In this case, using the same penalty (e.g., lasso) for both types of penalties may
actually not be what you want to use because the penalty function is sensitive to
the scales of the parameters. Instead, you may want to use two separate lasso
penalties for loadings and regressions. Similarly, separate penalties for
different parameters have, for instance, been proposed in multi-group models
(Geminiani et al., 2021).
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well. Models are fitted with the glmnet or ista optimizer. Note that the
optimizers differ in which penalties they support. The following table provides
an overview:
Penalty | Function | glmnet | ista |
lasso | addLasso | x | x |
elastic net | addElasticNet | x* | - |
cappedL1 | addCappedL1 | x | x |
lsp | addLsp | x | x |
scad | addScad | x | x |
mcp | addMcp | x | x |
By default, glmnet will be used. Note that the elastic net penalty can only be combined with other elastic net penalties.
Check vignette(topic = "Mixed-Penalties", package = "lessSEM") for more details.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Geminiani, E., Marra, G., & Moustaki, I. (2021). Single- and multiple-group penalized factor analysis: A trust-region algorithm approach with integrated automatic multiple tuning parameter selection. Psychometrika, 86(1), 65–95. https://doi.org/10.1007/s11336-021-09751-8
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
# In this example, we want to regularize the loadings l6-l10
# independently of the loadings l11-15. This could, for instance,
# reflect that the items y6-y10 and y11-y15 may belong to different
# subscales.
regularized <- lavaanModel |>
# create template for regularized model with mixed penalty:
mixedPenalty() |>
# add lasso penalty on loadings l6 - l10:
addLasso(regularized = paste0("l", 6:10),
lambdas = seq(0,1,length.out = 4)) |>
# add scad penalty on loadings l11 - l15:
addScad(regularized = paste0("l", 11:15),
lambdas = seq(0,1,length.out = 3),
thetas = 3.1) |>
# fit the model:
fit()
# elements of regularized can be accessed with the @ operator:
regularized@parameters[1,]
# AIC and BIC:
AIC(regularized)
BIC(regularized)
# The best parameters can also be extracted with:
coef(regularized, criterion = "AIC")
coef(regularized, criterion = "BIC")
# The tuningParameterConfiguration corresponds to the rows
# in the lambda, theta, and alpha matrices in regularized@tuningParamterConfigurations.
# Configuration 3, for example, is given by
regularized@tuningParameterConfigurations$lambda[3,]
regularized@tuningParameterConfigurations$theta[3,]
regularized@tuningParameterConfigurations$alpha[3,]
# Note that lambda, theta, and alpha may correspond to tuning parameters
# of different penalties for different parameters (e.g., lambda for l6 is the lambda
# of the lasso penalty, while lambda for l12 is the lambda of the scad penalty).
modifyModel
Description
Modify the model from lavaan to fit your needs
Usage
modifyModel(
addMeans = FALSE,
activeSet = NULL,
dataSet = NULL,
transformations = NULL,
transformationList = list(),
transformationGradientStepSize = 1e-06
)
Arguments
addMeans |
If lavaanModel has meanstructure = FALSE, addMeans = TRUE will add a mean structure. FALSE will set the means of the observed variables to their observed means. |
activeSet |
Option to only use a subset of the individuals in the data set. Logical vector of length N indicating which subjects should remain in the sample. |
dataSet |
option to replace the data set in the lavaan model with a different data set. Can be useful for cross-validation |
transformations |
allows for transformations of parameters - useful for measurement invariance tests etc. |
transformationList |
optional list used within the transformations. NOTE: This must be used as an Rcpp::List. |
transformationGradientStepSize |
step size used to compute the gradients of the transformations |
Value
Object of class modifyModel
Examples
modification <- modifyModel(addMeans = TRUE) # adds intercepts to a lavaan object
# that was fitted without explicit intercepts
newTau
Description
assign new value to parameter tau used by approximate optimization. Any regularized value below tau will be evaluated as zeroed which directly impacts the AIC, BIC, etc.
Usage
newTau(regularizedSEM, tau)
Arguments
regularizedSEM |
object fitted with approximate optimization |
tau |
new tau value |
Value
regularizedSEM, but with new regularizedSEM@fits$nonZeroParameters
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- smoothLasso(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
epsilon = 1e-10,
tau = 1e-4,
lambdas = seq(0,1,length.out = 50))
newTau(regularizedSEM = lsem, tau = .1)
plots the cross-validation fits
Description
plots the cross-validation fits
Usage
## S4 method for signature 'cvRegularizedSEM,missing'
plot(x, y, ...)
Arguments
x |
object of class cvRegularizedSEM |
y |
not used |
... |
not used |
Value
either an object of ggplot2 or of plotly
plots the regularized and unregularized parameters for all levels of lambda
Description
plots the regularized and unregularized parameters for all levels of lambda
Usage
## S4 method for signature 'gpRegularized,missing'
plot(x, y, ...)
Arguments
x |
object of class gpRegularized |
y |
not used |
... |
use regularizedOnly=FALSE to plot all parameters |
Value
either an object of ggplot2 or of plotly
plots the regularized and unregularized parameters for all levels of lambda
Description
plots the regularized and unregularized parameters for all levels of lambda
Usage
## S4 method for signature 'regularizedSEM,missing'
plot(x, y, ...)
Arguments
x |
object of class gpRegularized |
y |
not used |
... |
use regularizedOnly=FALSE to plot all parameters |
Value
either an object of ggplot2 or of plotly
plots the regularized and unregularized parameters for all levels of the tuning parameters
Description
plots the regularized and unregularized parameters for all levels of the tuning parameters
Usage
## S4 method for signature 'stabSel,missing'
plot(x, y, ...)
Arguments
x |
object of class stabSel |
y |
not used |
... |
use regularizedOnly=FALSE to plot all parameters |
Value
either an object of ggplot2 or of plotly
regressions
Description
Extract the labels of all regressions found in a lavaan model.
Usage
regressions(lavaanModel)
Arguments
lavaanModel |
fitted lavaan model |
Value
vector with parameter labels
Examples
# The following is adapted from ?lavaan::sem
library(lessSEM)
model <- '
# latent variable definitions
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + a*y2 + b*y3 + c*y4
dem65 =~ y5 + a*y6 + b*y7 + c*y8
# regressions
dem60 ~ ind60
dem65 ~ ind60 + dem60
# residual correlations
y1 ~~ y5
y2 ~~ y4 + y6
y3 ~~ y7
y4 ~~ y8
y6 ~~ y8
'
fit <- sem(model, data = PoliticalDemocracy)
regressions(fit)
regsem2LavaanParameters
Description
helper function: regsem and lavaan use slightly different parameter labels. This function can be used to translate the parameter labels of a cv_regsem object to lavaan labels
Usage
regsem2LavaanParameters(regsemModel, lavaanModel)
Arguments
regsemModel |
model of class regsem |
lavaanModel |
model of class lavaan |
Value
regsem parameters with lavaan labels
Examples
## The following is adapted from ?regsem::regsem.
#library(lessSEM)
#library(regsem)
## put variables on same scale for regsem
#HS <- data.frame(scale(HolzingerSwineford1939[,7:15]))
#
#mod <- '
#f =~ 1*x1 + l1*x2 + l2*x3 + l3*x4 + l4*x5 + l5*x6 + l6*x7 + l7*x8 + l8*x9
#'
## Recommended to specify meanstructure in lavaan
#lavaanModel <- cfa(mod, HS, meanstructure=TRUE)
#
#regsemModel <- regsem(lavaanModel,
# lambda = 0.3,
# gradFun = "ram",
# type="lasso",
# pars_pen=c("l1", "l2", "l6", "l7", "l8"))
# regsem2LavaanParameters(regsemModel = regsemModel,
# lavaanModel = lavaanModel)
Class for regularized SEM
Description
Class for regularized SEM
Slots
penalty
penalty used (e.g., "lasso")
parameters
data.frame with parameter estimates
fits
data.frame with all fit results
parameterLabels
character vector with names of all parameters
weights
vector with weights given to each of the parameters in the penalty
regularized
character vector with names of regularized parameters
transformations
if the model has transformations, the transformed parameters are returned
internalOptimization
list of elements used internally
inputArguments
list with elements passed by the user to the general
notes
internal notes that have come up when fitting the model
Class for regularized SEM
Description
Class for regularized SEM
Slots
penalty
penalty used (e.g., "lasso")
tuningParameterConfigurations
list with settings for the lambda, theta, and alpha tuning parameters.
parameters
data.frame with parameter estimates
fits
data.frame with all fit results
parameterLabels
character vector with names of all parameters
weights
vector with weights given to each of the parameters in the penalty
regularized
character vector with names of regularized parameters
transformations
if the model has transformations, the transformed parameters are returned
internalOptimization
list of elements used internally
inputArguments
list with elements passed by the user to the general
notes
internal notes that have come up when fitting the model
Class for regularized SEM using Rsolnp
Description
Class for regularized SEM using Rsolnp
Slots
parameters
data.frame with parameter estimates
fits
data.frame with all fit results
parameterLabels
character vector with names of all parameters
internalOptimization
list of elements used internally
inputArguments
list with elements passed by the user to the general
notes
internal notes that have come up when fitting the model
ridge
Description
Implements ridge regularization for structural equation models. The penalty function is given by:
p( x_j) = \lambda x_j^2
Note that ridge regularization will not set any of the parameters to zero but result in a shrinkage towards zero.
Usage
ridge(
lavaanModel,
regularized,
lambdas,
method = "glmnet",
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Ridge regularization:
Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- ridge(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 20))
# use the plot-function to plot the regularized parameters:
plot(lsem)
# the coefficients can be accessed with:
coef(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters[1,]
#### Advanced ###
# Switching the optimizer #
# Use the "method" argument to switch the optimizer. The control argument
# must also be changed to the corresponding function:
lsemIsta <- ridge(
lavaanModel = lavaanModel,
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 20),
method = "ista",
control = controlIsta())
# Note: The results are basically identical:
lsemIsta@parameters - lsem@parameters
ridgeBfgs
Description
This function allows for regularization of models built in lavaan with the ridge penalty. Its elements can be accessed with the "@" operator (see examples).
Usage
ridgeBfgs(
lavaanModel,
regularized,
lambdas = NULL,
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlBFGS()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Details
For more details, see:
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
# names of the regularized parameters:
regularized = paste0("l", 6:15)
lsem <- ridgeBfgs(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
regularized = regularized,
lambdas = seq(0,1,length.out = 50))
plot(lsem)
# the coefficients can be accessed with:
coef(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters[1,]
scad
Description
Implements scad regularization for structural equation models. The penalty function is given by:
p( x_j) = \begin{cases}
\lambda |x_j| & \text{if } |x_j| \leq \theta\\
\frac{-x_j^2 + 2\theta\lambda |x_j| - \lambda^2}{2(\theta -1)} &
\text{if } \lambda < |x_j| \leq \lambda\theta \\
(\theta + 1) \lambda^2/2 & \text{if } |x_j| \geq \theta\lambda\\
\end{cases}
where \theta > 2
.
Usage
scad(
lavaanModel,
regularized,
lambdas,
thetas,
modifyModel = lessSEM::modifyModel(),
method = "glmnet",
control = lessSEM::controlGlmnet()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
control |
used to control the optimizer. This element is generated with the controlIsta (see ?controlIsta) |
Details
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
scad regularization:
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- scad(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
lambdas = seq(0,1,length.out = 20),
thetas = seq(2.01,5,length.out = 5))
# the coefficients can be accessed with:
coef(lsem)
# if you are only interested in the estimates and not the tuning parameters, use
coef(lsem)@estimates
# or
estimates(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters[1,]
# fit Measures:
fitIndices(lsem)
# The best parameters can also be extracted with:
coef(lsem, criterion = "AIC")
# or
estimates(lsem, criterion = "AIC")
# optional: plotting the paths requires installation of plotly
# plot(lsem)
scadPenalty_C
Description
scadPenalty_C
Usage
scadPenalty_C(par, lambda_p, theta)
Arguments
par |
single parameter value |
lambda_p |
lambda value for this parameter |
theta |
theta value for this parameter |
Value
penalty value
show
Description
show
Usage
## S4 method for signature 'Rcpp_SEMCpp'
show(object)
Arguments
object |
object of class Rcpp_SEMCpp |
Value
No return value, just prints estimates
show
Description
show
Usage
## S4 method for signature 'Rcpp_mgSEM'
show(object)
Arguments
object |
object of class Rcpp_mgSEM |
Value
No return value, just prints estimates
Show method for objects of class cvRegularizedSEM
.
Description
Show method for objects of class cvRegularizedSEM
.
Usage
## S4 method for signature 'cvRegularizedSEM'
show(object)
Arguments
object |
object of class cvRegularizedSEM |
Value
No return value, just prints estimates
show
Description
show
Usage
## S4 method for signature 'gpRegularized'
show(object)
Arguments
object |
object of class gpRegularized |
Value
No return value, just prints estimates
show
Description
show
Usage
## S4 method for signature 'lessSEMCoef'
show(object)
Arguments
object |
object of class lessSEMCoef |
Value
No return value, just prints estimates
show
Description
show
Usage
## S4 method for signature 'logLikelihood'
show(object)
Arguments
object |
object of class logLikelihood |
Value
No return value, just prints estimates
show
Description
show
Usage
## S4 method for signature 'regularizedSEM'
show(object)
Arguments
object |
object of class regularizedSEM |
Value
No return value, just prints estimates
show
Description
show
Usage
## S4 method for signature 'regularizedSEMMixedPenalty'
show(object)
Arguments
object |
object of class regularizedSEM |
Value
No return value, just prints estimates
show
Description
show
Usage
## S4 method for signature 'stabSel'
show(object)
Arguments
object |
object of class stabSel |
Value
No return value, just prints estimates
simulateExampleData
Description
simulate data for a simple CFA model
Usage
simulateExampleData(
N = 100,
loadings = c(rep(1, 5), rep(0.4, 5), rep(0, 5)),
percentMissing = 0
)
Arguments
N |
number of persons in the data set |
loadings |
loadings of the latent variable on the manifest observations |
percentMissing |
percentage of missing data |
Value
data set for a single-factor CFA.
Examples
y <- lessSEM::simulateExampleData()
smoothAdaptiveLasso
Description
This function allows for regularization of models built in lavaan with the smooth adaptive lasso penalty. The returned object is an S4 class; its elements can be accessed with the "@" operator (see examples).
Usage
smoothAdaptiveLasso(
lavaanModel,
regularized,
weights = NULL,
lambdas,
epsilon,
tau,
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlBFGS()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
weights |
labeled vector with weights for each of the parameters in the model. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object. If set to NULL, the default weights will be used: the inverse of the absolute values of the unregularized parameter estimates |
lambdas |
numeric vector: values for the tuning parameter lambda |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
tau |
parameters below threshold tau will be seen as zeroed |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Details
For more details, see:
Zou, H. (2006). The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Lee, S.-I., Lee, H., Abbeel, P., & Ng, A. Y. (2006). Efficient L1 Regularized Logistic Regression. Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), 401–408.
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
# names of the regularized parameters:
regularized = paste0("l", 6:15)
# define adaptive lasso weights:
# We use the inverse of the absolute unregularized parameters
# (this is the default in adaptiveLasso and can also specified
# by setting weights = NULL)
weights <- 1/abs(getLavaanParameters(lavaanModel))
weights[!names(weights) %in% regularized] <- 0
lsem <- smoothAdaptiveLasso(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
regularized = regularized,
weights = weights,
epsilon = 1e-10,
tau = 1e-4,
lambdas = seq(0,1,length.out = 50))
# use the plot-function to plot the regularized parameters:
plot(lsem)
# the coefficients can be accessed with:
coef(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters[1,]
# AIC and BIC:
AIC(lsem)
BIC(lsem)
# The best parameters can also be extracted with:
coef(lsem, criterion = "AIC")
coef(lsem, criterion = "BIC")
smoothElasticNet
Description
This function allows for regularization of models built in lavaan with the smooth elastic net penalty. Its elements can be accessed with the "@" operator (see examples).
Usage
smoothElasticNet(
lavaanModel,
regularized,
lambdas = NULL,
nLambdas = NULL,
alphas,
epsilon,
tau,
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlBFGS()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
alphas |
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso. |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
tau |
parameters below threshold tau will be seen as zeroed |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Details
For more details, see:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x for the details of this regularization technique.
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Lee, S.-I., Lee, H., Abbeel, P., & Ng, A. Y. (2006). Efficient L1 Regularized Logistic Regression. Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), 401–408.
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
# names of the regularized parameters:
regularized = paste0("l", 6:15)
lsem <- smoothElasticNet(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
regularized = regularized,
epsilon = 1e-10,
tau = 1e-4,
lambdas = seq(0,1,length.out = 5),
alphas = seq(0,1,length.out = 3))
# the coefficients can be accessed with:
coef(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters[1,]
smoothLasso
Description
This function allows for regularization of models built in lavaan with the smoothed lasso penalty. The returned object is an S4 class; its elements can be accessed with the "@" operator (see examples). We don't recommend using this function. Use lasso() instead.
Usage
smoothLasso(
lavaanModel,
regularized,
lambdas,
epsilon,
tau,
modifyModel = lessSEM::modifyModel(),
control = lessSEM::controlBFGS()
)
Arguments
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
tau |
parameters below threshold tau will be seen as zeroed |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Details
For more details, see:
Lee, S.-I., Lee, H., Abbeel, P., & Ng, A. Y. (2006). Efficient L1 Regularized Logistic Regression. Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), 401–408.
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Value
Model of class regularizedSEM
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Regularization:
lsem <- smoothLasso(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
epsilon = 1e-10,
tau = 1e-4,
lambdas = seq(0,1,length.out = 50))
# use the plot-function to plot the regularized parameters:
plot(lsem)
# the coefficients can be accessed with:
coef(lsem)
# elements of lsem can be accessed with the @ operator:
lsem@parameters[1,]
# AIC and BIC:
AIC(lsem)
BIC(lsem)
# The best parameters can also be extracted with:
coef(lsem, criterion = "AIC")
coef(lsem, criterion = "BIC")
Class for stability selection
Description
Class for stability selection
Slots
regularized
names of regularized parameters
tuningParameters
data.frame with tuning parameter values
stabilityPaths
matrix with percentage of parameters being non-zero averaged over all subsets for each setting of the tuning parameters
percentSelected
percentage with which a parameter was selected over all tuning parameter settings
selectedParameters
final selected parameters
settings
internal
stabilitySelection
Description
Provides rudimentary stability selection for regularized SEM. Stability selection has been proposed by Meinshausen & Bühlmann (2010) and was extended to SEM by Li & Jacobucci (2021). The problem that stabiltiy selection tries to solve is the instability of regularization procedures: Small changes in the data set may result in different parameters being selected. To address this issue, stability selection uses random subsamples from the initial data set and fits models in these subsamples. For each parameter, we can now check how often it is included in the model for a given set of tuning parameters. Plotting these probabilities can provide an overview over which of the parameters are often removed and which remain in the model most of the time. To get a final selection, a threshold t can be defined: If a parameter is in the model t% of the time, it is retained.
Usage
stabilitySelection(
modelSpecification,
subsampleSize,
numberOfSubsamples = 100,
threshold = 70,
maxTries = 10 * numberOfSubsamples
)
Arguments
modelSpecification |
a call to one of the penalty functions in lessSEM. See examples for details |
subsampleSize |
number of subjects in each subsample. Must be smaller than the number of subjects in the original data set |
numberOfSubsamples |
number of times the procedure should subsample and recompute the model. According to Meinshausen & Bühlmann (2010), 100 seems to work quite well and is also the default in regsem |
threshold |
percentage of models, where the parameter should be contained in order to be in the final model |
maxTries |
fitting models in a subset may fail. maxTries sets the maximal number of subsets to try. |
Value
estimates for each subsample and aggregated percentages for each parameter
References
Li, X., & Jacobucci, R. (2021). Regularized structural equation modeling with stability selection. Psychological Methods, 27(4), 497–518. https://doi.org/10.1037/met0000389
Meinshausen, N., & Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417–473. https://doi.org/10.1111/j.1467-9868.2010.00740.x
Examples
library(lessSEM)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
dataset <- simulateExampleData()
lavaanSyntax <- "
f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
lavaanModel <- lavaan::sem(lavaanSyntax,
data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Stability selection
stabSel <- stabilitySelection(
# IMPORTANT: Wrap your call to the penalty function in an rlang::expr-Block:
modelSpecification =
rlang::expr(
lasso(
# pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = paste0("l", 6:15),
# in case of lasso and adaptive lasso, we can specify the number of lambda
# values to use. lessSEM will automatically find lambda_max and fit
# models for nLambda values between 0 and lambda_max. For the other
# penalty functions, lambdas must be specified explicitly
nLambdas = 50)
),
subsampleSize = 80,
numberOfSubsamples = 5, # should be set to a much higher number (e.g., 100)
threshold = 70
)
stabSel
plot(stabSel)
summary method for objects of class cvRegularizedSEM
.
Description
summary method for objects of class cvRegularizedSEM
.
Usage
## S4 method for signature 'cvRegularizedSEM'
summary(object, ...)
Arguments
object |
object of class cvRegularizedSEM |
... |
not used |
Value
No return value, just prints estimates
summary
Description
summary
Usage
## S4 method for signature 'gpRegularized'
summary(object, ...)
Arguments
object |
object of class gpRegularized |
... |
not used |
Value
No return value, just prints estimates
summary
Description
summary
Usage
## S4 method for signature 'regularizedSEM'
summary(object, ...)
Arguments
object |
object of class regularizedSEM |
... |
not used |
Value
No return value, just prints estimates
summary
Description
summary
Usage
## S4 method for signature 'regularizedSEMMixedPenalty'
summary(object, ...)
Arguments
object |
object of class regularizedSEMMixedPenalty |
... |
not used |
Value
No return value, just prints estimates
summary
Description
summary
Usage
## S4 method for signature 'regularizedSEMWithCustomPenalty'
summary(object, ...)
Arguments
object |
object of class regularizedSEMWithCustomPenalty |
... |
not used |
Value
No return value, just prints estimates
variances
Description
Extract the labels of all variances found in a lavaan model.
Usage
variances(lavaanModel)
Arguments
lavaanModel |
fitted lavaan model |
Value
vector with parameter labels
Examples
# The following is adapted from ?lavaan::sem
library(lessSEM)
model <- '
# latent variable definitions
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + a*y2 + b*y3 + c*y4
dem65 =~ y5 + a*y6 + b*y7 + c*y8
# regressions
dem60 ~ ind60
dem65 ~ ind60 + dem60
# residual correlations
y1 ~~ y5
y2 ~~ y4 + y6
y3 ~~ y7
y4 ~~ y8
y6 ~~ y8
'
fit <- sem(model, data = PoliticalDemocracy)
variances(fit)