Title: | Composite-Based Structural Equation Modeling |
Version: | 0.6.1 |
Date: | 2025-05-15 |
Maintainer: | Florian Schuberth <f.schuberth@utwente.nl> |
Depends: | R (≥ 3.5.0) |
Description: | Estimate, assess, test, and study linear, nonlinear, hierarchical and multigroup structural equation models using composite-based approaches and procedures, including estimation techniques such as partial least squares path modeling (PLS-PM) and its derivatives (PLSc, ordPLSc, robustPLSc), generalized structured component analysis (GSCA), generalized structured component analysis with uniqueness terms (GSCAm), generalized canonical correlation analysis (GCCA), principal component analysis (PCA), factor score regression (FSR) using sum score, regression or Bartlett scores (including bias correction using Croon’s approach), as well as several tests and typical postestimation procedures (e.g., verify admissibility of the estimates, assess the model fit, test the model fit etc.). |
BugReports: | https://github.com/FloSchuberth/cSEM/issues/ |
URL: | https://github.com/FloSchuberth/cSEM/, https://floschuberth.github.io/cSEM/ |
License: | GPL-3 |
Encoding: | UTF-8 |
NeedsCompilation: | yes |
LazyData: | true |
Imports: | alabama, cli, crayon, expm (≥ 0.999-5), future.apply, future, lifecycle, lavaan, magrittr, MASS, Matrix, matrixcalc, matrixStats, polycor, progressr, psych, purrr, Rdpack, rlang, stats, symmoments, TruncatedNormal, utils |
RdMacros: | Rdpack, lifecycle |
RoxygenNote: | 7.3.2 |
Suggests: | DiagrammeR, DiagrammeRsvg, dplyr, tidyr, knitr, nnls, prettydoc, plotly, rsvg, rmarkdown, rootSolve, listviewer, testthat (≥ 3.0.0), ggplot2, openxlsx, graphics, spelling |
Config/testthat/edition: | 3 |
VignetteBuilder: | knitr |
Language: | en-US |
Packaged: | 2025-05-15 14:33:52 UTC; SchuberthF |
Author: | Manuel E. Rademaker
|
Repository: | CRAN |
Date/Publication: | 2025-05-16 09:40:14 UTC |
cSEM: A package for composite-based structural equation modeling
Description
Estimate, analyze, test, and study linear, nonlinear, hierarchical and multigroup structural equation models using composite-based approaches and procedures including estimation techniques such as partial least squares path modeling (PLS) and its derivatives (PLSc, ordPLSc, robustPLSc), generalized structured component analysis (GSCA), generalized structured component analysis with uniqueness terms (GSCAm), generalized canonical correlation analysis (GCCA) unit weights (sum score) and fixed weights, as well as several tests and typical postestimation procedures (e.g., assess the model fit, compute direct, indirect and total effects).
Author(s)
Maintainer: Florian Schuberth f.schuberth@utwente.nl (ORCID)
Authors:
Manuel E. Rademaker manuel-rademaker@outlook.de (ORCID)
Other contributors:
Tamara Schamberger tamara.schamberger@uni-wuerzburg.de (ORCID) [contributor]
Michael Klesel (ORCID) [contributor]
Huu Phuc Nguyen (ORCID) [contributor]
Theo K. Dijkstra [contributor]
Jörg Henseler (ORCID) [contributor]
See Also
Useful links:
Report bugs at https://github.com/FloSchuberth/cSEM/issues/
Data: Anime
Description
A data frame with 183 observations and 13 variables.
Usage
Anime
Format
An object of class data.frame
with 183 rows and 13 columns.
Details
The data set for the example on github.com/ISS-Analytics/pls-predict/ with irrelevant variables removed.
Source
Original source: github.com/ISS-Analytics/pls-predict/
Data: Benitezetal2020
Description
A data frame containing 22 variables with 300 observations.
Usage
Benitezetal2020
Format
An object of class data.frame
with 300 rows and 22 columns.
Details
The simulated data contains variables about the social executive and employee behavior. Moreover, it contains variables about the social media capability and business performance. The dataset was used as an illustrative example in Benitez et al. (2020).
Source
The dataset is provided as supplementary material by Benitez et al. (2020).
References
Benitez J, Henseler J, Castillo A, Schuberth F (2020). “How to perform and report an impactful analysis using partial least squares: Guidelines for confirmatory and explanatory IS research.” Information & Management, 2(57), 103168. doi:10.1016/j.im.2019.05.003.
Examples
#============================================================================
# Example is taken from Benitez et al. (2020)
#============================================================================
model_Benitez <-"
# Reflective measurement models# Reflective measurement models
SEXB =~ SEXB1 + SEXB2 + SEXB3 +SEXB4
SEMB =~ SEMB1 + SEMB2 + SEMB3 + SEMB4
# Composite models
SMC <~ SMC1 + SMC2 + SMC3 + SMC4
BPP <~ BPP1 + BPP2 + BPP3 + BPP4 + BPP5
# Control variables
FS<~ FirmSize
Ind <~ Industry1 + Industry2 + Industry3
# Structural model
SMC ~ SEXB + SEMB
BPP ~ SMC + Ind + FS
"
out <- csem(.data = Benitezetal2020, .model = model_Benitez,
.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06)
Data: BergamiBagozzi2000
Description
A data frame containing 22 variables with 305 observations.
Usage
BergamiBagozzi2000
Format
An object of class data.frame
with 305 rows and 22 columns.
Details
The dataset contains 22 variables and originates from a larger survey among South Korean employees conducted and reported by Bergami and Bagozzi (2000). It is also used in Hwang and Takane (2004) and Henseler (2021) for demonstration purposes, see the corresponding tutorial.
Source
Survey among South Korean employees conducted and reported by Bergami and Bagozzi (2000).
References
Bergami M, Bagozzi RP (2000).
“Self-categorization, affective commitment and group self-esteem as distinct aspects of social identity in the organization.”
British Journal of Social Psychology, 39(4), 555–577.
doi:10.1348/014466600164633.
Henseler J (2021).
Composite-Based Structural Equation Modeling: Analyzing Latent and Emergent Variables.
Guilford Press, New York.
Hwang H, Takane Y (2004).
“Generalized Structured Component Analysis.”
Psychometrika, 69(1), 81–99.
Examples
#============================================================================
# Example is taken from Henseler (2021)
#============================================================================
model_Bergami_Bagozzi_Henseler="
# Measurement models
OrgPres =~ cei1 + cei2 + cei3 + cei4 + cei5 + cei6 + cei7 + cei8
OrgIden =~ ma1 + ma2 + ma3 + ma4 + ma5 + ma6
AffLove =~ orgcmt1 + orgcmt2 + orgcmt3 + orgcmt7
AffJoy =~ orgcmt5 + orgcmt8
Gender <~ gender
# Structural model
OrgIden ~ OrgPres
AffLove ~ OrgPres + OrgIden + Gender
AffJoy ~ OrgPres + OrgIden + Gender
"
out <- csem(.data = BergamiBagozzi2000,
.model = model_Bergami_Bagozzi_Henseler,
.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06
)
#============================================================================
# Example is taken from Hwang et al. (2004)
#============================================================================
model_Bergami_Bagozzi_Hwang="
# Measurement models
OrgPres =~ cei1 + cei2 + cei3 + cei4 + cei5 + cei6 + cei7 + cei8
OrgIden =~ ma1 + ma2 + ma3 + ma4 + ma5 + ma6
AffJoy =~ orgcmt1 + orgcmt2 + orgcmt3 + orgcmt7
AffLove =~ orgcmt5 + orgcmt6 + orgcmt8
# Structural model
OrgIden ~ OrgPres
AffLove ~ OrgIden
AffJoy ~ OrgIden"
out_Hwang <- csem(.data = BergamiBagozzi2000,
.model = model_Bergami_Bagozzi_Hwang,
.approach_weights = "GSCA",
.disattenuate = FALSE,
.id = "gender",
.tolerance = 1e-06)
Data: ITFlex
Description
A data frame containing 16 variables with 100 observations.
Usage
ITFlex
Format
A data frame containing the following variables:
ITCOMP1
Software applications can be easily transported and used across multiple platforms.
ITCOMP2
Our firm provides multiple interfaces or entry points (e.g., web access) for external end users.
ITCOMP3
Our firm establishes corporate rules and standards for hardware and operating systems to ensure platform compatibility.
ITCOMP4
Data captured in one part of our organization are immediately available to everyone in the firm.
ITCONN1
Our organization has electronic links and connections throughout the entire firm.
ITCONN2
Our firm is linked to business partners through electronic channels (e.g., websites, e-mail, wireless devices, electronic data interchange).
ITCONN3
All remote, branch, and mobile offices are connected to the central office.
ITCONN4
There are very few identifiable communications bottlenecks within our firm.
MOD1
Our firm possesses a great speed in developing new business applications or modifying existing applications.
MOD2
Our corporate database is able to communicate in several different protocols.
MOD3
Reusable software modules are widely used in new systems development.
MOD4
IT personnel use object-oriented and prepackaged modular tools to create software applications.
ITPSF1
Our IT personnel have the ability to work effectively in cross-functional teams.
ITPSF2
Our IT personnel are able to interpret business problems and develop appropriate technical solutions.
ITPSF3
Our IT personnel are self-directed and proactive.
ITPSF4
Our IT personnel are knowledgeable about the key success factors in our firm.
Details
The dataset was studied by Benitez et al. (2018) and is used in Henseler (2021) for demonstration purposes, see the corresponding tutorial. All questionnaire items are measured on a 5-point scale.
Source
The data was collected through a survey by Benitez et al. (2018).
References
Benitez J, Ray G, Henseler J (2018).
“Impact of Information Technology Infrastructure Flexibility on Mergers and Acquisitions.”
MIS Quarterly, 42(1), 25–43.
Henseler J (2021).
Composite-Based Structural Equation Modeling: Analyzing Latent and Emergent Variables.
Guilford Press, New York.
Examples
#============================================================================
# Example is taken from Henseler (2020)
#============================================================================
model_IT_Fex="
# Composite models
ITComp <~ ITCOMP1 + ITCOMP2 + ITCOMP3 + ITCOMP4
Modul <~ MOD1 + MOD2 + MOD3 + MOD4
ITConn <~ ITCONN1 + ITCONN2 + ITCONN3 + ITCONN4
ITPers <~ ITPSF1 + ITPSF2 + ITPSF3 + ITPSF4
# Saturated structural model
ITPers ~ ITComp + Modul + ITConn
Modul ~ ITComp + ITConn
ITConn ~ ITComp
"
out <- csem(.data = ITFlex, .model = model_IT_Fex,
.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06,
.PLS_ignore_structural_model = TRUE)
Data: LancelotMiltgenetal2016
Description
A data frame containing 10 variables with 1090 observations.
Usage
LancelotMiltgenetal2016
Format
An object of class data.frame
with 1090 rows and 11 columns.
Details
The data was analysed by Lancelot-Miltgen et al. (2016) to study young consumers’ adoption intentions of a location tracker technology in the light of privacy concerns. It is also used in Henseler (2021) for demonstration purposes, see the corresponding tutorial.
Source
This data has been collected through a cooperation with the European Commission Joint Research Center Institute for Prospective Technological Studies, contract “Young People and Emerging Digital Services: An Exploratory Survey on Motivations, Perceptions, and Acceptance of Risk” (EC JRC Contract IPTS No: 150876-2007 F1ED-FR).
References
Henseler J (2021).
Composite-Based Structural Equation Modeling: Analyzing Latent and Emergent Variables.
Guilford Press, New York.
Lancelot-Miltgen C, Henseler J, Gelhard C, Popovic A (2016).
“Introducing new products that affect consumer privacy: A mediation model.”
Journal of Business Research, 69(10), 4659–4666.
doi:10.1016/j.jbusres.2016.04.015.
Examples
#============================================================================
# Example is taken from Henseler (2020)
#============================================================================
model_Med <- "
# Reflective measurement model
Trust =~ trust1 + trust2
PrCon =~ privcon1 + privcon2 + privcon3 + privcon4
Risk =~ risk1 + risk2 + risk3
Int =~ intent1 + intent2
# Structural model
Int ~ Trust + PrCon + Risk
Risk ~ Trust + PrCon
Trust ~ PrCon
"
out <- csem(.data = LancelotMiltgenetal2016, .model = model_Med,
.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06
)
Data: political democracy
Description
The Industrialization and Political Democracy dataset. This dataset is used throughout Bollen's 1989 book (see pages 12, 17, 36 in chapter 2, pages 228 and following in chapter 7, pages 321 and following in chapter 8; Bollen (1989)). The dataset contains various measures of political democracy and industrialization in developing countries.
Usage
PoliticalDemocracy
Format
A data frame of 75 observations of 11 variables.
y1
Expert ratings of the freedom of the press in 1960
y2
The freedom of political opposition in 1960
y3
The fairness of elections in 1960
y4
The effectiveness of the elected legislature in 1960
y5
Expert ratings of the freedom of the press in 1965
y6
The freedom of political opposition in 1965
y7
The fairness of elections in 1965
y8
The effectiveness of the elected legislature in 1965
x1
The gross national product (GNP) per capita in 1960
x2
The inanimate energy consumption per capita in 1960
x3
The percentage of the labor force in industry in 1960
Source
The lavaan package (version 0.6-3).
References
Bollen KA (1989). Structural Equations with Latent Variables. Wiley-Interscience. ISBN 978-0471011712.
Examples
#============================================================================
# Example is taken from the lavaan website
#============================================================================
# Note: example is modified. Across-block correlations are removed
model <- "
# Measurement model
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + y2 + y3 + y4
dem65 =~ y5 + y6 + y7 + y8
# Regressions / Path model
dem60 ~ ind60
dem65 ~ ind60 + dem60
# residual correlations
y2 ~~ y4
y6 ~~ y8
"
aa <- csem(PoliticalDemocracy, model)
Data: Russett
Description
A data frame containing 10 variables with 47 observations.
Usage
Russett
Format
A data frame containing the following variables for 47 countries:
gini
The Gini index of concentration
farm
The percentage of landholders who collectively occupy one-half of all the agricultural land (starting with the farmers with the smallest plots of land and working toward the largest)
rent
The percentage of the total number of farms that rent all their land. Transformation: ln (x + 1)
gnpr
The 1955 gross national product per capita in U.S. dollars. Transformation: ln (x)
labo
The percentage of the labor force employed in agriculture. Transformation: ln (x)
inst
Instability of personnel based on the term of office of the chief executive. Transformation: exp (x - 16.3)
ecks
The total number of politically motivated violent incidents, from plots to protracted guerrilla warfare. Transformation: ln (x + 1)
deat
The number of people killed as a result of internal group violence per 1,000,000 people. Transformation: ln (x + 1)
stab
One if the country has a stable democracy, and zero otherwise
dict
One if the country experiences a dictatorship, and zero otherwise
Details
The dataset was initially compiled by Russett (1964), discussed and reprinted by Gifi (1990), and partially transformed by Tenenhaus and Tenenhaus (2011). It is also used in Henseler (2021) for demonstration purposes.
Source
From: Henseler (2021)
References
Gifi A (1990).
Nonlinear multivariate analysis.
Wiley.
Henseler J (2021).
Composite-Based Structural Equation Modeling: Analyzing Latent and Emergent Variables.
Guilford Press, New York.
Russett BM (1964).
“Inequality and Instability: The Relation of Land Tenure to Politics.”
World Politics, 16(3), 442–454.
doi:10.2307/2009581.
Tenenhaus A, Tenenhaus M (2011).
“Regularized generalized canonical correlation analysis.”
Psychometrika, 76(2), 257–284.
Examples
#============================================================================
# Example is taken from Henseler (2020)
#============================================================================
model_Russett="
# Composite model
AgrIneq <~ gini + farm + rent
IndDev <~ gnpr + labo
PolInst <~ inst + ecks + deat + stab + dict
# Structural model
PolInst ~ AgrIneq + IndDev
"
out <- csem(.data = Russett, .model = model_Russett,
.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06
)
Data: SQ
Description
A data frame containing 23 variables with 411 observations. The original indicators were measured on a 6-point scale. In this version of the dataset, the indicators are scaled to be between 0 and 100.
Usage
SQ
Format
An object of class data.frame
with 411 rows and 23 columns.
Details
The data comes from a European manufacturer of durable consumer goods and was studied by Bliemel et al. (2004) who focused on service quality. It is also used in Henseler (2021) for demonstration purposes, see the corresponding tutorial.
Source
The dataset is provided by Jörg Henseler.
References
Bliemel FW, Adolphs K, Henseler J (2004).
“Reconceptualizing service quality. A formative measurement approach using PLS path modeling.”
In Munuera-Aleman JL (ed.), Proceedings of the 33rd EMAC Conference, 224.
Henseler J (2021).
Composite-Based Structural Equation Modeling: Analyzing Latent and Emergent Variables.
Guilford Press, New York.
Data: Summers
Description
A (18 x 18) indicator correlation matrix.
Usage
Sigma_Summers_composites
Format
An object of class matrix
(inherits from array
) with 18 rows and 18 columns.
Details
The indicator correlation matrix for a modified version of Summers (1965) model. All constructs are modeled as composites.
Source
Own calculation based on Dijkstra and Henseler (2015).
References
Dijkstra TK, Henseler J (2015).
“Consistent and Asymptotically Normal PLS Estimators for Linear Structural Equations.”
Computational Statistics & Data Analysis, 81, 10–23.
Summers R (1965).
“A Capital Intensive Approach to the Small Sample Properties of Various Simultaneous Equation Estimators.”
Econometrica, 33(1), 1–41.
Examples
require(cSEM)
model <- "
ETA1 ~ ETA2 + XI1 + XI2
ETA2 ~ ETA1 + XI3 +XI4
ETA1 ~~ ETA2
XI1 <~ x1 + x2 + x3
XI2 <~ x4 + x5 + x6
XI3 <~ x7 + x8 + x9
XI4 <~ x10 + x11 + x12
ETA1 <~ y1 + y2 + y3
ETA2 <~ y4 + y5 + y6
"
## Generate data
summers_dat <- MASS::mvrnorm(n = 300, mu = rep(0, 18),
Sigma = Sigma_Summers_composites, empirical = TRUE)
## Estimate
res <- csem(.data = summers_dat, .model = model) # inconsistent
##
# 2SLS
res_2SLS <- csem(.data = summers_dat, .model = model, .approach_paths = "2SLS",
.instruments = list(ETA1 = c('XI1', 'XI2', 'XI3', 'XI4'),
ETA2 = c('XI1', 'XI2', 'XI3', 'XI4'))
)
Data: Switching
Description
A data frame containing 26 variables with 767 observations.
Usage
Switching
Format
An object of class data.frame
with 767 rows and 26 columns.
Details
The data contains variables about the consumers’ intention to switch a service provider. It is also used in Henseler (2021) for demonstration purposes, see the corresponding tutorial.
Source
The dataset is provided by Jörg Henseler.
References
Henseler J (2021). Composite-Based Structural Equation Modeling: Analyzing Latent and Emergent Variables. Guilford Press, New York.
Examples
#============================================================================
# Example is taken from Henseler (2021)
#============================================================================
model_Int <-"
# Measurement models
INV =~ INV1 + INV2 + INV3 +INV4
SAT =~ SAT1 + SAT2 + SAT3
INT =~ INT1 + INT2
# Structural model containing an interaction term.
INT ~ INV + SAT + INV.SAT
"
out <- csem(.data = Switching, .model = model_Int,
.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06)
Data: Yooetal2000
Description
A data frame containing 34 variables with 569 observations.
Usage
Yooetal2000
Format
An object of class data.frame
with 569 rows and 34 columns.
Details
The data is simulated and has the identical correlation matrix as the data that was analysed by Yoo et al. (2000) to examine how five elements of the marketing mix, namely price, store image, distribution intensity, advertising spending, and price deals, are related to the so-called dimensions of brand equity, i.e., perceived brand quality, brand loyalty, and brand awareness/associations. It is also used in Henseler (2017) and Henseler (2021) for demonstration purposes, see the corresponding tutorial.
Source
Simulated data with the same correlation matrix as the data studied by Yoo et al. (2000).
References
Henseler J (2017).
“Bridging Design and Behavioral Research With Variance-Based Structural Equation Modeling.”
Journal of Advertising, 46(1), 178–192.
doi:10.1080/00913367.2017.1281780.
Henseler J (2021).
Composite-Based Structural Equation Modeling: Analyzing Latent and Emergent Variables.
Guilford Press, New York.
Yoo B, Donthu N, Lee S (2000).
“An Examination of Selected Marketing Mix Elements and Brand Equity.”
Journal of the Academy of Marketing Science, 28(2), 195–211.
doi:10.1177/0092070300282002.
Examples
#============================================================================
# Example is taken from Henseler (2021)
#============================================================================
model_HOC="
# Measurement models FOC
PR =~ PR1 + PR2 + PR3
IM =~ IM1 + IM2 + IM3
DI =~ DI1 + DI2 + DI3
AD =~ AD1 + AD2 + AD3
DL =~ DL1 + DL2 + DL3
AA =~ AA1 + AA2 + AA3 + AA4 + AA5 + AA6
LO =~ LO1 + LO3
QL =~ QL1 + QL2 + QL3 + QL4 + QL5 + QL6
# Composite model for SOC
BR <~ QL + LO + AA
# Structural model
BR~ PR + IM + DI + AD + DL
"
out <- csem(.data = Yooetal2000, .model = model_HOC,
.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06)
Internal: Multiple testing correction
Description
Adjust a given significance level .alpha
to accommodate multiple testing.
The following corrections are implemented:
none
(Default) No correction is done.
bonferroni
A Bonferroni correction is done, i.e., alpha is divided by the number of comparisons
.nr_comparisons
.
Usage
adjustAlpha(
.alpha = args_default()$.alpha,
.approach_alpha_adjust = args_default()$.approach_alpha_adjust,
.nr_comparisons = args_default()$.nr_comparisons
)
Arguments
.alpha |
An integer or a numeric vector of significance levels.
Defaults to |
.approach_alpha_adjust |
Character string. Approach used to adjust the significance level to accommodate multiple testing. One of "none" or "bonferroni". Defaults to "none". |
.nr_comparisons |
Integer. The number of comparisons. Defaults to |
Value
A vector of (possibly adjusted) significance levels.
Complete list of assess()'s ... arguments
Description
A complete alphabetical list of all possible arguments accepted by assess()
's ...
(dotdotdot) argument.
Arguments
.absolute |
Logical. Should the absolute HTMT values be returned?
Defaults to |
.alpha |
An integer or a numeric vector of significance levels.
Defaults to |
.ci |
A vector of character strings naming the confidence interval to compute.
For possible choices see |
.closed_form_ci |
Logical. Should a closed-form confidence interval be computed?
Defaults to |
.handle_inadmissibles |
Character string. How should inadmissible results
be treated? One of "drop", "ignore", or "replace". If "drop", all
replications/resamples yielding an inadmissible result will be dropped
(i.e. the number of results returned will potentially be less than |
.inference |
Logical. Should critical values be computed? Defaults to |
.null_model |
Logical. Should the degrees of freedom for the null model
be computed? Defaults to |
.R |
Integer. The number of bootstrap replications. Defaults to |
.saturated |
Logical. Should a saturated structural model be used?
Defaults to |
.seed |
Integer or |
.type_gfi |
Character string. Which fitting function should the GFI be based on? One of "ML" for the maximum likelihood fitting function, "GLS" for the generalized least squares fitting function or "ULS" for the unweighted least squares fitting function (same as the squared Euclidean distance). Defaults to "ML". |
.type_vcv |
Character string. Which model-implied correlation matrix should be calculated? One of "indicator" or "construct". Defaults to "indicator". |
Details
Most arguments supplied to the ...
argument of assess()
are only
accepted by a subset of the functions called by assess()
. The following
list shows which argument is passed to which function:
- .absolute
Accepted by/Passed down to:
calculateHTMT()
- .alpha
Accepted by/Passed down to:
calculateRhoT()
,calculateHTMT()
,calculateCN()
- .ci
Accepted by/Passed down to:
calculateHTMT()
- .closed_form_ci
Accepted by/Passed down to:
calculateRhoT()
- .handle_inadmissibles
Accepted by/Passed down to:
calculateHTMT()
- .inference
Accepted by/Passed down to: calculateHTMT
- .null_model
Accepted by/Passed down to:
calculateDf()
- .R
Accepted by/Passed down to:
calculateHTMT()
- .saturated
Accepted by/Passed down to:
calculateSRMR()
,calculateDG()
,calculateDL()
,calculateDML()
and subsequentlyfit()
.- .seed
Accepted by/Passed down to:
calculateHTMT()
- .type_gfi
Accepted by/Passed down to:
calculateGFI()
- .type_vcv
Accepted by/Passed down to:
calculateSRMR()
,calculateDG()
,calculateDL()
,calculateDML()
and subsequentlyfit()
.
Show argument defaults or candidates
Description
Show all arguments used by package functions including default or candidate values. For argument descriptions see: csem_arguments.
Usage
args_default(.choices = FALSE)
Arguments
.choices |
Logical. Should candidate values for the arguments be returned?
Defaults to |
Details
By default args_default()
returns a list of default values by argument name.
If the list of accepted candidate values is required instead, use .choices = TRUE
.
Value
A named list of argument names and defaults or accepted candidates.
See Also
handleArgs()
, csem_arguments, csem()
, foreman()
Assess model
Description
Usage
assess(
.object = NULL,
.quality_criterion = c("all", "aic", "aicc", "aicu", "bic", "fpe", "gm", "hq",
"hqc", "mallows_cp", "ave",
"rho_C", "rho_C_mm", "rho_C_weighted",
"rho_C_weighted_mm", "dg", "dl", "dml", "df",
"effects", "f2", "fl_criterion", "chi_square", "chi_square_df",
"cfi", "cn", "gfi", "ifi", "nfi", "nnfi",
"reliability",
"rmsea", "rms_theta", "srmr",
"gof", "htmt", "htmt2", "r2", "r2_adj",
"rho_T", "rho_T_weighted", "vif",
"vifmodeB"),
.only_common_factors = TRUE,
...
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.quality_criterion |
Character string. A single character string or a vector of character strings naming the quality criterion to compute. See the Details section for a list of possible candidates. Defaults to "all" in which case all possible quality criteria are computed. |
.only_common_factors |
Logical. Should only concepts modeled as common
factors be included when calculating one of the following quality criteria:
AVE, the Fornell-Larcker criterion, HTMT, and all reliability estimates.
Defaults to |
... |
Further arguments passed to functions called by |
Details
Assess a model using common quality criteria. See the Postestimation: Assessing a model article on the cSEM website for details.
The function is essentially a wrapper around a number of internal functions that perform an "assessment task" (called a quality criterion in cSEM parlance) like computing reliability estimates, the effect size (Cohen's f^2), the heterotrait-monotrait ratio of correlations (HTMT) etc.
By default every possible quality criterion is calculated (.quality_criterion = "all"
).
If only a subset of quality criteria are needed a single character string
or a vector of character strings naming the criteria to be computed may be
supplied to assess()
via the .quality_criterion
argument. Currently, the
following quality criteria are implemented (in alphabetical order):
- Average variance extracted (AVE); "ave"
An estimate of the amount of variation in the indicators that is due to the underlying latent variable. Practically, it is calculated as the ratio of the (indicator) true score variances (i.e., the sum of the squared loadings) relative to the sum of the total indicator variances. The AVE is inherently tied to the common factor model. It is therefore unclear how to meaningfully interpret AVE results for constructs modeled as composites. It is possible to report the AVE for constructs modeled as composites by setting
.only_common_factors = FALSE
, however, result should be interpreted with caution as they may not have a conceptual meaning. Calculation is done bycalculateAVE()
.- Congeneric reliability; "rho_C", "rho_C_mm", "rho_C_weighted", "rho_C_weighted_mm"
-
An estimate of the reliability assuming a congeneric measurement model (i.e., loadings are allowed to differ) and a test score (proxy) based on unit weights. There are four different versions implemented. See the Methods and Formulae section of the Postestimation: Assessing a model article on the cSEM website for details. Alternative but synonymous names for
"rho_C"
are: composite reliability, construct reliability, reliability coefficient, Jöreskog's rho, coefficient omega, or Dillon-Goldstein's rho. For"rho_C_weighted"
: (Dijkstra-Henselers) rhoA.rho_C_mm
andrho_C_weighted_mm
have no corresponding names. The former uses unit weights scaled by (w'Sw)^(-1/2) and the latter weights scaled by (w'Sigma_hat w)^(-1/2) where Sigma_hat is the model-implied indicator correlation matrix. The Congeneric reliability is inherently tied to the common factor model. It is therefore unclear how to meaningfully interpret congeneric reliability estimates for constructs modeled as composites. It is possible to report the congeneric reliability for constructs modeled as composites by setting.only_common_factors = FALSE
, however, result should be interpreted with caution as they may not have a conceptual meaning. Calculation is done bycalculateRhoC()
. - Distance measures; "dg", "dl", "dml"
Measures of the distance between the model-implied and the empirical indicator correlation matrix. Currently, the geodesic distance (
"dg"
), the squared Euclidean distance ("dl"
) and the the maximum likelihood-based distance function are implemented ("dml"
). Calculation is done bycalculateDL()
,calculateDG()
, andcalculateDML()
.- Degrees of freedom, "df"
-
Returns the degrees of freedom. Calculation is done by
calculateDf()
. - Effects; "effects"
Total and indirect effect estimates. Additionally, the variance accounted for (VAF) is computed. The VAF is defined as the ratio of a variables indirect effect to its total effect. Calculation is done by
calculateEffects()
.- Effect size; "f2"
An index of the effect size of an independent variable in a structural regression equation. This measure is commonly known as Cohen's f^2. The effect size of the k'th independent variable in this case is defined as the ratio (R2_included - R2_excluded)/(1 - R2_included), where R2_included and R2_excluded are the R squares of the original structural model regression equation (R2_included) and the alternative specification with the k'th variable dropped (R2_excluded). Calculation is done by
calculatef2()
.- Fit indices; "chi_square", "chi_square_df", "cfi", "cn", "gfi", "ifi", "nfi", "nnfi", "rmsea", "rms_theta", "srmr"
-
Several absolute and incremental fit indices. Note that their suitability for models containing constructs modeled as composites is still an open research question. Also note that fit indices are not tests in a hypothesis testing sense and decisions based on common one-size-fits-all cut-offs proposed in the literature suffer from serious statistical drawbacks. Calculation is done by
calculateChiSquare()
,calculateChiSquareDf()
,calculateCFI()
,calculateGFI()
,calculateIFI()
,calculateNFI()
,calculateNNFI()
,calculateRMSEA()
,calculateRMSTheta()
andcalculateSRMR()
. - Fornell-Larcker criterion; "fl_criterion"
A rule suggested by Fornell and Larcker (1981) to assess discriminant validity. The Fornell-Larcker criterion is a decision rule based on a comparison between the squared construct correlations and the average variance extracted. FL returns a matrix with the squared construct correlations on the off-diagonal and the AVEs on the main diagonal. Calculation is done by
calculateFLCriterion()
.- Goodness of Fit (GoF); "gof"
The GoF is defined as the square root of the mean of the R squares of the structural model times the mean of the variances in the indicators that are explained by their related constructs (i.e., the average over all lambda^2_k). For the latter, only constructs modeled as common factors are considered as they explain their indicator variance in contrast to a composite where indicators actually build the construct. Note that, contrary to what the name suggests, the GoF is not a measure of model fit in a Chi-square fit test sense. Calculation is done by
calculateGoF()
.- Heterotrait-monotrait ratio of correlations (HTMT); "htmt"
-
An estimate of the correlation between latent variables assuming tau equivalent measurement models. The HTMT is used to assess convergent and/or discriminant validity of a construct. The HTMT is inherently tied to the common factor model. If the model contains less than two constructs modeled as common factors and
.only_common_factors = TRUE
,NA
is returned. It is possible to report the HTMT for constructs modeled as composites by setting.only_common_factors = FALSE
, however, result should be interpreted with caution as they may not have a conceptual meaning. Calculation is done bycalculateHTMT()
. - HTMT2; "htmt2"
-
An estimate of the correlation between latent variables assuming congeneric measurement models. The HTMT2 is used to assess convergent and/or discriminant validity of a construct. The HTMT is inherently tied to the common factor model. If the model contains less than two constructs modeled as common factors and
.only_common_factors = TRUE
,NA
is returned. It is possible to report the HTMT for constructs modeled as composites by setting.only_common_factors = FALSE
, however, result should be interpreted with caution as they may not have a conceptual meaning. Calculation is done bycalculateHTMT()
. - Model selection criteria: "aic", "aicc", "aicu", "bic", "fpe", "gm", "hq", "hqc", "mallows_cp"
-
Several model selection criteria as suggested by Sharma et al. (2019) in the context of PLS. See:
calculateModelSelectionCriteria()
for details. - Reliability: "reliability"
-
As described in the Methods and Formulae section of the Postestimation: Assessing a model article on the cSEM website there are many different estimators for the (internal consistency) reliability. Choosing
.quality_criterion = "reliability"
computes the three most common measures, namely: "Cronbach's alpha" (identical to "rho_T"), "Jöreskog's rho" (identical to "rho_C_mm"), and "Dijkstra-Henseler's rho A" (identical to "rho_C_weighted_mm"). Reliability is inherently tied to the common factor model. It is therefore unclear how to meaningfully interpret reliability estimates for constructs modeled as composites. It is possible to report the three common reliability estimates for constructs modeled as composites by setting.only_common_factors = FALSE
, however, result should be interpreted with caution as they may not have a conceptual meaning. - R square and R square adjusted; "r2", "r2_adj"
The R square and the adjusted R square for each structural regression equation. Calculated when running
csem()
.- Tau-equivalent reliability; "rho_T"
An estimate of the reliability assuming a tau-equivalent measurement model (i.e. a measurement model with equal loadings) and a test score (proxy) based on unit weights. Tau-equivalent reliability is the preferred name for reliability estimates that assume a tau-equivalent measurement model such as Cronbach's alpha. The tau-equivalent reliability (Cronbach's alpha) is inherently tied to the common factor model. It is therefore unclear how to meaningfully interpret tau-equivalent reliability estimates for constructs modeled as composites. It is possible to report tau-equivalent reliability estimates for constructs modeled as composites by setting
.only_common_factors = FALSE
, however, result should be interpreted with caution as they may not have a conceptual meaning. Calculation is done bycalculateRhoT()
.- Variance inflation factors (VIF); "vif"
An index for the amount of (multi-)collinearity between independent variables of a regression equation. Computed for each structural equation. Practically, VIF_k is defined as the ratio of 1 over (1 - R2_k) where R2_k is the R squared from a regression of the k'th independent variable on all remaining independent variables. Calculated when running
csem()
.- Variance inflation factors for PLS-PM mode B (VIF-ModeB); "vifmodeB"
An index for the amount of (multi-)collinearity between independent variables (indicators) in mode B regression equations. Computed only if
.object
was obtained using.weight_approach = "PLS-PM"
and at least one mode was mode B. Practically, VIF-ModeB_k is defined as the ratio of 1 over (1 - R2_k) where R2_k is the R squared from a regression of the k'th indicator of block j on all remaining indicators of the same block. Calculation is done bycalculateVIFModeB()
.
For details on the most important quality criteria see the Methods and Formulae section of the Postestimation: Assessing a model article on the on the cSEM website.
Some of the quality criteria are inherently tied to the classical common
factor model and therefore only meaningfully interpreted within a common
factor model (see the
Postestimation: Assessing a model
article for details).
It is possible to force computation of all quality criteria for constructs
modeled as composites by setting .only_common_factors = FALSE
, however,
we explicitly warn to interpret quality criteria in analogy to the common factor
model in this case, as the interpretation often does not carry over to composite models.
Resampling
To resample a given quality criterion supply the name of the function
that calculates the desired quality criterion to csem()
's .user_funs
argument.
See resamplecSEMResults()
for details.
Value
A named list of quality criteria. Note that if only a single quality criteria is computed the return value is still a list!
See Also
csem()
, resamplecSEMResults()
, exportToExcel()
Examples
# ===========================================================================
# Using the three common factors dataset
# ===========================================================================
model <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2
# Each concept is measured by 3 indicators, i.e., modeled as latent variable
eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"
res <- csem(threecommonfactors, model)
a <- assess(res) # computes all quality criteria (.quality_criterion = "all")
a
## The return value is a named list. Type for example:
a$HTMT
# You may also just compute a subset of the quality criteria
assess(res, .quality_criterion = c("ave", "rho_C", "htmt"))
## Resampling ---------------------------------------------------------------
# To resample a given quality criterion use csem()'s .user_funs argument
# Note: The output of the quality criterion needs to be a vector or a matrix.
# Matrices will be vectorized columnwise.
res <- csem(threecommonfactors, model,
.resample_method = "bootstrap",
.R = 40,
.user_funs = cSEM:::calculateSRMR
)
## Look at the resamples
res$Estimates$Estimates_resample$Estimates1$User_fun$Resampled[1:4, ]
## Use infer() to compute e.g., the 95% percentile confidence interval
res_infer <- infer(res, .quantity = "CI_percentile")
## The results are saved under the name "User_fun"
res_infer$User_fun
## Several quality criteria can be resampled simultaneously
res <- csem(threecommonfactors, model,
.resample_method = "bootstrap",
.R = 40,
.user_funs = list(
"SRMR" = cSEM:::calculateSRMR,
"RMS_theta" = cSEM:::calculateRMSTheta
),
.tolerance = 1e-04
)
res$Estimates$Estimates_resample$Estimates1$SRMR$Resampled[1:4, ]
res$Estimates$Estimates_resample$Estimates1$RMS_theta$Resampled[1:4]
Internal: Build DOT code for the SEM plot, including construct correlations.
Description
Constructs the DOT script for the SEM path diagram, now including correlations between constructs (not just exogenous ones). Correctly handles drawing only one edge per correlation.
Usage
buildDotCode(
title,
graph_attrs,
constructs,
r2_values,
measurement_edge_fun,
path_coefficients,
path_p_values,
correlations,
plot_significances,
plot_correlations,
plot_structural_model_only,
plot_labels,
is_second_order = FALSE,
model_measurement = NULL,
model_error_cor = NULL,
construct_correlations = NULL,
indicator_correlations = NULL
)
Arguments
title |
The title of the plot. |
graph_attrs |
Optional graph attributes. |
constructs |
A vector of constructs. |
r2_values |
Named vector of R2 values. |
measurement_edge_fun |
Function to generate measurement edge code. |
path_coefficients |
Matrix/data frame of path coefficients. |
path_p_values |
Named vector of path p-values. Used for construct correlations too. |
correlations |
List containing correlations (exogenous and indicator). |
plot_significances |
Logical. Whether to display significance levels. |
plot_correlations |
Option for indicator correlations ("none", "exo", or "all"). |
plot_structural_model_only |
Logical. Whether to display only the structural model. |
is_second_order |
Logical. Whether the model is second-order. |
model_measurement |
a matrix. The measurement matrix. |
model_error_cor |
a matrix. |
construct_correlations |
A matrix. The construct correlation matrix. |
indicator_correlations |
A matrix. The indicator correlation matrix. |
Value
A character string containing the complete DOT code.
Internal: Second/Third stage of the two-stage approach for second order constructs
Description
Performs the second and third stage for a model containing second order constructs.
Usage
calculate2ndStage(
.csem_model = args_default()$.csem_model,
.first_stage_results = args_default()$.first_stage_results,
.original_arguments = args_default()$.original_arguments,
.approach_2ndorder = args_default()$.approach_2ndorder
)
Arguments
.csem_model |
A (possibly incomplete) cSEMModel-list. |
.original_arguments |
The list of arguments used within |
.approach_2ndorder |
Character string. Approach used for models containing second-order constructs. One of: "2stage", or "mixed". Defaults to "2stage". |
Value
A cSEMResults object.
Average variance extracted (AVE)
Description
Calculate the average variance extracted (AVE) as proposed by Fornell and Larcker (1981). For details see the cSEM website
Usage
calculateAVE(
.object = NULL,
.only_common_factors = TRUE
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.only_common_factors |
Logical. Should only concepts modeled as common
factors be included when calculating one of the following quality criteria:
AVE, the Fornell-Larcker criterion, HTMT, and all reliability estimates.
Defaults to |
Details
The AVE is inherently tied to the common factor model. It is therefore
unclear how to meaningfully interpret the AVE in the context of a
composite model. It is possible, however, to force computation of the AVE for constructs
modeled as composites by setting .only_common_factors = FALSE
.
Value
A named vector of numeric values (the AVEs). If .object
is a list
of cSEMResults
objects, a list of AVEs is returned.
References
Fornell C, Larcker DF (1981). “Evaluating structural equation models with unobservable variables and measurement error.” Journal of Marketing Research, XVIII, 39–50.
See Also
Internal: Calculate composite variance-covariance matrix
Description
Calculate the sample variance-covariance (VCV) matrix of the composites/proxies.
Usage
calculateCompositeVCV(
.S = args_default()$.S,
.W = args_default()$.W
)
Arguments
.S |
The (K x K) empirical indicator correlation matrix. |
.W |
A (J x K) matrix of weights. |
Value
A (J x J) composite VCV matrix.
Internal: Calculate construct variance-covariance matrix
Description
Calculate the variance-covariance matrix (VCV) of the constructs, i.e., correlations that involve common factors/latent variables are diattenuated.
Usage
calculateConstructVCV(
.C = args_default()$.C,
.Q = args_default()$.Q
)
Arguments
.C |
A (J x J) composite variance-covariance matrix. |
.Q |
A vector of composite-construct correlations with element names equal to the names of the J construct names used in the measurement model. Note Q^2 is also called the reliability coefficient. |
Value
The (J x J) construct VCV matrix. Disattenuated if requested.
Internal: Calculate PLSc correction factors
Description
Calculates the correction factor used by PLSc.
Usage
calculateCorrectionFactors(
.S = args_default()$.S,
.W = args_default()$.W,
.modes = args_default()$.modes,
.csem_model = args_default()$.csem_model,
.PLS_approach_cf = args_default()$.PLS_approach_cf
)
Arguments
.S |
The (K x K) empirical indicator correlation matrix. |
.W |
A (J x K) matrix of weights. |
.modes |
A vector giving the mode for each construct in the form |
.csem_model |
A (possibly incomplete) cSEMModel-list. |
.PLS_approach_cf |
Character string. Approach used to obtain the correction
factors for PLSc. One of: "dist_squared_euclid", "dist_euclid_weighted",
"fisher_transformed", "mean_arithmetic", "mean_geometric", "mean_harmonic",
"geo_of_harmonic". Defaults to "dist_squared_euclid".
Ignored if |
Details
Currently, seven approaches are available:
"dist_squared_euclid" (default)
"dist_euclid_weighted"
"fisher_transformed"
"mean_geometric"
"mean_harmonic"
"mean_arithmetic"
"geo_of_harmonic" (not yet implemented)
See (Dijkstra 2013) for details.
Value
A numeric vector of correction factors with element names equal to the names of the J constructs used in the measurement model.
References
Dijkstra TK (2013). “A Note on How to Make Partial Least Squares Consistent.” Working Paper. doi:10.13140/RG.2.1.4547.5688.
Degrees of freedom
Description
Calculate the degrees of freedom for a given model from a cSEMResults object.
Usage
calculateDf(
.object = NULL,
.null_model = FALSE,
...
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.null_model |
Logical. Should the degrees of freedom for the null model
be computed? Defaults to |
... |
Ignored. |
Details
Although, composite-based estimators always retrieve parameters of the postulated models via the estimation of a composite model, the computation of the degrees of freedom depends on the postulated model.
See: cSEM website for details on how the degrees of freedom are calculated.
To compute the degrees of freedom of the null model use .null_model = TRUE
.
The degrees of freedom of the null model are identical to the number of
non-redundant off-diagonal elements of the empirical indicator correlation matrix.
This implicitly assumes a null model with model-implied indicator correlation
matrix equal to the identity matrix.
Value
A single numeric value.
See Also
Internal: Matrix difference
Description
Calculates the average of the differences between all possible pairs of (symmetric) matrices in a list using a given distance measure.
Usage
calculateDistance(
.matrices = NULL,
.distance = args_default()$.distance
)
Arguments
.matrices |
A list of at least two matrices. |
.distance |
Character string. A distance measure. One of: "geodesic" or "squared_euclidean". Defaults to "geodesic". |
Details
.matrices
must be a list of at least two matrices. If more than two matrices
are supplied the arithmetic mean of the differences between all possible pairs of
(symmetric) matrices in a list is computed. Mathematically this is
n chose 2. Hence, supplying a large number of matrices will
become computationally challenging.
Currently two distance measures are supported:
geodesic
(Default) The geodesic distance.
squared_euclidean
The squared Euclidean distance
Value
A numeric vector of length one containing the (arithmetic) mean of
the differences between all possible pairs of matrices supplied via .matrices
.
Internal: Calculate direct, indirect and total effect
Description
The direct effects are equal to the estimated coefficients. The total effect
equals (I-B)^-1 Gamma. The indirect effect equals the difference between
the total effect and the indirect effect. In addition, the variance accounted
for (VAF) is calculated. The VAF is defined as the ratio of a variables
indirect effect to its total effect. Helper for generic functions summarize()
and assess()
.
Usage
calculateEffects(
.object = NULL,
.output_type = c("data.frame", "matrix")
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.output_type |
Character string. The type of output to return. One of "complete" or "structured". See the Value section for details. Defaults to "complete". |
Value
A matrix or a data frame of effects.
See Also
assess()
, summarize()
cSEMResults
Fornell-Larcker criterion
Description
Computes the Fornell-Larcker matrix.
Usage
calculateFLCriterion(
.object = NULL,
.only_common_factors = TRUE,
...
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.only_common_factors |
Logical. Should only concepts modeled as common
factors be included when calculating one of the following quality criteria:
AVE, the Fornell-Larcker criterion, HTMT, and all reliability estimates.
Defaults to |
... |
Ignored. |
Details
The Fornell-Larcker criterion (FL criterion) is a rule suggested by Fornell and Larcker (1981) to assess discriminant validity. The Fornell-Larcker criterion is a decision rule based on a comparison between the squared construct correlations and the average variance extracted (AVE).
The FL criterion is inherently tied to the common factor model. It is therefore unclear how to meaningfully interpret the FL criterion in the context of a model that contains constructs modeled as composites.
Value
A matrix with the squared construct correlations on the off-diagonal and the AVEs on the main diagonal.
References
Fornell C, Larcker DF (1981). “Evaluating structural equation models with unobservable variables and measurement error.” Journal of Marketing Research, XVIII, 39–50.
See Also
Internal: ANOVA F-test statistic
Description
Calculate the ANOVA F-test statistic suggested by Sarstedt et al. (2011) in the OTG testing procedure.
Usage
calculateFR(.resample_sarstedt)
Arguments
.resample_sarstedt |
A matrix containing the parameter estimates that could potentially be compared and an id column indicating the group adherence of each row. |
Value
A named scalar, the test statistic of the ANOVA F-test
References
Sarstedt M, Henseler J, Ringle CM (2011). “Multigroup Analysis in Partial Least Squares (PLS) Path Modeling: Alternative Methods and Empirical Results.” In Advances in International Marketing, 195–218. Emerald Group Publishing Limited. doi:10.1108/s1474-7979(2011)0000022012.
Goodness of Fit (GoF)
Description
Calculate the Goodness of Fit (GoF) proposed by Tenenhaus et al. (2004). Note that, contrary to what the name suggests, the GoF is not a measure of model fit in the sense of SEM. See e.g. Henseler and Sarstedt (2012) for a discussion.
Usage
calculateGoF(
.object = NULL
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
Details
The GoF is inherently tied to the common factor model. It is therefore unclear how to meaningfully interpret the GoF in the context of a model that contains constructs modeled as composites.
Value
A single numeric value.
References
Henseler J, Sarstedt M (2012).
“Goodness-of-fit Indices for Partial Least Squares Path Modeling.”
Computational Statistics, 28(2), 565–580.
doi:10.1007/s00180-012-0317-1.
Tenenhaus M, Amanto S, Vinzi VE (2004).
“A Global Goodness-of-Fit Index for PLS Structural Equation Modelling.”
In Proceedings of the XLII SIS Scientific Meeting, 739–742.
See Also
HTMT
Description
Computes either the heterotrait-monotrait ratio of correlations (HTMT) based on Henseler et al. (2015) or the HTMT2 proposed by Roemer et al. (2021). While the HTMT is a consistent estimator for the construct correlation in case of tau-equivalent measurement models, the HTMT2 is a consistent estimator for congeneric measurement models. In general, they are used to assess discriminant validity.
Usage
calculateHTMT(
.object = NULL,
.type_htmt = c('htmt','htmt2'),
.absolute = TRUE,
.alpha = 0.05,
.ci = c("CI_percentile", "CI_standard_z", "CI_standard_t",
"CI_basic", "CI_bc", "CI_bca", "CI_t_interval"),
.inference = FALSE,
.only_common_factors = TRUE,
.R = 499,
.seed = NULL,
...
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.type_htmt |
Character string indicating the type of HTMT that should be calculated, i.e., the original HTMT ("htmt") or the HTMT2 ("htmt2"). Defaults to "htmt" |
.absolute |
Logical. Should the absolute HTMT values be returned?
Defaults to |
.alpha |
A numeric value giving the significance level.
Defaults to |
.ci |
A character strings naming the type of confidence interval to use
to compute the 1-alpha% quantile of the bootstrap HTMT values. For possible
choices see |
.inference |
Logical. Should critical values be computed? Defaults to |
.only_common_factors |
Logical. Should only concepts modeled as common
factors be included when calculating one of the following quality criteria:
AVE, the Fornell-Larcker criterion, HTMT, and all reliability estimates.
Defaults to |
.R |
Integer. The number of bootstrap replications. Defaults to |
.seed |
Integer or |
... |
Ignored. |
Details
Computation of the HTMT/HTMT2 assumes that all intra-block and inter-block correlations between indicators are either all-positive or all-negative. A warning is given if this is not the case.
To obtain bootstrap confidence intervals for the HTMT/HTMT2 values, set .inference = TRUE
.
To choose the type of confidence interval, use .ci
. To control the bootstrap process,
arguments .R
and .seed
are available. Note, that .alpha
is multiplied by two
because typically researchers are interested in one-sided bootstrap confidence intervals
for the HTMT/HTMT2.
Since the HTMT and the HTMT2 both assume a reflective measurement
model only concepts modeled as common factors are considered by default.
For concepts modeled as composites the HTMT may be computed by setting
.only_common_factors = FALSE
, however, it is unclear how to
interpret values in this case.
Value
A named list containing:
the values of the HTMT/HTMT2, i.e., a matrix with the HTMT/HTMT2 values at its lower triangular and if
.inference = TRUE
the upper triangular contains the upper limit of the 1-2*.alpha% bootstrap confidence interval if the HTMT/HTMT2 is positive and the lower limit if the HTMT/HTMT2 is negative.the lower and upper limits of the 1-2*.alpha% bootstrap confidence interval if
.inference = TRUE
; otherwise it isNULL
.the number of admissible bootstrap runs, i.e., the number of HTMT/HTMT2 values calculated during bootstrap if
.inference = TRUE
; otherwise it isNULL
. Note, the HTMT2 is based on the geometric and thus cannot always be calculated.
References
Henseler J, Ringle CM, Sarstedt M (2015).
“A New Criterion for Assessing Discriminant Validity in Variance-based Structural Equation Modeling.”
Journal of the Academy of Marketing Science, 43(1), 115–135.
doi:10.1007/s11747-014-0403-8.
Roemer E, Schuberth F, Henseler J (2021).
“HTMT2 – an improved criterion for assessing discriminant validity in structural equation modeling.”
Industrial Management & Data Systems, 121(12), 2637–2650.
See Also
Internal: Calculate indicator correlation matrix
Description
Calculate the indicator correlation matrix using conventional or robust methods.
Usage
calculateIndicatorCor(
.X_cleaned = NULL,
.approach_cor_robust = "none"
)
Arguments
.X_cleaned |
A data.frame of processed data (cleaned and ordered). Note: |
.approach_cor_robust |
Character string. Approach used to obtain a robust
indicator correlation matrix. One of: "none" in which case the standard
Bravais-Pearson correlation is used,
"spearman" for the Spearman rank correlation, or
"mcd" via |
Details
If .approach_cor_robust = "none"
(the default) the type of correlation computed
depends on the types of the columns of .X_cleaned
(i.e., the indicators)
involved in the computation.
Numeric-numeric
If both columns (indicators) involved are numeric, the Bravais-Pearson product-moment correlation is computed (via
stats::cor()
).Numeric-factor
If any of the columns is a factor variable, the polyserial correlation (Drasgow 1988) is computed (via
polycor::polyserial()
).Factor-factor
If both columns are factor variables, the polychoric correlation (Drasgow 1988) is computed (via
polycor::polychor()
).
Note: logical input is treated as a 0-1 factor variable.
If "mcd"
(= minimum covariance determinant), the MCD estimator
(Rousseeuw and Driessen 1999), a robust covariance estimator, is applied
(via MASS::cov.rob()
).
If "spearman"
, the Spearman rank correlation is used (via stats::cor()
).
Value
A list with elements:
$S
The (K x K) indicator correlation matrix
$cor_type
The type(s) of indicator correlation computed ( "Pearson", "Polyserial", "Polychoric")
$thre_est
Currently ignored (NULL)
References
Drasgow F (1988).
“Polychoric and polyserial correlations.”
In Encyclopedia of Statistical Sciences, volume 7, 68-74.
John Wiley & Sons Inc, Hoboken.
Rousseeuw PJ, Driessen KV (1999).
“A Fast Algorithm for the Minimum Covariance Determinant Estimator.”
Technometrics, 41(3), 212–223.
doi:10.1080/00401706.1999.10485670.
Internal: Calculate the inner weights for PLS-PM
Description
PLS-PM forms "inner" composites as a weighted sum of its I related composites. These inner weights are obtained using one of the following schemes (Lohmöller 1989):
centroid
According to the centroid weighting scheme each inner weight used to form composite j is either 1 if the correlation between composite j and its via the structural model related composite i = 1, ..., I is positive and -1 if it is negative.
factorial
According to the factorial weighting scheme each inner weight used to form inner composite j is equal to the correlation between composite j and its via the structural model related composite i = 1, ..., I.
path
Lets call all construct that have an arrow pointing to construct j predecessors of j and all arrows going from j to other constructs followers of j. According the path weighting scheme, inner weights are computed as follows. Take construct j:
For all predecessors of j set the inner weight of predecessor i to the correlation of i with j.
For all followers of j set the inner weight of follower i to the coefficient of a multiple regression of j on all followers i with i = 1,...,I.
Except for the path weighting scheme relatedness can come in two flavors.
If .PLS_ignore_structural_model = TRUE
all constructs are considered related.
If .PLS_ignore_structural_model = FALSE
(the default) only adjacent constructs
are considered. If .PLS_ignore_structural_model = TRUE
and .PLS_weight_scheme_inner = "path"
a warning is issued and .PLS_ignore_structural_model
is changed to FALSE
.
Usage
calculateInnerWeightsPLS(
.S = args_default()$.S,
.W = args_default()$.W,
.csem_model = args_default()$.csem_model,
.PLS_ignore_structural_model = args_default()$.PLS_ignore_structrual_model,
.PLS_weight_scheme_inner = args_default()$.PLS_weight_scheme_inner
)
Arguments
.S |
The (K x K) empirical indicator correlation matrix. |
.W |
A (J x K) matrix of weights. |
.csem_model |
A (possibly incomplete) cSEMModel-list. |
.PLS_ignore_structural_model |
Logical. Should the structural model be ignored
when calculating the inner weights of the PLS-PM algorithm? Defaults to |
.PLS_weight_scheme_inner |
Character string. The inner weighting scheme
used by PLS-PM. One of: "centroid", "factorial", or "path".
Defaults to "path". Ignored if |
Value
The (J x J) matrix E
of inner weights.
Internal: Calculate prediction metrics
Description
Currently, the following prediction measures are available:
Usage
calculateMAE(resid)
Details
Mean absolute error
Mean absolute percentage error
Mean squared error (MSE)
Root mean squared error
Theil's forecast accuracy
Theil's forecast quality
Bias proportion of MSE
Regression proportion of MSE
Disturbance proportion of MSE
Value
A vector of the prediction measures for the observed variables belonging to endogenous constructs
Model selection criteria
Description
Calculate several information or model selection criteria (MSC) such as the Akaike information criterion (AIC), the Bayesian information criterion (BIC) or the Hannan-Quinn criterion (HQ).
Usage
calculateModelSelectionCriteria(
.object = NULL,
.ms_criterion = c("all", "aic", "aicc", "aicu", "bic", "fpe", "gm", "hq",
"hqc", "mallows_cp"),
.by_equation = TRUE,
.only_structural = TRUE
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.ms_criterion |
Character string. Either a single character string or a vector
of character strings naming the model selection criterion to compute.
Defaults to |
.by_equation |
Should the criteria be computed for each structural model
equation separately? Defaults to |
.only_structural |
Should the the log-likelihood be based on the
structural model? Ignored if |
Details
By default, all criteria are calculated (.ms_criterion == "all"
). To compute only
a subset of the criteria a vector of criteria may be given.
If .by_equation == TRUE
(the default), the criteria are computed for each
structural equation of the model separately, as suggested by
Sharma et al. (2019) in the context of PLS. The relevant formula can be found in
Table B1 of the appendix of Sharma et al. (2019).
If .by_equation == FALSE
the AIC, the BIC and the HQ for whole model
are calculated. All other criteria are currently ignored in this case!
The relevant formula are (see, e.g., (Akaike 1974),
Schwarz (1978),
Hannan and Quinn (1979)):
AIC = - 2*log(L) + 2*k
BIC = - 2*log(L) + k*ln(n)
HQ = - 2*log(L) + 2*k*ln(ln(n))
where log(L) is the log likelihood function of the multivariate normal distribution of the observable variables, k the (total) number of estimated parameters, and n the sample size.
If .only_structural == TRUE
, log(L) is based on the structural model only.
The argument is ignored if .by_equation == TRUE
.
Value
If .by_equation == TRUE
a named list of model selection criteria.
References
Akaike H (1974).
“A New Look at the Statistical Model Identification.”
IEEE Transactions on Automatic Control, 19(6), 716–723.
Hannan EJ, Quinn BG (1979).
“The Determination of the order of an autoregression.”
Journal of the Royal Statistical Society: Series B (Methodological), 41(2), 190–195.
Schwarz G (1978).
“Estimating the Dimension of a Model.”
The Annals of Statistics, 6(2), 461–464.
doi:10.1214/aos/1176344136.
Sharma P, Sarstedt M, Shmueli G, Kim KH, Thiele KO (2019).
“PLS-Based Model Selection: The Role of Alternative Explanations in Information Systems Research.”
Journal of the Association for Information Systems, 20(4).
See Also
Internal: Calculate the outer weights for PLS-PM
Description
Calculates outer weights in PLS-PM. Currently, the originally suggested mode A and mode B are suggested. Additionally, non-negative least squares (modeBNNLS) and weights of principal component analysis (PCA) are implemented.
Usage
calculateOuterWeightsPLS(
.data = args_default()$.data,
.S = args_default()$.S,
.W = args_default()$.W,
.E = args_default()$.E,
.modes = args_default()$.modes
)
Arguments
.data |
A |
.S |
The (K x K) empirical indicator correlation matrix. |
.W |
A (J x K) matrix of weights. |
.E |
A (J x J) matrix of inner weights. |
.modes |
A vector giving the mode for each construct in the form |
Value
A (J x K) matrix of outer weights.
Internal: Parameter differences across groups
Description
Calculate the difference between one or more parameter estimates across
all possible pairs of groups (data sets) in .object
.
Usage
calculateParameterDifference(
.object = args_default()$.object,
.model = args_default()$.model
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.model |
A model in lavaan model syntax indicating which
parameters (i.e., path ( |
Value
A list of length equal to the number of possible pairs of
groups in .object
(mathematically, this is n choose 2, i.e., 3 if there are three
groups and 6 if there are 4 groups). Each list elements is itself a list of
three. The first list element contains
the difference between parameter estimates of the structural model, the second
list element the difference between estimated loadings, and the third
the difference between estimated weights.
Internal: Calculation of the CDF used in Henseler et al. (2009)
Description
Calculates the probability that theta^1 is smaller than or equal to theta^2. See Equation (6) in Sarstedt et al. (2011).
Usage
calculatePr(.resample_centered = NULL, .parameters_to_compare = NULL)
Arguments
.parameters_to_compare |
A model in lavaan model syntax indicating which
parameters (i.e, path ( |
Value
A named vector
References
Sarstedt M, Henseler J, Ringle CM (2011). “Multigroup Analysis in Partial Least Squares (PLS) Path Modeling: Alternative Methods and Empirical Results.” In Advances in International Marketing, 195–218. Emerald Group Publishing Limited. doi:10.1108/s1474-7979(2011)0000022012.
Relative Goodness of Fit (relative GoF)
Description
Calculate the Relative Goodness of Fit (GoF) proposed by Vinzi et al. (2010). Note that, contrary to what the name suggests, the Relative GoF is not a measure of model fit in the sense of SEM. See e.g. Henseler and Sarstedt (2012) for a discussion.
Usage
calculateRelativeGoF(
.object = NULL
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
Value
A single numeric value.
References
Henseler J, Sarstedt M (2012).
“Goodness-of-fit Indices for Partial Least Squares Path Modeling.”
Computational Statistics, 28(2), 565–580.
doi:10.1007/s00180-012-0317-1.
Vinzi VE, Trinchera L, Amato S (2010).
“PLS path modeling: From foundations to recent developments and open issues for model assessment and improvement.”
In Vinzi VE, Wang H (eds.), Handbook of Partial Least Squares, 47–82.
Springer.
See Also
Internal: Calculate Reliabilities
Description
Internal: Calculate Reliabilities
Usage
calculateReliabilities(
.X = args_default()$.X,
.S = args_default()$.S,
.W = args_default()$.W,
.approach_weights = args_default()$.approach_weights,
.csem_model = args_default()$.csem_model,
.disattenuate = args_default()$.disattenuate,
.PLS_approach_cf = args_default()$.PLS_approach_cf,
.reliabilities = args_default()$.reliabilities
)
Arguments
.X |
A matrix of processed data (scaled, cleaned and ordered). |
.S |
The (K x K) empirical indicator correlation matrix. |
.W |
A (J x K) matrix of weights. |
.approach_weights |
Character string. Approach used to obtain composite weights. One of: "PLS-PM", "SUMCORR", "MAXVAR", "SSQCORR", "MINVAR", "GENVAR", "GSCA", "PCA", "unit", "bartlett", or "regression". Defaults to "PLS-PM". |
.csem_model |
A (possibly incomplete) cSEMModel-list. |
.disattenuate |
Logical. Should composite/proxy correlations
be disattenuated to yield consistent loadings and path estimates if at least
one of the construct is modeled as a common factor? Defaults to |
.PLS_approach_cf |
Character string. Approach used to obtain the correction
factors for PLSc. One of: "dist_squared_euclid", "dist_euclid_weighted",
"fisher_transformed", "mean_arithmetic", "mean_geometric", "mean_harmonic",
"geo_of_harmonic". Defaults to "dist_squared_euclid".
Ignored if |
.reliabilities |
A character vector of |
Calculate variance inflation factors (VIF) for weights obtained by PLS Mode B
Description
Calculate the variance inflation factor (VIF) for weights obtained by PLS-PM's Mode B.
Usage
calculateVIFModeB(.object = NULL)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
Details
Weight estimates obtained by Mode B can suffer from multicollinearity. VIF values are commonly used to assess the severity of multicollinearity.
The function is only applicable to objects of class cSEMResults_default
.
For other object classes use assess()
.
Value
A named list of vectors containing the VIF values. Each list name is the name of a construct whose weights were obtained by Mode B. The vectors contain the VIF values obtained from a regression of each explanatory variable of a given construct on the remaining explanatory variables of that construct.
If the weighting approach is not "PLS-PM"
or for none of the constructs Mode B is used,
the function silently returns NA
.
References
There are no references for Rd macro \insertAllCites
on this help page.
See Also
Calculate composite weights using GSCA
Description
Calculate composite weights using generalized structure component analysis (GSCA). The first version of this approach was presented in Hwang and Takane (2004). Since then, several advancements have been proposed. The latest version of GSCA can been found in Hwang and Takane (2014). This is the version cSEMs implementation is based on.
Usage
calculateWeightsGSCA(
.X = args_default()$.X,
.S = args_default()$.S,
.csem_model = args_default()$.csem_model,
.conv_criterion = args_default()$.conv_criterion,
.iter_max = args_default()$.iter_max,
.starting_values = args_default()$.starting_values,
.tolerance = args_default()$.tolerance
)
Arguments
.X |
A matrix of processed data (scaled, cleaned and ordered). |
.S |
The (K x K) empirical indicator correlation matrix. |
.csem_model |
A (possibly incomplete) cSEMModel-list. |
.conv_criterion |
Character string. The criterion to use for the convergence check. One of: "diff_absolute", "diff_squared", or "diff_relative". Defaults to "diff_absolute". |
.iter_max |
Integer. The maximum number of iterations allowed.
If |
.starting_values |
A named list of vectors where the
list names are the construct names whose indicator weights the user
wishes to set. The vectors must be named vectors of |
.tolerance |
Double. The tolerance criterion for convergence.
Defaults to |
Value
A named list. J stands for the number of constructs and K for the number of indicators.
$W
A (J x K) matrix of estimated weights.
$E
NULL
$Modes
A named vector of Modes used for the outer estimation, for GSCA the mode is automatically set to "gsca".
$Conv_status
The convergence status.
TRUE
if the algorithm has converged andFALSE
otherwise.$Iterations
The number of iterations required.
References
Hwang H, Takane Y (2004).
“Generalized Structured Component Analysis.”
Psychometrika, 69(1), 81–99.
Hwang H, Takane Y (2014).
Generalized Structured Component Analysis: A Component-Based Approach to Structural Equation Modeling, Chapman & Hall/CRC Statistics in the Social and Behavioral Sciences.
Chapman and Hall/CRC.
Calculate weights using GSCAm
Description
Calculate composite weights using generalized structured component analysis with uniqueness terms (GSCAm) proposed by Hwang et al. (2017).
Usage
calculateWeightsGSCAm(
.X = args_default()$.X,
.csem_model = args_default()$.csem_model,
.conv_criterion = args_default()$.conv_criterion,
.iter_max = args_default()$.iter_max,
.starting_values = args_default()$.starting_values,
.tolerance = args_default()$.tolerance
)
Arguments
.X |
A matrix of processed data (scaled, cleaned and ordered). |
.csem_model |
A (possibly incomplete) cSEMModel-list. |
.conv_criterion |
Character string. The criterion to use for the convergence check. One of: "diff_absolute", "diff_squared", or "diff_relative". Defaults to "diff_absolute". |
.iter_max |
Integer. The maximum number of iterations allowed.
If |
.starting_values |
A named list of vectors where the
list names are the construct names whose indicator weights the user
wishes to set. The vectors must be named vectors of |
.tolerance |
Double. The tolerance criterion for convergence.
Defaults to |
Details
If there are only constructs modeled as common factors
calling csem()
with .appraoch_weights = "GSCA"
will automatically call
calculateWeightsGSCAm()
unless .disattenuate = FALSE
.
GSCAm currently only works for pure common factor models. The reason is that the implementation
in cSEM is based on (the appendix) of Hwang et al. (2017).
Following the appendix, GSCAm fails if there is at least one construct
modeled as a composite because calculating weight estimates with GSCAm leads to a product
involving the measurement matrix. This matrix does not have full rank
if a construct modeled as a composite is present.
The reason is that the measurement matrix has a zero row for every construct
which is a pure composite (i.e. all related loadings are zero)
and, therefore, leads to a non-invertible matrix when multiplying it with its transposed.
Value
A list with the elements
$W
A (J x K) matrix of estimated weights.
$C
The (J x K) matrix of estimated loadings.
$B
The (J x J) matrix of estimated path coefficients.
$E
NULL
$Modes
A named vector of Modes used for the outer estimation, for GSCA the mode is automatically set to 'gsca'.
$Conv_status
The convergence status.
TRUE
if the algorithm has converged andFALSE
otherwise.$Iterations
The number of iterations required.
References
Hwang H, Takane Y, Jung K (2017). “Generalized structured component analysis with uniqueness terms for accommodating measurement error.” Frontiers in Psychology, 8(2137), 1–12.
Calculate composite weights using GCCA
Description
Calculates composite weights according to one of the the five criteria "SUMCORR", "MAXVAR", "SSQCORR", "MINVAR", and "GENVAR" suggested by Kettenring (1971).
Usage
calculateWeightsKettenring(
.S = args_default()$.S,
.csem_model = args_default()$.csem_model,
.approach_gcca = args_default()$.approach_gcca
)
Arguments
.S |
The (K x K) empirical indicator correlation matrix. |
.csem_model |
A (possibly incomplete) cSEMModel-list. |
.approach_gcca |
Character string. The Kettenring approach to use for GCCA. One of "SUMCORR", "MAXVAR", "SSQCORR", "MINVAR" or "GENVAR". Defaults to "SUMCORR". |
Value
A named list. J stands for the number of constructs and K for the number of indicators.
$W
A (J x K) matrix of estimated weights.
$E
NULL
$Modes
The GCCA mode used for the estimation.
$Conv_status
The convergence status.
TRUE
if the algorithm has converged andFALSE
otherwise. For.approach_gcca = "MINVAR"
or.approach_gcca = "MAXVAR"
the convergence status isNULL
since both are closed-form estimators.$Iterations
The number of iterations required. 0 for
.approach_gcca = "MINVAR"
or.approach_gcca = "MAXVAR"
References
Kettenring JR (1971). “Canonical Analysis of Several Sets of Variables.” Biometrika, 58(3), 433–451.
Calculate composite weights using principal component analysis (PCA)
Description
Calculate weights for each block by extracting the first principal component of the indicator correlation matrix S_jj for each blocks, i.e., weights are the simply the first eigenvector of S_jj.
Usage
calculateWeightsPCA(
.S = args_default()$.S,
.csem_model = args_default()$.csem_model
)
Arguments
.S |
The (K x K) empirical indicator correlation matrix. |
.csem_model |
A (possibly incomplete) cSEMModel-list. |
Value
A named list. J stands for the number of constructs and K for the number of indicators.
$W
A (J x K) matrix of estimated weights.
$E
NULL
$Modes
The mode used. Always "PCA".
$Conv_status
NULL
as there are no iterations$Iterations
0 as there are no iterations
Calculate composite weights using PLS-PM
Description
Calculate composite weights using the partial least squares path modeling (PLS-PM) algorithm (Wold 1975).
Usage
calculateWeightsPLS(
.data = args_default()$.data,
.S = args_default()$.S,
.csem_model = args_default()$.csem_model,
.conv_criterion = args_default()$.conv_criterion,
.iter_max = args_default()$.iter_max,
.PLS_ignore_structural_model = args_default()$.PLS_ignore_structural_model,
.PLS_modes = args_default()$.PLS_modes,
.PLS_weight_scheme_inner = args_default()$.PLS_weight_scheme_inner,
.starting_values = args_default()$.starting_values,
.tolerance = args_default()$.tolerance
)
Arguments
.data |
A |
.S |
The (K x K) empirical indicator correlation matrix. |
.csem_model |
A (possibly incomplete) cSEMModel-list. |
.conv_criterion |
Character string. The criterion to use for the convergence check. One of: "diff_absolute", "diff_squared", or "diff_relative". Defaults to "diff_absolute". |
.iter_max |
Integer. The maximum number of iterations allowed.
If |
.PLS_ignore_structural_model |
Logical. Should the structural model be ignored
when calculating the inner weights of the PLS-PM algorithm? Defaults to |
.PLS_modes |
Either a named list specifying the mode that should be used for
each construct in the form |
.PLS_weight_scheme_inner |
Character string. The inner weighting scheme
used by PLS-PM. One of: "centroid", "factorial", or "path".
Defaults to "path". Ignored if |
.starting_values |
A named list of vectors where the
list names are the construct names whose indicator weights the user
wishes to set. The vectors must be named vectors of |
.tolerance |
Double. The tolerance criterion for convergence.
Defaults to |
Value
A named list. J stands for the number of constructs and K for the number of indicators.
$W
A (J x K) matrix of estimated weights.
$E
A (J x J) matrix of inner weights.
$Modes
A named vector of modes used for the outer estimation.
$Conv_status
The convergence status.
TRUE
if the algorithm has converged andFALSE
otherwise. If one-step weights are used via.iter_max = 1
or a non-iterative procedure was used, the convergence status is set toNULL
.$Iterations
The number of iterations required.
References
Wold H (1975). “Path models with latent variables: The NIPALS approach.” In Blalock HM, Aganbegian A, Borodkin FM, Boudon R, Capecchi V (eds.), Quantitative Sociology, International Perspectives on Mathematical and Statistical Modeling, 307–357. Academic Press, New York.
Calculate composite weights using unit weights
Description
Calculate unit weights for all blocks, i.e., each indicator of a block is equally weighted.
Usage
calculateWeightsUnit(
.S = args_default()$.S,
.csem_model = args_default()$.csem_model,
.starting_values = args_default()$.starting_values
)
Arguments
.S |
The (K x K) empirical indicator correlation matrix. |
.csem_model |
A (possibly incomplete) cSEMModel-list. |
.starting_values |
A named list of vectors where the
list names are the construct names whose indicator weights the user
wishes to set. The vectors must be named vectors of |
Value
A named list. J stands for the number of constructs and K for the number of indicators.
$W
A (J x K) matrix of estimated weights.
$E
NULL
$Modes
The mode used. Always "unit".
$Conv_status
NULL
as there are no iterations$Iterations
0 as there are no iterations
Calculate Cohen's f^2
Description
Calculate the effect size for regression analysis (Cohen 1992) known as Cohen's f^2.
Usage
calculatef2(.object = NULL)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
Value
A matrix with as many rows as there are structural equations. The number of columns is equal to the total number of right-hand side variables of these equations.
References
Cohen J (1992). “A power primer.” Psychological Bulletin, 112(1), 155–159.
See Also
Internal: Check convergence
Description
Check convergence of an algorithm using one of the following criteria:
diff_absolute
Checks if the largest elementwise absolute difference between two matrices
.W_new
andW.old
is smaller than a given tolerance.diff_squared
Checks if the largest elementwise squared difference between two matrices
.W_new
andW.old
is smaller than a given tolerance.diff_relative
Checks if the largest elementwise absolute rate of change (new - old / new) for two matrices
.W_new
andW.old
is smaller than a given tolerance.
Usage
checkConvergence(
.W_new = args_default()$.W_new,
.W_old = args_default()$.W_old,
.conv_criterion = args_default()$.conv_criterion,
.tolerance = args_default()$.tolerance
)
Arguments
.W_new |
A (J x K) matrix of weights. |
.W_old |
A (J x K) matrix of weights. |
.conv_criterion |
Character string. The criterion to use for the convergence check. One of: "diff_absolute", "diff_squared", or "diff_relative". Defaults to "diff_absolute". |
.tolerance |
Double. The tolerance criterion for convergence.
Defaults to |
Value
TRUE
if converged; FALSE
otherwise.
Internal: Check whether two indicators belong to the same construct.
Description
Checks whether two indicators belong to the same construct.
Usage
check_connection(
.indicator1,
.indicator2,
.model_measurement,
.model_error_cor
)
Arguments
.indicator1 |
Character string. The name of the indicator 1. |
.indicator2 |
Character string. The name of the indicator 1. |
.model_measurement |
Matrix. The measurement matrix indicating the relationship between constructs and indicators. |
.model_error_cor |
Matrix. The matrix indicates the error correlation structure. |
Value
TRUE if both indicators belong to the same construct, FALSE otherwise.
Internal: Classify structural model terms by type
Description
Classify terms of the structural model according to their type.
Usage
classifyConstructs(.terms = args_default()$.terms)
Arguments
.terms |
A vector of construct names to be classified. |
Details
Classification is required to estimate nonlinear structural relationships. Currently the following terms are supported
Single, e.g.,
eta1
Quadratic, e.g.,
eta1.eta1
Cubic, e.g.,
eta1.eta1.eta1
Two-way interaction, e.g.,
eta1.eta2
Three-way interaction, e.g.,
eta1.eta2.eta3
Quadratic and two-way interaction, e.g.,
eta1.eta1.eta3
Note that exponential terms are modeled as "interactions with itself"
as in i.e., eta1^3 = eta1.eta1.eta1
.
Value
A named list of length equal to the number of terms provided containing a data frame with columns "Term_class", "Component", "Component_type", and "Component_freq".
Internal: Clean a node name.
Description
Removes a trailing "_temp" from a node name.
Usage
cleanNode(node)
Arguments
node |
A node name. |
Value
A cleaned node name.
Internal: Convert second order cSEMModel
Description
Uses a cSEMModel containing second order constructs and turns it into an estimable model using either the "2stage" approach or the "mixed" approach.
Usage
convertModel(
.csem_model = NULL,
.approach_2ndorder = "2stage",
.stage = "first"
)
Arguments
.csem_model |
A (possibly incomplete) cSEMModel-list. |
.approach_2ndorder |
Character string. Approach used for models containing second-order constructs. One of: "2stage", or "mixed". Defaults to "2stage". |
.stage |
Character string. The stage the model is needed for. One of "first" or "second". Defaults to "first". |
Value
A cSEMModel list that may be passed to any function requiring
.csem_model
as a mandatory argument.
Composite-based SEM
Description
Usage
csem(
.data = NULL,
.model = NULL,
.approach_2ndorder = c("2stage", "mixed"),
.approach_cor_robust = c("none", "mcd", "spearman"),
.approach_nl = c("sequential", "replace"),
.approach_paths = c("OLS", "2SLS"),
.approach_weights = c("PLS-PM", "SUMCORR", "MAXVAR", "SSQCORR",
"MINVAR", "GENVAR","GSCA", "PCA",
"unit", "bartlett", "regression"),
.conv_criterion = c("diff_absolute", "diff_squared", "diff_relative"),
.disattenuate = TRUE,
.dominant_indicators = NULL,
.estimate_structural = TRUE,
.id = NULL,
.instruments = NULL,
.iter_max = 100,
.normality = FALSE,
.PLS_approach_cf = c("dist_squared_euclid", "dist_euclid_weighted",
"fisher_transformed", "mean_arithmetic",
"mean_geometric", "mean_harmonic",
"geo_of_harmonic"),
.PLS_ignore_structural_model = FALSE,
.PLS_modes = NULL,
.PLS_weight_scheme_inner = c("path", "centroid", "factorial"),
.reliabilities = NULL,
.starting_values = NULL,
.resample_method = c("none", "bootstrap", "jackknife"),
.resample_method2 = c("none", "bootstrap", "jackknife"),
.R = 499,
.R2 = 199,
.handle_inadmissibles = c("drop", "ignore", "replace"),
.user_funs = NULL,
.eval_plan = c("sequential", "multicore", "multisession"),
.seed = NULL,
.sign_change_option = c("none", "individual", "individual_reestimate",
"construct_reestimate"),
.tolerance = 1e-05
)
Arguments
.data |
A |
.model |
A model in lavaan model syntax or a cSEMModel list. |
.approach_2ndorder |
Character string. Approach used for models containing second-order constructs. One of: "2stage", or "mixed". Defaults to "2stage". |
.approach_cor_robust |
Character string. Approach used to obtain a robust
indicator correlation matrix. One of: "none" in which case the standard
Bravais-Pearson correlation is used,
"spearman" for the Spearman rank correlation, or
"mcd" via |
.approach_nl |
Character string. Approach used to estimate nonlinear structural relationships. One of: "sequential" or "replace". Defaults to "sequential". |
.approach_paths |
Character string. Approach used to estimate the
structural coefficients. One of: "OLS" or "2SLS". If "2SLS", instruments
need to be supplied to |
.approach_weights |
Character string. Approach used to obtain composite weights. One of: "PLS-PM", "SUMCORR", "MAXVAR", "SSQCORR", "MINVAR", "GENVAR", "GSCA", "PCA", "unit", "bartlett", or "regression". Defaults to "PLS-PM". |
.conv_criterion |
Character string. The criterion to use for the convergence check. One of: "diff_absolute", "diff_squared", or "diff_relative". Defaults to "diff_absolute". |
.disattenuate |
Logical. Should composite/proxy correlations
be disattenuated to yield consistent loadings and path estimates if at least
one of the construct is modeled as a common factor? Defaults to |
.dominant_indicators |
A character vector of |
.estimate_structural |
Logical. Should the structural coefficients
be estimated? Defaults to |
.id |
Character string or integer. A character string giving the name or
an integer of the position of the column of |
.instruments |
A named list of vectors of instruments. The names
of the list elements are the names of the dependent (LHS) constructs of the structural
equation whose explanatory variables are endogenous. The vectors
contain the names of the instruments corresponding to each equation. Note
that exogenous variables of a given equation must be supplied as
instruments for themselves. Defaults to |
.iter_max |
Integer. The maximum number of iterations allowed.
If |
.normality |
Logical. Should joint normality of
|
.PLS_approach_cf |
Character string. Approach used to obtain the correction
factors for PLSc. One of: "dist_squared_euclid", "dist_euclid_weighted",
"fisher_transformed", "mean_arithmetic", "mean_geometric", "mean_harmonic",
"geo_of_harmonic". Defaults to "dist_squared_euclid".
Ignored if |
.PLS_ignore_structural_model |
Logical. Should the structural model be ignored
when calculating the inner weights of the PLS-PM algorithm? Defaults to |
.PLS_modes |
Either a named list specifying the mode that should be used for
each construct in the form |
.PLS_weight_scheme_inner |
Character string. The inner weighting scheme
used by PLS-PM. One of: "centroid", "factorial", or "path".
Defaults to "path". Ignored if |
.reliabilities |
A character vector of |
.starting_values |
A named list of vectors where the
list names are the construct names whose indicator weights the user
wishes to set. The vectors must be named vectors of |
.resample_method |
Character string. The resampling method to use. One of: "none", "bootstrap" or "jackknife". Defaults to "none". |
.resample_method2 |
Character string. The resampling method to use when resampling
from a resample. One of: "none", "bootstrap" or "jackknife". For
"bootstrap" the number of draws is provided via |
.R |
Integer. The number of bootstrap replications. Defaults to |
.R2 |
Integer. The number of bootstrap replications to use when
resampling from a resample. Defaults to |
.handle_inadmissibles |
Character string. How should inadmissible results
be treated? One of "drop", "ignore", or "replace". If "drop", all
replications/resamples yielding an inadmissible result will be dropped
(i.e. the number of results returned will potentially be less than |
.user_funs |
A function or a (named) list of functions to apply to every
resample. The functions must take |
.eval_plan |
Character string. The evaluation plan to use. One of "sequential", "multicore", or "multisession". In the two latter cases all available cores will be used. Defaults to "sequential". |
.seed |
Integer or |
.sign_change_option |
Character string. Which sign change option should be used to handle flipping signs when resampling? One of "none","individual", "individual_reestimate", "construct_reestimate". Defaults to "none". |
.tolerance |
Double. The tolerance criterion for convergence.
Defaults to |
Details
Estimate linear, nonlinear, hierarchical and multigroup structural equation models using a composite-based approach. In cSEM any method or approach that involves linear compounds (scores/proxies/composites) of observables (indicators/items/manifest variables) is defined as composite-based. See the Get started section of the cSEM website for a general introduction to composite-based SEM and cSEM.
csem()
estimates linear, nonlinear, hierarchical or multigroup structural
equation models using a composite-based approach.
Data and model:
The .data
and .model
arguments are required. .data
must be given
a matrix
or a data.frame
with column names matching
the indicator names used in the model description. Alternatively,
a list
of data sets (matrices or data frames) may be provided
in which case estimation is repeated for each data set.
Possible column types/classes of the data provided are: "logical
",
"numeric
" ("double
" or "integer
"), "factor
" ("ordered
" and/or "unordered
"),
"character
", or a mix of several types. Character columns will be treated
as (unordered) factors.
Depending on the type/class of the indicator data provided cSEM computes the indicator
correlation matrix in different ways. See calculateIndicatorCor()
for details.
In the current version .data
must not contain missing values. Future versions
are likely to handle missing values as well.
To provide a model use the lavaan model syntax.
Note, however, that cSEM currently only supports the "standard" lavaan
model syntax (Types 1, 2, 3, and 7 as described on the help page).
Therefore, specifying e.g., a threshold or scaling factors is ignored.
Alternatively, a standardized (possibly incomplete) cSEMModel-list may be supplied.
See parseModel()
for details.
Weights and path coefficients:
By default weights are estimated using the partial least squares path modeling
algorithm ("PLS-PM"
).
A range of alternative weighting algorithms may be supplied to
.approach_weights
. Currently, the following approaches are implemented
(Default) Partial least squares path modeling (
"PLS-PM"
). The algorithm can be customized. SeecalculateWeightsPLS()
for details.Generalized structured component analysis (
"GSCA"
) and generalized structured component analysis with uniqueness terms (GSCAm). The algorithms can be customized. SeecalculateWeightsGSCA()
andcalculateWeightsGSCAm()
for details. Note that GSCAm is called indirectly when the model contains constructs modeled as common factors only and.disattenuate = TRUE
. See below.Generalized canonical correlation analysis (GCCA), including
"SUMCORR"
,"MAXVAR"
,"SSQCORR"
,"MINVAR"
,"GENVAR"
.Principal component analysis (
"PCA"
)Factor score regression using sum scores (
"unit"
), regression ("regression"
) or Bartlett scores ("bartlett"
)
It is possible to supply starting values for the weighting algorithm
via .starting_values
. The argument accepts a named list of vectors where the
list names are the construct names whose indicator weights the user
wishes to set. The vectors must be named vectors of "indicator_name" = value
pairs, where value
is the starting weight. See the examples section below for details.
Composite-indicator and composite-composite correlations are properly disattenuated by default to yield consistent loadings, construct correlations, and path coefficients if any of the concepts are modeled as a common factor.
For PLS-PM disattenuation is done using PLSc (Dijkstra and Henseler 2015).
For GSCA disattenuation is done implicitly by using GSCAm (Hwang et al. 2017).
Weights obtained by GCCA, unit, regression, bartlett or PCA are
disattenuated using Croon's approach (Croon 2002).
Disattenuation my be suppressed by setting .disattenuate = FALSE
.
Note, however, that quantities in this case are inconsistent
estimates for their construct level counterparts if any of the constructs in
the structural model are modeled as a common factor!
By default path coefficients are estimated using ordinary least squares (.approach_path = "OLS"
).
For linear models, two-stage least squares ("2SLS"
) is available, however, only if
instruments are internal, i.e., part of the structural model. Future versions
will add support for external instruments if possible. Instruments must be supplied to
.instruments
as a named list where the names
of the list elements are the names of the dependent constructs of the structural
equations whose explanatory variables are believed to be endogenous.
The list consists of vectors of names of instruments corresponding to each equation.
Note that exogenous variables of a given equation must be supplied as
instruments for themselves.
If reliabilities are known they can be supplied as "name" = value
pairs to
.reliabilities
, where value
is a numeric value between 0 and 1.
Currently, only supported for "PLS-PM".
Nonlinear models:
If the model contains nonlinear terms csem()
estimates a polynomial structural equation model
using a non-iterative method of moments approach described in
Dijkstra and Schermelleh-Engel (2014). Nonlinear terms include interactions and
exponential terms. The latter is described in model syntax as an
"interaction with itself", e.g., xi^3 = xi.xi.xi
. Currently only exponential
terms up to a power of three (e.g., three-way interactions or cubic terms) are allowed:
- Single, e.g.,
eta1
- Quadratic, e.g.,
eta1.eta1
- Cubic, e.g.,
eta1.eta1.eta1
- Two-way interaction, e.g.,
eta1.eta2
- Three-way interaction, e.g.,
eta1.eta2.eta3
- Quadratic and two-way interaction, e.g.,
eta1.eta1.eta3
The current version of the package allows two kinds of estimation:
estimation of the reduced form equation (.approach_nl = "replace"
) and
sequential estimation (.approach_nl = "sequential"
, the default). The latter does not
allow for multivariate normality of all exogenous variables, i.e.,
the latent variables and the error terms.
Distributional assumptions are kept to a minimum (an i.i.d. sample from a population with finite moments for the relevant order); for higher order models, that go beyond interaction, we work in this version with the assumption that as far as the relevant moments are concerned certain combinations of measurement errors behave as if they were Gaussian. For details see: Dijkstra and Schermelleh-Engel (2014).
Models containing second-order constructs
Second-order constructs are specified using the operators =~
and <~
. These
operators are usually used with indicators on their right-hand side. For
second-order constructs the right-hand side variables are constructs instead.
If c1, and c2 are constructs forming or measuring a higher-order
construct, a model would look like this:
my_model <- " # Structural model SAT ~ QUAL VAL ~ SAT # Measurement/composite model QUAL =~ qual1 + qual2 SAT =~ sat1 + sat2 c1 =~ x11 + x12 c2 =~ x21 + x22 # Second-order construct (in this case a second-order composite build by common # factors) VAL <~ c1 + c2 "
Currently, two approaches are explicitly implemented:
(Default)
"2stage"
. The (disjoint) two-stage approach as proposed by Agarwal and Karahanna (2000). Note that by default a correction for attenuation is applied if common factors are involved in modeling second-order constructs. For instance, the three-stage approach proposed by Van Riel et al. (2017) is applied in case of a second-order construct specified as a composite of common factors. On the other hand, if no common factors are involved the two-stage approach is applied as proposed by Schuberth et al. (2020)."mixed"
. The mixed repeated indicators/two-stage approach as proposed by Ringle et al. (2012).
The repeated indicators approach as proposed by Joereskog and Wold (1982)
and the extension proposed by Becker et al. (2012) are
not directly implemented as they simply require a respecification of the model.
In the above example the repeated indicators approach
would require to change the model and to append the repeated indicators to
the data supplied to .data
. Note that the indicators need to be renamed in this case as
csem()
does not allow for one indicator to be attached to multiple constructs.
my_model <- " # Structural model SAT ~ QUAL VAL ~ SAT VAL ~ c1 + c2 # Measurement/composite model QUAL =~ qual1 + qual2 SAT =~ sat1 + sat2 VAL =~ x11_temp + x12_temp + x21_temp + x22_temp c1 =~ x11 + x12 c2 =~ x21 + x22 "
According to the extended approach indirect effects of QUAL
on VAL
via c1
and c2
would have to be specified as well.
Multigroup analysis
To perform a multigroup analysis provide either a list of data sets or one
data set containing a group-identifier-column whose column
name must be provided to .id
. Values of this column are taken as levels of a
factor and are interpreted as group
identifiers. csem()
will split the data by levels of that column and run
the estimation for each level separately. Note, the more levels
the group-identifier-column has, the more estimation runs are required.
This can considerably slow down estimation, especially if resampling is
requested. For the latter it will generally be faster to use
.eval_plan = "multisession"
or .eval_plan = "multicore"
.
Inference:
Inference is done via resampling. See resamplecSEMResults()
and infer()
for details.
Value
An object of class cSEMResults
with methods for all postestimation generics.
Technically, a call to csem()
results in an object with at least
two class attributes. The first class attribute is always cSEMResults
.
The second is one of cSEMResults_default
, cSEMResults_multi
, or
cSEMResults_2ndorder
and depends on the estimated model and/or the type of
data provided to the .model
and .data
arguments. The third class attribute
cSEMResults_resampled
is only added if resampling was conducted.
For a details see the cSEMResults helpfile .
Postestimation
assess()
Assess results using common quality criteria, e.g., reliability, fit measures, HTMT, R2 etc.
infer()
Calculate common inferential quantities, e.g., standard errors, confidence intervals.
plot()
Creates a plot of the model. For the help file see
plot.cSEMResults_default()
.predict()
Predict endogenous indicator scores and compute common prediction metrics.
summarize()
Summarize the results. Mainly called for its side-effect the print method.
verify()
Verify/Check admissibility of the estimates.
Tests are performed using the test-family of functions. Currently the following tests are implemented:
testCVPAT()
Cross-validated predictive ability test proposed by Liengaard et al. (2021)
testOMF()
Bootstrap-based test for overall model fit based on Beran and Srivastava (1985).
testMICOM()
Permutation-based test for measurement invariance of composites proposed by Henseler et al. (2016).
testMGD()
Several (mainly) permutation-based tests for multi-group comparisons.
testHausman()
Regression-based Hausman test to test for endogeneity.
Other miscellaneous postestimation functions belong do the do-family of functions. Currently three do functions are implemented:
doIPMA()
Performs an importance-performance matrix analysis (IPMA).
doNonlinearEffectsAnalysis()
Perform a nonlinear effects analysis as described in e.g., Spiller et al. (2013)
doRedundancyAnalysis()
Perform a redundancy analysis (RA) as proposed by Hair et al. (2016) with reference to Chin (1998)
References
Agarwal R, Karahanna E (2000).
“Time Flies When You're Having Fun: Cognitive Absorption and Beliefs about Information Technology Usage.”
MIS Quarterly, 24(4), 665.
Becker J, Klein K, Wetzels M (2012).
“Hierarchical Latent Variable Models in PLS-SEM: Guidelines for Using Reflective-Formative Type Models.”
Long Range Planning, 45(5-6), 359–394.
doi:10.1016/j.lrp.2012.10.001.
Beran R, Srivastava MS (1985).
“Bootstrap Tests and Confidence Regions for Functions of a Covariance Matrix.”
The Annals of Statistics, 13(1), 95–115.
doi:10.1214/aos/1176346579.
Chin WW (1998).
“Modern Methods for Business Research.”
In Marcoulides GA (ed.), chapter The Partial Least Squares Approach to Structural Equation Modeling, 295–358.
Mahwah, NJ: Lawrence Erlbaum.
Croon MA (2002).
“Using predicted latent scores in general latent structure models.”
In Marcoulides GA, Moustaki I (eds.), Latent Variable and Latent Structure Models, chapter 10, 195–224.
Lawrence Erlbaum.
ISBN 080584046X, Pagination: 288.
Dijkstra TK, Henseler J (2015).
“Consistent and Asymptotically Normal PLS Estimators for Linear Structural Equations.”
Computational Statistics & Data Analysis, 81, 10–23.
Dijkstra TK, Schermelleh-Engel K (2014).
“Consistent Partial Least Squares For Nonlinear Structural Equation Models.”
Psychometrika, 79(4), 585–604.
Hair JF, Hult GTM, Ringle C, Sarstedt M (2016).
A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM).
Sage publications.
Henseler J, Ringle CM, Sarstedt M (2016).
“Testing Measurement Invariance of Composites Using Partial Least Squares.”
International Marketing Review, 33(3), 405–431.
doi:10.1108/imr-09-2014-0304.
Hwang H, Takane Y, Jung K (2017).
“Generalized structured component analysis with uniqueness terms for accommodating measurement error.”
Frontiers in Psychology, 8(2137), 1–12.
Joereskog KG, Wold HO (1982).
Systems under Indirect Observation: Causality, Structure, Prediction - Part II, volume 139.
North Holland.
Liengaard BD, Sharma PN, Hult GTM, Jensen MB, Sarstedt M, Hair JF, Ringle CM (2021).
“Prediction: Coveted, Yet Forsaken? Introducing a Cross-Validated Predictive Ability Test in Partial Least Squares Path Modeling.”
Decision Sciences, 52(2), 362–392.
Ringle CM, Sarstedt M, Straub D (2012).
“A Critical Look at the Use of PLS-SEM in MIS Quarterly.”
MIS Quarterly, 36(1), iii–xiv.
Schuberth F, Rademaker ME, Henseler J (2020).
“Estimating and assessing second-order constructs using PLS-PM: the case of composites of composites.”
Industrial Management & Data Systems, 120(12), 2211-2241.
doi:10.1108/imds-12-2019-0642.
Spiller SA, Fitzsimons GJ, Lynch JG, Mcclelland GH (2013).
“Spotlights, Floodlights, and the Magic Number Zero: Simple Effects Tests in Moderated Regression.”
Journal of Marketing Research, 50(2), 277–288.
doi:10.1509/jmr.12.0420.
Van Riel ACR, Henseler J, Kemeny I, Sasovova Z (2017).
“Estimating hierarchical constructs using Partial Least Squares: The case of second order composites of factors.”
Industrial Management & Data Systems, 117(3), 459–477.
doi:10.1108/IMDS-07-2016-0286.
See Also
args_default()
, cSEMArguments, cSEMResults, foreman()
, resamplecSEMResults()
,
assess()
, infer()
, plot.cSEMResults_default()
, predict()
, summarize()
, verify()
, testCVPAT()
, testOMF()
,
testMGD()
, testMICOM()
, testHausman()
Examples
# ===========================================================================
# Basic usage
# ===========================================================================
### Linear model ------------------------------------------------------------
# Most basic usage requires a dataset and a model. We use the
# `threecommonfactors` dataset.
## Take a look at the dataset
#?threecommonfactors
## Specify the (correct) model
model <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2
# (Reflective) measurement model
eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"
## Estimate
res <- csem(threecommonfactors, model)
## Postestimation
verify(res)
summarize(res)
assess(res)
# Notes:
# 1. By default no inferential quantities (e.g. Std. errors, p-values, or
# confidence intervals) are calculated. Use resampling to obtain
# inferential quantities. See "Resampling" in the "Extended usage"
# section below.
# 2. `summarize()` prints the full output by default. For a more condensed
# output use:
print(summarize(res), .full_output = FALSE)
## Dealing with endogeneity -------------------------------------------------
# See: ?testHausman()
### Models containing second constructs--------------------------------------
## Take a look at the dataset
#?dgp_2ndorder_cf_of_c
model <- "
# Path model / Regressions
c4 ~ eta1
eta2 ~ eta1 + c4
# Reflective measurement model
c1 <~ y11 + y12
c2 <~ y21 + y22 + y23 + y24
c3 <~ y31 + y32 + y33 + y34 + y35 + y36 + y37 + y38
eta1 =~ y41 + y42 + y43
eta2 =~ y51 + y52 + y53
# Composite model (second order)
c4 =~ c1 + c2 + c3
"
res_2stage <- csem(dgp_2ndorder_cf_of_c, model, .approach_2ndorder = "2stage")
res_mixed <- csem(dgp_2ndorder_cf_of_c, model, .approach_2ndorder = "mixed")
# The standard repeated indicators approach is done by 1.) respecifying the model
# and 2.) adding the repeated indicators to the data set
# 1.) Respecify the model
model_RI <- "
# Path model / Regressions
c4 ~ eta1
eta2 ~ eta1 + c4
c4 ~ c1 + c2 + c3
# Reflective measurement model
c1 <~ y11 + y12
c2 <~ y21 + y22 + y23 + y24
c3 <~ y31 + y32 + y33 + y34 + y35 + y36 + y37 + y38
eta1 =~ y41 + y42 + y43
eta2 =~ y51 + y52 + y53
# c4 is a common factor measured by composites
c4 =~ y11_temp + y12_temp + y21_temp + y22_temp + y23_temp + y24_temp +
y31_temp + y32_temp + y33_temp + y34_temp + y35_temp + y36_temp +
y37_temp + y38_temp
"
# 2.) Update data set
data_RI <- dgp_2ndorder_cf_of_c
coln <- c(colnames(data_RI), paste0(colnames(data_RI), "_temp"))
data_RI <- data_RI[, c(1:ncol(data_RI), 1:ncol(data_RI))]
colnames(data_RI) <- coln
# Estimate
res_RI <- csem(data_RI, model_RI)
summarize(res_RI)
### Multigroup analysis -----------------------------------------------------
# See: ?testMGD()
# ===========================================================================
# Extended usage
# ===========================================================================
# `csem()` provides defaults for all arguments except `.data` and `.model`.
# Below some common options/tasks that users are likely to be interested in.
# We use the threecommonfactors data set again:
model <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2
# (Reflective) measurement model
eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"
### PLS vs PLSc and disattenuation
# In the model all concepts are modeled as common factors. If
# .approach_weights = "PLS-PM", csem() uses PLSc to disattenuate composite-indicator
# and composite-composite correlations.
res_plsc <- csem(threecommonfactors, model, .approach_weights = "PLS-PM")
res$Information$Model$construct_type # all common factors
# To obtain "original" (inconsistent) PLS estimates use `.disattenuate = FALSE`
res_pls <- csem(threecommonfactors, model,
.approach_weights = "PLS-PM",
.disattenuate = FALSE
)
s_plsc <- summarize(res_plsc)
s_pls <- summarize(res_pls)
# Compare
data.frame(
"Path" = s_plsc$Estimates$Path_estimates$Name,
"Pop_value" = c(0.6, 0.4, 0.35), # see ?threecommonfactors
"PLSc" = s_plsc$Estimates$Path_estimates$Estimate,
"PLS" = s_pls$Estimates$Path_estimates$Estimate
)
### Resampling --------------------------------------------------------------
## Not run:
## Basic resampling
res_boot <- csem(threecommonfactors, model, .resample_method = "bootstrap")
res_jack <- csem(threecommonfactors, model, .resample_method = "jackknife")
# See ?resamplecSEMResults for more examples
### Choosing a different weightning scheme ----------------------------------
res_gscam <- csem(threecommonfactors, model, .approach_weights = "GSCA")
res_gsca <- csem(threecommonfactors, model,
.approach_weights = "GSCA",
.disattenuate = FALSE
)
s_gscam <- summarize(res_gscam)
s_gsca <- summarize(res_gsca)
# Compare
data.frame(
"Path" = s_gscam$Estimates$Path_estimates$Name,
"Pop_value" = c(0.6, 0.4, 0.35), # see ?threecommonfactors
"GSCAm" = s_gscam$Estimates$Path_estimates$Estimate,
"GSCA" = s_gsca$Estimates$Path_estimates$Estimate
)
## End(Not run)
### Fine-tuning a weighting scheme ------------------------------------------
## Setting starting values
sv <- list("eta1" = c("y12" = 10, "y13" = 4, "y11" = 1))
res <- csem(threecommonfactors, model, .starting_values = sv)
## Choosing a different inner weighting scheme
#?args_csem_dotdotdot
res <- csem(threecommonfactors, model, .PLS_weight_scheme_inner = "factorial",
.PLS_ignore_structural_model = TRUE)
## Choosing different modes for PLS
# By default, concepts modeled as common factors uses PLS Mode A weights.
modes <- list("eta1" = "unit", "eta2" = "modeB", "eta3" = "unit")
res <- csem(threecommonfactors, model, .PLS_modes = modes)
summarize(res)
cSEMArguments
Description
An alphabetical list of all arguments used by functions of the cSEM
package
including their description and defaults.
Mainly used for internal purposes (parameter inheritance). To list all arguments
and their defaults, use args_default()
. To list all arguments and
their possible choices, use args_default(.choices = TRUE)
.
Arguments
.alpha |
An integer or a numeric vector of significance levels.
Defaults to |
.absolute |
Logical. Should the absolute HTMT values be returned?
Defaults to |
.approach_gcca |
Character string. The Kettenring approach to use for GCCA. One of "SUMCORR", "MAXVAR", "SSQCORR", "MINVAR" or "GENVAR". Defaults to "SUMCORR". |
.approach_2ndorder |
Character string. Approach used for models containing second-order constructs. One of: "2stage", or "mixed". Defaults to "2stage". |
.approach_alpha_adjust |
Character string. Approach used to adjust the significance level to accommodate multiple testing. One of "none" or "bonferroni". Defaults to "none". |
.approach_cor_robust |
Character string. Approach used to obtain a robust
indicator correlation matrix. One of: "none" in which case the standard
Bravais-Pearson correlation is used,
"spearman" for the Spearman rank correlation, or
"mcd" via |
.approach_mgd |
Character string or a vector of character strings. Approach used for the multi-group comparison. One of: "all", "Klesel", "Chin", "Sarstedt", "Keil, "Nitzl", "Henseler", "CI_para", or "CI_overlap". Default to "all" in which case all approaches are computed (if possible). |
.approach_nl |
Character string. Approach used to estimate nonlinear structural relationships. One of: "sequential" or "replace". Defaults to "sequential". |
.approach_predict |
Character string. Which approach should be used to predictions? One of "earliest" and "direct". If "earliest" predictions for indicators associated to endogenous constructs are performed using only indicators associated to exogenous constructs. If "direct", predictions for indicators associated to endogenous constructs are based on indicators associated to their direct antecedents. Defaults to "earliest". |
.approach_p_adjust |
Character string or a vector of character strings.
Approach used to adjust the p-value for multiple testing.
See the |
.approach_paths |
Character string. Approach used to estimate the
structural coefficients. One of: "OLS" or "2SLS". If "2SLS", instruments
need to be supplied to |
.approach_score_benchmark |
Character string. How should the aggregation
of the estimates of the truncated normal distribution be done for the
benchmark predictions? Ignored if not OrdPLS or OrdPLSc is used to obtain benchmark predictions.
One of "mean", "median", "mode" or "round".
If "round", the benchmark predictions are obtained using the traditional prediction
algorithm for PLS-PM which are rounded for categorical indicators.
If "mean", the mean of the estimated endogenous indicators is calculated.
If "median", the mean of the estimated endogenous indicators is calculated.
If "mode", the maximum empirical density on the intervals defined by the thresholds
is used.
If |
.approach_score_target |
Character string. How should the aggregation of the estimates of the truncated normal distribution for the predictions using OrdPLS/OrdPLSc be done? One of "mean", "median" or "mode". If "mean", the mean of the estimated endogenous indicators is calculated. If "median", the mean of the estimated endogenous indicators is calculated. If "mode", the maximum empirical density on the intervals defined by the thresholds is used. Defaults to "mean". |
.approach_weights |
Character string. Approach used to obtain composite weights. One of: "PLS-PM", "SUMCORR", "MAXVAR", "SSQCORR", "MINVAR", "GENVAR", "GSCA", "PCA", "unit", "bartlett", or "regression". Defaults to "PLS-PM". |
.args_used |
A list of function argument names whose value was modified by the user. |
.attrbutes |
Character string. Variables used as attributes in IPMA. |
.benchmark |
Character string. The procedure to obtain benchmark predictions. One of "lm", "unit", "PLS-PM", "GSCA", "PCA", "MAXVAR", or "NA". Default to "lm". |
.bias_corrected |
Logical. Should the standard and the tStat
confidence interval be bias-corrected using the bootstrapped bias estimate?
If |
.by_equation |
Should the criteria be computed for each structural model
equation separately? Defaults to |
.C |
A (J x J) composite variance-covariance matrix. |
.check_errors |
Logical. Should the model to parse be checked for correctness
in a sense that all necessary components to estimate the model are given?
Defaults to |
.choices |
Logical. Should candidate values for the arguments be returned?
Defaults to |
.ci |
A vector of character strings naming the confidence interval to compute.
For possible choices see |
.ci_colnames |
Internal argument used by several print helper functions. |
.closed_form_ci |
Logical. Should a closed-form confidence interval be computed?
Defaults to |
.conv_criterion |
Character string. The criterion to use for the convergence check. One of: "diff_absolute", "diff_squared", or "diff_relative". Defaults to "diff_absolute". |
.csem_model |
A (possibly incomplete) cSEMModel-list. |
.csem_resample |
A list resulting from a call to |
.cv_folds |
Integer. The number of cross-validation folds to use. Setting
|
.data |
A |
.dependent |
Character string. The name of the dependent variable. |
.disattenuate |
Logical. Should composite/proxy correlations
be disattenuated to yield consistent loadings and path estimates if at least
one of the construct is modeled as a common factor? Defaults to |
.dist |
Character string. The distribution to use for the critical value. One of "t" for Student's t-distribution or "z" for the standard normal distribution. Defaults to "z". |
.distance |
Character string. A distance measure. One of: "geodesic" or "squared_euclidean". Defaults to "geodesic". |
.df |
Character string. The method for obtaining the degrees of freedom. Choices are "type1" and "type2". Defaults to "type1" . |
.dominant_indicators |
A character vector of |
.E |
A (J x J) matrix of inner weights. |
.effect |
Internal argument used by helper printEffects(). |
.estimate_structural |
Logical. Should the structural coefficients
be estimated? Defaults to |
.eval_plan |
Character string. The evaluation plan to use. One of "sequential", "multicore", or "multisession". In the two latter cases all available cores will be used. Defaults to "sequential". |
.filename |
Character string. The file name. |
.first_resample |
A list containing the |
.fit_measures |
Logical. (EXPERIMENTAL) Should additional fit measures
be included? Defaults to |
.force |
Logical. Should .object be resampled even if it contains resamples
already?. Defaults to |
.full_output |
Logical. Should the full output of summarize be printed.
Defaults to |
.graph_attrs |
Character string. Additional attributes that should be passed to the DiagrammeR syntax, e.g., c("rankdir=LR", "ranksep=1.0"). Defaults to c("rankdir=LR"). |
.H |
The (N x J) matrix of construct scores. |
.handle_inadmissibles |
Character string. How should inadmissible results
be treated? One of "drop", "ignore", or "replace". If "drop", all
replications/resamples yielding an inadmissible result will be dropped
(i.e. the number of results returned will potentially be less than |
.id |
Character string or integer. A character string giving the name or
an integer of the position of the column of |
.inference |
Logical. Should critical values be computed? Defaults to |
.independent |
Character string. The name of the independent variable. |
.instruments |
A named list of vectors of instruments. The names
of the list elements are the names of the dependent (LHS) constructs of the structural
equation whose explanatory variables are endogenous. The vectors
contain the names of the instruments corresponding to each equation. Note
that exogenous variables of a given equation must be supplied as
instruments for themselves. Defaults to |
.iter_max |
Integer. The maximum number of iterations allowed.
If |
.level |
Character. Used in |
.matrix1 |
A |
.matrix2 |
A |
.matrices |
A list of at least two matrices. |
.metrics |
Character string or a vector of character strings. Which prediction metrics should be displayed? One of: "MAE", "RMSE", "Q2", "MER", "MAPE, "MSE2", "U1", "U2", "UM", "UR", or "UD". Default to c("MAE", "RMSE", "Q2"). |
.model |
A model in lavaan model syntax or a cSEMModel list. |
.moderator |
Character string. The name of the moderator variable. |
.modes |
A vector giving the mode for each construct in the form |
.ms_criterion |
Character string. Either a single character string or a vector
of character strings naming the model selection criterion to compute.
Defaults to |
.n |
Integer. The number of observations of the original data. |
.n_steps |
Integer. A value giving the number of steps (the spotlights, i.e.,
values of .moderator in surface analysis or floodlight analysis)
between the minimum and maximum value of the moderator. Defaults to |
.normality |
Logical. Should joint normality of
|
.nr_comparisons |
Integer. The number of comparisons. Defaults to |
.null_model |
Logical. Should the degrees of freedom for the null model
be computed? Defaults to |
.object |
An R object of class cSEMResults resulting from a call to |
.object1 |
An R object of class cSEMResults resulting from a call to |
.object2 |
An R object of class cSEMResults resulting from a call to |
.only_common_factors |
Logical. Should only concepts modeled as common
factors be included when calculating one of the following quality criteria:
AVE, the Fornell-Larcker criterion, HTMT, and all reliability estimates.
Defaults to |
.only_structural |
Should the the log-likelihood be based on the
structural model? Ignored if |
.original_arguments |
The list of arguments used within |
.output_type |
Character string. The type of output to return. One of "complete" or "structured". See the Value section for details. Defaults to "complete". |
.P |
A (J x J) construct variance-covariance matrix (possibly disattenuated). |
.parameters_to_compare |
A model in lavaan model syntax indicating which
parameters (i.e, path ( |
.path |
Character string. Path of the directory to save the file to. Defaults
to |
.path_coefficients |
List. A list that contains the resampled and the original
path coefficient estimates. Typically a part of a |
.PLS_approach_cf |
Character string. Approach used to obtain the correction
factors for PLSc. One of: "dist_squared_euclid", "dist_euclid_weighted",
"fisher_transformed", "mean_arithmetic", "mean_geometric", "mean_harmonic",
"geo_of_harmonic". Defaults to "dist_squared_euclid".
Ignored if |
.plot_correlations |
Character string. Specify which correlations should be plotted, i.e.,
between the exogenous constructs ( |
.plot_labels |
Logical. Whether to display edge labels. Defaults to TRUE. |
.plot_package |
Character string. Indicates which packages should be used for plotting. |
.plot_significances |
Logical. Should p-values in the form of stars be plotted? Defaults to |
.plot_structural_model_only |
Logical. Should only the structural model,
i.e., the constructs and their relationships be plotted? Defaults to |
.plot_type |
Character string. Indicates the type of plot that is produced. |
.PLS_ignore_structural_model |
Logical. Should the structural model be ignored
when calculating the inner weights of the PLS-PM algorithm? Defaults to |
.PLS_modes |
Either a named list specifying the mode that should be used for
each construct in the form |
.PLS_weight_scheme_inner |
Character string. The inner weighting scheme
used by PLS-PM. One of: "centroid", "factorial", or "path".
Defaults to "path". Ignored if |
.probs |
A vector of probabilities. |
.postestimation_object |
An object resulting from a call to one of cSEM's
postestimation functions (e.g. |
.quality_criterion |
Character string. A single character string or a vector of character strings naming the quality criterion to compute. See the Details section for a list of possible candidates. Defaults to "all" in which case all possible quality criteria are computed. |
.quantity |
Character string. Which statistic should be returned? One of "all", "mean", "sd", "bias", "CI_standard_z", "CI_standard_t", "CI_percentile", "CI_basic", "CI_bc", "CI_bca", "CI_t_interval" Defaults to "all" in which case all quantities that do not require additional resampling are returned, i.e., all quantities but "CI_bca", "CI_t_interval". |
.Q |
A vector of composite-construct correlations with element names equal to the names of the J construct names used in the measurement model. Note Q^2 is also called the reliability coefficient. |
.reliabilities |
A character vector of |
.resample_method |
Character string. The resampling method to use. One of: "none", "bootstrap" or "jackknife". Defaults to "none". |
.resample_method2 |
Character string. The resampling method to use when resampling
from a resample. One of: "none", "bootstrap" or "jackknife". For
"bootstrap" the number of draws is provided via |
`.resample_object` |
An R object of class |
.resample_sarstedt |
A matrix containing the parameter estimates that could potentially be compared and an id column indicating the group adherence of each row. |
.r |
Integer. The number of repetitions to use. Defaults to |
.R |
Integer. The number of bootstrap replications. Defaults to |
.R2 |
Integer. The number of bootstrap replications to use when
resampling from a resample. Defaults to |
.R_bootstrap |
Integer. The number of bootstrap runs. Ignored if |
.R_permutation |
Integer. The number of permutations. Defaults to |
.S |
The (K x K) empirical indicator correlation matrix. |
.saturated |
Logical. Should a saturated structural model be used?
Defaults to |
.second_resample |
A list containing |
.seed |
Integer or |
.sign_change_option |
Character string. Which sign change option should be used to handle flipping signs when resampling? One of "none","individual", "individual_reestimate", "construct_reestimate". Defaults to "none". |
.sim_points |
Integer. How many samples from the truncated normal distribution should be simulated to estimate the exogenous construct scores? Defaults to "100". |
.stage |
Character string. The stage the model is needed for. One of "first" or "second". Defaults to "first". |
.standardized |
Logical. Should standardized scores be returned? Defaults
to |
.starting_values |
A named list of vectors where the
list names are the construct names whose indicator weights the user
wishes to set. The vectors must be named vectors of |
.steps_mod |
A numeric vector. Steps used for the moderator variable in calculating
the simple effects of an independent variable on the dependent variable.
Defaults to |
.terms |
A vector of construct names to be classified. |
.test_data |
A matrix of test data with the same column names as the training data. |
.testtype |
Character string. One of "twosided" (H1: The models do not perform equally in predicting indicators belonging to endogenous constructs)" and onesided" (H1: Model 1 performs better in predicting indicators belonging |
.title |
Character string. Title of an object. Defaults to "". |
.tolerance |
Double. The tolerance criterion for convergence.
Defaults to |
.treat_as_continuous |
Logical. Should the indicators for the benchmark predictions
be treated as continuous? If |
.type_gfi |
Character string. Which fitting function should the GFI be based on? One of "ML" for the maximum likelihood fitting function, "GLS" for the generalized least squares fitting function or "ULS" for the unweighted least squares fitting function (same as the squared Euclidean distance). Defaults to "ML". |
.type_ci |
Character string. Which confidence interval should be calculated?
For possible choices, see the |
.type_htmt |
Character string indicating the type of HTMT that should be calculated, i.e., the original HTMT ("htmt") or the HTMT2 ("htmt2"). Defaults to "htmt" |
.type_vcv |
Character string. Which model-implied correlation matrix should be calculated? One of "indicator" or "construct". Defaults to "indicator". |
.verbose |
Logical. Should information (e.g., progress bar) be printed
to the console? Defaults to |
.user_funs |
A function or a (named) list of functions to apply to every
resample. The functions must take |
.value_independent |
Integer. Only required for floodlight analysis; The value of the independent variable in case that it appears as a higher-order term. |
.values_moderator |
A numeric vector. The values of the moderator in a
the simple effects analysis. Typically these are difference from the mean (=0)
measured in standard deviations. Defaults to |
.vcv_asymptotic |
Logical. Should the asymptotic variance-covariance matrix be used, i.e.,
VCV(b0) - VCV(b1)= VCV(b1-b0), or should VCV(b1-b0) be computed directly?
Defaults to |
.vector1 |
A vector of numeric values. |
.vector2 |
A vector of numeric values. |
.W |
A (J x K) matrix of weights. |
.what |
Internal argument used by several print helper functions. |
.W_new |
A (J x K) matrix of weights. |
.W_old |
A (J x K) matrix of weights. |
.weighted |
Logical. Should estimation be based on a score that uses
the weights of the weight approach used to obtain |
.X |
A matrix of processed data (scaled, cleaned and ordered). |
.X_cleaned |
A data.frame of processed data (cleaned and ordered). Note: |
cSEMModel
Description
cSEMModel
Details
A standardized list containing model-related information. To convert a
a model written in lavaan model syntax
to a cSEMModel list use parseModel()
.
Value
An object of class cSEMModel is a standardized list containing the following components. J stands for the number of constructs and K for the number of indicators.
$structural
A matrix mimicking the structural relationship between constructs. If constructs are only linearly related,
structural
is of dimension (J x J) with row- and column names equal to the construct names. If the structural model contains nonlinear relationshipsstructural
is (J x (J + J*)) where J* is the number of nonlinear terms. Rows are ordered such that exogenous constructs are always first, followed by constructs that only depend on exogenous constructs and/or previously ordered constructs.$measurement
A (J x K) matrix mimicking the measurement/composite relationship between constructs and their related indicators. Rows are in the same order as the matrix
$structural
with row names equal to the construct names. The order of the columns is such that$measurement
forms a block diagonal matrix.$error_cor
A (K x K) matrix mimicking the measurement error correlation relationship. The row and column order is identical to the column order of
$measurement
.$cor_specified
A matrix indicating the correlation relationships between any variables of the model as specified by the user. Mainly for internal purposes. Note that
$cor_specified
may also contain inadmissible correlations such as a correlation between measurement errors indicators and constructs.$construct_type
A named vector containing the names of each construct and their respective type ("Common factor" or "Composite").
$construct_order
A named vector containing the names of each construct and their respective order ("First order" or "Second order").
$model_type
The type of model ("Linear" or "Nonlinear").
$instruments
Only if instruments are supplied: a list of structural equations relating endogenous RHS variables to instruments.
$indicators
The names of the indicators (i.e., observed variables and/or first-order constructs)
$cons_exo
The names of the exogenous constructs of the structural model (i.e., variables that do not appear on the LHS of any structural equation)
$cons_endo
The names of the endogenous constructs of the structural model (i.e., variables that appear on the LHS of at least one structural equation)
$vars_2nd
The names of the constructs modeled as second orders.
$vars_attached_to_2nd
The names of the constructs forming or building a second order construct.
$vars_not_attached_to_2nd
The names of the constructs not forming or building a second order construct.
It is possible to supply an incomplete list to parseModel()
, resulting
in an incomplete cSEMModel list which can be passed
to all functions that require .csem_model
as a mandatory argument. Currently,
only the structural and the measurement matrix are required.
However, specifying an incomplete cSEMModel list may lead to unexpected behavior
and errors. Use with care.
See Also
cSEMResults
Description
A call to csem()
results in an object with at least
two class attributes. The first class attribute is always cSEMResults
no matter
the type of data or model provided.
The second is one of cSEMResults_default
, cSEMResults_multi
, or
cSEMResults_2ndorder
and depends on the estimated model and/or the type of
data provided to the .model
and .data
arguments of csem()
.
The third class attribute cSEMResults_resampled
is only added if resampling
was conducted.
Details
Depending on the type of data and/or model provided three different output types exists.
- _default
This will be the structure for the vast majority of applications. If the data is a single
matrix
ordata.frame
with no id-column, the result is alist
with elements:$Estimates
A list containing a list of estimated quantities.
$Information
A list containing a list of additional information.
The resulting object has classes
cSEMResults
andcSEMResults_default
.- _multi
If the data provided is a single
matrix
ordata.frame
containing an id-column to split the data byG
group levels or if a list ofG
datasets is provided, the resulting object is a list ofG
lists, whereG
is equal to the number of groups or the number of datasets in the list of datasets provided. Each of theG
list elements is itself acSEMResults_default
object. Hence its structure is identical to the structure described in_default
.The resulting object has classes
cSEMResults
andcSEMResults_multi
.- _2ndorder
-
A special output is generated if the model to estimate contains hierarchical constructs and the "2stage" or "mixed" approach is used to estimate the model. In this case the resulting object is a list containing two elements
First_stage
andSecond_stage
.Each list element is itself a
cSEMResults_default
object. Hence its structure is identical to the structure described in_default
.
If .resample_method = "bootstrap"
or .resample_method = "jackknife"
, resamples
are attached to each object. For objects of class cSEMResults_default
the resamples are
attached to .object$Estimates$Estimates_resample
. For objects of class
cSEMResults_multi
the same is done by group. For objects of class
cSEMResults_2ndorder
the resamples are attached to the
.object$Second_stage$Information$Resamples
. All objects containing
these elements gain the cSEMResults_resampled
class.
cSEMSummarize
Description
cSEMSummarize
Value
An object of class cSEMSummary
.
Technically cSEMSummary
is a named list containing the following list elements:
- '...
Not finished yet.
cSEMTest
Description
cSEMTest
Value
A standardized list of class cSEMTest
. Technically cSEMTest
is a named
list containing the following list elements:
$Test_statistic
The value of test statistic(s).
$Critical_value
The critical value(s).
$Decision
The test decision. One of: Reject or Do not reject
$Number_admissibles
The number of admissible runs. See
verify()
for what constitutes and inadmissible run.
Data: Second order common factor of composites
Description
A dataset containing 500 standardized observations on 19 indicator generated from a
population model with 6 concepts, three of which (c1-c3
) are composites
forming a second order common factor (c4
). The remaining two (eta1
, eta2
)
are concepts modeled as common factors .
Usage
dgp_2ndorder_cf_of_c
Format
A matrix with 500 rows and 19 variables:
- y11-y12
Indicators attached to
c1
. Population weights are: 0.8; 0.4. Population loadings are: 0.925; 0.65- y21-y24
Indicators attached to
c2
. Population weights are: 0.5; 0.3; 0.2; 0.4. Population loadings are: 0.804; 0.68; 0.554; 0.708- y31-y38
Indicators attached to
c3
. Population weights are: 0.3; 0.3; 0.1; 0.1; 0.2; 0.3; 0.4; 0.2. Population loadings are: 0.496; 0.61; 0.535; 0.391; 0.391; 0.6; 0.5285; 0.53- y41-y43
Indicators attached to
eta1
. Population loadings are: 0.8; 0.7; 0.7- y51-y53
Indicators attached to
eta1
. Population loadings are: 0.8; 0.8; 0.7
The model is:
`c4` = gamma1 * `eta1` + zeta1
`eta2` = gamma2 * `eta1` + beta * `c4` + zeta2
with population values gamma1
= 0.6, gamma2
= 0.4 and beta
= 0.35.
The second order common factor is
`c4` = lambdac1 * `c1` + lambdac2 * `c2` + lambdac3 * `c3` + epsilon
Calculate difference between S and Sigma_hat
Description
Calculate the difference between the empirical (S) and the model-implied indicator variance-covariance matrix (Sigma_hat) using different distance measures.
Usage
calculateDG(
.object = NULL,
.matrix1 = NULL,
.matrix2 = NULL,
.saturated = FALSE,
...
)
calculateDL(
.object = NULL,
.matrix1 = NULL,
.matrix2 = NULL,
.saturated = FALSE,
...
)
calculateDML(
.object = NULL,
.matrix1 = NULL,
.matrix2 = NULL,
.saturated = FALSE,
...
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.matrix1 |
A |
.matrix2 |
A |
.saturated |
Logical. Should a saturated structural model be used?
Defaults to |
... |
Ignored. |
Details
The distances may also be computed for any two matrices A and B by supplying
A and B directly via the .matrix1
and .matrix2
arguments.
If A and B are supplied .object
is ignored.
Value
A single numeric value giving the distance between two matrices.
Functions
-
calculateDG()
: The geodesic distance (dG). -
calculateDL()
: The squared Euclidean distance -
calculateDML()
: The distance measure (fit function) used by ML
Do an importance-performance matrix analysis
Description
Usage
doIPMA(.object)
Arguments
.object |
A |
Details
Performs an importance-performance matrix analysis (IPMA).
To calculate the performance and importance, the weights of the indicators are unstandardized using the standard deviation of the original indicators but normed to have a length of 1. Normed construct scores are calculated based on the original indicators and the unstandardized weights.
The importance is calculated as the mean of
the original indicators or the unstandardized construct scores, respectively.
The performance is calculated as the unstandardized total effect if
.level == "construct"
and as the normed weight times the unstandardized
total effect if .level == "indicator"
. The literature recommends to use an
estimation as input for 'doIPMA()
that is based on normed
indicators, e.g., by scaling all indicators to 0 to 100,
see e.g., Henseler (2021); Ringle and Sarstedt (2016).
Note, indicators are not normed internally, as theoretical maximum/minimum can differ from the empirical maximum/minimum which would lead to an incorrect normalization.
Value
A list of class cSEMIPA
with a corresponding method for plot()
.
See: plot.cSEMIPMA()
.
See Also
csem()
, cSEMResults, plot.cSEMIPMA()
Do a nonlinear effects analysis
Description
Usage
doNonlinearEffectsAnalysis(
.object = NULL,
.dependent = NULL,
.independent = NULL,
.moderator = NULL,
.n_steps = 100,
.values_moderator = c(-2, -1, 0, 1, 2),
.value_independent = 0,
.alpha = 0.05
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.dependent |
Character string. The name of the dependent variable. |
.independent |
Character string. The name of the independent variable. |
.moderator |
Character string. The name of the moderator variable. |
.n_steps |
Integer. A value giving the number of steps (the spotlights, i.e.,
values of .moderator in surface analysis or floodlight analysis)
between the minimum and maximum value of the moderator. Defaults to |
.values_moderator |
A numeric vector. The values of the moderator in a
the simple effects analysis. Typically these are difference from the mean (=0)
measured in standard deviations. Defaults to |
.value_independent |
Integer. Only required for floodlight analysis; The value of the independent variable in case that it appears as a higher-order term. |
.alpha |
An integer or a numeric vector of significance levels.
Defaults to |
Details
Calculate the expected value of the dependent variable conditional on the values of an independent variables and a moderator variable. All other variables in the model are assumed to be zero, i.e., they are fixed at their mean levels. Moreover, it produces the input for the floodlight analysis.
Value
A list of class cSEMNonlinearEffects
with a corresponding method
for plot()
. See: plot.cSEMNonlinearEffects()
.
See Also
csem()
, cSEMResults, plot.cSEMNonlinearEffects()
Examples
## Not run:
model_Int <- "
# Measurement models
INV =~ INV1 + INV2 + INV3 +INV4
SAT =~ SAT1 + SAT2 + SAT3
INT =~ INT1 + INT2
# Structrual model containing an interaction term.
INT ~ INV + SAT + INV.SAT
"
# Estimate model
out <- csem(.data = Switching, .model = model_Int,
# ADANCO settings
.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-06,
.resample_method = 'bootstrap'
)
# Do nonlinear effects analysis
neffects <- doNonlinearEffectsAnalysis(out,
.dependent = 'INT',
.moderator = 'INV',
.independent = 'SAT')
# Get an overview
neffects
# Simple effects plot
plot(neffects, .plot_type = 'simpleeffects')
# Surface plot using plotly
plot(neffects, .plot_type = 'surface', .plot_package = 'plotly')
# Surface plot using persp
plot(neffects, .plot_type = 'surface', .plot_package = 'persp')
# Floodlight analysis
plot(neffects, .plot_type = 'floodlight')
## End(Not run)
Do a redundancy analysis
Description
Usage
doRedundancyAnalysis(.object = NULL)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
Details
Perform a redundancy analysis (RA) as proposed by Hair et al. (2016) with reference to Chin (1998).
RA is confined to PLS-PM, specifically PLS-PM with at least one construct
whose weights are obtained by mode B. In cSEM this is the case if the construct
is modeled as a composite or if argument .PLS_modes
was explicitly set to
mode B for at least one construct.
Hence RA is only conducted if .approach_weights = "PLS-PM"
and if at least
one construct's mode is mode B.
The principal idea of RA is to take two different measures of the same construct and regress the scores obtained for each measure on each other. If they are similar they are likely to measure the same "thing" which is then taken as evidence that both measurement models actually measure what they are supposed to measure (validity).
There are several issues with the terminology and the reasoning behind this logic. RA is therefore only implemented since reviewers are likely to demand its computation, however, its actual application for validity assessment is discouraged.
Currently, the function is not applicable to models containing second-order constructs.
Value
A named numeric vector of correlations. If
the weighting approach used to obtain .object
is not "PLS-PM"
or
non of the PLS outer modes was mode B, the function silently returns NA
.
References
Chin WW (1998).
“Modern Methods for Business Research.”
In Marcoulides GA (ed.), chapter The Partial Least Squares Approach to Structural Equation Modeling, 295–358.
Mahwah, NJ: Lawrence Erlbaum.
Hair JF, Hult GTM, Ringle C, Sarstedt M (2016).
A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM).
Sage publications.
See Also
Internal: Estimate the structural coefficients
Description
Estimates the coefficients of the structural model (nonlinear and linear) using OLS, 2SLS. The latter currently work for linear models only.
Usage
estimatePath(
.approach_nl = args_default()$.approach_nl,
.approach_paths = args_default()$.approach_paths,
.csem_model = args_default()$.csem_model,
.H = args_default()$.H,
.normality = args_default()$.normality,
.P = args_default()$.P,
.Q = args_default()$.Q
)
Arguments
.approach_nl |
Character string. Approach used to estimate nonlinear structural relationships. One of: "sequential" or "replace". Defaults to "sequential". |
.approach_paths |
Character string. Approach used to estimate the
structural coefficients. One of: "OLS" or "2SLS". If "2SLS", instruments
need to be supplied to |
.csem_model |
A (possibly incomplete) cSEMModel-list. |
.H |
The (N x J) matrix of construct scores. |
.normality |
Logical. Should joint normality of
|
.P |
A (J x J) construct variance-covariance matrix (possibly disattenuated). |
.Q |
A vector of composite-construct correlations with element names equal to the names of the J construct names used in the measurement model. Note Q^2 is also called the reliability coefficient. |
Value
A named list containing the estimated structural coefficients, the R2, the adjusted R2, and the VIFs for each regression.
Export to Excel (.xlsx)
Description
Usage
exportToExcel(
.postestimation_object = NULL,
.filename = "results.xlsx",
.path = NULL
)
Arguments
.postestimation_object |
An object resulting from a call to one of cSEM's
postestimation functions (e.g. |
.filename |
Character string. The file name. |
.path |
Character string. Path of the directory to save the file to. Defaults
to |
Details
Export results from postestimation functions assess()
, predict()
,
summarize()
and testOMF()
to an .xlsx (Excel) file. The function uses the openxlsx
package which does not depend on Java!
The function is deliberately kept simple: it takes all the
relevant elements in .postestimation_object
and writes them (worksheet by worksheet) into
an .xlsx file named .filename
in the directory given by .path
(the current
working directory by default).
If .postestimation_object
has class attribute _2ndorder
two .xlsx files
named ".filename_first_stage.xlsx"
and ".filename_second_stage.xlsx"
are created. If .postestimation_object
is a list of appropriate objects,
one file for each list elements is created.
Note: rerunning exportToExcel()
without changing .filename
and .path
overwrites the file!
See Also
assess()
, summarize()
, predict()
, testOMF()
Internal: firstOrderMeasurementEdges
Description
Build measurement edges for a first–order model.
Usage
firstOrderMeasurementEdges(
construct,
weights,
loadings,
weight_p_values,
loading_p_values,
plot_signif,
plot_labels,
constructTypes
)
Arguments
plot_labels |
Logical. Whether to display edge labels. Defaults to TRUE. |
Value
Character string containing DOT code.
Model-implied indicator or construct variance-covariance matrix
Description
Calculate the model-implied indicator or construct variance-covariance (VCV) matrix. Currently only the model-implied VCV for recursive linear models is implemented (including models containing second order constructs).
Usage
fit(
.object = NULL,
.saturated = args_default()$.saturated,
.type_vcv = args_default()$.type_vcv
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.saturated |
Logical. Should a saturated structural model be used?
Defaults to |
.type_vcv |
Character string. Which model-implied correlation matrix should be calculated? One of "indicator" or "construct". Defaults to "indicator". |
Details
Notation is taken from Bollen (1989).
If .saturated = TRUE
the model-implied variance-covariance matrix is calculated
for a saturated structural model (i.e., the VCV of the constructs is replaced
by their correlation matrix). Hence: V(eta) = WSW' (possibly disattenuated).
Value
Either a (K x K) matrix or a (J x J) matrix depending on the type_vcv
.
References
Bollen KA (1989). Structural Equations with Latent Variables. Wiley-Interscience. ISBN 978-0471011712.
See Also
csem()
, foreman()
, cSEMResults
Model fit measures
Description
Calculate fit measures.
Usage
calculateChiSquare(.object, .saturated = FALSE)
calculateChiSquareDf(.object)
calculateCFI(.object)
calculateGFI(.object, .type_gfi = c("ML", "GLS", "ULS"), ...)
calculateCN(.object, .alpha = 0.05, ...)
calculateIFI(.object)
calculateNFI(.object)
calculateNNFI(.object)
calculateRMSEA(.object)
calculateRMSTheta(.object)
calculateSRMR(
.object = NULL,
.matrix1 = NULL,
.matrix2 = NULL,
.saturated = FALSE,
...
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.saturated |
Logical. Should a saturated structural model be used?
Defaults to |
.type_gfi |
Character string. Which fitting function should the GFI be based on? One of "ML" for the maximum likelihood fitting function, "GLS" for the generalized least squares fitting function or "ULS" for the unweighted least squares fitting function (same as the squared Euclidean distance). Defaults to "ML". |
... |
Ignored. |
.alpha |
An integer or a numeric vector of significance levels.
Defaults to |
.matrix1 |
A |
.matrix2 |
A |
Details
See the Fit indices section of the cSEM website for details on the implementation.
Value
A single numeric value.
Functions
-
calculateChiSquare()
: The chi square statistic. -
calculateChiSquareDf()
: The Chi square statistic divided by its degrees of freedom. -
calculateCFI()
: The comparative fit index (CFI). -
calculateGFI()
: The goodness of fit index (GFI). -
calculateCN()
: The Hoelter index alias Hoelter's (critical) N (CN). -
calculateIFI()
: The incremental fit index (IFI). -
calculateNFI()
: The normed fit index (NFI). -
calculateNNFI()
: The non-normed fit index (NNFI). Also called the Tucker-Lewis index (TLI). -
calculateRMSEA()
: The root mean square error of approximation (RMSEA). -
calculateRMSTheta()
: The root mean squared residual covariance matrix of the outer model residuals (RMS theta). -
calculateSRMR()
: The standardized root mean square residual (SRMR).
Internal: Composite-based SEM
Description
The central hub of the cSEM package. It acts like a
foreman by collecting all (estimation) tasks, distributing them to lower
level package functions, and eventually recollecting all of their results.
It is called by csem()
to manage the actual calculations.
It may be called directly by the user, however, in most cases it will likely
be more convenient to use csem()
instead.
Usage
foreman(
.data = args_default()$.data,
.model = args_default()$.model,
.approach_cor_robust = args_default()$.approach_cor_robust,
.approach_nl = args_default()$.approach_nl,
.approach_paths = args_default()$.approach_paths,
.approach_weights = args_default()$.approach_weights,
.conv_criterion = args_default()$.conv_criterion,
.disattenuate = args_default()$.disattenuate,
.dominant_indicators = args_default()$.dominant_indicators,
.estimate_structural = args_default()$.estimate_structural,
.id = args_default()$.id,
.instruments = args_default()$.instruments,
.iter_max = args_default()$.iter_max,
.normality = args_default()$.normality,
.PLS_approach_cf = args_default()$.PLS_approach_cf,
.PLS_ignore_structural_model = args_default()$.PLS_ignore_structural_model,
.PLS_modes = args_default()$.PLS_modes,
.PLS_weight_scheme_inner = args_default()$.PLS_weight_scheme_inner,
.reliabilities = args_default()$.reliabilities,
.starting_values = args_default()$.starting_values,
.tolerance = args_default()$.tolerance
)
Arguments
.data |
A |
.model |
A model in lavaan model syntax or a cSEMModel list. |
.approach_cor_robust |
Character string. Approach used to obtain a robust
indicator correlation matrix. One of: "none" in which case the standard
Bravais-Pearson correlation is used,
"spearman" for the Spearman rank correlation, or
"mcd" via |
.approach_nl |
Character string. Approach used to estimate nonlinear structural relationships. One of: "sequential" or "replace". Defaults to "sequential". |
.approach_paths |
Character string. Approach used to estimate the
structural coefficients. One of: "OLS" or "2SLS". If "2SLS", instruments
need to be supplied to |
.approach_weights |
Character string. Approach used to obtain composite weights. One of: "PLS-PM", "SUMCORR", "MAXVAR", "SSQCORR", "MINVAR", "GENVAR", "GSCA", "PCA", "unit", "bartlett", or "regression". Defaults to "PLS-PM". |
.conv_criterion |
Character string. The criterion to use for the convergence check. One of: "diff_absolute", "diff_squared", or "diff_relative". Defaults to "diff_absolute". |
.disattenuate |
Logical. Should composite/proxy correlations
be disattenuated to yield consistent loadings and path estimates if at least
one of the construct is modeled as a common factor? Defaults to |
.dominant_indicators |
A character vector of |
.estimate_structural |
Logical. Should the structural coefficients
be estimated? Defaults to |
.id |
Character string or integer. A character string giving the name or
an integer of the position of the column of |
.instruments |
A named list of vectors of instruments. The names
of the list elements are the names of the dependent (LHS) constructs of the structural
equation whose explanatory variables are endogenous. The vectors
contain the names of the instruments corresponding to each equation. Note
that exogenous variables of a given equation must be supplied as
instruments for themselves. Defaults to |
.iter_max |
Integer. The maximum number of iterations allowed.
If |
.normality |
Logical. Should joint normality of
|
.PLS_approach_cf |
Character string. Approach used to obtain the correction
factors for PLSc. One of: "dist_squared_euclid", "dist_euclid_weighted",
"fisher_transformed", "mean_arithmetic", "mean_geometric", "mean_harmonic",
"geo_of_harmonic". Defaults to "dist_squared_euclid".
Ignored if |
.PLS_ignore_structural_model |
Logical. Should the structural model be ignored
when calculating the inner weights of the PLS-PM algorithm? Defaults to |
.PLS_modes |
Either a named list specifying the mode that should be used for
each construct in the form |
.PLS_weight_scheme_inner |
Character string. The inner weighting scheme
used by PLS-PM. One of: "centroid", "factorial", or "path".
Defaults to "path". Ignored if |
.reliabilities |
A character vector of |
.starting_values |
A named list of vectors where the
list names are the construct names whose indicator weights the user
wishes to set. The vectors must be named vectors of |
.tolerance |
Double. The tolerance criterion for convergence.
Defaults to |
See Also
Get construct scores
Description
Usage
getConstructScores(
.object = NULL,
.standardized = TRUE
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.standardized |
Logical. Should standardized scores be returned? Defaults
to |
Details
Get the standardized or unstandardized construct scores.
Value
A list of three with elements Construct_scores
, W_used
,
Indicators_used
.
See Also
Internal: Parameter names
Description
Based on a model in lavaan model syntax, returns the
names of the parameters of the structural
model, the measurement/composite model and the weight relationship. Used
by testMGD()
to extract the names of the parameters to compare across groups
according to the test proposed by Chin and Dibbern (2010).
Usage
getParameterNames(
.object = args_default()$.object,
.model = args_default()$.model
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.model |
A model in lavaan model syntax indicating which
parameters (i.e, path ( |
Value
A list with elements names_path
, names_loadings
, and names_weights
containing the names of the structural parameters, the loadings,
and the weight to compare across groups.
References
Chin WW, Dibbern J (2010). “An Introduction to a Permutation Based Procedure for Multi-Group PLS Analysis: Results of Tests of Differences on Simulated Data and a Cross Cultural Analysis of the Sourcing of Information System Services Between Germany and the USA.” In Handbook of Partial Least Squares, 171–193. Springer Berlin Heidelberg. doi:10.1007/978-3-540-32827-8_8.
Internal: Extract relevant parameters from several cSEMResults_multi
Description
Extract the relevant parameters from a cSEMResult_multi object in .object
.
Usage
getRelevantParameters(
.object = args_default()$.object,
.model = args_default()$.model
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.model |
A model in lavaan model syntax indicating which
parameters (i.e., path ( |
Value
A list of length equal to the number of groups in .object
.
Each list element is itself a list of three. The first list element contains
the relevant parameter estimates of the structural model, the second
list element the relevant estimated loadings, and the third
the relevant estimated weights.
Internal: Helper for doNonlinearEffectsAnalysis()
Description
Function that calculates the values required for the floodlight analysis, namely 1) partial effect of the independent variable on the dependent variable for each bootstrap run and for the original estimation for each step of the moderator 2) alpha/2 and 1-alpha/2 quantile of the bootstrap estimates.
Usage
getValuesFloodlight(
.model = NULL,
.path_coefficients = args_default()$.path_coefficients,
.dependent = args_default()$.dependent,
.independent = args_default()$.independent,
.moderator = args_default()$.moderator,
.steps_mod = args_default()$.steps_mod,
.value_independent = args_default()$.value_independent,
.alpha = args_default()$.alpha
)
Arguments
.model |
A model in lavaan model syntax or a cSEMModel list. |
.path_coefficients |
List. A list that contains the resampled and the original
path coefficient estimates. Typically a part of a |
.dependent |
Character string. The name of the dependent variable. |
.independent |
Character string. The name of the independent variable. |
.moderator |
Character string. The name of the moderator variable. |
.steps_mod |
A numeric vector. Steps used for the moderator variable in calculating
the simple effects of an independent variable on the dependent variable.
Defaults to |
.value_independent |
Integer. Only required for floodlight analysis; The value of the independent variable in case that it appears as a higher-order term. |
.alpha |
An integer or a numeric vector of significance levels.
Defaults to |
Details
Only variables that comprise the independent variable are taken into account. If it contains a variable other than the independent variable and the moderator the effect is set to zero as the other variables are evaluated at their means (=0), hence the effect is zero.
Internal: get significance stars
Description
Transforms a p-value into stars.
Usage
get_significance_stars(
.pvalue
)
Details
.pvalue
Numeric. A p-value that is transformed into significance stars.
Value
Character string. A p-value transformed into a star.
Internal: Handle arguments
Description
Internal helper function to handle arguments passed to any function within cSEM
.
Usage
handleArgs(.args_used)
Arguments
.args_used |
A list of argument names and user picked values. |
Value
The args_default list, with default values changed to the values given by the user.
Inference
Description
Usage
infer(
.object = NULL,
.quantity = c("all", "mean", "sd", "bias", "CI_standard_z",
"CI_standard_t", "CI_percentile", "CI_basic",
"CI_bc", "CI_bca", "CI_t_interval"),
.alpha = 0.05,
.bias_corrected = TRUE
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.quantity |
Character string. Which statistic should be returned? One of "all", "mean", "sd", "bias", "CI_standard_z", "CI_standard_t", "CI_percentile", "CI_basic", "CI_bc", "CI_bca", "CI_t_interval" Defaults to "all" in which case all quantities that do not require additional resampling are returned, i.e., all quantities but "CI_bca", "CI_t_interval". |
.alpha |
An integer or a numeric vector of significance levels.
Defaults to |
.bias_corrected |
Logical. Should the standard and the tStat
confidence interval be bias-corrected using the bootstrapped bias estimate?
If |
Details
Calculate common inferential quantities. For users interested in the
estimated standard errors, t-values, p-values and/or confidences
intervals of the path, weight or loading estimates, calling summarize()
directly will usually be more convenient as it has a much more
user-friendly print method. infer()
is useful for comparing
different confidence interval estimates.
infer()
is a convenience wrapper around a
number of internal functions that compute a particular inferential
quantity, i.e., a value or set of values to be used in statistical inference.
cSEM relies on resampling (bootstrap and jackknife) as the basis for
the computation of e.g., standard errors or confidence intervals.
Consequently, infer()
requires resamples to work. Technically,
the cSEMResults object used in the call to infer()
must
therefore also have class attribute cSEMResults_resampled
. If
the object provided by the user does not contain resamples yet,
infer()
will obtain bootstrap resamples first.
Naturally, computation will take longer in this case.
infer()
does as much as possible in the background. Hence, every time
infer()
is called on a cSEMResults object the quantities chosen by
the user are automatically computed for every estimated parameter
contained in the object. By default all possible quantities are
computed (.quantity = all
). The following table list the available
inferential quantities alongside a brief description. Implementation and
terminology of the confidence intervals is based on
Hesterberg (2015) and
Davison and Hinkley (1997).
"mean"
,"sd"
The mean or the standard deviation over all
M
resample estimates of a generic statistic or parameter."bias"
The difference between the resample mean and the original estimate of a generic statistic or parameter.
"CI_standard_z"
and"CI_standard_t"
The standard confidence interval for a generic statistic or parameter with standard errors estimated by the resample standard deviation. While
"CI_standard_z"
assumes a standard normally distributed statistic,"CI_standard_t"
assumes a t-statistic with N - 1 degrees of freedom."CI_percentile"
The percentile confidence interval. The lower and upper bounds of the confidence interval are estimated as the alpha and 1-alpha quantiles of the distribution of the resample estimates.
"CI_basic"
The basic confidence interval also called the reverse bootstrap percentile confidence interval. See Hesterberg (2015) for details.
"CI_bc"
The bias corrected (Bc) confidence interval. See Davison and Hinkley (1997) for details.
"CI_bca"
The bias-corrected and accelerated (Bca) confidence interval. Requires additional jackknife resampling to compute the influence values. See Davison and Hinkley (1997) for details.
"CI_t_interval"
The "studentized" t-confidence interval. If based on bootstrap resamples the interval is also called the bootstrap t-interval confidence interval. See Hesterberg (2015) on page 381. Requires resamples of resamples. See
resamplecSEMResults()
.
By default, all but the studendized t-interval confidence interval and the bias-corrected and accelerated confidence interval are calculated. The reason for excluding these quantities by default are that both require an additional resampling step. The former requires jackknife estimates to compute influence values and the latter requires double bootstrap. Both can potentially be time consuming. Hence, computation is triggered only if explicitly chosen.
Value
A list of class cSEMInfer
.
References
Davison AC, Hinkley DV (1997).
Bootstrap Methods and their Application.
Cambridge University Press.
doi:10.1017/cbo9780511802843.
Hesterberg TC (2015).
“What Teachers Should Know About the Bootstrap: Resampling in the Undergraduate Statistics Curriculum.”
The American Statistician, 69(4), 371–386.
doi:10.1080/00031305.2015.1089789.
See Also
csem()
, resamplecSEMResults()
, summarize()
cSEMResults
Examples
model <- "
# Structural model
QUAL ~ EXPE
EXPE ~ IMAG
SAT ~ IMAG + EXPE + QUAL + VAL
LOY ~ IMAG + SAT
VAL ~ EXPE + QUAL
# Measurement model
EXPE =~ expe1 + expe2 + expe3 + expe4 + expe5
IMAG =~ imag1 + imag2 + imag3 + imag4 + imag5
LOY =~ loy1 + loy2 + loy3 + loy4
QUAL =~ qual1 + qual2 + qual3 + qual4 + qual5
SAT =~ sat1 + sat2 + sat3 + sat4
VAL =~ val1 + val2 + val3 + val4
"
## Estimate the model with bootstrap resampling
a <- csem(satisfaction, model, .resample_method = "bootstrap", .R = 20,
.handle_inadmissibles = "replace")
## Compute inferential quantities
inf <- infer(a)
inf$Path_estimates$CI_basic
inf$Indirect_effect$sd
### Compute the bias-corrected and accelerated and/or the studentized t-inverval.
## For the studentied t-interval confidence interval a double bootstrap is required.
## This is pretty time consuming.
## Not run:
inf <- infer(a, .quantity = c("all", "CI_bca")) # requires jackknife estimates
## Estimate the model with double bootstrap resampling:
# Notes:
# 1. The .resample_method2 arguments triggers a bootstrap of each bootstrap sample
# 2. The double bootstrap is is very time consuming, consider setting
# `.eval_plan = "multisession`.
a1 <- csem(satisfaction, model, .resample_method = "bootstrap", .R = 499,
.resample_method2 = "bootstrap", .R2 = 199, .handle_inadmissibles = "replace")
infer(a1, .quantity = "CI_t_interval")
## End(Not run)
Internal: Helper for infer()
Description
Collection of various functions that compute an inferential quantity.
Usage
MeanResample(.first_resample)
SdResample(.first_resample, .resample_method, .n)
BiasResample(.first_resample, .resample_method, .n)
StandardCIResample(
.first_resample,
.bias_corrected,
.dist = c("z", "t"),
.df = c("type1", "type2"),
.resample_method,
.n,
.probs
)
PercentilCIResample(.first_resample, .probs)
BasicCIResample(.first_resample, .bias_corrected, .probs)
TStatCIResample(
.first_resample,
.second_resample,
.bias_corrected,
.resample_method,
.resample_method2,
.n,
.probs
)
BcCIResample(.first_resample, .probs)
BcaCIResample(.object, .first_resample, .probs)
Arguments
.first_resample |
A list containing the |
.resample_method |
Character string. The resampling method to use. One of: "none", "bootstrap" or "jackknife". Defaults to "none". |
.n |
Integer. The number of observations of the original data. |
.bias_corrected |
Logical. Should the standard and the tStat
confidence interval be bias-corrected using the bootstrapped bias estimate?
If |
.dist |
Character string. The distribution to use for the critical value. One of "t" for Student's t-distribution or "z" for the standard normal distribution. Defaults to "z". |
.df |
Character string. The method for obtaining the degrees of freedom. Choices are "type1" and "type2". Defaults to "type1" . |
.probs |
A vector of probabilities. |
.second_resample |
A list containing |
.resample_method2 |
Character string. The resampling method to use when resampling
from a resample. One of: "none", "bootstrap" or "jackknife". For
"bootstrap" the number of draws is provided via |
.object |
An R object of class cSEMResults resulting from a call to |
Details
Implementation and terminology of the confidence intervals is based on Hesterberg (2015) and Davison and Hinkley (1997).
References
Davison AC, Hinkley DV (1997).
Bootstrap Methods and their Application.
Cambridge University Press.
doi:10.1017/cbo9780511802843.
Hesterberg TC (2015).
“What Teachers Should Know About the Bootstrap: Resampling in the Undergraduate Statistics Curriculum.”
The American Statistician, 69(4), 371–386.
doi:10.1080/00031305.2015.1089789.
Internal: Calculate consistent moments of a nonlinear model
Description
Collection of various moment estimators. See classifyConstructs for a list of possible moments.
Usage
SingleSingle(.i, .j, .Q, .H)
SingleQuadratic(.i, .j, .Q, .H)
SingleCubic(.i, .j, .Q, .H)
SingleTwInter(.i, .j, .Q, .H)
SingleThrwInter(.i, .j, .Q, .H)
SingleQuadTwInter(.i, .j, .Q, .H)
QuadraticQuadratic(.i, .j, .Q, .H)
QuadraticCubic(.i, .j, .Q, .H)
QuadraticTwInter(.i, .j, .Q, .H)
QuadraticThrwInter(.i, .j, .Q, .H)
QuadraticQuadTwInter(.i, .j, .Q, .H)
CubicCubic(.i, .j, .Q, .H)
CubicTwInter(.i, .j, .Q, .H)
CubicThrwInter(.i, .j, .Q, .H)
CubicQuadTwInter(.i, .j, .Q, .H)
TwInterTwInter(.i, .j, .Q, .H)
TwInterThrwInter(.i, .j, .Q, .H)
TwInterQuadTwInter(.i, .j, .Q, .H)
ThrwInterThrwInter(.i, .j, .Q, .H)
ThrwInterQuadTwInter(.i, .j, .Q, .H)
QuadTwInercQuadTwInter(.i, .j, .Q, .H)
Arguments
.i |
Row index |
.j |
Column index |
.Q |
A vector of composite-construct correlations with element names equal to the names of the J construct names used in the measurement model. Note Q^2 is also called the reliability coefficient. |
.H |
The (N x J) matrix of construct scores. |
Details
M is the matrix of the sample counterparts (estimates) of the left-hand side terms in Equation (21) - (24) (Dijkstra and Schermelleh-Engel 2014). The label "M" did not appear in the paper and is only used in the package. Similar is suggested by Wall and Amemiya (2000) using classical factor scores.
References
Dijkstra TK, Schermelleh-Engel K (2014).
“Consistent Partial Least Squares For Nonlinear Structural Equation Models.”
Psychometrika, 79(4), 585–604.
Wall MM, Amemiya Y (2000).
“Estimation for polynomial structural equation models.”
Journal of the American Statistical Association, 95(451), 929–940.
Internal: Utility functions for the estimation of nonlinear models
Description
Internal: Utility functions for the estimation of nonlinear models
Usage
f1(.i, .j)
f2(.i, .j, .select_from, .Q, .H)
f3(.i, .j, .Q, .H)
f4(.i, .j, .Q, .H, .var_struc_error, .temp = NULL)
f5(.i, .j, .H, .Q, .var_struc_error)
Arguments
.i |
Row index |
.j |
Column index |
.select_from |
matrix to select from |
.Q |
A vector of composite-construct correlations with element names equal to the names of the J construct names used in the measurement model. Note Q^2 is also called the reliability coefficient. |
.H |
The (N x J) matrix of construct scores. |
Parse lavaan model
Description
Turns a model written in lavaan model syntax into a cSEMModel list.
Usage
parseModel(
.model = NULL,
.instruments = NULL,
.check_errors = TRUE
)
Arguments
.model |
A model in lavaan model syntax or a cSEMModel list. |
.instruments |
A named list of vectors of instruments. The names
of the list elements are the names of the dependent (LHS) constructs of the structural
equation whose explanatory variables are endogenous. The vectors
contain the names of the instruments corresponding to each equation. Note
that exogenous variables of a given equation must be supplied as
instruments for themselves. Defaults to |
.check_errors |
Logical. Should the model to parse be checked for correctness
in a sense that all necessary components to estimate the model are given?
Defaults to |
Details
Instruments must be supplied separately as a named list of vectors of instruments. The names of the list elements are the names of the dependent constructs of the structural equation whose explanatory variables are endogenous. The vectors contain the names of the instruments corresponding to each equation. Note that exogenous variables of a given equation must be supplied as instruments for themselves.
By default parseModel()
attempts to check if the model provided is correct
in a sense that all necessary components required to estimate the
model are specified (e.g., a construct of the structural model has at least
1 item). To prevent checking for errors use .check_errors = FALSE
.
Value
An object of class cSEMModel is a standardized list containing the following components. J stands for the number of constructs and K for the number of indicators.
$structural
A matrix mimicking the structural relationship between constructs. If constructs are only linearly related,
structural
is of dimension (J x J) with row- and column names equal to the construct names. If the structural model contains nonlinear relationshipsstructural
is (J x (J + J*)) where J* is the number of nonlinear terms. Rows are ordered such that exogenous constructs are always first, followed by constructs that only depend on exogenous constructs and/or previously ordered constructs.$measurement
A (J x K) matrix mimicking the measurement/composite relationship between constructs and their related indicators. Rows are in the same order as the matrix
$structural
with row names equal to the construct names. The order of the columns is such that$measurement
forms a block diagonal matrix.$error_cor
A (K x K) matrix mimicking the measurement error correlation relationship. The row and column order is identical to the column order of
$measurement
.$cor_specified
A matrix indicating the correlation relationships between any variables of the model as specified by the user. Mainly for internal purposes. Note that
$cor_specified
may also contain inadmissible correlations such as a correlation between measurement errors indicators and constructs.$construct_type
A named vector containing the names of each construct and their respective type ("Common factor" or "Composite").
$construct_order
A named vector containing the names of each construct and their respective order ("First order" or "Second order").
$model_type
The type of model ("Linear" or "Nonlinear").
$instruments
Only if instruments are supplied: a list of structural equations relating endogenous RHS variables to instruments.
$indicators
The names of the indicators (i.e., observed variables and/or first-order constructs)
$cons_exo
The names of the exogenous constructs of the structural model (i.e., variables that do not appear on the LHS of any structural equation)
$cons_endo
The names of the endogenous constructs of the structural model (i.e., variables that appear on the LHS of at least one structural equation)
$vars_2nd
The names of the constructs modeled as second orders.
$vars_attached_to_2nd
The names of the constructs forming or building a second order construct.
$vars_not_attached_to_2nd
The names of the constructs not forming or building a second order construct.
It is possible to supply an incomplete list to parseModel()
, resulting
in an incomplete cSEMModel list which can be passed
to all functions that require .csem_model
as a mandatory argument. Currently,
only the structural and the measurement matrix are required.
However, specifying an incomplete cSEMModel list may lead to unexpected behavior
and errors. Use with care.
Examples
# ===========================================================================
# Providing a model in lavaan syntax
# ===========================================================================
model <- "
# Structural model
y1 ~ y2 + y3
# Measurement model
y1 =~ x1 + x2 + x3
y2 =~ x4 + x5
y3 =~ x6 + x7
# Error correlation
x1 ~~ x2
"
m <- parseModel(model)
m
# ===========================================================================
# Providing a complete model in cSEM format (class cSEMModel)
# ===========================================================================
# If the model is already a cSEMModel object, the model is returned as is:
identical(m, parseModel(m)) # TRUE
# ===========================================================================
# Providing a list
# ===========================================================================
# It is possible to provide a list that contains at least the
# elements "structural" and "measurement". This is generally discouraged
# as this may cause unexpected errors.
m_incomplete <- m[c("structural", "measurement", "construct_type")]
parseModel(m_incomplete)
# Providing a list containing list names that are not part of a `cSEMModel`
# causes an error:
## Not run:
m_incomplete[c("name_a", "name_b")] <- c("hello world", "hello universe")
parseModel(m_incomplete)
## End(Not run)
# Failing to provide "structural" or "measurement" also causes an error:
## Not run:
m_incomplete <- m[c("structural", "construct_type")]
parseModel(m_incomplete)
## End(Not run)
cSEMIPMA
method for plot()
Description
Plot the importance-performance matrix.
Usage
## S3 method for class 'cSEMIPMA'
plot(
x = NULL,
.dependent = NULL,
.attributes = NULL,
.level = c("construct", "indicator"),
...
)
Arguments
x |
An R object of class |
.dependent |
Character string. Name of the target construct for which the importance-performance matrix should be created. |
.attributes |
Character string. A vector containing indicator/construct names that should be plotted in the importance-performance matrix. It must be at least of length 2. |
.level |
Character string. Indicates the level for which the
importance-performance matrix should be plotted. One of |
... |
Currently ignored. |
See Also
cSEMNonlinearEffects
method for plot()
Description
This plot method can be used to create plots to analyze non-linear models in more depth. In doing so the following plot types can be selected:
.plot_type = "simpleeffects"
:-
The plot of a simple effects analysis displays the predicted value of the dependent variable for different values of the independent variable and the moderator. As levels for the moderator the levels provided to the
doNonlinearEffectsAnalysis()
function are used. Since the constructs are standardized the values of the moderator equals the deviation from its mean measured in standard deviations. .plot_type = "surface"
:-
The plot of a surface analysis displays the predicted values of an independent variable (z). The values are predicted based on the values of the moderator and the independent variable including all their higher-order terms. For the values of the moderator and the independent variable steps between their minimum and maximum values are used.
.plot_type = "floodlight"
:-
The plot of a floodlight analysis displays the direct effect of an continuous independent variable (z) on a dependent variable (y) conditional on the values of a continuous moderator variable (x), including the confidence interval and the Johnson-Neyman points. It is noted that in the floodlight plot only moderation is taken into account and higher order terms are ignored. For more details, see Spiller et al. (2013).
Plot the predicted values of an independent variable (z) The values are predicted based on a certain moderator and a certain independent variable including all their higher-order terms.
Usage
## S3 method for class 'cSEMNonlinearEffects'
plot(x, .plot_type = "simpleeffects", .plot_package = "plotly", ...)
Arguments
x |
An R object of class |
.plot_type |
A character string indicating the type of plot that should be produced. Options are "simpleeffects", "surface", and "floodlight". Defaults to "simpleeffects". |
.plot_package |
A character vector indicating the plot package used. Options are "plotly", and "persp". Defaults to "plotly". |
... |
Additional parameters that can be passed to
|
See Also
cSEMPredict
method for plot()
Description
The cSEMPredict
method for the generic function plot()
.
Usage
## S3 method for class 'cSEMPredict'
plot(x, ...)
Arguments
x |
An R object of class |
... |
Currently ignored. |
See Also
cSEMResults
method for plot()
for second-order models.
Description
Usage
## S3 method for class 'cSEMResults_2ndorder'
plot(
x,
.title = args_default()$.title,
.plot_significances = args_default()$.plot_significances,
.plot_correlations = args_default()$.plot_correlations,
.plot_structural_model_only = args_default()$.plot_structural_model_only,
.plot_labels = args_default()$.plot_labels,
.graph_attrs = args_default()$.graph_attrs,
...
)
Arguments
x |
An R object of class |
.title |
Character string. Title of an object. Defaults to "". |
.plot_significances |
Logical. Should p-values in the form of stars be plotted? Defaults to |
.plot_correlations |
Character string. Specify which correlations should be plotted, i.e.,
between the exogenous constructs ( |
.plot_structural_model_only |
Logical. Should only the structural model,
i.e., the constructs and their relationships be plotted? Defaults to |
.plot_labels |
Logical. Whether to display edge labels and node R² values. Defaults to TRUE. |
.graph_attrs |
Character string. Additional attributes that should be passed to the DiagrammeR syntax, e.g., c("rankdir=LR", "ranksep=1.0"). Defaults to c("rankdir=LR"). |
... |
Currently ignored. |
Details
Creates a plot of a cSEMResults_2ndorder
object using the grViz function.
For more details on customizing plot, see https://rpubs.com/nguyen_mot/1275413.
See Also
cSEMResults
method for plot()
Description
Usage
## S3 method for class 'cSEMResults_default'
plot(
x = NULL,
.title = args_default()$.title,
.plot_significances = args_default()$.plot_significances,
.plot_correlations = args_default()$.plot_correlations,
.plot_structural_model_only = args_default()$.plot_structural_model_only,
.plot_labels = args_default()$.plot_labels,
.graph_attrs = args_default()$.graph_attrs,
...
)
Arguments
x |
An R object of class |
.title |
Character string. Title of an object. Defaults to "". |
.plot_significances |
Logical. Should p-values in the form of stars be plotted? Defaults to |
.plot_correlations |
Character string. Specify which correlations should be plotted, i.e.,
between the exogenous constructs ( |
.plot_structural_model_only |
Logical. Should only the structural model,
i.e., the constructs and their relationships be plotted? Defaults to |
.plot_labels |
Logical. Whether to display edge labels and R² values in the nodes. Defaults to TRUE (i.e. original plot). |
.graph_attrs |
Character string. Additional attributes that should be passed to the DiagrammeR syntax, e.g., c("rankdir=LR", "ranksep=1.0"). Defaults to c("rankdir=LR"). |
... |
Currently ignored. |
Details
Creates a plot of a cSEMResults
object using the grViz function.
For more details on customizing plot, see https://rpubs.com/nguyen_mot/1275413.
See Also
savePlot()
csem()
, cSEMResults, grViz
Examples
## Not run:
model_Bergami_int="
# Common factor and composite models
OrgPres <~ cei1 + cei2 + cei3 + cei4 + cei5 + cei6 + cei7 + cei8
OrgIden =~ ma1 + ma2 + ma3 + ma4 + ma5 + ma6
AffJoy =~ orgcmt1 + orgcmt2 + orgcmt3 + orgcmt7
AffLove =~ orgcmt5 + orgcmt6 + orgcmt8
# Structural model
OrgIden ~ OrgPres
AffLove ~ OrgPres+OrgIden+OrgPres.OrgIden
AffJoy ~ OrgPres+OrgIden
"
outBergamiInt <- csem(.data = BergamiBagozzi2000,.model = model_Bergami_int,
.disattenuate = T,
.PLS_weight_scheme_inner = 'factorial',
.tolerance = 1e-6,
.resample_method = 'none')
outPlot <- plot(outBergamiInt)
outPlot
savePlot(outPlot,.file='plot.pdf')
savePlot(outPlot,.file='plot.png')
savePlot(outPlot,.file='plot.svg')
savePlot(outPlot,.file='plot.dot')
## End(Not run)
cSEMResults
method for plot()
for multiple groups.
Description
Usage
## S3 method for class 'cSEMResults_multi'
plot(
x = NULL,
.title = args_default()$.title,
.plot_significances = args_default()$.plot_significances,
.plot_correlations = args_default()$.plot_correlations,
.plot_structural_model_only = args_default()$.plot_structural_model_only,
.plot_labels = args_default()$.plot_labels,
.graph_attrs = args_default()$.graph_attrs,
...
)
Arguments
x |
An R object of class |
.title |
Character string. Title of an object. Defaults to "". |
.plot_significances |
Logical. Should p-values in the form of stars be plotted? Defaults to |
.plot_correlations |
Character string. Specify which correlations should be plotted, i.e.,
between the exogenous constructs ( |
.plot_structural_model_only |
Logical. Should only the structural model,
i.e., the constructs and their relationships be plotted? Defaults to |
.plot_labels |
Logical. Whether to display edge labels and node R² values. Defaults to TRUE. |
.graph_attrs |
Character string. Additional attributes that should be passed to the DiagrammeR syntax, e.g., c("rankdir=LR", "ranksep=1.0"). Defaults to c("rankdir=LR"). |
... |
Currently ignored. |
Details
Creates a plot of a cSEMResults
object using the grViz function.
For more details on customizing plot, see https://rpubs.com/nguyen_mot/1275413.
See Also
Predict indicator scores
Description
Usage
predict(
.object = NULL,
.benchmark = c("lm", "unit", "PLS-PM", "GSCA", "PCA", "MAXVAR", "NA"),
.approach_predict = c("earliest", "direct"),
.cv_folds = 10,
.handle_inadmissibles = c("stop", "ignore", "set_NA"),
.r = 1,
.test_data = NULL,
.approach_score_target = c("mean", "median", "mode"),
.sim_points = 100,
.disattenuate = TRUE,
.treat_as_continuous = TRUE,
.approach_score_benchmark = c("mean", "median", "mode", "round"),
.seed = NULL
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.benchmark |
Character string. The procedure to obtain benchmark predictions. One of "lm", "unit", "PLS-PM", "GSCA", "PCA", "MAXVAR", or "NA". Default to "lm". |
.approach_predict |
Character string. Which approach should be used to perform predictions? One of "earliest" and "direct". If "earliest" predictions for indicators associated to endogenous constructs are performed using only indicators associated to exogenous constructs. If "direct", predictions for indicators associated to endogenous constructs are based on indicators associated to their direct antecedents. Defaults to "earliest". |
.cv_folds |
Integer. The number of cross-validation folds to use. Setting
|
.handle_inadmissibles |
Character string. How should inadmissible results
be treated? One of "stop", "ignore", or "set_NA". If "stop", |
.r |
Integer. The number of repetitions to use. Defaults to |
.test_data |
A matrix of test data with the same column names as the training data. |
.approach_score_target |
Character string. How should the aggregation of the estimates of the truncated normal distribution for the predictions using OrdPLS/OrdPLSc be done? One of "mean", "median" or "mode". If "mean", the mean of the estimated endogenous indicators is calculated. If "median", the mean of the estimated endogenous indicators is calculated. If "mode", the maximum empirical density on the intervals defined by the thresholds is used. Defaults to "mean". |
.sim_points |
Integer. How many samples from the truncated normal distribution should be simulated to estimate the exogenous construct scores? Defaults to "100". |
.disattenuate |
Logical. Should the benchmark predictions be based on
disattenuated parameter estimates? Defaults to |
.treat_as_continuous |
Logical. Should the indicators for the benchmark predictions
be treated as continuous? If |
.approach_score_benchmark |
Character string. How should the aggregation
of the estimates of the truncated normal distribution be done for the
benchmark predictions? Ignored if not OrdPLS or OrdPLSc is used to obtain benchmark predictions.
One of "mean", "median", "mode" or "round".
If "round", the benchmark predictions are obtained using the traditional prediction
algorithm for PLS-PM which are rounded for categorical indicators.
If "mean", the mean of the estimated endogenous indicators is calculated.
If "median", the mean of the estimated endogenous indicators is calculated.
If "mode", the maximum empirical density on the intervals defined by the thresholds
is used.
If |
.seed |
Integer or |
Details
The predict function implements the procedure introduced by Shmueli et al. (2016) in the PLS context
known as "PLSPredict" (Shmueli et al. 2019) including its variants PLScPredcit, OrdPLSpredict and OrdPLScpredict.
It is used to predict the indicator scores of endogenous constructs and to evaluate the out-of-sample predictive power
of a model.
For that purpose, the predict function uses k-fold cross-validation to randomly
split the data into training and test datasets, and subsequently predicts the
values of the test data based on the model parameter estimates obtained
from the training data. The number of cross-validation folds is 10 by default but
may be changed using the .cv_folds
argument.
By default, the procedure is not repeated (.r = 1
). You may choose to repeat
cross-validation by setting a higher .r
to be sure not to have a particular
(unfortunate) split. See Shmueli et al. (2019) for
details. Typically .r = 1
should be sufficient though.
Alternatively, users may supply a test dataset as matrix or a data frame of .test_data
with
the same column names as those in the data used to obtain .object
(the training data).
In this case, arguments .cv_folds
and .r
are
ignored and predict uses the estimated coefficients from .object
to
predict the values in the columns of .test_data
.
In Shmueli et al. (2016) PLS-based predictions for indicator i
are compared to the predictions based on a multiple regression of indicator i
on all available exogenous indicators (.benchmark = "lm"
) and
a simple mean-based prediction summarized in the Q2_predict metric.
predict()
is more general in that is allows users to compare the predictions
based on a so-called target model/specification to predictions based on an
alternative benchmark. Available benchmarks include predictions
based on a linear model, PLS-PM weights, unit weights (i.e. sum scores),
GSCA weights, PCA weights, and MAXVAR weights.
Each estimation run is checked for admissibility using verify()
. If the
estimation yields inadmissible results, predict()
stops with an error ("stop"
).
Users may choose to "ignore"
inadmissible results or to simply set predictions
to NA
("set_NA"
) for the particular run that failed.
Value
An object of class cSEMPredict
with print and plot methods.
Technically, cSEMPredict
is a
named list containing the following list elements:
$Actual
A matrix of the actual values/indicator scores of the endogenous constructs.
$Prediction_target
A list containing matrices of the predicted indicator scores of the endogenous constructs based on the target model for each repetition .r. Target refers to procedure used to estimate the parameters in
.object
.$Residuals_target
A list of matrices of the residual indicator scores of the endogenous constructs based on the target model in each repetition .r.
$Residuals_benchmark
A list of matrices of the residual indicator scores of the endogenous constructs based on a model estimated by the procedure given to
.benchmark
for each repetition .r.$Prediction_metrics
A data frame containing the predictions metrics MAE, RMSE, Q2_predict, the misclassification error rate (MER), the MAPE, the MSE2, Theil's forecast accuracy (U1), Theil's forecast quality (U2), Bias proportion of MSE (UM), Regression proportion of MSE (UR), and disturbance proportion of MSE (UD) (Hora and Campos 2015; Watson and Teelucksingh 2002).
$Information
A list with elements
Target
,Benchmark
,Number_of_observations_training
,Number_of_observations_test
,Number_of_folds
,Number_of_repetitions
, andHandle_inadmissibles
.
References
Hora J, Campos P (2015).
“A review of performance criteria to validate simulation models.”
Expert Systems, 32(5), 578–595.
doi:10.1111/exsy.12111.
Shmueli G, Ray S, Estrada JMV, Chatla SB (2016).
“The Elephant in the Room: Predictive Performance of PLS Models.”
Journal of Business Research, 69(10), 4552–4564.
doi:10.1016/j.jbusres.2016.03.049.
Shmueli G, Sarstedt M, Hair JF, Cheah J, Ting H, Vaithilingam S, Ringle CM (2019).
“Predictive Model Assessment in PLS-SEM: Guidelines for Using PLSpredict.”
European Journal of Marketing, 53(11), 2322–2347.
doi:10.1108/ejm-02-2019-0189.
Watson PK, Teelucksingh SS (2002).
A practical introduction to econometric methods: Classical and modern.
University of West Indies Press, Mona, Jamaica.
See Also
csem, cSEMResults, exportToExcel()
Examples
### Anime example taken from https://github.com/ISS-Analytics/pls-predict/
# Load data
data(Anime) # data is similar to the Anime.csv found on
# https://github.com/ISS-Analytics/pls-predict/ but with irrelevant
# columns removed
# Split into training and data the same way as it is done on
# https://github.com/ISS-Analytics/pls-predict/
set.seed(123)
index <- sample.int(dim(Anime)[1], 83, replace = FALSE)
dat_train <- Anime[-index, ]
dat_test <- Anime[index, ]
# Specify model
model <- "
# Structural model
ApproachAvoidance ~ PerceivedVisualComplexity + Arousal
# Measurement/composite model
ApproachAvoidance =~ AA0 + AA1 + AA2 + AA3
PerceivedVisualComplexity <~ VX0 + VX1 + VX2 + VX3 + VX4
Arousal <~ Aro1 + Aro2 + Aro3 + Aro4
"
# Estimate (replicating the results of the `simplePLS()` function)
res <- csem(dat_train,
model,
.disattenuate = FALSE, # original PLS
.iter_max = 300,
.tolerance = 1e-07,
.PLS_weight_scheme_inner = "factorial"
)
# Predict using a user-supplied training data set
pp <- predict(res, .test_data = dat_test)
pp
### Compute prediction metrics ------------------------------------------------
res2 <- csem(Anime, # whole data set
model,
.disattenuate = FALSE, # original PLS
.iter_max = 300,
.tolerance = 1e-07,
.PLS_weight_scheme_inner = "factorial"
)
# Predict using 10-fold cross-validation
## Not run:
pp2 <- predict(res, .benchmark = "lm")
pp2
## There is a plot method available
plot(pp2)
## End(Not run)
### Example using OrdPLScPredict -----------------------------------------------
# Transform the numerical indicators into factors
## Not run:
data("BergamiBagozzi2000")
data_new <- data.frame(cei1 = as.ordered(BergamiBagozzi2000$cei1),
cei2 = as.ordered(BergamiBagozzi2000$cei2),
cei3 = as.ordered(BergamiBagozzi2000$cei3),
cei4 = as.ordered(BergamiBagozzi2000$cei4),
cei5 = as.ordered(BergamiBagozzi2000$cei5),
cei6 = as.ordered(BergamiBagozzi2000$cei6),
cei7 = as.ordered(BergamiBagozzi2000$cei7),
cei8 = as.ordered(BergamiBagozzi2000$cei8),
ma1 = as.ordered(BergamiBagozzi2000$ma1),
ma2 = as.ordered(BergamiBagozzi2000$ma2),
ma3 = as.ordered(BergamiBagozzi2000$ma3),
ma4 = as.ordered(BergamiBagozzi2000$ma4),
ma5 = as.ordered(BergamiBagozzi2000$ma5),
ma6 = as.ordered(BergamiBagozzi2000$ma6),
orgcmt1 = as.ordered(BergamiBagozzi2000$orgcmt1),
orgcmt2 = as.ordered(BergamiBagozzi2000$orgcmt2),
orgcmt3 = as.ordered(BergamiBagozzi2000$orgcmt3),
orgcmt5 = as.ordered(BergamiBagozzi2000$orgcmt5),
orgcmt6 = as.ordered(BergamiBagozzi2000$orgcmt6),
orgcmt7 = as.ordered(BergamiBagozzi2000$orgcmt7),
orgcmt8 = as.ordered(BergamiBagozzi2000$orgcmt8))
model <- "
# Measurement models
OrgPres =~ cei1 + cei2 + cei3 + cei4 + cei5 + cei6 + cei7 + cei8
OrgIden =~ ma1 + ma2 + ma3 + ma4 + ma5 + ma6
AffJoy =~ orgcmt1 + orgcmt2 + orgcmt3 + orgcmt7
AffLove =~ orgcmt5 + orgcmt 6 + orgcmt8
# Structural model
OrgIden ~ OrgPres
AffLove ~ OrgIden
AffJoy ~ OrgIden
"
# Estimate using cSEM; note: the fact that indicators are factors triggers OrdPLSc
res <- csem(.model = model, .data = data_new[1:250,])
summarize(res)
# Predict using OrdPLSPredict
set.seed(123)
pred <- predict(
.object = res,
.benchmark = "PLS-PM",
.test_data = data_new[(251):305,],
.treat_as_continuous = TRUE, .approach_score_target = "median"
)
pred
round(pred$Prediction_metrics[, -1], 4)
## End(Not run)
cSEMAssess
method for print()
Description
The cSEMAssess
method for the generic function print()
.
Usage
## S3 method for class 'cSEMAssess'
print(x, ...)
See Also
cSEMNonlinearEffectsAnalysis
method for print()
Description
The cSEMNonlinearEffectsAnalysis
method for the generic function print()
.
Usage
## S3 method for class 'cSEMNonlinearEffects'
print(x, ...)
See Also
csem()
, cSEMResults, doNonlinearEffectsAnalysis()
, plot.cSEMNonlinearEffects()
cSEMPlotPredict
method for print()
Description
The cSEMPlotPredict
method for the generic function print()
.
Usage
## S3 method for class 'cSEMPlotPredict'
print(x, ...)
Arguments
x |
An R object of class |
... |
Currently ignored. |
See Also
cSEMPredict
method for print()
Description
The cSEMPredict
method for the generic function print()
.
Usage
## S3 method for class 'cSEMPredict'
print(x, .metrics = c("MAE", "RMSE", "Q2"), ...)
Arguments
.metrics |
Character string or a vector of character strings. Which prediction metrics should be displayed? One of: "MAE", "RMSE", "Q2", "MER", "MAPE, "MSE2", "U1", "U2", "UM", "UR", or "UD". Default to c("MAE", "RMSE", "Q2"). |
See Also
csem()
, cSEMResults, predict()
cSEMResults
method for print()
Description
The cSEMResults method for the generic function print()
.
Usage
## S3 method for class 'cSEMResults'
print(x, ...)
See Also
cSEMSummarize
method for print()
Description
The cSEMSummary method for the generic function print()
.
Usage
## S3 method for class 'cSEMSummarize'
print(x, .full_output = TRUE, ...)
Arguments
.full_output |
Logical. Should the full output of summarize be printed.
Defaults to |
See Also
csem()
, cSEMResults, summarize()
cSEMTestCVPAT
method for print()
Description
The cSEMTestCVAT
method for the generic function print()
.
Usage
## S3 method for class 'cSEMTestCVPAT'
print(x, ...)
See Also
csem()
, cSEMResults, testCVPAT()
cSEMTestHausman
method for print()
Description
The cSEMTestHausman
method for the generic function print()
.
Usage
## S3 method for class 'cSEMTestHausman'
print(x, ...)
See Also
csem()
, cSEMResults, testHausman()
cSEMTestMGD
method for print()
Description
The cSEMTestMGD
method for the generic function print()
.
Usage
## S3 method for class 'cSEMTestMGD'
print(
x,
.approach_mgd = c("none", "Klesel", "Chin", "Sarstedt", "Keil", "Nitzl", "Henseler",
"CI_para", "CI_overlap"),
...
)
Arguments
.approach_mgd |
Character string or a vector of character strings. For which approach should details be displayed? One of: "none", "Klesel", "Chin", "Sarstedt", "Keil, "Nitzl", "Henseler", "CI_para", or "CI_overlap". Default to "none" in which case no details are displayed. |
See Also
csem()
, cSEMResults, testMGD()
cSEMTestMICOM
method for print()
Description
The cSEMTestMICOM
method for the generic function print()
.
Usage
## S3 method for class 'cSEMTestMICOM'
print(x, ...)
See Also
csem()
, cSEMResults, testMICOM()
cSEMTestOMF
method for print()
Description
The cSEMTestOMF
method for the generic function print()
.
Usage
## S3 method for class 'cSEMTestOMF'
print(x, ...)
See Also
csem()
, cSEMResults, testOMF()
cSEMVerify
method for print()
Description
The cSEMVerify
method for the generic function print()
.
Usage
## S3 method for class 'cSEMVerify'
print(x, ...)
See Also
Internal: Process data
Description
Prepare, standardize, check, and clean data provided via the .data
argument.
Usage
processData(
.data = NULL,
.model = NULL,
.instruments = NULL
)
Arguments
.data |
A |
.model |
A model in lavaan model syntax or a cSEMModel list. |
.instruments |
A named list of vectors of instruments. The names
of the list elements are the names of the dependent (LHS) constructs of the structural
equation whose explanatory variables are endogenous. The vectors
contain the names of the instruments corresponding to each equation. Note
that exogenous variables of a given equation must be supplied as
instruments for themselves. Defaults to |
Value
A (N x K) data.frame containing the standardized data with columns ordered
according to the order they appear in the measurement model equations provided
via the .model
argument.
Reliability
Description
Compute several reliability estimates. See the Reliability section of the cSEM website for details.
Usage
calculateRhoC(
.object = NULL,
.model_implied = TRUE,
.only_common_factors = TRUE,
.weighted = FALSE
)
calculateRhoT(
.object = NULL,
.alpha = 0.05,
.closed_form_ci = FALSE,
.only_common_factors = TRUE,
.output_type = c("vector", "data.frame"),
.weighted = FALSE,
...
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.model_implied |
Logical. Should weights be scaled using the model-implied
indicator correlation matrix? Defaults to |
.only_common_factors |
Logical. Should only concepts modeled as common
factors be included when calculating one of the following quality criteria:
AVE, the Fornell-Larcker criterion, HTMT, and all reliability estimates.
Defaults to |
.weighted |
Logical. Should estimation be based on a score that uses
the weights of the weight approach used to obtain |
.alpha |
An integer or a numeric vector of significance levels.
Defaults to |
.closed_form_ci |
Logical. Should a closed-form confidence interval be computed?
Defaults to |
.output_type |
Character string. The type of output. One of "vector" or "data.frame". Defaults to "vector". |
... |
Ignored. |
Details
Since reliability is defined with respect to a classical true score measurement
model only concepts modeled as common factors are considered by default.
For concepts modeled as composites reliability may be estimated by setting
.only_common_factors = FALSE
, however, it is unclear how to
interpret reliability in this case.
Reliability is traditionally based on a test score (proxy) based on unit weights.
To compute congeneric and tau-equivalent reliability based on a score that
uses the weights of the weight approach used to obtain .object
use .weighted = TRUE
instead.
For the tau-equivalent reliability ("rho_T
" or "cronbachs_alpha
") a closed-form
confidence interval may be computed (Trinchera et al. 2018) by setting
.closed_form_ci = TRUE
(default is FALSE
). If .alpha
is a vector
several CIs are returned.
Value
For calculateRhoC()
and calculateRhoT()
(if .output_type = "vector"
)
a named numeric vector containing the reliability estimates.
If .output_type = "data.frame"
calculateRhoT()
returns a data.frame
with as many rows as there are
constructs modeled as common factors in the model (unless
.only_common_factors = FALSE
in which case the number of rows equals the
total number of constructs in the model). The first column contains the name of the construct.
The second column the reliability estimate.
If .closed_form_ci = TRUE
the remaining columns contain lower and upper bounds
for the (1 - .alpha
) confidence interval(s).
Functions
-
calculateRhoC()
: Calculate the congeneric reliability -
calculateRhoT()
: Calculate the tau-equivalent reliability
References
Trinchera L, Marie N, Marcoulides GA (2018). “A Distribution Free Interval Estimate for Coefficient Alpha.” Structural Equation Modeling: A Multidisciplinary Journal, 25(6), 876–887. doi:10.1080/10705511.2018.1431544.
See Also
Resample data
Description
Resample data from a data set using common resampling methods.
For bootstrap or jackknife resampling, package users usually do not need to
call this function but directly use resamplecSEMResults()
instead.
Usage
resampleData(
.object = NULL,
.data = NULL,
.resample_method = c("bootstrap", "jackknife", "permutation",
"cross-validation"),
.cv_folds = 10,
.id = NULL,
.R = 499,
.seed = NULL
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.data |
A |
.resample_method |
Character string. The resampling method to use. One of: "bootstrap", "jackknife", "permutation", or "cross-validation". Defaults to "bootstrap". |
.cv_folds |
Integer. The number of cross-validation folds to use. Setting
|
.id |
Character string or integer. A character string giving the name or
an integer of the position of the column of |
.R |
Integer. The number of bootstrap runs, permutation runs
or cross-validation repetitions to use. Defaults to |
.seed |
Integer or |
Details
The function resampleData()
is general purpose. It simply resamples data
from a data set according to the resampling method provided
via the .resample_method
argument and returns a list of resamples.
Currently, bootstrap
, jackknife
, permutation
, and cross-validation
(both leave-one-out (LOOCV) and k-fold cross-validation) are implemented.
The user may provide the data set to resample either explicitly via the .data
argument or implicitly by providing a cSEMResults objects to .object
in which case the original data used in the call that created the
cSEMResults object is used for resampling.
If both, a cSEMResults object and a data set via .data
are provided
the former is ignored.
As csem()
accepts a single data set, a list of data sets as well as data sets
that contain a column name used to split the data into groups,
the cSEMResults object may contain multiple data sets.
In this case, resampling is done by data set or group. Note that depending
on the number of data sets/groups provided this computation may be slower
as resampling will be repeated for each data set/group.
To split data provided via the .data
argument into groups, the column name or
the column index of the column containing the group levels to split the data
must be given to .id
. If data that contains grouping is taken from
a cSEMResults object, .id
is taken from the object information. Hence,
providing .id
is redundant in this case and therefore ignored.
The number of bootstrap or permutation runs as well as the number of
cross-validation repetitions is given by .R
. The default is
499
but should be increased in real applications. See e.g.,
Hesterberg (2015), p.380 for recommendations concerning
the bootstrap. For jackknife .R
is ignored as it is based on the N leave-one-out data sets.
Choosing resample_method = "permutation"
for ungrouped data causes an error
as permutation will simply reorder the observations which is usually not
meaningful. If a list of data is provided
each list element is assumed to represent the observations belonging to one
group. In this case, data is pooled and group adherence permuted.
For cross-validation the number of folds (k
) defaults to 10
. It may be
changed via the .cv_folds
argument. Setting k = 2
(not 1!) splits
the data into a single training and test data set. Setting k = N
(where N
is the
number of observations) produces leave-one-out cross-validation samples.
Note: 1.) At least 2 folds required (k > 1
); 2.) k
can not be larger than N
;
3.) If N/k
is not not an integer the last fold will have less observations.
Random number generation (RNG) uses the L'Ecuyer-CRMR RGN stream as implemented in the future.apply package (Bengtsson 2018). See ?future_lapply for details. By default a random seed is chosen.
Value
The structure of the output depends on the type of input and the resampling method:
- Bootstrap
If a
matrix
ordata.frame
without grouping variable is provided (i.e.,.id = NULL
), the result is a list of length.R
(default499
). Each element of that list is a bootstrap (re)sample. If a grouping variable is specified or a list of data is provided (where each list element is assumed to contain data for one group), resampling is done by group. Hence, the result is a list of length equal to the number of groups with each list element containing.R
bootstrap samples based on theN_g
observations of groupg
.- Jackknife
If a
matrix
ordata.frame
without grouping variable is provided (.id = NULL
), the result is a list of length equal to the number of observations/rows (N
) of the data set provided. Each element of that list is a jackknife (re)sample. If a grouping variable is specified or a list of data is provided (where each list element is assumed to contain data for one group), resampling is done by group. Hence, the result is a list of length equal to the number of group levels with each list element containingN
jackknife samples based on theN_g
observations of groupg
.- Permutation
If a
matrix
ordata.frame
without grouping variable is provided an error is returned as permutation will simply reorder the observations. If a grouping variable is specified or a list of data is provided (where each list element is assumed to contain data of one group), group membership is permuted. Hence, the result is a list of length.R
where each element of that list is a permutation (re)sample.- Cross-validation
If a
matrix
ordata.frame
without grouping variable is provided a list of length.R
is returned. Each list element contains a list containing thek
splits/folds subsequently used as test and training data sets. If a grouping variable is specified or a list of data is provided (where each list element is assumed to contain data for one group), cross-validation is repeated.R
times for each group. Hence, the result is a list of length equal to the number of groups, each containing.R
list elements (the repetitions) which in turn contain thek
splits/folds.
References
Bengtsson H (2018).
future.apply: Apply Function to Elements in Parallel using Futures.
R package version 1.0.1, https://CRAN.R-project.org/package=future.apply.
Hesterberg TC (2015).
“What Teachers Should Know About the Bootstrap: Resampling in the Undergraduate Statistics Curriculum.”
The American Statistician, 69(4), 371–386.
doi:10.1080/00031305.2015.1089789.
See Also
csem()
, cSEMResults, resamplecSEMResults()
Examples
# ===========================================================================
# Using the raw data
# ===========================================================================
### Bootstrap (default) -----------------------------------------------------
res_boot1 <- resampleData(.data = satisfaction)
str(res_boot1, max.level = 3, list.len = 3)
## To replicate a bootstrap draw use .seed:
res_boot1a <- resampleData(.data = satisfaction, .seed = 2364)
res_boot1b <- resampleData(.data = satisfaction, .seed = 2364)
identical(res_boot1, res_boot1a) # TRUE
### Jackknife ---------------------------------------------------------------
res_jack <- resampleData(.data = satisfaction, .resample_method = "jackknife")
str(res_jack, max.level = 3, list.len = 3)
### Cross-validation --------------------------------------------------------
## Create dataset for illustration:
dat <- data.frame(
"x1" = rnorm(100),
"x2" = rnorm(100),
"group" = sample(c("male", "female"), size = 100, replace = TRUE),
stringsAsFactors = FALSE)
## 10-fold cross-validation (repeated 100 times)
cv_10a <- resampleData(.data = dat, .resample_method = "cross-validation",
.R = 100)
str(cv_10a, max.level = 3, list.len = 3)
# Cross-validation can be done by group if a group identifyer is provided:
cv_10 <- resampleData(.data = dat, .resample_method = "cross-validation",
.id = "group", .R = 100)
## Leave-one-out-cross-validation (repeated 50 times)
cv_loocv <- resampleData(.data = dat[, -3],
.resample_method = "cross-validation",
.cv_folds = nrow(dat),
.R = 50)
str(cv_loocv, max.level = 2, list.len = 3)
### Permuation ---------------------------------------------------------------
res_perm <- resampleData(.data = dat, .resample_method = "permutation",
.id = "group")
str(res_perm, max.level = 2, list.len = 3)
# Forgetting to set .id causes an error
## Not run:
res_perm <- resampleData(.data = dat, .resample_method = "permutation")
## End(Not run)
# ===========================================================================
# Using a cSEMResults object
# ===========================================================================
model <- "
# Structural model
QUAL ~ EXPE
EXPE ~ IMAG
SAT ~ IMAG + EXPE + QUAL + VAL
LOY ~ IMAG + SAT
VAL ~ EXPE + QUAL
# Measurement model
EXPE =~ expe1 + expe2 + expe3 + expe4 + expe5
IMAG =~ imag1 + imag2 + imag3 + imag4 + imag5
LOY =~ loy1 + loy2 + loy3 + loy4
QUAL =~ qual1 + qual2 + qual3 + qual4 + qual5
SAT =~ sat1 + sat2 + sat3 + sat4
VAL =~ val1 + val2 + val3 + val4
"
a <- csem(satisfaction, model)
# Create bootstrap and jackknife samples
res_boot <- resampleData(a, .resample_method = "bootstrap", .R = 499)
res_jack <- resampleData(a, .resample_method = "jackknife")
# Since `satisfaction` is the dataset used the following approaches yield
# identical results.
res_boot_data <- resampleData(.data = satisfaction, .seed = 2364)
res_boot_object <- resampleData(a, .seed = 2364)
identical(res_boot_data, res_boot_object) # TRUE
Resample cSEMResults
Description
Resample a cSEMResults object using bootstrap or jackknife resampling.
The function is called by csem()
if the user sets
csem(..., .resample_method = "bootstrap")
or
csem(..., .resample_method = "jackknife")
but may also be called directly.
Usage
resamplecSEMResults(
.object = NULL,
.resample_method = c("bootstrap", "jackknife"),
.resample_method2 = c("none", "bootstrap", "jackknife"),
.R = 499,
.R2 = 199,
.handle_inadmissibles = c("drop", "ignore", "replace"),
.user_funs = NULL,
.eval_plan = c("sequential", "multicore", "multisession"),
.force = FALSE,
.seed = NULL,
.sign_change_option = c("none","individual","individual_reestimate",
"construct_reestimate"),
...
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.resample_method |
Character string. The resampling method to use. One of: "bootstrap" or "jackknife". Defaults to "bootstrap". |
.resample_method2 |
Character string. The resampling method to use when resampling
from a resample. One of: "none", "bootstrap" or "jackknife". For
"bootstrap" the number of draws is provided via |
.R |
Integer. The number of bootstrap replications. Defaults to |
.R2 |
Integer. The number of bootstrap replications to use when
resampling from a resample. Defaults to |
.handle_inadmissibles |
Character string. How should inadmissible results
be treated? One of "drop", "ignore", or "replace". If "drop", all
replications/resamples yielding an inadmissible result will be dropped
(i.e. the number of results returned will potentially be less than |
.user_funs |
A function or a (named) list of functions to apply to every
resample. The functions must take |
.eval_plan |
Character string. The evaluation plan to use. One of "sequential", "multicore", or "multisession". In the two latter cases all available cores will be used. Defaults to "sequential". |
.force |
Logical. Should .object be resampled even if it contains resamples
already?. Defaults to |
.seed |
Integer or |
.sign_change_option |
Character string. Which sign change option should be used to handle flipping signs when resampling? One of "none","individual", "individual_reestimate", "construct_reestimate". Defaults to "none". |
... |
Further arguments passed to functions supplied to |
Details
Given M
resamples (for bootstrap M = .R
and for jackknife M = N
, where
N
is the number of observations) based on the data used to compute the
cSEMResults object provided via .object
, resamplecSEMResults()
essentially calls
csem()
on each resample using the arguments of the original call (ignoring any arguments
related to resampling) and returns estimates for each of a subset of
practically useful resampled parameters/statistics computed by csem()
.
Currently, the following estimates are computed and returned by default based
on each resample: Path estimates, Loading estimates, Weight estimates.
In practical application users may need to resample a specific statistic (e.g,
the heterotrait-monotrait ratio of correlations (HTMT) or differences between path
coefficients such as beta_1 - beta_2).
Such statistics may be provided by a function fun(.object, ...)
or a list of
such functions via the .user_funs
argument. The first argument of
these functions must always be .object
.
Internally, the function will be applied on each
resample to produce the desired statistic. Hence, arbitrary complicated statistics
may be resampled as long as the body of the function draws on elements contained
in the cSEMResults object only. Output of fun(.object, ...)
should preferably
be a (named) vector but matrices are also accepted.
However, the output will be vectorized (columnwise) in this case.
See the examples section for details.
Both resampling the original cSEMResults object (call it "first resample")
and resampling based on a resampled cSEMResults object (call it "second resample")
are supported. Choices for the former
are "bootstrap" and "jackknife". Resampling based on a resample is turned off
by default (.resample_method2 = "none"
) as this significantly
increases computation time (there are now M * M2
resamples to compute, where
M2
is .R2
or N
).
Resamples of a resample are required, e.g., for the studentized confidence
interval computed by the infer()
function. Typically, bootstrap resamples
are used in this case (Davison and Hinkley 1997).
As csem()
accepts a single data set, a list of data sets as well as data sets
that contain a column name used to split the data into groups,
the cSEMResults object may contain multiple data sets.
In this case, resampling is done by data set or group. Note that depending
on the number of data sets/groups, the computation may be considerably
slower as resampling will be repeated for each data set/group. However, apart
from speed considerations users don not need to worry about the type of
input used to compute the cSEMResults object as resamplecSEMResults()
is able to deal with each case.
The number of bootstrap runs for the first and second run are given by .R
and .R2
.
The default is 499
for the first and 199
for the second run
but should be increased in real applications. See e.g.,
Hesterberg (2015), p.380,
Davison and Hinkley (1997), and
Efron and Hastie (2016) for recommendations.
For jackknife .R
are .R2
are ignored.
Resampling may produce inadmissible results (as checked by verify()
).
By default these results are dropped however users may choose to "ignore"
or "replace"
inadmissible results in which resampling continuous until
the necessary number of admissible results is reached.
The cSEM package supports (multi)processing via the future
framework (Bengtsson 2018). Users may simply choose an evaluation plan
via .eval_plan
and the package takes care of all the complicated backend
issues. Currently, users may choose between standard single-core/single-session
evaluation ("sequential"
) and multiprocessing ("multisession"
or "multicore"
). The future package
provides other options (e.g., "cluster"
or "remote"
), however, they probably
will not be needed in the context of the cSEM package as simulations usually
do not require high-performance clusters. Depending on the operating system, the future
package will manage to distribute tasks to multiple R sessions (Windows)
or multiple cores. Note that multiprocessing is not necessary always faster
when only a "small" number of replications is required as the overhead of
initializing new sessions or distributing tasks to different cores
will not immediately be compensated by the availability of multiple sessions/cores.
Random number generation (RNG) uses the L'Ecuyer-CRMR RGN stream as implemented in the
future.apply package (Bengtsson 2018).
It is independent of the evaluation plan. Hence, setting e.g., .seed = 123
will
generate the same random number and replicates
for both .eval_plan = "sequential"
, .eval_plan = "multisession"
, and .eval_plan = "multicore"
.
See ?future_lapply for details.
Value
The core structure is the same structure as that of .object
with
the following elements added:
-
$Estimates_resamples
: A list containing the.R
resamples and the original estimates for each of the resampled quantities (Path_estimates, Loading_estimates, Weight_estimates, user defined functions). Each list element is a list containing elements$Resamples
and$Original
.$Resamples
is a(.R x K)
matrix with each row representing one resample for each of theK
parameters/statistics.$Original
contains the original estimates (vectorized by column if the output of the user provided function is a matrix. -
$Information_resamples
: A list containing additional information.
Use str(<.object>, list.len = 3)
on the resulting object for an overview.
References
Bengtsson H (2018).
future: Unified Parallel and Distributed Processing in R for Everyone.
R package version 1.10.0, https://CRAN.R-project.org/package=future.
Bengtsson H (2018).
future.apply: Apply Function to Elements in Parallel using Futures.
R package version 1.0.1, https://CRAN.R-project.org/package=future.apply.
Davison AC, Hinkley DV (1997).
Bootstrap Methods and their Application.
Cambridge University Press.
doi:10.1017/cbo9780511802843.
Efron B, Hastie T (2016).
Computer Age Statistical Inference.
Cambridge University Pr.
ISBN 1107149894.
Hesterberg TC (2015).
“What Teachers Should Know About the Bootstrap: Resampling in the Undergraduate Statistics Curriculum.”
The American Statistician, 69(4), 371–386.
doi:10.1080/00031305.2015.1089789.
See Also
csem, summarize()
, infer()
, cSEMResults
Examples
## Not run:
# Note: example not run as resampling is time consuming
# ===========================================================================
# Basic usage
# ===========================================================================
model <- "
# Structural model
QUAL ~ EXPE
EXPE ~ IMAG
SAT ~ IMAG + EXPE + QUAL + VAL
LOY ~ IMAG + SAT
VAL ~ EXPE + QUAL
# Measurement model
EXPE =~ expe1 + expe2 + expe3 + expe4 + expe5
IMAG =~ imag1 + imag2 + imag3 + imag4 + imag5
LOY =~ loy1 + loy2 + loy3 + loy4
QUAL =~ qual1 + qual2 + qual3 + qual4 + qual5
SAT =~ sat1 + sat2 + sat3 + sat4
VAL =~ val1 + val2 + val3 + val4
"
## Estimate the model without resampling
a <- csem(satisfaction, model)
## Bootstrap and jackknife estimation
boot <- resamplecSEMResults(a)
jack <- resamplecSEMResults(a, .resample_method = "jackknife")
## Alternatively use .resample_method in csem()
boot_csem <- csem(satisfaction, model, .resample_method = "bootstrap")
jack_csem <- csem(satisfaction, model, .resample_method = "jackknife")
# ===========================================================================
# Extended usage
# ===========================================================================
### Double resampling ------------------------------------------------------
# The confidence intervals (e.g. the bias-corrected and accelearated CI)
# require double resampling. Use .resample_method2 for this.
boot1 <- resamplecSEMResults(
.object = a,
.resample_method = "bootstrap",
.R = 50,
.resample_method2 = "bootstrap",
.R2 = 20,
.seed = 1303
)
## Again, this is identical to using csem
boot1_csem <- csem(
.data = satisfaction,
.model = model,
.resample_method = "bootstrap",
.R = 50,
.resample_method2 = "bootstrap",
.R2 = 20,
.seed = 1303
)
identical(boot1, boot1_csem) # only true if .seed was set
### Inference ---------------------------------------------------------------
# To get inferencial quanitites such as the estimated standard error or
# the percentile confidence intervall for each resampled quantity use
# postestimation function infer()
inference <- infer(boot1)
inference$Path_estimates$sd
inference$Path_estimates$CI_percentile
# As usual summarize() can be called directly
summarize(boot1)
# In the example above .R x .R2 = 50 x 20 = 1000. Multiprocessing will be
# faster on most systems here and is therefore recommended. Note that multiprocessing
# does not affect the random number generation
boot2 <- resamplecSEMResults(
.object = a,
.resample_method = "bootstrap",
.R = 50,
.resample_method2 = "bootstrap",
.R2 = 20,
.eval_plan = "multisession",
.seed = 1303
)
identical(boot1, boot2)
## End(Not run)
Data: satisfaction
Description
A data frame with 250 observations and 27 variables.
Variables from 1 to 27 refer to six latent concepts: IMAG
=Image,
EXPE
=Expectations, QUAL
=Quality, VAL
=Value,
SAT
=Satisfaction, and LOY
=Loyalty.
- imag1-imag5
Indicators attached to concept
IMAG
which is supposed to capture aspects such as the institutions reputation, trustworthiness, seriousness, solidness, and caring about customer.- expe1-expe5
Indicators attached to concept
EXPE
which is supposed to capture aspects concerning products and services provided, customer service, providing solutions, and expectations for the overall quality.- qual1-qual5
Indicators attached to concept
QUAL
which is supposed to capture aspects concerning reliability of products and services, the range of products and services, personal advice, and overall perceived quality.- val1-val4
Indicators attached to concept
VAL
which is supposed to capture aspects related to beneficial services and products, valuable investments, quality relative to price, and price relative to quality.- sat1-sat4
Indicators attached to concept
SAT
which is supposed to capture aspects concerning overall rating of satisfaction, fulfillment of expectations, satisfaction relative to other banks, and performance relative to customer's ideal bank.- loy1-loy4
Indicators attached to concept
LOY
which is supposed to capture aspects concerning propensity to choose the same bank again, propensity to switch to other bank, intention to recommend the bank to friends, and the sense of loyalty.
Usage
satisfaction
Format
An object of class data.frame
with 250 rows and 27 columns.
Details
This dataset contains the variables from a customer satisfaction study of
a Spanish credit institution on 250 customers. The data is identical to
the dataset provided by the plspm package
but with the last column (gender
) removed. If you are looking for the original
dataset use the satisfaction_gender dataset.
Source
The plspm package (version 0.4.9). Original source according to plspm: "Laboratory of Information Analysis and Modeling (LIAM). Facultat d'Informatica de Barcelona, Universitat Politecnica de Catalunya".
Data: satisfaction including gender
Description
A data frame with 250 observations and 28 variables.
Variables from 1 to 27 refer to six latent concepts: IMAG
=Image,
EXPE
=Expectations, QUAL
=Quality, VAL
=Value,
SAT
=Satisfaction, and LOY
=Loyalty.
- imag1-imag5
Indicators attached to concept
IMAG
which is supposed to capture aspects such as the institutions reputation, trustworthiness, seriousness, solidness, and caring about customer.- expe1-expe5
Indicators attached to concept
EXPE
which is supposed to capture aspects concerning products and services provided, customer service, providing solutions, and expectations for the overall quality.- qual1-qual5
Indicators attached to concept
QUAL
which is supposed to capture aspects concerning reliability of products and services, the range of products and services, personal advice, and overall perceived quality.- val1-val4
Indicators attached to concept
VAL
which is supposed to capture aspects related to beneficial services and products, valuable investments, quality relative to price, and price relative to quality.- sat1-sat4
Indicators attached to concept
SAT
which is supposed to capture aspects concerning overall rating of satisfaction, fulfillment of expectations, satisfaction relative to other banks, and performance relative to customer's ideal bank.- loy1-loy4
Indicators attached to concept
LOY
which is supposed to capture aspects concerning propensity to choose the same bank again, propensity to switch to other bank, intention to recommend the bank to friends, and the sense of loyalty.- gender
The sex of the respondent.
Usage
satisfaction_gender
Format
An object of class data.frame
with 250 rows and 28 columns.
Details
This data set contains the variables from a customer satisfaction study of
a Spanish credit institution on 250 customers. The data is taken from the
plspm package. For convenience,
there is a version of the dataset with the last column (gender
) removed: satisfaction.
Source
The plspm package (version 0.4.9). Original source according to plspm: "Laboratory of Information Analysis and Modeling (LIAM). Facultat d'Informatica de Barcelona, Universitat Politecnica de Catalunya".
savePlot
Description
This function saves a given plot of a cSEMResults object to a specified file format.
Usage
savePlot(
.plot_object,
.filename,
.path = NULL)
Arguments
.plot_object |
Object returned by one of the following functions |
.filename |
Character string. The name of the file to save the plot to (supports 'pdf', 'png', 'svg', and 'dot' formats). |
.path |
Character string. Path of the directory to save the file to. Defaults to the current working directory. |
See Also
plot.cSEMResults_default()
plot.cSEMResults_multi()
plot.cSEMResults_2ndorder()
Internal: save_single_plot Helper function to save a single DiagrammeR plot based on the file extension
Description
diagrammer_obj
DiagrammeR plot object to be saved.
out_file
The name of the file to save the plot to (supports 'pdf', 'png', 'svg', and 'dot' formats).
Usage
save_single_plot(
diagrammer_obj,
out_file,
.path = NULL)
Arguments
.path |
Character string. Path of the directory to save the file to. Defaults
to |
Value
NULL.
Internal: Scale weights
Description
Scale weights such that the formed composite has unit variance.
Usage
scaleWeights(
.S = args_default()$.S,
.W = args_default()$.W
)
Arguments
.S |
The (K x K) empirical indicator correlation matrix. |
.W |
A (J x K) matrix of weights. |
Value
The (J x K) matrix of scaled weights.
Internal: secondOrderMeasurementEdges
Description
Build measurement edges for a second–order model.
Usage
secondOrderMeasurementEdges(
construct,
weights_first,
loadings_first,
weight_p_first,
loading_p_first,
weights_second,
loadings_second,
weight_p_second,
loading_p_second,
plot_signif,
plot_labels,
constructTypes,
only_second_stage = FALSE
)
Arguments
plot_labels |
Logical. Whether to display edge labels. Defaults to TRUE. |
Value
Character string.
Internal: Set the dominant indicator
Description
Set the dominant indicator for each construct. Since the sign of the weights, and thus the loadings is often not determined, a dominant indicator can be chosen per block. The sign of the weights are chosen that the correlation between the dominant indicator and the composite is positive.
Usage
setDominantIndicator(
.W = args_default()$.W,
.dominant_indicators = args_default()$.dominant_indicators,
.S = args_default()$.S
)
Arguments
.W |
A (J x K) matrix of weights. |
.dominant_indicators |
A character vector of |
.S |
The (K x K) empirical indicator correlation matrix. |
Value
The (J x K) matrix of weights with the dominant indicator set.
Internal: Set starting values
Description
Set the starting values.
Usage
setStartingValues(
.W = args_default()$.W,
.starting_values = args_default()$.starting_values
)
Arguments
.W |
A (J x K) matrix of weights. |
.starting_values |
A named list of vectors where the
list names are the construct names whose indicator weights the user
wishes to set. The vectors must be named vectors of |
Value
The (J x K) matrix of starting values.
Internal: get structured cSEMTestMGD results
Description
Convenience function to summarize the results of all tests resulting from a
call to testMGD()
in a user-friendly way.
Usage
structureTestMGDDecisions(.object)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
Value
A data.frame.
Summarize model
Description
Usage
summarize(
.object = NULL,
.alpha = 0.05,
.ci = NULL,
...
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.alpha |
An integer or a numeric vector of significance levels.
Defaults to |
.ci |
A vector of character strings naming the confidence interval to compute.
For possible choices see |
... |
Further arguments to |
Details
The summary is mainly focused on estimated parameters. For quality criteria
such as the average variance extracted (AVE), reliability estimates,
effect size estimates etc., use assess()
.
If .object
contains resamples, standard errors, t-values and p-values
(assuming estimates are standard normally distributed) are printed as well.
By default the percentile confidence interval is given as well. For other
confidence intervals use the .ci
argument. See infer()
for possible choices
and a description.
Value
An object of class cSEMSummarize
. A cSEMSummarize
object has
the same structure as the cSEMResults object with a couple differences:
Elements
$Path_estimates
,$Loadings_estimates
,$Weight_estimates
,$Weight_estimates
, and$Residual_correlation
are standardized data frames instead of matrices.Data frames
$Effect_estimates
,$Indicator_correlation
, and$Exo_construct_correlation
are added to$Estimates
.
The data frame format is usually much more convenient if users intend to present the results in e.g., a paper or a presentation.
See Also
csem, assess()
, cSEMResults, exportToExcel()
Examples
## Take a look at the dataset
#?threecommonfactors
## Specify the (correct) model
model <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2
# (Reflective) measurement model
eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"
## Estimate
res <- csem(threecommonfactors, model, .resample_method = "bootstrap", .R = 40)
## Postestimation
res_summarize <- summarize(res)
res_summarize
# Extract e.g. the loadings
res_summarize$Estimates$Loading_estimates
## By default only the 95% percentile confidence interval is printed. User
## can have several confidence interval computed, however, only the first
## will be printed.
res_summarize <- summarize(res, .ci = c("CI_standard_t", "CI_percentile"),
.alpha = c(0.05, 0.01))
res_summarize
# Extract the loading including both confidence intervals
res_summarize$Estimates$Path_estimates
Perform a Cross-Validated Predictive Ability Test (CVPAT)
Description
Usage
testCVPAT(
.object1 = NULL,
.object2 = NULL,
.approach_predict = c("earliest", "direct"),
.seed = NULL,
.cv_folds = 10,
.handle_inadmissibles = c("stop", "ignore"),
.testtype = c("twosided", "onesided"))
Arguments
.object1 |
An R object of class cSEMResults resulting from a call to |
.object2 |
An R object of class cSEMResults resulting from a call to |
.approach_predict |
Character string. Which approach should be used to predictions? One of "earliest" and "direct". If "earliest" predictions for indicators associated to endogenous constructs are performed using only indicators associated to exogenous constructs. If "direct", predictions for indicators associated to endogenous constructs are based on indicators associated to their direct antecedents. Defaults to "earliest". |
.seed |
Integer or |
.cv_folds |
Integer. The number of cross-validation folds to use. Setting
|
.handle_inadmissibles |
Character string. How should inadmissible results
be treated? One of "drop", "ignore", or "replace". If "drop", all
replications/resamples yielding an inadmissible result will be dropped
(i.e. the number of results returned will potentially be less than |
.testtype |
Character string. One of "twosided" (H1: The models do not perform equally in predicting indicators belonging to endogenous constructs)" and onesided" (H1: Model 1 performs better in predicting indicators belonging to endogenous constructs than model2). Defaults to "twosided". |
Details
Perform a Cross-Validated Predictive Ability Test (CVPAT) as described in (Liengaard et al. 2020). The predictive performance of two models based on the same dataset is compared. In doing so, the average difference in losses in predictions is compared for both models.
Value
An object of class cSEMCVPAT
with print and plot methods.
Technically, cSEMCVPAT
is a
named list containing the following list elements:
- '$Information'
Additional information.
References
Liengaard BD, Sharma PN, Hult GTM, Jensen MB, Sarstedt M, Hair JF, Ringle CM (2020). “Prediction: Coveted, Yet Forsaken? Introducing a Cross-Validated Predictive Ability Test in Partial Least Squares Path Modeling.” Decision Sciences, 52(2), 362–392. doi:10.1111/deci.12445.
See Also
csem, cSEMResults, exportToExcel()
Examples
### Anime example taken from https://github.com/ISS-Analytics/pls-predict/
# Load data
data(Anime) # data is similar to the Anime.csv found on
# https://github.com/ISS-Analytics/pls-predict/ but with irrelevant
# columns removed
# Split into training and data the same way as it is done on
# https://github.com/ISS-Analytics/pls-predict/
set.seed(123)
index <- sample.int(dim(Anime)[1], 83, replace = FALSE)
dat_train <- Anime[-index, ]
dat_test <- Anime[index, ]
# Specify model
model <- "
# Structural model
ApproachAvoidance ~ PerceivedVisualComplexity + Arousal
# Measurement/composite model
ApproachAvoidance =~ AA0 + AA1 + AA2 + AA3
PerceivedVisualComplexity <~ VX0 + VX1 + VX2 + VX3 + VX4
Arousal <~ Aro1 + Aro2 + Aro3 + Aro4
"
# Estimate (replicating the results of the `simplePLS()` function)
res <- csem(dat_train,
model,
.disattenuate = FALSE, # original PLS
.iter_max = 300,
.tolerance = 1e-07,
.PLS_weight_scheme_inner = "factorial"
)
# Predict using a user-supplied training data set
pp <- predict(res, .test_data = dat_test)
pp
### Compute prediction metrics ------------------------------------------------
res2 <- csem(Anime, # whole data set
model,
.disattenuate = FALSE, # original PLS
.iter_max = 300,
.tolerance = 1e-07,
.PLS_weight_scheme_inner = "factorial"
)
# Predict using 10-fold cross-validation
## Not run:
pp2 <- predict(res, .benchmark = "lm")
pp2
## There is a plot method available
plot(pp2)
## End(Not run)
### Example using OrdPLScPredict -----------------------------------------------
# Transform the numerical indicators into factors
## Not run:
data("BergamiBagozzi2000")
data_new <- data.frame(cei1 = as.ordered(BergamiBagozzi2000$cei1),
cei2 = as.ordered(BergamiBagozzi2000$cei2),
cei3 = as.ordered(BergamiBagozzi2000$cei3),
cei4 = as.ordered(BergamiBagozzi2000$cei4),
cei5 = as.ordered(BergamiBagozzi2000$cei5),
cei6 = as.ordered(BergamiBagozzi2000$cei6),
cei7 = as.ordered(BergamiBagozzi2000$cei7),
cei8 = as.ordered(BergamiBagozzi2000$cei8),
ma1 = as.ordered(BergamiBagozzi2000$ma1),
ma2 = as.ordered(BergamiBagozzi2000$ma2),
ma3 = as.ordered(BergamiBagozzi2000$ma3),
ma4 = as.ordered(BergamiBagozzi2000$ma4),
ma5 = as.ordered(BergamiBagozzi2000$ma5),
ma6 = as.ordered(BergamiBagozzi2000$ma6),
orgcmt1 = as.ordered(BergamiBagozzi2000$orgcmt1),
orgcmt2 = as.ordered(BergamiBagozzi2000$orgcmt2),
orgcmt3 = as.ordered(BergamiBagozzi2000$orgcmt3),
orgcmt5 = as.ordered(BergamiBagozzi2000$orgcmt5),
orgcmt6 = as.ordered(BergamiBagozzi2000$orgcmt6),
orgcmt7 = as.ordered(BergamiBagozzi2000$orgcmt7),
orgcmt8 = as.ordered(BergamiBagozzi2000$orgcmt8))
model <- "
# Measurement models
OrgPres =~ cei1 + cei2 + cei3 + cei4 + cei5 + cei6 + cei7 + cei8
OrgIden =~ ma1 + ma2 + ma3 + ma4 + ma5 + ma6
AffJoy =~ orgcmt1 + orgcmt2 + orgcmt3 + orgcmt7
AffLove =~ orgcmt5 + orgcmt 6 + orgcmt8
# Structural model
OrgIden ~ OrgPres
AffLove ~ OrgIden
AffJoy ~ OrgIden
"
# Estimate using cSEM; note: the fact that indicators are factors triggers OrdPLSc
res <- csem(.model = model, .data = data_new[1:250,])
summarize(res)
# Predict using OrdPLSPredict
set.seed(123)
pred <- predict(
.object = res,
.benchmark = "PLS-PM",
.test_data = data_new[(251):305,],
.treat_as_continuous = TRUE, .approach_score_target = "median"
)
pred
round(pred$Prediction_metrics[, -1], 4)
## End(Not run)
Regression-based Hausman test
Description
Usage
testHausman(
.object = NULL,
.eval_plan = c("sequential", "multicore", "multisession"),
.handle_inadmissibles = c("drop", "ignore", "replace"),
.R = 499,
.resample_method = c("bootstrap", "jackknife"),
.seed = NULL
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.eval_plan |
Character string. The evaluation plan to use. One of "sequential", "multicore", or "multisession". In the two latter cases all available cores will be used. Defaults to "sequential". |
.handle_inadmissibles |
Character string. How should inadmissible results
be treated? One of "drop", "ignore", or "replace". If "drop", all
replications/resamples yielding an inadmissible result will be dropped
(i.e. the number of results returned will potentially be less than |
.R |
Integer. The number of bootstrap replications. Defaults to |
.resample_method |
Character string. The resampling method to use. One of: "none", "bootstrap" or "jackknife". Defaults to "none". |
.seed |
Integer or |
Details
Calculates the regression-based Hausman test to be used to compare OLS to 2SLS estimates or 2SLS to 3SLS estimates. See e.g., Wooldridge (2010) (pages 131 f.) for details.
The function is somewhat experimental. Only use if you know what you are doing.
References
Wooldridge JM (2010). Econometric Analysis of Cross Section and Panel Data, 2 edition. MIT Press.
See Also
Examples
### Example from Dijkstra & Hensler (2015)
## Prepartion (values are from p. 15-16 of the paper)
Lambda <- t(kronecker(diag(6), c(0.7, 0.7, 0.7)))
Phi <- matrix(c(1.0000, 0.5000, 0.5000, 0.5000, 0.0500, 0.4000,
0.5000, 1.0000, 0.5000, 0.5000, 0.5071, 0.6286,
0.5000, 0.5000, 1.0000, 0.5000, 0.2929, 0.7714,
0.5000, 0.5000, 0.5000, 1.0000, 0.2571, 0.6286,
0.0500, 0.5071, 0.2929, 0.2571, 1.0000, sqrt(0.5),
0.4000, 0.6286, 0.7714, 0.6286, sqrt(0.5), 1.0000),
ncol = 6)
## Create population indicator covariance matrix
Sigma <- t(Lambda) %*% Phi %*% Lambda
diag(Sigma) <- 1
dimnames(Sigma) <- list(paste0("x", rep(1:6, each = 3), 1:3),
paste0("x", rep(1:6, each = 3), 1:3))
## Generate data
dat <- MASS::mvrnorm(n = 500, mu = rep(0, 18), Sigma = Sigma, empirical = TRUE)
# empirical = TRUE to show that 2SLS is in fact able to recover the true population
# parameters.
## Model to estimate
model <- "
## Structural model (nonrecurisve)
eta5 ~ eta6 + eta1 + eta2
eta6 ~ eta5 + eta3 + eta4
## Measurement model
eta1 =~ x11 + x12 + x13
eta2 =~ x21 + x22 + x23
eta3 =~ x31 + x32 + x33
eta4 =~ x41 + x42 + x43
eta5 =~ x51 + x52 + x53
eta6 =~ x61 + x62 + x63
"
library(cSEM)
## Estimate
res_ols <- csem(dat, .model = model, .approach_paths = "OLS")
sum_res_ols <- summarize(res_ols)
# Note: For the example the model-implied indicator correlation is irrelevant
# the warnings can be ignored.
res_2sls <- csem(dat, .model = model, .approach_paths = "2SLS",
.instruments = list("eta5" = c('eta1','eta2','eta3','eta4'),
"eta6" = c('eta1','eta2','eta3','eta4')))
sum_res_2sls <- summarize(res_2sls)
# Note that exogenous constructs are supplied as instruments for themselves!
## Test for endogeneity
test_ha <- testHausman(res_2sls, .R = 200)
test_ha
Tests for multi-group comparisons
Description
Usage
testMGD(
.object = NULL,
.alpha = 0.05,
.approach_p_adjust = "none",
.approach_mgd = c("all", "Klesel", "Chin", "Sarstedt",
"Keil", "Nitzl", "Henseler", "CI_para","CI_overlap"),
.output_type = c("complete", "structured"),
.parameters_to_compare = NULL,
.eval_plan = c("sequential", "multicore", "multisession"),
.handle_inadmissibles = c("replace", "drop", "ignore"),
.R_permutation = 499,
.R_bootstrap = 499,
.saturated = FALSE,
.seed = NULL,
.type_ci = "CI_percentile",
.type_vcv = c("indicator", "construct"),
.verbose = TRUE
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.alpha |
An integer or a numeric vector of significance levels.
Defaults to |
.approach_p_adjust |
Character string or a vector of character strings.
Approach used to adjust the p-value for multiple testing.
See the |
.approach_mgd |
Character string or a vector of character strings. Approach used for the multi-group comparison. One of: "all", "Klesel", "Chin", "Sarstedt", "Keil, "Nitzl", "Henseler", "CI_para", or "CI_overlap". Default to "all" in which case all approaches are computed (if possible). |
.output_type |
Character string. The type of output to return. One of "complete" or "structured". See the Value section for details. Defaults to "complete". |
.parameters_to_compare |
A model in lavaan model syntax indicating which
parameters (i.e, path ( |
.eval_plan |
Character string. The evaluation plan to use. One of "sequential", "multicore", or "multisession". In the two latter cases all available cores will be used. Defaults to "sequential". |
.handle_inadmissibles |
Character string. How should inadmissible results
be treated? One of "drop", "ignore", or "replace". If "drop", all
replications/resamples yielding an inadmissible result will be dropped
(i.e. the number of results returned will potentially be less than |
.R_permutation |
Integer. The number of permutations. Defaults to |
.R_bootstrap |
Integer. The number of bootstrap runs. Ignored if |
.saturated |
Logical. Should a saturated structural model be used?
Defaults to |
.seed |
Integer or |
.type_ci |
Character string. Which confidence interval should be calculated?
For possible choices, see the |
.type_vcv |
Character string. Which model-implied correlation matrix should be calculated? One of "indicator" or "construct". Defaults to "indicator". |
.verbose |
Logical. Should information (e.g., progress bar) be printed
to the console? Defaults to |
Details
This function performs various tests proposed in the context of multigroup analysis.
The following tests are implemented:
.approach_mgd = "Klesel"
: Approach suggested by Klesel et al. (2019)-
The model-implied variance-covariance matrix (either indicator (
.type_vcv = "indicator"
) or construct (.type_vcv = "construct"
)) is compared across groups. If the model-implied indicator or construct correlation matrix based on a saturated structural model should be compared, set.saturated = TRUE
. To measure the distance between the model-implied variance-covariance matrices, the geodesic distance (dG) and the squared Euclidean distance (dL) are used. If more than two groups are compared, the average distance over all groups is used. .approach_mgd = "Sarstedt"
: Approach suggested by Sarstedt et al. (2011)-
Groups are compared in terms of parameter differences across groups. Sarstedt et al. (2011) tests if parameter k is equal across all groups. If several parameters are tested simultaneously it is recommended to adjust the significance level or the p-values (in cSEM correction is done by p-value). By default no multiple testing correction is done, however, several common adjustments are available via
.approach_p_adjust
. Seestats::p.adjust()
for details. Note: the test has some severe shortcomings. Use with caution. .approach_mgd = "Chin"
: Approach suggested by Chin and Dibbern (2010)-
Groups are compared in terms of parameter differences across groups. Chin and Dibbern (2010) tests if parameter k is equal between two groups. If more than two groups are tested for equality, parameter k is compared between all pairs of groups. In this case, it is recommended to adjust the significance level or the p-values (in cSEM correction is done by p-value) since this is essentially a multiple testing setup. If several parameters are tested simultaneously, correction is by group and number of parameters. By default no multiple testing correction is done, however, several common adjustments are available via
.approach_p_adjust
. Seestats::p.adjust()
for details. .approach_mgd = "Keil"
: Approach suggested by Keil et al. (2000)-
Groups are compared in terms of parameter differences across groups. Keil et al. (2000) tests if parameter k is equal between two groups. It is assumed, that the standard errors of the coefficients are equal across groups. The calculation of the standard error of the parameter difference is adjusted as proposed by Henseler et al. (2009). If more than two groups are tested for equality, parameter k is compared between all pairs of groups. In this case, it is recommended to adjust the significance level or the p-values (in cSEM correction is done by p-value) since this is essentially a multiple testing setup. If several parameters are tested simultaneously, correction is by group and number of parameters. By default no multiple testing correction is done, however, several common adjustments are available via
.approach_p_adjust
. Seestats::p.adjust()
for details. .approach_mgd = "Nitzl"
: Approach suggested by Nitzl (2010)-
Groups are compared in terms of parameter differences across groups. Similarly to Keil et al. (2000), a single parameter k is tested for equality between two groups. In contrast to Keil et al. (2000), it is assumed, that the standard errors of the coefficients are unequal across groups (Sarstedt et al. 2011). If more than two groups are tested for equality, parameter k is compared between all pairs of groups. In this case, it is recommended to adjust the significance level or the p-values (in cSEM correction is done by p-value) since this is essentially a multiple testing setup. If several parameters are tested simultaneously, correction is by group and number of parameters. By default no multiple testing correction is done, however, several common adjustments are available via
.approach_p_adjust
. Seestats::p.adjust()
for details. .approach_mgd = "Henseler"
: Approach suggested by Henseler (2007)-
Groups are compared in terms of parameter differences across groups. In doing so, the bootstrap estimates of one parameter are compared across groups. In the literature, this approach is also known as PLS-MGA. Originally, this test was proposed as an one-sided test. In this function we perform a left-sided and a right-sided test to investigate whether a parameter differs across two groups. In doing so, the significance level is divided by 2 and compared to p-value of the left and right-sided test. Moreover,
.approach_p_adjust
is ignored and no overall decision is returned. For a more detailed description, see also Henseler et al. (2009). .approach_mgd = "CI_param"
: Approach mentioned in Sarstedt et al. (2011)-
This approach is based on the confidence intervals constructed around the parameter estimates of the two groups. If the parameter of one group falls within the confidence interval of the other group and/or vice versa, it can be concluded that there is no group difference. Since it is based on the confidence intervals
.approach_p_adjust
is ignored. .approach_mgd = "CI_overlap"
: Approach mentioned in Sarstedt et al. (2011)-
This approach is based on the confidence intervals (CIs) constructed around the parameter estimates of the two groups. If the two CIs overlap, it can be concluded that there is no group difference. Since it is based on the confidence intervals
.approach_p_adjust
is ignored.
Use .approach_mgd
to choose the approach. By default all approaches are computed
(.approach_mgd = "all"
).
For convenience, two types of output are available. See the "Value" section below.
By default, approaches based on parameter differences across groups compare
all parameters (.parameters_to_compare = NULL
). To compare only
a subset of parameters provide the parameters in lavaan model syntax just like
the model to estimate. Take the simple model:
model_to_estimate <- " Structural model eta2 ~ eta1 eta3 ~ eta1 + eta2 # Each concept os measured by 3 indicators, i.e., modeled as latent variable eta1 =~ y11 + y12 + y13 eta2 =~ y21 + y22 + y23 eta3 =~ y31 + y32 + y33 "
If only the path from eta1 to eta3 and the loadings of eta1 are to be compared across groups, write:
to_compare <- " Structural parameters to compare eta3 ~ eta1 # Loadings to compare eta1 =~ y11 + y12 + y13 "
Note that the "model" provided to .parameters_to_compare
does not need to be an estimable model!
Note also that compared to all other functions in cSEM using the argument,
.handle_inadmissibles
defaults to "replace"
to accommodate the Sarstedt et al. (2011) approach.
Argument .R_permuation
is ignored for the "Nitzl"
and the "Keil"
approach.
.R_bootstrap
is ignored if .object
already contains resamples,
i.e. has class cSEMResults_resampled
and if only the "Klesel"
or the "Chin"
approach are used.
The argument .saturated
is used by "Klesel"
only. If .saturated = TRUE
the original structural model is ignored and replaced by a saturated model,
i.e. a model in which all constructs are allowed to correlate freely.
This is useful to test differences in the measurement models between groups
in isolation.
Value
If .output_type = "complete"
a list of class cSEMTestMGD
. Technically, cSEMTestMGD
is a
named list containing the following list elements:
$Information
Additional information.
$Klesel
A list with elements,
Test_statistic
,P_value
, andDecision
$Chin
A list with elements,
Test_statistic
,P_value
,Decision
, andDecision_overall
$Sarstedt
A list with elements,
Test_statistic
,P_value
,Decision
, andDecision_overall
$Keil
A list with elements,
Test_statistic
,P_value
,Decision
, andDecision_overall
$Nitzl
A list with elements,
Test_statistic
,P_value
,Decision
, andDecision_overall
$Henseler
A list with elements,
Test_statistic
,P_value
,Decision
, andDecision_overall
$CI_para
A list with elements,
Decision
, andDecision_overall
$CI_overlap
A list with elements,
Decision
, andDecision_overall
If .output_type = "structured"
a tibble (data frame) with the following columns
is returned.
Test
The name of the test.
Comparision
The parameter that was compared across groups. If "overall" the overall fit of the model was compared.
alpha%
The test decision for a given "alpha" level. If
TRUE
the null hypotheses was rejected; if FALSE it was not rejected.p-value_correction
The p-value correction.
CI_type
Only for the "CI_para" and the "CI_overlap" test. Which confidence interval was used.
Distance_metric
Only for Test = "Klesel". Which distance metric was used.
References
Chin WW, Dibbern J (2010).
“An Introduction to a Permutation Based Procedure for Multi-Group PLS Analysis: Results of Tests of Differences on Simulated Data and a Cross Cultural Analysis of the Sourcing of Information System Services Between Germany and the USA.”
In Handbook of Partial Least Squares, 171–193.
Springer Berlin Heidelberg.
doi:10.1007/978-3-540-32827-8_8.
Henseler J (2007).
“A new and simple approach to multi-group analysis in partial least squares path modeling.”
In Martens H, Næ s T (eds.), Proceedings of PLS'07 - The 5th International Symposium on PLS and Related Methods, 104–107.
PLS, Norway: Matforsk, As.
Henseler J, Ringle CM, Sinkovics RR (2009).
“The use of partial least squares path modeling in international marketing.”
Advances in International Marketing, 20, 277–320.
doi:10.1108/S1474-7979(2009)0000020014.
Keil M, Tan BC, Wei K, Saarinen T, Tuunainen V, Wassenaar A (2000).
“A cross-cultural study on escalation of commitment behavior in software projects.”
MIS Quarterly, 24(2), 299–325.
Klesel M, Schuberth F, Henseler J, Niehaves B (2019).
“A Test for Multigroup Comparison Using Partial Least Squares Path Modeling.”
Internet Research, 29(3), 464–477.
doi:10.1108/intr-11-2017-0418.
Nitzl C (2010).
“Eine anwenderorientierte Einfuehrung in die Partial Least Square (PLS)-Methode.”
In Arbeitspapier, number 21.
Universitaet Hamburg, Institut fuer Industrielles Management, Hamburg.
Sarstedt M, Henseler J, Ringle CM (2011).
“Multigroup Analysis in Partial Least Squares (PLS) Path Modeling: Alternative Methods and Empirical Results.”
In Advances in International Marketing, 195–218.
Emerald Group Publishing Limited.
doi:10.1108/s1474-7979(2011)0000022012.
See Also
csem()
, cSEMResults, testMICOM()
, testOMF()
Examples
## Not run:
# ===========================================================================
# Basic usage
# ===========================================================================
model <- "
# Structural model
QUAL ~ EXPE
EXPE ~ IMAG
SAT ~ IMAG + EXPE + QUAL + VAL
LOY ~ IMAG + SAT
VAL ~ EXPE + QUAL
# Measurement model
EXPE <~ expe1 + expe2 + expe3 + expe4 + expe5
IMAG <~ imag1 + imag2 + imag3 + imag4 + imag5
LOY =~ loy1 + loy2 + loy3 + loy4
QUAL =~ qual1 + qual2 + qual3 + qual4 + qual5
SAT <~ sat1 + sat2 + sat3 + sat4
VAL <~ val1 + val2 + val3 + val4
"
## Create list of virtually identical data sets
dat <- list(satisfaction[-3,], satisfaction[-5, ], satisfaction[-10, ])
out <- csem(dat, model, .resample_method = "bootstrap", .R = 40)
## Test
testMGD(out, .R_permutation = 40,.verbose = FALSE)
# Notes:
# 1. .R_permutation (and .R in the call to csem) is small to make examples run quicker;
# should be higher in real applications.
# 2. Test will not reject their respective H0s since the groups are virtually
# identical.
# 3. Only exception is the approach suggested by Sarstedt et al. (2011), a
# sign that the test is unreliable.
# 4. As opposed to other functions involving the argument,
# '.handle_inadmissibles' the default is "replace" as this is
# required by Sarstedt et al. (2011)'s approach.
# ===========================================================================
# Extended usage
# ===========================================================================
### Test only a subset ------------------------------------------------------
# By default all parameters are compared. Select a subset by providing a
# model in lavaan model syntax:
to_compare <- "
# Path coefficients
QUAL ~ EXPE
# Loadings
EXPE <~ expe1 + expe2 + expe3 + expe4 + expe5
"
## Test
testMGD(out, .parameters_to_compare = to_compare, .R_permutation = 20,
.R_bootstrap = 20, .verbose = FALSE)
### Different p_adjustments --------------------------------------------------
# To adjust p-values to accommodate multiple testing use .approach_p_adjust.
# The number of tests to use for adjusting depends on the approach chosen. For
# the Chin approach for example it is the number of parameters to test times the
# number of possible group comparisons. To compare the results for different
# adjustments, a vector of p-adjustments may be chosen.
## Test
testMGD(out, .parameters_to_compare = to_compare,
.approach_p_adjust = c("none", "bonferroni"),
.R_permutation = 20, .R_bootstrap = 20, .verbose = FALSE)
## End(Not run)
Test measurement invariance of composites
Description
Usage
testMICOM(
.object = NULL,
.approach_p_adjust = "none",
.handle_inadmissibles = c("drop", "ignore", "replace"),
.R = 499,
.seed = NULL,
.verbose = TRUE
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.approach_p_adjust |
Character string or a vector of character strings.
Approach used to adjust the p-value for multiple testing.
See the |
.handle_inadmissibles |
Character string. How should inadmissible results
be treated? One of "drop", "ignore", or "replace". If "drop", all
replications/resamples yielding an inadmissible result will be dropped
(i.e. the number of results returned will potentially be less than |
.R |
Integer. The number of bootstrap replications. Defaults to |
.seed |
Integer or |
.verbose |
Logical. Should information (e.g., progress bar) be printed
to the console? Defaults to |
Details
The functions performs the permutation-based test for measurement invariance
of composites across groups proposed by Henseler et al. (2016).
According to the authors assessing measurement invariance in composite
models can be assessed by a three-step procedure. The first two steps
involve an assessment of configural and compositional invariance.
The third steps involves mean and variance comparisons across groups.
Assessment of configural invariance is qualitative in nature and hence
not assessed by the testMICOM()
function.
As testMICOM()
requires at least two groups, .object
must be of
class cSEMResults_multi
. As of version 0.2.0 of the package, testMICOM()
does not support models containing second-order constructs.
It is possible to compare more than two groups, however, multiple-testing
issues arise in this case. To adjust p-values in this case several p-value
adjustments are available via the approach_p_adjust
argument.
The remaining arguments set the number of permutation runs to conduct
(.R
), the random number seed (.seed
),
instructions how inadmissible results are to be handled (handle_inadmissibles
),
and whether the function should be verbose in a sense that progress is printed
to the console.
The number of permutation runs defaults to args_default()$.R
for
performance reasons. According to Henseler et al. (2016)
the number of permutations should be at least 5000 for assessment to be
sufficiently reliable.
Value
A named list of class cSEMTestMICOM
containing the following list element:
$Step2
A list containing the results of the test for compositional invariance (Step 2).
$Step3
A list containing the results of the test for mean and variance equality (Step 3).
$Information
A list of additional information on the test.
References
Henseler J, Ringle CM, Sarstedt M (2016). “Testing Measurement Invariance of Composites Using Partial Least Squares.” International Marketing Review, 33(3), 405–431. doi:10.1108/imr-09-2014-0304.
See Also
csem()
, cSEMResults, testOMF()
, testMGD()
Examples
## Not run:
# NOTE: to run the example. Download and load the newst version of cSEM.DGP
# from GitHub using devtools::install_github("M-E-Rademaker/cSEM.DGP").
# Create two data generating processes (DGPs) that only differ in how the composite
# X is build. Hence, the two groups are not compositionally invariant.
dgp1 <- "
# Structural model
Y ~ 0.6*X
# Measurement model
Y =~ 1*y1
X <~ 0.4*x1 + 0.8*x2
x1 ~~ 0.3125*x2
"
dgp2 <- "
# Structural model
Y ~ 0.6*X
# Measurement model
Y =~ 1*y1
X <~ 0.8*x1 + 0.4*x2
x1 ~~ 0.3125*x2
"
g1 <- generateData(dgp1, .N = 399, .empirical = TRUE) # requires cSEM.DGP
g2 <- generateData(dgp2, .N = 200, .empirical = TRUE) # requires cSEM.DGP
# Model is the same for both DGPs
model <- "
# Structural model
Y ~ X
# Measurement model
Y =~ y1
X <~ x1 + x2
"
# Estimate
csem_results <- csem(.data = list("group1" = g1, "group2" = g2), model)
# Test
testMICOM(csem_results, .R = 50, .alpha = c(0.01, 0.05), .seed = 1987)
## End(Not run)
Test for overall model fit
Description
Usage
testOMF(
.object = NULL,
.alpha = 0.05,
.fit_measures = FALSE,
.handle_inadmissibles = c("drop", "ignore", "replace"),
.R = 499,
.saturated = FALSE,
.seed = NULL,
...
)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
.alpha |
An integer or a numeric vector of significance levels.
Defaults to |
.fit_measures |
Logical. (EXPERIMENTAL) Should additional fit measures
be included? Defaults to |
.handle_inadmissibles |
Character string. How should inadmissible results
be treated? One of "drop", "ignore", or "replace". If "drop", all
replications/resamples yielding an inadmissible result will be dropped
(i.e. the number of results returned will potentially be less than |
.R |
Integer. The number of bootstrap replications. Defaults to |
.saturated |
Logical. Should a saturated structural model be used?
Defaults to |
.seed |
Integer or |
... |
Can be used to determine the fitting function used in the calculateGFI function. |
Details
Bootstrap-based test for overall model fit originally proposed by Beran and Srivastava (1985). See also Dijkstra and Henseler (2015) who first suggested the test in the context of PLS-PM.
By default, testOMF()
tests the null hypothesis that the population indicator
correlation matrix equals the population model-implied indicator correlation matrix.
Several discrepancy measures may be used. By default, testOMF()
uses four distance
measures to assess the distance between the sample indicator correlation matrix
and the estimated model-implied indicator correlation matrix, namely the geodesic distance,
the squared Euclidean distance, the standardized root mean square residual (SRMR),
and the distance based on the maximum likelihood fit function.
The reference distribution for each test statistic is obtained by
the bootstrap as proposed by Beran and Srivastava (1985).
It is possible to perform the bootstrap-based test using fit measures such
as the CFI, RMSEA or the GFI if .fit_measures = TRUE
. This is experimental.
To the best of our knowledge the applicability and usefulness of the fit
measures for model fit assessment have not been formally (statistically)
assessed yet. Theoretically, the logic of the test applies to these fit indices as well.
Hence, their applicability is theoretically justified.
Only use if you know what you are doing.
If .saturated = TRUE
the original structural model is ignored and replaced by
a saturated model, i.e., a model in which all constructs are allowed to correlate freely.
This is useful to test misspecification of the measurement model in isolation.
Value
A list of class cSEMTestOMF
containing the following list elements:
$Test_statistic
The value of the test statistics.
$Critical_value
The corresponding critical values obtained by the bootstrap.
$Decision
The test decision. One of:
FALSE
(Reject) orTRUE
(Do not reject).$Information
The
.R
bootstrap values; The number of admissible results; The seed used and the number of total runs.
References
Beran R, Srivastava MS (1985).
“Bootstrap Tests and Confidence Regions for Functions of a Covariance Matrix.”
The Annals of Statistics, 13(1), 95–115.
doi:10.1214/aos/1176346579.
Dijkstra TK, Henseler J (2015).
“Consistent and Asymptotically Normal PLS Estimators for Linear Structural Equations.”
Computational Statistics & Data Analysis, 81, 10–23.
See Also
csem()
, calculateSRMR()
, calculateDG()
, calculateDL()
, cSEMResults,
testMICOM()
, testMGD()
, exportToExcel()
Examples
# ===========================================================================
# Basic usage
# ===========================================================================
model <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2
# (Reflective) measurement model
eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"
## Estimate
out <- csem(threecommonfactors, model, .approach_weights = "PLS-PM")
## Test
testOMF(out, .R = 50, .seed = 320)
Data: threecommonfactors
Description
A dataset containing 500 standardized observations on 9 indicator generated from a population model with three concepts modeled as common factors.
Usage
threecommonfactors
Format
A matrix with 500 rows and 9 variables:
- y11-y13
Indicators attached to the first common factor (
eta1
). Population loadings are: 0.7; 0.7; 0.7- y21-y23
Indicators attached to the second common factor (
eta2
). Population loadings are: 0.5; 0.7; 0.8- y31-y33
Indicators attached to the third common factor (
eta3
). Population loadings are: 0.8; 0.75; 0.7
The model is:
`eta2` = gamma1 * `eta1` + zeta1
`eta3` = gamma2 * `eta1` + beta * `eta2` + zeta2
with population values gamma1
= 0.6, gamma2
= 0.4 and beta
= 0.35.
Examples
#============================================================================
# Correct model (the model used to generate the data)
#============================================================================
model_correct <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2
# Measurement model
eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"
a <- csem(threecommonfactors, model_correct)
## The overall model fit is evidently almost perfect:
testOMF(a, .R = 30) # .R = 30 to speed up the example
Verify admissibility
Description
Usage
verify(.object)
Arguments
.object |
An R object of class cSEMResults resulting from a call to |
Details
Verify admissibility of the results obtained using csem()
.
Results exhibiting one of the following defects are deemed inadmissible: non-convergence of the algorithm used to obtain weights, loadings and/or (congeneric) reliabilities larger than 1, a construct variance-covariance (VCV) and/or model-implied VCV matrix that is not positive semi-definite.
If .object
is of class cSEMResults_2ndorder
(i.e., estimates are
based on a model containing second-order constructs) both the first and the second stage are checked separately.
Currently, a model-implied indicator VCV matrix for nonlinear model is not
available. verify()
therefore skips the check for positive definiteness of the
model-implied indicator VCV matrix for nonlinear models and returns "ok".
Value
A logical vector indicating which (if any) problem occurred.
A FALSE
indicates that the specific problem did not occurred. For models containing second-order
constructs estimated by the two/three-stage approach, a list of two such vectors
(one for the first and one for the second stage) is returned. Status codes are:
1: The algorithm has converged.
2: All absolute standardized loading estimates are smaller than or equal to 1. A violation implies either a negative variance of the measurement error or a correlation larger than 1.
3: The construct VCV is positive semi-definite.
4: All reliability estimates are smaller than or equal to 1.
5: The model-implied indicator VCV is positive semi-definite. This is only checked for linear models (including models containing second-order constructs).
See Also
csem()
, summarize()
, cSEMResults
Examples
### Without higher order constructs --------------------------------------------
model <- "
# Structural model
eta2 ~ eta1
eta3 ~ eta1 + eta2
# (Reflective) measurement model
eta1 =~ y11 + y12 + y13
eta2 =~ y21 + y22 + y23
eta3 =~ y31 + y32 + y33
"
# Estimate
out <- csem(threecommonfactors, model)
# Check admissibility
verify(out) # ok!
## Examine the structure of a cSEMVerify object
str(verify(out))
### With higher order constructs -----------------------------------------------
# If the model containes higher order constructs both the first and the second-
# stage estimates estimates are checked for admissibility
## Not run:
require(cSEM.DGP) # download from https://m-e-rademaker.github.io/cSEM.DGP/
# Create DGP with 2nd order construct. Loading for indicator y51 is set to 1.1
# to produce a failing first stage model
dgp_2ndorder <- "
## Path model / Regressions
eta2 ~ 0.5*eta1
eta3 ~ 0.35*eta1 + 0.4*eta2
## Composite model
eta1 =~ 0.8*y41 + 0.6*y42 + 0.6*y43
eta2 =~ 1.1*y51 + 0.7*y52 + 0.7*y53
c1 =~ 0.8*y11 + 0.4*y12
c2 =~ 0.5*y21 + 0.3*y22
## Higher order composite
eta3 =~ 0.4*c1 + 0.4*c2
"
dat <- generateData(dgp_2ndorder) # requires the cSEM.DGP package
out <- csem(dat, .model = dgp_2ndorder)
verify(out) # not ok
## End(Not run)