Type: | Package |
Title: | Selection, Fusion, Smoothing and Principal Components Analysis for Ordinal Variables |
Version: | 1.1.0 |
Maintainer: | Aisouda Hoshiyar <aisouda.hoshiyar@hsu-hh.de> |
Description: | Selection, fusion, and/or smoothing of ordinally scaled independent variables using a group lasso, fused lasso or generalized ridge penalty, as well as non-linear principal components analysis for ordinal variables using a second-order difference/smoothing penalty. |
Depends: | grplasso, mgcv, RLRsim, quadprog, glmpath |
Imports: | ordinalNet |
Suggests: | utils, psy, knitr, rmarkdown, testthat (≥ 3.0.0) |
VignetteBuilder: | knitr |
License: | GPL-2 |
LazyLoad: | yes |
NeedsCompilation: | no |
Repository: | CRAN |
Config/testthat/edition: | 3 |
Packaged: | 2023-10-10 09:45:40 UTC; aisoudahoshiyar |
Author: | Jan Gertheiss [aut], Aisouda Hoshiyar [aut, cre], Fabian Scheipl [ctb] |
Date/Publication: | 2023-10-10 10:10:03 UTC |
Selection and/or Smoothing and Principal Components Analysis for Ordinal Variables
Description
Selection, and/or smoothing/fusing of ordinally scaled independent variables using a group lasso or generalized ridge penalty. Nonlinear principal components analysis for ordinal variables using a second-order difference penalty.
Details
Package: | ordPens |
Type: | Package |
Version: | 1.1.0 |
Date: | 2023-07-10 |
Depends: | grplasso, mgcv, RLRsim, quadprog, glmpath |
Imports: | ordinalNet |
Suggests: | psy |
License: | GPL-2 |
LazyLoad: | yes |
Smoothing and selection of ordinal predictors is done by the function
ordSelect
; smoothing only, by ordSmooth
; fusion and selection of ordinal predictors by ordFusion
. For
ANOVA with ordinal factors, use ordAOV
. Nonlinear PCA, performance evaluation and selection of an optimal
penalty parameter can be done using ordPCA
.
Author(s)
Authors: Jan Gertheiss jan.gertheiss@hsu-hh.de, Aisouda Hoshiyar aisouda.hoshiyar@hsu-hh.de.
Contributors: Fabian Scheipl
Maintainer: Aisouda Hoshiyar aisouda.hoshiyar@hsu-hh.de
References
Gertheiss, J. (2014). ANOVA for factors with ordered levels, Journal of Agricultural, Biological and Environmental Statistics, 19, 258-277.
Gertheiss, J., S. Hogger, C. Oberhauser and G. Tutz (2011). Selection of ordinally scaled independent variables with applications to international classification of functioning core sets. Journal of the Royal Statistical Society C (Applied Statistics), 60, 377-395.
Gertheiss, J. and F. Oehrlein (2011). Testing relevance and linearity of ordinal predictors, Electronic Journal of Statistics, 5, 1935-1959.
Gertheiss, J., F. Scheipl, T. Lauer, and H. Ehrhardt (2022). Statistical inference for ordinal predictors in generalized linear and additive models with application to bronchopulmonary dysplasia. BMC research notes, 15, 112.
Gertheiss, J. and G. Tutz (2009). Penalized regression with ordinal predictors. International Statistical Review, 77, 345-365.
Gertheiss, J. and G. Tutz (2010). Sparse modeling of categorial explanatory variables. The Annals of Applied Statistics, 4, 2150-2180.
Hoshiyar, A., H.A.L. Kiers, and J. Gertheiss (2021). Penalized non-linear principal components analysis for ordinal variables with an application to international classification of functioning core sets, British Journal of Mathematical and Statistical Psychology, 76, 353-371.
Hoshiyar, A., Gertheiss, L.H., and Gertheiss, J. (2023). Regularization and Model Selection for Item-on-Items Regression with Applications to Food Products' Survey Data. Preprint, available from https://arxiv.org/abs/2309.16373.
Tutz, G. and J. Gertheiss (2014). Rating scales as predictors – the old question of scale level and some answers. Psychometrica, 79, 357-376.
Tutz, G. and J. Gertheiss (2016). Regularized regression for categorical data. Statistical Modelling, 16, 161-200.
See Also
ordSelect
, ordSmooth
,
ordFusion
, ordAOV
, ordPCA
Examples
## Not run:
### smooth modeling of a simulated dataset
set.seed(123)
# generate (ordinal) predictors
x1 <- sample(1:8,100,replace=TRUE)
x2 <- sample(1:6,100,replace=TRUE)
x3 <- sample(1:7,100,replace=TRUE)
# the response
y <- -1 + log(x1) + sin(3*(x2-1)/pi) + rnorm(100)
# x matrix
x <- cbind(x1,x2,x3)
# lambda values
lambda <- c(1000,500,200,100,50,30,20,10,1)
# smooth modeling
o1 <- ordSmooth(x = x, y = y, lambda = lambda)
# results
round(o1$coef,digits=3)
plot(o1)
# If for a certain plot the x-axis should be annotated in a different way,
# this can (for example) be done as follows:
plot(o1, whx = 1, xlim = c(0,9), xaxt = "n")
axis(side = 1, at = c(1,8), labels = c("no agreement","total agreement"))
### nonlinear PCA on chronic widespread pain data
# load example data
data(ICFCoreSetCWP)
# adequate coding to get levels 1,..., max
H <- ICFCoreSetCWP[, 1:67] + matrix(c(rep(1, 50), rep(5, 16), 1),
nrow(ICFCoreSetCWP), 67,
byrow = TRUE)
# nonlinear PCA
ordPCA(H, p = 2, lambda = 0.5, maxit = 1000,
Ks = c(rep(5, 50), rep(9, 16), 5),
constr = c(rep(TRUE, 50), rep(FALSE, 16), TRUE))
# k-fold cross-validation
set.seed(1234)
lambda <- 10^seq(4,-4, by = -0.1)
cvResult1 <- ordPCA(H, p = 2, lambda = lambda, maxit = 100,
Ks = c(rep(5, 50), rep(9, 16), 5),
constr = c(rep(TRUE, 50), rep(FALSE, 16), TRUE),
CV = TRUE, k = 5)
# optimal lambda
lambda[which.max(apply(cvResult1$VAFtest,2,mean))]
## End(Not run)
ICF core set for chronic widespread pain
Description
The data set contains observed levels of ICF categories from the (comprehensive) ICF Core Set for chronic widespread pain (CWP) and a physical health component summary measure for n = 420 patients.
Usage
data(ICFCoreSetCWP)
Format
The data frame has 420 rows and 68 columns. The first 67 columns contain observed levels of ICF categories from the (comprehensive) ICF Core Set for chronic widespread pain (CWP). In the last column, the physical health component summary measure is given. Each row corresponds to one patient with CWP. ICF categories have discrete ordinal values between 0 and 4 (columns 1 - 50 and 67), or between -4 and 4 (columns 51 - 66). See the given references for details.
Details
The original data set contained some missing values, which have been imputed
using R package Amelia
.
The data were collected within the study Validation of ICF Core Sets for chronic conditions, which was a collaboration effort between the ICF Research Branch of the collaborating centers for the Family of International Classifications in German, the Classification, Terminology and standards Team from the World Health Organization and the International Society for Physical and Rehabilitation Medicine.
Special thanks go to the following participating study centers: Ankara University, Turkey; Azienda Ospedaliera di Sciacca, Italy; Donauspital, Vienna, Austria; Drei-Burgen-Klinik, Bad Muenster, Germany; Edertal Klinik, Bad Wildungen, Germany; Fachklinik Bad Bentheim, Germany; Hospital das Clinicas, School of Medicine, University of Sao Paulo, Brazil; Hospital San Juan Bautista, Catamarca, Argentina; Istituto Scientifico di Montescano, Italy; Istituto Scientifico di Veruno, Italy; Kaiser-Franz-Josef-Spital, Vienna, Austria; Klinik am Regenbogen, Nittenau, Germany; Klinik Bavaria Kreischa, Germany; Klinik Hoher Meissner, Bad Sooden-Allendorf, Germany; Klinikum Berchtesgadener Land, Schoenau, Germany; Kuwait Physical Medicine and Rehabilitation Society, Safat, Kuwait; National Institute for Medical Rehabilitation, Budapest, Hungary; Neuro-Orthopaedisches Krankenhaus und Zentrum fuer Rehabilitative Medizin Soltau, Germany; Praxis fuer Physikalische Medizin und Rehabilitation, Goettingen, Germany; Rehabilitationsklinik Seehof der Bundesversicherungsanstalt fuer Angestellte, Teltow, Germany; Rehaklinik Rheinfelden, Switzerland; Spanish Society of Rheumatology, Madrid, Spain; University Hospital Zurich, Switzerland; University of Santo Tomas, Quelonchy, Philippines.
Most special thanks go to all the patients participating in the study.
If you use the data, please cite the following two references.
References
Cieza, A., G. Stucki, M. Weigl, L. Kullmann, T. Stoll, L. Kamen, N. Kostanjsek, and N. Walsh (2004). ICF Core Sets for chronic widespread pain. Journal of Rehabilitation Medicine, Suppl. 44, 63-68.
Gertheiss, J., S. Hogger, C. Oberhauser and G. Tutz (2011). Selection of ordinally scaled independent variables with applications to international classification of functioning core sets. Journal of the Royal Statistical Society C (Applied Statistics), 60, 377-395.
Examples
# load the data
data(ICFCoreSetCWP)
# available variables
names(ICFCoreSetCWP)
# adequate coding of x matrix (using levels 1,2,...)
p <- ncol(ICFCoreSetCWP) - 1
n <- nrow(ICFCoreSetCWP)
add <- c(rep(1,50),rep(5,16),1)
add <- matrix(add,n,p,byrow=TRUE)
x <- ICFCoreSetCWP[,1:p] + add
# make sure that also a coefficient is fitted for levels
# that are not observed in the data
addrow <- c(rep(5,50),rep(9,16),5)
x <- rbind(x,addrow)
y <- c(ICFCoreSetCWP$phcs,NA)
# some lambda values
lambda <- c(600,500,400,300,200,100)
# smoothing and selection
modelICF <- ordSelect(x = x, y = y, lambda = lambda)
# results
plot(modelICF)
# plot a selected ICF category (e.g. e1101 'drugs')
# with adequate class labels
plot(modelICF, whx = 51, xaxt = "n")
axis(side = 1, at = 1:9, labels = -4:4)
Stability selection for ordinal-on-ordinal regression.
Description
This function performs stability selection for the cumulative logit model.
Usage
Stability.cumu(x, y, lambda, n_iter=100, type=c("selection", "fusion"), ...)
Arguments
x |
a vector or matrix of integers 1,2,... giving the observed levels
of the ordinal factor(s). If |
y |
the vector of response values. |
lambda |
vector of penalty parameters (in decreasing order). |
n_iter |
number of subsamples. Details below. |
type |
penalty to be applied. If "selection", group lasso penalty for smoothing and selection is used. If "fusion", a fused lasso penalty for fusiona dn selection is used. |
... |
additional arguments to |
Details
The method assumes that ordinal factor levels (contained in vector/columns of
matrix x
) take values 1,2,...,max, where max denotes the highest level
of the respective factor observed in the data. Every level between 1 and max has
to be observed at least once.
Instead of selecting/fitting one model, the data are pertubed/subsampled iter
times and we choose those variables that occur in a large fraction (pi
) of runs.
The stability path then shows the order of relevance of the predictors according to stability selection.
Value
Pi |
the matrix of estimated selection probabilities. Columns correspond to different lambda values, rows correspond to covariates. |
mSize |
matrix of size |
Author(s)
Aisouda Hoshiyar
References
Hoshiyar, A., Gertheiss, L.H., and Gertheiss, J. (2023). Regularization and Model Selection for Item-on-Items Regression with Applications to Food Products' Survey Data. Preprint, available from https://arxiv.org/abs/2309.16373.
Meinshausen, N. and Buehlmann, P. (2010). Stability selection, Journal of the Royal Statistical Society B (Statistical Methodology), 72, 417-473.
See Also
ANOVA for factors with ordered levels
Description
This function performs analysis of variance when the factor(s) of interest has/have ordinal scale level. For testing, values from the null distribution are simulated.
Usage
ordAOV(x, y, type = c("RLRT", "LRT"), nsim = 10000,
null.sample = NULL, ...)
Arguments
x |
a vector or matrix of integers 1,2,... giving the observed levels
of the ordinal factor(s). If |
y |
the vector of response values. |
type |
the type of test to carry out: likelihood ratio ("LRT") or restricted likelihood ratio ("RLRT"). |
nsim |
number of values to simulate from the null distribution. |
null.sample |
a vector, or a list of vectors (in case of multi-factorial
ANOVA) containing values already simulated from the null distribution
(overrides |
... |
additional arguments to |
Details
The method assumes that ordinal factor levels (contained in vector/columns of
matrix x
) take values 1,2,...,max, where max denotes the highest level
of the respective factor observed in the data. Every level between 1 and max has
to be observed at least once.
The method uses a mixed effects formulation of the usual one- or multi-factorial
ANOVA model (with main effects only) while penalizing (squared) differences of
adjacent means. Testing for equal means across factor levels is done by
(restricted) likelihood ratio testing for a zero variance component in a linear
mixed model. For simulating values from the finite sample null
distribution of the (restricted) likelihood ratio statistic, the
algorithms implemented in Package RLRsim
are used. See
LRTSim
and RLRTSim
for further information.
If x
is a vector (or one-column matrix), one-factorial ANOVA is applied,
and it is simulated from the exact finite sample null distribution as derived by
Crainiceanu & Ruppert (2004). If x
is a matrix, multi-factorial ANOVA
(with main effects only) is done, and the approximation of the finite sample null
distribution proposed by Greven et al. (2008) is used. Simulation
studies by Gertheiss (2014) suggest that for ANOVA with ordinal factors RLRT
should rather be used than LRT.
Value
In case of one-factorial ANOVA, a list of class htest
containing the
following components (see also exactLRT
and exactRLRT
):
statistic |
the observed (restricted) likelihood ratio. |
p |
p-value for the observed test statistic. |
method |
a character string indicating what type of test was performed and how many values were simulated to determine the critical value. |
sample |
the samples from the null distribution returned by
|
In case of multi-factorial ANOVA, a list (of lists) with the jth component giving the results above when testing the main effect of factor j.
Author(s)
Jan Gertheiss
References
Crainiceanu, C. and D. Ruppert (2004). Likelihood ratio tests in linear mixed models with one variance component, Journal of the Royal Statistical Society B, 66, 165-185.
Gertheiss, J. (2014). ANOVA for factors with ordered levels, Journal of Agricultural, Biological and Environmental Statistics, 19, 258-277.
Gertheiss, J. and F. Oehrlein (2011). Testing relevance and linearity of ordinal predictors, Electronic Journal of Statistics, 5, 1935-1959.
Greven, S., C. Crainiceanu, H. Kuechenhoff, and A. Peters (2008). Restricted likelihood ratio testing for zero variance components in linear mixed models, Journal of Computational and Graphical Statistics, 17, 870-891.
Scheipl, F., S. Greven, and H. Kuechenhoff (2008). Size and power of tests for a zero random effect variance or polynomial regression in additive and linear mixed models, Computational Statistics & Data Analysis, 52, 3283-3299.
See Also
LRTSim
, RLRTSim
Examples
# load some data
data(ICFCoreSetCWP)
# the pysical health component summary
y <- ICFCoreSetCWP$phcs
# consider the first ordinal factor
x <- ICFCoreSetCWP[,1]
# adequate coding
x <- as.integer(x - min(x) + 1)
# ANOVA
ordAOV(x, y, type = "RLRT", nsim=1000000)
Cross-validation for penalized regression with ordinal predictors.
Description
Performs k-fold cross-validation in order to evaluate the performance and/or select an optimal smoothing parameter of a penalized regression model with ordinal predictors.
Usage
ordCV(x, y, u = NULL, z = NULL, k=5, lambda, offset = rep(0,length(y)),
model = c("linear", "logit", "poisson", "cumulative"),
type=c("selection", "fusion"), ...)
Arguments
x |
matrix of integers 1,2,... giving the observed levels of the ordinal factor(s). |
y |
the vector of response values. |
u |
a matrix (or |
z |
a matrix (or |
k |
number of folds. |
lambda |
vector of penalty parameters (in decreasing order). |
offset |
vector of offset values. |
model |
the model which is to be fitted. Possible choices are "linear" (default), "logit", "poisson" or "cumulative". See details below. |
type |
penalty to be applied. If "selection", group lasso penalty for smoothing and selection is used. If "fusion", a fused lasso penalty for fusion and selection is used. |
... |
additional arguments to |
Details
The method assumes that categorical covariates (contained in x
and
u
) take values 1,2,...,max, where max denotes the (columnwise) highest
level observed in the data. If any level between 1 and max is not observed for an ordinal predictor,
a corresponding (dummy) coefficient is fitted anyway. If any level > max is
not observed but possible in principle, and a corresponding coefficient is to
be fitted, the easiest way is to add a corresponding row to x
(and
u
,z
) with corresponding y
value being NA
.
If a linear regression model is fitted, response vector y
may contain
any numeric values; if a logit model is fitted, y
has to be 0/1 coded;
if a poisson model is fitted, y
has to contain count data. If a cumulative logit model is fitted, y
takes values 1,2,...,max.
For the cumulative model, the measure of performance used by the function is the brier score, being the sum of squared differences between (indicator) outcome and predicted probabilities P(Y_i=r)=P(y_{ir})=\pi_{ir}
, with observations i=1,...,n
and classes r=1,...,c
. Otherwise, the deviance is used.
Value
Returns a list containing the following components:
Train |
matrix of size ( |
Test |
Brier/deviance score matrix when looking at the test data set. |
Author(s)
Aisouda Hoshiyar
References
Hoshiyar, A., Gertheiss, L.H., and Gertheiss, J. (2023). Regularization and Model Selection for Item-on-Items Regression with Applications to Food Products' Survey Data. Preprint, available from https://arxiv.org/abs/2309.16373.
See Also
Fusion and selection of dummy coefficients of ordinal predictors
Description
Fits dummy coefficients of ordinally scaled independent variables
with a fused lasso penalty on differences of adjacent dummy coefficients. Using the ordinalNet
algorithm if cumulative logit model is fitted, otherwise glmpath
algorithm is used.
Usage
ordFusion(x, y, u = NULL, z = NULL, offset = rep(0,length(y)), lambda,
model = c("linear", "logit", "poisson", "cumulative"),
restriction = c("refcat", "effect"), scalex = TRUE, nonpenx = NULL,
frac.arclength = NULL, ...)
Arguments
x |
the matrix of ordinal predictors, with each column corresponding to one predictor and containing numeric values from {1,2,...}; for each covariate, category 1 is taken as reference category with zero dummy coefficient. |
y |
the response vector. |
u |
a matrix (or |
z |
a matrix (or |
offset |
vector of offset values. |
lambda |
vector of penalty parameters, i.e., lambda values. |
model |
the model which is to be fitted. Possible choices are "linear" (default), "logit", "poisson" or "cumulative". See details below. |
restriction |
identifiability restriction for dummy coding. "reference" takes category 1 is as reference category (default), while with "effect" dummy coefficients sum up to 0 (known as effect coding). |
scalex |
logical. Should (split-coded) design matrix corresponding to
|
nonpenx |
vectors of indices indicating columns of
|
frac.arclength |
just in case the corresponding |
... |
additional arguments to |
Details
The method assumes that categorical covariates (contained in x
and
u
) take values 1,2,...,max, where max denotes the (columnwise) highest
level observed in the data. If any level between 1 and max is not observed for an ordinal predictor,
a corresponding (dummy) coefficient is fitted anyway (by linear interpolation, due to some additional but small quadratic penalty, see glmpath
for details). If any level > max is
not observed but possible in principle, and a corresponding coefficient is to
be fitted, the easiest way is to add a corresponding row to x
(and
u
,z
) with corresponding y
value being NA
.
If a linear regression model is fitted, response vector y
may contain
any numeric values; if a logit model is fitted, y
has to be 0/1 coded;
if a poisson model is fitted, y
has to contain count data. If a cumulative logit model is fitted, y
takes values 1,2,...,max.
If scalex
is TRUE
, (split-coded) design matrix constructed from x
is scaled to have
unit variance over columns (see standardize
argument of glmpath
or/and ordinalNet
).
Value
An ordPen
object, which is a list containing:
fitted |
the matrix of fitted response values of the training data.
Columns correspond to different |
coefficients |
the matrix of fitted coefficients with respect to dummy-coded (ordinal or nominal) categorical input variables (including the reference category) as well as metric predictors. Columns correspond to different lambda values. |
model |
the type of the fitted model: "linear", "logit", "poisson", or "cumulative". |
restriction |
the type of restriction used for identifiability. |
lambda |
the used lambda values. |
xlevels |
a vector giving the number of levels of the ordinal predictors. |
ulevels |
a vector giving the number of levels of the nominal predictors (if any). |
zcovars |
the number of metric covariates (if any). |
Author(s)
Jan Gertheiss, Aisouda Hoshiyar
References
Gertheiss, J. and G. Tutz (2010). Sparse modeling of categorial explanatory variables. The Annals of Applied Statistics, 4, 2150-2180.
Hoshiyar, A., Gertheiss, L.H., and Gertheiss, J. (2023). Regularization and Model Selection for Item-on-Items Regression with Applications to Food Products' Survey Data. Preprint, available from https://arxiv.org/abs/2309.16373.
Park, M.Y. and T. Hastie (2007). L1 regularization path algorithm for generalized linear models. Journal of the Royal Statistical Society B, 69, 659-677.
Tutz, G. and J. Gertheiss (2014). Rating scales as predictors – the old question of scale level and some answers. Psychometrica, 79, 357-376.
Tutz, G. and J. Gertheiss (2016). Regularized regression for categorical data. Statistical Modelling, 16, 161-200.
See Also
plot.ordPen
, predict.ordPen
,
ICFCoreSetCWP
Examples
# fusion and selection of ordinal covariates on a simulated dataset
set.seed(123)
# generate (ordinal) predictors
x1 <- sample(1:8,100,replace=TRUE)
x2 <- sample(1:6,100,replace=TRUE)
x3 <- sample(1:7,100,replace=TRUE)
# the response
y <- -1 + log(x1) + sin(3*(x2-1)/pi) + rnorm(100)
# x matrix
x <- cbind(x1,x2,x3)
# lambda values
lambda <- c(80,70,60,50,40,30,20,10,5,1)
# fusion and selection
ofu <- ordFusion(x = x, y = y, lambda = lambda)
# results
round(ofu$coef,digits=3)
plot(ofu)
# If for a certain plot the x-axis should be annotated in a different way,
# this can (for example) be done as follows:
plot(ofu, whx = 1, xlim = c(0,9), xaxt = "n")
axis(side = 1, at = c(1,8), labels = c("no agreement","total agreement"))
Testing for differentially expressed genes
Description
This function can be used to test for genes that are differentially expressed between levels of an ordinal factor, such as dose levels or ordinal phenotypes.
Usage
ordGene(xpr, lvs, type = c("RLRT", "LRT"), nsim = 1e6,
null.sample=NULL, ...)
Arguments
xpr |
a matrix or data frame of gene expression data with Probe IDs as row names. |
lvs |
a numeric vector containing the factor levels (e.g., dose levels)
corresponding to the columns of |
type |
the type of test to carry out: likelihood ratio ("LRT") or restricted likelihood ratio ("RLRT"). |
nsim |
number of values to simulate from the null distribution. |
null.sample |
a vector containing values already simulated from the null
distribution (overrides |
... |
additional arguments to |
Details
For each gene in the dataset, ordAOV
is applied to test for
differences between levels given in lvs
. See ordAOV
for
further information on the testing procedure. Simulation studies by Gertheiss (2014)
suggest that a restricted likelihood test (RLRT) should rather be used than
a likelihood ratio test (LRT).
In addition to (R)LRT, results of usual one-way ANOVA (not taking the factor's ordinal scale level into account) and a t-test assuming a linear trend across factor levels are reported. Note that the t-test does not assume linearity in the doses (such as 0, 0.5, 2.0, 5.0, ...), if given, but in the levels, i.e., 1, 2, 3, etc.
Value
A matrix containing the raw p-values for each gene (rows) when using (R)LRT, ANOVA or a t-test (columns).
Author(s)
Jan Gertheiss
References
Crainiceanu, C. and D. Ruppert (2004). Likelihood ratio tests in linear mixed models with one variance component, Journal of the Royal Statistical Society B, 66, 165-185.
Gertheiss, J. (2014). ANOVA for factors with ordered levels, Journal of Agricultural, Biological and Environmental Statistics, 19, 258-277.
Gertheiss, J. and F. Oehrlein (2011). Testing relevance and linearity of ordinal predictors, Electronic Journal of Statistics, 5, 1935-1959.
Sweeney, E., C. Crainiceanu, and J. Gertheiss (2015). Testing differentially expressed genes in dose-response studies and with ordinal phenotypes, Statistical Applications in Genetics and Molecular Biology, 15, 213-235.
See Also
Examples
## Not run:
# generate toy gene expression data
set.seed(321)
ni <- 5
n <- sum(5*ni)
xpr <- matrix(NA, ncol = n, nrow = 100)
mu_lin <- 3:7
mu_sq2 <- (-2:2)^2 * 0.5 + 3
a <- seq(0.75, 1.25, length.out = 10)
for(i in 1:10){
xpr[i,] <- a[i] * rep(mu_lin, each = ni) + rnorm(n)
xpr[i+10,] <- a[i] * rep(mu_sq2, each = ni) + rnorm(n)
}
for(i in 21:100) xpr[i,] <- 3 + rnorm(n)
dose <- rep(c(0,0.01,0.05,0.2,1.5), each = ni)
# continuous representation
oldpar <- par(mfrow = c(2,2))
plot(dose, xpr[4,], col = as.factor(dose), lwd = 2, ylab = "expression", main = "gene 4")
lines(sort(unique(dose)), mu_lin * a[4], lty = 1, col = 1)
plot(dose, xpr[14,], col = as.factor(dose), lwd = 2, ylab = "expression", main = "gene 14")
lines(sort(unique(dose)), mu_sq2 * a[4], lty = 1, col = 1)
# dose on ordinal scale
plot(1:length(sort(unique(dose))), ylim = range(xpr[4,]), pch = "", ylab = "expression",
xlab = "levels", xaxt="n")
axis(1, at = 1:length(sort(unique(dose))) )
points(as.factor(dose), xpr[4,], col=as.factor(dose), lwd = 2)
lines(1:length(sort(unique(dose))), mu_lin * a[4], lty = 1)
plot(1:length(sort(unique(dose))), ylim = range(xpr[14,]), pch = "", ylab = "expression",
xlab = "levels", xaxt="n")
axis(1, at = 1:length(sort(unique(dose))) )
points(as.factor(dose), xpr[14,], col=as.factor(dose), lwd = 2)
lines(1:length(sort(unique(dose))), mu_sq2 * a[4], lty = 1)
par(oldpar)
# calculate p-values
library(ordPens)
pvals <- ordGene(xpr = xpr, lvs = dose, nsim = 1e6)
# compare distribution of (small) p-values
plot(ecdf(pvals[,1]), xlim = c(0,0.05), ylim = c(0, 0.25),
main = "", xlab = "p-value", ylab = "F(p-value)")
plot(ecdf(pvals[,2]), xlim = c(0, 0.05), add = TRUE, col = 2)
plot(ecdf(pvals[,3]), xlim = c(0, 0.05), add = TRUE, col = 3)
legend('topleft', colnames(pvals), col = 1:3, lwd = 2, lty = 1)
## End(Not run)
Penalized nonlinear PCA for ordinal variables
Description
This function performs nonlinear principal components analysis when the variables of interest have ordinal level scale using a second-order difference penalty.
Usage
ordPCA(H, p, lambda = c(1), maxit = 100, crit = 1e-7, qstart = NULL,
Ks = apply(H,2,max), constr = rep(FALSE, ncol(H)), trace = FALSE,
CV = FALSE, k = 5, CVfit = FALSE)
Arguments
H |
a matrix or data frame of of integers 1,2,... giving the observed levels of the ordinal variables; provides the data for the principal components analysis. |
p |
the number of principal components to be extracted. |
lambda |
a numeric value or a vector (in decreasing order) defining the amount of shrinkage; defaults to 1. |
maxit |
the maximum number of iterations; defaults to 100. |
crit |
convergence tolerance; defaults to 1e-7. |
qstart |
optional list of quantifications for the initial linear PCA. |
Ks |
a vector containing the highest level of each variable. |
constr |
a logical vector specifying whether monotonicity constraints should be applied to the variables. |
trace |
logical; if |
CV |
a logical value indicating whether k-fold cross-validation should be performed in order to evaluate the performance and/or select an optimal smoothing parameter. |
k |
the number of folds to be specified; only if |
CVfit |
logical; to be specified only if |
Details
In order to respect the ordinal scale of the data, principal components analysis is not applied to data matrix H
itself, but to newly constructed variables by assigning numerical values – the quantifications – to the categories via penalized, optimal scaling/scoring.
The calculation is done by alternately cycling through data scoring and PCA until convergence.
The penalty parameter controls the amount of shrinkage: For lambda = 0
, purely nonlinear PCA via standard, optimal scaling is obtained. As lambda
becomes very large, the quantifications are shrunken towars linearity, i.e., usual PCA is applied to levels 1,2,... ignoring the ordinal scale level of the variables.
Note that optimization starts with the first component of lambda
. Thus, if lambda
is not in decreasing order, the vector will be sorted internally and so will be corresponding results.
In case of cross-validation, for each lambda
the proportion of variance accounted for (VAF) is given for both the training and test data (see below).
Value
A List with components:
qs |
a list of quantifications, if |
Q |
data matrix after scaling, if |
X |
matrix of factor values resulting from |
A |
loadings matrix as a result from |
iter |
number of iterations used. |
pca |
object of class |
trace |
vector of VAF values in each iteration, if |
VAFtrain |
matrix with columns corresponding to |
VAFtest |
VAF matrix for the test data within cross-validation. |
If cross-validation is desired, the pca results are stored in a list called fit
with each list entry corresponding to a certain fold. Within such a list entry, all sub entries can be accessed as described above.
However, VAF values are stored in VAFtrain
or VAFtest
and can be accessed directly.
Author(s)
Aisouda Hoshiyar, Jan Gertheiss
References
Hoshiyar, A. (2020). Analyzing Likert-type data using penalized non-linear principal components analysis, in: Proceedings of the 35th International Workshop on Statistical Modelling, Vol. I, 337-340.
Hoshiyar, A., H.A.L. Kiers, and J. Gertheiss (2021). Penalized non-linear principal components analysis for ordinal variables with an application to international classification of functioning core sets, British Journal of Mathematical and Statistical Psychology, 76, 353-371.
Linting, M., J.J. Meulmann, A.J. von der Kooji, and P.J.F. Groenen (2007). Nonlinear principal components analysis: Introduction and application, Psychological Methods, 12, 336-358.
See Also
Examples
## Not run:
## load ICF data
data(ICFCoreSetCWP)
# adequate coding to get levels 1,..., max
H <- ICFCoreSetCWP[, 1:67] + matrix(c(rep(1, 50), rep(5, 16), 1),
nrow(ICFCoreSetCWP), 67,
byrow = TRUE)
xnames <- colnames(H)
# nonlinear PCA
icf_pca1 <- ordPCA(H, p = 2, lambda = c(5, 0.5, 0.0001), maxit = 1000,
Ks = c(rep(5, 50), rep(9, 16), 5),
constr = c(rep(TRUE, 50), rep(FALSE, 16), TRUE))
# estimated quantifications
icf_pca1$qs[[55]]
plot(1:9, icf_pca1$qs[[55]][,1], type="b",
xlab="category", ylab="quantification", col=1, main=xnames[55],
ylim=range(c(icf_pca1$qs[[55]][,1],icf_pca1$qs[[55]][,2],icf_pca1$qs[[55]][,3])))
lines(icf_pca1$qs[[55]][,2], type = "b", col = 2, lty = 2, pch = 2, lwd=2)
lines(icf_pca1$qs[[55]][,3], type = "b", col = 3, lty = 3, pch = 3, lwd=2)
# compare VAF
icf_pca2 <- ordPCA(H, p = 2, lambda = c(5, 0.5, 0.0001), maxit = 1000,
Ks = c(rep(5, 50), rep(9, 16), 5),
constr = c(rep(TRUE, 50), rep(FALSE, 16), TRUE),
CV = TRUE, k = 5)
icf_pca2$VAFtest
## load ehd data
require(psy)
data(ehd)
# recoding to get levels 1,..., max
H <- ehd + 1
# nonlinear PCA
ehd1 <- ordPCA(H, p = 5, lambda = 0.5, maxit = 100,
constr = rep(TRUE,ncol(H)),
CV = FALSE)
# resulting PCA on the scaled variables
summary(ehd1$pca)
# plot quantifications
oldpar <- par(mfrow = c(4,5))
for(j in 1:length(ehd1$qs))
plot(1:5, ehd1$qs[[j]], type = "b", xlab = "level", ylab = "quantification",
main = colnames(H)[j])
par(oldpar)
# include cross-validation
lambda <- 10^seq(4,-4, by = -0.1)
set.seed(456)
cvResult <- ordPCA(H, p = 5, lambda = lambda, maxit = 100,
constr = rep(TRUE,ncol(H)),
CV = TRUE, k = 5, CVfit = FALSE)
# optimal lambda
lambda[which.max(apply(cvResult$VAFtest,2,mean))]
## End(Not run)
Internal ordPens functions
Description
Internal ordPens functions
Usage
cd(x)
coding(x, constant=TRUE, splitcod=TRUE)
genRidge(x, y, offset, omega, lambda, model, delta=1e-6, maxit=25)
ordAOV1(x, y, type, nsim, null.sample, ...)
ordAOV2(x, y, type, nsim, null.sample, ...)
crO(k, d=2)
penALS(H, p, lambda, qstart, crit, maxit, Ks, constr)
ord.glasso(x, y, lambda, weights = rep(1, length(y)), penscale = sqrt,
standardize = TRUE, restriction = c("refcat", "effect"),
nonpenx = NULL, control=ordglasso_control())
ordglasso_control(control = list())
invlink(eta)
## S3 method for class 'ordinal.smooth.spec'
smooth.construct(object, data, knots)
## S3 method for class 'ordinal.smooth'
Predict.matrix(object, data)
## S3 method for class 'ordinal.smooth'
plot(x, P=NULL, data=NULL, label="", se1.mult=1, se2.mult=2,
partial.resids=FALSE, rug=TRUE, se=TRUE, scale=-1, n=100, n2=40, n3=3,
pers=FALSE, theta=30, phi=30, jit=FALSE, xlab=NULL, ylab=NULL, main=NULL,
ylim=NULL, xlim=NULL, too.far=0.1, shade=FALSE, shade.col="gray80",
shift=0, trans=I, by.resids=FALSE, scheme=0, ...)
Details
These are not to be called by the user.
Author(s)
Jan Gertheiss, Fabian Scheipl, Aisouda Hoshiyar
Selection and smoothing of dummy coefficients of ordinal predictors
Description
Fits dummy coefficients of ordinally scaled independent variables with a group lasso penalty on differences of adjacent dummy coefficients.
Usage
ordSelect(x, y, u = NULL, z = NULL, offset = rep(0,length(y)), lambda,
model = c("linear", "logit", "poisson", "cumulative"),
restriction = c("refcat", "effect"), penscale = sqrt, scalex = TRUE,
nonpenx = NULL, control = NULL, eps = 1e-3, ...)
Arguments
x |
the matrix of ordinal predictors, with each column corresponding to one predictor and containing numeric values from {1,2,...}; for each covariate, category 1 is taken as reference category with zero dummy coefficient. |
y |
the response vector. |
u |
a matrix (or |
z |
a matrix (or |
offset |
vector of offset values. |
lambda |
vector of penalty parameters (in decreasing order). Optimization starts with the first component. See details below. |
model |
the model which is to be fitted. Possible choices are "linear" (default), "logit", "poisson" or "cumulative". See details below. |
restriction |
identifiability restriction for dummy coding. "reference" takes category 1 is as reference category (default), while with "effect" dummy coefficients sum up to 0 (known as effect coding). |
penscale |
rescaling function to adjust the value of the penalty parameter to the degrees of freedom of the parameter group. See the references below. |
scalex |
logical. Should (split-coded) design matrix corresponding to
|
nonpenx |
vectors of indices indicating columns of
|
control |
a list of control parameters only if |
eps |
a (small) constant to be added to the columnwise standard deviations when scaling the design matrix, to control the effect of very small stds. See details below. |
... |
additional arguments. |
Details
The method assumes that categorical covariates (contained in x
and
u
) take values 1,2,...,max, where max denotes the (columnwise) highest
level observed in the data. If any level between 1 and max is not observed for an ordinal predictor,
a corresponding (dummy) coefficient is fitted anyway. If any level > max is
not observed but possible in principle, and a corresponding coefficient is to
be fitted, the easiest way is to add a corresponding row to x
(and
u
,z
) with corresponding y
value being NA
.
If a linear regression model is fitted, response vector y
may contain
any numeric values; if a logit model is fitted, y
has to be 0/1 coded;
if a poisson model is fitted, y
has to contain count data. If a cumulative logit model is fitted, y
takes values 1,2,...,max.
If scalex
is TRUE
, (split-coded) design matrix constructed from x
is scaled to have
unit variance over columns. If a certain x
-category,
however, is observed only a few times, variances may become very small and
scaling has enormous effects on the result and may cause numerical problems.
Hence a small constant eps
can be added to each standard deviation
when used for scaling.
Value
An ordPen
object, which is a list containing:
fitted |
the matrix of fitted response values of the training data.
Columns correspond to different |
coefficients |
the matrix of fitted coefficients with respect to dummy-coded (ordinal or nominal) categorical input variables (including the reference category) as well as metric predictors. Columns correspond to different lambda values. |
model |
the type of the fitted model: "linear", "logit", "poisson", or "cumulative". |
restriction |
the type of restriction used for identifiability. |
lambda |
the used lambda values. |
fraction |
the used fraction values ( |
xlevels |
a vector giving the number of levels of the ordinal predictors. |
ulevels |
a vector giving the number of levels of the nominal predictors (if any). |
zcovars |
the number of metric covariates (if any). |
Author(s)
Jan Gertheiss, Aisouda Hoshiyar
References
Gertheiss, J., S. Hogger, C. Oberhauser and G. Tutz (2011). Selection of ordinally scaled independent variables with applications to international classification of functioning core sets. Journal of the Royal Statistical Society C (Applied Statistics), 60, 377-395.
Hoshiyar, A., Gertheiss, L.H., and Gertheiss, J. (2023). Regularization and Model Selection for Item-on-Items Regression with Applications to Food Products' Survey Data. Preprint, available from https://arxiv.org/abs/2309.16373.
Meier, L., S. van de Geer and P. Buehlmann (2008). The group lasso for logistic regression. Journal of the Royal Statistical Society B, 70, 53-71.
Tutz, G. and J. Gertheiss (2014). Rating scales as predictors – the old question of scale level and some answers. Psychometrika, 79, 357-376.
Tutz, G. and J. Gertheiss (2016). Regularized regression for categorical data. Statistical Modelling, 16, 161-200.
Yuan, M. and Y. Lin (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society B, 68, 49-67.
See Also
plot.ordPen
, predict.ordPen
,
ICFCoreSetCWP
Examples
# smoothing and selection of ordinal covariates on a simulated dataset
set.seed(123)
# generate (ordinal) predictors
x1 <- sample(1:8,100,replace=TRUE)
x2 <- sample(1:6,100,replace=TRUE)
x3 <- sample(1:7,100,replace=TRUE)
# the response
y <- -1 + log(x1) + sin(3*(x2-1)/pi) + rnorm(100)
# x matrix
x <- cbind(x1,x2,x3)
# lambda values
lambda <- c(1000,500,200,100,50,30,20,10,1)
# smoothing and selection
osl <- ordSelect(x = x, y = y, lambda = lambda)
# results
round(osl$coef,digits=3)
plot(osl)
# If for a certain plot the x-axis should be annotated in a different way,
# this can (for example) be done as follows:
plot(osl, whx = 1, xlim = c(0,9), xaxt = "n")
axis(side = 1, at = c(1,8), labels = c("no agreement","total agreement"))
Smoothing dummy coefficients of ordinal predictors
Description
Fits dummy coefficients of ordinally scaled independent variables with the sum of squared differences of adjacent dummy coefficients being penalized.
Usage
ordSmooth(x, y, u = NULL, z = NULL, offset = rep(0,length(y)), lambda,
model = c("linear", "logit", "poisson"), restriction = c("refcat", "effect"),
penscale = identity, scalex = TRUE, nonpenx = NULL, eps = 1e-3, delta = 1e-6,
maxit = 25, ...)
Arguments
x |
the matrix (or |
y |
the response vector. |
u |
a matrix (or |
z |
a matrix (or |
offset |
vector of offset values. |
lambda |
vector of penalty parameters (in decreasing order). Optimization starts with the first component. See details below. |
model |
the model which is to be fitted. Possible choices are "linear" (default), "logit" or "poisson". See details below. |
restriction |
identifiability restriction for dummy coding. "reference" takes category 1 is as reference category (default), while with "effect" dummy coefficients sum up to 0 (known as effect coding). |
penscale |
rescaling function to adjust the value of the penalty parameter to the degrees of freedom of the parameter group. |
scalex |
logical. Should (split-coded) design matrix corresponding to |
nonpenx |
vector of indices indicating columns of
|
eps |
a (small) constant to be added to the columnwise standard deviations when scaling the design matrix, to control the effect of very small stds. See details below. |
delta |
a small positive convergence tolerance which is used as stopping criterion for the penalized Fisher scoring when a logit or poisson model is fitted. See details below. |
maxit |
integer given the maximal number of (penalized) Fisher scoring iterations. |
... |
additional arguments. |
Details
The method assumes that categorical covariates (contained in x
and
u
) take values 1,2,...,max, where max denotes the (columnwise) highest
level observed in the data. If any level between 1 and max is not observed for an ordinal predictor,
a corresponding (dummy) coefficient is fitted anyway. If any level > max is
not observed but possible, and a corresponding coefficient is to
be fitted, the easiest way is to add a corresponding row to x
(and
u
,z
) with corresponding y
value being NA
.
If a linear regression model is fitted, response vector y
may contain
any numeric values; if a logit model is fitted, y
has to be 0/1 coded;
if a poisson model is fitted, y
has to contain count data.
If scalex
is TRUE
, (split-coded) design matrix constructed from x
is scaled to have
unit variance over columns. If a certain x
-category,
however, is observed only a few times, variances may become very small and
scaling has enormous effects on the result and may cause numerical problems.
Hence a small constant eps
can be added to each standard deviation
when used for scaling.
A logit or poisson model is fitted by penalized Fisher scoring. For stopping
the iterations the criterion sqrt(sum((b.new-b.old)^2)/sum(b.old^2)) < delta
is used.
Please note, ordSmooth
is intended for use with high-dimensional ordinal predictors; more precisely, if the number of ordinal predictors is large. Package ordPens
, however, also includes auxiliary functions such that gam
from mgcv
can be used for fitting generalized linear and additive models with first- and second-order ordinal smoothing penalty as well as built-in smoothing parameter selection. In addition, mgcv
tools for further statistical inference can be used. Note, however, significance of smooth (ordinal) terms is only reliable in case of the second-order penalty. Also note, if using gam
, dummy coefficients/fitted functions are centered over the data observed. For details, please see Gertheiss et al. (2021) and examples below.
Value
An ordPen
object, which is a list containing:
fitted |
the matrix of fitted response values of the training data.
Columns correspond to different |
coefficients |
the matrix of fitted coefficients with respect to dummy-coded (ordinal or nominal) categorical input variables (including the reference category) as well as metric predictors. Columns correspond to different lambda values. |
model |
the type of the fitted model: "linear", "logit", or "poisson". |
restriction |
the type of restriction used for identifiability. |
lambda |
the used lambda values. |
fraction |
the used fraction values ( |
xlevels |
a vector giving the number of levels of the ordinal predictors. |
ulevels |
a vector giving the number of levels of the nominal predictors (if any). |
zcovars |
the number of metric covariates (if any). |
Author(s)
Jan Gertheiss, Aisouda Hoshiyar
References
Gertheiss, J., F. Scheipl, T. Lauer, and H. Ehrhardt (2022). Statistical inference for ordinal predictors in generalized linear and additive models with application to bronchopulmonary dysplasia. BMC research notes, 15, 112.
Gertheiss, J. and G. Tutz (2009). Penalized regression with ordinal predictors. International Statistical Review, 77, 345-365.
Tutz, G. and J. Gertheiss (2014). Rating scales as predictors – the old question of scale level and some answers. Psychometrica, 79, 357-376.
Tutz, G. and J. Gertheiss (2016). Regularized regression for categorical data. Statistical Modelling, 16, 161-200.
See Also
Examples
# smooth modeling of a simulated dataset
set.seed(123)
# generate (ordinal) predictors
x1 <- sample(1:8,100,replace=TRUE)
x2 <- sample(1:6,100,replace=TRUE)
x3 <- sample(1:7,100,replace=TRUE)
# the response
y <- -1 + log(x1) + sin(3*(x2-1)/pi) + rnorm(100)
# x matrix
x <- cbind(x1,x2,x3)
# lambda values
lambda <- c(1000,500,200,100,50,30,20,10,1)
# smooth modeling
osm1 <- ordSmooth(x = x, y = y, lambda = lambda)
# results
round(osm1$coef,digits=3)
plot(osm1)
# If for a certain plot the x-axis should be annotated in a different way,
# this can (for example) be done as follows:
plot(osm1, whx = 1, xlim = c(0,9), xaxt = "n")
axis(side = 1, at = c(1,8), labels = c("no agreement","total agreement"))
# add a nominal covariate to control for
u1 <- sample(1:8,100,replace=TRUE)
u <- cbind(u1)
osm2 <- ordSmooth(x = x, y = y, u = u, lambda = lambda)
round(osm2$coef,digits=3)
## Use gam() from mgcv for model fitting:
# ordinal predictors need to be ordered factors
x1 <- as.ordered(x1)
x2 <- as.ordered(x2)
x3 <- as.ordered(x3)
# model fitting with first-order penalty and smoothing parameter selection by REML
gom1 <- gam(y ~ s(x1, bs = "ordinal", m = 1) + s(x2, bs = "ordinal", m = 1) +
s(x3, bs = "ordinal", m = 1) + factor(u1), method = "REML")
# plot with confidence intervals
plot(gom1)
# use second-order penalty instead
gom2 <- gam(y ~ s(x1, bs = "ordinal", m = 2) + s(x2, bs = "ordinal", m = 2) +
s(x3, bs = "ordinal", m = 2) + factor(u1), method = "REML")
# summary including significance of smooth terms
# please note, the latter is only reliable for m = 2
summary(gom2)
# plotting
plot(gom2)
Plot method for ordPen objects
Description
Takes a fitted ordPen
object and plots estimated dummy coefficients
of ordinal predictors for different lambda
values.
Usage
## S3 method for class 'ordPen'
plot(x, whl = NULL, whx = NULL,
type = NULL, xlab = NULL, ylab = NULL, main = NULL,
xlim = NULL, ylim = NULL, col = NULL, ...)
Arguments
x |
an |
whl |
a vector of indices of |
whx |
a vector of indices indicating the ordinal predictors whose
dummy coefficients are plotted; e.g., set |
type |
1-character string giving the type of plot desired, see
|
xlab |
a label for the x axis; if supplied then this will be used as the x label for all plots. |
ylab |
a label for the y axis; if supplied then this will be used as the y label for all plots. |
main |
a main title for the plot(s); if supplied then this will be used as the title for all plots. |
xlim |
the x limits; if supplied then this pair of numbers are used as the x limits for each plot. |
ylim |
the y limits; if supplied then this pair of numbers are used as the y limits for each plot. |
col |
the plotting color; can be a vector of the same length as
|
... |
additional graphical parameters (see |
Value
The function simply generates plots.
Author(s)
Jan Gertheiss
See Also
ordFusion
, ordSelect
, ordSmooth
Examples
# see for example
help(ordSelect)
Predict method for ordPen objects
Description
Obtains predictions from an ordPen
object.
Usage
## S3 method for class 'ordPen'
predict(object, newx, newu = NULL, newz = NULL,
offset = rep(0,nrow(as.matrix(newx))),
type = c("link", "response", "class"), ...)
Arguments
object |
an |
newx |
the matrix (or |
newu |
a matrix (or |
newz |
a matrix (or |
offset |
potential offset values. |
type |
the type of prediction; |
... |
additional arguments (not supported at this time). |
Value
A matrix of predictions whose columns correspond to the different values of
the penalty parameter lambda
of the ordPen
object.
Author(s)
Jan Gertheiss, Aisouda Hoshiyar
See Also
ordSelect
, ordSmooth
, ordFusion
Examples
# the training data
set.seed(123)
# generate (ordinal) predictors
x1 <- sample(1:8,100,replace=TRUE)
x2 <- sample(1:6,100,replace=TRUE)
x3 <- sample(1:7,100,replace=TRUE)
# the response
y <- -1 + log(x1) + sin(3*(x2-1)/pi) + rnorm(100)
# x matrix
x <- cbind(x1,x2,x3)
# lambda values
lambda <- c(1000,500,200,100,50,30,20,10,1)
# selecting and/or smoothing/fusing
o1 <- ordSmooth(x = x, y = y, lambda = lambda)
o2 <- ordSelect(x = x, y = y, lambda = lambda)
o3 <- ordFusion(x = x, y = y, lambda = lambda)
# new data
x1 <- sample(1:8,10,replace=TRUE)
x2 <- sample(1:6,10,replace=TRUE)
x3 <- sample(1:7,10,replace=TRUE)
newx <- cbind(x1,x2,x3)
# prediction
round(predict(o1, newx), digits=3)
round(predict(o2, newx), digits=3)
round(predict(o3, newx), digits=3)