Type: | Package |
Title: | Cross-Validation Model Averaging for Partial Linear Functional Additive Models |
Version: | 0.1.1 |
Imports: | fda, quadprog, mgcv, MASS, stats, utils |
NeedsCompilation: | no |
Author: | Shishi Liu [aut, cre], Jingxiao Zhang [aut] |
Maintainer: | Shishi Liu <liushishi_644@163.com> |
Description: | Produce an averaging estimate/prediction by combining all candidate models for partial linear functional additive models, using multi-fold cross-validation criterion. More details can be referred to arXiv e-Prints via <doi:10.48550/arXiv.2105.00966>. |
License: | GPL (≥ 3) |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.1.0 |
Packaged: | 2025-04-28 01:47:15 UTC; liushishi |
Repository: | CRAN |
Date/Publication: | 2025-04-28 02:40:01 UTC |
Generate cross-validation folds
Description
Randomly split the data indexes into nfolds
folds.
Usage
cvfolds(nfolds, datasize)
Arguments
nfolds |
The number of folds used in cross-validation. |
datasize |
The sample size. |
Value
A list
. Each element contains the index vector of sample data included in this fold.
Examples
# Given sample size 20, generate 5 folds
set.seed(1212)
cvfolds(5, 20)
#[[1]]
# [1] 6 11 14 16
#[[2]]
# [1] 3 5 10 18
#[[3]]
# [1] 4 7 8 19
#[[4]]
# [1] 2 9 12 15
#[[5]]
# [1] 1 13 17 20
Cross-Validation Model Averaging (CVMA) for Partial Linear Functional Additive Models (PLFAMs)
Description
Summarize the estimate of weights for averaging across all candidate models for PLFAMs, using multi-fold cross-validation criterion, and the corresponding mean squared prediction error risk.
Usage
cvmaPLFAM(
Y,
scalars,
functional,
Y.test = NULL,
scalars.test = NULL,
functional.test = NULL,
tt,
nump,
numfpcs,
nbasis,
nfolds,
ratio.train = NULL
)
Arguments
Y |
The vector of the scalar response variable. |
scalars |
The design matrix of scalar predictors. |
functional |
The matrix including records/measurements of the functional predictor. |
Y.test |
Test data: The vector of the scalar response variable. |
scalars.test |
Test data: The design matrix of scalar predictors. |
functional.test |
Test data: The matrix including records/measurements of the functional predictor. |
tt |
The vector of recording/measurement points for the functional predictor. |
nump |
The number of scalar predictors in candidate models. |
numfpcs |
The number of functional principal components (FPCs) for the functional predictor in candidate models. |
nbasis |
The number of basis functions used for spline approximation. |
nfolds |
The number of folds used in cross-validation. |
ratio.train |
The ratio of data for training, if test data are |
Value
A list
of
cv |
Mean squared error risk in training data set, produced by CVMA method. |
wcv |
The weights for each candidate model by CVMA method. |
predcv |
Mean squared prediction error risk in test data set, produced by CVMA method. |
Examples
# Generate simulated data
simdata = data_gen(R = 0.7, K = 1, n = 50, ntest = 10, M0 = 4, typ = 1, design = 1)
train_dat = simdata[[1]]
scalars.train = train_dat[,1:4]
fd.train = train_dat[,5:104]
Y.train = train_dat[,106]
test_dat = simdata[[2]]
scalars.test = test_dat[,1:4]
fd.test = test_dat[,5:104]
Y.test = test_dat[,106]
tps = seq(0, 1, length.out = 100)
# Estimation
res = cvmaPLFAM(Y=Y.train, scalars = scalars.train, functional = fd.train,
Y.test = Y.test, scalars.test = scalars.test, functional.test = fd.test, tt = tps,
nump = 2, numfpcs = 3, nbasis = 50, nfolds = 5)
# Weights estimated by CVMA method
res$wcv
# Prediction error risk on test data set
res$predcv
Output the prediction risks of the cross-validation model averaging (CVMA) method for partial linear functional additive models (PLFAMs)
Description
Calculate the estimated weights for averaging across all candidate models and the corresponding mean squared prediction error risk.
Usage
cvpredRisk(
M,
nump,
numq,
a2,
a3,
nfolds,
X.train,
ZZ.train,
Y.train,
X.pred,
ZZ.pred,
Y.pred,
nbasis,
tt
)
Arguments
M |
The number of candidate models. |
nump |
The number of scalar predictors in candidate models. |
numq |
The number of funtional principal components (FPCs) in candidate models. |
a2 |
The number of FPCs in each candidate model. See |
a3 |
The index for each component in each candidate model. See |
nfolds |
The number of folds used in cross-validation. |
X.train |
The training data of scalar predictors. |
ZZ.train |
The training data of the functional predictor. |
Y.train |
The training data of response variable. |
X.pred |
The test data of scalar predictors. |
ZZ.pred |
The test data of the functional predictor. |
Y.pred |
The test data of response variable. |
nbasis |
The number of basis functions used for spline approximation. |
tt |
The vector of recording/measurement points for the functional predictor. |
Value
A list
of
cv |
Mean squared error risk in training data set, produced by CVMA method. |
ws |
A |
predcv |
Mean squared prediction error risk in test data set, produced by CVMA method. |
Simulated data
Description
Simulate sample data for illustration, including a M0
-column design matrix of scalar predictors,
a 100
-column matrix of the functional predictor, a one-column vector of mu
, a one-column vector of Y
,
and a one-column vector of testY
.
Usage
data_gen(R, K, n, ntest, M0, typ, design)
Arguments
R |
A scalar of value ranging from |
K |
A scalar. The number of replications. |
n |
A scalar. The sample size of training data. |
ntest |
A scalar. The sample size of test data. |
M0 |
A scalar. True dimension of scalar predictors. |
typ |
A scalar of value |
design |
A scalar of value |
Value
A list
of K
simulated training data sets and K
simulated test data sets. Each data set is of matrix
type,
whose first M0
columns corresponds to the design matrix of scalar predictors, followed by the
recording/measurement matrix of the functional predictor, and vectors mu
, Y
.
Examples
library(MASS)
# Example: Design 1 in simulation study
set.seed(22)
data1 <- data_gen(R = 0.6, K = 2, n = 10, ntest = 5, M0 = 4, typ = 1, design = 1)
str(data1)
# List of 4
#$ : num [1:10, 1:106] -0.501 -1.266 -0.564 -0.563 -0.395 ...
#$ : num [1:10, 1:106] -1.207 -0.089 -0.782 0.123 0.66 ...
#$ : num [1:5, 1:106] 0.816 0.679 0.816 -0.563 -1.367 ...
#$ : num [1:5, 1:106] -0.089 -0.785 0.899 -0.785 -0.445 ...
# Example: Design 2 in simulation study
data_gen(R = 0.3, K = 3, n = 10, ntest = 5, M0 = 20, typ = 1, design = 2)
# Example: Design 3 in simulation study
data_gen(R = 0.9, K = 5, n = 20, ntest = 10, M0 = 4, typ = 2, design = 3)
Calculate functional principal component (fpc) scores
Description
Conduct functional principal component analysis (FPCA) on the observation matrix of the functional predictor.
Usage
fpcscore(Z, nbasis, tt)
Arguments
Z |
An |
nbasis |
The number of basis functions used for spline approximation. |
tt |
The vector of recording/measurement points for the functional predictor. |
Value
A list
of
score |
An |
eigv |
A vector of estimated eigen-values related to FPCA. |
varp |
A vector of percents of variance explained related to FPCA. |
Examples
# Generate a recording/measurement matrix of the functional predictor
fddata = matrix(rnorm(1000), nrow = 10, ncol = 100)
tpoints = seq(0, 1, length.out = 100)
library(fda)
# Using 20 basis functions for spline approximation
fpcscore(fddata, nbasis = 20, tt = tpoints)
Generate candidate models
Description
Specify non-nested or nested candidate models, according to the prescribed number of scalar predictors and the number of functional principal components (FPCs). Each candidate model comprises at least one scalar predictor and one FPC.
Usage
modelspec(nump, numq, method = NULL)
Arguments
nump |
The number of scalar predictors used in candidate models. |
numq |
The number of functional principal components (FPCs) used in candidate models. |
method |
A character string or NULL.
If |
Value
A list
of
a1 |
The number of scalar predictors in each candidate model. |
a2 |
The number of FPCs in each candidate model. |
a3 |
The index for each component in each candidate model. |
Examples
# Example 1: non-nested models
# Given nump = 2 and numq = 2, resulting in 9 candidate models
modelspec(2, 2)
#$a1
#[1] 2 2 2 1 1 1 1 1 1
#$a2
#[1] 2 1 1 2 1 1 2 1 1
#$a3
# [,1] [,2] [,3] [,4]
# [1,] 1 2 3 4
# [2,] 1 2 3 0
# [3,] 1 2 0 4
# [4,] 1 0 3 4
# [5,] 1 0 3 0
# [6,] 1 0 0 4
# [7,] 0 2 3 4
# [8,] 0 2 3 0
# [9,] 0 2 0 4
# Example 2: nested models
# Given nump = 2 and numq = 3, resulting in 6 candidate models
modelspec(2, 3, method = "nested")
#$a1
# [1] 2 2 2 1 1 1
#$a2
# [1] 3 2 1 3 2 1
#$a3
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 2 3 4 5
# [2,] 1 2 3 4 0
# [3,] 1 2 3 0 0
# [4,] 1 0 3 4 5
# [5,] 1 0 3 4 0
# [6,] 1 0 3 0 0
Fitting partial linear functional additive model
Description
Calculate the prediction values and prediction errors across all candidate models.
Usage
plam.fit(
M,
nump,
numq,
a3,
X.train,
ZZ.train,
y.train,
X.pred,
ZZ.pred,
y.pred,
nbasis,
tt
)
Arguments
M |
The number of candidate models. |
nump |
The number of scalar predictors in candidate models. |
numq |
The number of funtional principal components (FPCs) in candidate models. |
a3 |
The index for each component in each candidate model. See |
X.train |
The training data of scalar predictors. |
ZZ.train |
The training data of the functional predictor. |
y.train |
The training data of response variable. |
X.pred |
The test data of scalar predictors. |
ZZ.pred |
The test data of the functional predictor. |
y.pred |
The test data of response variable. |
nbasis |
The number of basis functions used for spline approximation. |
tt |
The vector of recording/measurement points for the functional predictor. |
Value
A list
of
muhat.train |
A |
ehat.train |
A |
muhat.pred |
A |
prederr |
A |
edf |
A |