Help for package gilmour

Type:

Package

Title:

The Interpretation of Adjusted Cp Statistic

Version:

0.1.1

VignetteBuilder:

knitr

Suggests:

knitr, rmarkdown

Description:

Several methods may be found for selecting a subset of regressors from a set of k candidate variables in multiple linear regression. One possibility is to evaluate all possible regression models and comparing them using Mallows's Cp statistic (Cp) according to Gilmour original study. Full model is calculated, all possible combinations of regressors are generated, adjusted Cp for each submodel are computed, and the submodel with the minimum adjusted value Cp (ModelMin) is calculated. To identify the final model, the package applies a sequence of hypothesis tests on submodels nested within ModelMin, following the approach outlined in Gilmour's original paper. For more details see the help of the function final_model() and the original study (1996) <doi:10.2307/2348411>.

License:

GPL-3

Encoding:

UTF-8

LazyData:

true

RoxygenNote:

7.3.2

Depends:

R (≥ 3.5)

NeedsCompilation:

Packaged:

2025-07-06 04:56:05 UTC; djose

Author:

Josef Dolejs

[aut, cre]

Maintainer:

Josef Dolejs <josef.dolejs@uhk.cz>

Repository:

CRAN

Date/Publication:

2025-07-10 13:50:02 UTC

illustrative econometric data from Eurostat for 5 variables in 17 countries in 2019 columns: Country, LifExp , HDP, Unempl, Obesity, APassangers remove the first column: rownames(EU2019)= EU2019[,1]; EU2019=EU2019[,-1] data has real meaning

Description

illustrative econometric data from Eurostat for 5 variables in 17 countries in 2019 columns: Country, LifExp , HDP, Unempl, Obesity, APassangers remove the first column: rownames(EU2019)= EU2019[,1]; EU2019=EU2019[,-1] data has real meaning

Usage

EU2019

Format

An object of class data.frame with 17 rows and 6 columns.

Author(s)

Josef Dolejs

References

Josef Dolejs

Ilustrative data published in the original article (House prices) Data with the practical meaning Gilmour9p was taken from the original study (see Table1).

Description

Ilustrative data published in the original article (House prices) Data with the practical meaning Gilmour9p was taken from the original study (see Table1).

Usage

Gilmour9p

Format

An object of class data.frame with 24 rows and 10 columns.

Author(s)

Steven G. Gilmour

References

https://rss.onlinelibrary.wiley.com/doi/10.2307/2348411

special illustrative data if more than two tests are done in the loop in the function final_model() original Gilmour table was modified data has no real meaning

Description

special illustrative data if more than two tests are done in the loop in the function final_model() original Gilmour table was modified data has no real meaning

Usage

Modified_Gilmour9p

Format

An object of class data.frame with 24 rows and 10 columns.

Author(s)

Josef Dolejs

References

Josef Dolejs

number of visitors in parks, citation: Stemberk Josef, Josef Dolejs, Petra Maresova, Kamil Kuca. Factors affecting the number of Visitors in National Parks in the Czech Republic, Germany and Austria. International Journal of Geo-Information. ISPRS Int. J. Geo-Inf. 2018, 7(3), 124; doi:10.3390/ijgi7030124 data has real meaning

Description

number of visitors in parks, citation: Stemberk Josef, Josef Dolejs, Petra Maresova, Kamil Kuca. Factors affecting the number of Visitors in National Parks in the Czech Republic, Germany and Austria. International Journal of Geo-Information. ISPRS Int. J. Geo-Inf. 2018, 7(3), 124; doi:10.3390/ijgi7030124 data has real meaning

Usage

Parks5p

Format

An object of class data.frame with 12 rows and 7 columns.

Author(s)

Stemberk Josef, Josef Dolejs, Petra Maresova, Kamil Kuca

References

https://www.mdpi.com/2220-9964/7/3/124

number of patents in universities, citation: Maresova Petra, Soukal Ivan, Stemberkova Ruzena, Selamat Ali Perspective and Suitable Research Area in Public Research-Patent Analysis of the Czech Public Universities Education and Urban Society, 54(7), Maresova Petra, Soukal Ivan, Stemberkova Ruzena, Selamat Ali data has real meaning

Description

number of patents in universities, citation: Maresova Petra, Soukal Ivan, Stemberkova Ruzena, Selamat Ali Perspective and Suitable Research Area in Public Research-Patent Analysis of the Czech Public Universities Education and Urban Society, 54(7), Maresova Petra, Soukal Ivan, Stemberkova Ruzena, Selamat Ali data has real meaning

Usage

Patents5p

Format

An object of class data.frame with 11 rows and 6 columns.

Author(s)

Maresova Petra, Soukal Ivan, Stemberkova Ruzena, Selamat Ali

References

doi:10.1177/00131245211027362

T1 contains simulated data without real meaning the null hypothesis is not rejected in the first test it illustrates the situation if full model is final model data has no real meaning

Description

T1 contains simulated data without real meaning the null hypothesis is not rejected in the first test it illustrates the situation if full model is final model data has no real meaning

Usage

T1

Format

An object of class data.frame with 16 rows and 13 columns.

Author(s)

Josef Dolejs

References

Josef Dolejs

T2 illustrates the situation if the loop of tests is finished by the Trivial model data has no practical meaning

Description

T2 illustrates the situation if the loop of tests is finished by the Trivial model data has no practical meaning

Usage

T2

Format

An object of class data.frame with 16 rows and 6 columns.

Author(s)

Josef Dolejs

References

Josef Dolejs

Trivial illustrates the situation if the Trivial model is model_min without testing process data has no real meaning

Description

Trivial illustrates the situation if the Trivial model is model_min without testing process data has no real meaning

Usage

Trivial

Format

An object of class data.frame with 16 rows and 3 columns.

Author(s)

Josef Dolejs

References

Josef Dolejs

This data contains critical values for testing at the significance level 5 Data published in the original article. Crit was taken from the original study (see Table2). Values in Crit were recalculated by Josef Dolejs for more possible steps of freedom

Description

This data contains critical values for testing at the significance level 5 Data published in the original article. Crit was taken from the original study (see Table2). Values in Crit were recalculated by Josef Dolejs for more possible steps of freedom

Usage

crit

Format

An object of class data.frame with 28 rows and 31 columns.

Author(s)

Steven G. Gilmour and recalculated by Josef Dolejs for more possible steps of freedom

References

https://rss.onlinelibrary.wiley.com/doi/10.2307/2348411

The function calculates all models and find final model according Gilmour

Description

The original Gilmour study proposed a method for interpreting adjusted Cp statistic (Cp) to identify the best regression model when dealing with a relatively large number of numerical regressors and a smaller number of observations.
(The interpretation of Mallows's Cp-statistic.The Statistician. 1996. 45(1):49-56. doi:10.2307/2348411).
The procedure consists of two main steps (two diagrams are avaiable in the vignette of the package):
Initial Selection:
Adjusted \bar{C}_p values are calculated for all possible combinations of regressors.
The submodel with the minimum of adjusted \bar{C}_pvalue is selected and labeled as ModelMin.
Reduction and Testing:
All submodels nested within ModelMin are considered.
Among them, the submodel with the lowest value of adjusted \bar{C}_p is selected as a new candidate.
A hypothesis test is then performed where the null hypothesis states that ModelMin provides a better fit than the candidate submodel.
If the null hypothesis is not rejected, ModelMin is accepted as the final model.
If the null hypothesis is rejected, the candidate submodel becomes the new ModelMin, and the process is repeated with submodels nested within this new model.
The procedure continues iteratively until a null hypothesis is not rejected or until the so-called trivial model (i.e., a model using only the arithmetic mean) is reached.
Consequently, to identify the final model, the package applies a sequence of hypothesis tests on submodels nested within ModelMin, following the approach outlined in Gilmour’s original paper.
The function final_model returns all regression results for:
the full model,
the selected ModelMin, and
the resulting final model with results of all tests.
Additionally, it outputs \bar{C}_p values for all submodels, enabling users to easily generate a adjusted \bar{C}_p vs. p plot (where p is the number of regressors+1), as recommended by Gilmour.
The input data should be provided as a data.frame, where the first column contains the numerical dependent variable, and the subsequent columns contain the regressors. The package also includes nine illustrative examples to demonstrate its functionality.
Adjusted \bar{C}_p is calculated using the formula:
\bar{C}_p = C_p - \frac{2(k - p + 1)}{(n - k - 3})
where k is the number of regressors in the full model, and p - 1 is the number of regressors in the submodel (thus, p includes the intercept term).
Further details and theoretical background can be found in the Gilmour's original paper.
For example, in the case of the trivial model (i.e., a model with only an intercept), the adjusted statistic is:
\bar{C}_1 = C_1 - \frac{2(k - p + 1)}{(n - k – 3)}= \frac{var(y)(n-1)}{MSE}-n+2-\frac{2(k - p + 1)}{(n - k – 3)}
where the function var() of sample variation is used for calculation of sum of squares in the trivial model and MSE is estimation of \sigma^2 .

Usage

final_model(d)

Arguments

d

dataframe

Value

y (numeric vector of dependend variable, the first column in data.frame input "d"); X (numeric matrix of regressors, the columns 2:k in data.frame input "d"); n number of observations (numeric value, the number of observations in "d"); k number of regressors (numeric value); t step of freedom (t=n-k-1, numeric value); regressors (set of strings of regressors); expression of full_model (string); regression results of full model (list calculated using lm()); MSE (characteristic in full model, numeric value); submodels_number (number of all possible submodels, numeric value); Cq_plus_1A (numeric value, Cp characteristic in model_min with the minimal adjusted Cp); q (level of model_min, number of regressors in model_min, numeric value); submodels (matrix with four columns: row number, p, Cp,submodel ); model_min (textual expression of model min, string); regression results of model_min (results in model min, list calculated using lm()); FinalCp (Cp characteristic in final_model, numeric value); expression of final_model (textual expression of final model, string); regression results of final_model (results in final model, list calculated using lm()).

Examples

mtcars ; MyResults=final_model(mtcars) # well known data in R
# the red points correspond to the starting submodel in the testing process
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp)) ; FinalCp= MyResults$FinalCp
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( ( round(yCp,4)== round(min(yCp),4)| round(yCp,4)== round(FinalCp,4) ),
 "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

MyResults$full_model
summary(MyResults$full_model_results)
MyResults$final_model
summary(MyResults$final_model_results)


# 8 tables are avaiable for illustration of the function final_model
# Gilmour9p, T1, T2, Trivial
# Modified_Gilmour9p, Parks5p, Patents5p and EU2019
# ilustrative data from the original Gilmour paper Gilmour9p
Gilmour9p ; MyResults=final_model(Gilmour9p)
# the red points correspond to the starting submodel in the testing process
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp)) ; FinalCp= MyResults$FinalCp
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
col=ifelse( ( round(yCp,4)== round(min(yCp),4)| round(yCp,4)== round(FinalCp,4) ),
"red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)


MyResults$full_model
summary(MyResults$full_model_results)
MyResults$final_model
summary(MyResults$final_model_results)

# T1 contains 12 regressors and 16 observations, simulated data without real meaning
# more submodels and calculation takes about 5 seconds
# the null hypothesis is not rejected in the first test
# it illustrates the situation if full model is final model
# data has no practical meaning
T1; MyResults=final_model(T1)
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp)) ; FinalCp= MyResults$FinalCp
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
col=ifelse( ( round(yCp,4)== round(min(yCp),4)| round(yCp,4)== round(FinalCp,4) ),
"red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

MyResults$full_model
summary(MyResults$full_model_results)
MyResults$final_model
summary(MyResults$final_model_results)

# T2 illustrates the situation if
# the loop of tests is finished by the Trivial model
# data has no practical meaning
T2; MyResults=final_model(T2)
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp)) ; FinalCp= MyResults$FinalCp
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( ( round(yCp,4)== round(min(yCp),4)| round(yCp,4)== round(FinalCp,4) ),
 "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

MyResults$full_model
summary(MyResults$full_model_results)
MyResults$final_model
summary(MyResults$final_model_results)

# Trivial contains illustrates the situation if
# the Trivial model is model_min without testing process
# data has no practical meaning
Trivial; MyResults=final_model(Trivial)
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp)) ; FinalCp= MyResults$FinalCp
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( ( round(yCp,4)== round(min(yCp),4)| round(yCp,4)== round(FinalCp,4) ),
 "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

MyResults$full_model
summary(MyResults$full_model_results)
MyResults$final_model
summary(MyResults$final_model_results)

# special illustrative data if more than two tests
# are done in the loop in the function final_model()
# original Gilmour table was modified
# data has no practical meaning
Modified_Gilmour9p ; MyResults=final_model(Modified_Gilmour9p)
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp)) ; FinalCp= MyResults$FinalCp
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( ( round(yCp,4)== round(min(yCp),4)| round(yCp,4)== round(FinalCp,4) ),
 "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)


MyResults$full_model
summary(MyResults$full_model_results)
MyResults$final_model
summary(MyResults$final_model_results)

# number of visitors in parks, citation:
# Stemberk Josef, Josef Dolejs, Petra Maresova, Kamil Kuca.
# Factors affecting the number of Visitors in National Parks
# in the Czech Republic, Germany and Austria.
# International Journal of Geo-Information. http://www.mdpi.com/2220-9964/7/3/124
# ISPRS Int. J. Geo-Inf. 2018, 7(3), 124; doi:10.3390/ijgi7030124
d=Parks5p ; rownames(d)= d[,1]; d=d[,-1]; d; MyResults=final_model(d); rm(d)
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp)) ; FinalCp= MyResults$FinalCp
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( ( round(yCp,4)== round(min(yCp),4)| round(yCp,4)== round(FinalCp,4) ),
 "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

MyResults$full_model
summary(MyResults$full_model_results)
MyResults$final_model
summary(MyResults$final_model_results)

# number of patents in universities (see column names – regressors)
# Perspective and Suitable Research Area in
# Public Research-Patent Analysis of the Czech Public Universities
# Education and Urban Society, 54(7), https://doi.org/10.1177/00131245211027362
# Maresova Petra, Soukal Ivan, Stemberkova Ruzena, Selamat Ali
Patents5p; MyResults=final_model(Patents5p)
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp)) ; FinalCp= MyResults$FinalCp
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( ( round(yCp,4)== round(min(yCp),4)| round(yCp,4)== round(FinalCp,4) ),
 "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

MyResults$full_model
summary(MyResults$full_model_results)
MyResults$final_model
summary(MyResults$final_model_results)

# illustrative econometric data from Eurostat for 5 variables in 17 countries in 2019
# columns: Country, LifExp , HDP, Unempl, Obesity, APassangers
d=EU2019 ; d
rownames(d)= d[,1]; d=d[,-1]; d  # the same data without the first column (country names)
MyResults=final_model(d)
# plot of all Cp, the red line is the relationship: "Cp ~ p"
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
FinalCp= MyResults$FinalCp
# plot without y limits "ylim=c( ymin ,1.5*max(xp)"
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="",
 col=ifelse( ( round(yCp,4)== round(min(yCp),4)| round(yCp,4)== round(FinalCp,4) ),
 "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

MyResults$full_model
summary(MyResults$full_model_results)
MyResults$final_model
summary(MyResults$final_model_results)

The function calculates all possible submodels and find model with the minimal value of Cp according Gilmour

Description

The original Gilmour study proposed a method for interpreting adjusted Cp statistic (Cp) to identify the best regression model when dealing with a relatively large number of numerical regressors and a smaller number of observations.
(The interpretation of Mallows's Cp-statistic. The Statistician. 1996. 45(1):49-56. doi:10.2307/2348411)
The procedure consists of two main steps (two diagrams are avaiable in the vignette of the package):
Initial Selection:
Adjusted \bar{C}_p values are calculated for all possible combinations of regressors.
The submodel with the minimum of adjusted \bar{C}_pvalue is selected and labeled as ModelMin.
Reduction and Testing (it is only in the function ’final_model’) :
All submodels nested within ModelMin are considered.
Among them, the submodel with the lowest value of adjusted \bar{C}_p is selected as a new candidate.
A hypothesis test is then performed where the null hypothesis states that ModelMin provides a better fit than the candidate submodel.
If the null hypothesis is not rejected, ModelMin is accepted as the ’final model’.
If the null hypothesis is rejected, the candidate submodel becomes the new ModelMin, and the process is repeated with submodels nested within this new model.
The procedure continues iteratively until a null hypothesis is not rejected or until the so-called trivial model (i.e., a model using only the arithmetic mean) is reached.
The second step is procced only in the function ’final model’
The function returns all regression results for:
the full model and the ModelMin.
Additionally, it outputs \bar{C}_p values for all submodels, enabling users to easily generate a adjusted \bar{C}_p vs. p plot (where p is the number of regressors+1), as recommended by Gilmour.
The input data should be provided as a data.frame, where the first column contains the numerical dependent variable, and the subsequent columns contain the regressors. The package also includes nine illustrative examples to demonstrate its functionality.
Adjusted \bar{C}_p is calculated using the formula:
\bar{C}_p = C_p - \frac{2(k - p + 1)}{(n - k - 3})
where k is the number of regressors in the full model, and p - 1 is the number of regressors in the submodel (thus, p includes the intercept term).
Further details and theoretical background can be found in the Gilmour's original paper.
For example, in the case of the trivial model (i.e., a model with only an intercept), the adjusted statistic is:
\bar{C}_1 = C_1 - \frac{2(k - p + 1)}{(n - k – 3)}= \frac{var(y)(n-1)}{MSE}-n+2-\frac{2(k - p + 1)}{(n - k – 3)}
where the function var() of sample variation is used for calculation of sum of squares in the trivial model and MSE is estimation of \sigma^2 .

Usage

submodels(d)

Arguments

d

dataframe

Value

Examples

# the standard well known data
d=mtcars ; d ; MyResults=submodels(d)
# the red points correspond to the starting submodel in the testing process
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp))
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( round(yCp,4)== round(min(yCp),4), "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)




# 8 tables are avaiable for illustration of the functions ’submodels’ and ’final_model’

# ilustrative data from the original Gilmour paper
Gilmour9p;MyResults=submodels(Gilmour9p)
# the red points correspond to the starting submodel in the testing process
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp))
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( round(yCp,4)== round(min(yCp),4), "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

# 12 regressors and 16 observations, simulated data without real meaning
# more submodels and calculation takes about 5 seconds
# the null hypothesis is not rejected in the first test
T1 ; MyResults=submodels(T1)
# the red points correspond to the starting submodel in the testing process
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp))
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( round(yCp,4)== round(min(yCp),4), "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

# the loop is finished by the Trivial model for data T2
T2 ; MyResults=submodels(T2)
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp))
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( round(yCp,4)== round(min(yCp),4), "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

# Trivial is illustrative data in which Trivial model is model_min without testing process
Trivial ; MyResults=submodels(Trivial)
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp))
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( round(yCp,4)== round(min(yCp),4), "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

# special illustrative data for more than two tests in the loop in the function ’final_model’
Modified_Gilmour9p ; MyResults=submodels(Modified_Gilmour9p)
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp))
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( round(yCp,4)== round(min(yCp),4), "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

# number of visitors in parks
# citation: Stemberk Josef, Josef Dolejs, Petra Maresova, Kamil Kuca.
# Factors affecting the number of Visitors in National Parks
# in the Czech Republic, Germany and Austria.
# International Journal of Geo-Information. https://www.mdpi.com/2220-9964/7/3/124
# ISPRS Int. J. Geo-Inf. 2018, 7(3), 124; doi:10.3390/ijgi7030124
d=Parks5p ; rownames(d)= d[,1]; d=d[,-1]; d
MyResults=submodels(d)
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp))
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( round(yCp,4)== round(min(yCp),4), "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

# number of patents in universities (see column names – regressors)
# citation: Perspective and Suitable Research Area
# in Public Research-Patent Analysis of the Czech Public Universities
# Education and Urban Society, 54(7), https://doi.org/10.1177/00131245211027362
Patents5p ; MyResults=submodels(Patents5p)
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
ymin= ifelse(min(yCp)<0, 1.1* min(yCp), 0.9* min(yCp))
YRange=c( ymin ,1.5*max(xp))
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="", ylim=YRange ,
 col=ifelse( round(yCp,4)== round(min(yCp),4), "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)

# illustrative econometric data from Eurostat for 5 variables in 17 countries in 2019
# columns: LifExp , HDP, Unempl, Obesity, APassangers
d= EU2019
rownames(d)= d[,1]; d=d[,-1]; d  # the same data without the first column (country names)
MyResults=submodels(d)
yCp= as.numeric(MyResults$submodels[,3]) ; xp= as.numeric(MyResults$submodels[,2])
# plot without y limits "ylim=c( ymin ,1.5*max(xp)"
plot(yCp ~ xp, xlab="Number of Parameters in Submodel",ylab="",
 col=ifelse( round(yCp,4)== round(min(yCp),4), "red", "darkblue")  )
lines(xp, xp, col="red")
mtext(bquote(paste( bar(C) , "p")), side=2, line=3, padj=1, cex=1.2)
mtext(bquote(paste("All Submodels: ",bar(C),"p ~ p")), side=3, line=3, padj=1, cex=1.2)