Type: | Package |
Title: | Hierarchical and Geographically Weighted Regression |
Version: | 0.6-1 |
Date: | 2024-11-15 |
Maintainer: | Yigong Hu <yigong.hu@bristol.ac.uk> |
Description: | This model divides coefficients into three types, i.e., local fixed effects, global fixed effects, and random effects (Hu et al., 2022)<doi:10.1177/23998083211063885>. If data have spatial hierarchical structures (especially are overlapping on some locations), it is worth trying this model to reach better fitness. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
URL: | https://github.com/HPDell/hgwrr/, https://hpdell.github.io/hgwrr/ |
Imports: | Rcpp (≥ 1.0.8) |
LinkingTo: | Rcpp, RcppArmadillo |
Depends: | R (≥ 3.5.0), sf, stats, utils, MASS |
NeedsCompilation: | yes |
Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0), furrr, progressr, |
SystemRequirements: | GNU make |
RoxygenNote: | 7.2.3 |
VignetteBuilder: | knitr |
Config/Needs/website: | tidyverse, ggplot2, tmap, lme4, spdep, GWmodel |
Packaged: | 2024-11-15 16:39:59 UTC; yigong |
Author: | Yigong Hu [aut, cre], Richard Harris [aut], Richard Timmerman [aut] |
Repository: | CRAN |
Date/Publication: | 2024-11-16 11:50:02 UTC |
HGWR: Hierarchical and Geographically Weighted Regression
Description
An R and C++ implementation of Hierarchical and Geographically Weighted Regression (HGWR) model is provided in this package. This model divides coefficients into three types: local fixed effects, global fixed effects, and random effects. If data have spatial hierarchical structures (especially are overlapping on some locations), it is worth trying this model to reach better fitness.
Details
Package: | hgwrr |
Type: | Package |
Title: | Hierarchical and Geographically Weighted Regression |
Version: | 0.6-1 |
Date: | 2024-11-15 |
Authors@R: | c(person(given = "Yigong", family = "Hu", role = c("aut", "cre"), email = "yigong.hu@bristol.ac.uk"), person(given = "Richard", family = "Harris", role = "aut"), person(given = "Richard", family = "Timmerman", role = "aut")) |
Maintainer: | Yigong Hu <yigong.hu@bristol.ac.uk> |
Description: | This model divides coefficients into three types, i.e., local fixed effects, global fixed effects, and random effects (Hu et al., 2022)<doi:10.1177/23998083211063885>. If data have spatial hierarchical structures (especially are overlapping on some locations), it is worth trying this model to reach better fitness. |
License: | GPL (>= 2) |
URL: | https://github.com/HPDell/hgwrr/, https://hpdell.github.io/hgwrr/ |
Imports: | Rcpp (>= 1.0.8) |
LinkingTo: | Rcpp, RcppArmadillo |
Depends: | R (>= 3.5.0), sf, stats, utils, MASS |
NeedsCompilation: | yes |
Suggests: | knitr, rmarkdown, testthat (>= 3.0.0), furrr, progressr, |
SystemRequirements: | GNU make |
Roxygen: | list(markdown = TRUE) |
RoxygenNote: | 7.2.3 |
VignetteBuilder: | knitr |
Config/Needs/website: | tidyverse, ggplot2, tmap, lme4, spdep, GWmodel |
Author: | Yigong Hu [aut, cre], Richard Harris [aut], Richard Timmerman [aut] |
Note
Acknowledgement: We gratefully acknowledge support from China Scholarship Council.
Author(s)
Yigong Hu, Richard Harris, Richard Timmerman
References
Hu, Y., Lu, B., Ge, Y., Dong, G., 2022. Uncovering spatial heterogeneity in real estate prices via combined hierarchical linear model and geographically weighted regression. Environment and Planning B: Urban Analytics and City Science. doi:10.1177/23998083211063885
Get estimated coefficients.
Description
Get estimated coefficients.
Usage
## S3 method for class 'hgwrm'
coef(object, ...)
Arguments
object |
An |
... |
Parameter received from other functions. |
Value
A DataFrame
object consists of all estimated coefficients.
See Also
hgwr()
, summary.hgwrm()
, fitted.hgwrm()
and residuals.hgwrm()
.
Get fitted response.
Description
Get fitted response.
Usage
## S3 method for class 'hgwrm'
fitted(object, ...)
Arguments
object |
An |
... |
Parameter received from other functions. |
Value
A vector consists of fitted response values.
See Also
hgwr()
, summary.hgwrm()
, coef.hgwrm()
and residuals.hgwrm()
.
Log likelihood function
Description
Log likelihood function
Usage
## S3 method for class 'hgwrm'
logLik(object, ...)
Arguments
object |
An |
... |
Additional arguments. |
Value
An logLik
instance used for S3 method logLik()
.
Make Dummy Variables
Description
Function make_dummy
converts categorical variables in a data frame to dummy variables.
Function make_dummy_extract
converts a column to dummy variables if necessary
and assign appropriate names.
See the "detail" section for further information.
Users can define their own functions to allow the model
deal with some types of variables properly.
Usage
make_dummy(data)
make_dummy_extract(col, name)
## S3 method for class 'character'
make_dummy_extract(col, name)
## S3 method for class 'factor'
make_dummy_extract(col, name)
## S3 method for class 'logical'
make_dummy_extract(col, name)
## Default S3 method:
make_dummy_extract(col, name)
Arguments
data |
The data frame from which dummy variables need to be extracted. |
col |
A vector to extract dummy variables. |
name |
The vector's name. |
Details
If col
is a character vector,
the function will get unique values of its elements
and leave out the last one.
Then, all the unique values are combined with the name
argument
as names of new columns.
If col
is a factor vector,
the function will get its levels and leave out the last one.
Then, all level labels are combined with the name
argument
as names of new columns.
If col
is a logical vector,
the function will convert it to a numeric vector
with value TRUE
mapped to 1
and FALSE
to 0
.
If col
is of other types,
the default behaviour for extracting dummy variables is
just to copy the original value and try to convert it to numeric values.
Value
The data frame with extracted dummy variables.
Examples
make_dummy(iris["Species"])
make_dummy_extract(iris$Species, "Species")
make_dummy_extract(c("top", "mid", "low", "mid", "top"), "level")
make_dummy_extract(factor(c("far", "near", "near")), "distance")
make_dummy_extract(c(TRUE, TRUE, FALSE), "sold")
Simulated Spatial Multisampling Data For Test (DataFrame)
Description
A simulation data set for testing use of spatial hierarchical structure and samples overlapping on certain locations.
Usage
data(mulsam.test)
Format
A list of three items called "data", "coords" and "beta". Item "data" is a data frame with 873 observations at 25 locations and the following 6 variables.
y
a numeric vector, dependent variable
y
g1
a numeric vector, group level independent variable
g_1
g2
a numeric vector, group level independent variable
g_2
z1
a numeric vector, sample level independent variable
z_1
x1
a numeric vector, sample level independent variable
x_1
group
a numeric vector, group id of each sample
where g1
and g2
are used to estimate local fixed effects;
x1
is used to estimate global fixed effects
and z1
is used to estimate random effects.
Author(s)
Yigong Hu yigong.hu@bristol.ac.uk
Examples
data(mulsam.test)
hgwr(formula = y ~ L(g1 + g2) + x1 + (z1 | group),
data = mulsam.test$data,
coords = mulsam.test$coords,
bw = 10, kernel = "bisquared")
Large Scale Simulated Spatial Multisampling Data (DataFrame)
Description
A simulation data of spatial hierarchical structure and samples overlapping on certain locations.
Usage
data(multisampling)
Format
A list of three items called "data", "coords" and "beta". Item "data" is a data frame with 21434 observations at 625 locations and the following 6 variables.
y
a numeric vector, dependent variable
y
g1
a numeric vector, group level independent variable
g_1
g2
a numeric vector, group level independent variable
g_2
z1
a numeric vector, sample level independent variable
z_1
x1
a numeric vector, sample level independent variable
x_1
group
a numeric vector, group id of each sample
where g1
and g2
are used to estimate local fixed effects;
x1
is used to estimate global fixed effects
and z1
is used to estimate random effects.
Author(s)
Yigong Hu yigong.hu@bristol.ac.uk
Examples
## Not run:
data(multisampling)
hgwr(formula = y ~ L(g1 + g2) + x1 + (z1 | group),
data = multisampling$data,
coords = multisampling$coords,
bw = 32)
## End(Not run)
Print description of a hgwrm
object.
Description
Print description of a hgwrm
object.
Usage
## S3 method for class 'hgwrm'
print(x, decimal.fmt = "%.6f", ...)
Arguments
x |
An |
decimal.fmt |
The format string passing to |
... |
Arguments passed on to
|
Value
No return.
See Also
summary.hgwrm()
, print_table_md()
.
Examples
data(mulsam.test)
model <- hgwr(
formula = y ~ L(g1 + g2) + x1 + (z1 | group),
data = mulsam.test$data,
coords = mulsam.test$coords,
bw = 10
)
print(model)
print(model, table.style = "md")
Print the result of spatial heterogeneity test
Description
Print the result of spatial heterogeneity test
Usage
## S3 method for class 'spahetbootres'
print(x, ...)
Arguments
x |
A |
... |
Other unused arguments. |
Print summary of an hgwrm
object.
Description
Print summary of an hgwrm
object.
Usage
## S3 method for class 'summary.hgwrm'
print(x, decimal.fmt = "%.6f", ...)
Arguments
x |
An object returned from |
decimal.fmt |
The format string passing to |
... |
Arguments passed on to
|
Value
No return.
See Also
summary.hgwrm()
, print_table_md()
.
Examples
data(mulsam.test)
model <- hgwr(
formula = y ~ L(g1 + g2) + x1 + (z1 | group),
data = mulsam.test$data,
coords = mulsam.test$coords,
bw = 10
)
summary(model)
Print a character matrix as a table.
Description
Print a character matrix as a table.
Usage
print_table_md(
x,
col_sep = "",
header_sep = "",
row_begin = "",
row_end = "",
table_before = NA_character_,
table_after = NA_character_,
table_style = c("plain", "md", "latex", "booktabs"),
...
)
Arguments
x |
A character matrix. |
col_sep |
Column separator. Default to |
header_sep |
Header separator. Default to |
row_begin |
Character at the beginning of each row.
Default to |
row_end |
Character at the ending of each row.
Default to |
table_before |
Characters to be printed before the table. |
table_after |
Characters to be printed after the table. |
table_style |
Name of pre-defined style.
Possible values are |
... |
Additional style control arguments. |
Details
When table_style
is specified, col_sep
, header_sep
, row_begin
and row_end
would not take effects.
Because this function will automatically set their values.
For each possible value of table_style
, its corresponding style settings
are shown in the following table.
plain | md | latex |
|
col_sep | "" | "|" | "&" |
header_sep | "" | "-" | "" |
row_begin | "" | "|" | "" |
row_end | "" | "|" | "\\"
|
In this function, characters are right padded by spaces.
Value
No return.
See Also
print.hgwrm()
, summary.hgwrm()
.
Get residuals.
Description
Get residuals.
Usage
## S3 method for class 'hgwrm'
residuals(object, ...)
Arguments
object |
An |
... |
Parameter received from other functions. |
Value
A vector consists of residuals.
See Also
hgwr()
, summary.hgwrm()
, coef.hgwrm()
and fitted.hgwrm()
.
Generic method to test spatial heterogeneity
Description
Generic method to test spatial heterogeneity
Usage
spatial_hetero_test(x, ...)
## Default S3 method:
spatial_hetero_test(x, ...)
## S3 method for class 'matrix'
spatial_hetero_test(x, coords, ...)
## S3 method for class 'numeric'
spatial_hetero_test(x, coords, ...)
## S3 method for class 'vector'
spatial_hetero_test(x, coords, ...)
## S3 method for class 'data.frame'
spatial_hetero_test(x, coords, ...)
## S3 method for class 'sf'
spatial_hetero_test(x, ...)
Arguments
x |
The data to be tested. |
... |
Arguments passed on to
|
coords |
The coordinates used for testing.
Accepts a matrix or vector.
For matrix, it needs to have the same number of rows as |
Methods (by class)
-
spatial_hetero_test(default)
: Default behavior. -
spatial_hetero_test(matrix)
: For the matrix,coords
is necessary. -
spatial_hetero_test(numeric)
: Takesx
as values of a series variables stored by column, andcoords
as coordinates for each row inx
. -
spatial_hetero_test(vector)
: Takesx
as values of the variable, andcoords
as coordinates for each element inx
. -
spatial_hetero_test(data.frame)
: Takesx
as variable values (each column is a variable), andcoords
as coordinates for each row inx
. -
spatial_hetero_test(sf)
: For thesf
object, coordinates of centroids are used. Only the numerical columns are tested.
Hierarchical and Geographically Weighted Regression
Description
A Hierarchical Linear Model (HLM) with group-level geographically weighted effects.
Usage
## S3 method for class 'hgwrm'
spatial_hetero_test(
x,
round = 99,
statistic = stat_glsw,
parallel = FALSE,
verbose = 0,
...
)
hgwr(
formula,
data,
...,
bw = "CV",
kernel = c("gaussian", "bisquared"),
alpha = 0.01,
eps_iter = 1e-06,
eps_gradient = 1e-06,
max_iters = 1e+06,
max_retries = 1e+06,
ml_type = c("D_Only", "D_Beta"),
f_test = FALSE,
verbose = 0
)
## S3 method for class 'sf'
hgwr(
formula,
data,
...,
bw = "CV",
kernel = c("gaussian", "bisquared"),
alpha = 0.01,
eps_iter = 1e-06,
eps_gradient = 1e-06,
max_iters = 1e+06,
max_retries = 1e+06,
ml_type = c("D_Only", "D_Beta"),
f_test = FALSE,
verbose = 0
)
## S3 method for class 'data.frame'
hgwr(
formula,
data,
...,
coords,
bw = "CV",
kernel = c("gaussian", "bisquared"),
alpha = 0.01,
eps_iter = 1e-06,
eps_gradient = 1e-06,
max_iters = 1e+06,
max_retries = 1e+06,
ml_type = c("D_Only", "D_Beta"),
f_test = FALSE,
verbose = 0
)
hgwr_fit(
formula,
data,
coords,
bw = c("CV", "AIC"),
kernel = c("gaussian", "bisquared"),
alpha = 0.01,
eps_iter = 1e-06,
eps_gradient = 1e-06,
max_iters = 1e+06,
max_retries = 1e+06,
ml_type = c("D_Only", "D_Beta"),
f_test = FALSE,
verbose = 0
)
Arguments
x |
An |
round |
The number of times to sampling from model. |
statistic |
A function used to calculate the statistics on the original data and bootstrapped data. Default to the variance of standardlised GLSW estimates. |
parallel |
If TRUE, use |
verbose |
An integer value. Determine the log level. Possible values are:
|
... |
Further arguments for the specified type of |
formula |
A formula.
Its structure is similar to response ~ L(glsw) + fixed + (random | group) For more information, please see the |
data |
The data. |
bw |
A numeric value. It is the value of bandwidth or |
kernel |
A character value. It specify which kernel function is used in GWR part. Possible values are
|
alpha |
A numeric value. It is the size of the first trial step in maximum likelihood algorithm. |
eps_iter |
A numeric value. Terminate threshold of back-fitting. |
eps_gradient |
A numeric value. Terminate threshold of maximum likelihood algorithm. |
max_iters |
An integer value. The maximum of iteration. |
max_retries |
An integer value. If the algorithm tends to be diverge, it stops automatically after trying max_retires times. |
ml_type |
An integer value. Represent which maximum likelihood algorithm is used. Possible values are:
|
f_test |
A logical value. Determine whether to do F test on GLSW effects.
If |
coords |
A 2-column matrix. It consists of coordinates for each group. |
Details
Effect Specification in Formula
In the HGWR model, there are three types of effects specified by the
formula
argument:
- Group-level spatially weighted (GLSW, aka. local fixed) effects
Effects wrapped by functional symbol
L
.- Sample-level random (SLR) effects
Effects specified outside the functional symbol
L
but to the left of symbol|
.- Fixed effects
Other effects
For example, the following formula in the example of this function below is written as
y ~ L(g1 + g2) + x1 + (z1 | group)
where g1
and g2
are GLSW effects,
x1
is the fixed effects,
and z1
is the SLR effects grouped by the group indicator group
.
Note that SLR effects can only be specified once!
Value
A list describing the model with following fields.
gamma
Coefficients of group-level spatially weighted effects.
beta
Coefficients of fixed effects.
mu
Coefficients of sample-level random effects.
D
Variance-covariance matrix of sample-level random effects.
sigma
Variance of errors.
effects
A list including names of all effects.
call
Calling of this function.
frame
The DataFrame object sent to this call.
frame.parsed
Variables extracted from the data.
groups
Unique group labels extracted from the data.
f_test
A list of F test for GLSW effects. Only exists when
f_test=TRUE
. Each item contains the F value, degrees of freedom in the numerator, degrees of freedom in the denominator, andp
value ofF>F_\alpha
.
Functions
-
spatial_hetero_test(hgwrm)
: Test the spatial heterogeneity with bootstrapping. -
hgwr_fit()
: Fit a HGWR model
Examples
data(mulsam.test)
hgwr(
formula = y ~ L(g1 + g2) + x1 + (z1 | group),
data = mulsam.test$data,
coords = mulsam.test$coords,
bw = 10
)
mod_Ftest <- hgwr(
formula = y ~ L(g1 + g2) + x1 + (z1 | group),
data = mulsam.test$data,
coords = mulsam.test$coords,
bw = 10
)
summary(mod_Ftest)
Test the spatial heterogeneity in data based on permutation.
Description
Test the spatial heterogeneity in data based on permutation.
Usage
spatial_hetero_test_data(
x,
coords,
...,
resample = 5000,
poly = 2,
bw = 10,
kernel = c("bisquared", "gaussian"),
verbose = 0
)
Arguments
x |
A matrix of data to be tested. Each column is a variable. |
coords |
A matrix of coordinates. |
... |
Additional arguments. |
resample |
The total times of resampling with replacement. Default to 5000. |
poly |
The number of polynomial terms used by the polynomial estimator. Default to 2. |
bw |
The adaptive bandwidth used by the polynomial estimator. Default to 10. |
kernel |
The kernel function used by the polynomial estimator. |
verbose |
The verbosity level. Default to 0. |
Value
A spahetbootres
object of permutation-test results with the following items:
vars
The names of variables.
t0
The value of the statistics on original values.
t
The value of the same statistics on permuted values.
p
The p-value for each variable.
Currently, variance is used as the statistics.
Summary an hgwrm
object.
Description
Summary an hgwrm
object.
Usage
## S3 method for class 'hgwrm'
summary(object, ..., test_hetero = FALSE, verbose = 0)
Arguments
object |
An |
... |
Other arguments passed from other functions. |
test_hetero |
Logical/list value.
Whether to test the spatial heterogeneity of GLSW effects.
If it is set to |
verbose |
An Integer value to control whether additional messages during testing spatial heterogeneity should be reported. |
Details
The parameters used to perform test of spatial heterogeneity are
bw
Bandwidth (unit: number of nearest neighbours) used to make spatial kernel density estimation. Default:
10
.poly
The number of polynomial terms used in the local polynomial estimation. Default:
2
.resample
Total resampling times. Default:
5000
.kernel
The kernel function used in the local polynomial estimation. Options are
"gaussian"
and"bisquared"
. Default:"bisquared"
.
Value
A list containing summary informations of this hgwrm
object
with the following fields.
diagnostic
A list of diagnostic information.
random.stddev
The standard deviation of random effects.
random.corr
The correlation matrix of random effects.
residuals
The residual vector.
See Also
Examples
data(mulsam.test)
m <- hgwr(
formula = y ~ L(g1 + g2) + x1 + (z1 | group),
data = mulsam.test$data,
coords = mulsam.test$coords,
bw = 10
)
summary(m)
summary(m, test_hetero = TRUE)
summary(m, test_hetero = list(kernel = "gaussian"))
Wuhan Second-hand House Price and POI Data (DataFrame)
Description
A data set of second-hand house price in Wuhan, China collected in 2018.
Usage
data(multisampling)
Format
A list of two items called "data" and "coords". Item "data" is a data frame with 13862 second-hand properties at 779 neighbourhoods and the following 22 variables.
Price
House price per square metre.
Floor.High
1 if a property is on a high floor, otherwise 0.
Floor.Low
1 if a property is on a low floor, otherwise 0.
Decoration.Fine
1 if a property is well decorated, otherwise 0.
PlateTower
1 if a property is of the plate-tower type, otherwise 0.
Steel
1 if a property is of 'steel' structure, otherwise 0.
BuildingArea
Building area in square metres.
Fee
Management fee per square meter per month.
d.Commercial
Distance to the nearest commercial area.
d.Greenland
Distance to the nearest green land.
d.Water
Distance to the nearest river or lake.
d.University
Distance to the nearest university.
d.HighSchool
Distance to the nearest high school.
d.MiddleSchool
Distance to the nearest middle school.
d.PrimarySchool
Distance to the nearest primary school.
d.Kindergarten
Distance to the nearest kindergarten.
d.SubwayStation
Distance to the nearest subway station.
d.Supermarket
Distance to the nearest supermarket.
d.ShoppingMall
Distance to the nearest shopping mall.
lon
Longitude coordinates (Projected CRS: EPSG 3857).
lat
Latitude coordinates (Projected CRS: EPSE 3857).
group
Group id of each sample.
The following variables are group level:
- Fee
- d.Commercial
- d.Greenland
- d.Water
- d.University
- d.HighSchool
- d.MiddleSchool
- d.PrimarySchool
- d.Kindergarten
- d.SubwayStation
- d.Supermarket
- d.ShoppingMall
The following variables are sample level:
- Price
- Floor.High
- Floor.Low
- Decoration.Fine
- PlateTower
- Steel
- BuildingArea
Item "coords" is a 779-by-2 matrix of coordinates of all neighbourhoods.
Author(s)
Yigong Hu yigong.hu@bristol.ac.uk
Examples
## Not run:
data(wuhan.hp)
hgwr(
formula = Price ~ L(d.Water + d.Commercial + d.PrimarySchool +
d.Kindergarten + Fee) + BuildingArea + (Floor.High | group),
data = wuhan.hp$data,
coords = wuhan.hp$coords, bw = 50, kernel = "bisquared")
## End(Not run)