Type: | Package |
Title: | Spatial Analysis on Network |
Version: | 0.4.4.6 |
Description: | Perform spatial analysis on network. Implement several methods for spatial analysis on network: Network Kernel Density estimation, building of spatial matrices based on network distance ('listw' objects from 'spdep' package), K functions estimation for point pattern analysis on network, k nearest neighbours on network, reachable area calculation, and graph generation References: Okabe et al (2019) <doi:10.1080/13658810802475491>; Okabe et al (2012, ISBN:978-0470770818);Baddeley et al (2015, ISBN:9781482210200). |
License: | GPL-2 |
Encoding: | UTF-8 |
LazyData: | true |
Imports: | spdep (≥ 1.1.2), igraph (≥ 1.2.6), cubature (≥ 2.0.4.1), future.apply (≥ 1.4.0), methods (≥ 1.7.1), ggplot2 (≥ 3.3.0), progressr (≥ 0.4.0), data.table (≥ 1.12.8), Rcpp (≥ 1.0.4.6), Rdpack (≥ 2.1.1), dbscan (≥ 1.1-8), sf (≥ 1.0-3), abind (≥ 1.4-5), sfheaders (≥ 0.4.4), cppRouting (≥ 3.1) |
Depends: | R (≥ 3.6) |
Suggests: | future (≥ 1.16.0), testthat (≥ 3.0.0), kableExtra (≥ 1.1.0), RColorBrewer (≥ 1.1-2), classInt (≥ 0.4-3), reshape2 (≥ 1.4.3), rlang (≥ 0.4.6), rgl (≥ 0.107.14), tmap (≥ 3.3-1), smoothr (≥ 0.2.2), concaveman (≥ 1.1.0), covr (≥ 3.5.1), knitr, rmarkdown |
RoxygenNote: | 7.3.2 |
VignetteBuilder: | knitr |
URL: | https://jeremygelb.github.io/spNetwork/ |
BugReports: | https://github.com/JeremyGelb/spNetwork/issues |
LinkingTo: | Rcpp, RcppProgress, RcppArmadillo, BH |
RdMacros: | Rdpack |
Language: | en-CA |
SystemRequirements: | C++17 |
NeedsCompilation: | yes |
Packaged: | 2025-03-29 15:40:59 UTC; Gelb |
Author: | Jeremy Gelb |
Maintainer: | Jeremy Gelb <jeremy.gelb@ucs.inrs.ca> |
Repository: | CRAN |
Date/Publication: | 2025-03-29 16:00:02 UTC |
spNetwork: Spatial Analysis on Network
Description
Perform spatial analysis on network. Implement several methods for spatial analysis on network: Network Kernel Density estimation, building of spatial matrices based on network distance ('listw' objects from 'spdep' package), K functions estimation for point pattern analysis on network, k nearest neighbours on network, reachable area calculation, and graph generation References: Okabe et al (2019) doi:10.1080/13658810802475491; Okabe et al (2012, ISBN:978-0470770818);Baddeley et al (2015, ISBN:9781482210200).
Perform spatial analysis on network. Implement several methods for spatial analysis on network: Network Kernel Density estimation, building of spatial matrices based on network distance ('listw' objects from 'spdep' package), K functions estimation for point pattern analysis on network, k nearest neighbours on network, reachable area calculation, and graph generation References: Okabe et al (2019) doi:10.1080/13658810802475491; Okabe et al (2012, ISBN:978-0470770818);Baddeley et al (2015, ISBN:9781482210200).
Author(s)
Maintainer: Jeremy Gelb jeremy.gelb@ucs.inrs.ca (ORCID)
Other contributors:
Philippe Apparicio philippe.apparicio@ucs.inrs.ca (ORCID) [contributor]
See Also
Useful links:
Report bugs at https://github.com/JeremyGelb/spNetwork/issues
Useful links:
Report bugs at https://github.com/JeremyGelb/spNetwork/issues
Adaptive bandwidth
Description
Function to calculate Adaptive bandwidths according to Abramson’s smoothing regimen.
Usage
adaptive_bw(
grid,
events,
lines,
bw,
trim_bw,
method,
kernel_name,
max_depth,
tol,
digits,
sparse,
verbose
)
Arguments
grid |
A spatial grid to split the data within |
events |
A feature collection of points representing the events points |
lines |
A feature collection of linestrings representing the network |
bw |
The fixed kernel bandwidth (can also be a vector, the value returned will be a matrix in that case) |
trim_bw |
The maximum size of local bandwidths (can also be a vector, must match bw) |
method |
The method to use when calculating the NKDE |
kernel_name |
The name of the kernel to use |
max_depth |
The maximum recursion depth |
tol |
A float indicating the spatial tolerance when snapping events on lines |
digits |
The number of digits to keep |
sparse |
A Boolean indicating if sparse matrix should be used |
verbose |
A Boolean indicating if update messages should be printed |
Value
A vector with the local bandwidths
Examples
#This is an internal function, no example provided
Adaptive bandwidth (multicore)
Description
Function to calculate Adaptive bandwidths according to Abramson’s smoothing regimen with multicore support
Usage
adaptive_bw.mc(
grid,
events,
lines,
bw,
trim_bw,
method,
kernel_name,
max_depth,
tol,
digits,
sparse,
verbose
)
Arguments
grid |
A spatial grid to split the data within |
events |
A feature collection of points representing the events |
lines |
A feature collection of linestrings representing the network |
bw |
The fixed kernel bandwidth (can also be a vector, the value returned will be a matrix in that case) |
trim_bw |
The maximum size of local bandwidths (can also be a vector, must match bw) |
method |
The method to use when calculating the NKDE |
kernel_name |
The name of the kernel to use |
max_depth |
The maximum recursion depth |
tol |
A float indicating the spatial tolerance when snapping events on lines |
digits |
The number of digits to keep |
sparse |
A Boolean indicating if sparse matrix should be used |
verbose |
A Boolean indicating if update messages should be printed |
Value
A vector with the local bandwidths
Examples
#This is an internal function, no example provided
Adaptive bw in one dimension
Description
Calculate adaptive bandwidths in one dimension
Usage
adaptive_bw_1d(events, w, bw, kernel_name)
Arguments
events |
A numeric vector representing the moments of occurrence of events |
w |
The weight of the events |
bw |
A float, the bandiwdth to use |
kernel_name |
The name of the kernel to use |
Adaptive bandwidth for TNDE
Description
Function to calculate Adaptive bandwidths according to Abramson’s smoothing regimen for TNKDE with a space-time interaction.
Usage
adaptive_bw_tnkde(
grid,
events_loc,
events,
lines,
bw_net,
bw_time,
trim_bw_net,
trim_bw_time,
method,
kernel_name,
max_depth,
div,
tol,
digits,
sparse,
verbose
)
Arguments
grid |
A spatial grid to split the data within |
events |
A feature collection of points representing the events points |
lines |
A feature collection of linestrings representing the network |
bw_net |
The fixed kernel bandwidth for the network dimension. Can also be a vector if several bandwidth must be used. |
bw_time |
The fixed kernel bandwidth for the time dimension. Can also be a vector if several bandwidth must be used. |
trim_bw_net |
The maximum size of local bandwidths for network dimension. Must be a vector if bw_net is a vector |
trim_bw_time |
The maximum size of local bandwidths for time dimension. Must be a vector if bw_net is a vector |
method |
The method to use when calculating the NKDE |
kernel_name |
The name of the kernel to use |
max_depth |
The maximum recursion depth |
div |
The divisor to use for kernels |
tol |
A float indicating the spatial tolerance when snapping events on lines |
digits |
The number of digits to keep |
sparse |
A Boolean indicating if sparse matrix should be used |
verbose |
A Boolean indicating if update messages should be printed |
Value
A vector with the local bandwidths, or an array if bw_time and bw_net are vectors. In that case, the array has the following dimensions : length(bw_net) X length(bw_time) X nrow(events)
Examples
#This is an internal function, no example provided
Adaptive bandwidth for TNDE (multicore)
Description
Function to calculate Adaptive bandwidths according to Abramson’s smoothing regimen for TNKDE with a space-time interaction with multicore support.
Usage
adaptive_bw_tnkde.mc(
grid,
events_loc,
events,
lines,
bw_net,
bw_time,
trim_bw_net,
trim_bw_time,
method,
kernel_name,
max_depth,
div,
tol,
digits,
sparse,
verbose
)
Arguments
grid |
A spatial grid to split the data within |
events |
A feature collection of points representing the events points |
lines |
A feature collection of linestrings representing the network |
bw_net |
The fixed kernel bandwidth for the network dimension |
bw_time |
The fixed kernel bandwidth for the time dimension |
trim_bw_net |
The maximum size of local bandiwidths for network dimension |
trim_bw_time |
The maximum size of local bandiwidths for time dimension |
method |
The method to use when calculating the NKDE |
kernel_name |
The name of the kernel to use |
max_depth |
The maximum recursion depth |
div |
The divisor to use for kernels |
tol |
A float indicating the spatial tolerance when snapping events on lines |
digits |
The number of digits to keep |
sparse |
A Boolean indicating if sparse matrix should be used |
verbose |
A Boolean indicating if update messages should be printed |
Value
A vector with the local bandwidths
Examples
#This is an internal function, no example provided
The exposed function to calculate adaptive bandwidth with space-time interaction for TNKDE (INTERNAL)
Description
The exposed function to calculate adaptive bandwidth with space-time interaction for TNKDE (INTERNAL)
Usage
adaptive_bw_tnkde_cpp(
method,
neighbour_list,
sel_events,
sel_events_wid,
sel_events_time,
events,
events_wid,
events_time,
weights,
bws_net,
bws_time,
kernel_name,
line_list,
max_depth,
min_tol
)
Arguments
method |
a string, one of "simple", "continuous", "discontinuous" |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
sel_events |
a Numeric vector indicating the selected events (id of nodes) |
sel_events_wid |
a Numeric Vector indicating the unique if of the selected events |
sel_events_time |
a Numeric Vector indicating the time of the selected events |
events |
a NumericVector indicating the nodes in the graph being events |
events_wid |
a NumericVector indicating the unique id of all the events |
events_time |
a NumericVector indicating the timestamp of each event |
weights |
a cube with the weights associated with each event for each bws_net and bws_time. |
bws_net |
an arma::vec with the network bandwidths to consider |
bws_time |
an arma::vec with the time bandwidths to consider |
kernel_name |
a string with the name of the kernel to use |
line_list |
a DataFrame describing the lines |
max_depth |
the maximum recursion depth |
min_tol |
a double indicating by how much 0 in density values must be replaced |
Value
a vector witht the estimated density at each event location
Examples
# no example provided, this is an internal function
The exposed function to calculate adaptive bandwidth with space-time interaction for TNKDE (INTERNAL)
Description
The exposed function to calculate adaptive bandwidth with space-time interaction for TNKDE (INTERNAL)
Usage
adaptive_bw_tnkde_cpp2(
method,
neighbour_list,
sel_events,
sel_events_wid,
sel_events_time,
events,
events_wid,
events_time,
weights,
bws_net,
bws_time,
kernel_name,
line_list,
max_depth,
min_tol
)
Arguments
method |
a string, one of "simple", "continuous", "discontinuous" |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
sel_events |
a Numeric vector indicating the selected events (id of nodes) |
sel_events_wid |
a Numeric Vector indicating the unique if of the selected events |
sel_events_time |
a Numeric Vector indicating the time of the selected events |
events |
a NumericVector indicating the nodes in the graph being events |
events_wid |
a NumericVector indicating the unique id of all the events |
events_time |
a NumericVector indicating the timestamp of each event |
weights |
a cube with the weights associated with each event for each bws_net and bws_time. |
bws_net |
an arma::vec with the network bandwidths to consider |
bws_time |
an arma::vec with the time bandwidths to consider |
kernel_name |
a string with the name of the kernel to use |
line_list |
a DataFrame describing the lines |
max_depth |
the maximum recursion depth |
min_tol |
a double indicating by how much 0 in density values must be replaced |
Value
a vector with the estimated density at each event location
Examples
# no example provided, this is an internal function
Add center vertex to lines
Description
Add to each feature of a feature collection of lines an additional vertex at its center.
Usage
add_center_lines(lines)
Arguments
lines |
The feature collection of linestrings to use |
Value
A feature collection of points
Examples
#This is an internal function, no example provided
Add vertices to a feature collection of linestrings
Description
Add vertices (feature collection of points) to their nearest lines (feature collection of linestrings), may fail if the line geometries are self intersecting.
Usage
add_vertices_lines(lines, points, nearest_lines_idx, mindist)
Arguments
lines |
The feature collection of linestrings to modify |
points |
The feature collection of points to add to as vertex to the lines |
nearest_lines_idx |
For each point, the index of the nearest line |
mindist |
The minimum distance between one point and the extremity of the line to add the point as a vertex. |
Value
A feature collection of linestrings
Examples
#This is an internal function, no example provided
Events aggregation
Description
Function to aggregate points within a radius.
Usage
aggregate_points(points, maxdist, weight = "weight", return_ids = FALSE)
Arguments
points |
The feature collection of points to contract (must have a weight column) |
maxdist |
The distance to use |
weight |
The name of the column to use as weight (default is "weight"). The values of the aggregated points for this column will be summed. For all the other columns, only the max value is retained. |
return_ids |
A boolean (default is FALSE), if TRUE, then an index indicating for each point the group it belongs to is returned. If FALSE, then a spatial point features is returned with the points already aggregated. |
Details
This function can be used to aggregate points within a radius. This is done by using the dbscan algorithm. This process is repeated until no more modification is applied.
Value
A new feature collection of points
Examples
data(bike_accidents)
bike_accidents$weight <- 1
agg_points <- aggregate_points(bike_accidents, 5)
Road accidents including a bicyle in Montreal in 2016
Description
A feature collection (sf object) representing road accidents including a cyclist in Montreal in 2016. The EPSG is 3797, and the data comes from the Montreal OpenData website. It is only a small subset in central districts used to demonstrate the main functions of spNetwork.
Usage
bike_accidents
Format
A sf object with 347 rows and 4 variables
- NB_VICTIME
the number of victims
- AN
the year of the accident
- Date
the date of the accident (yyyy/mm/dd)
- geom
the geometry (points)
Source
https://donnees.montreal.ca/dataset/collisions-routieres
Network generation with igraph
Description
Generate an igraph object from a feature collection of linestrings
Usage
build_graph(lines, digits, line_weight, attrs = FALSE)
Arguments
lines |
A feature collection of lines |
digits |
The number of digits to keep from the coordinates |
line_weight |
The name of the column giving the weight of the lines |
attrs |
A boolean indicating if the original lines' attributes should be stored in the final object |
Details
This function can be used to generate an undirected graph object (igraph object). It uses the coordinates of the linestrings extremities to create the nodes of the graph. This is why the number of digits in the coordinates is important. Too high precision (high number of digits) might break some connections.
Value
A list containing the following elements:
graph: an igraph object;
linelist: the dataframe used to build the graph;
lines: the original feature collection of linestrings;
spvertices: a feature collection of points representing the vertices of the graph;
digits : the number of digits kept for the coordinates.
Examples
data(mtl_network)
mtl_network$length <- as.numeric(sf::st_length(mtl_network))
graph_result <- build_graph(mtl_network, 2, "length", attrs = TRUE)
Network generation with cppRouting
Description
Generate an cppRouting object from a feature collection of linestrings
Usage
build_graph_cppr(lines, digits, line_weight, attrs = FALSE, direction = NULL)
Arguments
lines |
A feature collection of lines |
digits |
The number of digits to keep from the coordinates |
line_weight |
The name of the column giving the weight of the lines |
attrs |
A boolean indicating if the original lines' attributes should be stored in the final object |
Details
This function can be used to generate an undirected graph object (cppRouting object). It uses the coordinates of the linestrings extremities to create the nodes of the graph. This is why the number of digits in the coordinates is important. Too high precision (high number of digits) might break some connections.
Value
A list containing the following elements:
graph: a cppRouting object;
linelist: the dataframe used to build the graph;
lines: the original feature collection of linestrings;
spvertices: a feature collection of points representing the vertices of the graph;
digits : the number of digits kept for the coordinates.
Examples
data(mtl_network)
mtl_network$length <- as.numeric(sf::st_length(mtl_network))
graph_result <- build_graph_cppr(mtl_network, 2, "length", attrs = TRUE)
Directed network generation
Description
Generate a directed igraph object from a feature collection of linestrings
Usage
build_graph_directed(lines, digits, line_weight, direction, attrs = FALSE)
Arguments
lines |
A feature collection of linestrings |
digits |
The number of digits to keep from the coordinates |
line_weight |
The name of the column giving the weight of the lines |
direction |
A column name indicating authorized travelling direction on lines. if NULL, then all lines can be used in both directions. Must be the name of a column otherwise. The values of the column must be "FT" (From - To), "TF" (To - From) or "Both" |
attrs |
A boolean indicating if the original lines' attributes should be stored in the final object |
Details
This function can be used to generate a directed graph object (igraph object). It uses the coordinates of the linestrings extremities to create the nodes of the graph. This is why the number of digits in the coordinates is important. Too high precision (high number of digits) might break some connections. The column used to indicate directions can only have the following values: "FT" (From-To), "TF" (To-From) and "Both".
Value
A list containing the following elements:
graph: an igraph object;
linelist: the dataframe used to build the graph;
lines: the original feature collection of lines;
spvertices: a feature collection of points representing the vertices of the graph;
digits : the number of digits kept for the coordinates.
Examples
data(mtl_network)
mtl_network$length <- as.numeric(sf::st_length(mtl_network))
mtl_network$direction <- "Both"
mtl_network[6, "direction"] <- "TF"
mtl_network_directed <- lines_direction(mtl_network, "direction")
graph_result <- build_graph_directed(lines = mtl_network_directed,
digits = 2,
line_weight = "length",
direction = "direction",
attrs = TRUE)
Spatial grid
Description
Generate a grid of a specified shape in the bbox of a Spatial object.
Usage
build_grid(grid_shape, spatial)
Arguments
grid_shape |
A numeric vector of length 2 indicating the number of rows and the numbers of columns of the grid |
spatial |
A list of spatial feature collections objects (package sf) |
Value
A feature collection of polygons representing the grid
Examples
#This is an internal function, no example provided
Check function for parameters in bandwidth selection methods
Description
A check function for bandwidth selection methods raising an error if a parameter is not valid
Usage
bw_checks(
check,
lines,
samples,
events,
kernel_name,
method,
bws_net = NULL,
bws_time = NULL,
arr_bws_net = NULL,
arr_bws_time = NULL,
adaptive = FALSE,
trim_net_bws = NULL,
trim_time_bws = NULL,
diggle_correction = FALSE,
study_area = NULL
)
Arguments
check |
A boolean indicating if the geometries must be checked |
lines |
A feature collection of linestrings representing the underlying network |
samples |
A feature collection of points representing the sample location |
events |
a feature collection of points representing the events |
kernel_name |
The name of the kernel to use |
method |
The name of the NKDE to use |
bws_net |
An ordered numeric vector with all the network bandwidths |
bws_time |
An ordered numeric vector with all the time bandwidths |
arr_bws_net |
An array with all the local netowrk bandwidths precalculated (for each event, and at each possible combinaison of network and temporal bandwidths). The dimensions must be c(length(net_bws), length(time_bws), nrow(events))) |
arr_bws_time |
An array with all the local time bandwidths precalculated (for each event, and at each possible combinaison of network and temporal bandwidths). The dimensions must be c(length(net_bws), length(time_bws), nrow(events))) |
adaptive |
A boolean indicating if local bandwidths must be calculated |
trim_net_bws |
A numeric vector with the maximum local network bandwidth. If local bandwidths have higher values, they will be replaced by the corresponding value in this vector. |
trim_time_bws |
A numeric vector with the maximum local time bandwidth. If local bandwidths have higher values, they will be replaced by the corresponding value in this vector. |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
Examples
# no example provided, this is an internal function
Bandwidth selection by likelihood cross validation
Description
Calculate for multiple bandwidth the cross validation likelihood to select an appropriate bandwidth in a data-driven approach
Usage
bw_cv_likelihood_calc(
bws = NULL,
lines,
events,
w,
kernel_name,
method,
diggle_correction = FALSE,
study_area = NULL,
adaptive = FALSE,
trim_bws = NULL,
mat_bws = NULL,
max_depth = 15,
digits = 5,
tol = 0.1,
agg = NULL,
sparse = TRUE,
grid_shape = c(1, 1),
sub_sample = 1,
zero_strat = "min_double",
verbose = TRUE,
check = TRUE
)
Arguments
bws |
An ordered numeric vector with the bandwidths |
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring. |
events |
events A feature collection of points representing the events on the network. The points will be snapped on the network to their closest line. |
w |
A vector representing the weight of each event |
kernel_name |
The name of the kernel to use. Must be one of triangle, gaussian, tricube, cosine, triweight, quartic, epanechnikov or uniform. |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see nkde details for more information) |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
adaptive |
A boolean indicating if an adaptive bandwidth must be used. If adaptive = TRUE, the local bandwidth are derived from the global bandwidths (bws) |
trim_bws |
A vector indicating the maximum value an adaptive bandwidth can reach. Higher values will be trimmed. It must have the same length as bws. |
mat_bws |
A matrix giving the bandwidths for each observation and for each global bandwidth. This is usefull when the user want to use a different method from Abramson's smoothing regimen. |
max_depth |
when using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has many small edges (area with many of intersections and many events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 10 should yield good estimates in most cases. A larger value can be used without a problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
sub_sample |
A float between 0 and 1 indicating the percentage of quadra to keep in the calculus. For large datasets, it may be useful to limit the bandwidth evaluation and thus reduce calculation time. |
zero_strat |
A string indicating what to do when density is 0 when calculating LOO density estimate for an isolated event. "min_double" (default) replace the 0 value by the minimum double possible on the machine. "remove" will remove them from the final score. The first approach penalizes more strongly the small bandwidths. |
verbose |
A Boolean, indicating if the function should print messages about the process. |
check |
A Boolean indicating if the geometry checks must be run before the operation. This might take some times, but it will ensure that the CRS of the provided objects are valid and identical, and that geometries are valid. |
Details
The function calculates the likelihood cross validation score for several bandwidths in order to find the most appropriate one. The general idea is to find the bandwidth that would produce the most similar results if one event was removed from the dataset (leave one out cross validation). We use here the shortcut formula as described by the package spatstat (Baddeley et al. 2021).
LCV(h) = \sum_i \log\hat\lambda_{-i}(x_i)
Where the sum is taken for all events x_i
and where \hat\lambda_{-i}(x_i)
is the leave-one-out kernel
estimate at x_i
for a bandwidth h. A higher value indicates a better bandwidth.
Value
A dataframe with two columns, one for the bandwidths and the second for the cross validation score (the lower the better).
References
Baddeley A, Turner R, Rubak E (2021). spatstat: Spatial Point Pattern Analysis, Model-Fitting, Simulation, Tests. R package version 2.1-0, https://CRAN.R-project.org/package=spatstat.
Examples
data(mtl_network)
data(bike_accidents)
cv_scores <- bw_cv_likelihood_calc(seq(200,800,50),
mtl_network, bike_accidents,
rep(1,nrow(bike_accidents)),
"quartic", "simple",
diggle_correction = FALSE, study_area = NULL,
max_depth = 8,
digits=2, tol=0.1, agg=5,
sparse=TRUE, grid_shape=c(1,1),
sub_sample = 1, verbose=TRUE, check=TRUE)
Bandwidth selection by likelihood cross validation (multicore)
Description
Calculate for multiple bandwidth the cross validation likelihood to select an appropriate bandwidth in a data-driven approach
Usage
bw_cv_likelihood_calc.mc(
bws,
lines,
events,
w,
kernel_name,
method,
diggle_correction = FALSE,
study_area = NULL,
adaptive = FALSE,
trim_bws = NULL,
mat_bws = NULL,
max_depth = 15,
digits = 5,
tol = 0.1,
agg = NULL,
sparse = TRUE,
grid_shape = c(1, 1),
sub_sample = 1,
zero_strat = "min_double",
verbose = TRUE,
check = TRUE
)
Arguments
bws |
An ordered numeric vector with the bandwidths |
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring. |
events |
events A feature collection of points representing the events on the network. The points will be snapped on the network to their closest line. |
w |
A vector representing the weight of each event |
kernel_name |
The name of the kernel to use. Must be one of triangle, gaussian, tricube, cosine, triweight, quartic, epanechnikov or uniform. |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see nkde details for more information) |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
adaptive |
A boolean indicating if an adaptive bandwidth must be used. If adaptive = TRUE, the local bandwidth are derived from the global bandwidths (bws) |
trim_bws |
A vector indicating the maximum value an adaptive bandwidth can reach. Higher values will be trimmed. It must have the same length as bws. |
mat_bws |
A matrix giving the bandwidths for each observation and for each global bandwidth. This is usefull when the user want to use a different method from Abramson's smoothing regimen. |
max_depth |
when using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has many small edges (area with many of intersections and many events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 10 should yield good estimates in most cases. A larger value can be used without a problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
sub_sample |
A float between 0 and 1 indicating the percentage of quadra to keep in the calculus. For large datasets, it may be useful to limit the bandwidth evaluation and thus reduce calculation time. |
zero_strat |
A string indicating what to do when density is 0 when calculating LOO density estimate for an isolated event. "min_double" (default) replace the 0 value by the minimum double possible on the machine. "remove" will remove them from the final score. The first approach penalizes more strongly the small bandwidths. |
verbose |
A Boolean, indicating if the function should print messages about the process. |
check |
A Boolean indicating if the geometry checks must be run before the operation. This might take some times, but it will ensure that the CRS of the provided objects are valid and identical, and that geometries are valid. |
Details
See the function bw_cv_likelihood_calc for more details. The calculation is split
according to the parameter grid_shape. If grid_shape = c(1,1)
, then parallel processing cannot be used.
Value
A dataframe with two columns, one for the bandwidths and the second for the cross validation score (the lower the better).
Examples
data(mtl_network)
data(bike_accidents)
future::plan(future::multisession(workers=1))
cv_scores <- bw_cv_likelihood_calc.mc(seq(200,800,50),
mtl_network, bike_accidents,
rep(1,nrow(bike_accidents)),
"quartic", "simple",
diggle_correction = FALSE, study_area = NULL,
max_depth = 8,
digits=2, tol=0.1, agg=5,
sparse=TRUE, grid_shape=c(1,1),
sub_sample = 1, verbose=TRUE, check=TRUE)
## make sure any open connections are closed afterward
if (!inherits(future::plan(), "sequential")) future::plan(future::sequential)
Bandwidth selection for Temporal Kernel density estimate by likelihood cross validation
Description
Calculate the likelihood cross validation score for several bandwidths for the Temporal Kernel density
Usage
bw_cv_likelihood_calc_tkde(events, w, bws, kernel_name)
Arguments
events |
A numeric vector representing the moments of occurrence of events |
w |
The weight of the events |
bws |
A numeric vector, the bandwidths to use |
kernel_name |
The name of the kernel to use |
Value
A vector with the cross validation scores (the higher the better).
Examples
data(bike_accidents)
bike_accidents$Date <- as.POSIXct(bike_accidents$Date, format = "%Y/%m/%d")
start <- min(bike_accidents$Date)
diff <- as.integer(difftime(bike_accidents$Date , start, units = "days"))
w <- rep(1,length(diff))
scores <- bw_cv_likelihood_calc_tkde(diff, w, seq(10,60,10), "quartic")
Bandwidth selection by Cronie and Van Lieshout's Criterion
Description
Calculate for multiple bandwidth the Cronie and Van Lieshout's Criterion to select an appropriate bandwidth in a data-driven approach.
Usage
bw_cvl_calc(
bws = NULL,
lines,
events,
w,
kernel_name,
method,
diggle_correction = FALSE,
study_area = NULL,
adaptive = FALSE,
trim_bws = NULL,
mat_bws = NULL,
max_depth = 15,
digits = 5,
tol = 0.1,
agg = NULL,
sparse = TRUE,
zero_strat = "min_double",
grid_shape = c(1, 1),
sub_sample = 1,
verbose = TRUE,
check = TRUE
)
Arguments
bws |
An ordered numeric vector with the bandwidths |
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring. |
events |
events A feature collection of points representing the events on the network. The points will be snapped on the network to their closest line. |
w |
A vector representing the weight of each event |
kernel_name |
The name of the kernel to use. Must be one of triangle, gaussian, tricube, cosine, triweight, quartic, epanechnikov or uniform. |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see nkde details for more information) |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
adaptive |
A boolean indicating if an adaptive bandwidth must be used. If adaptive = TRUE, the local bandwidth are derived from the global bandwidths (bws) |
trim_bws |
A vector indicating the maximum value an adaptive bandwidth can reach. Higher values will be trimmed. It must have the same length as bws. |
mat_bws |
A matrix giving the bandwidths for each observation and for each global bandwidth. This is usefull when the user want to use a different method from Abramson's smoothing regimen. |
max_depth |
when using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has many small edges (area with many of intersections and many events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 10 should yield good estimates in most cases. A larger value can be used without a problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
zero_strat |
A string indicating what to do when density is 0 when calculating LOO density estimate for an isolated event. "min_double" (default) replace the 0 value by the minimum double possible on the machine. "remove" will remove them from the final score. The first approach penalizes more strongly the small bandwidths. |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
sub_sample |
A float between 0 and 1 indicating the percentage of quadra to keep in the calculus. For large datasets, it may be useful to limit the bandwidth evaluation and thus reduce calculation time. |
verbose |
A Boolean, indicating if the function should print messages about the process. |
check |
A Boolean indicating if the geometry checks must be run before the operation. This might take some times, but it will ensure that the CRS of the provided objects are valid and identical, and that geometries are valid. |
Details
The Cronie and Van Lieshout's Criterion (Cronie and Van Lieshout 2018) find the optimal bandwidth by minimizing the difference between the size of the observation window and the sum of the reciprocal of the estimated kernel density at the events locations. In the network case, the size of the study area is the sum of the length of each line in the network. Thus, it is important to only use the necessary parts of the network.
Value
A dataframe with two columns, one for the bandwidths and the second for the Cronie and Van Lieshout's Criterion.
References
Cronie O, Van Lieshout MNM (2018). “A non-model-based approach to bandwidth selection for kernel estimators of spatial intensity functions.” Biometrika, 105(2), 455–462.
Examples
data(mtl_network)
data(bike_accidents)
cv_scores <- bw_cvl_calc(seq(200,400,50),
mtl_network, bike_accidents,
rep(1,nrow(bike_accidents)),
"quartic", "discontinuous",
diggle_correction = FALSE, study_area = NULL,
max_depth = 8,
digits=2, tol=0.1, agg=5,
sparse=TRUE, grid_shape=c(1,1),
sub_sample = 1, verbose=TRUE, check=TRUE)
Bandwidth selection by Cronie and Van Lieshout's Criterion (multicore version)
Description
Calculate for multiple bandwidths the Cronie and Van Lieshout's Criterion to
select an appropriate bandwidth in a data-driven approach. A plan from the package future can be used
to split the work across several cores. The different cells generated in accordance with the
argument grid_shape are used for the parallelization. So if only one cell is
generated (grid_shape = c(1,1)
), the function will use only one core. The progress bar
displays the progression for the cells.
Usage
bw_cvl_calc.mc(
bws = NULL,
lines,
events,
w,
kernel_name,
method,
diggle_correction = FALSE,
study_area = NULL,
adaptive = FALSE,
trim_bws = NULL,
mat_bws = NULL,
max_depth = 15,
digits = 5,
tol = 0.1,
agg = NULL,
sparse = TRUE,
zero_strat = "min_double",
grid_shape = c(1, 1),
sub_sample = 1,
verbose = TRUE,
check = TRUE
)
Arguments
bws |
An ordered numeric vector with the bandwidths |
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring. |
events |
events A feature collection of points representing the events on the network. The points will be snapped on the network to their closest line. |
w |
A vector representing the weight of each event |
kernel_name |
The name of the kernel to use. Must be one of triangle, gaussian, tricube, cosine, triweight, quartic, epanechnikov or uniform. |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see nkde details for more information) |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
adaptive |
A boolean indicating if an adaptive bandwidth must be used. If adaptive = TRUE, the local bandwidth are derived from the global bandwidths calculated from bw_range and bw_step. |
trim_bws |
A vector indicating the maximum value an adaptive bandwidth can
reach. Higher values will be trimmed. It must have the same length as |
mat_bws |
A matrix giving the bandwidths for each observation and for each global bandwidth. This is usefull when the user want to use a different method from Abramson's smoothing regimen. |
max_depth |
when using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has many small edges (area with many of intersections and many events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 10 should yield good estimates in most cases. A larger value can be used without a problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
zero_strat |
A string indicating what to do when density is 0 when calculating LOO density estimate for an isolated event. "min_double" (default) replace the 0 value by the minimum double possible on the machine. "remove" will remove them from the final score. The first approach penalizes more strongly the small bandwidths. |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
sub_sample |
A float between 0 and 1 indicating the percentage of quadra to keep in the calculus. For large datasets, it may be useful to limit the bandwidth evaluation and thus reduce calculation time. |
verbose |
A Boolean, indicating if the function should print messages about the process. |
check |
A Boolean indicating if the geometry checks must be run before the operation. This might take some times, but it will ensure that the CRS of the provided objects are valid and identical, and that geometries are valid. |
Details
For more details, see help(bw_cvl_calc)
Value
A dataframe with two columns, one for the bandwidths and the second for the Cronie and Van Lieshout's Criterion.
Examples
data(mtl_network)
data(bike_accidents)
future::plan(future::multisession(workers=1))
cv_scores <- bw_cvl_calc.mc(seq(200,400,50),
mtl_network, bike_accidents,
rep(1,nrow(bike_accidents)),
"quartic", "discontinuous",
diggle_correction = FALSE, study_area = NULL,
max_depth = 8,
digits=2, tol=0.1, agg=5,
sparse=TRUE, grid_shape=c(1,1),
sub_sample = 1, verbose=TRUE, check=TRUE)
## make sure any open connections are closed afterward
if (!inherits(future::plan(), "sequential")) future::plan(future::sequential)
Time and Network bandwidth correction calculation
Description
Calculating the border correction factor for both time and network bandwidths
Usage
bw_tnkde_corr_factor(
net_bws,
time_bws,
diggle_correction,
study_area,
events,
events_loc,
lines,
method,
kernel_name,
tol,
digits,
max_depth,
sparse
)
Arguments
net_bws |
A vector of network bandwidths |
time_bws |
A vector of time bandwidths |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
events |
A feature collection of points representing the events |
events_loc |
A feature collection of points representing the unique location of events |
lines |
A feature collection of linestrings representing the underlying lines of the network |
method |
The name of the NKDE to use |
kernel_name |
The name of the kernel to use |
tol |
float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
digits |
An integer, the number of digits to keep for the spatial coordinates |
max_depth |
The maximal depth for continuous or discontinuous NKDE |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
Value
A list of two elements, first the network correction factors, then the time correction factors.
Examples
# no example provided, this is an internal function
Time and Network bandwidth correction calculation for arrays
Description
Calculating the border correction factor for both time and network bandwidths when we have to deal with adaptive bandwidths and arrays
Usage
bw_tnkde_corr_factor_arr(
net_bws,
time_bws,
diggle_correction,
study_area,
events,
events_loc,
lines,
method,
kernel_name,
tol,
digits,
max_depth,
sparse,
time_limits = NULL
)
Arguments
net_bws |
An array of network bandwidths |
time_bws |
An array of time bandwidths |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
events |
A feature collection of points representing the events |
events_loc |
A feature collection of points representing the unique location of events |
lines |
A feature collection of linestrings representing the underlying lines of the network |
method |
The name of the NKDE to use |
kernel_name |
The name of the kernel to use |
tol |
float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
digits |
An integer, the number of digits to keep for the spatial coordinates |
max_depth |
The maximal depth for continuous or discontinuous NKDE |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
time_limits |
A vector with the upper and lower limit of the time period studied |
Examples
# no example provided, this is an internal function
Bandwidth selection by likelihood cross validation for temporal NKDE
Description
Calculate for multiple network and time bandwidths the cross validation likelihood to select an appropriate bandwidth in a data-driven approach
Usage
bw_tnkde_cv_likelihood_calc(
bws_net = NULL,
bws_time = NULL,
lines,
events,
time_field,
w,
kernel_name,
method,
arr_bws_net = NULL,
arr_bws_time = NULL,
diggle_correction = FALSE,
study_area = NULL,
adaptive = FALSE,
trim_net_bws = NULL,
trim_time_bws = NULL,
max_depth = 15,
digits = 5,
tol = 0.1,
agg = NULL,
sparse = TRUE,
zero_strat = "min_double",
grid_shape = c(1, 1),
sub_sample = 1,
verbose = TRUE,
check = TRUE
)
Arguments
bws_net |
An ordered numeric vector with all the network bandwidths |
bws_time |
An ordered numeric vector with all the time bandwidths |
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring. |
events |
events A feature collection of points representing the events on the network. The points will be snapped on the network to their closest line. |
time_field |
The name of the field in events indicating when the events occurred. It must be a numeric field |
w |
A vector representing the weight of each event |
kernel_name |
The name of the kernel to use. Must be one of triangle, gaussian, tricube, cosine, triweight, quartic, epanechnikov or uniform. |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see nkde details for more information) |
arr_bws_net |
An array with all the local netowrk bandwidths precalculated (for each event, and at each possible combinaison of network and temporal bandwidths). The dimensions must be c(length(net_bws), length(time_bws), nrow(events))) |
arr_bws_time |
An array with all the local time bandwidths precalculated (for each event, and at each possible combinaison of network and temporal bandwidths). The dimensions must be c(length(net_bws), length(time_bws), nrow(events))) |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
adaptive |
A boolean indicating if local bandwidths must be calculated |
trim_net_bws |
A numeric vector with the maximum local network bandwidth. If local bandwidths have higher values, they will be replaced by the corresponding value in this vector. |
trim_time_bws |
A numeric vector with the maximum local time bandwidth. If local bandwidths have higher values, they will be replaced by the corresponding value in this vector. |
max_depth |
when using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has many small edges (area with many of intersections and many events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 10 should yield good estimates in most cases. A larger value can be used without a problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
zero_strat |
A string indicating what to do when density is 0 when calculating LOO density estimate for an isolated event. "min_double" (default) replace the 0 value by the minimum double possible on the machine. "remove" will remove them from the final score. The first approach penalizes more strongly the small bandwidths. |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
sub_sample |
A float between 0 and 1 indicating the percentage of quadra to keep in the calculus. For large datasets, it may be useful to limit the bandwidth evaluation and thus reduce calculation time. |
verbose |
A Boolean, indicating if the function should print messages about the process. |
check |
A Boolean indicating if the geometry checks must be run before the operation. This might take some times, but it will ensure that the CRS of the provided objects are valid and identical, and that geometries are valid. |
Details
The function calculates the likelihood cross validation score for several time and network bandwidths in order to find the most appropriate one. The general idea is to find the pair of bandwidths that would produce the most similar results if one event is removed from the dataset (leave one out cross validation). We use here the shortcut formula as described by the package spatstat (Baddeley et al. 2021).
LCV(h) = \sum_i \log\hat\lambda_{-i}(x_i)
Where the sum is taken for all events x_i
and where \hat\lambda_{-i}(x_i)
is the leave-one-out kernel
estimate at x_i
for a bandwidth h. A higher value indicates a better bandwidth.
Value
A matrix with the cross validation score. Each row corresponds to a network bandwidth and each column to a time bandwidth (the higher the better).
References
Baddeley A, Turner R, Rubak E (2021). spatstat: Spatial Point Pattern Analysis, Model-Fitting, Simulation, Tests. R package version 2.1-0, https://CRAN.R-project.org/package=spatstat.
Examples
# loading the data
data(mtl_network)
data(bike_accidents)
# converting the Date field to a numeric field (counting days)
bike_accidents$Time <- as.POSIXct(bike_accidents$Date, format = "%Y/%m/%d")
bike_accidents$Time <- difftime(bike_accidents$Time, min(bike_accidents$Time), units = "days")
bike_accidents$Time <- as.numeric(bike_accidents$Time)
bike_accidents <- subset(bike_accidents, bike_accidents$Time>=89)
# calculating the cross validation values
cv_scores <- bw_tnkde_cv_likelihood_calc(
bws_net = seq(100,800,100),
bws_time = seq(10,60,5),
lines = mtl_network,
events = bike_accidents,
time_field = "Time",
w = rep(1, nrow(bike_accidents)),
kernel_name = "quartic",
method = "discontinuous",
diggle_correction = FALSE,
study_area = NULL,
max_depth = 10,
digits = 2,
tol = 0.1,
agg = 15,
sparse=TRUE,
grid_shape=c(1,1),
sub_sample=1,
verbose = FALSE,
check = TRUE)
Bandwidth selection by likelihood cross validation for temporal NKDE (multicore)
Description
Calculate for multiple network and time bandwidths the cross validation likelihood to select an appropriate bandwidth in a data-driven approach with multicore support
Usage
bw_tnkde_cv_likelihood_calc.mc(
bws_net = NULL,
bws_time = NULL,
lines,
events,
time_field,
w,
kernel_name,
method,
arr_bws_net = NULL,
arr_bws_time = NULL,
diggle_correction = FALSE,
study_area = NULL,
adaptive = FALSE,
trim_net_bws = NULL,
trim_time_bws = NULL,
max_depth = 15,
digits = 5,
tol = 0.1,
agg = NULL,
sparse = TRUE,
zero_strat = "min_double",
grid_shape = c(1, 1),
sub_sample = 1,
verbose = TRUE,
check = TRUE
)
Arguments
bws_net |
An ordered numeric vector with all the network bandwidths |
bws_time |
An ordered numeric vector with all the time bandwidths |
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring. |
events |
events A feature collection of points representing the events on the network. The points will be snapped on the network to their closest line. |
time_field |
The name of the field in events indicating when the events occurred. It must be a numeric field |
w |
A vector representing the weight of each event |
kernel_name |
The name of the kernel to use. Must be one of triangle, gaussian, tricube, cosine, triweight, quartic, epanechnikov or uniform. |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see nkde details for more information) |
arr_bws_net |
An array with all the local netowrk bandwidths precalculated (for each event, and at each possible combinaison of network and temporal bandwidths). The dimensions must be c(length(net_bws), length(time_bws), nrow(events))) |
arr_bws_time |
An array with all the local time bandwidths precalculated (for each event, and at each possible combinaison of network and temporal bandwidths). The dimensions must be c(length(net_bws), length(time_bws), nrow(events))) |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
adaptive |
A boolean indicating if local bandwidths must be calculated |
trim_net_bws |
A numeric vector with the maximum local network bandwidth. If local bandwidths have higher values, they will be replaced by the corresponding value in this vector. |
trim_time_bws |
A numeric vector with the maximum local time bandwidth. If local bandwidths have higher values, they will be replaced by the corresponding value in this vector. |
max_depth |
when using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has many small edges (area with many of intersections and many events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 10 should yield good estimates in most cases. A larger value can be used without a problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
zero_strat |
A string indicating what to do when density is 0 when calculating LOO density estimate for an isolated event. "min_double" (default) replace the 0 value by the minimum double possible on the machine. "remove" will remove them from the final score. The first approach penalizes more strongly the small bandwidths. |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
sub_sample |
A float between 0 and 1 indicating the percentage of quadra to keep in the calculus. For large datasets, it may be useful to limit the bandwidth evaluation and thus reduce calculation time. |
verbose |
A Boolean, indicating if the function should print messages about the process. |
check |
A Boolean indicating if the geometry checks must be run before the operation. This might take some times, but it will ensure that the CRS of the provided objects are valid and identical, and that geometries are valid. |
Details
See the function bws_tnkde_cv_likelihood_calc for more details. Note that the calculation is split
according to the grid_shape argument. If the grid_shape is c(1,1)
then only one process can be used.
Value
A matrix with the cross validation score. Each row corresponds to a network bandwidth and each column to a time bandwidth (the higher the better).
Examples
# loading the data
data(mtl_network)
data(bike_accidents)
# converting the Date field to a numeric field (counting days)
bike_accidents$Time <- as.POSIXct(bike_accidents$Date, format = "%Y/%m/%d")
bike_accidents$Time <- difftime(bike_accidents$Time, min(bike_accidents$Time), units = "days")
bike_accidents$Time <- as.numeric(bike_accidents$Time)
bike_accidents <- subset(bike_accidents, bike_accidents$Time>=89)
future::plan(future::multisession(workers=1))
# calculating the cross validation values
cv_scores <- bw_tnkde_cv_likelihood_calc.mc(
bws_net = seq(100,800,100),
bws_time = seq(10,60,5),
lines = mtl_network,
events = bike_accidents,
time_field = "Time",
w = rep(1, nrow(bike_accidents)),
kernel_name = "quartic",
method = "discontinuous",
diggle_correction = FALSE,
study_area = NULL,
max_depth = 10,
digits = 2,
tol = 0.1,
agg = 15,
sparse=TRUE,
grid_shape=c(1,1),
sub_sample=1,
verbose = FALSE,
check = TRUE)
## make sure any open connections are closed afterward
if (!inherits(future::plan(), "sequential")) future::plan(future::sequential)
euclidean distance between rows of a matrix and a vector (arma mode)
Description
euclidean distance between rows of a matrix and a vector (arma mode)
Usage
calcEuclideanDistance3(y, x)
Arguments
y |
a matrix |
x |
a vector (same length as ncol(matrix)) |
Value
a vector (same length as nrow(matrix))
Gamma parameter for Abramson’s adaptive bandwidth
Description
Function to calculate the gamma parameter in Abramson’s smoothing regimen.
Usage
calc_gamma(k)
Arguments
k |
a vector of numeric values (the estimated kernel densities) |
Value
the gamma parameter in Abramson’s smoothing regimen
Examples
#This is an internal function, no example provided
Isochrones calculation
Description
Calculate isochrones on a network
Usage
calc_isochrones(
lines,
dists,
start_points,
donught = FALSE,
mindist = 1,
weight = NULL,
direction = NULL
)
Arguments
lines |
A feature collection of lines representing the edges of the network |
dists |
A vector of the size of the desired isochrones. Can also be a list of vector when each start point must have its own distances. If so, the length of the list must be equal to the number of rows in start_points. |
start_points |
A feature collection of points representing the starting points if the isochrones |
donught |
A boolean indicating if the returned lines must overlap for each distance (FALSE, default) or if the lines must be cut between each distance step (TRUE). |
mindist |
The minimum distance between two points. When two points are too close, they might end up snapped at the same location on a line. Default is 1. |
weight |
The name of the column in lines to use an edge weight. If NULL, the geographical length is used. Note that if lines are split during the network creation, the weight column is recalculated proportionally to the new lines length. |
direction |
The name of the column indicating authorized travelling direction on lines. if NULL, then all lines can be used in both directions (undirected). The values of the column must be "FT" (From - To), "TF" (To - From) or "Both". |
Details
An isochrone is the set of reachable lines around a node in a network within a specified distance (or time). This function perform dynamic segmentation to return the part of the edges reached and not only the fully covered edges. Several start points and several distances can be given. The network can also be directed. The lines returned by the function are the most accurate representation of the isochrones. However, if polygons are required for mapping, the vignette "Calculating isochrones" shows how to create smooth polygons from the returned sets of lines.
Value
A feature collection of lines representing the isochrones with the following columns
point_id: the index of the point at the centre of the isochrone;
distance: the size of the isochrone
Examples
library(sf)
# creating a simple network
wkt_lines <- c(
"LINESTRING (0.0 0.0, 5.0 0.0)",
"LINESTRING (0.0 -5.0, 5.0 -5.0)",
"LINESTRING (5.0 0.0, 5.0 5.0)",
"LINESTRING (5.0 -5.0, 5.0 -10.0)",
"LINESTRING (5.0 0.0, 5.0 -5.0)",
"LINESTRING (5.0 0.0, 10.0 0.0)",
"LINESTRING (5.0 -5.0, 10.0 -5.0)",
"LINESTRING (10.0 0, 10.0 -5.0)",
"LINESTRING (10.0 -10.0, 10.0 -5.0)",
"LINESTRING (15.0 -5.0, 10.0 -5.0)",
"LINESTRING (10.0 0.0, 15.0 0.0)",
"LINESTRING (10.0 0.0, 10.0 5.0)")
linesdf <- data.frame(wkt = wkt_lines,
id = paste("l",1:length(wkt_lines),sep=""))
lines <- st_as_sf(linesdf, wkt = "wkt", crs = 32188)
# and the definition of the starting point
start_points <- data.frame(x=c(5),
y=c(-2.5))
start_points <- st_as_sf(start_points, coords = c("x","y"), crs = 32188)
# setting the directions
lines$direction <- "Both"
lines[6,"direction"] <- "TF"
isochrones <- calc_isochrones(lines,dists = c(10,12),
donught = TRUE,
start_points = start_points,
direction = "direction")
Geometry sanity check
Description
Function to check if the geometries given by the user are valid.
Usage
check_geometries(lines, samples, events, study_area)
Arguments
lines |
A feature collection of lines |
samples |
A feature collection of points (the samples) |
events |
A feature collection of points (the events) |
study_area |
A feature collection of polygons (the study_area) |
Value
TRUE if all the checks are passed
Examples
#This is an internal function, no example provided
Clean events geometries
Description
Function to avoid having events at the same location.
Usage
clean_events(events, digits = 5, agg = NULL)
Arguments
events |
The feature collection of points to contract (must have a weight column) |
digits |
The number of digits to keep |
agg |
A double indicating if the points must be aggregated within a distance. if NULL, then the points are aggregated by rounding the coordinates. |
Value
A new feature collection of points
Examples
#This is an internal function, no example provided
Find closest points
Description
Solve the nearest neighbour problem for two feature collections of points This is a simple wrap-up of the dbscan::kNN function
Usage
closest_points(origins, targets)
Arguments
origins |
a feature collection of points |
targets |
a feature collection of points |
Value
for each origin point, the index of the nearest target point
Examples
data(mtl_libraries)
data(mtl_theatres)
close_libs <- closest_points(mtl_theatres, mtl_libraries)
The worker function to calculate continuous NKDE (with ARMADILLO and integer matrix)
Description
The worker function to calculate continuous NKDE (with ARMADILLO and integer matrix)
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
samples_k |
a numeric vector of the actual kernel values, updates at each recursion |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
edge_mat |
matrix, to find the id of each edge given two neighbours. |
v |
the actual node to consider for the recursion (int) |
bw |
the kernel bandwidth |
line_weights |
a vector with the length of the edges |
samples_edgeid |
a vector associating each sample to an edge |
samples_x |
a vector with x coordinates of each sample |
samples_y |
a vector with y coordinates of each sample |
nodes_x |
a vector with x coordinates of each node |
nodes_y |
a vector with y coordinates of each node |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a vector with the kernel values calculated for each samples from the first node given
The worker function to calculate continuous NKDE (with ARMADILLO and sparse matrix)
Description
The worker function to calculate continuous NKDE (with ARMADILLO and sparse matrix)
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
samples_k |
a numeric vector of the actual kernel values, updates at each recursion |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
edge_mat |
matrix, to find the id of each edge given two neighbours. |
v |
the actual node to consider for the recursion (int) |
bw |
the kernel bandwidth |
line_weights |
a vector with the length of the edges |
samples_edgeid |
a vector associating each sample to an edge |
samples_coords |
a matrix with the X and Y coordinates of the samples |
nodes_coords |
a matrix with the X and Y coordinates of the nodes |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a vector with the kernel values calculated for each samples from the first node given
The main function to calculate continuous NKDE (with ARMADILO and integer matrix)
Description
The main function to calculate continuous NKDE (with ARMADILO and integer matrix)
Usage
continuous_nkde_cpp_arma(
neighbour_list,
events,
weights,
samples,
bws,
kernel_name,
nodes,
line_list,
max_depth,
verbose,
div = "bw"
)
Arguments
neighbour_list |
a list of the neighbours of each node |
events |
a numeric vector of the node id of each event |
weights |
a numeric vector of the weight of each event |
samples |
a DataFrame of the samples (with spatial coordinates and belonging edge) |
bws |
the kernel bandwidths for each event |
kernel_name |
the name of the kernel to use |
nodes |
a DataFrame representing the nodes of the graph (with spatial coordinates) |
line_list |
a DataFrame representing the lines of the graph |
max_depth |
the maximum recursion depth (after which recursion is stopped) |
verbose |
a boolean indicating if the function must print its progress |
div |
The divisor to use for the kernel. Must be "n" (the number of events within the radius around each sampling point), "bw" (the bandwidth) "none" (the simple sum). |
Value
a DataFrame with two columns : the kernel values (sum_k) and the number of events for each sample (n)
The main function to calculate continuous NKDE (with ARMADILO and sparse matrix)
Description
The main function to calculate continuous NKDE (with ARMADILO and sparse matrix)
Usage
continuous_nkde_cpp_arma_sparse(
neighbour_list,
events,
weights,
samples,
bws,
kernel_name,
nodes,
line_list,
max_depth,
verbose,
div = "bw"
)
Arguments
neighbour_list |
a list of the neighbours of each node |
events |
a numeric vector of the node id of each event |
weights |
a numeric vector of the weight of each event |
samples |
a DataFrame of the samples (with spatial coordinates and belonging edge) |
bws |
the kernel bandwidths for each event |
kernel_name |
the name of the kernel to use |
nodes |
a DataFrame representing the nodes of the graph (with spatial coordinates) |
line_list |
a DataFrame representing the lines of the graph |
max_depth |
the maximum recursion depth (after which recursion is stopped) |
verbose |
a boolean indicating if the function must print its progress |
div |
The divisor to use for the kernel. Must be "n" (the number of events within the radius around each sampling point), "bw" (the bandwidth) "none" (the simple sum). |
Value
a DataFrame with two columns : the kernel values (sum_k) and the number of events for each sample (n)
Border correction for NKDE
Description
Function to calculate the border correction factor.
Usage
correction_factor(
study_area,
events,
lines,
method,
bws,
kernel_name,
tol,
digits,
max_depth,
sparse
)
Arguments
study_area |
A feature collection of polygons or a polygon, the limit of the study area. |
events |
A feature collection of points representing the events on the network. |
lines |
The lines used to create the network |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see details for more information) |
bws |
The kernel bandwidth (in meters) for each event |
kernel_name |
The name of the kernel to use |
tol |
When adding the events and the sampling points to the network, the minimum distance between these points and the lines extremities. When points are closer, they are added at the extremity of the lines. |
digits |
The number of digits to keep in the spatial coordinates. It ensures that topology is good when building the network. Default is 3 |
max_depth |
When using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has a lot of small edges (area with a lot of intersections and a lot of events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 8 should yield good estimates. A larger value can be used without problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
sparse |
A boolean indicating if sparse or regular matrix should be used by the Rcpp functions. Regular matrices are faster, but require more memory and could lead to error, in particular with multiprocessing. Sparse matrices are slower, but require much less memory. |
Value
A numeric vector with the correction factor values for each event
Examples
#no example provided, this is an internal function
Time extent correction for NKDE
Description
Function to calculate the time extent correction factor in tnkde.
Usage
correction_factor_time(
events_time,
samples_time,
bws_time,
kernel_name,
time_limits = NULL
)
Arguments
events_time |
A numeric vector representing when the events occurred |
samples_time |
A numeric vector representing when the densities will be sampled |
bws_time |
A numeric vector with the temporal bandwidths |
kernel_name |
The name of the kernel to use |
time_limits |
A vector with the upper and lower limit of the time period studied |
Value
A numeric vector with the correction factor values for each event
Examples
#no example provided, this is an internal function
A function to calculate the necessary information to apply the Diggle correction factor with a continuous method
Description
A function to calculate the necessary information to apply the Diggle correction factor with a continuous method
Usage
corrfactor_continuous(neighbour_list, events, line_list, bws, max_depth)
Arguments
neighbour_list |
a list of the neighbours of each node |
events |
a numeric vector of the node id of each event |
line_list |
a DataFrame representing the lines of the graph |
bws |
the kernel bandwidth for each event |
max_depth |
the maximum recursion depth (after which recursion is stopped) |
Value
a list of dataframes, used to calculate the Diggel correction factor
A function to calculate the necessary information to apply the Diggle correction factor with a continuous method (sparse)
Description
A function to calculate the necessary information to apply the Diggle correction factor with a continuous method (sparse)
Usage
corrfactor_continuous_sparse(neighbour_list, events, line_list, bws, max_depth)
Arguments
neighbour_list |
a list of the neighbours of each node |
events |
a numeric vector of the node id of each event |
line_list |
a DataFrame representing the lines of the graph |
bws |
the kernel bandwidth for each event |
max_depth |
the maximum recursion depth (after which recursion is stopped) |
Value
a list of dataframes, used to calculate the Diggel correction factor
A function to calculate the necessary informations to apply the Diggle correction factor with a discontinuous method
Description
A function to calculate the necessary informations to apply the Diggle correction factor with a discontinuous method
Usage
corrfactor_discontinuous(neighbour_list, events, line_list, bws, max_depth)
Arguments
neighbour_list |
a list of the neighbours of each node |
events |
a numeric vector of the node id of each event |
line_list |
a DataFrame representing the lines of the graph |
bws |
the kernel bandwidth for each event |
max_depth |
the maximum recursion depth (after which recursion is stopped) |
Value
a list of dataframes, used to calculate the Diggel correction factor
A function to calculate the necessary information to apply the Diggle correction factor with a discontinuous method (sparse)
Description
A function to calculate the necessary information to apply the Diggle correction factor with a discontinuous method (sparse)
Usage
corrfactor_discontinuous_sparse(
neighbour_list,
events,
line_list,
bws,
max_depth
)
Arguments
neighbour_list |
a list of the neighbours of each node |
events |
a numeric vector of the node id of each event |
line_list |
a DataFrame representing the lines of the graph |
bws |
the kernel bandwidth for each event |
max_depth |
the maximum recursion depth (after which recursion is stopped) |
Value
a list of dataframes, used to calculate the Diggel correction factor
Cosine kernel
Description
Function implementing the cosine kernel.
Usage
cosine_kernel(d, bw)
Arguments
d |
The distance from the event |
bw |
The bandwidth used for the kernel |
Value
The estimated density
Examples
#This is an internal function, no example provided
c++ cosine kernel
Description
c++ cosine kernel
Usage
cosine_kernel_cpp(d, bw)
Arguments
d |
a vector of distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
c++ cosine kernel for one distance
Description
c++ cosine kernel for one distance
Usage
cosine_kernelos(d, bw)
Arguments
d |
a double, the distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
c++ cross g function
Description
c++ cross g function (INTERNAL)
Usage
cross_gfunc_cpp(dist_mat, start, end, step, width, Lt, na, nb, wa, wb)
Arguments
dist_mat |
A matrix with the distances between points |
start |
A float, the start value for evaluating the g-function |
end |
A float, the last value for evaluating the g-function |
step |
A float, the jump between two evaluations of the k-function |
width |
The width of each donut |
Lt |
The total length of the network |
na |
The number of points in set A |
nb |
The number of points in set B |
wa |
The weight of the points in set A (coincident points) |
wb |
The weight of the points in set B (coincident points) |
c++ cross k function
Description
c++ cross k function
Usage
cross_kfunc_cpp(dist_mat, start, end, step, Lt, na, nb, wa, wb)
Arguments
dist_mat |
A square matrix with the distances between points |
start |
A float, the start value for evaluating the k-function |
end |
A float, the last value for evaluating the k-function |
step |
A float, the jump between two evaluations of the k-function |
Lt |
The total length of the network |
na |
The number of points in set A |
nb |
The number of points in set B |
wa |
The weight of the points in set A (coincident points) |
wb |
The weight of the points in set B (coincident points) |
Network cross k and g functions (maturing)
Description
Calculate the cross k and g functions for a set of points on a network. (maturing)
Usage
cross_kfunctions(
lines,
pointsA,
pointsB,
start,
end,
step,
width,
nsim,
conf_int = 0.05,
digits = 2,
tol = 0.1,
resolution = NULL,
agg = NULL,
verbose = TRUE,
return_sims = FALSE,
calc_g_func = TRUE
)
Arguments
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring |
pointsA |
A feature collection of points representing the points to which the distances are calculated. |
pointsB |
A feature collection of points representing the points from which the distances are calculated. |
start |
A double, the lowest distance used to evaluate the k and g functions |
end |
A double, the highest distance used to evaluate the k and g functions |
step |
A double, the step between two evaluations of the k and g function. start, end and step are used to create a vector of distances with the function seq |
width |
The width of each donut for the g-function. Half of the width is applied on both sides of the considered distance |
nsim |
An integer indicating the number of Monte Carlo simulations to perform for inference |
conf_int |
A double indicating the width confidence interval (default = 0.05) calculated on the Monte Carlo simulations |
digits |
An integer indicating the number of digits to retain from the spatial coordinates |
tol |
When adding the points to the network, specify the minimum distance between these points and the lines' extremities. When points are closer, they are added at the extremity of the lines |
resolution |
When simulating random points on the network, selecting a resolution will reduce greatly the calculation time. When resolution is null the random points can occur everywhere on the graph. If a value is specified, the edges are split according to this value and the random points can only be vertices on the new network |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates |
verbose |
A Boolean indicating if progress messages should be displayed |
return_sims |
a boolean indicating if the simulated k and g values must also be returned. |
calc_g_func |
A Boolean indicating if the G function must also be calculated (TRUE by default). If FALSE, then only the K function is calculated |
Details
The cross k-function is a method to characterize the dispersion of a set of points (A) around a second set of points (B). For each point in B, the numbers of other points in A in subsequent radii are calculated. This empirical cross k-function can be more or less clustered than a cross k-function obtained if the points in A were randomly located around points in B. In a network, the network distance is used instead of the Euclidean distance. This function uses Monte Carlo simulations to assess if the points are clustered or dispersed and gives the results as a line plot. If the line of the observed cross k-function is higher than the shaded area representing the values of the simulations, then the points in A are more clustered around points in B than what we can expect from randomness and vice-versa. The function also calculates the cross g-function, a modified version of the cross k-function using rings instead of disks. The width of the ring must be chosen. The main interest is to avoid the cumulative effect of the classical k-function. Note that the cross k-function of points A around B is not necessarily the same as the cross k-function of points B around A. This function is maturing, it works as expected (unit tests) but will probably be modified in the future releases (gain speed, advanced features, etc.).
Value
A list with the following values :
plotk |
A ggplot2 object representing the values of the cross k-function |
plotg |
A ggplot2 object representing the values of the cross g-function |
values |
A DataFrame with the values used to build the plots |
Examples
data(main_network_mtl)
data(mtl_libraries)
data(mtl_theatres)
result <- cross_kfunctions(main_network_mtl, mtl_theatres, mtl_libraries,
start = 0, end = 2500, step = 10, width = 250,
nsim = 50, conf_int = 0.05, digits = 2,
tol = 0.1, agg = NULL, verbose = FALSE)
Network cross k and g functions (maturing, multicore)
Description
Calculate the cross k and g functions for a set of points on a network. For more details, see the document of the function cross_kfunctions.
Usage
cross_kfunctions.mc(
lines,
pointsA,
pointsB,
start,
end,
step,
width,
nsim,
conf_int = 0.05,
digits = 2,
tol = 0.1,
resolution = NULL,
agg = NULL,
verbose = TRUE,
return_sims = FALSE,
calc_g_func = TRUE,
grid_shape = c(1, 1)
)
Arguments
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring |
pointsA |
A feature collection of points representing the points to which the distances are calculated. |
pointsB |
A feature collection of points representing the points from which the distances are calculated. |
start |
A double, the lowest distance used to evaluate the k and g functions |
end |
A double, the highest distance used to evaluate the k and g functions |
step |
A double, the step between two evaluations of the k and g function. start, end and step are used to create a vector of distances with the function seq |
width |
The width of each donut for the g-function. Half of the width is applied on both sides of the considered distance |
nsim |
An integer indicating the number of Monte Carlo simulations to perform for inference |
conf_int |
A double indicating the width confidence interval (default = 0.05) calculated on the Monte Carlo simulations |
digits |
An integer indicating the number of digits to retain from the spatial coordinates |
tol |
When adding the points to the network, specify the minimum distance between these points and the lines' extremities. When points are closer, they are added at the extremity of the lines |
resolution |
When simulating random points on the network, selecting a resolution will reduce greatly the calculation time. When resolution is null the random points can occur everywhere on the graph. If a value is specified, the edges are split according to this value and the random points can only be vertices on the new network |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates |
verbose |
A Boolean indicating if progress messages should be displayed |
return_sims |
a boolean indicating if the simulated k and g values must also be returned. |
calc_g_func |
A Boolean indicating if the G function must also be calculated (TRUE by default). If FALSE, then only the K function is calculated |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
Value
A list with the following values :
plotk |
A ggplot2 object representing the values of the cross k-function |
plotg |
A ggplot2 object representing the values of the cross g-function |
values |
A DataFrame with the values used to build the plots |
Examples
data(main_network_mtl)
data(mtl_libraries)
data(mtl_theatres)
future::plan(future::multisession(workers=1))
result <- cross_kfunctions.mc(main_network_mtl, mtl_theatres, mtl_libraries,
start = 0, end = 2500, step = 10, width = 250,
nsim = 50, conf_int = 0.05, digits = 2,
tol = 0.1, agg = NULL, verbose = FALSE)
Cut lines at a specified distance
Description
Cut lines in a feature collection of linestrings at a specified distance from the begining of the lines.
Usage
cut_lines_at_distance(lines, dists)
Arguments
lines |
The feature collection of linestrings to cut |
dists |
A vector of distances, if only one value is given, each line will be cut at that distance. |
Value
A feature collection of linestrings
Examples
# This is an interal function, no example provided
Make a network directed
Description
Function to create complementary lines for a directed network.
Usage
direct_lines(lines, direction)
Arguments
lines |
The original feature collection of linestrings |
direction |
A vector of integers. 0 indicates a bidirectional line and 1 an unidirectional line |
Value
A feature collection of linestrings with some lines duplicated according to direction
Examples
#This is an internal function, no example provided
The worker function to calculate discontinuous NKDE (with ARMADILLO and Integer matrix)
Description
The worker function to calculate discontinuous NKDE (with ARMADILLO and Integer matrix)
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
edge_mat |
matrix, to find the id of each edge given two neighbours |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
v |
the actual node to consider for the recursion (int) |
bw |
the kernel bandiwdth |
line_weights |
a vector with the length of the edges |
samples_edgeid |
a vector associating each sample to an edge |
samples_x |
a vector with x coordinates of each sample |
samples_ya |
vector with y coordinates of each sample |
nodes_x |
a vector with x coordinates of each node |
nodes_y |
a vector with y coordinates of each node |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a vector with the kernel values calculated for each samples from the first node given
The worker function to calculate discontinuous NKDE (with ARMADILLO and sparse matrix)
Description
The worker function to calculate discontinuous NKDE (with ARMADILLO and sparse matrix)
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
edge_mat |
matrix, to find the id of each edge given two neighbours |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
v |
the actual node to consider for the recursion (int) |
bw |
the kernel bandiwdth |
line_weights |
a vector with the length of the edges |
samples_edgeid |
a vector associating each sample to an edge |
samples_coords |
a matrix with the X and Y coordinates of the samples |
nodes_coords |
a matrix with the X and Y coordinates of the nodes |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a vector with the kernel values calculated for each samples from the first node given
The main function to calculate discontinuous NKDE (ARMA and sparse matrix)
Description
The main function to calculate discontinuous NKDE (ARMA and sparse matrix)
The main function to calculate discontinuous NKDE (ARMA and Integer matrix)
Usage
discontinuous_nkde_cpp_arma_sparse(
neighbour_list,
events,
weights,
samples,
bws,
kernel_name,
nodes,
line_list,
max_depth,
verbose,
div = "bw"
)
discontinuous_nkde_cpp_arma(
neighbour_list,
events,
weights,
samples,
bws,
kernel_name,
nodes,
line_list,
max_depth,
verbose,
div = "bw"
)
Arguments
neighbour_list |
a list of the neighbours of each node |
events |
a numeric vector of the node id of each event |
weights |
a numeric vector of the weight of each event |
samples |
a DataFrame of the samples (with spatial coordinates and belonging edge) |
bws |
the kernel bandwidth for each event |
kernel_name |
the name of the kernel function to use |
nodes |
a DataFrame representing the nodes of the graph (with spatial coordinates) |
line_list |
a DataFrame representing the lines of the graph |
max_depth |
the maximum recursion depth (after which recursion is stopped) |
verbose |
a boolean indicating if the function must print its progress |
div |
The divisor to use for the kernel. Must be "n" (the number of events within the radius around each sampling point), "bw" (the bandwidth) "none" (the simple sum). |
Value
a DataFrame with two columns : the kernel values (sum_k) and the number of events for each sample (n)
a DataFrame with two columns : the kernel values (sum_k) and the number of events for each sample (n)
Distance matrix with dupicated
Description
Function to Create a distance matrix when some vertices are duplicated.
Usage
dist_mat_dupl(graph, start, end, ...)
Arguments
graph |
The Graph to use |
start |
The vertices to use as starting points |
end |
The vertices to use as ending points |
... |
parameters passed to the function igraph::distances |
Value
A matrix with the distances between the vertices
Examples
#This is an internal function, no example provided
Epanechnikov kernel
Description
Function implementing the epanechnikov kernel.
Usage
epanechnikov_kernel(d, bw)
Arguments
d |
The distance from the event |
bw |
The bandwidth used for the kernel |
Value
The estimated density
Examples
#This is an internal function, no example provided
c++ epanechnikov kernel
Description
c++ epanechnikov kernel
Usage
epanechnikov_kernel_cpp(d, bw)
Arguments
d |
a vector of distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
c++ epanechnikov kernel for one distance
Description
c++ epanechnikov kernel for one distance
Usage
epanechnikov_kernelos(d, bw)
Arguments
d |
a double, the distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
The worker function to calculate continuous TNKDE likelihood cv
Description
The worker function to calculate continuous TNKDE likelihood cv (INTERNAL)
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
edge_mat |
matrix, to find the id of each edge given two neighbours. |
events |
a NumericVector indicating the nodes in the graph being events |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
v |
the actual node to consider (int) |
bws_net |
an arma::vec with the network bandwidths to consider |
line_weights |
a vector with the length of the edges |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a cube with the impact of the event v on each other events for each pair of bandwidths (cube(bws_net, bws_time, events))
The worker function to calculate continuous TNKDE likelihood cv
Description
The worker function to calculate continuous TNKDE likelihood cv (INTERNAL)
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
edge_mat |
matrix, to find the id of each edge given two neighbours. |
events |
a NumericVector indicating the nodes in the graph being events |
time_events |
a NumericVector indicating the timestamp of each event |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
v |
the actual node to consider (int) |
v_time |
the time of v (double) |
bws_net |
an arma::vec with the network bandwidths to consider |
bws_time |
an arma::vec with the time bandwidths to consider |
line_weights |
a vector with the length of the edges |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a cube with the impact of the event v on each other event for each pair of bandwidths (cube(bws_net, bws_time, events))
The worker function to calculate continuous TNKDE likelihood cv (adaptive case)
Description
The worker function to calculate continuous TNKDE likelihood cv (INTERNAL)
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
edge_mat |
matrix, to find the id of each edge given two neighbours. |
events |
a NumericVector indicating the nodes in the graph being events |
time_events |
a NumericVector indicating the timestamp of each event |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
v |
the actual node to consider (int) |
v_time |
the time of v (double) |
bws_net |
an arma::mat with the network bandwidths to consider |
bws_time |
an arma::mat with the time bandwidths to consider |
line_weights |
a vector with the length of the edges |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a cube with the impact of the event v on each other event for each pair of bandwidths (cube(bws_net, bws_time, events))
The worker function to calculate discontinuous TNKDE likelihood cv
Description
The worker function to calculate discontinuous TNKDE likelihood cv (INTERNAL)
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
edge_mat |
matrix, to find the id of each edge given two neighbours. |
events |
a NumericVector indicating the nodes in the graph being events |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
v |
the actual node to consider (int) |
bws_net |
an arma::vec with the network bandwidths to consider |
line_weights |
a vector with the length of the edges |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a cube with the impact of the event v on each other events for each pair of bandwidths (cube(bws_net, bws_time, events))
The worker function to calculate discontinuous TNKDE likelihood cv
Description
The worker function to calculate discontinuous TNKDE likelihood cv (INTERNAL)
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
edge_mat |
matrix, to find the id of each edge given two neighbours. |
events |
a NumericVector indicating the nodes in the graph being events |
time_events |
a NumericVector indicating the timestamp of each event |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
v |
the actual node to consider (int) |
v_time |
the time of v (double) |
bws_net |
an arma::vec with the network bandwidths to consider |
bws_time |
an arma::vec with the time bandwidths to consider |
line_weights |
a vector with the length of the edges |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a cube with the impact of the event v on each other event for each pair of bandwidths (cube(bws_net, bws_time, events))
The worker function to calculate discontinuous TNKDE likelihood cv (adaptive case)
Description
The worker function to calculate discontinuous TNKDE likelihood cv (INTERNAL)
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
edge_mat |
matrix, to find the id of each edge given two neighbours. |
events |
a NumericVector indicating the nodes in the graph being events |
time_events |
a NumericVector indicating the timestamp of each event |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
v |
the actual node to consider (int) |
v_time |
the time of v (double) |
bws_net |
an arma::mat with the network bandwidths to consider |
bws_time |
an arma::mat with the time bandwidths to consider |
line_weights |
a vector with the length of the edges |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a cube with the impact of the event v on each other event for each pair of bandwidths (cube(bws_net, bws_time, events))
Worker for simple NKDE algorithm
Description
The worker function to perform the simple nkde.
Usage
ess_kernel(graph, y, bw, kernel_func, ok_samples, nodes, ok_edges, N)
Arguments
graph |
a graph object from igraph representing the network |
y |
the index of the actual event |
bw |
a float indicating the kernel bandwidth (in meters) |
kernel_func |
a function obtained with the function select_kernel |
ok_samples |
a a feature collection of points representing the sampling points. The samples must be snapped on the network. A column edge_id must indicate for each sample on which edge it is snapped. |
nodes |
a a feature collection of points representing the nodes of the network |
ok_edges |
a a feature collection of linestrings representing the edges of the network |
Examples
#This is an internal function, no example provided
The worker function to calculate simple NKDE likelihood cv
Description
The worker function to calculate simple NKDE likelihood cv
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
edge_mat |
matrix, to find the id of each edge given two neighbours. |
events |
a NumericVector indicating the nodes in the graph being events |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
v |
the actual node to consider (int) |
bws_net |
an arma::vec with the network bandwidths to consider |
line_weights |
a vector with the length of the edges |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a matrix with the impact of the event v on each other events for each pair of bandwidths (mat(event, bws_net))
The worker function to calculate simple TNKDE likelihood cv
Description
The worker function to calculate simple TNKDE likelihood cv
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
edge_mat |
matrix, to find the id of each edge given two neighbours. |
events |
a NumericVector indicating the nodes in the graph being events |
time_events |
a NumericVector indicating the timestamp of each event |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
v |
the actual node to consider (int) |
v_time |
the time of v (double) |
bws_net |
an arma::vec with the network bandwidths to consider |
bws_time |
an arma::vec with the time bandwidths to consider |
line_weights |
a vector with the length of the edges |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a cube with the impact of the event v on each other event for each pair of bandwidths (cube(bws_net, bws_time, events))
The worker function to calculate simple TNKDE likelihood cv (adaptive case)
Description
The worker function to calculate simple TNKDE likelihood cv (adaptive case)
Arguments
kernel_func |
a cpp pointer function (selected with the kernel name) |
edge_mat |
matrix, to find the id of each edge given two neighbours. |
events |
a NumericVector indicating the nodes in the graph being events |
time_events |
a NumericVector indicating the timestamp of each event |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
v |
the actual node to consider (int) |
v_time |
the time of v (double) |
bws_net |
an arma::mat with the network bandwidths to consider |
bws_time |
an arma::mat with the time bandwidths to consider |
line_weights |
a vector with the length of the edges |
depth |
the actual recursion depth |
max_depth |
the maximum recursion depth |
Value
a cube with the impact of the event v on each other event for each pair of bandwidths (cube(bws_net, bws_time, events))
c++ g space-time function
Description
c++ g space-time function
Usage
g_nt_func_cpp(
dist_mat_net,
dist_mat_time,
start_net,
end_net,
step_net,
width_net,
start_time,
end_time,
step_time,
width_time,
Lt,
Tt,
n,
w
)
Arguments
dist_mat_net |
A square matrix with the distances between points on the network |
dist_mat_time |
A square matrix with the distances between points in time |
start_net |
A float, the start value for evaluating the g-function on the network |
end_net |
A float, the last value for evaluating the g-function on the network |
step_net |
A float, the jump between two evaluations of the g-function on the network |
width_net |
The width of each donut on the network |
start_time |
A float, the start value for evaluating the g-function in time |
end_time |
A float, the last value for evaluating the g-function in time |
step_time |
A float, the jump between two evaluations of the g-function in time |
width_time |
The width of each donut in time |
Lt |
The total length of the network |
n |
The number of points |
w |
The weight of the points (coincident points) |
Gaussian kernel
Description
Function implementing the gaussian kernel.
Usage
gaussian_kernel(d, bw)
Arguments
d |
The distance from the event |
bw |
The bandwidth used for the kernel |
Value
The estimated density
Examples
#This is an internal function, no example provided
c++ gaussian kernel
Description
c++ gaussian kernel
Usage
gaussian_kernel_cpp(d, bw)
Arguments
d |
a vector of distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
Scaled gaussian kernel
Description
Function implementing the scaled gaussian kernel.
Usage
gaussian_kernel_scaled(d, bw)
Arguments
d |
The distance from the event |
bw |
The bandwidth used for the kernel |
Value
The estimated density
Examples
#This is an internal function, no example provided
c++ scale gaussian kernel
Description
c++ scale gaussian kernel
Usage
gaussian_kernel_scaled_cpp(d, bw)
Arguments
d |
a vector of distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
c++ scaled gaussian kernel for one distance
Description
c++ scaled gaussian kernel for one distance
Usage
gaussian_kernel_scaledos(d, bw)
Arguments
d |
a double, the distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
c++ gaussian kernel for one distance
Description
c++ gaussian kernel for one distance
Usage
gaussian_kernelos(d, bw)
Arguments
d |
a double, the distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
c++ g function counting worker
Description
c++ k function counting (INTERNAL)
Usage
gfunc_counting(dist_mat, wc, wr, breaks, width)
Arguments
dist_mat |
A matrix with the distances between points |
wc |
The weight of the points represented by the columns (destinations) |
wr |
The weight of the points represented by the rows (origins) |
breaks |
A numeric vector with the distance to consider |
width |
The width of each donut |
Value
A numeric matrix with the countings of the g function evaluated at the required distances
c++ g function
Description
c++ g function (INTERNAL)
Usage
gfunc_cpp(dist_mat, start, end, step, width, Lt, n, w)
Arguments
dist_mat |
A square matrix with the distances between points |
start |
A float, the start value for evaluating the g-function |
end |
A float, the last value for evaluating the g-function |
step |
A float, the jump between two evaluations of the k-function |
width |
The width of each donut |
Lt |
The total length of the network |
n |
The number of points |
w |
The weight of the points (coincident points) |
Value
A numeric vector with the values of the g function evaluated at the required distances
c++ g function
Description
c++ g function (INTERNAL)
Usage
gfunc_cpp2(dist_mat, start, end, step, width, Lt, n, wc, wr)
Arguments
dist_mat |
A square matrix with the distances between points |
start |
A float, the start value for evaluating the g-function |
end |
A float, the last value for evaluating the g-function |
step |
A float, the jump between two evaluations of the k-function |
width |
The width of each donut |
Lt |
The total length of the network |
n |
The number of points |
wc |
The weight of the points represented by the columns (destinations) |
wr |
The weight of the points represented by the rows (origins) |
Value
A numeric vector with the values of the g function evaluated at the required distances
Geometric mean
Description
Function to calculate the geometric mean.
Usage
gm_mean(x, na.rm = TRUE)
Arguments
x |
A vector of numeric values |
na.rm |
A boolean indicating if we filter the NA values |
Value
The geometric mean of x
Examples
#This is an internal function, no example provided
Topological error
Description
A utility function to find topological errors in a network.
Usage
graph_checking(lines, digits, max_search = 5, tol = 0.1)
Arguments
lines |
A feature collection of linestrings representing the network |
digits |
An integer indicating the number of digits to retain for coordinates |
max_search |
The maximum number of nearest neighbour to search to find close_nodes |
tol |
The minimum distance expected between two nodes. If two nodes are closer, they are returned in the result of the function. |
Details
This function can be used to check for three common problems in networks: disconnected components, dangle nodes and close nodes. When a network has disconnected components, this means that several unconnected graphs are composing the overall network. This can be caused by topological errors in the dataset. Dangle nodes are nodes connected to only one other node. This type of node can be normal at the border of a network, but can also be caused by topological errors. Close nodes are nodes that are not coincident, but so close that they probably should be coincident.
Value
A list with three elements. The first is a feature collection of points indicating for each node of the network to which component it belongs. The second is a feature collection of points with nodes that are too close one of each other. The third is a feature collection of points with the dangle nodes of the network.
Examples
data(mtl_netowrk)
topo_errors <- graph_checking(mtl_network, 2)
Heal edges
Description
Merge Lines if they form a longer linestring without external intersections (experimental)
Usage
heal_edges(lines, digits = 3, verbose = TRUE)
Arguments
lines |
A feature collection of linestrings |
digits |
An integer indicating the number of digits to keep in coordinates |
verbose |
A boolean indicating if a progress bar should be displayed |
Value
A feature collection of linestrings with the eventually merged geometries. Note that if lines are merged, only the attributes of the first line are preserved
Examples
#This is an internal function, no example provided
Projection test
Description
Check if a feature collection is in a projected CRS
Usage
is_projected(obj)
Arguments
obj |
A feature collection |
Value
A boolean
Examples
#This is an internal function, no example provided
c++ k space-time function
Description
c++ k space-time function
c++ k and g space-time function
c++ k space-time function
Usage
k_nt_func_cpp(
dist_mat_net,
dist_mat_time,
start_net,
end_net,
step_net,
start_time,
end_time,
step_time,
Lt,
Tt,
n,
w
)
k_g_nt_func_cpp2(
dist_mat_net,
dist_mat_time,
start_net,
end_net,
step_net,
start_time,
end_time,
step_time,
width_net,
width_time,
Lt,
Tt,
n,
wc,
wr,
cross = FALSE
)
k_nt_func_cpp2(
dist_mat_net,
dist_mat_time,
start_net,
end_net,
step_net,
start_time,
end_time,
step_time,
Lt,
Tt,
n,
wc,
wr,
cross = FALSE
)
Arguments
dist_mat_net |
A square matrix with the distances between points (network) |
dist_mat_time |
A square matrix with the distances between points (time) |
start_net |
A float, the start value for evaluating the k-function (network) |
end_net |
A float, the last value for evaluating the k-function (network) |
step_net |
A float, the jump between two evaluations of the k-function (network) |
start_time |
A float, the start value for evaluating the k-function (time) |
end_time |
A float, the last value for evaluating the k-function (time) |
step_time |
A float, the jump between two evaluations of the k-function (time) |
Lt |
The total length of the network |
Tt |
The total duration of study area |
n |
The number of points |
w |
The weight of the points (coincident points) |
width_net |
A float indicating the width of the donught of the g-function (network) |
width_time |
A float indicating the width of the donught of the g-function (time) |
cross |
a boolean indicating of we are calculating a cross k or g function |
Network k and g functions for spatio-temporal data (experimental, NOT READY FOR USE)
Description
Calculate the k and g functions for a set of points on a network and in time (experimental, NOT READY FOR USE).
Usage
k_nt_functions(
lines,
points,
points_time,
start_net,
end_net,
step_net,
width_net,
start_time,
end_time,
step_time,
width_time,
nsim,
conf_int = 0.05,
digits = 2,
tol = 0.1,
resolution = NULL,
agg = NULL,
verbose = TRUE,
calc_g_func = TRUE
)
Arguments
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring |
points |
A feature collection of points representing the points on the network. These points will be snapped on their nearest line |
points_time |
A numeric vector indicating when the point occured |
start_net |
A double, the lowest network distance used to evaluate the k and g functions |
end_net |
A double, the highest network distance used to evaluate the k and g functions |
step_net |
A double, the step between two evaluations of the k and g for the network distance function. start_net, end_net and step_net are used to create a vector of distances with the function seq |
width_net |
The width (network distance) of each donut for the g-function. Half of the width is applied on both sides of the considered distance |
start_time |
A double, the lowest time distance used to evaluate the k and g functions |
end_time |
A double, the highest time distance used to evaluate the k and g functions |
step_time |
A double, the step between two evaluations of the k and g for the time distance function. start_time, end_time and step_time are used to create a vector of distances with the function seq |
width_time |
The width (time distance) of each donut for the g-function. Half of the width is applied on both sides of the considered distance |
nsim |
An integer indicating the number of Monte Carlo simulations to perform for inference |
conf_int |
A double indicating the width confidence interval (default = 0.05) calculated on the Monte Carlo simulations |
digits |
An integer indicating the number of digits to retain from the spatial coordinates |
tol |
When adding the points to the network, specify the minimum distance between these points and the lines' extremities. When points are closer, they are added at the extremity of the lines |
resolution |
When simulating random points on the network, selecting a resolution will reduce greatly the calculation time. When resolution is null the random points can occur everywhere on the graph. If a value is specified, the edges are split according to this value and the random points can only be vertices on the new network |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates |
verbose |
A Boolean indicating if progress messages should be displayed |
calc_g_func |
A boolean indicating if the G function must also be calculated |
Details
The k-function is a method to characterize the dispersion of a set of points. For each point, the numbers of other points in subsequent radii are calculated in both space and time. This empirical k-function can be more or less clustered than a k-function obtained if the points were randomly located . In a network, the network distance is used instead of the Euclidean distance. This function uses Monte Carlo simulations to assess if the points are clustered or dispersed. The function also calculates the g-function, a modified version of the k-function using rings instead of disks. The width of the ring must be chosen. The main interest is to avoid the cumulative effect of the classical k-function. This function is maturing, it works as expected (unit tests) but will probably be modified in the future releases (gain speed, advanced features, etc.).
Value
A list with the following values :
obs_k: A matrix with the observed k-values
lower_k: A matrix with the lower bounds of the simulated k-values
upper_k: A matrix with the upper bounds of the simulated k-values
obs_g: A matrix with the observed g-values
lower_g: A matrix with the lower bounds of the simulated g-values
upper_g: A matrix with the upper bounds of the simulated g-values
distances_net: A vector with the used network distances
distances_time: A vector with the used time distances
Examples
data(mtl_network)
data(bike_accidents)
# converting the Date field to a numeric field (counting days)
bike_accidents$Time <- as.POSIXct(bike_accidents$Date, format = "%Y/%m/%d")
start <- as.POSIXct("2016/01/01", format = "%Y/%m/%d")
bike_accidents$Time <- difftime(bike_accidents$Time, start, units = "days")
bike_accidents$Time <- as.numeric(bike_accidents$Time)
values <- k_nt_functions(
lines = mtl_network,
points = bike_accidents,
points_time = bike_accidents$Time,
start_net = 0 ,
end_net = 2000,
step_net = 10,
width_net = 200,
start_time = 0,
end_time = 360,
step_time = 7,
width_time = 14,
nsim = 50,
conf_int = 0.05,
digits = 2,
tol = 0.1,
resolution = NULL,
agg = 15,
verbose = TRUE)
Network k and g functions for spatio-temporal data (multicore, experimental, NOT READY FOR USE)
Description
Calculate the k and g functions for a set of points on a network and in time (multicore, experimental, NOT READY FOR USE).
Usage
k_nt_functions.mc(
lines,
points,
points_time,
start_net,
end_net,
step_net,
width_net,
start_time,
end_time,
step_time,
width_time,
nsim,
conf_int = 0.05,
digits = 2,
tol = 0.1,
resolution = NULL,
agg = NULL,
verbose = TRUE,
calc_g_func = TRUE,
grid_shape = c(1, 1)
)
Arguments
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring |
points |
A feature collection of points representing the points on the network. These points will be snapped on their nearest line |
points_time |
A numeric vector indicating when the point occured |
start_net |
A double, the lowest network distance used to evaluate the k and g functions |
end_net |
A double, the highest network distance used to evaluate the k and g functions |
step_net |
A double, the step between two evaluations of the k and g for the network distance function. start_net, end_net and step_net are used to create a vector of distances with the function seq |
width_net |
The width (network distance) of each donut for the g-function. Half of the width is applied on both sides of the considered distance |
start_time |
A double, the lowest time distance used to evaluate the k and g functions |
end_time |
A double, the highest time distance used to evaluate the k and g functions |
step_time |
A double, the step between two evaluations of the k and g for the time distance function. start_time, end_time and step_time are used to create a vector of distances with the function seq |
width_time |
The width (time distance) of each donut for the g-function. Half of the width is applied on both sides of the considered distance |
nsim |
An integer indicating the number of Monte Carlo simulations to perform for inference |
conf_int |
A double indicating the width confidence interval (default = 0.05) calculated on the Monte Carlo simulations |
digits |
An integer indicating the number of digits to retain from the spatial coordinates |
tol |
When adding the points to the network, specify the minimum distance between these points and the lines' extremities. When points are closer, they are added at the extremity of the lines |
resolution |
When simulating random points on the network, selecting a resolution will reduce greatly the calculation time. When resolution is null the random points can occur everywhere on the graph. If a value is specified, the edges are split according to this value and the random points can only be vertices on the new network |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates |
verbose |
A Boolean indicating if progress messages should be displayed |
calc_g_func |
A boolean indicating if the G function must also be calculated |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
Details
The k-function is a method to characterize the dispersion of a set of points. For each point, the numbers of other points in subsequent radii are calculated. This empirical k-function can be more or less clustered than a k-function obtained if the points were randomly located in space. In a network, the network distance is used instead of the Euclidean distance. This function uses Monte Carlo simulations to assess if the points are clustered or dispersed, and gives the results as a line plot. If the line of the observed k-function is higher than the shaded area representing the values of the simulations, then the points are more clustered than what we can expect from randomness and vice-versa. The function also calculates the g-function, a modified version of the k-function using rings instead of disks. The width of the ring must be chosen. The main interest is to avoid the cumulative effect of the classical k-function. This function is maturing, it works as expected (unit tests) but will probably be modified in the future releases (gain speed, advanced features, etc.).
Value
A list with the following values :
obs_k: A matrix with the observed k-values
lower_k: A matrix with the lower bounds of the simulated k-values
upper_k: A matrix with the upper bounds of the simulated k-values
obs_g: A matrix with the observed g-values
lower_g: A matrix with the lower bounds of the simulated g-values
upper_g: A matrix with the upper bounds of the simulated g-values
distances_net: A vector with the used network distances
distances_time: A vector with the used time distances
c++ k function counting worker
Description
c++ k function counting (INTERNAL)
Usage
kfunc_counting(dist_mat, wc, wr, breaks, cross = FALSE)
Arguments
dist_mat |
A matrix with the distances between points |
wc |
The weight of the points represented by the columns (destinations) |
wr |
The weight of the points represented by the rows (origins) |
breaks |
A numeric vector with the distance to consider |
cross |
A boolean indicating if we are calculating a cross k function or not (default is FALSE) |
Value
A numeric matrix with the countings of the k function evaluated at the required distances
c++ k function
Description
c++ k function (INTERNAL)
Usage
kfunc_cpp(dist_mat, start, end, step, Lt, n, w)
Arguments
dist_mat |
A square matrix with the distances between points |
start |
A float, the start value for evaluating the k-function |
end |
A float, the last value for evaluating the k-function |
step |
A float, the jump between two evaluations of the k-function |
Lt |
The total length of the network |
n |
The number of points |
w |
The weight of the points (coincident points) |
Value
A numeric vector with the values of the k function evaluated at the required distances
c++ k function 2
Description
c++ k function (INTERNAL)
Usage
kfunc_cpp2(dist_mat, start, end, step, Lt, n, wc, wr, cross = FALSE)
Arguments
dist_mat |
A square matrix with the distances between points |
start |
A float, the start value for evaluating the k-function |
end |
A float, the last value for evaluating the k-function |
step |
A float, the jump between two evaluations of the k-function |
Lt |
The total length of the network |
n |
The number of points |
wc |
The weight of the points represented by the columns (destinations) |
wr |
The weight of the points represented by the rows (origins) |
cross |
A boolean indicating if we are calculating a cross k function or not (default is FALSE) |
Value
A numeric vector with the values of the k function evaluated at the required distances
Network k and g functions (maturing)
Description
Calculate the k and g functions for a set of points on a network (maturing).
Usage
kfunctions(
lines,
points,
start,
end,
step,
width,
nsim,
conf_int = 0.05,
digits = 2,
tol = 0.1,
agg = NULL,
verbose = TRUE,
return_sims = FALSE,
calc_g_func = TRUE,
resolution = NULL
)
Arguments
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring |
points |
A feature collection of points representing the points on the network. These points will be snapped on their nearest line |
start |
A double, the lowest distance used to evaluate the k and g functions |
end |
A double, the highest distance used to evaluate the k and g functions |
step |
A double, the step between two evaluations of the k and g function. start, end and step are used to create a vector of distances with the function seq |
width |
The width of each donut for the g-function. Half of the width is applied on both sides of the considered distance |
nsim |
An integer indicating the number of Monte Carlo simulations to perform for inference |
conf_int |
A double indicating the width confidence interval (default = 0.05) calculated on the Monte Carlo simulations |
digits |
An integer indicating the number of digits to retain from the spatial coordinates |
tol |
When adding the points to the network, specify the minimum distance between these points and the lines' extremities. When points are closer, they are added at the extremity of the lines |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates |
verbose |
A Boolean indicating if progress messages should be displayed |
return_sims |
a boolean indicating if the simulated k and g values must also be returned. |
calc_g_func |
A Boolean indicating if the G function must also be calculated (TRUE by default). If FALSE, then only the K function is calculated |
resolution |
When simulating random points on the network, selecting a resolution will reduce greatly the calculation time. When resolution is null the random points can occur everywhere on the graph. If a value is specified, the edges are split according to this value and the random points can only be vertices on the new network |
Details
The k-function is a method to characterize the dispersion of a set of points. For each point, the numbers of other points in subsequent radii are calculated. This empirical k-function can be more or less clustered than a k-function obtained if the points were randomly located in space. In a network, the network distance is used instead of the Euclidean distance. This function uses Monte Carlo simulations to assess if the points are clustered or dispersed, and gives the results as a line plot. If the line of the observed k-function is higher than the shaded area representing the values of the simulations, then the points are more clustered than what we can expect from randomness and vice-versa. The function also calculates the g-function, a modified version of the k-function using rings instead of disks. The width of the ring must be chosen. The main interest is to avoid the cumulative effect of the classical k-function. This function is maturing, it works as expected (unit tests) but will probably be modified in the future releases (gain speed, advanced features, etc.).
Value
A list with the following values :
plotk: A ggplot2 object representing the values of the k-function
plotg: A ggplot2 object representing the values of the g-function
values: A DataFrame with the values used to build the plots
Examples
data(main_network_mtl)
data(mtl_libraries)
result <- kfunctions(main_network_mtl, mtl_libraries,
start = 0, end = 2500, step = 100,
width = 200, nsim = 50,
conf_int = 0.05, tol = 0.1, agg = NULL,
calc_g_func = TRUE,
verbose = FALSE)
Network k and g functions (multicore)
Description
Calculate the k and g functions for a set of points on a network with multicore support. For details, please see the function kfunctions. (maturing)
Usage
kfunctions.mc(
lines,
points,
start,
end,
step,
width,
nsim,
conf_int = 0.05,
digits = 2,
tol = 0.1,
agg = NULL,
verbose = TRUE,
return_sims = FALSE,
calc_g_func = TRUE,
resolution = NULL,
grid_shape = c(1, 1)
)
Arguments
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring |
points |
A feature collection of points representing the points on the network. These points will be snapped on their nearest line |
start |
A double, the lowest distance used to evaluate the k and g functions |
end |
A double, the highest distance used to evaluate the k and g functions |
step |
A double, the step between two evaluations of the k and g function. start, end and step are used to create a vector of distances with the function seq |
width |
The width of each donut for the g-function. Half of the width is applied on both sides of the considered distance |
nsim |
An integer indicating the number of Monte Carlo simulations to perform for inference |
conf_int |
A double indicating the width confidence interval (default = 0.05) calculated on the Monte Carlo simulations |
digits |
An integer indicating the number of digits to retain from the spatial coordinates |
tol |
When adding the points to the network, specify the minimum distance between these points and the lines' extremities. When points are closer, they are added at the extremity of the lines |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates |
verbose |
A Boolean indicating if progress messages should be displayed |
return_sims |
a boolean indicating if the simulated k and g values must also be returned. |
calc_g_func |
A Boolean indicating if the G function must also be calculated (TRUE by default). If FALSE, then only the K function is calculated |
resolution |
When simulating random points on the network, selecting a resolution will reduce greatly the calculation time. When resolution is null the random points can occur everywhere on the graph. If a value is specified, the edges are split according to this value and the random points can only be vertices on the new network |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
Details
For details, please look at the function kfunctions.
Value
A list with the following values :
plotk: A ggplot2 object representing the values of the k-function
plotg: A ggplot2 object representing the values of the g-function
values: A DataFrame with the values used to build the plots
Examples
data(main_network_mtl)
data(mtl_libraries)
result <- kfunctions(main_network_mtl, mtl_libraries,
start = 0, end = 2500, step = 10,
width = 200, nsim = 50,
conf_int = 0.05, tol = 0.1, agg = NULL,
verbose = FALSE)
c++ k and g function counting worker
Description
c++ k function counting (INTERNAL)
Usage
kgfunc_counting(dist_mat, wc, wr, breaks, width, cross = FALSE)
Arguments
dist_mat |
A matrix with the distances between points |
wc |
The weight of the points represented by the columns (destinations) |
wr |
The weight of the points represented by the rows (origins) |
breaks |
A numeric vector with the distance to consider |
width |
The width of each donut |
cross |
A boolean indicating if we are calculating a cross k function or not (default is FALSE) |
Value
A list of two numeric matrices with the values of the k and g function evaluated at the required distances
c++ k and g function
Description
c++ g function (INTERNAL)
Usage
kgfunc_cpp2(dist_mat, start, end, step, width, Lt, n, wc, wr, cross = FALSE)
Arguments
dist_mat |
A square matrix with the distances between points |
start |
A float, the start value for evaluating the g-function |
end |
A float, the last value for evaluating the g-function |
step |
A float, the jump between two evaluations of the k-function |
width |
The width of each donut |
Lt |
The total length of the network |
n |
The number of points |
wc |
The weight of the points represented by the columns (destinations) |
wr |
The weight of the points represented by the rows (origins) |
cross |
A boolean indicating if we are calculating a cross k function or not (default is FALSE) |
Value
A numeric matrix with the values of the k (first col) and g (second col) function evaluated at the required distances
c++ k and g function counting worker
Description
c++ k function counting (INTERNAL)
c++ k function counting (INTERNAL)
Usage
kgfunc_time_counting(
dist_mat_net,
dist_mat_time,
wc,
wr,
breaks_net,
breaks_time,
width_net,
width_time,
cross = FALSE
)
kfunc_time_counting(
dist_mat_net,
dist_mat_time,
wc,
wr,
breaks_net,
breaks_time,
cross = FALSE
)
Arguments
dist_mat_net |
A matrix with the distances between points on the network |
dist_mat_time |
A matrix with the distances between points in time |
wc |
The weight of the points represented by the columns (destinations) |
wr |
The weight of the points represented by the rows (origins) |
breaks_net |
A numeric vector with the distance to consider on network |
breaks_time |
A numeric vector with the distance to consider in time |
width_net |
The width of each donut for the network dimension |
width_time |
The width of each donut for the time dimension |
cross |
A boolean indicating if we are calculating a cross k function or not (default is FALSE) |
Value
A list of two numeric cubes with the values of the k and g function evaluated at the required distances
A list of two numeric cubes with the values of the k and g function evaluated at the required distances
Centre points of lines
Description
Generate a feature collection of points at the centre of the lines of a feature collection of linestrings. The length of the lines is used to determine their centres.
Usage
lines_center(lines)
Arguments
lines |
A feature collection of linestrings to use |
Value
A feature collection of points
Examples
data(mtl_network)
centers <- lines_center(mtl_network)
Lines coordinates as list
Description
A function to get the coordinates of some lines as a list of matrices
Usage
lines_coordinates_as_list(lines)
Arguments
lines |
A sf object with linestring type geometries |
Value
A list of matrices
Examples
#This is an internal function, no example provided
Unify lines direction
Description
A function to deal with the directions of lines. It ensures that only From-To situation are present by reverting To-From lines. For the lines labelled as To-From, the order of their vertices is reverted.
Usage
lines_direction(lines, field)
Arguments
lines |
A sf object with linestring type geometries |
field |
Indicate a field giving information about authorized travelling direction on lines. if NULL, then all lines can be used in both directions. Must be the name of a column otherwise. The values of the column must be "FT" (From - To), "TF" (To - From) or "Both". |
Value
A sf object with linestring type geometries
Examples
data(mtl_network)
mtl_network$length <- as.numeric(sf::st_length(mtl_network))
mtl_network$direction <- "Both"
mtl_network[6, "direction"] <- "TF"
mtl_network_directed <- lines_direction(mtl_network, "direction")
Get lines extremities
Description
Generate a feature collection of points with the first and last vertex of each line in a feature collection of linestrings.
Usage
lines_extremities(lines)
Arguments
lines |
A feature collection of linestrings (simple Linestrings) |
Value
A feature collection of points
Examples
wkt_lines <- c(
"LINESTRING (0 0, 1 0)",
"LINESTRING (1 0, 2 0)",
"LINESTRING (2 0, 3 0)",
"LINESTRING (0 1, 1 1)")
linesdf <- data.frame(wkt = wkt_lines,
id = paste("l",1:length(wkt_lines),sep=""))
all_lines <- sf::st_as_sf(linesdf, wkt = "wkt")
all_lines <- cbind(linesdf$wkt,all_lines)
points <- lines_extremities(all_lines)
Points along lines
Description
Generate a feature collection of points along the lines of feature collection of Linestrings.
Usage
lines_points_along(lines, dist)
Arguments
lines |
A feature collection of linestrings to use |
dist |
The distance between the points along the lines |
Value
A feature collection of points
Examples
data(mtl_network)
new_pts <- lines_points_along(mtl_network,50)
List of coordinates as lines
Description
A function to convert a list of matrices to as sf object with linestring geometry type
Usage
list_coordinates_as_lines(coord_list, crs)
Arguments
coord_list |
A list of matrices |
crs |
The CRS to use to create the lines |
Value
A sf object with linestring type geometries
Examples
#This is an internal function, no example provided
Cut lines into lixels
Description
Cut the lines of a feature collection of linestrings into lixels with a specified minimal distance may fail if the line geometries are self intersecting.
Usage
lixelize_lines(lines, lx_length, mindist = NULL)
Arguments
lines |
The sf object with linestring geometry type to modify |
lx_length |
The length of a lixel |
mindist |
The minimum length of a lixel. After cut, if the length of the final lixel is shorter than the minimum distance, then it is added to the previous lixel. if NULL, then mindist = maxdist/10. Note that the segments that are already shorter than the minimum distance are not modified. |
Value
An sf object with linestring geometry type
Examples
data(mtl_network)
lixels <- lixelize_lines(mtl_network,150,50)
Cut lines into lixels (multicore)
Description
Cut the lines of a feature collection of linestrings into lixels with a specified minimal distance may fail if the line geometries are self intersecting with multicore support.
Usage
lixelize_lines.mc(
lines,
lx_length,
mindist = NULL,
verbose = TRUE,
chunk_size = 100
)
Arguments
lines |
A feature collection of linestrings to convert to lixels |
lx_length |
The length of a lixel |
mindist |
The minimum length of a lixel. After cut, if the length of the final lixel is shorter than the minimum distance, then it is added to the previous lixel. If NULL, then mindist = maxdist/10 |
verbose |
A Boolean indicating if a progress bar must be displayed |
chunk_size |
The size of a chunk used for multiprocessing. Default is 100. |
Value
A feature collection of linestrings
Examples
data(mtl_network)
future::plan(future::multisession(workers=1))
lixels <- lixelize_lines.mc(mtl_network,150,50)
## make sure any open connections are closed afterward
if (!inherits(future::plan(), "sequential")){
future::plan(future::sequential)
}
Primary road network of Montreal
Description
A feature collection (sf object) representing the primary road network of Montreal. The EPSG is 3797, and the data comes from the Montreal OpenData website.
Usage
main_network_mtl
Format
A sf object with 2945 rows and 2 variables
- TYPE
the type of road
- geom
the geometry (linestrings)
Source
https://donnees.montreal.ca/dataset/geobase
Libraries of Montreal
Description
A feature collection (sf object) representing the libraries of Montreal. The EPSG is 3797 and the data comes from the Montreal OpenData website.
Usage
mtl_libraries
Format
A sf object with 55 rows and 3 variables.
- CP
the postal code
- NAME
the name of the library
- geom
the geometry (points)
Source
https://donnees.montreal.ca/dataset/lieux-culturels
Road network of Montreal
Description
A feature collection (sf object) representing the road network of Montreal. The EPSG is 3797, and the data comes from the Montreal OpenData website. It is only a small subset in central districts used to demonstrate the main functions of spNetwork.
Usage
mtl_network
Format
A sf object with 2945 rows and 2 variables
- ClsRte
the category of the road
- geom
the geometry (linestrings)
Source
https://donnees.montreal.ca/dataset/geobase
Theatres of Montreal
Description
A feature collection (sf object) representing the theatres of Montreal. The EPSG is 3797 and the data comes from the Montreal OpenData website.
Usage
mtl_theatres
Format
A sf object with 54 rows and 3 variables.
- CP
the postal code
- NAME
the name of the theatre
- geom
the geometry (points)
Source
https://donnees.montreal.ca/dataset/lieux-culturels
Nearest point on Line
Description
Find the nearest projected point on a LineString (from maptools)
Usage
nearestPointOnLine(coordsLine, coordsPoint)
Arguments
coordsLine |
The coordinates of the line (matrix) |
coordsPoint |
The coordinates of the point |
Value
A numeric vector with the coordinates of the projected point
Examples
#This is an internal function, no example provided
Nearest point on segment
Description
Find the nearest projected point on a segment (from maptools)
Usage
nearestPointOnSegment(s, p)
Arguments
s |
The coordinates of the segment |
p |
The coordinates of the point |
Value
A numeric vector with the coordinates of the projected point
Examples
#This is an internal function, no example provided
Nearest line for points
Description
Find for each point its nearest LineString
Usage
nearest_lines(points, lines, snap_dist = 300, max_iter = 10)
Arguments
points |
A feature collection of points |
lines |
A feature collection of linestrings |
snap_dist |
A distance (float) given to find for each point its nearest line in a spatial index. A too big value will produce unnecessary distance calculations and a too short value will lead to more iterations to find neighbours. In extrem cases, a too short value could lead to points not associated with lines (index = -1). |
max_iter |
An integer indicating how many iteration the search algorithm must perform in the spatial index to find lines close to a point. At each iteration, the snap_dist is doubled to find candidates. |
Examples
# this is an internal function, no example provided
K-nearest points on network
Description
Calculate the K-nearest points for a set of points on a network.
Usage
network_knn(
origins,
lines,
k,
destinations = NULL,
maxdistance = 0,
snap_dist = Inf,
line_weight = "length",
direction = NULL,
grid_shape = c(1, 1),
verbose = FALSE,
digits = 3,
tol = 0.1
)
Arguments
origins |
A feature collection of points, for each point, its k nearest neighbours will be found on the network. |
lines |
A feature collection of linestrings representing the underlying network |
k |
An integer indicating the number of neighbours to find. |
destinations |
A feature collection of points, might be used if the neighbours must be found in a separate set of points NULL if the neighbours must be found in origins. |
maxdistance |
The maximum distance between two observations to consider them as neighbours. It is useful only if a grid is used, a lower value will reduce calculating time, but one must be sure that the k nearest neighbours are within this radius. Otherwise NAs will be present in the results. |
snap_dist |
The maximum distance to snap the start and end points on the network. |
line_weight |
The weighting to use for lines. Default is "length" (the geographical length), but can be the name of a column. The value is considered proportional to the geographical length of the lines. |
direction |
The name of a column indicating authorized travelling direction on lines. if NULL, then all lines can be used in both directions. Must be the name of a column otherwise. The values of the column must be "FT" (From - To), "TF" (To - From) or "Both". |
grid_shape |
A vector of length 2 indicating the shape of the grid to use for splitting the dataset. Default is c(1,1), so all the calculation is done in one go. It might be necessary to split it if the dataset is large. |
verbose |
A Boolean indicating if the function should print its progress |
digits |
The number of digits to retain from the spatial coordinates ( simplification used to reduce risk of topological error) |
tol |
A float indicating the minimum distance between the points and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
Details
The k nearest neighbours of each point are found by using the network distance. The results could not be exact if some points share the exact same location. As an example, consider the following case. If A and B are two points at the exact same location, and C is a third point close to A and B. If the 1 nearest neighbour is requested for C, the function could return either A or B but not both. When such situation happens, a warning is raised by the function.
Value
A list with two matrices, one with the index of the neighbours and one with the distances.
Examples
data(main_network_mtl)
data(mtl_libraries)
results <- network_knn(mtl_libraries, main_network_mtl,
k = 3, maxdistance = 1000, line_weight = "length",
grid_shape=c(1,1), verbose = FALSE)
K-nearest points on network (multicore version)
Description
Calculate the K-nearest points for a set of points on a network with multicore support.
Usage
network_knn.mc(
origins,
lines,
k,
destinations = NULL,
maxdistance = 0,
snap_dist = Inf,
line_weight = "length",
direction = NULL,
grid_shape = c(1, 1),
verbose = FALSE,
digits = 3,
tol = 0.1
)
Arguments
origins |
A feature collection of points, for each point, its k nearest neighbours will be found on the network. |
lines |
A feature collection of linestrings representing the underlying network |
k |
An integer indicating the number of neighbours to find. |
destinations |
A feature collection of points, might be used if the neighbours must be found in a separate set of points NULL if the neighbours must be found in origins. |
maxdistance |
The maximum distance between two observations to consider them as neighbours. It is useful only if a grid is used, a lower value will reduce calculating time, but one must be sure that the k nearest neighbours are within this radius. Otherwise NAs will be present in the results. |
snap_dist |
The maximum distance to snap the start and end points on the network. |
line_weight |
The weighting to use for lines. Default is "length" (the geographical length), but can be the name of a column. The value is considered proportional to the geographical length of the lines. |
direction |
The name of a column indicating authorized travelling direction on lines. if NULL, then all lines can be used in both directions. Must be the name of a column otherwise. The values of the column must be "FT" (From - To), "TF" (To - From) or "Both". |
grid_shape |
A vector of length 2 indicating the shape of the grid to use for splitting the dataset. Default is c(1,1), so all the calculation is done in one go. It might be necessary to split it if the dataset is large. |
verbose |
A Boolean indicating if the function should print its progress |
digits |
The number of digits to retain from the spatial coordinates ( simplification used to reduce risk of topological error) |
tol |
A float indicating the minimum distance between the points and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
Value
A list with two matrices, one with the index of the neighbours and one with the distances.
Examples
data(main_network_mtl)
data(mtl_libraries)
future::plan(future::multisession(workers=1))
results <- network_knn.mc(mtl_libraries, main_network_mtl,
k = 3, maxdistance = 1000, line_weight = "length",
grid_shape=c(1,1), verbose = FALSE)
## make sure any open connections are closed afterward
if (!inherits(future::plan(), "sequential")) future::plan(future::sequential)
worker function for K-nearest points on network
Description
The worker the K-nearest points for a set of points on a network.
Usage
network_knn_worker(
points,
lines,
k,
direction = NULL,
use_dest = FALSE,
verbose = verbose,
digits = digits,
tol = tol
)
Arguments
points |
A feature collection of points, for each point, its k nearest neighbours will be found on the network. |
lines |
A feature collection of lines representing the network |
k |
An integer indicating the number of neighbours to find.. |
direction |
Indicates a field providing information about authorized travelling direction on lines. if NULL, then all lines can be used in both directions. Must be the name of a column otherwise. The values of the column must be "FT" (From - To), "TF" (To - From) or "Both". |
use_dest |
A boolean indicating if the origins and separations are separated (TRUE), FALSE if only origins are used. |
verbose |
A Boolean indicating if the function should print its progress |
digits |
The number of digits to retain in the spatial coordinates ( simplification used to reduce risk of topological error) |
tol |
A float indicating the spatial tolerance when points are added as vertices to lines. |
Value
A list with two matrices, one with the index of the neighbours and one with the distances.
Examples
#no example provided, this is an internal function
Network distance listw
Description
Generate listw object (spdep like) based on network distances.
Usage
network_listw(
origins,
lines,
maxdistance,
method = "centroid",
point_dist = NULL,
snap_dist = Inf,
line_weight = "length",
mindist = 10,
direction = NULL,
dist_func = "inverse",
matrice_type = "B",
grid_shape = c(1, 1),
verbose = FALSE,
digits = 3,
tol = 0.1
)
Arguments
origins |
A feature collection of lines, points, or polygons for which the spatial neighbouring list will be built |
lines |
A feature collection of lines representing the network |
maxdistance |
The maximum distance between two observations to consider them as neighbours. |
method |
A string indicating how the starting points will be built. If 'centroid' is used, then the centre of lines or polygons is used. If 'pointsalong' is used, then points will be placed along polygons' borders or along lines as starting and end points. If 'ends' is used (only for lines) the first and last vertices of lines are used as starting and ending points. |
point_dist |
A float, defining the distance between points when the method 'pointsalong' is selected. |
snap_dist |
The maximum distance to snap the start and end points on the network. |
line_weight |
The weighting to use for lines. Default is "length" (the geographical length), but can be the name of a column. The value is considered proportional to the geographical length of the lines. |
mindist |
The minimum distance between two different observations. It is important for it to be different from 0 when a W style is used. |
direction |
Indicates a field providing information about authorized travelling direction on lines. if NULL, then all lines can be used in both directions. Must be the name of a column otherwise. The values of the column must be "FT" (From - To), "TF" (To - From) or "Both". |
dist_func |
Indicates the function to use to convert the distance between observation in spatial weights. Can be 'identity', 'inverse', 'squared inverse' or a function with one parameter x that will be vectorized internally |
matrice_type |
The type of the weighting scheme. Can be 'B' for Binary, 'W' for row weighted, or 'I' (identity), see the documentation of spdep::nb2listw for details |
grid_shape |
A vector of length 2 indicating the shape of the grid to use for splitting the dataset. Default is c(1,1), so all the calculation is done in one go. It might be necessary to split it if the dataset is large. |
verbose |
A Boolean indicating if the function should print its progress |
digits |
The number of digits to retain in the spatial coordinates ( simplification used to reduce risk of topological error) |
tol |
A float indicating the spatial tolerance when points are added as vertices to lines. |
Value
A listw object (spdep like) if matrice_type is "B" or "W". If matrice_type is I, then a list with a nblist object and a list of weights is returned.
Examples
data(mtl_network)
listw <- network_listw(mtl_network,
mtl_network,
maxdistance = 500,
method = "centroid",
line_weight = "length",
dist_func = 'squared inverse',
matrice_type='B',
grid_shape = c(2,2))
Network distance listw (multicore)
Description
Generate listw object (spdep like) based on network distances with multicore support.
Usage
network_listw.mc(
origins,
lines,
maxdistance,
method = "centroid",
point_dist = NULL,
snap_dist = Inf,
line_weight = "length",
mindist = 10,
direction = NULL,
dist_func = "inverse",
matrice_type = "B",
grid_shape = c(1, 1),
verbose = FALSE,
digits = 3,
tol = 0.1
)
Arguments
origins |
A feature collection of linestrings, points or polygons for which the spatial neighbouring list will be built. |
lines |
A feature collection of linestrings representing the network |
maxdistance |
The maximum distance between two observations to consider them as neighbours. |
method |
A string indicating how the starting points will be built. If 'centroid' is used, then the centre of lines or polygons is used. If 'pointsalong' is used, then points will be placed along polygons' borders or along lines as starting and end points. If 'ends' is used (only for lines) the first and last vertices of lines are used as starting and ending points. |
point_dist |
A float, defining the distance between points when the method pointsalong is selected. |
snap_dist |
the maximum distance to snap the start and end points on the network. |
line_weight |
The weights to use for lines. Default is "length" (the geographical length), but can be the name of a column. The value is considered proportional with the geographical length of the lines. |
mindist |
The minimum distance between two different observations. It is important for it to be different from 0 when a W style is used. |
direction |
Indicates a field giving information about authorized travelling direction on lines. if NULL, then all lines can be used in both directions. Must be the name of a column otherwise. The values of the column must be "FT" (From - To), "TF" (To - From) or "Both". |
dist_func |
Indicates the function to use to convert the distance between observation in spatial weights. Can be 'identity', 'inverse', 'squared inverse' or a function with one parameter x that will be vectorized internally |
matrice_type |
The type of the weighting scheme. Can be 'B' for Binary, 'W' for row weighted, or 'I' (identity) see the documentation of spdep::nb2listw for details |
grid_shape |
A vector of length 2 indicating the shape of the grid to use for splitting the dataset. Default is c(1,1), so all the calculation is done in one go. It might be necessary to split it if the dataset is large. |
verbose |
A Boolean indicating if the function should print its progress |
digits |
The number of digits to retain in the spatial coordinates ( simplification used to reduce risk of topological error) |
tol |
A float indicating the spatial tolerance when points are added as vertices to lines. |
Value
A listw object (spdep like) if matrice_type is "B" or "W". If matrice_type is I, then a list with a nblist object and a list of weights is returned.
Examples
data(mtl_network)
future::plan(future::multisession(workers=1))
listw <- network_listw.mc(mtl_network,mtl_network,maxdistance=500,
method = "centroid", line_weight = "length",
dist_func = 'squared inverse', matrice_type='B', grid_shape = c(2,2))
## make sure any open connections are closed afterward
if (!inherits(future::plan(), "sequential")) future::plan(future::sequential)
network_listw worker
Description
The worker function of network_listw.
Usage
network_listw_worker(
points,
lines,
maxdistance,
dist_func,
direction = NULL,
mindist = 10,
matrice_type = "B",
verbose = FALSE,
digits = 3,
tol = 0.1
)
Arguments
points |
A feature collection of points corresponding to start and end points. It must have a column fid, grouping the points if necessary. |
lines |
A feature collection of lines representing the network |
maxdistance |
The maximum distance between two observation to consider them as neighbours. |
dist_func |
A vectorized function converting spatial distances into weights. |
direction |
Indicate a field giving information about authorized travelling direction on lines. if NULL, then all lines can be used in both directions. Must be the name of a column otherwise. The values of the column must be "FT" (From - To), "TF" (To - From) or "Both". |
mindist |
The minimum distance between two different observations. It is important for it to be different from 0 when a W style is used. |
matrice_type |
The type of the weighting scheme. Can be 'B' for Binary, 'W' for row weighted, or 'I' (identity), see the documentation of spdep::nb2listw for details |
verbose |
A Boolean indicating if the function should print its progress |
digits |
the number of digits to keep in the spatial coordinates ( simplification used to reduce risk of topological error) |
tol |
A float indicating the spatial tolerance when points are added as vertices to lines. |
Value
A list of neihbours as weights.
Examples
#no example provided, this is an internal function
Network Kernel density estimate
Description
Calculate the Network Kernel Density Estimate based on a network of lines, sampling points, and events
Usage
nkde(
lines,
events,
w,
samples,
kernel_name,
bw,
adaptive = FALSE,
trim_bw = NULL,
method,
div = "bw",
diggle_correction = FALSE,
study_area = NULL,
max_depth = 15,
digits = 5,
tol = 0.1,
agg = NULL,
sparse = TRUE,
grid_shape = c(1, 1),
verbose = TRUE,
check = TRUE
)
Arguments
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring. |
events |
events A feature collection of points representing the events on the network. The points will be snapped on the network to their closest line. |
w |
A vector representing the weight of each event |
samples |
A feature collection of points representing the locations for which the densities will be estimated. |
kernel_name |
The name of the kernel to use. Must be one of triangle, gaussian, tricube, cosine, triweight, quartic, epanechnikov or uniform. |
bw |
The kernel bandwidth (using the scale of the lines), can be a single float or a numeric vector if a different bandwidth must be used for each event. |
adaptive |
A Boolean, indicating if an adaptive bandwidth must be used |
trim_bw |
A float, indicating the maximum value for the adaptive bandwidth |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see nkde details for more information) |
div |
The divisor to use for the kernel. Must be "n" (the number of events within the radius around each sampling point), "bw" (the bandwidth) "none" (the simple sum). |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
max_depth |
when using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has many small edges (area with many of intersections and many events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 10 should yield good estimates in most cases. A larger value can be used without a problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
verbose |
A Boolean, indicating if the function should print messages about the process. |
check |
A Boolean indicating if the geometry checks must be run before the operation. This might take some times, but it will ensure that the CRS of the provided objects are valid and identical, and that geometries are valid. |
Details
The three NKDE methods
Estimating the density of a point process is commonly done by using an
ordinary two-dimensional kernel density function. However, there are numerous
cases for which the events do not occur in a two-dimensional space but on a
network (like car crashes, outdoor crimes, leaks in pipelines, etc.). New
methods were developed to adapt the methodology to networks, three of them
are available in this package.
The simple method: This first method was presented by (Xie and Yan 2008) and proposes an intuitive solution. The distances between events and sampling points are replaced by network distances, and the formula of the kernel is adapted to calculate the density over a linear unit instead of an areal unit.
The discontinuous method: The previous method has been criticized by (Okabe et al. 2009), arguing that the estimator proposed is biased, leading to an overestimation of density in events hot-spots. More specifically, the simple method does not conserve mass and the induced kernel is not a probability density along the network. They thus proposed a discontinuous version of the kernel function on network, which equally "divides" the mass density of an event at intersections.
The continuous method: If the discontinuous method is unbiased, it leads to a discontinuous kernel function which is a bit counter-intuitive. Okabe et al. (2009) proposed another version of the kernel, which divides the mass of the density at intersections but adjusts the density before the intersection to make the function continuous.
The three methods are available because, even though that the simple method is
less precise statistically speaking, it might be more intuitive. From a
purely geographical view, it might be seen as a sort of distance decay
function as used in Geographically Weighted Regression.
adaptive bandwidth
It is possible to use adaptive bandwidth instead of fixed
bandwidth. Adaptive bandwidths are calculated using the Abramson’s smoothing
regimen (Abramson 1982). To do so, an original
fixed bandwidth must be specified (bw parameter), and is used to estimate the
priory densitiy at event locations. These densities are then used to
calculate local bandwidth. The maximum size of the local bandwidth can be
limited with the parameter trim_bw. For more details, see the vignettes.
Optimization parameters
The grid_shape parameter allows to
split the calculus of the NKDE according to a grid dividing the study area.
It might be necessary for big dataset to reduce the memory used. If the
grid_shape is c(1,1), then a full network is built for the area. If the
grid_shape is c(2,2), then the area is split in 4 rectangles. For each
rectangle, the sample points falling in the rectangle are used, the events
and the lines in a radius of the bandwidth length are used. The results are
combined at the end and ordered to match the original order of the samples.
The geographical coordinates of the start and end of lines are used to
build the network. To avoid troubles with digits, we truncate the coordinates
according to the digit parameter. A minimal loss of precision is expected but
results in a fast construction of the network.
To calculate the
distances on the network, all the events are added as vertices. To reduce the
size of the network, it is possible to reduce the number of vertices by
adding the events at the extremity of the lines if they are close to them.
This is controlled by the parameter tol.
In the same way, it is
possible to limit the number of vertices by aggregating the events that are
close to each other. In that case, the weights of the aggregated events are
summed. According to an aggregation distance, a buffer is drawn around the
fist event, all events falling in that buffer are aggregated to the first
event, forming a new event. The coordinates of this new event are the means of
the original events coordinates. This procedure is repeated until no events
are aggregated. The aggregation distance can be fixed with the parameter agg.
When using the continuous and discontinuous kernel, the density is
reduced at each intersection crossed. In the discontinuous case, after 5
intersections with four directions each, the density value is divided by 243
leading to very small values. In the same situation but with the continuous
NKDE, the density value is divided by approximately 7.6. The max_depth
parameters allows the user to control the maximum depth of these two NKDE.
The base value is 15, but a value of 10 would yield very close estimates. A
lower value might have a critical impact on speed when the bandwidth is large.
When using the continuous and discontinuous kernel, the connections
between graph nodes are stored in a matrix. This matrix is typically sparse,
and so a sparse matrix object is used to limit memory use. If the network is
small (typically when the grid used to split the data has small rectangles)
then a classical matrix could be used instead of a sparse one. It
significantly increases speed, but could lead to memory issues.
Value
A vector of values, they are the density estimates at sampling points
References
Abramson IS (1982).
“On bandwidth variation in kernel estimates-a square root law.”
The annals of Statistics, 1217–1223.
Okabe A, Satoh T, Sugihara K (2009).
“A kernel density estimation method for networks, its computational method and a GIS-based tool.”
International Journal of Geographical Information Science, 23(1), 7–32.
Xie Z, Yan J (2008).
“Kernel density estimation of traffic accidents in a network space.”
Computers, environment and urban systems, 32(5), 396–406.
Examples
data(mtl_network)
data(bike_accidents)
lixels <- lixelize_lines(mtl_network,200,mindist = 50)
samples <- lines_center(lixels)
densities <- nkde(mtl_network,
events = bike_accidents,
w = rep(1,nrow(bike_accidents)),
samples = samples,
kernel_name = "quartic",
bw = 300, div= "bw",
adaptive = FALSE,
method = "discontinuous", digits = 1, tol = 1,
agg = 15,
grid_shape = c(1,1),
verbose=FALSE)
Network Kernel density estimate (multicore)
Description
Calculate the Network Kernel Density Estimate based on a network of lines, sampling points, and events with multicore support.
Usage
nkde.mc(
lines,
events,
w,
samples,
kernel_name,
bw,
adaptive = FALSE,
trim_bw = NULL,
method,
div = "bw",
diggle_correction = FALSE,
study_area = NULL,
max_depth = 15,
digits = 5,
tol = 0.1,
agg = NULL,
sparse = TRUE,
grid_shape = c(1, 1),
verbose = TRUE,
check = TRUE
)
Arguments
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring. |
events |
events A feature collection of points representing the events on the network. The points will be snapped on the network to their closest line. |
w |
A vector representing the weight of each event |
samples |
A feature collection of points representing the locations for which the densities will be estimated. |
kernel_name |
The name of the kernel to use. Must be one of triangle, gaussian, tricube, cosine, triweight, quartic, epanechnikov or uniform. |
bw |
The kernel bandwidth (using the scale of the lines), can be a single float or a numeric vector if a different bandwidth must be used for each event. |
adaptive |
A Boolean, indicating if an adaptive bandwidth must be used |
trim_bw |
A float, indicating the maximum value for the adaptive bandwidth |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see nkde details for more information) |
div |
The divisor to use for the kernel. Must be "n" (the number of events within the radius around each sampling point), "bw" (the bandwidth) "none" (the simple sum). |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
max_depth |
when using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has many small edges (area with many of intersections and many events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 10 should yield good estimates in most cases. A larger value can be used without a problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
verbose |
A Boolean, indicating if the function should print messages about the process. |
check |
A Boolean indicating if the geometry checks must be run before the operation. This might take some times, but it will ensure that the CRS of the provided objects are valid and identical, and that geometries are valid. |
Details
For more details, see help(nkde)
Value
A vector of values, they are the density estimates at sampling points
Examples
data(mtl_network)
data(bike_accidents)
future::plan(future::multisession(workers=1))
lixels <- lixelize_lines(mtl_network,200,mindist = 50)
samples <- lines_center(lixels)
densities <- nkde.mc(mtl_network,
events = bike_accidents,
w = rep(1,nrow(bike_accidents)),
samples = samples,
kernel_name = "quartic",
bw = 300, div= "bw",
adaptive = FALSE, agg = 15,
method = "discontinuous", digits = 1, tol = 1,
grid_shape = c(3,3),
verbose=TRUE)
## make sure any open connections are closed afterward
if (!inherits(future::plan(), "sequential")) future::plan(future::sequential)
The exposed function to calculate NKDE likelihood cv
Description
The exposed function to calculate NKDE likelihood cv (INTERNAL)
Usage
nkde_get_loo_values(
method,
neighbour_list,
sel_events,
sel_events_wid,
events,
events_wid,
weights,
bws_net,
kernel_name,
line_list,
max_depth,
cvl
)
Arguments
method |
a string, one of "simple", "continuous", "discontinuous" |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
sel_events |
a Numeric vector indicating the selected events (id of nodes) |
sel_events_wid |
a Numeric Vector indicating the unique if of the selected events |
events |
a NumericVector indicating the nodes in the graph being events |
events_wid |
a NumericVector indicating the unique id of all the events |
weights |
a matrix with the weights associated with each event (row) for each bws_net (cols). |
bws_net |
an arma::mat with the network bandwidths to consider for each event |
kernel_name |
a string with the name of the kernel to use |
line_list |
a DataFrame describing the lines |
max_depth |
the maximum recursion depth |
cvl |
a boolean indicating if the Cronie (TRUE) or CV likelihood (FALSE) must be used |
Value
a vector with the CV score for each bandwidth and the densities if required
Examples
# no example provided, this is an internal function
NKDE worker
Description
The worker function for nkde and nkde.mc
Usage
nkde_worker(
lines,
events,
samples,
kernel_name,
bw,
bws,
method,
div,
digits,
tol,
sparse,
max_depth,
verbose = FALSE
)
Arguments
lines |
A feature collection of linestrings representing the network. The geometries must be simple lines (may crash if some geometries are invalid) |
events |
A feature collection of points representing the events on the network. The points will be snapped on the network. |
samples |
A feature collection of points representing the locations for which the densities will be estimated. |
kernel_name |
The name of the kernel to use |
bw |
The global kernel bandwidth |
bws |
The kernel bandwidth (in meters) for each event. Is usually a vector but could also be a matrix if several global bandwidths were used. In this case, the output value is also a matrix. |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see details for more information) |
div |
The divisor to use for the kernel. Must be "n" (the number of events within the radius around each sampling point), "bw" (the bandwidth) "none" (the simple sum). |
digits |
The number of digits to keep in the spatial coordinates. It ensures that topology is good when building the network. Default is 3 |
tol |
When adding the events and the sampling points to the network, the minimum distance between these points and the lines extremities. When points are closer, they are added at the extremity of the lines. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. Regular matrices are faster, but require more memory and could lead to error, in particular with multiprocessing. Sparse matrices are slower, but require much less memory. |
max_depth |
When using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has a lot of small edges (area with a lot of intersections and a lot of events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 8 should yield good estimates. A larger value can be used without problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
verbose |
A Boolean, indicating if the function should print messages about the process. |
Value
A numeric vector with the nkde values
Examples
#This is an internal function, no example provided
Bandwidth selection by likelihood cross validation worker function
Description
worker function for calculating for multiple bandwidth the cross validation likelihood to select an appropriate bandwidth in a data-driven approach
Usage
nkde_worker_bw_sel(
lines,
quad_events,
events_loc,
events,
w,
kernel_name,
bws_net,
method,
div,
digits,
tol,
sparse,
max_depth,
zero_strat = "min_double",
verbose = FALSE,
cvl = FALSE
)
Arguments
lines |
A feature collection of linestrings representing the underlying network |
quad_events |
a feature collection of points indicating for which events the densities must be calculated |
events_loc |
A feature collection of points representing the location of the events |
events |
A feature collection of points representing the events. Multiple events can share the same location. They are linked by the goid column |
w |
A numeric matrix with the weight of the events for each bandwdith |
kernel_name |
The name of the kernel to use (string) |
bws_net |
A numeric matrix with the network bandwidths for each event |
method |
The type of NKDE to use (string) |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
zero_strat |
A string indicating what to do when density is 0 when calculating LOO density estimate for an isolated event. "min_double" (default) replace the 0 value by the minimum double possible on the machine. "remove" will remove them from the final score. The first approach penalizes more strongly the small bandwidths. |
verbose |
A boolean |
cvl |
A boolean indicating if the cvl method (TRUE) or the loo (FALSE) method must be used |
Examples
# no example provided, this is an internal function
pairwise distance between two vectors
Description
pairwise distance between two vectors
Usage
pair_dists(x, y)
Arguments
x |
a numeric vector |
y |
a numeric vector |
Value
a matrix width dimenion l(x) * l(y)
Plot graph
Description
Function to plot a graph (useful to check connectivity).
Usage
plot_graph(graph)
Arguments
graph |
A graph object (produced with build_graph) |
Examples
#This is an internal function, no example provided
Preparing results for K functions
Description
Prepare the final results at the end of the execution of the main functions calculating K or G functions.
Usage
prep_kfuncs_results(
k_vals,
g_vals,
all_values,
conf_int,
calc_g_func,
cross,
dist_seq,
return_sims
)
Arguments
k_vals |
a numeric vector with the real K values |
g_vals |
a numeric vector with the real g values |
all_values |
a list with the simulated K and G values that must be arranged. |
conf_int |
the confidence interval parameter. |
calc_g_func |
a boolean indicating if the G function has been calculated. |
cross |
a boolean indicating if we have calculated a simple (FALSE) or a cross function. |
dist_seq |
a numeric vector representing the distance used for calculation |
return_sims |
a boolean, indicating if the simulations must be returned |
Value
A list with the following values :
plotk: A ggplot2 object representing the values of the k-function
plotg: A ggplot2 object representing the values of the g-function
values: A DataFrame with the values used to build the plots
Examples
# no example, this is an internal function
Prior data preparation
Description
A simple function to prepare data before the NKDE calculation.
Usage
prepare_data(samples, lines, events, w, digits, tol, agg)
Arguments
samples |
A feature collection of points representing the samples points |
lines |
A feature collection of Linestrings representing the network |
events |
A feature collection of points representing the events points |
w |
A numeric vector representing the weight of the events |
digits |
The number of digits to keep |
tol |
A float indicating the spatial tolerance when snapping events on lines |
agg |
A double indicating if the points must be aggregated within a distance. if NULL, then the points are aggregated by rounding the coordinates. |
Value
the data prepared for the rest of the operations
Examples
#This is an internal function, no example provided
Data preparation for network_listw
Description
Function to prepare selected points and selected lines during the process.
Usage
prepare_elements_netlistw(is, grid, snapped_points, lines, maxdistance)
Arguments
is |
The indices of the quadras to use in the grid |
grid |
A feature collection of polygons representing the quadras to split calculus |
snapped_points |
The start and end points snapped to the lines |
lines |
The lines representing the network |
maxdistance |
The maximum distance between two observation to considere them as neighbours. |
Value
A list of two elements : selected points and selected lines
Examples
#no example provided, this is an internal function
Quartic kernel
Description
Function implementing the quartic kernel.
Usage
quartic_kernel(d, bw)
Arguments
d |
The distance from the event |
bw |
The bandwidth used for the kernel |
Value
The estimated density
Examples
#This is an internal function, no example provided
c++ quartic kernel
Description
c++ quartic kernel
c++ quartic kernel integral
Usage
quartic_kernel_cpp(d, bw)
quartic_kernel_int_cpp(d_start, d_end, bw)
Arguments
d |
a vector of distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
d_start |
a vector of start distances for which the density must be calculated |
d_end |
a vector of end distances for which the density must be calculated |
c++ quartic kernel for one distance
Description
c++ quartic kernel for one distance
Usage
quartic_kernelos(d, bw)
Arguments
d |
a double, the distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
Remove loops
Description
Remove from a sf object with linestring type geometries the lines that have the same starting and ending point.
Usage
remove_loop_lines(lines, digits)
Arguments
lines |
A sf object with linestring type geometries |
digits |
An integer indicating the number of digits to keep for the spatial coordinates |
Value
A sf object with linestring type geometries
Examples
#This is an internal function, no example provided
Remove mirror edges
Description
Keep unique edges based on start and end point
Usage
remove_mirror_edges(lines, keep_shortest = TRUE, digits = 3, verbose = TRUE)
Arguments
lines |
A feature collection of linestrings |
keep_shortest |
A boolean, if TRUE, then the shortest line is keeped if several lines have the same starting point and ending point. if FALSE, then the longest line is keeped. |
digits |
An integer indicating the number of digits to keep in coordinates |
Value
A feature collection of linestrings with the mirror edges removed
Examples
#This is an internal function, no example provided
Rervese the elements in a matrix
Description
reverse the order of the elements in a matrix both column and row wise
Usage
rev_matrix(mat)
Arguments
mat |
The matrix to reverse |
Value
A matrix
Reverse lines
Description
A function to reverse the order of the vertices of lines
Usage
reverse_lines(lines)
Arguments
lines |
A sf object with linestring type geometries |
Value
A sf object with linestring type geometries
Examples
#This is an internal function, no example provided
Sanity check for the knn functions
Description
Check if all the parameters are valid for the knn functions
Usage
sanity_check_knn(
origins,
destinations,
lines,
k,
maxdistance,
snap_dist,
line_weight,
direction,
grid_shape,
verbose,
digits,
tol
)
Arguments
origins |
A a feature collection of points, for each point, its k nearest neighbours will be found on the network. |
destinations |
A a feature collection of points, might be used if the neighbours must be found in a separate dataset. NULL if the neighbours must be found in origins. |
lines |
A a feature collection of linestrings representing the network |
k |
An integer indicating the number of neighbours to find.. |
maxdistance |
The maximum distance between two observations to consider them as neighbours. It is useful only if a grid is used, a lower value will reduce calculating time, but one must be sure that the k nearest neighbours are within this radius. Otherwise NAs will be present in the final matrices. |
snap_dist |
The maximum distance to snap the start and end points on the network. |
line_weight |
The weighting to use for lines. Default is "length" (the geographical length), but can be the name of a column. The value is considered proportional to the geographical length of the lines. |
direction |
Indicates a field providing information about authorized travelling direction on lines. if NULL, then all lines can be used in both directions. Must be the name of a column otherwise. The values of the column must be "FT" (From - To), "TF" (To - From) or "Both". |
grid_shape |
A vector of length 2 indicating the shape of the grid to use for splitting the dataset. Default is c(1,1), so all the calculation is done in one go. It might be necessary to split it if the dataset is large. |
verbose |
A Boolean indicating if the function should print its progress |
digits |
The number of digits to retain in the spatial coordinates ( simplification used to reduce risk of topological error) |
tol |
A float indicating the spatial tolerance when points are added as vertices to lines. |
Value
A list with two matrices, one with the index of the neighbours and one with the distances.
Examples
#no example provided, this is an internal function
Select the distance to weight function
Description
Select a function to convert distance to weights if a function is provided, this function will be vectorized.
Usage
select_dist_function(dist_func = "inverse")
Arguments
dist_func |
Could be a name in c('inverse', 'identity', 'squared inverse') or a function with only one parameter x |
Value
A vectorized function used to convert distance into spatial weights
Examples
#This is an internal function, no example provided
Select kernel function
Description
select the kernel function with its name.
Usage
select_kernel(name)
Arguments
name |
The name of the kernel to use |
Value
A kernel function
Examples
#This is an internal function, no example provided
LineString to simple Line
Description
Split the polylines of a feature collection of linestrings in simple segments at each vertex. The values of the columns are duplicated for each segment.
Usage
simple_lines(lines)
Arguments
lines |
The featue collection of linestrings to modify |
Value
An featue collection of linestrings
Examples
data(mtl_network)
new_lines <- simple_lines(mtl_network)
Simple NKDE algorithm
Description
Function to perform the simple nkde.
Usage
simple_nkde(graph, events, samples, bws, kernel_func, nodes, edges, div = "bw")
Arguments
graph |
a graph object from igraph representing the network |
events |
a feature collection of points representing the events. It must be snapped on the network, and be nodes of the network. A column vertex_id must indicate for each event its corresponding node |
samples |
a a feature collection of points representing the sampling points. The samples must be snapped on the network. A column edge_id must indicate for each sample on which edge it is snapped. |
bws |
a vector indicating the kernel bandwidth (in meters) for each event |
kernel_func |
a function obtained with the function select_kernel |
nodes |
a a feature collection of points representing the nodes of the network |
edges |
a a feature collection of linestrings representing the edges of the network |
div |
The divisor to use for the kernel. Must be "n" (the number of events within the radius around each sampling point), "bw" (the bandwidth) "none" (the simple sum). |
Value
a dataframe with two columns. sum_k is the sum for each sample point of the kernel values. n is the number of events influencing each sample point
Examples
#This is an internal function, no example provided
Simple TNKDE algorithm
Description
Function to perform the simple tnkde.
Usage
simple_tnkde(
graph,
events,
samples,
samples_time,
bws_net,
bws_time,
kernel_func,
nodes,
edges,
div
)
Arguments
graph |
a graph object from igraph representing the network |
events |
a feature collection of points representing the events. It must be snapped on the network, and be nodes of the network. A column vertex_id must indicate for each event its corresponding node |
samples |
a feature collection of points representing the sampling points. The samples must be snapped on the network. A column edge_id must indicate for each sample on which edge it is snapped. |
samples_time |
a numeric vector indicating when the densities must be sampled |
bws_net |
a vector indicating the network kernel bandwidth (in meters) for each event |
bws_time |
a vector indicating the time kernel bandwidth for each event |
kernel_func |
a function obtained with the function select_kernel |
nodes |
a feature collection of points representing the nodes of the network |
edges |
a feature collection of linestrings representing the edges of the network |
div |
The divisor to use for the kernel. Must be "n" (the number of events within the radius around each sampling point), "bw" (the bandwidth) "none" (the simple sum). |
Value
a list of two matrices. The first one ins the matrix of the densities, the rows are the samples and the columns the time. The second has the same dimensions and contains the number of events influencing each sample
Examples
#This is an internal function, no example provided
Simplify a network
Description
Simplify a network by applying two corrections: Healing edges and Removing mirror edges (experimental).
Usage
simplify_network(
lines,
digits = 3,
heal = TRUE,
mirror = TRUE,
keep_shortest = TRUE,
verbose = TRUE
)
Arguments
lines |
A feature collection of linestrings |
digits |
An integer indicating the number of digits to keep in coordinates |
heal |
A boolean indicating if the healing operation must be performed |
mirror |
A boolean indicating if the mirror edges must be removed |
keep_shortest |
A boolean, if TRUE, then the shortest line is kept from mirror edges. if FALSE, then the longest line is kept. |
verbose |
A boolean indicating if messages and a progress bar should be displayed |
Details
Healing is the operation to merge two connected linestring if the are intersecting at one extremity and do not intersect any other linestring. It helps to reduce the complexity of the network and thus can reduce calculation time. Removing mirror edges is the operation to remove edges that have the same extremities. If two edges start at the same point and end at the same point, they do not add information in the network and one can be removed to simplify the network. One can decide to keep the longest of the two edges or the shortest. NOTE: the edge healing does not consider lines directions currently!
Value
A feature collection of linestrings
Examples
data(mtl_network)
edited_lines <- simplify_network(mtl_network, digits = 3, verbose = FALSE)
Smaller subset road network of Montreal
Description
A feature collection (sf object) representing the road network of Montreal. The EPSG is 3797, and the data comes from the Montreal OpenData website. It is only a small extract in central districts used to demonstrate the main functions of spNetwork. It is mainly used internally for tests.
Usage
small_mtl_network
Format
A sf object with 1244 rows and 2 variables
- TYPE
the type of road
- geom
the geometry (linestrings)
Source
https://donnees.montreal.ca/dataset/geobase
Snap points to lines
Description
Snap points to their nearest lines (edited from maptools)
Usage
snapPointsToLines2(points, lines, idField = NA, ...)
Arguments
points |
A feature collection of points |
lines |
A feature collection of linestrings |
idField |
The name of the column to use as index for the lines |
... |
unused |
Value
A feature collection of points with the projected geometries
Examples
# reading the data
data(mtl_network)
data(bike_accidents)
mtl_network$LineID <- 1:nrow(mtl_network)
# snapping point to lines
snapped_points <- snapPointsToLines2(bike_accidents,
mtl_network,
"LineID"
)
Coordinates to unique character vector
Description
Generate a character vector based on a coordinates matrix and the maximum number of digits to keep.
Usage
sp_char_index(coords, digits)
Arguments
coords |
A n * 2 matrix representing the coordinates |
digits |
The number of digits to keep from the coordinates |
Value
A vector character vector of length n
Examples
#This is an internal function, no example provided
Split boundary of polygon
Description
A function to cut the boundary of the study area into chunks.
Usage
split_border(polygon, bw)
Arguments
polygon |
The polygon representing the study area |
bw |
The maximum bandwidth |
Value
A feature collection of linestrings
Split data with a grid
Description
Function to split the dataset according to a grid.
Usage
split_by_grid(grid, samples, events, lines, bw, tol, digits, split_all = TRUE)
Arguments
grid |
A spatial grid to split the data within |
samples |
A feature collection of points representing the samples points |
events |
A feature collection of points representing the events points |
lines |
A feature collection of linestrings representing the network |
bw |
The kernel bandwidth (used to avoid edge effect) |
tol |
A float indicating the spatial tolerance when snapping events on lines |
digits |
The number of digits to keep |
split_all |
A boolean indicating if we must split the lines at each vertex (TRUE) or only at event vertices (FALSE) |
Value
A list with the split dataset
Examples
#This is an internal function, no example provided
Split data with a grid
Description
Function to split the dataset according to a grid.
Function to split the dataset according to a grid.
Usage
split_by_grid.mc(
grid,
samples,
events,
lines,
bw,
tol,
digits,
split_all = TRUE
)
split_by_grid.mc(
grid,
samples,
events,
lines,
bw,
tol,
digits,
split_all = TRUE
)
Arguments
grid |
A spatial grid to split the data within |
samples |
A feature collection of points representing the samples points |
events |
A feature collection of points representing the events points |
lines |
A feature collection of linestrings representing the network |
bw |
The kernel bandwidth (used to avoid edge effect) |
tol |
A float indicating the spatial tolerance when snapping events on lines |
digits |
The number of digits to keep |
split_all |
A boolean indicating if we must split the lines at each vertex (TRUE) or only at event vertices (FALSE) |
Value
A list with the split dataset
A list with the split dataset
Examples
#This is an internal function, no example provided
#This is an internal function, no example provided
Split data with a grid for the adaptive bw function
Description
Function to split the dataset according to a grid for the adaptive bw function.
Usage
split_by_grid_abw(grid, events, lines, bw, tol, digits)
Arguments
grid |
A spatial grid to split the data within |
events |
A feature collection of points representing the events |
lines |
A feature collection of lines representing the network |
bw |
The kernel bandwidth (used to avoid edge effect) |
tol |
A float indicating the spatial tolerance when snapping events on lines |
digits |
The number of digits to keep |
Value
A list with the split dataset
Examples
#This is an internal function, no example provided
Split data with a grid for the adaptive bw function (multicore)
Description
Function to split the dataset according to a grid for the adaptive bw function with multicore support
Usage
split_by_grid_abw.mc(grid, events, lines, bw, tol, digits)
Arguments
grid |
A spatial grid to split the data within |
events |
A feature collection of points representing the events points |
lines |
A feature collection of lines representing the network |
bw |
The kernel bandwidth (used to avoid edge effect) |
tol |
A float indicating the spatial tolerance when snapping events on lines |
digits |
The number of digits to keep |
Value
A list with the split dataset
Examples
#This is an internal function, no example provided
Split graph components
Description
Function to split the results of build_graph and build_graph_directed into their sub components
Usage
split_graph_components(graph_result)
Arguments
graph_result |
A list typically obtained from the function build_graph or build_graph_directed |
Value
A list of lists, the graph_result split for each graph component
Examples
data(mtl_network)
mtl_network$length <- as.numeric(sf::st_length(mtl_network))
graph_result <- build_graph(mtl_network, 2, "length", attrs = TRUE)
sub_elements <- split_graph_components(graph_result)
Split lines at vertices in a feature collection of linestrings
Description
Split lines (feature collection of linestrings) at their nearest vertices (feature collection of points), may fail if the line geometries are self intersecting.
Usage
split_lines_at_vertex(lines, points, nearest_lines_idx, mindist)
Arguments
lines |
The feature collection of linestrings to split |
points |
The feature collection of points to add to as vertex to the lines |
nearest_lines_idx |
For each point, the index of the nearest line |
mindist |
The minimum distance between one point and the extremity of the line to add the point as a vertex. |
Value
A feature collection of linestrings
Examples
# reading the data
data(mtl_network)
data(bike_accidents)
# aggregating points within a 5 metres radius
bike_accidents$weight <- 1
agg_points <- aggregate_points(bike_accidents, 5)
mtl_network$LineID <- 1:nrow(mtl_network)
# snapping point to lines
snapped_points <- snapPointsToLines2(agg_points,
mtl_network,
"LineID"
)
# splitting lines
new_lines <- split_lines_at_vertex(mtl_network, snapped_points,
snapped_points$nearest_line_id, 1)
Obtain all the bounding boxes of a feature collection
Description
Obtain all the bounding boxes of a feature collection (INTERNAL).
Usage
st_bbox_by_feature(x)
Arguments
x |
a feature collection |
Value
a matrix (xmin, ymin, xmax, ymax)
Examples
#This is an internal function, no example provided
sf geometry bbox
Description
Generate polygon as the bounding box of a feature collection
Usage
st_bbox_geom(x)
Arguments
x |
A feature collection |
Value
A feature collection of polygons
Examples
#This is an internal function, no example provided
Points along polygon boundary
Description
Generate a feature collection of points by placing points along the border of polygons of a feature collection.
Usage
surrounding_points(polygons, dist)
Arguments
polygons |
A feature collection of polygons |
dist |
The distance between the points |
Value
A feature collection of points representing the points arrond the polygond
Examples
#This is an internal function, no example provided
Temporal Kernel density estimate
Description
Calculate the Temporal kernel density estimate based on sampling points in time and events
Usage
tkde(events, w, samples, bw, kernel_name, adaptive = FALSE)
Arguments
events |
A numeric vector representing the moments of occurrence of events |
w |
The weight of the events |
samples |
A numeric vector representing the moments to sample |
bw |
A float, the bandwidth to use |
kernel_name |
The name of the kernel to use |
adaptive |
Boolean |
Value
A numeric vector with the density values at the requested timestamps
Examples
data(bike_accidents)
bike_accidents$Date <- as.POSIXct(bike_accidents$Date, format = "%Y/%m/%d")
start <- min(bike_accidents$Date)
diff <- as.integer(difftime(bike_accidents$Date , start, units = "days"))
density <- tkde(diff, rep(1,length(diff)), seq(0,max(diff),1), 2, "quartic")
Temporal Network Kernel density estimate
Description
Calculate the Temporal Network Kernel Density Estimate based on a network of lines, sampling points in space and times, and events in space and time.
Usage
tnkde(
lines,
events,
time_field,
w,
samples_loc,
samples_time,
kernel_name,
bw_net,
bw_time,
adaptive = FALSE,
adaptive_separate = TRUE,
trim_bw_net = NULL,
trim_bw_time = NULL,
method,
div = "bw",
diggle_correction = FALSE,
study_area = NULL,
max_depth = 15,
digits = 5,
tol = 0.1,
agg = NULL,
sparse = TRUE,
grid_shape = c(1, 1),
verbose = TRUE,
check = TRUE
)
Arguments
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring. |
events |
events A feature collection of points representing the events on the network. The points will be snapped on the network to their closest line. |
time_field |
The name of the field in events indicating when the events occurred. It must be a numeric field |
w |
A vector representing the weight of each event |
samples_loc |
A feature collection of points representing the locations for which the densities will be estimated. |
samples_time |
A numeric vector indicating when the densities will be sampled |
kernel_name |
The name of the kernel to use. Must be one of triangle, gaussian, tricube, cosine, triweight, quartic, epanechnikov or uniform. |
bw_net |
The network kernel bandwidth (using the scale of the lines), can be a single float or a numeric vector if a different bandwidth must be used for each event. |
bw_time |
The time kernel bandwidth, can be a single float or a numeric vector if a different bandwidth must be used for each event. |
adaptive |
A Boolean, indicating if an adaptive bandwidth must be used. Both spatial and temporal bandwidths are adapted but separately. |
adaptive_separate |
A boolean indicating if the adaptive bandwidths for the time and the network dimensions must be calculated separately (TRUE) or in interaction (FALSE) |
trim_bw_net |
A float, indicating the maximum value for the adaptive network bandwidth |
trim_bw_time |
A float, indicating the maximum value for the adaptive time bandwidth |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see nkde details for more information) |
div |
The divisor to use for the kernel. Must be "n" (the number of events within the radius around each sampling point), "bw" (the bandwith) "none" (the simple sum). |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
max_depth |
when using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has many small edges (area with many of intersections and many events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 10 should yield good estimates in most cases. A larger value can be used without a problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
verbose |
A Boolean, indicating if the function should print messages about the process. |
check |
A Boolean indicating if the geometry checks must be run before the operation. This might take some times, but it will ensure that the CRS of the provided objects are valid and identical, and that geometries are valid. |
Details
Temporal Network Kernel Density Estimate
The TNKDE is an extension of the NKDE considering both the location of events on the network and
in time. Thus, density estimation (density sampling) can be done along lines of the network and
at different time. It can be used with the three NKDE (simple, discontinuous and continuous).
density in time and space
Two bandwidths must be provided, one for the network distance and one for the
time distance. They are both used to calculate the contribution of each event
to each sampling point. Let us consider one event E and a sample S. dnet(E,S)
is the contribution to network density of E at S location and dtime(E,S) is
the contribution to time density of E at S time. The total contribution is
thus dnet(E,S) * dtime(E,S). If one of the two densities is 0, then the total
density is 0 because the sampling point is out of the covered area by the
event in time or in the network space.
adaptive bandwidth
It is possible to use an adaptive bandwidth both on the network and in time.
Adaptive bandwidths are calculated using the Abramson’s smoothing regimen
(Abramson 1982). To do so, the original fixed
bandwidths must be specified (bw_net and bw_time parameters).
The maximum size of the two local bandwidths can be limited with
the parameters trim_bw_net and trim_bw_time.
Diggle correction factor
A set of events can be limited in both space (limits of the study
area) and time ( beginning and ending of the data collection period). These
limits induce lower densities at the border of the set of events, because
they are not sampled outside the limits. It is possible to apply the Diggle
correction factor (Diggle 1985) in both the
network and time spaces to minimize this effect.
Separated or simultaneous adaptive bandwidth
When the parameter adaptive is TRUE, one can choose between using separated
calculation of network and temporal bandwidths, and calculating them
simultaneously. In the first case (default), the network bandwidths are
determined for each event by considering only their locations and the time
bandwidths are determined by considering only there time stamps. In the second
case, for each event, the spatio-temporal density at its location on the
network and in time is estimated and used to determine both the network and
temporal bandwidths. This second approach must be preferred if the events are
characterized by a high level of spatio-temporal autocorrelation.
Value
A matrix with the estimated density for each sample point (rows) at each timestamp (columns). If adaptive = TRUE, the function returns a list with two slots: k (the matrix with the density values) and events (a feature collection of points with the local bandwidths).
Examples
# loading the data
data(mtl_network)
data(bike_accidents)
# converting the Date field to a numeric field (counting days)
bike_accidents$Time <- as.POSIXct(bike_accidents$Date, format = "%Y/%m/%d")
start <- as.POSIXct("2016/01/01", format = "%Y/%m/%d")
bike_accidents$Time <- difftime(bike_accidents$Time, start, units = "days")
bike_accidents$Time <- as.numeric(bike_accidents$Time)
# creating sample points
lixels <- lixelize_lines(mtl_network, 50)
sample_points <- lines_center(lixels)
# choosing sample in times (every 10 days)
sample_time <- seq(0, max(bike_accidents$Time), 10)
# calculating the densities
tnkde_densities <- tnkde(lines = mtl_network,
events = bike_accidents, time_field = "Time",
w = rep(1, nrow(bike_accidents)),
samples_loc = sample_points,
samples_time = sample_time,
kernel_name = "quartic",
bw_net = 700, bw_time = 60, adaptive = TRUE,
trim_bw_net = 900, trim_bw_time = 80,
method = "discontinuous", div = "bw",
max_depth = 10, digits = 2, tol = 0.01,
agg = 15, grid_shape = c(1,1),
verbose = FALSE)
Temporal Network Kernel density estimate (multicore)
Description
Calculate the Temporal Network Kernel Density Estimate based on a network of lines, sampling points in space and times, and events in space and time with multicore support.
Usage
tnkde.mc(
lines,
events,
time_field,
w,
samples_loc,
samples_time,
kernel_name,
bw_net,
bw_time,
adaptive = FALSE,
adaptive_separate = TRUE,
trim_bw_net = NULL,
trim_bw_time = NULL,
method,
div = "bw",
diggle_correction = FALSE,
study_area = NULL,
max_depth = 15,
digits = 5,
tol = 0.1,
agg = NULL,
sparse = TRUE,
grid_shape = c(1, 1),
verbose = TRUE,
check = TRUE
)
Arguments
lines |
A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings (may crash if some geometries are invalid) without MultiLineSring. |
events |
events A feature collection of points representing the events on the network. The points will be snapped on the network to their closest line. |
time_field |
The name of the field in events indicating when the events occurred. It must be a numeric field |
w |
A vector representing the weight of each event |
samples_loc |
A feature collection of points representing the locations for which the densities will be estimated. |
samples_time |
A numeric vector indicating when the densities will be sampled |
kernel_name |
The name of the kernel to use. Must be one of triangle, gaussian, tricube, cosine, triweight, quartic, epanechnikov or uniform. |
bw_net |
The network kernel bandwidth (using the scale of the lines), can be a single float or a numeric vector if a different bandwidth must be used for each event. |
bw_time |
The time kernel bandwidth, can be a single float or a numeric vector if a different bandwidth must be used for each event. |
adaptive |
A Boolean, indicating if an adaptive bandwidth must be used. Both spatial and temporal bandwidths are adapted but separately. |
adaptive_separate |
A boolean indicating if the adaptive bandwidths for the time and the network dimensions must be calculated separately (TRUE) or in interaction (FALSE) |
trim_bw_net |
A float, indicating the maximum value for the adaptive network bandwidth |
trim_bw_time |
A float, indicating the maximum value for the adaptive time bandwidth |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see nkde details for more information) |
div |
The divisor to use for the kernel. Must be "n" (the number of events within the radius around each sampling point), "bw" (the bandwith) "none" (the simple sum). |
diggle_correction |
A Boolean indicating if the correction factor for edge effect must be used. |
study_area |
A feature collection of polygons representing the limits of the study area. |
max_depth |
when using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has many small edges (area with many of intersections and many events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 10 should yield good estimates in most cases. A larger value can be used without a problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
agg |
A double indicating if the events must be aggregated within a distance. If NULL, the events are aggregated only by rounding the coordinates. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
grid_shape |
A vector of two values indicating how the study area must be split when performing the calculus. Default is c(1,1) (no split). A finer grid could reduce memory usage and increase speed when a large dataset is used. When using multiprocessing, the work in each grid is dispatched between the workers. |
verbose |
A Boolean, indicating if the function should print messages about the process. |
check |
A Boolean indicating if the geometry checks must be run before the operation. This might take some times, but it will ensure that the CRS of the provided objects are valid and identical, and that geometries are valid. |
Details
For details, see help(tnkde) and help(nkde)
Value
A matrix with the estimated density for each sample point (rows) at each timestamp (columns). If adaptive = TRUE, the function returns a list with two slots: k (the matrix with the density values) and events (a feature collection of points with the local bandwidths).
Examples
# loading the data
data(mtl_network)
data(bike_accidents)
# converting the Date field to a numeric field (counting days)
bike_accidents$Time <- as.POSIXct(bike_accidents$Date, format = "%Y/%m/%d")
start <- as.POSIXct("2016/01/01", format = "%Y/%m/%d")
bike_accidents$Time <- difftime(bike_accidents$Time, start, units = "days")
bike_accidents$Time <- as.numeric(bike_accidents$Time)
# creating sample points
lixels <- lixelize_lines(mtl_network, 50)
sample_points <- lines_center(lixels)
# choosing sample in times (every 10 days)
sample_time <- seq(0, max(bike_accidents$Time), 10)
future::plan(future::multisession(workers=1))
# calculating the densities
tnkde_densities <- tnkde.mc(lines = mtl_network,
events = bike_accidents, time_field = "Time",
w = rep(1, nrow(bike_accidents)),
samples_loc = sample_points,
samples_time = sample_time,
kernel_name = "quartic",
bw_net = 700, bw_time = 60, adaptive = TRUE,
trim_bw_net = 900, trim_bw_time = 80,
method = "discontinuous", div = "bw",
max_depth = 10, digits = 2, tol = 0.01,
agg = 15, grid_shape = c(1,1),
verbose = FALSE)
## make sure any open connections are closed afterward
if (!inherits(future::plan(), "sequential")) future::plan(future::sequential)
The exposed function to calculate TNKDE likelihood cv
Description
The exposed function to calculate TNKDE likelihood cv (INTERNAL)
Usage
tnkde_get_loo_values(
method,
neighbour_list,
sel_events,
sel_events_wid,
sel_events_time,
events,
events_wid,
events_time,
weights,
bws_net,
bws_time,
kernel_name,
line_list,
max_depth,
min_tol
)
Arguments
method |
a string, one of "simple", "continuous", "discontinuous" |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
sel_events |
a Numeric vector indicating the selected events (id of nodes) |
sel_events_wid |
a Numeric Vector indicating the unique if of the selected events |
sel_events_time |
a Numeric Vector indicating the time of the selected events |
events |
a NumericVector indicating the nodes in the graph being events |
events_wid |
a NumericVector indicating the unique id of all the events |
events_time |
a NumericVector indicating the timestamp of each event |
weights |
a cube with the weights associated with each event for each bws_net and bws_time. |
bws_net |
an arma::vec with the network bandwidths to consider |
bws_time |
an arma::vec with the time bandwidths to consider |
kernel_name |
a string with the name of the kernel to use |
line_list |
a DataFrame describing the lines |
max_depth |
the maximum recursion depth |
min_tol |
a double indicating by how much 0 in density values must be replaced |
Value
a matrix with the CV score for each pair of bandiwdths
Examples
# no example provided, this is an internal function
The exposed function to calculate TNKDE likelihood cv
Description
The exposed function to calculate TNKDE likelihood cv (INTERNAL) when an adaptive bandwidth is used
Usage
tnkde_get_loo_values2(
method,
neighbour_list,
sel_events,
sel_events_wid,
sel_events_time,
events,
events_wid,
events_time,
weights,
bws_net,
bws_time,
kernel_name,
line_list,
max_depth,
min_tol
)
Arguments
method |
a string, one of "simple", "continuous", "discontinuous" |
neighbour_list |
a List, giving for each node an IntegerVector with its neighbours |
sel_events |
a Numeric vector indicating the selected events (id of nodes) |
sel_events_wid |
a Numeric Vector indicating the unique if of the selected events |
sel_events_time |
a Numeric Vector indicating the time of the selected events |
events |
a NumericVector indicating the nodes in the graph being events |
events_wid |
a NumericVector indicating the unique id of all the events |
events_time |
a NumericVector indicating the timestamp of each event |
weights |
a cube with the weights associated with each event for each bws_net and bws_time. |
bws_net |
an arma::cube of three dimensions with the network bandwidths calculated for each observation for each global time and network bandwidths |
bws_time |
an arma::cube of three dimensions with the time bandwidths calculated for each observation for each global time and network bandwidths |
kernel_name |
a string with the name of the kernel to use |
line_list |
a DataFrame describing the lines |
max_depth |
the maximum recursion depth |
min_tol |
a double indicating by how much 0 in density values must be replaced |
Value
a matrix with the CV score for each pair of global bandiwdths
Examples
# no example provided, this is an internal function
TNKDE worker
Description
The worker function for tnkde and tnkde.mc
Usage
tnkde_worker(
lines,
events_loc,
events,
samples_loc,
samples_time,
kernel_name,
bw_net,
bw_time,
bws_net,
bws_time,
method,
div,
digits,
tol,
sparse,
max_depth,
verbose = FALSE
)
Arguments
lines |
A feature collection of linestrings with the sampling points. The geometries must be simple Linestrings (may crash if some geometries are invalid) |
events_loc |
A feature collection of points representing the aggergated events on the network. The points will be snapped on the network. |
events |
A feature collection of points representing the base events on the network |
samples_loc |
A feature collection of points representing the locations for which the densities will be estimated. |
samples_time |
A numeric vector representing when each density will be estimated |
kernel_name |
The name of the kernel to use |
bw_net |
The global network kernel bandwidth |
bw_time |
The global time kernel bandwidth |
bws_net |
The network kernel bandwidth (in meters) for each event |
bws_time |
The time bandwidth for each event |
method |
The method to use when calculating the NKDE, must be one of simple / discontinuous / continuous (see details for more information) |
div |
The divisor to use for the kernel. Must be "n" (the number of events within the radius around each sampling point), "bw" (the bandwidth) "none" (the simple sum). |
digits |
The number of digits to keep in the spatial coordinates. It ensures that topology is good when building the network. Default is 3 |
tol |
When adding the events and the sampling points to the network, the minimum distance between these points and the lines extremities. When points are closer, they are added at the extremity of the lines. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. Regular matrices are faster, but require more memory and could lead to error, in particular with multiprocessing. Sparse matrices are slower, but require much less memory. |
max_depth |
When using the continuous and discontinuous methods, the calculation time and memory use can go wild if the network has a lot of small edges (area with a lot of intersections and a lot of events). To avoid it, it is possible to set here a maximum depth. Considering that the kernel is divided at intersections, a value of 8 should yield good estimates. A larger value can be used without problem for the discontinuous method. For the continuous method, a larger value will strongly impact calculation speed. |
verbose |
A Boolean, indicating if the function should print messages about the process. |
Value
A numeric matrix with the nkde values
Examples
#This is an internal function, no example provided
Worker function fo Bandwidth selection by likelihood cross validation for temporal NKDE
Description
Calculate for multiple network and time bandwidths the cross validation likelihood to select an appropriate bandwidth in a data-driven approach (INTERNAL)
Usage
tnkde_worker_bw_sel(
lines,
quad_events,
events_loc,
events,
w,
kernel_name,
bws_net,
bws_time,
method,
div,
digits,
tol,
sparse,
max_depth,
verbose = FALSE,
cvl = FALSE
)
Arguments
lines |
A feature collection of linestrings representing the underlying network |
quad_events |
a feature collection of points indicating for which events the densities must be calculated |
events_loc |
A feature collection of points representing the location of the events |
events |
A feature collection of points representing the events. Multiple events can share the same location. They are linked by the goid column |
w |
A numeric array with the weight of the events for each pair of bandwidth |
kernel_name |
The name of the kernel to use (string) |
bws_net |
A numeric vector with the network bandwidths. Could also be an array if an adaptive bandwidth is calculated. |
bws_time |
A numeric vector with the time bandwidths. Could also be an array if an adaptive bandwidth is calculated. |
method |
The type of NKDE to use (string) |
div |
The type of divisor (not used currently) |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
max_depth |
The maximum depth of recursion |
verbose |
A boolean |
cvl |
A boolean indicating if the cvl method (TRUE) or the loo (FALSE) method must be used |
Value
An array with the CV score for each pair of bandiwdths (rows and lines) for each event (slices)
Examples
# no example provided, this is an internal function
The main function to calculate continuous TNKDE (with ARMADILO and sparse matrix)
Description
The main function to calculate continuous TNKDE (with ARMADILO and sparse matrix)
The main function to calculate continuous TNKDE (with ARMADILO and integer matrix)
Usage
continuous_tnkde_cpp_arma_sparse(
neighbour_list,
events,
events_time,
weights,
samples,
samples_time,
bws_net,
bws_time,
kernel_name,
nodes,
line_list,
max_depth,
verbose,
div
)
continuous_tnkde_cpp_arma(
neighbour_list,
events,
events_time,
weights,
samples,
samples_time,
bws_net,
bws_time,
kernel_name,
nodes,
line_list,
max_depth,
verbose,
div
)
Arguments
neighbour_list |
a list of the neighbours of each node |
events |
a numeric vector of the node id of each event |
events_time |
a numeric vector with the time for the events |
weights |
a numeric vector of the weight of each event |
samples |
a DataFrame of the samples (with spatial coordinates and belonging edge) |
samples_time |
a NumericVector indicating when to do the samples |
bws_net |
the network kernel bandwidths for each event |
bws_time |
the time kernel bandwidths for each event |
kernel_name |
the name of the kernel to use |
nodes |
a DataFrame representing the nodes of the graph (with spatial coordinates) |
line_list |
a DataFrame representing the lines of the graph |
max_depth |
the maximum recursion depth (after which recursion is stopped) |
verbose |
a boolean indicating if the function must print its progress |
div |
a string indicating how to standardize the kernel values |
Value
a List with two matrices: the kernel values (sum_k) and the number of events for each sample (n)
a List with two matrices: the kernel values (sum_k) and the number of events for each sample (n)
The main function to calculate discontinuous NKDE (ARMA and Integer matrix)
Description
The main function to calculate discontinuous NKDE (ARMA and Integer matrix)
Usage
discontinuous_tnkde_cpp_arma(
neighbour_list,
events,
weights,
events_time,
samples,
samples_time,
bws_net,
bws_time,
kernel_name,
nodes,
line_list,
max_depth,
verbose,
div = "bw"
)
Arguments
neighbour_list |
a list of the neighbours of each node |
events |
a numeric vector of the node id of each event |
weights |
a numeric vector of the weight of each event |
events_time |
a numeric vector with the time for the events |
samples |
a DataFrame of the samples (with spatial coordinates and belonging edge) |
samples_time |
a NumericVector indicating when to do the samples |
bws_net |
the network kernel bandwidths for each event |
kernel_name |
the name of the kernel function to use |
nodes |
a DataFrame representing the nodes of the graph (with spatial coordinates) |
line_list |
a DataFrame representing the lines of the graph |
max_depth |
the maximum recursion depth (after which recursion is stopped) |
verbose |
a boolean indicating if the function must print its progress |
div |
a string indicating how to standardize the kernel values |
Value
a List with two matrices: the kernel values (sum_k) and the number of events for each sample (n)
The main function to calculate discontinuous NKDE (ARMA and sparse matrix)
Description
The main function to calculate discontinuous NKDE (ARMA and sparse matrix)
Usage
discontinuous_tnkde_cpp_arma_sparse(
neighbour_list,
events,
weights,
events_time,
samples,
samples_time,
bws_net,
bws_time,
kernel_name,
nodes,
line_list,
max_depth,
verbose,
div = "bw"
)
Arguments
neighbour_list |
a list of the neighbours of each node |
events |
a numeric vector of the node id of each event |
weights |
a numeric vector of the weight of each event |
events_time |
a numeric vector with the time for the events |
samples |
a DataFrame of the samples (with spatial coordinates and belonging edge) |
samples_time |
a NumericVector indicating when to do the samples |
bws_net |
the network kernel bandwidths for each event |
kernel_name |
the name of the kernel function to use |
nodes |
a DataFrame representing the nodes of the graph (with spatial coordinates) |
line_list |
a DataFrame representing the lines of the graph |
max_depth |
the maximum recursion depth (after which recursion is stopped) |
verbose |
a boolean indicating if the function must print its progress |
div |
a string indicating how to standardize the kernel values |
Value
a List with two matrices: the kernel values (sum_k) and the number of events for each sample (n)
triangle kernel
Description
Function implementing the triangle kernel.
Usage
triangle_kernel(d, bw)
Arguments
d |
The distance from the event |
bw |
The bandwidth used for the kernel |
Value
The estimated density
Examples
#This is an internal function, no example provided
c++ triangle kernel
Description
c++ triangle kernel
Usage
triangle_kernel_cpp(d, bw)
Arguments
d |
a vector of distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
c++ triangle kernel for one distance
Description
c++ triangle kernel for one distance
Usage
triangle_kernelos(d, bw)
Arguments
d |
a double, the distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
Tricube kernel
Description
Function implementing the tricube kernel.
Usage
tricube_kernel(d, bw)
Arguments
d |
The distance from the event |
bw |
The bandwidth used for the kernel |
Value
The estimated density
Examples
#This is an internal function, no example provided
c++ tricube kernel
Description
c++ tricube kernel
Usage
tricube_kernel_cpp(d, bw)
Arguments
d |
a vector of distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
c++ tricube kernel for one distance
Description
c++ tricube kernel for one distance
Usage
tricube_kernelos(d, bw)
Arguments
d |
a double, the distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
Helper for isochrones lines cutting
Description
last operation for isochrone calculation, cutting the lines at their begining and ending. This is a worker function for calc_isochrones.
Usage
trim_lines_at(df1, graph_result, d, dd, i, donught)
Arguments
df1 |
A features collection of linestrings with some specific fields. |
graph_result |
A list produced by the functions build_graph_directed or build_graph. |
d |
the end distance of this isochrones. |
dd |
the start distance of this isochrones. |
i |
the actual iteration. |
donught |
A boolean indicating if the returned isochrone will be plained or a donught. |
Value
A feature collection of lines
Triweight kernel
Description
Function implementing the triweight kernel.
Usage
triweight_kernel(d, bw)
Arguments
d |
The distance from the event |
bw |
The bandwidth used for the kernel |
Value
The estimated density
Examples
#This is an internal function, no example provided
c++ triweight kernel
Description
c++ triweight kernel
Usage
triweight_kernel_cpp(d, bw)
Arguments
d |
a vector of distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
c++ triweight kernel for one distance
Description
c++ triweight kernel for one distance
Usage
triweight_kernelos(d, bw)
Arguments
d |
a double, the distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
Uniform kernel
Description
Function implementing the uniform kernel.
Usage
uniform_kernel(d, bw)
Arguments
d |
The distance from the event |
bw |
The bandwidth used for the kernel |
Value
The estimated density
Examples
#This is an internal function, no example provided
c++ uniform kernel
Description
c++ uniform kernel
Usage
uniform_kernel_cpp(d, bw)
Arguments
d |
a vector of distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
c++ uniform kernel for one distance
Description
c++ uniform kernel for one distance
Usage
uniform_kernelos(d, bw)
Arguments
d |
a double, the distances for which the density must be calculated |
bw |
a double representing the size of the kernel bandwidth |
Worker function for adaptive bandwidth for TNDE
Description
The worker function to calculate Adaptive bandwidths according to Abramson’s smoothing regimen for TNKDE with a space-time interaction (INTERNAL).
Usage
worker_adaptive_bw_tnkde(
lines,
quad_events,
events_loc,
events,
w,
kernel_name,
bw_net,
bw_time,
method,
div,
digits,
tol,
sparse,
max_depth,
verbose = FALSE
)
Arguments
lines |
A feature collection of linestrings representing the underlying network |
quad_events |
a feature collection of points indicating for which events the densities must be calculated |
events_loc |
A feature collection of points representing the location of the events |
events |
A feature collection of points representing the events. Multiple events can share the same location. They are linked by the goid column |
w |
A numeric vector with the weight of the events |
kernel_name |
The name of the kernel to use (string) |
bw_net |
The fixed kernel bandwidth for the network dimension. Can also be a vector if several bandwidth must be used. |
bw_time |
The fixed kernel bandwidth for the time dimension. Can also be a vector if several bandwidth must be used. |
method |
The type of NKDE to use (string) |
div |
The divisor to use for the kernel. Must be "n" (the number of events within the radius around each sampling point), "bw" (the bandwidth) "none" (the simple sum). |
digits |
The number of digits to retain from the spatial coordinates. It ensures that topology is good when building the network. Default is 3. Too high a precision (high number of digits) might break some connections |
tol |
A float indicating the minimum distance between the events and the lines' extremities when adding the point to the network. When points are closer, they are added at the extremity of the lines. |
sparse |
A Boolean indicating if sparse or regular matrices should be used by the Rcpp functions. These matrices are used to store edge indices between two nodes in a graph. Regular matrices are faster, but require more memory, in particular with multiprocessing. Sparse matrices are slower (a bit), but require much less memory. |
max_depth |
An integer, the maximum depth to reach for continuous and discontinuous NKDE |
verbose |
A Boolean, indicating if the function should print messages about the process. |
Value
A vector with the local bandwidths or an array if bw_net and bw_time are vectors
Examples
#This is an internal function, no example provided