Type: | Package |
Title: | Fit Vector Fields and Potential Landscapes from Intensive Longitudinal Data |
Version: | 0.1.0 |
Description: | A toolbox for estimating vector fields from intensive longitudinal data, and construct potential landscapes thereafter. The vector fields can be estimated with two nonparametric methods: the Multivariate Vector Field Kernel Estimator (MVKE) by Bandi & Moloche (2018) <doi:10.1017/S0266466617000305> and the Sparse Vector Field Consensus (SparseVFC) algorithm by Ma et al. (2013) <doi:10.1016/j.patcog.2013.05.017>. The potential landscapes can be constructed with a simulation-based approach with the 'simlandr' package (Cui et al., 2021) <doi:10.31234/osf.io/pzva3>, or the Bhattacharya et al. (2011) method for path integration <doi:10.1186/1752-0509-5-85>. |
License: | GPL (≥ 3) |
URL: | https://sciurus365.github.io/fitlandr/, https://github.com/Sciurus365/fitlandr |
BugReports: | https://github.com/Sciurus365/fitlandr/issues |
Imports: | cli, dplyr, furrr, future.apply, ggplot2, glue, grDevices, grid, magrittr, MASS, numDeriv, plotly, R.utils, Rfast, rlang, rootSolve, simlandr (≥ 0.3.0), SparseVFC, tidyr |
Suggests: | akima, colorRamps, future |
Encoding: | UTF-8 |
RoxygenNote: | 7.2.2 |
NeedsCompilation: | no |
Packaged: | 2023-02-09 10:01:08 UTC; jingm |
Author: | Jingmeng Cui |
Maintainer: | Jingmeng Cui <jingmeng.cui@outlook.com> |
Repository: | CRAN |
Date/Publication: | 2023-02-10 10:40:02 UTC |
fitlandr: Fit Vector Fields and Potential Landscapes from Intensive Longitudinal Data
Description
A toolbox for estimating vector fields from intensive longitudinal data, and construct potential landscapes thereafter. The vector fields can be estimated with two nonparametric methods: the Multivariate Vector Field Kernel Estimator (MVKE) by Bandi & Moloche (2018) doi:10.1017/S0266466617000305 and the Sparse Vector Field Consensus (SparseVFC) algorithm by Ma et al. (2013) doi:10.1016/j.patcog.2013.05.017. The potential landscapes can be constructed with a simulation-based approach with the 'simlandr' package (Cui et al., 2021) doi:10.31234/osf.io/pzva3, or the Bhattacharya et al. (2011) method for path integration doi:10.1186/1752-0509-5-85.
Author(s)
Maintainer: Jingmeng Cui jingmeng.cui@outlook.com (ORCID)
See Also
Useful links:
Report bugs at https://github.com/Sciurus365/fitlandr/issues
Pipe operator
Description
See magrittr::%>%
for details.
Usage
lhs %>% rhs
Arguments
lhs |
A value or the magrittr placeholder. |
rhs |
A function call using the magrittr semantics. |
Value
The result of calling rhs(lhs)
.
Multivariate vector field kernel estimator
Description
See references for details.
Usage
MVKE(d, h = 0.2, kernel = c("exp", "Gaussian"))
Arguments
d |
The dataset. Should be a matrix or a data frame, with each row representing a random vector. |
h |
The bandwidth for the kernel estimator. |
kernel |
The type of kernel estimator used. "exp" by default ( |
Value
A function(x), which then returns the \mu
and a
estimators at the position x
.
References
Bandi, F. M., & Moloche, G. (2018). On the functional estimation of multivariate diffusion processes. Econometric Theory, 34(4), 896-946. https://doi.org/10.1017/S0266466617000305
Add a grid to a vectorfield
object to enable linear interpolation
Description
Add a grid to a vectorfield
object to enable linear interpolation
Usage
add_interp_grid(vf, lims = vf$lims, n = vf$n)
Arguments
vf |
A |
lims |
The limits of the range for the vector field estimation as |
n |
The number of equally spaced points in each axis, at which the vectors are to be estimated. |
Value
A vectorfield
project with an interp_grid
field.
Align potential values
Description
So that all path-potentials end up at same global min and then generate potential surface with interpolation on a grid.
Usage
align_pot_B(resultB, n = 200, digits = 2, linear = TRUE, ...)
Arguments
resultB |
Result from |
n |
The number of equally spaced points in each axis, at which the landscape is to be estimated. |
digits |
Currently, the raw sample points in some regions are too dense that may crashes interpolation. To avoid this problem, only one point of all with the same first several digits. is kept. Use this parameter to indicate how many digits are considered. Note that this is a temporary solution and might be changed in the near future. |
linear |
logical – indicating whether linear or spline interpolation should be used. |
... |
Other parameters passed to |
Value
list with 3 components:
x , y |
vectors of x- and y- coordinates of output grid, the same as the input
argument |
z |
matrix of fitted z-values. The value |
If input is a SpatialPointsDataFrame
a
SpatialPixelssDataFrame
is returned.
A fast bilinear interpolation function
Description
It assumes equal grid intervals, thus can find the correct position in O(1)
time.
Usage
fast_bilinear(x, y, z, x0, y0)
Arguments
x |
a vector containing the |
y |
a vector containing the |
z |
a matrix containing the |
x0 |
|
y0 |
|
Details
The following is from the documentation of akima::bilinear()
.
This is an implementation of a bilinear interpolating function. For a point (x0,y0) contained in a rectangle (x1,y1),(x2,y1), (x2,y2),(x1,y2) and x1<x2, y1<y2, the first step is to get z() at locations (x0,y1) and (x0,y2) as convex linear combinations z(x0,y*)=az(x1,y)+(1-a)z(x2,y) where a=(x2-x1)/(x0-x1) for y*=y1,y2. In a second step z(x0,y0) is calculated as convex linear combination between z(x0,y1) and z(x0,y2) as z(x0,y1)=b*z(x0,y1)+(1-b)*z(x0,y2) where b=(y2-y1)/(y0-y1).
Finally, z(x0,y0) is a convex linear combination of the z values at the corners of the containing rectangle with weights according to the distance from (x0,y0) to these corners.
Value
A list which contains only one element, z
.
Find equilibrium points for a vector field
Description
Find equilibrium points for a vector field
Usage
find_eqs(vf, starts, jacobian_params = list(), ...)
Arguments
vf |
A |
starts |
A vector indicating the starting value for solving the equilibrium point, or a list of vectors providing multiple starting values together. |
jacobian_params |
Parameters passed to |
... |
Parameters passed to |
Value
A list of equilibrium points and their details. Use print.vectorfield_eqs()
to inspect it.
Estimate a 2D vector field
Description
Estimate a 2D vector field from intensive longitudinal data. Two methods can be used: Multivariate Vector Field Kernel Estimator (MVKE, using MVKE()
), or Sparse Vector Field Consensus (SparseVFC, using SparseVFC::SparseVFC()
). Note that the input data are automatically normalized before being sent to the estimation engines to make sure the default parameter settings are close to the optimal. Therefore, you do not need to scale up or down the parameters of MVKE()
or SparseVFC::SparseVFC()
. We suggest the MVKE method to be used for psychological data because it has more realistic assumptions and produces more reasonable output.
Usage
fit_2d_vf(
data,
x,
y,
lims,
n = 20,
vector_position = "start",
na_action = "omit_data_points",
method = c("MVKE", "MVKE"),
...
)
Arguments
data |
The data set used for estimating the vector field. Should be a data frame or a matrix. |
x , y |
Characters to indicate the name of the two variables. |
lims |
The limits of the range for the vector field estimation as |
n |
The number of equally spaced points in each axis, at which the vectors are to be estimated. |
vector_position |
Only useful if |
na_action |
One of "omit_data_points" or "omit_vectors". If using "omit_data_points", then only the |
method |
One of "MVKE" or "VFC". |
... |
Other parameters to be passed to |
Value
A vectorfield
object.
See Also
Examples
# generate data
single_output_grad <- simlandr::sim_fun_grad(length = 200, seed = 1614)
# fit the vector field
v2 <- fit_2d_vf(single_output_grad, x = "x", y = "y", method = "MVKE")
plot(v2)
Estimate a 3D potential landscape from a vector field
Description
Two methods are available: method = "pathB"
and method = "simlandr"
. See Details section.
Usage
fit_3d_vfld(
vf,
method = c("simlandr", "pathB"),
.pathB_options = pathB_options(vf),
.sim_vf_options = sim_vf_options(vf),
.simlandr_options = simlandr_options(vf),
linear_interp = FALSE
)
Arguments
vf |
A |
method |
The method used for landscape construction. Can be |
.pathB_options |
Only for |
.sim_vf_options |
Only for |
.simlandr_options |
Only for |
linear_interp |
Use linear interpolation method to estimate the drift vector (and the diffusion matrix). This can speed up the calculation. If |
Details
For method = "simlandr"
, the landscape is constructed based on the generalized potential landscape by Wang et al. (2008), implemented by the simlandr
package. This function is a wrapper of sim_vf()
and simlandr::make_3d_static()
. Use those two functions separately for more customization.
For method = "pathB"
, the landscape is constructed based on the deterministic path-integral quasi-potential defined by Bhattacharya et al. (2011).
We recommend the simlandr
method for psychological data because it is more stable.
Parallel computing based on future
is supported for both methods. Use future::plan("multisession")
to enable this and speed up computation.
Value
A landscape
object as described in simlandr::make_3d_static()
, or a 3d_static_landscape_B
object, which inherits from the landscape
class and contains the following elements: dist
, the distribution estimation for landscapes; plot
, a 3D plot using plotly
; plot_2, a 2D plot using ggplot2
; x, y, from vf
.
Examples
# generate data
single_output_grad <- simlandr::sim_fun_grad(length = 200, seed = 1614)
# fit the vector field
v2 <- fit_2d_vf(single_output_grad, x = "x", y = "y", method = "MVKE")
plot(v2)
# fit the landscape
future::plan("multisession")
set.seed(1614)
l2 <- fit_3d_vfld(v2,
.sim_vf_options = sim_vf_options(chains = 16, stepsize = 1, forbid_overflow = TRUE),
.simlandr_options = simlandr_options(adjust = 5, Umax = 4))
plot(l2, 2)
future::plan("sequential")
Return a normalized prediction function
Description
Return a normalized prediction function
Usage
normalize_predict_f(vf)
Arguments
vf |
A |
Value
A function that takes a vector x
and returns a list of v
, the drift part, and a
, the diffusion part.
Options controlling the path-integral algorithm
Description
See path_integral_B()
, align_pot_B()
for details.
Usage
pathB_options(
vf,
lims = rlang::expr(vf$lims),
n_path_int = 20,
stepsize = 0.01,
tol = 0.01,
numTimeSteps = 1400,
n = 200,
digits = 2,
linear = TRUE,
...
)
Arguments
vf |
A |
lims |
The limits of the range for the estimation as |
n_path_int |
The number of equally spaced points in each axis, at which the path integrals is to be calculated. |
stepsize |
The stepsize for Euler–Maruyama simulation of the system. |
tol |
The tolerance to test convergence. |
numTimeSteps |
Number of time steps for integrating along each path (to ensure uniform arrays). Choose high-enough number for convergence with given stepsize. |
n |
The number of equally spaced points in each axis, at which the landscape is to be estimated. |
digits |
Currently, the raw sample points in some regions are too dense that may crashes interpolation. To avoid this problem, only one point of all with the same first several digits. is kept. Use this parameter to indicate how many digits are considered. Note that this is a temporary solution and might be changed in the near future. |
linear |
logical – indicating whether linear or spline interpolation should be used. |
... |
Not in use. |
Value
A list containing the parameters of the corresponding function. Only intended to be used within fit_3d_vfld()
Bhattacharya method for path integration
Description
A method to construct potential landscapes using path integration. See references for details.
Usage
path_integral_B(
f,
lims,
n_path_int = 20,
stepsize = 0.01,
tol = 0.01,
numTimeSteps = 1400,
...
)
Arguments
f |
The vector field function. It should return |
lims |
The limits of the range for the estimation as |
n_path_int |
The number of equally spaced points in each axis, at which the path integrals is to be calculated. |
stepsize |
The time step used in each iteration. |
tol |
The tolerance to test convergence. |
numTimeSteps |
Number of time steps for integrating along each path (to ensure uniform arrays). Choose high-enough number for convergence with given stepsize. |
... |
Not in use. |
Value
A list with the following elements:
-
numPaths
Integer. Total Number of paths for defined grid spacing. -
pot_path
Matrix. Potential along the paths. -
path_tag
Vector. Tag for given paths. -
attractors_pot
Vector. Potential value of each identified attractor by the path integral approach. -
x_path
Vector. x-coord. along path. -
y_path
Vector. y-coord. along path.
References
Bhattacharya, S., Zhang, Q., & Andersen, M. E. (2011). A deterministic map of Waddington’s epigenetic landscape for cell fate specification. BMC Systems Biology, 5(1), 85. https://doi.org/10.1186/1752-0509-5-85. The functions in this file were translated from the Matlab code provided with the reference above, and its Python translation at https://dynamo-release.readthedocs.io/en/v0.95.2/_modules/dynamo/vectorfield/Bhattacharya.html
Plot a 2D vector field
Description
Plot a 2D vector field estimated by fit_2d_vf()
. Powered by ggplot2::ggplot()
.
Usage
## S3 method for class 'vectorfield'
plot(
x,
arrow = grid::arrow(length = grid::unit(0.1, "cm")),
show_estimated_vector = TRUE,
estimated_vector_enlarge = 1,
estimated_vector_options = list(),
show_point = TRUE,
point_options = list(size = 0.5),
show_original_vector = FALSE,
original_vector_enlarge = 1,
original_vector_options = list(),
show_used_vector = FALSE,
used_vector_options = list(color = "red"),
show_v_norm = FALSE,
v_norm_options = list(),
...
)
Arguments
x |
A |
arrow |
The description of the arrow heads of the vectors on the plot (representing the vector field). Generated by |
show_estimated_vector |
Show the vectors from the estimated model? |
estimated_vector_enlarge |
A number. How many times should the vectors (representing the estimated vector field) be enlarged on the plot? This can be useful when the estimated vector field is too strong or too weak. |
estimated_vector_options |
A list passing other customized parameters to |
show_point |
Show the original data points? |
point_options |
A list passing other customized parameters to |
show_original_vector |
Show the original vectors (i.e., the vectors between data points)? |
original_vector_enlarge |
A number. How many times should the original vectors be enlarged on the plot? |
original_vector_options |
A list passing other customized parameters to |
show_used_vector |
Only for vector fields estimated by the "VFC" method. Should the vectors from the original data that are considered inliers be specially marked? |
used_vector_options |
Only for vector fields estimated by the "VFC" method. A list passing other customized parameters to |
show_v_norm |
Show the norm of the estimated vectors (the strength of the vector field)? |
v_norm_options |
A list passing other customized parameters to |
... |
Not in use. |
Value
A ggplot2
plot.
Calculate the vector value at a given position
Description
Calculate the vector value at a given position
Usage
## S3 method for class 'vectorfield'
predict(object, pos, linear_interp = FALSE, calculate_a = TRUE, ...)
Arguments
object |
A |
pos |
A vector, the position of the vector. |
linear_interp |
Use linear interpolation method to estimate the drift vector (and the diffusion matrix). This can speed up the calculation. If |
calculate_a |
Effective when |
... |
Not in use. |
Value
A list of v
, the drift part that is used for vector fields, and a
(when calculate_a == TRUE
), the diffusion part at a given position.
See Also
Reorder a simulation output in time order
Description
Then simlandr::check_conv()
can be used meaningfully.
Usage
reorder_output(s, chains)
Arguments
s |
A simulation output, possibly generated by |
chains |
How many chains simulations should be performed? |
Value
A reordered matrix of the simulation output.
Simulation from vector fields
Description
Parallel computing based on future
is supported. Use future::plan("multisession")
to enable this.
Usage
sim_vf(
vf,
noise = 1,
noise_warmup = noise,
chains = 10,
length = 10000,
discard = 0.3,
stepsize = 0.01,
sparse = 1,
forbid_overflow = FALSE,
linear_interp = FALSE,
inits = matrix(c(stats::runif(chains, min = vf$lims[1], max = vf$lims[2]),
stats::runif(chains, min = vf$lims[3], max = vf$lims[4])), ncol = 2)
)
Arguments
vf |
A |
noise |
Relative noise of the simulation. Set this smaller when the simulation is unstable (e.g., when the elements in the diffusion matrix are not finite), and set this larger when the simulation converges too slowly. |
noise_warmup |
The noise used for the warming-up period. |
chains |
How many chains simulations should be performed? |
length |
The simulation length for each chain. |
discard |
How much of the starting part of each chain should be discarded? (Warming-up period.) |
stepsize |
The stepsize for Euler–Maruyama simulation of the system. |
sparse |
A number. How much do you want to sparse the output? When the noise is small, sparse the output may make the density estimation more efficient. |
forbid_overflow |
If |
linear_interp |
Use linear interpolation method to estimate the drift vector (and the diffusion matrix). This can speed up the calculation. If |
inits |
The initial values of each chain. |
Value
A matrix of the simulated data.
Options controlling the vector field simulation
Description
See sim_vf()
for details.
Usage
sim_vf_options(
vf,
noise = 1,
noise_warmup = noise,
chains = 10,
length = 10000,
discard = 0.3,
stepsize = 0.01,
sparse = 1,
forbid_overflow = FALSE,
inits = rlang::expr(matrix(c(stats::runif(chains, min = vf$lims[1], max = vf$lims[2]),
stats::runif(chains, min = vf$lims[3], max = vf$lims[4])), ncol = 2))
)
Arguments
vf |
A |
noise |
Relative noise of the simulation. Set this smaller when the simulation is unstable (e.g., when the elements in the diffusion matrix are not finite), and set this larger when the simulation converges too slowly. |
noise_warmup |
The noise used for the warming-up period. |
chains |
How many chains simulations should be performed? |
length |
The simulation length for each chain. |
discard |
How much of the starting part of each chain should be discarded? (Warming-up period.) |
stepsize |
The stepsize for Euler–Maruyama simulation of the system. |
sparse |
A number. How much do you want to sparse the output? When the noise is small, sparse the output may make the density estimation more efficient. |
forbid_overflow |
If |
inits |
The initial values of each chain. |
Value
A list containing the parameters of the corresponding function. Only intended to be used within fit_3d_vfld()
Options controlling the landscape construction
Description
To control the behavior of simlandr::make_3d_static()
, but with default values accommodated for fitlandr
. See simlandr::make_3d_static()
for details.
Usage
simlandr_options(
vf,
x = rlang::expr(vf$x),
y = rlang::expr(vf$y),
lims = rlang::expr(vf$lims),
kde_fun = c("ks", "MASS"),
n = 200,
adjust = 1,
h,
Umax = 5
)
Arguments
vf |
A |
x , y |
The names of the target variables. |
lims |
The limits of the range for the density estimator as |
kde_fun |
Which kernel estimator to use? Choices: "ks" |
n |
The number of equally spaced points in each axis, at which the density is to be estimated. |
adjust |
The multiplier to the bandwidth. The bandwidth used is actually |
h |
A number, or possibly a vector for 3D and 4D landscapes, specifying the smoothing bandwidth to be used. If missing, the default value of the kernel estimator will be used (but |
Umax |
The maximum displayed value of potential. |
Value
A list containing the parameters of the corresponding function. Only intended to be used within fit_3d_vfld()