Title: | Shape-Constrained Kernel Density Estimation |
Version: | 1.0.3 |
Description: | Implements methods for obtaining kernel density estimates subject to a variety of shape constraints (unimodality, bimodality, symmetry, tail monotonicity, bounds, and constraints on the number of inflection points). Enforcing constraints can eliminate unwanted waves or kinks in the estimate, which improves its subjective appearance and can also improve statistical performance. The main function scdensity() is very similar to the density() function in 'stats', allowing shape-restricted estimates to be obtained with little effort. The methods implemented in this package are described in Wolters and Braun (2017) <doi:10.1080/03610918.2017.1288247>, Wolters (2012) <doi:10.18637/jss.v047.i06>, and Hall and Huang (2002) https://www3.stat.sinica.edu.tw/statistica/j12n4/j12n41/j12n41.htm. See the scdensity() help for for full citations. |
Depends: | R (≥ 3.3.0) |
License: | GPL-2 |
Encoding: | UTF-8 |
Suggests: | testthat |
Imports: | quadprog, lpSolve |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2024-08-27 13:07:41 UTC; krams |
Author: | Mark A. Wolters |
Maintainer: | Mark A. Wolters <mark@mwolters.com> |
Repository: | CRAN |
Date/Publication: | 2024-08-27 13:50:02 UTC |
scdensity: Shape-constrained kernel density estimation.
Description
This package computes one-dimensional Gaussian kernel density estimates subject to a variety of shape constraints, including unimodality, bimodality, symmetry, and others.
Details
All of the package's functionality can be accessed through the function
scdensity()
. See that function's help file. The function is used
just like stats::density()
, but with extra arguments for handling the constraints.
Author(s)
Maintainer: Mark A. Wolters mark@mwolters.com (ORCID)
A small function to add the objects needed for the binning QP to the problem list.
Description
A small function to add the objects needed for the binning QP to the problem list.
Usage
AddBinObjects(P, s)
Arguments
P |
A list with the problem details. |
s |
A vector of centers. |
Value
A modified problem list.
Carry out the binning step.
Description
Carry out the binning step.
Usage
BinningStep(P)
Arguments
P |
The list of problem objects. |
Value
The input list, with extra members modified.
A function to add the constraint-checking grid
Description
A function to add the constraint-checking grid
Usage
BuildConCheckGrid(P)
Arguments
P |
List with problem details. |
Value
Modified list.
Build objects Apos, Aunit, Aeq, Ashape, bshape
Description
This function builds the matrices/vectors needed to implement shape constraints in the estimation step.
Usage
BuildConstraints(P)
Arguments
P |
A list of problem obejcts |
Value
The same list, with Apos, Aunit, Aeq, Ashape, bshape filled in.
Clean out near-zeros from a probability weights vector and re-normalize
Description
Clean out near-zeros from a probability weights vector and re-normalize
Usage
CleanWeights(w)
Arguments
w |
A vector of probability weights |
Value
A vector with the near-zeros removed, still summing to 1.
Computes a matrix of Gaussian kernel convolution values given two vectors.
Description
If x
and s
are n- and m-vectors, respectively, returns the n-by-m matrix
of convolution values using Gaussian kernel with bandwidth h
. If x
and
s
are equal, the spectral shift is done to ensure the matrix is numerically
positive definite.
Usage
ConvolutionMatrix(x, s, h, threshold = 1e-10)
Arguments
x |
A numeric vector. |
s |
A numeric vector. |
h |
A positive bandwidth. |
threshold |
Threshold value passed to SpectralShift. |
Value
The matrix of convolution values.
Carry out the shape-constrained estimation
Description
Carry out the shape-constrained estimation
Usage
EstimationStep(P)
Arguments
P |
The problem list |
Value
Problem list with estimate objects.
Initialize the list of problem-related objects.
Description
Initialize the list of problem-related objects.
Usage
InitializeP(x, h, constraints, method, opts)
Arguments
x |
The data vector. |
h |
The bandwidth (a positive scalar). |
constraints |
The vector of constraint strings. |
method |
Either "weightedKDE" or "adjustedKDE". |
Value
A list in the form needed by WeightedKDE(), with additional elements initialized to their default values.
Create a vector of kernel centers covering [LB, UB].
Description
Make the spacing as large as possible without going over width
.
If symmetric, ensure we have an even number of centers that are symmetric around PoS
.
Usage
MakeCenters(LB, UB, width, minspace, PoS = NULL)
Arguments
LB |
The lower bound. |
UB |
The upper bound. |
width |
The maximum possible spacing. |
minspace |
The minimum allowable spacing. |
PoS |
Point of symmetry (default NULL) |
Value
A vector of kernel centers.
Compute the values of n normal PDFs (or their derivatives at m grid points).
Description
Grid points are specified in g. Returns an m-by-n matrix. The (i,j)th element of the matrix is the rth derivative of N(mu(j),sd(j)) at g(i).
Usage
NormalGridFcn(g, r, mu, sd)
Arguments
g |
Locations at which to evaluate the normal densities. |
r |
Derivative degree of the densities to evaluate (0, 1, 2, or 3). |
mu |
Means/locations for the normal densities. |
sd |
Standard deviations of the normal densities. |
Value
An m-by-n matrix as described above.
A wrapper to call solve.QP.
Description
This function takes arguments slightly differently from solve.QP, to make it more convenient for internal use. It also implements measures to robustify calls to solve.QP:
A rounding hack to prevent a bug in solve.QP that occasionally produces all-NaN solutions without returning a warning or error. Rounding has been found to eliminate almost all such bugs.
A call to lpSolve's lp() to check feasibility before running solve.QP
solve.QP is called within tryCatch to eliminate unwanted crashes.
The output of this function is a list with elements
-
flag
is 0 for successful completion, 1 for failure at the LP check stage, and 2 for failure at the QP stage (usually the "NaN solution" bug). -
QP
is the list returned bysolve.QP
. If the QP was not run due to infeasibility, this element is NULL.
Usage
QPsolve(D, d, A, b, Aeq, beq)
Arguments
D |
The matrix of the quadratic objective. |
d |
The vector in the linear term of the quadratic objective. |
A |
The matrix of inequality constraints. |
b |
The vector of RHS of the inequalities. |
Aeq |
The matrix of equality constraints. |
beq |
The vector of RHS of the equalities. |
Details
solve.QP defines its quadratic program as minimizing 1/2 * x'Dx - x'd, subject to constraints A'x >= b. Equality constraints have to be in the first rows of A'.
This function minimizes x'Dx - x'd, subject to inequality constraints Ax >= b and Equality constraints Aeq*x = beq.
Value
A list with elements described above.
Minimize a function of r variables by sequential univariate searches.
Description
The function seeks to minimize fcn
, a scalar function of r
variables. v0
is a starting solution and bounds
is a 2-vector giving upper and lower limits for
elements of the solution.
Usage
SequentialLineMin(fcn, bounds, v0, tol = .Machine$double.eps^0.25)
Arguments
fcn |
A function with taking an r-vector as its first argument: call as |
bounds |
A 2-vector giving the upper and lower limits for elements of a solution. |
v0 |
A starting solution, with increasing elements. An r-vector. Not used if r == 1. |
tol |
Tolerance passed to |
Details
This algorithm is designed to search for solutions of the form v = [v_1 v_2 \ldots v_r]
,
where bounds(1)
< v_1 < v_2 < ... < v_r <
bounds(2)
. It loops through the solution vector
one variable at a time, and does a 1-D line search using optimize()
for an improving
value of that variable. So when optimizing v_i
, it searches the interval (v_{i-1},
v_{i+1})
to maintain the increasing nature of v
. The overall search terminates once a
pass through all r
elements of v
fails to produce any changes to v
.
Value
a list with elements:
minimizer |
An r-vector containing the solution. |
minimum |
The objective function value at the solution. |
Examples
fcn <- function(v) (v[1]+1)^2 + (v[2]-1)^2
SequentialLineMin(fcn, c(-5,5), c(-3,3))
Performs the spectral shift on a matrix to make it numerically positive definite.
Description
Matrix L
is assumed to have eigenvalues that are either all positive, or very
close to zero. If any eigenvalues are less than less than threshold
, a positive
quantity is added to the diagonal.
Usage
SpectralShift(L, threshold = 1e-10)
Arguments
L |
A square numeric matrix. |
threshold |
The eigenvalue threshold. Default 1E-10. |
Value
The spectral-shifted matrix.
Function to carry out the weighted or adjusted KDE optimization.
Description
This function sets up the problem and finds an optimal shape-constrained estimate for a specified set of important points.
Usage
WeightedKDE(P)
Arguments
P |
A list, as created by InitializeP(). |
Value
A mutated version of the input list, with additional elements giving the optimization output.
Display estimation results to console in an unsuccessful case.
Description
Display estimation results to console in an unsuccessful case.
Usage
displayFailure(pts)
Arguments
pts |
The important points. |
Display estimation results to console in a successful case.
Description
Display estimation results to console in a successful case.
Usage
displaySuccess(pts, value)
Arguments
pts |
The important points. |
value |
The objective function value. |
Estimate a specific quantile of a pdf given abscissa and ordinate values.
Description
Estimate a specific quantile of a pdf given abscissa and ordinate values.
Usage
getQuantile(x, y, p)
Arguments
x |
The abscissa values of the pdf |
y |
The ordinate values of the pdf |
p |
The probability at which to evaluate the quantile. |
Value
The estimated quantile.
Build a quantile function for a given constrained density estimator.
Description
This function implements a crude but numerically reliable method to approximate the
quantile function of an scdensity
estimate.
Usage
getQuantileFunction(x)
Arguments
x |
An object of S3 class |
Value
A function that takes a fraction and returns the quantile.
Move points closer to a target while maintaining a constraint.
Description
improve(startValue, x, confun)
uses a greedy algorithm to move the elements of a
user-supplied vector startValue
closer to their target values x
, while
continually satisfying the constraint-checking function confun
.
Usage
improve(
startValue,
x,
confun,
verbose = FALSE,
maxpasses = 500,
tol = diff(range(c(startValue, x))/1e+05)
)
Arguments
startValue |
The vector of starting values for the search. Must satisfy
|
x |
The target values. |
confun |
The constraint-checking function. |
verbose |
A logical value indicating whether or not information about iteration progress should be printed to the console. |
maxpasses |
The maximum allowable number of sweeps through the data points. At each pass, every point that is not pinned at the constraint boundary is moved toward its target point in a stepping-out procedure. |
tol |
Numerical tolerance for constraint checking. A point is considered to be at the
constraint boundary if adding |
Details
The algorithm implemented here is the one in Wolters (2012), "A Greedy Algorithm for Unimodal Kernel Density Estimation by Data Sharpening," Journal of Statistical Software, 47(6). It could conceivably be useful as a part of other gradient-free optimization schemes where we have an infeasible point and a feasible one, and we seek a point that is on the constraint boundary near the infeasible one.
Value
A vector of the same length as startValue
, with elements closer to x
.
Examples
#Constrain points to be inside the hypercube with vertices at -1 and +1.
#The target point is a vector of independent random standard normal variates.
#Start at rep(0,n) and "improve" the solution toward the target.
n <- 20
incube <- function(x) all(x <= 1 & x >= -1)
x0 <- rep(0,n)
target <- sort(rnorm(n))
xstar <- improve(x0, target, incube, verbose=TRUE)
dist <- abs(target - xstar)
zapsmall(cbind(target, xstar, dist), 4)
Check for zeros at the left side of a vector of function values.
Description
Given a vector of n function values and an index ix, check whether values 1:ix are zero. Returns TRUE if they are, FALSE otherwise.
Usage
isBoundedL(f, ix)
Arguments
f |
A vector of function values for increasing abscissa values. |
ix |
An index giving the cutoff for checking for zero. |
Details
As in isUnimodal
, the values are first scaled to fill [0, 1] and then rounded to
four decimal places. Because of this it is still possible to use the "bounded support"
constraints with the Gaussian kernel.
This function is intended to be called from other functions in the scdensity package. It does not implement any argument checking.
Value
A logical value indicating if the constraint is satisfied.
Check for zeros at the right side of a vector of function values.
Description
Given a vector of n function values and an index ix, check whether values with indices greater than or equal to ix are zero. Returns TRUE if they are, FALSE otherwise.
Usage
isBoundedR(f, ix)
Arguments
f |
A vector of function values for increasing abscissa values. |
ix |
An index giving the cutoff for checking for zero. |
Details
As in isUnimodal
, the values are first scaled to fill [0, 1] and then rounded to
four decimal places. Because of this it is still possible to use the "bounded support"
constraints with the Gaussian kernel.
This function is intended to be called from other functions in the scdensity package. It does not implement any argument checking.
Value
A logical value indicating if the constraint is satisfied.
Check for monotonicity of function values in the left tail.
Description
Given a vector of n function values and an index ix, determines whether the function values having indices less than or equal to ix are non-decreasing or non-increasing. Returns TRUE if they are, FALSE otherwise.
Usage
isMonotoneL(f, ix)
Arguments
f |
A vector of function values for increasing abscissa values. |
ix |
An index giving the cutoff for checking monotonicity. |
Details
As in isUnimodal
, the values are first scaled to fill [0, 1] and then rounded to
four decimal places. This eliminates unwanted detection of tiny differences as modes.
This function is intended to be called from other functions in the scdensity package. It does not implement any argument checking.
Value
A logical value indicating if the constraint is satisfied.
Check for monotonicity of function values in the right tail.
Description
Given a vector of n function values and an index ix, determines whether the function values having indices greater than or equal to ix are non-decreasing or non-increasing. Returns TRUE if they are, FALSE otherwise.
Usage
isMonotoneR(f, ix)
Arguments
f |
A vector of function values for increasing abscissa values. |
ix |
An index giving the cutoff for checking monotonicity. |
Details
As in isUnimodal
, the values are first scaled to fill [0, 1] and then rounded to
four decimal places. This eliminates unwanted detection of tiny differences as modes.
This function is intended to be called from other functions in the scdensity package. It does not implement any argument checking.
Value
A logical value indicating if the constraint is satisfied.
Check for unimodality of function values.
Description
Given a set of function values for increasing abscissa values, we call this unimodal if there are zero or one values that are greater than all of their neighbors. Before checking for modes, the values are scaled to fill [0, 1] and then rounded to four decimal places. This eliminates unwanted detection of tiny differences as modes.
Usage
isUnimodal(f)
Arguments
f |
A vector of function values for increasing abscissa values. |
Details
This function is intended to be called from other functions in the scdensity package. It does not implement any argument checking.
Value
A logical value indicating if unimodality is satisfied.
A function factory for making the search objective function.
Description
Used when we need to search for important points. P is the problem list. It should have already gone through BuildConCheckGrid and BinningStep. The returned function must return a value even if WeightedKDE() fails. In case of failure, just assign a large random value to the objective value (to keep the search from stagnating or moving systematically in one direction).
Usage
makeOF(P)
Arguments
P |
The list of problem details. |
Value
The objective function.
Plot method for class scdensity
.
Description
Creates a plot of a shape-constrained kernel density estimate. The amount of information
in the plot is controlled by detail
.
Usage
## S3 method for class 'scdensity'
plot(
x,
detail = 4,
main = c("Density Estimate", "Q-Q Plot"),
xlab = c(x$data.name, "Constrained KDE Quantiles"),
ylab = c("Density", "Sample Quantiles"),
type = c("l", "l", "p"),
lty = c(1, 2, 0),
pch = c(-1, -1, 1),
col = c("black", gray(0.4), "black"),
lwd = c(2, 1, 0),
zero.line = TRUE,
...
)
Arguments
x |
An object of S3 class |
detail |
An integer from 1 to 4, indicating the level of information to include in the plot. 1: plot only the constrained estimate. 2: draw both the constrained and unconstrained estimates on the same plot. 3: add a rug showing the data points. 4: additionally plot a Q-Q plot of the observed data versus the constrained estimate in a second panel (for qualitative assessment of goodness-of-fit). |
main |
A string passed on to the |
xlab |
A string passed on to the |
ylab |
A string passed on to the |
type |
A vector of up to 3 strings specifying the |
lty |
A vector of up to length 3, specifying the |
pch |
A vector of up to 3 integers specifying the |
col |
A vector of up to 3 strings specifying the |
lwd |
A vector of up to length 3 specifying the |
zero.line |
A logical value indicating whether or not a horizontal line should be drawn through zero to aid visualization. |
... |
Extra parameters passed to the initial |
Examples
# Basic usage:
x <- rlnorm(30)
scKDE <- scdensity(x)
plot(scKDE)
# Show only the constrained estimate
plot(scKDE, detail=1)
# Show the constrained and unconstrained estimates. Change line color and width.
plot(scKDE, detail=2, col=c("red","blue"), lwd=c(3,2))
# Show the Q-Q plot, but change that plot's symbol and its size.
plot(scKDE, detail=4, pch=c(-1, -1, 3), cex=0.5)
Print method for class scdensity
.
Description
Displays the names of the elements of the scdensity list object and their sizes and types. Includes minimal comments about the most important elements.
Usage
## S3 method for class 'scdensity'
print(x, ...)
Arguments
x |
An object of S3 class |
... |
Included for consistency with generic functions. |
Prints the information in a summary.scdensity
object to the console.
Description
Prints the information in a summary.scdensity
object to the console.
Usage
## S3 method for class 'summary.scdensity'
print(x, ...)
Arguments
x |
An object of S3 class |
... |
Included for consistency with generic functions. |
Shape-constrained kernel density estimation.
Description
scdensity
computes kernel density estimates that satisfy specified shape
restrictions. It is used in the same way as stats::density()
, and takes
most of that function's arguments. Its default behavior is to compute a unimodal estimate.
Use argument constraint
to choose different shape constraints, method
to
choose a different estimation method, and opts
to specify method- and
constraint-specific options. The result is a list of S3 class scdensity
, which
may be inspected via print, summary, and plot methods.
Usage
scdensity(
x,
bw = "nrd0",
constraint = c("unimodal", "monotoneRightTail", "monotoneLeftTail", "twoInflections",
"twoInflections+", "boundedLeft", "boundedRight", "symmetric", "bimodal"),
method = c("adjustedKDE", "weightedKDE", "greedySharpenedKDE"),
opts = NULL,
adjust = 1,
n = 512,
na.rm = FALSE
)
Arguments
x |
A vector of data from which the estimate is to be computed. |
bw |
The bandwidth. It is specified as either a numerical value or as one of the
character strings |
constraint |
A vector of strings giving the operative shape constraints. Elements
must partially match different alternatives among |
method |
A string giving the method of enforcing shape constraints. It must
paritally match one of |
opts |
A list giving options specific to the chosen constraints and/or method. E.g.
use |
adjust |
A scaling factor for the bandwidth, just as in |
n |
The number of points returned in the density estimate. Same as in
|
na.rm |
Logical indicating whether or not to remove missing values from |
Details
All density estimates in this package use the Gaussian kernel. It is the only common
kernel function with three continuous derivatives everywhere. The adjustedKDE
and
weightedKDE
methods require continuous derivatives to ensure numerical stability.
The default estimation method, adjustedKDE
, can handle all of the available constraints. The
weightedKDE
method can handle every constraint except symmetric
, while the
greedySharpenedKDE
method can handle only unimodal
, monotoneRightTail
,
monotoneLeftTail
, boundedLeft
, and boundedRight
. The opts
list can
also be used to supply method-specific control parameters. See the "Method details" section
for more.
Each constraint has a corresponding control parameter that can be supplied as an element of
opts
. The control parameters are described in the following table. See the "Constraint
details" section for definitions of each constraint.
More than one shape constraint can be specified simultaneously. Certain combinations of constraints
(e.g., unimodal
and monotoneRightTail
) are redundant, and will cause a warning. Other
combinations (e.g., unimodal
and bimodal
) are incompatible and will cause an error.
The figure below summarizes the valid constraint combinations.
Value
A list with the following elements:
-
constraint
The constraint(s) used for estimation. Might differ from the constraints supplied to the function if they included redundant constraints. -
method
The estimation method used. -
f0
A function. Usef0(v)
to evaluate the unconstrained KDE at the points inv
. -
fhat
A function. Usefhat(v)
to evaluate the constrained KDE at the points inv
. -
data
The data used to generate the estimate. -
bw
The bandwidth used. -
extra
A list holding additional outputs that are specific to the chosen method. See the "method details" section. -
x
A vector of abscissa values for plotting the estimate. Same as instats::density()
. -
y
A vector of ordinate values for plotting the estimate. Same as instats::density()
. -
n
The sample size, not including missing values. Note, thisn
has no relation to then
provided in the arguments. -
data.name
Deparsed name of thex
argument, used in plotting. -
call
The call to the function. -
has.na
AlwaysFALSE
. Included for consistency withstats::density()
.
Constraint details
All of the constraints other than symmetric
are restrictions on the sign of the estimate, or
its derviatives, over certain intervals. The boundaries of the intervals may be called
important points. If method="greedySharpenedKDE"
, the important points are determined
implicitly during estimation. For the other methods, the locations of the important points may be
supplied in opts
; in most cases they are optional. If they are not provided, estimation
will be run iteratively inside a search routine (SequentialLineMin
) to find good values,
and these values will be returned in the extra
list.
Here is a list of the constraints with their definitions and any relevant comments about their usage.
-
unimodal
: The estimate is nondecreasing to the left ofopts$modeLocation
, and nonincreasing to the right. IfmodeLocation
is not supplied, it is found by search. -
monotoneRightTail
: The estimate is nonincreasing to the right of theopts$rightTail
percentile of the unconstrained estimate.rightTail
is a numeric value between 0 and 100. If it is not supplied, it is set to its default value, 90. -
monotoneLeftTail
: The estimate is nondecreasing to the left of theopts$leftTail
percentile of the unconstrained estimate.leftTail
is a numeric value between 0 and 100. If it is not supplied, it is set to its default value, 10. -
twoInflections
: The estimate has two inflection points, found atopts$inflectionPoints[1]
andopts$inflectionPoints[2]
. This constraint implies unimodality, but provides greater smoothness thanunimodal
. IfinflectionPoints
is not supplied, it is found by search. -
twoInflections+
: The derivative of the estimate has three inflection points, located atopts$inflectionPoints[1]
,opts$inflectionPoints[2]
, andopts$inflectionPoints[3]
. This constraint impliestwoInflections
but is even smoother. Most parametric densities with two tails satisfy this constraint. IfinflectionPoints
is not supplied, it is found by search. -
boundedLeft
: The estimate is zero to the left ofopts$lowerBound
. The value oflowerBound
must be specified inopts
. This constraint is implemented only up to a numerical tolerance. Consequently it is still possible to use it with the Gaussian kernel. -
boundedRight
: The estimate is zero to the right ofopts$upperBound
. The value ofupperBound
must be specified inopts
. This constraint is implemented only up to a numerical tolerance. Consequently it is still possible to use it with the Gaussian kernel. -
symmetric
: The estimate is symmetric aroundopts$pointOfSymmetry
. IfpointOfSymmetry
is not provided, it is found by search. -
bimodal
: The estimate has modes atopts$modeLocation[1]
andopts$modeLocation[3]
, with an antimode (local minimum) atopts$modeLocation[2]
. IfmodeLocation
is not specified, it is found by search.
Method details
The adjustedKDE
and weightedKDE
methods are implemented using a common framework
where the standard KDE is first approximated by a binning step, after which the constrained estimate
is obtained. The greedySharpenedKDE
method uses a different approach.
adjustedKDE and weightedKDE
The adjustedKDE
method is based on the method of Wolters and Braun (2017). The method
uses the usual unconstrained kernel density estimate as a pilot estimate, and adjusts the shape of
this estimate by adding a function to it. The function is selected to minimally change the
shape of the pilot estimate while ensuring the constraints are satisfied. Any of the constraints
can be used with this method.
The weightedKDE
method is based on the method of Hall and Huang (2002).
The method uses a weighted kernel density estimator, with the weights minimally
perturbed such that the constraint is satisfied. Any of the constraints except symmetric
may be used with this method.
For either of these methods, the following optional arguments can be provided as elements of opts
:
-
ncheck
: The number of abscissa points used for constraint checking. By default, this is set tomax(100, ceiling((diff(range(x)) + 6*h) / h))
, whereh
is the bandwidth. With this default it should be rare to encounter constraint violations large enough to be visible in a plot. In the event that constraint violations are observed, re-run the estimation with a larger value ofncheck
. -
verbose
: IfTRUE
, progress information will be displayed in the console. The main use of this is to track the progress of the search for important points. Default isFALSE
.
When either of these methods are used, the output list extra
contains elements giving the locations of the
important points used in the final estimate (e.g., modeLocation
if the estimate is unimodal or
bimodal). Additionally, it containts the following elements:
-
conCheckGrid
: A vector giving the abscissa values at which the constraints were enforced. -
binnedCenters
: A vector giving the locations of the kernel centers determined in the binning step. -
binnedWeights
: The weights corresponding to the binned centers. -
finalCenters
: The kernel centers used for the final estimate. -
finalWeights
: The weights used for the final estimate.
greedySharpenedKDE
The greedySharpenedKDE
method is described in Wolters (2012a, 2012b). It uses a data sharpening
(shifting the data points) approach. Starting from an initial solution that satisfies the constraints,
a greedy algorithm (implemented in the function improve
) is used to move the points as close as
possible to the observed data while maintaining feasibility.
The following optional arguments can be provided as elements of opts
:
-
startValue
— A vector of the same length asx
, giving the feasible initial solution from which the algorithm is started. If not specified, a vector with all data points at the location of the unconstrained estimate's highest mode will be used. Note, it is not guaranteed that the default will satisfy every constraint for every data set. -
verbose
: IfTRUE
, information about iteration progress will be printed to the console. Default isFALSE
. -
maxpasses
: Each "pass" through the data points moves each point one-by-one in a greedy fasion. This option limits the maximum number of passes. Default is 500. -
tol
: A numerical tolerance for constraint checking. Seeimprove
. -
ILS
: An integer greater than zero. If supplied, the greedy algorithm is run inside an iterated local search metaheuristic, as described in Wolters (2012b, sec. 3.4). This can improve solution quality, but requires the greedy search to be run2*ILS
extra times.
When this method is used, the output list extra
contains the following elements:
-
xstar
: The final vector of "sharpened" data points used to generate the estimate.
References
Hall and Huang (2002), Unimodal Density Estimation Using Kernel Methods, Statistica Sinica, 12, 965-990.
Wolters and Braun (2017), Enforcing Shape Constraints on a Probability Density Estimate Using an Additive Adjustment curve, Communications in Statistics - Simulation and Computation, 47(3), 672-691.
Wolters (2012a), A Greedy Algorithm for Unimodal Kernel Density Estimation by Data Sharpening, Journal of Statistical Software, 46(6), 1–26.
Wolters (2012b), Methods for Shape-Constrained Kernel Density Estimation. Ph.D. Thesis, University of Western Ontario.
See Also
plot.scdensity
plot method, print.scdensity
print
method, and summary.scdensity
summary method.
Examples
# Default method gives a unimodal estimate using adjustment curve method.
x <- rlnorm(30)
scKDE <- scdensity(x)
scKDE
summary(scKDE)
plot(scKDE, detail=2)
plot(scKDE, detail=4)
# Constrain the first and fourth quartiles to be monotone, using greedy sharpening method.
x <- rt(50, df=3)
scKDE <- scdensity(x, bw="SJ", adjust=0.5, constraint=c("monotoneL", "monotoneR"),
opts=list(verbose=TRUE, leftTail=25, rightTail=75), method="greedy")
plot(scKDE)
# Compare unimodal, twoInflections, and twoInflections+ constraints
x <- rnorm(100)
h <- 0.5 * bw.SJ(x)
fhat1 <- scdensity(x, bw=h, constraint="unimodal")
fhat2 <- scdensity(x, bw=h, constraint="twoInflections")
fhat3 <- scdensity(x, bw=h, constraint="twoInflections+")
plot(density(x, bw=h))
lines(fhat1$x, fhat1$y, col="red")
lines(fhat2$x, fhat2$y, col="blue")
lines(fhat3$x, fhat3$y, col="green", lwd=2)
Summary method for class scdensity
.
Description
Collects high-level information about the scdensity
object and some descriptive
statistics.
Usage
## S3 method for class 'scdensity'
summary(object, ...)
Arguments
object |
An object of S3 class |
... |
Included for consistency with generic functions. |