Gaussian Kernel Density Trace

Gaussian Kernel AMISE Optimal Bandwidth Analyzer

  • Efficient use of Kernel density estimate, which is widely used in data mining and pattern recognition, depends on computation of the optimal bandwidth of the kernel. While AMISE (Asymptotic Mean Integrated Squared Error) optimal bandwidth can provide the best estimation, it involves extensive and time-consuming computation.
  • Gigawiz has developed a unique hybrid solver for processing the data and estimating the corresponding AMISE optimal bandwidth.
    • The Gigawiz hybrid solver, which has been stress tested using 10000 data sets (including both real world and simulated data), is fast in majority of cases, enabling to generate a kernel density graph on the fly using automatic calculation of the AMISE optimal bandwidth.
    • In addition, the optimal bandwidth estimator in the Stats Analyzer is designed for computing the bandwidth for data sets that require very heavy, time-consuming computation, before using the result for graphing purposes.
The Gigawiz Solver for Computing the AMISE Optimal Bandwidth of the Kernel
  • The Gigawiz solver processes the data and selects the AMISE optimal bandwidth by:
    • Computing the univariate density derivatives for each data point implementing the recommended methodology published by Raykar and Duraiswami (2005)*, followed by using the Sheather and Jones method (1991)** for the actual bandwidth estimation.

    * Raykar, V.C. and Duraiswami, R. (2005). Very Fast Optimal Bandwidth Selection for Univariate Kernel Density Estimation. Perceptual Interfaces and Reality Laboratory [CS-TR- 4774/UMIACS-TR-2005-73]: December 20, 2005.

    ** Sheather, S.J., and Jones, M.C. (1991). A Reliable Data-Based Bandwidth Selection Method for Kernel Density Estimation. Journal of the Royal Statistical Society. Series B (methodological), Volume 53, Issue 3, 683-690.

A Example of the Output From Kernel AMISE Bandwidth Analyzer in a Table Format

How Kernel Density Trace Charts Differ From Traditional Histograms
  • In a kernel density trace, the graph provides information on h(x) at all X values; i.e., a continuous trace of h(x) against X (see the right-hand side graph). Histogram of the same data set is shown below.
  • In a histogram, the number of bars are defined by hard binning; the height of bars (h(x)) provides a measure of the density of data points within the hard bins.

An Example of a Gaussian Kernel Density Chart Based on Computation of AMISE Optimal Bandwidth