Class

NumCosmoMathStatsDistKernel

Description [src]

abstract class NumCosmoMath.StatsDistKernel : GObject.Object
{
  /* No available fields */
}

An N-dimensional kernel used to compute the kernel density estimation function (KDE) in the NcmStatsDist class.

This class provides the tools to generate a kernel function to be used in a kernel density estimation method. Below is a quick review of the kernel density estimation method and some properties of the kernel function, which are generalized for multidimensional problems. For further information, check [Density Estimation for Statistics and Data Analysis, B.W. Silverman].

Starting with the uni-dimensional case, let $X_1,…,X_n$ be independent and identically distributed (iid) samples drawn from a distribution $f(x)$. The kernel density estimation of the function is \begin{align} \tilde{f}(x) = \sum_{i=1}^{n}K\left(\frac{x-x_i}{h}\right) ,\end{align} where $K$ is the kernel function and $h$ is the bandwidth parameter. The kernel density estimator function must be close to the true density function $f(x)$, which can be tested by analyzing whether the estimator provides similar expected values as the function $f(x)$, that is, the function $\tilde{f}(x)$ must minimize the mean square error (MSE) \begin{align} \label{eqmse} MSE_x(\tilde{f}) = E\left[\tilde{f}(x) - f(x)\right]^2 ,\end{align} where $E$ represents the expected value. This value depends on the choice of the kernel function, the data and the bandwidth. If the estimator $\tilde{f}(x)$ is close enough to the true function, it shall be used to generate samples that are distributed by $f(x)$.

The kernel $K$ is a symmetric function that must satisfy \begin{align} &\int K(x)~dx = 1 .\end{align} Usually, the kernel function is a symmetric probability density function that is easy to sample from, but it is totally under the user’s control. Using simple kernels, such as the Gaussian kernel, makes the kernel density estimator method a better alternative to generate samples when the desired distribution is a complicated function.

For the multidimensional case, given i.i.d d-dimensional sample points $X_1,.., X_n$ distributed by $f(x)$, the multivariate kernel density estimator function $\tilde{f}(x)$ is given by \begin{align} \tilde{f}(x) = \frac{1}{h^d} \sum_{i=1}^n w_i K\left(\frac{x-x_i}{h}, \Sigma_i\right) ,\end{align} where $\Sigma_i$ is the covariance matrix of the $i$-th point (the kernels used in this library depend on the covariance matrix), $d$ is the dimension and $w_i$ is the weight attached to each kernel to find the minimal error in equation \eqref{eqmse}.

The methods in this class define the type of kernel $K$, compute the bandwidth factor $h$, evaluate the kernel function at a given $d$-dimensional point $x$ or at a given vector of points $\vec{x}$, and, given the weights $w_i$, compute the kernel density estimation function $\tilde{f}(x)$.

Besides the function ncm_stats_dist_kernel_get_dim(), this class object only has virtual methods. Therefore, to use this object, the user must initialize one of the child objects (NcmStatsDistKernelGauss or NcmStatsDistKernelST). Inside the child objects are the implemented functions, which must be defined for each specific type of kernel function. Check the childs documentations for more information. More information about how the algorithm should be implemented is described below:

-This class is implemented in the `NcmStatsDist` class, where the `NcmStatsDistKernel` class shall define
the type of kernel used in the interpolation function in `NcmStatsDist` and how to compute values such as
the weighted sum of the kernels, the bandwidth, and so on. Yet, the user may use these class objects
to perform other kernel calculations, although some of the methods are not implemented outside the
`NcmStatsDist` class.

-This class does not possess the methods to compute the weights of each kernel. You may find this method in the
`NcmStatsDist` class.

-Every child object of this class can be used either in the `NcmStatsDistKDE` class or in the `NcmStatsDistVKDE` class.

Ancestors

Functions

ncm_stats_dist_kernel_clear

Decrease the reference count of stats_dist_nd_kde_gauss by one, and sets the pointer *sdk to NULL.

Instance methods

ncm_stats_dist_kernel_eval_sum0_gamma_lambda

Computes the weighted sum of kernels at $\chi^2=$chi2 (the density estimator function), $$ e^\gamma (1+\lambda) = \sum_i w_i\bar{K} (\chi^2_i) / u_i,$$ where $\gamma = \ln(w_a\bar{K} (\chi^2_a) / u_a)$ and $a$ labels is the largest term of the sum. This function shall be used when each kernel has a different normalization factor.

ncm_stats_dist_kernel_eval_sum1_gamma_lambda

Computes the weighted sum of kernels at $\chi^2=$chi2 (the density estimator function), $$ e^\gamma (1+\lambda) = \sum_i w_i\bar{K} (\chi^2_i) / u,$$ where $\gamma = \ln(w_a\bar{K} (\chi^2_a) / u)$ and $a$ labels is the largest term of the sum. This function shall be used when all the kernels have the same normalization factor.

ncm_stats_dist_kernel_eval_unnorm

Computes the unnormalized kernel at $\chi^2=$chi2.

ncm_stats_dist_kernel_eval_unnorm_vec

Computes the unnormalized kernel at $\chi^2=$chi2 for all elements of chi2 and store the results at Ku.

ncm_stats_dist_kernel_free

Decrease the reference count of sdk by one.

ncm_stats_dist_kernel_get_dim

Gets current kernel dimension.

ncm_stats_dist_kernel_get_lnnorm

Computes the kernel normalization for a given covariance cov_decomp.

ncm_stats_dist_kernel_get_rot_bandwidth

Computes the rule-of-thumb bandwidth for a interpolation using n kernels.

ncm_stats_dist_kernel_ref

Increase the reference of sdk by one.

ncm_stats_dist_kernel_sample

Generates a random vector from the kernel distribution using the covariance cov_decomp, bandwidth href and location vector mu. The result is stored in y.

Methods inherited from GObject (43)

Please see GObject for a full list of methods.

Properties

NumCosmoMath.StatsDistKernel:dimension
No description available.

Signals

Signals inherited from GObject (1)
GObject::notify

The notify signal is emitted on an object when one of its properties has its value set through g_object_set_property(), g_object_set(), et al.

Class structure

struct NumCosmoMathStatsDistKernelClass {
  GObjectClass parent_class;
  void (* set_dim) (
    NcmStatsDistKernel* sdk,
    const guint dim
  );
  guint (* get_dim) (
    NcmStatsDistKernel* sdk
  );
  gdouble (* get_rot_bandwidth) (
    NcmStatsDistKernel* sdk,
    const gdouble n
  );
  gdouble (* get_lnnorm) (
    NcmStatsDistKernel* sdk,
    NcmMatrix* cov_decomp
  );
  gdouble (* eval_unnorm) (
    NcmStatsDistKernel* sdk,
    const gdouble chi2
  );
  void (* eval_unnorm_vec) (
    NcmStatsDistKernel* sdk,
    NcmVector* chi2,
    NcmVector* Ku
  );
  void (* eval_sum0_gamma_lambda) (
    NcmStatsDistKernel* sdk,
    NcmVector* chi2,
    NcmVector* weights,
    NcmVector* lnnorms,
    NcmVector* lnK,
    gdouble* gamma,
    gdouble* lambda
  );
  void (* eval_sum1_gamma_lambda) (
    NcmStatsDistKernel* sdk,
    NcmVector* chi2,
    NcmVector* weights,
    gdouble lnnorm,
    NcmVector* lnK,
    gdouble* gamma,
    gdouble* lambda
  );
  void (* sample) (
    NcmStatsDistKernel* sdk,
    NcmMatrix* cov_decomp,
    const gdouble href,
    NcmVector* mu,
    NcmVector* y,
    NcmRNG* rng
  );
  
}

The virtual function table for NcmStatsDistKernel.

Class members
parent_class: GObjectClass

The parent class.

set_dim: void (* set_dim) ( NcmStatsDistKernel* sdk, const guint dim )

Sets the dimension of the kernel.

get_dim: guint (* get_dim) ( NcmStatsDistKernel* sdk )

Gets the dimension of the kernel.

get_rot_bandwidth: gdouble (* get_rot_bandwidth) ( NcmStatsDistKernel* sdk, const gdouble n )

Gets the rule-of-thumb bandwidth of the kernel.

get_lnnorm: gdouble (* get_lnnorm) ( NcmStatsDistKernel* sdk, NcmMatrix* cov_decomp )

Gets the log of the normalization constant of the kernel.

eval_unnorm: gdouble (* eval_unnorm) ( NcmStatsDistKernel* sdk, const gdouble chi2 )

Evaluates the unnormalized kernel at a given chi2.

eval_unnorm_vec: void (* eval_unnorm_vec) ( NcmStatsDistKernel* sdk, NcmVector* chi2, NcmVector* Ku )

Evaluates the unnormalized kernel at a given chi2 vector.

eval_sum0_gamma_lambda: void (* eval_sum0_gamma_lambda) ( NcmStatsDistKernel* sdk, NcmVector* chi2, NcmVector* weights, NcmVector* lnnorms, NcmVector* lnK, gdouble* gamma, gdouble* lambda )

Evaluates the kernels sum0, gamma and lambda at a given chi2 vector.

eval_sum1_gamma_lambda: void (* eval_sum1_gamma_lambda) ( NcmStatsDistKernel* sdk, NcmVector* chi2, NcmVector* weights, gdouble lnnorm, NcmVector* lnK, gdouble* gamma, gdouble* lambda )

Evaluates the kernels sum1, gamma and lambda at a given chi2 vector.

sample: void (* sample) ( NcmStatsDistKernel* sdk, NcmMatrix* cov_decomp, const gdouble href, NcmVector* mu, NcmVector* y, NcmRNG* rng )

Samples the kernel.

Virtual methods

NumCosmoMath.StatsDistKernelClass.eval_sum0_gamma_lambda

Computes the weighted sum of kernels at $\chi^2=$chi2 (the density estimator function), $$ e^\gamma (1+\lambda) = \sum_i w_i\bar{K} (\chi^2_i) / u_i,$$ where $\gamma = \ln(w_a\bar{K} (\chi^2_a) / u_a)$ and $a$ labels is the largest term of the sum. This function shall be used when each kernel has a different normalization factor.

NumCosmoMath.StatsDistKernelClass.eval_sum1_gamma_lambda

Computes the weighted sum of kernels at $\chi^2=$chi2 (the density estimator function), $$ e^\gamma (1+\lambda) = \sum_i w_i\bar{K} (\chi^2_i) / u,$$ where $\gamma = \ln(w_a\bar{K} (\chi^2_a) / u)$ and $a$ labels is the largest term of the sum. This function shall be used when all the kernels have the same normalization factor.

NumCosmoMath.StatsDistKernelClass.eval_unnorm

Computes the unnormalized kernel at $\chi^2=$chi2.

NumCosmoMath.StatsDistKernelClass.eval_unnorm_vec

Computes the unnormalized kernel at $\chi^2=$chi2 for all elements of chi2 and store the results at Ku.

NumCosmoMath.StatsDistKernelClass.get_dim

Gets current kernel dimension.

NumCosmoMath.StatsDistKernelClass.get_lnnorm

Computes the kernel normalization for a given covariance cov_decomp.

NumCosmoMath.StatsDistKernelClass.get_rot_bandwidth

Computes the rule-of-thumb bandwidth for a interpolation using n kernels.

NumCosmoMath.StatsDistKernelClass.sample

Generates a random vector from the kernel distribution using the covariance cov_decomp, bandwidth href and location vector mu. The result is stored in y.

NumCosmoMath.StatsDistKernelClass.set_dim

Sets the dimension of the kernel.