Class
NumCosmoMathStatsDistKernel
Description [src]
abstract class NumCosmoMath.StatsDistKernel : GObject.Object
{
/* No available fields */
}
An N-dimensional kernel used to compute the kernel density estimation function (KDE)
in the NcmStatsDist class.
This class provides the tools to generate a kernel function to be used in a kernel density estimation method. Below is a quick review of the kernel density estimation method and some properties of the kernel function, which are generalized for multidimensional problems. For further information, check [Density Estimation for Statistics and Data Analysis, B.W. Silverman].
Starting with the uni-dimensional case, let $X_1,…,X_n$ be independent and identically distributed (iid) samples drawn from a distribution $f(x)$. The kernel density estimation of the function is \begin{align} \tilde{f}(x) = \sum_{i=1}^{n}K\left(\frac{x-x_i}{h}\right) ,\end{align} where $K$ is the kernel function and $h$ is the bandwidth parameter. The kernel density estimator function must be close to the true density function $f(x)$, which can be tested by analyzing whether the estimator provides similar expected values as the function $f(x)$, that is, the function $\tilde{f}(x)$ must minimize the mean square error (MSE) \begin{align} \label{eqmse} MSE_x(\tilde{f}) = E\left[\tilde{f}(x) - f(x)\right]^2 ,\end{align} where $E$ represents the expected value. This value depends on the choice of the kernel function, the data and the bandwidth. If the estimator $\tilde{f}(x)$ is close enough to the true function, it shall be used to generate samples that are distributed by $f(x)$.
The kernel $K$ is a symmetric function that must satisfy \begin{align} &\int K(x)~dx = 1 .\end{align} Usually, the kernel function is a symmetric probability density function that is easy to sample from, but it is totally under the user’s control. Using simple kernels, such as the Gaussian kernel, makes the kernel density estimator method a better alternative to generate samples when the desired distribution is a complicated function.
For the multidimensional case, given i.i.d d-dimensional sample points $X_1,.., X_n$ distributed by $f(x)$, the multivariate kernel density estimator function $\tilde{f}(x)$ is given by \begin{align} \tilde{f}(x) = \frac{1}{h^d} \sum_{i=1}^n w_i K\left(\frac{x-x_i}{h}, \Sigma_i\right) ,\end{align} where $\Sigma_i$ is the covariance matrix of the $i$-th point (the kernels used in this library depend on the covariance matrix), $d$ is the dimension and $w_i$ is the weight attached to each kernel to find the minimal error in equation \eqref{eqmse}.
The methods in this class define the type of kernel $K$, compute the bandwidth factor $h$, evaluate the kernel function at a given $d$-dimensional point $x$ or at a given vector of points $\vec{x}$, and, given the weights $w_i$, compute the kernel density estimation function $\tilde{f}(x)$.
Besides the function ncm_stats_dist_kernel_get_dim(), this class object only has virtual methods.
Therefore, to use this object, the user must initialize one of the child objects (NcmStatsDistKernelGauss or NcmStatsDistKernelST).
Inside the child objects are the implemented functions, which must be defined for each specific type of kernel function.
Check the childs documentations for more information. More information about how the algorithm should be implemented is described below:
-This class is implemented in the `NcmStatsDist` class, where the `NcmStatsDistKernel` class shall define
the type of kernel used in the interpolation function in `NcmStatsDist` and how to compute values such as
the weighted sum of the kernels, the bandwidth, and so on. Yet, the user may use these class objects
to perform other kernel calculations, although some of the methods are not implemented outside the
`NcmStatsDist` class.
-This class does not possess the methods to compute the weights of each kernel. You may find this method in the
`NcmStatsDist` class.
-Every child object of this class can be used either in the `NcmStatsDistKDE` class or in the `NcmStatsDistVKDE` class.
Functions
ncm_stats_dist_kernel_clear
Decrease the reference count of stats_dist_nd_kde_gauss by one, and sets the pointer *sdk to
NULL.
Instance methods
ncm_stats_dist_kernel_eval_sum0_gamma_lambda
Computes the weighted sum of kernels at $\chi^2=$chi2 (the density estimator function),
$$ e^\gamma (1+\lambda) = \sum_i w_i\bar{K} (\chi^2_i) / u_i,$$
where $\gamma = \ln(w_a\bar{K} (\chi^2_a) / u_a)$ and $a$ labels
is the largest term of the sum. This function shall be used when
each kernel has a different normalization factor.
ncm_stats_dist_kernel_eval_sum1_gamma_lambda
Computes the weighted sum of kernels at $\chi^2=$chi2 (the density estimator function),
$$ e^\gamma (1+\lambda) = \sum_i w_i\bar{K} (\chi^2_i) / u,$$
where $\gamma = \ln(w_a\bar{K} (\chi^2_a) / u)$ and $a$ labels
is the largest term of the sum. This function shall be used when
all the kernels have the same normalization factor.
ncm_stats_dist_kernel_eval_unnorm_vec
Computes the unnormalized kernel at $\chi^2=$chi2 for all elements of chi2
and store the results at Ku.
ncm_stats_dist_kernel_get_lnnorm
Computes the kernel normalization for a given covariance cov_decomp.
ncm_stats_dist_kernel_get_rot_bandwidth
Computes the rule-of-thumb bandwidth for a interpolation
using n kernels.
ncm_stats_dist_kernel_sample
Generates a random vector from the kernel distribution
using the covariance cov_decomp, bandwidth href and
location vector mu. The result is stored in y.
Signals
Signals inherited from GObject (1)
GObject::notify
The notify signal is emitted on an object when one of its properties has its value set through g_object_set_property(), g_object_set(), et al.
Class structure
struct NumCosmoMathStatsDistKernelClass {
GObjectClass parent_class;
void (* set_dim) (
NcmStatsDistKernel* sdk,
const guint dim
);
guint (* get_dim) (
NcmStatsDistKernel* sdk
);
gdouble (* get_rot_bandwidth) (
NcmStatsDistKernel* sdk,
const gdouble n
);
gdouble (* get_lnnorm) (
NcmStatsDistKernel* sdk,
NcmMatrix* cov_decomp
);
gdouble (* eval_unnorm) (
NcmStatsDistKernel* sdk,
const gdouble chi2
);
void (* eval_unnorm_vec) (
NcmStatsDistKernel* sdk,
NcmVector* chi2,
NcmVector* Ku
);
void (* eval_sum0_gamma_lambda) (
NcmStatsDistKernel* sdk,
NcmVector* chi2,
NcmVector* weights,
NcmVector* lnnorms,
NcmVector* lnK,
gdouble* gamma,
gdouble* lambda
);
void (* eval_sum1_gamma_lambda) (
NcmStatsDistKernel* sdk,
NcmVector* chi2,
NcmVector* weights,
gdouble lnnorm,
NcmVector* lnK,
gdouble* gamma,
gdouble* lambda
);
void (* sample) (
NcmStatsDistKernel* sdk,
NcmMatrix* cov_decomp,
const gdouble href,
NcmVector* mu,
NcmVector* y,
NcmRNG* rng
);
}
The virtual function table for NcmStatsDistKernel.
Class members
parent_class: GObjectClassThe parent class.
set_dim: void (* set_dim) ( NcmStatsDistKernel* sdk, const guint dim )Sets the dimension of the kernel.
get_dim: guint (* get_dim) ( NcmStatsDistKernel* sdk )Gets the dimension of the kernel.
get_rot_bandwidth: gdouble (* get_rot_bandwidth) ( NcmStatsDistKernel* sdk, const gdouble n )Gets the rule-of-thumb bandwidth of the kernel.
get_lnnorm: gdouble (* get_lnnorm) ( NcmStatsDistKernel* sdk, NcmMatrix* cov_decomp )Gets the log of the normalization constant of the kernel.
eval_unnorm: gdouble (* eval_unnorm) ( NcmStatsDistKernel* sdk, const gdouble chi2 )Evaluates the unnormalized kernel at a given chi2.
eval_unnorm_vec: void (* eval_unnorm_vec) ( NcmStatsDistKernel* sdk, NcmVector* chi2, NcmVector* Ku )Evaluates the unnormalized kernel at a given chi2 vector.
eval_sum0_gamma_lambda: void (* eval_sum0_gamma_lambda) ( NcmStatsDistKernel* sdk, NcmVector* chi2, NcmVector* weights, NcmVector* lnnorms, NcmVector* lnK, gdouble* gamma, gdouble* lambda )Evaluates the kernels sum0, gamma and lambda at a given chi2 vector.
eval_sum1_gamma_lambda: void (* eval_sum1_gamma_lambda) ( NcmStatsDistKernel* sdk, NcmVector* chi2, NcmVector* weights, gdouble lnnorm, NcmVector* lnK, gdouble* gamma, gdouble* lambda )Evaluates the kernels sum1, gamma and lambda at a given chi2 vector.
sample: void (* sample) ( NcmStatsDistKernel* sdk, NcmMatrix* cov_decomp, const gdouble href, NcmVector* mu, NcmVector* y, NcmRNG* rng )Samples the kernel.
Virtual methods
NumCosmoMath.StatsDistKernelClass.eval_sum0_gamma_lambda
Computes the weighted sum of kernels at $\chi^2=$chi2 (the density estimator function),
$$ e^\gamma (1+\lambda) = \sum_i w_i\bar{K} (\chi^2_i) / u_i,$$
where $\gamma = \ln(w_a\bar{K} (\chi^2_a) / u_a)$ and $a$ labels
is the largest term of the sum. This function shall be used when
each kernel has a different normalization factor.
NumCosmoMath.StatsDistKernelClass.eval_sum1_gamma_lambda
Computes the weighted sum of kernels at $\chi^2=$chi2 (the density estimator function),
$$ e^\gamma (1+\lambda) = \sum_i w_i\bar{K} (\chi^2_i) / u,$$
where $\gamma = \ln(w_a\bar{K} (\chi^2_a) / u)$ and $a$ labels
is the largest term of the sum. This function shall be used when
all the kernels have the same normalization factor.
NumCosmoMath.StatsDistKernelClass.eval_unnorm_vec
Computes the unnormalized kernel at $\chi^2=$chi2 for all elements of chi2
and store the results at Ku.
NumCosmoMath.StatsDistKernelClass.get_lnnorm
Computes the kernel normalization for a given covariance cov_decomp.
NumCosmoMath.StatsDistKernelClass.get_rot_bandwidth
Computes the rule-of-thumb bandwidth for a interpolation
using n kernels.
NumCosmoMath.StatsDistKernelClass.sample
Generates a random vector from the kernel distribution
using the covariance cov_decomp, bandwidth href and
location vector mu. The result is stored in y.