Class

NumCosmoMathStatsDist

Description [src]

abstract class NumCosmoMath.StatsDist : GObject.Object
{
  /* No available fields */
}

Base class for implementing N-dimensional probability distributions.

Abstract class to reconstruct an arbitrary N-dimensional probability distribution. This class provides the tools to perform a radial basis interpolation in a multidimensional function using a radial basis function and then generates a new sample using the interpolation function as the kernel. This method generates a sample that is distributed by the original distribution, but in a more simple way since the used kernels are easier to sample from. For more information about radial basis interpolation, check [Radial Basis Function Interpolation, Wilna du Toit]. A brief description of the radial basis interpolation method can be found below.

Given a d-dimensional function $g(x): \mathbf{R}^d \rightarrow \mathbf{R}$, a radial basis function $\phi(x, \Sigma)$ is used such that \begin{align} \label{Interpolation_eq} s(x) = \sum_i^n \lambda_i \phi(|x-x_i|, \Sigma_i), \quad x~ \in~ \mathbf{R} . \end{align} The variables $\lambda_i$ represent the weights and are found such that \begin{align} \label{eqnnls1} s(x_i) = g(x_i) , \end{align} being $x_i$ the sample points. The values generated by $\phi(|x-x_i|, \Sigma_i)$ are displayed in a symmetric $n \times n$ matrix $\Phi$. This function depends on the norm of the points and on the covariance matrix $\Sigma$ associated with each point. The weights $\lambda_i$ are also organized in a matrix representation such that equation \eqref{eqnnls1} becomes \begin{align} \label{eqnnls} G = \lambda \times \Phi ,\end{align} where $G$ is a matrix containing all the function values $g(xi)$. Once the Lambda matrix is found, one may use $s(x)$ to sample values from $g(x)$, which is easier to do since $s(x)$ is a polynomial function.

We want $s(x)$ to be a probability distribution so we can sample from it. Therefore the Lambda matrix containing the weights is seen as the probability density and it must be minimized such that its values are always positive and sum up to one. To solve equation this problem, this algorithm has the tools to solve equation \eqref{eqnnls} for $\lambda$, which is a least-squares problem, using the NNLS method, which can be found in nnls.c file. Thus, the algorithm can randomly choose a kernel $\phi(|x-x_i|, \Sigma_i)$ associated to a probability contained in $\lambda$ and sample a point from it.

In this object, the radial basis interpolation function is not completely defined. One must choose one of the instances of the class, the NcmStatsDistKernelST object or the NcmStatsDistKernelGauss object, which uses a multivariate Student’s t function and a Gaussian function as the kernel. After initializing the desired object for the interpolation function, one may use the methods of this file to generate the interpolation and to sample from the new interpolated function.

The user must provide the input the values: over_smooth - ncm_stats_dist_set_over_smooth(), split_frac - ncm_stats_dist_set_split_frac(), over_smooth - ncm_stats_dist_set_over_smooth(), $v(x)$ - ncm_stats_dist_prepare_interp(). The other parameters must be inserted when the instance for the NcmStatsDistKDE or the NcmStatsDistVKDE object is initialized. To perform a calculation of this class, one needs to initialize the class within one of its subclasses (NcmStatsDistKernelGauss or NcmStatsDistKernelST), along with the input of a child object of the class NcmStatsDistKernel. For more information about the algorithm, see the description below.

-Since this class does not define what type of kernel will be used in the calculation (the fixed kernel in the NcmStatsDistKDE class or the variable kernel in NcmStatsDistVKDE class), one cannot compute the sample just using this instance. Also, it must be provided the function to be used as the kernel, which is implemented in the children from the class NcmStatsDistKernel. When initializing the NcmStatsDistKDE or NcmStatsDistVKDE classes, the function to be used as the kernel is defined in the object initialization function.

-This class also needs a child object to compute the interpolation matrix $IM$ and the covariance matrices stored in cov_decomp to perform the interpolation, which is kernel dependent and therefore also computed by the class child objects.

-Regarding the kernel types based on the radial basis function, $\phi(|x-x_i|)$, and how the sample points in ncm_stats_dist_sample() are generated, see the different implementations of NcmStatsDistKernel, e.g., NcmStatsDistKernelGauss and NcmStatsDistKernelST

-Regarding how the functions ncm_stats_dist_eval() and ncm_stats_dist_eval_m2lnp() are implemented, see the different implementations of NcmStatsDist, i.e., NcmStatsDistKDE and NcmStatsDistVKDE. These objects also compute the covariance matrix of each sample point and other objects needed for the least-squares problem, when computing the weights matrix ($\lambda$).

Ancestors

GObject

Descendants

NcmStatsDistKDE

Functions

ncm_stats_dist_clear

Decreases the reference count of sd and sets the pointer sd to NULL.

Instance methods

ncm_stats_dist_add_obs

Adds a new point y to the sample with weight 1.0. This function must be called to insert an initial sample into the object, so the interpolation can be computed.

ncm_stats_dist_eval

Evaluate the distribution at $\vec{x}=$x. The method ncm_stats_dist_eval_m2lnp() can be used to avoid underflow.

ncm_stats_dist_eval_m2lnp

Evaluate the distribution at $\vec{x}=$x. This method is more stable than ncm_stats_dist_eval() since it avoids underflows and overflows.

ncm_stats_dist_free

Decreases the reference count of sd.

ncm_stats_dist_get_Ki

Return all information about the i-th kernel.

ncm_stats_dist_get_cv_type

No description available.

ncm_stats_dist_get_dim

No description available.

ncm_stats_dist_get_href

No description available.

ncm_stats_dist_get_kernel

Gets the kernel to be used in the interpolation.

ncm_stats_dist_get_lnnorm

Gets the logarithm of the i-th kernel normalization.

ncm_stats_dist_get_n_kernels

After the prepare call, this function returns the number of kernels used in the interpolation.

ncm_stats_dist_get_over_smooth

No description available.

ncm_stats_dist_get_print_fit

No description available.

ncm_stats_dist_get_rnorm

Gets the value of the last $\chi^2$ fit obtained when computing the interpolation through ncm_stats_dist_prepare_interp().

ncm_stats_dist_get_sample_size

After the prepare call, this function returns the size of the sample used in the interpolation.

ncm_stats_dist_get_shrink

The shrink factor is used to shrink the weights of the sample points in the interpolation.

ncm_stats_dist_get_split_frac

No description available.

ncm_stats_dist_get_use_threads

No description available.

ncm_stats_dist_kernel_choose

Using the pseudo-random number generator rng chooses a random kernel based on the computed weights.

ncm_stats_dist_peek_cov_decomp

Gets the covariance matrix associated with the i-th kernel.

ncm_stats_dist_peek_full_cov

Gets the full covariance matrix of the whole sample.

ncm_stats_dist_peek_full_cov_decomp

Gets the full covariance matrix decomposition. This is a the Cholesky decomposition of the covariance matrix of the whole sample.

ncm_stats_dist_peek_kernel

Gets the kernel to be used in the interpolation.

ncm_stats_dist_peek_sample_array

No description available.

ncm_stats_dist_peek_weights

No description available.

ncm_stats_dist_prepare

Prepares the object for calculations. This function prepares the weight matrix and sets all the weights to 1.0/sample size. It also calls the kernel_prepare function, implemented by a child, and calls the get_href function.

ncm_stats_dist_prepare_interp

Prepares the object for calculations. Using the distribution values at the sample points. This function calls the prepare function and prepares the needed objects to compute the least squares problem. The interpolation matrix IM is prepared by a child object and called in this function. Then, depending on the cross validation method, the function solves the least squares problem using the ncm_nnls object.

ncm_stats_dist_prepare_kernel

Prepares the object for computations of the individuals kernels and is usually part of ncm_stats_dist_prepare() and is should not be called directly.

ncm_stats_dist_ref

Increases the reference count of sd.

ncm_stats_dist_reset

Reset the object discarding all added points.

ncm_stats_dist_sample

Using the pseudo-random number generator rng generates a point from the distribution and copy it to x.

ncm_stats_dist_set_cv_type

Sets the cross-validation method to cv_type. If the selected method is none, all the sample points will be used to compute the interpolation. If the cv_type is the cv_split, a split fraction of the points are randomly excluded and the interpolation is computed to a best fit of the remaining sample points, which leads to a more point independent interpolation.

ncm_stats_dist_set_kernel

Sets the kernel to be used in the interpolation. The different types of kernels are: the gaussian kernel and the student-t kernel, which are under the file names ncm_stats_dist_kernel_gauss.c and ncm_stats_dist_kernel_st.c.

ncm_stats_dist_set_over_smooth

Sets the over-smooth factor to over_smooth.

ncm_stats_dist_set_print_fit

Whether to print steps during the fitting process.

ncm_stats_dist_set_shrink

Sets the shrink factor to shrink. The shrink factor is used to shrink the weights of the sample points in the interpolation. A shrink factor of 0.0 no shrinkage is applied, a shrink factor of 1.0 full shrinkage is applied.

ncm_stats_dist_set_split_frac

Sets cross-correlation split fraction to split_frac. This method shall be used when the cv_type is the cv_split. The split fraction determines the fraction of sample points that will be left out to use the cross validation method.

ncm_stats_dist_set_use_threads

Sets whether to use OpenMP threads during the computation.

Methods inherited from GObject (43)

Please see GObject for a full list of methods.

Properties

NumCosmoMath.StatsDist:CV-type

No description available.

NumCosmoMath.StatsDist:N

No description available.

NumCosmoMath.StatsDist:kernel

No description available.

NumCosmoMath.StatsDist:over-smooth

No description available.

NumCosmoMath.StatsDist:print-fit

No description available.

NumCosmoMath.StatsDist:shrink

No description available.

NumCosmoMath.StatsDist:split-frac

No description available.

NumCosmoMath.StatsDist:use-threads

No description available.

Signals

Signals inherited from GObject (1)

GObject::notify

The notify signal is emitted on an object when one of its properties has its value set through g_object_set_property(), g_object_set(), et al.

Class structure

struct NumCosmoMathStatsDistClass {
  void (* set_dim) (
    NcmStatsDist* sd,
    const guint dim
  );
  gdouble (* get_href) (
    NcmStatsDist* sd
  );
  void (* prepare_kernel) (
    NcmStatsDist* sd,
    GPtrArray* sample_array
  );
  void (* prepare) (
    NcmStatsDist* sd
  );
  void (* prepare_interp) (
    NcmStatsDist* sd,
    NcmVector* m2lnp
  );
  void (* compute_IM) (
    NcmStatsDist* sd,
    NcmMatrix* IM
  );
  NcmMatrix* (* peek_cov_decomp) (
    NcmStatsDist* sd,
    guint i
  );
  NcmMatrix* (* peek_full_cov_decomp) (
    NcmStatsDist* sd
  );
  NcmMatrix* (* peek_full_cov) (
    NcmStatsDist* sd
  );
  gdouble (* get_lnnorm) (
    NcmStatsDist* sd,
    guint i
  );
  gdouble (* eval_weights) (
    NcmStatsDist* sd,
    NcmVector* weights,
    NcmVector* x
  );
  gdouble (* eval_weights_m2lnp) (
    NcmStatsDist* sd,
    NcmVector* weights,
    NcmVector* x
  );
  void (* reset) (
    NcmStatsDist* sd
  );
  
}

No description available.

Class members

set_dim: void (* set_dim) ( NcmStatsDist* sd, const guint dim ): No description available.
get_href: gdouble (* get_href) ( NcmStatsDist* sd ): No description available.
prepare_kernel: void (* prepare_kernel) ( NcmStatsDist* sd, GPtrArray* sample_array ): No description available.
prepare: void (* prepare) ( NcmStatsDist* sd ): No description available.
prepare_interp: void (* prepare_interp) ( NcmStatsDist* sd, NcmVector* m2lnp ): No description available.
compute_IM: void (* compute_IM) ( NcmStatsDist* sd, NcmMatrix* IM ): No description available.
peek_cov_decomp: NcmMatrix* (* peek_cov_decomp) ( NcmStatsDist* sd, guint i ): No description available.
peek_full_cov_decomp: NcmMatrix* (* peek_full_cov_decomp) ( NcmStatsDist* sd ): No description available.
peek_full_cov: NcmMatrix* (* peek_full_cov) ( NcmStatsDist* sd ): No description available.
get_lnnorm: gdouble (* get_lnnorm) ( NcmStatsDist* sd, guint i ): No description available.
eval_weights: gdouble (* eval_weights) ( NcmStatsDist* sd, NcmVector* weights, NcmVector* x ): No description available.
eval_weights_m2lnp: gdouble (* eval_weights_m2lnp) ( NcmStatsDist* sd, NcmVector* weights, NcmVector* x ): No description available.
reset: void (* reset) ( NcmStatsDist* sd ): No description available.

Virtual methods

NumCosmoMath.StatsDistClass.compute_IM

No description available.

NumCosmoMath.StatsDistClass.eval_weights

No description available.

NumCosmoMath.StatsDistClass.eval_weights_m2lnp

No description available.

NumCosmoMath.StatsDistClass.get_href

No description available.

NumCosmoMath.StatsDistClass.get_lnnorm

Gets the logarithm of the i-th kernel normalization.

NumCosmoMath.StatsDistClass.peek_cov_decomp

Gets the covariance matrix associated with the i-th kernel.

NumCosmoMath.StatsDistClass.peek_full_cov

Gets the full covariance matrix of the whole sample.

NumCosmoMath.StatsDistClass.peek_full_cov_decomp

Gets the full covariance matrix decomposition. This is a the Cholesky decomposition of the covariance matrix of the whole sample.

NumCosmoMath.StatsDistClass.prepare

Prepares the object for calculations. This function prepares the weight matrix and sets all the weights to 1.0/sample size. It also calls the kernel_prepare function, implemented by a child, and calls the get_href function.

NumCosmoMath.StatsDistClass.prepare_interp

Prepares the object for calculations. Using the distribution values at the sample points. This function calls the prepare function and prepares the needed objects to compute the least squares problem. The interpolation matrix IM is prepared by a child object and called in this function. Then, depending on the cross validation method, the function solves the least squares problem using the ncm_nnls object.