Simulating Halo, Cluster and Member Catalogs

Author

NumCosmo developers

Purpose

This tutorial shows how to generate mock halo/cluster catalogs and their galaxy members entirely from NumCosmo’s C objects, and how to chain them together with the MockPipeline orchestrator.

The generation is split into small, reusable pieces, each assembled from the existing modelling machinery rather than reimplemented:

  • NcHaloCatalogGenerator draws a Poisson realization of the cluster number counts from a NcClusterAbundance, samples the observed redshift and mass through the cluster proxies, and (optionally) adds sky positions over an NcmSkyFootprint and the spherical-overdensity radius r_Delta;
  • NcGalaxyHODZheng07 is a halo occupation distribution turning a halo mass into central and satellite galaxy counts;
  • NcHaloCatalogMemberGenerator populates each host with its galaxies, placing them in 3D within r_Delta and projecting them to sky coordinates and redshift;
  • MockPipeline wires the cluster and member generators together and returns the two linked NcHaloCatalog tables.

The resulting catalogs are ready to feed the matching tools of the Sky Match tutorial or the cluster number-counts likelihood.

Defining the Models

We start with a flat \(\Lambda\)CDM cosmology and the ingredients of the halo mass function: a linear matter power spectrum, a top-hat variance filter and a multiplicity function (NcMultiplicityFuncBocquet) with a critical \(\Delta = 200\) mass definition. The mass definition carried here is what fixes the meaning of r_Delta later on.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import HTML

from numcosmo_py import Nc, Ncm
from numcosmo_py.catalog import MockPipeline, catalog_to_table

Ncm.cfg_init()


def show_table(table, float_format="%.3f", max_rows=10):
    """Render an astropy Table as a compact HTML table."""
    return HTML(table.to_pandas().to_html(float_format=float_format, max_rows=max_rows))


cosmo = Nc.HICosmoDEXcdm.new()
cosmo.add_submodel(Nc.HIReionCamb.new())
cosmo.add_submodel(Nc.HIPrimPowerLaw.new())
cosmo["H0"] = 70.0
cosmo["Omegab"] = 0.05
cosmo["Omegac"] = 0.25
cosmo["Omegax"] = 0.70
cosmo["w"] = -1.0

dist = Nc.Distance.new(2.0)
psml = Nc.PowspecMLTransfer.new(Nc.TransferFuncEH())
psml.require_kmin(1.0e-3)
psml.require_kmax(1.0e3)
psf = Ncm.PowspecFilter.new(psml, Ncm.PowspecFilterType.TOPHAT)
psf.set_best_lnr0()

mulf = Nc.MultiplicityFuncBocquet.new()
mulf.set_mdef(Nc.MultiplicityFuncMassDef.CRITICAL)
mulf.set_Delta(200.0)
mulf.set_sim(Nc.MultiplicityFuncBocquetSim.DM)

hmf = Nc.HaloMassFunction.new(dist, psf, mulf)
hmf.prepare(cosmo)

Observable Proxies and Abundance

The cluster abundance needs proxies relating the true mass and redshift to the observed ones. We use a log-normal mass proxy (NcClusterMassLnnormal) and a global Gaussian photometric redshift (NcClusterPhotozGaussGlobal), and we collect every model into an NcmMSet.

cluster_m = Nc.ClusterMassLnnormal(
    lnMobs_min=np.log(1.0e14), lnMobs_max=np.log(1.0e16)
)
cluster_m["bias"] = 0.0
cluster_m["sigma"] = 0.2

cluster_z = Nc.ClusterPhotozGaussGlobal(
    pz_min=0.0, pz_max=0.7, z_bias=0.0, sigma0=0.03
)

cad = Nc.ClusterAbundance.new(hmf, None)

mset = Ncm.MSet.new_array([cosmo, cluster_z, cluster_m])

Footprint and Halo Occupation

The footprint (NcmSkyFootprintRectangular) is the sky region over which positions are drawn; the HOD turns each halo mass into a number of central and satellite galaxies.

footprint = Ncm.SkyFootprintRectangular.new(10.0, 14.0, -2.0, 2.0)

cad.set_area(footprint.get_area())
cad.prepare(cosmo, cluster_z, cluster_m)

hod = Nc.GalaxyHODZheng07.new()

Generating the Mock

MockPipeline assembles the cluster generator (with the footprint and radius enabled) and the member generator, so a single call produces both catalogs.

pipeline = MockPipeline(cad, hod, footprint)

rng = Ncm.RNG.seeded_new(None, 1)
mock = pipeline.generate(mset, rng)

clusters = mock.clusters
members = mock.members

print(f"Generated {len(clusters)} clusters and {len(members)} members.")
Generated 239 clusters and 864 members.

The cluster catalog carries the true and observed redshift/mass, the sky position and the spherical-overdensity radius r_Delta, plus a cluster_id join key:

Code
show_table(clusters)
z_true lnM_true z_obs_0 lnM_obs_0 ra dec r_Delta cluster_id
0 0.319 32.252 0.285 32.240 12.480 -1.182 0.863 0
1 0.501 32.648 0.434 32.796 13.796 1.025 0.917 1
2 0.545 32.505 0.537 32.480 12.417 -1.332 0.860 2
3 0.245 32.591 0.244 32.886 13.194 -0.615 0.993 3
4 0.598 33.060 0.607 33.123 12.299 -1.162 1.013 4
... ... ... ... ... ... ... ... ...
234 0.573 32.551 0.643 32.309 10.630 0.659 0.864 234
235 0.357 32.464 0.353 32.497 12.005 -0.951 0.913 235
236 0.059 33.351 0.056 33.245 12.139 0.664 1.363 236
237 0.633 32.116 0.578 32.409 10.029 -0.122 0.729 237
238 0.702 32.232 0.667 32.473 11.605 1.370 0.738 238
Table 1: First rows of the generated cluster catalog.

Each member links back to its host cluster through parent_id, which matches the cluster_id column of the cluster table:

Code
show_table(members)
galaxy_id parent_id is_central ra dec z
0 0 0 True 12.480 -1.182 0.319
1 1 0 False 12.445 -1.203 0.319
2 2 1 True 13.796 1.025 0.501
3 3 1 False 13.786 1.003 0.501
4 4 1 False 13.801 1.047 0.501
... ... ... ... ... ... ...
859 859 236 False 12.161 0.487 0.059
860 860 236 False 12.247 0.581 0.059
861 861 236 False 12.265 0.661 0.059
862 862 237 True 10.029 -0.122 0.633
863 863 238 True 11.605 1.370 0.702
Table 2: First rows of the generated member catalog.
n_central = int(np.sum(members["is_central"]))
print(f"{n_central} centrals and {len(members) - n_central} satellites.")
239 centrals and 625 satellites.

Visualising the Catalogs

The clusters trace the footprint; member galaxies cluster tightly around their hosts. We size the cluster markers by mass and colour them by redshift.

fig, ax = plt.subplots(figsize=(7, 6))

ax.scatter(members["ra"], members["dec"], s=2, color="0.7", label="members")
sc = ax.scatter(
    clusters["ra"],
    clusters["dec"],
    s=8 + 6 * (clusters["lnM_true"] - clusters["lnM_true"].min()),
    c=clusters["z_true"],
    cmap="viridis",
    edgecolor="k",
    linewidth=0.3,
    label="clusters",
)
fig.colorbar(sc, ax=ax, label="cluster $z$")
ax.set_xlabel("RA [deg]")
ax.set_ylabel("Dec [deg]")
ax.legend(loc="upper right")
plt.show()
Figure 1: Mock clusters (coloured by redshift) and their member galaxies.

Zooming on the most massive cluster shows the members distributed around the host position, with the central sitting exactly on it.

i = int(np.argmax(clusters["lnM_true"]))
cid = int(clusters["cluster_id"][i])
sel = members[members["parent_id"] == cid]
is_cen = np.asarray(sel["is_central"], dtype=bool)

fig, ax = plt.subplots(figsize=(5, 5))
ax.scatter(sel["ra"][~is_cen], sel["dec"][~is_cen], s=12, label="satellites")
ax.scatter(sel["ra"][is_cen], sel["dec"][is_cen], s=80, marker="*",
           color="crimson", label="central")
ax.scatter(clusters["ra"][i], clusters["dec"][i], s=120, facecolor="none",
           edgecolor="k", label="host")
ax.set_xlabel("RA [deg]")
ax.set_ylabel("Dec [deg]")
ax.legend()
ax.set_title(f"Members of cluster {cid}")
plt.show()
Figure 2: Members of the most massive cluster within its r_Delta.

Working with the Generators Directly

MockPipeline is a thin wrapper; the underlying generators can be used on their own. For example, to obtain just the cluster catalog with positions and radius, enable them with nc_halo_catalog_generator_set_footprint and nc_halo_catalog_generator_set_with_radius:

cluster_gen = Nc.HaloCatalogGenerator.new(cad)
cluster_gen.set_footprint(footprint)
cluster_gen.set_with_radius(True)

hcat = cluster_gen.generate(mset, Ncm.RNG.seeded_new(None, 2))
print(list(hcat.peek_columns()))
['z_true', 'lnM_true', 'z_obs_0', 'lnM_obs_0', 'ra', 'dec', 'r_Delta']
Code
show_table(catalog_to_table(hcat))
z_true lnM_true z_obs_0 lnM_obs_0 ra dec r_Delta
0 0.560 32.346 0.591 32.251 11.747 1.237 0.811
1 0.617 32.779 0.627 32.749 13.403 1.291 0.915
2 0.295 32.178 0.324 32.240 11.957 0.146 0.849
3 0.556 32.280 0.547 32.262 10.881 -0.045 0.794
4 0.645 32.431 0.641 32.541 10.111 -0.633 0.806
... ... ... ... ... ... ... ...
244 0.688 32.351 0.691 32.261 12.477 0.331 0.771
245 0.594 32.311 0.648 32.265 13.552 1.552 0.790
246 0.627 32.218 0.671 32.307 11.621 -1.219 0.756
247 0.587 32.825 0.536 32.854 12.574 1.978 0.941
248 0.584 32.353 0.653 32.367 13.869 -1.642 0.805
Table 3: Cluster catalog produced directly by the generator.

The member generator can then populate any host catalog carrying the ra, dec, z_true, lnM_true and r_Delta columns, via nc_halo_catalog_member_generator_generate:

member_gen = Nc.HaloCatalogMemberGenerator.new(hod)
member_hcat = member_gen.generate(hcat, mset, Ncm.RNG.seeded_new(None, 3))
print(f"{member_hcat.len()} members generated.")
850 members generated.

Summary

We generated a linked cluster and member mock entirely from NumCosmo C objects: the cluster abundance and proxies produce the cluster catalog, a halo occupation distribution populates each cluster with galaxies, and MockPipeline returns both as astropy tables linked by id. These catalogs feed naturally into the Sky Match tools and the cluster number-counts likelihood.

API Reference

The objects used in this tutorial:

The Python-level helpers MockPipeline and catalog_to_table live in the numcosmo_py.catalog subpackage.