API

You may find more practical guidance in example notebooks: nb.

High-level correlation function interface

Implements high-level interface to estimate 2-point correlation function.

pycorr.correlation_function.TwoPointCorrelationFunction(mode, edges, data_positions1, data_positions2=None, randoms_positions1=None, randoms_positions2=None, shifted_positions1=None, shifted_positions2=None, data_weights1=None, data_weights2=None, randoms_weights1=None, randoms_weights2=None, shifted_weights1=None, shifted_weights2=None, data_samples1=None, data_samples2=None, randoms_samples1=None, randoms_samples2=None, shifted_samples1=None, shifted_samples2=None, D1D2_weight_type='auto', D1R2_weight_type='auto', R1D2_weight_type='auto', R1R2_weight_type='auto', S1S2_weight_type='auto', D1S2_weight_type='auto', S1D2_weight_type='auto', S1R2_weight_type='auto', D1D2_twopoint_weights=None, D1R2_twopoint_weights=None, R1D2_twopoint_weights=None, R1R2_twopoint_weights=None, S1S2_twopoint_weights=None, D1S2_twopoint_weights=None, S1D2_twopoint_weights=None, S1R2_twopoint_weights=None, estimator='auto', boxsize=None, selection_attrs=None, mpicomm=None, mpiroot=None, **kwargs)

Compute two-point counts and correlation function estimation, optionally with jackknife realizations.

Note

To compute the cross-correlation of samples 1 and 2, provide data_positions2 (and optionally randoms_positions2, shifted_positions2 for the selection function / shifted random catalogs of population 2). To compute (with the correct normalization estimate) the auto-correlation of sample 1, but with 2 weights, provide data_positions1 (but no data_positions2, nor randoms_positions2 and shifted_positions2), data_weights1 and data_weights2; randoms_weights2 and shited_weights2 default to randoms_weights1 and shited_weights1, resp.

Parameters:
  • mode (string) –

    Type of correlation function, one of:

    • ”theta”: as a function of angle (in degree) between two particles

    • ”s”: as a function of distance between two particles

    • ”smu”: as a function of distance between two particles and cosine angle \(\mu\) w.r.t. the line-of-sight

    • ”rppi”: as a function of distance transverse (\(r_{p}\)) and parallel (\(\pi\)) to the line-of-sight

    • ”rp”: same as “rppi”, without binning in \(\pi\)

  • edges (tuple, array) – Tuple of bin edges (arrays), for the first (e.g. \(r_{p}\)) and optionally second (e.g. \(\pi \in [-\infty, \infty]\), \(\mu \in [-1, 1]\)) dimensions. In case of single-dimension binning (e.g. mode is “theta”, “s” or “rp”), the single array of bin edges can be provided directly. Edges are inclusive on the low end, exclusive on the high end, i.e. a pair separated by \(s\) falls in bin i if edges[i] <= s < edges[i+1]. In case mode is “smu” however, the first \(\mu\)-bin is also exclusive on the low end (increase the \(\mu\)-range by a tiny value to include \(\mu = \pm 1\)). Pairs at separation \(s = 0\) are included in the \(\mu = 0\) bin. Similarly, in case mode is “rppi”, the first \(\pi\)-bin is also exclusive on the low end and pairs at separation \(s = 0\) are included in the \(\pi = 0\) bin. In case of auto-correlation (no positions2 provided), auto-pairs (pairs of same objects) are not counted. In case of cross-correlation, all pairs are counted. In any case, duplicate objects (with separation zero) will be counted.

  • data_positions1 (array) – Positions in the first data catalog. Typically of shape (3, N), but can be (2, N) when mode is “theta”. See position_type.

  • data_positions2 (array, default=None) – Optionally, for cross-correlations, data positions in the second catalog. See data_positions1.

  • randoms_positions1 (array, default=None) – Optionally, positions of the random catalog representing the first selection function. If no randoms are provided, and estimator is “auto”, or “natural”, NaturalTwoPointEstimator will be used to estimate the correlation function, with analytical two-point counts for R1R2.

  • randoms_positions2 (array, default=None) – Optionally, for cross-correlations, positions of the random catalog representing the second selection function. See randoms_positions1.

  • shifted_positions1 (array, default=None) – Optionally, in case of BAO reconstruction, positions of the first shifted catalog.

  • shifted_positions2 (array, default=None) – Optionally, in case of BAO reconstruction, positions of the second shifted catalog.

  • data_weights1 (array, default=None) – Weights in the first data catalog. Not required if weight_type is either None or “auto”. See weight_type.

  • data_weights2 (array, default=None) – Optionally, for cross-two-point counts, weights in the second data catalog. See data_weights1.

  • randoms_weights1 (array, default=None) – Optionally, weights of the first random catalog. See data_weights1.

  • randoms_weights2 (array, default=None) – Optionally, for cross-correlations, weights of the second random catalog. See randoms_weights1.

  • shifted_weights1 (array, default=None) – Optionally, weights of the first shifted catalog. See data_weights1.

  • shifted_weights2 (array, default=None) – Optionally, weights of the second shifted catalog. See shifted_weights1.

  • data_samples1 (array, default=None) – Optionally, (integer) labels of subsamples for the first data catalog. This is used to obtain an estimate of the covariance matrix from jackknife realizations.

  • data_samples2 (array, default=None) – Same as data_samples1, for the second data catalog.

  • randoms_samples1 (array, default=None) – Same as data_samples1, for the first randoms catalog.

  • randoms_samples2 (array, default=None) – Same as data_samples1, for the second randoms catalog.

  • shifted_samples1 (array, default=None) – Same as data_samples1, for the first shifted catalog.

  • shifted_samples2 (array, default=None) – Same as data_samples1, for the second shifted catalog.

  • D1D2_weight_type (string, default='auto') –

    The type of weighting to apply to provided weights. One of:

    • None: no weights are applied.

    • ”product_individual”: each pair is weighted by the product of weights \(w_{1} w_{2}\).

    • ”inverse_bitwise”: each pair is weighted by \(\mathrm{nrealizations}/(\mathrm{noffset} + \mathrm{popcount}(w_{1} \& w_{2}))\).

      Multiple bitwise weights can be provided as a list. Individual weights can additionally be provided as float arrays. In case of cross-correlations with floating weights, bitwise weights are automatically turned to IIP weights, i.e. \(\mathrm{nrealizations}/(\mathrm{noffset} + \mathrm{popcount}(w_{1}))\).

    • ”auto”: automatically choose weighting based on input weights1 and weights2,

      i.e. None when weights1 and weights2 are None, “inverse_bitwise” if one of input weights is integer, else “product_individual”.

    In addition, angular upweights can be provided with D1D2_twopoint_weights, D1R2_twopoint_weights, etc.

  • D1R2_weight_type (string, default='auto') – Same as D1D2_weight_type, for D1R2 two-point counts.

  • R1D2_weight_type (string, default='auto') – Same as D1D2_weight_type, for R1D2 two-point counts.

  • R1R2_weight_type (string, default='auto') – Same as D1D2_weight_type, for R1R2 two-point counts.

  • S1S2_weight_type (string, default='auto') – Same as D1D2_weight_type, for S1S2 two-point counts.

  • D1S2_weight_type (string, default='auto') – Same as D1D2_weight_type, for D1S2 two-point counts.

  • S1D2_weight_type (string, default='auto') – Same as D1D2_weight_type, for S1D2 two-point counts.

  • D1D2_twopoint_weights (WeightTwoPointEstimator, default=None) – Weights to be applied to each pair of particles between first and second data catalogs. A WeightTwoPointEstimator instance or any object with arrays sep (separations) and weight (weight at given separation) as attributes (i.e. to be accessed through twopoint_weights.sep, twopoint_weights.weight) or as keys (i.e. twopoint_weights['sep'], twopoint_weights['weight']) or as element (i.e. sep, weight = twopoint_weights).

  • D1R2_twopoint_weights (WeightTwoPointEstimator, default=None) – Weights to be applied to each pair of particles between first data catalog and second randoms catalog. See D1D2_twopoint_weights.

  • R1D2_twopoint_weights (WeightTwoPointEstimator, default=None) – Weights to be applied to each pair of particles between second data catalog and first randoms catalog. See D1D2_twopoint_weights.

  • R1R2_twopoint_weights (WeightTwoPointEstimator, default=None) – Weights to be applied to each pair of particles between first and second randoms catalogs. See D1D2_twopoint_weights.

  • S1S2_twopoint_weights (WeightTwoPointEstimator, default=None) – Weights to be applied to each pair of particles between first and second shifted catalogs. See D1D2_twopoint_weights.

  • D1S2_twopoint_weights (WeightTwoPointEstimator, default=None) – Weights to be applied to each pair of particles between first data catalog and second shifted catalog. See D1D2_twopoint_weights.

  • S1D2_twopoint_weights (WeightTwoPointEstimator, default=None) – Weights to be applied to each pair of particles between second data catalog and first shifted catalog. See D1D2_twopoint_weights.

  • estimator (string, default='auto') – Estimator name, one of [“auto”, “natural”, “landyszalay”, “davispeebles”, “weight”, “residual”]. If “auto”, “landyszalay” will be chosen if random or shifted catalog(s) is/are provided, else “natural”.

  • bin_type (string, default='auto') – Binning type for first dimension, e.g. \(r_{p}\) when mode is “rppi”. Set to lin for speed-up in case of linearly-spaced bins. In this case, the bin number for a pair separated by a (3D, projected, angular…) separation sep is given by (sep - edges[0])/(edges[-1] - edges[0])*(len(edges) - 1), i.e. only the first and last bins of input edges are considered. Then setting compute_sepsavg is virtually costless. For non-linear binning, set to “custom”. “auto” allows for auto-detection of the binning type: linear binning will be chosen if input edges are within rtol = 1e-05 (relative tolerance) or atol = 1e-08 (absolute tolerance) of the array np.linspace(edges[0], edges[-1], len(edges)).

  • position_type (string, default='auto') –

    Type of input positions, one of:

    • ”rd”: RA/Dec in degree, only if mode is “theta”

    • ”rdd”: RA/Dec in degree, distance, for any mode

    • ”xyz”: Cartesian positions, shape (3, N)

    • ”pos”: Cartesian positions, shape (N, 3).

  • weight_attrs (dict, default=None) – Dictionary of weighting scheme attributes. In case weight_type is “inverse_bitwise”, one can provide “nrealizations”, the total number of realizations (including current one; defaulting to the number of bits in input weights plus one); “noffset”, the offset to be added to the bitwise counts in the denominator (defaulting to 1) and “default_value”, the default value of pairwise weights if the denominator is zero (defaulting to 0). The method used to compute the normalization with PIP weights can be specified with the keyword “normalization”: if None or “total”, normalization is given by eq. 22 of arXiv:1912.08803; “brute_force” (using OpenMP’ed C code) or “brute_force_npy” (slower, using numpy only methods; both methods match within machine precision) loop over all pairs; “counter” to normalize each pair by eq. 19 of arXiv:1912.08803. For normalizations “total” or “counter”, “nalways” specifies the number of bits systematically set to 1 minus the number of bits systematically set to 0 (defaulting to 0). For example, for the “zero-truncated” estimator (arXiv:1912.08803), one would use noffset = 0, nalways = 1.

  • selection_attrs (dict, default=None) – To select pairs to be counted, provide mapping between the quantity (string) and the interval (tuple of floats), e.g. {'rp': (0., 20.)} to select pairs with transverse separation ‘rp’ between 0 and 20, {'theta': (0., 20.)} to select pairs with separation angle ‘theta’ between 0 and 20 degrees. One can additionally provide e.g. 'counts': ['D1D2', 'D1R2'] to specify counts for which the selection is to be applied, except analytic counts; defaults to all counts.

  • los (string, default='midpoint') –

    Line-of-sight to be used when mode is “smu”, “rppi” or “rp”; one of:

    • ”x”, “y” or “z”: Cartesian axis

    • ”midpoint”: the mean position of the pair: \(\mathbf{\eta} = (\mathbf{r}_{1} + \mathbf{r}_{2})/2\)

    • ”firstpoint”: the first position of the pair: \(\mathbf{\eta} = \mathbf{r}_{1}\)

    • ”endpoint”: the second position of the pair: \(\mathbf{\eta} = \mathbf{r}_{2}\)

    WARNING: “endpoint” is obtained by reversing “firstpoint” (which is the only line-of-sight implemented in the counter). This means, if “s” or “rp” edges starts at 0, and the number of “mu” or “pi” bins is even, zero separation pairs (due to duplicate objects) will be counted in counts[0, (counts.shape[1] - 1) // 2] instead of counts[0, counts.shape[1] // 2].

  • boxsize (array, float, default=None) – For periodic wrapping, the side-length(s) of the periodic cube.

  • compute_sepsavg (bool, default=True) – Set to False to not calculate the average separation for each bin. This can make the two-point counts faster if bin_type is “custom”. In this case, sep will be set the midpoint of input edges.

  • dtype (string, np.dtype, default='f8') – Array type for positions and weights. If None, defaults to type of first array of positions. Double precision is highly recommended in case mode is “theta”, twopoint_weights is provided (due to cosine), or compute_sepsavg is True. dtype=’f8’ highly recommended for mode = 'theta' or for \(\theta\)-cut.

  • nthreads (int, default=None) – Number of OpenMP threads to use.

  • mpicomm (MPI communicator, default=None) – The MPI communicator, to MPI-distribute calculation.

  • mpiroot (int, default=None) – In case mpicomm is provided, if None, input positions and weights are assumed to be scattered across all ranks. Else the MPI rank where input positions and weights are gathered.

  • kwargs (dict) –

    One can provide “engine”, one of [‘corrfunc’, ‘cucount’], and engine-specific options, e.g. For ‘corrfunc’: - ‘isa’: one of ‘fallback’, ‘sse42’, ‘avx’, ‘fastest’ - ‘mesh_refine_factors’: an integer for ech dimension (2 for mode = 'theta'),

    which increases the resolution of the grid used to speed-up pair counting (only impacts running time).

    For ‘cucount’: - ‘refine’: a float of order 1; > 1 to increase the resolution of the grid used to speed-up pair counting (only impact running time); < 1 to decrease the resolution.

    One can also provide precomputed two-point counts, e.g. R1R2.

Returns:

estimator – Estimator with correlation function estimation BaseTwoPointEstimator.corr at separations BaseTwoPointEstimator.sep.

Return type:

BaseTwoPointEstimator

Correlation function estimators

Implements correlation function estimators, natural and Landy-Szalay.

class pycorr.twopoint_estimator.BaseTwoPointEstimator(D1D2=None, R1R2=None, D1R2=None, R1D2=None, S1S2=None, D1S2=None, S1D2=None, S1R2=None)

Bases: BaseClass

Base class for estimators. Extend this class to implement a new estimator.

corr

Correlation function estimation.

Type:

array

Initialize BaseTwoPointEstimator, and set correlation estimation corr (calling run()).

Parameters:
property XX

Return first two-point counts.

classmethod concatenate_x(*others)

Concatenate input estimators along sep by concatenating their two-point counts; see BaseTwoPointCounter.concatenate_x().

property count_names

Return list of counts used in estimator.

property edges

Edges for two-point count calculation, taken from XX.

get_corr(return_sep=False, return_cov=None, return_mask=False, mode='auto', **kwargs)

Return correlation function, optionally its covariance estimate, if available.

Parameters:
  • return_sep (bool, default=False) – Whether (True) to return average separation(s) sepavg.

  • return_cov (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), return covariance estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • return_mask (bool, default=False) – Return mask of bins that are summed over along dimension 1, for each dimension 0 point.

  • mode (str, default='auto') – If ‘poles’, return multipoles. If ‘wedges’, return \(\mu\)-wedges. If ‘wp’, return projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), return multipoles; else if ‘wedges’ provided, return \(\mu\)-wedges. else if ‘pimax’ provided, return projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ell, project_to_wedges() (if mode is ‘smu’), e.g. wedges, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

  • sep (array) – Optionally, separation values.

  • corr (array) – Estimated correlation function.

  • cov (array) – Optionally, covariance estimate (of the flattened corr), see return_cov.

property mode

Two-point counting mode, taken from XX.

property ndim

Return binning dimensionality.

normalize(wnorm='XX')

Renormalize all counts (BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm). This is useful when combining measurements in various regions.

Parameters:

wnorm (float, string, default='XX') – If float, rescale all BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm such that BaseTwoPointCounter.wnorm matches wnorm. Else, name of counts (e.g. ‘D1D2’, ‘R1R2’, etc.) to take wnorm from. ‘XX’ is the first counts of the estimator (usually ‘D1D2’).

Returns:

new – New estimator, with all counts renormalized. corr is expected to be exactly the same.

Return type:

BaseTwoPointEstimator

plot(plot_std=None, mode='auto', ax=None, fn=None, kw_save=None, show=False, **kwargs)

Plot correlation function.

Parameters:
  • plot_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), plot standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • ax (matplotlib.axes.Axes, default=None) – Axes where to plot samples. If None, takes current axes.

  • fn (string, default=None) – If not None, file name where to save figure.

  • kw_save (dict, default=None) – Optional arguments for matplotlib.figure.Figure.savefig().

  • show (bool, default=False) – Whether to show figure.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

ax

Return type:

matplotlib.axes.Axes

rebin(*args, **kwargs)

Rebin estimator, by rebinning all two-point counts. See BaseTwoPointCounter.rebin().

classmethod requires(join=None, **kwargs)

List required counts.

save(filename)

Save estimator to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ', return_std=None, mode='auto', **kwargs)

Save correlation function as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

  • return_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), save standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

select(*args, **kwargs)

Restrict estimator to provided coordinate limits in place.

For example:

estimator.select((0, 0.3)) # restrict first axis to (0, 0.3)
estimator.select(None, (0, 0.2)) # restrict second axis to (0, 0.2)
property sep

Array of separation values of first dimension; if not set by run(), taken from R1R2 if provided, else XX.

sepavg(*args, **kwargs)

Return average of separation for input axis; this is an 1D array of size shape[axis].

property seps

Array of separation values, if not set by run(), taken from R1R2 if provided, else XX.

property shape

Return shape of obtained correlation corr.

slice(*args, **kwargs)

Slice estimator, by rebinning all two-point counts. See BaseTwoPointCounter.slice().

classmethod sum(*others, exclude=None, uniques=True)

Sum input estimators (their two-point counts, actually). See e.g. https://arxiv.org/pdf/1905.01133.pdf for a use case. Input two-point estimators must have same edges for this operation to make sense (no checks performed).

Parameters:
  • exclude (list, default=None) – List of counts to exclude, e.g. [‘D1D2’].

  • uniques (bool, default=True) – If True, sum only unique (in the sense of same BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm) count instances.

Returns:

new

Return type:

BaseTwoPointEstimator

wrap()

Return new ‘smu’ or ‘rppi’ two-point estimators with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_estimator.DavisPeeblesTwoPointEstimator(D1D2=None, R1R2=None, D1R2=None, R1D2=None, S1S2=None, D1S2=None, S1D2=None, S1R2=None)

Bases: BaseTwoPointEstimator

Initialize BaseTwoPointEstimator, and set correlation estimation corr (calling run()).

Parameters:
property XX

Return first two-point counts.

classmethod concatenate_x(*others)

Concatenate input estimators along sep by concatenating their two-point counts; see BaseTwoPointCounter.concatenate_x().

property count_names

Return list of counts used in estimator.

property edges

Edges for two-point count calculation, taken from XX.

get_corr(return_sep=False, return_cov=None, return_mask=False, mode='auto', **kwargs)

Return correlation function, optionally its covariance estimate, if available.

Parameters:
  • return_sep (bool, default=False) – Whether (True) to return average separation(s) sepavg.

  • return_cov (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), return covariance estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • return_mask (bool, default=False) – Return mask of bins that are summed over along dimension 1, for each dimension 0 point.

  • mode (str, default='auto') – If ‘poles’, return multipoles. If ‘wedges’, return \(\mu\)-wedges. If ‘wp’, return projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), return multipoles; else if ‘wedges’ provided, return \(\mu\)-wedges. else if ‘pimax’ provided, return projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ell, project_to_wedges() (if mode is ‘smu’), e.g. wedges, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

  • sep (array) – Optionally, separation values.

  • corr (array) – Estimated correlation function.

  • cov (array) – Optionally, covariance estimate (of the flattened corr), see return_cov.

property mode

Two-point counting mode, taken from XX.

property ndim

Return binning dimensionality.

normalize(wnorm='XX')

Renormalize all counts (BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm). This is useful when combining measurements in various regions.

Parameters:

wnorm (float, string, default='XX') – If float, rescale all BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm such that BaseTwoPointCounter.wnorm matches wnorm. Else, name of counts (e.g. ‘D1D2’, ‘R1R2’, etc.) to take wnorm from. ‘XX’ is the first counts of the estimator (usually ‘D1D2’).

Returns:

new – New estimator, with all counts renormalized. corr is expected to be exactly the same.

Return type:

BaseTwoPointEstimator

plot(plot_std=None, mode='auto', ax=None, fn=None, kw_save=None, show=False, **kwargs)

Plot correlation function.

Parameters:
  • plot_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), plot standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • ax (matplotlib.axes.Axes, default=None) – Axes where to plot samples. If None, takes current axes.

  • fn (string, default=None) – If not None, file name where to save figure.

  • kw_save (dict, default=None) – Optional arguments for matplotlib.figure.Figure.savefig().

  • show (bool, default=False) – Whether to show figure.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

ax

Return type:

matplotlib.axes.Axes

rebin(*args, **kwargs)

Rebin estimator, by rebinning all two-point counts. See BaseTwoPointCounter.rebin().

classmethod requires(join=None, **kwargs)

List required counts.

run()

Set correlation function estimate corr based on the Davis-Peebles estimator \((D1D2 - D1S2)/D1R2\).

save(filename)

Save estimator to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ', return_std=None, mode='auto', **kwargs)

Save correlation function as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

  • return_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), save standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

select(*args, **kwargs)

Restrict estimator to provided coordinate limits in place.

For example:

estimator.select((0, 0.3)) # restrict first axis to (0, 0.3)
estimator.select(None, (0, 0.2)) # restrict second axis to (0, 0.2)
property sep

Array of separation values of first dimension; if not set by run(), taken from R1R2 if provided, else XX.

sepavg(*args, **kwargs)

Return average of separation for input axis; this is an 1D array of size shape[axis].

property seps

Array of separation values, if not set by run(), taken from R1R2 if provided, else XX.

property shape

Return shape of obtained correlation corr.

slice(*args, **kwargs)

Slice estimator, by rebinning all two-point counts. See BaseTwoPointCounter.slice().

classmethod sum(*others, exclude=None, uniques=True)

Sum input estimators (their two-point counts, actually). See e.g. https://arxiv.org/pdf/1905.01133.pdf for a use case. Input two-point estimators must have same edges for this operation to make sense (no checks performed).

Parameters:
  • exclude (list, default=None) – List of counts to exclude, e.g. [‘D1D2’].

  • uniques (bool, default=True) – If True, sum only unique (in the sense of same BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm) count instances.

Returns:

new

Return type:

BaseTwoPointEstimator

wrap()

Return new ‘smu’ or ‘rppi’ two-point estimators with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_estimator.LandySzalayTwoPointEstimator(D1D2=None, R1R2=None, D1R2=None, R1D2=None, S1S2=None, D1S2=None, S1D2=None, S1R2=None)

Bases: BaseTwoPointEstimator

Initialize BaseTwoPointEstimator, and set correlation estimation corr (calling run()).

Parameters:
property XX

Return first two-point counts.

classmethod concatenate_x(*others)

Concatenate input estimators along sep by concatenating their two-point counts; see BaseTwoPointCounter.concatenate_x().

property count_names

Return list of counts used in estimator.

property edges

Edges for two-point count calculation, taken from XX.

get_corr(return_sep=False, return_cov=None, return_mask=False, mode='auto', **kwargs)

Return correlation function, optionally its covariance estimate, if available.

Parameters:
  • return_sep (bool, default=False) – Whether (True) to return average separation(s) sepavg.

  • return_cov (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), return covariance estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • return_mask (bool, default=False) – Return mask of bins that are summed over along dimension 1, for each dimension 0 point.

  • mode (str, default='auto') – If ‘poles’, return multipoles. If ‘wedges’, return \(\mu\)-wedges. If ‘wp’, return projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), return multipoles; else if ‘wedges’ provided, return \(\mu\)-wedges. else if ‘pimax’ provided, return projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ell, project_to_wedges() (if mode is ‘smu’), e.g. wedges, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

  • sep (array) – Optionally, separation values.

  • corr (array) – Estimated correlation function.

  • cov (array) – Optionally, covariance estimate (of the flattened corr), see return_cov.

property mode

Two-point counting mode, taken from XX.

property ndim

Return binning dimensionality.

normalize(wnorm='XX')

Renormalize all counts (BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm). This is useful when combining measurements in various regions.

Parameters:

wnorm (float, string, default='XX') – If float, rescale all BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm such that BaseTwoPointCounter.wnorm matches wnorm. Else, name of counts (e.g. ‘D1D2’, ‘R1R2’, etc.) to take wnorm from. ‘XX’ is the first counts of the estimator (usually ‘D1D2’).

Returns:

new – New estimator, with all counts renormalized. corr is expected to be exactly the same.

Return type:

BaseTwoPointEstimator

plot(plot_std=None, mode='auto', ax=None, fn=None, kw_save=None, show=False, **kwargs)

Plot correlation function.

Parameters:
  • plot_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), plot standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • ax (matplotlib.axes.Axes, default=None) – Axes where to plot samples. If None, takes current axes.

  • fn (string, default=None) – If not None, file name where to save figure.

  • kw_save (dict, default=None) – Optional arguments for matplotlib.figure.Figure.savefig().

  • show (bool, default=False) – Whether to show figure.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

ax

Return type:

matplotlib.axes.Axes

rebin(*args, **kwargs)

Rebin estimator, by rebinning all two-point counts. See BaseTwoPointCounter.rebin().

classmethod requires(join=None, **kwargs)

List required counts.

run()

Set correlation function estimate corr based on the Landy-Szalay estimator \((D1D2 - D1S2 - S1D2 - S1S2)/R1R2\).

save(filename)

Save estimator to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ', return_std=None, mode='auto', **kwargs)

Save correlation function as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

  • return_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), save standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

select(*args, **kwargs)

Restrict estimator to provided coordinate limits in place.

For example:

estimator.select((0, 0.3)) # restrict first axis to (0, 0.3)
estimator.select(None, (0, 0.2)) # restrict second axis to (0, 0.2)
property sep

Array of separation values of first dimension; if not set by run(), taken from R1R2 if provided, else XX.

sepavg(*args, **kwargs)

Return average of separation for input axis; this is an 1D array of size shape[axis].

property seps

Array of separation values, if not set by run(), taken from R1R2 if provided, else XX.

property shape

Return shape of obtained correlation corr.

slice(*args, **kwargs)

Slice estimator, by rebinning all two-point counts. See BaseTwoPointCounter.slice().

classmethod sum(*others, exclude=None, uniques=True)

Sum input estimators (their two-point counts, actually). See e.g. https://arxiv.org/pdf/1905.01133.pdf for a use case. Input two-point estimators must have same edges for this operation to make sense (no checks performed).

Parameters:
  • exclude (list, default=None) – List of counts to exclude, e.g. [‘D1D2’].

  • uniques (bool, default=True) – If True, sum only unique (in the sense of same BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm) count instances.

Returns:

new

Return type:

BaseTwoPointEstimator

wrap()

Return new ‘smu’ or ‘rppi’ two-point estimators with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_estimator.NaturalTwoPointEstimator(D1D2=None, R1R2=None, D1R2=None, R1D2=None, S1S2=None, D1S2=None, S1D2=None, S1R2=None)

Bases: BaseTwoPointEstimator

Initialize BaseTwoPointEstimator, and set correlation estimation corr (calling run()).

Parameters:
property XX

Return first two-point counts.

classmethod concatenate_x(*others)

Concatenate input estimators along sep by concatenating their two-point counts; see BaseTwoPointCounter.concatenate_x().

property count_names

Return list of counts used in estimator.

property edges

Edges for two-point count calculation, taken from XX.

get_corr(return_sep=False, return_cov=None, return_mask=False, mode='auto', **kwargs)

Return correlation function, optionally its covariance estimate, if available.

Parameters:
  • return_sep (bool, default=False) – Whether (True) to return average separation(s) sepavg.

  • return_cov (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), return covariance estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • return_mask (bool, default=False) – Return mask of bins that are summed over along dimension 1, for each dimension 0 point.

  • mode (str, default='auto') – If ‘poles’, return multipoles. If ‘wedges’, return \(\mu\)-wedges. If ‘wp’, return projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), return multipoles; else if ‘wedges’ provided, return \(\mu\)-wedges. else if ‘pimax’ provided, return projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ell, project_to_wedges() (if mode is ‘smu’), e.g. wedges, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

  • sep (array) – Optionally, separation values.

  • corr (array) – Estimated correlation function.

  • cov (array) – Optionally, covariance estimate (of the flattened corr), see return_cov.

property mode

Two-point counting mode, taken from XX.

property ndim

Return binning dimensionality.

normalize(wnorm='XX')

Renormalize all counts (BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm). This is useful when combining measurements in various regions.

Parameters:

wnorm (float, string, default='XX') – If float, rescale all BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm such that BaseTwoPointCounter.wnorm matches wnorm. Else, name of counts (e.g. ‘D1D2’, ‘R1R2’, etc.) to take wnorm from. ‘XX’ is the first counts of the estimator (usually ‘D1D2’).

Returns:

new – New estimator, with all counts renormalized. corr is expected to be exactly the same.

Return type:

BaseTwoPointEstimator

plot(plot_std=None, mode='auto', ax=None, fn=None, kw_save=None, show=False, **kwargs)

Plot correlation function.

Parameters:
  • plot_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), plot standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • ax (matplotlib.axes.Axes, default=None) – Axes where to plot samples. If None, takes current axes.

  • fn (string, default=None) – If not None, file name where to save figure.

  • kw_save (dict, default=None) – Optional arguments for matplotlib.figure.Figure.savefig().

  • show (bool, default=False) – Whether to show figure.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

ax

Return type:

matplotlib.axes.Axes

rebin(*args, **kwargs)

Rebin estimator, by rebinning all two-point counts. See BaseTwoPointCounter.rebin().

classmethod requires(join=None, **kwargs)

List required counts.

run()

Set correlation function estimate corr based on the natural estimator \((D1D2 - S1S2)/R1R2\).

save(filename)

Save estimator to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ', return_std=None, mode='auto', **kwargs)

Save correlation function as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

  • return_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), save standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

select(*args, **kwargs)

Restrict estimator to provided coordinate limits in place.

For example:

estimator.select((0, 0.3)) # restrict first axis to (0, 0.3)
estimator.select(None, (0, 0.2)) # restrict second axis to (0, 0.2)
property sep

Array of separation values of first dimension; if not set by run(), taken from R1R2 if provided, else XX.

sepavg(*args, **kwargs)

Return average of separation for input axis; this is an 1D array of size shape[axis].

property seps

Array of separation values, if not set by run(), taken from R1R2 if provided, else XX.

property shape

Return shape of obtained correlation corr.

slice(*args, **kwargs)

Slice estimator, by rebinning all two-point counts. See BaseTwoPointCounter.slice().

classmethod sum(*others, exclude=None, uniques=True)

Sum input estimators (their two-point counts, actually). See e.g. https://arxiv.org/pdf/1905.01133.pdf for a use case. Input two-point estimators must have same edges for this operation to make sense (no checks performed).

Parameters:
  • exclude (list, default=None) – List of counts to exclude, e.g. [‘D1D2’].

  • uniques (bool, default=True) – If True, sum only unique (in the sense of same BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm) count instances.

Returns:

new

Return type:

BaseTwoPointEstimator

wrap()

Return new ‘smu’ or ‘rppi’ two-point estimators with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_estimator.RegisteredTwoPointEstimator(name, bases, class_dict)

Bases: BaseMetaClass

Metaclass registering BaseTwoPointEstimator-derived classes.

mro()

Return a type’s method resolution order.

set_logger()

Add attributes for logging:

  • logger

  • methods log_debug, log_info, log_warning, log_error, log_critical

class pycorr.twopoint_estimator.ResidualTwoPointEstimator(D1D2=None, R1R2=None, D1R2=None, R1D2=None, S1S2=None, D1S2=None, S1D2=None, S1R2=None)

Bases: BaseTwoPointEstimator

Initialize BaseTwoPointEstimator, and set correlation estimation corr (calling run()).

Parameters:
property XX

Return first two-point counts.

classmethod concatenate_x(*others)

Concatenate input estimators along sep by concatenating their two-point counts; see BaseTwoPointCounter.concatenate_x().

property count_names

Return list of counts used in estimator.

property edges

Edges for two-point count calculation, taken from XX.

get_corr(return_sep=False, return_cov=None, return_mask=False, mode='auto', **kwargs)

Return correlation function, optionally its covariance estimate, if available.

Parameters:
  • return_sep (bool, default=False) – Whether (True) to return average separation(s) sepavg.

  • return_cov (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), return covariance estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • return_mask (bool, default=False) – Return mask of bins that are summed over along dimension 1, for each dimension 0 point.

  • mode (str, default='auto') – If ‘poles’, return multipoles. If ‘wedges’, return \(\mu\)-wedges. If ‘wp’, return projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), return multipoles; else if ‘wedges’ provided, return \(\mu\)-wedges. else if ‘pimax’ provided, return projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ell, project_to_wedges() (if mode is ‘smu’), e.g. wedges, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

  • sep (array) – Optionally, separation values.

  • corr (array) – Estimated correlation function.

  • cov (array) – Optionally, covariance estimate (of the flattened corr), see return_cov.

property mode

Two-point counting mode, taken from XX.

property ndim

Return binning dimensionality.

normalize(wnorm='XX')

Renormalize all counts (BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm). This is useful when combining measurements in various regions.

Parameters:

wnorm (float, string, default='XX') – If float, rescale all BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm such that BaseTwoPointCounter.wnorm matches wnorm. Else, name of counts (e.g. ‘D1D2’, ‘R1R2’, etc.) to take wnorm from. ‘XX’ is the first counts of the estimator (usually ‘D1D2’).

Returns:

new – New estimator, with all counts renormalized. corr is expected to be exactly the same.

Return type:

BaseTwoPointEstimator

plot(plot_std=None, mode='auto', ax=None, fn=None, kw_save=None, show=False, **kwargs)

Plot correlation function.

Parameters:
  • plot_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), plot standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • ax (matplotlib.axes.Axes, default=None) – Axes where to plot samples. If None, takes current axes.

  • fn (string, default=None) – If not None, file name where to save figure.

  • kw_save (dict, default=None) – Optional arguments for matplotlib.figure.Figure.savefig().

  • show (bool, default=False) – Whether to show figure.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

ax

Return type:

matplotlib.axes.Axes

rebin(*args, **kwargs)

Rebin estimator, by rebinning all two-point counts. See BaseTwoPointCounter.rebin().

classmethod requires(join=None, **kwargs)

List required counts.

run()

Set correlation function estimate corr based on the residual estimator \((D1R2 - S1R2)/R1R2\).

save(filename)

Save estimator to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ', return_std=None, mode='auto', **kwargs)

Save correlation function as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

  • return_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), save standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

select(*args, **kwargs)

Restrict estimator to provided coordinate limits in place.

For example:

estimator.select((0, 0.3)) # restrict first axis to (0, 0.3)
estimator.select(None, (0, 0.2)) # restrict second axis to (0, 0.2)
property sep

Array of separation values of first dimension; if not set by run(), taken from R1R2 if provided, else XX.

sepavg(*args, **kwargs)

Return average of separation for input axis; this is an 1D array of size shape[axis].

property seps

Array of separation values, if not set by run(), taken from R1R2 if provided, else XX.

property shape

Return shape of obtained correlation corr.

slice(*args, **kwargs)

Slice estimator, by rebinning all two-point counts. See BaseTwoPointCounter.slice().

classmethod sum(*others, exclude=None, uniques=True)

Sum input estimators (their two-point counts, actually). See e.g. https://arxiv.org/pdf/1905.01133.pdf for a use case. Input two-point estimators must have same edges for this operation to make sense (no checks performed).

Parameters:
  • exclude (list, default=None) – List of counts to exclude, e.g. [‘D1D2’].

  • uniques (bool, default=True) – If True, sum only unique (in the sense of same BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm) count instances.

Returns:

new

Return type:

BaseTwoPointEstimator

wrap()

Return new ‘smu’ or ‘rppi’ two-point estimators with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_estimator.TwoPointEstimator

Bases: BaseClass

Entry point to two-point estimators.

Parameters:
  • estimator (string, default='landyszalay') – Estimator name, one of [“auto”, “natural”, “davispeebles”, “landyszalay”].

  • args (list) – Arguments for two-point estimator, see TwoPointEstimator.

  • kwargs (dict) – Arguments for two-point estimator, see TwoPointEstimator.

Returns:

estimator – Estimator instance.

Return type:

BaseTwoPointEstimator

static from_state(state, load=False)

Return new estimator based on state dictionary.

exception pycorr.twopoint_estimator.TwoPointEstimatorError

Bases: Exception

Exception raised when issue with estimator.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class pycorr.twopoint_estimator.WeightTwoPointEstimator(D1D2=None, R1R2=None, D1R2=None, R1D2=None, S1S2=None, D1S2=None, S1D2=None, S1R2=None)

Bases: NaturalTwoPointEstimator

Initialize BaseTwoPointEstimator, and set correlation estimation corr (calling run()).

Parameters:
property XX

Return first two-point counts.

classmethod concatenate_x(*others)

Concatenate input estimators along sep by concatenating their two-point counts; see BaseTwoPointCounter.concatenate_x().

property count_names

Return list of counts used in estimator.

property edges

Edges for two-point count calculation, taken from XX.

get_corr(return_sep=False, return_cov=None, return_mask=False, mode='auto', **kwargs)

Return correlation function, optionally its covariance estimate, if available.

Parameters:
  • return_sep (bool, default=False) – Whether (True) to return average separation(s) sepavg.

  • return_cov (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), return covariance estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • return_mask (bool, default=False) – Return mask of bins that are summed over along dimension 1, for each dimension 0 point.

  • mode (str, default='auto') – If ‘poles’, return multipoles. If ‘wedges’, return \(\mu\)-wedges. If ‘wp’, return projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), return multipoles; else if ‘wedges’ provided, return \(\mu\)-wedges. else if ‘pimax’ provided, return projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ell, project_to_wedges() (if mode is ‘smu’), e.g. wedges, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

  • sep (array) – Optionally, separation values.

  • corr (array) – Estimated correlation function.

  • cov (array) – Optionally, covariance estimate (of the flattened corr), see return_cov.

property mode

Two-point counting mode, taken from XX.

property ndim

Return binning dimensionality.

normalize(wnorm='XX')

Renormalize all counts (BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm). This is useful when combining measurements in various regions.

Parameters:

wnorm (float, string, default='XX') – If float, rescale all BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm such that BaseTwoPointCounter.wnorm matches wnorm. Else, name of counts (e.g. ‘D1D2’, ‘R1R2’, etc.) to take wnorm from. ‘XX’ is the first counts of the estimator (usually ‘D1D2’).

Returns:

new – New estimator, with all counts renormalized. corr is expected to be exactly the same.

Return type:

BaseTwoPointEstimator

plot(plot_std=None, mode='auto', ax=None, fn=None, kw_save=None, show=False, **kwargs)

Plot correlation function.

Parameters:
  • plot_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), plot standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • ax (matplotlib.axes.Axes, default=None) – Axes where to plot samples. If None, takes current axes.

  • fn (string, default=None) – If not None, file name where to save figure.

  • kw_save (dict, default=None) – Optional arguments for matplotlib.figure.Figure.savefig().

  • show (bool, default=False) – Whether to show figure.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

ax

Return type:

matplotlib.axes.Axes

rebin(*args, **kwargs)

Rebin estimator, by rebinning all two-point counts. See BaseTwoPointCounter.rebin().

classmethod requires(join=None, **kwargs)

List required counts.

run()

Set weight estimate corr following \(R1R2/D1D2\), typically used for angular weights, with R1R2 from parent sample and D1D2 from fibered sample.

save(filename)

Save estimator to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ', return_std=None, mode='auto', **kwargs)

Save correlation function as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

  • return_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), save standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

select(*args, **kwargs)

Restrict estimator to provided coordinate limits in place.

For example:

estimator.select((0, 0.3)) # restrict first axis to (0, 0.3)
estimator.select(None, (0, 0.2)) # restrict second axis to (0, 0.2)
property sep

Array of separation values of first dimension; if not set by run(), taken from R1R2 if provided, else XX.

sepavg(*args, **kwargs)

Return average of separation for input axis; this is an 1D array of size shape[axis].

property seps

Array of separation values, if not set by run(), taken from R1R2 if provided, else XX.

property shape

Return shape of obtained correlation corr.

slice(*args, **kwargs)

Slice estimator, by rebinning all two-point counts. See BaseTwoPointCounter.slice().

classmethod sum(*others, exclude=None, uniques=True)

Sum input estimators (their two-point counts, actually). See e.g. https://arxiv.org/pdf/1905.01133.pdf for a use case. Input two-point estimators must have same edges for this operation to make sense (no checks performed).

Parameters:
  • exclude (list, default=None) – List of counts to exclude, e.g. [‘D1D2’].

  • uniques (bool, default=True) – If True, sum only unique (in the sense of same BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm) count instances.

Returns:

new

Return type:

BaseTwoPointEstimator

property weight

Another name for corr.

wrap()

Return new ‘smu’ or ‘rppi’ two-point estimators with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

pycorr.twopoint_estimator.get_twopoint_estimator(estimator='auto', with_DR=True, with_jackknife=False)

Return BaseTwoPointEstimator subclass corresponding to input estimator name.

Parameters:
  • estimator (string, default='auto') – Estimator name, one of [“auto”, “natural”, “davispeebles”, “landyszalay”]. If “auto”, “landyszalay” will be chosen if with_DR, else “natural”.

  • with_DR (bool, default=True) – If estimator will be provided with two-point counts from data x random catalogs. See above.

Returns:

estimator – Estimator class.

Return type:

type

pycorr.twopoint_estimator.project_to_multipoles(estimator, ells=(0, 2, 4), return_sep=True, return_cov=None, ignore_nan=False, rp=None, return_mask=False, **kwargs)

Project \((s, \mu)\) correlation function estimation onto Legendre polynomials.

Parameters:
  • estimator (BaseTwoPointEstimator) – Estimator for \((s, \mu)\) correlation function.

  • ells (tuple, int, default=(0, 2, 4)) – Order of Legendre polynomial.

  • return_sep (bool, default=True) – Whether (True) to return separation.

  • return_cov (bool, default=None) – If True or None and input estimator holds (jackknife) realization(), return covariance matrix estimate (for all successive ells). If True and input estimator does not have realization(), raise TwoPointEstimatorError.

  • ignore_nan (bool, default=False) – If True, ignore NaN values of the correlation function in the integration.

  • rp (tuple, default=None) – Optionally, tuple of min and max values for a \(r_{p} = s \sqrt(1 - \mu^{2})\) cut.

  • return_mask (bool, default=False) – Return mask of \(\mu\)-bins that are summed over, for each \(s\); of shape estimator.shape.

  • kwargs (dict) – Optional arguments for JackknifeTwoPointEstimator.realization(), when relevant.

Returns:

  • sep (array) – Optionally, array of separation values.

  • poles (array) – Correlation function multipoles.

  • cov (array) – Optionally, covariance estimate (for all successive ells), see return_cov.

pycorr.twopoint_estimator.project_to_poles(estimator, ells=(0, 2, 4), return_sep=True, return_cov=None, ignore_nan=False, rp=None, return_mask=False, **kwargs)

Project \((s, \mu)\) correlation function estimation onto Legendre polynomials.

Parameters:
  • estimator (BaseTwoPointEstimator) – Estimator for \((s, \mu)\) correlation function.

  • ells (tuple, int, default=(0, 2, 4)) – Order of Legendre polynomial.

  • return_sep (bool, default=True) – Whether (True) to return separation.

  • return_cov (bool, default=None) – If True or None and input estimator holds (jackknife) realization(), return covariance matrix estimate (for all successive ells). If True and input estimator does not have realization(), raise TwoPointEstimatorError.

  • ignore_nan (bool, default=False) – If True, ignore NaN values of the correlation function in the integration.

  • rp (tuple, default=None) – Optionally, tuple of min and max values for a \(r_{p} = s \sqrt(1 - \mu^{2})\) cut.

  • return_mask (bool, default=False) – Return mask of \(\mu\)-bins that are summed over, for each \(s\); of shape estimator.shape.

  • kwargs (dict) – Optional arguments for JackknifeTwoPointEstimator.realization(), when relevant.

Returns:

  • sep (array) – Optionally, array of separation values.

  • poles (array) – Correlation function multipoles.

  • cov (array) – Optionally, covariance estimate (for all successive ells), see return_cov.

pycorr.twopoint_estimator.project_to_wedges(estimator, wedges=[(-1.0, -0.6666666666666666), (-0.6666666666666666, -0.3333333333333333), (-0.3333333333333333, 0.0), (0.0, 0.3333333333333333), (0.3333333333333333, 0.6666666666666666), (0.6666666666666666, 1.0)], return_sep=True, return_cov=None, ignore_nan=False, rp=None, return_mask=False, **kwargs)

Project \((s, \mu)\) correlation function estimation onto wedges (integrating over \(\mu\)).

Parameters:
  • estimator (BaseTwoPointEstimator) – Estimator for \((s, \mu)\) correlation function.

  • wedges (tuple, default=(-1., -2./3, -1./3, 0., 1./3, 2./3, 1.)) – \(mu\)-wedges. Single or list of tuples \((\mu_{\mathrm{min}}, \mu_{\mathrm{max}})\), or \(\mu\)-edges \((\mu_{0}, ..., \mu_{n})\),

  • return_sep (bool, default=True) – Whether (True) to return separation.

  • return_cov (bool, default=None) – If True or None and input estimator holds (jackknife) realization(), return covariance matrix estimate (for all successive ells). If True and input estimator does not have realization(), raise TwoPointEstimatorError.

  • ignore_nan (bool, default=False) – If True, ignore NaN values of the correlation functions in the integration.

  • rp (tuple, default=None) – Optionally, tuple of min and max values for a \(r_{p} = s \sqrt(1 - \mu^{2})\) cut.

  • return_mask (bool, default=False) – Return mask of \(\mu\)-bins that are summed over, for each \(s\); of shape (n,) + estimator.shape, with \(n\) the number of wedges.

  • kwargs (dict) – Optional arguments for JackknifeTwoPointEstimator.realization(), when relevant.

Returns:

  • sep (array) – Optionally, array of separation values.

  • wedges (array) – Correlation function wedges.

  • cov (array) – Optionally, covariance estimate (for all successive wedges), see return_cov.

pycorr.twopoint_estimator.project_to_wp(estimator, pimax=None, return_sep=True, return_cov=None, ignore_nan=False, return_mask=False, **kwargs)

Integrate \((r_{p}, \pi)\) correlation function over \(\pi\) to obtain \(w_{p}(r_{p})\).

Parameters:
  • estimator (BaseTwoPointEstimator) – Estimator for \((r_{p}, \pi)\) correlation function.

  • pimax (float, tuple, default=None) – Upper bound for summation of \(\pi\), or tuple of (lower bound, upper bound).

  • return_sep (bool, default=True) – Whether (True) to return separation.

  • return_cov (bool, default=None) – If True or None and input estimator holds (jackknife) realization(), return covariance matrix estimate (for all successive ells). If True and input estimator does not have realization(), raise TwoPointEstimatorError.

  • ignore_nan (bool, default=False) – If True, ignore NaN values of the correlation functions in the integration.

  • return_mask (bool, default=False) – Return mask of \(\pi\)-bins that are summed over, for each \(r_{p}\); of shape estimator.shape.

  • kwargs (dict) – Optional arguments for JackknifeTwoPointEstimator.realization(), when relevant.

Returns:

  • sep (array) – Optionally, array of separation values.

  • wp (array) – Estimated \(w_{p}(r_{p})\).

  • cov (array) – Optionally, covariance estimate, see return_cov.

Base two-point counter class

Implements base two-point counter, to be extended when implementing a new engine.

class pycorr.twopoint_counter.AnalyticTwoPointCounter(mode, edges, boxsize, size1=10, size2=None, los='z', selection_attrs=None)

Bases: BaseTwoPointCounter

Analytic two-point counter. Assume periodic wrapping and no data weights.

wcounts

Analytical two-point counts.

Type:

array

Initialize AnalyticTwoPointCounter, and set wcounts and sep.

Parameters:
  • mode (string) –

    Two-point counting mode, one of:

    • ”s”: two-point counts as a function of distance between two particles

    • ”smu”: two-point counts as a function of distance between two particles and cosine angle \(\mu\) w.r.t. the line-of-sight

    • ”rppi”: two-point counts as a function of distance transverse (\(r_{p}\)) and parallel (\(\pi\)) to the line-of-sight

    • ”rp”: same as “rppi”, without binning in \(\pi\)

  • edges (tuple, array) – Tuple of bin edges (arrays), for the first (e.g. \(r_{p}\)) and optionally second (e.g. \(\pi\)) dimensions. In case of single-dimension binning (e.g. mode is “theta”, “s” or “rp”), the single array of bin edges can be provided directly.

  • boxsize (array, float) – The side-length(s) of the periodic cube.

  • size1 (int, default=2) – Length of the first catalog.

  • size2 (int, default=None) – Optionally, for cross-two-point counts, length of second catalog.

  • los (string, default='z') – Line-of-sight to be used when mode is “rp”, in case of non-cubic box; one of cartesian axes “x”, “y” or “z”.

property compute_sepavg

Whether to compute average of separation values for first dimension (e.g. \(s\) if mode is “smu”).

classmethod concatenate_x(*others)

Concatenate input two-point counts along sep; useful when running two-point counts at different particle densities, e.g. high density on small scales, and lower density on larger scales, to keep computing time tractable.

Warning

wnorm is cast to a ndim array.

property ndim

Return binning dimensionality.

normalization()

Return two-point count normalization, i.e., in case of cross-correlation size1 * size2, and in case of auto-correlation size1 * (size1 - 1).

normalize(wnorm)

Rescale both wcounts and wnorm such that new wnorm matches wnorm. This is useful when combining counts in various regions.

Parameters:

wnorm (float) – New normalization wnorm.

Returns:

new – Normalized counts.

Return type:

BaseTwoPointCounter

normalized_wcounts()

Return normalized two-point counts, i.e. wcounts divided by normalization().

property periodic

Whether periodic wrapping is used (i.e. boxsize is not None).

rebin(factor=1)

Rebin two-point counts, by factor(s) factor. Input factors must divide shape.

Warning

If current instance is the result of concatenate_x(), rebinning is exact only if factor divides each of the constant-wnorm chunks.

reverse()

Return counts for reversed positions1/weights1 and positions2/weights2.

run()

Set analytical two-point counts.

save(filename)

Save two-point counts to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ')

Save two-point counts as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

select(*xlims)

Restrict counts to provided coordinate limits in place.

For example:

counts.select((0, 0.3))  # restrict first axis to (0, 0.3)
counts.select(None, (0, 0.2))  # restrict second axis to (0, 0.2)
statistic.select((0, 30, 4))   # rebin to match step size of 4 and restrict to (0, 30)
property sep

Array of separation values of first dimension (e.g. \(s\) if mode is “smu”).

sepavg(axis=0, method=None)

Return average of separation for input axis.

Parameters:
  • axis (int, default=0) – Axis; if mode is “smu”, 0 is \(s\), 1 is \(mu\); if mode is “rppi”, 0 is \(r_{p}\), 1 is \(\pi\).

  • method (str, default=None) – If None, return average separation from seps. If ‘mid’, return bin mid-points.

Returns:

sepavg – 1D array of size shape[axis].

Return type:

array

property shape

Return shape of obtained counts wcounts.

slice(*slices)

Slice counts in place. If slice step is not 1, use rebin(). For example:

counts.slice(slice(0, 10, 2), slice(0, 6, 3)) # rebin by factor 2 (resp. 3) along axis 0 (resp. 1), up to index 10 (resp. 6)
counts[:10:2,:6:3] # same as above, but return new instance.
classmethod sum(*others)

Sum input two-point counts; useful when splitting input sample of particles; e.g. https://arxiv.org/pdf/1905.01133.pdf.

Warning

If > 1 input two-point counts, size1, size2 attributes will be lost. Input two-point counts must have same edges for this operation to make sense (no checks performed).

property with_mpi

Whether to use MPI.

wrap()

Return new ‘smu’ or ‘rppi’ two-point counts with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_counter.BaseTwoPointCounter(mode, edges, positions1, positions2=None, weights1=None, weights2=None, bin_type='auto', position_type='auto', weight_type='auto', weight_attrs=None, twopoint_weights=None, selection_attrs=None, los='midpoint', boxsize=None, compute_sepsavg=True, dtype='f8', nthreads=None, mpicomm=None, mpiroot=None, **kwargs)

Bases: BaseClass

Base class for two-point counters. Extend this class to implement a new two-point counter engine.

wcounts

(Optionally weighted) two-point counts.

Type:

array

wnorm

Two-point count normalization.

Type:

float

Initialize BaseTwoPointCounter, and run actual two-point counts (calling run()), setting wcounts and sep.

Parameters:
  • mode (string) –

    Type of two-point counts, one of:

    • ”theta”: as a function of angle (in degree) between two particles

    • ”s”: as a function of distance between two particles

    • ”smu”: as a function of distance between two particles and cosine angle \(\mu\) w.r.t. the line-of-sight

    • ”rppi”: as a function of distance transverse (\(r_{p}\)) and parallel (\(\pi\)) to the line-of-sight

    • ”rp”: same as “rppi”, without binning in \(\pi\)

  • edges (tuple, array) – Tuple of bin edges (arrays), for the first (e.g. \(r_{p}\)) and optionally second (e.g. \(\pi \in [-\infty, \infty]\), \(\mu \in [-1, 1]\)) dimensions. In case of single-dimension binning (e.g. mode is “theta”, “s” or “rp”), the single array of bin edges can be provided directly. Edges are inclusive on the low end, exclusive on the high end, i.e. a pair separated by \(s\) falls in bin i if edges[i] <= s < edges[i+1]. In case mode is “smu” however, the first \(\mu\)-bin is also exclusive on the low end (increase the \(\mu\)-range by a tiny value to include \(\mu = \pm 1\)). Pairs at separation \(s = 0\) are included in the \(\mu = 0\) bin. Similarly, in case mode is “rppi”, the first \(\pi\)-bin is also exclusive on the low end and pairs at separation \(s = 0\) are included in the \(\pi = 0\) bin. In case of auto-correlation (no positions2 provided), auto-pairs (pairs of same objects) are not counted. In case of cross-correlation, all pairs are counted. In any case, duplicate objects (with separation zero) will be counted.

  • positions1 (list, array) – Positions in the first catalog. Typically of shape (3, N), but can be (2, N) when mode is “theta”. See position_type.

  • positions2 (list, array, default=None) – Optionally, for cross-two-point counts, positions in the second catalog. See positions1.

  • weights1 (array, list, default=None) – Weights of the first catalog. Not required if weight_type is either None or “auto”. See weight_type.

  • weights2 (array, list, default=None) – Optionally, for cross-two-point counts, weights in the second catalog. See weights1.

  • bin_type (string, default='auto') – Binning type for first dimension, e.g. \(r_{p}\) when mode is “rppi”. Set to lin for speed-up in case of linearly-spaced bins. In this case, the bin number for a pair separated by a (3D, projected, angular…) separation sep is given by (sep - edges[0])/(edges[-1] - edges[0])*(len(edges) - 1), i.e. only the first and last bins of input edges are considered. Then setting compute_sepsavg is virtually costless. For non-linear binning, set to “custom”. “auto” allows for auto-detection of the binning type: linear binning will be chosen if input edges are within rtol = 1e-05 (relative tolerance) or atol = 1e-08 (absolute tolerance) of the array np.linspace(edges[0], edges[-1], len(edges)).

  • position_type (string, default='auto') –

    Type of input positions, one of:

    • ”rd”: RA/Dec in degree, only if mode is “theta”

    • ”rdd”: RA/Dec in degree, distance, for any mode

    • ”xyz”: Cartesian positions, shape (3, N)

    • ”pos”: Cartesian positions, shape (N, 3).

  • weight_type (string, default='auto') –

    The type of weighting to apply to provided weights. One of:

    • None: no weights are applied.

    • ”product_individual”: each pair is weighted by the product of weights \(w_{1} w_{2}\).

    • ”inverse_bitwise”: each pair is weighted by \(\mathrm{nrealizations}/(\mathrm{noffset} + \mathrm{popcount}(w_{1} \& w_{2}))\).

      Multiple bitwise weights can be provided as a list. Individual weights can additionally be provided as float arrays. In case of cross-correlations with floating weights, bitwise weights are automatically turned to IIP weights, i.e. \(\mathrm{nrealizations}/(\mathrm{noffset} + \mathrm{popcount}(w_{1}))\).

    • ”auto”: automatically choose weighting based on input weights1 and weights2,

      i.e. None when weights1 and weights2 are None, “inverse_bitwise” if one of input weights is integer, else “product_individual”.

    In addition, angular upweights can be provided with twopoint_weights.

  • weight_attrs (dict, default=None) – Dictionary of weighting scheme attributes. In case weight_type is “inverse_bitwise”, one can provide “nrealizations”, the total number of realizations (including current one; defaulting to the number of bits in input weights plus one); “noffset”, the offset to be added to the bitwise counts in the denominator (defaulting to 1) and “default_value”, the default value of pairwise weights if the denominator is zero (defaulting to 0). The method used to compute the normalization with PIP weights can be specified with the keyword “normalization”: if None or “total”, normalization is given by eq. 22 of arXiv:1912.08803; “brute_force” (using OpenMP’ed C code) or “brute_force_npy” (slower, using numpy only methods; both methods match within machine precision) loop over all pairs; “counter” to normalize each pair by eq. 19 of arXiv:1912.08803. For normalizations “total” or “counter”, “nalways” specifies the number of bits systematically set to 1 minus the number of bits systematically set to 0 (defaulting to 0). For example, for the “zero-truncated” estimator (arXiv:1912.08803), one would use noffset = 0, nalways = 1.

  • twopoint_weights (WeightTwoPointEstimator, default=None) – Weights to be applied to each pair of particles. A WeightTwoPointEstimator instance or any object with arrays sep (separations) and weight (weight at given separation) as attributes (i.e. to be accessed through twopoint_weights.sep, twopoint_weights.weight) or as keys (i.e. twopoint_weights['sep'], twopoint_weights['weight']) or as element (i.e. sep, weight = twopoint_weights).

  • selection_attrs (dict, default=None) – To select pairs to be counted, provide mapping between the quantity (string) and the interval (tuple of floats), e.g. {'rp': (0., 20.)} to select pairs with transverse separation ‘rp’ between 0 and 20, {‘theta’: (0., 20.)}` to select pairs with separation angle ‘theta’ between 0 and 20 degrees.

  • los (string, default='midpoint') –

    Line-of-sight to be used when mode is “smu”, “rppi” or “rp”; one of:

    • ”x”, “y” or “z”: Cartesian axis

    • ”midpoint”: the mean position of the pair: \(\mathbf{\eta} = (\mathbf{r}_{1} + \mathbf{r}_{2})/2\)

    • ”firstpoint”: the first position of the pair: \(\mathbf{\eta} = \mathbf{r}_{1}\)

    • ”endpoint”: the second position of the pair: \(\mathbf{\eta} = \mathbf{r}_{2}\)

    WARNING: “endpoint” is obtained by reversing “firstpoint” (which is the only line-of-sight implemented in the counter). This means, if “s” or “rp” edges starts at 0, and the number of “mu” or “pi” bins is even, zero separation pairs (due to duplicate objects) will be counted in counts[0, (counts.shape[1] - 1) // 2] instead of counts[0, counts.shape[1] // 2].

  • boxsize (array, float, default=None) – For periodic wrapping, the side-length(s) of the periodic cube.

  • compute_sepsavg (bool, default=True) – Set to False to not calculate the average separation for each bin. This can make the two-point counts faster if bin_type is “custom”. In this case, sep will be set the midpoint of input edges.

  • dtype (string, np.dtype, default='f8') – Array type for positions and weights. If None, defaults to type of first positions1 array. Double precision is highly recommended in case mode is “theta”, twopoint_weights is provided (due to cosine), or compute_sepsavg is True.

  • nthreads (int, default=None) – Number of OpenMP threads to use.

  • mpicomm (MPI communicator, default=None) – The MPI communicator, to MPI-distribute calculation.

  • mpiroot (int, default=None) – In case mpicomm is provided, if None, input positions and weights are assumed to be scattered across all ranks. Else the MPI rank where input positions and weights are gathered.

  • kwargs (dict) – Two-point counter engine-specific options.

property compute_sepavg

Whether to compute average of separation values for first dimension (e.g. \(s\) if mode is “smu”).

classmethod concatenate_x(*others)

Concatenate input two-point counts along sep; useful when running two-point counts at different particle densities, e.g. high density on small scales, and lower density on larger scales, to keep computing time tractable.

Warning

wnorm is cast to a ndim array.

property ndim

Return binning dimensionality.

normalization()

Return two-point count normalization, i.e., in case of cross-correlation:

\[\left(\sum_{i=1}^{N_{1}} w_{1,i}\right) \left(\sum_{j=1}^{N_{2}} w_{2,j}\right)\]

with the sums running over the weights of the first and second catalogs, and in case of auto-correlation:

\[\left(\sum_{i=1}^{N_{1}} w_{1,i}\right)^{2} - \sum_{i=1}^{N_{1}} w_{1,i}^{2}\]
normalize(wnorm)

Rescale both wcounts and wnorm such that new wnorm matches wnorm. This is useful when combining counts in various regions.

Parameters:

wnorm (float) – New normalization wnorm.

Returns:

new – Normalized counts.

Return type:

BaseTwoPointCounter

normalized_wcounts()

Return normalized two-point counts, i.e. wcounts divided by normalization().

property periodic

Whether periodic wrapping is used (i.e. boxsize is not None).

rebin(factor=1)

Rebin two-point counts, by factor(s) factor. Input factors must divide shape.

Warning

If current instance is the result of concatenate_x(), rebinning is exact only if factor divides each of the constant-wnorm chunks.

reverse()

Return counts for reversed positions1/weights1 and positions2/weights2.

run()

Method that computes the actual two-point counts and set wcounts and sep, to be implemented in your new engine.

save(filename)

Save two-point counts to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ')

Save two-point counts as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

select(*xlims)

Restrict counts to provided coordinate limits in place.

For example:

counts.select((0, 0.3))  # restrict first axis to (0, 0.3)
counts.select(None, (0, 0.2))  # restrict second axis to (0, 0.2)
statistic.select((0, 30, 4))   # rebin to match step size of 4 and restrict to (0, 30)
property sep

Array of separation values of first dimension (e.g. \(s\) if mode is “smu”).

sepavg(axis=0, method=None)

Return average of separation for input axis.

Parameters:
  • axis (int, default=0) – Axis; if mode is “smu”, 0 is \(s\), 1 is \(mu\); if mode is “rppi”, 0 is \(r_{p}\), 1 is \(\pi\).

  • method (str, default=None) – If None, return average separation from seps. If ‘mid’, return bin mid-points.

Returns:

sepavg – 1D array of size shape[axis].

Return type:

array

property shape

Return shape of obtained counts wcounts.

slice(*slices)

Slice counts in place. If slice step is not 1, use rebin(). For example:

counts.slice(slice(0, 10, 2), slice(0, 6, 3)) # rebin by factor 2 (resp. 3) along axis 0 (resp. 1), up to index 10 (resp. 6)
counts[:10:2,:6:3] # same as above, but return new instance.
classmethod sum(*others)

Sum input two-point counts; useful when splitting input sample of particles; e.g. https://arxiv.org/pdf/1905.01133.pdf.

Warning

If > 1 input two-point counts, size1, size2 attributes will be lost. Input two-point counts must have same edges for this operation to make sense (no checks performed).

property with_mpi

Whether to use MPI.

wrap()

Return new ‘smu’ or ‘rppi’ two-point counts with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_counter.MetaTwoPointCounter(name, bases, class_dict)

Bases: BaseMetaClass

Metaclass to return correct two-point counter engine.

mro()

Return a type’s method resolution order.

set_logger()

Add attributes for logging:

  • logger

  • methods log_debug, log_info, log_warning, log_error, log_critical

class pycorr.twopoint_counter.RegisteredTwoPointCounter(name, bases, class_dict)

Bases: BaseMetaClass

Metaclass registering BaseTwoPointCounter-derived classes.

mro()

Return a type’s method resolution order.

set_logger()

Add attributes for logging:

  • logger

  • methods log_debug, log_info, log_warning, log_error, log_critical

class pycorr.twopoint_counter.TwoPointCounter(*args, engine='corrfunc', **kwargs)

Bases: BaseClass

Entry point to two-point counter engines.

Parameters:
  • engine (string, default='corrfunc') – Name of two-point counter engine, one of [“corrfunc”, “cucount”, “analytical”].

  • args (list) – Arguments for two-point counter engine, see BaseTwoPointCounter.

  • kwargs (dict) – Arguments for two-point counter engine, see BaseTwoPointCounter.

Returns:

engine

Return type:

BaseTwoPointCounter

classmethod from_state(state, load=False)

Return new two-point counter based on state dictionary.

exception pycorr.twopoint_counter.TwoPointCounterError

Bases: ValueError

Exception raised when issue with two-point counting.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class pycorr.twopoint_counter.TwoPointWeight(sep, weight)

Bases: tuple

Create new instance of TwoPointWeight(sep, weight)

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

sep

Alias for field number 0

weight

Alias for field number 1

pycorr.twopoint_counter.get_default_nrealizations(weights)

Return default number of realizations given input bitwise weights = the number of bits in input weights plus one.

pycorr.twopoint_counter.get_inverse_probability_weight(*weights, noffset=1, nrealizations=None, default_value=0.0, correction=None, dtype='f8')

Return inverse probability weight given input bitwise weights. Inverse probability weight is computed as:\(\mathrm{nrealizations}/(\mathrm{noffset} + \mathrm{popcount}(w_{1} \& w_{2} \& ...))\). If denominator is 0, weight is set to default_value.

Parameters:
  • weights (int arrays) – Bitwise weights.

  • noffset (int, default=1) – The offset to be added to the bitwise counts in the denominator (defaults to 1).

  • nrealizations (int, default=None) – Number of realizations (defaults to the number of bits in input weights plus one).

  • default_value (float, default=0.) – Default weight value, if the denominator is zero (defaults to 0).

  • correction (2D array, default=None) – Optionally, divide weight by correction[nbits1, nbits2] with nbits1, nbits2 the number of non-zero bits in weights.

  • dtype (string, np.dtype) – Type for output weight.

Returns:

weight – IIP weight.

Return type:

array

pycorr.twopoint_counter.get_twopoint_counter(engine='corrfunc')

Return BaseTwoPointCounter-subclass corresponding to input engine name.

Parameters:

engine (string, default='corrfunc') – Name of two-point counter engine, one of [“corrfunc”, “cucount”, “analytic”, “jackknife”].

Returns:

counter – Two-point counter class.

Return type:

type

pycorr.twopoint_counter.normalization(weights1, weights2=None, weight_type='auto', weight_attrs=None, dtype='f8', nthreads=None, mpicomm=None, mpiroot=None)

Initialize BaseTwoPointCounter, and run actual two-point counts (calling run()), setting wcounts and sep.

Parameters:
  • weights1 (int, array, list) – Weights (or local size, if no weights) of the first catalog. See weight_type.

  • weights2 (array, list, default=None) – Optionally, for cross-two-point counts, weights (or local size, if no weights) in the second catalog. See weights1.

  • weight_type (string, default='auto') –

    The type of weighting to apply to provided weights. One of:

    • None: no weights are applied.

    • ”product_individual”: each pair is weighted by the product of weights \(w_{1} w_{2}\).

    • ”inverse_bitwise”: each pair is weighted by \(\mathrm{nrealizations}/(\mathrm{noffset} + \mathrm{popcount}(w_{1} \& w_{2}))\).

      Multiple bitwise weights can be provided as a list. Individual weights can additionally be provided as float arrays. In case of cross-correlations with floating weights, bitwise weights are automatically turned to IIP weights, i.e. \(\mathrm{nrealizations}/(\mathrm{noffset} + \mathrm{popcount}(w_{1}))\).

    • ”auto”: automatically choose weighting based on input weights1 and weights2,

      i.e. None when weights1 and weights2 are None, “inverse_bitwise” if one of input weights is integer, else “product_individual”.

  • weight_attrs (dict, default=None) – Dictionary of weighting scheme attributes. In case weight_type is “inverse_bitwise”, one can provide “nrealizations”, the total number of realizations (including current one; defaulting to the number of bits in input weights plus one); “noffset”, the offset to be added to the bitwise counts in the denominator (defaulting to 1) and “default_value”, the default value of pairwise weights if the denominator is zero (defaulting to 0). One can also provide “nalways”, stating the number of bits systematically set to 1 minus the number of bits systematically set to 0 (defaulting to 0). These will only impact the normalization factors, if not computed with the brute-force approach. For example, for the “zero-truncated” estimator (arXiv:1912.08803), one would use noffset = 0, nalways = 1. The method used to compute the normalization with PIP weights can be specified with the keyword “normalization”: if None, normalization is given by eq. 22 of arXiv:1912.08803; “brute_force” (using OpenMP’ed C code) or “brute_force_npy” (slower, using numpy only methods; both methods match within machine precision) loop over all pairs; “counter” to normalize each pair by eq. 19 of arXiv:1912.08803.

  • dtype (string, np.dtype, default='f8') – Array type for weights. If None, defaults to type of first weights1 array if a floating-point array is provided, else ‘f8’. Double precision is highly recommended.

  • nthreads (int, default=None) – Number of OpenMP threads to use.

  • mpicomm (MPI communicator, default=None) – The MPI communicator, to MPI-distribute calculation.

  • mpiroot (int, default=None) – In case mpicomm is provided, if None, input weights are assumed to be scattered across all ranks. Else the MPI rank where input weights are gathered.

Jackknife two-point counter class

Implements methods to perform jackknife estimates of the correlation function covariance matrix:

  • subsamplers, to split footprint into subregions

  • two-point counters, to run jackknife two-point realizations

  • two-point estimators, using the jackknife realizations to estimate a covariance matrix.

class pycorr.twopoint_jackknife.BaseSubsampler(mode, positions, weights=None, nsamples=8, position_type='auto', dtype=None, mpicomm=None, mpiroot=None, **kwargs)

Bases: BaseClass

Base class for subsamplers. Extend this class to implement a new subsampler; in particular, one should implement BaseSubsampler.run() and BaseSubsampler.label(), which provides a subsample label to input positions.

Initialize BaseSubsampler.

Parameters:
  • mode (How to divide space, one of:) –

    • “angular”: on the sky

    • ”3d”: in Cartesian, 3D space

  • positions (list, array) – Positions of (typically randoms) particles to define subsamples. Typically of shape (3, N), but can be (2, N) when mode is “angular”. See position_type.

  • weights (array, default=None) – Optionally, weights of (typically randoms) particles to define subsamples.

  • nsamples (int, default=8) – Number of subsamples to define.

  • position_type (string, default='auto') –

    Type of input positions, one of:

    • ”rd”: RA/Dec in degree, only if mode is “angular”

    • ”rdd”: RA/Dec in degree, distance, for any mode

    • ”xyz”: Cartesian positions, shape (3, N)

    • ”pos”: Cartesian positions, shape (N, 3).

  • dtype (string, np.dtype, default=None) – Array type for positions and weights. If None, defaults to type of positions array.

  • mpicomm (MPI communicator, default=None) – The MPI communicator, to MPI-distribute calculation.

  • mpiroot (int, default=None) – In case mpicomm is provided, if None, input positions and weights are assumed to be scattered across all ranks. Else the MPI rank where input positions and weights are gathered.

  • kwargs (dict) – Subsampler engine-specific options.

label(positions, position_type=None)

Method that returns subsample labels corresponding to input positions.

property with_mpi

Whether to use MPI.

class pycorr.twopoint_jackknife.BoxSubsampler(positions=None, boxsize=None, boxcenter=None, nsamples=8, position_type='auto', wrap=False, dtype=None, mpicomm=None, mpiroot=None)

Bases: BaseSubsampler

Basic subsampler, that divides a box into subboxes in 3D space.

Initialize BoxSubsampler.

Parameters:
  • positions (list, array) – If boxsize and / or boxcenter is None, use these positions to determine boxsize and / or boxcenter. Typically of shape (3, N), see position_type.

  • boxsize (array, float, default=None) – Physical size of the box. If not provided, see positions.

  • boxcenter (array, float, default=None) – Box center. If not provided, see positions.

  • nsamples (int, tuple, default=8) – Total number of subsamples to define. Can be a 3-tuple, corresponding to the number of divisions along each x, y, z axis.

  • position_type (string, default='auto') –

    Type of input positions, one of:

    • ”rdd”: RA/Dec in degree, distance, shape (3, N)

    • ”xyz”: Cartesian positions, shape (3, N)

    • ”pos”: Cartesian positions, shape (N, 3).

  • wrap (bool, default=False) – Whether to wrap input positions in [boxcenter - boxsize/2, boxcenter + boxsize/2). If False and input positions do not fit in that range, raise a ValueError.

  • dtype (string, np.dtype, default=None) – Array type for positions and weights. If None, defaults to type of positions array.

  • mpicomm (MPI communicator, default=None) – The MPI communicator, to MPI-distribute calculation.

  • mpiroot (int, default=None) – In case mpicomm is provided, if None, input positions are assumed to be scattered across all ranks. Else the MPI rank where input positions are gathered.

label(positions, position_type=None)

Return subsample labels given input positions.

Parameters:
  • positions (list, array) – Positions to which labels will be attributed. Typically of shape (3, N), see position_type.

  • position_type (string, default='auto') –

    Type of input positions, one of:

    • ”rdd”: RA/Dec in degree, distance, shape (3, N)

    • ”xyz”: Cartesian positions, shape (3, N)

    • ”pos”: Cartesian positions, shape (N, 3).

Returns:

labels – Labels corresponding to input positions.

Return type:

array

run()

Set edges for binning along each axis.

property with_mpi

Whether to use MPI.

class pycorr.twopoint_jackknife.JackknifeDavisPeeblesTwoPointEstimator(D1D2=None, R1R2=None, D1R2=None, R1D2=None, S1S2=None, D1S2=None, S1D2=None, S1R2=None)

Bases: DavisPeeblesTwoPointEstimator, JackknifeTwoPointEstimator

Initialize BaseTwoPointEstimator, and set correlation estimation corr (calling run()).

Parameters:
  • D1D2 (BaseTwoPointCounter, default=None) – D1D2 two-point counts.

  • R1R2 (BaseTwoPointCounter, default=None) – R1R2 two-point counts.

  • D1R2 (BaseTwoPointCounter, default=None) – D1R2 two-point counts, e.g. for LandySzalayTwoPointEstimator.

  • R1D2 (BaseTwoPointCounter, default=None) – R1D2 two-point counts, e.g. for LandySzalayTwoPointEstimator, in case of cross-correlation.

  • S1S2 (BaseTwoPointCounter, default=None) – S1S2 two-point counts, e.g. with reconstruction, the Landy-Szalay estimator is commonly written: \((D1D2 - D1S2 - S1D2 - S1S2)/R1R2\), with S1 and S2 shifted random catalogs. Defaults to R1R2.

  • D1S2 (BaseTwoPointCounter, default=None) – D1S2 two-point counts, see S1S2. Defaults to D1R2.

  • S1D2 (BaseTwoPointCounter, default=None) – S1D2 two-point counts, see S1S2. Defaults to R1D2.

  • S1R2 (BaseTwoPointCounter, default=None) – S1R2 two-point counts. Defaults to R1R2.

property XX

Return first two-point counts.

classmethod concatenate(*others)

Concatenate input JackknifeTwoPointEstimator instances; typically used when calculation has been split into different samples, see argument samples of JackknifeTwoPointCounter.__init__().

classmethod concatenate_x(*others)

Concatenate input estimators along sep by concatenating their two-point counts; see BaseTwoPointCounter.concatenate_x().

property count_names

Return list of counts used in estimator.

property edges

Edges for two-point count calculation, taken from XX.

extend(other)

Extend current instance with another; see concatenate().

get_corr(return_sep=False, return_cov=None, return_mask=False, mode='auto', **kwargs)

Return correlation function, optionally its covariance estimate, if available.

Parameters:
  • return_sep (bool, default=False) – Whether (True) to return average separation(s) sepavg.

  • return_cov (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), return covariance estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • return_mask (bool, default=False) – Return mask of bins that are summed over along dimension 1, for each dimension 0 point.

  • mode (str, default='auto') – If ‘poles’, return multipoles. If ‘wedges’, return \(\mu\)-wedges. If ‘wp’, return projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), return multipoles; else if ‘wedges’ provided, return \(\mu\)-wedges. else if ‘pimax’ provided, return projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ell, project_to_wedges() (if mode is ‘smu’), e.g. wedges, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

  • sep (array) – Optionally, separation values.

  • corr (array) – Estimated correlation function.

  • cov (array) – Optionally, covariance estimate (of the flattened corr), see return_cov.

property mode

Two-point counting mode, taken from XX.

property ndim

Return binning dimensionality.

normalize(wnorm='XX')

Renormalize all counts (BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm). This is useful when combining measurements in various regions.

Parameters:

wnorm (float, string, default='XX') – If float, rescale all BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm such that BaseTwoPointCounter.wnorm matches wnorm. Else, name of counts (e.g. ‘D1D2’, ‘R1R2’, etc.) to take wnorm from. ‘XX’ is the first counts of the estimator (usually ‘D1D2’).

Returns:

new – New estimator, with all counts renormalized. corr is expected to be exactly the same.

Return type:

BaseTwoPointEstimator

property nrealizations

Number of jackknife realizations.

plot(plot_std=None, mode='auto', ax=None, fn=None, kw_save=None, show=False, **kwargs)

Plot correlation function.

Parameters:
  • plot_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), plot standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • ax (matplotlib.axes.Axes, default=None) – Axes where to plot samples. If None, takes current axes.

  • fn (string, default=None) – If not None, file name where to save figure.

  • kw_save (dict, default=None) – Optional arguments for matplotlib.figure.Figure.savefig().

  • show (bool, default=False) – Whether to show figure.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

ax

Return type:

matplotlib.axes.Axes

realization(ii, **kwargs)

Return jackknife realization ii.

Parameters:
Returns:

estimator – Two-point estimator for realization ii.

Return type:

BaseTwoPointEstimator

property realizations

List of jackknife realizations.

rebin(*args, **kwargs)

Rebin estimator, by rebinning all two-point counts. See BaseTwoPointCounter.rebin().

classmethod requires(join=None, **kwargs)

List required counts.

run()

Set correlation function estimate corr based on the Davis-Peebles estimator \((D1D2 - D1S2)/D1R2\).

save(filename)

Save estimator to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ', return_std=None, mode='auto', **kwargs)

Save correlation function as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

  • return_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), save standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

select(*args, **kwargs)

Restrict estimator to provided coordinate limits in place.

For example:

estimator.select((0, 0.3)) # restrict first axis to (0, 0.3)
estimator.select(None, (0, 0.2)) # restrict second axis to (0, 0.2)
property sep

Array of separation values of first dimension; if not set by run(), taken from R1R2 if provided, else XX.

sepavg(*args, **kwargs)

Return average of separation for input axis; this is an 1D array of size shape[axis].

property seps

Array of separation values, if not set by run(), taken from R1R2 if provided, else XX.

property shape

Return shape of obtained correlation corr.

slice(*args, **kwargs)

Slice estimator, by rebinning all two-point counts. See BaseTwoPointCounter.slice().

classmethod sum(*others, exclude=None, uniques=True)

Sum input estimators (their two-point counts, actually). See e.g. https://arxiv.org/pdf/1905.01133.pdf for a use case. Input two-point estimators must have same edges for this operation to make sense (no checks performed).

Parameters:
  • exclude (list, default=None) – List of counts to exclude, e.g. [‘D1D2’].

  • uniques (bool, default=True) – If True, sum only unique (in the sense of same BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm) count instances.

Returns:

new

Return type:

BaseTwoPointEstimator

wrap()

Return new ‘smu’ or ‘rppi’ two-point estimators with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_jackknife.JackknifeLandySzalayTwoPointEstimator(D1D2=None, R1R2=None, D1R2=None, R1D2=None, S1S2=None, D1S2=None, S1D2=None, S1R2=None)

Bases: LandySzalayTwoPointEstimator, JackknifeTwoPointEstimator

Initialize BaseTwoPointEstimator, and set correlation estimation corr (calling run()).

Parameters:
  • D1D2 (BaseTwoPointCounter, default=None) – D1D2 two-point counts.

  • R1R2 (BaseTwoPointCounter, default=None) – R1R2 two-point counts.

  • D1R2 (BaseTwoPointCounter, default=None) – D1R2 two-point counts, e.g. for LandySzalayTwoPointEstimator.

  • R1D2 (BaseTwoPointCounter, default=None) – R1D2 two-point counts, e.g. for LandySzalayTwoPointEstimator, in case of cross-correlation.

  • S1S2 (BaseTwoPointCounter, default=None) – S1S2 two-point counts, e.g. with reconstruction, the Landy-Szalay estimator is commonly written: \((D1D2 - D1S2 - S1D2 - S1S2)/R1R2\), with S1 and S2 shifted random catalogs. Defaults to R1R2.

  • D1S2 (BaseTwoPointCounter, default=None) – D1S2 two-point counts, see S1S2. Defaults to D1R2.

  • S1D2 (BaseTwoPointCounter, default=None) – S1D2 two-point counts, see S1S2. Defaults to R1D2.

  • S1R2 (BaseTwoPointCounter, default=None) – S1R2 two-point counts. Defaults to R1R2.

property XX

Return first two-point counts.

classmethod concatenate(*others)

Concatenate input JackknifeTwoPointEstimator instances; typically used when calculation has been split into different samples, see argument samples of JackknifeTwoPointCounter.__init__().

classmethod concatenate_x(*others)

Concatenate input estimators along sep by concatenating their two-point counts; see BaseTwoPointCounter.concatenate_x().

property count_names

Return list of counts used in estimator.

property edges

Edges for two-point count calculation, taken from XX.

extend(other)

Extend current instance with another; see concatenate().

get_corr(return_sep=False, return_cov=None, return_mask=False, mode='auto', **kwargs)

Return correlation function, optionally its covariance estimate, if available.

Parameters:
  • return_sep (bool, default=False) – Whether (True) to return average separation(s) sepavg.

  • return_cov (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), return covariance estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • return_mask (bool, default=False) – Return mask of bins that are summed over along dimension 1, for each dimension 0 point.

  • mode (str, default='auto') – If ‘poles’, return multipoles. If ‘wedges’, return \(\mu\)-wedges. If ‘wp’, return projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), return multipoles; else if ‘wedges’ provided, return \(\mu\)-wedges. else if ‘pimax’ provided, return projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ell, project_to_wedges() (if mode is ‘smu’), e.g. wedges, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

  • sep (array) – Optionally, separation values.

  • corr (array) – Estimated correlation function.

  • cov (array) – Optionally, covariance estimate (of the flattened corr), see return_cov.

property mode

Two-point counting mode, taken from XX.

property ndim

Return binning dimensionality.

normalize(wnorm='XX')

Renormalize all counts (BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm). This is useful when combining measurements in various regions.

Parameters:

wnorm (float, string, default='XX') – If float, rescale all BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm such that BaseTwoPointCounter.wnorm matches wnorm. Else, name of counts (e.g. ‘D1D2’, ‘R1R2’, etc.) to take wnorm from. ‘XX’ is the first counts of the estimator (usually ‘D1D2’).

Returns:

new – New estimator, with all counts renormalized. corr is expected to be exactly the same.

Return type:

BaseTwoPointEstimator

property nrealizations

Number of jackknife realizations.

plot(plot_std=None, mode='auto', ax=None, fn=None, kw_save=None, show=False, **kwargs)

Plot correlation function.

Parameters:
  • plot_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), plot standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • ax (matplotlib.axes.Axes, default=None) – Axes where to plot samples. If None, takes current axes.

  • fn (string, default=None) – If not None, file name where to save figure.

  • kw_save (dict, default=None) – Optional arguments for matplotlib.figure.Figure.savefig().

  • show (bool, default=False) – Whether to show figure.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

ax

Return type:

matplotlib.axes.Axes

realization(ii, **kwargs)

Return jackknife realization ii.

Parameters:
Returns:

estimator – Two-point estimator for realization ii.

Return type:

BaseTwoPointEstimator

property realizations

List of jackknife realizations.

rebin(*args, **kwargs)

Rebin estimator, by rebinning all two-point counts. See BaseTwoPointCounter.rebin().

classmethod requires(join=None, **kwargs)

List required counts.

run()

Set correlation function estimate corr based on the Landy-Szalay estimator \((D1D2 - D1S2 - S1D2 - S1S2)/R1R2\).

save(filename)

Save estimator to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ', return_std=None, mode='auto', **kwargs)

Save correlation function as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

  • return_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), save standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

select(*args, **kwargs)

Restrict estimator to provided coordinate limits in place.

For example:

estimator.select((0, 0.3)) # restrict first axis to (0, 0.3)
estimator.select(None, (0, 0.2)) # restrict second axis to (0, 0.2)
property sep

Array of separation values of first dimension; if not set by run(), taken from R1R2 if provided, else XX.

sepavg(*args, **kwargs)

Return average of separation for input axis; this is an 1D array of size shape[axis].

property seps

Array of separation values, if not set by run(), taken from R1R2 if provided, else XX.

property shape

Return shape of obtained correlation corr.

slice(*args, **kwargs)

Slice estimator, by rebinning all two-point counts. See BaseTwoPointCounter.slice().

classmethod sum(*others, exclude=None, uniques=True)

Sum input estimators (their two-point counts, actually). See e.g. https://arxiv.org/pdf/1905.01133.pdf for a use case. Input two-point estimators must have same edges for this operation to make sense (no checks performed).

Parameters:
  • exclude (list, default=None) – List of counts to exclude, e.g. [‘D1D2’].

  • uniques (bool, default=True) – If True, sum only unique (in the sense of same BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm) count instances.

Returns:

new

Return type:

BaseTwoPointEstimator

wrap()

Return new ‘smu’ or ‘rppi’ two-point estimators with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_jackknife.JackknifeNaturalTwoPointEstimator(D1D2=None, R1R2=None, D1R2=None, R1D2=None, S1S2=None, D1S2=None, S1D2=None, S1R2=None)

Bases: NaturalTwoPointEstimator, JackknifeTwoPointEstimator

Initialize BaseTwoPointEstimator, and set correlation estimation corr (calling run()).

Parameters:
  • D1D2 (BaseTwoPointCounter, default=None) – D1D2 two-point counts.

  • R1R2 (BaseTwoPointCounter, default=None) – R1R2 two-point counts.

  • D1R2 (BaseTwoPointCounter, default=None) – D1R2 two-point counts, e.g. for LandySzalayTwoPointEstimator.

  • R1D2 (BaseTwoPointCounter, default=None) – R1D2 two-point counts, e.g. for LandySzalayTwoPointEstimator, in case of cross-correlation.

  • S1S2 (BaseTwoPointCounter, default=None) – S1S2 two-point counts, e.g. with reconstruction, the Landy-Szalay estimator is commonly written: \((D1D2 - D1S2 - S1D2 - S1S2)/R1R2\), with S1 and S2 shifted random catalogs. Defaults to R1R2.

  • D1S2 (BaseTwoPointCounter, default=None) – D1S2 two-point counts, see S1S2. Defaults to D1R2.

  • S1D2 (BaseTwoPointCounter, default=None) – S1D2 two-point counts, see S1S2. Defaults to R1D2.

  • S1R2 (BaseTwoPointCounter, default=None) – S1R2 two-point counts. Defaults to R1R2.

property XX

Return first two-point counts.

classmethod concatenate(*others)

Concatenate input JackknifeTwoPointEstimator instances; typically used when calculation has been split into different samples, see argument samples of JackknifeTwoPointCounter.__init__().

classmethod concatenate_x(*others)

Concatenate input estimators along sep by concatenating their two-point counts; see BaseTwoPointCounter.concatenate_x().

property count_names

Return list of counts used in estimator.

property edges

Edges for two-point count calculation, taken from XX.

extend(other)

Extend current instance with another; see concatenate().

get_corr(return_sep=False, return_cov=None, return_mask=False, mode='auto', **kwargs)

Return correlation function, optionally its covariance estimate, if available.

Parameters:
  • return_sep (bool, default=False) – Whether (True) to return average separation(s) sepavg.

  • return_cov (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), return covariance estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • return_mask (bool, default=False) – Return mask of bins that are summed over along dimension 1, for each dimension 0 point.

  • mode (str, default='auto') – If ‘poles’, return multipoles. If ‘wedges’, return \(\mu\)-wedges. If ‘wp’, return projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), return multipoles; else if ‘wedges’ provided, return \(\mu\)-wedges. else if ‘pimax’ provided, return projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ell, project_to_wedges() (if mode is ‘smu’), e.g. wedges, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

  • sep (array) – Optionally, separation values.

  • corr (array) – Estimated correlation function.

  • cov (array) – Optionally, covariance estimate (of the flattened corr), see return_cov.

property mode

Two-point counting mode, taken from XX.

property ndim

Return binning dimensionality.

normalize(wnorm='XX')

Renormalize all counts (BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm). This is useful when combining measurements in various regions.

Parameters:

wnorm (float, string, default='XX') – If float, rescale all BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm such that BaseTwoPointCounter.wnorm matches wnorm. Else, name of counts (e.g. ‘D1D2’, ‘R1R2’, etc.) to take wnorm from. ‘XX’ is the first counts of the estimator (usually ‘D1D2’).

Returns:

new – New estimator, with all counts renormalized. corr is expected to be exactly the same.

Return type:

BaseTwoPointEstimator

property nrealizations

Number of jackknife realizations.

plot(plot_std=None, mode='auto', ax=None, fn=None, kw_save=None, show=False, **kwargs)

Plot correlation function.

Parameters:
  • plot_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), plot standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • ax (matplotlib.axes.Axes, default=None) – Axes where to plot samples. If None, takes current axes.

  • fn (string, default=None) – If not None, file name where to save figure.

  • kw_save (dict, default=None) – Optional arguments for matplotlib.figure.Figure.savefig().

  • show (bool, default=False) – Whether to show figure.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

ax

Return type:

matplotlib.axes.Axes

realization(ii, **kwargs)

Return jackknife realization ii.

Parameters:
Returns:

estimator – Two-point estimator for realization ii.

Return type:

BaseTwoPointEstimator

property realizations

List of jackknife realizations.

rebin(*args, **kwargs)

Rebin estimator, by rebinning all two-point counts. See BaseTwoPointCounter.rebin().

classmethod requires(join=None, **kwargs)

List required counts.

run()

Set correlation function estimate corr based on the natural estimator \((D1D2 - S1S2)/R1R2\).

save(filename)

Save estimator to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ', return_std=None, mode='auto', **kwargs)

Save correlation function as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

  • return_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), save standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

select(*args, **kwargs)

Restrict estimator to provided coordinate limits in place.

For example:

estimator.select((0, 0.3)) # restrict first axis to (0, 0.3)
estimator.select(None, (0, 0.2)) # restrict second axis to (0, 0.2)
property sep

Array of separation values of first dimension; if not set by run(), taken from R1R2 if provided, else XX.

sepavg(*args, **kwargs)

Return average of separation for input axis; this is an 1D array of size shape[axis].

property seps

Array of separation values, if not set by run(), taken from R1R2 if provided, else XX.

property shape

Return shape of obtained correlation corr.

slice(*args, **kwargs)

Slice estimator, by rebinning all two-point counts. See BaseTwoPointCounter.slice().

classmethod sum(*others, exclude=None, uniques=True)

Sum input estimators (their two-point counts, actually). See e.g. https://arxiv.org/pdf/1905.01133.pdf for a use case. Input two-point estimators must have same edges for this operation to make sense (no checks performed).

Parameters:
  • exclude (list, default=None) – List of counts to exclude, e.g. [‘D1D2’].

  • uniques (bool, default=True) – If True, sum only unique (in the sense of same BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm) count instances.

Returns:

new

Return type:

BaseTwoPointEstimator

wrap()

Return new ‘smu’ or ‘rppi’ two-point estimators with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_jackknife.JackknifeResidualTwoPointEstimator(D1D2=None, R1R2=None, D1R2=None, R1D2=None, S1S2=None, D1S2=None, S1D2=None, S1R2=None)

Bases: ResidualTwoPointEstimator, JackknifeTwoPointEstimator

Initialize BaseTwoPointEstimator, and set correlation estimation corr (calling run()).

Parameters:
  • D1D2 (BaseTwoPointCounter, default=None) – D1D2 two-point counts.

  • R1R2 (BaseTwoPointCounter, default=None) – R1R2 two-point counts.

  • D1R2 (BaseTwoPointCounter, default=None) – D1R2 two-point counts, e.g. for LandySzalayTwoPointEstimator.

  • R1D2 (BaseTwoPointCounter, default=None) – R1D2 two-point counts, e.g. for LandySzalayTwoPointEstimator, in case of cross-correlation.

  • S1S2 (BaseTwoPointCounter, default=None) – S1S2 two-point counts, e.g. with reconstruction, the Landy-Szalay estimator is commonly written: \((D1D2 - D1S2 - S1D2 - S1S2)/R1R2\), with S1 and S2 shifted random catalogs. Defaults to R1R2.

  • D1S2 (BaseTwoPointCounter, default=None) – D1S2 two-point counts, see S1S2. Defaults to D1R2.

  • S1D2 (BaseTwoPointCounter, default=None) – S1D2 two-point counts, see S1S2. Defaults to R1D2.

  • S1R2 (BaseTwoPointCounter, default=None) – S1R2 two-point counts. Defaults to R1R2.

property XX

Return first two-point counts.

classmethod concatenate(*others)

Concatenate input JackknifeTwoPointEstimator instances; typically used when calculation has been split into different samples, see argument samples of JackknifeTwoPointCounter.__init__().

classmethod concatenate_x(*others)

Concatenate input estimators along sep by concatenating their two-point counts; see BaseTwoPointCounter.concatenate_x().

property count_names

Return list of counts used in estimator.

property edges

Edges for two-point count calculation, taken from XX.

extend(other)

Extend current instance with another; see concatenate().

get_corr(return_sep=False, return_cov=None, return_mask=False, mode='auto', **kwargs)

Return correlation function, optionally its covariance estimate, if available.

Parameters:
  • return_sep (bool, default=False) – Whether (True) to return average separation(s) sepavg.

  • return_cov (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), return covariance estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • return_mask (bool, default=False) – Return mask of bins that are summed over along dimension 1, for each dimension 0 point.

  • mode (str, default='auto') – If ‘poles’, return multipoles. If ‘wedges’, return \(\mu\)-wedges. If ‘wp’, return projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), return multipoles; else if ‘wedges’ provided, return \(\mu\)-wedges. else if ‘pimax’ provided, return projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ell, project_to_wedges() (if mode is ‘smu’), e.g. wedges, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

  • sep (array) – Optionally, separation values.

  • corr (array) – Estimated correlation function.

  • cov (array) – Optionally, covariance estimate (of the flattened corr), see return_cov.

property mode

Two-point counting mode, taken from XX.

property ndim

Return binning dimensionality.

normalize(wnorm='XX')

Renormalize all counts (BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm). This is useful when combining measurements in various regions.

Parameters:

wnorm (float, string, default='XX') – If float, rescale all BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm such that BaseTwoPointCounter.wnorm matches wnorm. Else, name of counts (e.g. ‘D1D2’, ‘R1R2’, etc.) to take wnorm from. ‘XX’ is the first counts of the estimator (usually ‘D1D2’).

Returns:

new – New estimator, with all counts renormalized. corr is expected to be exactly the same.

Return type:

BaseTwoPointEstimator

property nrealizations

Number of jackknife realizations.

plot(plot_std=None, mode='auto', ax=None, fn=None, kw_save=None, show=False, **kwargs)

Plot correlation function.

Parameters:
  • plot_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), plot standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • ax (matplotlib.axes.Axes, default=None) – Axes where to plot samples. If None, takes current axes.

  • fn (string, default=None) – If not None, file name where to save figure.

  • kw_save (dict, default=None) – Optional arguments for matplotlib.figure.Figure.savefig().

  • show (bool, default=False) – Whether to show figure.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

ax

Return type:

matplotlib.axes.Axes

realization(ii, **kwargs)

Return jackknife realization ii.

Parameters:
Returns:

estimator – Two-point estimator for realization ii.

Return type:

BaseTwoPointEstimator

property realizations

List of jackknife realizations.

rebin(*args, **kwargs)

Rebin estimator, by rebinning all two-point counts. See BaseTwoPointCounter.rebin().

classmethod requires(join=None, **kwargs)

List required counts.

run()

Set correlation function estimate corr based on the residual estimator \((D1R2 - S1R2)/R1R2\).

save(filename)

Save estimator to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ', return_std=None, mode='auto', **kwargs)

Save correlation function as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

  • return_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), save standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

select(*args, **kwargs)

Restrict estimator to provided coordinate limits in place.

For example:

estimator.select((0, 0.3)) # restrict first axis to (0, 0.3)
estimator.select(None, (0, 0.2)) # restrict second axis to (0, 0.2)
property sep

Array of separation values of first dimension; if not set by run(), taken from R1R2 if provided, else XX.

sepavg(*args, **kwargs)

Return average of separation for input axis; this is an 1D array of size shape[axis].

property seps

Array of separation values, if not set by run(), taken from R1R2 if provided, else XX.

property shape

Return shape of obtained correlation corr.

slice(*args, **kwargs)

Slice estimator, by rebinning all two-point counts. See BaseTwoPointCounter.slice().

classmethod sum(*others, exclude=None, uniques=True)

Sum input estimators (their two-point counts, actually). See e.g. https://arxiv.org/pdf/1905.01133.pdf for a use case. Input two-point estimators must have same edges for this operation to make sense (no checks performed).

Parameters:
  • exclude (list, default=None) – List of counts to exclude, e.g. [‘D1D2’].

  • uniques (bool, default=True) – If True, sum only unique (in the sense of same BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm) count instances.

Returns:

new

Return type:

BaseTwoPointEstimator

wrap()

Return new ‘smu’ or ‘rppi’ two-point estimators with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_jackknife.JackknifeTwoPointCounter(mode, edges, positions1, samples1, weights1=None, positions2=None, samples2=None, weights2=None, bin_type='auto', position_type='auto', weight_type='auto', weight_attrs=None, twopoint_weights=None, selection_attrs=None, los='midpoint', boxsize=None, compute_sepsavg=True, dtype='f8', nthreads=None, mpicomm=None, mpiroot=None, nprocs_per_real=1, samples=None, **kwargs)

Bases: BaseTwoPointCounter

Perform jackknife two-point counts.

Initialize JackknifeTwoPointCounter.

Parameters:
  • mode (string) –

    Type of two-point counts, one of:

    • ”theta”: as a function of angle (in degree) between two particles

    • ”s”: as a function of distance between two particles

    • ”smu”: as a function of distance between two particles and cosine angle \(\mu\) w.r.t. the line-of-sight

    • ”rppi”: as a function of distance transverse (\(r_{p}\)) and parallel (\(\pi\)) to the line-of-sight

    • ”rp”: same as “rppi”, without binning in \(\pi\)

  • edges (tuple, array) – Tuple of bin edges (arrays), for the first (e.g. \(r_{p}\)) and optionally second (e.g. \(\pi > 0\), \(\mu \in [-1, 1]\)) dimensions. In case of single-dimension binning (e.g. mode is “theta”, “s” or “rp”), the single array of bin edges can be provided directly. Edges are inclusive on the low end, exclusive on the high end, i.e. a pair separated by \(s\) falls in bin i if edges[i] <= s < edges[i+1]. In case mode is “smu” however, the first \(\mu\)-bin is exclusive on the low end (increase the \(\mu\)-range by a tiny value to include \(\mu = \pm 1\)). Pairs at separation \(s = 0\) are included in the \(\mu = 0\) bin. In case of auto-correlation (no positions2 provided), auto-pairs (pairs of same objects) are not counted. In case of cross-correlation, all pairs are counted. In any case, duplicate objects (with separation zero) will be counted.

  • positions1 (list, array) – Positions in the first catalog. Typically of shape (3, N), but can be (2, N) when mode is “theta”. See position_type.

  • samples1 (array) – Labels of subsamples for the first catalog.

  • weights1 (array, list, default=None) – Weights of the first catalog. Not required if weight_type is either None or “auto”. See weight_type.

  • positions2 (list, array, default=None) – Optionally, for cross-two-point counts, positions in the second catalog. See positions1.

  • samples2 (list, array, default=None) – Optionally, for cross-two-point counts, labels in the second catalog. See samples1.

  • weights2 (array, list, default=None) – Optionally, for cross-two-point counts, weights in the second catalog. See weights1.

  • bin_type (string, default='auto') – Binning type for first dimension, e.g. \(r_{p}\) when mode is “rppi”. Set to lin for speed-up in case of linearly-spaced bins. In this case, the bin number for a pair separated by a (3D, projected, angular…) separation sep is given by (sep - edges[0])/(edges[-1] - edges[0])*(len(edges) - 1), i.e. only the first and last bins of input edges are considered. Then setting compute_sepsavg is virtually costless. For non-linear binning, set to “custom”. “auto” allows for auto-detection of the binning type: linear binning will be chosen if input edges are within rtol = 1e-05 (relative tolerance) or atol = 1e-08 (absolute tolerance) of the array np.linspace(edges[0], edges[-1], len(edges)).

  • position_type (string, default='auto') –

    Type of input positions, one of:

    • ”rd”: RA/Dec in degree, only if mode is “theta”

    • ”rdd”: RA/Dec in degree, distance, for any mode

    • ”xyz”: Cartesian positions, shape (3, N)

    • ”pos”: Cartesian positions, shape (N, 3).

  • weight_type (string, default='auto') –

    The type of weighting to apply to provided weights. One of:

    • None: no weights are applied.

    • ”product_individual”: each pair is weighted by the product of weights \(w_{1} w_{2}\).

    • ”inverse_bitwise”: each pair is weighted by \(\mathrm{nrealizations}/(\mathrm{noffset} + \mathrm{popcount}(w_{1} \& w_{2}))\).

      Multiple bitwise weights can be provided as a list. Individual weights can additionally be provided as float arrays. In case of cross-correlations with floating weights, bitwise weights are automatically turned to IIP weights, i.e. \(\mathrm{nrealizations}/(\mathrm{noffset} + \mathrm{popcount}(w_{1}))\).

    • ”auto”: automatically choose weighting based on input weights1 and weights2,

      i.e. None when weights1 and weights2 are None, “inverse_bitwise” if one of input weights is integer, else “product_individual”.

    In addition, angular upweights can be provided with twopoint_weights.

  • weight_attrs (dict, default=None) – Dictionary of weighting scheme attributes. In case weight_type is “inverse_bitwise”, one can provide “nrealizations”, the total number of realizations (including current one; defaulting to the number of bits in input weights plus one); “noffset”, the offset to be added to the bitwise counts in the denominator (defaulting to 1) and “default_value”, the default value of pairwise weights if the denominator is zero (defaulting to 0). The method used to compute the normalization with PIP weights can be specified with the keyword “normalization”: if None or “total”, normalization is given by eq. 22 of arXiv:1912.08803; “brute_force” (using OpenMP’ed C code) or “brute_force_npy” (slower, using numpy only methods; both methods match within machine precision) loop over all pairs; “counter” to normalize each pair by eq. 19 of arXiv:1912.08803. For normalizations “total” or “counter”, “nalways” specifies the number of bits systematically set to 1 minus the number of bits systematically set to 0 (defaulting to 0). For example, for the “zero-truncated” estimator (arXiv:1912.08803), one would use noffset = 0, nalways = 1.

  • twopoint_weights (WeightTwoPointEstimator, default=None) – Weights to be applied to each pair of particles. A WeightTwoPointEstimator instance or any object with arrays sep (separations) and weight (weight at given separation) as attributes (i.e. to be accessed through twopoint_weights.sep, twopoint_weights.weight) or as keys (i.e. twopoint_weights['sep'], twopoint_weights['weight']) or as element (i.e. sep, weight = twopoint_weights)

  • selection_attrs (dict, default=None) – To select pairs to be counted, provide mapping between the quantity (string) and the interval (tuple of floats), e.g. {'rp': (0., 20.)} to select pairs with ‘rp’ between 0 and 20.

  • los (string, default='midpoint') –

    Line-of-sight to be used when mode is “smu”, “rppi” or “rp”; one of:

    • ”midpoint”: the mean position of the pair: \(\mathbf{\eta} = (\mathbf{r}_{1} + \mathbf{r}_{2})/2\)

    • ”x”, “y” or “z”: cartesian axis

  • boxsize (array, float, default=None) – For periodic wrapping, the side-length(s) of the periodic cube.

  • compute_sepsavg (bool, default=True) – Set to False to not calculate the average separation for each bin. This can make the two-point counts faster if bin_type is “custom”. In this case, sep will be set the midpoint of input edges.

  • dtype (string, np.dtype, default='f8') – Array type for positions and weights. If None, defaults to type of first positions1 array. Double precision is highly recommended in case mode is “theta”, twopoint_weights is provided (due to cosine), or compute_sepsavg is True.

  • nthreads (int, default=None) – Number of OpenMP threads to use.

  • mpicomm (MPI communicator, default=None) – The MPI communicator, to MPI-distribute calculation.

  • mpiroot (int, default=None) – In case mpicomm is provided, if None, input positions and weights are assumed to be scattered across all ranks. Else the MPI rank where input positions and weights are gathered.

  • nprocs_per_real (int, default=1) – In case mpicomm is provided, the number of MPI processes to devote to the calculation of two-point counts for each jackknife realization. If nprocs_per_real is e.g. 1 (default), the parallelization is exclusively on the jackknife realizations. If nprocs_per_real is e.g. 2, the parallelization is on the jackknife realizations (with mpicomm.size // n_procs_per_real realizations treated in parallel) and the counts within each jackknife realization use 2 MPI processes.

  • samples (list, array, default=None) – Whether to restrict jackknife counts to these subsamples. This may be useful to manually distribute the calculation. At the end of the computation, the different JackknifeTwoPointCounter instances can be concatenated with concatenate().

  • kwargs (dict) – Two-point counter engine-specific options.

property compute_sepavg

Whether to compute average of separation values for first dimension (e.g. \(s\) if mode is “smu”).

classmethod concatenate(*others)

Concatenate input JackknifeTwoPointCounter instances; typically used when calculation has been split into different samples, see argument samples of __init__().

classmethod concatenate_x(*others)

Concatenate input two-point counts along sep; see BaseTwoPointCounter.concatenate_x().

cov(**kwargs)

Return jackknife covariance (of flattened counts).

Parameters:

kwargs (dict) – Optional arguments for realization().

Returns:

cov – Covariance matrix.

Return type:

array

extend(other)

Extend current instance with another; see concatenate().

property ndim

Return binning dimensionality.

normalization()

Return two-point count normalization, i.e., in case of cross-correlation:

\[\left(\sum_{i=1}^{N_{1}} w_{1,i}\right) \left(\sum_{j=1}^{N_{2}} w_{2,j}\right)\]

with the sums running over the weights of the first and second catalogs, and in case of auto-correlation:

\[\left(\sum_{i=1}^{N_{1}} w_{1,i}\right)^{2} - \sum_{i=1}^{N_{1}} w_{1,i}^{2}\]
normalize(wnorm)

Rescale both wcounts and wnorm such that new wnorm matches wnorm. This is useful when combining counts in various regions.

Parameters:

wnorm (float) – New normalization wnorm.

Returns:

new – Normalized counts.

Return type:

BaseTwoPointCounter

normalized_wcounts()

Return normalized two-point counts, i.e. wcounts divided by normalization().

property nrealizations

Number of jackknife realizations.

property periodic

Whether periodic wrapping is used (i.e. boxsize is not None).

realization(ii, correction='mohammad21')

Return jackknife realization ii.

Parameters:
  • ii (int) – Label of jackknife realization.

  • correction (string, default='mohammad') – Correction to apply to computed counts. If None, no correction is applied. Else, if “mohammad21”, rescale cross-pairs by factor eq. 27 in arXiv:2109.07071. Else, rescale cross-pairs by provided correction factor.

Returns:

counts – Two-point counts for realization ii.

Return type:

BaseTwoPointCounter

property realizations

List of jackknife realizations, corresponding to input samples.

rebin(factor=1)

Rebin two-point counts, by factor(s) factor. A tuple must be provided in case ndim is greater than 1. Input factors must divide shape.

Warning

If current instance is the result of concatenate_x(), rebinning is exact only if factor divides each of the constant-wnorm chunks.

reverse()

Return counts for reversed positions1/weights1 and positions2/weights2.

run(samples=None)

Run jackknife two-point counts.

save(filename)

Save two-point counts to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ')

Save two-point counts as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

select(*xlims)

Restrict counts to provided coordinate limits in place.

For example:

counts.select((0, 0.3))  # restrict first axis to (0, 0.3)
counts.select(None, (0, 0.2))  # restrict second axis to (0, 0.2)
statistic.select((0, 30, 4))   # rebin to match step size of 4 and restrict to (0, 30)
property sep

Array of separation values of first dimension (e.g. \(s\) if mode is “smu”).

sepavg(axis=0, method=None)

Return average of separation for input axis.

Parameters:
  • axis (int, default=0) – Axis; if mode is “smu”, 0 is \(s\), 1 is \(mu\); if mode is “rppi”, 0 is \(r_{p}\), 1 is \(\pi\).

  • method (str, default=None) – If None, return average separation from seps. If ‘mid’, return bin mid-points.

Returns:

sepavg – 1D array of size shape[axis].

Return type:

array

property shape

Return shape of obtained counts wcounts.

slice(*slices)

Slice counts in place. If slice step is not 1, use rebin(). For example:

counts.slice(slice(0, 10, 2), slice(0, 6, 3)) # rebin by factor 2 (resp. 3) along axis 0 (resp. 1), up to index 10 (resp. 6)
counts[:10:2,:6:3] # same as above, but return new instance.
classmethod sum(*others)

Sum input two-point counts, see BaseTwoPointCounter.sum().

property with_mpi

Whether to use MPI.

wrap()

Return new ‘smu’ or ‘rppi’ two-point counts with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_jackknife.JackknifeTwoPointEstimator(D1D2=None, R1R2=None, D1R2=None, R1D2=None, S1S2=None, D1S2=None, S1D2=None, S1R2=None)

Bases: BaseTwoPointEstimator

Extend BaseTwoPointEstimator with methods to handle jackknife realizations.

Initialize BaseTwoPointEstimator, and set correlation estimation corr (calling run()).

Parameters:
  • D1D2 (BaseTwoPointCounter, default=None) – D1D2 two-point counts.

  • R1R2 (BaseTwoPointCounter, default=None) – R1R2 two-point counts.

  • D1R2 (BaseTwoPointCounter, default=None) – D1R2 two-point counts, e.g. for LandySzalayTwoPointEstimator.

  • R1D2 (BaseTwoPointCounter, default=None) – R1D2 two-point counts, e.g. for LandySzalayTwoPointEstimator, in case of cross-correlation.

  • S1S2 (BaseTwoPointCounter, default=None) – S1S2 two-point counts, e.g. with reconstruction, the Landy-Szalay estimator is commonly written: \((D1D2 - D1S2 - S1D2 - S1S2)/R1R2\), with S1 and S2 shifted random catalogs. Defaults to R1R2.

  • D1S2 (BaseTwoPointCounter, default=None) – D1S2 two-point counts, see S1S2. Defaults to D1R2.

  • S1D2 (BaseTwoPointCounter, default=None) – S1D2 two-point counts, see S1S2. Defaults to R1D2.

  • S1R2 (BaseTwoPointCounter, default=None) – S1R2 two-point counts. Defaults to R1R2.

property XX

Return first two-point counts.

classmethod concatenate(*others)

Concatenate input JackknifeTwoPointEstimator instances; typically used when calculation has been split into different samples, see argument samples of JackknifeTwoPointCounter.__init__().

classmethod concatenate_x(*others)

Concatenate input estimators along sep by concatenating their two-point counts; see BaseTwoPointCounter.concatenate_x().

property count_names

Return list of counts used in estimator.

property edges

Edges for two-point count calculation, taken from XX.

extend(other)

Extend current instance with another; see concatenate().

get_corr(return_sep=False, return_cov=None, return_mask=False, mode='auto', **kwargs)

Return correlation function, optionally its covariance estimate, if available.

Parameters:
  • return_sep (bool, default=False) – Whether (True) to return average separation(s) sepavg.

  • return_cov (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), return covariance estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • return_mask (bool, default=False) – Return mask of bins that are summed over along dimension 1, for each dimension 0 point.

  • mode (str, default='auto') – If ‘poles’, return multipoles. If ‘wedges’, return \(\mu\)-wedges. If ‘wp’, return projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), return multipoles; else if ‘wedges’ provided, return \(\mu\)-wedges. else if ‘pimax’ provided, return projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ell, project_to_wedges() (if mode is ‘smu’), e.g. wedges, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

  • sep (array) – Optionally, separation values.

  • corr (array) – Estimated correlation function.

  • cov (array) – Optionally, covariance estimate (of the flattened corr), see return_cov.

property mode

Two-point counting mode, taken from XX.

property ndim

Return binning dimensionality.

normalize(wnorm='XX')

Renormalize all counts (BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm). This is useful when combining measurements in various regions.

Parameters:

wnorm (float, string, default='XX') – If float, rescale all BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm such that BaseTwoPointCounter.wnorm matches wnorm. Else, name of counts (e.g. ‘D1D2’, ‘R1R2’, etc.) to take wnorm from. ‘XX’ is the first counts of the estimator (usually ‘D1D2’).

Returns:

new – New estimator, with all counts renormalized. corr is expected to be exactly the same.

Return type:

BaseTwoPointEstimator

property nrealizations

Number of jackknife realizations.

plot(plot_std=None, mode='auto', ax=None, fn=None, kw_save=None, show=False, **kwargs)

Plot correlation function.

Parameters:
  • plot_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), plot standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • ax (matplotlib.axes.Axes, default=None) – Axes where to plot samples. If None, takes current axes.

  • fn (string, default=None) – If not None, file name where to save figure.

  • kw_save (dict, default=None) – Optional arguments for matplotlib.figure.Figure.savefig().

  • show (bool, default=False) – Whether to show figure.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

ax

Return type:

matplotlib.axes.Axes

realization(ii, **kwargs)

Return jackknife realization ii.

Parameters:
Returns:

estimator – Two-point estimator for realization ii.

Return type:

BaseTwoPointEstimator

property realizations

List of jackknife realizations.

rebin(*args, **kwargs)

Rebin estimator, by rebinning all two-point counts. See BaseTwoPointCounter.rebin().

classmethod requires(join=None, **kwargs)

List required counts.

save(filename)

Save estimator to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ', return_std=None, mode='auto', **kwargs)

Save correlation function as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

  • return_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), save standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

select(*args, **kwargs)

Restrict estimator to provided coordinate limits in place.

For example:

estimator.select((0, 0.3)) # restrict first axis to (0, 0.3)
estimator.select(None, (0, 0.2)) # restrict second axis to (0, 0.2)
property sep

Array of separation values of first dimension; if not set by run(), taken from R1R2 if provided, else XX.

sepavg(*args, **kwargs)

Return average of separation for input axis; this is an 1D array of size shape[axis].

property seps

Array of separation values, if not set by run(), taken from R1R2 if provided, else XX.

property shape

Return shape of obtained correlation corr.

slice(*args, **kwargs)

Slice estimator, by rebinning all two-point counts. See BaseTwoPointCounter.slice().

classmethod sum(*others, exclude=None, uniques=True)

Sum input estimators (their two-point counts, actually). See e.g. https://arxiv.org/pdf/1905.01133.pdf for a use case. Input two-point estimators must have same edges for this operation to make sense (no checks performed).

Parameters:
  • exclude (list, default=None) – List of counts to exclude, e.g. [‘D1D2’].

  • uniques (bool, default=True) – If True, sum only unique (in the sense of same BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm) count instances.

Returns:

new

Return type:

BaseTwoPointEstimator

wrap()

Return new ‘smu’ or ‘rppi’ two-point estimators with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_jackknife.JackknifeWeightTwoPointEstimator(D1D2=None, R1R2=None, D1R2=None, R1D2=None, S1S2=None, D1S2=None, S1D2=None, S1R2=None)

Bases: WeightTwoPointEstimator, JackknifeTwoPointEstimator

Initialize BaseTwoPointEstimator, and set correlation estimation corr (calling run()).

Parameters:
  • D1D2 (BaseTwoPointCounter, default=None) – D1D2 two-point counts.

  • R1R2 (BaseTwoPointCounter, default=None) – R1R2 two-point counts.

  • D1R2 (BaseTwoPointCounter, default=None) – D1R2 two-point counts, e.g. for LandySzalayTwoPointEstimator.

  • R1D2 (BaseTwoPointCounter, default=None) – R1D2 two-point counts, e.g. for LandySzalayTwoPointEstimator, in case of cross-correlation.

  • S1S2 (BaseTwoPointCounter, default=None) – S1S2 two-point counts, e.g. with reconstruction, the Landy-Szalay estimator is commonly written: \((D1D2 - D1S2 - S1D2 - S1S2)/R1R2\), with S1 and S2 shifted random catalogs. Defaults to R1R2.

  • D1S2 (BaseTwoPointCounter, default=None) – D1S2 two-point counts, see S1S2. Defaults to D1R2.

  • S1D2 (BaseTwoPointCounter, default=None) – S1D2 two-point counts, see S1S2. Defaults to R1D2.

  • S1R2 (BaseTwoPointCounter, default=None) – S1R2 two-point counts. Defaults to R1R2.

property XX

Return first two-point counts.

classmethod concatenate(*others)

Concatenate input JackknifeTwoPointEstimator instances; typically used when calculation has been split into different samples, see argument samples of JackknifeTwoPointCounter.__init__().

classmethod concatenate_x(*others)

Concatenate input estimators along sep by concatenating their two-point counts; see BaseTwoPointCounter.concatenate_x().

property count_names

Return list of counts used in estimator.

property edges

Edges for two-point count calculation, taken from XX.

extend(other)

Extend current instance with another; see concatenate().

get_corr(return_sep=False, return_cov=None, return_mask=False, mode='auto', **kwargs)

Return correlation function, optionally its covariance estimate, if available.

Parameters:
  • return_sep (bool, default=False) – Whether (True) to return average separation(s) sepavg.

  • return_cov (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), return covariance estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • return_mask (bool, default=False) – Return mask of bins that are summed over along dimension 1, for each dimension 0 point.

  • mode (str, default='auto') – If ‘poles’, return multipoles. If ‘wedges’, return \(\mu\)-wedges. If ‘wp’, return projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), return multipoles; else if ‘wedges’ provided, return \(\mu\)-wedges. else if ‘pimax’ provided, return projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ell, project_to_wedges() (if mode is ‘smu’), e.g. wedges, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

  • sep (array) – Optionally, separation values.

  • corr (array) – Estimated correlation function.

  • cov (array) – Optionally, covariance estimate (of the flattened corr), see return_cov.

property mode

Two-point counting mode, taken from XX.

property ndim

Return binning dimensionality.

normalize(wnorm='XX')

Renormalize all counts (BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm). This is useful when combining measurements in various regions.

Parameters:

wnorm (float, string, default='XX') – If float, rescale all BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm such that BaseTwoPointCounter.wnorm matches wnorm. Else, name of counts (e.g. ‘D1D2’, ‘R1R2’, etc.) to take wnorm from. ‘XX’ is the first counts of the estimator (usually ‘D1D2’).

Returns:

new – New estimator, with all counts renormalized. corr is expected to be exactly the same.

Return type:

BaseTwoPointEstimator

property nrealizations

Number of jackknife realizations.

plot(plot_std=None, mode='auto', ax=None, fn=None, kw_save=None, show=False, **kwargs)

Plot correlation function.

Parameters:
  • plot_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), plot standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • ax (matplotlib.axes.Axes, default=None) – Axes where to plot samples. If None, takes current axes.

  • fn (string, default=None) – If not None, file name where to save figure.

  • kw_save (dict, default=None) – Optional arguments for matplotlib.figure.Figure.savefig().

  • show (bool, default=False) – Whether to show figure.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

Returns:

ax

Return type:

matplotlib.axes.Axes

realization(ii, **kwargs)

Return jackknife realization ii.

Parameters:
Returns:

estimator – Two-point estimator for realization ii.

Return type:

BaseTwoPointEstimator

property realizations

List of jackknife realizations.

rebin(*args, **kwargs)

Rebin estimator, by rebinning all two-point counts. See BaseTwoPointCounter.rebin().

classmethod requires(join=None, **kwargs)

List required counts.

run()

Set weight estimate corr following \(R1R2/D1D2\), typically used for angular weights, with R1R2 from parent sample and D1D2 from fibered sample.

save(filename)

Save estimator to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ', return_std=None, mode='auto', **kwargs)

Save correlation function as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

  • return_std (bool, default=None) – If True or None and estimator holds a covariance estimate cov() (e.g. from jackknife realizations), save standard deviation estimate. If True and estimator does not have cov(), raise TwoPointEstimatorError.

  • mode (str, default='auto') – If ‘poles’, save multipoles. If ‘wp’, save projected correlation function. If ‘auto’, and ‘ells’ provided (see kwargs), save multipoles; else if ‘pimax’ provided, save projected correlation function.

  • kwargs (dict) – Optionally arguments for project_to_poles() (if mode is ‘smu’), e.g. ells, project_to_wp (if mode is ‘rpppi’), e.g. pimax, and cov().

select(*args, **kwargs)

Restrict estimator to provided coordinate limits in place.

For example:

estimator.select((0, 0.3)) # restrict first axis to (0, 0.3)
estimator.select(None, (0, 0.2)) # restrict second axis to (0, 0.2)
property sep

Array of separation values of first dimension; if not set by run(), taken from R1R2 if provided, else XX.

sepavg(*args, **kwargs)

Return average of separation for input axis; this is an 1D array of size shape[axis].

property seps

Array of separation values, if not set by run(), taken from R1R2 if provided, else XX.

property shape

Return shape of obtained correlation corr.

slice(*args, **kwargs)

Slice estimator, by rebinning all two-point counts. See BaseTwoPointCounter.slice().

classmethod sum(*others, exclude=None, uniques=True)

Sum input estimators (their two-point counts, actually). See e.g. https://arxiv.org/pdf/1905.01133.pdf for a use case. Input two-point estimators must have same edges for this operation to make sense (no checks performed).

Parameters:
  • exclude (list, default=None) – List of counts to exclude, e.g. [‘D1D2’].

  • uniques (bool, default=True) – If True, sum only unique (in the sense of same BaseTwoPointCounter.wcounts and BaseTwoPointCounter.wnorm) count instances.

Returns:

new

Return type:

BaseTwoPointEstimator

property weight

Another name for corr.

wrap()

Return new ‘smu’ or ‘rppi’ two-point estimators with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

class pycorr.twopoint_jackknife.KMeansSubsampler(mode, positions, nside=None, random_state=None, **kwargs)

Bases: BaseSubsampler

Subsampler using k-means scikit-learn algorithm to group particles together.

Initialize KMeansSubsampler.

Parameters:
  • mode (How to divide space, one of:) –

    • “angular”: on the sky

    • ”3d”: in Cartesian, 3D space

  • positions (list, array) – Positions of (typically randoms) particles to define subsamples. Typically of shape (3, N), but can be (2, N) when mode is “angular”.

  • nside (int, default=None) – Only if mode is “angular”. If not None, Healpix nside to pixelate input positions and weights. Smaller nside allows faster runtime, but coarser angular binning. If None, no Healpix pixelation is performed.

  • random_state (int, np.random.RandomState instance, default=None) – Determines random number generation for centroid initialization.

  • kwargs (dict) – Other arguments, see BaseSubsampler. One can also provide arguments for sklearn.cluster.KMeans.

label(positions, position_type=None)

Return subsample labels given input positions.

Parameters:
  • positions (list, array) – Positions of (typically randoms) particles to define subsamples. Typically of shape (3, N), but can be (2, N) when mode is “angular”. See position_type.

  • position_type (string, default='auto') –

    Type of input positions, one of:

    • ”rd”: RA/Dec in degree, only if mode is “angular”

    • ”rdd”: RA/Dec in degree, distance, for any mode

    • ”xyz”: Cartesian positions, shape (3, N)

    • ”pos”: Cartesian positions, shape (N, 3).

Returns:

labels – Labels corresponding to input positions.

Return type:

array

run()

Set kmeans instance to group particles together.

property with_mpi

Whether to use MPI.

pycorr.twopoint_jackknife.cls

alias of JackknifeTwoPointEstimator

Corrfunc two-point counter

Implement Corrfunc two-point counter engine.

class pycorr.corrfunc.CorrfuncTwoPointCounter(mode, edges, positions1, positions2=None, weights1=None, weights2=None, bin_type='auto', position_type='auto', weight_type='auto', weight_attrs=None, twopoint_weights=None, selection_attrs=None, los='midpoint', boxsize=None, compute_sepsavg=True, dtype='f8', nthreads=None, mpicomm=None, mpiroot=None, **kwargs)

Bases: BaseTwoPointCounter

Extend BaseTwoPointCounter for Corrfunc two-point counting code.

Initialize BaseTwoPointCounter, and run actual two-point counts (calling run()), setting wcounts and sep.

Parameters:
  • mode (string) –

    Type of two-point counts, one of:

    • ”theta”: as a function of angle (in degree) between two particles

    • ”s”: as a function of distance between two particles

    • ”smu”: as a function of distance between two particles and cosine angle \(\mu\) w.r.t. the line-of-sight

    • ”rppi”: as a function of distance transverse (\(r_{p}\)) and parallel (\(\pi\)) to the line-of-sight

    • ”rp”: same as “rppi”, without binning in \(\pi\)

  • edges (tuple, array) – Tuple of bin edges (arrays), for the first (e.g. \(r_{p}\)) and optionally second (e.g. \(\pi \in [-\infty, \infty]\), \(\mu \in [-1, 1]\)) dimensions. In case of single-dimension binning (e.g. mode is “theta”, “s” or “rp”), the single array of bin edges can be provided directly. Edges are inclusive on the low end, exclusive on the high end, i.e. a pair separated by \(s\) falls in bin i if edges[i] <= s < edges[i+1]. In case mode is “smu” however, the first \(\mu\)-bin is also exclusive on the low end (increase the \(\mu\)-range by a tiny value to include \(\mu = \pm 1\)). Pairs at separation \(s = 0\) are included in the \(\mu = 0\) bin. Similarly, in case mode is “rppi”, the first \(\pi\)-bin is also exclusive on the low end and pairs at separation \(s = 0\) are included in the \(\pi = 0\) bin. In case of auto-correlation (no positions2 provided), auto-pairs (pairs of same objects) are not counted. In case of cross-correlation, all pairs are counted. In any case, duplicate objects (with separation zero) will be counted.

  • positions1 (list, array) – Positions in the first catalog. Typically of shape (3, N), but can be (2, N) when mode is “theta”. See position_type.

  • positions2 (list, array, default=None) – Optionally, for cross-two-point counts, positions in the second catalog. See positions1.

  • weights1 (array, list, default=None) – Weights of the first catalog. Not required if weight_type is either None or “auto”. See weight_type.

  • weights2 (array, list, default=None) – Optionally, for cross-two-point counts, weights in the second catalog. See weights1.

  • bin_type (string, default='auto') – Binning type for first dimension, e.g. \(r_{p}\) when mode is “rppi”. Set to lin for speed-up in case of linearly-spaced bins. In this case, the bin number for a pair separated by a (3D, projected, angular…) separation sep is given by (sep - edges[0])/(edges[-1] - edges[0])*(len(edges) - 1), i.e. only the first and last bins of input edges are considered. Then setting compute_sepsavg is virtually costless. For non-linear binning, set to “custom”. “auto” allows for auto-detection of the binning type: linear binning will be chosen if input edges are within rtol = 1e-05 (relative tolerance) or atol = 1e-08 (absolute tolerance) of the array np.linspace(edges[0], edges[-1], len(edges)).

  • position_type (string, default='auto') –

    Type of input positions, one of:

    • ”rd”: RA/Dec in degree, only if mode is “theta”

    • ”rdd”: RA/Dec in degree, distance, for any mode

    • ”xyz”: Cartesian positions, shape (3, N)

    • ”pos”: Cartesian positions, shape (N, 3).

  • weight_type (string, default='auto') –

    The type of weighting to apply to provided weights. One of:

    • None: no weights are applied.

    • ”product_individual”: each pair is weighted by the product of weights \(w_{1} w_{2}\).

    • ”inverse_bitwise”: each pair is weighted by \(\mathrm{nrealizations}/(\mathrm{noffset} + \mathrm{popcount}(w_{1} \& w_{2}))\).

      Multiple bitwise weights can be provided as a list. Individual weights can additionally be provided as float arrays. In case of cross-correlations with floating weights, bitwise weights are automatically turned to IIP weights, i.e. \(\mathrm{nrealizations}/(\mathrm{noffset} + \mathrm{popcount}(w_{1}))\).

    • ”auto”: automatically choose weighting based on input weights1 and weights2,

      i.e. None when weights1 and weights2 are None, “inverse_bitwise” if one of input weights is integer, else “product_individual”.

    In addition, angular upweights can be provided with twopoint_weights.

  • weight_attrs (dict, default=None) – Dictionary of weighting scheme attributes. In case weight_type is “inverse_bitwise”, one can provide “nrealizations”, the total number of realizations (including current one; defaulting to the number of bits in input weights plus one); “noffset”, the offset to be added to the bitwise counts in the denominator (defaulting to 1) and “default_value”, the default value of pairwise weights if the denominator is zero (defaulting to 0). The method used to compute the normalization with PIP weights can be specified with the keyword “normalization”: if None or “total”, normalization is given by eq. 22 of arXiv:1912.08803; “brute_force” (using OpenMP’ed C code) or “brute_force_npy” (slower, using numpy only methods; both methods match within machine precision) loop over all pairs; “counter” to normalize each pair by eq. 19 of arXiv:1912.08803. For normalizations “total” or “counter”, “nalways” specifies the number of bits systematically set to 1 minus the number of bits systematically set to 0 (defaulting to 0). For example, for the “zero-truncated” estimator (arXiv:1912.08803), one would use noffset = 0, nalways = 1.

  • twopoint_weights (WeightTwoPointEstimator, default=None) – Weights to be applied to each pair of particles. A WeightTwoPointEstimator instance or any object with arrays sep (separations) and weight (weight at given separation) as attributes (i.e. to be accessed through twopoint_weights.sep, twopoint_weights.weight) or as keys (i.e. twopoint_weights['sep'], twopoint_weights['weight']) or as element (i.e. sep, weight = twopoint_weights).

  • selection_attrs (dict, default=None) – To select pairs to be counted, provide mapping between the quantity (string) and the interval (tuple of floats), e.g. {'rp': (0., 20.)} to select pairs with transverse separation ‘rp’ between 0 and 20, {‘theta’: (0., 20.)}` to select pairs with separation angle ‘theta’ between 0 and 20 degrees.

  • los (string, default='midpoint') –

    Line-of-sight to be used when mode is “smu”, “rppi” or “rp”; one of:

    • ”x”, “y” or “z”: Cartesian axis

    • ”midpoint”: the mean position of the pair: \(\mathbf{\eta} = (\mathbf{r}_{1} + \mathbf{r}_{2})/2\)

    • ”firstpoint”: the first position of the pair: \(\mathbf{\eta} = \mathbf{r}_{1}\)

    • ”endpoint”: the second position of the pair: \(\mathbf{\eta} = \mathbf{r}_{2}\)

    WARNING: “endpoint” is obtained by reversing “firstpoint” (which is the only line-of-sight implemented in the counter). This means, if “s” or “rp” edges starts at 0, and the number of “mu” or “pi” bins is even, zero separation pairs (due to duplicate objects) will be counted in counts[0, (counts.shape[1] - 1) // 2] instead of counts[0, counts.shape[1] // 2].

  • boxsize (array, float, default=None) – For periodic wrapping, the side-length(s) of the periodic cube.

  • compute_sepsavg (bool, default=True) – Set to False to not calculate the average separation for each bin. This can make the two-point counts faster if bin_type is “custom”. In this case, sep will be set the midpoint of input edges.

  • dtype (string, np.dtype, default='f8') – Array type for positions and weights. If None, defaults to type of first positions1 array. Double precision is highly recommended in case mode is “theta”, twopoint_weights is provided (due to cosine), or compute_sepsavg is True.

  • nthreads (int, default=None) – Number of OpenMP threads to use.

  • mpicomm (MPI communicator, default=None) – The MPI communicator, to MPI-distribute calculation.

  • mpiroot (int, default=None) – In case mpicomm is provided, if None, input positions and weights are assumed to be scattered across all ranks. Else the MPI rank where input positions and weights are gathered.

  • kwargs (dict) – Two-point counter engine-specific options.

property compute_sepavg

Whether to compute average of separation values for first dimension (e.g. \(s\) if mode is “smu”).

classmethod concatenate_x(*others)

Concatenate input two-point counts along sep; useful when running two-point counts at different particle densities, e.g. high density on small scales, and lower density on larger scales, to keep computing time tractable.

Warning

wnorm is cast to a ndim array.

property ndim

Return binning dimensionality.

normalization()

Return two-point count normalization, i.e., in case of cross-correlation:

\[\left(\sum_{i=1}^{N_{1}} w_{1,i}\right) \left(\sum_{j=1}^{N_{2}} w_{2,j}\right)\]

with the sums running over the weights of the first and second catalogs, and in case of auto-correlation:

\[\left(\sum_{i=1}^{N_{1}} w_{1,i}\right)^{2} - \sum_{i=1}^{N_{1}} w_{1,i}^{2}\]
normalize(wnorm)

Rescale both wcounts and wnorm such that new wnorm matches wnorm. This is useful when combining counts in various regions.

Parameters:

wnorm (float) – New normalization wnorm.

Returns:

new – Normalized counts.

Return type:

BaseTwoPointCounter

normalized_wcounts()

Return normalized two-point counts, i.e. wcounts divided by normalization().

property periodic

Whether periodic wrapping is used (i.e. boxsize is not None).

rebin(factor=1)

Rebin two-point counts, by factor(s) factor. Input factors must divide shape.

Warning

If current instance is the result of concatenate_x(), rebinning is exact only if factor divides each of the constant-wnorm chunks.

reverse()

Return counts for reversed positions1/weights1 and positions2/weights2.

run()

Compute the two-point counts and set wcounts and sep.

save(filename)

Save two-point counts to filename.

save_txt(filename, fmt='%.12e', delimiter=' ', header=None, comments='# ')

Save two-point counts as txt file.

Warning

Attributes are not all saved, hence there is load_txt() method.

Parameters:
  • filename (str) – File name.

  • fmt (str, default='%.12e') – Format for floating types.

  • delimiter (str, default=' ') – String or character separating columns.

  • header (str, list, default=None) – String that will be written at the beginning of the file. If multiple lines, provide a list of one-line strings.

  • comments (str, default=' #') – String that will be prepended to the header string.

select(*xlims)

Restrict counts to provided coordinate limits in place.

For example:

counts.select((0, 0.3))  # restrict first axis to (0, 0.3)
counts.select(None, (0, 0.2))  # restrict second axis to (0, 0.2)
statistic.select((0, 30, 4))   # rebin to match step size of 4 and restrict to (0, 30)
property sep

Array of separation values of first dimension (e.g. \(s\) if mode is “smu”).

sepavg(axis=0, method=None)

Return average of separation for input axis.

Parameters:
  • axis (int, default=0) – Axis; if mode is “smu”, 0 is \(s\), 1 is \(mu\); if mode is “rppi”, 0 is \(r_{p}\), 1 is \(\pi\).

  • method (str, default=None) – If None, return average separation from seps. If ‘mid’, return bin mid-points.

Returns:

sepavg – 1D array of size shape[axis].

Return type:

array

property shape

Return shape of obtained counts wcounts.

slice(*slices)

Slice counts in place. If slice step is not 1, use rebin(). For example:

counts.slice(slice(0, 10, 2), slice(0, 6, 3)) # rebin by factor 2 (resp. 3) along axis 0 (resp. 1), up to index 10 (resp. 6)
counts[:10:2,:6:3] # same as above, but return new instance.
classmethod sum(*others)

Sum input two-point counts; useful when splitting input sample of particles; e.g. https://arxiv.org/pdf/1905.01133.pdf.

Warning

If > 1 input two-point counts, size1, size2 attributes will be lost. Input two-point counts must have same edges for this operation to make sense (no checks performed).

property with_mpi

Whether to use MPI.

wrap()

Return new ‘smu’ or ‘rppi’ two-point counts with 2nd coordinate wrapped to positive values, \(\mu > 0\) or \(\pi > 0\).

Utilities

A few utilities.

class pycorr.utils.BaseClass

Bases: object

Base class that implements copy(). To be used throughout this package.

class pycorr.utils.BaseMetaClass(name, bases, class_dict)

Bases: type

Metaclass to add logging attributes to BaseClass derived classes.

mro()

Return a type’s method resolution order.

set_logger()

Add attributes for logging:

  • logger

  • methods log_debug, log_info, log_warning, log_error, log_critical

class pycorr.utils.BaseTaskManager(**kwargs)

Bases: BaseClass

A dumb task manager, that simply iterates through the tasks in series.

iterate(tasks)

Iterate through a series of tasks.

Parameters:

tasks (iterable) – An iterable of tasks that will be yielded.

Yields:

task – The individual items of `tasks, iterated through in series.

map(function, tasks)

Apply a function to all of the values in a list and return the list of results.

If tasks contains tuples, the arguments are passed to function using the *args syntax.

Parameters:
  • function (callable) – The function to apply to the list.

  • tasks (list) – The list of tasks.

Returns:

results – The list of the return values of function.

Return type:

list

class pycorr.utils.DistanceToRedshift(distance, zmax=100.0, nz=2048, interp_order=3)

Bases: object

Class that holds a conversion distance -> redshift.

Initialize DistanceToRedshift. Creates an array of redshift -> distance in log(redshift) and instantiates a spline interpolator distance -> redshift.

Parameters:
  • distance (callable) – Callable that provides distance as a function of redshift (array).

  • zmax (float, default=100.) – Maximum redshift for redshift <-> distance mapping.

  • nz (int, default=2048) – Number of points for redshift <-> distance mapping.

  • interp_order (int, default=3) – Interpolation order, e.g. 1 for linear interpolation, 3 for cubic splines.

pycorr.utils.TaskManager(mpicomm=None, **kwargs)

Switch between non-MPI (ntasks=1) and MPI task managers. To be called as:

with TaskManager(...) as tm:
    # do stuff
pycorr.utils.cartesian_to_sky(positions, wrap=True, degree=True, dtype=None)

Transform cartesian coordinates into RA, Dec, distance.

Parameters:
  • positions (array of shape (3, N), list of 3 arrays) – Positions in cartesian coordinates.

  • wrap (bool, default=True) – Whether to wrap RA in \([0, 2 \pi]\).

  • degree (bool, default=True) – Whether RA, Dec are in degrees (True) or radians (False).

Returns:

rdd – Right ascension, declination and distance.

Return type:

list of 3 arrays

pycorr.utils.cov_to_corrcoef(cov)

Return correlation matrix corresponding to input covariance matrix cov. If cov is scalar, return 1.

pycorr.utils.distance(positions)

Return cartesian distance, taking coordinates along position first axis.

pycorr.utils.exception_handler(exc_type, exc_value, exc_traceback)

Print exception with a logger.

pycorr.utils.get_mpi()

Return mpi module.

pycorr.utils.is_sequence(item)

Whether input item is a tuple or list.

pycorr.utils.joint_occurences(nrealizations=128, max_occurences=None, noffset=1, default_value=0)

Return expected value of inverse counts, i.e. eq. 21 of arXiv:1912.08803.

Parameters:
  • nrealizations (int) – Number of realizations (including current realization).

  • max_occurences (int, default=None) – Maximum number of occurences (including noffset). If None, defaults to nrealizations.

  • noffset (int, default=1) – The offset added to the bitwise count, typically 0 or 1. See “zero truncated estimator” and “efficient estimator” of arXiv:1912.08803.

  • default_value (float, default=0.) – The default value of pairwise weights if the denominator is zero (defaulting to 0).

Returns:

occurences – Expected value of inverse counts.

Return type:

list

pycorr.utils.mkdir(dirname)

Try to create dirname and catch OSError.

pycorr.utils.pack_bitarrays(*arrays, dtype=<class 'numpy.uint64'>)

Pack bit arrays into a list of integer arrays. Inverse operation is unpack_bitarray(), i.e. unpack_bitarrays(pack_bitarrays(*arrays, dtype=dtype))``is ``arrays, whatever integer dtype is.

Parameters:
  • arrays (bool arrays) – Arrays of integers or booleans whose elements should be packed to bits.

  • dtype (string, dtype) – Type of output integer arrays.

Returns:

arrays – List of integer arrays of type dtype, representing input boolean arrays.

Return type:

list

pycorr.utils.pascal_triangle(n_rows)

Compute Pascal triangle. Taken from https://stackoverflow.com/questions/24093387/pascals-triangle-for-python.

Parameters:

n_rows (int) – Number of rows in the Pascal triangle, i.e. maximum number of elements \(n\).

Returns:

triangle – List of list of binomial coefficients. The binomial coefficient \((k, n)\) is triangle[n][k].

Return type:

list

pycorr.utils.popcount(*arrays)

Return number of 1 bits in each value of input array. Inspired from https://github.com/numpy/numpy/issues/16325.

pycorr.utils.rebin(array, new_shape, statistic=<function sum>)

Bin an array in all axes based on the target shape, by summing or averaging. Number of output dimensions must match number of input dimensions and new axes must divide old ones.

Taken from https://stackoverflow.com/questions/8090229/resize-with-averaging-or-rebin-a-numpy-2d-array and https://nbodykit.readthedocs.io/en/latest/_modules/nbodykit/binned_statistic.html#BinnedStatistic.reindex.

Example

>>> m = np.arange(0,100,1).reshape((10,10))
>>> n = rebin(m, new_shape=(5,5), statistic=np.sum)
>>> print(n)
[[ 22 30 38 46 54]

[102 110 118 126 134] [182 190 198 206 214] [262 270 278 286 294] [342 350 358 366 374]]

pycorr.utils.reformat_bitarrays(*arrays, dtype=<class 'numpy.uint64'>, copy=True)

Reformat input integer arrays into list of arrays of type dtype. If, e.g. 6 arrays of type np.uint8 are input, and dtype is np.uint32, a list of 2 arrays is returned.

Parameters:
  • arrays (integer arrays) – Arrays of integers to reformat.

  • dtype (string, dtype) – Type of output integer arrays.

  • copy (bool, default=True) – If False, avoids copy of input arrays if dtype is uint8.

Returns:

arrays – List of integer arrays of type dtype, representing input integer arrays.

Return type:

list

pycorr.utils.savefig(filename, fig=None, bbox_inches='tight', pad_inches=0.1, dpi=200, **kwargs)

Save figure to filename.

Warning

Take care to close figure at the end, plt.close(fig).

Parameters:
  • filename (string) – Path where to save figure.

  • fig (matplotlib.figure.Figure, default=None) – Figure to save. Defaults to current figure.

  • kwargs (dict) – Optional arguments for matplotlib.figure.Figure.savefig().

Returns:

fig

Return type:

matplotlib.figure.Figure

pycorr.utils.setup_logging(level=20, stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, filename=None, filemode='w', **kwargs)

Set up logging.

Parameters:
  • level (string, int, default=logging.INFO) – Logging level.

  • stream (_io.TextIOWrapper, default=sys.stdout) – Where to stream.

  • filename (string, default=None) – If not None stream to file name.

  • filemode (string, default='w') – Mode to open file, only used if filename is not None.

  • kwargs (dict) – Other arguments for logging.basicConfig().

pycorr.utils.sky_to_cartesian(rdd, degree=True, dtype=None)

Transform RA, Dec, distance into cartesian coordinates.

Parameters:
  • rdd (array of shape (3, N), list of 3 arrays) – Right ascension, declination and distance.

  • degree (default=True) – Whether RA, Dec are in degrees (True) or radians (False).

Returns:

positions – Positions x, y, z in cartesian coordinates.

Return type:

list of 3 arrays

pycorr.utils.unpack_bitarrays(*arrays)

Unpack integer arrays into a bit array. Inverse operation is pack_bitarray(), i.e. pack_bitarrays(unpack_bitarrays(*arrays), dtype=arrays.dtype)``is ``arrays.

Parameters:

arrays (integer arrays) – Arrays of integers whose elements should be unpacked to bits.

Returns:

arrays – List of boolean arrays of type np.uint8, representing input integer arrays.

Return type:

list

MPI

class pycorr.mpi.MPITaskManager(nprocs_per_task=1, use_all_nprocs=False, mpicomm=mpi4py.MPI.COMM_WORLD)

Bases: BaseClass

A MPI task manager that distributes tasks over a set of MPI processes, using a specified number of independent workers to compute each task.

Given the specified number of independent workers (which compute tasks in parallel), the total number of available CPUs will be divided evenly.

The main function is iterate which iterates through a set of tasks, distributing the tasks in parallel over the available ranks.

Initialize MPITaskManager.

Parameters:
  • nprocs_per_task (int, optional) – The desired number of processes assigned to compute each task.

  • mpicomm (MPI communicator, optional) – The global communicator that will be split so each worker has a subset of CPUs available; default is COMM_WORLD.

  • use_all_nprocs (bool, optional) – If True, use all available CPUs, including the remainder if nprocs_per_task does not divide the total number of CPUs evenly; default is False.

is_root()

Is the current process the root process? Root is responsible for distributing the tasks to the other available ranks.

is_worker()

Is the current process a valid worker? Workers wait for instructions from the root.

iterate(tasks)

Iterate through a series of tasks in parallel.

Notes

This is a collective operation and should be called by all ranks.

Parameters:

tasks (iterable) – An iterable of task items that will be yielded in parallel across all ranks.

Yields:

task – The individual items of tasks, iterated through in parallel.

map(function, tasks)

Apply a function to all of the values in a list and return the list of results.

If tasks contains tuples, the arguments are passed to function using the *args syntax.

Notes

This is a collective operation and should be called by all ranks.

Parameters:
  • function (callable) – The function to apply to the list.

  • tasks (list) – The list of tasks.

Returns:

results – The list of the return values of function.

Return type:

list

pycorr.mpi.bcast(data, mpiroot=0, mpicomm=mpi4py.MPI.COMM_WORLD)

Broadcast the input data array across all ranks, assuming data is initially only on mpiroot (and None on other ranks). This uses Scatterv, which avoids mpi4py pickling, and also avoids the 2 GB mpi4py limit for bytes using a custom datatype.

Parameters:
  • data (array_like or None) – On mpiroot, this gives the data to broadcast.

  • mpiroot (int, default=0) – The rank number that initially has the data.

  • mpicomm (MPI communicator, default=COMM_WORLD) – The MPI communicator.

Returns:

recvbufferdata on each rank.

Return type:

array_like

pycorr.mpi.domain_decompose(mpicomm, smoothing, positions1, weights1=None, positions2=None, weights2=None, boxsize=None, domain_factor=None)

Adapted from https://github.com/bccp/nbodykit/blob/master/nbodykit/algorithms/pair_counters/domain.py. Decompose positions and weights on a grid of MPI processes. Requires mpi4py and pmesh.

Parameters:
  • mpicomm (MPI communicator) – The MPI communicator.

  • smoothing (float) – The maximum Cartesian separation implied by the user’s binning.

  • positions1 (list, array) – Positions in the first catalog. Typically of shape (3, N) or (2, N); in the latter case input arrays are assumed to be angular coordinates in degrees (these will be projected onto the unit sphere for further decomposition).

  • positions2 (list, array, default=None) – Optionally, for cross-pair counts, positions in the second catalog. See positions1.

  • weights1 (list, array, default=None) – Optionally, weights of the first catalog.

  • weights2 (list, array, default=None) – Optionally, weights in the second catalog.

  • boxsize (array, default=None) – For periodic wrapping, the 3 side-lengths of the periodic cube.

  • domain_factor (int, default=None) – Multiply the size of the MPI mesh by this factor. If None, defaults to 2 in case boxsize is None, else (periodic wrapping) 1.

Returns:

(positions1, weights1), (positions2, weights2) – The (decomposed) set of positions and weights.

Return type:

arrays

pycorr.mpi.enum(*sequential, **named)

Enumeration values to serve as status tags passed between processes.

pycorr.mpi.gather(data, mpiroot=0, mpicomm=mpi4py.MPI.COMM_WORLD)

Taken from https://github.com/bccp/nbodykit/blob/master/nbodykit/utils.py. Gather the input data array from all ranks to the specified mpiroot. This uses Gatherv, which avoids mpi4py pickling, and also avoids the 2 GB mpi4py limit for bytes using a custom datatype.

Parameters:
  • data (array_like) – The data on each rank to gather.

  • mpiroot (int, Ellipsis, default=0) – The rank number to gather the data to. If mpiroot is Ellipsis or None, broadcast the result to all ranks.

  • mpicomm (MPI communicator, default=MPI.COMM_WORLD) – The MPI communicator.

Returns:

recvbuffer – The gathered data on mpiroot, and None otherwise.

Return type:

array_like, None

pycorr.mpi.local_size(size, mpicomm=mpi4py.MPI.COMM_WORLD)

Divide global size into local (process) size.

Parameters:
  • size (int) – Global size.

  • mpicomm (MPI communicator, default=MPI.COMM_WORLD) – The MPI communicator.

Returns:

localsize – Local size. Sum of local sizes over all processes equals global size.

Return type:

int

pycorr.mpi.scatter(data, size=None, mpiroot=0, mpicomm=mpi4py.MPI.COMM_WORLD)

Taken from https://github.com/bccp/nbodykit/blob/master/nbodykit/utils.py Scatter the input data array across all ranks, assuming data is initially only on mpiroot (and None on other ranks). This uses Scatterv, which avoids mpi4py pickling, and also avoids the 2 GB mpi4py limit for bytes using a custom datatype

Parameters:
  • data (array_like or None) – On mpiroot, this gives the data to split and scatter.

  • size (int) – Length of data on current rank.

  • mpiroot (int, default=0) – The rank number that initially has the data.

  • mpicomm (MPI communicator, default=MPI.COMM_WORLD) – The MPI communicator.

Returns:

recvbuffer – The chunk of data that each rank gets.

Return type:

array_like

pycorr.mpi.split_ranks(nranks, nranks_per_worker, include_all=False)

Divide the ranks into chunks, attempting to have nranks_per_worker ranks in each chunk. This removes the master (0) rank, such that nranks - 1 ranks are available to be grouped.

Parameters:
  • nranks (int) – The total number of ranks available.

  • nranks_per_worker (int) – The desired number of ranks per worker.

  • include_all (bool, optional) – if True, then do not force each group to have exactly nranks_per_worker ranks, instead including the remainder as well; default is False.