GithubHelp home page GithubHelp logo

mikeswang / triumvirate Goto Github PK

View Code? Open in Web Editor NEW
19.0 2.0 4.0 14.48 MB

A Python/C++ package for three-point clustering measurements in LSS analyses

Home Page: https://mikeswang.github.io/Triumvirate/

License: GNU General Public License v3.0

Makefile 1.97% C++ 48.15% Python 39.66% Cython 6.04% TeX 2.23% Shell 1.92% Dockerfile 0.01% Jinja 0.02%
python cpp cython clustering-statistics large-scale-structure-cosmology

triumvirate's Introduction

Triumvirate-Logo

Three-Point Clustering Measurements in LSS

Release CI Docs pre-commit.ci-Status Codacy-Badge

Triumvirate is a Python/C++ software package for measuring three-point (and two-point) clustering statistics in large-scale structure (LSS) cosmological analyses.

Documentation

Documentation

Comprehensive documentation including the scientific background, installation instructions, tutorials and API reference can be found at triumvirate.readthedocs.io.

Installation

Python package

PyPI Conda

Triumvirate as a Python package is distributed through PyPI and Conda. Instructions for installation can be found on the Installation page in the documentation.

Tip

CUDA variants of the Python package are/will be made available as Triumvirate-CUDA on PyPI and triumvirate-cuda through Conda.

C++ library & program

Triumvirate as either a static library or a binary executable can be built using make. Instructions for compilation can be found on the Installation page in the documentation.

Development mode

Both the Python package and the C++ library/program can be set up in development mode with make, provided that dependency requirements are satisfied (GSL and FFTW3 libraries are mandatory while an OpenMP library is optional).

First git clone the desired branch/release from the GitHub repository and change into the repository directory path:

git clone [email protected]:MikeSWang/Triumvirate.git --branch <branch-or-release>
cd Triumvirate

Then, execute in shell:

make clean
make ([py|cpp]install)|(cpp[libinstall|appbuild]) [useomp=(true|1)] [usecuda=(true|1)]

where cpplibinstall or cppappbuild respectively builds the C++ static library or binary executable only, cppinstall builds both, pyinstall builds the Python package only, and install builds all of the above. To enable OpenMP parallelisation, append useomp=true or useomp=1 to the end of the second line as shown above. To enable CUDA support, append usecuda=true or usecuda=1 to the end of the second line as shown above.

Note

The latest release is on the main branch. The default Makefile (located at the repository directory root) should work in most build environments, but may need to be modified as appropriate.

Note

See the Installation page in the documentation for more details about dependency requirements.

Important

If enabling OpenMP, ensure the C++ compiler used supports it and is configured accordingly. The default Makefile (located at the repository directory root) assumes the GCC compiler and OpenMP library. See the Installation page in the documentation for more details.

Important

If enabling CUDA capability, ensure there is a CUDA-capable GPU with the appropriate driver installed. For atypical CUDA Toolkit paths, you may need to append the header and library paths to DEP_INCLUDES and DEP_LDFLAGS in the default Makefile (located at the repository directory root). See the Installation page in the documentation for more details.

Tip

Pass option -j[N] -O to make to run multiple concurrent jobs for parallel building (optional parameter N is the number of parallel jobs; see GNU Make Manual).

Attribution

JOSS Zenodo arXiv MNRAS MNRAS

To acknowledge the use of Triumvirate in your published research, please cite the publications linked above; for convenience, you can refer to the files CITATION.cff and CITATION.md for the relevant information in different formats.

Acknowledgement

ERC

ERC

This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Grant agreement ID: 853291).

Key underlying numerical algorithms were originally developed by Naonori S Sugiyama, and are available in the GitHub repository hitomi.

We thank the JOSS reviewers, William Coulton (@wcoulton) and Alfonso Veropalumbo (@alfonso-veropalumbo), for their valuable feedback and suggestions (openjournals/joss-reviews#5571), which have improved the functionality and documentation of the code.

Contributing/Development

Platforms Python-Version C++-Standard

Release-Date Commits-Since

Build-Issues Bug-Issues Feature-Issues Pull-Requests

pre-commit

Codespaces

User feedback and contributions are very welcome. Please refer to the contribution guidelines.

Discussions & Wiki

Discussions

A community forum for users and developers exists, where you can receive announcements, post questions, share ideas and get updates.

A wiki site collects wisdoms for specific use cases and user environments.

Releases

Release notes are included in the change log.

Licence

Licence

Triumvirate is made freely available under the GPLv3+ licence. Please see LICENCE (located at the repository directory root) for full terms and conditions.

© 2023 Mike S Wang & Naonori S Sugiyama

triumvirate's People

Contributors

dependabot[bot] avatar dfm avatar mikeswang avatar misharash avatar pre-commit-ci[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

triumvirate's Issues

[FEAT] Support mesh-grid source data

Is the requested feature related to an issue?

No.

Summary

Currently the only source data type supported is catalogues of discrete particles with positions and weights. An additional source data type is mesh grids of sampled density fields with line of sights given by the grid position vector.

Alternatives

N/A

Implementation

Overload existing field operation and clustering statistic computation methods.

Additional context

This scenario occurs e.g. when reconstruction techniques are used.

Mesh-grid source data should demand less computational time as spherical harmonics can be precomputed and stored on the mesh grids.

[FEAT] Full 2D three-point statistics

Is the requested feature related to an issue?

No.

Summary

Allow the user to compute the full 2D three-point statistics without fixing the first coordinate.

Alternatives

N/A

Implementation

This should be straightforward but there are two points of consideration:

  • the output measurement data array needs to be arranged to a specific layout, e.g. fix $k_1$ to the first bin and vary $k_2$ before moving to the next bin for $k_1$;
  • the computation time for a single execution is likely to be $N_\mathrm{bin}$ times longer.

Additional context

This is in response to the JOSS review suggestion (openjournals/joss-reviews#5571 (comment)).

[FEAT] Automatic numerical parameters (e.g. box size and grid cell number)

Is the requested feature related to an issue?

No.

Summary

Allow the box size and grid cell number to be unset and automatically calculated based on alternative parameters.

Alternatives

N/A

Implementation

An additional expansion factor (as a ratio of the particle coordinate spans) can be added for automatic box sizes. Based on the box sizes and an additional requested Nyquist cutoff (wavenumber or separation), the grid cell numbers are calculated.

Additional context

This saves the user from having to examine the catalogue files in advance to determine the appropriate box sizes.

[FEAT] Julia wrapper?

Hi, since the library seems to be build in C++ primarily with a python interface. I wonder how difficult it would be to write a thin wrapper for Julia. Julia supports C function calls natively so in principle having a shared library with functions that accept, for example, bispec(real *posx, real *posy, real *posz) should be enough.

I can look into it at some point but I am not familiar with the library so it will be harder.

Thanks in advance!

[FEAT] Add command-line utilities as console-script entry points

Is the requested feature related to an issue?

No.

Summary

Add console-script entry points so that triumvirate can be extended with command-line utilities.

As an example,

[python -m] triumvirate [--param-file=]params.yml

performs clustering statistic computations.

Alternatives

N/A

Implementation

Not provided yet.

Additional context

Utilities under consideration:

  1. Re-binning of measurements;
  2. Symmetric exchange of bispectrum multipole measurements;
  3. Data vector concatenation and aggregation;
  4. Parameter file generation.

[FEAT] Utilise parity/conjugation relations for three-point measurements

Is the requested feature related to an issue?

No.

Summary

Since $Y_\ell^{-m} = (-1)^m {Y_\ell^m}^*$, computation for $(-m_1, -m_2, -M)$ is redundant if the $(m_1, m_2, M)$ term has already been computed. Computational cost is reduced if the latter result is reused at practically no additional memory cost.

Alternatives

N/A

Implementation

For each $(m_1, m_2, M)$, check if the term corresponding to its additive inverse $(-m_1, -m_2, -M)$ has been computed. If not, add a parity-like factor with conjugation to account for the term corresponding to its additive inverse, else skip.

Additional context

Thanks to Yunchen Xie ([email protected]) at National Astronomical Observatories, Chinese Academy of Sciences for the suggestion.

[MAINT] NumPy 2 ABI breaking change(s)

Is the maintenance request a bug report or feature suggestion?

This can be regarded as a build/installation issue, a bug report or a feature suggestion depending on the viewpoint. It is mostly a maintenance issue though due to upstream/dependency API change.

Summary

NumPy 2 has introduced potentially breaking changes to its ABI. Currently triumvirate may be built with numpy<2 while the
runtime dependency is numpy>=2. This causes either

ImportError: numpy.core.multiarray failed to import (auto-generated because you didn't call 'numpy.import_array()' after cimporting numpy; use '<void>numpy._import_array' to disable if you are certain you don't need it).

or

ValueError: numpy.dtype size changed, may indicate binary incompatibility.

Scope

Changes to pyproject.toml and/or setup.cfg may be required to enforce NumPy 2 compatibility. cibuildwheel-specific configuration may need modification in pyproject.toml, and Conda meta.yaml recipes may also need updating.

Implementation

As a temporary fix, numpy<2 constraint is put in place (ce3b26d).

However, builds against NumPy 2 should be backward-compatible with NumPy 1. For either cibuildwheel or conda build, numpy>=2 may be added as a build dependency (specified through pyproject.toml and Conda meta.yaml recipes). See this NumPy documentation section.

The meta-package dependency oldest-supported-numpy may no longer be needed once numpy>=1.25 (or numpy>=2) constraint is put in place. See this NumPy documentation section.

Additional context

See also the NumPy migration guide.

[FEAT] Add sky-to-Cartesian coordinate transformation to Python interface

Is the requested feature related to an issue?

No.

Summary

Add sky-to-Cartesian coordinate transformation functions.

Alternatives

N/A

Implementation

Can be implemented in the Python interface through existing astropy dependency.

Additional context

This saves the user from creating separate catalogues with the required Cartesian coordinates.

[FEAT] Add CMake/Meson builds

Is the requested feature related to an issue?

No.

Summary

Consider supporting cross-platform build systems including CMake and Meson.

Alternatives

N/A

Implementation

Follow the documentation of CMake/Meson.

Additional context

N/A

[MAINT] Update GitHub Actions CI workflow to match the latest macOS runner image

Is the maintenance request a bug report or feature suggestion?

No.

Summary

CI workflow in failing on GitHub actions because of the change in the latest macOS runner image.

Scope

Needs to change the workflow configuration of CI and probably CD as well as cross-platform CD.

Implementation

Update compilation commands and flags, rerun actions and assess the outcome.

Additional context

This is only required before the next stable release to PyPI and Anaconda.

[FEAT] Offload FFTs to non-CUDA GPUs using HIP

Is the requested feature related to an issue?

This is the sequel to issue gh-17.

Summary

Offload FFT operations to non-CUDA GPUs using HIP/hipFFT.

Alternatives

N/A

Implementation

The HIP toolchain including hipFFT/rocFFT libraries is leveraged.

Additional context

HIP/hipFFT toolchain also allows the compilation of CUDA-compatible code to run on CUDA-capable GPUs.

[FEAT] Embed progress bar

Is the requested feature related to an issue?

No.

Summary

Embed a progress bar to track long-runtime functions.

Alternatives

N/A

Implementation

To estimate the progress, a task needs to be defined and a total task count needs to be provided. As an example, each task can be a FFT, and the total number of FFTs is the total task count.

Additional context

Given the short run time of two-point statistics, this is needed for three-point statistics only.

[FEAT] Allow 3PCF to be evaluated at bin centres

Is the requested feature related to an issue?

No.

Summary

The 3PCF statistics are evaluated at configuration-space coordinates rather than binned except for their shot noise components. Currently the coordinates are chosen to be bin averages based on the shot noise components which are binned from a mesh grid.

Where the shot noise components don't change significantly with binning, the raw 3PCF components can be evaluated at
arbitrary input coordinates, of which bin centres are a valid choice in this feature suggestion.

Alternatives

N/A

Implementation

Not provided yet. This can be added through an additional parameter setting, but clarity to users should be considered carefully.

Additional context

When comparing with count-based configuration-space estimators, this helps with consistency between binning choices.

[FEAT] Offload (FFTs) to GPUs

Is the requested feature related to an issue?

No.

Summary

Offload proportions of the C++ code to GPUs, namely the FFT operations.

Alternatives

N/A

Implementation

Possible routes for offloading:

  1. Directive-based OpenMP/OpenACC offloading;

  2. Library-based substitution.

Route 2 is preferred with cuFFTW or hipFFT/rocFFT libraries.


Possible routes for packaging:

  1. Auto-detect GPUs;

  2. Separate GPU builds and distributions.

Additional context

This makes use of GPUs on HPCs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.