GithubHelp home page GithubHelp logo

alexander-held / pyhep-resources Goto Github PK

View Code? Open in Web Editor NEW

This project forked from hsf-training/pyhep-resources

0.0 1.0 0.0 102 KB

Python libraries of interest to particle physicists. This is meant for educational purposes.

Home Page: https://gitter.im/HSF/PyHEP

License: Creative Commons Attribution 4.0 International

pyhep-resources's Introduction

Python Libraries of Interest to Particle Physics

Join the chat at https://gitter.im/HSF/PyHEP DOI

Python libraries of interest to particle physicists. This is meant to be a living document. Therefore, if you have suggestions, click the edit button then make a pull request with your proposed change(s). This is meant to focus on Python resources for new-ish users; if you just want to look for different libraries for a particular purpose, please visit the Python section of the awesome-hep list.

You are more than welcome to join the HSF/PyHEP Gitter channel and contribute to the informal discussions there. The channel HSF/PyHEP-newcomers Gitter channel specifically targets, well, newcomers.

See the HSF PyHEP Working Group page for a full list of Gitter channels and resources.

New to Python

If you are new to Python, the following contains general information on using Python in Science.

Name Use
Software Carpentry Python Lesson Lesson aimed at people who have never used Python before.
Scipy tutorials You'll want the beginner courses, the intermediate/advanced courses are actually quite advanced. Setup instructions are linked on the page, videos are here
Dive into Python 3 Very useful for learning python, though it's a bit old and doesn't cover any of the scientific python stuff you really need.
Code academy
Many EdX and Coursera courses Often introductory CS courses which can teach other useful skills (algorithms and datastructures)
The python docs
Level Up Your Python Advanced topics: Debuggers, static typing, logging, decorators, generators and more as interactive Jupyter book

Otherwise, just google python + description of problem, usually answer is on stackoverflow.

Youtube channels with talks / tutorials:

Pycon, e.g.:

Some more advanced talks of interested:

Getting Python

Name Use
Conda Anaconda packages most scientific Python libraries while also living purely in user space. Therefore, you don't need special permissions to setup. Anaconda is a metapackage of 100 or so scientific Python packages for Conda.
pipenv Very slick all-in-one combination of virtual environments and package installation, can manage Python installs too.
ripa ripa solves the packaging issue by letting you install packages (or requirements.txt) where priority given to conda channels, otherwise fetches from PyPI. (Works but rough)

Scientific Python Stack

The packages that are used in Physics and/or data science within Python grew somewhat organically before forming the current ecosystem. Detailed information on what exists can be found here, but we will summarize here.

Name Use
jupyter notebook Main one way of doing interactive and/or exploratory analysis.
numpy Array and matrix operations (including math operations) at C speeds.
pandas A very elegant way to work with tabular data (i.e. ntuples) with in memory calculations. Especially good at time series.
xarray Extension of pandas to N-Dim structures.
h5py Simple numpy to HDF5 bindings (backend for Keras saved models).
scipy Various scientific routines like minimization.
matplotlib Main Python plotting library. Start from matplotlib gallery then adapt to your application.
scikit-learn Very easy to use machine learning routines with great examples.

Visualisation:

Name Use
matplotlib Main Python plotting library. Start from matplotlib gallery then adapt to your application.
seaborn Easier to use plotting library with some statistical routines. Builds on matplotlib, but annoying to customize.
vegascope View Vega/Vega-Lite plots in your web browser from local or remote Python processes.

Machine learning:

Name Use
scikit-learn Popular package. Very easy to use ML routines, with great examples.
tensorflow By Google, for deep neural networks and more.
pytorch Deep learning framework for fast, flexible experimentation with dynamic computational graphs.
keras Higher level neural network interfaces.

General information through talks that may be useful on PyData (various conferences each year):

Data manipulation

Name Use
boost-histogram Python bindings for the C++14 Boost::Histogram library.
hist Analyst-friendly front-end for boost-histogram.

Statistical analysis and fitting

Name Use
iminuit Jupyter-friendly Python frontend to the MINUIT2 C++ library.
uncertainties Calculations with numbers with uncertainties. It also yields the derivatives of any expression.
pyhf Pure python implementation of HistFactory specification with auto-diff enabled backends in tensorflow, pytorch, and MXNet.
hepstats HEP statistics tools and utilities.
zfit Scalable pythonic fitting.

Particle Physics packages

Name Use
DecayLanguage Describe, manipulate and display particle decays.
hepstats HEP statistics tools and utilities.
hepunits Units and constants in the HEP system of units.
Particle PDG particle data and identification codes.
numpythia Interface between PYTHIA and NumPy.
pylhe Interface to read Les Houches Event (LHE) files.
pyjet Interface between FastJet and NumPy.
vector Arrays of 2D, 3D, and Lorentz vectors.

ROOT and interoperability with ROOT

For many particle physics experiments, a lot of data is stored within ROOT files. This means at very least one must have the ability to read ROOT files. ROOT also serves as a tool suite designed to solve many computational problems encountered in HEP, which means that one may want to access some of this tool suite. The following packages below are worth knowing for these situations:

Package name Use Pro Con Further information
ostap User-friendly & more intuitive interface to(Py)ROOT Many decorations to ROOT classes Requires C++ code compilation
uproot Native Python ROOT I/O Easy to install, fast, no dependence on C++ ROOT Although it can read all ROOT files, can only write ROOT files with specific objects.
Conda-Forge ROOT Using ROOT within Anaconda Full-featured ROOT and PyROOT on Linux and macOS
PyROOT Official ROOT Python bindings Good support and many examples Raw C++ wrapping results in weird Python code (improved in 6.22)
rootpy Pythonic ROOT access More logical for people who know Python Abandoned, mostly replaced by ROOT 6.22+ new bindings or the uproot family. Repository
alphatwirl Summerizing ROOT data into categorical data as Pandas' data frames Small output size. Easy one-function interface with qtwirl Not for data type conversion
pyhf statistical analysis / fitting pure python implementation of HistFactory specification with auto-diff enabled backends in tensorflow, pytorch, and MXNet not yet interoperable with ROOT-based RooFit models

Jupyter extensions

Jupyter has a wide ecosystem of extensions that can be used to extend the functionality. Some useful extensions for HEP data analysis are summarised here.

Name Use
nbdime Simplifies diffing and merging of jupyter notebooks that are stored in version control.
jupytext Splits notebooks into a .ipynb and .py file for easier version control and to allow them to be run as scripts idependently of jupyter.

Speeding up code

Often, it is not needed anymore to write C++/C routines that get wrapped since there are other ways to speed up your Python code. Namely:

Name Use
numba Tight loops are often the slow part of Python code, where this JIT compiles them!
Pythran Whole scripts
numpy Expressing your code as array options means you get native-C speeds per sub-expression.
jax Allows compiling and running NumPy operations on accelerated hardware such as GPU's. Also offers a JIT compiler (less sophisticated than Numba), automatic analytic differentiation (gradient) of Python functions and efficient vectorization. Was developed with the aim of being able to develop efficient machine learning algorithms "from scratch" with NumPy-like code.
NumExpr Single pass "mapper" operations (one input โ†’ one output).

Binding C/C++ to Python

Before you read this, realize that this is for existing C++ code. If you want to write new C/C++ code for speed, see section above.

Python entered into the particle physics ecosystem since it was useful as a 'glue lanaguage'. This means that you can get multiple softwares in different languages to work with one another. Given the large ecosystem of Python packages in the last decade, this is less common now. However, the situation still does arise that you want to call some existing C/C++ code from Python.

Please be aware that wrapping C code is significantly easier than wrapping C++ code due to details of how function names get garbled in libraries within C++; but the tools below can make wrapping C++ easy as well.

At present, the best summary of how to bind code in HEP applications comes from Henry Schreiner in a 2018 PyHEP talk.

Package name Use Pro Con Further information
pybind11 Wrapping existing C++ codes Small elegant package, simple build. Young but gaining populatiry quickly. Henry's slides
Cython Wrapping C++ code Widely used, freely mixing Python and C++. Mix of C and Python is a new language, incomplete coverage of C++.
SWIG Wrapping C++ code Widely used, multiple languages. Have to write wrapper file, harder to customize, and development is slow/dated.
Boost::Python Wrapping C++ code Widely used. Giant dependency since Boost does many other things, uses "jam" to build.

Miscellaneous

Packages that do not easily fit in any of the above topics.

Name Use
comparxiv Compare 2 versions of an arXiv preprint with a single command

Tutorials

See tutorials here and other resources collected by IML HEP-ML Resources.

Experimental codes

Stealing code from other physicists is its own sign of flattery. Codes that are abandoned more than two years will get struck through.:

Name Collaboration Use Further information Date added to list

pyhep-resources's People

Contributors

chrisburr avatar cranmer avatar dguest avatar eduardo-rodrigues avatar henryiii avatar jpivarski avatar klieret avatar meliache avatar reikdas avatar shantanu-gontia avatar taisakuma avatar tunnell avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.