GithubHelp home page GithubHelp logo

jacanchaplais / heparchy Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 1.0 216 KB

Hierarchical database storage and access for high energy physics event data.

Home Page: https://heparchy.readthedocs.io/

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
high-energy-physics database hdf5 event-generator pythia8 madgraph5 hepmc lhe

heparchy's People

Contributors

jacanchaplais avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

kieran-mg1

heparchy's Issues

Make HepMC reader consistent with hdf reader

Use same ABCs as HdfReader for HepMC parsing. If it is necessary to define a folder structure with additional toml files for metadata, leave old method as well for externally generated HepMC data. Otherwise, replace the current parsing method entirely.

Error while trying to install package pyhepmc-ng

Running pip install heparchy produces the following error on MacOS version 12.3.1 with Apple M1 Pro chip:

Building wheels for collected packages: pyhepmc-ng
Building wheel for pyhepmc-ng (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [18 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-10.9-x86_64-cpython-38
creating build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/_version.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/_io.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/init.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/view.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
running build_ext
building 'pyhepmc_ng._bindings' extension
creating build/temp.macosx-10.9-x86_64-cpython-38
creating build/temp.macosx-10.9-x86_64-cpython-38/extern
creating build/temp.macosx-10.9-x86_64-cpython-38/extern/HepMC3
creating build/temp.macosx-10.9-x86_64-cpython-38/extern/HepMC3/src
creating build/temp.macosx-10.9-x86_64-cpython-38/src
error: [Errno 2] No such file or directory: 'x86_64-apple-darwin13.4.0-clang'
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pyhepmc-ng
Running setup.py clean for pyhepmc-ng
Failed to build pyhepmc-ng
Installing collected packages: pyhepmc-ng, h5py, heparchy
Running setup.py install for pyhepmc-ng ... error
error: subprocess-exited-with-error

× Running setup.py install for pyhepmc-ng did not run successfully.
│ exit code: 1
╰─> [20 lines of output]
running install
/Users/kieran/opt/anaconda3/envs/tree_extractor/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
running build
running build_py
creating build
creating build/lib.macosx-10.9-x86_64-cpython-38
creating build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/_version.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/_io.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/init.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/view.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
running build_ext
building 'pyhepmc_ng._bindings' extension
creating build/temp.macosx-10.9-x86_64-cpython-38
creating build/temp.macosx-10.9-x86_64-cpython-38/extern
creating build/temp.macosx-10.9-x86_64-cpython-38/extern/HepMC3
creating build/temp.macosx-10.9-x86_64-cpython-38/extern/HepMC3/src
creating build/temp.macosx-10.9-x86_64-cpython-38/src
error: [Errno 2] No such file or directory: 'x86_64-apple-darwin13.4.0-clang'
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> pyhepmc-ng

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

Versions
Python: 3.8.13

Further Information
pip install pyhepmc runs successfully, the authors of pyhepmc state "The pyhepmc-ng package is continued as the package pyhepmc. Please install pyhepmc, pyhepmc-ng is no longer updated.", perhaps this is the issue?

Speed improvement by grouping events

HDF5 performance is hurt badly when large numbers of groups are stored at the same level, see h5py/h5py#1055. For ~ 1M event datasets within a single process, calling h5py's iterators results in hanging code.

Solution idea: store events in multiple subgroups within one process, with a maximum length. Once maximum length has been met, create another.

Add interface for distributed reading / writing

Can do MPI integrated HDF5 writing. This is a bit fiddly, and the comms do slow the I/O considerably when scaled across many nodes.

To start, provide data files which are split up to reflect the parallel topology, and use HDF5 external links to bring the data together into a single file.

Include deprecation notice of read_event method

Following on from the off-topic comments in #15, it has been identified that removing the read_event method from ProcessReaderBase may introduce breaking changes without warning for users updating their versions. Re-insert the method, adding a deprecation notice, until the 1.0 release.

Incompatibility with python 3.7

From version heparchy==0.2b6, when trying to import it I get the following error:
ImportError: cannot import name 'cached_property' from 'functools'

Stackoverflow says that cached_property has been added after python 3.8

Replace get_mask and get_custom methods with property interface

Since the reader exposes data as properties to the user, it would be more consistent that masks and custom datasets would be exposed with a dictionary-like interface. This would also add semantic value to having them as separate methods, rather than just aliases of the same function.

Desired behaviour:
Replace .get_mask() with .masks, and .get_custom() with .custom. These should return dictionaries with keys as the dataset names, and values as their numpy array contents. Ideally, if subscripted directly, this should only read the subscripted dataset from file, rather than wastefully instantiating and populating the whole dictionary, only for it to be discarded.

Add append mode to hdf writer class

Writer currently overwrites existing data if there is a filename collision.

Add a mode='w'|'a' option. Future versions may unify read and write classes into file handler class which gives all modes, like more traditional IO in Python.

Add yaml configuration

Should provide options for:

  • compression type
  • which data to store

and perhaps should mirror options in the API, too.

Replace .read_process with string subscripting

Much more Pythonic and consistent with the event getters.

with HdfReader(data_path) as hep_file:
    for event in hep_file["default"]:
        ...

Put in a deprecation notice for the method, for it to be removed in release 1.0.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.