jacanchaplais / heparchy Goto Github PK
View Code? Open in Web Editor NEWHierarchical database storage and access for high energy physics event data.
Home Page: https://heparchy.readthedocs.io/
License: BSD 3-Clause "New" or "Revised" License
Hierarchical database storage and access for high energy physics event data.
Home Page: https://heparchy.readthedocs.io/
License: BSD 3-Clause "New" or "Revised" License
Combine or refactor reader and writer classes into a factory class, which is passed a mode, so can be used for reading or writing.
Either remove copy method, or create dataclass to be copied instead.
Use same ABCs as HdfReader for HepMC parsing. If it is necessary to define a folder structure with additional toml
files for metadata, leave old method as well for externally generated HepMC data. Otherwise, replace the current parsing method entirely.
For particularly large LHE files, it may not always be feasible to load all into memory.
Check if lxml allows for a file streaming approach. Other XML libraries allow this.
Running pip install heparchy
produces the following error on MacOS version 12.3.1 with Apple M1 Pro chip:
Building wheels for collected packages: pyhepmc-ng
Building wheel for pyhepmc-ng (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [18 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-10.9-x86_64-cpython-38
creating build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/_version.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/_io.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/init.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/view.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
running build_ext
building 'pyhepmc_ng._bindings' extension
creating build/temp.macosx-10.9-x86_64-cpython-38
creating build/temp.macosx-10.9-x86_64-cpython-38/extern
creating build/temp.macosx-10.9-x86_64-cpython-38/extern/HepMC3
creating build/temp.macosx-10.9-x86_64-cpython-38/extern/HepMC3/src
creating build/temp.macosx-10.9-x86_64-cpython-38/src
error: [Errno 2] No such file or directory: 'x86_64-apple-darwin13.4.0-clang'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pyhepmc-ng
Running setup.py clean for pyhepmc-ng
Failed to build pyhepmc-ng
Installing collected packages: pyhepmc-ng, h5py, heparchy
Running setup.py install for pyhepmc-ng ... error
error: subprocess-exited-with-error
× Running setup.py install for pyhepmc-ng did not run successfully.
│ exit code: 1
╰─> [20 lines of output]
running install
/Users/kieran/opt/anaconda3/envs/tree_extractor/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
running build
running build_py
creating build
creating build/lib.macosx-10.9-x86_64-cpython-38
creating build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/_version.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/_io.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/init.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
copying src/pyhepmc_ng/view.py -> build/lib.macosx-10.9-x86_64-cpython-38/pyhepmc_ng
running build_ext
building 'pyhepmc_ng._bindings' extension
creating build/temp.macosx-10.9-x86_64-cpython-38
creating build/temp.macosx-10.9-x86_64-cpython-38/extern
creating build/temp.macosx-10.9-x86_64-cpython-38/extern/HepMC3
creating build/temp.macosx-10.9-x86_64-cpython-38/extern/HepMC3/src
creating build/temp.macosx-10.9-x86_64-cpython-38/src
error: [Errno 2] No such file or directory: 'x86_64-apple-darwin13.4.0-clang'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure
× Encountered error while trying to install package.
╰─> pyhepmc-ng
note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
Versions
Python: 3.8.13
Further Information
pip install pyhepmc
runs successfully, the authors of pyhepmc state "The pyhepmc-ng package is continued as the package pyhepmc. Please install pyhepmc, pyhepmc-ng is no longer updated.", perhaps this is the issue?
Move into graphicle.
Enable option for users to write out their data in HepMC files.
HDF5 performance is hurt badly when large numbers of groups are stored at the same level, see h5py/h5py#1055. For ~ 1M event datasets within a single process, calling h5py's iterators results in hanging code.
Solution idea: store events in multiple subgroups within one process, with a maximum length. Once maximum length has been met, create another.
Initial functionality to be moved from showerpipe (see jacanchaplais/showerpipe#5).
Further functionality could include:
Can do MPI integrated HDF5 writing. This is a bit fiddly, and the comms do slow the I/O considerably when scaled across many nodes.
To start, provide data files which are split up to reflect the parallel topology, and use HDF5 external links to bring the data together into a single file.
Following on from the off-topic comments in #15, it has been identified that removing the read_event
method from ProcessReaderBase
may introduce breaking changes without warning for users updating their versions. Re-insert the method, adding a deprecation notice, until the 1.0
release.
From version heparchy==0.2b6, when trying to import it I get the following error:
ImportError: cannot import name 'cached_property' from 'functools'
Stackoverflow says that cached_property has been added after python 3.8
Enable the ability to append new processes and events to existing HDF5 files.
Currently mostly just looks up the data by the key, and doesn't verify that the correct method is being used to retrieve this kind of data.
Since the reader exposes data as properties to the user, it would be more consistent that masks and custom datasets would be exposed with a dictionary-like interface. This would also add semantic value to having them as separate methods, rather than just aliases of the same function.
Desired behaviour:
Replace .get_mask()
with .masks
, and .get_custom()
with .custom
. These should return dictionaries with keys as the dataset names, and values as their numpy array contents. Ideally, if subscripted directly, this should only read the subscripted dataset from file, rather than wastefully instantiating and populating the whole dictionary, only for it to be discarded.
Writer currently overwrites existing data if there is a filename collision.
Add a mode='w'|'a'
option. Future versions may unify read and write classes into file handler class which gives all modes, like more traditional IO in Python.
Should provide options for:
and perhaps should mirror options in the API, too.
Much more Pythonic and consistent with the event getters.
with HdfReader(data_path) as hep_file:
for event in hep_file["default"]:
...
Put in a deprecation notice for the method, for it to be removed in release 1.0.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.