GithubHelp home page GithubHelp logo

trex-coe / trexio Goto Github PK

View Code? Open in Web Editor NEW
43.0 7.0 12.0 8.72 MB

TREX I/O library

Home Page: https://trex-coe.github.io/trexio/

License: BSD 3-Clause "New" or "Revised" License

Makefile 2.56% C 41.04% Shell 5.77% Python 21.54% Fortran 4.51% Emacs Lisp 0.97% M4 3.84% SWIG 1.71% CMake 2.97% Scheme 0.64% OCaml 0.64% Rust 13.81%
quantum-chemistry library wave-function

trexio's Introduction

TREXIO

build GitHub release (latest by date)

TREXIO is an open-source file format and library developed for the storage and manipulation of data produced by quantum chemistry calculations. It is designed with the goal of providing a reliable and efficient method of storing and exchanging wave function parameters and matrix elements. The library consists of a front-end implemented in the C programming language and two different back-ends: a text back-end and a binary back-end utilizing the HDF5 library which enables fast read and write operations. It is compatible with a variety of platforms and has interfaces for the Fortran, Python, OCaml and Rust programming languages.

Installation

Installation using a package manager

Conda

Anaconda-Server Badge Anaconda-Server Badge

The official releases of TREXIO >2.0.0 are also available via the conda-forge channel. The pre-compiled stable binaries of trexio can be installed as follows:

conda install -c conda-forge trexio

More details can be found in the corresponding trexio-feedstock. Note that both parallel (see mpi_openmpi prefix) and serial (nompi) variants are provided.

Spack

The official releases >=2.0.0 and the development version of TREXIO can be installed using the Spack package manager. The trexio/package.py file contains the Spack specifications required to build different variants of trexio library. It can be installed as follows

spack install --jobs `getconf _NPROCESSORS_ONLN` trexio

Guix

The official releases of TREXIO >=2.0.0 can be installed using the GNU Guix functional package manager. The trexio.scm Schema file contains the manifest specification for the trexio package. It can be installed as follows:

guix package --cores=`getconf _NPROCESSORS_ONLN` --install-from-file=trexio.scm

Debian/Ubuntu

The official release of TREXIO 2.2.0 is available as a Debian (.deb) package thanks to the Debichem Team. The source code is hosted here and the pre-built binary files are available via the Debian package registry.

TREXIO is also available on Ubuntu 23.04 (Lunar Lobster) and newer and can be installed as follows:

sudo apt-get update && sudo apt-get install libtrexio-dev

Installation from source

Minimal requirements (for users):

  • Autotools (autoconf >= 2.69, automake >= 1.11, libtool >= 2.2) or CMake (>= 3.16)
  • C compiler (gcc/icc/clang)
  • Fortran compiler (gfortran/ifort)
  • HDF5 library (>= 1.8) [optional, recommended for high performance]

Recommended: Installation from the release tarball

  1. Download the trexio-<version>.tar.gz file from the GitHub release page
  2. gzip -cd trexio-<version>.tar.gz | tar xvf -
  3. cd trexio-<version>
  4. ./configure
  5. make -j 4
  6. make -j 4 check
  7. sudo make install

In environments where sudo access is unavailable, a common workaround for executing make install/uninstall commands without requiring superuser privileges involves a modification to the ./configure command. This modification typically includes specifying an installation prefix within the user's home directory to circumvent the need for system-wide installation permissions. For instance, ./configure prefix=$HOME/.local can be employed, where $HOME/.local is often recommended for user-space software installations. However, this is merely a suggestion, and users are free to choose any suitable directory as their installation prefix, depending on their specific requirements and system configurations.

Regarding the integration with an MPI (Message Passing Interface) enabled HDF5 library, it's typical to specify the MPI compiler wrapper for the C compiler. This is done by appending a directive like CC=mpicc to the ./configure command. However, as TREXIO does not utilize MPI features, it is advisable to link against a non-MPI (serial) version of the HDF5 library for the sake of simplicity.

Compilation without the HDF5 library

By default, the configuration step proceeds to search for the HDF5 library. This search can be disabled if HDF5 is not present/installable on the user machine. To build TREXIO without HDF5 back end, append --without-hdf5 option to configure script or -DENABLE_HDF5=OFF option to cmake. For example,

  • ./configure --without-hdf5
  • cmake -S. -Bbuild -DENABLE_HDF5=OFF

For TREXIO developers: from the GitHub repo clone

Additional requirements:

  • Python3 (>= 3.6)
  • Emacs (>= 26.0)
  • SWIG (>= 4.0) [required for the Python API]

Note: The source code is auto-generated from the Emacs org-mode (.org) files following the literate programming approach. This is why the src directory is initially empty.

  1. git clone https://github.com/TREX-CoE/trexio.git
  2. cd trexio
  3. ./autogen.sh
  4. ./configure
  5. make -j 4
  6. make -j 4 check
  7. sudo make install

Using CMake instead of Autotools

The aforementioned instructions rely on Autotools build system. CMake users can achieve the same with the following steps (an example of out-of-source build):

  1. cmake -S. -Bbuild
  2. cd build
  3. make -j 4
  4. ctest -j 4
  5. sudo make install

Note: on systems with no sudo access, one can add -DCMAKE_INSTALL_PREFIX=build as an argument to the cmake command so that make install/uninstall can be run without sudo privileges.

Note: when linking against an MPI-enabled HDF5 library one usually has to specify the MPI wrapper for the C compiler by adding, e.g., -DCMAKE_C_COMPILER=mpicc to the cmake command.

Using TREXIO

Naming convention

The primary TREXIO API is composed of the following functions:

  • trexio_open
  • trexio_write_[group]_[variable]
  • trexio_read_[group]_[variable]
  • trexio_has_[group]_[variable]
  • trexio_close

where [group] and [variable] substitutions correspond to the contents of the trex.json configuration file (for more details, see the corresponding documentation page). For example, consider the coord variable (array), which belongs to the nucleus group. The TREXIO user can write or read it using trexio_write_nucleus_coord or trexio_read_nucleus_coord functions, respectively.

Note: the [variable] names have to be unique only within the corresponding parent [group]. There is no naming conflict when, for example, num variable exists both in the nucleus group (i.e. the number of nuclei) and in the mo group (i.e. the number of molecular orbitals). These quantities can be accessed using the corresponding trexio_[has|read|write]_nucleus_num and trexio_[has|read|write]_mo_num, respectively.

Tutorial

TREXIO tutorials in Jupyter notebook format can be found in the corresponding GitHub repository or on Binder.

For example, the tutorial covering TREXIO basics using benzene molecule as an example can be viewed and executed online by clicking on this badge: Binder

Documentation

Documentation generated from TREXIO org-mode files.

Linking to your program

The make install command takes care of installing the TREXIO shared library on the user machine. After installation, append -ltrexio to the list of compiler ($LIBS) options.

In some cases (e.g. when using custom installation prefix during configuration), the TREXIO library might end up installed in a directory, which is absent in the default $LD_LIBRARY_PATH. In order to link the program against TREXIO, the search path can be modified as follows:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<path_to_trexio>/lib

where the <path_to_trexio> has to be replaced by the prefix used during the installation.

If your project relies on CMake build system, feel free to use the FindTREXIO.cmake module to find and link TREXIO library automatically.

In Fortran applications, make sure that the trexio_f.f90 module file is included in the source tree. You might have to manually copy it into your program source directory. The trexio_f.f90 module file can be found in the include/ directory of the TREXIO source code distribution.

Note: there is no need to include trexio.h header file during compilation of Fortran programs. Only the installed library and the Fortran module file are required.

Distributing TREXIO with your code

The TREXIO software is distributed under the 3-clause BSD license, renowned for its permissiveness. Consequently, it is entirely acceptable for you to provide the TREXIO release tarball in conjunction with your own code. Should you opt to include TREXIO with your software, it is recommended to distribute the release tarball, instead of the content of the git repository. The release tarballs contain pre-generated source files. This not only accelerates the compilation process but also significantly reduces dependency requirements.

APIs for other languages

Python

PyPI version

For more details regarding the installation and usage of the TREXIO Python API, see this page.

The aforementioned instructions are adapted for users installing from the source code distribution (periodically updated). In order to install the Python API with the latest changes, follow the developer installation guide and run the following command in the end

make python-install

Note: this implies that SWIG is installed and available.

We rely on the pytest package for unit testing. It can be installed via pip install pytest. To test the installation, run

make python-test

We highly recommend to use virtual environments to avoid compatibility issues and to improve reproducibility.

Rust

The Rust API is available on Crates.io, so you can simply run

cargo add trexio

to your Rust project.

If you prefer to install the Rust API provided with this repository:

cargo add --path /path/to/trexio/rust/trexio

OCaml

The TREXIO OCaml API is available in OPAM:

opam install trexio

If you prefer to install it from this repository,

cd ocaml/trexio
make
opam install .

Citation

The journal article reference describing TREXIO can be cited as follows:

@article{10.1063/5.0148161,
    author = {Posenitskiy, Evgeny and Chilkuri, Vijay Gopal and Ammar, Abdallah and Hapka, Michał and Pernal, Katarzyna and Shinde, Ravindra and Landinez Borda, Edgar Josué and Filippi, Claudia and Nakano, Kosuke and Kohulák, Otto and Sorella, Sandro and de Oliveira Castro, Pablo and Jalby, William and Ríos, Pablo López and Alavi, Ali and Scemama, Anthony},
    title = "{TREXIO: A file format and library for quantum chemistry}",
    journal = {The Journal of Chemical Physics},
    volume = {158},
    number = {17},
    year = {2023},
    month = {05},
    issn = {0021-9606},
    doi = {10.1063/5.0148161},
    url = {https://doi.org/10.1063/5.0148161},
    note = {174801},
    eprint = {https://pubs.aip.org/aip/jcp/article-pdf/doi/10.1063/5.0148161/17355866/174801\_1\_5.0148161.pdf},
}

Journal paper: doi

ArXiv paper: arXiv

Miscellaneous

The code should be compliant with the C99 CERT C coding standard. This can be checked with the cppcheck tool.

If you loaded an HDF5 module and the configure script can't find the HDF5 library, it is probably because the path to the HDF5 library is missing from your $LIBRARY_PATH variable. It happens that when building the HDF5 modules, the system administrators only append the path to the libraries to the $LD_LIBRARY_PATH variable, but forget to append it also to $LIBRARY_PATH, which is required for linking. A simple workaround for the user is to do

export LIBRARY_PATH=$LD_LIBRARY_PATH

before running configure, but it is preferable to inform the system administators of the problem.


European flag TREX: Targeting Real Chemical Accuracy at the Exascale project has received funding from the European Union’s Horizon 2020 - Research and Innovation program - under grant agreement no. 952165. The content of this document does not represent the opinion of the European Union, and the European Union is not responsible for any use that might be made of such content.

trexio's People

Contributors

abdammar avatar addman2 avatar joguenzl avatar kousuke-nakano avatar plopezrios avatar q-posev avatar scemama avatar stefabat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

trexio's Issues

bugged when closing file [text]

example:

  1. writing nucleus_num and nucleus_charge without nucleus_coord leads to error upon closing;
  2. same if only nulceus_num has to be written
    guards have to be introduced in text_flush_group set of functions

Versioning of python packages

Please do not forget to update the Python API version when you change it in configure/CMakeLists

I really think that having two different version numbers for C/Fortran and Python is extremely confusing. If we tell users to get version 2.3.2 of the library, they will automatically do pip install trexio-2.3.2 and will get it wrong. It adds an extra layer of complexity.

As the Python package is distributed with the C library in a single repo, I think we should really have a common version number.

Still reachable memory reported by Valgrind

Valgrind sometimes reports still reachable 1,864 bytes in 3 blocks on programs using the TREXIO_HDF5 back end. In fact, even a simple open+close program using only HDF5 library (see below) can be used to debug it

 #include "hdf5.h"
 int main(void){
         hid_t file = H5Fcreate("test.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
         herr_t status = H5Fclose(file);
         return 0;
 }

This is the HDF5-related issue (see this discussion on the HDF5 forum). Recompiling the HDF5 library with --enable-using-memchecker configure option should fix it.

sanity check of dimensions in the front end

Guards have to be introduced for the case of user writing/reading an array which dimensions are not consistent with dimensions in the file.
Example of current vulnerability:
int num = 6; write_nucleus_num(file, num); double charges[2] = {1., 1.}; write_nucleus_charge(file, charges);

Compilation : User side improvement

I found a potential issue which can be added as a FAQ.


  • Issue: Compilation fails with the following message

make: *** No rule to make target 'include/trexio_f.f90', needed by 'tests/trexio_f.f90'. Stop.

  • Steps to reproduce

Clone the repository
git clone https://github.com/TREX-CoE/trexio
configure
./autogen -i;./configure
Issue make command
make


I know that if one is a developer and follows the README, things are OK.
However, if one is following the above steps, it would be nice to have an FAQ or a work around for this.

Loss of precision in Python

I wrote the nuclear repulsion in a TREXIO file. The nuclear repulsion computed by quantum package is

* Nuclear repulsion energy                           9.189533758614488    

I store it in a TREXIO file, and when I read it back in Python I get:

trexio.read_nucleus_repulsion(f)
9.189534187316895

The value given by Python is the same with both the text and the HDF5 back-ends.

When I inspect the text files produced by the text backend, I can see that the value of the nuclear repulsion is correct:

nucleus_repulsion   9.1895337586144876e+00                                                   

So I believe there is a conversion from float64 to float32 in the python interface after reading, but I could not figure out where it comes from...

It seems that I get the problem only for scalars. Arrays seem to be OK.

Add return code to open

If we want to return an exit code like all other functions, the first argument (file) would need to be of type trexio_t**, which is quite inconevnient for users.

So it is probably better to add the return code as a last argument of this function, passed by address.

ecp_local_coef does not have L=0 data for Z>1

@q-posev @scemama
I was using the python-API version to generate .hdf5 file from a gamess calculation. After looking at the h5dump output, it looks like only L=1 data of ECP (local component) is stored (Z>1 elements). In the case of hydrogen, it is stored properly.

Symmetry in indices of sparse data

We could simply give the list of equivalent indices obtained by permutations. This would imply the automatic creation of a variable nperm containing the number of equivalent permutations and perm, the list of permutations.
The default is nperm=1 and perm=[(1,2,3,4)] (for 4 index data). We can encode the potential sign change in the sign of the first element.

For example, for two-e integrals:

int32_t perm[8][4] = {
  1, 2, 3, 4, 
  3, 2, 1, 4, 
  1, 4, 3, 2, 
  3, 4, 1, 2,
  2, 1, 4, 3,                                                                                                                
  2, 3, 4, 1, 
  4, 1, 2, 3, 
  4, 3, 2, 1 };

trexio_write_ao_2e_int_eri_perm(trexio_file, 8, perm);

We could also store the most common predefined values:

int32_t trexio_4_index_1_perm[1][4] = {
  1, 2, 3, 4 };

int32_t trexio_6_index_1_perm[1][6] = {
  1, 2, 3, 4, 5, 6 };

int32_t trexio_4_index_4_perm_real[4][4] = {
  1, 2, 3, 4, 
  3, 2, 1, 4, 
  1, 4, 3, 2, 
  3, 4, 1, 2 };

int32_t trexio_4_index_4_perm_cplx[4][4] = {
  1, 2, 3, 4, 
  -3, 2, 1, 4, 
  -1, 4, 3, 2, 
  3, 4, 1, 2 };

int32_t trexio_4_index_8_perm_real[8][4] = {
  1, 2, 3, 4, 
  3, 2, 1, 4, 
  1, 4, 3, 2, 
  3, 4, 1, 2,
  2, 1, 4, 3,                                                                                                                
  2, 3, 4, 1, 
  4, 1, 2, 3, 
  4, 3, 2, 1 };

int32_t trexio_4_index_8_perm_cplx[8][4] = {
  1, 2, 3, 4, 
  -3, 2, 1, 4, 
  -1, 4, 3, 2, 
  3, 4, 1, 2,
  2, 1, 4, 3,                                                                                                                
  -2, 3, 4, 1, 
  -4, 1, 2, 3, 
  4, 3, 2, 1 };

Write numerical attributes that are not positive or float

So far, numerical attributes have been treated as dimensioning variables and thus could not be negative or float.

This is consistent with the current trex.json configuration but may change in the future. So additional flexibility has to be added to the API.

Add the ability to read arbitrary hyper-rectangles of dense arrays

For large arrays, it is often impossible to "read in replicated" or "read in on root and scatter" due to memory constraints. Therefore, it is highly desirable to read our data directly into distributed data structures (e.g. block-cyclic, etc). This can be done without parallel IO (although it's drastically improved by it) given the ability to read arbitrary blocks (hyper-rectangles) of data from the calling program.

decouple read/write from the corresponding group

In the current implementation, when reading/writing a dataset, its dimensions are obtained by scanning the file for the corresponding attribute (e.g. nucleus_num). This is ok for groups like nucleus where dimensions of all datasets depend on nucleus_num.
However, this will not work for groups like ecp, mo etc., where some datasets depend on _num variables determined outside of the current group.

Co-existing HDF5 (serial and parallel) + PKG_CHECK_MODULES issue

Upon compilation with the following modules loaded:

Currently Loaded Modulefiles:
  1) rocks-openmpi                     4) gcc/9.2.0                         7) apps/automake-1.16.1-gcc-7.3.0
  2) julia/1.6.3-gcc-9.2.0-ivybridge   5) apps/autogen-5.18.12-gcc-8.2.0    8) apps/libtool-2.4.6-gcc-7.3.0
  3) intel/2019.3       

The following issues are encountered:

./configure: line 17380: syntax error near unexpected token `PKG_CFLAGS=""'
./configure: line 17380: `PKG_CFLAGS=""'
checking for matching HDF5 Fortran wrapper... /usr/bin/h5fc
./configure: line 17855: syntax error near unexpected token `HDF5,'
./configure: line 17855: `PKG_CHECK_MODULES(HDF5, hdf5 >= 1.8,'

Thanks,
Vijay

pkg-config is broken after HDF5 detection rewriting

Calling pkg-config --libs trexio after make install returns the following:

Package @PKG_HDF5@ was not found in the pkg-config search path.
Perhaps you should add the directory containing `@[email protected]'
to the PKG_CONFIG_PATH environment variable
Package '@PKG_HDF5@', required by 'trexio', not found

The new HDF5 detection macro does not define PKG_HDF5 so it is not propagated to the trexio.pc

Backwards incompatibility of `TREXIO_TEXT` back end

TEXT back end relies on reading the entire group before each I/O operation (for sync-ing). The group is read from the .txt file in a particular order, according to trex.json. Thus, addition of a new attribute to a group changes the source code of the generated trexio_text_read_[group] function. As a result, the group produced before addition of an attribute cannot be read in the updated version of TREXIO.

version-dependent compilation issue with Autotools

TREXIO compilation in the developer mode (TREXIO_DEVEL=1 ./configure) with current Makefile.am and configure.ac fails on CALMIP HPC.

Version:
Autoconf 2.69
Automake 1.13.4
Libtool 2.4.2
GNU make 3.82

Output of make command:

Makefile:807: warning: overriding recipe for target src/.dirstamp' Makefile:795: warning: ignoring old recipe for target src/.dirstamp'
Makefile:834: warning: overriding recipe for target tests/.dirstamp' Makefile:822: warning: ignoring old recipe for target tests/.dirstamp'
Makefile:837: warning: overriding recipe for target tests/.deps/.dirstamp' Makefile:825: warning: ignoring old recipe for target tests/.deps/.dirstamp'
Makefile:858: src/.deps/trexio.Plo: No such file or directory
Makefile:859: src/.deps/trexio_hdf5.Plo: No such file or directory
Makefile:860: src/.deps/trexio_text.Plo: No such file or directory
make: *** No rule to make target `src/.deps/trexio_text.Plo'. Stop.

This issue does not occur on Ubuntu with the following versions:
Autoconf 2.69
Automake 1.16.1
Libtool 2.4.6
GNU make 4.2.1

trexio --without-hdf5 needs hdf5

Johannes (the numerical AO contributor) has access to a cluster where HDF5 is not installed, and we were not able to use TREXIO: the text backend failed to its purpose!

When we do trexio --without-hdf5, the produced library still needs to call some hdf5 functions.
We have missed some #ifdef HAVE_HDF5 in many places.

Name convention for trexio_open

Hello,

I am interfacing TREXIO to CP2K, and I am a bit confused about trexio_open and trexio_open_c. According top the Fortran example on the webpage I should use trexio_open to open the trexio file, though, the compiler complains that it does not find the correct reference. On the other hand, with trexio_open_c, it compiles without issues and the example of the webpage works fine (even though the filename is wrong). I see that trexio_open is a private function inside the module, though, I wonder why you have not swapped the two function names?

Warnings when compiling

Problem with a const:

In file included from src/trexio_hdf5.h:10,
                 from src/trexio_hdf5.c:6:
src/trexio_hdf5.c:8700:39: warning: passing argument 1 of 'free' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
 8700 |     if (index_p != index_sparse) FREE(index_p);
      |                                       ^~~~~~~
src/trexio_private.h:12:24: note: in definition of macro 'FREE'
   12 | #define FREE(X) { free(X) ; (X)=NULL; }
      |                        ^
In file included from src/trexio_hdf5.h:14,
                 from src/trexio_hdf5.c:6:
/usr/include/stdlib.h:482:25: note: expected 'void *' but argument is of type 'const void *'
  482 | extern void free (void *__ptr) __THROW;
      |                   ~~~~~~^~~~~

Version number in file

We should store in the file the version of the library that produced the file.
It is already stored in include/config.h as

/* major version */                                                                          
#define TREXIO_VERSION_MAJOR 0

/* minor version */
#define TREXIO_VERSION_MINOR 2

/* patch version */
#define TREXIO_VERSION_PATCH 0

/* Version number of package */
#define VERSION "0.2.0"

search bar for the website

either find a way to integrate it into the htmlized org-based website
or generate readthedocs from org-mode files (in this case search engine comes natively)

Broken backwards compatibility

PR #86 breaks it due to the removal of some blocks. We have to see with the users if these blocks have been used and change the major version if they are.

Does the TREXIO format support Cartesian contaminants?

Generally, one can have the angular factors of basis functions expressed as Cartesian functions (e.g. 6 d functions) or spherical harmonics (5 d functions), and I see the ao group includes a cartesian flag. But it is also possible to express them as spherical harmonics plus contaminant(s) (5 d functions plus 1 s * r^2 function). At least OpenMolcas can do it. Does the TREXIO format support this?

user name and machine in file?

It could be convenient to store the user name and name of the machine of the user who produced the file. We can easily get this info from the environment variables and write it when we close the file.
The question is : is it too much intrusive?

It is also possible to have an environment variable (or a configuration file stored in the home directory of the user) which activates this feature.

verbose error reporting

I implemented the recent version of trexio (2.2.0) and found that the trexio_assert subroutine could be better. I made a local copy of that subroutine which might be useful for verbose error reporting.

Call with:

call trexio_error(rc, TREXIO_SUCCESS, 'trexio_read_mo_num failed', __FILE__, __LINE__)

Subroutine::

      subroutine trexio_error(trexio_rc, check_rc, message, filename, line)
            !> This subroutine handles the error in reading/writing with trexio data
            !> @author Ravindra Shinde ([email protected])
            !> @date 01 June 2022
            !> \param[in] trexio_rc : the return code from the trexio library
            !> \param[in] check_rc  : the return code to compare against trexio_rc
            !> \param[in] message   : the error message for printing
            !> \param[in] filename  : the name of the file where the error occurred
            !> \param[in] line      : the line number where the error occurred


            use contrl_file,    	only: ounit, errunit
            use mpi,            	only: mpi_abort, MPI_COMM_WORLD
            implicit none

            integer, intent(in), value :: trexio_rc
            integer, intent(in), value :: check_rc
            integer, intent(in), value :: line
            character(len=*), intent(in), optional  :: message
            character(len=*), intent(in), optional  :: filename
            integer :: ierr

            if (trexio_rc /= check_rc) then
                  write(ounit,'(a)') "Error reading/writing data from trexio file :: ", trim(message)
                  write(errunit,'(a)') "Error reading/writing data from trexio file :: ", trim(message)
                  write(errunit,'(3a,i6)') "Debug source file :: ", trim(filename), " at line " , line
                  call mpi_abort(MPI_COMM_WORLD,-100,ierr)
            endif

      end subroutine trexio_error

Thread locking

This should be done in the front end.

We need to be careful because the front end calls its own functions. So we should acquire the lock just before calling the back-end function in the switch statement.

pip installation without HDF5 installed.

On a machine where HDF5 is not installed, I can't install the python binding.
However, h5py is installed.
Now that we don't use anymore hdf5_hl, maybe we can create a dependency to h5py and use the same libhdf5?
Or maybe we can create a fall-back for text-interface only?

Collecting trexio
  Using cached trexio-1.3.2.tar.gz (291 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [24 lines of output]
      Traceback (most recent call last):
        File "<string>", line 78, in <module>
      AssertionError

      During handling of the above exception, another exception occurred:

      Traceback (most recent call last):
        File "/n/home08/joonholee/miniconda3/envs/ipie_env/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/n/home08/joonholee/miniconda3/envs/ipie_env/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/n/home08/joonholee/miniconda3/envs/ipie_env/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-oh_8qnrs/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 341, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-oh_8qnrs/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 323, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-oh_8qnrs/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 338, in run_setup
          exec(code, locals())
        File "<string>", line 80, in <module>
      Exception: pkg-config could not locate HDF5
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

MO-coefficients get mangled when written and read again

Dear developers,

thanks for this nice library. I encountered an issue that seems related to #127. Please see the reproducer below:

Everything was tested on Python 3.12.1 and trexio==2.4.2.

It mimics the reading & writing of information from an unrestricted calculation done on 2 atoms with a total of 2 AOs and 2 alpha MOs Ca and 2 beta MOs Cb.

from pathlib import Path

import numpy as np
import trexio

np.set_printoptions(suppress=True, precision=4, linewidth=180)

nucleus_num = 2
ao_num = 2

fn = Path("issue.h5")
if fn.exists():
    fn.unlink()
tf = trexio.File(str(fn), mode="w", back_end=trexio.TREXIO_HDF5)

# Write file
coords3d = np.zeros((2, 3))
coords3d[1, 2] = 1.889
trexio.write_nucleus_num(tf, nucleus_num)
trexio.write_nucleus_coord(tf, coords3d)
mo_num = ao_num
Ca = np.random.rand(ao_num, mo_num)
Cb = np.random.rand(ao_num, mo_num)
# Mos are in columns; C.shape is (num_ao, 2*num_ao)
# Concatenate alpha and beta MOs
C = np.concatenate((Ca, Cb), axis=1)
mo_spin = ([0] * mo_num) + ([1] * mo_num)
mo_num = C.shape[1]
trexio.write_ao_num(tf, ao_num)
trexio.write_mo_num(tf, mo_num)
trexio.write_mo_coefficient(tf, C)
trexio.write_mo_spin(tf, mo_spin)
tf.close()

# Read again
with trexio.File(str(fn), mode="r", back_end=trexio.TREXIO_HDF5) as tf:
    coords3d_read = trexio.read_nucleus_coord(tf)
    # coords3d is fine
    np.testing.assert_allclose(coords3d, coords3d_read)
    C_read = trexio.read_mo_coefficient(tf)
    # C_read is mangled ...
    np.testing.assert_allclose(C, C_read)

When the MO-coefficients are stored with the shape outlined in the paper/documentation (ao_num, mo_num) the coefficients get mangled, when read again.

When I build the matrix C with shape (mo_num, ao_num) and store it, then everything is fine, but then C is also read again this shape, which seems not consistent with the documentation.

On the contrary, the 3d Cartesian coordinates seem fine; nothing gets transposed. So it is not like that I have to provide the data with column-major order initially.

So, did I misuse the API or is this a bug/documentation issue?

All the best
Johannes

External group

psi4/psi4#2847 (comment)

We could probably add a functionality allowing to write an arbitrary variable in e.g. "external" group via generic trexio_write|read_(file, variable-str, datatype-str, size-max). I can implement it easily for the HDF5 back end but TEXT one is more tricky.

Handle index/value structs

{
double value;
int i;
int j;
int k;
int l;
}

indices(4,batch_size)
values (batch_size)

rc = _read_(file, batch_size, indices, values)       
rc = number read values.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.