GithubHelp home page GithubHelp logo

hdfgroup / hdf.pinvoke Goto Github PK

View Code? Open in Web Editor NEW
79.0 16.0 28.0 64.96 MB

Raw HDF5 Power for .NET

Home Page: http://www.hdfgroup.org/HDF5

License: Other

C# 91.11% Smalltalk 7.48% CMake 1.14% Batchfile 0.02% F# 0.26%

hdf.pinvoke's Introduction

AppVeyor Project status badge Gitter

What it is (not)

HDF.PInvoke is a collection of PInvoke signatures for the HDF5 C-API. It's practically code-free, which means we can blame all the bugs on Microsoft or The HDF Group 😄

It is not a high-level .NET interface for HDF5. "It's the GCD of .NET bindings for HDF5, not the LCM." :bowtie:

Current Release Version(s)

HDF5 Release Version Assembly Version Assembly File Version Git Tag
1.8.21 1.8.21.1 1.8.21.1 v1.8.21.1
1.10.11 1.10.11 1.10.11 v1.10.11

How "stuff" is versioned.

Quick Install:

To install the latest HDF.PInvoke 1.8, run the following command in the Package Manager Console

    Install-Package HDF.PInvoke -Version 1.8.21.1

To install the latest HDF.PInvoke 1.10, run the following command in the Package Manager Console

    Install-Package HDF.PInvoke -Version 1.10.11

Prerequisites

The HDF.PInvoke.dll managed assembly depends on the following native DLLs (32-bit and 64-bit):

  • HDF5 core API, hdf5.dll
  • HDF5 high-level APIs, hdf5_hl.dll
  • Gzip compression, zlib.dll
  • Szip compression, szip.dll
  • The C-runtime of the Visual Studio version used to build the former, e.g., msvcr120.dll for Visual Studio 2013

All native dependencies, built with thread-safety enabled, are included in the NuGet packages, except the Visual Studio C-runtime, which is available from Microsoft as Visual C++ Redistributable Packages for Visual Studio 2013. In the unlikely event that they aren't already installed on your system, go get 'em! (See this link for the rationale behind not distributing the Visual Studio C-runtime in the NuGet package.)

The DLL Resolution Process

On the first call to an H5* function, the application's configuration file (e.g., YourApplication.exe.config) is searched for the key NativeDependenciesAbsolutePath, whose value, if found, is added to the DLL-search path. If this key is not specified in the application's config-file, then the HDF.PInvoke.dll assembly detects the processor architecture (32- or 64-bit) of the hosting process and expects to find the native DLLs in the bin32 or bin64 subdirectories, relative to its location. For example, if HDF.PInvoke.dll lives in C:\bin, it looks for the native DLLs in C:\bin\bin32 and C:\bin\bin64. Finally, the PATH environment variable of the running process is searched for other locations, such as installed by the HDF5 installers.

One Major HDF5 Version

The HDF Group currently maintains one major HDF5 release family, HDF5 1.14. The Visual Studio Solution is set up to build the HDF.PInvoke.dll .NET assemblies for the "Any CPU" platform in the Debug and Release configurations. Support for the HDF5 1.8 or 1.10 API is toggled via the HDF5_VER1_10 conditional compilation symbol in the Build properties of the HDF.PInvoke and UnitTest projects.

License

HDF.PInvoke is part of HDF5. It is subject to the same terms and conditions as HDF5. Please review COPYING or https://www.hdfgroup.org/licenses/ for the details. If you have any questions, please contact us.

Supporting HDF.PInvoke

The best way to support HDF.Pinvoke is to contribute to it either by reporting bugs, writing documentation (e.g., the cookbook), or sending pull requests.


The HDF Group logo

hdf.pinvoke's People

Contributors

dsanchen avatar gheber avatar hokb avatar janwosnitza avatar rbelokurov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hdf.pinvoke's Issues

A rewritten attribute needs H5.close() to be applied to the file

I created a variable length string and stored it into an attribute. If I rewrite the string, HDFView does not see it until H5.close() is called. This is not the case for datasets, why is it the case with attributes?

Probably this is also the case with freshly written attributes, but I did not test this yet.

Pseudocode:

H5.open()
H5F.open()
H5A.write()
H5F.close()
-----------------HDF does not see the change
H5.close()
-----------------HDF does see the change

H5Odisable_mdc_flushesTestSWMR[3,4] fail

Both fail with a file locking message such as this:

Test Name:  H5Odisable_mdc_flushesTestSWMR3
Test FullName:  UnitTests.H5SWMRTest.H5Odisable_mdc_flushesTestSWMR3
Test Source:    c:\Users\gerd\GitHub\HDF.PInvoke\UnitTests\H5SWMRTest\H5Odisable_mdc_flushes.cs : line 56
Test Outcome:   Failed
Test Duration:  0:00:00.0223386

Result Message: 
Assert.IsTrue failed. 
TestCleanup method UnitTests.H5SWMRTest.Cleanup threw exception. System.IO.IOException: System.IO.IOException: The process cannot access the file 'C:\Users\gerd\AppData\Local\Temp\tmp3A19.tmp' because it is being used by another process..
Result StackTrace:  
at UnitTests.H5SWMRTest.H5Odisable_mdc_flushesTestSWMR3() in c:\Users\gerd\GitHub\HDF.PInvoke\UnitTests\H5SWMRTest\H5Odisable_mdc_flushes.cs:line 62

TestCleanup Stack Trace
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.File.InternalDelete(String path, Boolean checkHost)
   at UnitTests.H5SWMRTest.Cleanup() in c:\Users\gerd\GitHub\HDF.PInvoke\UnitTests\H5SWMRTest\H5SWMRTest.cs:line 107

Versioning scheme

Currently, AssemblyInfo.cs defines the versions as:

#if HDF5_VER1_10
[assembly: AssemblyVersion("1.10.0.*")]
[assembly: AssemblyFileVersion("1.10.0.*")] 
#else
[assembly: AssemblyVersion("1.8.17.*")]
[assembly: AssemblyFileVersion("1.8.17.*")]
#endif

This gives a warning during build since Assembly**_File**_Version does not support the asterisk (while AssemblyVersion does).
Versioning is really important in .NET so it is rather important to make a conscious decision about which versionin scheme we want to follow for HDF.PInvoke.
AssemblyVersion is used in order to identify a strongly named assembly in DLL probing. Using the '*' here can lead to surprises on the user side. Commonly, updates of the same major, minor version should be distributed as drop-in replacements for the older, buggy version. (At least this is how Microsoft does it). So the build and revision numbers would not change for patches and bugfixing updates on the same major, minor version. Replacing them is easy for the user: she does not need to change any project references etc. The new DLL simply replaces the old one.
In order to distinguish actual builds the AssemblyFileVersion is of more use. This is the one displayed in the properties window when right-clicking on a dll file and selecting "Properties", among other places.

My recommendation would be to keep AssemblyVersion the same except for minor and major updates. And to use an auto-incrementing scheme for AssemblyFileVersion. Now it is too bad that there is no real easy way of auto-incrementing the AssemblyFileVersion. Commonly, people use msbuild or VisualStudio trickery to achieve this.

H5D.read returns -1

I filed a pull request for the H5D.read(). Unfortunately, I was not ablet to manage to get H5D.read() working for me but this should be really easy:

var strPtr = IntPtr.Zero;
H5D.read(datasetId, H5S.ALL, H5S.ALL, H5P.Default, strPtr)
var text = Marshal.PtrToStringAuto(strPtr);

I'm also new to HDF5 so maybe I did overlook something?!

H5DOappendTestSWMR1 fails

Not sure what's going on. Sometimes it seems to be locking the file. Other times it appears that the call itself is failing. Weird.

2D jagged variable string array write, read, extend

I found this example to create a 1D array with variable length writing and reading.

I was wondering how to modify to allow the same for 2D? Also extending this 2D array would be interesting.

Let's say I know from the beginning the number of columns but I then want to add a variable number of rows to each column. This should be possible at any time (extend).

-------a------------b-----------c------------d-----------e-----------f------
|           ||           ||           ||           ||           ||           |
      a1          b1           c1           d1           e1           f1
      a2          b2           c2           d2           e2           f2
      a3                       c3           d3           e3           f3
      a4                                    d4           e4           f4
      a5                                    d5           e5              
      a6                                    d6           e7              
                                                         e8              

I think in general this is possible right? How would I do with this library?

HDF dataset strides storage (pot. enhancement?)

@gheber Should I post this to [HDF-forum]?

The original element datatype of a dataset is stored by HDF5 and is required to identify the way to actually read back the elements into memory. The HDF API allows to provide the dataset values as arbitrary types. Internally the conversion is done transparently to the user. Also, the dimension lengths are stored as obligatory dataset attribute.

In order to be compatible to arbitrary applications, next to the dimension lengths and the element type, a 3rd information would be necessary to store and manage all properties of the array (let's call the dataset this way here): the strides of the dimensions. This would enable HDF5 to handle row-major vers. column-major order, and also any other custom stride the dataset storage might possibly be stored in.

The HDF5 dataset documentation (cannot find the exact place right now) points out the difference in storage scheme when using different languages. Datasets stored in ILNumerics / FORTRAN will have the dimensions flipped when read in C and vice versa.

Currently, I see the following options to handle this situation:

  • The user of HDF5 needs to keep the actual dimensions in mind. Dimensions need to be flipped manually when reading data which were stored using the other convention.
  • The hyperslab feature seems to be flexible enough to be used to read from the dataset storage utilizing arbitrary strides (?). Drawback: the flexibility comes to the price of a performance drop compared to sequential memory reads.

HDF5 stores the dataset in _row major _order, no matter what. Currently, a user who wants the data to be readable using any storage convention, could / must use an attribute in order to attach this information to the dataset. The necessary dimension permutations must be implemented on a layer outside of HDF5 still. There is no standard place / attribute name which is recommended for this (?).

I am interested in your experiences regarding this issue and in your recommendation on how to handle different storage schemes. Should we consider HDF5 as 'being row-major' only? Or is it rather meant to be a general storage API, capable of handling any storage scheme?

Maybe there are already plans to incorporate this extra bit into HDF? Having the information of the storage scheme of the dataset available would help the user of HDF5 to identify the necessary actions to take in order to clean up any dimension order mess, at least. The attribute option is a reasonable fallback. But an official 'standard' place to store / read the strides could help compatibility significantly. I am not aware of all potential side effects in HDF5 but to define a standard attribute might be possible without introducing too huge problems? [For an optimal solution, HDF would auto-permute the dimensions, but this would obviously be more demanding.]

What do you think?

H5DO.write_chunk missing in hdf5.dll

I just found that you made write_chunk available and I wanted to try it but it gives me the following error:
Additional information: Unable to find an entry point named 'H5DOwrite_chunk' in DLL 'hdf5.dll'.

This function is part of the high level of HDF5.

In constants there is the dll file name defined like:

public const string DLLFileName = "hdf5.dll";

Isn't the hdf5.dll part of the high level dll? How could we fix this?

H5T Errors with Mono

When trying to use the HDF.PInvoke library with Mono, the calls to GetModuleHandle and GetProcAddress will fail. Mono seems to be able to resolve the PInvokes, even changing the name of the called library to be able to locate in on a linux system, but it can't handle the runtime calls against kernel32.dll. I created a pull request with a new class for handling the runtime dll imports on linux. This currently runs without throwing any dlerrors on the imports, but it is still returning incorrect values for the types that I was trying (e.g. a return value of -1 for H5T_NATIVE_FLOAT_g) from dlsym.

HDF5 Architecture

Dear,

in one of my projects i used the legacy wrapper whitch work with HDF libs 1.8.9.
Unfortunatly, when i push some data i don't know why i have an error of memory curruption on visual studio (acess violation memory).
So i hope the HDF.PInvoke library can resolve the problem.
In order to test, i try to create a file and add chuncked H5D Compound but unsuccessfull.
I dont understant how to do that like i do before.

You can find a sample of file that i am enable to create with the legacy wrapper at this link.
https://drive.google.com/file/d/0B5l_eCCuTAyiVUNtX2luTGtxaW8/view?usp=sharing
Thanks for your help.

AnyCPU support - how far are we going?

+1 for AnyCPU support. However, next to the typing issue, AnyCPU rises the question how far do we want to go, especially regarding unmanaged binaries:

  1. Poor mans AnyCPU: let the user pick a set of binaries for a specific platform (from hfdgroup.org?) and place them next to the HDF.PInvoke assembly. This buys her the comfort of not having to compile HDF.PInvoke for a specific platform (because it is well prepared for both platforms). However, the choice of unmanged binaries depends on the target of the entry assembly (exe), which needs to be fixed in this case and must match the selected set of HDF5 unmanaged libs.

  2. More convenient AnyCPU: we release the user (partly) from having to think about the platform target. The user can easily try / test for both platforms by simply switching between 32/64 bit compilation forth and back. This gives true AnyCPU feeling and supports AnyCPU applications as well.

In the second case we should provide some mechanism to distinguish between 32 / 64 bit unmanaged dependencies at runtime. The user could still fetch the unmanaged binaries from hdfgroup herself I guess. (We do not want to track updates here, do we?)

Several options exist to pick the correct platform type binaries at runtime.

Which option would you prefer?

PInvoke and UTF-8 strings

I believe that in order to support UTF-8 we will have to marshal HDF5 path names, link names, and attribute names as byte[] rather than string or StringBuilder. Maybe there is a PInvoke trick, but I can't make it work. I've mocked up H5L/H5Lcreate_hard Have a look at H5Lcreate_hardTest2 in H5LTest/H5Lcreate_hard.cs and you'll see what I mean. The net inconvenience is that people will have to pass Encoding.UTF8.GetBytes("Ελληνικά") instead of "Ελληνικά". Thoughts?

H5PT

Hi,

in one of my programs, i used the lib H5PT in version 1.8.9.
I need to use it now with the PInvoke lib but i cant found that.

The objective is to store multiple datatype in one dataset.
Before, i was able to store data like that:

col1 | col2 | col3
double | strings | int

How can i do it today please?

Passing "object" to native DLL is dangerous

Running all tests failed for my x86 machine at H5Aiterate_by_nameTest1() and H5AiterateTest1(). Reason could be that passing a managed object (here ArrayList) directly to the native dll (to the H5Aiterate2 function), doing a callback, might not be a good idea. "object"s can be moved around in memory any time.
The native dll does not know anything about managed objects and trusts that it gets a valid pointer in the void* op_data parameter. When the object "is not there anymore" by the time, the callback is called, the callback will fail, as it has no idea, what "object" it should operate on.
I guess we have to use pinned GC-Handles.

How about passing out-Strings as StringBuilder instead of IntPtr ?

IntPtr buf = H5.allocate_memory(new IntPtr(256), 0);
H5F.get_name(m_v0_test_file, buf, new IntPtr(255)).ToInt64());
string name = Marshal.PtrToStringAnsi(buf);

➡️

StringBuilder nameBuilder = new StringBuilder(256);
H5F.get_name(m_v0_test_file, nameBuilder, new IntPtr(nameBuilder.Capacity)).ToInt64());
string name = nameBuilder.ToString();

Looks a little more c#-ish

Call H5.open on each H5* static constructor

H5.open could be called from any static constructor of the H5* -classes. That way, one would not have to remember calling H5.open explicitely. Nevertheless, H5.close must still be called "manually".
This would resemble the behavior of the native dll, where calling any H5_* -function makes sure that H5_init_library has been called...
What do others think ?

NuGet updates do not remove event commands

The update from HDF.PInvoke.1.10.0-pre2 to HDF.PInvoke.1.10.0-pre3 did not remove the event commands. This is also why this problem arised. Is there a possibility to remove the old ones when updating?

IntPtr overloads for System.Array arguments

Considering functions which receive or give back data in terms of arrays, as in H5Apublic.cs:
public extern static ssize_t get_name_by_idx (hid_t loc_id, byte[] obj_name, H5.index_t idx_type, H5.iter_order_t order, hsize_t n, [Out] byte[] name, size_t size, hid_t lapl_id = H5P.DEFAULT);

It is often useful to have the chance to access the function and provide a pointer intsted of a byte[]. Reason: latter needs to live on the heap. IntPtr can live on the stack. Often the data can be considered temporary anyway and need to be reworked in the calling function. Hence, this is a perfect place for stackalloc.

We should consider adding corresponding overloads on all places where IntPtr would be feasible?
public extern static ssize_t get_name_by_idx (hid_t loc_id, byte[] obj_name, H5.index_t idx_type, H5.iter_order_t order, hsize_t n, IntPtr name, size_t size, hid_t lapl_id = H5P.DEFAULT);

How to deal with / mark thread safety

=== This is NOT related to SWMR in 1.10. ===

Is HDF.PInvoke thread safe? Means, can I call the same function from multiple threads / AppDomains concurrently as long as I am working in individual files?

@gheber Maybe we could have another wiki page (hope, I haven't missed it somewhere?) and put some related content / links to such content from/ to the official docs?

strong naming

Could you please strong name sign your library? When I add it to a project via nuget I get an error message that your dll doesn't have a strong name. There are workarounds for this problem but it would be much easier if you give HDF.Pinvoke a strong name.
By the way I created a project on github that uses your library. It's still in its early stages and there are still some issues I need help with. Any other suggestions are also welcome.

Thanks by the way for this project. Your unit tests were a lot of help to me in understanding how to read and write hdf-5 files.

Please include a copy of the license in the repository, for easy reference and clarity

When opening/viewing the HDF.PInvoke repository it is not clear under what license the software is made available. I know it is available from http://www.hdfgroup.org/HDF5/doc/Copyright.html since this reference is given in the source code and the nuspec files. However it would be much nicer if a LICENSE file would be present in the root of the repository.

Furthermore I think the Readme.md should also have paragraph that discusses the licensing terms and a reference to the actual license.

H5A.read NATIVE_INT appears to treat the attribute as an unsigned integer

The following code stores -1 into the HDF5 file as an 8-bit integer:

  int value = -1;
  hid_t attributeSpace = -1;
  hid_t typeId = -1;
  hid_t attributeId = -1;

  try
  {
    attributeSpace = H5S.create(H5S.class_t.SCALAR);
    typeId = H5T.copy(H5T.NATIVE_INT);
    H5T.set_size(typeId, new IntPtr(1));
    attributeId = H5A.create(objectId, title, typeId, attributeSpace);

    var array = new int[] { value };
    GCHandle pinnedArray = GCHandle.Alloc(array, GCHandleType.Pinned);
    H5A.write(attributeId, typeId, pinnedArray.AddrOfPinnedObject());
    pinnedArray.Free();

    return true;
  }
  catch (Exception ex)
  {
    return false;
  }
  finally
  {
    if (attributeId != -1) H5A.close(attributeId);
    if (typeId != -1) H5T.close(typeId);
    if (attributeSpace != -1) H5S.close(attributeSpace);
  }

When reading the same value back in, it returns 255, appearing to treat the NATIVE_INT as an unsigned 8-bit integer. Based on the documentation, I do not expect this to occur.

  int value = -9999999;

  if (H5A.exists(objectId, title) == 0)
  {
    return false;
  }

  hid_t typeId = -1;
  hid_t attributeId = -1;

  try
  {
    typeId = H5T.copy(H5T.NATIVE_INT);
    H5T.set_size(typeId, new IntPtr(1));
    attributeId = H5A.open(objectId, title);
    if (attributeId == -1)
      return false;

    int[] array = new int[1];
    GCHandle pinnedArray = GCHandle.Alloc(array, GCHandleType.Pinned);
    H5A.read(attributeId, typeId, pinnedArray.AddrOfPinnedObject());
    pinnedArray.Free();

    value = array[0];

    return true;
  }
  catch (Exception ex)
  {
    if (logError)
    {
      log.Error("Exception in reading attribute: {0}", ex.Message);
    }
    return false;
  }
  finally
  {
    if (typeId != -1) H5T.close(typeId);
    if (attributeId != -1) H5A.close(attributeId);
  }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.