laurentmazare / npy-ocaml Goto Github PK
View Code? Open in Web Editor NEWNumpy file format support for ocaml.
License: Apache License 2.0
Numpy file format support for ocaml.
License: Apache License 2.0
the link to the “ npy format spec” in your readme is invalid.
Some 1d arrays end up with a shape of "(42,)" which is not parsed properly.
OCaml 5.2 added support for float16 bigarrays and makes npy
fail to build with:
#=== ERROR while compiling npy.0.0.9 ==========================================#
# context 2.2.0~beta2~dev | linux/x86_64 | ocaml-variants.5.2.0+trunk | file:///home/opam/opam-repository
# path ~/.opam/5.2/.opam-switch/build/npy.0.0.9
# command ~/.opam/5.2/bin/dune build -p npy -j 1
# exit-code 1
# env-file ~/.opam/log/npy-19-ee8120.env
# output-file ~/.opam/log/npy-19-ee8120.out
### output ###
# (cd _build/default && /home/opam/.opam/5.2/bin/ocamlopt.opt -w -40 -g -I src/.npy.objs/byte -I src/.npy.objs/native -I /home/opam/.opam/5.2/lib/camlzip -I /home/opam/.opam/5.2/lib/ocaml/unix -I /home/opam/.opam/5.2/lib/zip -intf-suffix .ml -no-alias-deps -o src/.npy.objs/native/npy.cmx -c -impl src/npy.ml)
# File "src/npy.ml", lines 13-26, characters 4-68:
# 13 | ....match packed_kind with
# 14 | | P Bigarray.Int32 -> "i4"
# 15 | | P Bigarray.Int64 -> "i8"
# 16 | | P Bigarray.Float32 -> "f4"
# 17 | | P Bigarray.Float64 -> "f8"
# ...
# 23 | | P Bigarray.Complex32 -> "c8" (* 2 32bits float. *)
# 24 | | P Bigarray.Complex64 -> "c16" (* 2 64bits float. *)
# 25 | | P Bigarray.Int -> failwith "Int is not supported"
# 26 | | P Bigarray.Nativeint -> failwith "Nativeint is not supported."
# Error (warning 8 [partial-match]): this pattern-matching is not exhaustive.
# Here is an example of a case that is not matched:
# P Float16
# (cd _build/default && /home/opam/.opam/5.2/bin/ocamlc.opt -w -40 -g -bin-annot -I src/.npy.objs/byte -I /home/opam/.opam/5.2/lib/camlzip -I /home/opam/.opam/5.2/lib/ocaml/unix -I /home/opam/.opam/5.2/lib/zip -intf-suffix .ml -no-alias-deps -o src/.npy.objs/byte/npy.cmo -c -impl src/npy.ml)
# File "src/npy.ml", lines 13-26, characters 4-68:
# 13 | ....match packed_kind with
# 14 | | P Bigarray.Int32 -> "i4"
# 15 | | P Bigarray.Int64 -> "i8"
# 16 | | P Bigarray.Float32 -> "f4"
# 17 | | P Bigarray.Float64 -> "f8"
# ...
# 23 | | P Bigarray.Complex32 -> "c8" (* 2 32bits float. *)
# 24 | | P Bigarray.Complex64 -> "c16" (* 2 64bits float. *)
# 25 | | P Bigarray.Int -> failwith "Int is not supported"
# 26 | | P Bigarray.Nativeint -> failwith "Nativeint is not supported."
# Error (warning 8 [partial-match]): this pattern-matching is not exhaustive.
# Here is an example of a case that is not matched:
# P Float16
Looking at npy-ocaml more or less by accident, I was thinking that it may be nice to add a function
val to_bigarrayN: 'c Bigarray.layout -> ('a,'b) kind -> package_arrayN -> ('a,'b,'c) Bigarray.ArrayN.t option
(for N∈{1,2,3} and Genarray) that will return Some a
if the type of the packed array matches the types of the layout and kind argument and return None
otherwise. Such function may ease the manipulation of the underlying bigarray. Would you be interested in an implementation of such functions?
Platform: MacOS High Sierra 10.13.6
Python: 3.6
For small files < 100MB Npz seems to be working, but I am getting CRC errors from numpy when writing large files ~1GB
open Core
let mk_big_file name npz_file =
let open Bigarray in
let arr = Array2.create int8_signed c_layout 10_000_000 2_048 in
let npz = Npy.Npz.open_out npz_file in
Exn.protectx npz ~finally:Npy.Npz.close_out ~f:(fun npz ->
let big_arr = arr |> Bigarray.genarray_of_array2 in
let () = Npy.Npz.write npz name big_arr in
())
let () =
mk_big_file "a" "a.npz"
Then in ipython
a = np.load('a.npz')
a['a']
And you get:
~/miniconda3/lib/python3.6/zipfile.py in _update_crc(self, newdata)
865 # Check the CRC if we're at the end of the file
866 if self._eof and self._running_crc != self._expected_crc:
--> 867 raise BadZipFile("Bad CRC-32 for file %r" % self.name)
868
869 def read1(self, n):
BadZipFile: Bad CRC-32 for file 'a.npy'
i just wrote a fortran-ordered bigarray using write1. the file starts with
�NUMPY��F�{'descr': '<f8', 'fortran_order': True, 'shape': (184), }
note that the shape is not a python tuple (this would have to be (184,)).
numpy refuses to load it, probably for this reason. indeed after i edit the file manually i can import it.
(maybe add a test case for write/read in all dimensions?)
Tests fail with python 3.
$ dune runtest
bash alias tests/runtest (exit 2)
(cd _build/default/tests && /bin/bash -e -u -o pipefail -c ./test.exe)
Running: python3 -c 'import numpy as np
arr = np.array([ [ 843124160., 305941280., 741140288. ], [ 531715328., 304366752., 573273728. ] ])
np.save("ptest_g.npy", arr.astype("f4"))'
test_g.npy 09823d7cdfe3688f0032c01a207e5bb0 de364fc012daa8645c888b985f16fe99
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.