GithubHelp home page GithubHelp logo

c-cube / iter Goto Github PK

View Code? Open in Web Editor NEW
117.0 11.0 11.0 1.41 MB

Simple iterator abstract datatype, intended to iterate efficiently on collections while performing some transformations.

Home Page: https://c-cube.github.io/iter

License: BSD 2-Clause "Simplified" License

Makefile 0.98% OCaml 99.02%
ocaml iterators lazy sequence monad iter higher-order-functions stream

iter's Introduction

Iter build

Clean and efficient loop fusion for all your iterating needs!

# #require "iter";;
# let p x = x mod 5 = 0 in
  Iter.(1 -- 5_000 |> filter p |> map (fun x -> x * x) |> fold (+) 0);;
- : int = 8345837500

Iter is a simple abstraction over iter functions intended to iterate efficiently on collections while performing some transformations. Common operations supported by Iter include filter, map, take, drop, append, flat_map, etc. Iter is not designed to be as general-purpose or flexible as Seq. Rather, it aims at providing a very simple and efficient way of iterating on a finite number of values, only allocating (most of the time) one intermediate closure to do so. For instance, iterating on keys, or values, of a Hashtbl.t, without creating a list. Similarly, the code above is turned into a single optimized for loop with flambda.

Documentation

There is only one important type, 'a Iter.t, and lots of functions built around this type. See the online API for more details on the set of available functions. Some examples can be found below.

The library used to be called Sequence. Some historical perspective is provided in this talk given by @c-cube at some OCaml meeting.

Short Tutorial

Transferring Data

Conversion between n container types would take n² functions. In practice, for a given collection we can at best hope for to_list and of_list. With iter, if the source structure provides a iter function (or a to_iter wrapper), it becomes:

# let q : int Queue.t = Queue.create();;
val q : int Queue.t = <abstr>
# Iter.( 1 -- 10 |> to_queue q);;
- : unit = ()
# Iter.of_queue q |> Iter.to_list ;;
- : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]

# let s : int Stack.t = Stack.create();;
val s : int Stack.t = <abstr>
# Iter.(of_queue q |> to_stack s);;
- : unit = ()
# Iter.of_stack s |> Iter.to_list ;;
- : int list = [10; 9; 8; 7; 6; 5; 4; 3; 2; 1]

Note how the list of elements is reversed when we transfer them from the queue to the stack.

Another example is extracting the list of values of a hashtable (in an undefined order that depends on the underlying hash function):

# let h: (int, string) Hashtbl.t = Hashtbl.create 16;;
val h : (int, string) Hashtbl.t = <abstr>
# for i = 0 to 10 do
     Hashtbl.add h i (string_of_int i)
  done;;
- : unit = ()

# Hashtbl.length h;;
- : int = 11

# (* now to get the values *)
  Iter.of_hashtbl h |> Iter.map snd |> Iter.to_list;;
- : string list = ["6"; "2"; "8"; "7"; "3"; "5"; "4"; "9"; "0"; "10"; "1"]

Replacing for loops

The for loop is a bit limited, and lacks compositionality. Instead, it can be more convenient and readable to use Iter.(--) : int -> int -> int Iter.t.

# Iter.(1 -- 10_000_000 |> fold (+) 0);;
- : int = 50000005000000

# let p x = x mod 5 = 0 in
  Iter.(1 -- 5_000
    |> filter p
    |> map (fun x -> x * x)
    |> fold (+) 0
  );;
- : int = 8345837500

NOTE: with flambda under sufficiently strong optimization flags, such compositions of operators should be compiled to an actual loop with no overhead!

Iterating on sub-trees

A small λ-calculus AST, and some operations on it.

# type term =
  | Var of string
  | App of term * term
  | Lambda of term ;;
type term = Var of string | App of term * term | Lambda of term

# let rec subterms : term -> term Iter.t =
  fun t ->
    let open Iter.Infix in
    Iter.cons t
      (match t with
      | Var _ -> Iter.empty
      | Lambda u -> subterms u
      | App (a,b) ->
        Iter.append (subterms a) (subterms b))
  ;;
val subterms : term -> term Iter.t = <fun>

# (* Now we can define many other functions easily! *)
  let vars t =
    Iter.filter_map
      (function Var s -> Some s | _ -> None)
      (subterms t) ;;
val vars : term -> string Iter.t = <fun>

# let size t = Iter.length (subterms t) ;;
val size : term -> int = <fun>

# let vars_list l = Iter.(of_list l |> flat_map vars);;
val vars_list : term list -> string Iter.t = <fun>

Permutations

Makes it easy to write backtracking code (a non-deterministic function returning several 'a will just return a 'a Iter.t). Here, we generate all permutations of a list by enumerating the ways we can insert an element in a list.

# open Iter.Infix;;
# let rec insert x l = match l with
  | [] -> Iter.return [x]
  | y :: tl ->
    Iter.append
      (insert x tl >|= fun tl' -> y :: tl')
      (Iter.return (x :: l)) ;;
val insert : 'a -> 'a list -> 'a list Iter.t = <fun>

# let rec permute l = match l with
  | [] -> Iter.return []
  | x :: tl -> permute tl >>= insert x ;;
val permute : 'a list -> 'a list Iter.t = <fun>

# permute [1;2;3;4] |> Iter.take 2 |> Iter.to_list ;;
- : int list list = [[4; 3; 2; 1]; [4; 3; 1; 2]]

Advanced example

The module examples/sexpr.mli exposes the interface of the S-expression example library. It requires OCaml>=4.0 to compile, because of the GADT structure used in the monadic parser combinators part of examples/sexpr.ml. Be careful that this is quite obscure.

Comparison with Seq from the standard library, and with Gen

  • Seq is an external iterator. It means that the code which consumes some iterator of type 'a Seq.t is the one which decides when to go to the next element. This gives a lot of flexibility, for example when iterating on several iterators at the same time:

    let rec zip a b () = match a(), b() with
      | Nil, _
      | _, Nil -> Nil
      | Cons (x, a'), Cons (y, b') -> Cons ((x,y), zip a' b')
  • Iter is an internal iterator. When one wishes to iterate over an 'a Iter.t, one has to give a callback f : 'a -> unit that is called in succession over every element of the iterator. Control is not handed back to the caller before the whole iteration is over. This makes zip impossible to implement. However, the type 'a Iter.t is general enough that it can be extracted from any classic iter function, including from data structures such as Map.S.t or Set.S.t or Hashtbl.t; one cannot obtain a 'a Seq.t from these without having access to the internal data structure.

  • Gen (from the gen library) is an external iterator, like Seq, but it is imperative, mutable, and consumable (you can't iterate twice on the same 'a Gen.t). It looks a lot like iterators in rust/java/… and can be pretty efficient in some cases. Since you control iteration you can also write map2, for_all2, etc but only with linear use of input generators (since you can traverse them only once). That requires some trickery for cartesian_product (like storing already produced elements internally).

In short, 'a Seq.t is more expressive than 'a Iter.t, but it also requires more knowledge of the underlying source of items. For some operations such as map or flat_map, Iter is also extremely efficient and will, if flambda permits, be totally removed at compile time (e.g. Iter.(--) becomes a for loop, and Iter.filter becomes a if test).

For more details, you can read http://gallium.inria.fr/blog/generators-iterators-control-and-continuations/ or see the slides about Iter by me (c-cube) when Iter was still called Sequence.

Build

  1. via opam opam install iter
  2. manually (need OCaml >= 4.02.0): make all install

If you have qtest installed, you can build and run tests with

$ make test

If you have benchmarks installed, you can build and run benchmarks with

$ make benchs

To see how to use the library, check the following tutorial. The tests and examples directories also have some examples, but they're a bit arcane.

License

Iter is available under the BSD license.

iter's People

Contributors

bridgethemasterbuilder avatar c-cube avatar copy avatar drup avatar ghulette avatar jan-pi-sona-lili avatar mooreryan avatar oliviernicole avatar rgrinberg avatar struktured avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

iter's Issues

Can't run tests with OUnit2 and qcheck-0.13

OUnit2 now is repalcement for OUnit.
[builder@localhost ocaml-iter-1.2.1]$ make test
Done: 53/58 (jobs: 1)File "qtest/run_qtest.ml", line 3, characters 5-11:
3 | open OUnit2;;
^^^^^^
Error: Unbound module OUnit2
Hint: Did you mean Unit?
ocamlc qtest/.run_qtest.eobjs/byte/run_qtest.{cmi,cmo,cmt} (exit 2)
(cd _build/default && /usr/bin/ocamlc.opt -w @1[email protected]@30..39@[email protected]@[email protected] -strict-sequence -strict-formats -short-paths -keep-locs -warn-error -a+8 -safe-string -w -33 -g -bin-annot -I qtest/.run_qtest.eobjs/byte -I /usr/lib64/ocaml/bytes -I /usr/lib64/ocaml/qcheck-core -I /usr/lib64/ocaml/result -I src/.iter.objs/byte -I src/.iter.objs/native -no-alias-deps -opaque -o qtest/.run_qtest.eobjs/byte/run_qtest.cmo -c -impl qtest/run_qtest.ml)
make: *** [Makefile:8: test] Error 1

after correcting the qtest/dune, the error looks like this:
ocaml-iter-1.2.1]$ make test
Done: 53/58 (jobs: 1)File "../src/Iter.ml", line 32, characters 5-7:
Error: This pattern matches values of type unit
but a pattern was expected which matches values of type test_ctxt
ocamlc qtest/.run_qtest.eobjs/byte/run_qtest.{cmi,cmo,cmt} (exit 2)
(cd _build/default && /usr/bin/ocamlc.opt -w @1[email protected]@30..39@[email protected]@[email protected] -strict-sequence -strict-formats -short-paths -keep-locs -warn-error -a+8 -safe-string -w -33 -g -bin-annot -I qtest/.run_qtest.eobjs/byte -I /usr/lib64/ocaml/bytes -I /usr/lib64/ocaml/ounit2 -I /usr/lib64/ocaml/ounit2/advanced -I /usr/lib64/ocaml/qcheck-core -I /usr/lib64/ocaml/result -I src/.iter.objs/byte -I src/.iter.objs/native -no-alias-deps -opaque -o qtest/.run_qtest.eobjs/byte/run_qtest.cmo -c -impl qtest/run_qtest.ml)
make: *** [Makefile:8: test] Error 1

zipping two sequences?

I feel like I must be missing something obvious. I could not find a function 'a t -> 'b t -> ('a * 'b) t which zips the sequences, stopping when the first one runs out, or something which does the same thing but combines the sequences with a function 'a->'b->'c instead. I found only more complicated combinations. ??

make doc fails

Also triggered by installation through opam when OPAMBUILDDOC=true is set.

File "src/Iter.mli", line 4, characters 1-35:
'1': bad section level (2-4 allowed)
File "src/Iter.mli", line 42, characters 51-62:
'@since' must begin on its own line
File "src/Iter.mli", line 490, characters 17-62:
'@since' must begin on its own line
odoc: internal error, uncaught exception:
      Parser___Helpers.InvalidReference("unknown qualifier `(-'")

File "src/IterLabels.mli", line 5, characters 1-35:
'1': bad section level (2-4 allowed)
File "src/IterLabels.mli", line 18, characters 51-62:
'@since' must begin on its own line
File "src/IterLabels.mli", line 461, characters 17-62:
'@since' must begin on its own line
odoc: internal error, uncaught exception:
      Parser___Helpers.InvalidReference("unknown qualifier `(-'")

Sequence.(--^) inconsistent with CCList.(--^)

Sequence.(--^) returns a reversed range, while CCList.(--^) returns a range with the right bound excluded.

See:

CCList.(--^) 0 5;;
(* val _1 : int list = [0; 1; 2; 3; 4] *)
Sequence.(--^) 5 0 |> Sequence.to_list;;
(* val _2 : int list = [5; 4; 3; 2; 1; 0] *)
Sequence.(--^) 0 5 |> Sequence.to_list;;
(* val _3 : int list = [] *)

I would suggest deprecating Sequence.(--^), since the CCList behaviour is more useful and the operator looks half-open range (the hat being on the right. I would expect a reversed range to include an arrow of some sorts).

opam config needs qtest dependency?

Not a big deal, but, I tried to install sequence via opam -- I did not have qtest installed -- and it fails:

#=== ERROR while installing sequence.0.10 =====================================#

opam-version 1.2.2

os linux

command make build

path /usr/app/lib/opam/4.03.0/build/sequence.0.10

compiler 4.03.0

exit-code 2

env-file /usr/app/lib/opam/4.03.0/build/sequence.0.10/sequence-28014-c61e7b.env

stdout-file /usr/app/lib/opam/4.03.0/build/sequence.0.10/sequence-28014-c61e7b.out

stderr-file /usr/app/lib/opam/4.03.0/build/sequence.0.10/sequence-28014-c61e7b.err

stdout

ocaml setup.ml -build

make[1]: Entering directory '/usr/app/lib/opam/4.03.0/build/sequence.0.10'

make[1]: Leaving directory '/usr/app/lib/opam/4.03.0/build/sequence.0.10'

stderr

make[1]: *** [Makefile:55: qtest-gen] Error 2

E: Failure("Command 'make qtest-gen' terminated with error code 2")

make: *** [Makefile:7: build] Error 1`

After installing qtest, sequence installs with no problem.

Thanks for all the great code btw.

Iter.IO.write_lines with "empty" iter still writes a newline char to output file

Not sure if this is a bug or intended behavior, but on an "empty" iter, the write_lines function will write a newline to the output file, rather than writing nothing.

But here is a simplified example of what's happening. The snoc/intersperse bit is from the write_bytes_line function which is called by write_lines (here).

# let i = Iter.of_list [] in Iter.snoc (Iter.intersperse "\n" i) "\n" |> Iter.to_list;;
- : string list = ["\n"]

# let i = Iter.of_list ["a"] in Iter.snoc (Iter.intersperse "\n" i) "\n" |> Iter.to_list;;
- : string list = ["a"; "\n"]

# let i = Iter.of_list ["a"; "b"] in Iter.snoc (Iter.intersperse "\n" i) "\n" |> Iter.to_list;;
- : string list = ["a"; "\n"; "b"; "\n"]

My expectation was that for an empty iter, nothing would be written to the file, but instead a newline is written. Is this the intended behavior?

Missing test dependency

Seems like some dependencies for the build-test opam target are not listed, for instance qcheck. This breaks the build when OPAMBUILDTEST=true.

Is it the normal behavior for group_by?

Hi

I add unexpected results with group_by but I'm not sure 'm right to find them unexpected. Anyway, I expected to get for s2 all pairs with the same first item in the same list, i.e. the same result as s3 when using group_by... Am I doing something wrong here (perhaps because of strings and hashing?)?

utop # module S = Sequence;;
module S = Sequence
utop # open Containers;;
utop # let s = 
List.product (fun x y -> (x, y)) [ "a"; "b"; "c" ] [ "d"; "e"; "f" ]
|> S.of_list;;
val s : (string * string) S.t = <fun>
utop # let print s = Format.printf "%a" 
(Format.within "[" "]" @@ S.pp_seq 
@@ Format.within "(" ")" @@ Pair.pp String.print String.print) s;;
val print : (string * string) S.t -> unit = <fun>
utop # print s;;
[("a", "d"), ("a", "e"), ("a", "f"), ("b", "d"), ("b", "e"), ("b", "f"), ("c", 
"d"), ("c", "e"), ("c", 
"f")]- : unit = ()
utop # let s2 = S.group_by ~hash:(Hash.pair String.hash String.hash)
~eq:(fun (a1, _) (b1, _) -> String.equal a1 b1) s;;
val s2 : (string * string) list S.t = <fun>
utop # let print2 s =
S.iter Format.(printf "%a@\n" (Format.within "[" "]" @@ list @@ within "(" ")" 
@@ hbox @@ Pair.pp String.print String.print)) s;;
val print2 : (string * string) list S.t -> unit = <fun>
utop # print2 s2;;
[("b", "d")]
[("a", "e")]
[("b", "f")]
[("a", "d")]
[("c", "d")]
[("a", "f")]
[("c", "f")]
[("c", "e")]
[("b", "e")]
- : unit = ()
utop # let s3 = S.sort_uniq ~cmp:(Pair.compare String.compare String.compare) s 
|> S.group_succ_by ~eq:(fun (a1, _) (b1, _) -> String.equal a1 b1);;
val s3 : (string * string) list S.t = <fun>
utop # print2 s3;;
[("a", "f"), ("a", "e"), ("a", "d")]
[("b", "f"), ("b", "e"), ("b", "d")]
[("c", "f"), ("c", "e"),
("c", "d")]
- : unit = ()

split into several modules

  • use module alias to reduce binary bloat
  • use (wrapped true) in dune
  • move all heavy components (relational combinators, IO, mlist) into separate modules
  • keep only the lightweight and common combinators in Iter (probably via an include)

Compatibility with mdx 2.0.0

It looks like the mdx binary will no longer be part of the installation of mdx (the new binary is called ocaml-mdx).
However it seems there is a simpler method with dune: https://dune.readthedocs.io/en/stable/dune-files.html?highlight=mdx#mdx-since-2-4

#=== ERROR while compiling iter.1.2.1 =========================================#
# context              2.1.0 | linux/x86_64 | ocaml-variants.4.14.0+trunk | file:///home/opam/opam-repository
# path                 ~/.opam/4.14/.opam-switch/build/iter.1.2.1
# command              ~/.opam/opam-init/hooks/sandbox.sh build dune runtest -p iter -j 31
# exit-code            1
# env-file             ~/.opam/log/iter-2311-bf1f39.env
# output-file          ~/.opam/log/iter-2311-bf1f39.out
### output ###
# File "dune", line 6, characters 15-18:
# 6 |           (run mdx test %{deps})
#                    ^^^
# Error: Program mdx not found in the tree or in PATH
#  (context: default)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.