GithubHelp home page GithubHelp logo

ocaml-flambda / flambda-backend Goto Github PK

View Code? Open in Web Editor NEW
90.0 19.0 65.0 161.41 MB

The Flambda backend project for OCaml

Makefile 0.53% M4 0.35% Shell 1.50% OCaml 86.96% Standard ML 0.25% TeX 0.12% CSS 0.01% HTML 0.01% C 9.19% Assembly 0.90% Awk 0.04% Forth 0.01% C# 0.01% Batchfile 0.02% Perl 0.01% Common Lisp 0.01% JavaScript 0.04% SCSS 0.07% Python 0.02% SMT 0.01%

flambda-backend's Introduction

The Flambda backend project for OCaml

This repository is for more experimental work, of production quality, on the middle end and backend of the OCaml compiler. This is also the home of the Flambda 2 optimiser and the Cfg backend.

The Flambda backend is currently based on OCaml 5.1 and supports both the OCaml 4 and OCaml 5 runtime systems.

The following gives basic instructions for getting set up. Please see HACKING.md for more detailed instructions if you want to develop in this repo. That file also contains instructions for installing the Flambda backend compiler in a way that it can be used to build OPAM packages.

One-time setup for dev work or installation

Only currently tested on Linux/x86-64 and macOS/x86-64.

One-time setup (you can also use other 4.14.x releases):

$ opam switch 4.14.1  # or "opam switch create 4.14.1" if you haven't got that switch already
$ eval $(opam env)
$ opam install dune.3.15.2 menhir.20210419

You probably then want to fork the ocaml-flambda/flambda-backend repo to your own Github org.

Branching and configuring

Use normal commands to make a branch from the desired upstream branch (typically main), e.g.:

$ git clone https://github.com/ocaml-flambda/flambda-backend
$ cd flambda-backend
$ git checkout -b myfeature origin/main

The Flambda backend tree has to be configured before building. The configure script is not checked in; you have to run autoconf. For example:

$ autoconf
$ ./configure --prefix=/path/to/install/dir

Building and installing

To build and install the Flambda backend, which produces a compiler installation directory whose layout is compatible with upstream, run:

$ make install

flambda-backend's People

Contributors

alanechang avatar antalsz avatar apilatjs avatar azewierzejew avatar ccasin avatar chambart avatar dkalinichenko-js avatar dvulakh avatar ekdohibs avatar forestryks avatar freemagma avatar gbury avatar goldfirere avatar gretay-js avatar keryan-dev avatar liam923 avatar lpw25 avatar lthls avatar lukemaurer avatar mshinwell avatar nathanreb avatar ncik-roberts avatar oscarpi avatar poechsel avatar riaqn avatar rleshchinskiy avatar stedolan avatar thenumbat avatar trishume avatar xclerc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flambda-backend's Issues

Bug in Dissect_letrec with partial applications

There is an example in the JS tree which has a let rec binding various full and partial applications (including polymorphic recursive definitions). This fails to compile with the "Unallowed recursive access" error (dissect_letrec.ml line 512 at the time of writing). The error cites an Lapply node.

Possible optimisation: delaying allocations

Here is an optimisation that I would like the compiler to do, which I was somewhat surprised that none of Closure / Flambda1 / Flambda2 currently perform. Currently, this function:

let go c =
  let s = Some c in
  if c < 0 then None else s

compiles to this cmm:

(function camlTest__go_81 (c/83: val)
 (let s/84 (alloc 1024 c/83) (if (< c/83 1) 1 s/84)))

which unconditionally allocates. It would be better to substitute away the let, moving the allocation under the if, so that it only allocates when c < 0.

Some more realistic examples:

  • Local functions:

    let f x ys =
      let aux s = s + x in
      match ys with
      | [] -> 42
      | y :: ys -> ... aux ...

    Ideally, the closure aux would only be allocated in the _ :: _ branch.

  • Inlined functions:

    let[@inline] log_change ch = if !logging_enabled then log := ch :: !log
    let do_stuff n m = log_change (Do_stuff (n, m)); n + m

    Ideally, the allocation Do_stuff (n, m) would only occur when logging_enabled is true.

Port #10401: Fix `include` and with `constraints` handling of "ghost" components

#10401

From the Changelog:

- #6654, #9774, #10401: make `include` and with `constraints` handle correctly
  the ghost components of signatures. For instance, in

    include struct class c = object end end type c

   the type `c` shadows the `class c` to avoid shadowing only the ghost type
   c introduced by the class.
  (Florian Angeletti, report by Eduardo Rafael, review by Gabriel Scherer)

Dune library for the native runtime linker errors

The dune file at ocaml/runtime/dune defines a dummy library that links with the runtime.
As a result, it tries to create runtime_native.cmxs, to link dynamically with the static runtime library.
This should fail, but since the OCaml code linked in this library doesn't even use the runtime it actually works.

But with -Oclassic, the code for the dummy module actually allocates, so it really depends on the runtime and we get colourful linker warnings complaining that we're trying to link a static library in a shared object.

I don't know why this dummy libraries exist, but I don't think we should build shared versions of them.

Compiler is linking systhreads

This is apparently what stops ocamldebug working on Flambda backend compilers. The compiler wasn't intended to link systhreads, maybe there is a default in Dune that is causing it?

Non-terminating module initialisers cause linker errors

It looks like the let-symbol bindings for module blocks are being dropped if the module initialiser is non-terminating. It's not entirely clear to me how to fix this without compile-time performance overhead. There seems to be a relevant comment in closure_conversion.ml on line 1351. The following should reproduce the linker error:

x.ml:  let x = while true do () done
x.mli: val x : unit
y.ml:  let a = X.x, X.x

-Oclassic makes all calls indirect, breaking probes

Currently -Oclassic gives all calls the kind Indirect_unknown_arity, which has the correct runtime behavior in general. Unfortunately, the translation from flambda2 to Cmm requires that any application with a probe_name be a direct call, so this breaks any code that uses probes.

Run the testsuite with the _build1 compiler

It would be useful to be able to run the testsuite using the compiler that resides in _build1 (which would probably require the stdlib in _build2), and make that feature available as a makefile target. That would help significantly when the flambda2 compiler generates code that produces segfaults (in which case the build of the compiler in _build2 might not succeed, or can suceed but result in a compiler that segfaults, which is not easy to debug).

Performance problem in Flambda1 Inline_and_simplify with large partial applications

The following code takes ~40s to compile under flambda1, with most of the time spent in Inline_and_simplify:

let many_args
      ~arg0 ~arg1 ~arg2 ~arg3 ~arg4 ~arg5 ~arg6 ~arg7 ~arg8 ~arg9
      ~arg10 ~arg11 ~arg12 ~arg13 ~arg14 ~arg15 ~arg16 ~arg17 ~arg18 ~arg19
      ~arg20 ~arg21 ~arg22 ~arg23 ~arg24 ~arg25 ~arg26 ~arg27 ~arg28 ~arg29
      ~arg30 ~arg31 ~arg32 ~arg33 ~arg34 ~arg35 ~arg36 ~arg37 ~arg38 ~arg39
      ~arg40 ~arg41 ~arg42 ~arg43 ~arg44 ~arg45 ~arg46 ~arg47 ~arg48 ~arg49
      ~arg50 ~arg51 ~arg52 ~arg53 ~arg54 ~arg55 ~arg56 ~arg57 ~arg58 ~arg59
       () = ()
let some_args = many_args ~arg59:12
let more_args = some_args ~arg58:27

This problem has been around for a while. Since the recent merge of local-allocs, this problem also occurs when all of the arguments in many_args are made optional, due to a change in Lambda that makes optional and mandatory labelled arguments be compiled mostly the same way. (This change made the issue more prevalent, since functions with very many optional arguments seem to be more common than functions with very many named arguments)

Port #10475: Change to Filename.chop_suffix

#10475

From the Changelog:

#7812, #10475: `Filename.chop_suffix name suff` now checks that `suff`
  is actually a suffix of `name` and raises Invalid_argument otherwise.
  (Xavier Leroy, report by whitequark, review by David Allsopp

Port #10539: Field kinds should be kept when copying types

#10539

From the Changelog:

#10539: Field kinds should be kept when copying types
  Losing the sharing meant that one could desynchronize them between several
  occurrences of self, allowing a method to be both public and hidden,
  which broke type soundness.
  (Jacques Garrigue, review by Leo White)

Refactor cfg_dataflow, cfg_liveness, cfgize

This issue is to keep track of CFG-related improvements identified during the review of PR#547.

  • remove duplicate "is_pure" after #555 merges
  • liveness: Poptrap, Prolog, Pushtrap live across should be set to "value" not empty. See comment.
  • forward: use less_equal not compare for fixpoint check
  • backward: simplify special treatment of trap handler blocks. See comment.
  • backward: change the types of transformers (don't return instruction) and the result (per instruction not per block)
  • merge forward and backward if possible
  • Cfgize.Trap_depth_and_exn : simplify exceptional_successor, it's always (pop stack) if the block can raise. See comment.
  • reenable cfg_equivalence check for layout

Flambda2 test list

Currently, the list of the tests to run with the flambda2 variant
is controlled by the file named "testsuite/flambda2-test-list".
As noted in #494, it might make sense to rather use ocamltest's
mechanism to determine whether a test should be run or not.

Fix menhir build for flambda2

We need to fix the menhir build for flambda2 so the parser rebuilds like everything else. I'm not certain of the versioning problems here but I think what's needed is to depend on a newer version of menhir, so that other tools (e.g. ocamlformat?) which might have specific version constraints are still ok.

This work should also include suppressing the many errors that menhir produces during the build.

Code_id <foo> is present with code metadata ... but imported with code metadata ...

This happened with jenga in polling mode, the file compiled fine afterwards. Might be a problem with the JS jenga build rules, or possibly something on the compiler side. We also saw a case a few weeks back where a .cmx load failed; I'm not certain but I suspect it probably read a malformed (truncated?) file. Might be another instance of the same problem.

Port #10568: remove Obj.marshal and Obj.unmarshal

#10568

From the Changelog:

#10568: remove Obj.marshal and Obj.unmarshal
  (these functions have been deprecated for a while and are superseded
   by the functions from module Marshal)
  (François Pottier, review by Gabriel Scherer and Kate Deplaix)

Port #10207 and #10312: deprecate consecutive letters in warning specifications

#10207, #10312

From the Changelog:

- #10207, #10312: deprecate consecutive letters in warning specifications.
  The form `-w aBcD` was equivalent to `-w -a+b-c+d`.
  It is now deprecated to improve the coexistence with warning mnemonics.
  However, using isolated single letter is not deprecated to allow the form
  `-w "A-32..50-45"`.
  (Florian Angeletti, review by Damien Doligez and Gabriel Scherer)

Trap actions referencing deleted continuations

This was seen whilst debugging another patch. It looks like continuations named in trap actions may sometimes be left behind (with the continuation definition having been deleted) if an Invalid occurs:

>> Fatal error: Continuation k88 not found in env

Context is: translating function camlCmi_format__read_cmi_1_6_code to Cmm with return cont k213, exn cont k214 and body:
((let
  (name/639 = filename/635)
  ((apply
    ((Stdlib.camlStdlib__open_in_gen_119〈k219〉《k214》
      (Stdlib.camlStdlib__const_block592 0 filename/635))
     (call_kind
      (Direct (code_id camlStdlib__open_in_gen_30_120_code)
       (closure_id Stdlib.open_in_gen/30) (return_arity 𝕍)))
     (dbg ...)
     (inline Default_inline)
     (inlining_state
      (depth 1, arguments
       ((max_inlining_depth 1) (call_cost 0.625000) (alloc_cost 0.875000)
        (prim_cost 0.375000) (branch_cost 0.625000)
        (indirect_call_cost 0.500000) (poly_compare_cost 1.250000)
        (small_function_size 10) (large_function_size 10)
        (threshold 10.000000)))) (probe_name ())))
   k219 (param/640 ∷ 𝕍) #One: goto k218))
 k218 #One:
  (push_trap k88 then goto k220
   k220 #One: Invalid Halt_and_catch_fire))

Add some checks on closure offsets

There are a few undocumented invariants about sets of closures that must be respected, else the GC might do weird things. One of them is that there should not be any gap between the start of a set of closures and the first closure, or said otherwise, the first closure slot in a set of closures should have offset 0.

Specifically, this code in the major GC assumes there is no gap at the start of a set of closures, since it reads the 2nd field of the set of closures:

if (Tag_val(block) == Closure_tag) {
/* Skip the code pointers and integers at beginning of closure;
start scanning at the first word of the environment part. */
/* It might be the case that [mark_stack_push] has been called
while we are traversing a closure block but have not enough
budget to finish the block. In that specific case, we should not
update [m.offset] */
if (offset == 0)
offset = Start_env_closinfo(Closinfo_val(block));

and for the Closinfo macro:

#define Closinfo_val(val) Field((val), 1) /* Arity and start env */

Currently, it's unlikely this would happen, but not impossible (but the probability of it happening depends on the degree of sharing of closure_id between sets of closures, which is quite low currently).

It would be hard to ensure that this is enforced by the current greedy algorithm, and we may need another/better algorithm at some point, but in the meantime, we should at the very least add a check so that if it happens, we get a compile-time error rather than a segfault at runtime.

Port #10039: Safepoints

#10039

From the Changelog:

- #10039: Safepoints
  Add poll points to native generated code. These are effectively
  zero-sized allocations and fix some signal and remembered set
  issues. Also multicore prerequisite.
  (Sadiq Jaffer, Stephen Dolan, Damien Doligez, Xavier Leroy,
   Anmol Sahoo, Mark Shinwell, review by Damien Doligez, Xavier Leroy,
   and Mark Shinwell)

Get_tag primitive in Flambda 2 with multicore

This primitive is deemed completely pure in Flambda 2, which is what we want, but we will need to be certain there are no problematic interactions with lazy tag-changing operations in multicore.

Port #10140: enable warning 6 [labels-omitted] by default

#10140

From the Changelog:

* #10118, #10140: enable warning 6 [labels-omitted] by default.
  The following now warns:
    let f ~x y = ... in f 3 5
  the callsite (f 3 5) has to be turned into (f ~x:3 5).
  This prevents mistakes where two arguments of the same types are swapped.
  (Note: Dune already enables this warning by default.)
  (Gabriel Scherer, review by Xavier Leroy and Florian Angeletti,
   report by ygrek)

Changing --prefix does not cause a proper rebuild

Reconfiguring to change the --prefix and then building, without cleaning the tree, doesn't seem to work (the old prefix gets left around in some places). We should try to fix this or add some other check saying that the tree has to be cleaned; several people have hit this recently.

Join loses sharing, causes OOM

In the Jane Street tree, we have a large functor (Merlin reports an expanded module type that's 5400 lines long). It has several sub-modules that reference each other heavily.

module Make(Config : Config.S) : S = struct
  module Config = Config
  module My_foo = Foo(Config)
  module My_bar = Bar(My_foo)
  
  (* ... many, many more things ... *)
end

Config.S is basically a record type with 18 small fields, but Foo and Bar are pretty substantial functors in their own right. Then the driver comes along and says:

module Implementations = struct
  module I1 = Make(Config1)
  module I2 = Make(Config2)
  module I3 = Make(Config3)
end

let run n =
  let module Implementation =
    (val
      match n with
      | 1 -> (module Implementations.I1 : S)
      | 2 -> (module Implementations.I2)
      | _ -> (module Implementations.I3))
  in
  ...

Up to run, all of this compiles fine: with aggressive inlining, the three functor calls in Implementations generate a lot of code, but nothing we can't handle (in about 30 s). The let module, however, is trouble: It produces a join point whose argument type is I1 ∨ I2 ∨ I3, which is a three-way join over the entire signature S. Worse, join operations produce expanded types that have no sharing: every reference to Config in Foo becomes a new record type with 18 fields, and then every reference to My_foo in Bar becomes a giant new record type in which Config has been expanded many times. The result of all this (with Foo doing something similar on its own) is an OOM on a 96G machine.

(For reasons I haven't quite pinned down, the OOM only happens if the max inlining depth is at least 9, even though I'm running this with functor return types, so in principle the types should get huge even without inlining. My working theory is that inlining often causes functions to have more free variables, and thus bigger closures: a function where I1 occurs free, for instance, might become one where I1 and I1.My_bar separately occur free, probably ~doubling the size of the fully-expanded type of the closure.)

It's not entirely clear what to do here. As a workaround, one can wrap each branch in Sys.opaque_identity (as it happens, the join isn't that useful anyway), but that's a messy hack for perfectly reasonable code. It should suffice for Meet_and_join.join to use memoization to preserve sharing, but its recursive calls take an environment that keeps changing, so it would have to be done carefully if it's possible at all. Of course, join could give up when its arguments get too large; that might be a bit brutal in cases like this, but I doubt it happens often that

  1. there are modules with enough sharing to explode like this, and
  2. they're large enough to eat this much RAM, and
  3. there would be much benefit from computing the join in full.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.