fram-lang / dbl Goto Github PK

An experimental programming language that combines effect handlers and implicit parameters

License: MIT License

OCaml 99.61% Shell 0.39%

dbl's Introduction

DBL

DBL is an interpreter of Fram, an experimental programming language designed around the idea of combining lexically scoped effect handlers with a powerful mechanism of implicit and named parameters in a strongly-typed setting. The main goal of DBL interpreter is providing a tool for bootstrapping Fram compiler. DBL implements core and some of advanced language features, including ML-style parametric polymorphism, rank-N types, mutually recursive data-types and definitions, existential types, and pattern-matching.

Requirements

DBL is written in pure OCaml and tested with OCaml system version 5.1.0. DBL uses dune as a build system (tested with version 3.14.0).

Installation

Simply type dune build to compile the project. If you use opam package manager, you can locally install DBL by typing dune install.

Usage

Just type dbl (or rlwrap dbl for better readline support) to run the interpreter in a REPL mode. In this mode you can interactively provide definitions and expressions to evaluate, separated by double semicolon ;;. The example REPL session is shown below.

$ dbl
> let id x = x ;;
> id () ;;
: Unit
= ()
>

You can also run programs in a batch mode, by providing a file to execute as a command-line argument (e.g., dbl filename.fram).

If you didn't install DBL via dune install (or OPAM, generally) then it's recommended to set the DBL_LIB environmental variable to the lib directory of your local installation of DBL.

Examples

Several simple examples can be found in the examples directory. Moreover, in a test/ok directory you can find many tiny examples used to test implementation of various language features.

Source Code and Development

Source code can be found in the src directory. For a high-level overview of the implementation we encourage to look into the src/Pipeline.ml file and src/Lang directory to see phases and intermediate languages of the interpreter. To run DBL interpreter on all test programs just type ./test.sh dbl test/test_suite. Bugs can be reported on our issue tracker.

dbl's People

Contributors

Stargazers

Watchers

Forkers

foxinio jauzatopauza zielinsky kwasielewski minionjakub wojpok fatmage florian3k wbukowski bartlomiejkrolikowski bohun9 pa000

dbl's Issues

Recursive functions

Recursive functions are not implemented yet (but they are expressible via recursive types: see fix in the Prelude). We should add (mutually) recursive functions to all intermediate languages in the pipeline (Surface, Unif, Core, and Untyped).

Abstract type constructors can be accessed in pattern matching

Constructors of abstract types can still be used in patterns when the type is know, e.g. from an annotation. This doesn't seem desirable. The following program typechecks.

module M
  abstr data T = C
end

let f (C : M.T) = ()

Some method definitions can lead to `Internal error: type mismatch`

The following program exhibits the issue.

data U = I

method foo {self = I} = I

let baz I (x : U) = x.foo

Here’s the full error message.

Internal error: type mismatch
at:
(fn (x#16 tp#1b) (let x#1d (tapp foo#10 tp#8)) (app x#1d x#16))
requested:
(tp#1f -> tp#1f)
provided:
(tp#1f -> tp#1f (effect tp#8 tp#8))
Fatal error: exception Failure("Internal error: type mismatch")

Another example without pattern matching, forcing impurity through the annotation.

method foo {self : U} = (fn g => g : U ->[] U) (fn x => x) I

This issue was originally reported by @forell

REPL pretty-printing based on `toString` method

The current pretty-printer in REPL behaves very poorly.

$ dbl
> data T = A ;;
> A ;;
: T
= <ctor>

Better pretty-printer could base on toString method.

Optional parameters

To the mechanism of named/implicit parameters we could add optional parameters.

ADT constructors in a type-checking mode

In a bidirectional type-checker we can take advantage of a type-checking mode in case of constructors of ADTs, and do not require the full module path to be provided by the programmer. However, we treat constructors as regular functions in expressions, and we switch from a type-checking mode to a type-inferring mode in case of a function application. This two decisions in combination make type-checking of shadowed or not imported constructors more challenging. Consider the following code, which could be considered valid.

data A = A
data B = A of A
data C = A of B

let _ = A (A A)

At the moment I do not see any simple solution to this problem, so any ideas are welcome.

Modules as ADTs

Each module could have distinguished type, that could be accessed by providing only the module name (the same feature as this types in Helium). The motivation is to write List instead of List.T or List.List for library-defined types. This feature will work well with #44, as regular ADT could be a module with constructors and single distinguished type. On the other hand, without #34 it would give terrible error messages. The concrete syntax should be determined.

Non-fatal errors in REPL do not abort the pipeline

Non-fatal errors in REPL are raised, but do not abort the pipeline, allowing to run ill-typed code. The error can be reproduced by typing in REPL:

> let f (x : Unit -> Unit) = x () ;;
> f () ;;
<stdin>:1:3-4: error: ...
: Unit
Fatal error: exception Failure ("Runtime error!")

Types introduced by implicit parameters may escape their scope

See implementation of the DLetFun case in TypeInference.check_def. Any example of code showing buggy behavior is welcome.

`abstract` visibility modifier for datatypes.

Datatype could be marked as abstract, which means that the type variable is publicly visible, but its constructors remains private. The keyword used to mark datatypes as abstract can be discussed, as abstract seems to be bit long. Other options that can be considered are abs or abstr.

Check for `NTerm` effect for recursive functions in Core

The internal type checker of Core doesn't check if recursive functions have the NTerm effect on correct position.

Implement method visibility

Methods can be declared as public using pub, but it currently has no effect: all methods are public regardless of the visibility modifier. The visibility parameter is passed to the Env.add_poly_method function in the typechecker, but is not used.

Type variable escaped via type parameter

The minimal example, that type-checks, but it shouldn’t. It raises internal error on ToCore phase.

data Box X = Box of X
label l
data A = A
let foo (Box A) = ()
handle {label=l} _ = ()
  return  _ => foo
  finally c => c (Box A)

Scopes of unification variables represented as levels

Currently, unification variables (in Unif) stores scope constraints represented as sets of type variables. An alternative implementation can be based on levels: each unification variable remembers its level, i.e., number of its scope (higher numbers are wider or more nested scopes). Level-based approach is less intuitive, but has many advantages: it would be simpler in some places and is definitely more efficient (let polymorphism generalizes only variables on a specific level, so there is no need to traverse all the environment for searching unification variables that may escape).

Setup Github Actions for running tests automatically

Tests could be run at github automatically for each pushed commit.

Add source code location tracking in module paths

Currently, there is no source code location assigned within module paths in Surface. Adding it would make error messages more accurate.

References with backtracking

Unification variables in Unif rely on mutable references provided by the BRef module. Currently, the BRef module is just a wrapper for references from the standard library. However, extending this module by transactions and rolling back the state of references to the last valid point in case of type errors, would allow to produce better error messages, especially in REPL.

Method functions in type-check mode

Method functions (method fn foo) cannot be used without any parameter.

> (+) ;;
<stdin>:1:1-3: fatal error: Variable (+) is registered as method add and cannot be used without argument

However, it would make sense to allow them provided that they are in type-check mode, when type type is an arrow with a neutral type on argument position. The motivating example is a sum of elements of a list:

let sum (xs : List Int) = xs.foldl (+) 0

or writing a code like in Eval.ml:

let extern_map =
  [ "dbl_addInt",      int_binop ( + );
    "dbl_subInt",      int_binop ( - );
    "dbl_mulInt",      int_binop ( * );
    "dbl_divInt",      int_binop ( / );
  ...

Method operators declaration in Prelude

Standard arithmetic and comparison operators should be declared as methods in Prelude. For instance:

method fn (+) = add

Delayed solving of method constraints

In the current implementation, the self parameter of the method call should have a concrete type. Sometimes it requires heavy type annotation, to make sure that the type of self is known at the place of the method call. In some cases we could delay method selection by assigning method constraints to unification variables. The motivating example (assuming that we have higher-order methods #18 and infix operators #12) is the following definition of overloaded addition, that could be a part of the Prelude.

let (+) {type T, method add : T -> T ->[] T} (x : T) y = x.add y

let x = 13 + 29
let str = "Hello " + "World!"

In the current approach to the method resolution this code would not type-check, because in 13 + 29 we start with type-checking of (+), so we guess the type T (as a fresh unification variable) and we have to find its method add. This in turn would raise an error similar to Calling method of expression of unknown type.

Higher-order methods

Method constraints can be treated as another form of named parameters. Consider the following code.

method toString {A, method toString : A ->[] String, self : List A} = ...

Of course, concrete syntax can be negotiated.

Recursive methods

As it stands recursive methods are not usable. The type guessed for the recursive definition before it's type-checked does not contain enough information to add the method to the environment.

Simple namespaces

Namespaces are used for grouping definitions. They can be nested, opened, and included. The concrete syntax should be determined. In the future, this mechanism will be combined with dividing a program into separate files (translation units).

Operators as ADT constructors

It should be possible to define ADT with constructors that are operators. For instance, the standard List type.

data rec List X =
  | []
  | (::) of X, List X

Kind annotations

Kind annotations are not supported yet. They could be particularly useful in definitions of ADTs, as in the following code.

data rec Fix (F : type -> type) = Fix of F (Fix F)

Unification variables in REPL

Currently, REPL commands cannot leave unsolved unification variables.

$ dbl
> id ;;
fatal error: Unsolved unification variables left.
>

It comes from the fact, that unification variables are not supported in Core.

A solution that was proposed is to add some dummy type to Core, that is used to translation of unification variables. In batch mode this type should never appear, so most of function for manipulating types in Core (especially CorePriv.WellTypedInvariant) can raise internal error when encounter this dummy type. The only minor drawback of this approach is that internal type-checker and some of possible program optimizations would not be available in REPL.

Rename `equal` method to `hequal` in `Map1`

Name equal used by the Map1 module (part of the Utils library) collides with the usual equality, e.g., in the Core.TVar module. Name hequal that stands for heterogeneous equality seems to be a better option.

Improve the error message for methods as named parameters

Methods as named parameters were introduced in #58
However, when an error occurs only the name of the method is printed, the owner is omitted even though it would be useful information for programmers. Introducing such a feature is non-trivial since the function to acquire the owner of a method is part of the module TypeUtils that utilizes the Error module. Thus, using that function would create a dependency cycle.

More precise error message when some module is missing within a module path.

Failure to resolve a path like X.Y.Z.x because there is no module Y in X results in an error message printing the whole path, which is not ideal.

Wild-cards in type parameters

Wild-cards in type parameters, like in the following code, are not supported yet. They could be easily added.

data T _ = A

Implicit resolution may fall into infinite loop

The buggy behavior is exhibited by one of negative tests (test/err/tc_0000_implicitLoop.dbl) that is already included in the repository.

implicit `n
let `n = `n
let _ = `n

This test worked at some point, but now it doesn't.

Negative tests

Testing script is capable of running negative tests (i.e., invalid input programs, like test/err/tc_*), but the test suite does not use them.

Infix and prefix operators

They could be changed into function application at the desugaring phase.

Reisntatniation of named parameters in implicit instantiation resolution

Currently, regular named parameters can be implicitly introduced, but not instantiated. This prevents an instantiation of an implicit parameter of type { x : A} -> B, assuming that there is such an implicit in the scope. We could allow the instantiation algorithm to instantiate named parameters that were recently introduced by Env.open_scheme.

Warn about useless `abstr` modifier

Within a recursive block marked as pub, the abstr modifier on a data type has no effect, since the public visibility is propagated to the constructors regardless. This should probably trigger a warning, as it might not be the programmer's intention.

module M
  pub rec
    abstr data T = C
  end
end

`Utils/BRef.mli` is poorly documented

Functions in the BRef module are not documented, even though they are not very exciting. Moreover, the format of documentation comment of BRef.bracket is not compatible with ocamldoc (the second line starts with asterisk).

Selection from a module (`M.x`) as syntactic sugar (`open M in x`) might be counter-intuitive

This syntactic sugar allows to select variables that are not publicly available in the module, but defined in other opened modules. This might lead to very counter-intuitive behavior, when variable is defined as private, as in the following code:

module M
  let x = 42
end
let x = 13
let y = M.x

Current implementation raises no errors and assigns 13 to y.

This issue was originally reported by @kwasielewski

Pretty-printing of named types prefixed by a module path

Type pretty-printer in TypeInference is not adjusted to modules yet. The main challenge of this task is to find a proper representation of pretty-printing information stored in the environment (Env).

Syntactic sugar for `&&`, `||`, and `;`

These operators need special treatment. Operators && and || are lazy and can be desugared to conditional expression. Operator ; is could be defined in Prelude as

let (;) () x = x

but in e1 ; e2 expression e2 should be at tail position. It is better to desugar e1 ; e2 as let () = e1 in e2 or let _ = (e1 : Unit) in e2.

ADTs as recursive values

Fully-applied ADT constructors could be considered productive values, provided the recursive occurrences are guarded by lambdas.

let rec r = C { f = fn x => ... r ... }

`pub` definition modifier as syntactic sugar for public patterns

As discussed in #29 there should be no DPub constructor in Surface. Public definitions should be desugared to definitions with public patterns. This solution seems to be cleaner and scales better to other features that we aim for, including the following.

Public and abstract visibility modifiers for datatypes.
Selectively marking as public some of mutually recursive definitions or datatypes.

Multi-parameter functions in `Core` and `Untyped`

We could consider modifying Core and Untyped languages to support real multi-parameter functions. I see several benefits:

it would simplify some parts of compilation of deep pattern-matching;
it would allow to get rid of a dummy unit parameter that delays a computation (currently, such things appear in the translation of deep pattern-matching in avoiding duplication of code used by several patterns);
it will pave the path towards multi-parameter functions in Unif and Surface (I'm not sure if we want that);
a code will run faster (if we care about that): polymorphic functions with named/implicit parameters are true multi-parameter functions, so we could avoid creating of unnecessary closures.

However, it complicates these intermediate languages and raises some questions about how multi-parameter functions should be combined with parametric polymorphism (should they be combined into a single construct or treated separately?).

Mixed mutually recursive blocks

We should be able to mix data type and function definitions in a single rec ... end block. Additionally, other kinds of definitions could be allowed in these blocks, such as module definitions.

Check name uniqueness in mutually recursive definitions

The type checker should check whether all names in a rec ... end block are unique.

Early pattern-matching of function parameters

Patterns in formal parameters generate a pattern-matching just after a lambda that binds a parameter. As a result, a function that perform pattern-matching on a parameter cannot be pure. This might be a problem for named or implicit parameters, as they form only pure functions. The following code would not type-check.

data U = I
let foo {`x=I} y = y

A possible solution to this problem is to delay this pattern-matching until all parameters are taken. Such a behavior is different than in OCaml: assuming non-exhaustive patterns allowed, the following would run correctly, while in OCaml (and the current implementation) it would fail.

data T = A | B
let foo A _ = B
let _ = foo B

Generating methods for accessing fields of records

For record-like types, e.g.,

data Vec X = Vec { x : X, y : X }

we could automatically generate methods for accessing fields of these types. In this example it would be

method x {self = Vec { x }} = x
method y {self = Vec { y }} = y

There are some design decisions that should be made. First of all, when should we generate such methods? The possible options I see are the following (I prefer option 2 or 4).

Always, for ADT with single constructor (with some limitation related to existential types: see below).
For types explicitly marked as records (e.g., by a special keyword, or when the constructor name is omitted or the same as the name of type).
For each field that is shared between all constructors.
By explicitly marking the fields, e.g., by something like deriving in Haskell.

Moreover, for types that stores existential types, e.g.,

data T = T { X, a : X, b : Int }

it is not possible to generate some methods (method a in this example, but method b is still reasonable). If we follow option 2, then we should forbid existential types in records, or omit some automatically generated methods.

ADTs as modules

Each definition of a ADT could also be a definition of a module that contains all the constructors. It is possible due to fact that types and modules form separate namespaces that both start with an uppercase letter. The motivation is that it would be easier to work with multiple data-types with the same names of constructors.

data T = A | B
data U = A | B

let x = T.A
let y = U.B

Error messages about escaping type variables

Escaping type variables are detected, but error messages are not very informative. For instance, the following program

let foo f =
  data T = A in
  f A

produces the following error.

test.dbl:3:5: error: This expression has type T, but an expression was expected of type T

We could produce better error messages.

Support for dividing a program to separate files

Since modules are now supported, each file of the program could be seen as a separate module.

Syntactic sugar for unary conditional expression

Expression if e1 then e2 could be desugared to if e1 then e2 else ().