GithubHelp home page GithubHelp logo

travisbrown / dhallj Goto Github PK

View Code? Open in Web Editor NEW
175.0 8.0 9.0 1.36 MB

Dhall for Java

License: BSD 3-Clause "New" or "Revised" License

Scala 23.31% Java 74.69% Dhall 1.54% Starlark 0.37% JavaScript 0.09%
dhall java scala json yaml

dhallj's Introduction

Dhall for Java

Build status Gitter Maven Central

This project is an implementation of the Dhall configuration language for the Java Virtual Machine.

Our goal for this project is to make it as easy as possible to integrate Dhall into JVM build systems (see the dhall-kubernetes demonstration below for a concrete example of why you might want to do this).

The core modules have no external dependencies, are Java 7-compatible, and are fairly minimal:

$ du -h modules/core/target/dhall-core-0.10.0-M1.jar
168K    modules/core/target/dhall-core-0.10.0-M1.jar

$ du -h modules/parser/target/dhall-parser-0.10.0-M1.jar
108K    modules/parser/target/dhall-parser-0.10.0-M1.jar

There are also several Scala modules that are published for Scala 2.12, 2.13, and 3.0. While most of the examples in this README are focused on Scala, you shouldn't need to know or care about Scala to use the core DhallJ modules.

The initial development of this project was supported in part by Permutive.

Table of contents

Status

The current release of this project supports Dhall 21.0.0. We're running the Dhall acceptance test suites for parsing, normalization, CBOR encoding and decoding, hashing, and type inference, and currently all tests are passing (with three exceptions; see the 0.10.0-M1 release notes for details).

There are several known issues:

While we think the project is reasonably well-tested, it's very new, is sure to be full of bugs, and nothing about the API should be considered stable at the moment. Please use responsibly.

Getting started

The easiest way to try things out is to add the Scala wrapper module to your build. If you're using sbt that would look like this:

libraryDependencies += "org.dhallj" %% "dhall-scala" % "0.10.0-M1"

This dependency includes two packages: org.dhallj.syntax and org.dhallj.ast.

The syntax package provides some extension methods, including a parseExpr method for strings (note that this method returns an Either[ParsingFailure, Expr], which we unwrap here with Right):

scala> import org.dhallj.syntax._
import org.dhallj.syntax._

scala> val Right(expr) = "\\(n: Natural) -> [n + 0, n + 1, 1 + 1]".parseExpr
expr: org.dhallj.core.Expr = λ(n : Natural)  [n + 0, n + 1, 1 + 1]

Now that we have a Dhall expression, we can type-check it:

scala> val Right(exprType) = expr.typeCheck
exprType: org.dhallj.core.Expr = ∀(n : Natural)  List Natural

We can "reduce" (or β-normalize) it:

scala> val normalized = expr.normalize
normalized: org.dhallj.core.Expr = λ(n : Natural)  [n, n + 1, 2]

We can also α-normalize it, which replaces all named variables with indexed underscores:

scala> val alphaNormalized = normalized.alphaNormalize
alphaNormalized: org.dhallj.core.Expr = λ(_ : Natural)  [_, _ + 1, 2]

We can encode it as a CBOR byte array:

scala> alphaNormalized.getEncodedBytes
res0: Array[Byte] = Array(-125, 1, 103, 78, 97, 116, 117, 114, 97, 108, -123, 4, -10, 0, -124, 3, 4, 0, -126, 15, 1, -126, 15, 2)

And we can compute its semantic hash:

scala> alphaNormalized.hash
res1: String = c57cdcdae92638503f954e63c0b3ae8de00a59bc5e05b4dd24e49f42aca90054

If we have the official dhall CLI installed, we can confirm that this hash is correct:

$ dhall hash <<< '\(n: Natural) -> [n + 0, n + 1, 1 + 1]'
sha256:c57cdcdae92638503f954e63c0b3ae8de00a59bc5e05b4dd24e49f42aca90054

We can also compare expressions:

scala> val Right(other) = "\\(n: Natural) -> [n, n + 1, 3]".parseExpr
other: org.dhallj.core.Expr = λ(n : Natural)  [n, n + 1, 3]

scala> normalized == other
res2: Boolean = false

scala> val Some(diff) = normalized.diff(other)
diff: (Option[org.dhallj.core.Expr], Option[org.dhallj.core.Expr]) = (Some(2),Some(3))

And apply them to other expressions:

scala> val Right(arg) = "10".parseExpr
arg: org.dhallj.core.Expr = 10

scala> expr(arg)
res3: org.dhallj.core.Expr = (λ(n : Natural)  [n + 0, n + 1, 1 + 1]) 10

scala> expr(arg).normalize
res4: org.dhallj.core.Expr = [10, 11, 2]

We can also resolve expressions containing imports (although at the moment dhall-scala doesn't support remote imports or caching; please see the section on import resolution below for details about how to set up remote import resolution if you need it):

val Right(enumerate) =
     |   "./dhall-lang/Prelude/Natural/enumerate".parseExpr.flatMap(_.resolve)
enumerate: org.dhallj.core.Expr = let enumerate : Natural  List Natural = ...

scala> enumerate(arg).normalize
res5: org.dhallj.core.Expr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Note that we're working with values of type Expr, which comes from dhall-core, which is a Java module. The Expr class includes static methods for creating Expr values:

scala> import org.dhallj.core.Expr
import org.dhallj.core.Expr

scala> Expr.makeTextLiteral("foo")
res6: org.dhallj.core.Expr = "foo"

scala> Expr.makeEmptyListLiteral(Expr.Constants.BOOL)
res7: org.dhallj.core.Expr = [] : Bool

If you're working from Scala, though, you're generally better off using the constructors included in the org.dhallj.ast package, which provide more type-safety:

scala> TextLiteral("foo")
res8: org.dhallj.core.Expr = "foo"

scala> NonEmptyListLiteral(BoolLiteral(true), Vector())
res9: org.dhallj.core.Expr = [True]

The ast package also includes extractors that let you pattern match on Expr values:

scala> expr match {
     |   case Lambda(name, _, NonEmptyListLiteral(first +: _)) => (name, first)
     | }
res10: (String, org.dhallj.core.Expr) = (n,n + 0)

Note that we don't have exhaustivity checking for these extractors, although we might be able to add that in an eventual Dotty version.

In addition to dhall-scala, there's a (more experimental) dhall-scala-codec module, which supports encoding and decoding Scala types to and from Dhall expressions. If you add it to your build, you can write the following:

scala> import org.dhallj.codec.syntax._
import org.dhallj.codec.syntax._

scala> List(List(1, 2), Nil, List(3, -4)).asExpr
res0: org.dhallj.core.Expr = [[+1, +2], [] : List Integer, [+3, -4]]

You can even decode Dhall functions into Scala functions (assuming you have the appropriate codecs for the input and output types):

val Right(f) = """

  let enumerate = ./dhall-lang/Prelude/Natural/enumerate

  let map = ./dhall-lang/Prelude/List/map

  in \(n: Natural) ->
    map Natural Integer Natural/toInteger (enumerate n)

""".parseExpr.flatMap(_.resolve)

And then:

scala> val Right(scalaEnumerate) = f.as[BigInt => List[BigInt]]
scalaEnumerate: BigInt => List[BigInt] = org.dhallj.codec.Decoder$$anon$11$$Lambda$15614/0000000050B06E20@94b036

scala> scalaEnumerate(BigInt(3))
res1: List[BigInt] = List(0, 1, 2)

Eventually we'll probably support generic derivation for encoding Dhall expressions to and from algebraic data types in Scala, but we haven't implemented this yet.

Converting to other formats

DhallJ currently includes several ways to export Dhall expressions to other formats. The core module includes very basic support for printing Dhall expressions as JSON:

scala> import org.dhallj.core.converters.JsonConverter
import org.dhallj.core.converters.JsonConverter

scala> import org.dhallj.parser.DhallParser.parse
import org.dhallj.parser.DhallParser.parse

scala> val expr = parse("(λ(n: Natural) → [n, n + 1, n + 2]) 100")
expr: org.dhallj.core.Expr.Parsed = (λ(n : Natural)  [n, n + 1, n + 2]) 100

scala> JsonConverter.toCompactString(expr.normalize)
res0: String = [100,101,102]

This conversion supports the same subset of Dhall expressions as dhall-to-json (e.g. it can't produce JSON representation of functions, which means the normalization in the example above is necessary—if we hadn't normalized the conversion would fail).

There's also a module that provides integration with Circe, allowing you to convert Dhall expressions directly to (and from) io.circe.Json values without intermediate serialization to strings:

scala> import org.dhallj.circe.Converter
import org.dhallj.circe.Converter

scala> import io.circe.syntax._
import io.circe.syntax._

scala> Converter(expr.normalize)
res0: Option[io.circe.Json] =
Some([
  100,
  101,
  102
])

scala> Converter(List(true, false).asJson)
res1: org.dhallj.core.Expr = [True, False]

Another module supports converting to any JSON representation for which you have a Jawn facade. For example, the following build configuration would allow you to export spray-json values:

libraryDependencies ++= Seq(
  "org.dhallj"    %% "dhall-jawn" % "0.4.0",
  "org.typelevel" %% "jawn-spray" % "1.0.0"
)

And then:

scala> import org.dhallj.jawn.JawnConverter
import org.dhallj.jawn.JawnConverter

scala> import org.typelevel.jawn.support.spray.Parser
import org.typelevel.jawn.support.spray.Parser

scala> val toSpray = new JawnConverter(Parser.facade)
toSpray: org.dhallj.jawn.JawnConverter[spray.json.JsValue] = org.dhallj.jawn.JawnConverter@be3ffe1d

scala> toSpray(expr.normalize)
res0: Option[spray.json.JsValue] = Some([100,101,102])

Note that unlike the dhall-circe module, the integration provided by dhall-jawn is only one way (you can convert Dhall expressions to JSON values, but not the other way around).

We also support YAML export via SnakeYAML (which doesn't require a Scala dependency):

scala> import org.dhallj.parser.DhallParser.parse
import org.dhallj.parser.DhallParser.parse

scala> import org.dhallj.yaml.YamlConverter
import org.dhallj.yaml.YamlConverter

scala> val expr = parse("{foo = [1, 2, 3], bar = [4, 5]}")
expr: org.dhallj.core.Expr.Parsed = {foo = [1, 2, 3], bar = [4, 5]}

scala> println(YamlConverter.toYamlString(expr))
foo:
- 1
- 2
- 3
bar:
- 4
- 5

You can use the YAML exporter with dhall-kubernetes, for example. Instead of maintaining a lot of verbose and repetitive and error-prone YAML files, you can keep your configuration in well-typed Dhall files (like this example) and have your build system export them to YAML:

import org.dhallj.syntax._, org.dhallj.yaml.YamlConverter

val kubernetesExamplePath = "../dhall-kubernetes/1.17/examples/deploymentSimple.dhall"
val Right(kubernetesExample) = kubernetesExamplePath.parseExpr.flatMap(_.resolve)

And then:

scala> println(YamlConverter.toYamlString(kubernetesExample.normalize))
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      name: nginx
  template:
    metadata:
      name: nginx
    spec:
      containers:
      - image: nginx:1.15.3
        name: nginx
        ports:
        - containerPort: 80

It's not currently possible to convert to YAML without the SnakeYAML dependency, although we may support a simplified version of this in the future (something similar to what we have for JSON in the core module).

Import resolution

There are currently two modules that implement import resolution (to different degrees).

dhall-imports

The first is dhall-imports, which is a Scala library built on cats-effect that uses http4s for its HTTP client. This module is intended to be a complete implementation of the import resolution and caching specification.

It requires a bit of ceremony to set up:

import cats.effect.{IO, Resource}
import org.dhallj.core.Expr
import org.dhallj.imports.syntax._
import org.dhallj.parser.DhallParser
import org.http4s.blaze.client.BlazeClientBuilder
import org.http4s.client.Client
import scala.concurrent.ExecutionContext

val client: Resource[IO, Client[IO]] = BlazeClientBuilder[IO](ExecutionContext.global).resource

And then if we have some definitions like this:

val concatSepImport = DhallParser.parse("https://prelude.dhall-lang.org/Text/concatSep")

val parts = DhallParser.parse("""["foo", "bar", "baz"]""")
val delimiter = Expr.makeTextLiteral("-")

We can use them with a function from the Dhall Prelude like this:

scala> val resolved = client.use { implicit c =>
     |   concatSepImport.resolveImports[IO]
     | }
resolved: cats.effect.IO[org.dhallj.core.Expr] = IO(...)

scala> import cats.effect.unsafe.implicits.global
import cats.effect.unsafe.implicits.global

scala> val result = resolved.map { concatSep =>
     |   Expr.makeApplication(concatSep, Array(delimiter, parts)).normalize
     | }
result: cats.effect.IO[org.dhallj.core.Expr] = IO(...)

scala> result.unsafeRunSync()
res0: org.dhallj.core.Expr = "foo-bar-baz"

(Note that we could use dhall-scala to avoid the use of Array above.)

Classpath imports

We support an extension of the spec which allows you to also import expressions from the classpath using the syntax let e = classpath:/absolute/path/to/file in e. The semantics are subject to change as we get more experience with it but currently it should generally have the same behaviour as an absolute path import of a local file (but files on the classpath can import each other using relative paths). This includes it being protected by the referential sanity check so that remote imports cannot exfiltrate information from the classpath.

Also note that classpath imports as location are currently not supported as the spec requires that an import as Location must return an expression of type <Local Text | Remote Text | Environment Text | Missing>.

dhall-imports-mini

The other implementation is dhall-imports-mini, which is a Java library that depends only on the core and parser modules, but that doesn't support remote imports or caching.

The previous example could be rewritten as follows using dhall-imports-mini and a local copy of the Prelude:

import org.dhallj.core.Expr
import org.dhallj.imports.mini.Resolver
import org.dhallj.parser.DhallParser

val concatSep = Resolver.resolve(DhallParser.parse("./dhall-lang/Prelude/Text/concatSep"), false)

val parts = DhallParser.parse("""["foo", "bar", "baz"]""")
val delimiter = Expr.makeTextLiteral("-")

And then:

scala> Expr.makeApplication(concatSep, Array(delimiter, parts)).normalize
res0: org.dhallj.core.Expr = "foo-bar-baz"

It's likely that eventually we'll provide a complete pure-Java implementation of import resolution, but this isn't currently a high priority for us.

Command-line interface

We include a command-line interface that supports some common operations. It's currently similar to the official dhall and dhall-to-json binaries, but with many fewer options.

If GraalVM Native Image is available on your system, you can build the CLI as a native binary (thanks to sbt-native-packager).

$ sbt cli/graalvm-native-image:packageBin

$ cd cli/target/graalvm-native-image/

$ du -h dhall-cli
8.2M    dhall-cli

$ time ./dhall-cli hash --normalize --alpha <<< "λ(n: Natural) → [n, n + 1]"
sha256:a8d9326812aaabeed29412e7b780dc733b1e633c5556c9ea588e8212d9dc48f3

real    0m0.009s
user    0m0.000s
sys     0m0.009s

$ time ./dhall-cli type <<< "{foo = [1, 2, 3]}"
{foo : List Natural}

real    0m0.003s
user    0m0.000s
sys     0m0.003s

$ time ./dhall-cli json <<< "{foo = [1, 2, 3]}"
{"foo":[1,2,3]}

real    0m0.005s
user    0m0.004s
sys     0m0.001s

Even on the JVM it's close to usable, although you can definitely feel the slow startup:

$ cd ..

$ time java -jar ./cli-assembly-0.4.0-SNAPSHOT.jar hash --normalize --alpha <<< "λ(n: Natural) → [n, n + 1]"
sha256:a8d9326812aaabeed29412e7b780dc733b1e633c5556c9ea588e8212d9dc48f3

real    0m0.104s
user    0m0.106s
sys     0m0.018s

There's probably not really any reason you'd want to use dhall-cli right now, but I think it's a pretty neat demonstration of how Graal can make Java (or Scala) a viable language for building native CLI applications.

Other stuff

dhall-testing

The dhall-testing module provides support for property-based testing with ScalaCheck in the form of Arbitrary (and Shrink) instances:

scala> import org.dhallj.core.Expr
import org.dhallj.core.Expr

scala> import org.dhallj.testing.instances._
import org.dhallj.testing.instances._

scala> import org.scalacheck.Arbitrary
import org.scalacheck.Arbitrary

scala> Arbitrary.arbitrary[Expr].sample
res0: Option[org.dhallj.core.Expr] = Some(Optional (Optional (List Double)))

scala> Arbitrary.arbitrary[Expr].sample
res1: Option[org.dhallj.core.Expr] = Some(Optional (List <neftfEahtuSq : Double | kg...

It includes (fairly basic) support for producing both well-typed and probably-not-well-typed expressions, and for generating arbitrary values of specified Dhall types:

scala> import org.dhallj.testing.WellTypedExpr
import org.dhallj.testing.WellTypedExpr

scala> Arbitrary.arbitrary[WellTypedExpr].sample
res2: Option[org.dhallj.testing.WellTypedExpr] = Some(WellTypedExpr(8436008296256993755))

scala> genForType(Expr.Constants.BOOL).flatMap(_.sample)
res3: Option[org.dhallj.core.Expr] = Some(True)

scala> genForType(Expr.Constants.BOOL).flatMap(_.sample)
res4: Option[org.dhallj.core.Expr] = Some(False)

scala> genForType(Expr.makeApplication(Expr.Constants.LIST, Expr.Constants.INTEGER)).flatMap(_.sample)
res5: Option[org.dhallj.core.Expr] = Some([+1522471910085416508, -9223372036854775809, ...

This module is currently fairly minimal, and is likely to change substantially in future releases.

dhall-javagen and dhall-prelude

The dhall-javagen module lets you take a DhallJ representation of a Dhall expression and use it to generate Java code that will build the DhallJ representation of that expression.

This is mostly a toy, but it allows us for example to distribute a "pre-compiled" jar containing the Dhall Prelude:

scala> import java.math.BigInteger
import java.math.BigInteger

scala> import org.dhallj.core.Expr
import org.dhallj.core.Expr

scala> val ten = Expr.makeNaturalLiteral(new BigInteger("10"))
ten: org.dhallj.core.Expr = 10

scala> val Prelude = org.dhallj.prelude.Prelude.instance
Prelude: org.dhallj.core.Expr = ...

scala> val Natural = Expr.makeFieldAccess(Prelude, "Natural")
Natural: org.dhallj.core.Expr = ...

scala> val enumerate = Expr.makeFieldAccess(Natural, "enumerate")
enumerate: org.dhallj.core.Expr = ...

scala> Expr.makeApplication(enumerate, ten).normalize
res0: org.dhallj.core.Expr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Note that the resulting jar (which is available from Maven Central as dhall-prelude) is many times smaller than either the Prelude source or the Prelude serialized as CBOR.

Developing

The project includes the currently-supported version of the Dhall language repository as a submodule, so if you want to run the acceptance test suites, you'll need to clone recursively:

git clone --recurse-submodules [email protected]:travisbrown/dhallj.git

Or if you're like me and always forget to do this, you can initialize the submodule after cloning:

git submodule update --init

This project is built with sbt, and you'll need to have sbt installed on your machine.

We're using the JavaCC parser generator for the parsing module, and we have our own sbt plugin for integrating JavaCC into our build. This plugin is open source and published to Maven Central, so you don't need to do anything to get it, but you will need to run it manually the first time you build the project (or any time you update the JavaCC grammar):

sbt:root> javacc
Java Compiler Compiler Version 7.0.5 (Parser Generator)
File "Provider.java" does not exist.  Will create one.
File "StringProvider.java" does not exist.  Will create one.
File "StreamProvider.java" does not exist.  Will create one.
File "TokenMgrException.java" does not exist.  Will create one.
File "ParseException.java" does not exist.  Will create one.
File "Token.java" does not exist.  Will create one.
File "SimpleCharStream.java" does not exist.  Will create one.
Parser generated with 0 errors and 1 warnings.
[success] Total time: 0 s, completed 12-Apr-2020 08:48:53

After this is done, you can run the tests:

sbt:root> test
...
[info] Passed: Total 1319, Failed 0, Errors 0, Passed 1314, Skipped 5
[success] Total time: 36 s, completed 12-Apr-2020 08:51:07

Note that a few tests require the dhall-haskell dhall CLI. If you don't have it installed on your machine, these tests will be skipped.

There are also a few additional slow tests that must be run manually:

sbt:root> slow:test
...
[info] Passed: Total 4, Failed 0, Errors 0, Passed 4
[success] Total time: 79 s (01:19), completed 12-Apr-2020 08:52:41

Community

This project supports the Scala code of conduct and wants all of its channels (Gitter, GitHub, etc.) to be inclusive environments.

Copyright and license

All code in this repository is available under the 3-Clause BSD License.

Copyright Travis Brown and Tim Spence, 2020.

dhallj's People

Contributors

amesgen avatar note avatar scala-steward avatar timwspence avatar travisbrown avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dhallj's Issues

JSON and YAML export doesn't respect toMap format

There's no good reason for this, I've just not implemented it yet.

scala> import org.dhallj.parser.DhallParser
import org.dhallj.parser.DhallParser

scala> import org.dhallj.core.converters.JsonConverter
import org.dhallj.core.converters.JsonConverter

scala> val expr = DhallParser.parse("""[{mapKey = "foo", mapValue = [1]}]""")
expr: org.dhallj.core.Expr.Parsed = [{mapKey = "foo", mapValue = [1]}]

scala> JsonConverter.toCompactString(expr)
res0: String = [{"mapKey":"foo","mapValue":[1]}]

It should do the same thing as dhall-to-json:

$ dhall-to-json <<< '[{mapKey = "foo", mapValue = [1]}]'
{
  "foo": [
    1
  ]
}

I am considering this one a blocker for the 0.1.0 release.

Fix toString's handling of text literals, etc.

The implementation of toString (taking an expression and writing it back to Dhall code as a string) is currently pretty rough and needs some attention in general, but in particular it doesn't serialize text literals containing newlines correctly. toString hasn't been a high priority for me so far, but the newline issue at least needs to be fixed before the 0.1.0 release.

Parse records with keywords as keys

Thanks for this awesome new package! A minor parsing bug:

@ val Right(expr) = """{ if : Text }""".parseExpr 
scala.MatchError: Left(org.dhallj.core.DhallException$ParsingFailure: Encountered unexpected token: "if" "if"
    at line 1, column 3.

Was expecting one of:

    ","
    "="
    "Location"
    "Some"
    "Text"
    "}"
    <BUILT_IN>
    <NONRESERVED_LABEL>
) (of class scala.util.Left)

With dhall-haskell:

$ dhall <<< '{ if : Text }'
{ if : Text }

Parse quoted URLs correctly

parse("https://example.com/foo/\"bar?baz\"?qux") quietly catches a java.net.URISyntaxException e in org.dhallj.parser.support.ParsingHelpers and returns null

From a very brief look, I'd say we're parsing it correctly as a string but the java.net.URI constructor won't accept it. We might have to roll our own URI which would be very annoying!

Fix type annotation precedence issue

The parser currently gets precedence wrong for type annotations in three situations:

  • Empty lists
  • merge
  • toMap

These cases are some of the few places where the JavaCC grammar isn't a fairly straightforward translation of the Dhall ABNF file. I made the adjustments as a hack because I wasn't initially able to get JavaCC to handle backtracking correctly at the expression level for these cases. The current approach works for most cases, including everything in the parsing acceptance tests, but fails on some valid Dhall code—for example:

scala> import org.dhallj.parser.DhallParser.parse
import org.dhallj.parser.DhallParser.parse

scala> parse("[]: List Natural: Type")
org.dhallj.core.DhallException$ParsingFailure: Encountered unexpected token: ":" ":"
    at line 1, column 17.

Was expecting one of:

    <EOF>
    <WHSP>

Compare dhall:

$ dhall <<< '[]: List Natural: Type'
[] : List Natural

This is parsed as []: (List Natural: Type):

$ dhall encode --json <<< "[]: List Natural: Type"
[
    28,
    [
        26,
        [
            0,
            "List",
            "Natural"
        ],
        "Type"
    ]
]
$ dhall encode --json <<< "[]: ((List Natural): Type)"
[
    28,
    [
        26,
        [
            0,
            "List",
            "Natural"
        ],
        "Type"
    ]
]

I don't believe this bug in our parser can result in incorrect parses, only failed ones, and it's enough easy to work around:

scala> parse("[]: (List Natural: Type)").normalize.hash
res1: String = d79a2e0e14809ab2dbd2d180e60da8e129a5fb197bdd0caed57e3828402e48a9

Or:

scala> parse("[]: List (Natural: Type)").normalize.hash
res2: String = d79a2e0e14809ab2dbd2d180e60da8e129a5fb197bdd0caed57e3828402e48a9

Which gives the same hash as the Haskell implementation:

$ dhall hash <<< "[]: List Natural: Type"
sha256:d79a2e0e14809ab2dbd2d180e60da8e129a5fb197bdd0caed57e3828402e48a9

I've decided not to let this block the 0.1.0 release, but I'm planning to fix it soon.

Fill in gaps in single-quoted literal support

I just noticed that there are some language features related to escaped sequences in single-quoted literals that we don't currently support. I'm working on fixing these now, and will put together some tests for the standard acceptance suite as well.

Import resolution against consul/etcd

Ideally users should be able to provide their own resolver to load configs from storages like consul, etcd, vault. I used to hack this with a includer for typesafe-config.

Fix with precedence

The parser currently accepts some inputs it shouldn't:

scala> org.dhallj.parser.DhallParser.parse("foo { x = 0 } with x = 1")
val res0: org.dhallj.core.Expr.Parsed = (foo {x = 0}) with x = 1

scala> org.dhallj.parser.DhallParser.parse("{ x = 0 } with x = 1 : T")
val res1: org.dhallj.core.Expr.Parsed = ({x = 0} with x = 1) : T

Both of these should be parsing failures, but our handling of the precedence of with is currently somewhat awkward, and while I don't think it would be excessively hard to fix these cases, it's not trivial, and it would be much easier if we move to this approach, so I haven't done it yet.

Note that we're currently ignoring the WithPrecedence2 and WithPrecedence3 tests, which catch this bug.

In any case this bug should not affect most users, and it's unlikely to cause problems. As you can see in the printed code above, even when the parser does accept code it shouldn't, the parses it comes up with aren't unreasonable.

Clean up escaping

There are currently many different string encodings with different escapings, with lots of transition points between them. These transitions are currently handled in a fairly ad-hoc way, with escaping done at each of them as needed to get tests to pass.

I don't know of any specific bugs in this respect at the moment, but I'm sure there are some in there, and anyway the current situation won't be fun to maintain in the longer term. I need to work through exactly what needs to be escaped where, and to clean up the transitions. This should be relatively easy given the tests we have now.

Document identifier index limit

The Haskell implementation will happily parse an arbitrarily large index on an identifier (although it looks like something's overflowing either in the representation or the encoding):

$ dhall encode --json <<< "x @ 9223372036854775808"
[
    "x",
    -9223372036854775808
]

We stop at Long.MaxValue:

scala> import org.dhallj.parser.DhallParser
import org.dhallj.parser.DhallParser

scala> DhallParser.parse("x @ 9223372036854775807")
res0: org.dhallj.core.Expr.Parsed = x@9223372036854775807

scala> DhallParser.parse("x @ 9223372036854775808")
java.lang.NumberFormatException: For input string: "9223372036854775808"
  at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:68)
  at java.base/java.lang.Long.parseLong(Long.java:703)
  ...

I think this is fine, since nobody's ever going to bind 9,223,372,036,854,775,808 xs, so this will never type-check, anyway, but we should document the difference and maybe wrap the NumberFormatException in an ParseException.

YAML export escapes quotes

Another one:

@ YamlConverter.toYamlString(parse(""" { a = "\"\n" } """)) 
res19: String = """a: |
  \"
"""

EDIT the newline is irrelevant

@ YamlConverter.toYamlString(parse(""" { a = "\"" } """)) 
res5: String = """a: \"
"""

dhall-scala dependency is broken: dhall-ast not found

I was trying to add a dependency on "org.dhallj" %% "dhall-scala" % "0.3.2" (for a 2.13 project), and I get this error:

[error] (game-data / update) sbt.librarymanagement.ResolveException: Error downloading org.dhallj:dhall-ast_2.13:0.3.2
[error]   Not found
[error]   Not found
[error]   not found: /Users/gavin/.ivy2/local/org.dhallj/dhall-ast_2.13/0.3.2/ivys/ivy.xml
[error]   not found: https://repo1.maven.org/maven2/org/dhallj/dhall-ast_2.13/0.3.2/dhall-ast_2.13-0.3.2.pom
[error] Total time: 2 s, completed May 26, 2020 12:01:58 PM

Leverage Truffle?

I wonder leveraging Truffle (from the Graal project) would be something we'd be interested in - if we'd want to JIT Dhall rather than always interpreting it?

Decide what to do about type checker failure test failure

The type checker acceptance test include the following failure test case:

{ x: Natural, x: Natural }

We currently type check this without a failure:

scala> import org.dhallj.parser.DhallParser
import org.dhallj.parser.DhallParser

scala> DhallParser.parse("{ x: Natural, x: Natural }").typeCheck
res0: org.dhallj.core.Expr = Type

We could easily change the type checker to make it fail here, but the spec seems to explicitly say that we don't need to:

Carefully note that there should be no need to handle duplicate fields by this point because the desugaring rules for record literals merge duplicate fields into unique fields.

Right now I'm ignoring this failure test case, but I need to figure out whether I'm misunderstanding the language in the spec, and fix the type checker if I am.

Parser is not stack-safe for deep records

All operations currently work just fine on arbitrarily long lists:

scala> import org.dhallj.core.Expr
import org.dhallj.core.Expr

scala> import org.dhallj.parser.DhallParser
import org.dhallj.parser.DhallParser

scala> import org.dhallj.core.converters.JsonConverter
import org.dhallj.core.converters.JsonConverter

scala> val longList = DhallParser.parse((0 to 100000).mkString("[", ",", "]"))
longList: org.dhallj.core.Expr.Parsed = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...

scala> longList.normalize
res0: org.dhallj.core.Expr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, ...

scala> longList.typeCheck
res1: org.dhallj.core.Expr = List Natural

scala> JsonConverter.toCompactString(longList)
res2: String = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,...

…and most things work just fine on arbitrarily deep record literals (or record types, or union types):

scala> val deepRecord = (0 to 10000).foldLeft(longList: Expr) { 
     |   case (expr, _) => Expr.makeRecordLiteral("foo", expr)
     | }
deepRecord: org.dhallj.core.Expr = {foo = {foo = {foo = {foo = {foo = {foo = ...

scala> deepRecord.normalize.alphaNormalize.hash
res3: String = f41d556f987dd60c59e9b49a367b0bf907dba111c904c88dfda27e2a599a07bc

scala> JsonConverter.toCompactString(deepRecord)
res4: String = {"foo":{"foo":{"foo":{"foo":{"foo":{"foo":{"foo":{"foo":{"foo": ...

Note that dhall produces the same hash for this expression:

$ dhall hash < deep-record.dhall
sha256:f41d556f987dd60c59e9b49a367b0bf907dba111c904c88dfda27e2a599a07bc

Unfortunately the parser can't handle this expression:

scala> DhallParser.parse(deepRecord.toString)
java.lang.StackOverflowError
  at org.dhallj.parser.support.JavaCCParser.jj_ntk_f(JavaCCParser.java:4508)
  at org.dhallj.parser.support.JavaCCParser.BASE_EXPRESSION(JavaCCParser.java:2425)
  at org.dhallj.parser.support.JavaCCParser.RECORD_LITERAL_ENTRY(JavaCCParser.java:390)
  at org.dhallj.parser.support.JavaCCParser.RECORD_LITERAL_OR_TYPE(JavaCCParser.java:666)
  at org.dhallj.parser.support.JavaCCParser.PRIMITIVE_EXPRESSION(JavaCCParser.java:874)
  at org.dhallj.parser.support.JavaCCParser.SELECTOR_EXPRESSION(JavaCCParser.java:1011)
  at org.dhallj.parser.support.JavaCCParser.COMPLETION_EXPRESSION(JavaCCParser.java:1080)
  ...

I think this should just be a matter of doing some more left-factoring, but I'm fairly new to JavaCC and I don't really know how much work this will be, so I've decided not to let this issue block the 0.1.0 release.

root directory for imports

For filesystem imports, it would be nice to be able to specify the root directory (espc. as there is no easy and reliable way to change the current working dir on the JVM). imports-mini already supports this.

This should only involve adding e.g. rootDirectory: Path to ResolutionConfig and respecting it in ResolveImportsVisitor. I could create a PR for this.

cannot parse string literals with a #

Any string literal containing a # is rejected.

@ val Right(expr) = """ "#" """.parseExpr 
org.dhallj.parser.support.TokenMgrException: Lexical error at line 1, column 3.  Encountered: "#" (35), after : ""
  org.dhallj.parser.support.JavaCCParserTokenManager.getNextToken(JavaCCParserTokenManager.java:7734)
  org.dhallj.parser.support.JavaCCParser.jj_ntk_f(JavaCCParser.java:4508)
  org.dhallj.parser.support.JavaCCParser.DOUBLE_QUOTE_LITERAL(JavaCCParser.java:96)
  org.dhallj.parser.support.JavaCCParser.TEXT_LITERAL(JavaCCParser.java:158)
  org.dhallj.parser.support.JavaCCParser.PRIMITIVE_EXPRESSION(JavaCCParser.java:870)
  org.dhallj.parser.support.JavaCCParser.SELECTOR_EXPRESSION(JavaCCParser.java:1011)
  org.dhallj.parser.support.JavaCCParser.COMPLETION_EXPRESSION(JavaCCParser.java:1080)
  org.dhallj.parser.support.JavaCCParser.IMPORT_EXPRESSION(JavaCCParser.java:1211)
  org.dhallj.parser.support.JavaCCParser.APPLICATION_EXPRESSION(JavaCCParser.java:1306)
  org.dhallj.parser.support.JavaCCParser.WITH_EXPRESSION(JavaCCParser.java:1388)
  org.dhallj.parser.support.JavaCCParser.EQUIVALENT_EXPRESSION(JavaCCParser.java:1410)
  org.dhallj.parser.support.JavaCCParser.NOT_EQUALS_EXPRESSION(JavaCCParser.java:1450)
  org.dhallj.parser.support.JavaCCParser.EQUALS_EXPRESSION(JavaCCParser.java:1490)
  org.dhallj.parser.support.JavaCCParser.TIMES_EXPRESSION(JavaCCParser.java:1530)
  org.dhallj.parser.support.JavaCCParser.COMBINE_TYPES_EXPRESSION(JavaCCParser.java:1570)
  org.dhallj.parser.support.JavaCCParser.PREFER_EXPRESSION(JavaCCParser.java:1610)
  org.dhallj.parser.support.JavaCCParser.COMBINE_EXPRESSION(JavaCCParser.java:1650)
  org.dhallj.parser.support.JavaCCParser.AND_EXPRESSION(JavaCCParser.java:1690)
  org.dhallj.parser.support.JavaCCParser.LIST_APPEND_EXPRESSION(JavaCCParser.java:1730)
  org.dhallj.parser.support.JavaCCParser.TEXT_APPEND_EXPRESSION(JavaCCParser.java:1770)
  org.dhallj.parser.support.JavaCCParser.PLUS_EXPRESSION(JavaCCParser.java:1810)
  org.dhallj.parser.support.JavaCCParser.OR_EXPRESSION(JavaCCParser.java:1841)
  org.dhallj.parser.support.JavaCCParser.IMPORT_ALT_EXPRESSION(JavaCCParser.java:1881)
  org.dhallj.parser.support.JavaCCParser.OPERATOR_EXPRESSION(JavaCCParser.java:1908)
  org.dhallj.parser.support.JavaCCParser.FUNCTION_TYPE_OR_ANNOTATED_EXPRESSION(JavaCCParser.java:2366)
  org.dhallj.parser.support.JavaCCParser.BASE_EXPRESSION(JavaCCParser.java:2472)
  org.dhallj.parser.support.JavaCCParser.COMPLETE_EXPRESSION(JavaCCParser.java:2500)
  org.dhallj.parser.support.JavaCCParser.TOP_LEVEL(JavaCCParser.java:2514)
  org.dhallj.parser.support.Parser.parse(Parser.java:12)
  org.dhallj.parser.DhallParser.parse(DhallParser.java:11)
  org.dhallj.syntax.package$DhallStringOps$.parseExpr$extension(package.scala:13)
  ammonite.$sess.cmd10$.<clinit>(cmd10.sc:1)

Both dhall-haskell and dhall-rust accept this.

YAML export escapes newlines

Using 0.1.1:

@ YamlConverter.toYamlString(parse(""" { a = "\n" } """).normalize()) 
res15: String = """a: \n
"""

@ JsonConverter.toCompactString(parse(""" { a = "\n" } """).normalize()) 
res16: String = "{\"a\":\"\\n\"}"

THe JSON output is correct, but in the YAML output, the value does not contain a newline (without quotes, \n is not an escape sequence).

Validate that core modules actually work on Java 7

We're currently using -source 1.7 and -target 1.7 for the Java modules, and I'm pretty sure we're not using any post-7 standard library stuff, but I'm not setting the bootstrap class path, and I haven't actually bothered to try things out on a Java 7 JVM (only Java 8 at the earliest, both locally and in CI).

Unions don't seem to be typechecked correctly

In the following Dhall expression:

let Package =
      < GitHub : { repository : Text, revision : Text }
      >

in [ Package.GitHub { repository = {} } ]

We should get a type error because of repository not being a Text, and revision being missing. Dhall's editor on the website says:

Error: Wrong type of function argument

{ - revision : 
,   repository : - Text
                 + Type
}

5      Package.GitHub { repository = {} }

(input):5:6

Instead, if I try to normalize this expression and encode it as JSON using the circe encoder:

import $ivy.`org.dhallj::dhall-scala:0.8.0-M1`
import $ivy.`org.dhallj::dhall-circe:0.8.0-M1`

import org.dhallj.parser.DhallParser._

val input = """
let Package =
      < GitHub : { repository : Text, revision : Text }
      >

in [ Package.GitHub { repository = {} } ]
"""

import org.dhallj.circe.Converter

println(parse(input))
println(parse(input).normalize)
println(Converter(parse(input).normalize))

The last line prints:

Some([
  {
    
  }
])

The result of normalization prints as [(<GitHub : {repository : Text, revision : Text}>.GitHub) {repository = {}}].

If I skip repository = {} completely, I get Some([\n]).

If Package is just a record and not a union, it works as expected.

Clean up toString implementation

This is a follow-up to #7. I've now got toString for Expr working well enough that it can be used to round-trip the unnormalised prelude, which was my goal for 0.1.0, but the implementation is still a disaster. I'd originally been using it for debugging and didn't really care about producing valid Dhall code, and I just threw together the current version in the past couple of days. It parenthesises unnecessarily, probably still gets precedence wrong in some cases, is an unmaintainable mess, etc.

We'll also probably want some kind of Dhall code pretty-printing at some point, but I'll open a separate issue for that.

Set up JavaScript build

I hacked this together as an experiment this afternoon with J2CL, and it's not too bad—a few Bazel build files, some minimal implementations of java.net and java.nio stuff, and (the worst part) rewriting all instances of String.format.

You can try it out from a web console here.

I think we should at least consider publishing JavaScript artifacts from here. On the "do it" side:

  • Compiling dhall-core and dhall-parser to JavaScript ended up being much smaller than I expected: J2CL / Closure gets a module that exports parsing, normalisation, and type-checking down to 240K. For comparison, the current GHCJS-built JavaScript used on dhall-lang.org is 6.6M minimised (this isn't a direct comparison, since dhall-core and dhall-parser don't provide import resolution and the GHCJS build does, but the difference was still a little surprising to me (25-30x)).

On the "no" side:

  • Living without String.format is really annoying.
  • I don't know if anyone would actually use something like this.

If someone can come up with a single real use case for this I'll clean up my branch and open a PR. I'll also need some help with packaging, etc.—I've essentially not used JavaScript outside of the context of Scala.js for at least a decade.

Run acceptance tests related to import resolution

We're not currently running any of the dhall-lang/tests/import tests, or the two type-inference tests related to caching (CacheImports and CacheImportsCanonicalize). I don't think there's any particular reason for this now, and we have our own tests for the imports module that cover some of the same ground, so we should give it a try.

Investigate type-checking performance

Right now the normalized Prelude type-checks ~instantaneously:

scala> import org.dhallj.syntax._
import org.dhallj.syntax._

scala> val path = "./dhall-lang/Prelude/package.dhall"
path: String = ./dhall-lang/Prelude/package.dhall

scala> val Right(prelude) = path.parseExpr.flatMap(_.resolve)
prelude: org.dhallj.core.Expr = {`Bool` = {and = let ...

scala> val normalized = prelude.normalize
normalized: org.dhallj.core.Expr = {`Bool` = {and = ...

scala> normalized.typeCheck
res0: Either[org.dhallj.core.typechecking.TypeCheckFailure,org.dhallj.core.Expr] = Right(...

If you don't normalize it, though, it takes a little over a minute:

scala> prelude.typeCheck
res1: Either[org.dhallj.core.typechecking.TypeCheckFailure,org.dhallj.core.Expr] = Right(...

It does produce the correct result, but something is obviously wrong. I haven't really looked into this yet, but I'm assuming it's because it's type-checking the same imported trees over and over.

Fix DontCacheIfHash test

We're currently ignoring this new-ish test, which turned up a bug in the dhall-imports module. It involves the cache not being consulted when a duplicate import provides a hash, and should not affect most users.

Publish API docs

We have sbt-unidoc set up and I've been using it locally, and we could pretty easily publish the API docs now with sbt-ghpages, but the Scaladoc presentation of the Java API docs is pretty terrible in my view, and I'd rather wait and take the time to do this more nicely (probably by publishing the Javadoc and Scaladoc sites separately—I think I'd give up cross-Java-Scala-module linking to have real Javadoc output for the Java modules).

Investigate whether Paths.get is safe across platforms

Right now we use Paths.get(pathToken) for local imports during parsing. The parser ensures that only / will be used as a separator, but I'm not sure I know for a fact that Paths.get is guaranteed to handle / appropriately on all platforms.

Alternatively we could switch away from using Path in our representation of local imports, which I think might be the better choice.

Fix SomeXYZ parsing case

The parser acceptance suite includes the following test input:

{-
This is always a type error, but if this
was a parse error it would be quite confusing
because Some looks like a builtin.
-}
Some x y z

And the expected CBOR encoding looks like this:

[0, [5, null, ["x", 0]], ["y", 0], ["z", 0]]

We parse this as Some applied to three arguments, and end up with a different encoding:

scala> import org.dhallj.parser.DhallParser
import org.dhallj.parser.DhallParser

scala> DhallParser.parse("Some x y z")
res0: org.dhallj.core.Expr.Parsed = Some x y z

scala> DhallParser.parse("Some x y z").getEncodedBytes
res1: Array[Byte] = Array(-125, 5, -10, -126, 97, 120, 0, -126, 97, 121, 0, -126, 97, 122, 0)

I'm currently set this test case to be ignored, and since it seems like a minor issue and doesn't affect well-typed code (as far as I can see?) I've decided not to let it block the 0.1.0 release, but we need to fix it.

Type checker is not stack-safe for (extremely) deep records

Similar to #2, although the cause is different, and the point at which it becomes a problem is much deeper.

The type checker will happily give you a type for a record many thousands of nestings deep (although 20k layers takes about a minute on my laptop, so it's not fast):

scala> import org.dhallj.core.Expr
import org.dhallj.core.Expr

scala> val deepRecord = (0 to 20000).foldLeft(Expr.Constants.TRUE) {
     |   case (expr, _) => Expr.makeRecordLiteral("foo", expr)
     | }
deepRecord: org.dhallj.core.Expr = {foo = {foo = {foo = {foo = {foo = {foo = ...

scala> deepRecord.typeCheck
res0: org.dhallj.core.Expr = {foo : {foo : {foo : {foo : {foo : {foo : {foo : ...

All operations except type checking work for arbitrarily deep values—e.g. normalizing or hashing a record 100k layers deep takes seconds:

scala> val deeperRecord = (0 to 100000).foldLeft(Expr.Constants.TRUE) {
     |   case (expr, _) => Expr.makeRecordLiteral("foo", expr)
     | }
deeperRecord: org.dhallj.core.Expr = {foo = {foo = {foo = {foo = {foo = {foo = ...

scala> deeperRecord.normalize
res2: org.dhallj.core.Expr = {foo = {foo = {foo = {foo = {foo = {foo = {foo = {foo = ...

scala> deeperRecord.hash
res3: String = 8a8477b86e27cd48496db13bbd71bb9845c700cb88b9a8bfacd2391541ff38cc

(I'm guessing dhall would agree on this hash, but it's been running for five minutes and I've not heard back from it yet.)

Type checking blows up somewhere between 20k and 100k:

scala> deeperRecord.typeCheck
java.lang.StackOverflowError
  at org.dhallj.core.typechecking.TypeCheck.onRecord(TypeCheck.java:404)
  at org.dhallj.core.typechecking.TypeCheck.onRecord(TypeCheck.java:27)
  at org.dhallj.core.Constructors$RecordLiteral.accept(constructors-gen.java:249)
  at org.dhallj.core.typechecking.TypeCheck.onRecord(TypeCheck.java:408)
  at org.dhallj.core.typechecking.TypeCheck.onRecord(TypeCheck.java:27)
  at org.dhallj.core.Constructors$RecordLiteral.accept(constructors-gen.java:249)
  ...

This is because the type checker is currently implemented as an external visitor, where the visitor drives the recursion manually, while all of the other core operations are implemented as internal visitors, where the driver manages the recursion and maintains its own stack.

20k layers should be enough for any record, so I'm not letting this block the 0.1.0 release, but I'm planning to rework the type checker as an internal visitor soon, anyway, and I'm opening this issue to track the problem.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.