typelevel / grackle Goto Github PK

View Code? Open in Web Editor NEW

171.0 18.0 27.0 3.04 MB

Grackle: Functional GraphQL for the Typelevel stack

Home Page: http://typelevel.org/grackle/

License: Apache License 2.0

Scala 88.50% PLpgSQL 11.49% Shell 0.01%

grackle's Introduction

Grackle - GraphQL for the Typelevel stack

Overview

Grackle is a GraphQL server written in functional Scala, built on the Typelevel stack.

It's powered by cats, cats-effect, fs2 and http4s, and supports GraphQL queries, mutations and subscriptions.

It has an abstract model of data sources implemented in terms of declarative mappings between GraphQL schemas and backing data, and cursors into that data. It supports in-memory, DB-backed, and effectful data sources.

Grackle is structured as a compiler/interpreter. Queries are type-checked against a GraphQL schema and compiled into an internal query algebra. The query algebra may be further compiled in a backend-specific way to materialize data. In particular it can be compiled to efficient SQL and in that regard currently supports Postgres via Doobie or Skunk.

Grackle is an Apache 2.0 licensed Typelevel project and is available for Scala 2/3 and for Scala.js and Scala Native.

Work has been generously sponsored by Aura/Gemini and ITV over the last four years.

Getting Started

See the tutorial and accompanying demo.
Online Scaladoc is available here.
Ask us anything the in #grackle channel on the Typelevel discord server.

To add Grackle to your project you should add the following to your build.sbt,

// Required: Scala 2.13/3.3+
libraryDependencies += "org.typelevel" %% "grackle-core" % "0.20.0"

// Optional: support for in-memory Json backend using circe
libraryDependencies += "org.typelevel" %% "grackle-circe" % "0.20.0"

// Optional: support for in-memory generic Scala backend using shapeless
libraryDependencies += "org.typelevel" %% "grackle-generic" % "0.20.0"

// Optional: support for Postgres backend via Doobie (JVM only)
libraryDependencies += "org.typelevel" %% "grackle-doobie-pg" % "0.20.0"

// Optional: support for Postgres backend via Skunk
libraryDependencies += "org.typelevel" %% "grackle-skunk" % "0.20.0"

Community

Grackle is proud to be a Typelevel project. We are committed to providing a friendly, safe and welcoming environment for all, and ask that the community adhere to the Typelevel Code of Conduct in all venues.

Conversations around Grackle are currently happening on GitHub issues, PR discussions, and discord.

The Typelevel discord has a #grackle channel, as well as channels for related projects such as #cats, #cats-effect, #fs2, #doobie and #skunk. If you're new to the Typelevel ecosystem the #beginners channel might also be useful. Please join us!

Talks

Contributing

This project exists thanks to all the people who contribute.

We welcome all kinds of contribution, including but not limited to,

documentation improvements, explanatory images/diagrams, fixes in typos, useful links
refactorings of messy code, build structure, increasing test coverage or quality
new features and bugfixes (including bug reports and feature requests).

Writing documentation is valuable for learning, so if you find some explanation insufficient, overly complicated or incorrect, it's a perfect opportunity to make a change to it!

If at any point you run into problems, you can always ask a question on the #grackle channel on the Typelevel discord server.

More information, including on how to build locally and submit pull requests, can be found here.

grackle's People

Contributors

Stargazers

Watchers

grackle's Issues

Add support for generating/deriving interpreter/cursor for datatype backed models

Coalesce multiple root queries in the doobie interpreter

Staging generates multiple root queries for the next stage, and when #13 is done, client queries will be able to do so directly. Currently these are handled independently, resulting in multiple DB queries. In many cases they can be merged and satisfied with a single DB query.

Validate mappings

Currently there is no validation of mappings. This means that mappings for objects or fields could be missing or have incorrect names compared with the schema. Doobie object mappings also require that there is at least one field or attribute which is marked as a "key" for an object (ie. grouping rows on that field/attribute will collect all the rows relevant to a given object together). If a key isn't defined then currently subobjects will silently vanish. Any of the above will result in hard to diagnose runtime errors.

Some simple consistency checks and validation checks between the mapping and the schema should eliminate most of the problems.

It would also be nice to be able to validate table and column names against the DB schema, but it's not so obvious how to approach that.

Allow predicates to traverse Postgres array and jsonb columns

Currently doobie array (and in the future jsonb) columns are handled as special kinds of leaf types as far as GraphQL query execution is concerned. It should be possible to consider array elements and json fields as an extension of the model and allow predicates to index in to them.

Add support for query comments

See 2.1.4.

Rename `master` to `main`

Add support for subscriptions

See 6.2.3.

Support GraphQL extensions

The GraphQL spec defines a simple extension mechanism for all types. This allows the definition of a new type by the addition of new elements of the appropriate sort to an existing type (eg. an addition field to an object type). This is currently not supported.

This can most likely be implemented as an elaboration/validation phase.

Add support for Postgres jsonb columns

Following on from #72 it should be possible to add support for Postgres jsonb columns similarly to how array columns where added in #66.

Complete fragment validation

See,

Collect fragment fields correctly

See 5.3.2 and 6.3.2.

The spec defines behaviour when multiple applicable fragments request the same fields. This is not yet implemented.

Add support for custom scalar types

How this is done in the JS ecosystem:

Test framework to sanity check generated SQL

To avoid runtime surprises it would be useful to have some sort of mechanism to check that SQL generated for particular schema/query combinations is reasonable. @tpolecat suggested that we might be able to use information extracted from EXPLAIN ANALYSE.

The main thing we want to be able to catch is a top-level query generating thousands of subqueries.

Add support for optional commas

See 2.1.5.

(De)serialize schema values to/from GraphQL.

GraphQL schemas are represented internally as values of a Scala ADT; the spec defines an external GraphQL representation. The latter is a straightforward serialization form of the former.

We should implement the latter and replace the manual schema construction in tests with schema text.

Add support for object input values

See 2.9.8 and the validation rules in 5.6.2-4.

Validate schema

Currently there's no validation of schemas beyond basic syntactic well-formedness. Consequently it's possible to construct schemas with references to undefined types (typically due to a typo in the definition or use) or multiple definitions (eg. I've accidentally defined a first as a scalar then as an enum, but forgotten to remove the first definition). Invalid schemas of this sort can result in all sorts of confusing runtime failures later.

The GraphQL spec has a fairly comprehensive list of validation criteria for queries. The validation criteria for schemas is less explicit and spread throughout section 3 of the spec: https://spec.graphql.org/June2018/#sec-Type-System. In practice I think that checking for references to undefined or multiply defined types will cover the most common issues.

GraphQL schemas are represented as values of the Schema types and the validation above would be a matter of checking the NamedType elements of its types member for consistency and completeness.

Move to cats-effect 3.x

And bump corresponding doobie and skunk versions.

Rework the compiler to use table aliases, subqueries, limit, offset and order by

Currently the SQL compiler targets a very limited subset of SQL: left joins and where clauses with simple, non-subquery, expressions. The staging/batching mechanism mitigates the lack of table aliasing somewhat but isn't ideal. In some cases this has led to query algebra constructs being squeezed into the current translation inappropriately (see #119 for a consequence of handling what should be a subquery as a join).

The compiler needs to be reworked in at least the following ways,

recursive queries should be compiled in terms of table aliases
projected predicates should compile to subqueries
interfaces and unions should be compiled in terms sql unions
limit and order by should be compiled to the natural sql equivalents rather then being implemented programmatically in the interpreter.

Enable full validation prior to execution

Currently all (implemented) validation rules are always applied, however some are applied late ie. after query execution has started. We should fully validate before any query execution.

Simplify/eliminate user specified joins in the doobie mapping

Currently some user provided code has to be written to support joins in the doobie mapping. In many cases this can be derived automatically from the mapping, GraphQL and DB schema.

Add support for mutation

See 6.2.2.

Get rid of `NoType`

NoType was an idea borrowed from scalac/dotc as an alternative for Option[Type]. It's error-prone (cp. null) and although it's possible that it's justified in terms of memory footprint/heap churn in those cases, it really isn't here, where queries are much smaller than typical Scala programmes.

Consider a lower level/more performant response representation

Currently we construct responses as Circe JsonObjects. @djspiewak observed that this is very costly, and also most likely unnecessary given that we're constructing a single response value bottom up: we could just as easily glom byte arrays together rather than constructing an intermediate structured value.

Simply ripping out Circe (for this role, not everywhere) and replacing with something lower level should be fairly straightforward, but probably isn't an immediate priority.

A slightly more ambitious move would be to support response streaming (ie. start returning the response to the client as soon as the first bytes are available). This might involve some tricky scheduling of nested subqueries for the best results, but I think also should be possible.

Add support for multiple root queries

See discussion here. Depends on #12.

This might also interact with staged query coalescing.

Add support for list input values

See 2.9.7.

Add support for @skip and @include directives

See 3.13.1 and 3.13.2.

It should be possible to eliminate these during elaboration.

Depends on #17.

Schema permits duplicate enum values

If I'm reading the spec correctly (and understanding the code correctly!) duplicate enum values in schemas should not be permitted (https://spec.graphql.org/June2018/#sec-Enums), but presently the following test fails:

val schema = Schema(
      """
         enum Direction {
          NORTH
          NORTH
        }
    """
    )

    assert(schema.isLeft)

Seems to me that we should be holding these values as a Set rather than a List? Happy to work on a fix if someone can clarify whether or not this is intended behaviour :)

Add support for null input values

See 2.9.5.

Additional tests exercising validation and other failures

Current tests are biased towards the happy path.

Projected predicates not implemented correctly

The Project predicate combinator introduced in #94 should be implemented in terms of a sql subquery or something similar. There are two disabled tests in ProjectionSpec which illustrate how the current implementation fails.

Add support for field aliases

See 2.7.

Add support for @deprecated directive

See 3.13.3.

Add support for narrowing a doobie row based on a discriminator

GraphQL supports subtype polymorphism via interface extension and safe downcasting using fragments. Currently for a doobie backed model this has to be implemented via subqueries, ie. we first have to read a discriminator value and then use that to create a subtype specific continuation query.

We should be able to support inlining small sets of subtypes in a single row with a discriminator to avoid having to make an additional query.

Schema String interpolator to allow compile time schema validation

Embedding a GraphQL schema in code currently involves runtime parsing (and, after #68, validation) which could fail. It would be better to have this done at compile time.

One model we could follow is to have a schema string interpolator similar to circe's json interpolator which will valid at compile time of the schema isn't well-formed and valid.

Add support for parent path/occurrence parameterization of doobie object mappings

Currently a doobie object mapping for a given GraphQL type maps each leaf field of that type to exactly one table/column. This means that if the same type appears as the type of a field in more than one parent type, or appears as the type of more than one field in a single parent type, then it can't be inlined into the table of the parent and must instead be split out into a separate table and recovered via a join.

It should be possible to parameterize an object mapping with a parent path or occurrence to allow the same type to be mapped to different tables/columns depending on context.

Evaluate switching from non-null by default to nullable by default

The GraphQL spec defines types as nullable as default and provides a Non-Null "wrapping type" to assert that a field is not nullable.

Early on I decided that for the internals of the compiler and interpreter it would be better to flip this around and treat types as non-nullable by default and add an explicit NullableType wrapper, ie. something analogous to Option.

This has worked out well in the compiler and interpreter, but was a bit awkward in the introspection component because we need to jump through a few hoops to get the types back into the form that introspection has to deliver back to clients.

This isn't a huge deal, but it gives me pause, because to the extent that people come to Grackle internals with prior GraphQL experience there will be a potentially off-putting mismatch between their expectations and what they'll find. It would at least be good to get a sense for whether this pain is worth the gain.

Feature Request: Ability to disable schema introspection via configuration

We have Grackle running in a QA and Production in a PoC, and we were discussing the implications of anyone being able to plug a GraphQL client into a live service.

Since GraphQL allows for introspection, this would mean that anyone could theoretically trawl through all of our live data. In our industry, that could put us in violation of media usage rights (on images, for example).

Would it be possible to disable introspection by some form of configuration settings, so that we can switch it off in production?

Support more sophisticated joins in the doobie module

Currently we only compute joins up until the point that we hit a select that through a type that we've already visited (ie. we pass through a recursion point in the schema). If we hit such a select the query will be split into two or more sequential queries. We might be able to avoid this by computing more sophisticated joins.

Complete introspection integration

Currently the introspection implementation operates alongside the non-introspective mechanism distinguished by an explicit operation name IntrospectionQuery. It should be properly composed with the non-introspection mechanism, ie. introspection queries should be available as-if defined as part of the non-introspective schema.

Support propagating filters to subqueries

Grackle handles field arguments by rewriting queries, eg. rewriting a select with an argument to a select with no argument, but with a child query that includes a filter operation parameterized with the field argument.

This works well in many case, but there are some scenarios where a top level filter might logically imply a nested filter. For example, suppose you have a model of a series with episodes. You might want to filter on the availability of some feature, eg. the availability of subtitles. A series has subtitles if at least one of it's episodes does, but it might have a mix both subtitled and non-subtitled episodes. So we might want to be able to distinguish between queries of the form,

series(subtitles: true) {
  episodes {
    title
  }
}

which yields all the episodes (subtitled or not) of all the series which have at least one subtitled episode, and,

series(subtitles: true) {
  episodes(subtitled: true) {
    title
  }
}

which yields all the (subtitled only) episodes of all the series which have at least one subtitled episode.

We can do this currently, so far so good.

However, it would seem more natural in this example for the top level series filter to be propagated automatically to the episode level without having to be repeated: if I ask for a subtitled series I'm most likely not interested in unsubtitled episodes.

This can be kludged together as things stand, but it's a bit awkward. To make it easier the SelectElaborator should support propagating context from outer selects to inner.

Scala 3 support

Would be great to have Scala 3 support for this library, though I'm not sure what would be required to do this, as there are no macros presently I guess this might be fairly straightforward?

Add support for query variables

See 2.10 and the validation rules in 5.8.

Tracing for Skunk back-end

Tracing is always available if you're using Skunk, so we can replace SkunkMonitor with something that records events in context.

Roadmap for 1.0

Review use of scala.Enumeration

Comment from @djspiewak "holy sh*t you used Enumeration!"

scala.Enumeration is used to signal that a data type model value is usable as a value of a GraphQL enum type. It's not a completely stupid idea but as Daniel's comment hints, Enumeration has a fraught history.

We should probably continue to allow Enumeration to be used in this way, but support (and recommend) a different mechanism for representing GraphQL enums.

Add support for circe backed models

Currently Grackle has support for models back by Scala ADTs (via either manual or derived mappings) and SQL databases via doobie.

It would be nice to also be able to use a (collection of) circe Json objects as the model with GraphQL queries having their natural interpretation as traversals through and selections from those objects.

A basic implementation should be fairly straightforward ... we would need a new subtype of Mapping which provided a Grackle Cursor backed by a circe cursor.