GithubHelp home page GithubHelp logo

gmac / graphql-stitching-ruby Goto Github PK

View Code? Open in Web Editor NEW
29.0 4.0 3.0 2.25 MB

GraphQL Schema Stitching for Ruby

License: MIT License

Ruby 100.00%
graphql graphql-ruby ruby schema-stitching graphql-stitching graphql-federation schema-federation

graphql-stitching-ruby's Introduction

GraphQL Stitching for Ruby

GraphQL stitching composes a single schema from multiple underlying GraphQL resources, then smartly proxies portions of incoming requests to their respective locations in dependency order and returns the merged results. This allows an entire location graph to be queried through one combined GraphQL surface area.

Stitched graph

Supports:

  • All operation types: query, mutation, and subscription.
  • Merged object and abstract types.
  • Shared objects, fields, enums, and inputs across locations.
  • Multiple and composite type keys.
  • Combining local and remote schemas.
  • File uploads via multipart form spec.
  • Tested with all minor versions of graphql-ruby.

NOT Supported:

  • Computed fields (ie: federation-style @requires).
  • Defer/stream.

This Ruby implementation is a sibling to GraphQL Tools (JS) and Bramble (Go), and its capabilities fall somewhere in between them. GraphQL stitching is similar in concept to Apollo Federation, though more generic. The opportunity here is for a Ruby application to stitch its local schemas together or onto remote sources without requiring an additional proxy service running in another language. If your goal is to build a purely high-throughput federated reverse proxy, consider not using Ruby.

Getting started

Add to your Gemfile:

gem "graphql-stitching"

Run bundle install, then require unless running an autoloading framework (Rails, etc):

require "graphql/stitching"

Usage

The quickest way to start is to use the provided Client component that wraps a stitched graph in an executable workflow (with optional query plan caching hooks):

movies_schema = <<~GRAPHQL
  type Movie { id: ID! name: String! }
  type Query { movie(id: ID!): Movie }
GRAPHQL

showtimes_schema = <<~GRAPHQL
  type Showtime { id: ID! time: String! }
  type Query { showtime(id: ID!): Showtime }
GRAPHQL

client = GraphQL::Stitching::Client.new(locations: {
  movies: {
    schema: GraphQL::Schema.from_definition(movies_schema),
    executable: GraphQL::Stitching::HttpExecutable.new(url: "http://localhost:3000"),
  },
  showtimes: {
    schema: GraphQL::Schema.from_definition(showtimes_schema),
    executable: GraphQL::Stitching::HttpExecutable.new(url: "http://localhost:3001"),
  },
  my_local: {
    schema: MyLocal::GraphQL::Schema,
  },
})

result = client.execute(
  query: "query FetchFromAll($movieId:ID!, $showtimeId:ID!){
    movie(id:$movieId) { name }
    showtime(id:$showtimeId): { time }
    myLocalField
  }",
  variables: { "movieId" => "1", "showtimeId" => "2" },
  operation_name: "FetchFromAll"
)

Schemas provided in location settings may be class-based schemas with local resolvers (locally-executable schemas), or schemas built from SDL strings (schema definition language parsed using GraphQL::Schema.from_definition) and mapped to remote locations via executables.

While the Client constructor is an easy quick start, the library also has several discrete components that can be assembled into custom workflows:

  • Composer - merges and validates many schemas into one supergraph.
  • Supergraph - manages the combined schema, location routing maps, and executable resources. Can be exported, cached, and rehydrated.
  • Request - manages the lifecycle of a stitched GraphQL request.
  • HttpExecutable - proxies requests to remotes with multipart file upload support.

Merged types

Object and Interface types may exist with different fields in different graph locations, and will get merged together in the combined schema.

Merging types

To facilitate this merging of types, stitching must know how to cross-reference and fetch each variant of a type from its source location using resolver queries. For those in an Apollo ecosystem, there's also limited support for merging types though a federation _entities protocol.

Merged type resolver queries

Types merge through resolver queries identified by a @stitch directive:

directive @stitch(key: String!, arguments: String) repeatable on FIELD_DEFINITION

This directive (or static configuration) is applied to root queries where a merged type may be accessed in each location, and a key argument specifies a field needed from other locations to be used as a query argument.

products_schema = <<~GRAPHQL
  directive @stitch(key: String!, arguments: String) repeatable on FIELD_DEFINITION

  type Product {
    id: ID!
    name: String!
  }

  type Query {
    product(id: ID!): Product @stitch(key: "id")
  }
GRAPHQL

catalog_schema = <<~GRAPHQL
  directive @stitch(key: String!, arguments: String) repeatable on FIELD_DEFINITION

  type Product {
    id: ID!
    price: Float!
  }

  type Query {
    products(ids: [ID!]!): [Product]! @stitch(key: "id")
  }
GRAPHQL

client = GraphQL::Stitching::Client.new(locations: {
  products: {
    schema: GraphQL::Schema.from_definition(products_schema),
    executable:  GraphQL::Stitching::HttpExecutable.new(url: "http://localhost:3001"),
  },
  catalog: {
    schema: GraphQL::Schema.from_definition(catalog_schema),
    executable:  GraphQL::Stitching::HttpExecutable.new(url: "http://localhost:3002"),
  },
})

Focusing on the @stitch directive usage:

type Product {
  id: ID!
  name: String!
}
type Query {
  product(id: ID!): Product @stitch(key: "id")
}
  • The @stitch directive is applied to a root query where the merged type may be accessed. The merged type identity is inferred from the field return.
  • The key: "id" parameter indicates that an { id } must be selected from prior locations so it may be submitted as an argument to this query. The query argument used to send the key is inferred when possible (more on arguments later).

Each location that provides a unique variant of a type must provide at least one resolver query for the type. The exception to this requirement are outbound-only types and/or foreign key types that contain no exclusive data:

type Product {
  id: ID!
}

The above representation of a Product type contains nothing but a key that is available in other locations. Therefore, this representation will never require an inbound request to fetch it, and its resolver query may be omitted.

List queries

It's okay (even preferable in most circumstances) to provide a list accessor as a resolver query. The only requirement is that both the field argument and return type must be lists, and the query results are expected to be a mapped set with null holding the position of missing results.

type Query {
  products(ids: [ID!]!): [Product]! @stitch(key: "id")
}

# input:  ["1", "2", "3"]
# result: [{ id: "1" }, null, { id: "3" }]

See error handling tips for list queries.

Abstract queries

It's okay for resolver queries to be implemented through abstract types. An abstract query will provide access to all of its possible types by default, each of which must implement the key.

interface Node {
  id: ID!
}
type Product implements Node {
  id: ID!
  name: String!
}
type Query {
  nodes(ids: [ID!]!): [Node]! @stitch(key: "id")
}

To customize which types an abstract query provides and their respective keys, you may extend the @stitch directive with a typeName constraint. This can be repeated to select multiple types.

directive @stitch(key: String!, arguments: String, typeName: String) repeatable on FIELD_DEFINITION

type Product { sku: ID! }
type Order { id: ID! }
type Customer { id: ID! } # << not stitched
union Entity = Product | Order | Customer

type Query {
  entity(key: ID!): Entity
    @stitch(key: "sku", typeName: "Product")
    @stitch(key: "id", typeName: "Order")
}

Argument shapes

Stitching infers which argument to use for queries with a single argument, or when the key name matches its intended argument. For custom mappings, the arguments option may specify a template of GraphQL arguments that insert key selections:

type Product {
  id: ID!
}
type Query {
  product(byId: ID, bySku: ID): Product
    @stitch(key: "id", arguments: "byId: $.id")
}

Key insertions are prefixed by $ and specify a dot-notation path to any selections made by the resolver key, or __typename. This syntax allows sending multiple arguments that intermix stitching keys with complex input shapes and other static values:

type Product {
  id: ID!
}
union Entity = Product
input EntityKey {
  id: ID!
  type: String!
}
enum EntitySource {
  DATABASE
  CACHE
}

type Query {
  entities(keys: [EntityKey!]!, source: EntitySource = DATABASE): [Entity]!
    @stitch(key: "id", arguments: "keys: { id: $.id, type: $.__typename }, source: CACHE")
}

See resolver arguments for full documentation on shaping input.

Composite type keys

Resolver keys may make composite selections for multiple key fields and/or nested scopes, for example:

interface FieldOwner {
  id: ID!
  type: String!
}
type CustomField {
  owner: FieldOwner!
  key: String!
  value: String
}
input CustomFieldLookup {
  ownerId: ID!
  ownerType: String!
  key: String!
}

type Query {
  customFields(lookups: [CustomFieldLookup!]!): [CustomField]! @stitch(
    key: "owner { id type } key",
    arguments: "lookups: { ownerId: $.owner.id, ownerType: $.owner.type, key: $.key }"
  )
}

Note that composite key selections may not be distributed across locations. The complete selection criteria must be available in each location that provides the key.

Multiple type keys

A type may exist in multiple locations across the graph using different keys, for example:

type Product { id:ID! }          # storefronts location
type Product { id:ID! sku:ID! }  # products location
type Product { sku:ID! }         # catelog location

In the above graph, the storefronts and catelog locations have different keys that join through an intermediary. This pattern is perfectly valid and resolvable as long as the intermediary provides resolver queries for each possible key:

type Product {
  id: ID!
  sku: ID!
}
type Query {
  productById(id: ID!): Product @stitch(key: "id")
  productBySku(sku: ID!): Product @stitch(key: "sku")
}

The @stitch directive is also repeatable, allowing a single query to associate with multiple keys:

type Product {
  id: ID!
  sku: ID!
}
type Query {
  product(id: ID, sku: ID): Product @stitch(key: "id") @stitch(key: "sku")
}

Class-based schemas

The @stitch directive can be added to class-based schemas with a directive class:

class StitchingResolver < GraphQL::Schema::Directive
  graphql_name "stitch"
  locations FIELD_DEFINITION
  repeatable true
  argument :key, String, required: true
  argument :arguments, String, required: false
end

class Query < GraphQL::Schema::Object
  field :product, Product, null: false do
    directive StitchingResolver, key: "id"
    argument :id, ID, required: true
  end
end

The @stitch directive can be exported from a class-based schema to an SDL string by calling schema.to_definition.

SDL-based schemas

A clean schema may also have stitching directives applied via static configuration by passing a stitch array in location settings:

sdl_string = <<~GRAPHQL
  type Product {
    id: ID!
    sku: ID!
  }
  type Query {
    productById(id: ID!): Product
    productBySku(sku: ID!): Product
  }
GRAPHQL

supergraph = GraphQL::Stitching::Composer.new.perform({
  products:  {
    schema: GraphQL::Schema.from_definition(sdl_string),
    executable: ->() { ... },
    stitch: [
      { field_name: "productById", key: "id" },
      { field_name: "productBySku", key: "sku", arguments: "mySku: $.sku" },
    ]
  },
  # ...
})

Custom directive names

The library is configured to use a @stitch directive by default. You may customize this by setting a new name during initialization:

GraphQL::Stitching.stitch_directive = "resolver"

Executables

An executable resource performs location-specific GraphQL requests. Executables may be GraphQL::Schema classes, or any object that responds to .call(request, source, variables) and returns a raw GraphQL response:

class MyExecutable
  def call(request, source, variables)
    # process a GraphQL request...
    return {
      "data" => { ... },
      "errors" => [ ... ],
    }
  end
end

A Supergraph is composed with executable resources provided for each location. Any location that omits the executable option will use the provided schema as its default executable:

supergraph = GraphQL::Stitching::Composer.new.perform({
  first: {
    schema: FirstSchema,
    # executable:^^^^^^ delegates to FirstSchema,
  },
  second: {
    schema: SecondSchema,
    executable: GraphQL::Stitching::HttpExecutable.new(url: "http://localhost:3001", headers: { ... }),
  },
  third: {
    schema: ThirdSchema,
    executable: MyExecutable.new,
  },
  fourth: {
    schema: FourthSchema,
    executable: ->(req, query, vars) { ... },
  },
})

The GraphQL::Stitching::HttpExecutable class is provided as a simple executable wrapper around Net::HTTP.post with file upload support. You should build your own executables to leverage your existing libraries and to add instrumentation. Note that you must manually assign all executables to a Supergraph when rehydrating it from cache (see docs).

Batching

The stitching executor automatically batches subgraph requests so that only one request is made per location per generation of data. This is done using batched queries that combine all data access for a given a location. For example:

query MyOperation_2($_0_key:[ID!]!, $_1_0_key:ID!, $_1_1_key:ID!, $_1_2_key:ID!) {
  _0_result: widgets(ids: $_0_key) { ... } # << 3 Widget
  _1_0_result: sprocket(id: $_1_0_key) { ... } # << 1 Sprocket
  _1_1_result: sprocket(id: $_1_1_key) { ... } # << 1 Sprocket
  _1_2_result: sprocket(id: $_1_2_key) { ... } # << 1 Sprocket
}

Tips:

  • List queries (like the widgets selection above) are generally preferable as resolver queries because they keep the batched document consistent regardless of set size, and make for smaller documents that parse and validate faster.
  • Assure that root field resolvers across your subgraph implement batching to anticipate cases like the three sprocket selections above.

Otherwise, there's no developer intervention necessary (or generally possible) to improve upon data access. Note that multiple generations of data may still force the executor to return to a previous location for more data.

Concurrency

The Executor component builds atop the Ruby fiber-based implementation of GraphQL::Dataloader. Non-blocking concurrency requires setting a fiber scheduler via Fiber.set_scheduler, see graphql-ruby docs. You may also need to build your own remote clients using corresponding HTTP libraries.

Additional topics

Examples

This repo includes working examples of stitched schemas running across small Rack servers. Clone the repo, cd into each example and try running it following its README instructions.

Tests

bundle install
bundle exec rake test [TEST=path/to/test.rb]

graphql-stitching-ruby's People

Contributors

drich10 avatar gmac avatar mikeharty avatar rsperko avatar thomasmarshall avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

graphql-stitching-ruby's Issues

Arguments are dropped during composition

From @mikeharty:

# Getting double args sometimes... why?
return if owner.arguments.any? { _1.first == argument_name }

This line causes arguments to be dropped entirely - I haven't dug into the root of why the double args, but if I change that return to be next, my arguments stop being dropped.

I'm testing this against a reasonably substantial (~400 distinct types) schema and with these changes am generating identical schema to without stitching.

Issue mapping Enum values to keys in Composer.build_enum_type

Hello,

First, thanks for writing this, the implementation is very clean and easy to walk through.

I have a case where I'm upgrading a pretty stale version of GraphQL and I've run into an issue with Enums. The project makes use of the value property on Enums to translate between GraphQL Enum "labels" and Ruby values, e.g.:

Enum.value('UNSPECIFIED', 'Unspecified value', value: 'none') will be UNSPECIFIED in the Schema, but if passed as a query argument, will appear as "none" in Ruby.

The issue comes up during introspection, when Composer.build_enum_type attempts to build the enum types.

On this line: https://github.com/gmac/graphql-stitching-ruby/blob/main/lib/graphql/stitching/composer.rb#L238
it is constructing a new EnumValue via EnumValue.value, but it passes the value as the first argument, rather than the graphql_name.

In my case, this causes two issues:

  1. Some of my Enum values are not valid EnumValue names, they are Ruby primitives or don't meet the naming validation rules
  2. Those that do pass validation are still detached from their original EnumValue which links the name and value together.

I experimented pretty thoroughly with different approaches to solve this, I was hoping I could achieve it by implementing a custom enum_value_class, but the necessary values aren't passed down. For now, I've resorted to monkey-patching build_enum_type, which gets the job done for my narrow case, but there's likely a better general solution. For my Schema, I don't have overlapping types, so I can reliably pick the first location an EnumValue was seen and pull the graphql_name off of that, which I've done by simply:

enum_values_by_value_location.each do |value, enum_values_by_location|
  # Getting the first location
  location = enum_values_by_location.keys.first
  # Getting the GraphQL name off of it, or falling back to original behavior
  graphql_name = enum_values_by_location[location]&.graphql_name || value
  enum_value = value(graphql_name,
               value: value,
               description: builder.merge_descriptions(type_name, enum_values_by_location, enum_value: value),
               deprecation_reason: builder.merge_deprecations(type_name, enum_values_by_location, enum_value: value))

Any tips or thoughts appreciated, happy to offer any addition info as needed.

Support schema visibility controls

GraphQL Ruby supports visibility controls for selectively hiding parts of a schema from view. Stitching should be able to piggy-back on the GraphQL Ruby implementation of the feature to allow portions of the combined schema to be hidden. Visibility controls would make sense as directives structured similar to Apollo authorizations.

Support composite keys/inputs

It would add practical value if composite key selections were allowed. Composite keys would require an additional argument mapping to express how the composite selections map into query arguments:

input WidgetKey {
  group: String!
  name: String!
}

widgets(keys: [WidgetKey!]!, other: String): [Widget]! @stitch(
  key: "scope { group name }",
  arguments: "keys: {group: $scope.group, name: $scope.name}, other: 'Sfoo'"
)

The arguments param is parsed as a GraphQL inner-arguments literal. Then, paths from the key are inserted into the literal as a namespaced path prefixed by "$", ie: $scope.group. Some scoping rules would apply, as a repeatable key field can only be inserted into into a repeatable argument scope.

Arguments

type Widget {
  scope: String!
  name: String!
}

type Query {
  widget1(scope: String!, name: String!): Widget @stitch(
    key: "scope name",
    arguments: "scope: $scope, name: $name"
  )
  widget2(s: String!, n: String!): Widget @stitch(
    key: "scope name",
    arguments: "s: $scope, n: $name"
  )
}

Input objects

type Widget {
  scope: String!
  name: String!
}

input WidgetKey {
  scope: String!
  name: String!
}
input WidgetKey2 {
  s: String!
  n: String!
}
type Query {
  widgets(keys: [WidgetKey!]!): [Widget]! @stitch(
    key: "scope name", 
    arguments: "keys: {scope: $scope, name: $name}"
  )

  widget1(key: WidgetKey!): Widget @stitch(
    key: "scope name", 
    arguments: "key: {scope: $scope, name: $name}"
  )
  widget2(key: WidgetKey2!): Widget @stitch(
    key: "scope name", 
    arguments: "key: {s: $scope, n: $name}"
  )
  
  widget1(key: WidgetKey, other: String): Widget @stitch(
    key: "scope name", 
    arguments: "key: {scope: $scope, name: $name}, other: 'Sfoo'"
  )
  widget2(key: WidgetKey2, other: String): Widget @stitch(
    key: "scope name", 
    arguments: "key: {s: $scope, n: $name}, other: 'Sfoo'"
  )
}

Nested selections

type WidgetScope {
  group: String!
  name: String!
}
type Widget {
  scope: WidgetScope
  title: String
}

input WidgetKey {
  group: String!
  key: String!
}
type Query {
  widgets(keys: [WidgetKey!]!, other: String): [Widget]! @stitch(
    key: "scope { group name }",
    arguments: "keys: {group: $scope.group, name: $scope.name}, other: 'Sfoo'"
  )
}

Entity representations (Apollo Federation protocol)

type WidgetScope {
  group: String!
  name: String!
}
type Widget {
  scope: WidgetScope
  title: String
}

union _Entity = Widget
scalar _Any
type Query {
  # sends keys as JSON blobs:
  # [{"group": "a", name: "b", "__typename": "Widget"}, ...]
  _entities(representations: [_Any!]!): [_Entity]! @stitch(
    key: "scope { group name } __typename", 
    arguments: "representations: { group: $scope.group, name: $scope.name, __typename: $__typename }",
  )
}

Simple parser:

class ArgumentsParser
  class << self
    # "reps: {group: $scope.group, name: $scope.name}, other: 'Sfoo'""
    def parse(template)
      template = template.gsub("'", '"').gsub(/(\$[\w\.]+)/) { %|"#{_1}"| }
      GraphQL.parse("{ f(#{template}) }")
        .definitions.first
        .selections.first
        .arguments
    end
  end
end

Support field-level authorization

It'd be nice to formally support field-level authorization through the query planner, similar to other federation libraries. A few specs:

  • Unauthorized fields are simply filtered out of the request by default.
  • A setting opts requests with unauthorized fields into returning immediately with an error.

It looks like @mikeharty has been doing some auth work in his custom executor. Mike – any chance you could elaborate here with more on how the feature could/should work with what you're already doing?

Composer directives for omitting elements

Similar to other composer libraries, there should be some controls for prioritizing and hiding elements in a schema. Specifically:

  • @override: prioritize a field from a given location. This can replace the current root_field_location_selector proc. This might make more sense as a @priority directive that can assign a numeric priority to fields. Priority determines the location’s rank in the field’s delegation set (the planner searches field locations from first-to-last, so the first location has priority). It might also be nice to prioritize fields as “-1” to omit them from the delegation set entirely so they are never considered (the planner will still pick non-priority locations when it allows making fewer requests).

  • @inaccessible: eliminates a field or argument from the combined schema. The element is omitted from the supergraph when any subschema marks it as inaccessible. This feature has a few implications:

    • This can lead to empty object scopes (object types with zero fields), which should not be allowed. Raise composer error if a type resolves with zero fields.
    • This can also lead to unreachable types left in the schema. We’d need to traverse from root query and mutation types before building the supergraph and prune any types without field/argument usage.
    • Inaccessible elements should still be known to resolver queries, which may operate with subgraph information beyond the scope of the public supergraph. This extended criteria should only ever be selected through exports, so should be compatible with the current Shaper, but tests should confirm that.

Argument defaults are ignored during composition

From @mikeharty:

I'm not totally clear on whether "default_value" is officially supported in the GraphQL gem, but we are using it and it does work. It appears to be left out of the implementation here: https://github.com/gmac/graphql-stitching-ruby/blob/main/lib/graphql/stitching/composer.rb#L369

I've worked around this adding a merge_default_values function into the monkey patch I mentioned above:

def merge_default_values(type_name, members_by_location, argument_name: nil, field_name: nil)
  default_values = members_by_location.values.map(&:default_value)
  
  return nil if default_values.any?(&:nil?)
  
  if default_values.uniq.length != 1
    path = [type_name, field_name, argument_name].compact.join('.')
    raise ComposerError, "Default values for `#{path}` must be the same."
  end
  
  default_values.first
end

and I updated the code at the point I linked to conditionally add the default_value via kwargs if it has a value, so that it doesn't default fields to nulls that previously had no default (caused validation issues otherwise)

type = merge_value_types(type_name, value_types, argument_name: argument_name, field_name: field_name)
default_value = merge_default_values(type_name, arguments_by_location, argument_name: argument_name, field_name: field_name)
kwargs = {}
kwargs[:default_value] = default_value unless default_value.nil? && type.non_null?
schema_argument = owner.argument(
  argument_name,
  description: merge_descriptions(type_name, arguments_by_location, argument_name: argument_name, 
                                                                    field_name: field_name),
  deprecation_reason: merge_deprecations(type_name, arguments_by_location, argument_name: argument_name, 
                                                                           field_name: field_name),
  type: GraphQL::Stitching::Util.unwrap_non_null(type),
  required: type.non_null?,
  camelize: false,
  **kwargs
)

Support foreign key → relation transform

GraphQL foreign keys are commonly handled as Product.imageId is here:

# -- Products schema:

type Product {
  id: ID!
  imageId: ID!
}

# -- Images schema:

type Image {
  id: ID!
  url: String!
}

However, stitching wants this schema to be shaped as:

# -- Products schema:

type Product {
  id: ID!
  image: Image!
}

type Image {
  id: ID!
}

# -- Images schema:

type Image {
  id: ID!
  url: String!
}

Rather than forcing services to be reshaped (assuming we even have ownership and are able to), it would be nice if stitching would handle the transformation of key fields into typed relations, such as:

# -- Products schema:

type Product {
  id: ID!
  imageId: ID! @relation(fieldName: "image", typeName: "Image", foreignKey: "id")
  # --> image: Image! // type Image { id: ID! }
}

# -- Images schema:

type Image {
  id: ID!
  url: String!
}

All hashes should use string keys

The library does some work using hashes... this is because we want to facilitate serialization and deserialization of critical data structures (delegation maps and query plans). However, right now some parts use string keys and other parts use symbol keys... this is at odds with running JSON.parse on a cached structure and being ready to go.

While symbol keys look nicer, delegation maps have to use all string keys for name matching. It probably makes the most sense to use string keys everywhere. We could also potentially wrap some common structures like boundaries in Structs, but that just adds a step and more object creation, so I'm not sure it's worth it.

Feat: hoist inlined inputs

Summary

Input values embedded into a GraphQL document make that document unique and prevent it from hashing consistently when looking for a cached query plan:

query { 
  product(id: "1") { name }
}

A nice add-on would be a utility that traverses a request document and extracts input literals and hoists them up to document variables. Then no matter how the request was submitted, it will be subject to plan caching with a normalized body:

query($_hoist_0: ID!){ 
  product(id: $_hoist_0) { name }
}

# variables { "_hoist_0": "1" }

This normalization would be appropriate to happen in the Document object.

n+1 / batching

Hello!

I'm considering using this gem for a graphql-ruby / packwerk project. The idea of having in-process graphql stitching is compelling. I'm curious if you have thoughts about how I could avoid n+1 problems. I don't have a great understanding about how gateways / schema stitching tools handle this in general.

Need async executor execution

Summary

The Executor currently runs request executions synchronously.

def exec!
  # @todo make this async
  next_ops = @queue.select { _1[:after_key].nil? }

  while next_ops.any?
    next_ops.each do |op|
      # Each of these "next operations" should be run in parallel...
      # Also, each individual completion should trigger a next round looking for new operations
      # (we do NOT need to await all operations in this round before looking for followups)
      @status[op[:key]] = perform_operation(op)
    end

    next_ops = @queue.select do |op|
      after_key = op[:after_key]
      after_key && @status[after_key] == :completed && @status[op[:key]].nil?
    end
  end
end

We need to explore async options for running batches of requests concurrently. We'd ideally align with GraphQL Ruby's async implementation to avoid new dependencies, or match GraphQL Batch. Things to look at:

  • GraphQL Ruby dataloader docs
  • The GraphQL Interpreter, which is what all GraphQL::Schema.execute calls go to. All requests are multiplexed (single execution is just a wrapper for a multiplex of one)... and multiplexing uses GraphQL's built-in dataloader.
  • Might want to talk to @swalkinshaw about how GraphQL Batch gets its async event reactor. Following GraphQL Batch would be a good secondary approach.

Consider renaming `@boundary` to `@stitch`

The @boundary directive is terminology borrowed from Bramble, and while I like it, it's not necessary to carry it over and is potentially confusing given that Bramble's boundary annotations work differently and are a lot more confusing.

Support GraphQL v1.13

There's presently one failing test involving a late-bound type when running with GraphQL v1.13. Let's assess and fix/ignore to expand gem compatibility down to GraphQL Ruby v1.13.

Batch all queries to a location per execution frame

Right now, many operations may have the same after_key yet all target the same location on behalf of different insertion paths. This results in several requests made to the same service during the same execution frame. We should expand batching to write a single query for all of the different operations being delegated during a given frame.

Need a Gateway component

Summary

The library is organized around composable pieces:

Composer -> Supergraph -> Planner -> Executor -> Shaper

These are intentionally discrete so that parts and pieces of the stitching workflow can be mixed and matched (ie: precompose and caching of supergraph, caching and restoration of query plans, etc). However, this makes the library difficult to use quickly out of the box. We need a Gateway component that rolls up a boilerplate workflow of all the parts and pieces into one unit.

Desired API

Should be easy to build a stitched gateway and use it to execute requests:

gateway = GraphQL::Stitching::Gateway.new({
  products: {
    schema: GraphQL::Schema.from_definition(movies_schema),
    client: GraphQL::Stitching::RemoteClient.new(url: "http://localhost:3000"),
  },
  showtimes: {
    schema: GraphQL::Schema.from_definition(showtimes_schema),
    client: GraphQL::Stitching::RemoteClient.new(url: "http://localhost:3001"),
  },
  local: {
    schema: MyLocal::GraphQL::Schema
  },
})

gateway.cache_read do |key|
  $redis.get(key) # << 3P code
end

gateway.cache_write do |key, payload|
  $redis.set(key, payload) # << 3P code
end

result = gateway.execute(
  # Same basic arguments as GraphQL Ruby (https://graphql-ruby.org/queries/executing_queries)
  query: ,
  document: ,
  variables: ,
)

Steps

This general workflow is mostly laid out in example/gateway.rb. Use that for reference...

  1. Run Composer during gateway initialization – make sure all location names are input as strings.
  2. Also during initialization, add provided clients to the composed Supergraph. A client is anything that responds to .call(). We should consolidate Supergraph's assign_location_url and assign_location_handler into one "assign_location_client" method that accepts any call-able object.
  3. The Gateway cache_read and cache_write methods should stash the provided procs as instance variables. These are not required to be set.
  4. Add an execute method, the signature should be a subset of the GraphQL::Schema.execute method. When invoking execute:
  • Generate a Document (stitching lib) from the provided document or query
  • Validate the document AST against the Supergraph schema. Format and return any validation errors.
  • Bonus: add a variable hosting routine to normalize the document.
  • If there's a cache reader, then generate a document SHA and request it from the cache accessor. Parse any returned results (currently requires JSON to parse with symbolized keys). If we have a cached plan, we can skip the next step.
  • Plan the submitted request document unless a cached plan was found.
    • If there's a cache writer, then provide the generated plan and the request SHA to the cache writer.
  • Execute the plan with provided query variables.
  • Pass the raw execution result to the Shaper (in progress).
  1. Needs tests.

Extract batched stitching ids into request variables

At present, executor batching inlines all stitching IDs into their resolver queries:

query MyOperation_2 {
  _0_result: widgets(ids:["a","b","c"]) { ... }
  _1_0_result: sprocket(id:"x") { ... }
  _1_1_result: sprocket(id:"y") { ... }
}

This is not ideal because it creates high request cardinality, which may defeat some backend caches. It would be generally better if requests stayed consistent when possible and submitted keys as request variables:

query MyOperation_2($_0_key: [ID!]!, $_1_0_key: ID!, $_1_1_key: ID!) {
  _0_result: widgets(ids: $_0_key) { ... }
  _1_0_result: sprocket(id: $_1_0_key) { ... }
  _1_1_result: sprocket(id: $_1_1_key) { ... }
}

# variables: { "_0_key": ["a","b","c"], "_1_0_key": "x", "_1_1_key": "y" }

Need conditional type checks during execution

Summary

Requests are statically planned up front, which means that fragment selections may generate operations forking from types that are not actually resolved. We need to perform a type check after each execution and (recursively?) eliminate child operations that don't actually apply to the resolved type.

Root fields with fragments do not export typename

I discovered that when the data for a corresponding type cannot be retrieved from an external server (returning null), the results can vary based on how fragments are used, even if the queries are essentially the same in meaning.

In my case, access to the corresponding type may be restricted due to user permissions, leading to its unavailability from the external server.

I have created a sample to reproduce the issue and would like to share it with you.

Schema Definition

ServiceA

  • Server facing client applications
  • Manages ParentResource
type Query {
  parentResource(id: ID!): ParentResource

  subResourcesByIds(ids: [ID!]!): [SubResource]!
}

type ParentResource {
  id: ID!
  subResource: SubResource
}

type SubResource {
  id: ID!
}

ServiceB

  • Server behind ServiceA
  • Manages SubResource
  • Viewing SubResource may not be permitted depending on the user
  • Therefore, subResourcesByIds may return null elements
type Query {
  subResourcesByIds(ids: [ID!]!): [SubResource]!
}

type SubResource {
  id: ID!
  serviceBField: String!
}

Query Results

In the following examples, ServiceB returns { "data": { "_0_result": [null] } }

Query that Returns the Expected Result

Query

query {
  parentResource(id: "ParentResource:1000") {
    id
    subResource {
      ...SubResourceFragment
    }
  }
}

fragment SubResourceFragment on SubResource {
  id
  serviceBField
}

Result

{
  "data": {
    "parentResource": {
      "id": "ParentResource:1000",
      "subResource": null
    }
  }
}

Query that Returns the Unexpected Result

Query

query {
  parentResource(id: "ParentResource:1000") {
    ...ParentResourceFragment
  }
}

fragment ParentResourceFragment on ParentResource {
  id
  subResource {
    id
    serviceBField
  }
}

Result

{
  "data": {
    "parentResource": {
      "id": "ParentResource:1000",
      // `serviceBField` is defined as non-nullable, but the field is not being returned
      // This makes the response violate the Schema. I expect `subResource` to be returned as null.
      "subResource": {
        "id": "SubResource:2000",
        "_export_id": "SubResource:2000",
        "_export___typename": "SubResource"
      }
    }
  }
}

I hope this information is helpful in addressing the issue.

Support for multipart form file uploads?

I recently added support for multipart form file uploads in the project where I've implemented this library. I've been debating whether or not it's something that would make sense to incorporate here.

My implementation has deviated pretty far in this particular area; I implemented a custom HttpExecutable in order to swap the client to RestClient (needed a proxy), and to add pre/post-request hook points to manipulate queries and responses. This is also where the multipart form handling happens, I'm using apollo_upload_server-ruby, which does follow the spec, but is a Rails specific implementation.

I say all of that essentially to say, maybe that's where this belongs, in a custom implementation, if needed. I figured I'd get your take before making that assumption. The thought also crossed my mind that maybe there's a plugin-style approach that would make sense. Let me know if there's any interest here, I'm happy to more formally propose a solution if there's an appetite for it, just wanted to gauge whether this feels messy/out of scope first.

Need Shaper component

Summary

Right now, a raw execution result is returned directly. This raw result has many possible inaccuracies:

  • Might contain stitching keys that were automatically added.
  • Might be missing requested fields rather than providing the requested field with a null value.
  • Needs to apply schema nullability constraints to the resolved payload.

We need a final-pass algorithm that traverses the original request, prunes extra payload fields, adds missing payload fields as null, and then bubbles nullability constraints up through the document tree. Same basic idea as the Apollo resultsShaper or Bramble bubbleUpNullValuesInPlace.

There's a dev branch setup for this work here: https://github.com/gmac/graphql-stitching-ruby/compare/dev_shaper?expand=1

Tests run using:

bundle exec rake test TEST=test/graphql/stitching/shaper_test.rb

Example

The user requested this:

query {
  storefront(id: "1") {
    id
    products {
      upc
      name
      price
      nullableField
    }
  }
}

But the raw execution result looks like this:

{
  "data": {
    "storefront": {
      "id": "1",
      "products": [
        {
          "upc": "1",
          "_STITCH_upc": "1",
          "_STITCH_typename": "Product",
          "name": "iPhone",
          "price": 699.99,
          "nullableField": 1
        },
        {
          "upc": "2",
          "_STITCH_upc": "2",
          "_STITCH_typename": "Product",
          "name": "Apple Watch",
          "price": 399.99
        }
      ]
    }
  }
}

In the above, the user didn't request the stitching keys, so they should be removed. They did request nullableField but a value only came back for one record, so the field is missing from the second and should be added as "nullableField": null. Lastly, we'd need to bubble up errors in place based on schema null constraints.

Variables with boolean value `false` are overwritten in Request.prepare!

In the Request class, the prepare! method applies default values to variables using conditional assignment:

operation.variables.each do |v|
  @variables[v.name] ||= v.default_value
end

Unfortunately,false is falsey (obviously), so it is always overwritten with the value in v.default_value (which could be nil or true).

This is critical, I'd think, considering it can flip false boolean values to true if the default is true.

Script to reproduce (run from project root):

require_relative 'lib/graphql/stitching'
require_relative 'lib/graphql/stitching/request'

query = <<~GRAPHQL
  query($a: Boolean, $b: Boolean, $c: Boolean = true) {
    base(a: $a, b: $b, c: $c) { id }
  }
GRAPHQL

variables = { "a" => true, "b" => false, "c" => false }
request = GraphQL::Stitching::Request.new(GraphQL.parse(query), variables: variables)
request.prepare!

puts request.variables
# {"a"=>true, "b"=>nil, "c"=>true}

Fix is simple, I'll open a PR in a moment with the fix + a test.

Plan runtime directives, @skip and @include

At present, runtime directives are completely ignored by the planner. Need to:

  • Extract variables from all runtime directives while planning.
  • Confirm that runtime directives are passed through in subqueries.
  • Make planning operations aware of runtime directives on their root scope, and conditionally run operations when an entire operation can be ignored.

Remote errors are not properly propagated to clients

When one of our location returns an error, GraphQL::Stitching::Client doesn't propagate that error and raises no implicit conversion of nil into Array (TypeError) instead at this line:

@executor.errors.concat(extract_errors!(origin_sets_by_operation, errors)) if errors&.any?

This is because that extract_errors!(origin_sets_by_operation, errors) is returning nil.

Looking at the implementation,

end
errors_result.flatten!
end

This flatten! is the crux. It can return nil, in case the operation did not modify errors_result.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.