apollographql / federation-next Goto Github PK

Home of the rust rewrite of federation core (composition & query planning)

License: Other

Rust 100.00%

federation-next's Introduction

Apollo Federation

Apollo Federation is an architecture for declaratively composing APIs into a unified graph. Each team can own their slice of the graph independently, empowering them to deliver autonomously and incrementally.

Federation 2 is an evolution of the original Apollo Federation with an improved shared ownership model, enhanced type merging, and cleaner syntax for a smoother developer experience. It’s backwards compatible, requiring no major changes to your subgraphs.

Checkout the Federation 2 docs and demo repo to take it for a spin and let us know what you think!

Usage

TODO

Contributing

TODO

Security

For more info on how to contact the team for security issues, see our Security Policy.

License

Source code in this repository is covered by the Elastic License 2.0. The default throughout the repository is a license under the Elastic License 2.0, unless a file header or a license file in a subdirectory specifies another license. See the LICENSE for the full license text.

federation-next's People

Stargazers

Watchers

federation-next's Issues

Compute best fetch dependency graph given the options for each branch (query plan business logic)

This GH issue is the first part of #31 .

Specifically:

Pruning the options for each branch.
Generating the initial partial fetch dependency graph and remaining options.
Iterating through all possible fetch dependency graphs from the initial one, and returning the one with best cost.

More specifically, this issue is for porting QueryPlanningTraversal.computeBestPlanFromClosedBranches() and its code dependencies.

bug: ensure consistent order of how we print supergraph/api schemas

Printed SDL should match the order of the JS implementation. By default, JS prints schema first, followed by directives and closing up with types. Directives and types are printed in alphabetical order (meaning a < A).

Current RS implementation attempts to do the similar thing

    fn print_sdl(schema: &Schema) -> String {
        let mut schema = schema.clone();
        schema.types.sort_keys();
        schema.directive_definitions.sort_keys();
        schema.to_string()
    }

Difference is that RS compares characters ASCII values so A < a. We should try to avoid making any changes to this behavior to ensure that users can seamlessly upgrade from JS to RS implementation without any changes to their supergraph.

[Operation Processing] Compute normalized keys in operations per-element instead of per-normalized-map

When we normalize apollo-rs operations in #87 and convert them to our normalized representation, we compute a key for each selection in a selection set. In the JS code, this key is computed once and stored in the selection's element itself (i.e. the field/inline fragment/named fragment spread). This allows it to stay the same when that element is copied, such as when a named fragment is expanded, or when selections are copied from a supergraph query into a subgraph query.

However, the Rust code currently computes this when the NormalizedMap is populated during creation. This means that copying the selection (or even moving it around) will change its key. This may have been fine if the key were only used here, but it turns out we check for key equivalence when we try to reuse fragments in subgraph queries. For this code to work, we'll need to do a few things:

The different kinds of selections (fields/inline fragments/named fragment spreads) need a field for storing a u64 ID.
When the selection is created (either from the apollo-rs representation, or programmatically via e.g. new()), we set the ID from a global incrementing counter (starting at 1).
Selection types need a key() method, which computes and returns the current NormalizedKey, but the label is 0 if there are no @defer directive applications, and the above ID otherwise.
Cloning should copy this ID. There could optionally be a method to change the ID to a new one taken from the global incrementing counter, if needed (e.g. to change the identity of a @defer usage post-clone).

Note that for this to work, we'll also need to do two other things:

We'll need to use the same normalized representation throughout query planning (we can't inter-convert with the apollo-rs representation).
We'll need to ensure named fragments are converted to a normalized form before they are expanded within an operation, so that @defered selections within them are assigned the same ID (currently apollo-rs named fragments are expanded directly into inline fragments). This depends on #104.

Consider IndexMap + shift_remove instead of LinkedHashMap

src/query_plan/operation.rs contains this comment:

Note that this must be a LinkedHashMap so that removals don't change the order.

Instead of a remove method, IndexMap has both swap_remove and shift_remove. Only the former changes the iteration order of some remaining items. The trade-off is that the latter takes O(n) time (like Vec::remove). This may still be a win over LinkedHashMap, which despite better algorithmic complexity in theory probably has more memory fragmentation. Let’s try it when we can get some performance numbers.

The linked-hash-map crate is also “in maintenance mode, due to insufficient maintainer resources”.

logging & tracing story in the query planner

Preview solution to loosely align with the router, pending a proper router/federation-next strategy in Q1.

document release process

This issue might take us some time to iron out, but it'd be good to note down how we release in the future. Should include documentation for how #142 works in the process

[operation processing] normalize query planner operation selection sets

apollo-rs represents SelectionSet as a Vec<Selection> which matches GraphQL specification. For query planner purposes we need to normalize those selection sets so we can efficiently generate query plans.

Normalization involves:

Benchmark with and without prune_closed_branches

The TypeScript implementation got faster in at least some cases by removed it apollographql/federation#2905. The Rust implementation has slightly different data structures and memory layout, so let’s measure whether removing would be an improvement there or not.

Plan selection and generation (query plan business logic)

During query planning, we should be able to select and generate the lowest-cost query plan using the options generated during traversal.

This is specifically steps 9-11 of this query-planning summary, i.e.:

Prune the options for each closed branch, to reduce cartesian product blow-up.
Compute fetch dependency graphs, evaluate costs, and determine lowest-cost dependency graph.
Generate query plan from the chosen dependency graph.

Note that if it's possible to cleanly punt on porting feature-specific code for later (e.g. subscriptions), we should take the opportunity to do so, in the interests of delivering incrementally.

update or implement apollo-rs:to_encoder

epic: testing - backwards compatibility tests

backwards compatibility tests - runs fed2x and fed-next, compares output to validate equivalence. (tool, take dataset from perf tests, CI) (M)

epic: wasm support

wasm support

gateway integration
studio integration
harness integration

[Plan Selection] Option pruning for closed branches

Port the logic for options pruning from the JS codebase to Rust. That is, port these methods on QueryPlanningTraversal:

pruneClosedBranches()
sortOptionsInClosedBranches()
This block of code, which should really be moved into a new method named e.g. reduce_options_if_needed().

It'll help to look at the comments in this section of code, which explains more about those functions in the context in which they're called.

Part of #77.

[Plan Selection] `FetchDependencyGraph` creation and modification

As part of computeBestPlanFromClosedBranches(), we need the following FetchDependencyGraph-related methods to be ported to Rust:

QueryPlanningTraversal.newDependencyGraph()
QueryPlanningTraversal.updatedDependencyGraph()
FetchDependencyGraph.clone()

Part of #77.

epic: query planner - plan selection and generation (testing)

Port the federation repo tests corresponding to the functionality ported during #31

epic: testing - integration tests for wasm support

additional integration test for wasm support (needed for Studio where fetch plan is visually represented) (@jeff to follow up w/ Pulsar team)

Remove V8 from Apollo Router

Before this can happen, various pieces of functionality currently provided by router-bridge need to have a Rust replacement:

Tasks

Beta Give feedback

https://github.com/apollographql/federation-planning/issues/458
https://github.com/apollographql/federation-planning/issues/385
epic: operation signature #26

epic
https://github.com/apollographql/federation-planning/issues/384
Plan selection and generation (query plan business logic) #31
Options

epic: query planner - query graph traversal (testing)

Port the federation repo tests corresponding to the functionality ported during #30

[operation processing] normalize `@defer` usage

epic: dev tooling

As an engineer, I would like a means to test federation and federation-next against a sample dataset, so that I have a fast feedback loop for my work. As well as a CI/CD process that permits the team to build and test all their changes together.

Acceptance Criteria

Articulate a list of dev tools that would empower the team. As a later step we can groom this further into issues or sets of issues based on our thoughts on their rough sizing.

epic: creating query graphs (testing)

Port the federation repo tests corresponding to the functionality ported during https://github.com/apollographql/federation-planning/issues/459

configure CI

Query graph traversal (query plan business logic)

During query planning, we need to use the operation to traverse the query graph and generate options for each closed branch (leaf field).

This is specifically step 8 of this query-planning summary. Note that if it's possible to cleanly punt on porting feature-specific code for later (e.g. subscriptions), we should take the opportunity to do so, in the interests of delivering incrementally.

[operation processing] rebase fragments on subgraph fetches

take sdl as input and return a complete validated subgraph (parse_and_expand)

Craft a basic data set

Can we craft a small toy example of subgraphs to merge into a supergraph? In this way we have a known example we can demonstrate with Fed 2 and try out in our new draft.

Ongoing log of issues and features to be backported

To backport

Beta Give feedback

[SPLIT] Operation processing (query plan business logic)

During query planning, we should be able to parse/validate operations into a representation suitable for query planning, and process operations (and their fragments) so that they may be used for query planning.

This is specifically steps 5-7 of this query-planning summary, i.e.:

Parse and validate the operation against the API schema (similar to schemas, federation-internals has a representation of operations that provides capabilities useful for federation-y things, e.g. rebasing).
Modify operation and fragments (e.g. adding __typename where needed, removing introspection fields, normalizing @defer usages).
Expand fragments in operations for query-planning purposes, and rebase fragments on subgraphs (to be used in subgraph fetches).

Refactor `extract_subgraphs_from_supergraph()` to carry around state in a struct

This is a follow-up to a PR #56 (comment) .

Currently, extract_subgraphs_from_supergraph() will call many functions during its execution, and those functions take a lot of arguments. We'd like to hoist those arguments (where possible) into a new struct (e.g. SubgraphExtractor), and have those functions be methods on that struct. This issue is for performing that conversion. This should hopefully reduce the number of explicit lifetimes used in those functions.

[Plan Generation] Fetch group processing

Port the logic for fetch group processing from the JS codebase to Rust in fetch_dependency_graph_processor.rs (there's some stubs there). That is:

Port the Typescript interface FetchGroupProcessor to a trait named FetchDependencyGraphProcessor.
Port the defaultCostFunction global object here to the Rust struct FetchDependencyGraphToCostProcessor by having it implement the trait.
Port the fetchGroupToPlanProcessor() function here to the Rust struct FetchDependencyGraphToQueryPlanProcessor by having it implement the trait.

Note that I think we need more operation processing logic to be done before we can port FetchGroup.toPlanNode() to FetchDependencyGraphNode.toPlanNode(), so I'd leave a TODO there for now.

Part of #78.

Convert `SpecDefinition` to inner struct instead of trait

This is a follow-up to a PR #56 (comment) .

Currently, SpecDefinition is a trait being used as a mixin (where trait implementers define two methods, and SpecDefinition provides various other method implementations based on those two methods). We could instead imagine this as a struct with the state for those two methods (url and minimum_federation_version), where implementers instead declare a field containing this struct. This issue is for performing that conversion.

Note that implementers will still need to implement some trait to indicate that the inner struct can be accessed, for structs like SpecDefinitions to still work. This could be Deref<SpecDefinition>, so that SpecDefinitions would be SpecDefinitions<T: Deref<SpecDefinition>> instead of SpecDefinitions<T: SpecDefinition>. However, we do not want to enable deref coercions for implementers (and Deref gives the implication that the implementer is a smart pointer when it really isn't). So you should instead make a new trait that's like Deref<SpecDefinition>, maybe AsSpecDefinition (we may also be fine with AsRef<SpecDefinition>, but not sure here).

[operation processing] optimize `__typename` usage

epic: composition - advanced

Advanced use cases of composition: composeDirective, directive framework, @authenticated, @requiredScopes, `@interfaceObject'

[Operation Processing] Add the ability to represent normalized named fragments and named fragment spreads

The normalized representation of operations introduced in #87 cannot currently represent named fragment spreads (e.g. ... on Foo for named fragment Foo) and named fragments. This is fine for now, since we needed to expand fragments anyway prior to query planning. However:

When we get to fragment optimization of subgraph queries after query planning, we'll need to represent subgraph queries with named fragment spreads (and their associated named fragments).
In order to facilitate the key logic in #103, we'll need to be able to convert apollo-rs named fragments to normalized named fragments first, and then expand.

This ticket is for adding the ability to represent normalized named fragments, along with named fragment spreads in normalized selection sets.

epic: operation signature

As an engineer, I would like to replace common graphql-js functions with native rust functions so I may leave graphql-js by the side of the road.

Acceptance Criteria

Articulate a list of Validation, introspection, operation signature jobs to be done. As a later step we can groom this further into issues or sets of issues based on our thoughts on their rough sizing.

epic: testing - performance (router)

performance tests (router) (placeholder @jeff to follow up w/ Polaris team)
Using the data set defined in query planner performance epic, establish a comparable end-to-end router performance test that will permit us to exercise federation and federation-next to measure, CPU, memory and schema reload latency.

spike: what are the primary tasks performed by query-graph-js?

Acceptance Criteria: Review query-graph-js and write a document providing an analysis of key objects, functions (inputs and output) and purpose. Where there are still unknowns, write follow up issues linked to this issue.

Lets time box this to three business days. I know this seems short but let's use this as an opportunity to discover what we can at a high level. I fully expect that more time would be needed but lets use this issue as our mechanism to put transparency on the unknowns.

epic: testing - query planner fuzz testing

setup test fuzzing for unit testing (S)

[Plan Selection] Computing best plan

Port the logic for generateAllPlansAndFindBest() from generateAllPlans.ts in the JS codebase over to Rust. The logic is relatively self-contained, so it's a candidate for a good first issue in the sense it doesn't require much prerequisite reading. The testing is also pretty self-contained (and there's not too much test logic), so it shouldn't be too bad to port tests either.

Part of #77.

epic: testing - performance (query-planner)

performance test (query-planner) (get a random sample from test harness, establish a baseline) (M)

add fuzzing and corpus testing to release process

As we add and continue to make improvements to the testing process, we should try to baking it in to the release process for the query planner. This should use cluster-fuzz and a form of corpus harness testing. Includes documentation.

add default members to workspace and exclude apollo-harness

apollo-harness builds a router-bridge which requires deno-core which requires libz-ng-sys which requires cmake. Not all systems have cmake installed.

Most developers won't need to build apollo-harness natively on their host system. To avoid requiring that they have cmake installed, add a default-members list to the workspace which excludes apollo-harness.

Document the apollo-harness README.md requirement to have cmake if you are modifying code in src.

Refactor `Subgraph` and `Supergraph` to use `FederationSchema`

This is a follow-up to a PR #56 (comment) .

We've created FederationSchema to help keep track of federation metadata and referencers as we build subgraph schemas in extract_subgraphs_from_supergraph(). We could change the Subgraph and Supergraph structs (used by composition) to wrap this FederationSchema instead of Schema. This issue is for performing that transformation.

Note that FederationSchema doesn't allow arbitrary access to its underlying Schema, as it must update metadata appropriately with each insert()/remove(); this may consequently require some refactoring of our Rust composition code. Additionally (similar to the JS codebase), FederationSchema requires that types be "declared" (via pre_insert()) before type references of that type may be used, to keep track of type referencers appropriately. This will mean doing a similar step as in JS composition where we make a pass to collect existing type/directive names.

epic: @defer use case

Add support for @defer use case.

[Plan Selection] `OpPathTree` creation and modification

As part of computeBestPlanFromClosedBranches(), we need the following OpPathTree-related methods to be ported to Rust:

PathTree.createOp()
PathTree.createFromOpPaths()
PathTree.merge()

Part of #77.

epic: query planner - operation processing (testing)

Port the federation repo tests corresponding to the functionality ported during #29

Generate query plan from fetch dependency graph (query plan business logic)

This GH issue is the second part of #31 .

Specifically:

Reducing and optimizing the fetch dependency graph.
Generating a query plan from the reduced fetch dependency graph.

More specifically, this issue is for porting FetchDependencyGraph.process() and its code dependencies.

Tasks

Beta Give feedback

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Awaiting Schedule

These updates are awaiting their schedule. Click on a checkbox to get an update now.

chore(deps): lock file maintenance

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Vulnerabilities

Renovate has not found any CVEs on osv.dev.

Detected dependencies

cargo

Cargo.toml

apollo-compiler =1.0.0-beta.12

derive_more 0.99.17

indexmap 2.1.0

lazy_static 1.4.0

petgraph 0.6.4

salsa 0.16.1

serde_json 1.0.108

strum 0.25.0

strum_macros 0.25.2

thiserror 1.0

url 2

insta 1.34.0

circleci

.circleci/config.yml

rust 1.6.1

gh 2.3.0

secops 2.0.7

epic: testing - planning

As an engineer, I would like to:

set up unit, integration tests, performance tests

So that we have quality metrics established at the start of the project to establish a baseline.

Acceptance Criteria

unit tests and integration tests should be written as part of standard development
migrating existing jest tests, should be done as part of development process
performance test (query-planner) (get a random sample from test harness, establish a baseline) (M)
performance tests (router) (@jeff to follow up w/ Polaris team)
harness tests (we need to revisit the test harness retro tasks) (M)
additional integration test for wasm support (needed for Studio where fetch plan is visually represented) (@jeff to follow up w/ Pulsar team)
backwards compatibility tests - runs fed2x and fed-next, compares output to validate equivalence. (tool, take dataset from perf tests, CI) (M)
setup test fuzzing for unit testing (S)

considerations for query graph on router reloads

This issue is for us to look into query graph and router reloads relationship. Does the query graph need to be cached? Is the computation in essence marginally small that it doesn’t matter?

This doesn’t have to be solved/looked into until we have some (performance) data on the new query planner in the router.

apollographql / federation-next Goto Github PK

federation-next's Introduction

Apollo Federation

Usage

Contributing

Security

License

federation-next's People

Stargazers

Watchers

federation-next's Issues

Tasks

To backport

Tasks

Awaiting Schedule

Open

Vulnerabilities

Detected dependencies

Recommend Projects

Recommend Topics

Recommend Org

Jobs