GithubHelp home page GithubHelp logo

cambria-project's Introduction

Cambria

Cambria is a Javascript/Typescript library for converting JSON data between related schemas.

You specify (in YAML or JSON) a lens, which specifies a data transformation. Cambria lets you use this lens to convert:

Lenses are bidirectional. Once you've converted a document from schema A to schema B, you can edit the document in schema B and propagate those edits backwards through the same lens to schema A.

For more background on why Cambria exists and what it can do, see the research essay.

⚠ Cambria is still immature software, and isn't yet ready for production use

Use cases

  • Manage backwards compatibility in a JSON API
  • Manage database migrations for JSON data
  • Transform a JSON document into a different shape on the command line
  • Combine with cambria-automerge to collaborate on documents across multiple versions of local-first software

CLI Usage

Cambria includes a simple CLI tool for converting JSON from the command line.

(You'll want to run yarn build to compile the latest code.)

Covert the github issue into a an arthropod-style issue:

cat ./demo/github-issue.json | node ./dist/cli.js -l ./demo/github-arthropod.lens.yml

To get a live updating pipeline using entr:

echo ./demo/github-arthropod.lens.yml | entr bash -c "cat ./demo/github-issue.json | node ./dist/cli.js -l ./demo/github-arthropod.lens.yml > ./demo/simple-issue.json"

Compile back from an updated "simple issue" to a new github issue file:

cat ./demo/simple-issue.json | node ./dist/cli.js -l ./demo/github-arthropod.lens.yml -r -b ./demo/github-issue.json

Live updating pipeline backwards:

echo ./demo/simple-issue.json | entr bash -c "cat ./demo/simple-issue.json | node ./dist/cli.js -l ./demo/github-arthropod.lens.yml -r -b ./demo/github-issue.json > ./demo/new-github-issue.json"

API Usage

Cambria is mostly intended to be used as a Typescript / Javascript library. Here's a simple example of converting an entire document.

// read doc from stdin if no input specified
const input = readFileSync(program.input || 0, 'utf-8')
const doc = JSON.parse(input)

// we can (optionally) apply the contents of the changed document to a target document
const targetDoc = program.base ? JSON.parse(readFileSync(program.base, 'utf-8')) : {}

// now load a (yaml) lens definition
const lensData = readFileSync(program.lens, 'utf-8')
let lens = loadYamlLens(lensData)

// should we reverse this lens?
if (program.reverse) {
  lens = reverseLens(lens)
}

// finally, apply the lens to the document, with the schema, onto the target document!
const newDoc = applyLensToDoc(lens, doc, program.schema, targetDoc)
console.log(JSON.stringify(newDoc, null, 4))

Install

If you're using npm, run npm install cambria. If you're using yarn, run yarn add cambria. Then you can import it with require('cambria') as in the examples (or import * as Cambria from 'cambria' if using ES2015 or TypeScript).

Tests

npm run test

cambria-project's People

Contributors

geoffreylitt avatar orionz avatar pvh avatar singingwolfboy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cambria-project's Issues

postinstall hook fails

Running yarn install on a node project that depends on cambria results in this error:

error /Users/geoffreylitt/dev/arthropod2/arthropod/node_modules/cambria: Command failed.
Exit code: 127
Command: yarn run build
Arguments:
Directory: /Users/geoffreylitt/dev/arthropod2/arthropod/node_modules/cambria
Output:
yarn run v1.17.3
$ tsc --outDir ./dist

The problem is that we don't install dependencies before attempting to run yarn run build in the post-install hook.

Options:

  • replace yarn run build w/ a cross-platform-friendly version of yarn install && yarn run build (&& not windows-friendly)
  • put dist/ in git

I think this won't be a problem once we publish to npm -- I think we can publish dist to npm without putting it in source control.

Issue importing doc

When I import doc which has two array items (with a property that has null) all works fine:

const importedDoc = cambria.importDoc({
        items: [
                { name: 'Peter Johnson', nationality: null },
                // { name: 'Ashley Appleseed', nationality: null },
                { name: 'John Doe', nationality: null }
              ]
    })

but if I have 3 items in the array:

const importedDoc = cambria.importDoc({
        items: [
                { name: 'Peter Johnson', nationality: null },
                { name: 'Ashley Appleseed', nationality: null },
                { name: 'John Doe', nationality: null }
              ]
    })

it fails with error:

(node:21691) UnhandledPromiseRejectionWarning: TypeError: Cannot read property 'type' of null
    at Object.mergeSchemaObjs (.../node_modules/to-json-schema/lib/helpers.js:40:19)
    at .../node_modules/to-json-schema/lib/index.js:93:24
    at Array.reduce (<anonymous>)
    at ToJsonSchema.getCommonArrayItemSchema (...node_modules/to-json-schema/lib/index.js:92:22)

Node version: v12.18.3

Support for subschemas and references

I ran into some errors trying to update a schema that had subschemas using the allOf keyword. The schema I was trying to update can be seen here.

I've implemented a naive fix that strictly addresses this problem for my use case, but then I run into the problem that $ref's are not yet supported. I dug around a bit and found a mention of this issue in #1 (comment), and I assume this is still the status, as @geoffreylitt mentioned there:

We could start adding support for refs everywhere, but this seems like the right moment to pause and consider Peter's suggestion of finding a better way of working with json schemas -- either find a helper lib or roll our own functions to help navigate a schema.

So I guess I'm wondering if there have been any further decisions or discussion about how this will move forward, and whether contributions would be considered. If so, I'd be happy to lend a hand.

For my own use case, it's fairly trivial to just pull out the only allOf schema that's really significant (frankly, it's annoying it's there at all), and then "dereference" my schema definitions as normal properties. I'm already doing so in my demo repo. So I have a workaround. But I figured it could be useful to document here, and I appreciate any updates if and when there is progress on these features.

JSON Schema - use of allOf, oneOf

I'm thinking about using Cambria in a tool that already uses JSON Schema (CRUD tool for JSONs). However, things like allOf and oneOf are not supported (and will not be supported as they make things too complicated, in the code but also from the perspective of the created data structures).

I was wondering why those two things are used in the Cambria JSON schema.

allOf

If I understand correctly, this is used to first import the $ref and then set a custom title/description? Couldn't the title/description then just be set on the original $refd definition? Or will this probably be used in other places as well?

Also - and again I'm not sure - can't $ref be used to import parts of another schema? So instead of re-defining the basic JSON schema types, could the $ref not point directly to http://json-schema.org/draft-07/schema/definitions/simpleTypes? Although allOf would then definitely have to be used to set the title/description ...

oneOf

Instead of the oneOf in lensOp, couldn't just one big object be created with all the ops as keys? It would allow to add multiple, but distinct OPs in one step - but is that a problem? (not sure if some of the JSON schema validation rules could be used to prevent that, we're also not making use of those)

Not pure (but deterministic) lenses

So currently lenses have to be pure and cannot obtain additional data. This is an issue for cases like the Stripe API example from the doc. But in practice, such transformations are pretty common, at least in my case which is focusing on API versioning transformations. You move part of data to another API endpoint, for example. I was thinking that this can be nicely solved in the case of API versioning by having lens be able to call API at the version for which the lense is written. In this way the lense continues to work even many versions later and you use API versioning transformations to get data for the lense back-migrated from the latest API version to the version the lense expect.

The issue I am having is how would one describe such API requests in a declarative way, like current lenses are. I am finding that this is mostly impossible and it looks I would have to write imperative code for forward and backward migration. But then it is hard to also be able to automatically implement transformations of schemas, and so on.

So I am curious if you have any thoughts on this subject?

Enforcing conservation

It seems to me, at first sight, that it should be possible to enforce the conservation design goal at compile time, if desired.

For instance, in the example from appendix III of the research paper where the single-value writer is [...] overwriting data they are unaware of, maybe Cambria could track which schema is aware/unaware of which bits of data, and make sure no write that happens under a given schema ever affects data that is not covered by that schema.

One could maybe model the lens more precisely by keeping the common information separate from the one-sided information, in an intermediate step:

Schema A: assignee: string | null
<- lens 1 (trivially) ->
Intermediate: { mainAssignee: string | null, otherAssignees: string[] }
<- lens 2 (lossless) ->
Schema B: assignees: string[]

I'll think about this some more.

Missing property type in reversed `remove` op

Description

This demo mentioned in README.md seems broken:

Compile back from an updated "simple issue" to a new github issue file:

cat ./demo/simple-issue.json | node ./dist/cli.js -l ./demo/github-arthropod.lens.yml -r -b ./demo/github-issue.json

In practice:

$ cat ./demo/simple-issue.json | node ./dist/cli.js -l ./demo/github-arthropod.lens.yml -r -b ./demo/github-issue.json
./dist/json-schema.js:35
        throw new Error(`Missing property name in addProperty.\nFound:\n${JSON.stringify(property)}`);
        ^

Error: Missing property name in addProperty.
Found:
{"op":"add","name":"labels"}
    at addProperty (./dist/json-schema.js:35:15)
    at applyLensOperation (./dist/json-schema.js:338:20)
    at ./cambria-project/dist/json-schema.js:366:16
    at Array.reduce (<anonymous>)
    at Object.updateSchema (./dist/json-schema.js:363:17)
    at Object.applyLensToDoc (./dist/doc.js:51:40)
    at Object.<anonymous> (./dist/cli.js:25:22)
    at Module._compile (internal/modules/cjs/loader.js:1085:14)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1114:10)
    at Module.load (internal/modules/cjs/loader.js:950:32)

Cause

github-arthropod.lens.yml defines a valid remove op––no type required:

- remove:
name: labels

And reverse.ts just swaps in the add op:

case 'remove':
return {
...lensOp,
op: 'add',
}

But an add op must specify a type! It's the type missing, not the name:

if (!name || !type) {
throw new Error(`Missing property name in addProperty.\nFound:\n${JSON.stringify(property)}`)
}

Solutions

  • Require types on remove ops.
  • Set an unrestrictive default type when reversing a remove op without a type. I sketched that out here, and confirmed the demo works as expected and unit tests pass: lukasschwab@77b602e
  • Loosen the commitment to reversibility. This seems incompatible with the project goals.

Let me know if you'd like me to open a PR.

Cool stuff––enjoyed the HYTRADBOI talk!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.