GithubHelp home page GithubHelp logo

condenast / atjson Goto Github PK

View Code? Open in Web Editor NEW
213.0 213.0 13.0 58.76 MB

atjson is a living content format for annotating content

Home Page: https://atjson.condenast.io

License: Apache License 2.0

TypeScript 95.47% HTML 0.31% CSS 0.45% JavaScript 0.58% Shell 0.01% MDX 3.19%
maintained

atjson's People

Contributors

a-rena avatar adityagandhamal avatar akondakov-cn avatar allison-zhao avatar andrealandonio avatar anurag-cn avatar bachbui avatar blaine avatar colin-alexa avatar dependabot[bot] avatar dkorenblyum avatar donmclean avatar fedeava avatar foobarrio avatar gmedina avatar gnorsilva avatar indrani-gostu avatar jaylonez avatar kmaxxo avatar mattbedell avatar nayeemrehman avatar neilius avatar nosamanuel avatar pgoldrbx avatar renovate-bot avatar renovate[bot] avatar rnsell avatar tim-evans avatar vinay-pr avatar vladimirjv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

atjson's Issues

Fix splitting delimiter runs

We have a utility in our Commonmark renderer to adjust the boundaries of certain annotations when they would produce an invalid delimiter run. This logic had assumed that the rules for valid delimiter runs were the same regardless of what the specific delimiter character was, but this is not the case.

Here are the rules for delimiters, from least to most restrictive:

If the delimiter is ^ or ~:

  • the inner boundary must not be a whitespace character

If the delimiter is *, **, or ~~

  • the inner boundary for a delimiter run must not be a whitespace character
  • the outer boundary for a delimiter run must be a whitespace or punctuation character if the inner boundary is a punctuation character

If the delimiter run is _ or __

  • the inner boundary for a delimiter run must not be a whitespace character
  • the outer boundary for a delimiter run must be a whitespace or punctuation character

Here are some examples of the correct behavior. Here square brackets represent the delimiter boundary, an underscore represents a whitespace character, and a dash represents a punctuation character:

Original Split for ^, ~ Split for *, **, ~~ Split for _, __
[_a_b] _[a_b] _[a_b] _[a_b]
a[-b] a[-b] a-[b] a-[b]
a[b_c] a[b_c] a[b_c] ab_[c]
a[bc] a[bc] a[bc] abc[]

Return subdocuments in the ReactRenderer as a React node

During our meeting discussion, we had a discussion around improving developer ergonomics of the @atjson/renderer-react package, and wanted to make subdocuments usable in a more natural React way.

This change requires looking at an annotation to see if it has subdocuments, rendering the subdocument using the components given to the top level render, and then returning that property. This does change compatibility using AttributesOf type definition.

So, to summarize, using an annotation like so:

import { ObjectAnnotation } from "@atjson/document";

export class Image extends ObjectAnnotation<{
  src: string;
  caption: CaptionSource;
}> {
  static vendorPrefix = "test";
  static type = "image";
  static subdocuments = { caption: CaptionSource };
}

Previously, the React component would require calling ReactRenderer.render again:

import { AttributesOf } from "@atjson/document";
import ReactRenderer from "@atjson/renderer-react";
import * as React from "react";
import { FC } from "react";
import { Image as Annotation } from "./annotation";
import components from '../components';

export const Image: FC<AttributesOf<Annotation>> = props => {
  return (
    <figure>
      <img src={props.src} />
      <figcaption>{ReactRenderer.render(props.caption, components)}</figcaption>
    </figure>
  );
};

With this proposal, this code could be simplified:

Previously, the React component would require calling ReactRenderer.render again:

import { AttributesOf } from "@atjson/renderer-react";
import * as React from "react";
import { FC } from "react";
import { Image as Annotation } from "./annotation";

export const Image: FC<AttributesOf<Annotation>> = props => {
  return (
    <figure>
      <img src={props.src} />
      <figcaption>{props.caption}</figcaption>
    </figure>
  );
};

To summarize, the suggested changes here are:

  • Add a new export to @atjson/renderer-react for a React-aware AttributesOf
  • Update the react renderer with a breaking change that recursively calls ReactRenderer.render on any subdocuments that are found.

Add cumulative performance time to performance profile summary

Currently we don't have this readily available for our performance profiles, which makes it a bit difficult in cases like #394, where function names were changed with the change. The cumulative summary with the confidence interval would help in this case so we're more aware of the affect of the changes.

Annotation.equals returns false positives

When the annotation being compared has additional attributes, Annotation.equals returns a false positive.

What do you expect to happen?

Annotation.equals should return false when the attributes are not strictly equal.

Only short close HTML tags that are valid to short close

We were seeing usage of shortClose causing parsing issues due to short closed elements being restricted in the HTML spec to void elements.

The shortClose property should be removed from the $ method in the HTML renderer and we should short close automatically according to whether the tag name is in the list of valid void elements.

Remove the concept of "tight" lists

Commonmark has a concept of "tight" lists, which is a shorthand that says whether there should be surrounding whitespace in a list item or not. We've adopted this into the offset-annotations package, which I believe is a mistake. It has the potential of causing changes in behavior because a client is unaware of how they should render the markup with / without the tight parameter.

In addition to this, it changes how annotations should be rendered according to the outer context.

The proposal is to wrap list items to wrap list items in paragraphs if tight is false in the converter from markdown to atjson.

Index annotations by type to speed up `where` queries that check the type

The where API (for documents, not collections) is like a relational db api, so indexes make sense for us to keep. We shouldn't create a dsl (currently) for creating indexes.

Reindexing has to happen less than we call where queries to get the performance benefits.

This will only be beneficial if the index is long-lived. (cf. if the index exists on a collection, the index isn't super useful). In the case of converters, we create a new collection and discard the collection along with the annotations. To see a large improvement with this, we may need to rearchitect consuming code.

Cache busting will be hard, and we'll need to cache-bust the indexes when addAnnotations, removeAnnotations, and replaceAnnotations is called.

👩‍🔬 Hypothesis:

Most of the where queries we do are faceted by type, and we suspect that indexing by type will make all consuming code of atjson faster if queries are indexed by type and we could facet query results by type at the cost of O(1).

⚠️ The functional API will not use this and this should not be a breaking change to atjson.

Strikethrough is missing from renderer-html

We're working on a CKEditor integration, and strikethrough is failing to work.

What do you expect to happen?

A strikethrough annotation is rendered into <s>strikethrough</s>

What happened instead?

Strikethroughs are rendered as strikethrough (no tags).

Complete Open Source Checklist

This is a master issue to work through the Condé Nast Open Source Checklist.

Documentation

  • README
  • CONTRIBUTING
  • CODE_OF_CONDUCT
  • LICENSE
  • ISSUE_TEMPLATE
  • PULL_REQUEST_TEMPLATE
  • CHANGELOG
  • Announcement Blog Post

Development

  • Tests
  • Linter
  • Travis
  • Examples
  • Clear Github History

README Checklist

  • Title
  • Badges
  • Description
  • Prerequisite for using software ( if any )
  • Install
  • Example Usage
  • Screenshots and GIFs
  • Contributors
  • Conde Nast Technology Logo
  • Attributions ( mention 3rd party libs used etc. )
  • Benchmarks (if any)
  • Prior Art (if any)

Badges Checklist:

  • License Badge
  • Testing Badge
  • CI Badge

Not all annotations have type definitions on them

Some annotation definitions don't have an annotation schema defined on them.

The following packages have incomplete / incorrectly defined annotation definitions on them:

  • @atjson/source-gdocs-paste
  • @atjson/source-mobiledoc
  • @atjson/source-prism
  • @atjson/source-url

Define Schema interface

As a kickoff of #183, let's have a discussion of how schemas should be structured.

In the branch that I created, the schema interface looks like:

interface SchemaDefinition {
  type: string;
  version: string;
  annotations: {
    [key: string]: typeof Annotation;
  }
}

The other option here is the most minimal approach, which is an annotation lookup table:

interface SchemaDefinition {
  [key: string]: typeof Annotation;
}

Before we start writing this, we want to solicit some feedback and ensure that we understand requirements that we may want on this.

Some general questions:

  • We've made a "content negotiation framework" to support dynamic markdown fetching so we don't need to optimistically write migrations. Do we want to add some friendly hooks here to handle this more easily?
  • We've had some discussions around versioning, and was wondering if it made sense for the version of the schema to be computed by the hash of the annotations included in the document.
  • How do we handle code that currently does lookups via strings? Should we require type to retain this behaviour, or should we sunset that pattern?

@bachbui, @colinarobinson, @blaine have all contributed to this discussion prior to this issue being opened
❤️ Thank you ❤️

Nested lists are flattened when converting from gdocs paste source

In a Google Doc, you can create a nested list like

1. List item
2. List item
   a. Nested list item
   b. Nested list item
3. List item

This is represented in GDocs as a single list with 5 list items, where the outer items have attributes ls_nest: 0 and in nested items have attributes ls_nest: 1. When converting this from the GDocs source to Offset, we drop the ls_nest attribute and just produce a list with 5 elements.

What do you expect to happen?

Produce annotations like:

List item\nList item\nNested list item\nNested list item\nList item
                      ^-----item-----^  ^-----item-----^
                      ^------ List { level: 2 } -------^ 
^-item--^  ^-item--^  ^------------- item -------------^  ^-item--^
^---------------------- List { level: 1 } ------------------------^

What happened instead?

Produced annotations like:

List item\nList item\nNested list item\nNested list item\nList item
^-item--^  ^-item--^  ^-----item-----^  ^-----item-----^  ^-item--^
^----------------------------- List ------------------------------^

Environment

Software Version(s)
Node 10
Lerna 3.2
npm 6.4
Browser Chrome 80

Add pandoc source and renderer

What are the results of this discussion?

I'd like to propose to primarily use the Pandoc document model for source and rendering. Pandoc is a universal document converter covering Markdown, Office, TeX and many more. It is being developed and used heavily since years so it also covers most edge cases and pitfalls of these formats. Pandoc internally converts document formats to its document model which can be read and written as JSON, e.g.:

echo '# Hello _World_' | pandoc -t json

For reference of the model see this Haskell package. Maybe this mapping to Perl I've written a few years ago, is also of use. So the workflow for converting documents to and from atjson would be:

  • Document => Pandoc JSON => atsjon
  • atjson => Pandoc JSON => Document

When atjson specification will be finished, support of atjson could also be added to the Pandoc source code.

What do you think about use of Pandoc for conversion from and to atjson?

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Warning

These dependencies are deprecated:

Datasource Name Replacement PR?
npm @babel/plugin-proposal-class-properties Unavailable
npm @babel/plugin-proposal-object-rest-spread Unavailable
npm @ckeditor/ckeditor5-build-classic Unavailable
npm @types/entities Unavailable
npm @types/parse5 Unavailable
npm @types/prettier Unavailable

Awaiting Schedule

These updates are awaiting their schedule. Click on a checkbox to get an update now.

  • chore(deps): update dependency lint-staged to v13.3.0
  • chore(deps): update dependency node to v20.15.0
  • chore(deps): update dependency ts-loader to v9.5.1
  • chore(deps): update dependency typescript to v5.5.3
  • chore(deps): update eslint (@typescript-eslint/eslint-plugin, @typescript-eslint/parser, eslint, eslint-config-prettier, eslint-plugin-jest)
  • chore(deps): update react monorepo (@types/react, @types/react-dom, react, react-dom)
  • fix(deps): update dependency @wordpress/shortcode to v3.58.0
  • fix(deps): update dependency classnames to v2.5.1
  • fix(deps): update dependency sax to v1.4.1
  • fix(deps): update docusaurus monorepo to v3.4.0 (@docusaurus/core, @docusaurus/preset-classic)
  • chore(deps): update eslint (major) (@typescript-eslint/eslint-plugin, @typescript-eslint/parser, eslint, eslint-config-prettier, eslint-plugin-jest, eslint-plugin-prettier)
  • chore(deps): update major deps (major) (@ckeditor/ckeditor5-build-classic, @ckeditor/ckeditor5-engine, @commitlint/cli, @commitlint/config-conventional, @types/markdown-it, @types/node, @types/parse5, @types/prettier, @wordpress/shortcode, actions/checkout, actions/github-script, actions/setup-node, actions/upload-artifact, conventional-changelog-core, entities, husky, jsdom, lerna, lint-staged, prettier, react, react-dom)

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

github-actions
.github/workflows/ci.yml
  • actions/checkout v3
  • actions/setup-node v3
  • actions/checkout v3
  • actions/setup-node v3
.github/workflows/docs.yml
  • actions/checkout v3
  • actions/setup-node v3
.github/workflows/perf.yml
  • actions/setup-node v3
  • actions/checkout v3
  • actions/checkout v3
  • actions/github-script v6
  • actions/upload-artifact v3
.github/workflows/prerelease.yml
  • actions/github-script v6
  • actions/github-script v3
  • actions/checkout v3
  • actions/setup-node v3
  • actions/github-script v6
  • actions/github-script v6
.github/workflows/release.yml
  • actions/checkout v3
  • actions/setup-node v3
npm
package.json
  • @babel/core 7.24.7
  • @babel/plugin-proposal-class-properties 7.18.6
  • @babel/preset-env 7.24.7
  • @babel/preset-react 7.24.7
  • @babel/preset-typescript 7.24.7
  • @ckeditor/ckeditor5-build-classic 37.0.1
  • @ckeditor/ckeditor5-engine 35.3.2
  • @commitlint/cli 17.8.1
  • @commitlint/config-conventional 17.8.1
  • @condenast/perf-kit 0.1.4
  • @types/chance 1.1.6
  • @types/entities 2.0.0
  • @types/jest 29.5.12
  • @types/jsdom 21.1.7
  • @types/markdown-it 12.2.3
  • @types/minimist 1.2.5
  • @types/node 18.19.39
  • @types/parse5 6.0.3
  • @types/prettier 2.7.3
  • @types/react 18.2.70
  • @types/react-dom 18.2.22
  • @types/sax 1.2.7
  • @types/wordpress__shortcode 2.3.6
  • @typescript-eslint/eslint-plugin 5.58.0
  • @typescript-eslint/parser 5.58.0
  • babel-jest 29.7.0
  • chance 1.1.11
  • commonmark 0.31.0
  • commonmark-spec 0.31.2
  • conventional-changelog-core 4.2.4
  • eslint 8.38.0
  • eslint-config-prettier 8.8.0
  • eslint-plugin-jest 27.2.1
  • eslint-plugin-prettier 4.2.1
  • husky 8.0.3
  • jest 29.7.0
  • jest-environment-jsdom 29.7.0
  • jsdom 21.1.2
  • lerna 6.6.2
  • lint-staged 13.2.3
  • markdown-it 14.1.0
  • minimist 1.2.8
  • prettier 2.8.8
  • react 17.0.2
  • react-dom 17.0.2
  • ts-loader 9.4.4
  • typescript 5.4.5
  • uuid-random 1.3.2
packages/@atjson/document/package.json
  • uuid-random ^1.3.0
packages/@atjson/hir/package.json
packages/@atjson/offset-annotations/package.json
packages/@atjson/react/package.json
  • react *
packages/@atjson/renderer-commonmark/package.json
packages/@atjson/renderer-graphviz/package.json
packages/@atjson/renderer-hir/package.json
packages/@atjson/renderer-html/package.json
  • entities ^4.3.1
packages/@atjson/renderer-plain-text/package.json
packages/@atjson/renderer-react/package.json
  • react *
packages/@atjson/renderer-webcomponent/package.json
packages/@atjson/source-ckeditor/package.json
packages/@atjson/source-commonmark/package.json
  • entities ~4.5.0
  • markdown-it 14.1.0
packages/@atjson/source-gdocs-paste/package.json
packages/@atjson/source-html/package.json
  • parse5 ^7.1.2
packages/@atjson/source-mobiledoc/package.json
packages/@atjson/source-prism/package.json
  • @types/sax ^1.2.7
  • entities ^4.5.0
  • sax ^1.3.0
packages/@atjson/source-url/package.json
packages/@atjson/source-wordpress-shortcode/package.json
  • @wordpress/shortcode 3.54.0
packages/@atjson/util/package.json
website/package.json
  • @babel/plugin-proposal-class-properties 7.18.6
  • @babel/plugin-proposal-object-rest-spread 7.20.7
  • @babel/preset-typescript 7.24.7
  • @docusaurus/core 3.0.0
  • @docusaurus/preset-classic 3.0.0
  • classnames 2.3.2
  • react 18.2.0
  • react-dom 18.2.0
  • resize-observer 1.0.4
  • styled-components 6.1.11
  • @types/react 18.2.70
  • @types/styled-components 5.1.34
nvm
.nvmrc
  • node 20.12.2

  • Check this box to trigger a request for Renovate to run again on this repository

Unicode normalization of content

What are the results of this discussion?

The character position within a Unicode string depends on whether the string is normalized and which Unicode normalization form is used. atjson should specify to normalize content strings to avoid character position mismatch and to ensure same content results in same character sequence.

When are two content strings assumed to be equivalent? Does atjson recommend or require Unicode normalization form and which?
I recommend NFC (Normalization Form Canonical Composition).

Add performance regression testing on pull requests

We currently have some fairly rudimentary performance regression testing in renderer-commonmark. It turns out that it's really hard to catch these, because they don't cause any test failures, nor does it give any indication of whether the change in atjson caused any relative change in performance.

The proposal here is the following:

  • move these benchmarks / integration tests to a top level tests folder
  • run a github action that runs on each pull request that stores the previous performance runs in memory and provides an indication, with probability of whether this is actually meaningful.
  • improve our performance tests so we can track down performance regressions more accurately.

Write schema interface

Implement schema interface as decided in #311

Please include type "macros" for grabbing annotation names and annotation classes.
Given a schema:

import { Bold, Italic } from "@atjson/offset-annotations";
const MySchema = {
  annotations: {
    Bold,
    Italic
  }
};

The annotations name type should return a type of "Bold" | "Italic" and the annotations class type should return typeof Bold | typeof Italic.

You can reference the document-in-test branch on this repository for some examples of how to do this. Ask @tim-evans if you have questions on handling this via conditional types.

Experiment with converters as renderers

Conceptually, renderers are a more general case of converters, being essentially a function from a document to any type. With the idea from #285 of adding additional safeguards and guarantees around renderer implementations, writing converters as renderers could help ensure that the converter satisfies some useful properties such as handling all the possible annotations in the source document.

(Written by @colinarobinson ❤️)

Natural/native renderers

There is potential to use atjson strictly as a document manipulation tool for any format for which we have a source defined. For example:

// Add target "_blank" to all links
let doc = HTMLSource.fromRaw(someHtmlString);
let links = doc.where({type: "-html-a"});

links.update(link => {
  link.attributes.target = "_blank";
});

One thing that makes this difficult currently is that in order to render this back to HTML, we have to convert the document to a common format, which seems unnecessary. It would be nice if each source could define its own "natural" or "native" renderer which acts on the schema of the source rather than on a common format schema. We have currently only been writing renderers acting on a common format schema in order to avoid the temptation of writing converters between every pair of document formats, but it seems this proposed natural renderer still is in line with that philosophy since it only acts on the source schema.

Currently, there is nothing stopping users from adding these renderer definitions. I wonder if we would want to formalize it in the api somehow:

// Add target "_blank" to all links
let doc = HTMLSource.fromRaw(someHtmlString);
let links = doc.where({type: "-html-a"});

links.update(link => {
  link.attributes.target = "_blank";
});
let newHtmlString = doc.render(); // calls the natural renderer of the source doc 

It is possible that the natural rendering of a document produces different results than converting to a common format and using the existing renderers. I'm not sure if that is a problem or not.

      Converter 
Source ---> Common Format
      \     |
       \    |
        \   |  HTMLRenderer
Natural  \  |
Renderer  \ |
           === html?

What are the results of this discussion?

Expressing equivalency classes for annotations

We often have to determine whether two documents are equivalent, which is exposed as document.equals(). This is implemented by comparing the canonical versions of the documents and checking that their content and annotations are equal. For annotations, this equivalency is implemented by checking their start and end positions match, and then doing a deep comparison of their attributes properties.

However, an annotation might have some properties, particularly in their attributes, which might not represent a meaningful difference. For example, if an annotation was created during a conversion, it is sometimes useful to include some properties from the original annotation in the converted version as signposts for verification. These properties should be ignored when determining if two annotations are equivalent.

It's currently possible to override equals on the annotation but we could provide nicer hooks. One possibility is to add a declarative API to annotations where one could list these 'non-data' attributes.

What are the results of this discussion?

Action Required: Fix Renovate Configuration

There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.

Error type: undefined. Note: this is a nested preset so please contact the preset author if you are unable to fix it yourself.

💬 Discuss: Project layout

What are the results of this discussion?

Package names and project layout after #57 is merged.

Annotation classes begin the process of binding sources and renderers more tightly together. It's becoming clearer to me that it's easier to move renderers and sources into the same package.

renderer-commonmark + source-commonmark = format-commonmark

There's also some questions about where render code for annotations should go. There's some thoughts that rendering code for annotations should live next to annotations. A possible file structure for this is:

📁 annotations
  📁 bold
    📄 annotation.ts
    📄 component.ts
    📄 style.css
    📄 template.html

Apple News Renderer / Source

Our brands use Apple News as a distribution, and we have expertise in publishing to Apple News. We have a service at Condé Nast that supports this, and I'm interested in outlining a more holistic approach to Apple News that provides better support for folks.

From a high level, I'd like to provide the following:

  • Rendering an Apple News article.json that can be sent to Apple for use on the News app.
  • Previewing an Apple News article.json using a React application by leveraging atjson's React renderer.
  • Importing an Apple News article.json so there can be rich preview (and eventual addition / manipulation) using atjson

I played around with type definitions for the Apple News Format, and ended up creating a programmatically generated type definition file from Apple's own documentation, which can be found in this gist. The code used to generate this file can be found here.

Using these definitions, we can create annotations for the Apple News format. I recommend that the renderer and source for Apple News use the same annotations. This means that if you want to render to Apple News, you should convert your schema into the Apple News annotations.

At a high level, I expect us to have annotations for Components and the AricleDocument. There's some additional parsing for handling inline formatting provided by HTML or markdown, which is fairly minimal (bold, italic, strikethrough, links, and a few others).

We can then use the React renderer to render a facsimile of how the article would appear on the News app. This would involve a bunch of work building out components for Apple News. For our teams at Condé, it would be very beneficial, because they can preview their content as it would (approximately) appear on Apple News before publishing or sending it to Apple.

In addition, they would be able to catch errors with the document because the preview would using the same toolchain as our service delivering the article.json to Apple.

The goal here is to:

  • streamline work at Condé
  • provide Apple News specific capabilities to enrich / customize content
  • provide great tools that other editorial teams can benefit from

Image "description" should be string; not a document

I was asked by @balaclark about writing a test for writing a test against a react component that maps to the Image annotation, and looked into the commonmark spec again for what the alt text should do. It turns out that commonmark expects the alt text to have markdown stripped when parsing.

What do you expect to happen?

Image#attributes.description should be a string.

Note: We are also seeing failures related to this on our nightly job that verifies that we can rewrite our content using converters that we've defined.

What happened instead?

Image#attributes.description is a Document and renders markdown, which is supposed to be stripped according to https://spec.commonmark.org/0.29/#image-description

📓 Example Document

![Markdown **is stripped** from *this*](test.jpg)
{
  "content": "",
  "contentType": "application/vnd.atjson+commonmark",
  "annotations": [
    {
      "type": "-commonmark-image",
      "start": 0,
      "end": 0,
      "attributes": {
        "-commonmark-src": "test.jpg",
        "-commonmark-alt": "Markdown is stripped from this"
      }
    },
    {
      "type": "-commonmark-paragraph",
      "start": 0,
      "end": 0,
      "attributes": {}
    }
  ]
}

Deficient Complex Boundary Behaviour

@neilius encountered an issue where it's impossible to insert text between two adjacent annotations, and have the text not covered by either.

What do you expect to happen?

Given two annotations, { start: 10, end: 20 } and { start: 20, end: 30 }, it should be possible to insert text at position 20 and have the resulting annotations be { start: 10, end: 20 } and { start: 21, end: 31 }

This is not the default insertion behaviour, but it should be an option.

What happened instead?

Instead, we're able to modify the boundary behaviour for one or the other annotations, but not both. The default behaviour of insertText results in { start: 10, end: 21 } and { start: 21, end: 31 } (the first annotation is extended right, and the second annotation's coverage is unmodified), and the AdjacentBoundaryBehaviour.preserve behaviour results in { start: 10, end: 20 } and { start: 20, end: 31 } (i.e., the first annotation is unmodified and second annotation is extended left to include the new text).

📓 Example Document

Before: https://files.slack.com/files-pri/T5Y8VC3HU-FT9GKNJ9M/before.json
After (with preserve): https://files.slack.com/files-pri/T5Y8VC3HU-FT7MA4HMK/after.json

Proposal

@tim-evans @neilius and I discussed, and agreed that a good possible solution is to add a (backwards-compatible) way to specify boundary behaviour for text insertion for adjacent annotations both before and after the insertion. This should clarify the relevant bit of code, since currently both before and after boundaries are handled in the same method, despite being subtly different in their handling.

Add support for common embed codes from around the web

A lot of websites have embed codes to embed their content directly into other apps, and we currently don't do anything smart about it for atjson.

The proposal here is to detect and convert a set of very common embed codes so folks can paste these codes directly in and have it be understood correctly.

  • YouTube
    <iframe width="560" height="315" src="https://www.youtube.com/embed/BriBDiBxaMY" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
  • Instagram
    <blockquote class="instagram-media" data-instgrm-captioned data-instgrm-permalink="https://www.instagram.com/p/B3M8RM-HqdP/?utm_source=ig_embed&amp;utm_campaign=loading" data-instgrm-version="12" style=" background:#FFF; border:0; border-radius:3px; box-shadow:0 0 1px 0 rgba(0,0,0,0.5),0 1px 10px 0 rgba(0,0,0,0.15); margin: 1px; max-width:540px; min-width:326px; padding:0; width:99.375%; width:-webkit-calc(100% - 2px); width:calc(100% - 2px);"><div style="padding:16px;"> <a href="https://www.instagram.com/p/B3M8RM-HqdP/?utm_source=ig_embed&amp;utm_campaign=loading" style=" background:#FFFFFF; line-height:0; padding:0 0; text-align:center; text-decoration:none; width:100%;" target="_blank"> <div style=" display: flex; flex-direction: row; align-items: center;"> <div style="background-color: #F4F4F4; border-radius: 50%; flex-grow: 0; height: 40px; margin-right: 14px; width: 40px;"></div> <div style="display: flex; flex-direction: column; flex-grow: 1; justify-content: center;"> <div style=" background-color: #F4F4F4; border-radius: 4px; flex-grow: 0; height: 14px; margin-bottom: 6px; width: 100px;"></div> <div style=" background-color: #F4F4F4; border-radius: 4px; flex-grow: 0; height: 14px; width: 60px;"></div></div></div><div style="padding: 19% 0;"></div> <div style="display:block; height:50px; margin:0 auto 12px; width:50px;"><svg width="50px" height="50px" viewBox="0 0 60 60" version="1.1" xmlns="https://www.w3.org/2000/svg" xmlns:xlink="https://www.w3.org/1999/xlink"><g stroke="none" stroke-width="1" fill="none" fill-rule="evenodd"><g transform="translate(-511.000000, -20.000000)" fill="#000000"><g><path d="M556.869,30.41 C554.814,30.41 553.148,32.076 553.148,34.131 C553.148,36.186 554.814,37.852 556.869,37.852 C558.924,37.852 560.59,36.186 560.59,34.131 C560.59,32.076 558.924,30.41 556.869,30.41 M541,60.657 C535.114,60.657 530.342,55.887 530.342,50 C530.342,44.114 535.114,39.342 541,39.342 C546.887,39.342 551.658,44.114 551.658,50 C551.658,55.887 546.887,60.657 541,60.657 M541,33.886 C532.1,33.886 524.886,41.1 524.886,50 C524.886,58.899 532.1,66.113 541,66.113 C549.9,66.113 557.115,58.899 557.115,50 C557.115,41.1 549.9,33.886 541,33.886 M565.378,62.101 C565.244,65.022 564.756,66.606 564.346,67.663 C563.803,69.06 563.154,70.057 562.106,71.106 C561.058,72.155 560.06,72.803 558.662,73.347 C557.607,73.757 556.021,74.244 553.102,74.378 C549.944,74.521 548.997,74.552 541,74.552 C533.003,74.552 532.056,74.521 528.898,74.378 C525.979,74.244 524.393,73.757 523.338,73.347 C521.94,72.803 520.942,72.155 519.894,71.106 C518.846,70.057 518.197,69.06 517.654,67.663 C517.244,66.606 516.755,65.022 516.623,62.101 C516.479,58.943 516.448,57.996 516.448,50 C516.448,42.003 516.479,41.056 516.623,37.899 C516.755,34.978 517.244,33.391 517.654,32.338 C518.197,30.938 518.846,29.942 519.894,28.894 C520.942,27.846 521.94,27.196 523.338,26.654 C524.393,26.244 525.979,25.756 528.898,25.623 C532.057,25.479 533.004,25.448 541,25.448 C548.997,25.448 549.943,25.479 553.102,25.623 C556.021,25.756 557.607,26.244 558.662,26.654 C560.06,27.196 561.058,27.846 562.106,28.894 C563.154,29.942 563.803,30.938 564.346,32.338 C564.756,33.391 565.244,34.978 565.378,37.899 C565.522,41.056 565.552,42.003 565.552,50 C565.552,57.996 565.522,58.943 565.378,62.101 M570.82,37.631 C570.674,34.438 570.167,32.258 569.425,30.349 C568.659,28.377 567.633,26.702 565.965,25.035 C564.297,23.368 562.623,22.342 560.652,21.575 C558.743,20.834 556.562,20.326 553.369,20.18 C550.169,20.033 549.148,20 541,20 C532.853,20 531.831,20.033 528.631,20.18 C525.438,20.326 523.257,20.834 521.349,21.575 C519.376,22.342 517.703,23.368 516.035,25.035 C514.368,26.702 513.342,28.377 512.574,30.349 C511.834,32.258 511.326,34.438 511.181,37.631 C511.035,40.831 511,41.851 511,50 C511,58.147 511.035,59.17 511.181,62.369 C511.326,65.562 511.834,67.743 512.574,69.651 C513.342,71.625 514.368,73.296 516.035,74.965 C517.703,76.634 519.376,77.658 521.349,78.425 C523.257,79.167 525.438,79.673 528.631,79.82 C531.831,79.965 532.853,80.001 541,80.001 C549.148,80.001 550.169,79.965 553.369,79.82 C556.562,79.673 558.743,79.167 560.652,78.425 C562.623,77.658 564.297,76.634 565.965,74.965 C567.633,73.296 568.659,71.625 569.425,69.651 C570.167,67.743 570.674,65.562 570.82,62.369 C570.966,59.17 571,58.147 571,50 C571,41.851 570.966,40.831 570.82,37.631"></path></g></g></g></svg></div><div style="padding-top: 8px;"> <div style=" color:#3897f0; font-family:Arial,sans-serif; font-size:14px; font-style:normal; font-weight:550; line-height:18px;"> View this post on Instagram</div></div><div style="padding: 12.5% 0;"></div> <div style="display: flex; flex-direction: row; margin-bottom: 14px; align-items: center;"><div> <div style="background-color: #F4F4F4; border-radius: 50%; height: 12.5px; width: 12.5px; transform: translateX(0px) translateY(7px);"></div> <div style="background-color: #F4F4F4; height: 12.5px; transform: rotate(-45deg) translateX(3px) translateY(1px); width: 12.5px; flex-grow: 0; margin-right: 14px; margin-left: 2px;"></div> <div style="background-color: #F4F4F4; border-radius: 50%; height: 12.5px; width: 12.5px; transform: translateX(9px) translateY(-18px);"></div></div><div style="margin-left: 8px;"> <div style=" background-color: #F4F4F4; border-radius: 50%; flex-grow: 0; height: 20px; width: 20px;"></div> <div style=" width: 0; height: 0; border-top: 2px solid transparent; border-left: 6px solid #f4f4f4; border-bottom: 2px solid transparent; transform: translateX(16px) translateY(-4px) rotate(30deg)"></div></div><div style="margin-left: auto;"> <div style=" width: 0px; border-top: 8px solid #F4F4F4; border-right: 8px solid transparent; transform: translateY(16px);"></div> <div style=" background-color: #F4F4F4; flex-grow: 0; height: 12px; width: 16px; transform: translateY(-4px);"></div> <div style=" width: 0; height: 0; border-top: 8px solid #F4F4F4; border-left: 8px solid transparent; transform: translateY(-4px) translateX(8px);"></div></div></div></a> <p style=" margin:8px 0 0 0; padding:0 4px;"> <a href="https://www.instagram.com/p/B3M8RM-HqdP/?utm_source=ig_embed&amp;utm_campaign=loading" style=" color:#000; font-family:Arial,sans-serif; font-size:14px; font-style:normal; font-weight:normal; line-height:17px; text-decoration:none; word-wrap:break-word;" target="_blank">4 of 6 “I consider myself [to be] a crazy cat lady: I have six cats and a dog. I have a strong connection with them. They call me the ‘cat whisperer’. We understand each other without understanding each other. We don’t need language. It’s another level of love. ⠀ “I have a [tattoo of] cat on my arm that says: ‘Doing things? No thanks.’ I have an Arabic tattoo that says: ‘Why are you frowning?’ I have a cartoon by Coco Capitán that says: ‘Is it tomorrow yet?’ I have a lot of sarcastic tattoos. I know this is a bit macabre—but I know I’m going to die. I know that my body is something I can play with in the meantime.” ⠀ Follow Beirut-based model @noursaliba_ and her story this week on @vogue.</a></p> <p style=" color:#c9c8cd; font-family:Arial,sans-serif; font-size:14px; line-height:17px; margin-bottom:0; margin-top:8px; overflow:hidden; padding:8px 0 7px; text-align:center; text-overflow:ellipsis; white-space:nowrap;">A post shared by <a href="https://www.instagram.com/vogue/?utm_source=ig_embed&amp;utm_campaign=loading" style=" color:#c9c8cd; font-family:Arial,sans-serif; font-size:14px; font-style:normal; font-weight:normal; line-height:17px;" target="_blank"> Vogue International</a> (@vogue) on <time style=" font-family:Arial,sans-serif; font-size:14px; line-height:17px;" datetime="2019-10-04T16:00:27+00:00">Oct 4, 2019 at 9:00am PDT</time></p></div></blockquote> <script async src="//www.instagram.com/embed.js"></script>
  • Twitter
    <blockquote class="twitter-tweet"><p lang="en" dir="ltr">i just keep getting hotter and smarter</p>&mdash; skelejenn (@jennschiffer) <a href="https://twitter.com/jennschiffer/status/708888255828250625?ref_src=twsrc%5Etfw">March 13, 2016</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
  • Facebook
    <iframe src="https://www.facebook.com/plugins/post.php?href=https%3A%2F%2Fwww.facebook.com%2FCondeNastTraveler%2Fposts%2F10157914140568982&width=500" width="500" height="491" style="border:none;overflow:hidden" scrolling="no" frameborder="0" allowTransparency="true" allow="encrypted-media"></iframe>
  • Other embed codes that use the standard <iframe> code.

Extract out performance profiling into its own repository

We currently have a pretty dope profiling framework thanks mostly to @bachbui and @colinarobinson, which we'd like to use to profile other parts of the atjson ecosystem that we have. (Also for other JS libs that we'd like to profile!)

Tasks:

  • Name it!
  • Define the package structure / how it should work for clients

Add efficient method for annotation class comparison, "is"

#269 introduced some checking to atjson that was resilient to whether the annotation constructor matched. We have also found that during our testing that instanceof is a fairly expensive operation and we would like to do this as little as possible.

I had the idea to generalize the problem of isBold / isItalic to an is method:

function is<T extends AnnotationConstructor>(
  annotation: Annotation<any>,
  Class: T
): annotation is InstanceType<T> {
  let AnnotationClass = annotation.constructor as AnnotationConstructor;
  return AnnotationClass.vendorPrefix === Class.vendorPrefix &&
    annotation.type === Class.type;
}

We could then use this like so:

import { is } from "@atjson/document";

if (is(annotation, Bold)) {
  // This is now type narrowed to "Bold"!
}

🏖 TypeScript Playground →

Tombstoned attributes

Currently, when we're doing conversions, we have a big grab bag of attributes that are semi-messily put into the same property on the annotation— attributes.

We've encountered a bunch of issues doing so, where we've added properties that aren't meaningful to the output to the annotation for metrics purposes, which have caused some issues with determining annotation equality. (Side note: we're using canonical document equality to determine whether we can edit a document in our rich text editor)

Another issue that has come up a few times is a TypeScript issue where we are casting an attribute as any to get around the type system, because we're using one of these "grab bag" attributes.

The suggestion here is to add a graveyard / tombstoned attributes property to the Annotation class that will serve as a way to safely access these attributes. These attributes will not affect annotation equality (that is, if two annotations are identical except for their tombstoned attributes, they will be treated as if they were equal).

Proposal

Add a new property to annotations that is set on initialization of an annotation called $attributes. This will contain all attributes that are in a different "vendor space" than the current annotation. (This is the first pass of this feature, since doing this more correctly requires us to have a schema that we can directly read from in the code).

abstract class Annotation<Attributes = {}> {
  id: string;
  start: number;
  end: number;
  attributes: Attributes;
  $attributes: { [key: string]: unknown };
}

$attributes will be serialized as-is into JSON, resulting in a JSON object that look like: attributes & $attributes.

For example:

import HTMLSource from "@atjson/source-html";
import OffsetSource from "@atjson/offset-annotations";

let doc = HTMLSource
  .fromRaw(`<h1 style="text-align:center;">Guilt</h1>`)
  .convertTo(OffsetSource);

let [heading] = doc.annotations.where({ type: "-offset-heading" });
console.log(heading);
// Heading {
//   start: 0,
//   end: 41,
//   attributes: {},
//   $attributes: {
//     -html-style: "text-align:center;"
//   }
// }

Unicode whitespace is stripped from leading and trailing positions in markdown paragraphs

Following up on #307, we should handle a longer list of unicode whitespace characters:

Name Code Point Entity Size
No Break Space \u00A0 &nbsp; 👉 👈
En Quad \u2000 &#8192; 👉 👈
Em Quad \u2001 &#8193; 👉 👈
En Space \u2002 &ensp; 👉 👈
Em Space \u2003 &emsp; 👉 👈
Thick Space \u2004 &#8196; 👉 👈
Mid Space \u2005 &#8197; 👉 👈
Six-per-em Space \u2006 &#8198; 👉 👈
Figure Space \u2007 &#8199; 👉 👈
Punctuation Space \u2008 &#8200; 👉 👈
Thin Space \u2009 &#8201; 👉 👈
Hair Space \u200A &#8202; 👉 👈
Zero Width Space \u200B &#8203; 👉​👈
Narrow No-break Space \u202F &#8239; 👉 👈
Medium Mathematical Space \u205F &#8287; 👉 👈
Ideographic Space \u3000 &#12288; 👉 👈
Zero Width No-break Space \uFEFF &#65279; 👉👈

I think this is a fairly exhaustive list of spaces, but if any more should be added, please comment 😄

Remove commented-out code

What are the results of this discussion?

Commented-out code has been removed from the codebase and has found a more appropriate location.

While looking at the text insertion, I noticed a few instances of commented-out code still lingering in the codebase. We should get rid of this, either relying on version control for instances where we need to reference this code in the future or moving it to documentation if it's intended to illustrate functionality for users of the library.

A few examples: in insertText and in deleteText

Add safeguards around Unknown annotations and unhandled annotations in renderers

As per a discussion that we had here, we were finding that it's particularly annoying when unknown annotations occur, and when Rendering cases aren't handled.

Unknown annotations are created automatically by atjson, and we'd like to make this an explicit decision by the user, because often this is not intentional and causes bugs.

Sussing out what to do about this structurally will take some time, but for now, we'd like to do the following:

  • Add assertions when encountering any UnknownAnnotation in a Renderer.
  • Add warnings when a Renderer doesn't have a rendering hook for a given annotation. Having no render method for an annotation is undefined behavior by atjson, and is also different for each renderer. This logging will improve introspection for applications that use @atjson/renderer-react.

We expect that these will make the ergonomics of rendering and converting a bit easier.

Orphaned parse tokens on modified documents.

I'm curious about what, if anything, there is to do about no longer relevant parse tokens after a document has been modified.

Let's say we have the following Markdown:

{
    "body": "I **feel** happy."
}

When we ingest it and look at the document annotations, we have:

    [ { type: 'parse-token',
        start: 0,
        end: 1,
        attributes: { type: 'paragraph_open' } },
      { type: 'parse-token',
        start: 3,
        end: 4,
        attributes: { type: 'strong_open' } },
      { type: 'parse-token',
        start: 8,
        end: 9,
        attributes: { type: 'strong_close' } },
      { type: 'bold', start: 3, end: 9, attributes: {} },
      { type: 'parse-token',
        start: 16,
        end: 17,
        attributes: { type: 'paragraph_close' } },
      { type: 'paragraph', start: 0, end: 17, attributes: {} } ]

Now if we modify the document and delete the bold type annotation, the strong_ parse-token type annotations remain.

doc.where({ type: 'bold' }).transform(bold => doc.removeAnnotation(bold as Annotation));
    [ { type: 'parse-token',
        start: 0,
        end: 1,
        attributes: { type: 'paragraph_open' } },
      { type: 'parse-token',
        start: 3,
        end: 4,
        attributes: { type: 'strong_open' } },
      { type: 'parse-token',
        start: 8,
        end: 9,
        attributes: { type: 'strong_close' } },
      { type: 'parse-token',
        start: 16,
        end: 17,
        attributes: { type: 'paragraph_close' } },
      { type: 'paragraph', start: 0, end: 17, attributes: {} } ]

This could lead to confusion or bloated documents down the road, but is this likely enough to be a concern?

Correctly type check annotation constructors.

Currently, our type checking around annotation constructors isn't strict enough and allows invalid construction of annotations.

What do you expect to happen?

Instantiating a new annotation should be a type error when the annotation requires attributes and the attributes are omitted from the output.

HTML attributes are not escaped when rendering a document to HTML

When rendering a document to HTML, quoted strings aren't properly escaped.

What do you expect to happen?

Text should be escaped when rendering to HTML.

📓 Example Document

import OffsetSource, { Link } from "@atjson/offset-annotations";
import HTMLRenderer from "@atjson/renderer-html";

let doc = new OffsetSource({
  content: "Malika Favre’s “Sweeping Into Fall”",
  annotations: [
    new Link({
      start: 0,
      end: 35,
      attributes: {
        url: "https://www.newyorker.com/culture/cover-story/cover-story-2019-09-09",
        title: "Malika Favre’s \"Sweeping Into Fall\""
      }
    })
  ]
});

HTMLRenderer.render(doc);

Expected result:

<a
  href="https://www.newyorker.com/culture/cover-story/cover-story-2019-09-09"
  title="Malika Favre’s \"Sweeping Into Fall\"">Malika Favre’s “Sweeping Into Fall”</a>

Getting started with atjson

I'm keen to learn how to use this - had spoken with @blaine about it at a recent p2p gathering. I've visited this repo a couple of times but got swamped reading the source.

Here's my current best attempt from reading and copying examples from in the code. It's not yet working / I don't know what it looks like when it is working:

import Document from '@atjson/document'

// Web components in the registry can't be redefined,
// so reload the page on every change
if (module.hot) {
  module.hot.dispose(() => {
    window.location.reload()
  })
}

document.addEventListener('DOMContentLoaded', () => {
  let editor = document.querySelector('offset-editor')

  let doc = new Document({
    content: 'Some text that is both bold and italic plus something after.',
    annotations: [
      { type: 'bold', display: 'inline', start: 23, end: 31 },
      { type: 'link', display: 'inline', start: 20, end: 24, attributes: { url: 'https://google.com' } },
      { type: 'italic', display: 'inline', start: 28, end: 38 },
      { type: 'underline', display: 'inline', start: 28, end: 38 },
      { type: 'paragraph', display: 'block', start: 0, end: 61 }
    ]
  })

  editor.setDocument(doc)

  console.log('done!')
})

I get an error:

TypeError: options.annotations is undefined

Are these modules still in use? Are they being deprecated?
I would love a demo repo showing me how to wire things together. My ideal use case is being able to have a rich accessible editor, a way to output the content/annotations, then a way to render content/annotations once it's "published" (to scuttlebutt).

I care about this because we need to move beyond markdown to be more accessible for a wider range of peers ... and the promise of tidy extensibility is really exciting.

Document.annotations.in(<schema>)

Right now the Document is the primary typed entity in atjson, which works fairly well in practice but is conceptually a little confusing. When we convert documents between different sources what we're really converting is the collection of annotations associated with that document. I think it would be conceptually simpler if the notion of a type or annotation schema existed primarily at the level of the annotation.

I propose adding a few types to the library:
AnnotationCollection<T extends AnnotationSchema> with public members AnnotationCollection<T>.convertTo(schema: <U extends AnnotationSchema>) : AnnotationCollection<U> and AnnotationCollection<T>.in(schema: <U extends AnnotationSchema>) : AnnotationCollection<U>

  • convertTo(schema: <U extends AnnotationSchema> : AnnotationCollection<U> is Document.convertTo, just scoped to work on a list of annotations instead
  • in(schema: <U extends AnnotationSchema>) : AnnotationCollection<U> returns a new AnnotationCollection where any unknown annotations in schema from the original collection are made known, and all other annotations are made unknown.

AnnotationSchema is the parent type of schema definitions. This would take over much of the current role occupied by Document and its subtypes. Converters would be defined between subtypes of AnnotationSchema rather than subtypes of Document. They would also obviously 'own' the definition of their annotations, and would have a content type string for marking their annotations during de/serialization.

This proposal doesn't give the library any additional power (since a document is just a piece of un-annotated media and an AnnotationCollection, and the media portion doesn't at all complicate the associated schema) but I think it would be helpful for explaining the library and would help generally separate independent concerns within the system

Project-wide integration tests

We have quite robust unit testing in atjson, but we have encountered quite a few cases where we want to establish systemic properties of atjson as a set of tools and have found difficulty in doing that in unit testing. A good example of us testing these properties is us testing that our commonmark rendering and parsing is compliant to the commonmark spec.

We'd also like to test additional properties of atjson in a full-mesh approach where we can test rigorousness of source libraries. This may be best to do property-based testing as suggested by @colinarobinson to make the testing sufficiently generic while having full coverage.

Meeting Notes 9/26

Attendees: @blaine / @balaclark / @gmedina / @tim-evans

Open source maintainership

Basic idea is to follow an open source model; submit a ticket / issue to Github, it's discussed and then we implement!

Good Contributions

  • Documentation

Things to look forward to

How do we manage versioning of the content format (change over time)?

Ideas

  • React renderer to turn nested documents into a react node instead of a doc. (So, props.caption could be used directly in react instead of having to render them again)

Add formal specification of atjson format

One lesson I learned in decades of working with data: without specification and validation, eventually data will not conform to an expected data format anymore. If atjson format is going to be used beyond its current software, it needs a formal specification.

I've started a JSON Schema in my specification branch. The format is quite simple so the schema is not complex. One option question I stumbled upon is whether the end position is optional (I don't think so)?

To validate atjson documents against the schema I am looking for actual examples and use cases. Validation should best be added to unit tests at least.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.