json-schema-org / json-schema-spec Goto Github PK

View Code? Open in Web Editor NEW

3.2K 98.0 241.0 3.39 MB

The JSON Schema specification

Home Page: http://json-schema.org/

License: Other

Makefile 8.17% JavaScript 91.83%

json-schema api-documentation json validation jsonschema

json-schema-spec's Introduction

Welcome to JSON Schema

JSON Schema is a vocabulary that allows you to validate, annotate, and manipulate JSON documents.

This repository contains the sources for the work in progress of the next set of JSON Schema IETF Internet Draft (I-D) documents. For the latest released I-Ds, please see the Specification page on the website.

Call for contributions and feedback

Reviews, comments and suggestions are most welcome! Please read our guidelines for contributing.

Status

For the current status of issues and pull requests, please see the following labels

Labels are assigned based on Sensible Github Labels.

Authoring and Building

Specification

To build the spec files to HTML from the Markdown sources, run npm run build. You can also build each individually with npm run build-core and npm run build-validation.

The spec is built using Remark, a markdown engine with good support for plugins and lots of existing plugins we can use.

Plugins

The following is a not-necessarily-complete list of configured plugins and the features they make available to you.

remark-lint -- Enforce markdown styles guide.
remark-validate-links -- Check for broken links.
remark-gfm -- Adds support for Github Flavored Markdown specific markdown features such as autolink literals, footnotes, strikethrough, tables, and tasklists.
remark-heading-id -- Adds support for {#my-anchor} syntax to add an id to an element so it can be referenced using URI fragment syntax.
remark-headings -- A collection of enhancements for headings.
- Adds hierarchical section numbers to headings.
  - Use the [Appendix] prefix on headings that should be numbered as an appendix.
- Adds id anchors to headers that don't have one
  - Example: #section-2-13
  - Example: #appendix-a
- Makes the heading a link utilizing its anchor

remark-reference-links -- Adds new syntax for referencing a section of the spec using the section number as the link text.

Example:

## Foo {#foo}

## Bar
This is covered in {{foo}} // --> Renders to "This is covered in [Section 2.3](#foo)"
- Link text will use "Section" or "Appendix" as needed

remark-table-of-contents -- Adds a table of contents in a section with a header called "Table of Contents".
remark-code-titles -- Add titles to code blocks
- Example:
```
\`\`\`jsonschema "My Fun Title"
{ "type": "string" }
\`\`\`
```
- The languages jsonschema and json have special styling
- The title will be parsed as a JSON string, but you have to double escape escaped characters. So, to get My "quoted" title, you would need to be "My \\\\"quoted\\\\" title".
remark-torchlight -- Syntax highlighting and more using https://torchlight.dev. Features include line numbers and line highlighting.
remark-flexible-containers -- Add a callout box using the following syntax. Supported container types are warning, note, and experimental.
```
::: {type} {title}
{content}
:::
```

Internet-Drafts

To build components that are being maintained as IETF Internet-Drafts, run make. The Makefile will create the necessary Python venv for you as part of the regular make target.

make clean will remove all output including the venv. To clean just the spec output and keep the venv, use make spec-clean.

If you want to run xml2rfc manually after running make for the first time, you will need to activate the virtual environment: source .venv/bin/activate.

The version of "xml2rfc" that this project uses is updated by modifying requirements.in and running pip-compile requirements.in.

Descriptions of the xml2rfc, I-D documents, and RFC processes:

Test suites

Conformance tests for JSON Schema and its vocabularies may be found in their own repository.

The website

The JSON Schema web site is at http://json-schema.org/

The source for the website is maintained in a separate repository.

Contributors

Code Contributors

This project exists thanks to all the people who contribute. [Contribute].

Financial Contributors

Become a financial contributor and help us sustain our community. [Contribute]

Individuals

License

The contents of this repository are licensed under either the BSD 3-clause license or the Academic Free License v3.0.

json-schema-spec's People

Contributors

Stargazers

Watchers

Forkers

awwright pdl brettz9 kiaragrouwstra dret syntax-ppc brucemetallians yi-tang imjerrybao shu-yu redaktor hariharmishra handrews neujie sbhatti534602210 slurmulon rongfengliang lbx13 erosb tigerqiu712 kw217 bowman williambailey smyrman villegas epoberezkin anthropic donaldpipowitch asgs thatnerdjosh developer-dashboard dondyabla admindevelopment mypravite-project mspiegel dongwenke dfinlay thecharge lauretadias dlax gh-solutions ghosthamlet brandonkelly adamvoss blgm amir-arad orlandow pol-no kkstudy antoha-gs ryangalamb connectthefuture wrightrocket levbishop ye8303019 andrewfinnell krstnaparker mokkabonna jacanotrigueros biancode stuartpb hitarthishah ponach g376861363 qinglecheng sensecollective sunmit9 yosoft katombo akumar96 barkermn01 haslead muthhus natewaite crownbonded zhm022726 googleinternetauthorityg2sunghan relequestual h4ck3rm1k3 pebsconsulting jupitermandy brbart michellekwa126 tohidazizi paavaimano sfarias051 vitaliyvv 280455936 roshlearningstack eduardoavargas ucarion cnbailian lovasoa thomblin dinadurykina mash696 ts3ng sazzer mhmystackpath dolph

json-schema-spec's Issues

v6 hyper-schema: extended templating (vars+template)

Originally written by @geraintluff at https://github.com/json-schema/json-schema/wiki/Extended-templating-syntax-(v5-proposal)

Proposed keywords

Uses existing keyword - proposes extension of href in LDOs.

(Should this syntax also be allowed inside "$ref" to allow templating of references?)

Purpose

Currently, the only values available for templating in href are the object itself, and the immediate children of the object (which must be referred to by their exact name).

This proposed new syntax would allow more powerful templating, specifying values from the data using Relative JSON Pointer.

It would also allow the re-naming of template variables. This is useful because in some templates, the variable name is actually included in the results, e.g.

/prefix/{?foo,bar,baz} -> /prefix/?foo=1&bar=2&baz=3

Values

In addition to the existing string values/behaviour for href, the following is proposed:

The value of href may be an object, containing the following properties:

template - containing a URI template
vars - an object, where all the values are Relative JSON Pointers

Behaviour

To obtain the URI for the link, the URI template in template is expanded. When a variable is referenced by the template, its value is obtained like so:

if the variable name is a defined property in vars, then:
- the corresponding value in vars is interpreted as a Relative JSON Pointer, and resolved relative to the current data instance.
- the result of resolving the relative pointer is used as the value for the variable in the template
otherwise, the variable name is percent-decoded and taken as the name of an immediate property.

(Note the complete lack of pre-processing rules - they are not needed here, due to the expressive power of Relative JSON Pointers.)

Example

Data:

{
    "author": {"id": 241, "name": "Jenny"},
    ...
}

Schema:

{
    "links": [
        {
            "rel": "author",
            "href": {
                "template": "/users/{authorId}",
                "vars": {
                    "authorId": "0/author/id"
                }
            }
        }
    ]
}

Concerns

Mismatched pre-processing rules

This syntax is in many ways much simpler than the existing syntax, because there is no need for escaping rules. (The current syntax does pre-processing using (, ) and $.)

We are faced with a choice, then - to make the new syntax equally complex, or to have complex pre-processing rules for URI Templates in some situations but not others. (Or, of course, remove the old plain-string syntax, which will impact brevity as well as backwards-compatability.)

Usage inside `$ref`

Allowing templating inside $ref would force all validators to implement link-parsing - currently, validators can ignore all hyper-schema aspects, which is convenient.

Use inside $ref would limit static analysis for schemas. However, (like $data) allowing this keyword in $ref would open up quite a lot of expressive power.

Use inside $ref would also ruin our ability to describe $ref relationships in the meta-schema. Currently, the $ref behaviour is characterised by a full link, but allowing templating would undermine that.

Current behaviour, and `rel="describedby"`

The behaviour of templating $ref can currently be mirrored by adding a rel="describedby" link:

{
    "links": [{
        "rel": "describedby",
        "href": "/schemas/{type}"
    }]
}

The only difference is that validators are not obliged to take any notice of links. "Hyper-validators" should, but it is not expected that plain validators would.

Add restrictions on how to pull schemas over HTTP

Misbehaved clients might pose a problem if they pull a schema over the network every time it's being validated against, when it's instead possible to cache for a long period of time. Server owners won't like JSON Schema very much if this becomes a problem.

JSON Schema does not rely on or need HTTP, even if schemas are referenced with an http or https URI. However, in some hypermedia cases, it is still useful to download schemas over the network.

For these cases, add a section about behavior of clients when they make HTTP requests:

Clients SHOULD set or prepend a User-Agent header specific to the JSON Schema implementation, that is not merely the HTTP library being used (if any). e.g. Instead of User-Agent: curl/7.43.0 use User-Agent: so-cool-json-schema/1.0.2 curl/7.43.0. Since symbols are listed in decreasing order of significance, the JSON Schema library name/version goes first, then the more generic HTTP library name (if any)
Clients SHOULD set a From header so that server operators can contact the owner of a potentially misbehaving script.
Clients SHOULD observe caching headers and not re-request documents within their freshness period

Add `contains` keyword

Just surfacing this proposal from https://github.com/json-schema/json-schema/wiki/contains-%28v5-proposal%29

It would greatly simplify JSON Schemas such as this one:
https://github.com/w3c/web-annotation-tests/blob/master/common/contextValue.json#L15-L26

It has been implemented in ajv.

/cc @awwright @epoberezkin

Support `autoIncrement` and `indexes` properties

I would like to see standardization on the following properties for any given schema (especially the root): autoIncrement and keyPath, as allowed for stores in IndexedDB. Perhaps these could be within a store-metadata property or such.

I'd also like to see an indexes property which could standardize on at least three subproperties (as used in IndexedDB indexes), unique, multiEntry, and keyPath (and I'd hope the i18n work (I still intend to expand on this in PR #12 ) would hopefully also tie into Mozilla's locale property so that such a property would not need to be added in this location as well). Perhaps a name property could also be provided to allow for complete store and index generation.

Such properties could be used for auto-generating database stores based on the schemas, and one would not need to rebuild according to database implementation. IndexedDB is a particularly suitable choice, imo, since it is (at least seems it will be) ubiquitous in browsers, and can also be used as an API in server-side implementations (the most promising, imo, at least for Node, appears to me to be IndexedDBShim (with this PR), and possibly also for NoSQL implementations not having an IndexedDB API.

v6 hyper-schema: propertyLinks

Originally written by @geraintluff at https://github.com/json-schema/json-schema/wiki/propertyLinks-(v5-proposal)

Proposed keywords

propertyLinks

Purpose

Currently, we can describe the format of a parent object (with title/description/etc.), and we can describe the format of a child instance (with title/description/etc.) but we cannot describe the relationship between the two.

Motivating example

For instance, say you have an instance representing a programming project:

{
    "title": "AwesomeNet",
    "author": {
        "name": "Petra the Programmer",
        "homepage": "..."
    }
}

When describing the format for this instance, you can easily write a schema for the parent format:

{
    "title": "Programming project",
    "description": "A description of a programming project",
    "type": "object",
    ...
}

... and for the child format:

{
    "title": "Person"
    "description": "A representation of a person",
    "type": "object",
    ...
}

These are completely accurate descriptions of the format, but there is no way to express anything about the relationship between the two - considered on its own, the entry in "author" is indeed a "Person", but its relationship to the parent is more than that.

One current hack is to extend the "Person" format, so you can give it a title/whatever. However, that isn't really accurate - nothing has changed about the format at all, it's the parent-child link that's the interesting part.

Values

The value of propertyLinks would be an object. The values would themselves be objects, containing zero or more of the following properties:

title - a name for the parent/child relationship
description - a more detailed description of the parent/child relationship
rel - a link relation (URI) representing the parent/child relationship

Example

Extending the "Motivating example" above:

{
    "title": "Programming project",
    "description": "A description of a programming project",
    "type": "object",
    "properties": {
        ...,
        "author": {"$ref": "/schemas/user"}
    },
    "propertyLinks": {
        "author": {
            "title": "Author",
            "rel": "http://schema.org/author"
        }
    }
}

Concerns

Duplicated keys

In that example, you end up listing "author" twice in the schema.

However, the alternatives are either an intermediate object (hard-to-read and not concise) or sticking extra info in the child schema (which requires ugly/awkward allOf extension, and still suffers from similar conceptual concerns to the existing workaround).

annotation: Multilingual meta data

This was originally proposed on the old wiki at https://github.com/json-schema/json-schema/wiki/multilingual-meta-data-(v5-proposal) by @geraintluff with further contributions from @brettz9 and @sonnyp

The translation alternative discussed at the end of this comment was originally proposed at https://github.com/json-schema/json-schema/wiki/translations-(v5-proposal) by @geraintluff based on an email thread with @fge.

Proposed keywords

This proposal modifies the existing properties:

title
description

This proposal would also apply to the named enumerations proposed in issue #57 , if that makes it in.

Purpose

This modification would allow inclusions of multiple translated values for the specified properties.

Currently, schemas can only specify meta-data in one language at a time. Different localisations may be requested by the client using the HTTP Accept-Language header, but that requires multiple (largely redundant) requests to get multiple localisations, and is only available over HTTP (not when pre-loading schemas, for instance).

Values

In addition to the current string values (which are presumed to be in the language of the document), the values of these keywords may be an object.

The keys of such an object should be IETF Language Tags, and the values must be strings.

Behaviour

When the value of the keyword is an object, the most appropriate language tag should be selected by the client, and the string value used as the value of the keyword.

Example

{
    "title": {
        "en": "Example schema",
        "de": "..."
    }
}

Concerns

Schemas with many languages could end up quite bulky.

In fact, the Accept-Language option is in many ways more elegant, as the majority of the time only one language will be used by the client (and the other localisations will simply be noise). However, this option is not available in all situations. One might also avoid the extra bulk by using JSON references (and thereby also enable localisation files to contain all translatable text).

An alternative approach to the above would be to reserve localeKey as a property for any schema object or sub-object and localization-strings as a top-level property:

{
    "localization-strings": {
        "en": {
            "example": {
                "title": "Example schema",
                "description": "Example schema description"
            }
        },
        "de": {
            "example": {}
        }
    },
    "type": "object",
    "localeKey": "example"
}

The advantage to this approach would be that, as typically occurs with locale files (for reasons of convenience in independent editing by different translators), all language strings could be stored together. Thus, if leveraging JSON references, it would be a simple matter of:

{
    "localization-strings": {
        "en": {
            "$ref": "locale_en-US.json"
        },
        "de": {
            "$ref": "locale_de.json"
        }
    },
    "type": "object",
    "localeKey": "example"
}

or yet simpler:

{
    "localization-strings": {"$ref": "locales.json"},
    "type": "object",
    "localeKey": "example"
}

Alternative: translation objects

This alterantive proposes a translations keyword which would be alongside the title and `descript

translation object Values

The value of translations would be an object. The values would be a JSON schema meta keyword and would themselves be objects,
where each property key MUST be in accordance with RFC 3066

Example translation object

When translating title and description, you can easily write an object where the meta keywords are RFC3066 conformal:

{
    "title": "postal code",
    "description": "A postal code.",
    "translations": { 
        "title": { 
            "en-GB": "postcode",
            "en-US": "zip code",
            "de": "Postleitzahl",
            "fr": "code postal"
        },
        "description": {
            "en-GB": "A Royal Mail postcode.",
            "en-US": "An USPS ZIP code.",
            //  ...
        }
    }
    //  ...
}

Translation object concerns: where to apply?

"What would be left to specify is of course what "relevant" is here.
Apart from "title", there is "description". But I don't think we want any other keyword to be affected."

Improve handling of oneOf error reporting

In cases where oneOf is used for multiple mutually exclusive options, it is frequently the case that the option to pick is a single key within the instance. E.g. the "type" property will be from the enum "Animal, Vegetable, Mineral" and the appropriate schema to apply is picked based on the value of this property.

Right now, all schemas must be tested (similar to an O(n) operation). Only one schema corresponding with an e.g. "type" property need be tested (similar to an O(log n) or O(1) operation), and only errors against that schema need be reported. Otherwise we get bizarre, non-helpful errors like so:

(1 of 1) instance.o3 is not exactly one from <http://example.org/Animal>,<http://example.org/Vegetable>,<http://example.org/Mineral>

Rename "id" to "$id"

json-schema/json-schema#215

Investigate CBOR compatibility

CBOR (RFC7049, http://cbor.io/) is considered a binary version of JSON, however it implements a superset of functionality, including native dates, byte (octet) strings (JSON is UTF), integers, URIs, and different storage formats for floating point and fixed and variable sized integers.

For draft-5, we need not add features specific to CBOR, but consider the ways that CBOR might be used, and make sure there's no definitions in outright opposition to this goal.

What is the status of JSON Schema maintainers?

@ACubed
I am the current maintainer of JSON Schema Form (including the popular Angular Schema Form)
https://github.com/json-schema-form

That project has had a view definition for years, but recently I have begun working with other form building tool provider to create a json-ui-schema to define a universal view definition to marry up with a json-schema model definition to generate forms with more precision than you could, or perhaps should, with a json-schema definition alone. We have a pure JavaScript core and Angular implementation and potentially React, Node and other ports in the works soon.

I am concerned that some v5 proposals appear to be suggesting more view capability in json-schema (like choices) and I am interested in having discussion with the maintainers as I have not been able to get any email response from the json-schema github org members, @ACubed I would love to talk to you and any other json-schema maintainers about collaboration options. You can reach me at, iamanthropic, at g mail .

Investigate EXI for JSON compatability

EXI for JSON https://www.w3.org/TR/exi-for-json/ is a method of compressing JSON into a compact binary form, using the algorithm defined by EXI (originally defined for XML).

For draft-5, we need not add define any specific compatibility features, but do consider the ways it might be used.

IRI support (i.e. Unicode in URIs)

Most modern Web/hypermedia formats support IRIs instead of just URIs that are 7bit ASCII. IRIs are a superset of URIs that support full Unicode. For standards that only support URIs, IRIs have to be converted/escaped into a URI-compatible format.

Defining a JSON-based incremental upgrade path for schemas and associated instance docs

Another topic related to IndexedDB (as in issue #17) and perhaps leveraging the same syntax proposed in #15 ...

Although versioning of schemas may not be of large consequence whenever server-side databases are in use--since an upgrade can in many cases be forced at once on all new visitors, with client-side IndexedDB in particular (though also for server-side databases being interacted with by Ajax and maintaining separate stores for a given user) users may need to be allowed to continue operating with an old version of the database, but when the version change can occur, there needs to be a safe migration path (even potentially needing to go through multiple schema upgrades if the user is making changes to the database offline long before visiting the site online again).

IndexedDB has an upgradeneeded event which can be leveraged for such migrations (and service workers could be used to grab the latest upgrades without the user needing to load a new page or refresh the old one), but it would be handy for the IndexedDB-friendly JSON Schema (proposed in issue #17 ) to also have a formal JSON definition for expressing diffs between schemas (even if it would not be able to have the robustness of all potential programmatic changes such as changing individual records between versions) and in a way which would also cause changes in the instance documents.

For example, one might wish to indicate that for version 2 of a schema, such-and-such a store should be added and an object modified, while for version 3, one store should be deleted, one schema object should be renamed, and one object should be moved elsewhere within the schema (and data also migrated--at least when "move" and "copy" operations are used on the schema diffs).

Add logical "if" schema

It would be great if there were a logical if schema in the vein of and/or/not. It could look like

{ "if": [ conditional, consequent ] }

whose semantics would be exactly equivalent to:

{ "or": [ { "not": conditional },
          { "and": [ conditional, consequent ] } ] }

Thank you!

Update json-schema/json-schema to redirect people here!

Edit the description to redirect people to this repo.
Add a github issues template to redirect people here.
Add a github pull request template to redirect people here.

I can't do this so... @awwright ?
I mean, I can make the templates and make a PR if it would help time wise.

allow more complex conditional branching for individual keys

Sorry for the somewhat verbose title there, but this is a proposal to discuss addition of a feature similar to Joi's when.

Joi lets you do something like:

{
  query: Joi.object({
    type: Joi.string().required(),
    value: Joi.string().max(255).when('type', {
      is: 'optionalValue',
      then: Joi.optional(),
      otherwise: Joi.required()
    })
  });
}

This is a bit of a contrived (and simple) example, but I think it shows how easy the mental mapping is versus json-schema. json-schema supports the same functionality but very quickly becomes difficult to define, needing to mix and match dependencies and in the worst case having to specify duplicate schemas with a single key's definition changed and jamming them into an anyOf or oneOf. What's worse is that when you have to use the aforementioned anyOf approach you generally end up with resultant errors which are incredibly difficult to reason about, when all you really need is a single error that a given field doesn't match the criteria (not e.g. 3 errors: one indicating that the field was incorrect according to one schema, another indicating the error is incorrect in the other possible schema, and finally that neither of the schemas provided in anyOf were matched).

Is there any room for discussion on adding these types of features to later versions of json-schema?

Any reason, why "format" value accepts only strings?

Link attributes in JSON Hyper-schema

The HTTP Link header, HTML, and Atom each define slightly different attributes on link relations. Things like hints at the target resource's media type, language, title, and other metadata that would otherwise require dereferencing the resource.

Perhaps JSON Schema should normatively reference these link-extensions or similar.

Link relation for pointing to network-addressable schema

JSON Schema presently specifies to use the "profile" link relation, e.g.:

Link: <http://example.com/Schema.json>;rel="profile"

However, the "profile" link relation is not supposed to be dereferenced. There's no way for an automated user agent to be able to follow this link and actually download the schema if it doesn't already have it.

Perhaps publish a link relation type, similar to profile, that is supposed to be downloaded?

v6 hyper-schema: baseUri

Originally proposed by @geraintluff at https://github.com/json-schema/json-schema/wiki/baseUri-(v5-proposal)
The content below is exactly as it appears on the old wiki:

Proposed keywords

baseUri

Purpose

For convenience, specify a base URI against which schema-defined links will be resolved. This allows shorter href values.

Values

baseUri must be a URI Template (resolved against current base URI, or request URI).

(v4 actually mentioned that rel="self" links could be used for this, but that's not ideal.)

Example

{
    "baseUri": "/items/{id}/",
    "links": [
        {
            "rel": "comments",
            "href": "comments/"
        },
        {
            "rel": "related",
            "href": "related/"
        }
    ]
}

Concerns

Does this propagate into children? Either:

You have to also specify baseUri for every schema that defines links
baseUri applies to the data - at which point, what if multiple schemas have multiple values? Ideally, each schema would use its own baseUri for its own links, but that gets complicated when it comes to child properties.

$merge and $patch for "schema extensions"

And again, those are the two only reliable mechanisms allowing for truly extending schemas in an unambiguous fashion (yes, I love unambiguous definitions). Recall:

$merge relies on JSON Merge Patch, aka RFC 7396;
$patch relies on JSON Patch, aka RFC 6902.

Recall:

implementations would be REQUIRED to implement $merge;
implementations MAY implement $patch.

Why, then, define $patch? Simply because it allows for schema alterations which $merge cannot do. However, $merge answers the vast majority of cases.

Recall of the rules:

those keywords take precedence over all other JSON Schema keywords;
however, $ref still takes precedence.

Add "migrated" notice to the old github org

Also remove "hopefully temporary" from the tagline of this org. What makes you think it is temporary?

v6 validation: "constant"

Originally written by @geraintluff at https://github.com/json-schema/json-schema/wiki/constant-(v5-proposal)

Proposed keywords

constant

Purpose

For ordinary use, this would be equivalent to a single-valued enum, simply tidier.

The only real difference comes in its behaviour with $data-substitution. $data-subsitution would be allowed for this keyword, which means that this keyword is not capable of specifying literal values of the form: {"$data":...}, because these would be interpreted as $data-substitutions.

However, literal values of this form can still be specified using enum, so there is no loss of functionality.

Values

The value of this keyword would be any value - however, it would be subject to $data-substitution.

Validation

Instances are only valid if they are exactly equal to the value of this keyword.

Example

Simple constant

{
    "type": "object",
    "properties": {
        "five": {
            "constant": 5
        }
    }
}

Valid: {}, {"five": 5}
Invalid: {"five": 0}, {"five": "5"}

Using `$data` to specify equality

{
    "type": "object",
    "properties": {
        "a": {"type": "string"},
        "b": {
            "constant": {"$data": "1/a"}
        }
    },
    "required": ["a", "b"]
}

Valid: {"a": "foo", "b": "foo"}, {"a": "bar", "b": "bar"}
Invalid: {"a": "foo", "b": "bar"}

Concerns

Similarity to `enum`

Unless $data is being used, the same effect can be obtained using fewer actual characters:

{"constant":"whatever"} - 23 characters
{"enum":["whatever"]} - 21 characters

However, when used in combination with $data, it opens up possibilities that are not otherwise available.

Make id informational only

This would solve all the addressing woes currently plaguing JSON Schema.

If id becomes informational only, then it may be completely ignored as a requirement for addressing, which has the very nice consequence that addressing is now completely unambiguous, since it only relies on JSON Reference and JSON Pointer.

Implementations wishing to rely on id to define the current resolution context MAY do so; however, such implementations MUST NOT expect that peer implementations use this mechanism.

A net win for everybody.

(I said I wasn't involved in JSON Schema anymore, but this problem is still nagging me, and this simple change makes implementation of JSON Schema MUCH easier)

Direct people to StackOverflow for questions that are not discussion based

I personally feel that using the Google Group for help is fine, but people who are asking questions which have clear problem statements, I would much rather direct those to StackOverflow (with tag).

I strongly beleve that specific solved problems or questions should be easy to find, and would be much better suited on SO.
If agreed, I'll add a link under "more" on the website, but we will need to add a note about this on the Google Group, and agree that we should direct people there.

(Speaking of which, should we expand those who have admin rights on the Google Group? Myself and @awwright ?)

How to validate a hyperschema LDO ?

Hello there,

wrote similar tools like committee including a router with coercive validation for node.js / express …
There seem to exist various implementations / opinions about how to validate the variables coming directly from (e.g. path) variables of an RFC 6570 URI template, see this gist

Support `unique` for integers/numbers or strings

'Nuff said. Edit: Well, just to be clear, the intent is for keys.

Hyper-schema: "create" is not an IANA link relation

The JSON Hyper-schema spec currently lists the following

       {
           "title": "Post a comment",
           "rel": "create",
           "href": "/{id}/comments",
           "method": "POST",
           "schema": {
               "type": "object",
               "properties": {
                   "message": {
                       "type": "string"
                   }
               },
               "required": ["message"]
           }
       }

But this isn't a useful link relation because there's no such thing as a "create" link.

The place that would best define how to create resources would probably AtomPub, see https://tools.ietf.org/rfc/rfc5023.txt

The specification does later define "create", but it should normatively reference the IANA registry instead.

Profiles are not Schemas

the current draft talks about a profile media type parameter, and recommends using it to link to a JSON schema. the existing https://tools.ietf.org/html/rfc6906 about profiles (that also talks about "profile" media type parameters) is not referenced, but at least there is some overlap. please consider that the idea of profiles (as specified in RFC 6906) is not to interlink instances and schemas. instead, the idea is that a profile identifies a set of rules applied to an instance which can be processed with or without knowing the profile-specific processing rules.

Normalize $ref / id behavior

The draft 5 should (finally!) normalize the behavior of $ref and id, both in text and in form of language-agnostic test files.

Some of the edge-cases were discussed e.g. in

v6 annotation: named enumerations

The Problem

Enumerations are often cryptic, particularly when they exist to match legacy systems that valued storage efficiency over readability. While it is possible to include more information with the title and description fields at the same level as the enum, it is not possible to associate any additional information with each enum value.

There are two use cases:

Documentation

This falls squarely within JSON Schema’s goals, and is simply about providing an easily-understood-by-humans string for each enum value.

UI Generation

This is analogous to the value+label tuples common in web application framework geared towards producing select widgets. While JSON Schema is intended to help build UIs, it is debatable as to whether this is enough of a core goal to motivate features on its own. See also issue #55

The Proposals

There have been several proposals to address this. The options so far are:

A parallel array of human-readable names under a different keyword adjacent to ”enum”
A parallel-ish array of [enumValue, humanName] tuples under a different keyword adjacent to ”enum”
Replacing the current “enum” array with an array of tuples of (enumValue, humanName)

Due ”enum” values supporting any JSON type, it is not possible to have a JSON object mapping values to names. This is why lists of tuples are proposed instead.

@geraintluff proposed the parallel array of names, under the keyword ”enumNames”: https://github.com/json-schema/json-schema/wiki/enumNames-(v5-proposal)

@nemesisdesign proposed replacing with a tuple array, using the keyword ”choices”, drawn from web app frameworks: https://github.com/json-schema/json-schema/wiki/choices-(v5-proposal-to-enhance-enum)

@sam-at-github proposed the parallel-ish array of tuples, under the keyword ”enumLut” (although this is more or less the same as the proposed transitional period for moving the “choices”). See the comments in the issue filed for "choices" at the old repository (and also for a discussion of the validity of UI generation as a goal): json-schema/json-schema#211

Pros and cons

separate keywords for enum values and human-readable names preserves our existing distinction between validation keywords and annotation keywords
Parallel arrays are error-prone and very difficult to manage with anything but a very short enumeration
Making the array hold tuples so that the order is irrelevant makes it more robust, but involves duplication. If the enum value itself is a complex object or list, the duplication can get non-trivial
Replacing ”enum” with a new keyword that holds tuples is disruptive, and combines validation and annotation into one keyword, which we’ve otherwise avoided
A list of tuples, whether in addition to or in place of ”enum”, matches how many web development frameworks set up <select> inputs in forms.

In terms of schema design purity, the parallel array of names is the best solution. ”enum” remains a validation property, and ”enumNames” (or whatever we call the parallel array) is an annotation property.

In terms of ease of use, replacing the current value list with a tuple list is the best option. It removes any possibility of mis-matching values and names, and avoids any duplication. The cost is some syntactic noise for unnamed enums as the entries need to be tuples whether there are names or not.

In terms of flexibility, the parallel-ish array of tuples, which is keyed by the value rather than matched strictly by order, is the best option. It allows unnamed enums to continue to work exactly as they already do. We also preserve the validation vs annotation property separation. And it is not vulnerable to mismatches by miscounting. The cost is needing to duplicate the enum values, and then the values can get out of sync.

Steps towards a resolution

We should decide whether the separation of validation and annotation keywords is a fundamental part of the JSON Schema approach (again, see issue #55). If it is, then we can discard the "replace with a list of tuples" option, as it would be used for both validation and annotation. It would be the only annotation that leaves noise in the validation syntax even when it is not used. The value itself may be a tuple, so the top level must always be a tuple in order to avoid ambiguity, even if there is no name present.

If we do settle on the validation/annotation split principle, we're down to either adding a list of names that must be strictly parallel to the list of values, or we must add a list of tuples that are correlated by the value in the tuple. The former option is likely to get out of order or end up with the wrong number of entries, while the latter is likely to end up with values out of sync.

For simple values, keeping the values in sync should be pretty easy, but if enums supply complex data structure values, bugs are likely. I suspect that complex values in enums are quite rare.

For small sets of values, keeping lists in parallel should be easy, but long enums will lead to bugs. I suspect that long lists are more common than complex values.

If long lists are more common than complex values, we should choose the option that is more robust for long lists, which is the list of tuples. I'd appropriate the "enumName" keyword for it, even though that was proposed for the list of names, because it clearly ties the list of tuples to the "enum" property.

One mitigation for bugs involving values getting out of sync is that a debug mode could easily check that every value in the tuple list is an actual value of the corresponding enum. I am NOT proposing this as a step in validating instances- JSON Schema seems to generally be fine with nonsensical schemas (although that's another principle that we should confirm in issue #55). I am just speculating about an additional tool, like a linter for JSON Schema.

The point being that it would be possible to detect the most likely bugs from using a list of tuples with a theoretical linter, but the only thing such a linter could check with the list of names is that it is not longer than the enumeration. I think this, plus the likelihood of long enumerations vs complex values, gives the list of tuples alongside the existing "enum" list the edge.

Update JSON pointer RFC reference

It has been since released as https://tools.ietf.org/html/rfc6901

v6 hyper-schema: linkSource

This proposal originally written by @geraintluff at https://github.com/json-schema/json-schema/wiki/linkSource-(v5-proposal)

Proposed keywords

This proposal would introduce the following keyword to LDOs:

linkSource

Purpose

Currently, links described in links apply to the instance being described by that schema.

Sometimes, however, it would be good to be able to describe links for other data items.

Values

The value of linkSource would be a relative JSON Pointer.

Behaviour

When parsing a link definition, the substitution (for href and possible rel) would be processed as normal.

Once the link had been determined, though, the Relative JSON Pointer in linkSource would be resolved. The result of resolving that pointer should be considered the "source" of the link, instead of the current instance.

Example

Take this data for example:

{
    "postType": "blog",
    "authors": [
        "someuser123",
        "otheruser"
    ],
    ...
}

The entries in "authors" represent authors for the post - but the best we can currently do is to define a rel="author" link on the string itself (e.g. "someuser123"), or perhaps just define a rel="full" link (not specify an author link at all).

This would be incorrect - the links shouldn't apply to the individual entries in "authors", but to the post itself. Using linkSource, we could represent this as:

{
    "type": "object",
    "properties": {
        "authors": {
            "type": "array",
            "items": {
                "type": "string",
                "links": [{
                    "rel": "author",
                    "href": {
                        "template": "/users/{username}",
                        "vars": {"username": "0"}
                    },
                    "linkSource": "2"
                }]
            }
        }
    }
}

Concerns

Walking the whole instance tree

If link definitions can be defined outside of the data they describe, then in order to find all the links that apply to the instance, it would no longer be enough to process the "immediate" schemas for that data - tools would have to inspect the schemas for all children in the entire instance.

v5 validation: Clearly document validation principles

NOTE: This is a request for clarification in v5, and is not a proposal for changed behavior.

The Problem

There are several underlying principles to validation which are currently poorly articulated, or even just implied. Some of the more contentious arguments over feature proposals are due to unclear understanding of these principles. Plainly stating these in the specification will help keep the evolution of JSON Schema focused and reduce feature debate noise.

Terminology: indexing into a schema

You can index into JSON data by a property name or an array index. This can be written in JavaScript access form, e.g. A["foo"], A.foo, or A[0].

Indexing into a schema by a property name or array index number will, within this issue, mean finding the schema that would validate a similarly indexed instance. So if schema X validates instance A, then:

X.foo is the schema that is used to validate A.foo in the course of validating A with X.
X[5] is similarly the schema used to validate A[5]

Note that X.foo will in truth be one of:
X.properties.foo
X.patternProperties.patternThatMatchesFoo
X.additionalProperties # if neither of the above and additionalProperties is a schema
{} # the blank schema, if none of the above and additionalProperties is true

Similarly, X[5] will in truth be one of:
X.items[5] # if items is an array with at least six members
X.additionalItems # if items is an array with less than six members and addtionalItems is a schema
X.items # if items is a schema rather than an array
{} # if none of the above and additionalItems is true

"allOf"/"anyOf"/"oneOf"/"not" involve special considerations, which we will revisit within the principles below. Here are the basics of how indexing applies to them:

if X is an "allOf" with two branches X1 and X2, then:
X.foo is {"allOf": [X1.foo, X2.foo]}

if X is an "anyOf" or "oneOf" with two branches X1 and X2, then X.foo must only take into account the schema(s) that validated A. In the case of "anyOf" that may be both or just one, while in the case of "oneOf" it will always be just one of the branches.

If X2 is the branch of "oneOf" that validates A, then X.foo is X2.foo
If both X1 and X2 validate A in an "anyOf", then X.foo is {"anyOf": [X1.foo, X2.foo]}

if X is a "not" schema {"not": Y}, then there is no meaningful index into X. Depending on the rest of how Y is defined, Y.foo may or may not validate against A.foo, even though Y as a whole is guaranteed to fail validation with A due to the "not".

Known or Suspected Principles

I am totally making these up off the top of my head. They are a starting point: some are missing, and some are probably wrong. Some are defined, and others are more of a request for someone to explain the principle involved.

Context-free validation

Validation of a schema should succeed or fail independent of whether or where it appears within another schema.

A corollary of this is that if instance A validates against schema X, then indexing into both will produce a sub-instance that validates against the sub-schema. Since A.foo validates against X.foo in the context of A and X, it must also validate when pulled out to stand alone.

Notably, if X is {"not": Y}, the impact of this principle is unclear because there is no meaningful X.foo. The overall context of the "not" must be taken into account in order to say anything.

Schemas that cannot possibly validate any instance are considered valid

That this is an underlying principle is clear from reading the spec. However, I have not seen any explanation as to the benefit. Is it intended to facilitate extensibility somehow? Is it to avoid burdening validator implementors with expensive and difficult checks? If it is the latter, is having the validation succeed the only possible solution to this requirement?

One generalized example is section 4.1 of draft 04, which says: "Some validation keywords only apply to one or more primitive types. When the primitive type of the instance cannot be validated by a given keyword, validation for this keyword and instance SHOULD succeed."

Why should a schema of {"type": "string", "maximum": 10} which is clearly nonsensical validate cleanly against the string "foo"?

Furthermore, why should a default, or enum values, be allowed that fail validation?

A minimally conforming validator need only validate syntactical/structural constraints

It may ignore all annotation fields, all hypermedia fields, and all semantic validation fields (currently "format" is the only semantic field).

This is important for answering the objection that a new annotation field (for instance) places a burden on validator implementors. Since any minimal validator must already ignore any unrecognized fields in a schema, there is no validator burden for non-validation schema fields.

This principle can be inferred from what is marked required or optional and how each field behaves, but clearly articulating it will avoid some arguments based on observations of other issue discussions.

Does anyone actually use JSON Hyper-schema

There's a lot of broken features defined in JSON Hyper-Schema, I want to ask implementors how much I can be allowed to "break" (i.e. make compliant with normative references).

Mostly things like quirks about how it defines URI templates, uses "rel", and uses "method".

Anyone?

v6 annotation: custom error messages while validating

Originally written by @epoberezkin at https://github.com/json-schema/json-schema/wiki/Custom-error-messages-(v5-proposal)
with additional requests by @the-t-in-rtf at json-schema/json-schema#222

Add keyword errors that would contain error messages, potentially templated, that would be added to errors reported by validators when some keyword fails the validation. Example:

{
  "properties": {
    "age": {
      "minimum": 13,
      "errors": {
        "minimum": "Should be at least ${schema} years, ${data} years is too young."
      }
    },
    "gender": {
      "enum": ["male", "female"],
      "errors": {
        "enum": {
          "text": "Gender should be ${schema/0} or ${schema/1}",
          "action": "replace"
        }
      }
    }
  }
}

They can be merged using absolute or relative JSON pointers:

{
  "properties": {
    "age": { "minimum": 13 },
    "gender": { "enum": ["male", "female"] }
  },
  "errors": {
    "#/properties/age/minimum": "Should be at least ${schema} years, ${data} years is too young.",
    "#/properties/gender/enum": {
      "text": "Gender should be ${schema/0} or ${schema/1}",
      "action": "replace"
    }
  }
}

Add more owners to the organisation

@awwright It already happened once when the owner of json-schema became unreachable. In this organisation the situation is even worse - you are the only member. Could you maybe add these people as owners to this org?

Website is pointing to the wrong Git repository

The json-schema.org homepage points to json-schema/json-schema, where it should point to json-schema-org/json-schema-spec.

What does it take to get the JSON-Schema specification adopted by an IETF working group?

This is more a question that needs answering as opposed to an issue. If anyone can shed some light on this, please do comment.

Allow anyOf/oneOf/allOf with one value instead of extends

"extends" was removed in draft 4. Can you allow anyOf, oneOf, allOf with one value only to allow for the same functionality (inheritance)? That way, the JSON schemas would remain compliant with the latest draft of the spec...

http://json-schema.org/latest/json-schema-validation.html#anchor145

Disallow fragments in id URIs

It seems that the some of problems with $refs and id could be alleviated by disallowing fragments in id URIs.

Currently such fragments

collide with JSON pointers and
make strange distinctions between outer-most schema (which currently can not have fragment in id) and inner schemas (which can have fragments in id)

Allow regular JSON pointers for $data reference (draft5)

Right now, the proposed $data reference only supports relative JSON pointers. While this is useful and more compact for referencing data nearby, it's a big pain in the ass for referencing data that is further away. As an example (note that I'm using YAML for readability):

type: object
properties: 
    foo:
        enum:
            $data: "1/enumList"
    bar:
        type: object
        properties:
            baz:
                enum: 
                    $data: "2/enumList"
    enumList: 
        type: array

At a small scale, the only real issue is the lack of code reuse, as you can't reuse the enum if the data is closer or further from the root. However, the problem gets progressively worse the larger and more complex the schema gets. It's not, however, an insurmountable problem. The real problem is when you start using recursive schemas:

type: object
properties:
    foo:
        $ref: "#/definitions/recursiveSchema"
    enumList: 
        type: array

definitions:
    recursiveSchema:
        type: object
        properties:
            bar:
                enum:
                    $data: "2/enumList"
            foobar:
                $ref: "#/definitions/recursiveSchema"

Example Data:

foo:
    bar: "Allowed"
    foobar:
      bar:  "Not Allowed"
      foobar:
        bar:  "Allowed"
enumList: ["Allowed"]

Now, what I expected to happen here was the relative pointer only works on the first level, and then simply fails to resolve the enum for every level after. What actually happens is AJV realizes the relative pointer is completely out of whack and assumes something is wrong with the JSON, and rejects any data you give it that follows the schema. Remove either the recursion or the enum, and whatever is left works just fine.

With regular JSON pointers, the fix is easy: instead of "[number]/enumList", it would simply be "/enumList", and it would resolve properly regardless of where you are in the document. All absolute pointers start with a slash, and all relative pointers start with a number, so there would never be any confusion about which is which.

Adding in JSON pointers shouldn't be hard for the people who've already implemented the $data reference, but it would be nice if it was part of the actual specification.

Define the abstract instance validation function

It may be useful to define, in somewhat mathematical terms, what it means to validate an instance, and which inputs are used.

I imagine the validation function being defined as such:

Validate[collection, schema, version, iriBase, instance] → Boolean ∪ Indeterminate

Where:

collection ∈ set of all Map[ IRI → valid JSON Schema instance ]
schema ∈ set of all IRIs
version ∈ set of all IRIs
iriBase ∈ set of all IRIs
instance ∈ set of all JSON documents (i.e. with a media type application/json)

This may also help to resolve issue #4. If the validation function is defined to have no side-effects, then we can just reiterate that point within the "default" keyword. We can also say the keyword is "not used for validation, but may be used for other purposes not defined here."

This is not to say that JSON Schema libraries can't implement other functions, they might desire to implement a "coerce" function that turns an arbitrary JSON instance into a validating one (casting strings to numbers, filling in missing required values using the default, etc).

Aside: Defining a "coerce" might be something useful for v6 (or, the next version with feature additions).

Release plan for draft 5

Is there some release plan / time schedule for draft 5?
Who is working on this? Is this currently a personal project of @ACubed who self-appointed himself or are there more people involved?

v6 validation: "contains"

Originally written by @geraintluff at https://github.com/json-schema/json-schema/wiki/contains-(v5-proposal)

Proposed keywords

contains

We also might want an equivalent for objects (like containsProperty).

Purpose

Specifying that an array must contain at least one matching item is awkward. It can currently be done, but only using some inside-out syntax:

{
    "type": "array",
    "not": {
        "items": {
            "not": {... whatever ...}
        }
    }
}

This would replace it with the much neater:

{
    "type": "array",
    "contains": {... whatever ...}
}

It would also enable us to specify multiple schemas that must be matched by distinct items (which is currently not supported).

Values

The value of contains would be either a schema, or an array of schemas.

Validation

If the value of contains is a schema, then validation would only succeed if at least one of the items in the array matches the provided sub-schema.

If the value of contains is an array, then validation would only succeed if it is possible to map each sub-schema in contains to a distinct array item matching that sub-schema. Two sub-schemas in contains cannot be mapped to the same array index.

Example

Plain schema

{
    "type": "array",
    "contains": {
        "type": "string"
    }
}

Valid: ["foo"], [5, null, "foo"]
Invalid: [], [5, null]

Array of schemas

{
    "type": "array",
    "items": {"type": "object"},
    "contains": [
        {"required": ["propA"]},
        {"required": ["propB"]}
    ]
}

Valid:

[{"propA": true}, {"propB": true}]
[{"propA": true}, {"propA": true, "propB": true}]

Invalid:

[]
[{"propA": true}] - no match for second entry
[{"propA": true, "propB": true}] - entries in contains must describe different items

Concerns

Implementation

The plain-schema case is simple.

The array case is equivalent to Hall's Marriage Theorem. There are relatively efficient solutions for the general problem - but, I suspect a brute-force search will be surprisingly effective and efficient (due to the relatively small number of entries in contains).

It may or may not be worth warning schema authors about stuffing hundreds of entries into contains, because a naive implementation could easily end up having O(n³m) complexity.

Complexity of understanding (for humans)

Behaviour for the array for may be slightly complicated. For example:

{
    "type": "array",
    "contains": [
        {"enum": ["A", "B"]},
        {"enum": ["A", "B", "C"]},
        {"enum": ["A", "D"]},
    ]
}

In this case, ["A", "B", "C"] is valid.

However, this is not due to the syntax - it's simply a complex constraint.

Criteria for building hyperlinks

Right now, links that use variables that aren't defined are just cast as empty. There should be some way to specify a default value, or to depend on certain variables existing and skip the hyperlink if it doesn't exist.

For example, if there's a schema like

{ links: [
    { href:"/doc/{uuid}", rel:"self"  }
]
}

but multiple posts don't have a "uuid" property, then they all get the same URI of </post/> and all of a sudden we're saying multiple different posts are actually the "same". Oops!

This can sort of be done right now like so:

{ anyOf: [
   {
      required: ['uuid'],
      links: [ {href:'/doc/{uuid}', rel:'self'} ],
    }
]
}

... but this is bulky.

Wiki is not migrated

The contents of the wiki at json-schema/json-schema/wiki have not been migrated/redirected to their new home here.

syntactic fallback validation for "format"

The Problem: "format" is frequently re-implemented using "pattern" because it is unreliable

The "format" keyword is currently defined as an optional feature of JSON Schema. This frees implementations from the relatively burdensome requirements of performing the specified semantic validations, but also intentionally makes the feature unreliable. As a result, schema authors frequently re-define validation schemas for fields that could be completely described with the "format" keyword were its implementation consistent.

This places an undue burden on schema writers who wish to both take advantage of any full implementations and work around any minimal implementations.

Here is an example of a document (written in YAML for human-friendliness) the provides JSON Schemas for ipv4 and ipv6 addresses for use in other schemas from the same product in place of the "format" keyword:

https://support.riverbed.com/apis/sh.common/1.0/service.yml

The Proposal

JSON Schema can provide a standard "pattern"-based schema for each format value in its meta-schema, which will provide a documented level of purely syntactical validation for instances. This requires only trivial additional work from implementations as shown below under "Mechanism".

Each such schema MUST successfully validate against all possible valid instances. They MAY also successfully validate invalid instances due to the limits of regular expressions or the decision of the JSON Schema standard that the full pattern is too complex or has too much of a performance impact to support at all.

Mechanism

A "formats" section would be added to the "definitions" within the meta-schema:

{
    "definitions": {
        "formats": {
            "definitions": {
                "ipv4": {
                    "minLength": 7,
                    "maxLength": 15,
                    "pattern": "^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$"
                },
                "email": {
                    "pattern": "you-get-the-idea"
                }
            }
        }
    }
}

The purpose of the nested "definitions" section is to clearly differentiate between definitions used only for format validation and definitions used to build the actual meta-schema.

If an implementation does not handle "format": "ipv4" directly, then the schema:

{
    "$schema": "http://json-schema.org/schema#",
    "type": "string",
    "readOnly": true,
    "format": "ipv4"
}

should be interpreted as:

{
    "$schema": "http://json-schema.org/schema#",
    "allOf": [
        {
            "type": "string",
            "readOnly": true
        },
        { "$ref": "http://json-schema.org/schema#/definitions/formats/definitions/ipv4" }
    ]
}

combining the fallback schema with whatever schema elements beyond "format" were already present.

Correctness Concerns

While all of the formats can be at least somewhat validated by regular expressions, several are either extremely complex to fully validate or cannot be entirely validated by a regex. Is this a problem? I argue that it is not, because properly implemented this provides substantial validation assistance that schema authors are otherwise writing each time themselves. Schema authors may examine the supplied regexes and determine whether or not they are sufficient for the given application, and re-implement them accordingly if they are not. This is no worse than what currently happens.

Performance Concerns

Due to the complexity of the regular expressions involved, the performance impact of using them is a valid concern. However, the "format" specification already states that implementations SHOULD provide an option to disable the keyword. That requirement should be left as-is. Disabling the "format" keyword should disable it entirely, including the fallback validation.

$data

Originally written by @geraintluff at https://github.com/json-schema/json-schema/wiki/%24data-(v5-proposal)

NOTE: JSON Relative Pointer is defined as an extension of JSON Pointer, which means that an absolute JSON pointer is legal anywhere that a relative pointer is mentioned (but not vice versa).

Absolute JSON Pointers always begin with /, while relative JSON pointers always begin with a digit. Resolving a pointer beginning with / behaves the same whether it is being resolved "relative" to a specific location or not, just as resolving a URI "/foo/bar" is resolved the same whether there is an existing path component to the URI or not.

Proposed keywords

$data

This keyword would be available:

inside any schema
contained in an object ({"$data": ...}) for the following schema properties:
- minimum/maximum
- exclusiveMinimum/exclusiveMaximum
- minItems/maxItems,
- enum
- more...
contained in an object ({"$data": ...}) for the following LDO properties:
- href
- rel
- title
- mediaType
- more...

Purpose

This keyword would allow schemas to use values from the data, specified using Relative JSON Pointers.

This allows more complex behaviour, including interaction between different parts of the data.

When used inside LDOs, this allows extraction of many more link attributes/parameters from the data.

Values

Wherever it is used, the value of $data is a Relative JSON Pointer.

Behaviour

If the $data keyword is defined in a schema, then before any further processing of the schema:

The value of $data is interpreted as a Relative JSON Pointer.
The pointer is resolved relative to the current instance being validated/processed/etc.
The resolved value is taken to be the value of the schema for all further processing.

When used in one of the permitted schema/LDO properties, then before any further processing of the schema/LDO:

The value of $data is interpreted as Relative JSON Pointer.
The pointer is resolved relative to the current instance being validated/processed/etc.
The resolved value is substituted as the property value.

Example

{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "type": "object",
    "properties": {
        "smaller": {"type": "number"},
        "larger": {
            "type": "number",
            "minimum": {"$data": "1/smaller"},
            "exclusiveMinimum": true
        }
    },
    "required": ["larger", "smaller"]
}

In the above example, the "larger" property must be strictly greater than the "smaller" property.

Concerns

Theoretical purity

Currently, validation is "context-free", meaning that one part of the data has minimal effect on the validation of another part. This has an effect on things like referencing sub-schemas. Changing this is a big issue, and should not be done lightly.

Some interplay of different parts of the data can currently be specified using oneOf (and the proposed switch) - but crucially, these constraints are specified in the schema for a common parent node, meaning that sub-schema referencing is still simple.

The use of $data also (in some cases) limits the amount of static analysis that can be done on schemas, because their behaviour becomes much more data-dependent. However, the expressive power it opens up is quite substantial.

Not available for all keywords

It's also tempting to allow its use for all schema keywords - however, not only is that a bad idea for keywords such as properties/id, but it also might present an obstacle to anybody extending the standard.

Not available inside `enum` values

It should be noted that while {"enum": {"$data":...}} would extract a list of possible values from the data, {"enum": [{"$data":...}]} would not - it would in fact specify that there is only one valid value: {"$data":...}.

Similar concerns would exist with an extra keyword like constant - what if you want the constant value to be a literal {"$data":...}? However, perhaps constant could be given this data-templating ability, and if you want a literal {"$data":...}, then you can still use enum.

Describing using the meta-schema

The existing mechanics of $ref can be nicely described using a rel="full" link relation.

The mechanics of $data, however, would be impossible to even approach in the meta-schema. We could describe the syntax, but nothing more. Is this a problem?

Instance validation/Format types into main specification

https://groups.google.com/forum/#!searchin/json-schema/format/json-schema/WInNIGWSL4U/UtTl29b-3GIJ
https://groups.google.com/forum/#!searchin/json-schema/format/json-schema/74XLt7R4ISE/FPOnAh6rq_UJ

I agree with the comment in the first post, it is too confusing; may I suggest for draft 5 or 6, JSON Schema goes back to one specification document with instance validation and format types inside it?

json-schema-org / json-schema-spec Goto Github PK

json-schema-spec's Introduction

Welcome to JSON Schema

Call for contributions and feedback

Status

Authoring and Building

Specification

Plugins

Internet-Drafts

Test suites

The website

Contributors

Code Contributors

Financial Contributors

Sponsors

Individuals

License

json-schema-spec's People

Contributors

Stargazers

Watchers

Forkers

json-schema-spec's Issues

Proposed keywords

Purpose

Values

Behaviour

Example

Concerns

Mismatched pre-processing rules

Usage inside $ref

Current behaviour, and rel="describedby"

Proposed keywords

Purpose

Motivating example

Values

Example

Concerns

Duplicated keys

Proposed keywords

Purpose

Values

Behaviour

Example

Concerns

Alternative: translation objects

translation object Values

Example translation object

Translation object concerns: where to apply?

Proposed keywords

Purpose

Values

Example

Concerns

Proposed keywords

Purpose

Values

Validation

Example

Simple constant

Using $data to specify equality

Concerns

Similarity to enum

The Problem

Documentation

UI Generation

The Proposals

Pros and cons

Steps towards a resolution

Proposed keywords

Purpose

Values

Behaviour

Example

Concerns

Walking the whole instance tree

The Problem

Terminology: indexing into a schema

Known or Suspected Principles

Context-free validation

Usage inside `$ref`

Current behaviour, and `rel="describedby"`

Using `$data` to specify equality

Similarity to `enum`