GithubHelp home page GithubHelp logo

Remove normalization completely from JSON-LD spec or make it return a JSON-LD representation in the form of a string about json-ld.org HOT 30 CLOSED

json-ld avatar json-ld commented on July 22, 2024
Remove normalization completely from JSON-LD spec or make it return a JSON-LD representation in the form of a string

from json-ld.org.

Comments (30)

gkellogg avatar gkellogg commented on July 22, 2024

This really comes down to what the ultimate driver for normalization in JSON-LD, or any Linked Data/RDF representation really is. To be useful,
a graph requires an identical normalized representation that is independent of the
data format originally used for markup, or the way in which language features or publisher
preferences create differences in the markup of identical graphs.

An open question is if the result must have a single serialization, a reproducible signature,
or any representation of an graph in which all nodes are equivalent in spite of graph store
location. (From the RDF 1.1 WG, it's equivalent to asking if two g-snaps have the same meaning).

One use case driving JSON-LD normalization is the need to create a cryptographic signature
that can be determined reliably by multiple parties using multiple processors; this argues that
the purpose of JSON-LD graph normalization is to reliably create such a signature.

As a result, it may be that JSON-LD normalization is not what's required, but an API method that will return a graph signature for the graph represented by the JSON-LD input document.

There may be the need for something that was similar to normalization, which would be flattening the graph to a single array of object definitions; this probably does not require that BNode identifiers be consistently mapped, or that object order be consistent. If it does, then something very much like normalization is required for JSON-LD (need use case), which could be created by re-serializing a normalized N-Triples representation back to a flat representation of JSON-LD, preserving subject order and BNode identifiers.

from json-ld.org.

dlongley avatar dlongley commented on July 22, 2024

I think the original idea behind the normalize API was to return a JSON-LD object that represents the normalized graph in JSON-LD form. Then, it would be a very simple further step to convert that form into a normalized N-Triples format for string comparison. The JSON-LD object could still be serialized to a JSON string and be compared reliable as well; but for signing graphs, the JSON-LD object would first be converted into N-Triples using a very trivial algorithm. The triples API method in JSON-LD could be leveraged to accomplish the conversion to N-Triples.

The normalization API has other uses including generating consistent bnode identifiers, producing simple, flattened, consistent JSON-LD objects, and it is currently reused to simplify the framing algorithm.

I think we can have this both ways; there's a normalized JSON-LD object form that is useful in the JSON-LD world, and, for the digital signature use case, this form can be trivially converted to an N-triples serialization.

from json-ld.org.

gkellogg avatar gkellogg commented on July 22, 2024

After some more discussion with Manu, it seems that normalization algorithm MUST return the same serialization for equivalent input graphs, independent of the input format; this implies a common serialized output form that can be used to construct a cryptographic signature.

Normalized JSON-LD is required for, among other things, framing, to ensure that the appropriate objects are created, and values are expressed in the proper order.

This implies that one step in the JSON-LD normalization procedure will be to take the serialized N-Triples and parse them into JSON-LD normalized form. It's not clear that white-space must be normalized in output, as that is not directly used for creating a signature. In any case, if the result of JSON-LD normalization is an object, rather than a DOMString, it's not really a serialized form where whitespace becomes significant.

from json-ld.org.

dlongley avatar dlongley commented on July 22, 2024

Sounds like a good solution to me: the JSON-LD normalization algorithm parses the internally generated N-triples string from the generalized normalization algorithm and outputs a JSON-LD object in normalized form. We may want an option to simply output the serialized N-triples string, however, so that JSON-LD processors can provide a simple call that takes JSON-LD as input and produces its associated normalization string.

from json-ld.org.

lanthaler avatar lanthaler commented on July 22, 2024

I think I was slightly misunderstood. What I tried to say was that the serialization of the normalized graph should be in JSON-LD instead of being in N-Triples. We don't need to return an object from the API, returning a string is fine. But the contents of that string should be a valid JSON-LD document. This doesn't prevent any of the mentioned use cases and has the, in my opinion, very important advantage of being able to parse a such a normalization without having to support N-Triples in an JSON-LD parser.

If the said string contains N-Triples, we cannot serve it with the media type "application/ld+json;form=normalized" as it is not JSON-LD. I do not see any advantage of returning that string in N-Triples. What is it that I miss?

-----Original Message-----
From: Dave Longley [mailto:reply+i-2800332-
[email protected]]
Sent: Friday, January 13, 2012 1:26 AM
To: Markus Lanthaler
Subject: Re: [json-ld.org] Remove normalization completely from JSON-LD
or make it return a JSON-LD object (#53)

Sounds like a good solution to me: the JSON-LD normalization algorithm
takes an N-triples string from the generalized normalization algorithm
and outputs a JSON-LD object in normalized form. We may want an option
to simply output the serialized N-triples string, however, so that
JSON-LD processors can provide a simple call that takes JSON-LD as
input and produces its associated normalization string.


Reply to this email directly or view it on GitHub:
#53 (comment)

from json-ld.org.

gkellogg avatar gkellogg commented on July 22, 2024

On Jan 13, 2012, at 5:36 AM, "Markus Lanthaler" [email protected] wrote:

I think I was slightly misunderstood. What I tried to say was that the serialization of the normalized graph should be in JSON-LD instead of being in N-Triples. We don't need to return an object from the API, returning a string is fine. But the contents of that string should be a valid JSON-LD document. This doesn't prevent any of the mentioned use cases and has the, in my opinion, very important advantage of being able to parse a such a normalization without having to support N-Triples in an JSON-LD parser.

If the said string contains N-Triples, we cannot serve it with the media type "application/ld+json;form=normalized" as it is not JSON-LD. I do not see any advantage of returning that string in N-Triples. What is it that I miss?

If you look at the language now, goes to N-triples for normalization, and processes the normalized results back into normalizes JSON-LD.

Gregg

-----Original Message-----
From: Dave Longley [mailto:reply+i-2800332-
[email protected]]
Sent: Friday, January 13, 2012 1:26 AM
To: Markus Lanthaler
Subject: Re: [json-ld.org] Remove normalization completely from JSON-LD
or make it return a JSON-LD object (#53)

Sounds like a good solution to me: the JSON-LD normalization algorithm
takes an N-triples string from the generalized normalization algorithm
and outputs a JSON-LD object in normalized form. We may want an option
to simply output the serialized N-triples string, however, so that
JSON-LD processors can provide a simple call that takes JSON-LD as
input and produces its associated normalization string.


Reply to this email directly or view it on GitHub:
#53 (comment)


Reply to this email directly or view it on GitHub:
#53 (comment)

from json-ld.org.

lanthaler avatar lanthaler commented on July 22, 2024

The current language says the following as to my understanding:

JSON-LD document -> normalized graph -> string (containing a
normalized serialization in **N-Triples**) -> JSON-LD object

What I'm asking is why do we introduce N-Triples at all. We could as well do the following:

JSON-LD document -> normalized graph -> string (containing a
normalized serialization in **JSON-LD**)

So the output of the API would be a string but this string contains valid JSON-LD instead of N-Triples. If someone would like to use this normalized document, he just takes the string and parses it with a JSON-LD parser.

-----Original Message-----
From: Gregg Kellogg [mailto:reply+i-2800332-
[email protected]]
Sent: Saturday, January 14, 2012 7:31 AM
To: Markus Lanthaler
Subject: Re: [json-ld.org] Remove normalization completely from JSON-LD
or make it return a JSON-LD object (#53)

On Jan 13, 2012, at 5:36 AM, "Markus Lanthaler"
[email protected] wrote:

I think I was slightly misunderstood. What I tried to say was that
the serialization of the normalized graph should be in JSON-LD instead
of being in N-Triples. We don't need to return an object from the API,
returning a string is fine. But the contents of that string should be a
valid JSON-LD document. This doesn't prevent any of the mentioned use
cases and has the, in my opinion, very important advantage of being
able to parse a such a normalization without having to support N-
Triples in an JSON-LD parser.

If the said string contains N-Triples, we cannot serve it with the
media type "application/ld+json;form=normalized" as it is not JSON-
LD. I do not see any advantage of returning that string in N-Triples.
What is it that I miss?

If you look at the language now, goes to N-triples for normalization,
and processes the normalized results back into normalizes JSON-LD.

Gregg

-----Original Message-----
From: Dave Longley [mailto:reply+i-2800332-
[email protected]]
Sent: Friday, January 13, 2012 1:26 AM
To: Markus Lanthaler
Subject: Re: [json-ld.org] Remove normalization completely from
JSON-LD
or make it return a JSON-LD object (#53)

Sounds like a good solution to me: the JSON-LD normalization
algorithm
takes an N-triples string from the generalized normalization
algorithm
and outputs a JSON-LD object in normalized form. We may want an
option
to simply output the serialized N-triples string, however, so that
JSON-LD processors can provide a simple call that takes JSON-LD as
input and produces its associated normalization string.


Reply to this email directly or view it on GitHub:
#53 (comment)
3466850


Reply to this email directly or view it on GitHub:
#53 (comment)


Reply to this email directly or view it on GitHub:
#53 (comment)

from json-ld.org.

gkellogg avatar gkellogg commented on July 22, 2024

As I understood it, the need for graph normalization goes beyond JSON-LD and is a more generalized need. And, there's no interest outside of JSON-LD to use it just for this purpose. This has led to the conclusion that the normalization work will be RDF based, to reach the broadest audience. Given a dependency on RDF normalization, we're left with going through RDF serialization/deserialization.

Of course, the RDF normalization group could consider some format other than N-Triples, or there could be some API that returned "raw" triples, but if one of the requirements is to obtain a cryptographic signature, some serialization format will be required; it's hard to imagine it not being N-Triples.

BTW, I'm somewhat closer to your TZ, now in Guam on the way to Palau, but probably won't be too responsive for the next week+. You should peruse this on Tuesday's call so we can get to a concensus opinion.

Gregg Kellogg
Sent from my iPhone

On Jan 14, 2012, at 3:42 PM, "Markus Lanthaler" [email protected] wrote:

The current language says the following as to my understanding:

JSON-LD document -> normalized graph -> string (containing a
normalized serialization in N-Triples) -> JSON-LD object

What I'm asking is why do we introduce N-Triples at all. We could as well do the following:

JSON-LD document -> normalized graph -> string (containing a
normalized serialization in JSON-LD)

So the output of the API would be a string but this string contains valid JSON-LD instead of N-Triples. If someone would like to use this normalized document, he just takes the string and parses it with a JSON-LD parser.

-----Original Message-----
From: Gregg Kellogg [mailto:reply+i-2800332-
[email protected]]
Sent: Saturday, January 14, 2012 7:31 AM
To: Markus Lanthaler
Subject: Re: [json-ld.org] Remove normalization completely from JSON-LD
or make it return a JSON-LD object (#53)

On Jan 13, 2012, at 5:36 AM, "Markus Lanthaler"
[email protected] wrote:

I think I was slightly misunderstood. What I tried to say was that
the serialization of the normalized graph should be in JSON-LD instead
of being in N-Triples. We don't need to return an object from the API,
returning a string is fine. But the contents of that string should be a
valid JSON-LD document. This doesn't prevent any of the mentioned use
cases and has the, in my opinion, very important advantage of being
able to parse a such a normalization without having to support N-
Triples in an JSON-LD parser.

If the said string contains N-Triples, we cannot serve it with the
media type "application/ld+json;form=normalized" as it is not JSON-
LD. I do not see any advantage of returning that string in N-Triples.
What is it that I miss?

If you look at the language now, goes to N-triples for normalization,
and processes the normalized results back into normalizes JSON-LD.

Gregg

-----Original Message-----
From: Dave Longley [mailto:reply+i-2800332-
[email protected]]
Sent: Friday, January 13, 2012 1:26 AM
To: Markus Lanthaler
Subject: Re: [json-ld.org] Remove normalization completely from
JSON-LD
or make it return a JSON-LD object (#53)

Sounds like a good solution to me: the JSON-LD normalization
algorithm
takes an N-triples string from the generalized normalization
algorithm
and outputs a JSON-LD object in normalized form. We may want an
option
to simply output the serialized N-triples string, however, so that
JSON-LD processors can provide a simple call that takes JSON-LD as
input and produces its associated normalization string.


Reply to this email directly or view it on GitHub:
#53 (comment)
3466850


Reply to this email directly or view it on GitHub:
#53 (comment)


Reply to this email directly or view it on GitHub:
#53 (comment)


Reply to this email directly or view it on GitHub:
#53 (comment)

from json-ld.org.

lanthaler avatar lanthaler commented on July 22, 2024

They question is whether the normalization algorithm really has to be bound to a serialization format. I'm inclined to say no. The normalization algorithm could be described completely independent of the serialization format as eventually serializing it to some specific format, whether that's JSON-LD, RDF/XML, Turtle, or N-Triples does not really matter. I think that would even have the advantage, that every new serialization format will be able to use the normalization algorithm and then independently specify it's serialization format.

So I would propose the following:

  • specify the normalization algorithm independently of a serialization format
  • specify a JSON-LD serialization of a normalized graph
  • optionally, specify serializations in N-Triples, Turtle, etc.

The advantage would be that each system could directly parse an normalized graph without having to implement support for another other serialization format so that there's full round-tripping support.

What do you think?

Awesome, are you going for a diving trip? Enjoy it!

-----Original Message-----
From: Gregg Kellogg [mailto:reply+i-2800332-
[email protected]]
Sent: Saturday, January 14, 2012 6:21 PM
To: Markus Lanthaler
Subject: Re: [json-ld.org] Remove normalization completely from JSON-LD
or make it return a JSON-LD object (#53)

As I understood it, the need for graph normalization goes beyond JSON-
LD and is a more generalized need. And, there's no interest outside of
JSON-LD to use it just for this purpose. This has led to the conclusion
that the normalization work will be RDF based, to reach the broadest
audience. Given a dependency on RDF normalization, we're left with
going through RDF serialization/deserialization.

Of course, the RDF normalization group could consider some format other
than N-Triples, or there could be some API that returned "raw" triples,
but if one of the requirements is to obtain a cryptographic signature,
some serialization format will be required; it's hard to imagine it
not being N-Triples.

BTW, I'm somewhat closer to your TZ, now in Guam on the way to Palau,
but probably won't be too responsive for the next week+. You should
peruse this on Tuesday's call so we can get to a concensus opinion.

Gregg Kellogg
Sent from my iPhone

On Jan 14, 2012, at 3:42 PM, "Markus Lanthaler"
[email protected] wrote:

The current language says the following as to my understanding:

JSON-LD document -> normalized graph -> string (containing a
normalized serialization in N-Triples) -> JSON-LD object

What I'm asking is why do we introduce N-Triples at all. We could as
well do the following:

JSON-LD document -> normalized graph -> string (containing a
normalized serialization in JSON-LD)

So the output of the API would be a string but this string contains
valid JSON-LD instead of N-Triples. If someone would like to use this
normalized document, he just takes the string and parses it with a
JSON-LD parser.

-----Original Message-----
From: Gregg Kellogg [mailto:reply+i-2800332-
[email protected]]
Sent: Saturday, January 14, 2012 7:31 AM
To: Markus Lanthaler
Subject: Re: [json-ld.org] Remove normalization completely from
JSON-LD
or make it return a JSON-LD object (#53)

On Jan 13, 2012, at 5:36 AM, "Markus Lanthaler"
[email protected] wrote:

I think I was slightly misunderstood. What I tried to say was that
the serialization of the normalized graph should be in JSON-LD
instead
of being in N-Triples. We don't need to return an object from the
API,
returning a string is fine. But the contents of that string should
be a
valid JSON-LD document. This doesn't prevent any of the mentioned
use
cases and has the, in my opinion, very important advantage of being
able to parse a such a normalization without having to support N-
Triples in an JSON-LD parser.

If the said string contains N-Triples, we cannot serve it with the
media type "application/ld+json;form=normalized" as it is not
JSON-
LD. I do not see any advantage of returning that string in N-
Triples.
What is it that I miss?

If you look at the language now, goes to N-triples for
normalization,
and processes the normalized results back into normalizes JSON-LD.

Gregg

-----Original Message-----
From: Dave Longley [mailto:reply+i-2800332-
[email protected]]
Sent: Friday, January 13, 2012 1:26 AM
To: Markus Lanthaler
Subject: Re: [json-ld.org] Remove normalization completely from
JSON-LD
or make it return a JSON-LD object (#53)

Sounds like a good solution to me: the JSON-LD normalization
algorithm
takes an N-triples string from the generalized normalization
algorithm
and outputs a JSON-LD object in normalized form. We may want an
option
to simply output the serialized N-triples string, however, so that
JSON-LD processors can provide a simple call that takes JSON-LD as
input and produces its associated normalization string.


Reply to this email directly or view it on GitHub:
#53 (comment)
3466850


Reply to this email directly or view it on GitHub:
#53 (comment)
3481249


Reply to this email directly or view it on GitHub:
#53 (comment)
3488284


Reply to this email directly or view it on GitHub:
#53 (comment)


Reply to this email directly or view it on GitHub:
#53 (comment)

from json-ld.org.

gkellogg avatar gkellogg commented on July 22, 2024

On Jan 14, 2012, at 9:45 PM, "Markus Lanthaler" [email protected] wrote:

They question is whether the normalization algorithm really has to be bound to a serialization format. I'm inclined to say no. The normalization algorithm could be described completely independent of the serialization format as eventually serializing it to some specific format, whether that's JSON-LD, RDF/XML, Turtle, or N-Triples does not really matter. I think that would even have the advantage, that every new serialization format will be able to use the normalization algorithm and then independently specify it's serialization format.

So I would propose the following:

  • specify the normalization algorithm independently of a serialization format
  • specify a JSON-LD serialization of a normalized graph
  • optionally, specify serializations in N-Triples, Turtle, etc.

The advantage would be that each system could directly parse an normalized graph without having to implement support for another other serialization format so that there's full round-tripping support.

What do you think?

It gets down to if you need a signature, i.e., SHA-1, you need a serialization that will is the same across different technologies, if each does it natively for their own language, it's it generally useful.

Awesome, are you going for a diving trip? Enjoy it!

-----Original Message-----
From: Gregg Kellogg [mailto:reply+i-2800332-
[email protected]]
Sent: Saturday, January 14, 2012 6:21 PM
To: Markus Lanthaler
Subject: Re: [json-ld.org] Remove normalization completely from JSON-LD
or make it return a JSON-LD object (#53)

As I understood it, the need for graph normalization goes beyond JSON-
LD and is a more generalized need. And, there's no interest outside of
JSON-LD to use it just for this purpose. This has led to the conclusion
that the normalization work will be RDF based, to reach the broadest
audience. Given a dependency on RDF normalization, we're left with
going through RDF serialization/deserialization.

Of course, the RDF normalization group could consider some format other
than N-Triples, or there could be some API that returned "raw" triples,
but if one of the requirements is to obtain a cryptographic signature,
some serialization format will be required; it's hard to imagine it
not being N-Triples.

BTW, I'm somewhat closer to your TZ, now in Guam on the way to Palau,
but probably won't be too responsive for the next week+. You should
peruse this on Tuesday's call so we can get to a concensus opinion.

Gregg Kellogg
Sent from my iPhone

On Jan 14, 2012, at 3:42 PM, "Markus Lanthaler"
[email protected] wrote:

The current language says the following as to my understanding:

JSON-LD document -> normalized graph -> string (containing a
normalized serialization in N-Triples) -> JSON-LD object

What I'm asking is why do we introduce N-Triples at all. We could as
well do the following:

JSON-LD document -> normalized graph -> string (containing a
normalized serialization in JSON-LD)

So the output of the API would be a string but this string contains
valid JSON-LD instead of N-Triples. If someone would like to use this
normalized document, he just takes the string and parses it with a
JSON-LD parser.

-----Original Message-----
From: Gregg Kellogg [mailto:reply+i-2800332-
[email protected]]
Sent: Saturday, January 14, 2012 7:31 AM
To: Markus Lanthaler
Subject: Re: [json-ld.org] Remove normalization completely from
JSON-LD
or make it return a JSON-LD object (#53)

On Jan 13, 2012, at 5:36 AM, "Markus Lanthaler"
[email protected] wrote:

I think I was slightly misunderstood. What I tried to say was that
the serialization of the normalized graph should be in JSON-LD
instead
of being in N-Triples. We don't need to return an object from the
API,
returning a string is fine. But the contents of that string should
be a
valid JSON-LD document. This doesn't prevent any of the mentioned
use
cases and has the, in my opinion, very important advantage of being
able to parse a such a normalization without having to support N-
Triples in an JSON-LD parser.

If the said string contains N-Triples, we cannot serve it with the
media type "application/ld+json;form=normalized" as it is not
JSON-
LD. I do not see any advantage of returning that string in N-
Triples.
What is it that I miss?

If you look at the language now, goes to N-triples for
normalization,
and processes the normalized results back into normalizes JSON-LD.

Gregg

-----Original Message-----
From: Dave Longley [mailto:reply+i-2800332-
[email protected]]
Sent: Friday, January 13, 2012 1:26 AM
To: Markus Lanthaler
Subject: Re: [json-ld.org] Remove normalization completely from
JSON-LD
or make it return a JSON-LD object (#53)

Sounds like a good solution to me: the JSON-LD normalization
algorithm
takes an N-triples string from the generalized normalization
algorithm
and outputs a JSON-LD object in normalized form. We may want an
option
to simply output the serialized N-triples string, however, so that
JSON-LD processors can provide a simple call that takes JSON-LD as
input and produces its associated normalization string.


Reply to this email directly or view it on GitHub:
#53 (comment)
3466850


Reply to this email directly or view it on GitHub:
#53 (comment)
3481249


Reply to this email directly or view it on GitHub:
#53 (comment)
3488284


Reply to this email directly or view it on GitHub:
#53 (comment)


Reply to this email directly or view it on GitHub:
#53 (comment)


Reply to this email directly or view it on GitHub:
#53 (comment)

from json-ld.org.

lanthaler avatar lanthaler commented on July 22, 2024

The need for a signature is an application specific requirement (which therefore we aren't going to specify). It thus depends completely on the application how this signature gets calculated. Which serialization format is used doesn't matter in this instance. I wouldn't like to force all (JSNO-LD) applications to support N-Triples just to be able to normalize a graph.

from json-ld.org.

dlongley avatar dlongley commented on July 22, 2024

There's an issue in that the normalization algorithm itself requires some kind of lexicographical comparison of nodes in order to produce a canonical output. Whatever that serialization is, a JSON-LD processor will have to understand it to produce an ordered output in JSON-LD. Either this, or the normalization algorithm can output triples in order ... and the JSON-LD processor can take it from there. That doesn't mean a JSON-LD application that links to a JSON-LD processor has to know anything about an internally used serialization; the output of the JSON-LD API can be either a JSON-LD document in "normalized form" where it is an expanded array with a particular order for the subjects, or it could be a simple string (likely in N-triples format) that can be seen as an opaque value that uniquely identities the particular graph (and can therefore be hashed and signed however your application sees fit). To be clear, I'm saying that the API should give you the option to choose either output.

from json-ld.org.

msporny avatar msporny commented on July 22, 2024

PROPOSAL: Add a flag to the normalize call that specifies whether the return value should be a string in N-Triples format or a native language structure used to represent a JSON-LD document.

from json-ld.org.

gkellogg avatar gkellogg commented on July 22, 2024

I think it is important that Graph normalization be independent of serialization format, so that it can be considered universally. However, we can settle on a Normalized JSON-LD form, which is derived from the normalized Graph form. By having an API call that takes the results of graph normalization and deterministically reproduces JSON-LD, we will have accomplished. that. The current algorithm does just this.

From a JSON-LD API perspective having the API call return normalized JSON-LD accomplishes this. A Normalization API should be separate, and take arbitrary serializations and return normalized N-Triples. Serializations other than JSON-LD could then define their own normalized forms based on this.

So, I'm -1 on the proposal to add a flag. The normalize API method works fine, but should be defined in terms of transforming the JSON-LD to RDF represented as N-Triples, invoking the normalization API and transforming the result back to normalized JSON-LD. Just as the JSON-LD to RDF conversion is defined as an API method, we might also define an RDF to JSON-LD conversion.

from json-ld.org.

lanthaler avatar lanthaler commented on July 22, 2024

I agree with Gregg: The normalization algorithm should be independent of the serialization format and that we should settle on a normalized JSON-LD form which is derived from the normalized graph.

The output of the API should be a string which can be parsed into JSON by any _JSON_ parser.

from json-ld.org.

gkellogg avatar gkellogg commented on July 22, 2024

Relevant emails relating to performing normalization:
http://lists.w3.org/Archives/Public/public-linked-json/2012Mar/0017.html
http://lists.w3.org/Archives/Public/public-linked-json/2012Mar/0018.html
http://lists.w3.org/Archives/Public/public-linked-json/2012Mar/0019.html

from json-ld.org.

gkellogg avatar gkellogg commented on July 22, 2024

IRC conversation on normalization:

[5:12pm] gkellogg: dlongley: does the normalization algorithm make any guarantees about statement order of rdf:List statements, relative to each other and/or to the referencing object?
[5:20pm] taaz: wouldn't they just get sorted like anything else?
[5:21pm] dlongley: they'll all just be lexicographically sorted like any other triple
[5:21pm] dlongley: all of the triples generated by rdf:List (the linked list triples) will just be treated like any other triple ...
[5:21pm] dlongley: there's no special relational information in the normalization algorithm for lists
[5:21pm] gkellogg: Figured, I'm tweaking the fromTriples algorithm to be independent of order, in this case.
[5:22pm] dlongley: ok
[5:23pm] gkellogg: Requires first passing over the output array of triples looking for rdf:first/rdf:rest before a second pass to serialize other statements.
[5:23pm] dlongley: yeah, we'll have to search through and find the related list info
[5:24pm] gkellogg: My thought is that the result of normalization if an array of entity objects, not a string. Markus has a different opinion.
[5:24pm] dlongley: or keep some placeholders around and something to track list triples as we find them
[5:24pm] gkellogg: Two-pass is easier.
[5:24pm] dlongley: ok... i'll have to re-read Markus's email ...
[5:25pm] dlongley: it sounded like he wanted it to be an array of triples not a string
[5:25pm] gkellogg: IMO, a normalized graph signature needs to be independent of serialization format.
[5:25pm] dlongley: hmm, what do you mean?
[5:25pm] dlongley: in order to produce the signature you'll have to serialize somehow
[5:26pm] gkellogg: Yes, but that should be the same result regardless if the graph was communicated as JSON-LD, RDFa or Turtle. This is why ordered N-Triples is appealing.
[5:27pm] gkellogg: From Markus: "I think would need to return a string to preserve the order."
[5:28pm] dlongley: i see ... i thought he was talking about the result of fromTriples
[5:28pm] dlongley: not the result of normalization
[5:28pm] gkellogg: fromTriples is the last step in normalization.
[5:28pm] dlongley: the result of normalization, i think all 3 of us agree, should be an array of triples
[5:28pm] dlongley: oh, you mean "JSON-LD normalization"
[5:28pm] gkellogg: Yes.
[5:28pm] dlongley: right, well, he has a point in that the returned object won't necessarily preserve key order
[5:29pm] dlongley: it will in some languages, but not in others
[5:29pm] gkellogg: I don't see this as an issue, as the only reason to preserve key order is for creating a signature, which should be independent of JSON-LD anyway.
[5:29pm] dlongley: ok, i understand your point.
[5:30pm] dlongley: sorry, i had to sew those independent pieces together in my brain real quick.
[5:30pm] dlongley: ok, so since a graph signature should use the same serialization regardless of how the graph is communicated ...
[5:30pm] dlongley: there's no point in requiring the JSON-LD normalization algorithm to output an object with strictly-ordered keys
[5:31pm] dlongley: of course, this means that you can't simply serialize to JSON and compare JSON-LD normalized output
[5:31pm] gkellogg: The normalization algorithm probably needs a "signature" API method.
[5:31pm] dlongley: and people might not be expecting that
[5:31pm] dlongley: i dont' know if we want to do that ...
[5:32pm] gkellogg: What your doing is comparing strings, not JSON. No reason that you shouldn't be able to do a deep-JSON comparison. In Ruby, objects are Hashs, and comparison of hashes does not deepend on key order.
[5:32pm] dlongley: i don't think that's the right way to go (specifying all of the various signature options etc that could be passed to such a method)
[5:32pm] dlongley: i agree with your point...
[5:32pm] dlongley: the objects that are returned are equivalent.
[5:33pm] dlongley: so they are truly "normalized".
[5:33pm] gkellogg: If I want to do string comparison, it doesn't seem that what's in the string is particularly important, it may as well be N-Triples.
[5:33pm] dlongley: i think the normalization spec should recommend a serialization format (N-Triples) for hashing/signatures, but not go so far to add API calls to do these things.
[5:33pm] gkellogg: object key order in JSON is not defined, so saying that they're ordered has no meaning in JSON.
[5:33pm] dlongley: right
[5:34pm] dlongley: i think i'm in agreement with you.... and maybe these arguments would persuade markus as well.
[5:34pm] gkellogg: The Normalization algorithm MUST provide a way for different implementations to retrieve the same signature, doesn't an API aid in doing this.
[5:34pm] dlongley: i think what he ultimately wants is for the normalized JSON-LD output to be very simple and useful
[5:35pm] gkellogg: To be useful as JSON, it needs to be JSON, not a string.
[5:35pm] gkellogg: Otherwise, parsing the string to JSON will loose order anyway.
[5:35pm] dlongley: it would aid in doing it, but putting all of the crypto work into an API call in that spec i think would be overly complex.
[5:35pm] dlongley: all the normalizatoin algorithm has to do is say how to generate two equivalent strings.
[5:35pm] dlongley: what you do with those strings is entirely up to you
[5:36pm] gkellogg: Okay, I'll buy that reasoning, but interoperability is going to need to describe unambiguously how to obtain a signature from that string.
[5:36pm] dlongley: i think it would be a mistake to specify an API call that performs a signature
[5:36pm] manu-db: gkellogg: The JSON-LD normalization spec will detail an additional API method that can be tacked onto a JSON-LD processor, I think... this decouples the JSON-LD API from normalization... which is what we want for W3C Process purposes.
[5:36pm] dlongley: additional method for what?
[5:37pm] manu-db: purely from a W3C process perspective, we want JSON-LD Syntax, JSON-LD API, and JSON-LD Normalization to be separate, independent specs.
[5:37pm] dlongley: you mean the normalization method?
[5:37pm] gkellogg: I just want to remove the normalized serialization format from JSON-LD into the Normalization spec. The issue of getting a signature is separate from JSON-LD.
[5:37pm] manu-db: dlongley: woops, yes.
[5:37pm] dlongley: manu-db: ok
[5:37pm] manu-db: we /may/ have an additional .signature() method in another spec, that would tack onto the JSON-LD processor in the same way that the .normalize() method would.
[5:37pm] dlongley: gkellogg: i agree that the signature stuff is separate from JSON-LD.
[5:38pm] gkellogg: I'll take this IRC dialog and put it in the issue for reference.
[5:38pm] manu-db: gkellogg: Yes, let's move normalization and signature stuff to separate specs... I thought we had already decided to do that, but it may not have been clear.
[5:38pm] dlongley: however, i think it might be a bit odd to be specifying the JSON-LD normalization format in the normalization spec ... which, afaict, is moving towards being almost entirely independent of various serializations...
[5:38pm] manu-db: (that is, neither of those things belong in the JSON-LD API spec)
[5:38pm] dlongley: save for the N-Triples mention for generating equivalent strings
[5:39pm] dlongley: we might want a JSON-LD normalization spec ...
[5:39pm] dlongley: that is in addition to the generic graph normalization spec.
[5:39pm] manu-db: sure, we could do that.
[5:39pm] dlongley: and that spec provides the API call to get normalized JSON-LD.
[5:40pm] gkellogg: The existing algorithm describes turning JSON-LD into an RDF graph, invoking normalization, and turning the results back into a JSON-LD array of objects. What I communicated my email is that the result of normalization should using some IDL description of the results, as an ordered array of statements.
[5:41pm] gkellogg: The point is that the result of the JSON-LD normalization call is a JSON array containing flat JSON Object definitions of entities, rather than a string which could be re-parsed as JSON.
[5:44pm] dlongley: that sounds like the result of the JSON-LD normalization call isn't valid JSON-LD?
[5:45pm] gkellogg: No, it is JSON-LD. It's an array of objects, each of which is the JSON-LD representation of a common subject.
[5:45pm] dlongley: would these entities be valid JSON-LD?
[5:45pm] dlongley: ok.
[5:46pm] advatar joined the chat room.
[5:46pm] dlongley: i'm fine with that result.
[5:46pm] gkellogg: … with lists serialized back into those objects.
[5:46pm] dlongley: i don't have a problem with what you're proposing
[5:46pm] gkellogg: GitHub issue is #53: #53 I'll update that with conversation and relevant email tracks.
[5:46pm] dlongley: ok
[5:47pm] dlongley: i do think we should keep in mind using iterable interfaces or similar ...
[5:47pm] dlongley: rather than (or in addition to) lists
[5:47pm] gkellogg: Don't follow.
[5:47pm] dlongley: if we have large graphs of information
[5:48pm] dlongley: we may want the normalization algorithm to output triples in some streaming fashion
[5:48pm] dlongley: and have our fromTriples method similarly accept them in a streaming fashion
[5:48pm] dlongley: so we might want fromTriples to not just accept a list of triples, but possibly some iterable interface of triples...
[5:49pm] dlongley: which was also requested by a few people earlier on who were wondering how to create JSON-LD from an iterable triples interface
[5:49pm] gkellogg: Sure, but I don't see how to do it with @list support without doing two passes.
[5:49pm] dlongley: i has nothing to do with that, sorry.
[5:49pm] dlongley: just generally talkign about the changes we're discussing.
[5:49pm] dlongley: regarding normalization and to/from triples.
[5:50pm] gkellogg: The point is, I'll need to iterate over it to get list references out, and then iterate again to do the actual conversion.
[5:50pm] dlongley: or you can do what i mentioned earlier
[5:50pm] gkellogg: Perhaps the API should take an iterator then, not an array. Not sure how that's done in WebIDL.
[5:50pm] dlongley: and use placeholders and store the list triples as you get them in
[5:50pm] dlongley: it would be nice to take both
[5:50pm] advatar left the chat room. (Read error: Connection reset by peer)
[5:51pm] dlongley: but maybe we should just pick the most common case...
[5:51pm] dlongley: (which is probably a list)
[5:51pm] gkellogg: I thought we might be able do do this, but given that I could get the third element in a list before the first, and both of those before the referencing object, that seems like a real problem.
[5:51pm] dlongley: and let others extend it to an iterable interface as they see fit
[5:51pm] advatar joined the chat room.
[5:51pm] dlongley: hmm, i see.
[5:52pm] gkellogg: That's what motivated my original question, which would mean that you could do as is currently described.
[5:52pm] dlongley: what does an rdf:list look like in triples?
[5:52pm] dlongley: have a quick link or example?
[5:52pm] gkellogg: _:l1
[5:52pm] gkellogg: _l1 rdf:first "foo"; rdf:rest _:l2
[5:52pm] gkellogg: _l2 rdf:first "bar"; rdf:rest rdf:nil .
[5:53pm] gkellogg: Just a sec.
[5:53pm] advatar left the chat room. (Client Quit)
[5:53pm] dlongley: so does it just generate a bunch of unique blank nodes then?
[5:53pm] gkellogg: Yes, one for each element in the list.
[5:54pm] dlongley: ok
[5:54pm] gkellogg: ref:
http://www.w3.org/TR/rdf-primer/#collections
[5:54pm] advatar joined the chat room.
[5:55pm] dlongley: i ask because we do want to make sure it isn't too difficult of ap roblem to solve ...
[5:55pm] dlongley: if you do need to use an iterable interface
[5:56pm] gkellogg: I'll take a stab at it.
[5:56pm] dlongley: so you can just store all of the list related triples and a placeholder until each list is built
[5:56pm] dlongley: it isn't as simple as a two pass
[5:57pm] dlongley: but i don't think it's too difficult which is what is important
[5:57pm] gkellogg: Well, no, but basically the first pass removes the rdf:first/rdf:rest bits, then you create ordered arrays. In the second pass, when you see a reference to an object that is in the list map, you replace it with that array definition as {@list: []}
[5:58pm] dlongley: right

from json-ld.org.

lanthaler avatar lanthaler commented on July 22, 2024

I partly understand why you wanna return an object instead of a string as the result of normalization but I'm not really convinced that that's the right thing to do. Maybe it's just because I have different use cases in mind, I don't know. Let me try to explain my reasoning again.

JSON-LD is first and foremost a serialization format for linked data graphs. That being said, I always thought that the output of normalization is a unique representation of a graph. How I make use of that unique representation is completely up to me. I could take it and compute a hash or a signature (that's what payswarm is doing) or just use it to check if two graphs are the same (that's what can be used for testing, e.g.). How I do that, is also completely up to me.
The point is, that every variation of the same graph (I assume you understand what I mean) will result in exactly the same representation - and with representation I mean the bytes on the wire or in a file, not necessarily the parsed representation in main memory. And here lies the crux. If I parse and re-serialize such a normalized representation and don't specify in which order the unordered object properties have to be serialized, I'm unable to reproduce that unique representation. As far as I know, the only way for a JSON-based API is to return a string representation to achieve that.

Please also note that in the current spec there is a MIME type parameter "form=normalized". How would you create such a representation? Define another method that takes a "normalized object" and turn it into a "normalized string representation"? Maybe normalizeToString()?

I have problems in seeing value of a normalize() method that outputs an object, but maybe it's just the word "normalize" that confuses me. Perhaps "flatten()" would be a better word for you are trying to achieve!? But then I don't really understand why you are advocating N-Triples as the standard normalization format and create parsers of that format for JSON-LD. The interoperability argument is, IMHO, a weak argument as the systems won't be interoperable anyway if they are talking in two different languages (N-Triples/JSON-LD). If someone needs that kind of interoperability, he can write a N-Triples to JSON-LD converter to fulfill his need. But that's out of scope for JSON-LD, that's a requirement of a specific application of JSON-LD.

I tried to write this comment as objective as possible but I think it sounds a bit aggressive - that's not what it is intended to be. That's merely a weakness in my English skills to express it in a nicer way :-)

from json-ld.org.

gkellogg avatar gkellogg commented on July 22, 2024

On Mar 24, 2012, at 4:16 AM, Markus Lanthaler wrote:

I partly understand why you wanna return an object instead of a string as the result of normalization but I'm not really convinced that that's the right thing to do. Maybe it's just because I have different use cases in mind, I don't know. Let me try to explain my reasoning again.

JSON-LD is first and foremost a serialization format for linked data graphs. That being said, I always thought that the output of normalization is a unique representation of a graph.

If it is a unique representation of a graph, then shouldn't it be the same representation no matter what the input format is? By separating normalization from JSON-LD into RDF Normalization, we're doing just that: providing a means of turning any graph representation into a single output representation. This is why the output of a normalizer needs to be something like N-Triples, which is a reasonably agnostic way to describe a graph.

How I make use of that unique representation is completely up to me. I could take it and compute a hash or a signature (that's what payswarm is doing) or just use it to check if two graphs are the same (that's what can be used for testing, e.g.). How I do that, is also completely up to me.
The point is, that every variation of the same graph (I assume you understand what I mean) will result in exactly the same representation - and with representation I mean the bytes on the wire or in a file, not necessarily the parsed representation in main memory. And here lies the crux. If I parse and re-serialize such a normalized representation and don't specify in which order the unordered object properties have to be serialized, I'm unable to reproduce that unique representation. As far as I know, the only way for a JSON-based API is to return a string representation to achieve that.

For generic graph normalization, there is (will be) a single unique serialization, just not one that parses directly as JSON. Of course, there is a standardized way to turn N-Triples into a normalized JSON representation, but JSON does not have the notion of objects with ordered keys; keys are unordered, so as a (parsed) JSON representation, ordering has no meaning.

Please also note that in the current spec there is a MIME type parameter "form=normalized". How would you create such a representation? Define another method that takes a "normalized object" and turn it into a "normalized string representation"? Maybe normalizeToString()?

There is a normalized representation, but it is normalized in that the parsed representation will be equivalent, not that the stream-of-bytes is equivalent. For graph equivalence to be meaningful, it must be true in spite of the original representation, RDF/XML, Turtle or JSON-LD.

I have problems in seeing value of a normalize() method that outputs an object, but maybe it's just the word "normalize" that confuses me. Perhaps "flatten()" would be a better word for you are trying to achieve!?

No, it is actually normalizing it. The order of objects in the array is identical, the names of blank nodes are identical. It is more than just a flattened representation.

But then I don't really understand why you are advocating N-Triples as the standard normalization format and create parsers of that format for JSON-LD. The interoperability argument is, IMHO, a weak argument as the systems won't be interoperable anyway if they are talking in two different languages (N-Triples/JSON-LD).

Why that you think that two systems that can perform content negotiation and be agnostic to the input format are not interoperable? Certainly if there is a minimum syntax required (N-Triples), then they can certainly interoperate.

If someone needs that kind of interoperability, he can write a N-Triples to JSON-LD converter to fulfill his need. But that's out of scope for JSON-LD, that's a requirement of a specific application of JSON-LD.

As long as we depend on an RDF normalizer, then JSON-LD to RDF and RDF to JSON-LD is not out of scope; it's part of the API. If you're advocating not using a generic RDF normalizer, and making it JSON-LD specific, that seems like a step backwards, and won't have appeal to a wider community. Just look how useful (not) XML Canonical representations have been.

I tried to write this comment as objective as possible but I think it sounds a bit aggressive - that's not what it is intended to be. That's merely a weakness in my English skills to express it in a nicer way :-)

A certain amount of conflict is part of the process. As long as we don't personalize things, "heated" discussion is fine, and helps create better results. In any case, I didn't take the tone of this email, or any of your other emails as being too aggressive (I hope this is true of mine as well).

Gregg


Reply to this email directly or view it on GitHub:
#53 (comment)

from json-ld.org.

lanthaler avatar lanthaler commented on July 22, 2024

JSON-LD is first and foremost a serialization format for linked data
graphs. That being said, I always thought that the output of
normalization is a unique representation of a graph.

If it is a unique representation of a graph, then shouldn't it be the
same representation no matter what the input format is? By separating
normalization from JSON-LD into RDF Normalization, we're doing just
that: providing a means of turning any graph representation into a
single output representation. This is why the output of a normalizer
needs to be something like N-Triples, which is a reasonably agnostic
way to describe a graph.

This is the point where I don't agree. I agree that the algorithm can be generic, but not the serialization format. I think we need to have a unique JSON-LD normalization to write such a thing to a file e.g. I would also like to work with such a normalized version directly instead of having to convert it to another format before I can use it. What if there's a little bug in the conversion algorithm? I would accept different graphs as being the same even though they are not.

Nothing stops us to define also a normalized N-Triples representation. It's just fine in my opinion. Systems deciding to talk in JSON-LD to each other will use a JSON-LD serialization of a normalized graph. Systems talking in RDF/XML are going to use that and others might stick to N-Triples. It's similar to hashing. There are many different algorithms to calculate a hash and to systems need to use the algorithm to be able to understand each other.

For generic graph normalization, there is (will be) a single unique
serialization, just not one that parses directly as JSON. Of course,
there is a standardized way to turn N-Triples into a normalized JSON
representation, but JSON does not have the notion of objects with
ordered keys; keys are unordered, so as a (parsed) JSON representation,
ordering has no meaning.

What you are talking about here is to parse N-Triples with a JSON-LD API. You are not talking about turning a normalized N-Triples representation into a normalized JSON-LD representation (on the disk/on the wire), right?

Please also note that in the current spec there is a MIME type
parameter "form=normalized". How would you create such a
representation? Define another method that takes a "normalized object"
and turn it into a "normalized string representation"? Maybe
normalizeToString()?

There is a normalized representation, but it is normalized in that the
parsed representation will be equivalent, not that the stream-of-bytes
is equivalent. For graph equivalence to be meaningful, it must be true
in spite of the original representation, RDF/XML, Turtle or JSON-LD.

No matter how the input looks, the parsed representation will always be equivalent. The point of the "form=normalized" parameter was to express that this is the unique representation of that specific graph. If that's not true, then we have to remove that parameter as there are different "form=normalized" representations for the same graph, i.e., the object properties will be serialized in different orders.

But then I don't really understand why you are advocating N-Triples
as the standard normalization format and create parsers of that format
for JSON-LD. The interoperability argument is, IMHO, a weak argument as
the systems won't be interoperable anyway if they are talking in two
different languages (N-Triples/JSON-LD).

Why that you think that two systems that can perform content
negotiation and be agnostic to the input format are not interoperable?
Certainly if there is a minimum syntax required (N-Triples), then they
can certainly interoperate.

Isn't that a contradiction in itself? Why do you need content negotiation if the two systems are agnostic to the input format? As soon as there is content negotiation, both, N-Triples and JSON-LD will be fine as long as the two systems can agree on one.

If someone needs that kind of interoperability, he can write a N-
Triples to JSON-LD converter to fulfill his need. But that's out of
scope for JSON-LD, that's a requirement of a specific application of
JSON-LD.

As long as we depend on an RDF normalizer, then JSON-LD to RDF and RDF
to JSON-LD is not out of scope; it's part of the API. If you're
advocating not using a generic RDF normalizer, and making it JSON-LD
specific, that seems like a step backwards, and won't have appeal to a
wider community. Just look how useful (not) XML Canonical
representations have been.

RDF is something abstract. We are not talking about RDF/XML here, we are talking about the abstract form of triples. I could call this as well EAV, it doesn't matter at all. And no, I'm not advocating not using a generic normalization algorithm - I think that makes perfect sense. But the output of the algorithm is still something rather abstract, it's a ordered list of triples. How these triples are then serialized is up to other specifications using that algorithm. In case of JSON-LD that would mean that that ordered list of triples is converted into a JSON representation in the form of a string as there's no other way to preserve that order.

In any case, I didn't take the tone of this email, or
any of your other emails as being too aggressive (I hope this is true
of mine as well).

Good, this is certainly true for your mails as well :-)

from json-ld.org.

gkellogg avatar gkellogg commented on July 22, 2024

Probably best to go back to use cases to understand requirements; it may well be that there's not a need for a normalization method. One use case is to support framing, where I think there are simpler algorithms; for example, flattening the graph by translating to RDF and back would accomplish much the same thing, where there's not an absolute need for determinism. As I've said before, the main use. See for normalization is for digital signatures, and I think a generic RDF normal form addresses this.

If we do need to support a normalized serialization, we could add on to the existing algorithm by specifying key ordering and whitespace in the serialized output.

Time for others to chime in.

from json-ld.org.

lanthaler avatar lanthaler commented on July 22, 2024

Agree, let's try to find some more use cases first. Maybe a flatten() method would be handy as well if someone prefers to work with triples but still use JSON. Let's see what the rest of the group thinks.

from json-ld.org.

dlongley avatar dlongley commented on July 22, 2024

After going back and forth on this, I think the output of the JSON-LD normalization algorithm should be an object (more precisely, an array). The object will be a unique representation of a graph, but it won't necessarily be a unique object. And by that I mean that the order of objects in a property list can vary in JSON-LD without changing the actual graph.

Also, as we've discussed, there is no actual ordering for keys in JSON. This indicates to me that if we're going to serialize this object to JSON, we shouldn't have the expectation that every other JSON serializer will produce exactly the same byte-for-byte output. We simply can't rely on that because the JSON format doesn't support it. If we must specify a serialization to use, then we must pick one where order is relevant.

If you want to compare two normalized JSON-LD objects (arrays), then the correct thing to do is to compare each subject in the arrays, checking their @ids and each of their properties to ensure that the same objects and number of objects appear. This isn't a terribly difficult algorithm to write and it avoids the issue of trying to force key order into a format that doesn't support it.

Ultimately, we should recommend that N-Triples be the serialization format of choice if graphs are to be compared lexicographically or are to be hashed (eg: for digital signatures).

So I support the following:

A JSON-LD normalization algorithm that takes the result of a generic normalization algorithm (where the result is an ordered list of triples) and outputs an order-specific JSON-LD array of subjects.

from json-ld.org.

dlongley avatar dlongley commented on July 22, 2024

On 03/24/2012 11:59 PM, Markus Lanthaler wrote:

Please also note that in the current spec there is a MIME type
parameter "form=normalized". How would you create such a
representation? Define another method that takes a "normalized object"
and turn it into a "normalized string representation"? Maybe
normalizeToString()?
There is a normalized representation, but it is normalized in that the
parsed representation will be equivalent, not that the stream-of-bytes
is equivalent. For graph equivalence to be meaningful, it must be true
in spite of the original representation, RDF/XML, Turtle or JSON-LD.
No matter how the input looks, the parsed representation will always be equivalent. The point of the "form=normalized" parameter was to express that this is the unique representation of that specific graph. If that's not true, then we have to remove that parameter as there are different "form=normalized" representations for the same graph, i.e., the object properties will be serialized in different orders.

Well, "form=normalized" is actually being applied to graph, not just the
over-the-wire serialization. I guess another way of looking at this is
that I suppose it's possible to request an "application/json"
representation from two different servers, that are both members of the
same cluster, and receive two different byte-for-byte results that
aren't actually different objects. That doesn't make "application/json"
useless.

Similarly, you may receive two different serialized byte streams when
requesting "form=normalized", but their meaning will be the same. They
will both represent the same unique graph.

from json-ld.org.

lanthaler avatar lanthaler commented on July 22, 2024

Dave, sorry, but I have problems following your arguments.

Well, "form=normalized" is actually being applied to graph,
not the over-the-wire serialization.

It's a MIME type parameter. That get's applied to the representation you get from a server. What you do with that representation is beyond the scope of a media type definition.

I guess another way of looking at this is
that I suppose it's possible to request an "application/json"
representation from two different servers, that are both members of the
same cluster, and receive two different byte-for-byte results that
aren't actually different objects. That doesn't make "application/json"
useless.

Similarly, you may receive two different serialized byte streams when
requesting "form=normalized", but their meaning will be the same. They
will both represent the same unique graph.

Same applies to application/ld+json. We can get the graph in several forms and that was initially the motivation for "form=normalized" (at least that was my understanding). If I can't calculate a hash or something straight from the representation without having to fire up a parser or something, "form=normalized" has no value - at least not for me. The use cases I had in mind were, e.g., that you receive a JSON-LD document and check the signature of it before parsing and processing a potentially large graph that in the end turns out to be invalid. If we don't require something like that, I suggest the simplest thing is to just drop "format=normalized" and the normalized form in JSON-LD. If someone needs to do graph comparison or signature calculation he can use the RDF normalization algorithm or do it in another way that is beyond the scope of the JSON-LD spec.

from json-ld.org.

dlongley avatar dlongley commented on July 22, 2024

My point about the MIME type parameter was that you seemed to be suggesting that "form=normalized" was useless because you might receive two different byte-for-byte serializations of the same resource representation. I may have misunderstood this as your point -- but if not, then I disagree about the lack of utility for the same reason I provided w/respect to "application/json". A serialized representation doesn't have to always be in the same order, so long as the representation doesn't actually change; which is the case with key-order in JSON and property object order in JSON-LD.

I don't necessarily buy the argument about avoiding processing a potentially large graph (that may turn out to be invalid) by simply checking a signature. If we're defining "waste" as the work required to determine whether or not something is invalid, what amount of waste is reasonably acceptable? In your example, you'll have to receive the entire invalid graph (waste) and then process it to produce a signature (waste). You seem to argue that this amount of "waste" is acceptable -- but why?

If we're talking processing time/effort, then I wouldn't expect most modern computers to have an issue with the added overhead of parsing the graph to check its validity. And, if we're looking at waste in terms of developer effort, it's only going to be an API call or two. I don't think it will make much of a difference either way.

I think the two primary uses for a normalized JSON-LD object are: ensuring that you're working with a unique graph in JSON that has a flat structure and for in-memory graph comparisons (different from the simple: "is this graph the same?" "yes/no" question). I would expect most people to convert to N-Triples to hash and digitally sign/verify because of interoperability.

from json-ld.org.

dlongley avatar dlongley commented on July 22, 2024

Another way to think about this that might help:

We're talking about "a normalized graph in JSON-LD" not "normalized JSON-LD". There's a subtle but important difference that might be getting lost in the language here.

from json-ld.org.

dlongley avatar dlongley commented on July 22, 2024

Perhaps "form=normalized" is poor word choice and we can choose something else. What you're getting when you request that is this:

A JSON-LD representation of a deterministically-labeled graph. This means you're getting a JSON array of JSON-LD subjects sorted by @id.

from json-ld.org.

lanthaler avatar lanthaler commented on July 22, 2024

We're talking about "a normalized graph in JSON-LD" not "normalized JSON-LD".
There's a subtle but important difference that might be getting lost in the language here.

I think that's exactly the crux: I would like to have normalized JSON-LD while you and Gregg would like to have a normalized graph in JSON-LD. I want to have a unique byte-representation of a graph.

from json-ld.org.

lanthaler avatar lanthaler commented on July 22, 2024

RESOLVED: Remove the normalization algorithm and API from the JSON-LD API specification. The normalization algorithm will be placed into a separate RDF Graph Normalization specification which contains an API for retrieving a set of normalized statements.

from json-ld.org.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.