GithubHelp home page GithubHelp logo

Comments (12)

SmithSamuelM avatar SmithSamuelM commented on July 1, 2024 1

@peacekeeper I agree that if one takes extra measures like using hash links one can go a long ways to addressing some of my security concerns of @context. However I agree with @dhh1128 that that there is no guarantee that that is how they will be used. Fundamentally, from a secure system design standpoint, its not that someone could use @context in a more or less secure way (which even using the best suggestions still has security vulnerabilities) but that when used as intended (which is in an open world model) that there is no way to guarantee any level of security. Security is about reducing the attack surfaces that malicious users can exploit in ways that are difficult to anticipate. So @context is intended as an extensibility mechanism that enables injection of information not under the control of the issuer. Any resolver, or any other intermediary such as a relay can use @context to inject arbitrary information and a json-ld verifier will accept it. This means that we would have to add a new protocol that is not normative to json-ld, and that is not-normative to the did method itself to protect against such an injection. A normative did doc itself is not end-verifiable ( it is not signed), one has to trust the did resolver or any other type of intermediary that provides it. Therefore a malicious attacker could impersonate any intermediary and inject in such a way that the end-verifier can't detect such an injection because any intermediate '@context' injection is normatively correct vis-a-vis JSON-LD and the did doc spec.

I know that "best practice" would be to only run one's own resolver in one's own infrastructure so that its more protected than some resolver running in someone else's infrastructure. Moreover, given that we have a universal resolver that is meant to be run anywhere, a verifier can't guarantee that best practices have been followed, if for no other reason that we have made it easy to violate them by virtue of publishing implementations that do not use them. But even if one runs their own resolver in their own infrastructure it is still a much weaker security posture than one that is end-verifiable. One can have trusted intermediaries but such intermediaries are still subject to various attacks. Attacks on such intermediaries are well know exploits for DNS resolvers and BGP resolvers even when in one's own infrastructure. Running one's own DID resolver is not very strong protection relatively speaking. Furthermore, @context injection because it is normatively correct makes such an attack undetectable downstream. To reiterate, because @context is a normatively correct mechanism that allows arbitrary injection of information, a successful attack is undetectable. Protocols that enable undetectable, but successful attacks are pathologically broken protocols from a security standpoint. The harm from such an attack is unbounded because the attack can persist indefinitely without detection and hence mitigation.

Another way of appreciating the problems of non-normative protection mechanisms is by viewing them as a form of ad-hoc security. Because one can't reliably reason about ad-hoc security, one can't build secure systems on top of such mechanisms. At the very least one must incur the cost of continual manual monitoring of all implementations of ad-hoc mechanisms. We should be wary of any such approaches.

KERI has two important security guarantees for KEL-backed data.

  1. Any successful attack must begin with the successful compromise of one or more private keys.
  2. Any successful attack is detectable as duplicity. This is true because it requires publication to the KEL in a way that is observable to the controller of the KEL or any other verifier of the KEL.

A normative mechanism that is correct but allows injection of arbitrary data violates both the first and second properties of KERI. The first because the information can be injected without compromising at least one private key of the issuer. The second, because any such injection is undetectable downstream and may hide in plain sight despite any attempt to make it KEL backed.

In the book Practical Cryptography by Ferguson and Schneier there are two fundamental security system design principles.

  1. Complexity is the worst enemy of security.
  2. Correctness must be a local property

@context violates principle 2 in about as salient a fashion as one can violate it.

The best way to protect against that type of attack is to convert a did doc into an ersatz verifiable credential that elides the @context. Then the ersatz VC can be securely attributed to its issuer not any intermediary such as a did resolver. Any end user can then verifity the ersatz VC without the @context and then the end-user can choose to inject an @context and thereby inject whatever information comes with it and then do expansion to an RDF graph. This makes the ersatz VC as did doc, end-verifiable with no possibility of malicious injection mid-stream and only the end-user can inject the @context after verification.

Thus the presence of @context of a did:doc provided by an intermediary would be a sign of exploit and must be discarded. Arbitrary injection then cannot happen in any intermediary but is guaranteed to only happen post verification by the end user. This gives us a strong guarantee than removes such an attack from the attack surface regardless of how any intermediary is implemented or how well the intermediary follows best practices for non-normatively protecting against exploit of @context. It's off the table. It's therefore simplified our security posture and we can reason about it correctly.

Now if we are NOT going to make a did doc an ersatz verifiable credential, then we can't trust any information it provides so we must verify every item of information it provides. This means the did doc itself is merely a discovery mechanism and all the information in it must eventually lead to the discovery of end-verifiable KEL backed data. Furthermore, as mentioned above, because @context injection is not KEL backable it has no place in a such a did doc e.g. one that is meant to be secured by KERI mechanisms. It would violate KERI's security guarantees.

from tswg-did-method-webs-specification.

swcurran avatar swcurran commented on July 1, 2024

Agree

from tswg-did-method-webs-specification.

dhh1128 avatar dhh1128 commented on July 1, 2024

Disagree with the reasoning, but could go along with the conclusion.

Re. "vast majority of DID methods", I think this is entirely irrelevant. Being similar to arbitrary other methods has no interop benefits, and the vast majority of the devs who implement or the people who make or consume these DIDs won't care -- either about the theoretical question, or about any feature delta attributable to the difference.

Re. did:web: This argument is more relevant, IMO. However, I am becoming increasingly uncomfortable that we are setting up did:webs as "did:web plus an extra, optional security feature". What we should be saying is that did:webs is "like did:web, but with an upgraded approach to security." These two characterizations are similar, but the difference in them matters -- and the did:web argument feels to me like it's part of the first characterization.

The "did:web plus an extra, optional security feature" narrative encourages a false mental model -- or, if it's a true mental model, it results in a DID method that is unsafe. How many times have we heard that you can't bolt on security after the fact? What makes did:webs secure isn't just a bolted on feature, it's thinking about the basis of security differently. And what we are building here is an upgrade path, not a perpetual parallel where either did method is equally good and has the same feature set. If we tell people that they don't have to think about the basis differently -- they can use their entire mental model, unmodified, and just tack on one nifty thing at the end -- we are actually going to end up with a method that is secure in theory, but used wrong in practice, undermining the security and helping nobody.

<predictable_soapbox>JSON-LD is insecure unless your contexts are locked down with hashlinks or similar, or unless you agree not to parse any unfamiliar contexts. You can stipulate that no unfamiliar contexts are allowed, but then of course you don't have the open world assumption that its advocates cite as its raison d'etre. Security-wise, it is thus the opposite of what this method is trying to accomplish.</predictable_soapbox>

What could convince me to go along is:

  • We justify this choice without referencing the vast majority of other DID methods.
  • We do NOT encourage a mental model of "did:web + extra, optional security" but rather "similar to did:web but with an upgraded approach to security" -- and we therefore justify this with a phrase like "in order to facilitate upgrades from from did:web usages that assume JSON-LD".
  • To avoid any possibility of re-contaminating the security of the method, we stipulate that no JSON-LD features are legal other than @context. No vocabs, no extensibility, no lang, etc. It is JSON-LD not for open world model, but purely to ease migration.

Something like that.

@SmithSamuelM

from tswg-did-method-webs-specification.

SmithSamuelM avatar SmithSamuelM commented on July 1, 2024

Disagree. json-ld used correctly is meant to support an open world model which is antithetical to the security properties we need to enforce for did:webs to have any reason for existence. Dumbing down the security is entirely problematic. Every feature we add to did:webs needs to adhere to a minimal level of security properties otherwise we should just move this discussion to the did:web working group and add the features there.

That said, however, a secure JSON document can be transformed into a json-ld by anyone who wants to expand it into an insecure open world model. Thus, there is no need for did:webs to incur the complexity or the security flaws of json-ld no matter how well caveated. More appropriate would be some external spec that instructs how a given application could convert the json did:doc into a json-ld along with all the caveats.

from tswg-did-method-webs-specification.

SmithSamuelM avatar SmithSamuelM commented on July 1, 2024

I couldn't be more adamant, no @context ! none nada. Have we learned nothing over the last several years.

from tswg-did-method-webs-specification.

pfeairheller avatar pfeairheller commented on July 1, 2024

Disagree for all the security arguments already made and in addition JSON-LD adds no value. A pure JSON did doc is spec compliant and will work fine.

from tswg-did-method-webs-specification.

swcurran avatar swcurran commented on July 1, 2024

I care that the document be spec compliant. If that can be done without json-ld I’m fine with that. If the spec requires it, that too is fine, as it would be ignored by most.

from tswg-did-method-webs-specification.

peacekeeper avatar peacekeeper commented on July 1, 2024

Understood..

I don't think JSON-LD is inherently insecure. In this DID method, the @context would only contain one or two well-known entries that define the main terms used by the DID document. I have never seen any DID resolver which downloads contexts from a remote, insecure location. And if the DID document contained any additional unknown @context entries, or if any "open world" extensions are included in the DID document, then in these cases a did:webs resolver should throw an error anyway, since it verifies the DID document contents against the KEL.

In fact, this could be a powerful demonstration.. We could show how a JSON-LD DID document with @context resolves just fine with did:web, but fails to resolve with did:webs, if any insecure or compromised JSON-LD elements are detected in the DID document :)

I agree with the "similar to did:web but with an upgraded approach to security" mental model, but not sure whether using JSON-LD or pure JSON better communicates this mental model.

But it's true of course that a pure JSON DID document is also completely spec compliant, so I'm also fine with doing that.

from tswg-did-method-webs-specification.

dhh1128 avatar dhh1128 commented on July 1, 2024

I don't think JSON-LD is inherently insecure. In this DID method, the @context would only contain one or two well-known entries that define the main terms used by the DID document. I have never seen any DID resolver which downloads contexts from a remote, insecure location.

@peacekeeper : Nobody on the planet knows more about DID resolvers than you do, I'm sure. And maybe nobody knows more about DIDs in general, either. Therefore, when I read what you say here, very carefully, I can find a way to not disagree with you. However, I think casual readers of your statement might get the wrong idea from it. So I'm going to respond a bit pedantically, even though I know you already know all of this, because I think the subtleties need to be documented in this repo's history.

Casual analyzers of DIDs and security often make dangerous assumptions about what "secure" should mean in a DID context. The first example of a DID doc in DID core begins like this:

  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/suites/ed25519-2020/v1"
  ] 

When people read the DID standard, and they hear statements from world-class experts like you say that resolvers never resolve contexts from a "remote, insecure location", they conclude that, since JSON-LD theory would call for resolvers to download contexts, and since this JSON-LD fragment shows contexts at URLs like "https://www.w3.org/ns/did/v1" and "https://w3id.org/security/suites/ed25519-2020/v1", those two URLs must be "secure". After all, they are managed by the W3C and they begin with "https", and every JSON-LD DID doc in the world MUST reference them...

In fact, the reason your statement is true is not because those remote locations are secure -- they are not -- but because none of the resolvers with market gravitas EVER downloads contexts from them at all, period. This is the security workaround advocated by JSON-LD experts, but not necessarily obvious to every developer who writes a new DID resolver. I have seen the insecurity of these URLs publicly acknowledged in writing by JSON-LD experts on at least 3 or 4 occasions.

Why am I claiming that these URLs are insecure? Again, I know you know this, Markus, but for other readers with less battlescars: because a malicious party with access to W3C's hosting infrastructure (hacker, evil sysadmin), or to any proxy or intermediary that returns content over the HTTPS session, or any party that can conduct a DNS poisoning attack, can swap the content that a client receives over HTTPS if it fetches that URL. And that means that the meaning of the terms in a JSON-LD doc -- even if there are only a handful of simple ones -- can be changed. If I were an evil hacker who wanted to subtly undermine a digital trust ecosystem that claimed to provide rock-solid guarantees about control of identifiers and credentials, this is EXACTLY how I would do it.

The workaround of not downloading the URL during resolution is correctly called a "workaround", not a "solution". The reason is that it just changes the timescale of the attack; it doesn't eliminate it. Sure, a developer can hardcode the contexts into their resolver. But where did they get the hardcoded contexts, or where did they go to study those contexts at they were hardcoding semantics into their resolver? Answer: they fetched that URL. A developer who read the context on Jan 1 for inclusion in resolver 1's code, and a developer who read the context on July 15 for resolver 2's code, are not guaranteed to have downloaded the same thing, because of the URL's fundamental insecurity. The chances that this will happen, on these timescales, are lower than the chances that it will happen if everybody did dynamic resolution -- but those chances are still not zero. We know of sophisticated, subtle attacks on security infrastructure that lurk for years, precisely to create ways that trust in security tech can be manipulated, so I don't think this is an imaginary concern.

This vulnerability is inherent in URLs on the web, which are designed to allow different content to be served at different times. As early as about 2004, I believe I was reading recommendations from W3C that most URLs should be permalinks, but they've basically given up on that advice AFAICT, marginalizing immutable stuff like purls. Today, the vulnerability can be eliminated with hashlinks (or far better, with the SAIDs that KERI is built from, which have numerous advantages). This makes it accurate to claim that JSON-LD is not insecure, THEORETICALLY. However, hashlinks are still not fully standardized, let alone broadly implemented. Even if they were, it will take another generation, at least, before developers generally understand the need to use them to achieve security. That fact that we're not even seeing them in the DID spec itself tells you how far we are from having this behavior be mainstream. So, even if there's a theoretical argument that JSON-LD can be used in a secure way, it is too practically insecure to be an appropriate foundation for high-security applications. Not all DIDs have high security goals, but in general, they do. In this method, specifically -- where we aim to fix security problems inherent in the pure-web-and-URL trust basis of did:web -- keeping a very high standard for how we define "secure" is crucial. And JSON-LD doesn't meet that standard.

from tswg-did-method-webs-specification.

swcurran avatar swcurran commented on July 1, 2024

I think we are all on the same page. As long as the doc is spec. compliant we are good with as little JSON-LD as needed. If that is nothing, great. If we need a context element great.

No LD Tools will be used in the making of this DID Method.

Can we close this issue?

from tswg-did-method-webs-specification.

peacekeeper avatar peacekeeper commented on July 1, 2024

I'm fine with closing, but - also for documentation purposes - allow me to play the role of "JSON-LD apologist" one more time..

JSON-LD theory would call for resolvers to download contexts

The workaround of not downloading the URL during resolution is correctly called a "workaround"

I have never understood JSON-LD quite like this. To me, the @context entries are primarily identifiers (URIs, not URLs). Their context files may be downloadable from those locations (and I admit the JSON-LD spec mentions this), but they don't have to be dynamically downloaded or even be downloadable at all. I have never considered dynamic downloading of these files to be the primary way of establishing the context. For example, our JSON-LD document loader doesn't download by default, so the JSON-LD processing is entirely local, unless remote downloading is explicitly turned on.

In plain JSON you have other ways of learning the semantics of the data. In JWT you may have to download the IANA registry of claims at some point (kind of like a centralized "context"). In a JSON DID document, you may have to download the DID Spec Registries. Since these resources are on the web, they could be attacked in the same way as a JSON-LD context, no?

But I can see a difference between theory and practice. I can see that in practice, since JSON-LD contexts CAN (easily) be downloaded, and since they ARE machine-readable, the risk of developers downloading them in an insecure way is greater than with other technologies.

hashlinks

I agree ACDC's SAID approach is much nicer than "bolting on" hashes to HTTP URLs.

we're not even seeing them in the DID spec

In the DID spec there is a DID parameter called "hl" (hashlink), but this is not widely used, and its value would only protect the DID document itself, not the contents of the contexts, so it wouldn't solve the concerns here.

from tswg-did-method-webs-specification.

peacekeeper avatar peacekeeper commented on July 1, 2024

Thanks for the detailed explanations, I read them with interest. I fully believe in end-verifiability as provided by KERI/ACDC.

My comments in this issue were not really about JSON-LD vs. KERI/ACDC.

They were more about JSON-LD vs. plain JSON, and in this comparison, I don't think JSON-LD is so fundamentally more insecure. If the @context (needed by JSON-LD) has security problems, then so does the IANA JWT term registry and the DID Spec Registry (needed by JSON), since those are not end-verifiable either.

Oh and I wonder, isn't ACDC also a form of linked data (<- lower case!) and based on an open-world semantic model? Just a better, more secure, end-verifiable form of it? :)

Anyway, I am closing this now, maybe we can continue in an IIW session.

from tswg-did-method-webs-specification.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.