Comments (11)
I've been putting some thought into this, and I think I have some ideas.
The GSD namespace is really what we want to be THE source of information. If we want to see updates and/or corrections, that's really what we want. I think we should figure out how to turn existing NVD entries into GSD (OSV) format.
If we want to modify the data, that's the place to do it. There will be some issues with keeping NVD changes current is a different discussion, so let's just ignore that point for the moment.
If someone wants to add their own data, that's where a namespace makes sense I think. Some namespaces will be merged into GSD, some won't. It'll just depend, but the point is your namespace is for whatever you want.
from gsd-tools.
So several questions/comments:
- we want to import the data automatically from various sources as much as possible (scaling/efficiency, etc)
- what do we do when that data is broken or incorrect? do we fix it? where?
- what happens when (if?) the original source changes it, was that a fix? (or is it more brokan?) do we want someone to review it? do we flag it?
- as for where we fix the data, if we write the NVD data in, then fix it, we have that trail in git, e.g. we still have the original data (if anyone cares to go look for it)
- if we are fixing it by overwritting it, how do we indicate we changed it from the original? "hey GSD, you have a typo, NVD says CVSS score of X, you say Y"
- how do we indicate WHY we changed it, e.g. what evidence caused us to change this data?
- if we fix the data outside the namespace where it exists, and then for example have the API overlay our improved data when serving the file, do we put the overlay data in the root namespace, or what?
from gsd-tools.
Also your commit seems to assume the NVD namespace is empty, which it isn't, so the commit needs to be reworked to overwrite (merge? overlay? some word like that) the data into the existing "nvd.nist.gov" space (or else the json breaks, can't have multiple identically named keys).
This also all assumes we want to overwrite the "broken" entry and not just insert a more correct one.
from gsd-tools.
More thinking outloud:
this is why we picked git initially. We can do things like have a random person overwrite a "Read only" namespace and 1) we can revert it easily and 2) we can know exactly who did what when, and with a good commit message why.
So I'm inclined, for now, to allow overwrites of these "Read only" spaces, and we see where this goes and how to best deal with it, worst case we clean it up by overwriting it with the NVD data and we move the altered data somewhere else.
from gsd-tools.
I was basing it off of what you'd done at CloudSecurityAlliance/gsd-database@1efab96, though once it was fixed in NVD someone else reverted that change. The bot that runs to populate the nvd.nist.gov
is definitely going to clobber anything we put there, so it doesn't really make sense to suggest changes at all I guess
from gsd-tools.
The whole point to a namespace is keeping it off limits to others. I like that we can't monkey with the NVD or CVE data. What we have is what they have. If they update something, we pick up that update quickly.
We know we want to be able to add enrichment data, that's basically the whole point right now. There's no good way to enrich the existing NVD upstream data without a lot of pain
I think there are two types of enrichment (let's just think about this in the context of NVD for the moment, it will help keep the scope sane)
- Correcting an existing read only namespace
- Adding new data
For adding new data, I think either adding something the GSD namespace or your own namespace would be fine.
For corrections there isn't a great way I'm aware of that can correct portions of the NVD data
from gsd-tools.
Ok so adding data is easy if you use your own namespace, the trick becomes knowing when/how to overlay it, e.g. let's assume I use seifried.org, and people trust/want to use my data in my namespace. If I have something like (again for the sake of argument):
overlay: { namespaces: { cve.org: [some CVE data like an affects set of data]
are we adding my data to the cve.org data? replacing it entirely? Because two very common cases are "they got it wrong, here's the correct one" and "they are incomplete, here's more data". When you can only have one item, like a description it's easy, it overwrites the existing one, but when you have lists (e.g. affects, or references) then what?
So we may also need a way to indicate that this data is "in addition to" or "replaces" whatever keys are in the same space. We also need a way to specify what we're adding or overwriting (originally I used the term "overlay", I still can't think of a better one).
One option would be:
overlay: { replace: { namespaces: { cve.org: [some CVE data like an affects set of data]
overlay: { addto: { namespaces: { cve.org: [some CVE data like CVSS environmental data]
Now having said this all, there may be a better solution:
We populate the root osv:{} based on data in the root and in namespaces (e.g. CVE,. NVD, etc.). We can basically just sort it out and write "the best truth" in osv:{}, if people don't l;ike it, they can choose to have their own parsing rules (e.g. "do we trust seifried.org namepsace to overlay for cve.org?") and so on.
My vote, for now:
People write stuff to their namespaces. We parse it and write it to the root osv:{}. This will be a natural extension of the GSD to osv:{} conversion I'm working on (it already has this mindset).
from gsd-tools.
To describes changes to JSON there is JSON Patch defined in RFC 6902 and the there is definition to merge JSON objects in RFC 7396
from gsd-tools.
The challenge here is we then... store the original JSON files and a series of patches and the final file (e.g. so we don't make everyone apply the series to get up to date), or.. something else? For now, the solution is git, this gives us the history, and the ability to roll back, the hosting is easy (github), and distribution (git clone/git pull) all in one tidy bundle.
from gsd-tools.
Fist let me clarify I am not proposing that JSON patch should be used or other forms of diffs. But the question was raised how this could be encoded and before a new format is defined I wanted to mention that others already worked on this problem.
It is also possible to include a JSON patch inside the object that should be changed.
{
"foo": 1,
"patch": [
{ "op": "replace", "path": "/foo", "value": 2 }
]
}
So the patch could be put into a namespace and there is no need to manage multiple files for one GSD entry.
But this is only a viable solution if you want to store changes to the read only data. Maybe to highlight false data in CVE and co or to clarify contradicting data with GSD and CVE.
I agree with @joshbressers the approach to improve the GSD namespace is the most sensible.
from gsd-tools.
So there's two issues here:
- what technical method do we use to update the JSON data (e.g. direct overwrite? patch?)
- depending on the technical method, if we overwrite the data how do we handle updates from NIST for example? If we use patche(s) how do we apply them and in what order? What happens when a patch gets out of synch with the data (e.g. NIST updates it to delete an entry or something)
My thinking here is we don't touch cve.org/nist, we synthesize that data, patch it, whatever, and put the result into the GSD namespace. Then for example when the API is serving the data the requestor gets the best up to date complete data from GSD.
from gsd-tools.
Related Issues (20)
- Reach out to Ruby Advisory Database community about participation with GSD
- API Endpoint to request/reserve a GSD ID HOT 1
- Update gsd-web to use jsonschema validator HOT 2
- Update gsd-schema with kurtseifried/gsd-schema changes
- Link in https://github.com/cloudsecurityalliance/gsd-tools/blob/main/securitylist/README.md goes to 404 HOT 1
- schema.gsd.id Cloudflare Worker for schema $refs HOT 7
- Fix GSD Bot creating multiple affected packages instead of using multiple ranges
- Update GSD Bot to use INTRODUCED/FIX for reference types HOT 2
- Add tooltips for data.gsd.id add reference type
- Add text box for commit message when editing on data.gsd.id
- Update GSD Web to update modified at when changing a GSD
- Update Edit Button to support all OSV fields HOT 3
- GSD data normalization and format cross-compatibility/conversion
- Vulnerability Data Source Landscape HOT 1
- Update data.gsd.id show page to include all OSV values
- Create an OpenAPI v3.1 Spec for GSD API
- Create Python Interface for GSD HOT 1
- @dawiddczarnecki - counting CVE HOT 1
- Update NVD automation to use the V2 api HOT 1
- Where do kernel vulnerabilities come from ? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gsd-tools.