GithubHelp home page GithubHelp logo

ntia-conformance-checker's People

Contributors

anthonyharrison avatar csatarigergely avatar dependabot[bot] avatar devbysn avatar goneall avatar jspeed-meyers avatar lumjjb avatar meretp avatar prayag-09 avatar starswaterbrook avatar thireo avatar vargenau avatar yong-aan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ntia-conformance-checker's Issues

Use SPDX-ID instead of name in machine-readable output

Potential improvement: Use SPDX-ID instead of name in machine-readable output when listing components.

The problem? The code cannot list the names of components that have no name.

The solution: Use SPDX ID instead of names to list nonconformant components.

This is low priority and can potentially await a re-architecture in which the messages list is no longer the central data structure of the codebase.

Add Troubleshooting FAQ Section

Add section that answers troubleshooting FAQs. For instance, see issue #49.

This would be a separate markdown document and linked from the main README.

Rework Print Output Function to Make Better Suited for SPDX Online Tools

@goneall, I noticed that SPDX Online Tools uses the default print functionality. IMO, it's a bit hard to grok as a user since it prints a long list of issues, rather than a more structured output.

Would you be interested if I created a print mode that was optimized for SPDX Online Tools? I could make some suggestions (or you could) in this issue and then I could try to implement it.

Note: Almost any suggestion will require some re-architecting of the codebase. But, TBH, that's on the docket anyway, so that's inevitable and doesn't need to be a constraint while we brainstorm. But once we decide on a new print mode (if we do), I'll also open a ticket about re-architecting and then I can kill two birds with one PR.

Add "Build Passing" Badge

Should we add a "build passing" badge?

I noticed https://github.com/spdx/tools-golang has this badge and others.

Having the badge lets maintainers know if the build is broken without digging around in the Actions tab. Additionally, assuming the build is passing, provides confidence to potential users.

NTIA "Other unique identifiers" check needs review

NTIA component identifiers check passes for the attached file (please remove .txt from it before running).


Is this SBOM NTIA minimum element conformant? False

Individual elements                            | Status
-------------------------------------------------------
All component names provided?                  | True
All component versions provided?               | False
All component identifiers provided?            | True
All component suppliers provided?              | False
SBOM author name provided?                     | True
SBOM creation timestamp provided?              | True
Dependency relationships provided?             | True

The script expects the presence of unique SPDXID which is truly unique for all packages.

However, NTIA intent with Other unique identifier appears to be checking for PURL/CPE/SWID (or equivalent). From the NTIA doc - Other unique identifiers support automated efforts to map data across data uses and ecosystems and can reinforce certainty in instances of uncertainty. Examples of commonly used unique identifiers are Common Platform Enumeration (CPE),9 Software Identification (SWID) tags,10 and Package Uniform Resource Locators (PURL). 11 These other identifiers may not be available for every piece of software, but should be used if they exist.

With the CPE/PURL/SWID interpretation, only 8 out of 15 components have unique identifier. e.g:

ExternalRef: PACKAGE-MANAGER purl pkg:oci/busybox@sha256:f4ed5f2163110c26d42741fdc92bd1710e118aed4edb19212548e8ca4e5fca22?mediaType=application%2Fvnd.docker.distribution.manifest.list.v2+json&repository_url=index.docker.io%2Flibrary

but completely missing from the following package

PackageName: sha256:3d8a17fefa47b7be9e46147c5e670fb74d3de4a45889e307c5b7e85da5bee3d0

On this issue, sbomqs implementation differs from ntia-comformance-checker so I would like to get SPDX's interpretation for a consistent implementation.

PS: Thanks to @kestewart for pointing me to this tool
bom-alpine-3.15.spdx.txt

What should "SBOM Author" name be if SBOM created by a tool?

When I run this tool on an SPDX document created by Tern, I get a False status for SBOM author name provided field. My question is, what should this field be when a document is created by a tool? According to the spec, https://spdx.github.io/spdx-spec/v2.3/how-to-use/#k22-mapping-ntia-minimum-elements-to-spdx-fields, Author maps to the Creator field. In this case, the creator is a tool and the SBOM includes this information:

Creator: Tool: tern-2dd359916884b250e8b66d94c175506e387df07e

What is the tool looking for?

Remove main.py File

This file seems to be a template file that does not contain functionality related to the project. I'd be glad to put in a PR that removes it.

Improve Usage Instructions

It could be nice to:

  • Show a user how to use the --help flag
  • Show a user how to specify --file as part of the command rather than interactively.

Make Test Case Documents Consistent Across Each Format For Each Set of Tests

It could become a source of confusion and maintenance burden to have test documents for a particular test case that are divergent across formats. See PR #68 for an example of a PR that introduces this type of problem. The supplier test documents, per the PR, are a good example.

It could be helpful to treat one format (say, JSON) as the source of truth and to have tooling that auto-generates the other formats for each test case.

Make Output Machine Readable

It could be helpful for machine consumers of this tool to have access to JSON output.

Is there any interest in this feature?

Use Upstream of tools-python, Not Fork

Related to issue #28

In the course of examining a bug related to parsing, @goneall discovered that ntia-conformance-checker is using a fork, not the upstream, of tools-python.

This codebase should use the upstream to take advantage of ongoing improvements.

The only open question: Should this project switch to the upstream before the upstream accepts three commits from @linynjosh's fork? Or should the project switch to the upstream and, in the meantime, try to merge those changes? (Assuming those changes haven't been submitted and merged in the past sometime.)

Improve Error Message When File Not Found

I discovered that providing an input of a file that does not exist leads to a potentially confusing error message.

(ntia-conformance-checker) bash-3.2$ python3 checker.py 
File name: nosuchfile.json

which returns:

['Document cannot be parsed.']

I would expect an error like "Document not found" rather than the "Document cannot be parsed." The "cannot be parsed" phrasing could unintentionally imply to the user that the document does exist.

Add Field for Total Number of Components to JSON Ouput

Add a field for total number of components to JSON output. One observer pointed out to me that without information about the total number of components it is harder to evaluate whether there are "many" or "few" components with missing values. For instance, 20 components missing version info might seem like a lot, but not if there are 2000 components overall.

This should be a simple PR.

Exit gracefully when file not found

When I was running the tool I accidentally provided a file to it that did not exist (fat finger typo). The tool gave me a confusing UnboundLocalError message that might be confusing for users not familiar with reading Python tracebacks. Suggestion for the tool to exit gracefully if a non-existent file is supplied by providing a more clear error message.

Currently:

(ternenv) rose@rose-vm:~/ternenv/ntia-conformance-checker/ntia_conformance_checker$ python3 main.py -v --file dne.spdx
ERROR:root:Filename dne.spdx not found.
ERROR:root:Document cannot be parsed: [Errno 2] No such file or directory: 'dne.spdx'
Traceback (most recent call last):
  File "/home/rose/ternenv/ntia-conformance-checker/ntia_conformance_checker/main.py", line 51, in <module>
    main()  # pylint: disable=no-value-for-parameter
  File "/home/rose/ternenv/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/rose/ternenv/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/rose/ternenv/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/rose/ternenv/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/rose/ternenv/ntia-conformance-checker/ntia_conformance_checker/main.py", line 36, in main
    sbom = sbom_checker.SbomChecker(file)
  File "/home/rose/ternenv/ntia-conformance-checker/ntia_conformance_checker/sbom_checker.py", line 17, in __init__
    self.doc = self.parse_file()
  File "/home/rose/ternenv/ntia-conformance-checker/ntia_conformance_checker/sbom_checker.py", line 42, in parse_file
    return doc
UnboundLocalError: local variable 'doc' referenced before assignment

Could be improved to something like:

(ternenv) rose@rose-vm:~/ternenv/ntia-conformance-checker/ntia_conformance_checker$ python3 main.py -v --file dne.spdx
Warning: file 'dne.spdx' not found.

Add Check for Depth

@puerco mentioned to me an aspect of the NTIA minimum requirements document of which I was unaware:

Depth. An SBOM should contain all primary (top level) components, with all their transitive
dependencies listed. At a minimum, all top-level dependencies must be listed with enough
detail to seek out the transitive dependencies recursively.

The question: Should ntia-conformance-checker attempt to account for this "depth" requirement? If so, how?

NOASSERTION is accepted as a valid supplier

A package supplier can be defined as an Organistion or Person. It can also be defined as NOASSERTION see SPDX Specification.

It appears that if a PackageSupplier tag exists, this is sufficient to pass the 'are all package details provided' even if the supplier is marked as NOASSERTION. This doesn't seem correct.

Release?

I'm going to make a new release of the SPDX Online Tools in the next week or so - I'm going to include an updated NTIA Conformance Checker.

Let me know if there are any additional pull requests or issues we should resolve before updating the online tools.

Fix all use of sys.exit(-1) to sys.exit(1)

Please use exit code 0 on success and 1 on error, not -1 which is system-dependent and many systems only support unsigned values.

Fix all use of sys.exit(-1) to sys.exit(1)

Empty versions can pass for valid value

NTIA version check passes for the attached file (please remove .txt from it before running).

bom-alpine-3.15.spdx.json.txt

However, a typical version information field is empty:

"versionInfo": "",

The root cause appears to be the check here is missing an empty string check (or even stricter check for semver or derivative).

bom-alpine-3.15.spdx.json.txt

PS: Ignore these messages from the output

'{'packageVerificationCodeValue': ''}' is not a valid value for PKG_VERIF_CODE_FIELD

This is a known issue with bom filed here -
kubernetes-sigs/bom#230

we are tracking that and other known issues with formats here if you are curious to follow along - interlynk-io/sbomqs#39

['Document cannot be parsed.'] - SPDX format file being used

(ntia-conformance-checker) (base) ricardo@MB cli_tools % python checker.py
File name: /Users/ricardo/_git/spdx_sboms/us-demo-org-2_react.spdx
['Document cannot be parsed.']

Is there a way to produce more details on the error why the document cannot be parsed?

I'm referencing a standard SPDX file format. I have presented absolute path to the source file, and I have even moved the file inside the cli_tools/ directory where checker.py is located.

(ntia-conformance-checker) (base) ricardo@MC cli_tools % python checker.py
File name: us-demo-org-2_react.spdx
['Document cannot be parsed.']

Thanks.

Add --version Flag

If possible with argparse, add a version flag and perhaps show the commit hash of the git commit associated with the source from which that version was built.

Will need to investigate.

Re-Architecture from "Conveyor Belt" to "Singleton"

In tandem with #37, it's time, IMO, for a re-architecture. Fortunately, this codebase is only ~250 lines, so I actually don't think it will be that painful. Let me explain the current architecture, the motivation for changing this architecture, and my proposed new architecture.

The Current Architecture: A Conveyor Belt

The codebase currently uses a messages list data structure that holds all messages to the user about the minimum elements checks. I compare it to a conveyor belt because all the messages are in a line, one after the other, and the codebase simply adds new messages to the messages conveyor belt. This is a simple architecture, which is an important point in its favor, but I think the codebase has outgrown this data structure.

Why Change?

Because a conveyor belt is great for picking up your luggage due to the simplicity of the operation (wait for your particular piece or pieces of luggage), but it's not great for presenting structure to a user. In particular, the conveyor belt approach is why it's hard to quickly re-architect the print functionality to make a print functionality optimized for the online-tools web app. To make this work, one has to write parsing code that grabs lots of elements from the messages data structure and then re-arranges them. It's also why the JSON output depends oon convoluted (and brittle) parsing code.

So, TL;DR: The current messages data structure requires after-the-fact parsing in order to present output to the user in any form other than a long list.

The Case for a Singleton Architecture

A little bit of object orientation could go a long way in this codebase. In particular, I propose a SBOM class that would be created each time the tool in invoked and that would hold all the data (in a structured way) that is now put on the messages conveyor belt. But instead of one long line of messages, there would be properties specifically for each check. This way, when a programmer wants to write a print functionality, the programmer simply needs that object, and not complicated parsing functionality that dissects the messages list.

@goneall, sound good? @linynjosh, feel free to weigh in too!

Remove --pre Flag

Because ntia-conformance-checker no longer relies on a pre-release of tools-python, the --pre flag can be removed from the GitHub CI automation.

Add Basic Usage Information to README

It took me a couple of minutes to find checker.py. It could speed up the time it takes for a user to understand how to use this cool tool as a command line tool if there were some usage information on the README. I'm glad to put in a draft PR if anyone thinks this would be useful.

urllib3 module not found when trying to install ntia-conformance-checker

I am trying to install the conformance checker tool according to the directions in the README but hit the following ModuleNotFound error:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/packages/__init__.py", line 27, in <module>
    from . import urllib3
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/packages/urllib3/__init__.py", line 8, in <module>
    from .connectionpool import (
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/packages/urllib3/connectionpool.py", line 35, in <module>
    from .connection import (
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/packages/urllib3/connection.py", line 54, in <module>
    from ._collections import HTTPHeaderDict
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/packages/urllib3/_collections.py", line 2, in <module>
    from collections import Mapping, MutableMapping
ImportError: cannot import name 'Mapping' from 'collections' (/usr/lib/python3.10/collections/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/pipenv", line 33, in <module>
    sys.exit(load_entry_point('pipenv==11.9.0', 'console_scripts', 'pipenv')())
  File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/pipenv/cli.py", line 347, in install
    from .import core
  File "/usr/lib/python3/dist-packages/pipenv/core.py", line 21, in <module>
    import requests
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/__init__.py", line 62, in <module>
    from .packages.urllib3.exceptions import DependencyWarning
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/packages/__init__.py", line 29, in <module>
    import urllib3
ModuleNotFoundError: No module named 'urllib3'

urllib3 was already installed so I tried to upgrade it but still get the same error

(ternenv) rose@rose-vm:~/ternenv/ntia-conformance-checker$ pip install urllib3 --upgrade
Requirement already satisfied: urllib3 in /home/rose/ternenv/lib/python3.10/site-packages (1.26.9)
Collecting urllib3
  Downloading urllib3-1.26.14-py2.py3-none-any.whl (140 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 140.6/140.6 kB 2.4 MB/s eta 0:00:00
Installing collected packages: urllib3
  Attempting uninstall: urllib3
    Found existing installation: urllib3 1.26.9
    Uninstalling urllib3-1.26.9:
      Successfully uninstalled urllib3-1.26.9
Successfully installed urllib3-1.26.14

Add quiet option

Would be very useful if the tool operated with a quiet option and just returned a value 0 (conformant) or -1 (non-conformant). This would then allow the tool to be easily added to a CI/CD pipeline.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.