spdx / ntia-conformance-checker Goto Github PK

View Code? Open in Web Editor NEW

47.0 47.0 16.0 4.11 MB

Check SPDX SBOM for NTIA minimum elements

License: Apache License 2.0

Python 100.00%

sbom sbom-tool

ntia-conformance-checker's People

Contributors

Stargazers

Watchers

Forkers

licquia jspeed-meyers anthonyharrison meretp myartym03 prayag-09 fjsnogueira lumjjb starswaterbrook devbysn nokia puneeth072003 vargenau wolfi-chainguard-demo thireo cssolomon

ntia-conformance-checker's Issues

Use self.ntia_mininum_elements_compliant, not related function, in main()

This is for the sake of consistency and to ease the mental load on current and future maintainers

Add Documentation on Using ntia-conformance-checker as a Library

import ntia_conformance_checker as ntia

sbom = ntia.SbomChecker("SBOM_filepath")

print(sbom.ntia_mininum_elements_compliant)

Adding this code block as an example with a little additional text should be sufficient.

Use SPDX-ID instead of name in machine-readable output

Potential improvement: Use SPDX-ID instead of name in machine-readable output when listing components.

The problem? The code cannot list the names of components that have no name.

The solution: Use SPDX ID instead of names to list nonconformant components.

This is low priority and can potentially await a re-architecture in which the messages list is no longer the central data structure of the codebase.

Add Troubleshooting FAQ Section

Add section that answers troubleshooting FAQs. For instance, see issue #49.

This would be a separate markdown document and linked from the main README.

Rework Print Output Function to Make Better Suited for SPDX Online Tools

@goneall, I noticed that SPDX Online Tools uses the default print functionality. IMO, it's a bit hard to grok as a user since it prints a long list of issues, rather than a more structured output.

Would you be interested if I created a print mode that was optimized for SPDX Online Tools? I could make some suggestions (or you could) in this issue and then I could try to implement it.

Note: Almost any suggestion will require some re-architecting of the codebase. But, TBH, that's on the docket anyway, so that's inevitable and doesn't need to be a constraint while we brainstorm. But once we decide on a new print mode (if we do), I'll also open a ticket about re-architecting and then I can kill two birds with one PR.

Add "Build Passing" Badge

Should we add a "build passing" badge?

I noticed https://github.com/spdx/tools-golang has this badge and others.

Having the badge lets maintainers know if the build is broken without digging around in the Actions tab. Additionally, assuming the build is passing, provides confidence to potential users.

NTIA "Other unique identifiers" check needs review

NTIA component identifiers check passes for the attached file (please remove .txt from it before running).


Is this SBOM NTIA minimum element conformant? False

Individual elements                            | Status
-------------------------------------------------------
All component names provided?                  | True
All component versions provided?               | False
All component identifiers provided?            | True
All component suppliers provided?              | False
SBOM author name provided?                     | True
SBOM creation timestamp provided?              | True
Dependency relationships provided?             | True

The script expects the presence of unique SPDXID which is truly unique for all packages.

However, NTIA intent with Other unique identifier appears to be checking for PURL/CPE/SWID (or equivalent). From the NTIA doc - Other unique identifiers support automated efforts to map data across data uses and ecosystems and can reinforce certainty in instances of uncertainty. Examples of commonly used unique identifiers are Common Platform Enumeration (CPE),9 Software Identification (SWID) tags,10 and Package Uniform Resource Locators (PURL). 11 These other identifiers may not be available for every piece of software, but should be used if they exist.

With the CPE/PURL/SWID interpretation, only 8 out of 15 components have unique identifier. e.g:

ExternalRef: PACKAGE-MANAGER purl pkg:oci/busybox@sha256:f4ed5f2163110c26d42741fdc92bd1710e118aed4edb19212548e8ca4e5fca22?mediaType=application%2Fvnd.docker.distribution.manifest.list.v2+json&repository_url=index.docker.io%2Flibrary

but completely missing from the following package

PackageName: sha256:3d8a17fefa47b7be9e46147c5e670fb74d3de4a45889e307c5b7e85da5bee3d0

On this issue, sbomqs implementation differs from ntia-comformance-checker so I would like to get SPDX's interpretation for a consistent implementation.

PS: Thanks to @kestewart for pointing me to this tool
bom-alpine-3.15.spdx.txt

What should "SBOM Author" name be if SBOM created by a tool?

When I run this tool on an SPDX document created by Tern, I get a False status for SBOM author name provided field. My question is, what should this field be when a document is created by a tool? According to the spec, https://spdx.github.io/spdx-spec/v2.3/how-to-use/#k22-mapping-ntia-minimum-elements-to-spdx-fields, Author maps to the Creator field. In this case, the creator is a tool and the SBOM includes this information:

Creator: Tool: tern-2dd359916884b250e8b66d94c175506e387df07e

What is the tool looking for?

Remove Click as Dependency, Replace with a Standard Library CLI

I don't think (but I could be wrong) that Click adds any functionality (for this app at this time) above and beyond what the Python standard library provides. To minimize dependencies, I propose removing Click.

Remove main.py File

This file seems to be a template file that does not contain functionality related to the project. I'd be glad to put in a PR that removes it.

Check_components_identifiers() function check for supplier instead of indentifier

Tool does not check for identifier instead check for supplier in check_component_indentifiers(). This can be corrected by checking spdx id instead.

Improve Usage Instructions

It could be nice to:

Show a user how to use the --help flag
Show a user how to specify --file as part of the command rather than interactively.

Should we allow organizations as valid SPDX document creators?

In check_sbom_author, only a Person is valid.

I'm wondering if we should also allow Organizations.

@kestewart - do you know if the NTIA conformance guidance is specific on the creator being a Person?

Replace Setup.py With Pyproject.toml and Ensure Correct spdx-tools Dependency Specified

The current setup.py resulted in a recent hiccup for spdx-online-tools: spdx/spdx-online-tools#418

Make Test Case Documents Consistent Across Each Format For Each Set of Tests

It could become a source of confusion and maintenance burden to have test documents for a particular test case that are divergent across formats. See PR #68 for an example of a PR that introduces this type of problem. The supplier test documents, per the PR, are a good example.

It could be helpful to treat one format (say, JSON) as the source of truth and to have tooling that auto-generates the other formats for each test case.

Interest in Adding Automated Testing via GitHub Actions?

If there is interest and willingness to have automated testing on pull requests, I could put in a PR that adds this capability via GitHub Actions.

Cut New Release, Debug if Necessary

@goneall and @anthonyharrison, I'm going to try cutting a new release on Thursday. Sound good?

Fingers crossed I don't set anything on fire. If so, I'll write a GitHub issue describing the issues I encountered and we can debug.

Make Output Machine Readable

It could be helpful for machine consumers of this tool to have access to JSON output.

Is there any interest in this feature?

Usage Instructions Don't Work (for me)

$ ntia-checker --help
zsh: command not found: ntia-checker

@anthonyharrison, I did the pip install route in the README and the got this. What am I missing? Thank you!!

Create Output Printing Style Better Suited for SPDX-Online-Tools

See spdx/spdx-online-tools#428 (comment)

The current printing mode is optimized for a terminal. But there should be a print mode optimized for SPDX-online-tools

This will require some investigation of SPDX-online-tools and its current UI for output.

Use Upstream of tools-python, Not Fork

Related to issue #28

In the course of examining a bug related to parsing, @goneall discovered that ntia-conformance-checker is using a fork, not the upstream, of tools-python.

This codebase should use the upstream to take advantage of ongoing improvements.

The only open question: Should this project switch to the upstream before the upstream accepts three commits from @linynjosh's fork? Or should the project switch to the upstream and, in the meantime, try to merge those changes? (Assuming those changes haven't been submitted and merged in the past sometime.)

Test if Package Can Be Used As a Library

If not, fix so it can.

Improve Error Message When File Not Found

I discovered that providing an input of a file that does not exist leads to a potentially confusing error message.

(ntia-conformance-checker) bash-3.2$ python3 checker.py 
File name: nosuchfile.json

which returns:

['Document cannot be parsed.']

I would expect an error like "Document not found" rather than the "Document cannot be parsed." The "cannot be parsed" phrasing could unintentionally imply to the user that the document does exist.

Add Field for Total Number of Components to JSON Ouput

Add a field for total number of components to JSON output. One observer pointed out to me that without information about the total number of components it is harder to evaluate whether there are "many" or "few" components with missing values. For instance, 20 components missing version info might seem like a lot, but not if there are 2000 components overall.

This should be a simple PR.

Exit gracefully when file not found

When I was running the tool I accidentally provided a file to it that did not exist (fat finger typo). The tool gave me a confusing UnboundLocalError message that might be confusing for users not familiar with reading Python tracebacks. Suggestion for the tool to exit gracefully if a non-existent file is supplied by providing a more clear error message.

Currently:

(ternenv) rose@rose-vm:~/ternenv/ntia-conformance-checker/ntia_conformance_checker$ python3 main.py -v --file dne.spdx
ERROR:root:Filename dne.spdx not found.
ERROR:root:Document cannot be parsed: [Errno 2] No such file or directory: 'dne.spdx'
Traceback (most recent call last):
  File "/home/rose/ternenv/ntia-conformance-checker/ntia_conformance_checker/main.py", line 51, in <module>
    main()  # pylint: disable=no-value-for-parameter
  File "/home/rose/ternenv/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/rose/ternenv/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/rose/ternenv/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/rose/ternenv/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/rose/ternenv/ntia-conformance-checker/ntia_conformance_checker/main.py", line 36, in main
    sbom = sbom_checker.SbomChecker(file)
  File "/home/rose/ternenv/ntia-conformance-checker/ntia_conformance_checker/sbom_checker.py", line 17, in __init__
    self.doc = self.parse_file()
  File "/home/rose/ternenv/ntia-conformance-checker/ntia_conformance_checker/sbom_checker.py", line 42, in parse_file
    return doc
UnboundLocalError: local variable 'doc' referenced before assignment

Could be improved to something like:

(ternenv) rose@rose-vm:~/ternenv/ntia-conformance-checker/ntia_conformance_checker$ python3 main.py -v --file dne.spdx
Warning: file 'dne.spdx' not found.

Add Python Linting (Pylint) to GitHub CI

To help with code quality, it could be helpful to add a pylint GH action. Info on pylint is here: https://github.com/PyCQA/pylint

This would likely involve some sprucing up of the codebase too, but probably wouldn't be too bad.

I'm glad to put in this PR.

Add Check for Depth

@puerco mentioned to me an aspect of the NTIA minimum requirements document of which I was unaware:

Depth. An SBOM should contain all primary (top level) components, with all their transitive
dependencies listed. At a minimum, all top-level dependencies must be listed with enough
detail to seek out the transitive dependencies recursively.

The question: Should ntia-conformance-checker attempt to account for this "depth" requirement? If so, how?

Considering Adding SLSA Provenance to Release Process

For some technical documentation how this could be done: https://github.com/slsa-framework/slsa-github-generator/blob/main/internal/builders/generic/README.md

Add Python PSF Black Formatting to GitHub CI

Adding the Python Software Foundation's black formatter could be a way to simplify development.

Info here: https://github.com/psf/black

I'm glad to put in a PR.

Is Test Coverage to PR Functionality Broken?

On PR #41, no test coverage showed up in PR. Huh?

Add Explanation of How to Development on NTIA Conformance Checker

Add explanation to README

NOASSERTION is accepted as a valid supplier

A package supplier can be defined as an Organistion or Person. It can also be defined as NOASSERTION see SPDX Specification.

It appears that if a PackageSupplier tag exists, this is sufficient to pass the 'are all package details provided' even if the supplier is marked as NOASSERTION. This doesn't seem correct.

Can pipfile and pipfile.lock be removed?

Is pyproject.toml all that is needed?

Need to investigate.

See PR #74.

@anthonyharrison notes:

i don't think you need Pipfile and Pipfile.lock as they are related to pipenv. (the lock file should be autogenerated - pipenv lock

Should We Do Releases Once Every Two Weeks?

I'll plan on doing releases every two weeks (or at least try that for a month or two) unless anyone objects.

Release?

I'm going to make a new release of the SPDX Online Tools in the next week or so - I'm going to include an updated NTIA Conformance Checker.

Let me know if there are any additional pull requests or issues we should resolve before updating the online tools.

Add Bandit Python Static Analysis Security Tool to GitHub Actions CI

Bandit is a static analysis security tool for Python: https://bandit.readthedocs.io/en/latest/

Adding it to CI can help us know, fix, and prevent some security issues at relatively low cost.

I'm glad to put in a PR.

Fix all use of sys.exit(-1) to sys.exit(1)

Please use exit code 0 on success and 1 on error, not -1 which is system-dependent and many systems only support unsigned values.

Fix all use of sys.exit(-1) to sys.exit(1)

Empty versions can pass for valid value

NTIA version check passes for the attached file (please remove .txt from it before running).

bom-alpine-3.15.spdx.json.txt

However, a typical version information field is empty:

"versionInfo": "",

The root cause appears to be the check here is missing an empty string check (or even stricter check for semver or derivative).

bom-alpine-3.15.spdx.json.txt

PS: Ignore these messages from the output

'{'packageVerificationCodeValue': ''}' is not a valid value for PKG_VERIF_CODE_FIELD

This is a known issue with bom filed here -
kubernetes-sigs/bom#230

we are tracking that and other known issues with formats here if you are curious to follow along - interlynk-io/sbomqs#39

Publish SBOM for Each Release

Requirements:

SPDX SBOM
JSON format
Stored in the GitHub releases page

cc @anthonyharrison

Add Info about Why SBOM Author is False When Using Verbose Flag

See issue #52 for the relevant background.

['Document cannot be parsed.'] - SPDX format file being used

(ntia-conformance-checker) (base) ricardo@MB cli_tools % python checker.py
File name: /Users/ricardo/_git/spdx_sboms/us-demo-org-2_react.spdx
['Document cannot be parsed.']

Is there a way to produce more details on the error why the document cannot be parsed?

I'm referencing a standard SPDX file format. I have presented absolute path to the source file, and I have even moved the file inside the cli_tools/ directory where checker.py is located.

(ntia-conformance-checker) (base) ricardo@MC cli_tools % python checker.py
File name: us-demo-org-2_react.spdx
['Document cannot be parsed.']

Thanks.

Sign Releases with Sigstore

I think signing PyPI releases with Sigstore is possible.

Add --version Flag

If possible with argparse, add a version flag and perhaps show the commit hash of the git commit associated with the source from which that version was built.

Will need to investigate.

Add Test Coverage Message for Each PR

This would help submitters and reviewers know whether the PR provides test coverage for any new code.

Re-Architecture from "Conveyor Belt" to "Singleton"

In tandem with #37, it's time, IMO, for a re-architecture. Fortunately, this codebase is only ~250 lines, so I actually don't think it will be that painful. Let me explain the current architecture, the motivation for changing this architecture, and my proposed new architecture.

The Current Architecture: A Conveyor Belt

The codebase currently uses a messages list data structure that holds all messages to the user about the minimum elements checks. I compare it to a conveyor belt because all the messages are in a line, one after the other, and the codebase simply adds new messages to the messages conveyor belt. This is a simple architecture, which is an important point in its favor, but I think the codebase has outgrown this data structure.

Why Change?

Because a conveyor belt is great for picking up your luggage due to the simplicity of the operation (wait for your particular piece or pieces of luggage), but it's not great for presenting structure to a user. In particular, the conveyor belt approach is why it's hard to quickly re-architect the print functionality to make a print functionality optimized for the online-tools web app. To make this work, one has to write parsing code that grabs lots of elements from the messages data structure and then re-arranges them. It's also why the JSON output depends oon convoluted (and brittle) parsing code.

So, TL;DR: The current messages data structure requires after-the-fact parsing in order to present output to the user in any form other than a long list.

The Case for a Singleton Architecture

A little bit of object orientation could go a long way in this codebase. In particular, I propose a SBOM class that would be created each time the tool in invoked and that would hold all the data (in a structured way) that is now put on the messages conveyor belt. But instead of one long line of messages, there would be properties specifically for each check. This way, when a programmer wants to write a print functionality, the programmer simply needs that object, and not complicated parsing functionality that dissects the messages list.

@goneall, sound good? @linynjosh, feel free to weigh in too!

Remove --pre Flag

Because ntia-conformance-checker no longer relies on a pre-release of tools-python, the --pre flag can be removed from the GitHub CI automation.

Add Basic Usage Information to README

It took me a couple of minutes to find checker.py. It could speed up the time it takes for a user to understand how to use this cool tool as a command line tool if there were some usage information on the README. I'm glad to put in a draft PR if anyone thinks this would be useful.

Create Release Process for NTIA Conformance Checker

Initial thoughts:

Follow semantic versioning
Automate release process via GitHub Actions
Release via PyPI
Release an SBOM as one of the artifacts (might be able to use this)

Other thoughts, ideas, welcome.

urllib3 module not found when trying to install ntia-conformance-checker

I am trying to install the conformance checker tool according to the directions in the README but hit the following ModuleNotFound error:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/packages/__init__.py", line 27, in <module>
    from . import urllib3
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/packages/urllib3/__init__.py", line 8, in <module>
    from .connectionpool import (
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/packages/urllib3/connectionpool.py", line 35, in <module>
    from .connection import (
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/packages/urllib3/connection.py", line 54, in <module>
    from ._collections import HTTPHeaderDict
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/packages/urllib3/_collections.py", line 2, in <module>
    from collections import Mapping, MutableMapping
ImportError: cannot import name 'Mapping' from 'collections' (/usr/lib/python3.10/collections/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/pipenv", line 33, in <module>
    sys.exit(load_entry_point('pipenv==11.9.0', 'console_scripts', 'pipenv')())
  File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/pipenv/vendor/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/pipenv/cli.py", line 347, in install
    from .import core
  File "/usr/lib/python3/dist-packages/pipenv/core.py", line 21, in <module>
    import requests
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/__init__.py", line 62, in <module>
    from .packages.urllib3.exceptions import DependencyWarning
  File "/usr/lib/python3/dist-packages/pipenv/vendor/requests/packages/__init__.py", line 29, in <module>
    import urllib3
ModuleNotFoundError: No module named 'urllib3'

urllib3 was already installed so I tried to upgrade it but still get the same error

(ternenv) rose@rose-vm:~/ternenv/ntia-conformance-checker$ pip install urllib3 --upgrade
Requirement already satisfied: urllib3 in /home/rose/ternenv/lib/python3.10/site-packages (1.26.9)
Collecting urllib3
  Downloading urllib3-1.26.14-py2.py3-none-any.whl (140 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 140.6/140.6 kB 2.4 MB/s eta 0:00:00
Installing collected packages: urllib3
  Attempting uninstall: urllib3
    Found existing installation: urllib3 1.26.9
    Uninstalling urllib3-1.26.9:
      Successfully uninstalled urllib3-1.26.9
Successfully installed urllib3-1.26.14

Add quiet option

Would be very useful if the tool operated with a quiet option and just returned a value 0 (conformant) or -1 (non-conformant). This would then allow the tool to be easily added to a CI/CD pipeline.

spdx / ntia-conformance-checker Goto Github PK

ntia-conformance-checker's People

Contributors

Stargazers

Watchers

Forkers

ntia-conformance-checker's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs