GithubHelp home page GithubHelp logo

bom's Introduction

Investigating BoM tools and formats

Contains the BoM fields of interest and how we believe we can represent them in CycloneDX and SPDX.

Tools Investigations

Pack

Command run for CycloneDX XML (from source): syft packages <path-to-source> -o cyclonedx

Command run for CycloneDX XML (from image): syft packages <image-name> -o cyclonedx

Includes examples in both Syft enriched JSON format (not CycloneDX) and CycloneDX (XML)


Tern

Command run for true SPDX JSON format: ./docker_run.sh ternd "report -f spdxjson -i built-app-image:latest"

Includes examples in both Tern enriched JSON format (not SPDX) and true SPDX (JSON)

Conclusions

Time Averages for Scanning

  • Syft on a pre-built image: 4.8062s
  • Syft on application source code: 0.7624s
  • Tern on a pre-built image: 136.4412s

Tern

  • We generated both a Tern-specific SPDX JSON file, as well as a "true" SPDX JSON file.
  • It appears that while neither have all of the metadata we are looking for, the "true" SPDX format has the least.
    • It does not have CPEs, SHAs, and layer paths

Syft

  • The JSON output of Syft is NOT CycloneDX format, but rather it's a superset of all metadata that can be retrieved.

  • The real CycloneDX format from Syft (XML) is missing some information.

    • It does not have CPEs, SHAs, and layer paths
  • This issue on Syft looks like a request for what we might want

  • The enriched formats have all of the information we need, but they don't seem to align with CycloneDX or SPDX as well as we once thought

Syft enriched JSON seems to fit our use case

  • Gets full information on all OS level and indirectly installed packages

    • CPEs, licenses, urls, shas, layer location, name, version
    • These are all fields we feel strongly about being first-class citizens
    • The SPDX output from Tern did not surface this metadata fully/or as clearly
  • For all of our language modules (packages installed by go-mod and npm) can easily be retrieved with fully-fledged metadata

  • It might be nice to integrate with this tooling rather than build out our own custom logic

  • No information about the the actual dependencies we directly install (node, go, etc), but this is information that we can easily provide

  • Does a better job of conveying the information we think is important

  • The tooling ecosystem feels better fleshed out for the use cases we have (language module metadata collection)

Format Concerns

The fact that these scanning tools all seem to have enriched BOM outputs that are outside of either SPDX or CycloneDX and then there translations to these formats are sparse is with interesting and cause for pause.

  • Why do these tools not just try and do everything in the existing BOM formats that they support?
  • Why is the translation between their format and the "official" format so sparse?
  • Is it sparse because they don't care about those formats or because they cannot get more specfic?

To Do

  • Conversion tooling
  • Offline environments?
  • Check CycloneDX Scanning Tools

bom's People

Contributors

foresteckhardt avatar sophiewigmore avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.