Comments (4)
Thanks for the response
I ended up creating a script that
- parses the selected package
- walks its dependencies (including packages from the workspace, and their dependencies etc.)
- creates a new temp package which represents those dependencies, but flattened
- copies in the pnpm lock file from the top level of the monorepo
- performs a pnpm install (which is quite fast because it's just setting up links, and the copy of the lock file makes it use the already established versions)
- generates the SBOM
Hacky script here:
https://gist.github.com/mc-alt/b0c27dd7621b3ea2f984b43a619877c2
This seems to work for us
Note: this would not be performant if not for the way pnpm's cache and linking approach works
Unfortunately I've been pulled on to other things, but I will try and find time to prepare a public example repository for a setup like ours
from syft.
Hi, @mc-alt, there are a few things to mention here, so let me start by suggesting a few options with what is available today: are you able to scan the subdirectories directly? If you wanted separate SBOMs, I'd think just scanning like syft project-root/packages/sub-package-1
could do the trick. If you tried this, I suspect the challenge you ran into is since this is a directory scan it doesn't pick up any package.json
information by default; you'd need to enable the javascript-package-cataloger
(by using the flag --select-catalogers +javascript-package-cataloger
). This isn't perfect, though, as Syft won't do transitive dependencies, only read what it finds on the filesystem -- this is one reason Syft prefers lock files, but what I understand about this setup is that the lock file only exists as the top-level pnpm-lock.yaml
so it wouldn't be read when scanning a subdirectory and syft wouldn't necessarily know how to determine which packages to exclude anyway. If, however, you had all the appropriate dependencies in the node_modules
installed, these would show up as you expect using the javascript-package-cataloger
with another caveat that it will also include build-time dependencies that are downloaded into node_modules
. I don't claim to have a lot of familiarity with PNPM; does this option get you close to having something usable?
I could definitely see some sort of enhancements we could implement -- namely looking outside the requested directory to attempt to find some additional pnpm-lock.yaml
, node_modules
, or other pertinent files. But we haven't done a lot of this and it's a little unclear to me if this should be the default behavior -- in other words: if I scan a directory, did I mean to treat it as a directory or as part of a larger workspace? Another option to explore is to add some sort of --workspace
or similar flag that can be used by catalogers that have knowledge of workspaces. PNPM certainly isn't the only one that does something like this and perhaps we can find some commonality across different package managers. The last thing I'd note is that, we've also had some investigation shelling out to tools (such as mvn dependency:tree
or a similar pnpm
call), but would very much like Syft to avoid doing this as much as possible.
That said, would you be able to provide some public repo(s) with a similar setup that we could have a look at?
from syft.
@mc-alt glad you are figured with your script!
I think the interesting thing to take out of this is that there may be something missing in the syft ecosystem in terms of "scanning 1 thing and generating N many SBOMs", which is outside of the scope of syft, but may be hinting at a separate tool that wraps syft. This is similar to (but not the same as) #562 .
The new use case highlight here is "what is the prescription for using syft in a mono repo setting?". This probably warrants some discussion.
from syft.
Another example of something that a user might want to perform a similar scan is a maven multi-module project, where a subdirectory contains something like a deployable web application and a user wants to include parent and sibling directories to properly resolve modules and parent poms with relative paths.
This seems to boil down to separating the set of files included in the source from the target directory to catalog. Today, for example, a user running a directory scan uses: syft my/dir
and syft indexes, and scans everything within that directory only. If there was a way to specify a different directory to scan while retaining the larger set of files for context it could be possible do accomplish what's asked for here, with some work in the catalogers to follow relative links. For example: syft /some/root/path --only-catalog sub/dir
or syft /some/root/path/sub/dir --root /some/root/path
to select a subset of files the cataloging functions when an alternate root is provided.
It seems there may be a path forward for this, but certainly more investigation is needed.
from syft.
Related Issues (20)
- Option to set `PackageSupplier` in root of SPDX document generated by CLI HOT 2
- UT TestParseRpmFiles is failing
- Syft Extract dependencies from Package.json in JavaScript Package Cataloger
- Missing dependency relationships between direct dependencies and transient dependencies in NPM packages HOT 2
- Wrong CPEs generated for OpenSSL by dotnet cataloguer HOT 4
- Special characters (tab, newline) in license URL HOT 1
- linter in CI doesn't work as expected HOT 3
- dom4j: Incorrect Goup IDs (Relocation) HOT 1
- is it possible to disable package scan and use other tasks (metadata, digest, etc) HOT 2
- Support to exclude scope for Maven Projects like compile, runtime, provided or test in Syft HOT 2
- Remove duplicates in cyclonedx-json format when same bom-ref HOT 1
- Support fluent-bit 1.7.0 dev, rc
- Support HAProxy dev
- Mysql binary detection version incorrect for 8.0.x
- syft convert: broken link in help - documentation no longer existing
- Support scanning filesystems without building an index HOT 6
- golang remote license search attempts to resolve stdlib modules
- Don't extract tar/tar.gz contents when cataloging
- Adding "Stats" on the scan inside the json HOT 5
- Dart: Syft incorrectly generates SBOM with version 0.0.0 for SDK dependencies HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from syft.