rust-secure-code / cargo-auditable Goto Github PK

View Code? Open in Web Editor NEW

576.0 576.0 23.0 1.12 MB

Make production Rust binaries auditable

License: Apache License 2.0

Rust 95.33% Roff 4.67%

cargo-plugin cargo-subcommand rust rust-lang sbom security-audit security-automation security-tools

cargo-auditable's People

Contributors

Stargazers

Watchers

cargo-auditable's Issues

Some examples in docs are excluded from doctests

There are multiple examples in the docs that are marked ```rust,ignore because they require other crates that are normally not in the dependency tree. We should investigate whether adding extra dependencies in doctest mode only is possible.

Tests involving `cdylib` fail when cross-compiling to musl - not enough binary artifacts

Steps to reproduce:

rustup target add x86_64-unknown-linux-musl
AUDITABLE_TEST_TARGET=x86_64-unknown-linux-musl cargo test --all-features

The test test_cargo_auditable_workspaces fails:

---- test_cargo_auditable_workspaces stdout ----
Test fixture binary map: {"binary_and_cdylib_crate": ["/home/shnatsel/Code/cargo-auditable/cargo-auditable/tests/fixtures/workspace/target/x86_64-unknown-linux-musl/debug/binary_and_cdylib_crate"], "crate_with_features": ["/home/shnatsel/Code/cargo-auditable/cargo-auditable/tests/fixtures/workspace/target/x86_64-unknown-linux-musl/debug/crate_with_features_bin"]}
/home/shnatsel/Code/cargo-auditable/cargo-auditable/tests/fixtures/workspace/target/x86_64-unknown-linux-musl/debug/binary_and_cdylib_crate dependency info: VersionInfo { packages: [Package { name: "binary_and_cdylib_crate", version: Version { major: 0, minor: 1, patch: 0 }, source: "local", kind: Runtime, dependencies: [1], features: [] }, Package { name: "library_crate", version: Version { major: 0, minor: 1, patch: 0 }, source: "local", kind: Runtime, dependencies: [], features: [] }] }
thread 'test_cargo_auditable_workspaces' panicked at 'assertion failed: crate_with_features_bins.len() == 2', cargo-auditable/tests/it.rs:149:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Strongly typed source in `auditable_serde`

Right now the source field in auditable_serde::Package is a String. It should be made into a #[non_exhaustive] enum instead.

Use a stabilizable format

Cargo.lock is not a stable format and is not expected to become stable. We need to store dependency information in a format we can eventually stabilize.

JSON isomorphic to Cargo.lock with some fields redacted sounds like a good start. Some discussion on the format can be found in the RFC: rust-lang/rfcs#2801

Why do I need to install Visual Studio for using this?

I am trying to use this but while trying to compile, it is asking that I need to install VS. Just wondering why is that?

Drop start/stop markers

We now put the dependency data in a separate link section. Parsing the executable format should be sufficient to locate the start and end of the dependency data.

Fails when a rustc wrapper like sccache is set

Having a rustc wrapper defined (like build caching with RUSTC_WRAPPER=sccache) makes build fail:

$ export RUSTC_WRAPPER=sccache
$ cargo auditable build --release
error: failed to run `rustc` to learn about target-specific information

Caused by:
  process didn't exit successfully: `/home/amousset/.cargo/bin/sccache /home/amousset/.cargo/bin/cargo-auditable rustc - --crate-name ___ --print=file-names --crate-type bin --crate-type rlib --crate-type dylib --crate-type cdylib --crate-type staticlib --crate-type proc-macro --print=sysroot --print=cfg` (exit status: 2)
  --- stderr
  sccache: error: failed to execute compile
  sccache: caused by: Compiler not supported: "Unrecognized command: \"-E\"\n"

Make cyclic dependencies impossible to represent in the data format

Right now it is possible to represent cyclic dependencies in the data format. For example:

{"packages":[
{"name":"foo","version":"0.3.1","source":"local","dependencies":[1]},
{"name":"bar","version":"0.2.1","source":"crates.io","dependencies":[0]}
]}

But Cargo does not actually allow cyclic dependencies.

This quirk of the format requires any consumer of this data to check for cyclic dependencies first, or risk an infinite loop during traversal.

It would be nice to make cyclic dependencies impossible to represent, for example:

{"packages":[{"name":"foo","version":"0.3.1","source":"local"},{"name":"bar","version":"0.2.1","source":"crates.io"}]}
{"dependencies":{"0": [{"1":{[]}]}}

Or something along those lines. The idea is to encode the relationships in a JSON tree that is guaranteed to be acyclic.

Dependency data not embedded unless `inject_dependency_list!()` return value is used

Dependency information will not be present in the final binary, unless the data returned by auditable::inject_dependency_list!() is actually used somewhere in the binary (printed, put into test::black_box, etc).

This is actually a bug in rustc: rust-lang/rust#47384

Check compatibility with sparse crates.io registry

https://blog.rust-lang.org/2022/06/22/sparse-registry-testing.html discusses a sparse registry feature. This may impact the reported source URL for crates coming from crates.io; we need to check if our detection of crates.io still works with sparse registry.

Surface dependency info in the dependent executable

Right now rust-audit requires running a separate executable to get the Cargo.lock information. It'd be really useful to surface the Cargo.lock information to the running program itself - even as just a &'static str so that it can be reported dynamically.

I am specifically thinking about some production code I own that returns build info through a /buildinfo HTTP GET endpoint. It'd be awesome if I could return the whole Cargo.lock through something like /buildinfo/dependencies

Integration tests fail on Windows

Relevant lines in CI log: https://github.com/Shnatsel/rust-audit/runs/7619314273?check_suite_focus=true#step:6:149

It appears that Cargo creates many more binary artifacts than our tests expect - namely the .pdb, .lib and .exp files on top of the .dll and .exe that we expect.

cc @tofay who wrote the code for parsing Cargo output

Store build dependencies separately

Right now all dependencies are stored together. We could split out build dependencies and store them separately.

A common case where this distinction would be useful is RUSTSEC-2018-0006: this is used by the current version of clap as a build dependency and poses no security risk in that context; however, its uses as a runtime dependency would be problematic.

We need to keep information about build dependencies because a bug in a code generator such as protobuf or cap'n proto only included as a build dependency may still pose a security risk at runtime.

Externally inject auditable data

I was recently thinking about what it would take to integrate something like this into a cargo install process. The biggest issue I see is that it requires modifying the binary sources to have the data added. I think a potentially more useful approach is a way to inject this data into an arbitrary binary build; maybe via something like a cargo wrapper cargo auditable build.

This would also avoid issues #9, #11 and #13 (but probably introduce others 😀).

Is there a reason to prefer the current approach where each binary needs to be configured to include the data?

Doesn't work on Mac OS

The section name we use for Linux cannot be reused as-is:

mach-o section specifier requires a segment and section separated by a comma.

Prototype configuration via `[package.metadata.cargo-auditable]`

https://github.com/bnjbvr/cargo-machete implements configuration via [package.metadata.cargo-machete], we should consider using this as a configuration mechanism for cargo-auditable.

auditable::version_info() should return `&str`, not `&[u8]`

auditable::version_info() returning &str instead of &[u8] would be much more ergonomic.

Problem is, we need to store inline data in a variable instead of a pointer, and it has to be sized. AFAIK there's no statically-sized version of str, that's why we currently use [u8; weird_length_calculation()].

So either we need to store the version info twice - once as bytes and once as &str - or use unsafe to convert from without doing the full scan for non-UTF-8 characters on every call to auditable::version_info(), but it's only sound if we can statically ensure that our slice is UTF-8. Using include_str! to verify UTF-8 compliance would leave us open to time-of-check/time-of-use attacks.

Add integration tests for RustSec interop

It would be nice to verify that the recovered information is indeed read correctly by cargo auditable and/or the underlying rustsec crate, and that it does indeed report vulnerable versions when they're present.

There is a test advisory specifically for this purpose: https://github.com/rustsec/advisory-db/blob/main/crates/rustsec-example-crate/RUSTSEC-2019-0024.md

`auditable` embeds its own dependencies into the dependency info

Could be possible to fix if we parse the Cargo.lock in more detail and doctor it.

This would automatically not be an issue for an implementation of the idea within Cargo.

Long initial compilation time

auditable crate currently adds ~30 seconds to compilation time due to the dependencies on syn and serde.

Since serde-json is what Cargo itself uses, we have to stick to it for this to be a reasonably faithful implementation of the Cargo RFC. This issue is going to disappear once this is functionality is upstreamed into Cargo.

`auditable` breaks `cargo test`

This stems from two issues:

Cargo doesn't run build.rs when running tests, sothe dependency file doesn't exist and the environment variable with a path to it is not set
When running cargo test, the test attribute is both set and not set at the same time so we cannot even inject a dummy value (I tried)

Demo of Schrödinger's cfg attribute: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=4538ce9eba5d72975dda87d696a717ea (make sure you select the "test" action, not "run")

Test Windows MinGW target on CI

Rust defaults to the x86_64-pc-windows-msvc platform on Windows, but both it and x86_64-pc-windows-gnu are Tier 1. We should test cargo auditable on both.

Cargo Resolver V2 (different feature sets for build and runtime dependencies) is not supported

Cargo has made it possible to depend on the same version of a given crate with different feature sets, provided that one version is a runtime dependency and another is a build dependency.

The dependency resolution in rust-audit was written prior to that change, and it's possible that auditable-serde collates these two packages.

The deduplication is done on the package ID from cargo-metadata, and we'll need to double-check that this is in fact correct even in the presence of the new Cargo feature resolver:

https://github.com/Shnatsel/rust-audit/blob/d7fa6fff1861799adab41638267e0457b7ba4698/auditable-serde/src/lib.rs#L219

Cargo.lock lookup is Linux-specific

We should use std::path::MAIN_SEPARATOR instead of hardcoded /

`cargo audit` does not report vulnerabilities in recovered TOML

After using the json-to-toml example and feeding the data to cargo-audit, it reports that it has succeeded and found no vulnerabilities. However, in practice the presence of vulnerabilities is not reported.

For example, RUSTSEC-2021-0003 is not reported when the bundled hello-world sample depends on a vulnerable SmallVec verison.

Dependencies not regenerated after first build

Since the dependency extraction logic is in build.rs of a dependency crate, it is not re-run whenever the toplevel crate is recompiled. This may lead to the embedded dependency info being stale compared to the actual state of affairs.

External injection interacts poorly with workspaces

The external-injection branch implements a Cargo subcommand to inject the audit data without requiring a build script and avoiding several issues associated with that.

It currently assumes that only one binary artifact is being built, and handles cases where several binaries or an entire workspace is being built very poorly. Cargo doesn't make the information about which binary artifacts are being built readily available to subcommands.

According to @tofay:

You can get info on binaries built in an external tool, but you have to enable cargo json messages and then parse them (https://doc.rust-lang.org/cargo/reference/external-tools.html#artifact-messages). we have a wrapper at work that does this to do some post-processing of binaries, and that's the approach I've proposed for cargo-spdx at alilleybrinker/cargo-spdx#9

This will probably have to be implemented before cargo auditable can be actually used in the wild for arbitrary projects.

Compress data

Dependency info is highly compressible. We should utilize that fact to reduce the binary size overhead.

Build fail (due to update to binfarce create I think)

Hey I tried

git clone https://github.com/Shnatsel/rust-audit.git        
cd rust-audit
cargo build --release

and got:

error[E0004]: non-exhaustive patterns: `SectionIsMissing(_)` and `UnexpectedSectionType { .. }` not covered
  --> auditable-extract/src/lib.rs:96:15
   |
96 |         match e {
   |               ^ patterns `SectionIsMissing(_)` and `UnexpectedSectionType { .. }` not covered
   | 
  ::: /home/harry/.cargo/git/checkouts/binfarce-e74c3427e3f3ff61/32b9eed/src/error.rs:5:5
   |
5  |     SectionIsMissing(&'static str),
   |     ---------------- not covered
6  |     UnexpectedSectionType { expected: u32, actual: u32 },
   |     --------------------- not covered
   |
   = help: ensure that all possible cases are being handled, possibly by adding wildcards or more match arms
   = note: the matched value is of type `ParseError`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0004`.
error: could not compile `auditable-extract`

To learn more, run the command again with --verbose.
warning: build failed, waiting for other jobs to finish...
error: build failed

Registry URL omitted from dependency information

This means you can't identify if crates are from crates-io or an alternative registry, which may result in false positives in subsequent scanning.

Perhaps crates-io could be special cased, so that the URL is included? That would at least allow differentiation between crates-io and alternative registries. (I'm assuming rust-audit doesn't include the source from cargo_metadata directly for privacy reasons e.g leaking URLs/ git repo structure)

Provide a single function that extracts audit data from an executable

Right now we have the full extraction pipeline in examples, which is not super complicated but is nevertheless manual.

rust-audit-info shows how it's all tied together; we should just put that into a function and make that a crate.

auditable-build panics on Beta

I have a crate, mbot for which auditable-build successfully runs and this all works for on Stable, but for which auditable-build panics when building on Beta.
Some relevant command output:

jmn@neogreen:~/Projects/mprojects/mbot$ RUST_BACKTRACE=1 cargo +beta run --release --features auditable-data
   Compiling mbot v0.1.0 (/home/jmn/Projects/mprojects/mbot)
error: failed to run custom build command for `mbot v0.1.0 (/home/jmn/Projects/mprojects/mbot)`

Caused by:
  process didn't exit successfully: `/home/jmn/Projects/mprojects/mbot/target/release/build/mbot-35641d5a1f06aff7/build-script-build` (exit code: 101)
  --- stdout
  cargo:rerun-if-changed=data/maddie.json

  --- stderr
  thread 'main' panicked at 'no entry found for key', cargo/registry/src/github.com-1ecc6299db9ec823/auditable-serde-0.1.0/src/lib.rs:307:51
  stack backtrace:
     0: rust_begin_unwind
               at /rustc/9f0e6fa94be6f97c736e51811d7b58904edfa8cb/library/std/src/panicking.rs:475
     1: core::panicking::panic_fmt
               at /rustc/9f0e6fa94be6f97c736e51811d7b58904edfa8cb/library/core/src/panicking.rs:85
     2: core::option::expect_failed
               at /rustc/9f0e6fa94be6f97c736e51811d7b58904edfa8cb/library/core/src/option.rs:1213
     3: core::option::Option<T>::expect
     4: <std::collections::hash::map::HashMap<K,V,S> as core::ops::index::Index<&Q>>::index
     5: <auditable_serde::VersionInfo as core::convert::TryFrom<&cargo_metadata::Metadata>>::try_from
     6: auditable_build::collect_dependency_list
     7: build_script_build::main
     8: core::ops::function::FnOnce::call_once
  note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Support RISC-V

RISC-V is the only architecture for which the data exposed by the object crate is not sufficient, and information from the compiler internals is needed. That's why it is currently stubbed out:

cargo-auditable/cargo-auditable/src/object_file.rs

Lines 119 to 126 in a90fb30

 // TODO 

 // Architecture::Riscv64 if sess.target.options.features.contains("+d") => { 

 // // copied from `riscv64-linux-gnu-gcc foo.c -c`, note though 

 // // that the `+d` target feature represents whether the double 

 // // float abi is enabled. 

 // let e_flags = elf::EF_RISCV_RVC | elf::EF_RISCV_FLOAT_ABI_DOUBLE; 

 // file.flags = FileFlags::Elf { e_flags }; 

 // }

We need to find some way to deal with this.

WebAssembly support

It is technically possible to support WebAssembly, since they do allow custom sections: https://webassembly.github.io/spec/core/appendix/custom.html

This may be useful since the overhead of the audit info is just a few kilobytes and WebAssembly is being applied not just for the web.

Hardcoded / as a path separator

I'm not sure how to include_bytes! with a platform-agnostic path separator, so it would insert / or \ depending on the host OS.
This works but is not portable: include_bytes!(concat!(env!("OUT_DIR"), "/myfile"));
This doesn't work: include_bytes!(concat!(env!("OUT_DIR"), std::path::MAIN_SEPARATOR, "myfile"));
I've tried googling this but nothing comes up

Explore effect on reproducible builds

Now that the dependency info is part of the executable, we need to ensure that it allows for reproducible builds. We need to ensure that the package version info we embed is deterministic.

One potential source of non-determinism is the fact that we store data as ordered collections, but don't enforce any particular order.

Investigate pre-exiting formats for storing dependency info

Apparently there is a number of formats designed to encode package info already: https://gitbom.dev/glossary/sbom/

We need to check if any of them are suitable for our use case. Notably we redact some field such as git repo URLs, and also include information about enabled features, so it might not be 100% compatible.

Also, the degree of adoption of these formats needs to be understood; perhaps we should provide conversion utilities, even if we don't end up using the format internally.

Omit dependencies disabled via features

Right now we include the data equivalent to the contents of Cargo.lock, which lists all dependencies declared in the workspace. Some of those dependencies may be disabled via features; including them may result in false positives in whatever tooling uses this data.

Dependencies for all platforms are included

Windows-only dependencies are currently included in a Linux build and vice versa. We should filter deps by platform.

Fails on msys2 with --target x86_64-pc-windows-gnu on Github Actions

Hi, nice to see all the progress on the "injection" approach. On msys2 i see that

cargo build --release --target x86_64-pc-windows-gnu

works fine, but

cargo auditable build --release --target x86_64-pc-windows-gnu

fails with

error[E0463]: can't find crate for `std`
  |
  = note: the `x86_64-pc-windows-gnu` target may not be installed
  = help: consider downloading the target with `rustup target add x86_64-pc-windows-gnu`

error: cannot find macro `println` in this scope
 --> src\main.rs:2:5
  |
2 |     println!("Hello, world!");
  |     ^^^^^^^

error: requires `sized` lang_item

For more information about this error, try `rustc --explain E0463`.

I tried to create a minimal reproducible example here: https://github.com/niklasf/cargo-auditable-issue/runs/7665968148?check_suite_focus=true

Naming is confusing with cargo-audit

cargo-audit already has decent developer penetration, so having another audit crate is confusing. Could I maybe suggest the name rust-traceable instead?

Investigate `build-info` crate

https://crates.io/crates/build-info

It exists and does something similar. Perhaps some of the concerns can be offloaded to it, or perhaps they've got some cool techniques I couldn't come up with.

`cargo auditable`: use `anyhow` crate for error handling

Right now cargo auditable just has .unwrap() all over the place. That's fine for a prototype, but we'll need proper error handling to show nice error messages.

We should use anyhow crate for error handling, because that's what Cargo already uses, and it will make upstreaming simpler.

v0.3.0 release checklist

Update repository URL in Cargo.toml files
Remove all references to auditable crate and replace then with cargo auditable
Create a test to check rlib and bin in a single crate to fulfill this TODO
Figure out how to make Cargo publish the toplevel README.md to crates.io under the cargo auditable package
~~#52~~

Cargo.lock lookup fails if CARGO_TARGET_DIR is overridden

Right now auditable relies on the assumption that the target directory is somewhere below the directory containing Cargo.lock. This is true by default, with the build happening in target/, but the user can override it via CARGO_TARGET_DIR and violate this assumption.

The problem here is that Cargo does not support any kind of cross-crate communication via build.rs. Even though we can run literally arbitrary code from build.rs, we don't know where the code of the other crate is located. Specifically:

The toplevel crate has no way to inject its Cargo.lock back into auditable because by the time its build.rs runs, auditable is already compiled
The build.rs in auditable has no clean way to know what is the toplevel crate when it's in the dependency tree

As I see it, we have to either:

Use a heurtstic (like we do now) and require merely adding the crate as a dependency, but fail if CARGO_TARGET_DIR is overridden. We can provide an env variable to explicitly point to the appropriate Cargo.lock as a workaround. This is in the spirit of making easy things easy and hard things possible.
Put all the logic in the toplevel crate, requiring the user to both create/modify build.rs and inject some macro in your own code. This is reliable, but the ergonomics are poor. Obtaining the string from the code is especially tricky.

This issue will no longer exist once this mechanism is moved to Cargo, since Cargo has full knowledge of the Cargo.lock of the toplevel crate.

`auditable` breaks cross-compilation from Windows to Unix or vice versa

We need to call include_bytes! with a platform-agnostic path separator, so it would insert / or \ depending on the host OS.

The best way so far is #[cfg(unix)] and #[cfg(windows)] which is what we currently use. But these are set depending on the target platform, not host platform. There are no cfg options for host platform.

Doesn't work on Mac

Building on Mac without passing linker flags doesn't preserve the audit data. The Mac ld doesn't support the --undefined flag.

There is a documented flag -u that seems to do what we need it to, but trying to actually use it results in the following error:

ld: unknown option: -u AUDITABLE_VERSION_INFO

The -u flag is documented in the manpage, so it's weird to see the linker reject it.

Both -u AUDITABLE_VERSION_INFO (with the space) and -uAUDITABLE_VERSION_INFO (without the space) fail.

Set up CI tests for Windows, Mac

I only have a Linux machine to test on. We should write end-to-end tests and run them in CI on a variety of platforms.

Determine the root package

The current format doesn't have information to determine direct or transitive dependence. In the following example, we can know ansi_term depends on bitflags, but the project may also directly depend on bitflags.

{
  "packages": [
    {
      "name": "ansi_term",
      "version": "0.12.1",
      "source": "crates.io",
      "dependencies": [
        1
      ]
    },
    {
      "name": "bitflags",
      "version": "1.2.1",
      "source": "crates.io",
      "features": [
        "default"
      ]
    }
  }
}

For example, package-lock.json has "" so that we can know direct dependencies.
https://github.com/firebase/functions-samples/blob/3515b7f38a3c598cdb20152a263372e81719ecda/package-lock.json#L7-L17

Public release checklist

A more robust way to locate Cargo.lock file in auditable
Store dependencies in a format other than raw Cargo.lock; will likely involve cargo-lock crate for parsing it
Create a pure-Rust recovery tool, based on RazrFalcon/cargo-bloat#59 or libgoblin
crates.io name acquisition and publishing

	// TODO
	// Architecture::Riscv64 if sess.target.options.features.contains("+d") => {
	// // copied from `riscv64-linux-gnu-gcc foo.c -c`, note though
	// // that the `+d` target feature represents whether the double
	// // float abi is enabled.
	// let e_flags = elf::EF_RISCV_RVC \| elf::EF_RISCV_FLOAT_ABI_DOUBLE;
	// file.flags = FileFlags::Elf { e_flags };
	// }

rust-secure-code / cargo-auditable Goto Github PK

cargo-auditable's People

Contributors

Stargazers

Watchers

Forkers

cargo-auditable's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs