Comments (5)
Totally agree that execution metadata can be easy to miss, will work on making it more readable!
from buildbuddy.
Hey @aaronmondal - if a remote executor doesn't already have this input in its local disk cache needed to perform a remote action, it must fetch it from the cache. If the executor already has artifact in its local cache, it will not be fetched again. There can be many (dozens, hundreds, +) executors running at given time so it can take a few builds for a given cache artifact to wind up on every executor. You can explore the Executions tab and click on an execution to see the inputs and how large they are.
If much of this size is from toolchains / some large inputs that get pulled in for every action - you can consider putting this on a custom container-image which gets pulled from a docker registry rather than from cache https://www.buildbuddy.io/docs/rbe-platforms/#using-a-custom-docker-image
from buildbuddy.
@siggisim Thanks for the swift reply! I might have an idea where that download size is coming from, though this might be a "bug" in the UI.
After looking at the build again it just didnt make sense that there would be such a large download size from the Bazel artifacts:
https://app.buildbuddy.io/invocation/be9102f9-6977-47b0-8ab9-25ddb38c6265
The executors tab here shows ~6000 actions with a max read size of ~0.28KB, which would amount to a max download of ~2MB. That looks a bit strange, but since this was a clean build it might just be profiling artifacts exchanged between executors.
The large amount of CAS hits probably has some part in this. ~500.000 hits at max artifact size of ~130KB are still just max ~65GB though.
However, we already do use a custom image, and that image is quite large at ~2.8 GB. If the docker pull of that image on 50 workers is tracked in the download size, tht would explain the ~140GB that I can't find in the logs (though 50 jobs=50 separate pulls?).
Another part is that http_archive fetches seem to be missing from the logs as well. I expect ~700MB of fetches there which on 50 workers again could contribute to the overall download size.
So one of these, or both might be missing metrics in the UI (or I just couldn't find it?):
- Docker pull size
http_archive
(and similar) fetch size
It might also be relevant that we are using bzlmod for all of this, and the log might be missing information because fetches triggered by module resolution is not tracked correctly.
from buildbuddy.
Hey @aaronmondal - you can take a look at the Executions tab and sort by "File size downloaded"
It shows that many of the (5,323) remote actions have 700MB + of inputs (these numbers only show networked downloads, and skip artifacts that are already present on the remote executor). You can click on these individual actions and explore the input files to see where this is going.
Docker pull size doesn't affect these stats, since they're pulled from a docker registry rather than from the cache.
http_archive
s fetches don't count because they are downloaded to your machine that is hosting Bazel, not to the remote executor (unless it's listed as an input to a particular remote action).
from buildbuddy.
@siggisim Ahhh now is see it. Ok then this is all clear and this of course fully explains everything.
Maybe it would be a good idea to make that small text larger and higher contrast. On a (fairly high quality) 4k display this is so small and low contrast that I overlooked that text even after looking at these logs for like a really long time. I just always read that 0.28KB
number which takes all the visual focus since it is so much more pronounced. I assumed that those 0.28KB
values were the download size, completely overlooking the 808 MB
value.
I don't have a visual impairment, but an occasional case of "being a very dumb user". Maybe this might classifies as an accessibility issue regardless 😅
from buildbuddy.
Related Issues (20)
- [CLI] Queries break for some common attributes HOT 2
- [CLI] Several "too many files open" errors for some builds HOT 2
- Actions using ctx.actions.declare_directory() are never cached HOT 2
- Non-root user in buildbuddy-app-onprem HOT 3
- Build Invocation Data not Shareable Across Docker containers even when stored in S3 HOT 4
- Incorrect name and organization when logging in for the first time HOT 5
- Reclient/chromium build support? HOT 1
- BuildBuddy GitHub app: Passes --heap_dump_on_oom HOT 1
- BuildBuddy GitHub app: Container has a version of `ar` without `--output=` flag HOT 2
- `buildbuddy.yaml`: Set default --config HOT 4
- BuildBuddy GitHub app: Submodules HOT 5
- [CLI] `bb login` fails in `git-worktree` linked worktrees. HOT 2
- Show analysis phase errors explicitly HOT 2
- [CLI] bb specific commands don't work when using `.bazelversion` HOT 5
- Feature Request: Show execution log in artifacts tab HOT 3
- Unreadable test result HOT 9
- [CLI] `Gathering metadata for bazel version...` breaks with some `tools/bazel` wrappers HOT 5
- [CLI] `bb login` should be a no-op if the user already has a valid API key HOT 4
- [CLI] `bb login` should say it's going to open a webpage before doing so HOT 2
- [CLI]: bb plugin hardlink does not work in devcontainer's volume HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from buildbuddy.