Comments (7)
Yes this an expected behaviour as both executions are generating conflicting output.txt
files under output_custom
path, and we don't know how to merge the output directory when we get conflicting file names. The error is telling the user that we can't merge results from the multiple executions, but they can still download the raw results
This job works as both executions are generating random files under output_custom
.
Name: Docker Job with Output
Type: batch
Namespace: default
Count: 2
Tasks:
- Name: main
Engine:
Type: docker
Params:
Image: ubuntu:latest
Entrypoint:
- /bin/bash
Parameters:
- -c
- >
dd if=/dev/urandom of=/output_custom/$(tr -dc A-Za-z0-9 </dev/urandom | head -c 13 ; echo '').txt bs=1M count=1
Publisher:
Type: local
ResultPaths:
- Name: output_custom
Path: /output_custom
The result directory structure looks like:
→ tree job-j-518c8f47
job-j-518c8f47
├── exitCode
├── output_custom
│ ├── 4xpHSwz4corBo.txt
│ └── F2LWEK6Urmzin.txt
├── stderr
└── stdout
from bacalhau.
Yes, I understand this behavior is expected given the current implementation of the code - but from a UX perspective this unexpected imo. We're basically saying "You're holding it wrong" for a relatively simple task and implying we expect users to write jobs that name all their output files different when run on multiple nodes.
Perhaps we could make the merge behavior something users opt into rather than out of?
from bacalhau.
Agreed - we should save both simultaneously, and tell the user that. Eg
/<output name>-<short node ID>
For all nodes
from bacalhau.
Would something like:
job-j-1234/
node-n-abc/
exitcode
stderr
stdout
output-name/
...
node-n-def/
exitcode
stderr
stdout
output-name/
...
be acceptable?
from bacalhau.
the problem is you won't get deterministic outputs or structure of the result directory. but at the same time I hate this kind of magic that works most of the time, but fails on others. Also we are not really doing a good job in merging stdout, stderr or exit code.
also I wouldn't group results by node-id as the same node might run multiple executions for the same job, such as when we have a fat node in the network. We don't do this today, but we should.
So options can be:
- Group by an execution counter. If the job had
count=3
, then we will have000
,001
and002
sub-directories, and each will have results and logs for a specific execution. This solves the deterministic behaviour, but not simple to implement today as we don't have this execution counter logic. Though we should - Group by executionID: much easier to implement and can work nicely if we allow users to download results for specific executions
from bacalhau.
the problem is you won't get deterministic outputs or structure of the result directory
Is determinism in output still something we care about? I'm not sure I see this as a problem, rather a side effect of the system.
I hate this kind of magic that works most of the time, but fails on others. Also we are not really doing a good job in merging stdout, stderr or exit code.
I hate the magic too, because it's the furthest thing from magic when it doesn't work. I don't think we should be merging outputs, it's almost always not what I want when I download the results of a job.
also I wouldn't group results by node-id as the same node might run multiple executions for the same job
Humm, yeah this is a fair point. On the other hand I like seeing which result came from which node. sooo....
Group by executionID: much easier to implement and can work nicely if we allow users to download results for specific executions
What if we group by node and execution, that is:
Job
Node-1
Execution-1
Execution-2
Node-2
Execution-1
Node-3
Execution-1
Execution-2
Execution-3
I am not sure what this should look like when there is more than 1 task, thoughts?
from bacalhau.
We need to balance between correctness and have a great UX
Is determinism in output still something we care about? I'm not sure I see this as a problem, rather a side effect of the system.
It is not a great user experience if users want to pipe the download command or do some sort of automation when the the download path is not predictable. I understand they play with wildcards, but still not a nice UX
On the other hand I like seeing which result came from which node
Why do users care about the node id when reading results? I understand it might be useful for debugging, which users can know by calling bacalhau job executions <job_id>
, but not necessery useful in the result directory structure
What if we group by node and execution
Too many levels is not a great experience. We can always add a metadata
file inside each path that holds some metadata about the execution, such as its full id, node id, start time, end time, ... etc
from bacalhau.
Related Issues (20)
- Move `bacalhau list` to `bacalhau job list`
- Move `bacalhau id` to `bacalhau agent node`
- Move `bacalhau describe` to `bacalhau job describe`
- Move `bacalhau validate` to `bacalhau job validate`
- Move `bacalhau logs` to `bacalhau job logs`
- KVMigration Tests Flaky HOT 1
- Implement auto download flags and function on wasm and docker commands HOT 3
- Disable WebUI in production environment
- Add a --queue-timeout flag to docker run
- Optimise locking in producer and consumer based NATS streaming clients
- Re-work documentation around CLI to align with V2 CLI commands
- JACoB Installed
- Update docs after IPFS deprecation
- Update docs after removing libp2p
- Document different timeout types
- Document job queue
- installationID is not persisted
- Add history event when the job is delayed in the queue
- Allow a node to run multiple executions for the same job
- Return HTTP 410 for legacy API methods
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bacalhau.