papaemmelab / toil_container Goto Github PK
View Code? Open in Web Editor NEW:whale: Toil + Docker and Singularity.
License: MIT License
:whale: Toil + Docker and Singularity.
License: MIT License
Hi ๐
This is my first visit to this fine repo, but it seems you have been working hard to keep all dependencies updated so far.
Once you have closed this issue, I'll create separate pull requests for every update as soon as I find one.
That's it for now!
Happy merging! ๐ค
it doesn't make sense to pass relative paths to containers as cwd
toil_container version: 1.1.6
Python version: 3.9.4
Operating System: Ubuntu 20.10
Currently the most useful entry point to toil_container()
is the call()
function of ContainerJob
. Unfortunately, it uses the options
dictionary, which is defined at initialization time. This means that:
a) All options have to be known before the job actually runs. This is not always true, as bind mounts might be dynamic, and even the docker/singularity choice could be changed dynamically
b) We have to actually pass options
every time we instantiate a job. This is verbose and not DRY.
I propose that we:
containerJob.call
out into a static function such as containerCall()
. containerCall()
then takes the "options" (docker/singularity, mounts etc) as an argument, and does not rely on self.options
. This resolves my first point.containerJob
then uses containerCall()
internally for backwards-compatibility (I have no intention to use containerJob
myself, though)containerCall()
pull its default options directly from the Toil config object, accessed through self.jobStore.config
from any given job. These can still be overridden by arguments. This system means that, if you have --singularity
enabled, all jobs will automatically use singularity (by default) without having to pass options
around (my second point above)Currently all the versions of all dependencies are pinned:
Lines 18 to 24 in 2afa800
This is unfortunate, as it means that it forces me to downgrade toil
, docker
etc just to install this library.
Good packaging practise [1], [2] dictates that library requirements should be as permissive/abstract as possible, to allow compatibility with other libraries. This can include an upper and lower bound, but pinning a specific version, and possibly even using ~=some_patch_version
are problematic because they severely restrict the dependencies.
Do you think it might be possible to convert these pins to lower bounds? It may be necessary to also add upper bounds for things like Toil, but I'm willing to help fix this library to make it compatible if this is the case.
Apparently there are some security issues by using shell=True
. But without having it, we also lose a lot of features, like piping, wildcards (*), etc.
Example:
> !echo 'It Works' > tmp
> !cat tmp | xargs echo
It Works
> subprocess.check_output(['cat', 'tmp', '|', 'xargs', 'echo'])
CalledProcessError: Command '['cat', 'tmp', '|', 'xargs', 'echo']' returned non-zero exit status 1.
> subprocess.check_output('cat tmp | xargs echo', shell = True)
b'It Works\n'
A workaround can be done by using bash
to run a script or command
> subprocess.check_output(['bash', '-c', 'cat tmp | xargs echo'])
b'It Works\n'
The question is if using bash -c <command>
or bash <script>
it's the same as shell=True
. If it is, if we should add support as a param option.
Some good resource for this: https://stackoverflow.com/a/51950538/3949081
cc: @mflevine
We need to do something like this, so that every container call uses a new tmp directory:
--workdir $TMP_DIR/${USER}_toi_container_
openssl rand -hex 12 \
Perhaps cleaning tmp afterwards?
We also need to use --containall
so that $HOME
is not mapped inside the container
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.