Comments (20)
Hi @eburgueno , I'm glad you found our project interesting.
I'm not sure if I follow what you are asking. The user inside the container is always 'biodocker', and as far as I know, that doesn't interfere with the external users, that means you can share files between your external users with the container user with no problems.
I have used some of the containers on a shared server with multiple users and I haven't seen no problem so far.
I hope that answer your question, please feel free to elaborate a little more if you find necessary.
from specs.
Hi @prvst, this is what I mean:
[user@host1 dummy]$ pwd
/workspace/user/dummy
[user@host1 dummy]$ ll -nd
drwxrwsr-x. 3 550 10024 389 Jun 15 16:48 .
[user@host1 dummy]$ ll -n
total 188809
-rw-rw-r--. 1 550 10024 73119119 Jan 14 2014 accepted_hits.bam
-rw-rw-r--. 1 550 10024 73119119 Jan 14 2014 accepted_hits.sorted.bam
-rw-rw-r--. 1 550 10024 101552 Jan 14 2014 accepted_hits.sorted.bam.bai
-rw-rw-r--. 1 550 10024 25188 Jan 14 2014 deletions.bed
-rw-rw-r--. 1 550 10024 31378 Jan 14 2014 insertions.bed
-rw-rw-r--. 1 550 10024 546489 Jan 14 2014 junctions.bed
-rw-rw-r--. 1 550 10024 66 Jan 14 2014 left_kept_reads.info
drwxrwxr-x. 2 550 10024 831 Jan 14 2014 logs
-rw-rw-r--. 1 550 10024 66 Jan 14 2014 right_kept_reads.info
-rw-rw-r--. 1 550 10024 6764172 Jan 14 2014 unmapped_left.fq.z
-rw-rw-r--. 1 550 10024 6185737 Jan 14 2014 unmapped_right.fq.z
[user@host1 dummy]$ docker search biodckr/blast
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
biodckr/blast NCBI BLAST+ 2 [OK]
[user@host1 dummy]$ docker run -ti --rm -v $(pwd):/data biodckr/blast bash
Unable to find image 'biodckr/blast:latest' locally
latest: Pulling from biodckr/blast
8387d9ff0016: Pull complete
3b52deaaf0ed: Pull complete
4bd501fad6de: Pull complete
a3ed95caeb02: Pull complete
b8dd2122b2bc: Pull complete
9c8ffe978896: Pull complete
d9ec616dfddb: Pull complete
88ea212f7810: Pull complete
ff0c7281174b: Pull complete
475239cf827c: Pull complete
Digest: sha256:37076ba5533524ba05703950160ca6becc82526225b9f5cf2974c1b0f5d8ccef
Status: Downloaded newer image for biodckr/blast:latest
biodocker@502394e2c896:/data$ id
uid=1000(biodocker) gid=1000(biodocker) groups=1000(biodocker),27(sudo),105(fuse)
biodocker@502394e2c896:/data$ cd logs
biodocker@502394e2c896:/data/logs$ ls
bowtie.left_kept_reads.fixmap.log bowtie.right_kept_reads.fixmap.log bowtie_build.log long_spanning_reads.segs.log run.log
bowtie.left_kept_reads_seg1.fixmap.log bowtie.right_kept_reads_seg1.fixmap.log bowtie_inspect_recons.log prep_reads.log sam_merge.log
bowtie.left_kept_reads_seg2.fixmap.log bowtie.right_kept_reads_seg2.fixmap.log juncs_db.log reports.log segment_juncs.log
bowtie.left_kept_reads_seg3.fixmap.log bowtie.right_kept_reads_seg3.fixmap.log long_spanning_reads.log reports.samtools_sort.log
biodocker@502394e2c896:/data/logs$ rm *
rm: remove write-protected regular file 'bowtie.left_kept_reads.fixmap.log'? y
rm: cannot remove 'bowtie.left_kept_reads.fixmap.log': Permission denied
rm: remove write-protected regular file 'bowtie.left_kept_reads_seg1.fixmap.log'? y
rm: cannot remove 'bowtie.left_kept_reads_seg1.fixmap.log': Permission denied
rm: remove write-protected regular file 'bowtie.left_kept_reads_seg2.fixmap.log'? y
rm: cannot remove 'bowtie.left_kept_reads_seg2.fixmap.log': Permission denied
rm: remove write-protected regular file 'bowtie.left_kept_reads_seg3.fixmap.log'? ^C
biodocker@502394e2c896:/data/logs$ cd ..
biodocker@502394e2c896:/data$ blast
blast_formatter blastdb_aliastool blastdbcheck blastdbcmd blastdbcp blastn blastp blastx
biodocker@502394e2c896:/data$ blastx --help > output
bash: output: Permission denied
In a shared environment, multiple users write to the same PATHs. We preserve the user ownership information because we want to know who did what, and we enable collaboration by setting a SGID bit in the top level directory (this enforces group ownership inheritance on new files). UPG standards take care of file permissions (write access for group). I believe this is pretty much common practice in an HPC environment.
My apologies as my example above doesn't really run blast
(I'm no bioinformatician, just a sysadmin), but hopefully the idea stands out: the user running the process inside the container won't be able to write a directory on the container host, as it obviously doesn't belong to the 10024 group.
Even if you were able to write into a directory, the ownership of the new file generated inside the container will be UID=1000. In our particular case UID=1000 has already been assigned to someone. Changing this person's UID for another one is not exactly ideal. But even if you were to do that, you still have a problem in that you can't know who created said file (ie: was it user1 or user2 who started the container?).
Hope this makes sense.
from specs.
hum, OK, now I see your problem.
@sauloal, @ypriverol; what do you think about this?
from specs.
@eburgueno we have an ongoing discussion about data inside the containers and why we should not modify this data on containers. Why do you need to modify inside the container the data? Have you try other solutions?
from specs.
Hi @ypriverol, I'm not talking about modifying data inside the container. The output data will be created on a volume mapped to the running container when it is started (using docker run -v ...
).
My problem is that the user inside your containers has UID=1000. In my environment we have hundreds of users, each one with their own UID. When two different users run your containers, the output files are created as if they belong to the same user (the one with UID=1000), not as the one that actually started the container. This makes it impossible to figure out who did what.
There is no solution to this problem yet as far as I'm aware.
from specs.
@eburgueno @prvst can we convert biodcoker user in a virtual user and all the users within your environment should be members of it. Can be this one of the possible solutions?
from specs.
Theres already a user docker which everyone running docker already belongs
to....
On Thu, Sep 29, 2016, 12:25 Yasset Perez-Riverol [email protected]
wrote:
@eburgueno https://github.com/eburgueno @prvst
https://github.com/prvst can we convert biodcoker user in a virtual
user and all the users within your environment should be members of it. Can
be this one of the possible solutions?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAe608PchrEymX4mnsz-dCMBJo2IqWGLks5qu5IqgaJpZM4I17Bs
.
from specs.
@sauloal Then, the only thing that @eburgueno should do in their side is to add all his cluster users part of the biodocker group?
from specs.
If they want to run docker they must belong to the docker group
On Thu, Sep 29, 2016, 12:29 Yasset Perez-Riverol [email protected]
wrote:
@sauloal https://github.com/sauloal Then, the only thing that @eburgueno
https://github.com/eburgueno should do in their side is to add all his
cluster users part of the biodocker group?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAe609k79arIy5vgHvx132kvGGMWnRn5ks5qu5MlgaJpZM4I17Bs
.
from specs.
@sauloal can we define a more generic name now that we are not only docker.. something proabably like biocontainer USER. @bgruening How this happen in rkt
from specs.
rkt
does not have a server, it's a client only concept. Moreover, for Docker there are also other workarounds possible: https://github.com/ekorpela/cloud-vm-workshop/blob/master/materials/Docker/ProvisioningTrick.md
from specs.
@bgruening @sauloal do you think this is a better an consistent way that provide a docker virtual group where all the users should be. or what @abdulrahmanazab said in https://github.com/ekorpela/cloud-vm-workshop/blob/master/materials/Docker/ProvisioningTrick.md . I would like to standarize this in all of our docker-based containers.
from specs.
Ok, I guess that the solution of using a wrapper script (as described in https://github.com/ekorpela/cloud-vm-workshop/blob/master/materials/Docker/ProvisioningTrick.md) may work. It's just slightly inconvenient having to create one for each container out there, or come up with a more generic wrapper. But at least this hides the complexity to the users, so that they don't need to run docker run -u...
themselves with their own UID and GID (some might not even know what those are).
The only other thing needed for the containers to work well in a shared environment is to ensure that the default umask is 0002
. At the moment I believe you're using 0022
:
$ docker run -ti --rm -v $(pwd):/data biodckr/blast bash
biodocker@29b4fce05367:/data$ umask
0022
from specs.
@eburgueno I think you are right, but this is also related with the topic of having a wrapper script for all containers #27. Can we finally define this helper script @sauloal
from specs.
There is another way. Place a helper script INSIDE the container to be
called before each run. This command would create a new user with the same
Id as the current user. This takes milliseconds. The biggest problem would
be that all programs would need to be installed in a default path (not
home) and that conda word not be suitable for that.
A wrapper should not be a problem but would have to be created cross
platform. Always fun to do.
On Thu, Sep 29, 2016, 15:09 Yasset Perez-Riverol [email protected]
wrote:
@eburgueno https://github.com/eburgueno I think you are right, but this
is also related with the topic of having a wrapper script for all
containers #27 #27. Can we
finally define this helper script @sauloal https://github.com/sauloal—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAe60__-HTmCMZNQO_siTSyokkLd5qhmks5qu7iWgaJpZM4I17Bs
.
from specs.
@sauloal we tried this approach and found two issues:
- Some tools don't have just one binary to call at script exit (blast here is a good example). Capturing both the command to run and any command line arguments gets tricky.
- The only way for the helper script to receive information about the user (and any common groups) to add is via the
docker run
arguments (for example, via environment variables). This relegates the responsibility to the user running the container who may not know, forget to, or do it wrong.
from specs.
@eburgueno @sauloal @bgruening I don't have to mush experience as a sysadmin but the idea of having a virtual user biodocker that people can basically be part of that group and acce to the data make a lot of sense for me. This is something easy to control and easy to configure.
from specs.
This is already how docker use. Everybody belongs to group 'docker'. The
problem is that the files are still saved as root.
The outside wrapper seems the best idea. I wonder if writing it in 'go',
which is by nature multiplatform, would be a possibility.
On Thu, Sep 29, 2016, 21:14 Yasset Perez-Riverol [email protected]
wrote:
@eburgueno https://github.com/eburgueno @sauloal
https://github.com/sauloal @bgruening https://github.com/bgruening I
don't have to mush experience as a sysadmin but the idea of having a
virtual user biodocker that people can basically be part of that group and
acce to the data make a lot of sense for me. This is something easy to
control and easy to configure.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAe607ysg3iSIIneLWsjxFq2r6K-foFNks5qvA32gaJpZM4I17Bs
.
from specs.
The problem with the wrapper script and other alternatives is that we should propose a clean and well documented implementation. BioContainers should provide mainly the docker containers and advice how to run those containers. @sauloal can we implement a general tool as suggested by @abdulrahmanazab that can take a Biocontainers docker-based and executed in clusters environments. Probably we can include this tools into a separate repository for Biocontainers tools.
from specs.
Imho we should not enforce any strategy howto use our containers. There are many possible solutions from the outside, like https://github.com/indigo-dc/udocker the way ht-condor handles containers, socker and what not. We should concentrate on containers and the content not on how people should use them. This will change so fast in the near future that we only restrict our self to much with it imho.
from specs.
Related Issues (20)
- Bowtie2 missing in Bismark container HOT 2
- Security Considerations for BioContainers Project HOT 1
- sandbox link on specs main page broken HOT 2
- Tag most recent container with truncated version
- Add labels to BioContainers made from Bioconda packages HOT 1
- MAINTAINER instruction is deprecated
- Would be great to have in the registry the latest version of the software
- contributing link broken HOT 1
- why specifying base image in labels? HOT 1
- Hackathon in October
- BioContainers Specifications for Training HOT 3
- Containers with no "latest" HOT 4
- Containers should have maintainers that can be group or consortiums HOT 2
- license field should use SPDX identifiers, not upstream URLs (too fragile) HOT 12
- BioTools and BioContainers integration. HOT 32
- More metadata into the recipes of BioConda and BioContainers HOT 1
- GDPR considerations for the future. HOT 1
- Additions to best practices/specs
- Broken links in the docs HOT 2
- Use label-schema.org compliant labels HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from specs.