GithubHelp home page GithubHelp logo

Comments (20)

prvst avatar prvst commented on June 17, 2024

Hi @eburgueno , I'm glad you found our project interesting.

I'm not sure if I follow what you are asking. The user inside the container is always 'biodocker', and as far as I know, that doesn't interfere with the external users, that means you can share files between your external users with the container user with no problems.

I have used some of the containers on a shared server with multiple users and I haven't seen no problem so far.

I hope that answer your question, please feel free to elaborate a little more if you find necessary.

from specs.

eburgueno avatar eburgueno commented on June 17, 2024

Hi @prvst, this is what I mean:

[user@host1 dummy]$ pwd
/workspace/user/dummy
[user@host1 dummy]$ ll -nd
drwxrwsr-x. 3 550 10024 389 Jun 15 16:48 .
[user@host1 dummy]$ ll -n
total 188809
-rw-rw-r--. 1 550 10024 73119119 Jan 14  2014 accepted_hits.bam
-rw-rw-r--. 1 550 10024 73119119 Jan 14  2014 accepted_hits.sorted.bam
-rw-rw-r--. 1 550 10024   101552 Jan 14  2014 accepted_hits.sorted.bam.bai
-rw-rw-r--. 1 550 10024    25188 Jan 14  2014 deletions.bed
-rw-rw-r--. 1 550 10024    31378 Jan 14  2014 insertions.bed
-rw-rw-r--. 1 550 10024   546489 Jan 14  2014 junctions.bed
-rw-rw-r--. 1 550 10024       66 Jan 14  2014 left_kept_reads.info
drwxrwxr-x. 2 550 10024      831 Jan 14  2014 logs
-rw-rw-r--. 1 550 10024       66 Jan 14  2014 right_kept_reads.info
-rw-rw-r--. 1 550 10024  6764172 Jan 14  2014 unmapped_left.fq.z
-rw-rw-r--. 1 550 10024  6185737 Jan 14  2014 unmapped_right.fq.z
[user@host1 dummy]$ docker search biodckr/blast
NAME            DESCRIPTION   STARS     OFFICIAL   AUTOMATED
biodckr/blast   NCBI BLAST+   2                    [OK]
[user@host1 dummy]$ docker run -ti --rm -v $(pwd):/data biodckr/blast bash
Unable to find image 'biodckr/blast:latest' locally
latest: Pulling from biodckr/blast
8387d9ff0016: Pull complete 
3b52deaaf0ed: Pull complete 
4bd501fad6de: Pull complete 
a3ed95caeb02: Pull complete 
b8dd2122b2bc: Pull complete 
9c8ffe978896: Pull complete 
d9ec616dfddb: Pull complete 
88ea212f7810: Pull complete 
ff0c7281174b: Pull complete 
475239cf827c: Pull complete 
Digest: sha256:37076ba5533524ba05703950160ca6becc82526225b9f5cf2974c1b0f5d8ccef
Status: Downloaded newer image for biodckr/blast:latest
biodocker@502394e2c896:/data$ id
uid=1000(biodocker) gid=1000(biodocker) groups=1000(biodocker),27(sudo),105(fuse)
biodocker@502394e2c896:/data$ cd logs
biodocker@502394e2c896:/data/logs$ ls
bowtie.left_kept_reads.fixmap.log       bowtie.right_kept_reads.fixmap.log       bowtie_build.log           long_spanning_reads.segs.log  run.log
bowtie.left_kept_reads_seg1.fixmap.log  bowtie.right_kept_reads_seg1.fixmap.log  bowtie_inspect_recons.log  prep_reads.log                sam_merge.log
bowtie.left_kept_reads_seg2.fixmap.log  bowtie.right_kept_reads_seg2.fixmap.log  juncs_db.log               reports.log                   segment_juncs.log
bowtie.left_kept_reads_seg3.fixmap.log  bowtie.right_kept_reads_seg3.fixmap.log  long_spanning_reads.log    reports.samtools_sort.log
biodocker@502394e2c896:/data/logs$ rm *
rm: remove write-protected regular file 'bowtie.left_kept_reads.fixmap.log'? y
rm: cannot remove 'bowtie.left_kept_reads.fixmap.log': Permission denied
rm: remove write-protected regular file 'bowtie.left_kept_reads_seg1.fixmap.log'? y
rm: cannot remove 'bowtie.left_kept_reads_seg1.fixmap.log': Permission denied
rm: remove write-protected regular file 'bowtie.left_kept_reads_seg2.fixmap.log'? y
rm: cannot remove 'bowtie.left_kept_reads_seg2.fixmap.log': Permission denied
rm: remove write-protected regular file 'bowtie.left_kept_reads_seg3.fixmap.log'? ^C
biodocker@502394e2c896:/data/logs$ cd ..
biodocker@502394e2c896:/data$ blast
blast_formatter    blastdb_aliastool  blastdbcheck       blastdbcmd         blastdbcp          blastn             blastp             blastx             
biodocker@502394e2c896:/data$ blastx --help > output
bash: output: Permission denied

In a shared environment, multiple users write to the same PATHs. We preserve the user ownership information because we want to know who did what, and we enable collaboration by setting a SGID bit in the top level directory (this enforces group ownership inheritance on new files). UPG standards take care of file permissions (write access for group). I believe this is pretty much common practice in an HPC environment.

My apologies as my example above doesn't really run blast (I'm no bioinformatician, just a sysadmin), but hopefully the idea stands out: the user running the process inside the container won't be able to write a directory on the container host, as it obviously doesn't belong to the 10024 group.

Even if you were able to write into a directory, the ownership of the new file generated inside the container will be UID=1000. In our particular case UID=1000 has already been assigned to someone. Changing this person's UID for another one is not exactly ideal. But even if you were to do that, you still have a problem in that you can't know who created said file (ie: was it user1 or user2 who started the container?).

Hope this makes sense.

from specs.

prvst avatar prvst commented on June 17, 2024

hum, OK, now I see your problem.
@sauloal, @ypriverol; what do you think about this?

from specs.

ypriverol avatar ypriverol commented on June 17, 2024

@eburgueno we have an ongoing discussion about data inside the containers and why we should not modify this data on containers. Why do you need to modify inside the container the data? Have you try other solutions?

from specs.

eburgueno avatar eburgueno commented on June 17, 2024

Hi @ypriverol, I'm not talking about modifying data inside the container. The output data will be created on a volume mapped to the running container when it is started (using docker run -v ...).

My problem is that the user inside your containers has UID=1000. In my environment we have hundreds of users, each one with their own UID. When two different users run your containers, the output files are created as if they belong to the same user (the one with UID=1000), not as the one that actually started the container. This makes it impossible to figure out who did what.

There is no solution to this problem yet as far as I'm aware.

from specs.

ypriverol avatar ypriverol commented on June 17, 2024

@eburgueno @prvst can we convert biodcoker user in a virtual user and all the users within your environment should be members of it. Can be this one of the possible solutions?

from specs.

sauloal avatar sauloal commented on June 17, 2024

Theres already a user docker which everyone running docker already belongs
to....

On Thu, Sep 29, 2016, 12:25 Yasset Perez-Riverol [email protected]
wrote:

@eburgueno https://github.com/eburgueno @prvst
https://github.com/prvst can we convert biodcoker user in a virtual
user and all the users within your environment should be members of it. Can
be this one of the possible solutions?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAe608PchrEymX4mnsz-dCMBJo2IqWGLks5qu5IqgaJpZM4I17Bs
.

from specs.

ypriverol avatar ypriverol commented on June 17, 2024

@sauloal Then, the only thing that @eburgueno should do in their side is to add all his cluster users part of the biodocker group?

from specs.

sauloal avatar sauloal commented on June 17, 2024

If they want to run docker they must belong to the docker group

On Thu, Sep 29, 2016, 12:29 Yasset Perez-Riverol [email protected]
wrote:

@sauloal https://github.com/sauloal Then, the only thing that @eburgueno
https://github.com/eburgueno should do in their side is to add all his
cluster users part of the biodocker group?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAe609k79arIy5vgHvx132kvGGMWnRn5ks5qu5MlgaJpZM4I17Bs
.

from specs.

ypriverol avatar ypriverol commented on June 17, 2024

@sauloal can we define a more generic name now that we are not only docker.. something proabably like biocontainer USER. @bgruening How this happen in rkt

from specs.

bgruening avatar bgruening commented on June 17, 2024

rkt does not have a server, it's a client only concept. Moreover, for Docker there are also other workarounds possible: https://github.com/ekorpela/cloud-vm-workshop/blob/master/materials/Docker/ProvisioningTrick.md

from specs.

ypriverol avatar ypriverol commented on June 17, 2024

@bgruening @sauloal do you think this is a better an consistent way that provide a docker virtual group where all the users should be. or what @abdulrahmanazab said in https://github.com/ekorpela/cloud-vm-workshop/blob/master/materials/Docker/ProvisioningTrick.md . I would like to standarize this in all of our docker-based containers.

from specs.

eburgueno avatar eburgueno commented on June 17, 2024

Ok, I guess that the solution of using a wrapper script (as described in https://github.com/ekorpela/cloud-vm-workshop/blob/master/materials/Docker/ProvisioningTrick.md) may work. It's just slightly inconvenient having to create one for each container out there, or come up with a more generic wrapper. But at least this hides the complexity to the users, so that they don't need to run docker run -u... themselves with their own UID and GID (some might not even know what those are).

The only other thing needed for the containers to work well in a shared environment is to ensure that the default umask is 0002. At the moment I believe you're using 0022:

$ docker run -ti --rm -v $(pwd):/data biodckr/blast bash
biodocker@29b4fce05367:/data$ umask
0022

from specs.

ypriverol avatar ypriverol commented on June 17, 2024

@eburgueno I think you are right, but this is also related with the topic of having a wrapper script for all containers #27. Can we finally define this helper script @sauloal

from specs.

sauloal avatar sauloal commented on June 17, 2024

There is another way. Place a helper script INSIDE the container to be
called before each run. This command would create a new user with the same
Id as the current user. This takes milliseconds. The biggest problem would
be that all programs would need to be installed in a default path (not
home) and that conda word not be suitable for that.

A wrapper should not be a problem but would have to be created cross
platform. Always fun to do.

On Thu, Sep 29, 2016, 15:09 Yasset Perez-Riverol [email protected]
wrote:

@eburgueno https://github.com/eburgueno I think you are right, but this
is also related with the topic of having a wrapper script for all
containers #27 #27. Can we
finally define this helper script @sauloal https://github.com/sauloal


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAe60__-HTmCMZNQO_siTSyokkLd5qhmks5qu7iWgaJpZM4I17Bs
.

from specs.

eburgueno avatar eburgueno commented on June 17, 2024

@sauloal we tried this approach and found two issues:

  • Some tools don't have just one binary to call at script exit (blast here is a good example). Capturing both the command to run and any command line arguments gets tricky.
  • The only way for the helper script to receive information about the user (and any common groups) to add is via the docker run arguments (for example, via environment variables). This relegates the responsibility to the user running the container who may not know, forget to, or do it wrong.

from specs.

ypriverol avatar ypriverol commented on June 17, 2024

@eburgueno @sauloal @bgruening I don't have to mush experience as a sysadmin but the idea of having a virtual user biodocker that people can basically be part of that group and acce to the data make a lot of sense for me. This is something easy to control and easy to configure.

from specs.

sauloal avatar sauloal commented on June 17, 2024

This is already how docker use. Everybody belongs to group 'docker'. The
problem is that the files are still saved as root.

The outside wrapper seems the best idea. I wonder if writing it in 'go',
which is by nature multiplatform, would be a possibility.

On Thu, Sep 29, 2016, 21:14 Yasset Perez-Riverol [email protected]
wrote:

@eburgueno https://github.com/eburgueno @sauloal
https://github.com/sauloal @bgruening https://github.com/bgruening I
don't have to mush experience as a sysadmin but the idea of having a
virtual user biodocker that people can basically be part of that group and
acce to the data make a lot of sense for me. This is something easy to
control and easy to configure.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#55 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAe607ysg3iSIIneLWsjxFq2r6K-foFNks5qvA32gaJpZM4I17Bs
.

from specs.

ypriverol avatar ypriverol commented on June 17, 2024

The problem with the wrapper script and other alternatives is that we should propose a clean and well documented implementation. BioContainers should provide mainly the docker containers and advice how to run those containers. @sauloal can we implement a general tool as suggested by @abdulrahmanazab that can take a Biocontainers docker-based and executed in clusters environments. Probably we can include this tools into a separate repository for Biocontainers tools.

from specs.

bgruening avatar bgruening commented on June 17, 2024

Imho we should not enforce any strategy howto use our containers. There are many possible solutions from the outside, like https://github.com/indigo-dc/udocker the way ht-condor handles containers, socker and what not. We should concentrate on containers and the content not on how people should use them. This will change so fast in the near future that we only restrict our self to much with it imho.

from specs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.