GithubHelp home page GithubHelp logo

containers's People

Contributors

bruhwiler avatar e-carlin avatar elventear avatar nselem avatar robnagler avatar rorour avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

elventear

containers's Issues

create SRW-light repo to avoid tarball copy

The standard SRW repo is over 600MB, and GitHub sometimes times out during clones. We created a tarball snapshot to avoid this issue, but now that we are having active development of SRW by @mrakitin , we need a solution that pulls from a repo so we'll create a trimmed down repo as is done with the create_srw_tar_gz function in beamsim/codes/srw.sh.

Ensure backwards compatibility with docker

@7d9bd292 introduced an incompatibility with docker before 1.10. -f is required with tag to overwrite tags. I've got a fix for this pending, just adding a placeholder.

proxy image builds

To mitigate network failures when building images, we can put a persisten proxy-cache in front of the build container. Need to figure this out, because this week has seen a lot of network failures.

@elventear have you seen this done before?

vagrant build needs to unset DISPLAY

  Downloading six-1.10.0-py2.py3-none-any.whl
Installing collected packages: six, python-dateutil, pytz, cycler, pyparsing, matplotlib
  Running setup.py install for matplotlib: started
X11 connection rejected because of wrong authentication.
X11 connection rejected because of wrong authentication.

jupyter image

Build a Jupyter image from beamsim:

  • environment for synergia must be right
  • constrain container to cpuset, memory, disk to avoid resource issues

VirtualBox fails due to bios settings on PC laptop

Hardware: HP Spectre 360 laptop
OS: Windows 10 Pro
Processor: Intel Core i7-6560U CU 2.20 GHz (dual)
System type: 64-bit OS, x64-based processor

The Vagrant file contains the following:
Vagrant.configure(2) do |config|
config.vm.box = "radiasoft/develop"
config.vm.hostname = "rsdev"
config.ssh.forward_x11 = true
end

The following error is seen:
$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Checking if box 'radiasoft/radtrack' is up to date...
==> default: Clearing any previously set forwarded ports...
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
default: Adapter 1: nat
==> default: Forwarding ports...
default: 22 (guest) => 2222 (host) (adapter 1)
==> default: Booting VM...
There was an error while executing VBoxManage, a CLI used by Vagrant
for controlling VirtualBox. The command and stderr is shown below.

Command: ["startvm", "9cb3b6e8-c8ab-4838-89e3-defa4ac7a6f7", "--type", "headless"]

Stderr: VBoxManage.exe: error: VT-x is disabled in the BIOS for all CPU modes (VERR_VMX_MSR_ALL_VMX_DISABLED)
VBoxManage.exe: error: Details: code E_FAIL (0x80004005), component ConsoleWrap, interface IConsole

Automate vagrant install

I just had a difficult install on a Windows box that was running Ubuntu in VirtualBox with too small of a virtual disk. If we are to make this process seamless, we are going to need to make a one click installer that creates a small VM with a large enough disk to run Docker. We also want a hands free install, that is, when the VM is running, it will stay running across reboots.

I also tried to install on 2010 MacBook Proc with 4GB. At this point the installer should warn the user that it will be unacceptably slow. There's only so much we can do.

Just noting here that if we decide to automate the installer, we have some work to do.

JupyterHub Security

JupyterHub runs as root. It needs to run as an unprivileged user. It also needs to be protected from spying on the local network. One solution is to run JH in the cloud on an untrusted network.

If JH is to run on multiple machines, it will need access to docker on those machines. The docker port will have to be on the internet, and have to be locked down.

Need to work through the security issues and list them here.

Jupyter sqlite locking issue

Reported by @cchall: I've been getting some strange behavior from the Jupyter server. When I try to start a new notebook or restart a running notebook it just hangs for a bit then tells me the kernel has died.

Trying to start an ipython instance from the command line give:

[2.7.10; jupyter]$ ipython                                                                                                         
Traceback (most recent call last):                                                                                                 
  File "/home/vagrant/.pyenv/versions/2.7.10/bin/ipython", line 11, in <module>                                                    
    sys.exit(start_ipython())                                                                                                      
  File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/IPython/__init__.py", line 119, in start_ipython          
    return launch_new_instance(argv=argv, **kwargs)                                                                                
  File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/traitlets/config/application.py", line 588, in launch_inst
ance                                                                                                                               
    app.initialize(argv)                                                                                                           
  File "<decorator-gen-111>", line 2, in initialize                                                                                
  File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/traitlets/config/application.py", line 74, in catch_config
_error                                                                                                                             
    return method(app, *args, **kwargs)                                                                                            
  File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/IPython/terminal/ipapp.py", line 314, in initialize       
    self.init_shell()                                                                                                              
  File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/IPython/terminal/ipapp.py", line 330, in init_shell       
    ipython_dir=self.ipython_dir, user_ns=self.user_ns)                                                                            
  File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/traitlets/config/configurable.py", line 404, in instance  
    inst = cls(*args, **kwargs)                                                                                                    
  File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 513, in __init__  
    self.init_history()                                                                                                            
  File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 1636, in init_hist
ory                                                                                                                                
    self.history_manager = HistoryManager(shell=self, parent=self)                                                                 
  File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/IPython/core/history.py", line 520, in __init__           
    self.new_session()                                                                                                             
  File "<decorator-gen-21>", line 2, in new_session                                                                                
  File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/IPython/core/history.py", line 68, in needs_sqlite        
    return f(self, *a, **kw)                                                                                                       
  File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/IPython/core/history.py", line 538, in new_session        
    NULL, "") """, (datetime.datetime.now(),))                                                                                     
OperationalError: database is locked                                                                                               

If you suspect this is an IPython bug, please report it at:                                                                        
    https://github.com/ipython/ipython/issues                                                                                      
or send an email to the mailing list at [email protected]                                                                      

You can print a more detailed traceback right now with "%tb", or use "%debug"                                                      
to interactively debug it.                                                                                                         

Extra-detailed tracebacks for bug-reporting purposes can be enabled via:                                                           
    c.Application.verbose_crash=True                                                                                               

Update to `warp.init_tools`

Depends on fix to radiasoft/sirepo#472. Remi emailed that warp init tools are no part of warp. We need to switch to a specific commit. We should add the uninstall (see below) to support existing VM updates.

Remi writes:

In order to use the new initialization tools from Warp:

Update your version of Warp (using git pull and make install) In your input scripts, replace the line

from warp_init_tools import *

by

from warp.init_tools import *

(Notice the . instead of _.)
It is also recommended that you uninstall the old package warp_init_tools:

pip uninstall warp_init_tools

locales screwed up on CentOS docker

Building radiasoft/python2 on CentOS 6 docker (version 1.7.1, build 786b29d/1.7.1) causes the locales not to get saved. Running the same build on a Fedora 21 VM works fine. The locales are there at the end of the build, but they disappear after the docker step, i.e. putting this in the Docker file after the build-run.sh:

RUN localedef --list

Results in no locales listed and running the resultant image does not work either.

It's not clear why this is happening at all, and it took quite some time to see that this works on Fedora.

beamsim-jupyter: include emacs in the container

Please include emacs in the container.

In order to debug and develop IPy Notebooks on the server, it's necessary to

  • edit files
  • pip install
  • test
  • push to GitHub

For example, I find I have to do this with rssynergia in order to get the notebooks working.
This is too painful with vi.

I don't need the GUI or X-version; command-line only version is fine, if that's helpful in terms of saving space.

Imported image does not work

I performed building of the image using the following set of commands:

cd ~/src/radiasoft
gcl containers
cd containers
docker pull radiasoft/beamsim:alpha
bin/build docker radiasoft/sirepo
docker save -o sirepo.tar radiasoft/sirepo:20160413.155446

Then I transferred the .tar file to cpu-001 and imported it:

root@cpu-001:/var/lib/sirepo# docker import sirepo_20160413.204336.tar radiasoft/sirepo:alpha
sha256:66c0f2e7df01fe60f9bf83bf582ec95a57a21e01014c697bbb35a040df7ca658
root@cpu-001:/var/lib/sirepo#

Here is the information regarding containers and images:

root@cpu-001:/var/lib/sirepo# docker ps -a
CONTAINER ID        IMAGE               COMMAND                   CREATED             STATUS                         PORTS               NAMES
f0935a14ae1c        42e5585f46bb        "bash -c '/radia-run "    43 minutes ago      Exited (137) 2 minutes ago                         sirepo
0d039691aaf3        42e5585f46bb        "bash -c '/radia-run "    43 minutes ago      Exited (137) 2 minutes ago                         celery-sirepo
9e558e522178        42e5585f46bb        "bash -c '/radia-run "    43 minutes ago      Exited (137) 3 minutes ago                         rabbitmq
92948c87dd28        28564a53830f        "/bin/sh -c \"/conf/bu"   About an hour ago   Exited (1) About an hour ago                       loving_raman
root@cpu-001:/var/lib/sirepo# docker images
REPOSITORY            TAG                 IMAGE ID            CREATED             SIZE
radiasoft/sirepo      alpha               66c0f2e7df01        59 seconds ago      1.956 GB
<none>                <none>              28564a53830f        About an hour ago   1.682 GB
<none>                <none>              42e5585f46bb        23 hours ago        1.898 GB
<none>                <none>              1f478560322d        24 hours ago        1.898 GB
<none>                <none>              bcbbdabf3222        25 hours ago        1.898 GB
<none>                <none>              127043ad605e        12 days ago         1.898 GB
radiasoft/beamsim     alpha               d1badf4a7ff1        2 weeks ago         1.682 GB
radiasoft/beamsim     latest              d1badf4a7ff1        2 weeks ago         1.682 GB
s10                   latest              21aa964e2a93        3 weeks ago         1.923 GB
<none>                <none>              080425aab593        4 weeks ago         1.914 GB
radiasoft/sirepo      latest              65a8b38e724b        5 weeks ago         1.906 GB
agileek/cpuset-test   latest              4e8863106dde        6 weeks ago         1.455 MB
hello-world           latest              690ed74de00f        6 months ago        960 B

Here is the output when I start the services:

root@cpu-001:/var/lib/sirepo# for f in rabbitmq celery-sirepo sirepo; do systemctl status $f; done
● rabbitmq.service - LSB: AMQP service provided by RabbitMQ
   Loaded: loaded (/etc/init.d/rabbitmq)
   Active: inactive (dead) since Wed 2016-04-13 17:38:44 EDT; 3min 16s ago
  Process: 4623 ExecStop=/etc/init.d/rabbitmq stop (code=exited, status=0/SUCCESS)
  Process: 73469 ExecStart=/etc/init.d/rabbitmq start (code=exited, status=0/SUCCESS)

Apr 13 16:58:00 cpu-001 rabbitmq[73469]: Starting rabbitmq: rabbitmq failed!
Apr 13 17:38:44 cpu-001 rabbitmq[4623]: Stopping rabbitmq:  failed!
● celery-sirepo.service - LSB: Celery
   Loaded: loaded (/etc/init.d/celery-sirepo)
   Active: inactive (dead) since Wed 2016-04-13 17:38:55 EDT; 3min 5s ago
  Process: 4722 ExecStop=/etc/init.d/celery-sirepo stop (code=exited, status=0/SUCCESS)
  Process: 1604 ExecStart=/etc/init.d/celery-sirepo start (code=exited, status=0/SUCCESS)

Apr 13 16:58:16 cpu-001 celery-sirepo[1604]: Starting celery-sirepo: celery-sirepo failed!
Apr 13 17:38:55 cpu-001 celery-sirepo[4722]: Stopping celery-sirepo:  failed!
● sirepo.service - LSB: Enable sirepo
   Loaded: loaded (/etc/init.d/sirepo)
   Active: inactive (dead) since Wed 2016-04-13 17:39:06 EDT; 2min 54s ago
  Process: 5150 ExecStop=/etc/init.d/sirepo stop (code=exited, status=0/SUCCESS)
  Process: 2485 ExecStart=/etc/init.d/sirepo start (code=exited, status=0/SUCCESS)

Apr 13 16:58:33 cpu-001 sirepo[2485]: Starting sirepo: sirepo failed!
Apr 13 17:39:06 cpu-001 sirepo[5150]: Stopping sirepo:  failed!

Did I do anything wrong?

update Synergia in containers

Synergia now includes a special-purpose 'linear' space charge algorithm, which is important for IOTA simulations.

We need the latest Synergia release to be available by default in our containers, in particular on JupyterHub.

Use Docker within Vagrant

I think we should deploy Docker inside Vagrant for "end-user" containers. We can then provide a base Vagrant box that doesn't need updating. This box would have docker and a runner. The runner would know how to pull the latest images, cleaning dangling images.

The reason for this change is that maintaining Vagrant images is extra work, and it introduces a reliability issue, because the Vagrant version may different from the Docker version.

We will still maintain the ability to build Vagrant so that we can setup development machines. However, those setups would best be solved with a curl installer like we do now for codes. Developers are different than end-users.

Feedback?

rpmdb lock error in build

We get a lock error every now and then when a build happens. This may be related to CentOS 6 so perhaps we have to move the build master to Fedora 21+.

^[[0m^[[91mDownload: https://depot.radiasoft.org/foss/SDDSToolKit-3.3.1-1.fedora.21.x86_64.rpm
^[[0mBDB2053 Freeing read locks for locker 0x106: 1137/139993741190912
[...]
BDB2053 Freeing read locks for locker 0x105: 1137/139993741190912
^[[91merror: db5 error(-30986) from dbcursor->c_get: BDB0075 DB_PAGE_NOTFOUND: Requested page not found
^[[0m^[[91merror: db5 error(-30986) from dbcursor->c_get: BDB0075 DB_PAGE_NOTFOUND: Requested page not found
^[[0m^[[91mError: Rpmdb checksum is invalid: pkg checksums: fontpackages-filesystem-0:1.44-10.fc21.noarch
^[[0mThe command '/bin/sh -c "/conf/build-run.sh"' returned a non-zero code: 1
Build: ERROR TRAP

jupyterhub image testing

In order to create a JupyterHub image which satisfies our needs, we need to figure out how to do
the following:

  • Configure basic JupyterHub and work around bugs (e.g. jupyterhub/jupyterhub#181)
  • Install JupyterHub on apa20
  • Proxy JupyterHub through apa11
  • GitHub OAuth
  • Db persistence (sqlite or postgres?)
  • DockerSpawner locally
  • DockerSpawner remotely
  • Mount shared volume for homes
  • Creating users dynamically

There are a number of issues with the way JupyterHub works, including a fixed user whitelist. We'll need to dynamically append to the whitelist and add the users to JupyterHub.

Probably he biggest issue is that there doesn't appear to be automatic garbage collection on Jupyter servers. There is no "persist container" with the DockerSpawner.

rabbitmq needs to start with /radia-run

We can't control the vagrant uid/gid so we have to use /radia-run to start rabbitmq inside docker so it sets the uid/gid for vagrant in the container to match the outside.

jupyter needs to clear sys.argv

When importing warp from within jupyter, make sure sys.argv is clear so it doesn't produce this.

from warp import *
usage: __main__.py [-h] [-p DECOMP DECOMP DECOMP] [-l LOCALFLAGS]
                   [--pnumb PNUMB]
__main__.py: error: unrecognized arguments: -f /home/vagrant/.local/share/jupyter/runtime/kernel-62e3bd20-edd5-47a6-b640-505c1451d747.json
An exception has occurred, use %tb to see the full traceback.

SystemExit: 2


To exit: use 'exit', 'quit', or Ctrl-D.

test codes in build

need to run tests for all codes during build. some codes don't have tests, need to add them. Want end to end tests, so probably need to create all tests, including using "code runner" (#1)

Easier container builds

@mrakitin, do you think you'll have to change SRW frequently? I'm concerned we'll be rebuilding radiasoft/beamsim too frequently. It takes hours now, because it contains all codes.

One alternative is to create RPMs for the codes. However, that defeats the purpose of containers.

Any thoughts?

clean old files

remove *-conf and libexec
fix bundler in radiasoft-installer

automate docker run for codes

Forwarded conversation

Subject: Running codes in docker

From: Rob Nagler [email protected]
Date: Sun, Sep 27, 2015 at 8:36 AM
To: "[email protected]" [email protected]

Turns out that docker does not may UIDs. This means the containers are even more difficult to use than I had previously discussed.

Up until now, there was a user vagrant with a UID=1000 on the host, which in the docker container maps to vagrant=1000 so you can do something like:

docker -i -t -u vagrant -v $PWD:/home/vagrant/work -u vagrant radiasoft/beamsim bash -c '. ~/.bashrc && cd ~/work && synergia fodo.py'

However, if you are running as your normal user (nagler=601) on apa11 and try this command, you'll get permission denied, because $PWD is owned by nagler, not vagrant.

It is possible to deal with this situation, but it requires running usermod -u 601 vagrant when the container boots. It's not sufficient, while the usermod does change the files in ~vagrant to be the new UID, the group is wrong. You need to run groupmod -g 601 vagrant (nagler group is 601 on apa11), which doesn't change the files in ~vagrant in the container so you need to follow with chgrp -R vagrant ~vagrant.

So, it looks like a more complex wrapper is involved which involves extracting the UID and GID of /home/vagrant/work inside the container and then running usermod before su'ing to vagrant to actually run the command.

People don't run into this problem in production, because they use the same users on the host and guest (as we do for sirepo.com). And during development, we run in a VM with Docker so it works just fine since the containers are built with vagrant=1000. And, helpfully, VirtualBox maps UIDs as the invoking user.

The question is: should I create a magic command that gets installed in ~/bin on the host which does these tricks for the end user? An alternative is to configure root, not vagrant, in the image so that you don't need to switch UIDs. This isn't recommend, because root does have some privileges on the host, e.g. you can chown files on the host from the guest. I would rather run as vagrant in the container (and others recommend that in general).

Thoughts?

Rob


From: David Bruhwiler [email protected]
Date: Sun, Sep 27, 2015 at 10:19 AM
To: [email protected]

Very interesting.

I think the built in magic commands would be cool.

I agree that configuring as root is a bad idea.

David


From: Rob Nagler [email protected]
Date: Wed, Sep 30, 2015 at 8:37 AM
To: "[email protected]" [email protected]

Added it to my list.

:)

Rob

Add srw to sirepo build

We need to build latest srw with sirepo for the time being since it is changing frequently.

warp: numpy.core.multiarray failed to import

Visiting the warp simulations page is causing a server error on alpha.
It looks like something isn't right with OpenPMDTimeSeries on alpha -
/var/lib/sirepo/app.log:

from opmd_viewer import OpenPMDTimeSeries

File "build/bdist.linux-x86_64/egg/opmd_viewer/init.py", line 2,
in
File "build/bdist.linux-x86_64/egg/opmd_viewer/openpmd_timeseries/init.py",
line 2, in
File "build/bdist.linux-x86_64/egg/opmd_viewer/openpmd_timeseries/main.py",
line 9, in
File "build/bdist.linux-x86_64/egg/opmd_viewer/openpmd_timeseries/plotter.py",
line 7, in
File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/matplotlib/pyplot.py",
line 114, in
_backend_mod, new_figure_manager, draw_if_interactive, _show = pylab_setup()
File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/matplotlib/backends/init.py",
line 32, in pylab_setup
globals(),locals(),[backend_name],0)
File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/matplotlib/backends/backend_tkagg.py",
line 13, in
import matplotlib.backends.tkagg as tkagg
File "/home/vagrant/.pyenv/versions/2.7.10/lib/python2.7/site-packages/matplotlib/backends/tkagg.py",
line 9, in
from matplotlib.backends import _tkagg
ImportError: numpy.core.multiarray failed to import

@moellep

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.