thecodeteam / mesos-module-dvdi Goto Github PK
View Code? Open in Web Editor NEWMesos Docker Volume Driver Isolator module
License: Apache License 2.0
Mesos Docker Volume Driver Isolator module
License: Apache License 2.0
Some storage platforms, and some/most file systems prevent concurrent mounts of a volume from multiple users.
If a Mesos agent mount a volume and abnormally terminates, and the same or an alternate workload subsequently tries to mount the volume, without a dismount from the original location, the second mount can potentially fail. Some storage platforms have a mechanism for supporting a "pre-emptive mount" option which will force-ibly remove the first mount when the second mount is attempted.
Resolution of this issue is dependent on resolution of a corresponding issue in the RexRay project. See: rexray/rexray#30
This is related to and dependent on issue #21.
Mount point specification will be exactly like that used by the existing Mesos containerizer shared filesystem isolator
Hi, I install this module in mesos 0.25.0 with rexray 0.3.0 on OpenStack.
I test that example in README, then cinder volume is created.
But, I test with docker container in mesos , cinder volume is not created at all. Here is my json file deploy by Marathon 0.13.0. And I want to know that this module whether support docker container at this moment.
{
"id": "demo",
"mem": 512,
"cpus": 0.5,
"instances": 1,
"container": {
"type": "DOCKER",
"docker": {
"network": "BRIDGE",
"image": "nginx:1.9.6",
"portMappings": [{
"containerPort": 80,
"protocol": "tcp"
}]
},
"volumes": [{
"containerPath": "/usr/share/nginx/html",
"hostPath": "/var/lib/rexray/volumes/demo",
"mode": "RO"
}]
},
"env": {
"DVDI_VOLUME_NAME": "demo",
"DVDI_VOLUME_DRIVER": "rexray",
"DVDI_VOLUME_OPTS": "size=5,iops=150,newfstype=xfs,overwritefs=true",
"DVDI_VOLUME_CONTAINERPATH": "/var/lib/rexray/volumes/demo"
}
}
When I try to add the module to mesos it fails to start and spits out:
Jan 13 03:00:01 ip-10-0-4-73 mesos-slave[29629]: WARNING: Logging before InitGoogleLogging() is written to STDERR
Jan 13 03:00:01 ip-10-0-4-73 mesos-slave[29629]: I0113 03:00:01.719745 29605 main.cpp:243] Build: 2016-11-16 01:30:49 by ubuntu
Jan 13 03:00:01 ip-10-0-4-73 mesos-slave[29629]: I0113 03:00:01.719815 29605 main.cpp:244] Version: 1.1.0
Jan 13 03:00:01 ip-10-0-4-73 mesos-slave[29629]: I0113 03:00:01.719827 29605 main.cpp:247] Git tag: 1.1.0
Jan 13 03:00:01 ip-10-0-4-73 mesos-slave[29629]: I0113 03:00:01.719835 29605 main.cpp:251] Git SHA: a44b077ea0df54b77f05550979e1e97f39b15873
Jan 13 03:00:01 ip-10-0-4-73 mesos-slave[29629]: I0113 03:00:01.723484 29605 logging.cpp:194] INFO level logging started!
Jan 13 03:00:01 ip-10-0-4-73 mesos-slave[29629]: Error loading modules: Error opening library: 'libmesos_dvdi_isolator-1.1.0.so': Could not load library 'libmesos_dvdi_isolator-1.1.0.so': /usr/lib/libmesos_dvdi_isolator-1.1.0.so: undefined symbol: _ZNK6google8protobuf7Message11GetTypeNameEv
Jan 13 03:00:01 ip-10-0-4-73 systemd[1]: mesos-slave.service: Main process exited, code=exited, status=1/FAILURE
Code currently uses internal disk IO using an isolator specific implementation based on C++ standard library. Using the existing Mesos slave checkpoint implementation is expected to eliminate/reduce code by reusing an existing, proven internal Mesos component
Google protocol buffer is the standard mechanism within Mesos for rendering and passing configuration and parameters. It is likely to better handle versioning and schema definition.
Hi,
how can I debug the fact that mesos slave does not call the module ?
Using dvdcli is fine.
I launched my slave with:
/usr/sbin/mesos-slave --master=zk://a.b.c.d:2181/mesos --log_dir=/var/log/mesos --containerizers=docker,mesos --modules=file:///usr/lib/dvdi-mod.json
I can see at startup:
.... --log_dir="/var/log/mesos" --logbufsecs="0" --logging_level="INFO" --master="zk://a.b.c.d:2181/mesos" --modules="libraries {
file: "/usr/lib/libmesos_dvdi_isolator-0.23.0.so"
modules {
name: "com_emccode_mesos_DockerVolumeDriverIsolator"
}
}
When I call my docker plugin with dvdcli, I can see my plugin is activated etc... However, when I execute a job, nothing happens.
I have set env variables in mesos offer prototype, in commandinfo, though I cannot check they are correctly set. The job is executed "as usual", without calling the docker plugin.
I use mesos 0.23.0
Thanks
Either via a series of ifdefs or some type of abstraction interface, support versions of mesos going back to 0.23.0
Hello, the library will not work on CoreOS (eg: DC/OS) due to the following:
isolator/isolator/docker_volume_driver_isolator.hpp:
static constexpr char DVDCLI_MOUNT_CMD[] = "/usr/bin/dvdcli mount";
static constexpr char DVDCLI_UNMOUNT_CMD[] = "/usr/bin/dvdcli unmount";
whereas dvdcli will install in /opt/bin on CoreOS:
if [ -n "$IS_COREOS" ]; then
BIN_DIR=/opt/bin
BIN_FILE=$BIN_DIR/$BIN_NAME
related to #88, if calls to os::shell
to execute dvdcli
hang or block for significant amounts of time then the task launch pipeline breaks down and tasks become stuck in STAGING
. part of the reason why this happens is because the isolator module invokes potentially blocking operations synchronously from within the mesos module API handlers.
a better approach would be to invoke such commands asynchronously. perhaps by using, for example, Subprocess. HDFS code in Mesos provides an example of this approach: https://github.com/apache/mesos/blob/4d2b1b793e07a9c90b984ca330a3d7bc9e1404cc/src/hdfs/hdfs.cpp#L53
Mesos 1.3.0 was recently released and should be supported by the isolator
When testing out mesos-module-dvdi in the context of DC/OS 1.7, a user reported this abort upon integrating the module and restarting the agent:
Jun 05 06:33:23 dcostest-gra1-slavepub01 mesos-slave[4942]: I0605 06:33:23.179944 4951 state.cpp:58] Recovering state from '/tmp/mesos'
Jun 05 06:33:23 dcostest-gra1-slavepub01 mesos-slave[4942]: ABORT: (/pkg/src/mesos/3rdparty/libprocess/3rdparty/stout/include/stout/result.hpp:114): Result::get() but state == NONE
Jun 05 06:33:23 dcostest-gra1-slavepub01 mesos-slave[4942]: *** Aborted at 1465108403 (unix time) try "date -d @1465108403" if you are using GNU date ***
It looks like the module did not acknowledge the MESOS_WORK_DIR=/var/lib/mesos
default in DC/OS, and attempted to read from a missing folder. This was confirmed by creating the folder with appropriate permissions, which un-blocked mesos-slave.
It's possible that this is the correct location for that path, in which case the module may want to create it if it doesn't exist.
Mesos 1.2.0 was recently released and should be supported by the isolator
ABORT: (../../3rdparty/libprocess/3rdparty/stout/include/stout/result.hpp:114): Result::get() but state == NONE
Jun 13 19:17:13 slave-1 mesos-slave[22560]: *** Aborted at 1465845433 (unix time) try "date -d @1465845433" if you are using GNU date ***
Jun 13 19:17:13 slave-1 mesos-slave[22560]: PC: @ 0x7f5180a0f5f7 __GI_raise
Jun 13 19:17:13 slave-1 mesos-slave[22560]: *** SIGABRT (@0x57f8) received by PID 22520 (TID 0x7f517a2dd700) from PID 22520; stack trace: ***
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f51812c8100 (unknown)
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f5180a0f5f7 __GI_raise
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f5180a10ce8 __GI_abort
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x40b71c _Abort()
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x40b75c _Abort()
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f518218a2db Result<>::get()
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f517aaf724f mesos::slave::DockerVolumeDriverIsolator::recover()
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f51822a6a55 mesos::internal::slave::MesosContainerizerProcess::recoverIsolators()
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f51822b0207 mesos::internal::slave::MesosContainerizerProcess::_recover()
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f51822cd797 _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKSt4listINS6_5slave14ContainerStateESaISC_EERK7hashsetINS6_11ContainerIDESt4hashISI_ESt8equal_toISI_EESE_SN_EENS0_6FutureIT_EERKNS0_3PIDIT0_EEMSU_FSS_T1_T2_ET3_T4_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f518276e8a1 process::ProcessManager::resume()
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f518276eba7 _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f5181066220 (unknown)
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f51812c0dc5 start_thread
Jun 13 19:17:13 slave-1 mesos-slave[22560]: @ 0x7f5180ad021d __clone
Jun 13 19:17:13 slave-1 systemd[1]: mesos-slave.service: main process exited, code=killed, status=6/ABRT
Command used:
Jun 13 19:28:13 slave-1 mesos-slave[27764]: I0613 19:28:13.036092 27723 slave.cpp:194] Flags at startup: --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/tmp/mesos/store/appc" --attributes="flavor:m1-slave;java:1.8.0;os:centos7" --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="docker,mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="15secs" --docker_store_dir="/tmp/mesos/store/docker" --enforce_container_disk_quota="true" --executor_environment_variables="{"DATACENTER":"foo","JAVA_HOME":"\/usr\/jdk1.8.0_31"}" --executor_registration_timeout="5mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size=
Jun 13 19:28:13 slave-1 mesos-slave[27764]: "2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="/usr/hadoop-2.6.3" --help="false" --hostname="slave-1.redacted.fqdn" --hostname_lookup="true" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="com_emccode_mesos_DockerVolumeDriverIsolator" --launcher_dir="/usr/libexec/mesos" --logbufsecs="0" --logging_level="INFO" --master="zk://10.211.194.118:2181,10.211.194.121:2181,10.211.194.122:2181/mesos" --modules="libraries {
Jun 13 19:28:13 slave-1 mesos-slave[27764]: file: "/usr/lib/libmesos_dvdi_isolator-0.28.2.so"
Jun 13 19:28:13 slave-1 mesos-slave[27764]: modules {
Jun 13 19:28:13 slave-1 mesos-slave[27764]: name: "com_emccode_mesos_DockerVolumeDriverIsolator"
Jun 13 19:28:13 slave-1 mesos-slave[27764]: }
Jun 13 19:28:13 slave-1 mesos-slave[27764]: }
Jun 13 19:28:13 slave-1 mesos-slave[27764]: " --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --port="5050" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --resources="ports:[1025-8999,9011-65535]" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_enable_support="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/var/lib/mesos"```
Works great on 0.28.1.
Support for Mesos 1.4.1 Needed
Since https://github.com/codedellemc/mesos-module-libstorage also uses dvdcli, why would one be used vs the other?
In the v0.23.0 version of Mesos, the slave work directory was not readily available to an isolator implemented as a module. It is expected that this (slave work directory) will be available in an upcoming version of Mesos. When it is available, the existing isolator parameter will be deprecated, and the slave work directory will automatically be used instead. This will simplify configuration of the isolator
@cantbewong I'm planning to work on https://issues.apache.org/jira/browse/MESOS-4355 , this is exactly what you are doing now for this project, do you have any comments we make this upstream to mesos?
It will be nice if we can add support for building dvdi module with mesos v0.27.0.
The isolator determines the 'legacy mounts' (no active container is using that) during slave recovery by just looking at the checkpointed state and will umount those legacy mounts immediately. However, for known orphans, they might not be killed yet (known orphans are those containers that are known by the launcher). The Mesos agent will do an async cleanup on those known orphan containers.
Umounting the volumes while those orphan containers are still using them might be problematic. The correct way is to wait for 'cleanup' to be called for those known orphan containers (i.e., still create Info struct for those orphan containers, and do proper cleanup in 'cleanup' function).
This describes desired behavior:
Implement these behaviors
There is a 0.2.0 release of dvdcli coming out shortly. I reviewed the current isolator code, and I believe there should not be any net effect as is.
There was a change in the dvdcli
to update it against the Docker 1.11.0 volume API. The biggest change is that we now have the ability to check whether a volume exists prior to any operation. Prior, we would implicitly create volumes no matter what when commands were ran with dvdcli.
There is a new flag --explicitCreate=true
that need to be set to enable this new functionality. Without it, implicit will still work for mesos-module-dvdi as before and new volumes can be specified with the task. Let's figure out how to get this into the next release of the module where a parameter can be set for the agent that defined whether explicitCreate is used or not.
Docker volume plugin has Create/Mount/Unmount/Delete operations.
Will all those methods be called on container startup/destroy ? At which step of the mesos container life cycle?
Thanks
Greetings,
Wondering if someone can supply an example of a JSON file where Docker container would mount DVDI provided volume ? Let's say I want to elasticsearch container to mount a volume called elkvol001 as the /data provided by DVDI module. Something along the lines of this:
{
"id": "elasticsearch",
"container": {
"type": "DOCKER",
"docker": {
"image": "elasticsearch",
"network": "BRIDGE"
}
},
"volumes": [{
"containerPath": "/data",
"hostPath": "/var/lib/rexray/volumes/dbdata",
"mode": "RW"
}],
"cpus": 0.2,
"mem": 512.0,
"env": {
"DVDI_VOLUME_NAME": "elkvol001",
"DVDI_VOLUME_DRIVER": "rexray",
"DVDI_VOLUME_OPTS": "size=5,iops=150,volumetype=io1,newfstype=xfs,overwritefs=true"
},
"instances": 1
}
Currently, the isolator checkpoints the mount AFTER doing the actual mount. The consequence of that if the slave crashes after doing the mount before the checkpointing, it loses track of that mount during recovery. One obvious issue is mount leaking. There might be other issues.
Does there are any files can enable me do the check?
The reason is that when I was doing the project of https://issues.apache.org/jira/browse/MESOS-4355 , the shepherd has a question: He think that we do not need do dvdcli umount
when cleanup
or recover
, we want to make sure there is no impact if we did not call unmount
when cleanup
or recover
.
Another point from shepherd is that we run the dvdcli mount
in container's namespace, so we do not need to unmount
when clean up as the mount point will be managed by the container, once container goes away, the mount point also goes away. So I was verifying those issues and want to get some help from you guys. Thanks.
Here come a question is Why does the dvdi need to call "unmount" when isolator "cleanup"
? I did not see much impact if we do not do unmount
when isolator cleanup
or recover
, the only issue is that there will be some garbage mount info in/proc/mounts
, any comments?
** Environment **
OS: RHEL 7.2
Apache Mesos: 0.26.0
Rexray: 0.3.1
DVDI: 0.4.1-dev
Built DVDI Isolator module for Mesos 0.26.0 as per documentation. Trying to load the load the module on Mesos-slave upon start:
File: _/etc/mesos-slave/modules_
file:///usr/lib/dvdi-mod.json
File _/etc/mesos-slave/isolation_
com_emccode_mesos_DockerVolumeDriverIsolator
Trying to start mesos-slave:
systemctl daemon-reload && systemctl restart mesos-slave
Unfortunately it is failing on startup:
systemctl -l status mesos-slave
● mesos-slave.service - Mesos Slave
Loaded: loaded (/usr/lib/systemd/system/mesos-slave.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/mesos-slave.service.d
└─mesos-slave-containerizer.conf
Active: activating (auto-restart) (Result: exit-code) since Fri 2016-01-29 15:08:47 EST; 5s ago
Process: 2202 ExecStart=/usr/bin/mesos-init-wrapper slave (code=exited, status=1/FAILURE)
Main PID: 2202 (code=exited, status=1/FAILURE)
Jan 29 15:08:47 node1.local.net systemd[1]: Unit mesos-slave.service entered failed state.
Jan 29 15:08:47 node1.local.net systemd[1]: mesos-slave.service failed.
[root@node1 mesos-slave]#
_error log entries_
Jan 29 15:00:00 node1.local.net mesos-slave[1741]: Error loading modules: Error opening library: '/usr/lib/dvdi_isolator-0.26.0.so': Could not load library '/usr/lib/dvdi_isolator-0.26.0.so': /usr/lib/dvdi_isolator-0.26.0.so: cannot dynamically load executable
Jan 29 15:00:00 node1.local.net systemd: mesos-slave.service: main process exited, code=exited, status=1/FAILURE
Jan 29 15:00:00 node1.local.net systemd: Unit mesos-slave.service entered failed state.
Jan 29 15:00:00 node1.local.net systemd: mesos-slave.service failed.
Any clues where I should be focusing on ?
Thanks,
Alex
Bumping SVN_VER := 1.9.2 to SVN_VER := 1.9.3 solves the issue.
We need to ensure that we have covered the recovery scenarios considering the information we have in 0.23.0.
Some of which can be
I created a task in Marathon, watched it come up, let it run for a few min, then suspended the app (sets instances to zero). I ended up with a hung unmount op on the slave the task was running on. The task appears as KILLED in mesos. I have a single slave node cluster. Now I can't launch a new task trying to use a (different) rexray vol. Attempting to do so results in a task stuck in STAGING. So this hung unmount op is blocking somewhere.
logs:
Apr 02 19:23:11 ip-10-0-0-249.us-west-2.compute.internal mesos-slave[1129]: I0402 19:23:11.981403 1134 slave.cpp:3528] executor(1)@10.0.0.249:53756 exited
Apr 02 19:23:12 ip-10-0-0-249.us-west-2.compute.internal mesos-slave[1129]: I0402 19:23:12.070981 1136 containerizer.cpp:1608] Executor for container '05ca518e-1fe1-4fee-bc2f-abfbb0cb9fa9' has exited
Apr 02 19:23:12 ip-10-0-0-249.us-west-2.compute.internal mesos-slave[1129]: I0402 19:23:12.071041 1136 containerizer.cpp:1392] Destroying container '05ca518e-1fe1-4fee-bc2f-abfbb0cb9fa9'
Apr 02 19:23:12 ip-10-0-0-249.us-west-2.compute.internal mesos-slave[1129]: I0402 19:23:12.072430 1133 cgroups.cpp:2427] Freezing cgroup /sys/fs/cgroup/freezer/mesos/05ca518e-1fe1-4fee-bc2f-abfbb0cb9fa9
Apr 02 19:23:12 ip-10-0-0-249.us-west-2.compute.internal mesos-slave[1129]: I0402 19:23:12.073606 1132 cgroups.cpp:1409] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos/05ca518e-1fe1-4fee-bc2f-abfbb0cb9fa9
after 1.126912ms
Apr 02 19:23:12 ip-10-0-0-249.us-west-2.compute.internal mesos-slave[1129]: I0402 19:23:12.074779 1139 cgroups.cpp:2445] Thawing cgroup /sys/fs/cgroup/freezer/mesos/05ca518e-1fe1-4fee-bc2f-abfbb0cb9fa9
Apr 02 19:23:12 ip-10-0-0-249.us-west-2.compute.internal mesos-slave[1129]: I0402 19:23:12.075850 1139 cgroups.cpp:1438] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/05ca518e-1fe1-4fee-bc2f-abfbb0cb9f
a9 after 1.031936ms
Apr 02 19:23:12 ip-10-0-0-249.us-west-2.compute.internal mesos-slave[1129]: I0402 19:23:12.076619 1133 docker_volume_driver_isolator.cpp:356] rexray/test is being unmounted on cleanup()
Apr 02 19:23:12 ip-10-0-0-249.us-west-2.compute.internal mesos-slave[1129]: I0402 19:23:12.078660 1133 docker_volume_driver_isolator.cpp:362] Invoking /opt/mesosphere/bin/dvdcli unmount --volumedriver=rexray --vo
lumename=test
Apr 02 19:23:12 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:12Z" level=info msg=vdm.Create driverName=docker moduleName=default-docker opts=map[] volumeName=test
Apr 02 19:23:12 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:12Z" level=info msg="initialized count" count=0 moduleName=default-docker volumeName=test
Apr 02 19:23:12 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:12Z" level=info msg="creating volume" driverName=docker moduleName=default-docker volumeName=test volumeOpts=map[]
Apr 02 19:23:12 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:12Z" level=info msg=sdm.GetVolume driverName=ec2 moduleName=default-docker volumeID= volumeName=test
Apr 02 19:23:13 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:13Z" level=info msg=vdm.Unmount driverName=docker moduleName=default-docker volumeID= volumeName=test
Apr 02 19:23:13 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:13Z" level=info msg="initialized count" count=0 moduleName=default-docker volumeName=test
Apr 02 19:23:13 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:13Z" level=info msg="unmounting volume" driverName=docker moduleName=default-docker volumeID= volumeName=test
Apr 02 19:23:13 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:13Z" level=info msg=sdm.GetVolume driverName=ec2 moduleName=default-docker volumeID= volumeName=test
Apr 02 19:23:13 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:13Z" level=info msg=sdm.GetVolumeAttach driverName=ec2 instanceID=i-8677785c moduleName=default-docker volumeID=vol-23fe
679a
Apr 02 19:23:13 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:13Z" level=info msg=odm.GetMounts deviceName="/dev/xvdc" driverName=linux moduleName=default-docker mountPoint=
Apr 02 19:23:13 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:13Z" level=info msg=odm.Unmount driverName=linux moduleName=default-docker mountPoint="/var/lib/rexray/volumes/test"
Apr 02 19:23:13 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:13Z" level=info msg=sdm.DetachVolume driverName=ec2 instanceID= moduleName=default-docker runAsync=false volumeID=vol-23
fe679a
Apr 02 19:23:13 ip-10-0-0-249.us-west-2.compute.internal rexray[931]: time="2016-04-02T19:23:13Z" level=info msg="waiting for volume detachment to complete" driverName=ec2 force=false moduleName=default-docker run
Async=false volumeID=vol-23fe679a
Apr 02 19:23:14 ip-10-0-0-249.us-west-2.compute.internal kernel: vbd vbd-51744: 16 Device in use; refusing to close
/proc/mounts:
core@ip-10-0-0-249 ~ $ cat /proc/mounts
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,nosuid,size=7688856k,nr_inodes=1922214,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
/dev/xvda9 / ext4 rw,relatime,data=ordered 0 0
/dev/xvda3 /usr ext4 ro,relatime 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
tmpfs /media tmpfs rw,nosuid,nodev,noexec,relatime 0 0
systemd-1 /boot autofs rw,relatime,fd=32,pgrp=1,timeout=0,minproto=5,maxproto=5,direct 0 0
tmpfs /tmp tmpfs rw 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
xenfs /proc/xen xenfs rw,relatime 0 0
/dev/xvda6 /usr/share/oem ext4 rw,nodev,relatime,commit=600,data=ordered 0 0
/dev/xvda1 /boot vfat rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,errors=remount-ro 0 0
/dev/xvdb /var/lib ext4 rw,relatime,data=ordered 0 0
/dev/xvdc /tmp/test-rexray-volume ext4 rw,relatime,data=ordered 0 0
I was using 0.27.0 dvd driver and the env json is as following:
root@mesos002:~/src/mesos/27/mesos/build# cat /opt/env.json
{
"DVDI_VOLUME_NAME": "123",
"DVDI_VOLUME_DRIVER": "convoy"
}
I was always getting this error when launch a task
E0314 20:17:41.391485 2008 docker_volume_driver_isolator.cpp:485] Environment variable DVDI_VOLUME_NAME rejected because it's value contains prohibited characters
We should avoid that since that directory is owned by the Mesos agent. Mesos does not expect an isolator to put checkpointed data there. We should put it under /var/run/mesos/isolators/.
The directory in which the checkpointed data is put should be cleaned up on reboot. Otherwise, the isolator will try to recovery the checkpointed data after the reboot, thinking that mounts are still there. Therefore, putting checkpointed data under /var/run is recommended as it'll get cleaned up on reboot.
Are there any plans about adding the support for the current mesos version (1.5.0)?
limit code to 80 characters per line.
All comments should end with a punctuation mark.
Capitalize the beginning of comments.
Make Error messages generally begin with capital letters, they should not end with punctuation marks.
Make a bit more liberal use of carriage returns to provide vertical spacing. I would have a look at the Mesos code and try to emulate the patterns used there: often a blank line before/after an if
statement (unless a preceding declaration/assignment is directly related to the if
), separating declarations & assignments into logical groupings using blank lines, etc...
I notice in your code that you show a general preference for working with stringstreams rather than manipulating strings; I'm not sure how much it matters, but the Mesos codebase relies primarily on string manipulation.
Make sure that you're taking advantage of all the using
statements at the beginning of the code, using std::string
for example.
Check out the indent rules in the style guide; there are a couple blocks in the isolator code that are indented four spaces rather than two.
The includes at the beginning of your code should be divided into logical groupings with newlines; you can check out the Mesos code for examples of this.
Avoid using namespace
directives in Mesos, instead utilizing standard namespace declarations.
Good use of the stout library's foreach
helper, but revert to the standard C++11 range-based for
loop in several places; for consistency, you could stick to the foreach
function exclusively.
In the v0.23.0 release the DVDI isolator a mount is exposed to other applications running on the slave. Reuse code from the Mesos Containerizer shared file system isolator to impose isolation between mount points on an agent node.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.