GithubHelp home page GithubHelp logo

mysos's People

Contributors

xujyan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mysos's Issues

slave error: Failed to fetch URIs for container

I'm trying to run mysos on Openstack. Since I'm not using vagrant ( unable to start a VM inside an opentack instance) I had to change the scripts. here is the list of modifications:
• changed the hardcoded ip address to the host private ip address on config and script files.
• changed the username from vagrant to ubuntu on config and script files.
I was able to install and start zookeeper, mesos and mysos-scheduler. they are all connected through zookeeper.
here is how I run different services:

mesos-master:


sudo mesos-master \
--zk=zk://1.125.1.5:2181/mesos/master \
--ip=1.125.1.5 \
--work_dir=/home/ubuntu/var/local/mesos/master/db \
--quorum=1 \
--roles=mysos \
--credentials=/home/ubuntu/mysos/vagrant/etc/framework_keys.txt \
--log_dir=/home/ubuntu/log-mysos/master \
 --no-authenticate_slave

mesos-slave:


sudo mesos-slave \ 
--master=zk://1.125.1.5.1:2181/mesos/master \ 
--ip=1.125.1.5 \ 
--hostname=1.125.1.5 \ 
--resources="cpus(mysos):4;mem(mysos):1024;disk(mysos):20000;ports(mysos):[31000-32000]" \ 
--isolation="cgroups/cpu,cgroups/mem" \ 
--cgroups_enable_cfs \ 
--log_dir=/home/ubuntu/log-mysos/slave  \ 
--frameworks_home=/home/ubuntu/mysos/vagrant/bin

mysos-scheduler:


mysos_scheduler \
    --port=55001 \
    --framework_user=ubuntu \
    --mesos_master=zk://1.125.1.5:2181/mesos/master \
    --executor_uri=/home/ubuntu/mysos/dist/mysos-0.1.0-dev0.zip \
    --executor_cmd=/home/ubuntu/mysos/vagrant/bin/mysos_executor.sh \
    --zk_url=zk://1.125.1.5:2181/mysos \
    --admin_keypath=/home/ubuntu/mysos/vagrant/etc/admin_keyfile.yml \
    --framework_failover_timeout=1m \
    --framework_role=mysos \
    --framework_authentication_file=/home/ubuntu/mysos/vagrant/etc/fw_auth_keyfile.yml \
    --scheduler_keypath=/home/ubuntu/mysos/vagrant/etc/scheduler_keyfile.txt \
    --executor_source_prefix='vagrant.devcluster' \
    --executor_environ='[{"name": "MYSOS_DEFAULTS_FILE", "value": "/etc/mysql/conf.d/my5.6.cnf"}]'

now, when I try to create a cluster using the following command:

curl -X POST mysos_host_ip:55001/clusters/test_cluster3 --form "cluster_user=mysos"

on mysos scheduler:


I0707 14:12:27.885504 13297 connection.py:276] Sending request(xid=119): Exists(path='/mysos/state/clusters', watcher=None)
I0707 14:12:27.886852 13297 connection.py:360] Received response(xid=119): ZnodeStat(czxid=9180, mzxid=9180, ctime=1436278298268, mtime=1436278298268, version=0, cversion=1, aversion=0, ephemeralOwner=0, dataLength=0, numChildren=1, pzxid=9181)
I0707 14:12:27.887795 13297 connection.py:276] Sending request(xid=120): Exists(path='/mysos/state/clusters/test_cluster2', watcher=None)
I0707 14:12:27.888803 13297 connection.py:360] Received response(xid=120): ZnodeStat(czxid=9181, mzxid=9214, ctime=1436278298278, mtime=1436278345754, version=29, cversion=0, aversion=0, ephemeralOwner=0, dataLength=1400, numChildren=0, pzxid=9181)
I0707 14:12:27.889123 13297 connection.py:276] Sending request(xid=121): SetData(path='/mysos/state/clusters/test_cluster2', data="ccopy_reg\n_reconstructor\np1\n(cmysos.scheduler.state\nMySQLCluster\np2\nc__builtin__\nobject\np3\nNtRp4\n(dp5\nS'encrypted_password'\np6\ng1\n(cnacl.utils\nEncryptedMessage\np7\nc__b  uiltin__\nstr\np8\nS'A\\xdeI!J\\xef\\x86\\xae\\\\\\xd3\\x92!\\xef0\\xab\\x91\\xf1\\xab\\xbcP\\x95\\xd1\\x18<\\xcf\\xe5}Gu\\xc9\\nK&\\xd3\\x0eAF\\x80D\\x89T\\x06s\\xc1w\\xf4\\x1c\\xe9\\xa4\\xe5\\x10$\\xa2\\x94\\r\\x86\\x00=8&\\xff'\ntRp9\n(dp10\nS'_ciphertext'\np11\nS'\\xcf\\xe5}Gu\\xc9\\nK&\\xd3\\x0eAF\\x80D\\x89T  \\x06s\\xc1w\\xf4\\x1c\\xe9\\xa4\\xe5\\x10$\\xa2\\x94\\r\\x86\\x00=8&\\xff'\np12\nsS'_nonce'\np13\nS'A\\xdeI!J\\xef\\x86\\xae\\\\\\xd3\\x92!\\xef0\\xab\\x91\\xf1\\xab\\xbcP\\x95\\xd1\\x18<'\np14\nsbsS'backup_id'\np15\nNsS'name'\np16\nS'test_cluster2'\np17\nsS'mem'\np18\ng1\n(ctwitter.common.quantity\nAmount\np19\n  g3\nNtRp20\n(dp21\nS'_unit'\np22\ng1\n(ctwitter.common.quantity\nData\np23\ng3\nNtRp24\n(dp25\nS'_multiplier'\np26\nI1048576\nsS'_display'\np27\nS'MB'\np28\nsbsS'_amount'\np29\nI512\nsbsS'cpus'\np30\nF1\nsS'num_nodes'\np31\nI1\nsS'tasks'\np32\n(dp33\nsS'user'\np34\nS'mysos'\np35\nsS'members'\np36\n(dp37\nsS'master  _id'\np38\nNsS'next_epoch'\np39\nI0\nsS'next_id'\np40\nI15\nsS'disk'\np41\ng1\n(g19\ng3\nNtRp42\n(dp43\ng22\ng1\n(g23\ng3\nNtRp44\n(dp45\ng26\nI1073741824\nsg27\nS'GB'\np46\nsbsg29\nI2\nsbsb.", version=-1)
I0707 14:12:27.909009 13297 connection.py:360] Received response(xid=121): ZnodeStat(czxid=9181, mzxid=9215, ctime=1436278298278, mtime=1436278347889, version=30, cversion=0, aversion=0, ephemeralOwner=0, dataLength=1133, numChildren=0, pzxid=9181)
I0707 14:12:27.909272 13297 launcher.py:484] Checkpointed the status update for task mysos-test_cluster2-14 of cluster test_cluster2
I0707 14:12:28.751266 13297 launcher.py:185] Launcher test_cluster2 accepted offer 20150707-140838-83983617-5050-13042-22 on Mesos slave 20150707-140838-83983617-5050-13042-0 (1.125.1.5)
I0707 14:12:28.751960 13297 launcher.py:305] Executor will use environment variable: {u'name': u'MYSOS_DEFAULTS_FILE', u'value': u'/etc/mysql/conf.d/my5.6.cnf'}
I0707 14:12:28.752923 13297 connection.py:276] Sending request(xid=122): Exists(path='/mysos/state/clusters', watcher=None)
I0707 14:12:28.754126 13297 connection.py:360] Received response(xid=122): ZnodeStat(czxid=9180, mzxid=9180, ctime=1436278298268, mtime=1436278298268, version=0, cversion=1, aversion=0, ephemeralOwner=0, dataLength=0, numChildren=1, pzxid=9181)
I0707 14:12:28.754930 13297 connection.py:276] Sending request(xid=123): Exists(path='/mysos/state/clusters/test_cluster2', watcher=None)
I0707 14:12:28.755887 13297 connection.py:360] Received response(xid=123): ZnodeStat(czxid=9181, mzxid=9215, ctime=1436278298278, mtime=1436278347889, version=30, cversion=0, aversion=0, ephemeralOwner=0, dataLength=1133, numChildren=0, pzxid=9181)
I0707 14:12:28.756345 13297 connection.py:276] Sending request(xid=124): SetData(path='/mysos/state/clusters/test_cluster2', data="ccopy_reg\n_reconstructor\np1\n(cmysos.scheduler.state\nMySQLCluster\np2\nc__builtin__\nobject\np3\nNtRp4\n(dp5\nS'encrypted_password'\np6\ng1\n(cnacl.utils\nEncryptedMessage\np7\nc__b  uiltin__\nstr\np8\nS'A\\xdeI!J\\xef\\x86\\xae\\\\\\xd3\\x92!\\xef0\\xab\\x91\\xf1\\xab\\xbcP\\x95\\xd1\\x18<\\xcf\\xe5}Gu\\xc9\\nK&\\xd3\\x0eAF\\x80D\\x89T\\x06s\\xc1w\\xf4\\x1c\\xe9\\xa4\\xe5\\x10$\\xa2\\x94\\r\\x86\\x00=8&\\xff'\ntRp9\n(dp10\nS'_ciphertext'\np11\nS'\\xcf\\xe5}Gu\\xc9\\nK&\\xd3\\x0eAF\\x80D\\x89T  \\x06s\\xc1w\\xf4\\x1c\\xe9\\xa4\\xe5\\x10$\\xa2\\x94\\r\\x86\\x00=8&\\xff'\np12\nsS'_nonce'\np13\nS'A\\xdeI!J\\xef\\x86\\xae\\\\\\xd3\\x92!\\xef0\\xab\\x91\\xf1\\xab\\xbcP\\x95\\xd1\\x18<'\np14\nsbsS'backup_id'\np15\nNsS'name'\np16\nS'test_cluster2'\np17\nsS'mem'\np18\ng1\n(ctwitter.common.quantity\nAmount\np19\n  g3\nNtRp20\n(dp21\nS'_unit'\np22\ng1\n(ctwitter.common.quantity\nData\np23\ng3\nNtRp24\n(dp25\nS'_multiplier'\np26\nI1048576\nsS'_display'\np27\nS'MB'\np28\nsbsS'_amount'\np29\nI512\nsbsS'cpus'\np30\nF1\nsS'num_nodes'\np31\nI1\nsS'tasks'\np32\n(dp33\nVmysos-test_cluster2-15\np34\ng1\n(cmysos.scheduler.state\nMySQL  Task\np35\ng3\nNtRp36\n(dp37\nS'hostname'\np38\nV1.125.1.5\np39\nsS'task_id'\np40\ng34\nsS'mesos_slave_id'\np41\nV20150707-140838-83983617-5050-13042-0\np42\nsS'cluster_name'\np43\ng17\nsS'state'\np44\nI6\nsS'port'\np45\nI31400\nsbssS'user'\np46\nS'mysos'\np47\nsS'members'\np48\n(dp49\nsS'master_id'\np50\nNsS'next  _epoch'\np51\nI0\nsS'next_id'\np52\nI16\nsS'disk'\np53\ng1\n(g19\ng3\nNtRp54\n(dp55\ng22\ng1\n(g23\ng3\nNtRp56\n(dp57\ng26\nI1073741824\nsg27\nS'GB'\np58\nsbsg29\nI2\nsbsb.", version=-1)
I0707 14:12:28.763336 13297 connection.py:360] Received response(xid=124): ZnodeStat(czxid=9181, mzxid=9216, ctime=1436278298278, mtime=1436278348756, version=31, cversion=0, aversion=0, ephemeralOwner=0, dataLength=1400, numChildren=0, pzxid=9181)
I0707 14:12:28.763725 13297 launcher.py:202] Launching task mysos-test_cluster2-15 on Mesos slave 20150707-140838-83983617-5050-13042-0 (1.125.1.5)
I0707 14:12:29.879532 13297 launcher.py:395] Updating state of task mysos-test_cluster2-15 of cluster test_cluster2 from TASK_STAGING to TASK_LOST
E0707 14:12:29.879740 13297 launcher.py:443] Task mysos-test_cluster2-15 is now in terminal state TASK_LOST with message 'Executor terminated'
W0707 14:12:29.879869 13297 launcher.py:474] Slave mysos-test_cluster2-15 of cluster test_cluster2 failed to start running

on mesos-master:


I0707 14:12:19.743690 13045 master.cpp:3559] Sending 1 offers to framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:19.778714 13043 master.cpp:2169] Processing reply for offers: [ 20150707-140838-83983617-5050-13042-19 ] on slave 20150707-140838-83983617-5050-13042-0 at slave(1)@1.125.1.5:5051 (1.125.1.5) for framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:19.779111 13043 master.hpp:829] Adding task mysos-test_cluster2-12 with resources cpus(mysos):0.99; mem(mysos):480; disk(mysos):2047; ports(mysos):[31531-31531] on slave 20150707-140838-83983617-5050-13042-0 (1.125.1.5)
I0707 14:12:19.779166 13043 master.cpp:2318] Launching task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000 with resources cpus(mysos):0.99; mem(mysos):480; disk(mysos):2047; ports(mysos):[31531-31531] on slave 20150707-140838-83983617-5050-13042-0 at slave(1)@1.125.1.5:5051 (1.125.1.5)
I0707 14:12:19.779368 13043 hierarchical_allocator_process.hpp:563] Recovered cpus(mysos):3; mem(mysos):512; disk(mysos):17952; ports(mysos):[31000-31530, 31532-32000](total allocatable: cpus%28mysos%29:3; mem%28mysos%29:512; disk%28mysos%29:17952; ports%28mysos%29:[31000-31530, 31532-32000]) on slave 20150707-140838-83983617-5050-13042-0 from framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.833107 13049 master.cpp:3229] Executor mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000 on slave 20150707-140838-83983617-5050-13042-0 at slave(1)@1.125.1.5:5051 (1.125.1.5) exited with status 1
I0707 14:12:21.833279 13049 hierarchical_allocator_process.hpp:563] Recovered cpus(mysos):0.01; mem(mysos):32; disk(mysos):1 (total allocatable: cpus(mysos):3.01; mem(mysos):544; disk(mysos):17953; ports(mysos):[31000-31530, 31532-32000]) on slave 20150707-140838-83983617-5050-13042-0 from framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.873121 13050 master.cpp:3180] Forwarding status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.873198 13050 master.cpp:3146] Status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000 from slave 20150707-140838-83983617-5050-13042-0 at slave(1)@1.125.1.5:5051 (1.125.1.5)
I0707 14:12:21.873272 13050 master.hpp:847] Removing task mysos-test_cluster2-12 with resources cpus(mysos):0.99; mem(mysos):480; disk(mysos):2047; ports(mysos):[31531-31531] on slave 20150707-140838-83983617-5050-13042-0 (1.125.1.5)
I0707 14:12:21.873401 13050 hierarchical_allocator_process.hpp:563] Recovered cpus(mysos):0.99; mem(mysos):480; disk(mysos):2047; ports(mysos):[31531-31531](total allocatable: cpus%28mysos%29:4; mem%28mysos%29:1024; disk%28mysos%29:20000; ports%28mysos%29:[31000-32000]) on slave 20150707-140838-83983617-5050-13042-0 from framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.887816 13046 master.cpp:2661] Forwarding status update acknowledgement bdc5ce90-70c6-4ac4-bf7a-edde3b20c791 for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000 to slave 20150707-140838-83983617-5050-13042-0 at slave(1)@1.125.1.5:5051 (1.125.1.5)

on mesos-slave:


I0707 14:12:19.779917 13068 slave.cpp:1002] Got assigned task mysos-test_cluster2-12 for framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:19.780154 13068 slave.cpp:3536] Checkpointing FrameworkInfo to '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/framework.info'
I0707 14:12:19.780479 13068 slave.cpp:3543] Checkpointing framework pid '[email protected]:50160' to '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/framework.pid'
I0707 14:12:19.780894 13068 gc.cpp:84] Unscheduling '/tmp/mesos/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000' from gc
I0707 14:12:19.781018 13068 gc.cpp:84] Unscheduling '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000' from gc
I0707 14:12:19.781136 13068 slave.cpp:1112] Launching task mysos-test_cluster2-12 for framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:19.782254 13068 slave.cpp:3857] Checkpointing ExecutorInfo to '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12/executor.info'
I0707 14:12:19.782737 13068 slave.cpp:3972] Checkpointing TaskInfo to '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12/runs/e89291b4-cf6e-43ad-9423-e2740beb9f4d/tasks/mysos-test_cluster2-12/task.info'
I0707 14:12:19.782922 13064 containerizer.cpp:394] Starting container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d' for executor 'mysos-test_cluster2-12' of framework '20150707-140838-83983617-5050-13042-0000'
I0707 14:12:19.782939 13068 slave.cpp:1222] Queuing task 'mysos-test_cluster2-12' for executor mysos-test_cluster2-12 of framework '20150707-140838-83983617-5050-13042-0000
I0707 14:12:19.785490 13064 mem.cpp:479] Started listening for OOM events for container e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:19.786022 13064 mem.cpp:293] Updated 'memory.soft_limit_in_bytes' to 512MB for container e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:19.786504 13068 cpushare.cpp:338] Updated 'cpu.shares' to 1024 (cpus 1) for container e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:19.787155 13064 mem.cpp:358] Updated 'memory.limit_in_bytes' to 512MB for container e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:19.787747 13068 cpushare.cpp:359] Updated 'cpu.cfs_period_us' to 100ms and 'cpu.cfs_quota_us' to 100ms (cpus 1) for container e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:19.789216 13068 linux_launcher.cpp:191] Cloning child process with flags = 0
I0707 14:12:19.790909 13068 containerizer.cpp:678] Checkpointing executor's forked pid 13451 to '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12/runs/e89291b4-cf6e-43ad-9423-e2740beb9f4d/pids/forked.pid'
I0707 14:12:19.793015 13068 containerizer.cpp:510] Fetching URIs for container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d' using command '/usr/local/libexec/mesos/mesos-fetcher'
I0707 14:12:20.824784 13070 containerizer.cpp:882] Destroying container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d'
E0707 14:12:20.825043 13067 slave.cpp:2485] Container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d' for executor 'mysos-test_cluster2-12' of framework '20150707-140838-83983617-5050-13042-0000' failed to start: Failed to fetch URIs for container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d': exit status 256
I0707 14:12:20.826287 13070 cgroups.cpp:2208] Freezing cgroup /sys/fs/cgroup/freezer/mesos/e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:20.827714 13063 cgroups.cpp:1375] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos/e89291b4-cf6e-43ad-9423-e2740beb9f4d after 1.239808ms
I0707 14:12:20.828982 13063 cgroups.cpp:2225] Thawing cgroup /sys/fs/cgroup/freezer/mesos/e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:20.830205 13063 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/e89291b4-cf6e-43ad-9423-e2740beb9f4d after 1.078016ms
I0707 14:12:21.826225 13070 containerizer.cpp:997] Executor for container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d' has exited
I0707 14:12:21.831550 13067 slave.cpp:2596] Executor 'mysos-test_cluster2-12' of framework 20150707-140838-83983617-5050-13042-0000 exited with status 1
E0707 14:12:21.831750 13065 slave.cpp:2866] Failed to unmonitor container for executor mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000: Not monitored
I0707 14:12:21.832567 13067 slave.cpp:2088] Handling status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000 from @0.0.0.0:0
W0707 14:12:21.832794 13064 containerizer.cpp:788] Ignoring update for unknown container: e89291b4-cf6e-43ad-9423-e2740beb9f4d
I0707 14:12:21.833605 13064 status_update_manager.cpp:320] Received status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.833945 13064 status_update_manager.hpp:342] Checkpointing UPDATE for status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.872320 13064 status_update_manager.cpp:373] Forwarding status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000 to [email protected]:5050
I0707 14:12:21.888372 13064 status_update_manager.cpp:398] Received status update acknowledgement (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.888582 13064 status_update_manager.hpp:342] Checkpointing ACK for status update TASK_LOST (UUID: bdc5ce90-70c6-4ac4-bf7a-edde3b20c791) for task mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.927053 13064 slave.cpp:2732] Cleaning up executor 'mysos-test_cluster2-12' of framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.927609 13064 slave.cpp:2807] Cleaning up framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.927803 13067 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12/runs/e89291b4-cf6e-43ad-9423-e2740beb9f4d' for gc 6.99998926540444days in the future
I0707 14:12:21.927835 13068 status_update_manager.cpp:282] Closing status update streams for framework 20150707-140838-83983617-5050-13042-0000
I0707 14:12:21.927999 13067 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12' for gc 6.99998926446815days in the future
I0707 14:12:21.928270 13067 gc.cpp:56] Scheduling '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12/runs/e89291b4-cf6e-43ad-9423-e2740beb9f4d' for gc 6.99998926416593days in the future
I0707 14:12:21.928318 13067 gc.cpp:56] Scheduling '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000/executors/mysos-test_cluster2-12' for gc 6.99998926392296days in the future
I0707 14:12:21.928351 13067 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000' for gc 6.99998926238222days in the future
I0707 14:12:21.928390 13067 gc.cpp:56] Scheduling '/tmp/mesos/meta/slaves/20150707-140838-83983617-5050-13042-0/frameworks/20150707-140838-83983617-5050-13042-0000' for gc 6.99998926207111days in the future

when I check the mesos-slave logs. I see this:


E0707 14:12:20.825043 13067 slave.cpp:2485] Container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d' for executor 'mysos-test_cluster2-12' of framework '20150707-140838-83983617-5050-13042-0000' failed to start: Failed to fetch URIs for container 'e89291b4-cf6e-43ad-9423-e2740beb9f4d': exit status 256
E0707 14:12:21.831750 13065 slave.cpp:2866] Failed to unmonitor container for executor mysos-test_cluster2-12 of framework 20150707-140838-83983617-5050-13042-0000: Not monitored

and here is my sandbox stderr which explains more:

I0707 15:47:32.898041 15795 fetcher.cpp:76] Fetching URI '/home/ubuntu/mysos/dist/mysos-0.1.0-dev0.zip'
I0707 15:47:32.898449 15795 fetcher.cpp:179] Copying resource from '/home/ubuntu/mysos/dist/mysos-0.1.0-dev0.zip' to '/tmp/mesos/slaves/20150707-154245-83983617-5050-14919-0/frameworks/20150707-154245-83983617-5050-14919-0000/executors/mysos-test_cluster2-78/runs/fe5ddc90-5cdf-49fd-8013-e8eb0e451e3f'
cp: cannot stat â/home/ubuntu/mysos/dist/mysos-0.1.0-dev0.zipâ: No such file or directory
E0707 15:47:32.909354 15795 fetcher.cpp:184] Failed to copy '/home/ubuntu/mysos/dist/mysos-0.1.0-dev0.zip' : Exit status 256
Failed to fetch: /home/ubuntu/mysos/dist/mysos-0.1.0-dev0.zip
Failed to synchronize with slave (it's probably exited) 

when I check my dist dir mysos-0.1.0_dev0-py2.7.egg is there. but no .zip files!!
What have I missed during the installation? It must have made it at some point!

  • A few notes about my setup:
    • all of the services run on the same node, but they use the private-ip(1.125.1.5), not localhost.
    • my network has a Man-in-the-Middle proxy

Any idea what is wrong here?

Cannot build package

Or it might just be me you cannot figure out how.

Steps to reproduce:

First, start with a clean Ubuntu 14.04 x64 server (e.g. from digitalocean.com)
Then, run these commands:

# ===========
# Follow instructions from https://docs.mesosphere.com/getting-started/datacenter/install/
# Setup
apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF
DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')
CODENAME=$(lsb_release -cs)

# Add the repository
echo "deb http://repos.mesosphere.io/${DISTRO} ${CODENAME} main" | \
  sudo tee /etc/apt/sources.list.d/mesosphere.list
apt-get update

apt-get install mesos marathon
# ===========

apt-get install git
git clone https://github.com/twitter/mysos.git
cd mysos
apt-get install python-pip
pip install virtualenv
virtualenv venv
source venv/bin/activate
cd 3rdparty
wget http://downloads.mesosphere.io/master/ubuntu/14.04/mesos-0.22.1-py2.7-linux-x86_64.egg
wheel convert mesos-0.22.1-py2.7-linux-x86_64.egg
cd ..

python setup.py install
mysos_scheduler

Expected: The scheduler should start?
Actual:

Traceback (most recent call last):
  File "/root/mysos/venv/bin/mysos_scheduler", line 9, in <module>
    load_entry_point('mysos==0.1.0.dev0', 'console_scripts', 'mysos_scheduler')()
  File "/root/mysos/venv/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 552, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/root/mysos/venv/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2672, in load_entry_point
    return ep.load()
  File "/root/mysos/venv/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2344, in load
    self.require(*args, **kwargs)
  File "/root/mysos/venv/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2361, in require
    items = working_set.resolve(reqs, env, installer)
  File "/root/mysos/venv/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 833, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'pynacl<1,>=0.3.0' distribution was not found and is required by the application

Production executor entry point broken

The mysos_executor entry point references mysos.executor.mysos_executor:proxy_main, which does not exist. What is the proper way to get an executor for use outside of vagrant?

finding the MySQL Cluster port

I'm trying to find an easy way to get the host+port of the cluster that gets created with curl POST.
I tried writing a simple client that just gets the host and port from zokeeper by using the utilities provided at https://github.com/twitter/commons/tree/master/src/java/com/twitter/common/zookeeper, and/or
https://github.com/twitter/commons/tree/master/src/python/twitter/common/zookeeper
however the porblem is:

  1. i dont know which class to use/call since there is no javadoc/ doc available.
  2. I tried running a few of the classes (e,g, cli.py) but they all have dependency to the twitter commons, which I cant manage to build (pants does not work behind the proxy of my server)

Can you help me with my two questions:

  1. is there a ready built client that I can just run? if so, can you tell me where is it?
  2. if I have to write a client, which class (on java or python) should I use?

I appreciate any help.

Master and slave have equal Mysql server ids

Fatal error: The slave I/O thread stops because master and slave have equal MySQL server ids; these ids must be different for replication to work (or the --replicate-same-server-id option must be used on slave but this does not always make sense; please check the manual before using it).

don't start replication.

about mysos-0.1.0-dev0.zip

Hello, sorry for the stupid question.

What is mysos-0.1.0-dev0.zip and where can i find it or how i can build it?

Running MySQL on shared Mesos slave pool

Currently Mysos requires Mesos slaves to dedicate all resources to it. MySQL usually requires special boxes with large disks. However for certain test use cases this may not be true and it would be nice if Mysos can be configured to support colocating its tasks with tasks from other Mesos frameworks on shared slaves.

Leverage GTID to properly failover MySQL master instances.

The current Mysos master failover code is capable of reliably detecting the dead instance, sending queries to find the "most current" slave and sending the commands to promote the new master and reparent the slaves. However without GTID our current scripts that Mysos invokes to do these things aren't sufficient because some files need to be copied out of band by tools such as MHA.

If we leverage GTID in MySQL 5.6 we can make failover really work without relying on external tools.

"Mysos scheduler is still connecting...".

run up the mesos scheduler and listener on 55001 port. The page of http://myhost:55001 always display "Mysos scheduler is still connecting...". Below is scheduler's logs. Is somethings wrong?
My command is:
1 #!/bin/sh
2
3 ZK_HOST=10.175.100.231
4 API_PORT=55001
5
6 # NOTE: In --executor_environ we are pointing MYSOS_DEFAULTS_FILE to an empty MySQL defaults file.
7 # The file 'my5.6.cnf' is pre-installed by the 'mysql-server-5.6' package on the VM.
8 mysos_scheduler
9 --port=$API_PORT
10 --framework_user=vagrant
11 --mesos_master=zk://$ZK_HOST:2184/mesos
12 --executor_uri=/home/huajianfeng/.tox/distshare/mysos-0.1.0-dev0.zip
13 --executor_cmd=/home/huajianfeng/incubator-cotton-master/vagrant/bin/mysos_executor.sh
14 --zk_url=zk://$ZK_HOST:2184/mysos
15 --admin_keypath=/home/huajianfeng/incubator-cotton-master/vagrant/etc/admin_keyfile.yml
16 --framework_failover_timeout=1m
17 --framework_role=*
18 --scheduler_keypath=/home/huajianfeng/incubator-cotton-master/vagrant/etc/scheduler_keyfile.txt
19 --executor_source_prefix='vagrant.devcluster' \
20 --executor_environ='[{"name": "MYSOS_DEFAULTS_FILE", "value": "/etc/mysql/conf.d/my5.6.cnf"}]'

my output is:
I1124 00:13:35.048789 178455 mysos_scheduler.py:219] Extracted web assets into /tmp/mysos
I1124 00:13:35.048928 178455 mysos_scheduler.py:244] Starting Mysos scheduler
I1124 00:13:35.050659 178455 connection.py:566] Connecting to 10.175.100.231:2184
I1124 00:13:35.051512 178455 connection.py:276] Sending request(xid=None): Connect(protocol_version=0, last_zxid_seen=0, time_out=10000, session_id=0, passwd='\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', read_only=None)
I1124 00:13:35.055763 178455 client.py:378] Zookeeper connection established, state: CONNECTED
I1124 00:13:35.058011 178455 mysos_scheduler.py:250] Using ZooKeeper (path: /mysos) for state storage
I1124 00:13:35.058290 178455 connection.py:276] Sending request(xid=1): GetData(path='/mysos/state/scheduler', watcher=None)
I1124 00:13:35.059406 178455 connection.py:360] Received response(xid=1): ("ccopy_reg\n_reconstructor\np1\n(cmysos.scheduler.state\nScheduler\np2\nc__builtin__\nobject\np3\nNtRp4\n(dp5\nS'framework_info'\np6\ncmesos.interface.mesos_pb2\nFrameworkInfo\np7\n(tRp8\n(dp9\nS'serialized'\np10\nS'\n\x07vagrant\x12\x05mysos!\x00\x00\x00\x00\x00\x00N@(\x012\x01*'\np11\nsbsS'clusters'\np12\ng1\n(ctwitter.common.collections.orderedset\nOrderedSet\np13\ng3\nNtRp14\n(dp15\nS'map'\np16\n(dp17\nsS'end'\np18\n(lp19\nNag19\nag19\nasbsb.", ZnodeStat(czxid=259437, mzxid=259437, ctime=1448290870564, mtime=1448290870564, version=0, cversion=0, aversion=0, ephemeralOwner=0, dataLength=410, numChildren=0, pzxid=259437))
I1124 00:13:35.059720 178455 mysos_scheduler.py:262] Successfully restored scheduler state
2015-11-24 00:13:35,066:178455(0x7f36717f2700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5
2015-11-24 00:13:35,066:178455(0x7f36717f2700):ZOO_INFO@log_env@716: Client environment:host.name=hare229
2015-11-24 00:13:35,066:178455(0x7f36717f2700):ZOO_INFO@log_env@723: Client environment:os.name=Linux
2015-11-24 00:13:35,066:178455(0x7f36717f2700):ZOO_INFO@log_env@724: Client environment:os.arch=3.16.0-30-generic
2015-11-24 00:13:35,066:178455(0x7f36717f2700):ZOO_INFO@log_env@725: Client environment:os.version=#40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015
I1124 00:13:35.066289 178455 sched.cpp:164] Version: 0.25.0
2015-11-24 00:13:35,066:178455(0x7f36717f2700):ZOO_INFO@log_env@733: Client environment:user.name=pangbingqiang
2015-11-24 00:13:35,066:178455(0x7f36717f2700):ZOO_INFO@log_env@741: Client environment:user.home=/root
2015-11-24 00:13:35,066:178455(0x7f36717f2700):ZOO_INFO@log_env@753: Client environment:user.dir=/home/huajianfeng/incubator-cotton-master
2015-11-24 00:13:35,066:178455(0x7f36717f2700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=10.175.100.231:2184 sessionTimeout=10000 watcher=0x7f3690e18692 sessionId=0 sessionPasswd= context=0x7f3630000e90 flags=0
2015-11-24 00:13:35,070:178455(0x7f365ffff700):ZOO_INFO@check_events@1703: initiated connection to server [10.175.100.231:2184]
2015-11-24 00:13:35,073:178455(0x7f365ffff700):ZOO_INFO@check_events@1750: session establishment complete on server [10.175.100.231:2184], sessionId=0x15134d48dc00014, negotiated timeout=10000
I1124 00:13:35.073601 178493 group.cpp:331] Group process (group(1)@10.175.102.229:58721) connected to ZooKeeper
I1124 00:13:35.073673 178493 group.cpp:805] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
I1124 00:13:35.073709 178493 group.cpp:403] Trying to create path '/mesos' in ZooKeeper
I1124 00:13:35.076272 178487 detector.cpp:156] Detected a new leader: (id='56')
I1124 00:13:35.076438 178492 group.cpp:674] Trying to get '/mesos/info_0000000056' in ZooKeeper
W1124 00:13:35.077445 178479 detector.cpp:444] Leading master [email protected]:5050 is using a Protobuf binary format when registering with ZooKeeper (info): this will be deprecated as of Mesos 0.24 (see MESOS-2340)
I1124 00:13:35.077500 178479 detector.cpp:481] A new leading master ([email protected]:5050) is detected
I1124 00:13:35.077623 178476 sched.cpp:262] New master detected at [email protected]:5050
I1124 00:13:35.077847 178476 sched.cpp:272] No credentials provided. Attempting to register without authentication
Bottle v0.11.6 server starting up (using CherryPyServer())...
Listening on http://0.0.0.0:55001/
Hit Ctrl-C to quit.

Mysos CLI.

It would be nice to have a CLI that wraps around the RESTful APIs.

is mysos runable?

I run up the mesos scheduler and listener on 5500 port. The page of http://myhost:5500 always display "Mysos scheduler is still connecting...". Below is scheduler's logs. Is somethings wrong?

My command is:

python ./scheduler.py \
    --mesos_master=zk://172.31.15.246:2181/mesos \
    --port=5500 \
    --framework_user=mysos \
    --executor_uri=/home/vagrant/mysos/dist/mysos-0.1.0-dev0.zip \
    --executor_cmd=/home/vagrant/mysos/vagrant/bin/mysos_executor.sh \
    --zk_url=zk://172.31.15.246:2181/mysos \
    --admin_keypath=/home/ubuntu/mysos/vagrant/etc/admin_keyfile.yml \
    --framework_failover_timeout=1m \
    --framework_role=mysos \
    --scheduler_keypath=/home/ubuntu/mysos/vagrant/etc/scheduler_keyfile.txt \
    --executor_source_prefix='vagrant.devcluster' \
    --executor_environ='[{"name": "MYSOS_DEFAULTS_FILE", "value": "/etc/mysql/conf.d/my5.6.cnf"}]'

and the scheduler.py content is:

#!/usr/bin/python

from os.path import join, abspath
import sys

ROOT_DIR = abspath('./')
sys.path.insert(0, ROOT_DIR)

from mysos.scheduler.mysos_scheduler import proxy_main

if __name__ == '__main__':
    proxy_main()

output is:

root@mingqi-dev:~/mysos# ./scheduler.sh
I0603 15:34:57.013078 17733 mysos_scheduler.py:177] Options in use: {'framework_failover_timeout': '1m', 'twitter_common_log_simple'
: False, 'verbose': None, 'twitter_common_app_daemon_stdout': '/dev/null', 'twitter_common_log_scribe_category': 'python_default', '
api_port': 5500, 'twitter_common_log_log_dir': '/var/tmp', 'twitter_common_app_daemonize': False, 'twitter_common_app_ignore_rc_file
': False, 'twitter_common_app_profiling': False, 'work_dir': '/tmp/mysos', 'twitter_common_app_pidfile': None, 'twitter_common_log_s
cribe_buffer': False, 'executor_source_prefix': 'vagrant.devcluster', 'election_timeout': '60s', 'twitter_common_app_rc_filename': False, 'framework_role': 'mysos', 'executor_environ': '[{"name": "MYSOS_DEFAULTS_FILE", "value": "/etc/mysql/conf.d/my5.6.cnf"}]', 'twitter_common_log_scribe_log_level': 'NONE', 'executor_uri': '/home/vagrant/mysos/dist/mysos-0.1.0-dev0.zip', 'twitter_common_log_disk_log_level': 'NONE', 'twitter_common_log_stderr_log_level': 'ERROR', 'framework_authentication_file': None, 'state_storage': 'zk', 'executor_cmd': '/home/vagrant/mysos/vagrant/bin/mysos_executor.sh', 'twitter_common_app_profile_output': None, 'framework_user': 'mysos', 'zk_url': 'zk://172.31.15.246:2181/mysos', 'twitter_common_app_debug': False, 'twitter_common_log_scribe_port': 1463, 'twitter_common_log_scribe_host': 'localhost', 'scheduler_keypath': '/home/ubuntu/mysos/vagrant/etc/scheduler_keyfile.txt', 'installer_args': None, 'backup_store_args': None, 'mesos_master': 'zk://172.31.15.246:2181/mesos', 'admin_keypath': '/home/ubuntu/mysos/vagrant/etc/admin_keyfile.yml', 'twitter_common_app_daemon_stderr': '/dev/null'}
I0603 15:34:57.016001 17733 mysos_scheduler.py:219] Extracted web assets into /tmp/mysos
I0603 15:34:57.016411 17733 mysos_scheduler.py:244] Starting Mysos scheduler
I0603 15:34:57.027303 17733 connection.py:566] Connecting to 172.31.15.246:2181
I0603 15:34:57.028554 17733 connection.py:276] Sending request(xid=None): Connect(protocol_version=0, last_zxid_seen=0, time_out=10000, session_id=0, passwd='\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', read_only=None)
I0603 15:34:57.031117 17733 client.py:378] Zookeeper connection established, state: CONNECTED
I0603 15:34:57.034564 17733 mysos_scheduler.py:250] Using ZooKeeper (path: /mysos) for state storage
I0603 15:34:57.034933 17733 connection.py:276] Sending request(xid=1): GetData(path='/mysos/state/scheduler', watcher=None)
I0603 15:34:57.036081 17733 connection.py:360] Received response(xid=1): ("ccopy_reg\n_reconstructor\np1\n(cmysos.scheduler.state\nScheduler\np2\nc__builtin__\nobject\np3\nNtRp4\n(dp5\nS'framework_info'\np6\ncmesos.interface.mesos_pb2\nFrameworkInfo\np7\n(tRp8\n(dp9\nS'serialized'\np10\nS'\\n\\x05mysos\\x12\\x05mysos!\\x00\\x00\\x00\\x00\\x00\\x00N@(\\x012\\x05mysosB\\x05mysos'\np11\nsbsS'clusters'\np12\ng1\n(ctwitter.common.collections.orderedset\nOrderedSet\np13\ng3\nNtRp14\n(dp15\nS'map'\np16\n(dp17\nsS'end'\np18\n(lp19\nNag19\nag19\nasbsb.", ZnodeStat(czxid=17, mzxid=17, ctime=1433316463490, mtime=1433316463490, version=0, cversion=0, aversion=0, ephemeralOwner=0, dataLength=422, numChildren=0, pzxid=17))
I0603 15:34:57.036864 17733 mysos_scheduler.py:262] Successfully restored scheduler state
I0603 15:34:57.039196 17733 sched.cpp:139] Version: 0.20.1
2015-06-03 15:34:57,041:17733(0x7f90d37fe700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5
2015-06-03 15:34:57,053:17733(0x7f90d37fe700):ZOO_INFO@log_env@716: Client environment:host.name=mingqi-dev
2015-06-03 15:34:57,053:17733(0x7f90d37fe700):ZOO_INFO@log_env@723: Client environment:os.name=Linux
2015-06-03 15:34:57,054:17733(0x7f90d37fe700):ZOO_INFO@log_env@724: Client environment:os.arch=3.13.0-44-generic
2015-06-03 15:34:57,054:17733(0x7f90d37fe700):ZOO_INFO@log_env@725: Client environment:os.version=#73-Ubuntu SMP Tue Dec 16 00:22:43 UTC 2014
2015-06-03 15:34:57,054:17733(0x7f90d37fe700):ZOO_INFO@log_env@733: Client environment:user.name=ubuntu
2015-06-03 15:34:57,055:17733(0x7f90d37fe700):ZOO_INFO@log_env@741: Client environment:user.home=/root
2015-06-03 15:34:57,055:17733(0x7f90d37fe700):ZOO_INFO@log_env@753: Client environment:user.dir=/home/ubuntu/mysos
2015-06-03 15:34:57,055:17733(0x7f90d37fe700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=172.31.15.246:2181 sessionTimeout=10000 watcher=0x7f90e525bcc0 sessionId=0 sessionPasswd=<null> context=0x7f90cc0013a0 flags=0
2015-06-03 15:34:57,057:17733(0x7f90d09c9700):ZOO_INFO@check_events@1703: initiated connection to server [172.31.15.246:2181]
2015-06-03 15:34:57,059:17733(0x7f90d09c9700):ZOO_INFO@check_events@1750: session establishment complete on server [172.31.15.246:2181], sessionId=0x14db7918a73000e, negotiated timeout=10000
I0603 15:34:57.060551 17745 group.cpp:313] Group process (group(1)@127.0.0.1:38704) connected to ZooKeeper
I0603 15:34:57.060619 17745 group.cpp:787] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
I0603 15:34:57.060670 17745 group.cpp:385] Trying to create path '/mesos' in ZooKeeper
Bottle v0.11.6 server starting up (using CherryPyServer())...
Listening on http://0.0.0.0:5500/
I0603 15:34:57.076939 17745 detector.cpp:138] Detected a new leader: (id='1')
I0603 15:34:57.077203 17745 group.cpp:658] Trying to get '/mesos/info_0000000001' in ZooKeeper
Hit Ctrl-C to quit.

I0603 15:34:57.085907 17745 detector.cpp:426] A new leading master ([email protected]:5050) is detected
I0603 15:34:57.086035 17745 sched.cpp:235] New master detected at [email protected]:5050
I0603 15:34:57.086138 17745 sched.cpp:243] No credentials provided. Attempting to register without authentication

Cross DC/Mesos pool replication support.

Organizations often enable cross DC replication for MySQL clusters but Mysos currently has tacit assumptions that all hosts are in the same DC, as a Mesos cluster is typically running in one DC. We would need to implement a higher level scheduler that can manage replication across DCs.

Use 'disk' resource in Mesos offers

We haven't been using it because it has not been enforced by the Mesos slave and we are planning on leveraging the persistent resource primitives soon.

However with disk not being accounted for, we sometimes run into issues where the disk space has been a resource bottleneck for hosts (e.g. the instance restores from a HUGE backup). We should have a simple implementation first that specifies the disk sources as if they are enforced.

Support backing up MySQL master

Mysos currently supports restoring a MySQL cluster from a backup (See BackupStore). We also need to support creating backups on a regular basis. We can add a BackupStore.backup() method in the interface.

Questions about the state of mysos within twitter?

hello @xujyan

Not sure what the best medium to post these questions, whether it is github issue or one of the following mailing list. let me know
[email protected]
[email protected]

I have some general questions about the state of mysos/cotton within twitter. My guess is that each of the features currently available within twitter will be slowly open sourced and make generic so that different implementation can be contributed by the community. I am curious what they look like at twitter right now

I am curious about how twitter current using mysos to do failover, it seems like it is using the reparent.sh as you mentioned here that without gtid "some files need to be copied out of band by tools such as MHA", does the version running inside twitter already have integration with MHA? or is ther other mechanism that bring the bin log to new master.

Also i see that there is the concept of BackupStore, does twitter current using HDFS to backup the files? i see some sample code related to it here but not sure if it is just an idea or twitter already have one implementation of it.

Thanks again for open sourcing cotton, we look forward to more news

Better MySQL master failure detection

Currently Mysos executor monitoring the health of MySQL master only by making sure the process is alive.

More subtle failure cases can be captured by more sophisticated health checking (e.g., have executors regularly attempt sql queries against the master and reporting to the scheduler if they fail). The scheduler can then determine a point at which a master election is required.

Support slave-only Mysos clusters

This way MySQL slaves in Mysos provides no master failover but rather a scale out option for adding more replicas to serve more RO traffic. Maybe an interesting use case to help existing MySQL DBs to transition into Mysos.

add support for adding more instances

Currently the docs README talk about being able to scale up and down:

An elastic solution that allows users to easily scale up and down a MySQL cluster by changing the number of slave instances

But this is neither documented in the user-guide nor part of the http.py, so I assume it is missing. We should add it then :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.