Docker image containing GOLANG that is compatible to the Raspberry Pi.
make build
make version
- First, use a
docker login
with username, password and email address - Second, push Docker Image to the official Docker Hub
make push
Hypriot Cluster Lab
Home Page: http://blog.hypriot.com
License: MIT License
Docker image containing GOLANG that is compatible to the Raspberry Pi.
make build
make version
docker login
with username, password and email addressmake push
Work in progress here: https://github.com/hypriot/cluster-lab/tree/move_to_Travis-CI
In the Hypriot community channel (https://gitter.im/hypriot/talk), lispmeister reported:
@MathiasRenner we tested the new (0.1.1)cluster lab SD image and we see more then one node declaring itself swarm master. Still trying to figure out why that's happening. Using the old image (0.1) with a patched cluster-start.sh script works reliably though.
Despite own tests and also users of community that approved the latest release to work fine, this error happened in my environment tonight as well.
Quickfix:
On the node that should be slave, run
sudo systemctl start cluster-stop
sudo systemctl start cluster-start
Let's debug and fix this:
The relevant parts of log output of journalctl
:
...
Jan 01 01:00:13 slave1 ntpd[545]: Listen normally on 3 eth0 192.168.0.20 UDP 123
IP address via DHCP is assigned. OK.
Jan 01 01:00:13 slave1 ntpd[545]: Listening on routing socket on fd #24 for interface updates
Jan 01 01:00:13 slave1 systemd[1]: Started LSB: Start NTP daemon.
Jan 01 01:00:13 slave1 ifup[246]: Restarting ntp (via systemctl): ntp.service.
Jan 01 01:00:14 slave1 dhclient[254]: bound to 192.168.0.20 -- renewal in 375250 seconds.
Jan 01 01:00:14 slave1 ifup[246]: bound to 192.168.0.20 -- renewal in 375250 seconds.
Jan 01 01:00:14 slave1 ntpdate[588]: the NTP socket is in use, exiting
Jan 01 01:00:14 slave1 systemd[1]: Reloading OpenBSD Secure Shell server.
Jan 01 01:00:14 slave1 sshd[261]: Received SIGHUP; restarting.
Jan 01 01:00:14 slave1 systemd[1]: Reloaded OpenBSD Secure Shell server.
Jan 01 01:00:14 slave1 sshd[261]: Could not load host key: /etc/ssh/ssh_host_rsa_key
Jan 01 01:00:14 slave1 sshd[261]: Could not load host key: /etc/ssh/ssh_host_dsa_key
Jan 01 01:00:14 slave1 sshd[261]: Could not load host key: /etc/ssh/ssh_host_ecdsa_key
Jan 01 01:00:14 slave1 sshd[261]: Could not load host key: /etc/ssh/ssh_host_ed25519_key
Jan 01 01:00:14 slave1 sshd[261]: Server listening on 0.0.0.0 port 22.
Jan 01 01:00:14 slave1 sshd[261]: Server listening on :: port 22.
Jan 01 01:00:14 slave1 ntpd_intres[312]: parent died before we finished, exiting
Jan 01 01:00:15 slave1 ntpdate[667]: the NTP socket is in use, exiting
Jan 01 01:00:15 slave1 systemd[1]: Reloading OpenBSD Secure Shell server.
Jan 01 01:00:15 slave1 sshd[261]: Received SIGHUP; restarting.
Jan 01 01:00:15 slave1 systemd[1]: Reloaded OpenBSD Secure Shell server.
Jan 01 01:00:15 slave1 sshd[261]: Could not load host key: /etc/ssh/ssh_host_rsa_key
Jan 01 01:00:15 slave1 sshd[261]: Could not load host key: /etc/ssh/ssh_host_dsa_key
Jan 01 01:00:15 slave1 sshd[261]: Could not load host key: /etc/ssh/ssh_host_ecdsa_key
Jan 01 01:00:15 slave1 sshd[261]: Server listening on 0.0.0.0 port 22.
Jan 01 01:00:15 slave1 sshd[261]: Server listening on :: port 22.
Jan 01 01:00:15 slave1 sshd[261]: Could not load host key: /etc/ssh/ssh_host_ed25519_key
Jan 31 22:01:56 slave1 systemd[1]: Time has been changed
Jan 31 22:02:01 slave1 rc.local[263]: Creating SSH2 RSA key; this may take some time ...
Jan 31 22:02:01 slave1 rc.local[263]: 2048 a2:ce:f9:0a:db:e3:57:20:5a:63:68:4b:a7:4a:24:8b /etc
Jan 31 22:02:04 slave1 cluster-start.sh[413]: W: Failed to fetch http://mirrordirector.raspbian
Jan 31 22:02:04 slave1 cluster-start.sh[413]: W: Failed to fetch http://mirrordirector.raspbian
Jan 31 22:02:04 slave1 cluster-start.sh[413]: W: Failed to fetch https://packagecloud.io/Hyprio
Jan 31 22:02:04 slave1 cluster-start.sh[413]: W: Some index files failed to download. They have
Jan 31 22:02:04 slave1 cluster-start.sh[413]: install required packages
Jan 31 22:02:06 slave1 rc.local[263]: Creating SSH2 DSA key; this may take some time ...
Jan 31 22:02:06 slave1 rc.local[263]: 1024 8c:2c:1a:d2:dc:95:70:b4:7a:34:24:27:d4:0c:bc:6b /etc
Jan 31 22:02:06 slave1 rc.local[263]: Creating SSH2 ECDSA key; this may take some time ...
Jan 31 22:02:06 slave1 rc.local[263]: 256 85:e4:b0:ac:97:72:c3:bf:6b:49:b6:1a:f5:e5:ba:08 /etc/
Jan 31 22:02:07 slave1 rc.local[263]: Creating SSH2 ED25519 key; this may take some time ...
Jan 31 22:02:07 slave1 rc.local[263]: 256 a2:94:18:ac:b9:57:ed:ff:3c:06:6e:2a:7b:3b:02:d8 /etc/
Jan 31 22:02:07 slave1 cluster-start.sh[413]: WARNING: The following packages cannot be authent
Jan 31 22:02:07 slave1 cluster-start.sh[413]: libavahi-client3 avahi-utils vlan
Jan 31 22:02:07 slave1 cluster-start.sh[413]: E: There are problems and -y was used without --f
Jan 31 22:02:07 slave1 cluster-start.sh[413]: create vlan with tag 200 on eth0
Jan 31 22:02:07 slave1 kernel: 8021q: 802.1Q VLAN Support v1.8
Jan 31 22:02:07 slave1 cluster-start.sh[413]: configure avahi only on eth0.200 \(vlan with id 2
Jan 31 22:02:07 slave1 avahi-daemon[347]: Files changed, reloading.
Jan 31 22:02:07 slave1 avahi-daemon[347]: No service file found in /etc/avahi/services.
Jan 31 22:02:07 slave1 avahi-daemon[347]: Files changed, reloading.
Jan 31 22:02:07 slave1 avahi-daemon[347]: No service file found in /etc/avahi/services.
Jan 31 22:02:07 slave1 avahi-daemon[347]: Files changed, reloading.
Jan 31 22:02:07 slave1 avahi-daemon[347]: No service file found in /etc/avahi/services.
Jan 31 22:02:07 slave1 avahi-daemon[347]: Files changed, reloading.
Jan 31 22:02:07 slave1 avahi-daemon[347]: No service file found in /etc/avahi/services.
Jan 31 22:02:07 slave1 avahi-daemon[347]: Files changed, reloading.
Jan 31 22:02:07 slave1 avahi-daemon[347]: No service file found in /etc/avahi/services.
Jan 31 22:02:07 slave1 avahi-daemon[347]: Files changed, reloading.
Jan 31 22:02:07 slave1 avahi-daemon[347]: No service file found in /etc/avahi/services.
Jan 31 22:02:07 slave1 avahi-daemon[347]: Files changed, reloading.
Jan 31 22:02:07 slave1 avahi-daemon[347]: No service file found in /etc/avahi/services.
Jan 31 22:02:07 slave1 avahi-daemon[347]: Joining mDNS multicast group on interface eth0.200.IP
Jan 31 22:02:07 slave1 avahi-daemon[347]: New relevant interface eth0.200.IPv4 for mDNS.
Jan 31 22:02:07 slave1 avahi-daemon[347]: Registering new address record for 192.168.200.5 on e
Jan 31 22:02:07 slave1 cluster-start.sh[413]: #-----------------
Jan 31 22:02:07 slave1 cluster-start.sh[413]: # check if leader
Jan 31 22:02:07 slave1 cluster-start.sh[413]: #-----------------
Jan 31 22:02:07 slave1 cluster-start.sh[413]: set ip address on vlan 200
Jan 31 22:02:07 slave1 cluster-start.sh[413]: /usr/local/bin/cluster-start.sh: line 215: avahi-
Jan 31 22:02:07 slave1 avahi-daemon[347]: Withdrawing address record for 192.168.200.5 on eth0.
Jan 31 22:02:07 slave1 avahi-daemon[347]: Leaving mDNS multicast group on interface eth0.200.IP
Jan 31 22:02:07 slave1 avahi-daemon[347]: Interface eth0.200.IPv4 no longer relevant for mDNS.
Jan 31 22:02:07 slave1 cluster-start.sh[413]: if CLUSTERMASTERIP is empty then this machine is
Jan 31 22:02:07 slave1 cluster-start.sh[413]: #####################
Jan 31 22:02:07 slave1 cluster-start.sh[413]: # #
Jan 31 22:02:07 slave1 cluster-start.sh[413]: # configure node as #
Jan 31 22:02:07 slave1 cluster-start.sh[413]: # #
Jan 31 22:02:07 slave1 cluster-start.sh[413]: # cluster master #
Jan 31 22:02:07 slave1 cluster-start.sh[413]: # #
Jan 31 22:02:07 slave1 cluster-start.sh[413]: #####################
...
What's truncated at this line:
Jan 31 22:02:04 slave1 cluster-start.sh[413]: W: Failed to fetch http://mirrordirector.raspbian
... is a .gpg at the end. Without this key, installing avahi-browse client fails later (log says WARNING: The following packages cannot be authent [...]
)
Thus, the error must be somewhere before the Failed to fetch
- any ideas?
Hi,
Starting from basic hypriot os image (latest one), I then installed cluster-lab
package and when I do systemctl start cluster-start
, at the end, I have:
firstboot_done: command not found
So I hope, it was just a useless text that I missed.
At least, consul UI is up and running.
Brw, on home page project on github, package is cluster-lab
and not hypriot-cluster-lab
.
Since consul takes some time to complete the startup process on the slave nodes, the swarm join functionality is not registering the current node in the key value store of consul.
An additional delay between starting up consul and swarm join should help to overcome this issue.
Hi,
I started my picosluter and clusterlab with sd-card-rpi-v0.5.14.img
and upgraded it.
As notice in #36, I saw that docker was no longer working. So I removed the /etc/docker/daemon.json
file but anyway. Docker and Cluster-lab starts well but consul container is always restarting.
For what I can see on my master node:
HypriotOS/armv7: pirate@pico-master in ~
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c0398b11ece3 hypriot/rpi-swarm:1.2.2 "/swarm manage --repl" 9 seconds ago Up 7 seconds 0.0.0.0:2378->2375/tcp cluster_lab_swarmmanage
7495b4163adb hypriot/rpi-swarm:1.2.2 "/swarm join --advert" 11 seconds ago Up 9 seconds 2375/tcp cluster_lab_swarm
82457f97e74d hypriot/rpi-consul:0.6.4 "/consul agent -serve" 15 seconds ago Restarting (1) 2 seconds ago cluster_lab_consul
$ docker logs cluster_lab_consul
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Joining cluster...
==> dial tcp 192.168.200.1:8301: getsockopt: connection refused
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Joining cluster...
==> dial tcp 192.168.200.1:8301: getsockopt: connection refused
and:
$ sudo systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/etc/systemd/system/docker.service; enabled)
Active: active (running) since Fri 2016-05-27 21:34:19 UTC; 55s ago
Docs: https://docs.docker.com
Main PID: 1116 (docker)
CGroup: /system.slice/docker.service
├─1116 /usr/bin/docker daemon --storage-driver overlay --host fd:// --debug --host tcp://192.168.200.31:2375 --cluster-advertise 192.168.200.31:2375 --cluster-sto...
├─1121 docker-containerd -l /var/run/docker/libcontainerd/docker-containerd.sock --runtime docker-runc --debug --metrics-interval=0
├─1395 docker-containerd-shim 7495b4163adb8d323bfb41671212d75aef65d04ca5264519aa90f4dbd0f91e12 /var/run/docker/libcontainerd/7495b4163adb8d323bfb41671212d75aef65d...
├─1476 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 2378 -container-ip 172.17.0.3 -container-port 2375
└─1480 docker-containerd-shim c0398b11ece30d3c24cc0c8c5ec1851302dff382b983472719dbecb0ba64036a /var/run/docker/libcontainerd/c0398b11ece30d3c24cc0c8c5ec1851302dff...
May 27 21:34:48 pico-master docker[1116]: time="2016-05-27T21:34:48.214344343Z" level=debug msg="logs: begin stream"
May 27 21:34:48 pico-master docker[1116]: time="2016-05-27T21:34:48.219792562Z" level=debug msg="logs: end stream"
May 27 21:34:53 pico-master docker[1116]: time="2016-05-27T21:34:53.723090296Z" level=debug msg="received containerd event: &types.Event{Type:\"start-container\",...x5748bd7d}"
May 27 21:34:53 pico-master docker[1116]: time="2016-05-27T21:34:53.726771710Z" level=debug msg="event unhandled: type:\"start-container\" id:\"82457f97e74d5251a6...464384893 "
May 27 21:34:53 pico-master docker[1116]: time="2016-05-27T21:34:53Z" level=debug msg="containerd: process exited" id=82457f97e74d5251a6f5b5a619f7bd61db00b1c5c92e...temPid=1784
May 27 21:34:53 pico-master docker[1116]: time="2016-05-27T21:34:53.924103578Z" level=debug msg="received containerd event: &types.Event{Type:\"exit\", Id:\"82457...x5748bd7d}"
May 27 21:34:58 pico-master docker[1116]: time="2016-05-27T21:34:58.647580399Z" level=warning msg="Registering as \"192.168.200.31:2375\" in discovery failed: can...n sessions"
May 27 21:34:58 pico-master docker[1116]: time="2016-05-27T21:34:58.683884447Z" level=error msg="discovery error: Get http://192.168.200.31:8500/v1/kv/docker/node...on refused"
May 27 21:34:58 pico-master docker[1116]: time="2016-05-27T21:34:58.684968644Z" level=error msg="discovery error: Put http://192.168.200.31:8500/v1/kv/docker/node...on refused"
May 27 21:34:58 pico-master docker[1116]: time="2016-05-27T21:34:58.686406219Z" level=error msg="discovery error: Unexpected watch error"
Hint: Some lines were ellipsized, use -l to show in full.
$ sudo systemctl status cluster-lab -l
● cluster-lab.service - hypriot-cluster-lab
Loaded: loaded (/etc/systemd/system/cluster-lab.service; enabled)
Active: active (exited) since Fri 2016-05-27 21:34:30 UTC; 12min ago
Main PID: 888 (code=exited, status=0/SUCCESS)
CGroup: /system.slice/cluster-lab.service
└─975 dhclient eth0.200
May 27 21:33:36 pico-master cluster-lab[327]: dpkg-query: error: error writing to '<standard output>': Broken pipe
May 27 21:33:46 pico-master cluster-lab[327]: Device "eth0.200" does not exist.
May 27 21:33:46 pico-master cluster-lab[327]: dpkg-query: error: error writing to '<standard output>': Broken pipe
May 27 21:33:47 pico-master cluster-lab[327]: dpkg-query: error: error writing to '<standard output>': Broken pipe
May 27 21:33:49 pico-master cluster-lab[888]: dpkg-query: error: error writing to '<standard output>': Broken pipe
May 27 21:33:49 pico-master cluster-lab[888]: dpkg-query: error: error writing to '<standard output>': Broken pipe
May 27 21:33:53 pico-master dhclient[965]: DHCPREQUEST on eth0.200 to 255.255.255.255 port 67
May 27 21:33:53 pico-master dhclient[965]: DHCPACK from 192.168.200.1
May 27 21:34:30 pico-master systemd[1]: Started hypriot-cluster-lab.
What else do you need ? How can I fix it ?
Thanks,
Nicolas
Is there any plans to support the rancher project
Hello, I've followed the instructions in this Hypriot Blog Post. Could that be setup Cluster Lab using Pi 3's wifi chip instead of using eth0 (Ethernet)?
I tried to edit /etc/cluster-lab/cluster.conf
manually. When I cluster-lab stop
and then start. It shows me:
Got more than one IP address on wlan0.200: 192.168.200.1
Is that a bad idea to build a cluster using wlan? If not, how to config Cluster Lab on the Pi 3?
Thanks!
When the Cluster Lab starts up it often show the following error
[FAIL] Consul is able to talk to Docker-Engine on port 7946 (Serf)
while actually everything is working.
So we need to give it more time or to make the test otherwise more reliable.
See High Availability in Docker Swarm for possible solutions on how to do that.
Having more than one Swarm Manager would make the Cluster Lab more resilient in the event that the master node goes down.
Hello,
I recently decided to try Hypriot Cluster Lab on my Pi 3b(s) and 2b(s). On the 2b(s) it worked perfectly, but on the 3b(s), it will refuse to boot, when I plug them in, the red light turns on and nothing happens. Help?
Thanks in advance,
Zach Hilman
Refactor Cluster-Lab to make it more reliable
Based on fresh jessie armbian installation and following the blog guide as a base.
Installed docker-compose
from dpkg directly, since the following did not provide the package, only the docker-hypriot
package.
curl -s https://packagecloud.io/install/repositories/Hypriot/Schatzkiste/script.deb.sh | sudo bash
Armbian does use init instead of systemd by default, which leads to:
$ sudo systemctl start cluster-start
Failed to get D-Bus connection: Unknown error -1
Found the following solution to change to systemd http://forum.armbian.com/index.php/topic/342-dbus-error-when-using-systemctl/
Latest not so easy solved problem is that the kernel does not support vxlan, which leads to not working overlay networks. In order to make this work the kernel has to be recompiled with the option -> Device Drivers -> Network device support -> Network core driver support -> Virtual eXtensible Local Area Network (VXLAN) [m]
.
==> follower1: dpkg-deb: building package 'hypriot-cluster-lab-src' in 'hypriot-cluster-lab-src_0.2.12-1.deb'.
==> follower1: /cluster-lab-src/vagrant
==> follower1: dpkg: error processing archive ./hypriot-cluster-lab-src_0.1.1-1.deb (--install):
==> follower1: cannot access archive: No such file or directory
==> follower1: Errors were encountered while processing:
==> follower1: ./hypriot-cluster-lab-src_0.1.1-1.deb
==> follower1: /tmp/vagrant-shell: line 39: cluster-lab: command not found
Happens on each box. On Latest master.
We are finding that consul for large networks is unstable. Members do not join properly for instance. For more flexibility, it would be great to allow the use of hosted swarm that use the token:// with a unique token address rather than consul. This means that the system needs to come up with a slightly modified token. And you should get the token from the user or by running docker run swarm create
which will give you a guid. I haven't tried it bu docker run hypriot/rpi-swarm create
should do the same.
The specific change is to modify the cluster-lab.conf so that there is a DOCKER_CONSUL_CONF for consul and a DOCKER_SWARM_CONF for non-consul. Then the cluster-lab script needs to be able to switch. Perhaps add a new command to cluster-lab script. There are bunch of different approaches but the core line that needs to change is assuming $TOKEN has the unique id
DOCKER_OPTS='{\\n \
\"storage-driver\": \"overlay\", \\n \
\"hosts\": [\"fd://\", \"tcp://${CLUSTER_NODE_IP}:2375\"], \\n \
\"cluster-advertise\": \"${CLUSTER_NODE_IP}:2375\", \\n \
\"cluster-store\": \"consul://${CLUSTER_NODE_IP}:8500\", \\n \
\"label\": [\"hypriot.arch=${ARCHITECTURE}\",\"hypriot.hierarchy=${CLUSTER_NODE_ROLE}\"] \\n \
}'
For the slaves you need to be smart about what swarm image you execute depending on Intel vs arm roughly (as I have not yet figured out all the escapes needed):
DOCKER_OPTS='{\\n \
\"storage-driver\": \"overlay\", \\n \
\"swarm\" \\n \
\"swarm-image\": \"hypriot/rpi-swarm", \\n \
\"swarm-advertise\": \"${CLUSTER_TOKEN}", \\n \
\"label\": [\"hypriot.arch=${ARCHITECTURE}\",\"hypriot.hierarchy=${CLUSTER_NODE_ROLE}\"] \\n \
}'
For the master you need all of the above plus
\"swarmmaster\": \"overlay\", \\n \
Hi,
on a rPI Model B Revision 2.0, 512MB RAM,
using http://downloads.hypriot.com/hypriot_20160121-235123_clusterlab.img.zip
i had to modify the start script in order for the system to pickup on master.
(otherwise 192.168.200.1 was always assigned on node-1, so i had to debug avahi-browse output and it only 'refreshes' when the daemon is restarted)
echo -e "#-----------------\n# check if leader\n#-----------------"
setip 192.168.200.254
sleep 5
systemctl restart avahi-daemon.service
sleep 5
CLUSTERMASTERIP=$(avahi-browse _cluster._tcp -t -r -p | grep 'os-release=hypriot' | grep '^=' | grep ';Cluster-Master' | grep 'eth0\.' | grep IPv4 | awk -F ';' 'BEGIN { format="%s\n" }{ printf(format,$8) }')
is this something worth a change (by me via PR-ing) or you have other plans related to this ?
I've tried this 3 times, turn pi on with hypriot cluster image. A few seconds later the router login page becomes unresponsive and then a minute or so go by then all network connectivity is lost. I have to restart the router to get connectivity back.
Hopefully the below information will help:
Pi -> Gigabit switch -> Virgin Media SuperHub 2ac
SuperHub software version: V1.01.11, Hardware: 1.03
I'm 99% sure the router supports VLAN. And the switch should definitely support VLAN.
By default, the Docker Daemon is started with ExecStart=/usr/bin/docker daemon -H fd://
. This behavior must be overwritten to be applicable to a json
configuration.
I suggest to write a file in /etc/systemd/system/docker.service.d
containing
[Service]
ExecStart=
ExecStart=/usr/bin/docker daemon
It modifies the default start behavior. Beware to call systemctl daemon-reload
before starting docker again.
Is there any plans to support the rancher project
While running up a cluster-lab imaged pi to be a dhcp server, I noticed that the resolv.conf points to private nameservers in Germany.
I know that the expected usecase is for the cluster-lab hosts to have dhcp provide them with their IP/name resolution data, but on the off chance that someone is either exploring (or a serious edge case like a travelling cloud-in-a-box), would it make sense to have it point by default to the google public nameservers (8.8.8.8 and 8.8.4.4)?
Currently the communication between Consul nodes and also Swarm nodes is not secured.
We need to look which ways for securing the communication we want to use and the implement one.
Hi,
I plan to use CL in a DMZ context where I have no DHCP by default nor DNS ; so I don't know if the issue should be reported there or in hypriot/flash. Let me know which one is the best.
At the end, what I would like is something like:
--ipaddress
for the IP of the node with a /24 or something ; ex: --ipaddress=192.168.8.100/24
; a variant could be --ipaddress
and --netmask
depending on the expected format--gateway
: for the IP of my gateway to internet : ex --gateway=192.168.8.1
--dns
: to set the DNS with at least one, but ideally a list of dns like --dns="8.8.8.8 8.8.4.4"
It would lead to:
flash --hostname cl-master -ipaddress=192.168.8.100/24 --gateway=192.168.8.1 --dns="8.8.8.8 8.8.4.4" http://downloads.hypriot.com/hypriot-rpi-20151128-152209-docker-swarm-cluster.img.zip
Would it make sense ?
Hello,
The cluster lab main example shows using Vagrant with Virtualbox as an example. I'm on a headless CentOS 7 system with only docker available.
I've been able to run a vagrant + docker example such as : https://github.com/bubenkoff/vagrant-docker-example
But when I try to add "--provider docker" to the first vagrant instruction, I get :
[root@test-vm vagrant]# /usr/bin/vagrant up --provider=docker
No usable default provider could be found for your system.
Or maybe using a docker provider for cluster-lab is not possible ?
I have installed latest cluster lab in my macbook with EL Captain. I am able to successfuly logging in to vagrant leader through vagrant ssh leader from mac machine. When I tried executing the command sudo su and docker -H tcp://192.168.200.1:2378 info, I am getting FATA[0000] Cannot connect to the Docker daemon. Is 'docker -d' running on this host?
Thanks.
Vasu
The current implementation does not fully support systemd
control. When executing systemctl disable clusterlab
it still starts up after rebooting.
Anyone else having trouble starting the UI container?
docker -H MASTER_IP:2378 run --cidfile=dockeruipull -d -p 9000:9000 --env="constraint:node==hypriot11" --name dockerui1 hypriot/rpi-dockerui -e http://MASTER_IP:2378
7496213a97ad621da51d809dc16c6e6d4574a5958fc48355aff36b71d0efca15
Error response from daemon: Cannot start container 7496213a97ad621da51d809dc16c6e6d4574a5958fc48355aff36b71d0efca15: [8] System error: exec: "./dockerui": stat ./dockerui: no such file or directory
When running the OS for the first time I can see that the docker container for consul is not starting:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3528081eb5dc hypriot/rpi-swarm "/swarm manage consul" About a minute ago Up About a minute 0.0.0.0:2378->2375/tcp bin_swarmmanage_1
db3898d5fc5f hypriot/rpi-consul "/consul agent -serve" About a minute ago Restarting (1) 9 seconds ago bin_consul_1
7d9b0936d53e hypriot/rpi-swarm "/swarm join --advert" About a minute ago Up About a minute 2375/tcp bin_swarm_1
So i did docker logs bin_consul_1
to see what was happening :
==> Error starting agent: Failed to get advertise address: Multiple private IPs found. Please configure one.
==> WARNING: BootstrapExpect Mode is specified as 1; this is the same as Bootstrap mode.
==> WARNING: Bootstrap mode enabled! Do not enable unless necessary
==> Starting Consul agent...
==> Error starting agent: Failed to get advertise address: Multiple private IPs found. Please configure one.
I'seen on the blog's comment that I am not the only one with this issue.
To make cluster-lab compatible to our lastest image-builder-rpi v0.5.17 which is now compatible with docker-machine we also have to move the config from daemon.json back to docker.service.
A configuration option for the docker daemon would be nice.
Let's say I want to run a local docker registry and use that with all my daemons. I would have to change the /etc/default/docker
file to add --insecure-registry myregistrydomain.com:5000
for every daemon in my cluster. Only possible way would be to change the cluster-start.sh
file, since it will be overwritten otherwise.
Even better when I can supply this config with the flash tool, too. This makes it a very easy setup then.
docker run -ti --rm hypriot/rpi-consul members -rpc-addr=192.168.200.1:8400
. Now, I saw that consul uses many ports, could you explain which port does what?systemctl start cluster-lab
than cluster-start
, why don't start automatically from .deb
pkg, where you only enable the services?.conf
file somewhere, where you may specify e.g. network settingssystemd
services into one? Or make a cluster-lab
service with ExecStart
and ExecStop
pointing to your files. If required, you can also make a second service which is Before shutdown
, which stops cluster-lab
Overall, this is a major step in the right direction
After some month of tuning, this will be a very good starting point for enthusiasts like me
Can you add my kubernetes-on-arm as a related project? It is very related
This week, I'm probably going to release v0.6.2
of my repo, which will feature HypriotOS
support.
This makes Kubernetes this far away:
# Install
wget <deb file> from github
dpkg -i <deb file>
kube-config install
# After that, kubectl, hyperkube, etcdctl and all required binaries are in $PATH
kube-config enable-master
# Now Kubernetes is up and running...
kubectl get no,po,svc,rc,secrets,serviceAccounts --all-namespaces
# To connect a worker, just do this from another node
kube-config enable-worker [master-ip]
# Spin up dns and a registry as cluster addons
kube-config enable-addon dns
kube-config enable-addon registry
HI
sed -i -e 's/use-ipv6=no/use-ipv6=yes' /etc/avahi/avahi-daemon.conf
should be
sed -i -e 's/use-ipv6=no/use-ipv6=yes/' /etc/avahi/avahi-daemon.conf
Thanks.
While starting the cluster-lab with a vagrant up
after vagrant destroy
I get the following log output:
==> follower2: Setting up hypriot-cluster-lab-src (0.2.12-1) ...
==> follower2: Created symlink from /etc/systemd/system/multi-user.target.wants/cluster-lab.service to /etc/systemd/system/cluster-lab.service.
==> follower2: cp:
==> follower2: cannot stat ‘/etc/systemd/system/docker.service’
==> follower2: : No such file or directory
==> follower2: sed: can't read /etc/systemd/system/docker.service: No such file or directory
A docker info
against Swarm results in the following output:
root@follower1:/home/vagrant# DOCKER_HOST=tcp://192.168.200.1:2378 docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 3
(unknown): 192.168.200.45:2375
└ Status: Pending
└ Containers: 0
└ Reserved CPUs: 0 / 0
└ Reserved Memory: 0 B / 0 B
└ Labels:
└ Error: Cannot connect to the docker engine endpoint
└ UpdatedAt: 2016-06-08T05:03:41Z
(unknown): 192.168.200.1:2375
└ Status: Pending
└ Containers: 0
└ Reserved CPUs: 0 / 0
└ Reserved Memory: 0 B / 0 B
└ Labels:
└ Error: Cannot connect to the docker engine endpoint
└ UpdatedAt: 2016-06-08T04:58:31Z
(unknown): 192.168.200.26:2375
└ Status: Pending
└ Containers: 0
└ Reserved CPUs: 0 / 0
└ Reserved Memory: 0 B / 0 B
└ Labels:
└ Error: Cannot connect to the docker engine endpoint
└ UpdatedAt: 2016-06-08T05:01:01Z
Plugins:
Volume:
Network:
Kernel Version: 4.2.0-30-generic
Operating System: linux
Architecture: amd64
CPUs: 0
Total Memory: 0 B
Name: e62a0f42529d
Docker Root Dir:
Debug mode (client): false
Debug mode (server): false
WARNING: No kernel memory limit support
A docker info
against the local Docker installation results in
root@follower1:/home/vagrant# docker info
Containers: 3
Running: 3
Paused: 0
Stopped: 0
Images: 2
Server Version: 1.11.2
Storage Driver: overlay
Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge null host
Kernel Version: 4.2.0-30-generic
Operating System: Ubuntu 15.10
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 992.9 MiB
Name: follower1
ID: FJYP:QGBI:QQRC:DCXS:OEOW:36JV:JMPV:DTFV:6B6K:C4XO:PEQO:LJYE
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
A cluster-lab health
shows
root@follower1:/home/vagrant# cluster-lab health
Internet Connection
[PASS] eth1 exists
[PASS] eth1 has an ip address
[PASS] Internet is reachable
[PASS] DNS works
Networking
[PASS] eth1.200 exists
[PASS] eth1.200 has correct IP from vlan network
[PASS] Cluster leader is reachable
[PASS] eth1.200 has exactly one IP
[PASS] eth1.200 has no local link address
[PASS] Avahi process exists
[PASS] Avahi is using eth1.200
Docker
[PASS] Docker is running
[FAIL] Docker is configured to use Consul as key-value store
[FAIL] Docker is configured to listen via tcp at port 2375
[FAIL] Docker listens on 192.168.200.26 via tcp at port 2375 (Docker-Engine)
Consul
[PASS] Consul Docker image exists
[PASS] Consul Docker container is running
[PASS] Consul is listening on port 8300
[PASS] Consul is listening on port 8301
[PASS] Consul is listening on port 8302
[PASS] Consul is listening on port 8400
[PASS] Consul is listening on port 8500
[PASS] Consul is listening on port 8600
[PASS] Consul API works
[PASS] Cluster-Node is pingable with IP 192.168.200.26
[PASS] Cluster-Node is pingable with IP 192.168.200.45
[PASS] Cluster-Node is pingable with IP 192.168.200.1
[PASS] No Cluster-Node is in status 'failed'
[FAIL] Consul is able to talk to Docker-Engine on port 7946 (Serf)
Swarm
[PASS] Swarm-Join Docker container is running
[PASS] Swarm-Manage Docker container is running
[PASS] Number of Swarm and Consul nodes is equal which means our cluster is healthy
It seems the Docker daemon was not configured correctly by the cluster-lab.
I guess the problem is related to the following line:
https://github.com/hypriot/cluster-lab/blob/master/package/usr/local/lib/cluster-lab/docker_lib#L79-L81
@firecyberice What do you think?
Allow additional labels for all containers, eg. to add traefik.enable=false
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.