Running an autoscaler to test arm64 builds.
AWS EC2 m6g.large
Sporadically seeing failed builds on newly launched instances:
clone: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
latest: Pulling from drone/git
Digest: sha256:091ecd02ee4ac5154fd76133c5055b2345a61cbc17182b00612df1fa7eef1510
Status: Downloaded newer image for drone/git:latest
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
However, inot consistently, maybe 10% of the time. Most of the time, no errors.
What comes to mind is to put a "sleep 1" or "systemctl restart docker" someplace. A very small change.
It looks like that is already present "[ systemctl, restart, docker ]"
Now trying a modification in drivers/internal/userdata/userdata.go . Change
- path: /etc/docker/daemon.json
content: |
{
"hosts": [ "0.0.0.0:2376", "unix:///var/run/docker.sock" ],
"tls": true,
"tlsverify": true,
"tlscacert": "/etc/docker/ca.pem",
"tlscert": "/etc/docker/server-cert.pem",
"tlskey": "/etc/docker/server-key.pem"
}
to
- path: /etc/docker/daemon.json
content: |
{
"hosts": [ "0.0.0.0:2376", "unix:///var/run/docker.sock" ],
"containerd": "/run/containerd/containerd.sock",
"tls": true,
"tlsverify": true,
"tlscacert": "/etc/docker/ca.pem",
"tlscert": "/etc/docker/server-cert.pem",
"tlskey": "/etc/docker/server-key.pem"
}
Adding "containerd": "/run/containerd/containerd.sock"
, based on this article https://stackoverflow.com/questions/68823645/what-is-purpose-of-the-switch-containerd-in-command-dockerd
I'll report back, if it eventually improves the condition.