Comments (10)
Ugh it looks like I was on an outdated version of the repo. After updating the repo and reinstalling it is working now.
from consul-helm.
Does it never stabilize? It looks like its starting to work, just hasn't joined all the servers yet.
Also, PVCs might be a real issue for sure. I didn't realize actually that StatefulSets with PVCs can be started before the PVC is available (spoiled in the environment we run in I guess). Do you know what that looks like? Is the directory just not available yet? That might be something we have to build into an init container or something (to wait for it to be ready).
from consul-helm.
@mitchellh no, it never does, funnily it used to work with the previous release of rook (v0.7).
In my understanding the StatefulSet starts and attempts to bind the PVCs, until that is done, the pod should report an unbound pvc issue - should be easily accessbile via kubectl describe
or kubectl logs
from consul-helm.
But during that time, the containers are started?
Sorry, easiest way to figure this out would be if you did more digging or I can get a reproduction. For the latter, is there an easy way for me to get a similar environment up and running?
from consul-helm.
They appear to be - the logs are there.
I'm happy to do more digging, however in order for you to get a repro, I'd have to provide you with terraform files, helm value files & k8s manifests to get a copy of my env going OR I'll simply share a kubeconfig file so that you can poke around and leave it running for the night
from consul-helm.
@mitchellh I've cloned the repo and made some alterations:
values.yaml
server.storageclass: rook-ceph-block
server-statefulset.yaml
readinessProbe.initialDelaySeconds: 60
Results
- All consul-* pods were up and running v. fast, however they have not been ready for about 50s
- All consul-server-* pods were not up and running immediately, they were stuck in
ContainerCreating
state for approx 50s when the 1st pod reportedRunning
. None of the pods wereReady
- Once all 3 consul-server pods were up and running the initialDelaySeconds period has passed and they started to pass the readiness test
helm status consul
LAST DEPLOYED: Wed Sep 26 23:34:25 2018
NAMESPACE: service-discovery
STATUS: DEPLOYED
RESOURCES:
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
consul-kktl5 1/1 Running 0 2m
consul-tbjmt 1/1 Running 0 2m
consul-x5cqq 1/1 Running 0 2m
consul-server-0 1/1 Running 0 2m
consul-server-1 1/1 Running 0 2m
consul-server-2 1/1 Running 0 2m
==> v1/ConfigMap
NAME DATA AGE
consul-client-config 1 2m
consul-server-config 1 2m
==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
consul-dns ClusterIP 10.3.228.30 <none> 53/TCP,53/UDP 2m
consul-server ClusterIP None <none> 8500/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP,8600/UDP 2m
consul-ui ClusterIP 10.3.98.102 <none> 80/TCP 2m
==> v1/DaemonSet
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
consul 3 3 3 3 3 <none> 2m
==> v1/StatefulSet
NAME DESIRED CURRENT AGE
consul-server 3 3 2m
==> v1beta1/PodDisruptionBudget
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
consul-server N/A 0 0 2m
Compare to previous results
If you compare the previous results, the 0/1
(s) mean that the pod is running, however the readiness test is failed.
NAME READY STATUS RESTARTS AGE
consul-56lvs 0/1 Running 0 36s
consul-jttwp 0/1 Running 0 36s
consul-qpgdn 0/1 Running 0 36s
consul-server-0 0/1 ContainerCreating 0 36s
consul-server-1 0/1 ContainerCreating 0 36s
consul-server-2 0/1 Running 0 36s
Hypothesis (FailureTreshold?)
This is what the docs say:
failureThreshold: When a Pod starts and the probe fails, Kubernetes will try failureThreshold times
before giving up. Giving up in case of liveness probe means restarting the Pod. In case of readiness
probe the Pod will be marked Unready. Defaults to 3. Minimum value is 1.
The server-statefulset.yaml failure treshold is:
failureThreshold: 2
and the initial delay is: initialDelaySeconds: 5
Meaning that kubernetes started to fail the checks and gave up before the pvc(s) were bound to the pods?
from consul-helm.
@mmisztal1980 I made the adjustments that you mentioned, but am still seeing pod failing health checks and servers sitting in a pending state.
consul consul-dzjnx 0/1 Running 0 4m
consul consul-g8lmf 0/1 Running 0 4m
consul consul-kx8l6 0/1 Running 0 4m
consul consul-server-0 0/1 Pending 0 4m
consul consul-server-1 0/1 Pending 0 4m
consul consul-server-2 0/1 Pending 0 4m
Consul logs seem to indicate that the stateful set isn't binding the PVC. I am also using Rook/Ceph for storage.
pod description
Name: consul-server-0
Namespace: consul
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: app=consul
chart=consul-0.1.0
component=server
controller-revision-hash=consul-server-5cf54754b
hasDNS=true
release=consul
statefulset.kubernetes.io/pod-name=consul-server-0
Annotations: consul.hashicorp.com/connect-inject: false
Status: Pending
IP:
Controlled By: StatefulSet/consul-server
Containers:
consul:
Image: consul:1.2.3
Ports: 8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
Command:
/bin/sh
-ec
CONSUL_FULLNAME="consul"
exec /bin/consul agent \
-advertise="${POD_IP}" \
-bind=0.0.0.0 \
-bootstrap-expect=3 \
-client=0.0.0.0 \
-config-dir=/consul/config \
-datacenter=dc1 \
-data-dir=/consul/data \
-domain=consul \
-hcl="connect { enabled = true }" \
-ui \
-retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-server
Readiness: exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
] delay=60s timeout=5s period=3s #success=1 #failure=2
Environment:
POD_IP: (v1:status.podIP)
NAMESPACE: consul (v1:metadata.namespace)
Mounts:
/consul/config from config (rw)
/consul/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-lv4v8 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-consul-server-0
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: consul-server-config
Optional: false
default-token-lv4v8:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-lv4v8
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 11s (x245 over 10m) default-scheduler pod has unbound PersistentVolumeClaims (repeated 3 times)
I checked the claims (kubectl get pvc) but don't see any claims made my Consul showing up in there.
from consul-helm.
Added a PR where the probe settings are configurable via the chart values. That should help in our case.
from consul-helm.
@mitchellh would the above PR be satisfactory? Tweaking the probe settings seems to have fixed the issue for myself and @jmreicha
from consul-helm.
As mentioned in a comment on the PR, these configuration options won't be added to the values.yaml
at this time. If you need further customization on the Helm chart, read more about options here.
from consul-helm.
Related Issues (20)
- Variable Interpolation within Helm chart HOT 7
- [ERROR] Unable to get Agent services: error="Unexpected response code: 403 (ACL not found)" HOT 9
- GKE Ingress requires pathType to be ImplementationSpecific
- flag provided but not defined: -log-json HOT 6
- Mesh-Gateway k8s: Error initializing configuration HOT 6
- Chart v0.32.+ : invalid config key "TransparentProxy" for proxy-defaults.yaml HOT 2
- Could not resolve host: static-server HOT 9
- test issue migration
- consul-consul-webhook-cert-manager flag provided but not defined: -log-json HOT 2
- 0.32.1/1.10.0 WAN Federation consul-server-acl-init job failing to execute resulting in failed helm installation HOT 8
- Kubernetes, ConsulCatalog and Traefik - Changing default Sync Rule between K8s and ConsulCatalog? HOT 2
- Consul connect injected sidecars not permitted by Pod Security Policies HOT 2
- Allow specifying a default policy / role for namespace mirroring HOT 2
- How route traffic to another consul node? HOT 5
- Consul ingress gateways not starting after chart upgrade HOT 6
- Which directory consul kv data is stored HOT 4
- consul connect: 503 after scaling in or pod restarts HOT 15
- feat: Add `-recursor` flag
- Failed helm upgrade leaves behind a job that prevents future upgrades? HOT 7
- Allow Consul client daemonset connectivity via nodePort (instead of only hostPort) HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from consul-helm.