Comments (6)
/sig node
from kubernetes.
Hi, could you please confirm the version of OLM that you have installed in your cluster. I am using v1.30 kubeadm cluster with 2 nodes. But I am facing an issue in installing OLM properly. I installed the latest version and this is the issue I am facing with one of the pods:
kubectl describe pod operatorhubio-catalog-4f4pn -n olm
Name: operatorhubio-catalog-4f4pn
Namespace: olm
Priority: 0
Service Account: operatorhubio-catalog
Node: kubenode01/192.168.56.21
Start Time: Thu, 11 Jul 2024 07:32:19 +0000
Labels: olm.catalogSource=operatorhubio-catalog
olm.managed=true
olm.pod-spec-hash=2x1nBHsbQOubqsGtZedVVKLKuv3chn8Oxx6Yio
Annotations: cluster-autoscaler.kubernetes.io/safe-to-evict: true
Status: Running
SeccompProfile: RuntimeDefault
IP: 10.244.1.16
IPs:
IP: 10.244.1.16
Controlled By: CatalogSource/operatorhubio-catalog
Containers:
registry-server:
Container ID: cri-o://662c558544d511a2be80abe0a31d31a9d53f7db384aaf9373cd57e2e0630b6fe
Image: quay.io/operatorhubio/catalog:latest
Image ID: quay.io/operatorhubio/catalog@sha256:ebb371353d720d380e6accf4b338e3c4b4cdb8594cc4bb53481f9aff07f95909
Port: 50051/TCP
Host Port: 0/TCP
State: Running
Started: Thu, 11 Jul 2024 07:42:29 +0000
Last State: Terminated
Reason: Error
Message: time="2024-07-11T07:40:49Z" level=info msg="starting pprof endpoint" address="localhost:6060"
time="2024-07-11T07:40:49Z" level=info msg="found existing cache contents" backend=pogreb.v1 cache=/tmp/cache configs=/configs
Exit Code: 2
Started: Thu, 11 Jul 2024 07:40:46 +0000
Finished: Thu, 11 Jul 2024 07:42:21 +0000
Ready: True
Restart Count: 6
Requests:
cpu: 10m
memory: 50Mi
Liveness: exec [grpc_health_probe -addr=:50051] delay=10s timeout=5s period=10s #success=1 #failure=3
Readiness: exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3
Startup: exec [grpc_health_probe -addr=:50051] delay=0s timeout=5s period=10s #success=1 #failure=10
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ccmxs (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-ccmxs:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 51m default-scheduler Successfully assigned olm/operatorhubio-catalog-4f4pn to kubenode01
Normal Pulled 51m kubelet Successfully pulled image "quay.io/operatorhubio/catalog:latest" in 2.469s (8.499s including waiting). Image size: 708569125 bytes.
Normal Killing 49m kubelet Container registry-server failed startup probe, will be restarted
Normal Pulling 49m (x2 over 51m) kubelet Pulling image "quay.io/operatorhubio/catalog:latest"
Normal Pulled 49m kubelet Successfully pulled image "quay.io/operatorhubio/catalog:latest" in 2.96s (2.974s including waiting). Image size: 708569125 bytes.
Normal Created 49m (x2 over 51m) kubelet Created container registry-server
Normal Started 49m (x2 over 51m) kubelet Started container registry-server
Warning Unhealthy 41m (x60 over 51m) kubelet Startup probe failed: timeout: failed to connect service ":50051" within 1s
I will have to recreate this issue to validate it so that it gets triage accepted. Then maybe I will be able to work on the solution.I tried it on minikube as well and I did not face this issue. The issue with minikube is that it uses a different OS and it does not have app armor installed. In that case, I get an issue while creating an app armor profile through kubectl.
from kubernetes.
@Aaina26
Here's how I install OLM in my cluster:
export ARCH=${'$'}(case ${'$'}(uname -m) in x86_64) echo -n amd64 ;; aarch64) echo -n arm64 ;; *) echo -n ${'$'}(uname -m) ;; esac)
export OS=${'$'}(uname | awk "{print tolower(\${'$'}0)}")
export OPERATOR_SDK_DL_URL=https://github.com/operator-framework/operator-sdk/releases/latest/download/
curl -LO ${'$'}{OPERATOR_SDK_DL_URL}/operator-sdk_${'$'}{OS}_${'$'}{ARCH}
gpg --keyserver keyserver.ubuntu.com --recv-keys 052996E2A20B5C7E
curl -LO ${'$'}{OPERATOR_SDK_DL_URL}/checksums.txt
curl -LO ${'$'}{OPERATOR_SDK_DL_URL}/checksums.txt.asc
gpg -u "Operator SDK (release) <[email protected]>" --verify checksums.txt.asc
grep operator-sdk_${'$'}{OS}_${'$'}{ARCH} checksums.txt | sha256sum -c -
chmod +x operator-sdk_${'$'}{OS}_${'$'}{ARCH} && sudo install operator-sdk_${'$'}{OS}_${'$'}{ARCH} /usr/local/bin/operator-sdk
operator-sdk olm install
Hope that helps.
from kubernetes.
Thank you for sharing this! But I followed the same steps from the olm documentation and I am getting the same issue. I'll try to troubleshoot this in my cluster then.
from kubernetes.
No problem. Maybe check the logs on the relevant node with crictl.
Ich just did a clean install on my cluster with Kubernetes 1.30.0. Here's what it looks like on my end. Install procedure was as documented in the previous post;
NAME READY STATUS RESTARTS AGE
pod/catalog-operator-78857dfb48-n2dqc 1/1 Running 0 84s
pod/olm-operator-6bf4f9c984-2fqwq 1/1 Running 0 85s
pod/operatorhubio-catalog-txnnm 1/1 Running 0 55s
pod/packageserver-6d7c785b6-25j8x 1/1 Running 0 58s
pod/packageserver-6d7c785b6-m6zdl 1/1 Running 0 58s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/operatorhubio-catalog ClusterIP 10.97.206.107 <none> 50051/TCP 55s
service/packageserver-service ClusterIP 10.103.57.85 <none> 5443/TCP 59s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/catalog-operator 1/1 1 1 84s
deployment.apps/olm-operator 1/1 1 1 85s
deployment.apps/packageserver 2/2 2 2 58s
NAME DESIRED CURRENT READY AGE
replicaset.apps/catalog-operator-78857dfb48 1 1 1 84s
replicaset.apps/olm-operator-6bf4f9c984 1 1 1 85s
replicaset.apps/packageserver-6d7c785b6 2 2 2 58s
from kubernetes.
sig-node meeting notes:
- @tallclair who's working on AppArmor GA. can you take a look?
/priority important-soon
/triage accepted
from kubernetes.
Related Issues (20)
- Add 100/1000s buckets for prometheus workqueue histograms QueueLatencyKey and WorkDurationKey HOT 2
- Gracefull and Controlled / Ordered Cluster Shutdown and Controlled / Ordered Startup HOT 2
- No explanation of error returned from SharedInformer.AddEventHandler HOT 2
- Projected secrets mounted with a subPath disappear after the secret is updated. HOT 13
- Do not start cadvisor when feature PodAndContainerStatsFromCRI is enabled HOT 9
- kube-proxy conntrack reconciler HOT 4
- CRD validation rule: escape would overwrite the field whose name is in escape format HOT 2
- CEL validation can cause APIServer to crashloop HOT 11
- Kubectl Versions >= 1.30 Don't Allow Exec When HTTPS Scheme in proxy-url HOT 11
- Failure cluster [1f90b3bc...] `[sig-node] Pod InPlace Resize Container [Serial] [Feature:InPlacePodVerticalScaling] BestEffort pod - try requesting memory, expect error` HOT 6
- PodDeletionCost occasionally doesn't work HOT 9
- [Failing Test] CSI Volumes .. should preempt lower priority pods using ReadWriteOncePod volumes fails when Beta Feature gates are enabled HOT 2
- Why doesn't the device plugin pass the containerID parameter HOT 12
- flaky test: VolumeAttributesClass e2e HOT 2
- [Failing Test] Test DefaultProcMount - `will mask proc mounts by default` is failing HOT 3
- CVE-2024-5321: Incorrect permissions on Windows containers logs HOT 1
- Intermittent error on new Nodes: "Unable to locate credentials" HOT 4
- Error during pulling of v.1.30.3 images HOT 20
- ec2-eks jobs are failing to create the cluster successfully HOT 6
- API Server fails validation for CRDs embedding resources with optional fields without omitempty when there is webhook HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kubernetes.