GithubHelp home page GithubHelp logo

Comments (14)

yprokule avatar yprokule commented on July 30, 2024

/cc @mcornea @achuzhoy @celebdor

from dev-scripts.

hardys avatar hardys commented on July 30, 2024

Thanks for the report, can you provide the keepalived logs from all masters please?

from dev-scripts.

yprokule avatar yprokule commented on July 30, 2024

@hardys logs from keepalieved containers

from dev-scripts.

yprokule avatar yprokule commented on July 30, 2024

Worth mentioning that both masters and workers nodes end up in NotReady status:

oc get nodes
NAME       STATUS     ROLES    AGE     VERSION
master-0   NotReady   master   3d2h    v1.13.4+1ad602308
master-1   NotReady   master   3d2h    v1.13.4+1ad602308
master-2   NotReady   master   3d2h    v1.13.4+1ad602308
worker-0   NotReady   worker   2d22h   v1.13.4+1ad602308

from dev-scripts.

jtaleric avatar jtaleric commented on July 30, 2024

Hit this issue in my deployment, only master-2 went to NotReady. Exact same issue described in this issue.

from dev-scripts.

yboaron avatar yboaron commented on July 30, 2024

@yprokule , seems like there's L2 connectivity issue for 192.168.123.5 (as ping doesn't work) ,

  1. Could you please run the same test for 192.168.123.6 (DNS) and 192.168.123.10 (INGRESS)?
  2. Could you please attach the output of arp table (arp -a) from all nodes?

from dev-scripts.

yprokule avatar yprokule commented on July 30, 2024

@yboaron

master-0

[root@master-0 ~]# ping -c1 192.168.123.6
PING 192.168.123.6 (192.168.123.6) 56(84) bytes of data.
64 bytes from 192.168.123.6: icmp_seq=1 ttl=64 time=0.029 ms

--- 192.168.123.6 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.029/0.029/0.029/0.000 ms
[root@master-0 ~]# ping -c1 192.168.123.5
connect: Invalid argument

master-1

[root@master-1 ~]# ping -c1 192.168.123.5
PING 192.168.123.5 (192.168.123.5) 56(84) bytes of data.
64 bytes from 192.168.123.5: icmp_seq=1 ttl=64 time=0.174 ms

--- 192.168.123.5 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.174/0.174/0.174/0.000 ms
[root@master-1 ~]# ping -c1 192.168.123.6
PING 192.168.123.6 (192.168.123.6) 56(84) bytes of data.
64 bytes from 192.168.123.6: icmp_seq=1 ttl=64 time=0.213 ms

--- 192.168.123.6 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.213/0.213/0.213/0.000 ms

master-2

root@master-2 ~]# ping -c1 192.168.123.6
PING 192.168.123.6 (192.168.123.6) 56(84) bytes of data.
64 bytes from 192.168.123.6: icmp_seq=1 ttl=64 time=0.163 ms

--- 192.168.123.6 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.163/0.163/0.163/0.000 ms
[root@master-2 ~]# ping -c1 192.168.123.5
PING 192.168.123.5 (192.168.123.5) 56(84) bytes of data.
64 bytes from 192.168.123.5: icmp_seq=1 ttl=64 time=0.030 ms

--- 192.168.123.5 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.030/0.030/0.030/0.000 ms

from dev-scripts.

yprokule avatar yprokule commented on July 30, 2024

on other nodes

Apr 15 16:15:41 master-2 hyperkube[121307]: E0415 16:15:41.413449  121307 kubelet.go:2273] node "master-2" not found
Apr 15 16:15:41 master-2 hyperkube[121307]: E0415 16:15:41.497345  121307 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: services is forbidden: User "system:anonymous" cannot list resource "services" in API group "" at the cluster scope
Apr 15 16:15:41 master-2 hyperkube[121307]: E0415 16:15:41.498449  121307 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: pods is forbidden: User "system:anonymous" cannot list resource "pods" in API group "" at the cluster scope

from dev-scripts.

yboaron avatar yboaron commented on July 30, 2024

@yprokule , I think that I found something.

Master-0 doesn't hold the API VIP (192.168.123.5) but I can still see the following HOST entry in the routing table:
192.168.123.5 dev ens4 proto kernel scope link src 192.168.123.5 metric 101

So, when Master-0 try to send any packet to 192.168.123.5, network stack fail with 'connect: Invalid argument' .

I deleted the 192.168.123.5 route from Master-0 , and now I'm able to ping 192.168.123.5.

[core@master-0 ~]$ sudo ip route del 192.168.123.5/32
[core@master-0 ~]$ ping 192.168.123.5
PING 192.168.123.5 (192.168.123.5) 56(84) bytes of data.
64 bytes from 192.168.123.5: icmp_seq=1 ttl=64 time=0.170 ms
64 bytes from 192.168.123.5: icmp_seq=2 ttl=64 time=0.098 ms
64 bytes from 192.168.123.5: icmp_seq=3 ttl=64 time=0.252 ms
64 bytes from 192.168.123.5: icmp_seq=4 ttl=64 time=0.210 ms
^C
--- 192.168.123.5 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 92ms
rtt min/avg/max/mdev = 0.098/0.182/0.252/0.058 ms
[core@master-0 ~]$

from dev-scripts.

karmab avatar karmab commented on July 30, 2024

yep deleting the incorrect route fixed the notready state of the node, which was simply not able to reach the api to report status

from dev-scripts.

russellb avatar russellb commented on July 30, 2024

I See @karmab has a WIP patch for this here: #369

from dev-scripts.

yboaron avatar yboaron commented on July 30, 2024

seems like a RHCOS/RHEL bug, I filed bz for that https://bugzilla.redhat.com/show_bug.cgi?id=1700415

from dev-scripts.

russellb avatar russellb commented on July 30, 2024

a different workaround here: #377

from dev-scripts.

russellb avatar russellb commented on July 30, 2024

We've got an open bug tracking the kernel issue. In the meantime, we've updated our config such that the undeleted route won't cause a problem anymore. See #377

from dev-scripts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.