GithubHelp home page GithubHelp logo

Comments (12)

ioolkos avatar ioolkos commented on August 18, 2024

@pbwur Thanks, The change must be in PR #380, #382, #384 or #385 then. What does the Verne log tell?

@ashtonian does this ring a bell to you, from the changes to add optional listeners?


👉 Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq
👉 Using the binary VerneMQ packages commercially (.deb/.rpm/Docker) requires a paid subscription.

from docker-vernemq.

pbwur avatar pbwur commented on August 18, 2024

ik zi niks on de logging dat wijst op een probleem bij de healthcheck. When the first pod (of 3) starts there are a lot of log statements like:

vmq_swc_store:handle_info/2:555: Replica meta4: Can't initialize AE exchange due to no peer available

After a while VerneMq exists. But before that I'm able to execute the healthcheck using http://localhost:8888/health successfully.

024-05-02T08:53:35.711676+00:00 [debug] <0.292.0> vmq_swc_store:handle_info/2:555: Replica meta9: Can't initialize AE exchange due to no peer available
2024-05-02T08:53:36.920696+00:00 [debug] <0.247.0> vmq_swc_store:handle_info/2:555: Replica meta4: Can't initialize AE exchange due to no peer available
2024-05-02T08:53:37.434670+00:00 [debug] <0.238.0> vmq_swc_store:handle_info/2:555: Replica meta3: Can't initialize AE exchange due to no peer available
2024-05-02T08:53:37.790656+00:00 [debug] <0.283.0> vmq_swc_store:handle_info/2:555: Replica meta8: Can't initialize AE exchange due to no peer available
2024-05-02T08:53:38.419727+00:00 [debug] <0.301.0> vmq_swc_store:handle_info/2:555: Replica meta10: Can't initialize AE exchange due to no peer available
2024-05-02T08:53:38.744695+00:00 [debug] <0.229.0> vmq_swc_store:handle_info/2:555: Replica meta2: Can't initialize AE exchange due to no peer available
2024-05-02T08:53:40.392832+00:00 [debug] <0.265.0> vmq_swc_store:handle_info/2:555: Replica meta6: Can't initialize AE exchange due to no peer available
2024-05-02T08:53:41.044680+00:00 [debug] <0.256.0> vmq_swc_store:handle_info/2:555: Replica meta5: Can't initialize AE exchange due to no peer available
2024-05-02T08:53:41.835692+00:00 [debug] <0.220.0> vmq_swc_store:handle_info/2:555: Replica meta1: Can't initialize AE exchange due to no peer available
2024-05-02T08:53:42.212673+00:00 [debug] <0.292.0> vmq_swc_store:handle_info/2:555: Replica meta9: Can't initialize AE exchange due to no peer available
I'm the only pod remaining. Not performing leave and/or state purge.
2024-05-02T08:53:42.465663+00:00 [debug] <0.274.0> vmq_swc_store:handle_info/2:555: Replica meta7: Can't initialize AE exchange due to no peer available
2024-05-02T08:53:42.839671+00:00 [debug] <0.283.0> vmq_swc_store:handle_info/2:555: Replica meta8: Can't initialize AE exchange due to no peer available
2024-05-02T08:53:42.944858+00:00 [notice] <0.44.0> application_controller:info_exited/3:2129: Application: vmq_server. Exited: stopped. Type: permanent.
2024-05-02T08:53:42.945013+00:00 [notice] <0.44.0> application_controller:info_exited/3:2129: Application: stdout_formatter. Exited: stopped. Type: permanent.

from docker-vernemq.

ioolkos avatar ioolkos commented on August 18, 2024

Those "Replica" logs are normal when you have debug log level on.
I guess Kubernetes terminates the pods here, since it cannot reach the health endpoint.


👉 Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq
👉 Using the binary VerneMQ packages commercially (.deb/.rpm/Docker) requires a paid subscription.

from docker-vernemq.

ashtonian avatar ashtonian commented on August 18, 2024

Probably need to add this back:
https://github.com/vernemq/docker-vernemq/pull/382/files#diff-95359b2d5d846bb085015977b06cde6a1facdc4ac553c06adb7d12e47aa39373L224-L226
May need to add the cluster port back as well.

from docker-vernemq.

ioolkos avatar ioolkos commented on August 18, 2024

@ashtonian Thanks, I reverted this here: #387
cc @pbwur let's see whether this resolves the issue. I can build new images tomorrow.


👉 Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq
👉 Using the binary VerneMQ packages commercially (.deb/.rpm/Docker) requires a paid subscription.

from docker-vernemq.

ioolkos avatar ioolkos commented on August 18, 2024

@pbwur I have now uploaded 2.0.0 images with a tentative fix to Dockerhub. Can you test one of those to check whether the Kubernetes Health check works now?


👉 Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq
👉 Using the binary VerneMQ packages commercially (.deb/.rpm/Docker) requires a paid subscription.

from docker-vernemq.

pbwur avatar pbwur commented on August 18, 2024

@ioolkos, it seems to work now. All 3 nodes of the cluster are starting now. Thanks for the great response!

Although probably not related, I do get an error with the second node after the first node starts successfully. After I delete the persistentStoraceClaim and start the cluster again, everything is ok.

This is part of the logging:

2024-05-03T09:00:36.793105+00:00 [info] <0.686.0> vmq_diversity_app:start/2:85: enable auth script for postgres "./share/lua/auth/postgres.lua"
Error! Failed to eval: vmq_server_cmd:node_join('VerneMQ@vernemq-0.vernemq-headless.mdtis-poc-mqtt.svc.cluster.local')

Runtime terminating during boot ({{badkey,{'VerneMQ@vernemq-1.vernemq-headless.mdtis-poc-mqtt.svc.cluster.local',<<34,100,99,27,209,16,239,117,147,202,59,36,181,234,60,253,91,83,95,77>>}},[{erlang,map_get,[{'VerneMQ@vernemq-1.vernemq-headless.mdtis-poc-mqtt.svc.cluster.local',<<34,100,99,27,209,16,239,117,147,202,59,36,181,234,60,253,91,83,95,77>>},#{}],[{error_info,#{module=>erl_erts_errors}}]},{vmq_swc_plugin,'-summary/1-lc$^1/1-1-',3,[{file,"/opt/vernemq/apps/vmq_swc/src/vmq_swc_plugin.erl"},{line,220}]},{vmq_swc_plugin,'-summary/1-lc$^1/1-1-',3,[{file,"/opt/vernemq/apps/vmq_swc/src/vmq_swc_plugin.erl"},{line,220}]},{vmq_swc_plugin,history,1,[{file,"/opt/vernemq/apps/vmq_swc/src/vmq_swc_plugin.erl"},{line,230}]},{vmq_swc_peer_service,attempt_join,1,[{file,"/opt/vernemq/apps/vmq_swc/src/vmq_swc_peer_service.erl"},{line,57}]},{vmq_server_cli,'-vmq_cluster_join_cmd/0-fun-1-',3,[{file,"/opt/vernemq/apps/vmq_server/src/vmq_server_cli.erl"},{line,516}]},{clique_command,run,1,[{file,"/opt/vernemq/_build/default/
2024-05-03T09:00:37.798996+00:00 [error] <0.9.0>: Error in process <0.9.0> on node 'VerneMQ@vernemq-1.vernemq-headless.mdtis-poc-mqtt.svc.cluster.local' with exit value:, {{badkey,{'VerneMQ@vernemq-1.vernemq-headless.mdtis-poc-mqtt.svc.cluster.local',<<34,100,99,27,209,16,239,117,147,202,59,36,181,234,60,253,91,83,95,77>>}},[{erlang,map_get,[{'VerneMQ@vernemq-1.vernemq-headless.mdtis-poc-mqtt.svc.cluster.local',<<34,100,99,27,209,16,239,117,147,202,59,36,181,234,60,253,91,83,95,77>>},#{}],[{error_info,#{module => erl_erts_errors}}]},{vmq_swc_plugin,'-summary/1-lc$^1/1-1-',3,[{file,"/opt/vernemq/apps/vmq_swc/src/vmq_swc_plugin.erl"},{line,220}]},{vmq_swc_plugin,'-summary/1-lc$^1/1-1-',3,[{file,"/opt/vernemq/apps/vmq_swc/src/vmq_swc_plugin.erl"},{line,220}]},{vmq_swc_plugin,history,1,[{file,"/opt/vernemq/apps/vmq_swc/src/vmq_swc_plugin.erl"},{line,230}]},{vmq_swc_peer_service,attempt_join,1,[{file,"/opt/vernemq/apps/vmq_swc/src/vmq_swc_peer_service.erl"},{line,57}]},{vmq_server_cli,'-vmq_cluster_join_cmd/0-fun-1-',3,[{file,"/opt/vernemq/apps/vmq_server/src/vmq_server_cli.erl"},{line,516}]},{clique_command,run,1,[{file,"/opt/vernemq/_build/default/lib/clique/src/clique_command.erl"},{line,87}]},{vmq_server_cli,command,2,[{file,"/opt/vernemq/apps/vmq_server/src/vmq_server_cli.erl"},{line,45}]}]}

Crash dump is being written to: /erl_crash.dump...[os_mon] memory supervisor port (memsup): Erlang has closed
[os_mon] cpu supervisor port (cpu_sup): Erlang has closed
Stream closed EOF for mdtis-poc-mqtt/vernemq-1 (vernemq)

from docker-vernemq.

hsudbrock avatar hsudbrock commented on August 18, 2024

@pbwur I have the same issue as the one you describe in your last comment above: When restarting a pod of the vernemq stateful set, I get the exact same error; only after deleting the PVC (and underlying PV) and restarting the pod it comes up again. This issue started with 2.0.0, I did not have it with 1.13.

Did you, by any chance, resolve that issue on your side? If yes, I would be thankful to hear how :)

from docker-vernemq.

ioolkos avatar ioolkos commented on August 18, 2024

@pbwur @hsudbrock Currently looking into the PVC related start error; it looks like some sort of regression.

The following setting in vernemq.conf should prevent it: (by switching to the previous join logic)

vmq_swc.prevent_nonempty_join = off

from docker-vernemq.

pbwur avatar pbwur commented on August 18, 2024

Hi @hsudbrock and @ioolkos , apologies for the late response. That issue did still happen here also.
It would be great if that setting would fix it. What would be the correct environment variable to set it? DOCKER_VERNEMQ_VMQ_SWC__PREVENT__NONEMPTY__JOIN?

from docker-vernemq.

ioolkos avatar ioolkos commented on August 18, 2024

@pbwur DOCKER_VERNEMQ_VMQ_SWC__PREVENT_NONEMPTY_JOIN

(translate . to __, keep _ as _)

from docker-vernemq.

hsudbrock avatar hsudbrock commented on August 18, 2024

Thanks for the hint and the PR for fixing the issue! For me, so far it looks good, i.e., disabling the nonempty join check resulted in no errors when restarting my vernemq cluster so far.

from docker-vernemq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.