GithubHelp home page GithubHelp logo

[Bug] operator crash about mariadb-operator HOT 10 CLOSED

Guzzi avatar Guzzi commented on June 10, 2024
[Bug] operator crash

from mariadb-operator.

Comments (10)

mmontes11 avatar mmontes11 commented on June 10, 2024

Hey @Guzzi ! Thanks for reporting this

Unfortunately I'm unable to reproduce this, here the steps I've followed:

  • Install mariadb-operator v0.0.19 via the helm chart using the default values
  • Apply this example. This one is in the 11.0.2 version
  • Downgrade the MariaDB resource to 10.11.3 by updating spec.image.tag

Could you provide more details? From and to which version are you upgrading?

from mariadb-operator.

Guzzi avatar Guzzi commented on June 10, 2024

Hi @mmontes11 ,
I downgraded from 11.0.2 to 10.11.4 (wanted to switch to lts version, though I didn't expect that a downgrade would work).
The controller is now in a state, that as soon as I try to apply a new CR, it fails from the beginning.
If I delete the CR, it is stopping to crash/restart.

Last crash log was this:

{"level":"info","ts":1690989085.6401832,"msg":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","controller":"mariadb","controllerGroup":"mariadb.mmontes.io","controllerKind":"MariaDB","mariaDB":{"name":"infoplatform-galera","namespace":"dev-ctraffic-infoplatform"},"namespace":"dev-ctraffic-infoplatform","name":"infoplatform-galera","reconcileID":"0f53d0a1-8277-432a-b2b1-953fbc3cc61e"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x14bf858]

goroutine 809 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:118 +0x1f4
panic({0x16ead40, 0x2775f30})
/usr/local/go/src/runtime/panic.go:884 +0x213
github.com/mariadb-operator/mariadb-operator/pkg/builder.(*Builder).buildGaleraAgentContainer.func1(0xc00074a700?, 0x0?, 0xc00011fea0)
/app/pkg/builder/statefulset_container_builder.go:57 +0x518
github.com/mariadb-operator/mariadb-operator/pkg/builder.(*Builder).buildGaleraAgentContainer(_, _)
/app/pkg/builder/statefulset_container_builder.go:67 +0x23f
github.com/mariadb-operator/mariadb-operator/pkg/builder.(*Builder).buildStsContainers(0x422ec7?, 0xc00074a700, 0x0)
/app/pkg/builder/statefulset_container_builder.go:27 +0x36a
github.com/mariadb-operator/mariadb-operator/pkg/builder.(*Builder).buildStsPodTemplate(0x16dbdc0?, 0xc00074a700, 0x18f5163?, 0x1a?)
/app/pkg/builder/statefulset_builder.go:102 +0x3e
github.com/mariadb-operator/mariadb-operator/pkg/builder.(*Builder).BuildStatefulSet(0xc00011fea0, 0xc00074a700, {{0xc0003ce460?, 0xc002d50000?}, {0xc000048498?, 0xc00095ab60?}}, 0xc0019ab800?)
/app/pkg/builder/statefulset_builder.go:75 +0x4ba
github.com/mariadb-operator/mariadb-operator/controllers.(*MariaDBReconciler).reconcileStatefulSet(0xc00015c4d0, {0x1b403f0, 0xc001c3bda0}, 0xc00074a700)
/app/controllers/mariadb_controller.go:231 +0x149
github.com/mariadb-operator/mariadb-operator/controllers.(*MariaDBReconciler).Reconcile(0xc00015c4d0, {0x1b403f0, 0xc001c3bda0}, {{{0xc0003ce460?, 0x0?}, {0xc000048498?, 0x40de67?}}})
/app/controllers/mariadb_controller.go:155 +0x632
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x1b40348?, {0x1b403f0?, 0xc001c3bda0?}, {{{0xc0003ce460?, 0x1817320?}, {0xc000048498?, 0x1b2ee98?}}})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:121 +0xc8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000198b40, {0x1b40348, 0xc00042caa0}, {0x1746340?, 0xc000430620?})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:320 +0x327
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000198b40, {0x1b40348, 0xc00042caa0})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:230 +0x587

I was using this on -> example

kubernetes: 1.25.5 (aks)
linkerd mesh injection (no errors in sidecar)

I would probably have to reinstall operator to "fix it" ?

What else would be of help?

Should I try to apply the simple example?

from mariadb-operator.

Guzzi avatar Guzzi commented on June 10, 2024

here is the exact applied CR:

apiVersion: mariadb.mmontes.io/v1alpha1
kind: MariaDB
metadata:
labels:
name: infoplatform-galera
namespace: xxx
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: kubernetes.io/hostname
connection:
secretName: mariadb-conn
secretTemplate:
key: dsn
database: mariadb
galera:
agent:
gracefulShutdownTimeout: 5s
image:
pullPolicy: IfNotPresent
repository: ghcr.io/mariadb-operator/agent
tag: v0.0.2
kubernetesAuth:
authDelegatorRoleName: mariadb-galera-auth
enabled: true
port: 5555
enabled: true
initContainer:
image:
pullPolicy: IfNotPresent
repository: ghcr.io/mariadb-operator/init
tag: v0.0.5
recovery:
enabled: true
replicaThreads: 1
sst: mariabackup
volumeClaimTemplate:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Mi
storageClassName: managed-premium
image:
pullPolicy: IfNotPresent
repository: mariadb
tag: 10.11.4
myCnf: |
[mariadb]
bind-address=0.0.0.0
default_storage_engine=InnoDB
binlog_format=row
innodb_autoinc_lock_mode=2
max_allowed_packet=256M
passwordSecretKeyRef:
key: password
name: mariadb
podDisruptionBudget:
maxUnavailable: 66%
podSecurityContext:
runAsUser: 0
port: 3306
replicas: 3
resources:
limits:
memory: 1Gi
requests:
cpu: 300m
memory: 256Mi
rootPasswordSecretKeyRef:
key: root-password
name: mariadb
service:
type: ClusterIP
tolerations:
- effect: NoSchedule
key: env
operator: Equal
value: dev
- effect: NoSchedule
key: project
operator: Equal
value: dev-cGroup
updateStrategy:
type: RollingUpdate
username: mariadb
volumeClaimTemplate:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: managed-premium

from mariadb-operator.

mmontes11 avatar mmontes11 commented on June 10, 2024

Hey there!

All your panics are hapening in the same Galera() function:

func (m *MariaDB) Galera() Galera {

You seem to be correctly providing the spec.galera field, so I think it is not being mapped correctly on the apiserver side, causing the panics.

You may have inconsistencies in your MariaDB CRD, so I suggest trying the following:

  • Uninstall mariadb-operator
helm uninstall mariadb-operator
  • Delete the current MariaDB CRD
kubectl delete crd mariadbs.mariadb.mmontes.io
  • Install again mariadb-operator. This will install a new MariaDB CRD
helm install mariadb-operator mariadb-operator/mariadb-operator
  • Apply again the MariaDB resource

from mariadb-operator.

Guzzi avatar Guzzi commented on June 10, 2024

Hi @mmontes11 ,
thanks for the feedback - actually, I had initially the issue, that the CR couldn't be created at all because of that reason (it was claiming spec.galera is missing) - I deleted CR and CRD and reinstalled operator (as I had seen that recommendation in another ticket here) - after this that error about missing spec.galera was gone.
I will repeat that tomorrow and see if that helps.

from mariadb-operator.

mmontes11 avatar mmontes11 commented on June 10, 2024

Hey @Guzzi ! Thanks, let me know if this finally works for you.

from mariadb-operator.

Guzzi avatar Guzzi commented on June 10, 2024

Hi @mmontes11 ,
tried to redeploy after deletion and reinstall operator, but still crashing:

{"level":"info","ts":1691103806.636239,"msg":"Starting workers","controller":"user","controllerGroup":"mariadb.mmontes.io","controllerKind":"User","worker count":1}
{"level":"info","ts":1691103806.636245,"msg":"Starting workers","controller":"grant","controllerGroup":"mariadb.mmontes.io","controllerKind":"Grant","worker count":1}
{"level":"info","ts":1691103806.6378896,"msg":"Starting workers","controller":"connection","controllerGroup":"mariadb.mmontes.io","controllerKind":"Connection","worker count":1}
{"level":"info","ts":1691103806.6379483,"msg":"Starting workers","controller":"statefulset","controllerGroup":"apps","controllerKind":"StatefulSet","worker count":1}
{"level":"info","ts":1691103806.6385517,"msg":"Starting workers","controller":"database","controllerGroup":"mariadb.mmontes.io","controllerKind":"Database","worker count":1}
{"level":"info","ts":1691103806.6388893,"msg":"Starting workers","controller":"restore","controllerGroup":"mariadb.mmontes.io","controllerKind":"Restore","worker count":1}
{"level":"info","ts":1691103806.639121,"msg":"Starting workers","controller":"backup","controllerGroup":"mariadb.mmontes.io","controllerKind":"Backup","worker count":1}
{"level":"info","ts":1691103806.6402593,"msg":"Starting workers","controller":"sqljob","controllerGroup":"mariadb.mmontes.io","controllerKind":"SqlJob","worker count":1}
{"level":"info","ts":1691103806.6414099,"msg":"Starting workers","controller":"pod","controllerGroup":"","controllerKind":"Pod","worker count":1}
{"level":"info","ts":1691103806.6425667,"msg":"Starting workers","controller":"mariadb","controllerGroup":"mariadb.mmontes.io","controllerKind":"MariaDB","worker count":1}
{"level":"info","ts":1691103806.6430128,"msg":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","controller":"mariadb","controllerGroup":"mariadb.mmontes.io","controllerKind":"MariaDB","mariaDB":{"name":"infoplatform-galera","namespace":"dev-ctraffic-infoplatform"},"namespace":"dev-ctraffic-infoplatform","name":"infoplatform-galera","reconcileID":"de85ddbc-80cd-4e2e-b56b-9220bf4cf02e"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x14bf858]

goroutine 558 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:118 +0x1f4
panic({0x16ead40, 0x2775f30})
/usr/local/go/src/runtime/panic.go:884 +0x213
github.com/mariadb-operator/mariadb-operator/pkg/builder.(*Builder).buildGaleraAgentContainer.func1(0xc000ae6380?, 0x0?, 0xc0005459a0)
/app/pkg/builder/statefulset_container_builder.go:57 +0x518
github.com/mariadb-operator/mariadb-operator/pkg/builder.(*Builder).buildGaleraAgentContainer(_, _)
/app/pkg/builder/statefulset_container_builder.go:67 +0x23f
github.com/mariadb-operator/mariadb-operator/pkg/builder.(*Builder).buildStsContainers(0x40d2fe?, 0xc000ae6380, 0x0)
/app/pkg/builder/statefulset_container_builder.go:27 +0x36a
github.com/mariadb-operator/mariadb-operator/pkg/builder.(*Builder).buildStsPodTemplate(0x16dbdc0?, 0xc000ae6380, 0x18f5163?, 0x1a?)
/app/pkg/builder/statefulset_builder.go:102 +0x3e
github.com/mariadb-operator/mariadb-operator/pkg/builder.(*Builder).BuildStatefulSet(0xc0005459a0, 0xc000ae6380, {{0xc000fc4920?, 0xc002682000?}, {0xc000addc50?, 0xc0008a6340?}}, 0xc0024ca120?)
/app/pkg/builder/statefulset_builder.go:75 +0x4ba
github.com/mariadb-operator/mariadb-operator/controllers.(*MariaDBReconciler).reconcileStatefulSet(0xc000294620, {0x1b403f0, 0xc0019772c0}, 0xc000ae6380)
/app/controllers/mariadb_controller.go:231 +0x149
github.com/mariadb-operator/mariadb-operator/controllers.(*MariaDBReconciler).Reconcile(0xc000294620, {0x1b403f0, 0xc0019772c0}, {{{0xc000a29d60?, 0x0?}, {0xc0003a8ae0?, 0x40de67?}}})
/app/controllers/mariadb_controller.go:155 +0x632
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x1b40348?, {0x1b403f0?, 0xc0019772c0?}, {{{0xc000a29d60?, 0x1817320?}, {0xc0003a8ae0?, 0x1b2ee98?}}})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:121 +0xc8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00019caa0, {0x1b40348, 0xc0002d0c80}, {0x1746340?, 0xc00013e2e0?})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:320 +0x327
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00019caa0, {0x1b40348, 0xc0002d0c80})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:230 +0x587
Logs from 4.8.2023, 01:02:31

Is there anything I can do to help solving this?

from mariadb-operator.

mmontes11 avatar mmontes11 commented on June 10, 2024

Hey @Guzzi ! thanks for looking into this.

This is most likely an inconsistency in your CRDs. What do you get if you run?:

 kubectl explain mariadb.spec.galera

from mariadb-operator.

Guzzi avatar Guzzi commented on June 10, 2024

Hi @mmontes11
results in following output:

kubectl explain mariadb.spec.galera
GROUP: mariadb.mmontes.io
KIND: MariaDB
VERSION: v1alpha1

FIELD: galera

DESCRIPTION:
Galera allows you to enable multi-master HA via Galera in your MariaDB
cluster.

FIELDS:
agent
GaleraAgent is a sidecar agent that co-operates with mariadb-operator. More
info: https://github.com/mariadb-operator/agent.

enabled
Enabled is a flag to enable Galera.

initContainer
InitContainer is an init container that co-operates with mariadb-operator.
More info: https://github.com/mariadb-operator/init.

recovery
GaleraRecovery is the recovery process performed by the operator whenever
the Galera cluster is not healthy. More info:
https://galeracluster.com/library/documentation/crash-recovery.html.

replicaThreads
ReplicaThreads is the number of replica threads used to apply Galera write
sets in parallel. More info:
https://mariadb.com/kb/en/galera-cluster-system-variables/#wsrep_slave_threads.

sst
SST is the Snapshot State Transfer used when new Pods join the cluster. More
info: https://galeracluster.com/library/documentation/sst.html.

volumeClaimTemplate
VolumeClaimTemplate is a template for the PVC that will contain the Galera
configuration files shared between the InitContainer, Agent and MariaDB.

from mariadb-operator.

mmontes11 avatar mmontes11 commented on June 10, 2024

I have merged a potential fix:

To be released in v0.0.20, stay tuned!

from mariadb-operator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.