Describe the bug Once the creation of a resource f

Hey there <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

Hi Martin, here the details: I have 3 DBs: mariadb-gal

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hey there <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

[Bug] Database, User, Grant creation is not repeated ,about mariadb-operator/mariadb-operator

Comments (8)

mmontes11 commented on June 9, 2024 1

Hey there @andreasgeisslerdt. v0.0.21 will be out this week.

from mariadb-operator.

mmontes11 commented on June 9, 2024

Hey there @andreasgeisslerdt ! Thanks for reporting this!

Could you provide the status of your resources to understand in which reconciliation stage the operator is? For example:

❯ kubectl get database data-test -o jsonpath="{.status}"

{"conditions":[{"lastTransitionTime":"2023-10-10T13:46:44Z","message":"Error connecting to MariaDB","reason":"Failed","status":"False","type":"Ready"}]}%

❯ kubectl get database data-test -o jsonpath="{.status}"

{"conditions":[{"lastTransitionTime":"2023-10-10T13:49:16Z","message":"Created","reason":"Created","status":"True","type":"Ready"}]}%

The creation of resources (database, user, grant) should be restarted up to a configurable number of retries.

The operator does perform retries on the SQL resources, utilizing an exponential backoff strategy. This means that the longer your connections encounter errors, the more time the operator will wait before retrying. However, this approach is designed to eventually succeed, saving database connections and avoiding unnecessary overhead.

Thanks!

from mariadb-operator.

andreasgeisslerdt commented on June 9, 2024

Hi Martin,
here the details:
I have 3 DBs:

mariadb-galera (galera cluster) + user
cds-db (single pod, no HA) + user + db + grant
policy-mariadb (single pod, no HA) + user + db + grant

here the status:

ubuntu@controlkubectl -n onap get mariadb
NAME             READY   STATUS    PRIMARY POD        AGE
cds-db           True    Running   cds-db-0           8h
mariadb-galera   True    Running   mariadb-galera-0   8h
policy-mariadb   True    Running   policy-mariadb-0   7h50m
ubuntu@control01-daily-master-sm:~$ kubectl -n onap get user
NAME          READY   STATUS                        MAXCONNS   MARIADB          AGE
my-user       True    Created                       100        mariadb-galera   8h
policy-user   False   Error connecting to MariaDB   100        policy-mariadb   7h53m
sdnctl        False   Error connecting to MariaDB   100        cds-db           8h
ubuntu@control01-daily-master-sm:~$ kubectl -n onap get database
NAME          READY   STATUS                        CHARSET   COLLATE           MARIADB          AGE     NAME
policyadmin   False   Error connecting to MariaDB   utf8      utf8_general_ci   policy-mariadb   7h53m   
sdnctl        False   Error connecting to MariaDB   utf8      utf8_general_ci   cds-db           8h      
ubuntu@control01-daily-master-sm:~$ kubectl -n onap get grant
NAME                                     READY   STATUS                        DATABASE      TABLE   USERNAME      GRANTOPT   MARIADB          AGE
policy-user-policyadmin-policy-mariadb   False   Error connecting to MariaDB   policyadmin   *       policy-user   true       policy-mariadb   7h54m
sdnctl-sdnctl-cds-db                     False   Error connecting to MariaDB   sdnctl        *       sdnctl        true       cds-db           8h

here the requested "database" statuses:

ubuntu@control01-daily-master-sm:~$ kubectl -n onap get database sdnctl -o jsonpath="{.status}"
{"conditions":[{"lastTransitionTime":"2023-10-10T22:54:59Z","message":"Error connecting to MariaDB","reason":"Failed","status":"False","type":"Ready"}]}ubuntu@control01-daily-master-sm:~
$ kubectl -n onap get database policyadmin -o jsonpath="{.status}"
{"conditions":[{"lastTransitionTime":"2023-10-11T04:56:20Z","message":"Error connecting to MariaDB","reason":"Failed","status":"False","type":"Ready"}]}

Here are the logs of the mariadb-operator:

[mysql] 2023/10/10 23:56:20 packets.go:37: unexpected EOF
[mysql] 2023/10/10 23:56:20 packets.go:37: unexpected EOF
[mysql] 2023/10/10 23:56:21 packets.go:37: unexpected EOF
[mysql] 2023/10/11 00:56:20 packets.go:37: unexpected EOF
[mysql] 2023/10/11 00:56:20 packets.go:37: unexpected EOF
[mysql] 2023/10/11 00:56:21 packets.go:37: unexpected EOF
[mysql] 2023/10/11 01:56:20 packets.go:37: unexpected EOF
[mysql] 2023/10/11 01:56:20 packets.go:37: unexpected EOF
{"level":"error","ts":1696989380.774429,"msg":"Reconciler error","controller":"grant","controllerGroup":"mariadb.mmontes.io","controllerKind":"Grant","grant":{"name":"sdnctl-sdnctl-cds-db","namespace":"onap"},"namespace":"onap","name":"sdnctl-sdnctl-cds-db","reconcileID":"2f3f9d82-2504-43cd-a755-c0581e55a884","error":"error reconciling in TemplateReconciler: error creating MariaDB client: 1 error occurred:\n\t* driver: bad connection\n\n","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234"}
{"level":"error","ts":1696989380.776826,"msg":"Reconciler error","controller":"database","controllerGroup":"mariadb.mmontes.io","controllerKind":"Database","database":{"name":"sdnctl","namespace":"onap"},"namespace":"onap","name":"sdnctl","reconcileID":"2e8c4bec-546f-435c-91ad-ea38e3600b90","error":"error reconciling in TemplateReconciler: error creating MariaDB client: 1 error occurred:\n\t* driver: bad connection\n\n","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234"}
[mysql] 2023/10/11 01:56:21 packets.go:37: unexpected EOF
{"level":"error","ts":1696989381.1669478,"msg":"Reconciler error","controller":"user","controllerGroup":"mariadb.mmontes.io","controllerKind":"User","user":{"name":"sdnctl","namespace":"onap"},"namespace":"onap","name":"sdnctl","reconcileID":"4a2f3fc1-8228-45b4-883d-0176210a8c63","error":"error reconciling in TemplateReconciler: error creating MariaDB client: 1 error occurred:\n\t* driver: bad connection\n\n","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234"}

The connection problems occur only for the single pod MariaDBs and IMHO they are caused by the fact, that the "PeerAuthentication" is not active at that time, the user/grant/database is created by the mariadb-operator.

from mariadb-operator.

andreasgeisslerdt commented on June 9, 2024

After this situation, I restarted all mariadb-operator pods, deleted the "sdnctl" user and redeployed it.
The result is, that the new user is not handled at all:

ubuntu@control01-daily-master-sm:~$ kubectl -n onap get user
NAME          READY   STATUS                        MAXCONNS   MARIADB          AGE
my-user       True    Created                       100        mariadb-galera   8h
policy-user   False   Error connecting to MariaDB   100        policy-mariadb   8h
sdnctl                                              100        cds-db           82s

from mariadb-operator.

mmontes11 commented on June 9, 2024

Hey @andreasgeisslerdt ! Thanks a lot for your input, very much appreciated.

Judging by your comment, it seems like exponential backoff it's not quite working for your case. We would need to retry more often.

After this situation, I restarted all mariadb-operator pods, deleted the "sdnctl" user and redeployed it.
The result is, that the new user is not handled at all:

Right, you have re-deployed the resource with the same name and not restarted the operator. The thing is that the operator has an internal retry cache indexed by name/namespace, so even if you recreate the resource, it won't retry as it hits the cache.

My suggestion is introducing a spec.retryInterval in the SQL resources (User, Grant, Database) so you can opt-out from exponential backoff and explicitly define the retry internal:

apiVersion: mariadb.mmontes.io/v1alpha1
kind: User
metadata:
  name: user
spec:
  mariaDbRef:
    name: mariadb
  passwordSecretKeyRef:
    name: user
    key: password
  maxUserConnections: 20
  host: "%"
  retryInterval: 5s

apiVersion: mariadb.mmontes.io/v1alpha1
kind: Grant
metadata:
  name: grant
spec:
  mariaDbRef:
    name: mariadb
  privileges:
    - "SELECT"
    - "INSERT"
    - "UPDATE"
  database: "*"
  table: "*"
  username: user
  grantOption: true
  host: "%"
  retryInterval: 5s

apiVersion: mariadb.mmontes.io/v1alpha1
kind: Grant
metadata:
  name: grant
spec:
  mariaDbRef:
    name: mariadb
  privileges:
    - "SELECT"
    - "INSERT"
    - "UPDATE"
  database: "*"
  table: "*"
  username: user
  grantOption: true
  host: "%"
  retryInterval: 5s

Bear in mind that this can potentially increase the number of connections in your database.

Thoughts?

from mariadb-operator.

andreasgeisslerdt commented on June 9, 2024

I just tested again the procedure to deploy the MariaDB (cds-db) and user.

I deleted the old user and DB
Deleted all mariadb-operator pods
deployed mariadb and user
deployment was successful (logs showed retries to create the user until DB was ready)

So the problem might not be the retries, but the caching might be the problem.
Could it be that allthough the connection to the DB was not successful (EOF, see above) and the creation failed, it was added to the cache and never retried.

from mariadb-operator.

mmontes11 commented on June 9, 2024

Hey there @andreasgeisslerdt ! I was working on a PR for this and fixed a bug in the controller responsible for SQL resources. Basically after a resource has been non ready for a while, the controller does not set it back to ready after it becomes healthy. It will be released in v0.0.21 along with the spec.retryInterval feature, which I still think it can be useful for many use cases.

from mariadb-operator.

andreasgeisslerdt commented on June 9, 2024

OK, thanks for the info, what is the planned release date for v0.0.21?
In the meantime I will test:

enable galera for the 2 failing DBs or
disable istio sidecar injection for all DBs

from mariadb-operator.

[Bug] Database, User, Grant creation is not repeated about mariadb-operator HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs