GithubHelp home page GithubHelp logo

Comments (8)

mmontes11 avatar mmontes11 commented on June 9, 2024 1

Hey there @andreasgeisslerdt. v0.0.21 will be out this week.

from mariadb-operator.

mmontes11 avatar mmontes11 commented on June 9, 2024

Hey there @andreasgeisslerdt ! Thanks for reporting this!

Could you provide the status of your resources to understand in which reconciliation stage the operator is? For example:

❯ kubectl get database data-test -o jsonpath="{.status}"

{"conditions":[{"lastTransitionTime":"2023-10-10T13:46:44Z","message":"Error connecting to MariaDB","reason":"Failed","status":"False","type":"Ready"}]}%

❯ kubectl get database data-test -o jsonpath="{.status}"

{"conditions":[{"lastTransitionTime":"2023-10-10T13:49:16Z","message":"Created","reason":"Created","status":"True","type":"Ready"}]}%

The creation of resources (database, user, grant) should be restarted up to a configurable number of retries.

The operator does perform retries on the SQL resources, utilizing an exponential backoff strategy. This means that the longer your connections encounter errors, the more time the operator will wait before retrying. However, this approach is designed to eventually succeed, saving database connections and avoiding unnecessary overhead.

Thanks!

from mariadb-operator.

andreasgeisslerdt avatar andreasgeisslerdt commented on June 9, 2024

Hi Martin,
here the details:
I have 3 DBs:

  • mariadb-galera (galera cluster) + user
  • cds-db (single pod, no HA) + user + db + grant
  • policy-mariadb (single pod, no HA) + user + db + grant

here the status:

ubuntu@controlkubectl -n onap get mariadb
NAME             READY   STATUS    PRIMARY POD        AGE
cds-db           True    Running   cds-db-0           8h
mariadb-galera   True    Running   mariadb-galera-0   8h
policy-mariadb   True    Running   policy-mariadb-0   7h50m
ubuntu@control01-daily-master-sm:~$ kubectl -n onap get user
NAME          READY   STATUS                        MAXCONNS   MARIADB          AGE
my-user       True    Created                       100        mariadb-galera   8h
policy-user   False   Error connecting to MariaDB   100        policy-mariadb   7h53m
sdnctl        False   Error connecting to MariaDB   100        cds-db           8h
ubuntu@control01-daily-master-sm:~$ kubectl -n onap get database
NAME          READY   STATUS                        CHARSET   COLLATE           MARIADB          AGE     NAME
policyadmin   False   Error connecting to MariaDB   utf8      utf8_general_ci   policy-mariadb   7h53m   
sdnctl        False   Error connecting to MariaDB   utf8      utf8_general_ci   cds-db           8h      
ubuntu@control01-daily-master-sm:~$ kubectl -n onap get grant
NAME                                     READY   STATUS                        DATABASE      TABLE   USERNAME      GRANTOPT   MARIADB          AGE
policy-user-policyadmin-policy-mariadb   False   Error connecting to MariaDB   policyadmin   *       policy-user   true       policy-mariadb   7h54m
sdnctl-sdnctl-cds-db                     False   Error connecting to MariaDB   sdnctl        *       sdnctl        true       cds-db           8h

here the requested "database" statuses:

ubuntu@control01-daily-master-sm:~$ kubectl -n onap get database sdnctl -o jsonpath="{.status}"
{"conditions":[{"lastTransitionTime":"2023-10-10T22:54:59Z","message":"Error connecting to MariaDB","reason":"Failed","status":"False","type":"Ready"}]}ubuntu@control01-daily-master-sm:~
$ kubectl -n onap get database policyadmin -o jsonpath="{.status}"
{"conditions":[{"lastTransitionTime":"2023-10-11T04:56:20Z","message":"Error connecting to MariaDB","reason":"Failed","status":"False","type":"Ready"}]}

Here are the logs of the mariadb-operator:

[mysql] 2023/10/10 23:56:20 packets.go:37: unexpected EOF
[mysql] 2023/10/10 23:56:20 packets.go:37: unexpected EOF
[mysql] 2023/10/10 23:56:21 packets.go:37: unexpected EOF
[mysql] 2023/10/11 00:56:20 packets.go:37: unexpected EOF
[mysql] 2023/10/11 00:56:20 packets.go:37: unexpected EOF
[mysql] 2023/10/11 00:56:21 packets.go:37: unexpected EOF
[mysql] 2023/10/11 01:56:20 packets.go:37: unexpected EOF
[mysql] 2023/10/11 01:56:20 packets.go:37: unexpected EOF
{"level":"error","ts":1696989380.774429,"msg":"Reconciler error","controller":"grant","controllerGroup":"mariadb.mmontes.io","controllerKind":"Grant","grant":{"name":"sdnctl-sdnctl-cds-db","namespace":"onap"},"namespace":"onap","name":"sdnctl-sdnctl-cds-db","reconcileID":"2f3f9d82-2504-43cd-a755-c0581e55a884","error":"error reconciling in TemplateReconciler: error creating MariaDB client: 1 error occurred:\n\t* driver: bad connection\n\n","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234"}
{"level":"error","ts":1696989380.776826,"msg":"Reconciler error","controller":"database","controllerGroup":"mariadb.mmontes.io","controllerKind":"Database","database":{"name":"sdnctl","namespace":"onap"},"namespace":"onap","name":"sdnctl","reconcileID":"2e8c4bec-546f-435c-91ad-ea38e3600b90","error":"error reconciling in TemplateReconciler: error creating MariaDB client: 1 error occurred:\n\t* driver: bad connection\n\n","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234"}
[mysql] 2023/10/11 01:56:21 packets.go:37: unexpected EOF
{"level":"error","ts":1696989381.1669478,"msg":"Reconciler error","controller":"user","controllerGroup":"mariadb.mmontes.io","controllerKind":"User","user":{"name":"sdnctl","namespace":"onap"},"namespace":"onap","name":"sdnctl","reconcileID":"4a2f3fc1-8228-45b4-883d-0176210a8c63","error":"error reconciling in TemplateReconciler: error creating MariaDB client: 1 error occurred:\n\t* driver: bad connection\n\n","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234"}

The connection problems occur only for the single pod MariaDBs and IMHO they are caused by the fact, that the "PeerAuthentication" is not active at that time, the user/grant/database is created by the mariadb-operator.

from mariadb-operator.

andreasgeisslerdt avatar andreasgeisslerdt commented on June 9, 2024

After this situation, I restarted all mariadb-operator pods, deleted the "sdnctl" user and redeployed it.
The result is, that the new user is not handled at all:

ubuntu@control01-daily-master-sm:~$ kubectl -n onap get user
NAME          READY   STATUS                        MAXCONNS   MARIADB          AGE
my-user       True    Created                       100        mariadb-galera   8h
policy-user   False   Error connecting to MariaDB   100        policy-mariadb   8h
sdnctl                                              100        cds-db           82s

from mariadb-operator.

mmontes11 avatar mmontes11 commented on June 9, 2024

Hey @andreasgeisslerdt ! Thanks a lot for your input, very much appreciated.

Judging by your comment, it seems like exponential backoff it's not quite working for your case. We would need to retry more often.

After this situation, I restarted all mariadb-operator pods, deleted the "sdnctl" user and redeployed it.
The result is, that the new user is not handled at all:

Right, you have re-deployed the resource with the same name and not restarted the operator. The thing is that the operator has an internal retry cache indexed by name/namespace, so even if you recreate the resource, it won't retry as it hits the cache.

My suggestion is introducing a spec.retryInterval in the SQL resources (User, Grant, Database) so you can opt-out from exponential backoff and explicitly define the retry internal:

apiVersion: mariadb.mmontes.io/v1alpha1
kind: User
metadata:
  name: user
spec:
  mariaDbRef:
    name: mariadb
  passwordSecretKeyRef:
    name: user
    key: password
  maxUserConnections: 20
  host: "%"
  retryInterval: 5s
apiVersion: mariadb.mmontes.io/v1alpha1
kind: Grant
metadata:
  name: grant
spec:
  mariaDbRef:
    name: mariadb
  privileges:
    - "SELECT"
    - "INSERT"
    - "UPDATE"
  database: "*"
  table: "*"
  username: user
  grantOption: true
  host: "%"
  retryInterval: 5s
apiVersion: mariadb.mmontes.io/v1alpha1
kind: Grant
metadata:
  name: grant
spec:
  mariaDbRef:
    name: mariadb
  privileges:
    - "SELECT"
    - "INSERT"
    - "UPDATE"
  database: "*"
  table: "*"
  username: user
  grantOption: true
  host: "%"
  retryInterval: 5s

Bear in mind that this can potentially increase the number of connections in your database.

Thoughts?

from mariadb-operator.

andreasgeisslerdt avatar andreasgeisslerdt commented on June 9, 2024

I just tested again the procedure to deploy the MariaDB (cds-db) and user.

  • I deleted the old user and DB
  • Deleted all mariadb-operator pods
  • deployed mariadb and user
  • deployment was successful (logs showed retries to create the user until DB was ready)

So the problem might not be the retries, but the caching might be the problem.
Could it be that allthough the connection to the DB was not successful (EOF, see above) and the creation failed, it was added to the cache and never retried.

from mariadb-operator.

mmontes11 avatar mmontes11 commented on June 9, 2024

Hey there @andreasgeisslerdt ! I was working on a PR for this and fixed a bug in the controller responsible for SQL resources. Basically after a resource has been non ready for a while, the controller does not set it back to ready after it becomes healthy. It will be released in v0.0.21 along with the spec.retryInterval feature, which I still think it can be useful for many use cases.

from mariadb-operator.

andreasgeisslerdt avatar andreasgeisslerdt commented on June 9, 2024

OK, thanks for the info, what is the planned release date for v0.0.21?
In the meantime I will test:

  • enable galera for the 2 failing DBs or
  • disable istio sidecar injection for all DBs

from mariadb-operator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.