GithubHelp home page GithubHelp logo

Comments (12)

vitabaks avatar vitabaks commented on June 2, 2024 1

I will also add an update the pgbackrest package on the backup server in automation.

UPD: #648

from postgresql_cluster.

chuegel avatar chuegel commented on June 2, 2024 1

Try reinit replica patronictl reinit postgres-cluster <problem replica name>

After you have fixed the problem with the different versions of the pgbackrest packages between the database servers and the backup servers, try running the update_pgcluster.yml playbook again to complete the cluster update.

That worked!!!

patronictl list postgres-cluster
+ Cluster: postgres-cluster (7253014758852064969) --+----+-----------+
| Member       | Host         | Role    | State     | TL | Lag in MB |
+--------------+--------------+---------+-----------+----+-----------+
| postgresql01 | 10.83.200.12 | Leader  | running   | 15 |           |
| postgresql02 | 10.83.200.13 | Replica | streaming | 15 |         0 |
| postgresql03 | 10.83.200.14 | Replica | streaming | 15 |         0 |
+--------------+--------------+---------+-----------+----+-----------+

Thank you, Sir!

from postgresql_cluster.

vitabaks avatar vitabaks commented on June 2, 2024

Please attach ansible log

from postgresql_cluster.

chuegel avatar chuegel commented on June 2, 2024

Thanks for your reply.
So, digging deeper into the logs it turned out that the version on the pgbackrest server was lagging behind:

2024-04-30 00:00:06.440 P00   INFO: archive-get command end: aborted with exception [103]
2024-04-30 00:00:06 CEST [442637-1]  LOG:  started streaming WAL from primary at 27/B0000000 on timeline 15
2024-04-30 00:00:06 CEST [442637-2]  FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 0000000F00000027000000B0 has already been removed
2024-04-30 00:00:06.490 P00   INFO: archive-get command begin 2.51: [00000010.history, pg_wal/RECOVERYHISTORY] --exec-id=442639-759ac39e --log-level-console=info --log-level-file=detail --log-path=/var/log/pgbackrest --pg1-path=/var/lib/postgresql/15/main --process-max=4 --repo1-host=10.83.43.119 --repo1-host-user=postgres --repo1-path=/var/lib/pgbackrest --repo1-type=posix --stanza=postgres-cluster
WARN: repo1: [ProtocolError] expected value '2.51' for greeting key 'version' but got '2.50'
      HINT: is the same version of pgBackRest installed on the local and remote host?
ERROR: [103]: unable to find a valid repository

After upgrading pgbackrest server now I get this error:

2024-04-30 09:25:12 CEST [29612-1]  LOG:  started streaming WAL from primary at 27/B0000000 on timeline 15
2024-04-30 09:25:12 CEST [29612-2]  FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 0000000F00000027000000B0 has already been removed
2024-04-30 09:25:12.412 P00   INFO: archive-get command begin 2.51: [00000010.history, pg_wal/RECOVERYHISTORY] --exec-id=29614-ee35ac41 --log-level-console=info --log-level-file=detail --log-path=/var/log/pgbackrest --pg1-path=/var/lib/postgresql/15/main --process-max=4 --repo1-host=10.83.43.119 --repo1-host-user=postgres --repo1-path=/var/lib/pgbackrest --repo1-type=posix --stanza=postgres-cluster
2024-04-30 09:25:12.670 P00   INFO: unable to find 00000010.history in the archive
2024-04-30 09:25:12.771 P00   INFO: archive-get command end: completed successfully (363ms)
2024-04-30 09:25:12 CEST [859-806]  LOG:  waiting for WAL to become available at 27/B0002000

from postgresql_cluster.

chuegel avatar chuegel commented on June 2, 2024

The other replica seems to be fine:

2024-04-30 09:57:30 CEST [700-23]  LOG:  recovery restart point at 2C/B4010EA8
2024-04-30 09:57:30 CEST [700-24]  DETAIL:  Last completed transaction was at log time 2024-04-30 09:57:22.977464+02.
2024-04-30 10:12:28 CEST [700-25]  LOG:  restartpoint starting: time
2024-04-30 10:12:30 CEST [700-26]  LOG:  restartpoint complete: wrote 21 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=2.018 s, sync=0.003 s, total=2.037 s; sync files=16, longest=0.002 s, average=0.001 s; distance=46 kB, estimate=14701 kB

from postgresql_cluster.

vitabaks avatar vitabaks commented on June 2, 2024

It is strange that the pgbackrest package has not been updated with target=system

from postgresql_cluster.

vitabaks avatar vitabaks commented on June 2, 2024

I understand everything, you are using a dedicated pgbackest server and it is here that the old package is used. So yes, it's worth updating the pgbackrest server first.

P.S. I switched to minio (s3) and I no longer have similar problems with the pgbackrest versions.

from postgresql_cluster.

chuegel avatar chuegel commented on June 2, 2024

I understand everything, you are using a dedicated pgbackest server and it is here that the old package is used. So yes, it's worth updating the pgbackrest server first.

P.S. I switched to minio (s3) and I no longer have similar problems with the pgbackrest versions.

Yes, I use a dedicated pgbackrest server. After aligning the versions, the one replica complains with:

2024-04-30 10:29:30 CEST [56835-1]  LOG:  started streaming WAL from primary at 27/B0000000 on timeline 15
2024-04-30 10:29:30 CEST [56835-2]  FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 0000000F00000027000000B0 has already been removed
2024-04-30 10:29:30.877 P00   INFO: archive-get command begin 2.51: [00000010.history, pg_wal/RECOVERYHISTORY] --exec-id=56837-4247d19a --log-level-console=info --log-level-file=detail --log-path=/var/log/pgbackrest --pg1-path=/var/lib/postgresql/15/main --process-max=4 --repo1-host=10.83.43.119 --repo1-host-user=postgres --repo1-path=/var/lib/pgbackrest --repo1-type=posix --stanza=postgres-cluster
2024-04-30 10:29:31.129 P00   INFO: unable to find 00000010.history in the archive

But there is no 00000010.history on the pgbackrest server:

ls -la /var/lib/pgbackrest/archive/postgres-cluster/15-1/
total 192
drwxr-x--- 7 postgres postgres  4096 Apr 27 00:01 .
drwxr-x--- 3 postgres postgres  4096 Apr 30 00:01 ..
-rw-r----- 1 postgres postgres   610 Mar  5 15:35 0000000F.history
drwxr-x--- 2 postgres postgres 36864 Apr 27 00:01 0000000F00000028
drwxr-x--- 2 postgres postgres 32768 Apr 16 07:01 0000000F00000029
drwxr-x--- 2 postgres postgres 36864 Apr 21 12:31 0000000F0000002A
drwxr-x--- 2 postgres postgres 36864 Apr 26 17:31 0000000F0000002B
drwxr-x--- 2 postgres postgres 24576 Apr 30 09:01 0000000F0000002C

from postgresql_cluster.

chuegel avatar chuegel commented on June 2, 2024

It is strange that the pgbackrest package has not been updated with target=system

Thats because the playbook didn't run agains the pgbackrest host. Not sure why

from postgresql_cluster.

vitabaks avatar vitabaks commented on June 2, 2024

The playbook is designed to update the postgres cluster, not the backup server.

from postgresql_cluster.

chuegel avatar chuegel commented on June 2, 2024

I understand. The pgbackup package was update on the replicas successully. On the leader, it does a switchover before upgrading packages. Since the switchover failed, the leader had also a older version of pgbackrest.
I did manually upgrade the pgbackrest package on leader and pgbackrest server:
apt install --only-upgrade pgbackrest

I'm not quite sure which steps to take to recover the one replica.

from postgresql_cluster.

vitabaks avatar vitabaks commented on June 2, 2024

Try reinit replica
patronictl reinit postgres-cluster <problem replica name>

After you have fixed the problem with the different versions of the pgbackrest packages between the database servers and the backup servers, try running the update_pgcluster.yml playbook again to complete the cluster update.

from postgresql_cluster.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.