Comments (4)
It seems like you are upgrading from an ancient version of Postgres. This issue was fixed here: cloudfoundry/bpm-release#152
from bosh.
Thank you so much for the response @rkoster! Indeed we're operating an "outdated" BOSH environment and have not done the upgrade regularly as we should. We have seen this issue intermittently on a few runs of BOSH Director upgrade testing.
How can we move forward with this BOSH Director v280.0.14 upgrade and ensure that this issue won't happen in our existing production BOSH environments?
Option 1: Can we first manually shut down Postgres 10 on the BOSH Director VM before attempting BOSH Director upgrade? If yes, which command sequences should be used to properly shut down Postgres 10 and other BOSH Director related services?
Option 2: First update BPM component to v1.1.14 or higher (cloudfoundry/bpm-release#152 (comment)) with the fix on current BOSH Director v271.2.0 before upgrading to BOSH Director v280.0.14.
Any other options? Greatly appreciate your suggestions here.
from bosh.
Updating BPM would still be an update of the instance, and as such have a change of an improper Postgres shutdown.
@bgandon do you remember if there was a workaround that was used before the fix was implemented?
from bosh.
Hi @bgandon,
As @rkoster confirmed using Option 2 will likely run into the same improper Postgres shutdown. Could you please advice on the workaround you used before the BPM fix was implemented if it's possible?
We're thinking of using the Option 1 as a workaround for manually shutting down Postgres 10 on the BOSH Director VM before attempting BOSH Director upgrade. Please help to confirm if the following steps will work.
- SSH into BOSH Director VM.
- Monit stop all other processes except Postgres.
bosh/0:~# for name in "credhub" "uaa" "health_monitor" "director_nginx" "director_sync_dns" "director_scheduler" "blobstore_nginx" "nats" "director"; do monit stop "${name}"; done
bosh/0:~# monit summary
The Monit daemon 5.2.5 uptime: 7d 2h 19m
Process 'nats' not monitored
Process 'postgres' running
Process 'blobstore_nginx' not monitored
Process 'director' not monitored
Process 'worker_1' not monitored
Process 'worker_2' not monitored
Process 'worker_3' not monitored
Process 'worker_4' not monitored
Process 'director_scheduler' not monitored
Process 'director_sync_dns' not monitored
Process 'director_nginx' not monitored
Process 'health_monitor' not monitored
Process 'uaa' not monitored
Process 'credhub' not monitored
System 'system_be0914a6-1473-47f1-58d9-4f3aacbe2ab5' running
- Umonitor Postgres process, so monit won't restart it when Postgres is shutdown using "kill" command directly later.
bosh/0:~# monit unmonitor postgres
bosh/0:~# monit summary
The Monit daemon 5.2.5 uptime: 7d 2h 54m
Process 'nats' not monitored
Process 'postgres' not monitored
Process 'blobstore_nginx' not monitored
Process 'director' not monitored
Process 'worker_1' not monitored
Process 'worker_2' not monitored
Process 'worker_3' not monitored
Process 'worker_4' not monitored
Process 'director_scheduler' not monitored
Process 'director_sync_dns' not monitored
Process 'director_nginx' not monitored
Process 'health_monitor' not monitored
Process 'uaa' not monitored
Process 'credhub' not monitored
System 'system_be0914a6-1473-47f1-58d9-4f3aacbe2ab5' running
- Shutdown Postgres using "kill" command with SIGINT signal for fast mode shutdown.
bosh/0:~# postgres_pid=$(/var/vcap/packages/bpm/bin/bpm pid postgres-10) && kill -s SIGINT "${postgres_pid}"
- Check Postgres database cluster state and ensure it's been shutting down properly with "shut down" state instead of "in production"
bosh/0:~# su - vcap -c "/var/vcap/packages/postgres-10/bin/pg_controldata -D /var/vcap/store/postgres-10" | grep -F "Database cluster state"
Database cluster state: shut down
- If Postgres database cluster state is in "shut down", then exit the BOSH Director VM and proceed with the BOSH Director upgrade as usual.
from bosh.
Related Issues (20)
- 1 of 2 post-start scripts failed. Failed Jobs: cloud_controller_ng. Successful Jobs: bosh-dns. HOT 1
- Resurrector not resurrecting unresponsive agent. HOT 7
- Multi-cpi with different iaas bosh cpi releases induce ruby package conflict HOT 2
- Default bosh generated x509 certificates have invalid 3 digits USA country code HOT 6
- Support Alibaba OSS as an external blobstore for bosh HOT 5
- Improve support for diagnostics of failed compilation: flag to preserve compilation source packages and logs HOT 2
- How to get vm_cid in VM? HOT 1
- Non-descriptive error message when a BOSH job spec property name is a prefix for another one HOT 3
- Support for updating disks HOT 4
- Cannot connect to Bosh Director HOT 5
- Retention period of task logs HOT 2
- health_monitor is leaking connections
- panic: Internal inconsistency: Expected len(Interface '(.+)' was successfully created matches) >= 3: HOT 5
- Health_Monitor stop sending logs HOT 2
- Health-Monitor fails to start because of NATS? HOT 2
- BOSH deployment state is unresponsive agent after restart and sent meltdowns HOT 6
- BOSH Health Monitor JSON Pluging Not Working HOT 2
- unable to bosh cck an unresponsive vm (very high cpu load) HOT 2
- Api endpoint to get deployment manifest with expanded runtime config HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bosh.