GithubHelp home page GithubHelp logo

postgres-release's Introduction

postgres-release


This is a BOSH release for PostgreSQL.

Contents

Deploying

In order to deploy the postgres-release you must follow the standard steps for deploying software with BOSH.

  1. Deploy and run a BOSH director. Please refer to BOSH documentation for instructions on how to do that. Bosh-lite specific instructions can be found here.

  2. Install the BOSH command line Interface (CLI) v2+. Please refer to BOSH CLI documentation. Use the CLI to target your director.

  3. Upload the desired stemcell directly to bosh. bosh.io provides a resource to find and download stemcells.

    # Example for bosh-lite
    bosh upload-stemcell https://bosh.io/d/stemcells/bosh-warden-boshlite-ubuntu-xenial-go_agent
    
  4. Upload the latest release from bosh.io:

    bosh upload-release https://bosh.io/d/github.com/cloudfoundry/postgres-release
    

    or create and upload a development release:

    cd ~/workspace/postgres-release
    bosh -n create-release --force && bosh -n upload-release
    
  5. Generate the manifest. You can provide in input an operation file to customize the manifest:

    ~/workspace/postgres-release/scripts/generate-deployment-manifest \
    -o OPERATION-FILE-PATH > OUTPUT_MANIFEST_PATH
    

    You can use the operation file to specify postgres job properties or to override the configuration if your BOSH director cloud-config is not compatible.

    This example operation file is a great starting point. Note: when using this operation file, you will need to inject pgadmin_database_password at bosh deploy-time, which is a good pattern for keeping credentials out of manifests.

    You are also provided with options to enable TLS in the PostgreSQL server or to use static ips.

  6. Deploy:

    bosh -d DEPLOYMENT_NAME deploy OUTPUT_MANIFEST_PATH
    

    Example, injecting the pgadmin_database_password variable:

    bosh -d DEPLOYMENT_NAME deploy -v pgadmin_database_password=foobarbaz OUTPUT_MANIFEST_PATH
    

Customizing

The table below shows the most significant properties you can use to customize your PostgreSQL installation. The complete list of available properties can be found in the spec.

Property Description
databases.port The database port. Default: 5432
databases.databases A list of databases and associated properties to create when Postgres starts
databases.databases[n].name Database name
databases.databases[n].citext If true the citext extension is created for the db
databases.roles A list of database roles and associated properties to create
databases.roles[n].name Role name
databases.roles[n].password Login password for the role. If not provided, TLS certificate authentication is assumed for the user.
databases.roles[n].common_name The cn attribute of the certificate for the user. It only applies to TLS certificate authentication.
databases.roles[n].permissions A list of attributes for the role. For the complete list of attributes, refer to ALTER ROLE command options.
databases.tls.certificate PEM-encoded certificate for secure TLS communication
databases.tls.private_key PEM-encoded key for secure TLS communication
databases.tls.ca PEM-encoded certification authority for secure TLS communication. Only needed to let users authenticate with TLS certificate.
databases.max_connections Maximum number of database connections
databases.log_line_prefix The postgres printf style string that is output at the beginning of each log line. Default: %m:
databases.collect_statement_statistics Enable the pg_stat_statements extension and collect statement execution statistics. Default: false
databases.additional_config A map of additional key/value pairs to include as extra configuration properties in postgresql.conf
databases.monit_timeout Monit timout in seconds for the postgres job start. Default: 90.
databases.trust_local_connection Whether or not postgres must trust local connections. vcap is always trusted. It defaults to true.
databases.skip_data_copy_in_minor Whether or not a copy of the data directory is created during PostgreSQL minor upgrades. A copy is created by default.
databases.hooks.timeout Time limit in seconds for the hook script. By default, it's set to 0 that means no time limit.
databases.hooks.pre-start Script to run before starting PostgreSQL.
databases.hooks.post-start Script to run after PostgreSQL has started.
databases.hooks.pre-stop Script to run before stopping PostgreSQL.
databases.hooks.post-stop Script to run after PostgreSQL has stopped.
databases.logging.format.timestamp Format for timestamp in control jobs logs. By default it's set to rfc3339.
janitor.script If specified, this script would be run periodically. This would be useful for running house-keeping tasks.
janitor.interval Interval in seconds between two invocations of the janitor script. By default it's set to 1 day.
janitor.timeout Time limit in seconds for the janitor script. By default it's set to 0 that means no time limit.

Note

  • Removing a database from databases.databases list and deploying again does not trigger a physical deletion of the database in PostgreSQL.
  • Removing a role from databases.roles list and deploying again does not trigger a physical deletion of the role in PostgreSQL.

Enabling TLS on the PostgreSQL server

PostgreSQL has native support for using TLS connections to encrypt client/server communications for increased security. You can enable it by setting the databases.tls.certificate and the databases.tls.private_key properties.

A script is provided that creates a CA, generates a key pair, and signs it with the CA:

./scripts/generate-postgres-certs -n HOSTNAME_OR_IP_ADDRESS

The common name for the server certificate must be set to the DNS hostname if any or to the ip address of the PostgreSQL server. This because in TLS mode verify-full, the hostname is matched against the common-name. Refer to PostgreSQL documentation for more details.

You can also use BOSH variables to generate the certificates. See by way of example the operation file used by the manifest generation script.

~/workspace/postgres-release/scripts/generate-deployment-manifest \
   -s -h HOSTNAME_OR_IP_ADDRESS \
   -o OPERATION-FILE-PATH > OUTPUT_MANIFEST_PATH

Enabling TLS certificate authentication

In order to perform authentication using TLS client certificates, you must not specify a user password and you must configure the following properties:

  • databases.tls.certificate
  • databases.tls.private_key
  • databases.tls.ca

The cn (Common Name) attribute of the certificate will be compared to the requested database user name, and if they match the login will be allowed.

Optionally you can map the common_name to a different database user by specifying property databases.roles[n].common_name.

A script is provided that creates a client certificates:

./scripts/generate-postgres-client-certs --ca-cert <PATH-TO-CA-CERT> --ca-key <PATH-TO-CA-KEY> --client-name <USER_NAME>

Hooks

You can run custom code before or after PostgreSQL starts or stops or periodically. For details, see hooks documentation.

Backup and Restore

You can enable backup and restore through bbr by adding the bbr-postgres-db job with the postgres job and by setting its release_level_backup option to true. If enabled, a backup is collected using pg_dump for each database specified in the databases.databases property.

If you don't colocate the bbr-postgres-db with postgres then you must specify in the postgres.dbuser property a database user with enough permissions to run backup and restore.

If your PostgreSQL is configured with TLS, by default backup and restore are run with sslmode=verify-full. You can change it to sslmode=verify-ca by setting postgres.ssl_verify_hostname to false.

Caveats:

  • Restore does not drop the database, the extensions, or the schema; therefore the schema of the restored and existing databases must be the same.
  • If a backup is not present for one of the configured databases in the databases.databases property, the restore issues a message and continues.

Contributing

Contributor License Agreement

Contributors must sign the Contributor License Agreement before their contributions can be merged. Follow the directions here to complete that process.

Developer Workflow

  1. Fork the repository and make a local clone

  2. Create a feature branch from the development branch

    cd postgres-release
    git checkout develop
    git checkout -b feature-branch
  3. Make changes on your branch

  4. Test your changes by running acceptance tests

  5. Push to your fork (git push origin feature-branch) and submit a pull request selecting develop as the target branch. PRs submitted against other branches will need to be resubmitted with the correct branch targeted.

Known Limitations

The postgres-release does not directly support high availability. Even if you deploy more instances, no replication is configured.

Upgrading

Refer to versions.yml in order to assess if a postgres-release version upgrades the PostgreSQL version.

Upgrade Test Policy

The maintainers of the postgres-release test the following upgrade paths:

  • From the previous postgres-release
  • From the latest postgres-release that bumps the previous PostgreSQL version
  • From the latest cf-deployment that bumps the previous PostgreSQL version

Considerations before deploying

  1. A copy of the database is made for the upgrade, you may need to adjust the persistent disk capacity of the postgres job.
    • For major upgrades the copy is always created
    • For minor upgrades the copy is created unless the databases.skip_data_copy_in_minor is set to true.
  2. The upgrade happens as part of the pre-start and its duration may vary basing on your env.
    • In case of a PostgreSQL minor upgrade a simple copy of the old data directory is made.
    • In case of a PostgreSQL major upgrade the pg_upgrade utility is used.
  3. Postgres will be unavailable during this upgrade.

Considerations after a successful deployment

PostgreSQL upgrade may require some post-upgrade processing. The administrator should check the /var/vcap/store/postgres/pg_upgrade_tmp directory for the generated script files and eventually run them. See pg_upgrade post-upgrade processing for more details.

In case a copy of the old database is kept (see considerations above), the old database is moved to /var/vcap/store/postgres/postgres-previous. The postgres-previous directory will be kept until the next postgres upgrade is performed in the future. You are free to remove this if you have verified the new database works and you want to reclaim the space.

Recovering a failure during deployment

In case of a long upgrade, the deployment may time out; anyway, bosh would not stop the actual upgrade process. In this case you can just wait for the upgrade to complete and, only when postgres is up and running, rerun the bosh deploy.

If the upgrade fails:

  • The old data directory is still available at /var/vcap/store/postgres/postgres-x.x.x where x.x.x is the old PostgreSQL version
  • The new data directory is at /var/vcap/store/postgres/postgres-y.y.y where y.y.y is the new PostgreSQL version
  • If the upgrade is a PostgreSQL major upgrade:
    • A marker file is kept at /var/vcap/store/postgres/POSTGRES_UPGRADE_LOCK to prevent the upgrade from happening again.
    • pg_upgrade logs that may have details of why the migration failed can be found in /var/vcap/sys/log/postgres/postgres_ctl.log

If you want to attempt the upgrade again or to roll back to the previous release, you should remove the new data directory and, if present, the marker file.

CI

The CI pipeline runs:

postgres-release's People

Contributors

aalbanes avatar bernharddenner avatar beyhan avatar cf-rabbit-bot avatar christianang avatar drnic avatar fhambrec avatar jobsby avatar jpalermo avatar lnguyen avatar ramonskie avatar rkoster avatar suhlig avatar svrc avatar trevorwhitney avatar valeriap avatar wendorf avatar zankich avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

postgres-release's Issues

Bbr support

As an operator I would like to be able to use bbr to backup and restore my postgresql databases.

The database.address value is not used

The databases.address value is not being used in the job. Rather the address (host) value is being picked up from the first static IP from the default network section (jobs.postgres.networks.default.static_ips.[0]).

Does this databases.address field required?

bbr-postgres-db not being installed

Hello, I'm trying to use bbr-postgres-db, but it does not seem to be installed during deployment. I colocated it with in the postgresql deployment, but I get the error: - Can't find link with type 'database' for job 'joe-db' in deployment 'joe'. If I move it with in the same job as postgres, I don't get an error - but it doesn't exist.

- name: ((deployment_name))-db
  instances: 1
  url: http://((db_ip)):5432
  azs: [z1]
  networks:
  - default:
    - dns
    - gateway
    name: default
  - name: public
    static_ips: ((db_ip))
  stemcell: trusty
  vm_type: ((db_vm_type))
  persistent_disk_type: ((db_persistent_disk_type))
  jobs:
  - release: postgres
    name: postgres
    type: database
    properties:
      databases:
        port: 5432
        databases:
        - name: ((postgres_database))
        roles:
        - *db_role
        - name: postgres
          password: ((postgres_password))
          permissions:
          - SUPERUSER
          - CREATEDB
          - CREATEROLE
          - REPLICATION
        tls:
          certificate: ((atc_tls.certificate))
          private_key: ((atc_tls.private_key))
        name: bbr-postgres-db
        type: database
        properties:
          databases:
          - name: ((postgres_database))
          release_level_backup:
            default: true
#   name: bbr-postgres-db
#   type: database
#   properties:
#     databases:
#       - name: ((postgres_database))
#     release_level_backup:
#       default: true
  - *syslog-forwarder
  - *telegraf

how to make this postgres as a service broker with my native Cloud Foundry?

i want to push this php demo to my natvie Cloud foundry, but this demo need pgsql services.
https://github.com/cloudfoundry-samples/cf-ex-pgbouncer

i have deploy this postgres-release on my native Cloud foundry, but how can i make it as a services and to be bindable by App(php,java,python etc)?
cf create-service-broker SERVICE_BROKER USERNAME PASSWORD URL
cf marketplace
cf create-service SERVICE PLAN SERVICE_INSTANCE [-b BROKER]
much thanks!

my env: vSphere-ESXi.6.5 + VCSA-all.6.7.0 + bosh-deployment(bosh.268.6.0) + cf-deployment(cf.v8.1.0)

Upgrading stemcell for postgres release (v28) vm resulted in data loss

We manage a concourse that uses postgres-release v28 and when upgrading the stemcell resulted in our concourse displaying very old data (it looks like the state of the database right after the original fly set-pipeline command.) Based on the result of du -h we still have the data sitting on the persistent disk, but the db is serving old data instead. We could not find anything in the bosh logs that would point to something obvious happen, and we are not sure where to look in the postgres logs.

Please complete the Cloudfoundry Component Log Timestamp Audit - as per: CF-RFC#030

Hi There!

In an effort to assure all CF components use a consistent logging timestamp as per CF-RFC#030, I'm submitting this issue requesting a little action from y'all on this x-component-team effort.

First

Please complete this audit as soon as possible.

Second

If additional work is required to meet the requirements outlined in CF-RFC#030 please create, and take action to address, github issue(s) describing the work required to meet those requirements.

Thanks so much!

The CF-RFC#030 Authors (Josh Collins and Amelia Downs)

Additional description needed for databases.databases

Hello maintainers of postgres-release,

Today we were looking over our BOSH manifest, and noticed that we'd defined some database config that looked like:

databases:
- tag: uaa
  name: uaa

We were wondering what the tag was for and whether it was still used. When we looked at your job spec file, there wasn't any explanation for the expected contents of each hash in the list. We started to dig through your job templates, but didn't find a mention of the tag key. As a general convention in BOSH releases, it would be great to always provide an example or extended description for lists of hashes, to enumerate the available keys and their uses. We're also curious about the exact purpose of tag in this context, and whether we can remove it from our manifest.

Thanks!
Dave and Raina (@rainmaker)
PCF Release Engineering

Postgres 10

Hello, I didn't know where I should ask about this. I just wanted to know if there were plans to use postgres 10 in this release?

How do i add config for streaming replication.

Hi.

Now, I'm going to build streaming replication environment.
To realize it, It needs to add role for replication and add connection setting at pg_hba.conf.
role config is editable in this release from here. But pg_hba.conf seems to be not so.

Is there something way to edit pg_hba.conf?

BOSH says postgres not running after update; no helpful logs

Attempting to test that Routing API can communicate with postgresql over TLS. We have tested that without configuring databases.tls.certificate and databases.tls.private_key, the deployment succeeds. However, upon configuring these properties, BOSH/monit believes postgres fails to start.

Task 5458 | 19:32:02 | Updating instance singleton-database: singleton-database/f63ca082-5799-4b34-a9e3-58bd16c1cea0 (0) (canary) (00:11:02)
                     L Error: 'singleton-database/f63ca082-5799-4b34-a9e3-58bd16c1cea0 (0)' is not running after update. Review logs for failed jobs: postgres
Task 5458 | 19:43:05 | Error: 'singleton-database/f63ca082-5799-4b34-a9e3-58bd16c1cea0 (0)' is not running after update. Review logs for failed jobs: postgres
Process 'postgres'                  not monitored

The postgres logs don't have any errors in them: https://gist.github.com/shalako/7cd886afdb6ac9f8924e60f253553b78

We looked for a manifest property to increase the log level but couldn't find one. We ended up modifying /var/vcap/jobs/postgres/config/postgresql.conf by adding the following line and using monit to restart

log_min_messages = 'DEBUG5'

The logs didn't seems to change at all

We don't know how to troubleshoot the problem.

could not open shared memory segment error in v46

psql: error: connection to server at "127.0.0.1", port 5432 failed: FATAL: could not open shared memory segment "/PostgreSQL.231763232": No such file or directory

The issue seems to be documented here and can be worked around by adding RemoveIPC=no to /etc/systemd/logind.conf and then systemctl restart systemd-logind.service

Not sure if we want to add that to the pre-start script, or figure out a better way to solve this problem.

Startup scripts silently fail if DB password include a '$'

If you include a password containing a '$' under databases.roles, the postgres job will report as running but the DB won't have a password set. In /var/vcap/sys/log/postgres/postgres_ctl.err.log I see:

[2017-08-18 20:46:02+0000] + echo 'Setting password for role cloud_controller...'
[2017-08-18 20:46:02+0000] /var/vcap/jobs/postgres/bin/postgres_start.sh: line 81: $6: unbound variable

My cloud_controller DB password does have a '...$6...' in the middle. Looks like this line is not doing any escaping so special characters get expanded. On CAPI we usually use Ruby's Shellwords.shellescape in our ERB templates around passwords.

Also a big +1 on this comment around failing the job if setting the roles fails:

# TODO: This script is responsible for both
# starting PostgreSQL and running some queries
# (create DBs, roles, applying grants). One problem
# that needs to be addressed in the future is that
# if some queries fail job is still considered running.
# Later we'll change it to use a more involved approach
# (i.e. script that brings DB to sync)
. Would have been much easier to debug if the postgres job failing immediately rather than silently erroring.

Upgrade to Postgres 12

There were some vulnerability issues in the older version of Postgres, so upgrading to 12v will resolve some problems in our project.

Prometheus upgrade from 23.3.0 to 26.2.0 - DB Start Failed

Hi Team

We try to upgrade from 23.3.0 to 26.2.0 and during upgrade we failed with upgrade.

Task 605996 | 06:44:14 | Updating instance database: database/53a1cdb3-5aa9-47b9-91c3-2acb065d5c26 (0) (canary) (00:04:38)
L Error: Action Failed get_task: Task 923b6aa6-84dc-4f78-6658-84fe3841d7da result: 1 of 2 pre-start scripts failed. Failed Jobs: postgres. Successful Jobs: bosh-dns.

DB logs:
database/53a1cdb3-5aa9-47b9-91c3-2acb065d5c26:/var/vcap/jobs/postgres/bin#
[2020-02-21 06:44:14+0000] 3898 /var/vcap/packages/postgres-11.6/bin/pg_ctl -D
/var/vcap/store/postgres/postgres-11.6 -l logfile start
[2020-02-21 06:44:14+0000] 3898
[2020-02-21 06:44:14+0000] 3898 /var/vcap/store/postgres/pg_upgrade_tmp ~
[2020-02-21 06:44:14+0000] 3898 Running in verbose mode
[2020-02-21 06:44:14+0000] 3898
[2020-02-21 06:44:14+0000] 3898 check for "/var/vcap/packages/postgres-9.6.8/bin" failed: No such file or directory
[2020-02-21 06:44:14+0000] 3898
[2020-02-21 06:44:14+0000] 3898 Failure, exiting

Existing files inside the directory:
database/53a1cdb3-5aa9-47b9-91c3-2acb065d5c26:/var/vcap/packages# ls -lrt
total 28
lrwxrwxrwx 1 root root 80 Feb 21 06:44 postgres-common -> /var/vcap/data/packages/postgres-common/48f53292835d7ef420d4c6b5d22abbdba7b5e04e
lrwxrwxrwx 1 root root 82 Feb 21 06:44 postgres_exporter -> /var/vcap/data/packages/postgres_exporter/89784b5241e6d926fdadf182be9cb8ecda34e47e
lrwxrwxrwx 1 root root 75 Feb 21 06:44 filesnitch -> /var/vcap/data/packages/filesnitch/a7787657f5f2fb52ee8320e450f5f8ef587fc134
lrwxrwxrwx 1 root root 73 Feb 21 06:44 bosh-dns -> /var/vcap/data/packages/bosh-dns/28a34d61581ca20880c7e5dbad623541fe6bd142
lrwxrwxrwx 1 root root 78 Feb 21 06:44 postgres-11.6 -> /var/vcap/data/packages/postgres-11.6/a7c6794a8f9dc2dc4ee66fa4e43962f85f2f6f36
lrwxrwxrwx 1 root root 74 Feb 21 06:44 fim_utils -> /var/vcap/data/packages/fim_utils/87595ea2ffd48a8dd439758dccd5f46e977a3f0e
lrwxrwxrwx 1 root root 80 Feb 21 06:44 postgres-9.6.10 -> /var/vcap/data/packages/postgres-9.6.10/448f10d2c6429ef66b3888209903c6f5b5

Kindly let us know what causing the upgrade to search old version directory.

upgrade to postgres 13.

as bosh is now also using postgres 13.
is it not a good idea we bump this to postgres 13 as well.

the question for bumping has been suggested 2 years ago #58
but action has been taken since than.

so are there any updates on this?

Upgrading to v45 fails during pre-start

check for "/var/vcap/packages/postgres-11.15/bin" failed: No such file or directory

This happens if going from v44 to v45 because during run_major_upgrade, the PACKAGE_DIR_OLD references 11.15, but v45 contains the 11.20 package.

what role to use in service-broker to connect with postgres service ?

Hello Team,

I've deployed postgres bosh release on AliCloud using bosh V2 Manifest.

I'm using https://github.com/cloudfoundry/postgres-release/blob/develop/templates/operations/set_properties.yml file to define role and I've provided password for 'pgadmin' role at deployment time from bosh CLI.

"bosh -d DEPLOYMENT_NAME deploy -v pgadmin_database_password=foobarbaz OUTPUT_MANIFEST_PATH"

Now I've configured the service-broker to use 'pgadmin' role, however I'm getting error while creating a service-instance of postgres saying:
FAILED
Server error, status code: 502, error code: 10001, message: Service broker error: PostgreSQL server is not reachable

I tried to connect to postgres server with psql client from jumpbox and I'm able to do so/connect with 'pgadmin' role.

"psql -h <host_ip> -p 5524 postgres pgadmin"

Please see the screenshot.
image

Any suggestions please, what I'm missing ?

postgres fails to restart after VM reboot

Steps to reproduce

  • use bosh to deploy a postgres-release v23 VM
  • reboot the VM on IaaS level (e.g. nova reboot <VM_CID> if on openstack)

Observed behaviour

  • postgres consistently keeps failing to start because the pid file directory /var/vcap/sys/run/postgres/ does not exist.

from postgres_ctl.err.log :

[2018-01-05 12:53:31+0000] + main start
[2018-01-05 12:53:31+0000] + local action
[2018-01-05 12:53:31+0000] + action=start
[2018-01-05 12:53:31+0000] + source /var/vcap/jobs/postgres/bin/pgconfig.sh
[2018-01-05 12:53:31+0000] ++ current_version=9.6.4
[2018-01-05 12:53:31+0000] ++ pgversion_current=postgres-9.6.4
[2018-01-05 12:53:31+0000] ++ pgversion_old=postgres-9.6.3
[2018-01-05 12:53:31+0000] ++ pgversion_older=postgres-9.6.2
[2018-01-05 12:53:31+0000] ++ JOB_DIR=/var/vcap/jobs/postgres
[2018-01-05 12:53:31+0000] ++ PACKAGE_DIR=/var/vcap/packages/postgres-9.6.4
[2018-01-05 12:53:31+0000] ++ STORE_DIR=/var/vcap/store
[2018-01-05 12:53:31+0000] ++ PG_STORE_DIR=/var/vcap/store/postgres
[2018-01-05 12:53:31+0000] ++ DATA_DIR=/var/vcap/store/postgres/postgres-9.6.4
[2018-01-05 12:53:31+0000] ++ DATA_DIR_PREVIOUS=/var/vcap/store/postgres/postgres-previous
[2018-01-05 12:53:31+0000] ++ DATA_DIR_OLD=/var/vcap/store/postgres/postgres-unknown
[2018-01-05 12:53:31+0000] ++ PACKAGE_DIR_OLD=/var/vcap/packages/postgres-unknown
[2018-01-05 12:53:31+0000] ++ POSTGRES_UPGRADE_LOCK=/var/vcap/store/postgres/POSTGRES_UPGRADE_LOCK
[2018-01-05 12:53:31+0000] ++ pgversion_upgrade_from=postgres-unknown
[2018-01-05 12:53:31+0000] ++ '[' -d /var/vcap/store/postgres/postgres-9.6.3 -a -f /var/vcap/store/postgres/postgres-9.6.3/postgresql.conf ']'
[2018-01-05 12:53:31+0000] ++ '[' -d /var/vcap/store/postgres/postgres-9.6.2 -a -f /var/vcap/store/postgres/postgres-9.6.2/postgresql.conf ']'
[2018-01-05 12:53:31+0000] ++ RUN_DIR=/var/vcap/sys/run/postgres
[2018-01-05 12:53:31+0000] ++ LOG_DIR=/var/vcap/sys/log/postgres
[2018-01-05 12:53:31+0000] ++ PIDFILE=/var/vcap/sys/run/postgres/postgres.pid
[2018-01-05 12:53:31+0000] ++ CONTROL_JOB_PIDFILE=/var/vcap/sys/run/postgres/postgresctl.pid
[2018-01-05 12:53:31+0000] ++ HOST=0.0.0.0
[2018-01-05 12:53:31+0000] ++ PORT=5524
[2018-01-05 12:53:31+0000] ++ [[ -n '' ]]
[2018-01-05 12:53:31+0000] ++ LD_LIBRARY_PATH=/var/vcap/packages/postgres-9.6.4/lib
[2018-01-05 12:53:31+0000] + set +u
[2018-01-05 12:53:31+0000] + source /var/vcap/packages/postgres-common/utils.sh
[2018-01-05 12:53:31+0000] + set -u
[2018-01-05 12:53:31+0000] + case "${action}" in
[2018-01-05 12:53:31+0000] + pid_guard /var/vcap/sys/run/postgres/postgresctl.pid 'Postgres control job' false
[2018-01-05 12:53:31+0000] + local pidfile=/var/vcap/sys/run/postgres/postgresctl.pid
[2018-01-05 12:53:31+0000] + local 'name=Postgres control job'
[2018-01-05 12:53:31+0000] + local logall=false
[2018-01-05 12:53:31+0000] + '[' false == true ']'
[2018-01-05 12:53:31+0000] + '[' -f /var/vcap/sys/run/postgres/postgresctl.pid ']'
[2018-01-05 12:53:31+0000] + echo 1315
[2018-01-05 12:53:31+0000] /var/vcap/jobs/postgres/bin/postgres_ctl: line 18: /var/vcap/sys/run/postgres/postgresctl.pid: No such file or directory
[2018-01-05 12:55:11+0000] + main start
[2018-01-05 12:55:11+0000] + local action
[2018-01-05 12:55:11+0000] + action=start
[2018-01-05 12:55:11+0000] + source /var/vcap/jobs/postgres/bin/pgconfig.sh
[2018-01-05 12:55:11+0000] ++ current_version=9.6.4
[2018-01-05 12:55:11+0000] ++ pgversion_current=postgres-9.6.4
[2018-01-05 12:55:11+0000] ++ pgversion_old=postgres-9.6.3
[2018-01-05 12:55:11+0000] ++ pgversion_older=postgres-9.6.2
[2018-01-05 12:55:11+0000] ++ JOB_DIR=/var/vcap/jobs/postgres
[2018-01-05 12:55:11+0000] ++ PACKAGE_DIR=/var/vcap/packages/postgres-9.6.4
[2018-01-05 12:55:11+0000] ++ STORE_DIR=/var/vcap/store
[2018-01-05 12:55:11+0000] ++ PG_STORE_DIR=/var/vcap/store/postgres
[2018-01-05 12:55:11+0000] ++ DATA_DIR=/var/vcap/store/postgres/postgres-9.6.4
[2018-01-05 12:55:11+0000] ++ DATA_DIR_PREVIOUS=/var/vcap/store/postgres/postgres-previous
[2018-01-05 12:55:11+0000] ++ DATA_DIR_OLD=/var/vcap/store/postgres/postgres-unknown
[2018-01-05 12:55:11+0000] ++ PACKAGE_DIR_OLD=/var/vcap/packages/postgres-unknown
[2018-01-05 12:55:11+0000] ++ POSTGRES_UPGRADE_LOCK=/var/vcap/store/postgres/POSTGRES_UPGRADE_LOCK
[2018-01-05 12:55:11+0000] ++ pgversion_upgrade_from=postgres-unknown
[2018-01-05 12:55:11+0000] ++ '[' -d /var/vcap/store/postgres/postgres-9.6.3 -a -f /var/vcap/store/postgres/postgres-9.6.3/postgresql.conf ']'
[2018-01-05 12:55:11+0000] ++ '[' -d /var/vcap/store/postgres/postgres-9.6.2 -a -f /var/vcap/store/postgres/postgres-9.6.2/postgresql.conf ']'
[2018-01-05 12:55:11+0000] ++ RUN_DIR=/var/vcap/sys/run/postgres
[2018-01-05 12:55:11+0000] ++ LOG_DIR=/var/vcap/sys/log/postgres
[2018-01-05 12:55:11+0000] ++ PIDFILE=/var/vcap/sys/run/postgres/postgres.pid
[2018-01-05 12:55:11+0000] ++ CONTROL_JOB_PIDFILE=/var/vcap/sys/run/postgres/postgresctl.pid
[2018-01-05 12:55:11+0000] ++ HOST=0.0.0.0
[2018-01-05 12:55:11+0000] ++ PORT=5524
[2018-01-05 12:55:11+0000] ++ [[ -n '' ]]
[2018-01-05 12:55:11+0000] ++ LD_LIBRARY_PATH=/var/vcap/packages/postgres-9.6.4/lib
[2018-01-05 12:55:11+0000] + set +u
[2018-01-05 12:55:11+0000] + source /var/vcap/packages/postgres-common/utils.sh
[2018-01-05 12:55:11+0000] + set -u
[2018-01-05 12:55:11+0000] + case "${action}" in
[2018-01-05 12:55:11+0000] + pid_guard /var/vcap/sys/run/postgres/postgresctl.pid 'Postgres control job' false
[2018-01-05 12:55:11+0000] + local pidfile=/var/vcap/sys/run/postgres/postgresctl.pid
[2018-01-05 12:55:11+0000] + local 'name=Postgres control job'
[2018-01-05 12:55:11+0000] + local logall=false
[2018-01-05 12:55:11+0000] + '[' false == true ']'
[2018-01-05 12:55:11+0000] + '[' -f /var/vcap/sys/run/postgres/postgresctl.pid ']'
[2018-01-05 12:55:11+0000] + echo 1473
[2018-01-05 12:55:11+0000] /var/vcap/jobs/postgres/bin/postgres_ctl: line 18: /var/vcap/sys/run/postgres/postgresctl.pid: No such file or directory

the important line from log above is

[2018-01-05 12:53:31+0000] /var/vcap/jobs/postgres/bin/postgres_ctl: line 18: /var/vcap/sys/run/postgres/postgresctl.pid: No such file or directory

Expected behaviour

postgres is able to survive a VM reboot.

Workaround

bosh ssh into the postgres VM and manually execute

mkdir -p /var/vcap/sys/run/postgres/
chown -R vcap:vcap /var/vcap/sys/run/
monit start postgres

Possible fix

there are similar bugs already fixed in other bosh releases, see e.g. cloudfoundry/capi-release#25

I think the problem is that the control scripts rely on prestart to create $RUN_DIR:

mkdir -p "${RUN_DIR}"
chown -R vcap:vcap "${RUN_DIR}"

and in case of external VM reboot (i.e. outside of any bosh operation), prestart is not executed.
/var/vcap/data/sys/run/ is a temp file system

$ mount | grep sys/run
tmpfs on /var/vcap/data/sys/run type tmpfs (rw,size=1m)

and thus is empty after VM reboot (which is fine).

although there is an mkdir -p $RUN_DIR in

, this is too late in case of VM reboot because of this line executed earlier:

echo $$ > ${CONTROL_JOB_PIDFILE}

which assumes the directory $RUN_DIR already exists.

Initialization of $RUN_DIR should probably happen as the very first thing when start) is invoked, see the fix for a similar problem in cloud_controller_ng

In our scenario (we run CCDB on postgres-release) this bug has caused an extended Cloudfoundry downtime triggered by forced VM reboots which were required to deploy urgent security fixes on the IaaS level.

postgres 9.6.11 is out

v9.6.11 is out with some tasty bug fixes.

Could the blobs/packages/jobs please be updated?

postgres creates roles and databases in the `post-start` script now.

I noticed in the latest postgres-release, you guys have moved the logic of creating roles and databases to the post-start script. Is there a reason you did this?

In many releases (UAA and CredHub), we need those roles and databases to start. If they are being done in 'post-start', we can't start.

DB upgrade leads to bosh deploy timeout

We recently upgraded the postgres release used in concourse as advertised in the concourse 3.5.0 release notes

I did read https://github.com/cloudfoundry/postgres-release/#upgrading
and increased the databases.monit_timeout to 300 seconds.

this allowed the DB upgrade to finish without a monit timeout (it took about 2 minutes) according to /var/vcap/sys/log/postgres/postgres_ctl.log , but still bosh deploy failed with

"time":1508245109,"stage":"Updating instance","tags":["db"],"total":1,"task":"db/c58db631-411d-4390-9787-734be1d88eca[98/6448]
ary)","index":1,"state":"failed","progress":100,"data":{"error":"''db/c58db631-411d-4390-9787-734be1d88eca (0)'' is not running
 after update. Review logs for failed jobs: postgres"}}
{"time":1508245109,"error":{"code":400007,"message":"''db/c58db631-411d-4390-9787-734be1d88eca (0)'' is not running after updat
e. Review logs for failed jobs: postgres"}}
', "result_output" = '', "context_id" = '' WHERE ("id" = 11605)
D, [2017-10-17 12:58:29 #10554] [task:11605] DEBUG -- DirectorJobRunner: (0.001595s) COMMIT
I, [2017-10-17 12:58:29 #10554] []  INFO -- DirectorJobRunner: Task took 2 minutes 53.07573839 seconds to process.

According to bosh lifecycle docs, there is another timeout (probably update_watch_time) which is exceeded.

Rather than increasing update_watch_time for all jobs,
according to bosh lifecycle docs it seems that a pre_start script would be a better lifecycle to perform long-running tasks like a DB upgrade because it does not timeout on the bosh level.

db buffer vales hardcoded

would be nice to have PostgreSQL buffer values templated
shared_buffers = 128MB
temp_buffers = 8MB

Postgres doesn't restart when VM manually restarted via vcentre

When restarting postgres on vsphere using vcentre, the startup script fails due to it missing:

/var/vcap/sys/run/postgres directory.

/var/vcap/jobs/postgres/bin/postgres_ctl: line 18: /var/vcap/sys/run/postgres/postgresctl.pid: No such file or directory

The monit start control script could not read the pid file it was expecting and fails.

When this folder structure is manually created before starting the process, postgres restarts correctly.

When restarted via bosh, the process restarts successfully, however when restarting manually in vcentre it fails.

The start script should be changed to check for this file before using it.

Upgrade to Postgres 11.3 once it's out

Hi there! We on Concourse team are affected by a bug present in Postgres 11.2 that causes Postgres to lock up until it's force-restarted. The fix is expected to be in 11.3, which should be out this week (on May 9) according to their roadmap.

We're at a point where we're evaluating downgrading to to Postgres 9.6, switching to CloudSQL, or just waiting on 11.3 and force-restarting our Postgres whenever it falls over. The last item involves the least amount of work, so I figured I'd open this preemptively in hopes y'all have bandwidth to publish a new release shortly after it's out. ๐Ÿ™‚

Alternatively, we may be able to work-around the bug by disabling parallel queries, so no rush.

Thanks!

How do I run PGATS on bosh-lite?

I've been trying to get the tests to pass for a while now, but simply can't get it to work:

The SSL tests require a config server to generate the certificates, so I configured bosh-lite with credhub and uaa. But the bosh-cli version used by PGATS to talk to the director doesn't seem to support oauth authentication via uaa to the director.

I've then switched the director back to using local user_management, but then there seems to be a problem generating the credentials:

Task 71 | 18:10:02 | Preparing deployment: Preparing deployment (00:00:01)
                   L Error: Config Server failed to generate value for '/Bosh-Lite-Director/pgats-fresh-cbd4b17c-29a1-9dd3-891d-94a7a8a220b4/certuser_matching_certs' with type 'certificate'. HTTP Code '404', Error: 'The request could not be completed because the credential does not exist or you do not have sufficient authorization.'

Looking at credhub.log I see

2018-04-20T18:10:03.231Z [https-jsse-nio-8844-exec-8] .... ERROR --- ExceptionHandlers: The request could not be completed because the credential does not exist or you do not have sufficient authorization.
2018-04-20T18:10:03.359Z [https-jsse-nio-8844-exec-8] ....  INFO --- CREDHUB_SECURITY_EVENTS: CEF:0|cloud_foundry|credhub|1.6.5|POST /api/v1/data|POST /api/v1/data|0|rt=1524247803253 suser=null suid=uaa-client:director_to_credhub cs1Label=userAuthenticationMechanism cs1=oauth-access-token request=/api/v1/data requestMethod=POST cs3Label=result cs3=clientError cs4Label=httpStatusCode cs4=404 src=192.168.50.6 dst=192.168.50.6

I don't know if this is related to the director using local users. I tried to add the credhub accounts with identical passwords as local users as well, but that doesn't make any difference (credhub probably wants a UAA token and not a password anyways).

Could you please provide some documentation on how to run the tests successfully?

roles.sq.erb has a bug.

Hello,

I am trying to leverage the permissions attribute in the manifest with the following:

roles:

  • name: postgres
    password: postgrespassword
    permission: SUPERUSER

In the customizing documentation:
databases.roles[n].permissions | A list of attributes for the role. For the complete list of attributes, refer to ALTER ROLE command options.

Looking at the command options for postgres, I found this:

ALTER ROLE name [ [ WITH ] option [ ... ] ]

where option can be:

  SUPERUSER | NOSUPERUSER
| CREATEDB | NOCREATEDB
| CREATEROLE | NOCREATEROLE
| CREATEUSER | NOCREATEUSER
| INHERIT | NOINHERIT
| LOGIN | NOLOGIN
| REPLICATION | NOREPLICATION
| CONNECTION LIMIT connlimit
| [ ENCRYPTED | UNENCRYPTED ] PASSWORD 'password'
| VALID UNTIL 'timestamp'

so my above manifest should be correct. However, roles.permission attribute calls roles.sql.erb:

<% p("databases.roles", []).each do |role| %>
DO
$body$
BEGIN
IF NOT EXISTS (SELECT FROM pg_catalog.pg_user WHERE usename = '<%= role["name"] %>') THEN
CREATE ROLE "<%= role["name"] %>" WITH LOGIN;
END IF;
END
$body$;
<% if role["password"] %>
ALTER ROLE "<%= role["name"] %>" WITH LOGIN PASSWORD '<%= role["password"] %>';
<% end %>
<% if role["permissions"] %>
ALTER ROLE "<%= role["name"] %>" WITH <%= role["permissions"].join(' ') %>;
<% end %>

This successfully creates the user name and sets the password. However, when the role["permissions"] exists, this will call the alter role role["name"] with role["permissions"].join(' '). What is the purpose of the .join(' ') at the end? It will error out with this error:

  • Error filling in template 'roles.sql.erb' (line 14: undefined method `join' for "SUPERUSER":String)

shouldn't .join be removed?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.