pivotal-cf / cf-rabbitmq-release Goto Github PK
View Code? Open in Web Editor NEWA BOSH Release of RabbitMQ
License: Apache License 2.0
A BOSH Release of RabbitMQ
License: Apache License 2.0
We noticed that max_in_flight
is set to 4
in the deployment templates.
We're running our RMQ clusters with 3 nodes.
With max_in_flight = 4, it will update the remaining 2 nodes simultaneously after the canary. Thus, all HA queues will be moved to a single node (the canary) which we think is not optimal.
Is there a specific reason why you set max_in_flight to 4 or can we safely reduce this to 1?
Currently to set up clustered Rabbit nodes, the IP's of all the nodes need to be passed in via the manifest. I attempted to set up clustering by passing in hostnames which will always be the same, but the Rabbit jobs kept failing. Is there a property to set to allow hostnames instead of IP's or a way to tweak the scripts to accommodate this?
Thanks!
The release 266.0.0
installs rabbitmq version 3.7.14
, but shouldn't be 3.7.15
from this commit:
The cli/gui print version 3.7.14
after using the release 266.0.0
When https enabled, the Rabbit Management console node is only https based. E.g. https://[ip]:15762, while the proxy is also https based forwarding.
However the route registered with router won't work (time out error). pivotal-rabbitmq.10.244.0.34.xip.io
Because router -> Rabbit Proxy is only http based
Shaozhen Ding
Pivotal Services
I'm getting an error when executing scripts/deploy-bosh-lite
on my bosh-lite
I’m getting cf-rabbitmq-test
not found on Bosh Lite Director. I cloned the rabbitmq github repo and ran the script scripts/deploy-bosh-lite
.
➜ bosh-lite git:(master) ✗ bosh releases
Acting as user 'admin' on 'Bosh Lite Director'
+--------------+------------+-------------+
| Name | Versions | Commit Hash |
+--------------+------------+-------------+
| cf | 245+dev.1* | 1a84cd71 |
| cf-mysql | 31* | 41fda3be+ |
| local-volume | 0+dev.1 | dc5a6282 |
| routing | 0.136.0 | d29132da+ |
| syslog | 7 | 8e3ce3ac+ |
+--------------+------------+-------------+
(*) Currently deployed
(+) Uncommitted changes
Releases total: 5
➜ bosh-lite git:(master) ✗ cd ../cf-rabbitmq-release && scripts/deploy-bosh-lite
….
……
Release uploaded
Acting as user 'admin' on 'Bosh Lite Director'
Successfully updated cloud config
Deployment set to `/Users/bhaarat/code/cloudfoundry/cf-rabbitmq-release/manifests/cf-rabbitmq.yml'
in deploy
Acting as user 'admin' on deployment 'cf-rabbitmq' on 'Bosh Lite Director'
Getting deployment properties from director...
Unable to get properties list from director, trying without it...
Release 'cf-rabbitmq-test' not found on director. Unable to resolve 'latest' alias in manifest.
➜ cf-rabbitmq-release git:(master) ✗ bosh releases
Acting as user 'admin' on 'Bosh Lite Director'
+--------------+---------------+-------------+
| Name | Versions | Commit Hash |
+--------------+---------------+-------------+
| cf | 245+dev.1* | 1a84cd71 |
| cf-mysql | 31* | 41fda3be+ |
| cf-rabbitmq | 224.0.0+dev.6 | f46b35c7+ |
| local-volume | 0+dev.1 | dc5a6282 |
| routing | 0.136.0 | d29132da+ |
| syslog | 7 | 8e3ce3ac+ |
+--------------+---------------+-------------+
(*) Currently deployed
(+) Uncommitted changes
Releases total: 6
We were wondering why cf-rabbitmq-release
uses the ip address of the node as its RMQ node name (in setup.sh).
RMQ seems to rely on this node name in different cases. So coupling the cluster to fixed IPs has some drawbacks:
Side note: We're aware that we can backup and restore the configuration using the /api/definitions
endpoint, but this does not include the queue data.
Could the node name be changed to use another unique attribute of the node as the node name, i.e. job_name/job_index
? This would still be unique but would decouple the cluster from the used IPs, i.e allow us to move the deployment around.
I'm trying to deploy this on an openstack cloud. So far I've had no luck at all. The error I'm getting is:
Error 100: Unable to process links for deployment. Errors are:
- Multiple instance groups provide links of type 'rabbitmq-server'. Cannot decide which one to use for instance group 'rmq_z1'.
cf-rabbitmq.rmq_z1.rabbitmq-server.rabbitmq-server
cf-rabbitmq.rmq_z2.rabbitmq-server.rabbitmq-server
- Multiple instance groups provide links of type 'rabbitmq-server'. Cannot decide which one to use for instance group 'rmq_z2'.
cf-rabbitmq.rmq_z1.rabbitmq-server.rabbitmq-server
cf-rabbitmq.rmq_z2.rabbitmq-server.rabbitmq-server
The thing is, I can't see anywhere in the manifest that shows any links. All references to rabbitmq-server are just templates. I tried changing their template references to rabbitmq-server1 and rabbitmq-server2, but got:
Error 100: Job 'rabbitmq-server1' not found in Template table
I've been trying to read up on links and how they're used, but from what I can tell, there's no links in this manifest at all. Nothing provides
or consumes
at all.
I also tried removing the secondary instances, and then received this error:
“Error 140014: Link path was not provided for required link 'rabbitmq-broker' in instance group 'rmq-broker'”
I'm at a loss for what to even look for next.
Hi All,
While we do migration of rabbitmq cluster from single az (z1) to multi-az(z1, z2, z3), there will be two new rabbitmq nodes created in z2 and z3 and will join the cluster and two existing nodes in z1 will be deleted.
But when i check the cluster status, it gives me as 5 nodes are in the cluster and 3 are currently running.
Example:
Cluster status of node rabbit@d6c3dc3f73a21864ffd1f702be9442d8
[{nodes,[{disc,[rabbit@3c316ebd7ecca8c1dd6ed8953d387a39,
rabbit@ac2b27f80f64fab4371a20d8dc27c71d,
rabbit@b03884f0dcaec0453b7331fd5aaf17c7,
rabbit@d6c3dc3f73a21864ffd1f702be9442d8,
rabbit@df2526feccc0fc58d96e914e40f20ca8]}]},
{running_nodes,[rabbit@3c316ebd7ecca8c1dd6ed8953d387a39,
rabbit@ac2b27f80f64fab4371a20d8dc27c71d,
rabbit@d6c3dc3f73a21864ffd1f702be9442d8]},
{cluster_name,<<"rabbit@localhost">>},
{partitions,[]},
{alarms,[{rabbit@3c316ebd7ecca8c1dd6ed8953d387a39,[]},
{rabbit@ac2b27f80f64fab4371a20d8dc27c71d,[]},
{rabbit@d6c3dc3f73a21864ffd1f702be9442d8,[]}]}]
I expect the result in cluster status should only contain the 3 nodes. The cleanup of the other two nodes in the cluster status doesn't happen. I can see that in /var/vcap/store/rabbitmq/mnesia/db/cluster_nodes.config contains 5 nodes. But my env set to have 3 node cluster.
ENV:
SERVER_START_ARGS='-rabbit log_levels [{connection,info}] -rabbit cluster_nodes {[rabbit@d6c3dc3f73a21864ffd1f702be9442d8,rabbit@ac2b27f80f64fab4371a20d8dc27c71d,rabbit@3c316ebd7ecca8c1dd6ed8953d387a39],disc} -rabbit disk_free_limit 3221225472 -rabbit cluster_partition_handling pause_minority -rabbit halt_on_upgrade_failure false -rabbitmq_mqtt subscription_ttl 1800000 -rabbitmq_management http_log_dir "/var/vcap/sys/log/rabbitmq-server"'
It appears that we currently allow only one RabbitMQ cluster per CF foundation because the service-uuid is set as a constant. Please see
The client is using this release for both PCF and CF (OSS) foundations (vSphere and AWS, respectively) and view this as a potential capacity constraint since this will limit them to one rabbit cluster per CF.
I apologize if this is already in the Tracker, but I do not currently have access to it. Please add me, if possible, so I can review our User Stories prior to submitting and to track stories as needed.
Thanks!
It appears that only the rabbitmq-server
job that can be configured to forward its logs via syslog. (see spec and rsyslog config)
As far as I can tell you cannot configure the rabbitmq-haproxy
( spec ) or rabbitmq-broker
( spec ) jobs to forward their logs.
Is this correct?
If so, is this an intentional omission, or just a missing "feature" ?
Thanks!
Currently you can customize the management_domain (used in the Dashboard URL advertised to CF services). However, the route-registrar process on the rabbitmq-broker job does not advertise that domain in a route. Instead it advertises pivotal-rabbitmq.<system-domain>
. These values should probably be synchronized.
What is the logic behind committing new releases with random order regarding the tag versions?
For example, days ago there was a commit with the tag v270.0.0
and now the latest commit has lower "version" v267.3.0
. Based on time one may assume (has listed also here: https://github.com/pivotal-cf/cf-rabbitmq-release/releases looks like the latest stable is v267.3.0
but then what is the point of v270.0.0
Probably would be better to have different branches for "possible test" but in any case, I would highly suggest to please keep a healthy sequence in the tags
I see it referenced in the README (https://github.com/pivotal-cf/cf-rabbitmq-broker-release which gives a 404) and in https://github.com/pivotal-cf/cf-rabbitmq-release/blob/master/manifests/cf-rabbitmq-colocated-with-multitenant-broker-template.yml.
Are there plans to make it public or do we need to package our own?
Hi!
We're currently firefighting a nasty issue in 3.7.4 and 3.7.5 with the vhost supervisor. It appears the issue has been fixed in 3.7.6 (rabbitmq/rabbitmq-management#575).
Could we bump to RMQ 3.7.6 in the bump of this boshrelease?
Thank you!
We have configure rabbitmq clusters with haproxy using cf-multitentant-rabbitmq-broker release without TLS. Now when we configure
self signed certs on rabbitmq nodes which has the following details,
cn = rabbitmq node-1 ip
sn = rabbitmq node-1 ip, rabbitmq node-2 ip
We are using rabbit-example-app to test it with TLS
https://github.com/pivotal-cf/rabbit-example-app.git
When we put the client and server certs same as the certs configured on the nodes it gives us following error,
"Connection to amqps://***@10.x.x.x failed to start hostname haproxy-ip does not match server certificate"
We are using the following release versions,
cf-multitentant-rabbitmq-broker - v49
cf-rabbitmq-release - v265
There are a number of instances in the base manifests that have some quite unusual variable names, an example being:
management_domain: ((rabbitmq-management-domain)).((bosh-hostname))
or
uris:
- ((rabbitmq-broker-domain)).((bosh-hostname))
Considering a uri would normally be considered <hostname>.<domain>
this becomes quite confusing.
Hi guys
The property rabbitmq-server.config
allows specifying additional rabbitmq server config.
Currently, it expects a base64 encoded string as described in the docs.
However, this relies on the bosh cli to interpret ERB expressions in the deployment manifest. This was the case for the old ruby cli, but the new golang cli (v2) does (intentionally) not support this:
cloudfoundry/bosh-cli#282
This forces bosh cli v2 users of this release to somehow (manually) preprocess the bosh deployment manifest before deploying (or just put base64 in the deployment manifest - either of it is not that nice) - it would be nicer if we could put the config as plain string, i.e. rabbitmq-server.raw_config
or similar... Would this be possible?
I can't see why passing rabbitmq-server.config
as a plain string could lead to issues, i.e.:
rabbitmq-server:
config: |
[{rabbit, [{vm_memory_high_watermark,0.5}]}].
Here is the listing of /var/vcap/packages
after an upgrade to the release v240.0.0:
# ls -l
total 32
lrwxrwxrwx 1 root root 73 Mar 14 22:17 bosh-dns -> /var/vcap/data/packages/bosh-dns/67b977dca6e9b86cad54af73c08150b34e99d309
lrwxrwxrwx 1 root root 71 Mar 14 22:17 erlang -> /var/vcap/data/packages/erlang/1c7771c7774d4c7c97a1ba9a666b5ad5fb45a0c2
lrwxrwxrwx 1 root root 78 Mar 14 22:17 node_exporter -> /var/vcap/data/packages/node_exporter/923a6fbd61d30904b8ff3da59fdba3e57fc2743a
lrwxrwxrwx 1 root root 80 Mar 14 22:17 rabbitmq-common -> /var/vcap/data/packages/rabbitmq-common/30344ae448f136ceda4ac0fb595d561674514d9f
lrwxrwxrwx 1 root root 80 Mar 14 22:17 rabbitmq-server -> /var/vcap/data/packages/rabbitmq-server/e2bba8e813a3677de354efdedca9da94e4d12cb9
lrwxrwxrwx 1 root root 84 Mar 28 16:11 rabbitmq-server-3.6 -> /var/vcap/data/packages/rabbitmq-server-3.6/ccf881b215c3493c2f215d4c51af19e585e8ddc9
lrwxrwxrwx 1 root root 84 Mar 28 16:11 rabbitmq-server-3.7 -> /var/vcap/data/packages/rabbitmq-server-3.7/9d526adf0dd98198e4a1d73ca00cd1002473ff27
lrwxrwxrwx 1 root root 93 Mar 28 16:11 rabbitmq-upgrade-preparation -> /var/vcap/data/packages/rabbitmq-upgrade-preparation/61fc96b42e4be83a9065284a00608be67eb0cba7
The rabbitmq-server
link is pointing at the former package e2bba8e...
from the deployment of previous version 238 of the Bosh Release. Here you need to know that Bosh keeps the previous packages around in order to speed up any subsequent rollback.
And here is what the configure_rmq_version()
(from pre-start.nash
template) has created:
# ls -l /var/vcap/packages/rabbitmq-server/rabbitmq-server*
lrwxrwxrwx 1 root root 42 Mar 28 16:11 /var/vcap/packages/rabbitmq-server/rabbitmq-server-3.7 -> /var/vcap/packages/rabbitmq-server-3.7
In pre-start.bash
, the use of ln
without removing any pre-exiting /var/vcap/packages/rabbitmq-server
(file or directory or link) is a classical tricky case:
configure_rmq_version() {
ln -f -s /var/vcap/packages/rabbitmq-server-"$RMQ_SERVER_VERSION" /var/vcap/packages/rabbitmq-server
}
Instead, any existing /var/vcap/packages/rabbitmq-server
link should be removed first, and the ln
invocation should not need the -f
flag.
I tried to fix the Bosh Release with this code:
configure_rmq_version() {
rm -rf /var/vcap/packages/rabbitmq-server
ln -s /var/vcap/packages/rabbitmq-server-"$RMQ_SERVER_VERSION" /var/vcap/packages/rabbitmq-server
}
But then I hit an issue when actually upgrading the cluster. The canary node in my deployment fails at starting rabbitmq-server
job with the new RabbitMQ 3.7 binary.
To my understanding, I see that the configure_rmq_version()
is called after rabbitmq-config-vars.bash
has been loaded, because the actual engine version is define there. The problem is that this should be done earlier, in order to ensure that no operation involving rabbitmqctl
are made before the package link is properly created.
Currently, run_rabbitmq_upgrade_preparation_shutdown_cluster
is called before configure_rmq_version()
and when rabbitmq-upgrade-preparation
is run, it uses the rabbitmqctl
from the previous deployment (because the link to the new one is not already set).
Normally, a fresh new deployment should not even work because rabbitmq-upgrade-preparation
is not supposed to find any /var/vcap/packages/rabbitmq-server/bin/rabbitmqctl
since the /var/vcap/packages/rabbitmq-server
link is not created yet. I didn't test this case though.
Anyway, I'll submit my work-in-progress patch and let you dig into the issue further.
This Bosh Release is hard to debug. Especially the Bash scripts from the rabbitmq-server
job templates: though they are individually well-written, and functions are individually properly named (which is obviously the result of good programing skills in the first place), the scripts are too complicated as a whole. They need to be refactored in order to simplify things. Currently, there's I'm still trying to figure out which awesome features of this Bosh Release could possibly lead to such tangled code.
As a return on experience with the Cassandra Bosh Release, we leveraged the move to BPM to drive a major cut into bloated Bash scripts. But the situation was not even close to the amount of script lines we can see in these rabbitmq-server
job templates.
The current issue I'm raising here really looks like a consequence of this complexity. Thus the remark.
Hey ya'll, one of the other rabbitmq https://github.com/rabbitmq/rabbitmq-server-boshrelease/blob/master/jobs/rabbitmq-server/spec#L141 supports "HIPE" (high performance erlang) that I was interested in playing around with, is this something that would be desirable to add to this bosh release? edit: Just realized/remember that the other boshrelease is experimental
It seems something is missing from the example.
WARNING: You're currently running as root; probably by accident.
Press control-C to abort or Enter to continue as root.
Set LEIN_ROOT to disable this warning.
Reflection warning, /tmp/form-init7134059142373307675.clj:1:1014 - call to static method invokeStaticMethod on clojure.lang.Reflector can't be resolved (argument types: unknown, java.lang.String, unknown).
Reflection warning, clj_yaml/core.clj:27:5 - call to org.yaml.snakeyaml.Yaml ctor can't be resolved.
Reflection warning, clj_yaml/core.clj:59:22 - reference to field getLine can't be resolved.
Reflection warning, clj_yaml/core.clj:60:23 - reference to field getIndex can't be resolved.
Reflection warning, clj_yaml/core.clj:61:24 - reference to field getColumn can't be resolved.
Reflection warning, clj_yaml/core.clj:111:3 - call to method dump can't be resolved (target class is unknown).
Reflection warning, clj_yaml/core.clj:116:11 - call to method load can't be resolved (target class is unknown).
columns.clj:130 recur arg for primitive local: current_indent is not matching primitive, had: Object, needed: long
Auto-boxing loop arg: current-indent
exception.clj:44 recur arg for primitive local: i is not matching primitive, had: Object, needed: long
Auto-boxing loop arg: i
Reflection warning, clj_http/multipart.clj:26:4 - call to org.apache.http.entity.mime.content.FileBody ctor can't be resolved.
Reflection warning, cheshire/core.clj:63:3 - call to method createJsonGenerator on com.fasterxml.jackson.core.JsonFactory can't be resolved (argument types: unknown).
2016-10-20 13:25:28.259:INFO::main: Logging initialized @86645ms
Reflection warning, ring/adapter/jetty9.clj:51:23 - reference to field getRemote can't be resolved.
Reflection warning, ring/adapter/jetty9.clj:51:23 - call to method sendString can't be resolved (target class is unknown).
Reflection warning, ring/adapter/jetty9.clj:47:5 - reference to field getRemote can't be resolved.
Reflection warning, ring/adapter/jetty9.clj:43:23 - reference to field getRemote can't be resolved.
Reflection warning, ring/adapter/jetty9.clj:43:23 - call to method sendString can't be resolved (target class is unknown).
Reflection warning, ring/adapter/jetty9.clj:39:23 - reference to field getRemote can't be resolved.
Reflection warning, ring/adapter/jetty9.clj:39:23 - call to method sendBytes can't be resolved (target class is unknown).
Reflection warning, ring/adapter/jetty9.clj:89:7 - call to method onWebSocketClose can't be resolved (target class is unknown).
Reflection warning, ring/adapter/jetty9.clj:82:7 - call to method onWebSocketConnect can't be resolved (target class is unknown).
Reflection warning, ring/adapter/jetty9.clj:212:30 - call to org.eclipse.jetty.server.ServerConnector ctor can't be resolved.
Reflection warning, ring/adapter/jetty9.clj:220:33 - call to org.eclipse.jetty.server.ServerConnector ctor can't be resolved.
Reflection warning, ring/middleware/basic_authentication.clj:21:40 - reference to field getBytes can't be resolved.
Reflection warning, io/pivotal/pcf/rabbitmq/main.clj:76:13 - reference to field join can't be resolved.
Config validation failed. Errors:
Hi,
I'm facing the following problem deploying rabbiitmq 224 on openstack.
D, [2017-02-13 12:15:56 #29852] [task:443] DEBUG -- DirectorJobRunner: SENT: hm.director.alert {"id":"fa0bc75b-d46f-4192-a41d-74ede69af2a5","severity":3,"source":"director","title":"director - error during update deployment","summary":"Error during update deployment for 'cf-rabbitmq' against Director '87c25ba6-be78-4c13-8c7a-3ca9017594c7': #<Bosh::Director::JobMissingLink: Link path was not provided for required link 'rabbitmq-broker' in instance group 'rmq-broker'>","created_at":1486988156}
E, [2017-02-13 12:15:56 #29852] [task:443] ERROR -- DirectorJobRunner: Link path was not provided for required link 'rabbitmq-broker' in instance group 'rmq-broker'
Here is my stub.
meta:
stemcell:
version: latest
compilation_az: us-east-1a
az1: us-east-1a
az2: us-east-1b
director_uuid: 87c25ba6-be78-4c13-8c7a-3ca9017594c7
name: cf-rabbitmq
releases:
- name: cf-rabbitmq
version: latest
jobs:
- name: rmq_z1
instances: 1
networks:
- name: rabbitmq_z1
static_ips:
- 10.0.0.210
- name: rmq-broker
instances: 1
networks:
- name: rabbitmq_z1
static_ips:
- 10.0.0.212
- name: haproxy_z1
instances: 1
networks:
- name: rabbitmq_z1
static_ips:
- 10.0.0.213
- name: broker-registrar
properties:
broker:
name: p-rabbitmq
properties:
# for broker and route registrars
cf:
domain: de.cloudlab.com
admin_password: password
admin_username: admin
api_url: "http://api.de.cloudlab.com"
nats:
host: "10.0.0.127"
port: "4222"
password: password
username: nats-user
rabbitmq-server:
administrators:
broker:
username: username
password: password
static_ips:
- 10.0.0.210
- 10.0.0.211
ssl:
cert: |
-----BEGIN CERTIFICATE-----
your certificate
-----END CERTIFICATE-----
key: |
-----BEGIN RSA PRIVATE KEY-----
your key
-----END RSA PRIVATE KEY-----
cacert: |
-----BEGIN CERTIFICATE-----
certificate for your CA
-----END CERTIFICATE-----
rabbitmq-haproxy:
stats:
username: username
password: password
rabbitmq-broker:
ip: 10.0.0.212
cc_endpoint: http://api.de.cloudlab.com
service:
username: "p1-rabbit"
password: "password"
logging:
level: debug
print_stack_traces: false
rabbitmq:
management_domain: pivotal-rabbitmq.de.cloudlab.com
management_ip: 10.0.0.213
ssl: false
hosts:
- 10.0.0.210
networks:
- name: rabbitmq_z1
type: manual
subnets:
- range: 10.0.0.0/24
gateway: 10.0.0.1
reserved:
- 10.0.0.1 - 10.0.0.209
static:
- 10.0.0.210 - 10.0.0.220
dns:
- 139.25.25.243
cloud_properties:
subnet: 116355d4-c925-4c66-b49e-6ef8f64764cc
security_groups:
- bosh
resource_pools:
- name: services-small-z1
cloud_properties:
instance_type: m3.medium
When CF platform users need a RMQ service, they can obtain it in a self-service manner using the marketplace/broker.
But we also would like to use the RMQ service on a infrastructure level - and for this, we would prefer to configure the needed vhosts and users directly in the deployment manifest so that they're automatically provisioned on deployment time of the BOSH release.
I.e. the cf-mysql release (that is not only used as a service
, but very often also on infrastructure level, i.e. to back CF components themselves) offers such an option:
https://github.com/cloudfoundry/cf-mysql-release/blob/62b1e35c31ba412c1647e2a7f916c5fa5f89a6dc/jobs/mysql/spec#L127
https://github.com/cloudfoundry/mariadb_ctrl/blob/b2b272a1ef3bca647bd314b8524dcbb1ca3bba8e/mariadb_helper/mariadb_helper.go#L202
Would you welcome such a PR that adds the option to specify such "seeded" vhosts/users to the RMQ release?
How we imagine it:
# add a new configuration option:
rabbitmq-server.seeded_vhosts:
description: 'Set of vhosts to seed'
default: {}
example: |
- name: vhost1
username: user1
password: pw1
tags: administrator
permissions: .* .* .* # maybe also split this up to the individual fields conf/read/write?
- name: vhost2
username: user2
password: pw2
tags: monitoring
permissions: conf read write
These vhosts, users and permissions would be created upon startup (effectively overwriting already existing user attributes/permissions, similar to how the admin user provisioning is done).
This could then be implemented in a couple of lines of shell script in rabbitmq-server.init.bash, reusing (and probably refactoring) the methods handling the admin/mgmt user setup.
Are there any plans to upgrade the RabbitMQ version to 3.6.1?
This would fix the security issue (CVE-2015-8786) in the management plugin
Hi all
While changing configuration of our rabbitmq deployment lately, we ran into the following issue:
Result: The whole cluster was down. The post-deploy script then failed on the nodes, so we noticed something was wrong.
If we would add a post-start
script, this scenario could not happen.
Since post-start
are run on each VM before BOSH marks it as healthy, individual node failures would be caught before moving on with the next instance.
UAA already does that in a similar fashion: https://github.com/cloudfoundry/uaa-release/blob/develop/jobs/uaa/templates/bin/post-start
What do you think? We're happy to provide the post-start
as a PR.
Hey everyone,
We're trying to deploy a new rabbitmq deployment. Using the latest version v253.0.0, we interpolated the manifest as per the installation instructions in the readme, we hit this error:
16:52:23 | Preparing deployment: Preparing deployment
16:52:24 | Warning: IP address not available for the link provider instance: rmq/89c5ac2f-39bf-4acf-a187-676397e7fff1
16:52:24 | Warning: IP address not available for the link provider instance: rmq/2de67b09-8bc1-4f4e-90ad-dc775848cc3a
16:52:24 | Warning: IP address not available for the link provider instance: rmq/5e4a215e-0d45-429f-b5ba-64e3a8f4c541
16:52:24 | Warning: IP address not available for the link provider instance: rmq/89c5ac2f-39bf-4acf-a187-676397e7fff1
16:52:24 | Warning: IP address not available for the link provider instance: rmq/2de67b09-8bc1-4f4e-90ad-dc775848cc3a
16:52:24 | Warning: IP address not available for the link provider instance: rmq/5e4a215e-0d45-429f-b5ba-64e3a8f4c541
16:52:24 | Warning: IP address not available for the link provider instance: nats/0686302f-a30a-43b9-b5d4-2a0268835ff7
16:52:24 | Warning: IP address not available for the link provider instance: nats/e8b2af09-e742-457a-9e58-4c59e6671521
16:52:24 | Warning: IP address not available for the link provider instance: nats/ed8fd81a-4839-4b15-af45-55255da6b882
16:52:24 | Preparing deployment: Preparing deployment (00:00:01)
16:52:26 | Preparing package compilation: Finding packages to compile (00:00:00)
16:52:26 | Compiling packages: erlang/58d3f5df2010e2843e46aca77719f6e1d5a9c2eac6f3a5e3a28307a766cbfd38 (00:10:04)
17:03:37 | Creating missing vms: rmq/2de67b09-8bc1-4f4e-90ad-dc775848cc3a (2)
17:03:37 | Creating missing vms: rmq/89c5ac2f-39bf-4acf-a187-676397e7fff1 (0)
17:03:37 | Creating missing vms: rmq/5e4a215e-0d45-429f-b5ba-64e3a8f4c541 (1)
17:03:37 | Creating missing vms: haproxy/aa0f1e50-94a2-4038-bf67-40617c6a40b5 (0)
17:04:15 | Creating missing vms: rmq/2de67b09-8bc1-4f4e-90ad-dc775848cc3a (2) (00:00:38)
17:04:18 | Creating missing vms: rmq/5e4a215e-0d45-429f-b5ba-64e3a8f4c541 (1) (00:00:41)
17:04:20 | Creating missing vms: rmq/89c5ac2f-39bf-4acf-a187-676397e7fff1 (0) (00:00:43)
17:04:21 | Creating missing vms: haproxy/aa0f1e50-94a2-4038-bf67-40617c6a40b5 (0) (00:00:44)
17:04:25 | Updating instance haproxy: haproxy/aa0f1e50-94a2-4038-bf67-40617c6a40b5 (0) (canary)
17:04:25 | Updating instance rmq: rmq/89c5ac2f-39bf-4acf-a187-676397e7fff1 (0) (canary) (00:00:38)
L Error: Action Failed get_task: Task ce089a46-f3fe-4a2a-5ffb-f85d15a2f8be result: 1 of 1 pre-start scripts failed. Failed Jobs: rabbitmq-server.
17:05:12 | Updating instance haproxy: haproxy/aa0f1e50-94a2-4038-bf67-40617c6a40b5 (0) (canary) (00:00:47)
17:05:12 | Error: Action Failed get_task: Task ce089a46-f3fe-4a2a-5ffb-f85d15a2f8be result: 1 of 1 pre-start scripts failed. Failed Jobs: rabbitmq-server.
When I ssh into rmq/89c5ac2f-39bf-4acf-a187-676397e7fff1 and check the logs, in pre-start.stderr.log I see the following error:
2018-11-15 17:05:00 inet_config: parse error in ~ts~n^M
2018-11-15 17:05:01 inet_config: parse error in ~ts~n^M
comm: file 1 is not in sorted order
And in pre-start.stdout.log, I see:
=ERROR REPORT==== 15-Nov-2018::17:05:00 ===
inet_config: parse error in /var/vcap/store/rabbitmq/erl_inetrc
I tried deploying v252.0.0 and also got the same error. Could you help with what's gone wrong here?
Hi team!!
I'm installing rabbitmq on vsphere using the below deployment manifest.
https://gist.github.com/cpraveen412/fc724dccd4338e9dc771b3b5597e2e10
deployment was successful but getting error while i was runnung errand command
bosh run errand broker-registrar
below are the errors
My Environment
vsphere vcenter (with single datacenter, single cluster and only one node in that cluster)
bosh stemcell v3263.7
cf v245
rabbitmq v224.0.0
So we've hit the memory threshold on one of our rabbit-server node today. Looking at the configuration found this:
[{rabbit, [{vm_memory_high_watermark, 0.4}]}].
How can we change this 40% limit. 40% seems quite low on my bosh managed vm, I looked at the bosh vitals when the alarm was triggered and I was only using 51% of current RAM. Seems like using 40% is wasting a lot of memory.
Can't find anywhere I can update this value.
The RABBITQM_MNESIA_BASE and _DIR directories are on /var/vcap/store rather than the persistent volume /var/vcap/data. Our openstack stemcells only have a 3GB root partition so they can fill up fairly quickly.
Is there a reason it doesnt use the volume? What is the volume on rabbit-server for?
I think v225 release is made for using BOSH v2.
But the "scripts/deploy-bosh-lite" file still contains BOSH v1 cli instruction words.
For example,
Hi!
At current moment i have a need to add some additional community plugins to existing RabbitMQ BOSH installation. Is there any simple way to add external plugins without reworking existing release and switching on develop version?
Thanks!
Hello,
Please!!
Can anyone help me on the difference between version 215.0.0 and the latest version?
For example the cf-rabbitmq-template file : instance groups, propreties ...
Thank you
When we do update/migrate from 3.6 to 3.7(3 nodes cluster), we get the mentioned below error . After all the nodes gets updated, sometimes the cluster never comes up.
2018-05-17 06:23:04.073 [error] <0.5.0>
Error description:
init:do_boot/3
init:start_em/1
rabbit:start_it/1 line 445
rabbit:'-boot/0-fun-0-'/0 line 296
rabbit_upgrade:run_mnesia_upgrades/2 line 155
rabbit_upgrade:die/2 line 212
throw:{upgrade_error,"\n\n****\n\nCluster upgrade needed but other disc nodes shut down after this one.\nPlease first start the last disc node to shut down.\n\nNote: if several disc nodes were shut down simultaneously they may all\nshow this message. In which case, remove the lock file on one of them and\nstart that node. The lock file on this node is:\n\n /var/vcap/store/rabbitmq/mnesia/db/nodes_running_at_shutdown \n\n****\n\n\n"}
Log file(s) (may contain more information):
#36 is related to this issue
the cf-rabbitmq-yml
manifests seems to have settings that the team is using for testing purposes. This breaks running scripts/deploy-bosh-lite
for end users trying to deploy cf-rabbitmq on their instances of bosh-lite
Could the balance
option be configurable in https://github.com/pivotal-cf/cf-rabbitmq-release/blob/master/jobs/rabbitmq-haproxy/templates/haproxy.config.erb#L41
Or default to balance roundrobin
?
Currently, only one NATS server can be entered into the RabbitMQ manifest under the global properties for the broker.
Ideally an array would be allowed for an entry here. Having tried entering an array, it will not take it since the release does not appear to allow for it.
Hi guys
If I check the releases, there are three different branches maintained:
We were wondering: What's the difference between them? Is there some doc about it?
We also noticed that only the artifacts based on master are published on bosh.io.
Is there a reason for this?
We found this in Pivotal CloudFoundry when upgrading the Rabbit Tile:
SPEC File:
rabbitmq-server.disk_alarm_threshold:
description: "The threshold in bytes of free disk space at which rabbitmq will raise an alarm"
> default: "{mem_relative,0.4}"
setup.sh.erb
SERVER_START_ARGS="SERVER_START_ARGS='-rabbitmq_clusterer config "${CLUSTER_CONFIG}" >-rabbit log_levels [{connection,info}] -rabbit disk_free_limit <%= disk_alarm_threshold %> -rabbit >cluster_partition_handling <%= cluster_partition_handling %> -rabbit halt_on_upgrade_failure false >-rabbitmq_mqtt subscription_ttl 1800000"
It should either be "[{mem_relative,0.4}]" in the spec file or the injection into setup.sh.erb should be [<%= cluster_partition_handling %>]
It probably passed tests because monit does not realize that the startup of the job actually errors out with:
BOOT FAILED
Error description:
{could_not_start,rabbit,
{error,
{{shutdown,
{failed_to_start_child,rabbit_memory_monitor,
{badarg,
[{lists,member,[disk,{error,bad_module}],[]},
{rabbit_memory_monitor,init,1,
[{file,"src/rabbit_memory_monitor.erl"},
{line,121}]},
{gen_server2,init_it,6,
[{file,"src/gen_server2.erl"},{line,554}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,247}]}]}}},
{child,undefined,rabbit_memory_monitor_sup,
{rabbit_restartable_sup,start_link,
[rabbit_memory_monitor_sup,
{rabbit_memory_monitor,start_link,[]},
false]},
transient,infinity,supervisor,
[rabbit_restartable_sup]}}}}
We are deploying this BOSH release in an HA mode and we're struggling to get it working.
In terms of the HAProxy and RabbitMQ nodes, it's working fine. We tested killing a RabbitMQ node and the HAProxy connection balanced to the other nodes.
But we are unable to get the broker reconnect to other HAProxies when we kill one of them. If we kill the 1st HAProxy (as it appears at the rabbitmq-broker.rabbitmq.hosts property), then we got an exception from the broker as it is unable to connect to the HAProxy. This is not happening if we kill the 2nd HAProxy, but we believe this is because the broker never connects to the 2nd one. I'm not a clojure expert, but seems that the connection is using a single IP address, regardless of how many proxy/nodes you have configured at the above property.
We later used the rabbitmq-broker.rabbitmq.dns_host property by registering an HAProxy route (using the routing
release as show at the cf-rabbitmq.yml example manifest file). But the broker was unable to start because it was unable to connect to the RabbitMQ API. We believe this is happening because the connection is using a hardcoded port (15672), and when using a CF route, the port must be 80
or 443
. We didn't discover any way to override this port using a manifest property.
We modified the broker code manually to use port 80
and then it worked, but we wonder if we missed any step or if there is any other way to configure all of this in an HA mode.
Hello,
We have deployed version 252.0.0
however after the upgrade we are still seeing the version set as 3.6.16
, we were expecting to see 3.7.7
.
Is there something we have missed here?
Thanks.
I'm working on a Grails 2.5.3 application using the rabbitmq plugin. I'm deploying the application to PCFDev and have created the p-rabbitmq
service. I get the connection properties in my code like this:
rabbitmq {
connectionfactory {
def dbInfo = cloud?.getServiceInfo('rabbitmq')
username = dbInfo?.userName
password = dbInfo?.password
hostname = dbInfo?.host
port = dbInfo?.port
}
queues = {
exchange name: 'amq.direct', type: direct, durable: true, autoDelete: false, {
queue1 durable: true
queue2 durable: true
}
}
}
As you can see I am creating two queues queue1
and queue2
. When I deploy the above application, I can see the following environment variables:
{
"VCAP_SERVICES": {
"p-mysql": [
{
"credentials": {
"hostname": "mysql-broker.local.pcfdev.io",
"jdbcUrl": "jdbc:mysql://mysql-broker.local.pcfdev.io:3306/cf_f7626b32_02e4_46f0_84d2_7002c383c3d6?user=JW89THSa6voZbcyb\u0026password=96lFfCeFmzPZCDo0",
"name": "cf_f7626b32_02e4_46f0_84d2_7002c383c3d6",
"password": "96lFfCeFmzPZCDo0",
"port": 3306,
"uri": "mysql://JW89THSa6voZbcyb:[email protected]:3306/cf_f7626b32_02e4_46f0_84d2_7002c383c3d6?reconnect=true",
"username": "JW89THSa6voZbcyb"
},
"label": "p-mysql",
"name": "mysql",
"plan": "512mb",
"provider": null,
"syslog_drain_url": null,
"tags": [
"mysql"
],
"volume_mounts": []
}
],
"p-rabbitmq": [
{
"credentials": {
"dashboard_url": "https://rabbitmq-management.local.pcfdev.io/#/login/1b200e1d-139e-44ca-a284-59c6c6def87f/u15slucb59vosh3888fe1hm48m",
"hostname": "rabbitmq.local.pcfdev.io",
"hostnames": [
"rabbitmq.local.pcfdev.io"
],
"http_api_uri": "https://1b200e1d-139e-44ca-a284-59c6c6def87f:[email protected]/api/",
"http_api_uris": [
"https://1b200e1d-139e-44ca-a284-59c6c6def87f:[email protected]/api/"
],
"password": "u15slucb59vosh3888fe1hm48m",
"protocols": {
"amqp": {
"host": "rabbitmq.local.pcfdev.io",
"hosts": [
"rabbitmq.local.pcfdev.io"
],
"password": "u15slucb59vosh3888fe1hm48m",
"port": 5672,
"ssl": false,
"uri": "amqp://1b200e1d-139e-44ca-a284-59c6c6def87f:[email protected]:5672/913f0907-5fb1-4a37-ae7f-219d91aa7807",
"uris": [
"amqp://1b200e1d-139e-44ca-a284-59c6c6def87f:[email protected]:5672/913f0907-5fb1-4a37-ae7f-219d91aa7807"
],
"username": "1b200e1d-139e-44ca-a284-59c6c6def87f",
"vhost": "913f0907-5fb1-4a37-ae7f-219d91aa7807"
},
"management": {
"host": "rabbitmq.local.pcfdev.io",
"hosts": [
"rabbitmq.local.pcfdev.io"
],
"password": "u15slucb59vosh3888fe1hm48m",
"path": "/api/",
"port": 15672,
"ssl": false,
"uri": "http://1b200e1d-139e-44ca-a284-59c6c6def87f:[email protected]:15672/api/",
"uris": [
"http://1b200e1d-139e-44ca-a284-59c6c6def87f:[email protected]:15672/api/"
],
"username": "1b200e1d-139e-44ca-a284-59c6c6def87f"
}
},
"ssl": false,
"uri": "amqp://1b200e1d-139e-44ca-a284-59c6c6def87f:[email protected]/913f0907-5fb1-4a37-ae7f-219d91aa7807",
"uris": [
"amqp://1b200e1d-139e-44ca-a284-59c6c6def87f:[email protected]/913f0907-5fb1-4a37-ae7f-219d91aa7807"
],
"username": "1b200e1d-139e-44ca-a284-59c6c6def87f",
"vhost": "913f0907-5fb1-4a37-ae7f-219d91aa7807"
},
"label": "p-rabbitmq",
"name": "rabbitmq",
"plan": "standard",
"provider": null,
"syslog_drain_url": null,
"tags": [
"rabbitmq",
"messaging",
"message-queue",
"amqp",
"stomp",
"mqtt",
"pivotal"
],
"volume_mounts": []
}
]
}
}
{
"VCAP_APPLICATION": {
"application_id": "1f71603e-88d2-45ab-a64e-83ab42751d52",
"application_name": "cf-sample",
"application_uris": [
"cf-sample.local.pcfdev.io"
],
"application_version": "0a6a61d4-108a-4698-b272-93a305d8d0b5",
"limits": {
"disk": 512,
"fds": 16384,
"mem": 1024
},
"name": "cf-sample",
"space_id": "ec596a9f-cd56-4529-8f77-9396a886a8b8",
"space_name": "pcfdev-space",
"uris": [
"cf-sample.local.pcfdev.io"
],
"users": null,
"version": "0a6a61d4-108a-4698-b272-93a305d8d0b5"
}
}
User-Provided:
JBP_CONFIG_OPEN_JDK_JRE: {jre: { version: 1.7.0_+ }}
JBP_CONFIG_TOMCAT: {tomcat: { version: 7.0.+ }}
Notice the dashboard-url
. It is https://rabbitmq-management.local.pcfdev.io/#/login/
however, this URL results in 404.
Additionally, no queues are being created when I go to the management console. As can be see in the screenshot below:
Sample Grails app to re-create the issue: https://github.com/Omnipresent/cf-rabbitmq-grails
We are trying to configure rabbitmq to use two way tls for authenticating users using bosh. However, when we specify the configuration to verify, verify_peer
and fail_if_no_peer_cert, true
as part of bosh deployment manifest it does not take effect.
Further investigation revealed that following line in setup.sh is actually specifying verify, verify_none
and fail_if_no_peer_cert, false
https://github.com/pivotal-cf/cf-rabbitmq-release/blob/master/jobs/rabbitmq-server/templates/setup.sh.erb#L140
This setting is taking effect instead of the settings specified in the bosh deployment manifest for bosh-lite
config: <%= ["[{rabbit,[{auth_mechanisms, ['EXTERNAL']},{auth_backends, [rabbit_auth_backend_internal]}, {ssl_listeners, [5671]}, {ssl_options, [{cacertfile,\"/var/vcap/jobs/rabbitmq-server/bin/../etc/cacert.pem\"},{certfile,\"/var/vcap/jobs/rabbitmq-server/bin/../etc/cert.pem\"},{keyfile,\"/var/vcap/jobs/rabbitmq-server/bin/../etc/key.pem\"},{verify,verify_peer},{fail_if_no_peer_cert,true},{versions,['tlsv1.2','tlsv1.1',tlsv1]}]}]},{rabbitmq_mqtt,[{allow_anonymous, false},{ssl_listeners, [8883]},{ssl_cert_login, true}]}]."].pack("m0") %>
Is this causing our configuration not to be honored? How can we get rabbitmq enforce two way TLS with bosh? May be providing bosh deployment manifest properties to configure these options and using them in setup.sh.erb
will help?
When I deployed v225.0.0 to bosh-lite, I got these errors.
bosh task error
Error: Action Failed get_task: Task d8ab5d09-9d62-46f7-6bf6-68cf09a70401 result: 1 of 4 post-deploy scripts failed. Failed Jobs: rabbitmq-server. Successful Jobs: rabbitmq-statsdb-reset-cron-test, permissions-test, syslog-configuration-test.
error log in rabbitmq-server vm (when trying "rabbitmqcrl start_app")
Error description:
{could_not_start,rabbit,
{error,
{{shutdown,
{failed_to_start_child,rabbit_memory_monitor,
{badarg,
[{lists,member,[disk,{error,bad_module}],[]},
{rabbit_memory_monitor,init,1,
[{file,"src/rabbit_memory_monitor.erl"},
{line,121}]},
{gen_server2,init_it,6,
[{file,"src/gen_server2.erl"},{line,554}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,247}]}]}}},
{child,undefined,rabbit_memory_monitor_sup,
{rabbit_restartable_sup,start_link,
[rabbit_memory_monitor_sup,
{rabbit_memory_monitor,start_link,[]},
false]},
transient,infinity,supervisor,
[rabbit_restartable_sup]}}}}
Hi Team, we have enabled 'rabbitmq_tracing' plugin in Pivotal RabbitMQ release version: 210.8.0.
We are able to see the tracing tab enabled in the administrator console however while we click on it, it gives a 500 Error with the error description as below. Request your help here.
Error trace::
webmachine error: path="/api/trace-files"
{error,{error,{badmatch,{error,eacces}},
[{rabbit_tracing_files,list,0,[]},
{rabbit_tracing_wm_files,to_json,2,[]},
{webmachine_resource,resource_call,3,[]},
{webmachine_resource,do,3,[]},
{webmachine_decision_core,resource_call,1,[]},
{webmachine_decision_core,decision,1,[]},
{webmachine_decision_core,handle_request,2,[]},
{rabbit_webmachine,'-makeloop/1-fun-0-',2,[]}]}}
Thanks,
Guru.
Hi,
please see details :
http://stackoverflow.com/questions/32783762/how-to-provide-more-broker-users-in-bosh-release-rabbitmq
Regards M.
We've been using the v226
branch of this repo, and at that point the release contained a rabbitmq-broker
job and broker-registrar
/ broker-deregistrar
errands. It looks like this repo no longer contains those things, and they are now separated to https://github.com/pivotal-cf/cf-rabbitmq-multitenant-broker-release ? Is that accurate?
So is the idea you use this to deploy your rabbitmq server infrastructure, then deploy the "multitenant-broker" separately pointing it at this server infrastructure?
All of this is very unclear from supplied README documentation, Release Notes, and previous issues. It would be nice if changes to the structure of the release and its intended usage were documented somewhere. Maybe it is and I just don't know where to look?
I deployed cf-rabbitmq-release v226.0.0.
I included "rabbitmq_management" to plugins.
When I browsed to http://pivotal-rabbitmq.[MY_DOMAIN]/cli, it displayed like below.
rabbitmqadmin
Download it from here (Right click, Save as), make executable, and drop it in your path.
But when I click "here" to download rabbitmqadmin, the response was {"error":"Object Not Found","reason":"Not Found"} and I failed to get rabbitmqadmin.
Hi Team!!
I'm getting the following message while i was updating the release using ./scripts/update-release command
The agent has no identities.
Please add your ssh key
Can anyone help me?
Thanks In Advance
When configuring the certs etc to enable SSL. It appears that SSL doesn't get properly enabled. After confirming certs/ ciphers etc are all valid using openssl
we are still getting handshake
errors.
It sounds like we need to configure auth_mechanisms to "External" to allow SSL auth, but that option doesn't seem to be present in this release?
When I ran "./scripts/generate-manifest openstack" in v225.0.0 with no modification,
the error occurred like below.
2017/04/24 09:16:17 error generating manifest: unresolved nodes:
(( merge )) in /service-v3/cf-rabbitmq-release/templates/cf-rabbitmq-infrastructure-openstack.yml networks (networks)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.