cloudfoundry / bpm-release Goto Github PK

View Code? Open in Web Editor NEW

34.0 27.0 26.0 12.17 MB

isolated bosh jobs

License: Apache License 2.0

Shell 2.54% Go 97.03% HTML 0.17% Dockerfile 0.26%

bosh containers security bosh-addon

bpm-release's People

Contributors

Stargazers

Watchers

bpm-release's Issues

Can't chdir to '/var/vcap/store/redis': No such file or directory

My bpm-managed redis-server cannot access its persistent storage area.

==> /var/vcap/sys/log/redis/redis.log <==
1:C 13 Oct 09:48:11.763 # Can't chdir to '/var/vcap/store/redis': No such file or directory

The bpm.yml is:

---
processes:
- name: redis
  executable: /var/vcap/packages/redis/bin/redis-server
  args:
  - /var/vcap/jobs/redis/config/redis.conf
  volumes:
  - /var/vcap/store/redis

The pre-start for the job is:

#!/bin/bash

mkdir -p /var/vcap/store/redis
chown vcap:vcap -R /var/vcap/store/redis

Is there something else I need to do to allow a process inside bpm to access its volumes?

Repo is at https://github.com/cloudfoundry-community/redis-boshrelease/tree/bpm

Do not try and limit swap if the host machine does not use swap

I'm using a machine which does not have a swap partition. Running the tests fails due to a failure in the memory limit test.

container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"process_linux.go:367: setting cgroup config for procHooks process caused \\\"failed to write 8388608 to memory.memsw.limit_in_bytes: open /sys/fs/cgroup/memory/G5QTEMBRGBRDCLLBGJRWELJUGMZTKLJZMMZWGLLCMFTDSMZVMMYDCMTGGY------/memory.memsw.limit_in_bytes: permission denied\\\"\""

If I remove the swap limit line from adapter.go then this problem goes away. Can we programatically detect if swap is enabled on the machine and not apply this limit in that case?

BPM doesn't wait with `start` action until all `pre-start` actions completed

Hello,

I'm co-locating an addon that needs to start before the other jobs of the release. Lacking other methods, I was using the pre-start of the addon to execute the start action. I understand that this is not optimal, however, it used to work nicely: the release's start actions weren't executed until the addon's pre-start was finished.
After bpm-ifying the release, it seems that this doesn't work reliably anymore. Some jobs of the release seem to be started before the addon's pre-start is finished. I didn't thoroughly test this, though.

Should this be working the way I expect it to work? Or is this not a case that is currently supported by bpm?

Pass additional volumes via flag to bpm run

The Platform Recovery team have explored using bpm run in BOSH Backup and Restore (BBR) scripts.

Part of the BBR contract is $BBR_ARTIFACT_DIRECTORY will be set, which is the location of the backup artifact for backup or restore.

This location must be mounted within the BPM container. Because it is only determined at backup or restore time, it is not possible to template it into the BOSH job at deployment time.

Currently the BBR CLI creates $BBR_ARTIFACT_DIRECTORY in the form: /var/vcap/store/bbr-backup/JOB_NAME. We would like to avoid the magic string /var/vcap/store/bbr-backup becoming encoded in bpm.yml config files across the many BOSH releases that support BBR. What do you think?

To avoid extending the BBR contract perhaps it would be possible for BPM to allow us to specify additional volumes as a flag to the bpm run command?

@cloudfoundry-incubator/bosh-backup-and-restore-team

Brief error in bpm list when starting an errand

failed to list jobs: invalid character 'l' looking for beginning of value
Error: invalid character 'l' looking for beginning of value

conflict when colocating job using bpm with concourse worker garden runc

Since bosh switch to bpm by default (without option to disable it), concourse worker, which we colocate with bosh in BUCC, stops working.

When concourse tries to create a garden (runc) container it encounters the following error:

runc create: exit status 1: container_linux.go:295: starting container process caused "process_linux.go:398: container init caused \"rootfs_linux.go:57: mounting \\\"cgroup\\\" to rootfs \\\"/var/vcap/data/baggageclaim/volumes/live/7bc52091-48a7-4f14-69c8-a802bfeb3348/volume\\\" at \\\"/sys/fs/cgroup\\\" caused \\\"stat /cgroup/bpm/blkio/garden/97409a1f-dd85-4003-76de-cddb509aceec: permission denied\\\"\""

After tries different combinations of garden version's with and without bpm enabled. We found we could work around this issue by unmounting /cgroup/bpm/*:

for id in $(/var/vcap/packages/runc/bin/runc list | cut -d' ' -f1); do /var/vcap/packages/runc/bin/runc delete "${id}"; done
umount /cgroup/bpm/*

This proves to us that these mountpoints affect other processes, which is unexpected.

Feature request - `pre-stop`

Currently we have to do some very elaborate bash scripting to make sure we can trap signals and coordinate with other jobs on the VM in uaa-release.

It would be nice if BPM provided a pre-stop hook.

The UAA, runs as a Java web application on a servlet container. Today we don't have a way to intercept signals in the Java process before the container, Apache Tomcat, starts shutting down.

This feature request may become obsolete once we move to Spring Boot, where we could control the shutdown sequence ourselves.

bpm run -v flag parses multiple options incorrectly

Whilst experimenting with bpm run -v ... in v0.12.0 we discovered that when multiple options are provided in the flag they are parsed incorrectly. For example:

$ bpm run -p my_process -v /var/vcap/store/my_dir:writeable,mount_only
Error: invalid volume path: mount_only must be within (/var/vcap/data, /var/vcap/store)

After inspecting we discovered the issue is in the flag parsing package used here: https://github.com/cloudfoundry-incubator/bpm-release/blob/be92a3b0427d78d4d6c3cb1497a1e485a5ee4127/src/bpm/commands/run.go#L39

StringSliceVarP splits flag values on ,. For example -v /var/vcap/store/my_dir:writeable,mount_only is parsed to []string{"/var/vcap/store/my_dir:writeable","mount_only"}.

We're not sure what the best solution is for this issue. Perhaps a different separator for the volume options, or a different flag parser?

Currently we only need to pass a single volume with a single option, so we are not blocked by this issue at the moment.

Josh and @aclevername
cc: @cloudfoundry-incubator/bosh-backup-and-restore-team

creation of pre-defined files?

We'd like to be able to setup a file, say for access logging, in some well known location. For now, the only way there seems to be able to do this is to actually have a pre-start script that creates said file. Would be nicer if bpm could make a known file for us if requested.

`bpm start` fails when a directory cannot be `chown`ed

The Cloud Controller jobs in capi-release support using a mounted NFS volume as their "blobstore". We tried adding this mount to the additional_volumes section of our bpm config (see example), but this caused our job to fail to start.

We found this was due to the following error:

api/8bcc8305-a5d4-4ba3-bff5-ab8253656aef:~# /var/vcap/jobs/bpm/bin/bpm start cloud_controller_ng
Error: failed to start job-process: failed to create system files: chown /var/vcap/data/nfs: operation not permitted

After some investigation, we discovered this was due to our NFS server having "root squashing" enabled (more information here) which meant that actions taken as root on the NFS client (such as this chown) would actually be executed as a "nobody" user that lacked the necessary permissions.

We would like for an option to tell bpm to not attempt to modify this directory (or some other solution that lets us mount NFS volumes within a bpm container).

Additional context: https://www.pivotaltracker.com/story/show/157880726

Thanks!

Cassandra requires a writable temporary directory with "exec" permissions

Hi,

In the course of building our Cassandra BOSH Release we are interested to having it rely on BPM.

The problem is that Cassandra uses JNA (Java Native Access) which requires to compile native code and execute it. Currently, we remount /tmp in our pre-start with exec option for this reason.

The problem with BPM is that all rw mounted volumes have a hard-wired noexec option. We would like to see a executions: true possible option to additional volumes in order to support Java Native Access and thus the Cassandra release.

Please note that I might submit a PR for this in the upcoming hours or so. The point here is to discuss the opportunity for a new executions: true option (defaulting to false) that we could add to additional_volumes.

Thanks!

bosh.io/docs presence

it would be nice to add a page to bosh.io/docs to describe that we recommend bpm and how we typically expect people to use it. may be also list options (similar to http://bosh.io/docs/package-vendoring.html) when to use it and when not to.

package directories are not consistently named with package names (inside spec files)

we recently added a check to cli for jobs but didnt for packages.

Please configure GITBOT

Pivotal uses GITBOT to synchronize Github issues and pull requests with Pivotal Tracker.
Please add your new repo to the GITBOT config-production.yml in the Gitbot configuration repo.
If you don't have access you can send an ask ticket to the CF admins. We prefer teams to submit their changes via a pull request.

Steps:

Fork this repo: cfgitbot-config
Add your project to config-production.yml file
Submit a PR

If there are any questions, please reach out to [email protected].

Canonical example release that maintains latest bpm release?

@cppforlife zookeeper-release is using bpm 0.1.0 and its bpm.yml syntax is failing with v0.2.0. diego-release has a similar looking bpm.yml and its manifest doesn't specify a bpm release version https://github.com/cloudfoundry/diego-release/blob/develop/operations/enable-bpm.yml

Which releases are you maintaining that will automatically/soon be upgraded to bpm 0.2.0 examples?

As an aside in case its me, not the version, the error I'm getting is:

{"timestamp":"1507787796.620288849","source":"bpm","message":"bpm.start.failed-to-parse-config","log_level":2,"data":{"error":"yaml: unmarshal errors:\n  line 2: cannot unmarshal !!map into []*config.ProcessConfig","job":"redis","process":"redis","session":"1"}}

And my bpm.yml looks like:

processes:
  redis:
    executable: /var/vcap/packages/bin/redis-server
    args:
    - /var/vcap/jobs/redis/config/redis.conf

Update: I've found the docs example https://github.com/cloudfoundry-incubator/bpm-release/blob/master/docs/config.md#job-configuration - I'd still be interested in the README/docs pointing to some example repos that are maintained in sync with bpm releases too.

And I'm still getting the above error. Source file https://github.com/cloudfoundry-community/redis-boshrelease/compare/bpm?expand=1#diff-b5d39bdc8d532c326b4ff356bc62060b

mkdir volumes if missing

I'm upgrading redis-boshrelease, and I want to configure the container to allocate and permit redis to access /var/vcap/store/redis:

---
processes:
- name: redis
  executable: /var/vcap/packages/redis/bin/redis-server
  args:
  - /var/vcap/jobs/redis/config/redis.conf
  volumes:
  - /var/vcap/store/redis

But the folder /var/vcap/store/redis is not created within the /var/vcap/store host volume.

Currently this will mean I need to write a wrapper script for redis-server or write a pre-start script. Perhaps bpm could create each host folder for any volumes that are specified?

Expose the job's socket directory with a single configuration option

The same way we have ephemeral_disk: and persistent_disk: to mount in a default volume for the current job we could have the same for the socket directory.

The first pass proposal for this was to add sockets: but @jfmyers9 wasn't a fan.

/cc @aramprice @jfmyers9

Pass environment variables as flags to bpm run

Unfortunately, when we raised #74 we missed this 😢

When we use bpm run in BOSH Backup and Restore (BBR) scripts, we need to inject the $BBR_ARTIFACT_DIRECTORY that is created by bbr into the container running the BBR script. This needs to be injected as an additional volume (addressed by #81) and an environment variable.

Would it be possible to extend the bpm run command to accept environment variables as flags? For example, here is how a BBR backup script might invoke bpm run:

/var/vcap/jobs/bpm/bin/bpm run s3-versioned-blobstore-backup-restorer \
  -p backup \
  -v "${BBR_ARTIFACT_DIRECTORY}:writable" \
  -e "BBR_ARTIFACT_DIRECTORY=${BBR_ARTIFACT_DIRECTORY}"

Mirah and @jamesjoshuahill
cc @cloudfoundry-incubator/bosh-backup-and-restore-team

Update bosh.io with new documentation

Specifically as it regards to migrating to bpm with errands.

Delete bpm/mounts package

Hi there!

It looks like the bpm/mounts package can be deleted since the three functions within that package already exist within bpm dependenies.

mount.Mount -> unix.Mount
mount.Unmount -> unix.Unmount
mount.Mounts -> github.com/opencontainers/runc/libcontainer/mount.GetMounts()

I would have PRed this but before I do I wanted to find out about vendoring methods in bpm. It looks like github.com/opencontainers/runc/libcontainer/mount isn't venored right now, and when I try to add the import path to bpm/cgroups (instead using runc's GetMounts()) and run dep ensure, it's making changes to a large numer of files. I had a look at the Gopkg.lock and it looks like you've vendored 0.1.1 of runc and I assume my dep ensure was trying to update runc since the runc mount package didn't exist in 0.1.1.

Looking at the commits to the bosh packages, it looks like you're currently packaging 1.0.0-rc6 - perhaps the runc library being used should be kept in sync with the binary bpm builds.

Additionally I can see that the version of runc is not locked in the Gopkg.toml - would you accept two PRs to:

lock the version of the runc library, and bump to 1.0.0-rc6
delete the bpm/mount package in favour of libcontainer/mount

Add version numbers to blob names

Because blobs can be named anything which makes it hard to track it's origins (src and version), most folks have adopted a pattern that would suggest naming the two blobs something more like:

golang/go1.9.2-linux-amd64.tar.gz
runc/runc-1.0.0-rc4-linux-amd64

nil-dereference panic on interrupting `bpm trace`

Steps to reproduce:

run bpm trace JOB on a BPM-enabled job
run ^C to interrupt the trace
observe panic and goroutine dump such as the following:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x5c8284]

goroutine 1 [running]:
bpm/commands.trace(0x76ef60, 0xc42007d940, 0x1, 0x1, 0x0, 0x0)
	/var/vcap/packages/bpm/src/bpm/commands/trace.go:98 +0x494
github.com/spf13/cobra.(*Command).execute(0x76ef60, 0xc42007d900, 0x1, 0x1, 0x76ef60, 0xc42007d900)
	/var/vcap/packages/bpm/src/github.com/spf13/cobra/command.go:650 +0x456
github.com/spf13/cobra.(*Command).ExecuteC(0x76f860, 0xc4200d5f40, 0x5c7d5d, 0x76f860)
	/var/vcap/packages/bpm/src/github.com/spf13/cobra/command.go:729 +0x2fe
github.com/spf13/cobra.(*Command).Execute(0x76f860, 0x641906, 0xc4200d5f70)
	/var/vcap/packages/bpm/src/github.com/spf13/cobra/command.go:688 +0x2b
main.main()
	/var/vcap/packages/bpm/src/bpm/main.go:25 +0x31

Observed during acceptance on https://www.pivotaltracker.com/story/show/151765952.

bpm pre-start question

If I reboot vm, bpm pre-start will not be run and cgroup will not be mounted?

Read-only log dir mounting for log-forwarding add-ons?

We recently discovered that we're running blackbox in the syslog-release with cap_dac_read_search in our ctl script. Permissions around log files as currently written by current releases are inconsistent/inconvenient, so we need this for our vcap process to read pre-start logs as root, etc.

So we were thinking it might be good to constrain that ability to read from the entire filesystem.
We've got thoughts on forming an explicit contract about file permissions in the future, but for now/soontimes, we were thinking maybe BPM could let us run blackbox with that cap, but only mount the /var/vcap/sys/log directory. Of course, this isn't part of your interface right now that I can see.

We'd ideally get it as a read only mount.

(Also, would that setcap still work/be available in BPM? Changing the user removes caps, but if it were set after the user...)

Can we break BPM into a submodule with properly vendored dependencies?

Is this something we want to do? I opened this issue as a place for discussion. I don't believe this is something urgent that we need to solve now, but I would like to hear thoughts and opinions.

Pros:

We can vendor our dependencies in a standard way i.e. dep
BPM can be built outside of the BOSH releases context and encodes it's own dependencies.
BPM code can be imported into other codebases (may not be valuable)

Cons:

Our tooling is built around submodules (bumping dependencies, commit messages, etc)
BPM as a tool never needs to be built outside of a BOSH release.
We would have a separate repository for the code.

Thoughts?

Missing start errors in bpm.log?

Hi - I'm not sure if this is expected or not, but I had a misconfiguration in my job's BPM stuff which was causing a failure, but the only way to discover it was by manually running bpm start myself. Given I was able to see messages from bpm.log, I initially expected to see the error message to show up there.

Initially, I tailed the job's bpm.log, but found nothing useful...

$ tail /var/vcap/sys/log/ipfs/bpm.log
{"timestamp":"1527679613.576059580","source":"bpm","message":"bpm.start.acquiring-lifecycle-lock.starting","log_level":1,"data":{"job":"ipfs","process":"ipfs","session":"1.1"}}
{"timestamp":"1527679613.576259851","source":"bpm","message":"bpm.start.acquiring-lifecycle-lock.complete","log_level":1,"data":{"job":"ipfs","process":"ipfs","session":"1.1"}}
{"timestamp":"1527679613.576652765","source":"bpm","message":"bpm.start.starting","log_level":1,"data":{"job":"ipfs","process":"ipfs","session":"1"}}
{"timestamp":"1527679613.588210583","source":"bpm","message":"bpm.start.start-process.starting","log_level":1,"data":{"job":"ipfs","process":"ipfs","session":"1.2"}}
{"timestamp":"1527679613.588368654","source":"bpm","message":"bpm.start.start-process.creating-job-prerequisites","log_level":1,"data":{"job":"ipfs","process":"ipfs","session":"1.2"}}
{"timestamp":"1527679613.589195251","source":"bpm","message":"bpm.start.start-process.complete","log_level":1,"data":{"job":"ipfs","process":"ipfs","session":"1.2"}}
{"timestamp":"1527679613.589248419","source":"bpm","message":"bpm.start.complete","log_level":1,"data":{"job":"ipfs","process":"ipfs","session":"1"}}

After looking through all the other log files I could find, I eventually tried running bpm start manually and was able to see the error...

$ bpm start ipfs
Error: failed to start job-process: failed to create system files: requested persistent disk does not exist

A little surprised, I verified I didn't miss the message in a log somewhere else...

$ grep -snr 'requested persistent disk does not exist' /var/vcap/sys/log/

It kind of makes sense that I could be missing logs since I'm not redirecting the output of my monit start/stop commands like I do with non-bpm jobs... but I also had originally been assuming bpm was redirecting everything possible to its log file.

If an odd edge case like this were to show up in a production environment, it would be difficult to diagnose the issue from logs. Could/should this sort of error something which could be written to the regular bpm.log file instead of STDOUT only? If not, I start wanting to go back to wrap all my start/stop bpm calls so I can be sure that all output will be sent to a log file somewhere... but that's extra overhead I'd prefer to avoid.

Thanks!

Please cut github releases

Can you please cut github releases for each new version - this will trigger GitHub to send emails to everyone who is watching; and we'll know that there is a new version. All my releases are still on 0.2.0 and just noticed there is now an 0.4.0.

Bonus question: what's changed between 0.2.0 and 0.4.0? Can I just upgrade or do I need to change my bpm.yml?

Support of post-start script in bpm

Hi,
if we reboot manually a vm, can bpm on the restart execute the post-start?

This will allow us to solve this CFCR issue

Thanks!

Should we allow secrets to be passed via the -e flags?

I wanted to split this out from the review comments so that it doesn't get lost when that PR is merged. This issue expands on this comment thread.

Pro
We don't want secrets to appear in the process table so allowing secrets to be pulled through from the environment will keep them from leaking in there.

Con
People probably won't know about this feature so they're going to pass the secret on the command-line anyway.

Pro
If they do know then they'll at least have some way of achieving this rather than being forced to do it the "wrong way".

Con
If they do know then they'll come to us and we can add the feature when they want it anyway. We can maybe then advise them to pass their secrets through the regular BPM configuration file or through their own configuration file.

I think I've summed my thought process around this issue so far.

@dpb587-pivotal @aramprice Have I missed or misrepresented anything?

Mount propagation support for docker volume plugins running as BPM jobs

As a prerequisite for migrating volume services to BPM, we would need for BPM to optionally support mount propagation. Mount propagation is necessary because volume plugins mounting stuff into a container need for the resulting mount to be visible out in the diego cell OS so that it can get picked up by the executor and handed off to Garden.

To do this, BPM would need to be able to:

create a shared mount path that can be bound into the BPM container like so
launch the BPM container with process.linux.rootfsPropagation set to shared like so

Note that this is probably not the only thing we'd need to figure out. Some other stuff we do is:

install kernel modules in pre-start scripts. Modules like fuse and nfs-common are missing from the stemcell, so we have to install them in our pre-start scripts. Not clear if we can do this from a container?
fuse modules require access to /dev/fuse. This can be bind mounted in, which is no biggie, but I suspect that it's not there for BPM?

Submodule Bumper is chatty, but not informative

Could Submodule Bumper be updated to specify what submodules it updated in the commit message?

Explicit stop?

Thanks for the BPM talk at cfsummit yesterday.

I'm looking at upgrading redis-boshrelease. Its ctl stop command delegates to a redis-cli shutdown. Is there a way to configure a shutdown command to help bpm know how its managed process would like to be shutdown nicely, prior to being shutdown forcefully?

https://github.com/cloudfoundry-community/redis-boshrelease/blob/master/jobs/redis/templates/bin/ctl#L22

Concern: single bosh release of `bpm` per deployment

In https://github.com/cloudfoundry-community/kafka-service-broker-boshrelease/blob/master/manifests/kafka-service-broker.yml I have three different jobs from two different BOSH releases using bpm. Additionally, in future the https://github.com/cppforlife/zookeeper-release/tree/bpm/jobs/zookeeper job will support bpm hopefully.

Between bpm 0.1.0 and 0.2.0 there was a breaking change to the bpm.yml file. As such, each job using bpm needed to be upgraded - it must either run against bpm 0.1.0 or 0.2.0.

But BOSH does not support a deployment manifest containing multiple versions of the same release name.

This might cause BOSH release developers to not move forward to newer bpm versions because they can't run it in the deployments which contain other jobs using the older bpm version. People will be stuck. bpm 0.2.0 might be the last bpm version some people ever use!

Proposals:

bpm commits to never making breaking changes to bpm.yml ever again
bpm adds a schema: 0.2.0 to bpm.yml and it supports historical versions of the bpm.yml file format

BOSH adds support for multiple releases with the same name in a deployment manifest. E.g. /cc @cppforlife @dpb587-pivotal

releases:
- name: bpm
  version: 0.2.0
  alias: bpm-0-2-0
- name: bpm
  version: 0.3.0
  alias: bpm-0-3-0

instance_groups:
- name: using-old-bpm
  jobs:
  - name: old-bpm
    release: bpm-0-2-0
- name: using-new-bpm
  jobs:
  - name: new-bpm
    release: bpm-0-3-0

Please configure GITBOT

Steps:

Fork this repo: cfgitbot-config
Add your project to config-production.yml file
Submit a PR

If there are any questions, please reach out to [email protected].

Publish pre-compiled releases

Could bpm CI please start publishing pre-compiled releases? Currently, a no-op bosh release with bpm addon takes 5 mins on a bosh-lite/bucc-lite to compile golang/bpm:

Task 4 | 01:08:43 | Compiling packages: bpm-runc/c0b41921c5063378870a7c8867c6dc1aa84e7d85
Task 4 | 01:08:43 | Compiling packages: golang/65c792cb5cb0ba6526742b1a36e57d1b195fe8be
Task 4 | 01:10:34 | Compiling packages: bpm-runc/c0b41921c5063378870a7c8867c6dc1aa84e7d85 (00:01:51)
Task 4 | 01:11:21 | Compiling packages: golang/65c792cb5cb0ba6526742b1a36e57d1b195fe8be (00:02:38)
Task 4 | 01:11:21 | Compiling packages: bpm/b5e1678ac76dd5653bfa65cb237c0282e083894a (00:00:17)

An example pipeline PR for adding compiled release generation + use-compiled-releases.yml operator file updates is https://github.com/starkandwayne/pipeline-templates/pull/28/files

persistent_disk allow_executions ?

When setting persistend_disk: true in bpm.yml, the job automatically gets access to data directory at /var/vcap/store/JOB. However, this directory is mounted with a no exec option.

Is there a way to have the default data share with exec permissions enabled?

If I try to workaround that, with something like:

persistent_disk: false
additional_volumes:
- path: /var/vcap/store/JOB
  writable: true
  allow_executions: true

bpm stops with the error message invalid volume path: /var/vcap/store/JOB cannot conflict with default job data or store directories.

Config Documentation is Incorrect

Example Documentation for Config
https://github.com/cloudfoundry-incubator/bpm-release/blob/21958b5/docs/config.md#job-configuration

Shows that <job> and <worker> are keys. Looking at the source code it reveals that the processes is an array, and that the job name is part of name

It seems that this is a more accurate configuration, but it's hard to tell

processes:
- name: uaa
  executable: /var/vcap/jobs/uaa/bin/uaa
  env:
    CATALINA_BASE: /var/vcap/data/uaa/tomcat
    CATALINA_HOME: /var/vcap/data/uaa/tomcat
    CLOUD_FOUNDRY_CONFIG_PATH: /var/vcap/jobs/uaa/config
    CATALINA_OPTS: "-Xmx768m -XX:MaxMetaspaceSize=256m"
  args: []

The readme never says what bpm stands for

sorry to be that guy

Allow /var/vcap/sys/run/{job} to be specified as an additional volume

BOSH release authors using BPM currently put their sockets within /var/vcap/data/{job}. Release authors should be free to use /var/vcap/sys/run/{job} for sockets however there is a restriction within BPM that won't allow specifying that volume as an additional volume. Since the FHS specifies that sockets must live with /var/run (which I believe is analogous to /var/vcap/sys/run) I propose that we open up /var/vcap/sys/run/{job} to be allowed as an additional volume.

The slack conversation around this message shows a conversation on this topic: https://cloudfoundry.slack.com/archives/C7A0K6NMU/p1538067508000100

cgroup-path

TLDR:

prepend bpm to the cgroup path
do not base32 encode the container id

Details:

In a previous story, the decision was to base32 encode the job name for the runc container-id which becomes the cgroup-path in the file /proc/[pid]/cgroup:

hierarchy-ID:controller-list:cgroup-path

I would like to write a sysdig-bosh release which will be colocated on VMs and which should ideally also be able to monitor bpm containers. Sysdig parses the cgroup-path in order to classify the container type (see here for LXC).
For example, the cgroup-path for an LXC container looks like /lxc/my-container:

cat /proc/<pid of lxc container>/cgroup
11:hugetlb:/lxc/my-container

It's the default for LXC (see here):

lxc.cgroup.pattern
Format string used to generate the cgroup path (e.g. lxc/%n).

According to the sysdig code, the same applies to mesos, libvirt-lxc and docker.
Only the bpm container has this base32 encoded cgroup-path like

12:pids:/MJWG6YTTORXXEZI-

for the blobstore for example.

Having prepended something like bpm. to the cgroup-path would allow to parse it easily in sysdig such that the filter csysdig -pc container.type=bpm could match BPM containers.

But even though if the cgroup-path looks like bpm.MJWG6YTTORXXEZI-, the output for above filter would show a container column with the name MJWG6YTTORXXEZI- rather than blobstore.

Therefore, it would also be better to not base32 encode the whole container name, but encode maybe only those chars which do not match the regex ^[\w+-\.]+$ (see code). This would probably require to have a special delimiter char for the decoding process in order to identify the encoded chars.

See also Slack discussion for more context.

If the above two requirements are okay with you, I could also create a PR.

Include additional log files in `bpm logs`

Is there a way, or could we add a way, for any additional log files created by running processes to be included in the bpm logs output?

The release tags the wrong commit sha.

We are currently tagging the bump to the dev version rather than the creation of the final version. See the v0.5.0 here 145aabc.

Issue with gorouter allowing users to stream access logs to sys logs

Issue

Gorouter fails to deploy with router.enable_access_log_streaming set to true.

Context

The gorouter exposes a spec property to enable access log streaming. This was exposed as a feature since some users had concerns about writing logs to disk and the performance implication + the volume of logs thereof. There is another property that operators can use to stop writing access logs to disk.

The gorouter now uses BPM by default and this means that the deploy now fails when the Gorouter is deployed with the router.enable_access_log_streaming set to true.

The issue was reported by someone from the community here

Steps to Reproduce

Set router.enable_access_log_streaming set to true in the deployment manifest
Deploy
Deployment fails

Expected result

Deployment should succeed.

Current result

Deployment fails.

Possible Options

We can deprecate this feature for the gorouter if the performance concerns are mitigated by BPM
We can make this a feature for BPM
We can implement this in the Gorouter, though this will be a bespoke implementation and mean that we will diverge from the standard BPM is trying to push across components.

`dev` mounts for volume plugins running as BPM jobs

This code line hard codes BPM mounts to be nodev but in order for fuse mounting to work in a BPM process, we need to be able to mount \dev\fuse as a device.

So, I think we need for that to be an option on unsafe.unrestricted_volume mounts, just like exec/noexec is.

Could we have an allow_device flag in that context maybe?

Related persi team spike story here: https://www.pivotaltracker.com/story/show/160187993

Messages sent to syslog inside the job container cannot be seen on the host

This issue was mentioned here: cloudfoundry/bosh#1999.

The implementation of this could be as simple as exposing /dev/log to the job container.

Maybe there are other concerns with doing this? Can jobs interfere with one another through this? Should we encourage all jobs to use stdout and stderr or write to log files instead?

Please configure GITBOT

Steps:

Fork this repo: cfgitbot-config
Add your project to config-production.yml file
Submit a PR

If there are any questions, please reach out to [email protected].

`bpm stop` times out when stopping in a resource constrained environment

By default, monit imposes a 30 second timeout when running monit stop on a process. Right now, bpm stop will:

Send a SIGTERM to the process
If the process does not exit in 20 seconds, send a SIGQUIT to the process
Wait up to 5 seconds for the process to exit
runc delete -f which will send a SIGKILL and delete the container
Delete the runc bundle in /var/vcap/data/bpm/bundles

In our upgrade tests, we are seeing that this takes ~30 seconds and causes the monit stop to fail:

{"timestamp":"1528407119.558125257","source":"bpm","message":"bpm.stop.acquiring-lifecycle-lock.starting","log_level":1,"data":{"job":"test-server","process":"alt-test-server","session":"1.1"}}
{"timestamp":"1528407119.558376551","source":"bpm","message":"bpm.stop.acquiring-lifecycle-lock.complete","log_level":1,"data":{"job":"test-server","process":"alt-test-server","session":"1.1"}}
{"timestamp":"1528407119.558422804","source":"bpm","message":"bpm.stop.starting","log_level":1,"data":{"job":"test-server","process":"alt-test-server","session":"1"}}
{"timestamp":"1528407144.594295740","source":"bpm","message":"bpm.stop.failed-to-stop","log_level":2,"data":{"error":"failed to stop job within timeout","job":"test-server","process":"alt-test-server","session":"1"}}
{"timestamp":"1528407149.619541407","source":"bpm","message":"bpm.stop.complete","log_level":1,"data":{"job":"test-server","process":"alt-test-server","session":"1"}}
{"timestamp":"1528407149.619646311","source":"bpm","message":"bpm.stop.releasing-lifecycle-lock.starting","log_level":1,"data":{"job":"test-server","process":"alt-test-server","session":"1.2"}}
{"timestamp":"1528407149.619714737","source":"bpm","message":"bpm.stop.releasing-lifecycle-lock.complete","log_level":1,"data":{"job":"test-server","process":"alt-test-server","session":"1.2"}}

This is due to the fact that runc delete -f and deleting the bundle are taking about 5 seconds on top of the 25 seconds that we wait for the process to exit.

We should separate actions that may be IO constrained from bpm stop.

A different way to stop processes could be:

Send a SIGTERM to the process
If the process does not exit in 20 seconds, send a SIGQUIT to the process
If the process does not exit in 5 seconds, send a SIGKILL to the process
Out of band, delete the container and clean up the bundle.

I'm not quite sure what it means to clean up the artifacts "out of band" yet.

Thoughts?

Mount points order is not deterministic

The diego team ran into an issue today where the Rep couldn't rename files from /var/vcap/data/rep/tmp to /var/vcap/data/rep/download_cache with the following golang error:

rename /var/vcap/data/rep/tmp/executor-work/transformed488409585 /var/vcap/data/rep/download_cache/057f5645ffeb205f110c33a19f078e94-1529077853772195780-330: invalid cross-device link

It turns out the order of mount points in the process ns are different (from a healthy cell):

diego-cell/0ff998a2-1b9e-4182-82f9-f8dbb5f844b6:~# cat /proc/708342/mountinfo  | grep /var/vcap/data                                                          
102 94 8:3 /rep /var/vcap/data/rep rw,nosuid,nodev,noexec,relatime - ext4 /dev/sda3 rw,data=ordered                                                           
103 102 0:40 / /var/vcap/data/rep/instance_identity rw,relatime - tmpfs tmpfs rw,size=2784k                                                                   
104 94 8:3 /garden /var/vcap/data/garden rw,nosuid,nodev,noexec,relatime - ext4 /dev/sda3 rw,data=ordered                                                     
113 102 8:3 /rep/tmp /var/vcap/data/rep/tmp rw,nosuid,nodev,noexec,relatime - ext4 /dev/sda3 rw,data=ordered                                                  
114 94 8:3 /voldrivers /var/vcap/data/voldrivers rw,nosuid,nodev,noexec,relatime - ext4 /dev/sda3 rw,data=ordered                                             
117 94 8:3 /packages /var/vcap/data/packages ro,nosuid,nodev,relatime - ext4 /dev/sda3 rw,data=ordered

vs a healthy cell

113 97 202:18 /packages /var/vcap/data/packages ro,nosuid,nodev,relatime - ext4 /dev/xvdb2 rw,data=ordered                                                    
117 97 202:18 /rep/tmp /var/vcap/data/rep/tmp rw,nosuid,nodev,noexec,relatime - ext4 /dev/xvdb2 rw,data=ordered                                               
118 97 202:18 /rep /var/vcap/data/rep rw,nosuid,nodev,noexec,relatime - ext4 /dev/xvdb2 rw,data=ordered                                                       
119 118 0:41 / /var/vcap/data/rep/instance_identity rw,relatime - tmpfs tmpfs rw,size=2792k                                                                   
120 97 202:18 /voldrivers /var/vcap/data/voldrivers rw,nosuid,nodev,noexec,relatime - ext4 /dev/xvdb2 rw,data=ordered                                         
121 97 202:18 /garden /var/vcap/data/garden rw,nosuid,nodev,noexec,relatime - ext4 /dev/xvdb2 rw,data=ordered

Notice that the order of /var/vcap/data/rep/tmp and /var/vcap/data/rep are different. On the healthy cell, /var/vcap/data/rep follows /var/vcap/data/rep/tmp which practically hides the previous mount point and makes the two paths /var/vcap/data/rep/tmp/foo & /var/vcap/data/rep/download_cache/foo looks like they are on the same mountpoint.

On the unhealthy cell the order is reversed which causes /var/vcap/data/rep/tmp to look like a different mount point from /var/vcap/data/rep. This combined with the fact that rename will return EXDEV even if the two mountpoints share the underlying filesystem/block device.

The non deterministic behavior seems to be a result of this deduping logic

Inconsistency in docu

Looking at https://bosh.io/docs/bpm/runtime/ and https://github.com/cloudfoundry-incubator/bpm-release/blob/master/docs/config.md we found a inconsistency in the documentation.
For shared data different key words are given (volumes vs additional_volumes) and even the explanation how to use it is misleading.

cloudfoundry / bpm-release Goto Github PK

bpm-release's People

Contributors

Stargazers

Watchers

Forkers

bpm-release's Issues

Issue

Context

Steps to Reproduce

Expected result

Current result

Possible Options

Recommend Projects

Recommend Topics

Recommend Org

Jobs