GithubHelp home page GithubHelp logo

renatomefi / php-fpm-healthcheck Goto Github PK

View Code? Open in Web Editor NEW
476.0 17.0 53.0 98 KB

A POSIX compliant sh script to healthcheck PHP fpm status, can be used only for pinging or check for specific metrics

License: MIT License

Makefile 9.43% Shell 38.19% Python 52.37%
docker php php-fpm health-check kubernetes livenessprobe

php-fpm-healthcheck's Introduction

A PHP fpm Health Check script

With the ascension of containerized applications it becomes more and more useful to have a php-fpm healthcheck.

This POSIX compliant sh script gets php-fpm status page using cgi-fcgi tool, parses it's outcome and allows you to choose a metric which you want to check one, a ping mode is also available which only makes sure php-fpm is answering.

Motivation

Previously at work we had Docker containers containing both php-fpm and Nginx processes, while they were managed by another process being Supervisord or s6 overlay for instance. One good example is this image from Ric Harvey

It works really well, but I wanted to achieve a few other things like using the official images and its release cycle, logs belonging to their own processes, not mixed, I didn't like to rely on Supervisord since I had bad experiences in the past with it, and other things related to the "Docker way", I'm not saying it's perfect but I wanted some of those things.

Now comes the php-fpm healthcheck part, while having in place a healthcheck which requested an url in the application asking if it was alive, it was indirectly testing the whole chain, Nginx -> php-fpm -> application, and now I had the chance to test still the whole chain via nginx but also monitor how busy and stable is php-fpm, if you check its /status page it has quite some useful information, so why not monitor on it? For instance you could make a container unhealthy after a certain amount of requests, or if the queue is too long and even slow requests, and that's what this script tries to achieve!

Good news is that you can still do it even using the mixed container approach, but I wanted to take a time to explain why I came to do it like this now! The advantage in my opinion is that having separate containers you have a better grasp on where the problem is laying and you can restart only what's failing, not the whole, also avoiding Supervisord to restart it for you since you are already behind a container orchestration tool.

Installation

Enable php-fpm status page

On you php-fpm pool configuration add: pm.status_path = /status

For instance on the official php image you can alter the file /usr/local/etc/php-fpm.d/zz-docker.conf

See a simple example

More about PHP fpm pool configuration

Requirements

The script is POSIX sh but also uses some tools from your operating system, being:

  • cgi-fcgi
  • sed
  • tail
  • grep

In case you're using alpine you only need to make sure you have installed busybox and fcgi packages.

See a simple Dockerfile based on the official PHP image

Download

wget -O /usr/local/bin/php-fpm-healthcheck \
https://raw.githubusercontent.com/renatomefi/php-fpm-healthcheck/master/php-fpm-healthcheck \
&& chmod +x /usr/local/bin/php-fpm-healthcheck

Update

wget -O $(which php-fpm-healthcheck) \
https://raw.githubusercontent.com/renatomefi/php-fpm-healthcheck/master/php-fpm-healthcheck \
&& chmod +x $(which php-fpm-healthcheck)

Manually

You can always of course manually download and maintain the file, as long as you follow the MIT License

Usage

Ping mode

If you're aiming only to make sure php-fpm is alive and answering to requests you can:

$ php-fpm-healthcheck
$ echo $?
0

Or with verbose to see php-fpm status output:

$ php-fpm-healthcheck -v
Trying to connect to php-fpm via: localhost:9000/status
php-fpm status output:
pool:                 www
process manager:      dynamic
start time:           11/Sep/2018:10:47:06 +0000
start since:          436
accepted conn:        1
listen queue:         0
max listen queue:     0
listen queue len:     0
idle processes:       1
active processes:     1
total processes:      2
max active processes: 1
max children reached: 0
slow requests:        0
$ echo $?
0

Metric mode

Let's say you want to fail our healthcheck after your fpm has handled more than 3000 requests:

$ php-fpm-healthcheck --accepted-conn=3000
$ echo $?
0

And you can also check if you have more than 10 processes in the queue:

$ php-fpm-healthcheck --accepted-conn=3000 --listen-queue=10
$ echo $?
0

How a failing metric looks like

$ php-fpm-healthcheck --accepted-conn=1
'accepted conn' value '6' is greater than expected '1'
$ echo $?
1

Connection via socket or another host

You can simply specify FCGI_CONNECT variable with your connection uri:

$ FCGI_CONNECT=/var/run/php-fpm.sock php-fpm-healthcheck
$ echo $?
0

Alternative status page path

Since v0.5.0

While the default status page path is /status, you can replace it in your php-fpm configuration, in order to change also in the script in you can specify FCGI_STATUS_PATH env var within your connection uri:

$ FCGI_STATUS_PATH=/custom-status-path php-fpm-healthcheck -v
Trying to connect to php-fpm via: localhost:9000/custom-status-path
...
$ echo $?
0

Docker example

You can use HEALTHCHECK command on Dockerfile to define the health of your container. According to Docker Docs, possible return values are 0 for success, 1 to unhealthy and 2 is reserved and we must not use this exit code.

HEALTHCHECK --interval=5s --timeout=1s \
    CMD php-fpm-healthcheck || exit 1

Kubernetes example

More and more people are looking for health checks on kubernetes for php-fpm, here is an example of livenessProbe and readinessProbe:

livenessProbe

# PodSpec: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.10/#podspec-v1-core
    spec:
        containers:
        - name: "php-fpm"
        livenessProbe:
            exec:
                command:
                    - php-fpm-healthcheck
                    - --listen-queue=10 # fails if there are more than 10 processes waiting in the fpm queue
                    - --accepted-conn=5000 # fails after fpm has served more than 5k requests, this will force the pod to reset, use with caution
            initialDelaySeconds: 0
            periodSeconds: 10

readinessProbe

# PodSpec: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.10/#podspec-v1-core
    spec:
        containers:
        - name: "php-fpm"
        readinessProbe:
            exec:
                command:
                    - php-fpm-healthcheck # a simple ping since this means it's ready to handle traffic
            initialDelaySeconds: 1
            periodSeconds: 5

Docker HEALTHCHECK command is ignored on Kubernetes and you must define it using pod specifications.

Why POSIX sh

Most of the containers contain limited software installed, using POSIX sh aims to be compatible with most of the OS images around.

Author

Made with love by Renato Mefi

Distributed under MIT License

php-fpm-healthcheck's People

Contributors

caugner avatar rdohms avatar renatomefi avatar roynasser avatar s4mur4i avatar smatyas avatar stevenjm avatar tjespers avatar toadjaune avatar wandersonwhcr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

php-fpm-healthcheck's Issues

Metric check for listen queue len seems wrong

Based on https://serverfault.com/a/355546 the listen queue len value is decreased on connection. So initially the value might be for example 128. When I run the check via php-fpm-healthcheck --listen-queue-len=10 it fails with 'listen queue len' value '128' is greater than expected '10'. In this case this should only fail if the value of listen queue len is lower than 10.

Support all php-fpm metrics

Context

Support all php-fpm metrics which can be seen via: php-fpm-healthcheck -v
Of course ignoring non numeric/measurable data

Impact

  • Documentation and tests

Script Bugs with Image based on Debian 9 stretch

Environment:

# cat /etc/os-release 
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/

How to reproduce :

Run php-fpm-healthcheck --listen-queue=10
And you will get this error :

/usr/local/bin/php-fpm-healthcheck: 78: test: Illegal number:

Understanding FastCGI (with Ubuntu 22.04)

⏰ Background

  • I am building "PHP-FPM + NGINX" and "PHP + Apache2" images with S6 Overlay 3.1.2.1

👉 Problem I am trying to solve

🤔 My Approach to a Solution

I found this lovely project, but I am confused on what FastCGI really is and why it's a dependency for checking the status of PHP-FPM.

What I am attempting to do is:

  • Start services in order, but ensure each service is healthy before continuing to the next

Here is how I am starting my services:

graph TD;
    A[S6 Overlay] --> B[PHP-FPM];
     B[PHP-FPM] --> C[NGINX];
Loading

The folks over at S6 Overlay have been very helpful and pointed me in the right direction for the script side of things: just-containers/s6-overlay#482

Now I just need to get that "definition of healthy" with an exit code of 0.

That's when I found this project and saw the dependency for FastCGI.

I googled "Install FastCGI Ubuntu 22.04" and this was the first result: https://installati.one/ubuntu/22.04/fcgiwrap/

👉 My Problem

Even with having fcgiwrap installed, I am getting this error:

Screen Shot 2022-09-22 at 12 27 53

Error:

Make sure fcgi is installed (i.e. apk add --no-cache fcgi). Aborting.

👨‍💻 My Source Code

I have all my source code available on GitHub: https://github.com/serversideup/docker-php/tree/feature/add-readiness/src

Note: The fpm-nginx image is built from the fpm image (so you understand the image dependency and what's included).

❓ Questions

  1. Is there a lightweight way to run this FastCGI server on Ubuntu 22.04 without it installing a bunch of other things in my Docker Image?

Thanks for all your hard work! This project is incredible!! 🤩

Ubuntu - cgi-fcgi isn't terminating

Hi all,

When trying to use this script with ubuntu I'm seeing an ever-growing number of active cgi-fcgi instances and the health checks aren't terminating. Any idea why?

Test failed metrics

Context

Test when a metrics has fail to success

Suggestion

  • Use the metric --accepted-conn since it increases on every time a request is served by php-fpm
  • Also test with multiple metrics in different orders

When environment variables exceeds 8kb fpm will close the connection

Context

If you have more than 8kb exported in your environment, they'll will be send by default towards the fcgi call to php fpm, this makes fpm close the connection causing either exit status code of 104 or 253depending on when it happens in the internal calls.

How to fix

Send only the needed variables towards php fpm

It could be possible by replacing the fcgi call with:
env -i cgi-fcgi...

env -i meant to clear the current variables in the session

Log

An strace would look like:

strace cgi-fcgi -bind -connect /var/run/php-fpm.sock
execve("/usr/bin/cgi-fcgi", ["cgi-fcgi", "-bind", "-connect", "/var/run/php-fpm.sock"], 0x7ffed28a0c08 /* 191 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x7f1e94c06b88) = 0
set_tid_address(0x7f1e94c06bc0)         = 16859
open("/etc/ld-musl-x86_64.path", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib/libfcgi.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/local/lib/libfcgi.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib/libfcgi.so.0", O_RDONLY|O_CLOEXEC) = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
fstat(3, {st_mode=S_IFREG|0755, st_size=38488, ...}) = 0
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240+\0\0\0\0\0\0"..., 960) = 960
mmap(NULL, 2138112, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x7f1e94771000
mmap(0x7f1e94979000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x8000) = 0x7f1e94979000
close(3)                                = 0
mprotect(0x7f1e94979000, 4096, PROT_READ) = 0
mprotect(0x7f1e94c03000, 4096, PROT_READ) = 0
mprotect(0x561b68d55000, 4096, PROT_READ) = 0
rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RT_1 RT_2], NULL, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=0x7f1e9477694e, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f1e949c10ea}, NULL, 8) = 0
rt_sigaction(SIGUSR1, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGUSR1, {sa_handler=0x7f1e94776af8, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f1e949c10ea}, NULL, 8) = 0
socket(AF_UNIX, SOCK_STREAM, 0)         = 3
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/php-fpm.sock"}, 23) = 0
write(3, "\1\1\0\1\0\10\0\0\0\1\0\0\0\0\0\0", 16) = 16
brk(NULL)                               = 0x561b693c7000
brk(0x561b693ca000)                     = 0x561b693ca000
write(3, "\1\4\0\1\37\370\0\0008\16FEEDBACK_COLLECTOR_HTT"..., 8192) = 8192
write(3, "\1\4\0\1\1\245\3\0SERVICE_PORT8081\30\3BOTKIT"..., 440) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=16859, si_uid=0} ---
rt_sigreturn({mask=[]})                 = -1 EPIPE (Broken pipe)
fcntl(3, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
select(4, [3], [3], NULL, NULL)         = 2 (in [3], out [3])
read(3, "", 8192)                       = 0
shutdown(3, SHUT_WR)                    = 0
poll([{fd=3, events=POLLIN}], 1, 2000)  = 1 ([{fd=3, revents=POLLIN|POLLHUP}])
read(3, "", 1024)                       = 0
close(3)                                = 0
exit_group(-3)                          = ?
+++ exited with 253 +++
/opt/search-api # strace cgi-fcgi -bind -connect /var/run/php-fpm.sock
execve("/usr/bin/cgi-fcgi", ["cgi-fcgi", "-bind", "-connect", "/var/run/php-fpm.sock"], 0x7ffdc1d0c428 /* 191 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x7feba6734b88) = 0
set_tid_address(0x7feba6734bc0)         = 16862
open("/etc/ld-musl-x86_64.path", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib/libfcgi.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/local/lib/libfcgi.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib/libfcgi.so.0", O_RDONLY|O_CLOEXEC) = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
fstat(3, {st_mode=S_IFREG|0755, st_size=38488, ...}) = 0
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240+\0\0\0\0\0\0"..., 960) = 960
mmap(NULL, 2138112, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x7feba629f000
mmap(0x7feba64a7000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x8000) = 0x7feba64a7000
close(3)                                = 0
mprotect(0x7feba64a7000, 4096, PROT_READ) = 0
mprotect(0x7feba6731000, 4096, PROT_READ) = 0
mprotect(0x55cb34157000, 4096, PROT_READ) = 0
rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RT_1 RT_2], NULL, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=0x7feba62a494e, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7feba64ef0ea}, NULL, 8) = 0
rt_sigaction(SIGUSR1, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGUSR1, {sa_handler=0x7feba62a4af8, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7feba64ef0ea}, NULL, 8) = 0
socket(AF_UNIX, SOCK_STREAM, 0)         = 3
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/php-fpm.sock"}, 23) = 0
write(3, "\1\1\0\1\0\10\0\0\0\1\0\0\0\0\0\0", 16) = 16
brk(NULL)                               = 0x55cb34ba0000
brk(0x55cb34ba3000)                     = 0x55cb34ba3000
write(3, "\1\4\0\1\37\370\0\0008\16FEEDBACK_COLLECTOR_HTT"..., 8192) = 8192
write(3, "\1\4\0\1\1\245\3\0SERVICE_PORT8081\30\3BOTKIT"..., 440) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=16862, si_uid=0} ---
rt_sigreturn({mask=[]})                 = -1 EPIPE (Broken pipe)
fcntl(3, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
select(4, [3], [3], NULL, NULL)         = 2 (in [3], out [3])
read(3, "", 8192)                       = 0
shutdown(3, SHUT_WR)                    = 0
poll([{fd=3, events=POLLIN}], 1, 2000)  = 1 ([{fd=3, revents=POLLIN|POLLHUP}])
read(3, "", 1024)                       = 0
close(3)                                = 0
exit_group(-3)                          = ?
+++ exited with 253 +++

Readiness probe error not handled due to php-fpm-healthcheck timeout

When php-fpm stuck and didn't respond, php-fpm-healthckeck can't handle this, and timeouts to connect to the port.

root@deployment-php-5bddfd7964-7ctmf:/app# php-fpm-healthcheck -v Trying to connect to php-fpm via: localhost:9000/status ^C
So, php-fpm-healthcheck doesn't return fail status in a reasonable time.

Since K8s doesn't handle timeouts healthcheck, it doesn't work. Because Kubernetes remove the pod from service only when heathcheck returning fail event.
Readiness probe errored: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Readiness probe errored, but pod didn't remove from services.

Is there any way to setup a reasonable timeout to connect to the PHP port for the php-fpm-healthcheck?

Missing implementation of exit status codes

Hi,

When command cgi-fcgi fails to connect to a php-fpm instance the script exists.

Shouldn't there be an implementation for the following exit codes as described in the script header:

2,9,111 - Couldn't connect to PHP fpm, is it running?

For example:

    if test "$FPM_STATUS" = "Could not connect to $1"; then
        >&2 printf "Failed to connect to php-fpm instace via $1. Is it running? \\n";
        exit 2;
    fi;

Disable access logs

Is there a way to disable access logs for path /status in php-fpm?

    • 21/Oct/2019:19:58:00 +0000 "GET /status" 200

I run this in Kubernetes and have separate containers for php-fpm and nginx. I don's see the logs from the line above in nginx logs, just in php-fpm logs.

I can't understand the livenessProbe example, Can you explain it in detail?

https://github.com/renatomefi/php-fpm-healthcheck#livenessprobe

--listen-queue=10 # fails if there are more than 10 processes waiting in the fpm queue
In this case, pod should be scaleUp, rather than reset?

--accepted-conn=5000 # fails after fpm has served more than 5k requests, this will force the pod to reset, use with caution
In this case, pod don't have to reset., Why do you configure it this way?

So how do you configure it in production?

centos not support usage of ' | tail +5'

I use this script on centos, but failed.

error output like this:

[root@root /] echo "any text" | tail +5
[root@root /] tail: cannot open `+5' for reading: No such file or directory

or

[root@root /] FCGI_CONNECT=/dev/shm/php-fpm.socket FCGI_STATUS_PATH=/fpm/status ./php-fpm-healthcheck.sh --verbose
[root@root /] Trying to connect to php-fpm via: /dev/shm/php7-fpm.socket/fpm/status
[root@root /] tail: cannot open `+5' for reading: No such file or directory

not working in debian linux with FPM socket communication

Hi guys,
We had to switch from alpine to debian linux (unfortunately). Since then, the healthcheck script is no longer running. I only receive the answer: Make sure fcgi is installed (i.e. apk add --no-cache fcgi). Aborting.
I'm running the script with: FCGI_CONNECT=/var/run/socket/php5-fpm.sock php-fpm-healthcheck

Could we find out the reason for it?

Test other distros than alpine

Context

Currently the test matrix only tests alpine images, would be good to be able to test strecht for instance.

Impact

  • tests might differ, I recommend using pytest mark feature to separate those distro specific
  • docker.sh has to adhere to the marks, build and run this matrix, maybe the make file could inform which Dockerfile to look at

36: set: Illegal option -o pipefail

Installed as

wget -O /usr/local/bin/php-fpm-healthcheck \
https://raw.githubusercontent.com/renatomefi/php-fpm-healthcheck/master/php-fpm-healthcheck \
&& chmod +x /usr/local/bin/php-fpm-healthcheck

result is:

# php-fpm-healthcheck
/usr/local/bin/php-fpm-healthcheck: 36: set: Illegal option -o pipefail

# php-fpm-healthcheck -v
/usr/local/bin/php-fpm-healthcheck: 36: set: Illegal option -o pipefail

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.