GithubHelp home page GithubHelp logo

vmware-archive / healthcheck Goto Github PK

View Code? Open in Web Editor NEW
686.0 11.0 88.0 44 KB

A library for implementing Kubernetes liveness and readiness probe handlers in your Go application.

License: Apache License 2.0

Go 100.00%
kubernetes healthcheck liveness readiness

healthcheck's Introduction

healthcheck

Build Status Go Report Card GoDoc

Healthcheck is a library for implementing Kubernetes liveness and readiness probe handlers in your Go application.

Features

  • Integrates easily with Kubernetes. This library explicitly separates liveness vs. readiness checks instead of lumping everything into a single category of check.

  • Optionally exposes each check as a Prometheus gauge metric. This allows for cluster-wide monitoring and alerting on individual checks.

  • Supports asynchronous checks, which run in a background goroutine at a fixed interval. These are useful for expensive checks that you don't want to add latency to the liveness and readiness endpoints.

  • Includes a small library of generically useful checks for validating upstream DNS, TCP, HTTP, and database dependencies as well as checking basic health of the Go runtime.

Usage

See the GoDoc examples for more detail.

  • Install with go get or your favorite Go dependency manager: go get -u github.com/heptiolabs/healthcheck

  • Import the package: import "github.com/heptiolabs/healthcheck"

  • Create a healthcheck.Handler:

    health := healthcheck.NewHandler()
  • Configure some application-specific liveness checks (whether the app itself is unhealthy):

    // Our app is not happy if we've got more than 100 goroutines running.
    health.AddLivenessCheck("goroutine-threshold", healthcheck.GoroutineCountCheck(100))
  • Configure some application-specific readiness checks (whether the app is ready to serve requests):

    // Our app is not ready if we can't resolve our upstream dependency in DNS.
    health.AddReadinessCheck(
        "upstream-dep-dns",
        healthcheck.DNSResolveCheck("upstream.example.com", 50*time.Millisecond))
    
    // Our app is not ready if we can't connect to our database (`var db *sql.DB`) in <1s.
    health.AddReadinessCheck("database", healthcheck.DatabasePingCheck(db, 1*time.Second))
  • Expose the /live and /ready endpoints over HTTP (on port 8086):

    go http.ListenAndServe("0.0.0.0:8086", health)
  • Configure your Kubernetes container with HTTP liveness and readiness probes see the (Kubernetes documentation) for more detail:

    # this is a bare bones example
    # copy and paste livenessProbe and readinessProbe as appropriate for your app
    apiVersion: v1
    kind: Pod
    metadata:
      name: heptio-healthcheck-example
    spec:
      containers:
      - name: liveness
        image: your-registry/your-container
    
        # define a liveness probe that checks every 5 seconds, starting after 5 seconds
        livenessProbe:
          httpGet:
            path: /live
            port: 8086
          initialDelaySeconds: 5
          periodSeconds: 5
    
        # define a readiness probe that checks every 5 seconds
        readinessProbe:
          httpGet:
            path: /ready
            port: 8086
          periodSeconds: 5
  • If one of your readiness checks fails, Kubernetes will stop routing traffic to that pod within a few seconds (depending on periodSeconds and other factors).

  • If one of your liveness checks fails or your app becomes totally unresponsive, Kubernetes will restart your container.

HTTP Endpoints

When you run go http.ListenAndServe("0.0.0.0:8086", health), two HTTP endpoints are exposed:

  • /live: liveness endpoint (HTTP 200 if healthy, HTTP 503 if unhealthy)
  • /ready: readiness endpoint (HTTP 200 if healthy, HTTP 503 if unhealthy)

Pass the ?full=1 query parameter to see the full check results as JSON. These are omitted by default for performance.

healthcheck's People

Contributors

bryanl avatar davecheney avatar mattmoyer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

healthcheck's Issues

collectChecks may run the same checks more than once

In setting up this library, I initially added a database check to both AddLivenessCheck and AddReadinessCheck. In looking at handlers.go, it looks like the ready handler will run both liveness and readiness checks. I didn't see this documented anywhere, so I assumed each only ran their own specific checks as added.

Would it be possible to either 1) document the behavior in the godoc and README, or 2) treat each as separate checks. Maybe expose a third more explicit method AddLivenessReadinessCheck() if someone wants to add to both.

If both checks are going to be combined, it might make sense to treat the name as a unique identifier and ensure the same check (which may be registered in both places) is not run more than once. Otherwise the ready checks may call (for example) db.PingContext() twice instead of once for a single /ready check from Kubernetes.

Allow customizing of route names

Currently, only the following routes ,'/live' and '/ready' are exposed.
It would be great to allow customization of the routes eg. '/liveness' or '/readiness' or '/super/site/readiness'.

HealthCheck endpoint does not support HTTP Method HEAD

I was attempting to curl a healthcheck endpoint as part of a wait-for-it script in Docker and noticed an issue with the HealthCheck server. The server, by default, doesn't respond to the HTTP Method HEAD which is often used with curl to limit the response.

Example with --head

# curl -v --head http://api:8071/_ready
*   Trying 172.22.0.8...
* TCP_NODELAY set
* Connected to api (172.22.0.8) port 8071 (#0)
> HEAD /_ready HTTP/1.1
> Host: api:8071
> User-Agent: curl/7.52.1
> Accept: */*
> 
< HTTP/1.1 405 Method Not Allowed
HTTP/1.1 405 Method Not Allowed
< Content-Type: text/plain; charset=utf-8
Content-Type: text/plain; charset=utf-8
< X-Content-Type-Options: nosniff
X-Content-Type-Options: nosniff
< Date: Fri, 17 Jan 2020 19:55:24 GMT
Date: Fri, 17 Jan 2020 19:55:24 GMT
< Content-Length: 19
Content-Length: 19

< 
* Curl_http_done: called premature == 0
* Connection #0 to host api left intact

Example without --head

# curl -v http://api:8071/_ready
*   Trying 172.22.0.8...
* TCP_NODELAY set
* Connected to api (172.22.0.8) port 8071 (#0)
> GET /_ready HTTP/1.1
> Host: api:8071
> User-Agent: curl/7.52.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=utf-8
< Date: Fri, 17 Jan 2020 19:55:32 GMT
< Content-Length: 3
< 
{}
* Curl_http_done: called premature == 0
* Connection #0 to host api left intact

When calling the GET endpoint with the HEAD method should just return the HTTP header and a 200.

healthcheck function only executed once?

Hi,
trying to get this little example to run but it does not seem to even execute the mycheck function more than once during the startup. I thought it should be happening every time I try to access the /live endpoint?

package main

import (
	"errors"
	"fmt"
	h "github.com/heptiolabs/healthcheck"
	"math/rand"
	"net/http"
)

func main() {
	Example()
}

func Example() {
	health := h.NewHandler()
	health.AddLivenessCheck("mycheck", mycheck())
	http.ListenAndServe("0.0.0.0:8080", health)
}

func mycheck() h.Check {
	if rand.Intn(2) == 1 {
		return func() error {
			return nil
		}
	} else {
		return func() error {
			return errors.New("0")
		}
	}
}

Consider subpackage for various checks

At the moment if one doesn't want to have specific checks from this package but they have to depend on them (e.g. prometheus) and vendor them accordingly.
It would be nice to have separate subpackages with handlers and with checks to avoid this situation.

Liveness check - database ping issue

Hello,

I'm trying to make use of this library to implement some healthchecks in my application.

My liveness check consists of a database ping check like this:

health.AddLivenessCheck("database-ping", healthcheck.DatabasePingCheck(db, 2*time.Second))

But given that the database.Ping() returns an error only on the initial request (i.e. if the database goes down at some point, it will not return an error) the /live endpoint request returns 200 OK even if the database is down.

Are there any plans to fix this, perhaps by using database.Query("SELECT 1") instead of database.Ping()?

Thanks a lot.

Is this repo still active?

Hello everyone πŸ‘‹,

First of all, thank you for creating such a repo, a simple health check tooling that can easily be integrated with Kube. I really liked it.

I have a question regarding the activity of this repo. I've seen that the last contribution was about 2 years ago, which is concerning. Many PRs are also just parked and waiting for either some comments or reviews.

This project would definitely benefit from the new Go 1.14 features, like: go mod, new request context, etc. In case this is not being maintained anymore, I would be happy to help in keeping it up.

Thank you all!

suggestion to add monitoring healthcheck

Hello,
Kubelet use the Healthcheck response for a restart action(liveness) or removing it’s IP(readiness) but it does not give a monitoring action for container Healthcheck that does not require an action on the pod.
For exmaple: monitoring availability of non critical application process that stopped working within the POD or warning of application process that is still functioning.
The capability of monitoring actions using health probing will help to get correct status of the applications behavior in the cluster and detect problems without affect on the Pod Lifecycle.

Suggestion:

Define a monitoring probe with a monitoring command that will send the status and return values to specified destination or expose it to cAdvisor using end point.

example: monitoring.yaml

spec:
containers:
name: appmonitoring
image: imagex
args:
monitoringProbe:
exec:
command:
- /bin/sh
- app1health.sh
initialDelaySeconds: 5
periodSeconds: 5

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.