GithubHelp home page GithubHelp logo

log-forwarder's Introduction

Journald SumoLogic forwarder

Build Status

Reads journald entries and uploads them to SumoLogic.

Quickstart

Use prebuilt images from Dockerhub: https://hub.docker.com/r/bsycorp/log-forwarder/ Example SystemD service file:

[Unit]
After=docker.service

[Service]
ExecStartPre=-/usr/bin/docker rm -f log-forwarder
ExecStart=/usr/bin/docker run --init --rm --name log-forwarder \
  -e "SUMO_SOURCE_CATEGORY=dev/your-aws-account/linux/<something>" \
  -e "SUMO_SOURCE_NAME=your-system-<something>" \
  -e "SUMO_TRUSTED_TIMESTAMP_COLLECTOR_URL=<unique-collector-url>" \
  -e "SUMO_UNTRUSTED_TIMESTAMP_COLLECTOR_URL=<unique-collector-url>" \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /etc:/etc \
  -v /var/log/journal:/var/log/journal:ro \
  -v /var/lib/log-forwarder:/var/lib/log-forwarder \
  --log-driver none \
  --network host \
  "bsycorp/log-forwarder:latest"'
StateDirectory=log-forwarder
Restart=always
RestartSec=1m
TimeoutStartSec=60

[Install]
WantedBy=multi-user.target

Configuration

Required environment variables

See the SumoLogic documentation for more information.

  • SUMO_SOURCE_CATEGORY
  • SUMO_SOURCE_NAME
  • SUMO_TRUSTED_TIMESTAMP_COLLECTOR_URL - This should be a collector configured with 'Enable Timestamp Parsing' ON / ENABLED
  • SUMO_UNTRUSTED_TIMESTAMP_COLLECTOR_URL - This should be a collector configured with 'Enable Timestamp Parsing' OFF / DISABLED

Optional environment variables

See: https://www.freedesktop.org/software/systemd/man/systemd.journal-fields.html for a list of valid Journald transports.

  • SUMO_SOURCE_HOST - Can be used to override hostname, otherwise will auto-detect using ec2 metadata or /etc/hostname in that order.
  • JOURNAL_INCLUDE_TRANSPORTS - Comma delimited list of journald transports to collect from. Default: all valid transports.
  • JOURNAL_EXCLUDE_TRANSPORTS - if set, will exclude the listed journald transports from collection. Default: empty.
  • JOURNAL_EXCLUDE_UNITS - if set, will exclude messages from the nominated systemd units, useful to exclude the logfwder itself but accepts a comma separated list.
  • FORMAT_MESSAGE_EXCLUDE_UNITS - if set, will disable custom formatting for the nominated systemd units. Default: docker.service is excluded by default.
  • SUMO_EXCLUDE_SOURCE_CATEGORIES - A comma separated list of strings which will cause messages to be dropped if they match (by "string contains") a source category. For example, a value of kubernetes/kube-system/weave-net will prevent weave net messages from being forwarded to Sumo.

Proxy environment variables

Standard proxy environment variables are supported.

  • http_proxy or HTTP_PROXY
  • https_proxy or HTTPS_PROXY
  • no_proxy or NO_PROXY

Hostname Lookup

The SUMO_SOURCE_HOST environment variable can be set to override the hostname emitted to sumo, however the default behaviour is to auto-detect the hostname from the system itself, using cloud provider metadata or /etc/hostname in that order. Both AWS and GCP are supported.

Source Category / Source Name Generation

For all journald event sources a source category and/or source name will be generated and override or extend what is specified in environment variables. This effectively makes the values specified in environment variables the base values, especially for Source Category.

There are 3 kinds of generation behaviour currently supported:

Kubernetes Pods

If an event is from docker, has a CONTAINER_ID and the CONTAINER_NAME starts with k8s_ then we will look up Kubernetes API to get the pod name, namespace and owner name (daemonset, deployment etc) for that pod.

  • Source Category will be set to: $SUMO_SOURCE_CATEGORY/kubernetes/<kubernetes namespace>/<kubernetes owner name / pod name>
  • Source Name will be set to: <kubernetes pod name>

Docker Containers

If an event is from docker and has a CONTAINER_ID but isn't from kubernetes, it is treated like a vanilla docker process.

  • Source Category will be set to: $SUMO_SOURCE_CATEGORY/docker/<docker container name>
  • Source Name will be set to: <docker container name>

Systemd Unit

If an event has SYSTEMD_SLICE set and doesn't match the above scenarios it is treated as a vanilla systemd process.

  • Source Category will be set to: $SUMO_SOURCE_CATEGORY/systemd/<systemd slice name>
  • Source Name will be set to: <systemd slice name>

Journald Entry

Otherwise an if an event doesn't match any of the above scenarios it is treated as a vanilla journald entry.

  • Source Category will be set to: $SUMO_SOURCE_CATEGORY/journald/<journal transport name>
  • Source Name will be set to: <journal transport name>

Timestamp parsing

Because log-forwarder is supposed to be drawing all logs entries from a given host, likely from a number of sources as described above, it is likely that some of those sources are logging timestamp in different ways or not at all. This creates a problem in SumoLogic as it will by default try and parse a given log event and make it searchable with whatever time (in the nominated timezone) that it finds. Where it doesn't find a timezone it will apply the default timezone for the collector. As the log-forwarder can't pass source specific timezone information to SumoLogic (its not part of the upload API) we have simplified down to two classes of sources, sources we can 'trust' the timestamp of and those we can't. Trusted log entries are those that have a format SumoLogic supports that includes a timezone, and untrusted are entries that should be marked with the receipt timestamp in SumoLogic (so now-ish). All events can't be marked with receipt time as it will subtly change the order of messages for transactions that cross host boundaries, this is most important for application logs which are luckily a trusted timestamp source so won't have this problem.

Enabling trusted timestamps from a service

Currently only kubernetes pods may have trusted timestamps (not docker containers or systemd services).

To send logs to the "trusted timestamp" collector, set an annotation like com.sumologic/trusted-timestamp=true on a pod.

log-forwarder's People

Contributors

bls avatar nhoughto avatar mengxuzhao avatar

Stargazers

Constantin Bugneac avatar Denis Denisov avatar Montana Flynn avatar DNX avatar  avatar  avatar

Watchers

 avatar James Cloos avatar  avatar  avatar  avatar

log-forwarder's Issues

sumologic fails to properly detect message boundaries at certain log volume

It appears sumologic, at a certain log volume to a single HTTP collector will start misbehaving and instead of detecting bounds of a log message via timestamp detection (or however it works) it creates a sumologic message for every line (ish?) submitted to the HTTP collector.

There don't appear to be any knobs to turn in sumologic config to change it, and there appears to be a threshold we've crossed to see this behaviour, it previously worked ok (and appears to work if log volumes drop enough).

Maybe we should be balancing emitted logs across multiple HTTP collector URLs? ๐Ÿ˜ฌ

Sample Kubernetes DaemonSet

Hello, do you have any sample K8S yaml files you can share for your project? I'm using fluentd currently, and running into some memory consumption issues similar to those you reported.

Sometimes gets stuck and requires restart, should exit instead?

Oct 24 02:14:09 proxy-0cc62565e01bc8767 sh[4144]: 2019/10/24 02:14:09 warning: GetEntry:  failed to get realtime timestamp: 99
Oct 24 02:14:09 proxy-0cc62565e01bc8767 sh[4144]: 2019/10/24 02:14:09 Opening journal
Oct 24 02:14:09 proxy-0cc62565e01bc8767 sh[4144]: 2019/10/24 02:14:09 Closed old journal
Oct 24 02:14:09 proxy-0cc62565e01bc8767 sh[4144]: 2019/10/24 02:14:09 Seeking to:  s=384da0f830fd4a768962d90956b51e3c;i=ba241d;b=8b6a5e01a1bc4263ae88e2500daf9c81;m=3c65601acfa;t=5959e96b86e90;x=68fd708a7c03c2de
Oct 24 02:14:09 proxy-0cc62565e01bc8767 sh[4144]: 2019/10/24 02:14:09 Journal opened OK
Dec 04 04:10:15 proxy-0cc62565e01bc8767 sh[4144]: 2019/12/04 04:10:15 warning: GetEntry:  failed to get realtime timestamp: 99
Dec 04 04:10:15 proxy-0cc62565e01bc8767 sh[4144]: 2019/12/04 04:10:15 Opening journal
Dec 04 04:10:15 proxy-0cc62565e01bc8767 sh[4144]: 2019/12/04 04:10:15 Closed old journal
Dec 04 04:10:15 proxy-0cc62565e01bc8767 sh[4144]: 2019/12/04 04:10:15 Seeking to:  s=384da0f830fd4a768962d90956b51e3c;i=bbe99e;b=8b6a5e01a1bc4263ae88e2500daf9c81;m=700bca19762;t=598d8fd5858f9;x=c55d339b8d05f99c
Dec 04 04:10:15 proxy-0cc62565e01bc8767 sh[4144]: 2019/12/04 04:10:15 Journal opened OK

restart fixed it

Jan 08 22:47:32 proxy-0cc62565e01bc8767 systemd[1]: Starting log-forwarder.service...
Jan 08 22:47:32 proxy-0cc62565e01bc8767 docker[20657]: Error: No such container: log-forwarder
Jan 08 22:47:32 proxy-0cc62565e01bc8767 systemd[1]: Started log-forwarder.service.
Jan 08 22:47:32 proxy-0cc62565e01bc8767 sh[20665]: 2020/01/08 22:47:32 Opening journal
Jan 08 22:47:32 proxy-0cc62565e01bc8767 sh[20665]: 2020/01/08 22:47:32 Seeking to:  s=384da0f830fd4a768962d90956b51e3c;i=bcd5d4;b=8b6a5e01a1bc4263ae88e2500daf9c81;m=9d06cbbc4fa;t=59ba8ad728690;x=56256c024bb5e074
Jan 08 22:47:32 proxy-0cc62565e01bc8767 sh[20665]: 2020/01/08 22:47:32 Journal opened OK
Jan 08 22:47:32 proxy-0cc62565e01bc8767 sh[20665]: 2020/01/08 22:47:32 Listening for journald transports:  [audit driver syslog journal stdout kernel]
Jan 08 22:47:32 proxy-0cc62565e01bc8767 sh[20665]: 2020/01/08 22:47:32 Not formatting message for systemd units:  [docker.service]
Jan 08 22:47:32 proxy-0cc62565e01bc8767 sh[20665]: 2020/01/08 22:47:32 Excluding messages for sumo source categories:  [kubernetes/kube-system/kube-flannel-ds]
Jan 08 22:47:32 proxy-0cc62565e01bc8767 sh[20665]: 2020/01/08 22:47:32 metrics reporting to dogstatd at:  127.0.0.1:8125

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.