python-beaver / python-beaver Goto Github PK

View Code? Open in Web Editor NEW

554.0 37.0 172.0 1.25 MB

Needs Maintainer: python daemon that munches on logs and sends their contents to logstash

Home Page: https://python-beaver.readthedocs.org/

License: MIT License

Python 94.04% Shell 5.96%

python-beaver's Introduction

Beaver

python daemon that munches on logs and sends their contents to logstash

Requirements

Python 2.6+
Optional zeromq support: install libzmq (brew install zmq or apt-get install libzmq-dev) and pyzmq (pip install pyzmq==2.1.11)

Installation

Using PIP:

From Github:

pip install git+git://github.com/python-beaver/[email protected]#egg=beaver

From PyPI:

pip install beaver==36.3.1

Documentation

Full documentation is available online at http://python-beaver.readthedocs.org/

You can also build the docs locally:

# get sphinx installed
pip install sphinx

# retrieve the repository
git clone git://github.com/python-beaver/beaver.git

# build the html output
cd beaver/docs
make html

HTML docs will be available in beaver/docs/_build/html.

Contributing

When contributing to Beaver, please review the full guidelines here: https://github.com/python-beaver/python-beaver/blob/master/CONTRIBUTING.md. If you would like, you can open an issue to let others know about your work in progress. Documentation must be included and tests must pass on Python 2.6 and 2.7 for pull requests to be accepted.

Credits

Based on work from Giampaolo and Lusis:

Real time log files watcher supporting log rotation.

Original Author: Giampaolo Rodola' <g.rodola [AT] gmail [DOT] com>
http://code.activestate.com/recipes/577968-log-watcher-tail-f-log/

License: MIT

Other hacks (ZMQ, JSON, optparse, ...): lusis

python-beaver's People

Contributors

Stargazers

Watchers

Forkers

ibeex shaftoe jdutton dazworrall ibarkowski tomoconnor michaeldauria zuazo-forks awheeler mdelagrange joshbrand lechat oerd needle stelmod faulkner arenstar kitchen joejulian normanjoyner blt appdynamics grncdr anentropic chiehwen jonathanq andrewgross nathanic markocelan pperezrubio tback kiall blake-education jlambert121 pchandra milieuinfo yankeeicecream alappe lbjay chrisroberts moniker-dns meekmichael romabysen ielkhalloufi doismellburning katafalkas reallyenglish duylong adesso-mobile joemiller ronnylt indykish avleen joshuatobin gregdurham atwardowski lbarreau simonmcc phildougherty wayfair-archive tommyulfsparre optionalg josephglanville turtlebender jeffbryner web5design zenweasel jesusaurus basili0 webratz bearstech wojons davidgarvey wocin jeredding hltbra mahendra fetep pieterlexis ajechropov diegosanjuan thohal andrewjsledge xoor kingdomofjoy mca312 xia0pin9 aeronotix altvnk qrilka accelerationnet gentitope ankushnarula scanepa ronnocol kaanerturk sxu60 coregear gfloyd lbjworld

python-beaver's Issues

Add redis auth support

It would be handy to be able to use password auth with the redis transport

Fix issues when running under supervisor

When running with supervisor, it appears that logging is blocked at program start, so long as the connection to redis succeeds. This might have to do with environment setting, but it is weird that it works fine outside of supervisor and not when in.

The connection issue also presumably occurs with other brokers as well.

`daemon` not included in requirements but imported in `bin/beaver`

Causes the failure

Traceback (most recent call last):
  File "/usr/local/bin/beaver", line 2, in <module>
    import daemon
ImportError: No module named daemon

Beaver not re-connecting after a transport exception

I see this behavior happening all the time when I converge my Chef nodes. The Chef run causes RabbitMQ to restart, and beaver loses its connection to it. This causes a transport exception that, in theory, should be sorted moments after with a respawn.

However, the respawn of the transport does not seem to work, as every respawn causes another exception until max_tries is reached.

The only solution is to restart beaver completely.

Example log (restarted RabbitMQ at 00:39:59):

[2013-01-30 06:39:34,635] INFO    [801g40282] - watching logfile /var/log/syslog
[2013-01-30 06:39:34,635] INFO    Working...
[2013-01-30 06:39:34,635] INFO    Starting queue consumer
ERROR:pika.adapters.base_connection:Socket Error on fd 23: 104
WARNING:pika.adapters.blocking_connection:Received Channel.Close, closing: None
[2013-01-31 00:39:59,302] INFO    Caught transport exception, respawning in 3 seconds
[2013-01-31 00:40:02,305] INFO    Caught transport exception, respawning in 9 seconds
[2013-01-31 00:40:11,308] INFO    Caught transport exception, respawning in 27 seconds

Beaver does not run under Python2.6 on Ubuntu 10.04

lusis/chef-logstash#54

Reports that beaver does not run under Python2.6.
This implies limited support for RHEL family platforms which update more slowly than Debian family.

Those may be outside your target, if so feel free to close this issue.

@rafaelmagu may have more details as this info comes from him.

Add an event parser

This event parser may also be configured to leave out certain keys as specified in the config file.

Allow sending of raw message logs instead of json

While it might provide fewer features, it would also provide a speed gain from not having to add information every time before sending

Exception: Unsupported UTF-8 sequence length

I'm not exactly sure what caused this but I think it's might be because my log path contained binary files (apache ssl_cache). The problem is that beaver just crashes when this happens. Changing path not to include binary files is the obvious solution in my case. But, it's possible that real log files contain bad data. It would be nicer if beaver just ignored the bad data and continued on.

[2012-09-11 23:38:12] [fd00g4e4d3] - watching logfile /var/log/httpd/access-ssl.log
[2012-09-11 23:38:12] [fd00g4e501] - watching logfile /var/log/httpd/access.log
[2012-09-11 23:38:12] [fd00g4e4d2] - watching logfile /var/log/httpd/error-ssl.log
[2012-09-11 23:38:12] [fd00g4e4b3] - watching logfile /var/log/httpd/error.log
[2012-09-11 23:38:12] [fd00g4e521] - watching logfile /var/log/httpd/httpd.pid
[2012-09-11 23:38:12] [fd00g4e51f] - watching logfile /var/log/httpd/ssl_scache(512000).dir
[2012-09-11 23:38:12] [fd00g4e520] - watching logfile /var/log/httpd/ssl_scache(512000).pag
[2012-09-11 23:38:12] Working...
[2012-09-11 23:50:36] Unhandled Exception: Unsupported UTF-8 sequence length when encoding string

beaver crashes if redis not available

While doing some maintenance on redis this morning, it looks like beaver crashed because it was not able to connect to it. Is there a way to make it more graceful with such a situation?

Thanks!

ConnectionError: Error 111 connecting 10.x.x.x:6379. Connection refused.
[2013-01-09 10:00:06,148] INFO Starting queue consumer
Process Process-22:
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 88, in run
self._target(_self._args, *_self._kwargs)
File "/usr/lib/python2.6/site-packages/Beaver-21-py2.6.egg/beaver/queue.py", line 18, in run_queue
transport.callback(*data)
File "/usr/lib/python2.6/site-packages/Beaver-21-py2.6.egg/beaver/redis_transport.py", line 46, in callback
self._pipeline.execute()
File "/usr/lib/python2.6/site-packages/redis-2.4.11-py2.6.egg/redis/client.py", line 1528, in execute
return execute(conn, stack)
File "/usr/lib/python2.6/site-packages/redis-2.4.11-py2.6.egg/redis/client.py", line 1485, in _execute_pipeline
connection.send_packed_command(all_cmds)
File "/usr/lib/python2.6/site-packages/redis-2.4.11-py2.6.egg/redis/connection.py", line 241, in send_packed_command
self.connect()
File "/usr/lib/python2.6/site-packages/redis-2.4.11-py2.6.egg/redis/connection.py", line 189, in connect
raise ConnectionError(self._error_message(e))

Add version as flag

How to use add_field in configuration file

Hi,
I want to add some fields to event before beaver send it to redis server. I added 'add_field' to config.ini file but values are not forwarded to stdout. In examples there are only 'tags' and 'type' meta data words. Is add_fiels working?

p.s. beaver is a grat tool, and i use it on production env. Sorry about my english :)

Add command to generate a configuration file

Ideally this would use some sort of templating and give you a wizard-like flow so that you could generate a custom configuration file.

Add support for the lumberjack protocol

This one might be a doozy, and I'll probably need @jordansissel to document the protocol.

Add init.d config

Add supervisor config

Ability to remove certain items from output

Our company doesn't use [ @tags, @type, @source ], so not sending would save space.

Can ujson >= 1.19 be used?

I'd like to use 1.23 since there's an rpm built for it so I don't have to install compilers on all my machines to use this.

Do you know of any specific problems?

beaver events missing tags after some time

Noticed an issue with beaver last night when I started using it with a system that produces a good amount of logs. After some time of running, beaver stopped tagging the event stream it was sending to my redis queue. The events were still coming in, but the tags were missing. Restarting beaver corrected it. Not sure how to reproduce the issue or what sort of debug info would be helpful.

So far, about 12 hours after restarting to correct above issue, the problem has not occurred again.

Files specified with -p PATH or BEAVER_PATH are not checked

The directory specified with -p PATH or BEAVER_PATH are not checked.
The default directory of /var/log is similarly not checked.
Files specified with -f are checked, including those with globs.

touch /tmp/beaver-1.log

Window 1:
beaver -p /tmp/
Window 2:
date >> /tmp/beaver-1.log

No output. Output should be a JSON line with date.

Window 1:
BEAVER_PATH="/tmp" beaver
Window 2:
date >> /tmp/beaver-1.log

No output. Output should be a JSON line with date.

Window 1:
beaver
Window 2:
sudo date >> /var/log/beaver-1.log

No output. Output should be a JSON line with date.

Window 1:
beaver -f /tmp/beaver-1.log
Window 2:
date >> /tmp/beaver-1.log

Entry appears in "Window 1", as expected.

Window 1:
beaver -f /tmp/*.log
Window 2:
date >> /tmp/beaver-1.log

Entry appears in "Window 1", as expected.

Add sqs transport

current trunk outputs no data for stdout transport

When the transport is stdout there is zero data output.

This is because the utils.log() helper function is calling logging.log() but the default log instance does not define a stream handler, so the data is dropped on the floor.

signal to force beaver to reconnect to redis

In my setup I'm thinking about having a number of redis instances behind a load balancer. This is for ease of maintenance and redundancy and such.

However, since beaver's redis connections are sticky, should I take one of them down and put it back in service, it will never get any more connections.

I think being able to send a HUP or something to beaver to make it reconnect to redis would be a great option. This would allow me to then tell my beavers to reconnect and the load balancer should take care of spreading things back out again.

Add configuration file parsing for tags/types

To be useful in our scenario Beaver should at least be able to specify the "tags" and "types" metadata before shipping to RabbitMQ. I think what we should do is to add file parsing for input {} logstash stanzas (actually only for http://logstash.net/docs/1.1.1/inputs/file ). Do you have any suggestions?

Allow specifying host via flag

Deprecate all environment variables

It is getting a bit messy to check both environment variables, configuration files, and parsed arguments. We should just support two things:

Argparse - It is good at what it does
Conf files - For specialization of files

I'll start putting in a deprecation notice in the next release, removing all references to env vars from the readme, and remove them completely by release 20.

Add support for sincedb

Beaver should keep track of last read line in a file (to handle network problems or such), maybe as Logstash file input plugin does:

http://logstash.net/docs/1.1.1/inputs/file#setting_sincedb_path

amqp transport and RabbitMQ

Hi! Well done, I'd love to have a much more lightweight logstash shipper on our boxes! Anyway, is the amqp transport tailored to ZeroMQ or will it ship to a RabbitMQ queue as well?

More fields in @fields

Hello,
After last patch i can add field in config.ini, and it's forwarded in outut message. The problem is when i want to add more then one field.

In config.ini I wrote:

add_field: Env,RG,App,Yeti

In output message beaver send:

"@fields":{"Env":["RG","Yeti"]}

Add grok support

This would be killer.

sleep/poll if file specified does not exist

Mentioned in #43.

Fix misplaced value conversions

#98 (comment)

Logging from beaver

Hi,

At the moment logging from beaver it self standard goes to stdout.
When daemonized it goes nowhere unless you specify the '-o OUTPUT, --output OUTPUT'.
The naming of this makes no sense to what it does.

Can this be renamed to --logfile / -l ?

Add support for multiple redis fallbacks

Add bash config

string and json logfiles

Hi,

Its very unclear at this moment if beaver support mixing 'plain' logs ( strings ) and json_event logs.
Should be able to set per file the format so its processed correctly.

Use setuptools instead of distutils for packaging

distutils gives warnings about how it doesn't know about install_requires.

Might make sense to use requires for distutils, but I don't know how that affects packaging. Any pythonistas that want to chime in would be very much appreciated.

Add support for start_position

From http://logstash.net/docs/1.1.3/inputs/file#setting_start_position

"Choose where logstash starts initially reading files - at the beginning or at the end. The default behavior treats files like live streams and thus starts at the end. If you have old data you want to import, set this to 'beginning'

This option only modifieds "first contact" situations where a file is new and not seen before. If a file has already been seen before, this option has no effect."

Value can be any of: "beginning", "end"
Default value is "end"

This is very handy if you want to process the whole file (quite a common case in our scenario). Of course it must be coupled with the sincedb support discussed in #6, otherwise we will have duplicates at any Beaver (re)start.

proper versioning

The current versioning scheme is a bit odd. Consider http://www.python.org/dev/peps/pep-0386/ or http://semver.org/.

Figure out why beaver cannot deal with logrotate compression

Beaver stops working after logrotate

Beaver==9 stops sending logs after log rotation. This is my logrotate configuration:

/var/log/celery/*.log
{
        compress
        copytruncate
        create 644 www-data www-data
        daily
        maxage 365
        maxsize 100M
        missingok
        nodelaycompress
        notifempty
        rotate 999
}

And my beaver conf:

$ cat /etc/beaver.ini
[/var/log/celery/fc.*log]
type: celery
tags: celery,unstable
add_field: site,fc,server_type,unstable

$ REDIS_NAMESPACE='logstash' REDIS_URL="redis://...:6379/0" beaver -t redis -c /etc/beaver.ini

Untrapped exception when RabbitMQ closes the connection in rabbitmq_transport.py

See https://gist.github.com/4156499 for full traceback. In essence, rabbitmq_transport.py should trap connection closures and attempt re-connection every 60 seconds. This should work around unstable network connections or when RabbitMQ is restarted.

Unhandled Exception: 'NoneType' object has no attribute 'getitem'

While tailing a logfile with beaver and writing to redis, the following exception occurs:

Unhandled Exception: 'NoneType' object has no attribute 'getitem'

Setup:

Fedora 17
redis-2.4.10-1.fc17.x86_64
python-redis-2.4.9-2.fc17.noarch
beaver from git
commandline: REDIS_URL="redis://127.0.0.1:6379/0" sudo python beaver -t redis -f /var/log/mcollective.log

The same happens on CentOS 6.3 and Ubuntu 12.04 with different redis versions. Any idea?

-f does nothing at all

Regardless of what is used for -f it will just fall back to scanning /var/log.

The only way I've found to correctly set the files to watch is by setting BEAVER_FILES.

Logfiles discovered after launch not tagged

When new log files are discovered and tailed ..after.. the process has started, are not tagged like they would have if discovered at startup time.

Add a tcp transport

This should handle anyone who wants something that is "syslog" compatible, though with better options - such as redis - I only see this useful as a last resort.

use socket.getfqdn rather than socket.gethostname?

My machines are set up with a hierarchical fqdn and often have identical hostnames. For example: app1v-role.group.environment.location.example.com

Fortunately, socket.gethostname on these machines does return me the full fqdn. However, the FM says that this is not always the case and if you want to always get the fqdn to use the socket.getfqdn function instead.

I'd like to use getfqdn instead of gethostname. I need to be able to know which app1v-role I'm getting logs from and be able to rely on that being available going forward. I realize that it has potential to break backward compatibilty if it's just changed, since some people (my own local test bed included, for some reason) are only getting the hostname part and not the fqdn. Maybe this should be optional? With a command line flag or ini file setting?

Let me know your thoughts, and I'll make a pull request.

Or if you want, I can implement it both ways and you can just pick one!

python-beaver / python-beaver Goto Github PK

python-beaver's Introduction

Beaver

Requirements

Installation

Documentation

Contributing

Credits

python-beaver's People

Contributors

Stargazers

Watchers

Forkers

python-beaver's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs