GithubHelp home page GithubHelp logo

ruby-cabin's Introduction

Logging kind of sucks.

I want:

Context and Structured Data

because logging with printf makes it hard to read later.

Why write code that's easy to maintain but not write logs that are the same? Structured data means you don't need crazy regular expression skills to make sense of logs.

Output logs to multiple targets

Why not log to a file, a database, and a websocket at the same time? What if you could log to any output logstash supported right from your application?

Log levels

What did the application programmer think of the importance and meaning of a log message?

Is the usual list of fatal, error, warning, info, and debug sufficient?

Easy shared logging configuration through an application

It should be easy for your entire application (and all libraries you use) to use the same logging configuration.

API that encourages tracking metrics, latencies, etc

Your applications and libraries would be vastly easier to debug, scale, and maintain if they exposed metrics about ongoing behaviors. Keep a count of HTTP hits by response code, count errors, time latencies, etc.

Separation of Data and View

Using printf or similar logging methods is bad because you are merging your data with your view of that data.

I want to be able to log in a structured way and have the log output know how that should be formatted. Maybe for humans, maybe for computers (as JSON), maybe as some other format. Mabye you want to log to a csv file because that's easy to load into Excel for analysis, but you don't want to change all your applications log methods?

What is out there?

log4j has the context bits (see MDC and NDC).

Ruby's Logger has almost none of this. Same with Python's standard 'logging' module. Node doesn't really have any logging tools. Java has many, including log4j mentioned above, and misses much of the above.

Zoom out for a moment

Many logging tools are fixated on only purpose. Some logging tools are for debugging and troubleshooting. Some are for logging usage for billing and accounting. Some logs are for recording transactions for rollback or replay.

Ultimately all of these things are, roughly, a timestamp and some data. Debug logs will have messages and context. Billing logs will have customer info and usage metrics. Transaction logs will include operations performed.

For troubleshooting-style logs, it can make sense to use a "level" concept where some logs have a higher degree of importance or different meaning. In billing logs, what is "info" vs "fatal," and would you even have such a thing?

We can do better than requiring three different kinds of log libraries and tools for each of these three problems.

Why experiment with this?

Logging plain-text strings is just plain shit. You need to be a regexp ninja to make any kind of aggregated sense out of anything more than a single log event.

  • How many customers signed up yesterday?
  • Have any recent database queries failed?
  • What is the average SQL query latency in the past hour?
  • How many unique users are visiting the site?
  • What's in my logs that matters to my goals? (Business or otherwise?)

Lots of this data finds its way into your logs (rather than your metrics/graphing systems).

How about we skip the level 70 Regular Expression skill requirement? Log structured data, yo. Pretty sure every language can parse JSON. Don't like JSON? That's fine, JSON is just a serialization - a data representation - there are plenty of choices...

... but I digress. Your applications have context at the time of logging. Most of the time you try to embed it in some silly printf or string-interpolated meatball, right? Stop that.

Instead of code like this:

logger.error("#{hostname} #{program}[#{pid}]: error: PAM: authentication error for illegal user #{user} from #{client}")

and output like this:

Sep 25 13:44:37 fbsd1 sshd[4374]: error: PAM: authentication error for illegal user amelia from e210255180014.ec-userreverse.dion.ne.jp

and a regex to parse it like this:

/haha, just kidding I'm not writing a regex to parse that crap./

How about this:

logger.error("PAM: authentication error for illegal user", {
  :hostname => "fbsd1",
  :program => "sshd",
  :pid => 4374,
  :user => "amelia",
  :client => "e210255180014.ec-userreverse.dion.ne.jp"
})

And output in any structured data format, like json:

{ 
  "timestamp": "2011-09-25T13:44:37.034Z",
  "message": "PAM: authentication error for illegal user",
  "hostname": "fbsd1",
  "program": "sshd",
  "pid": 4374,
  "user": "amelia",
  "client": "e210255180014.ec-userreverse.dion.ne.jp"
}

Log structured stuff and you can trivially do some nice analytics and searching on your logs.

Latency matters.

Want to time something and log it?

n = 30
logger[:input] = n
logger.time("fibonacci duration") do
  fib(30)
end

Output in JSON format:

{
  "timestamp":"2011-10-11T02:17:12.447487-0700",
  "input":30,
  "message":"fibonacci duration",
  "duration":1.903017632
}

Metrics like counters?

metrics = Cabin::Channel.new
logger = Cabin::Channel.new

# Pretend you can subscribe rrdtool, graphite, or statsd to the 'metrics'
# channel. Pretend the 'logger' channel has been subscribed to something
# that writes to stdout or disk.

begin
  logger.time("Handling request") do
    handle_request
  end
  metrics[:hits] += 1
rescue => e
  metrics[:errors] += 1
  logger.error("An error occurred", :exception => e, :backtrace => e.backtrace)
end

ruby-cabin's People

Contributors

driskell avatar grantr avatar gsandie avatar jlambert121 avatar jordansissel avatar jsvd avatar portertech avatar sanemat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ruby-cabin's Issues

Context data

I discovered your library yesterday and I really like it, I hate printf logging more and more with each project but never really managed to find a nice way to wrap an alternative in a nice way.

I started playing around with the idea of using an hash like syntax instead of plain strings since I stumbled upon logstash and the closest thing I found to my needs is radar (http://radargem.com/) which does something similar but is aimed at handling errors.

There is a nice thing in radar you don't seem to have in your gem (but I admit I did not took a deep look at it yet), it is a way to have contextual informations added automatically without having to include them in every messages you log.

Here is some code to give you a better idea of what I mean:

class AddUserInfo
  def update_message(msg)
    if Thread.current[:user]
      msg.merge(
          :user_id => Thread.current[:user].id
        )
    end
  end
end

logger.add_extension(AddUserInfo)

logger.info("Yeah it worked", :something => 123)
# => {"message" => "Yeah it worked", "user_id" => 9}

logger.remove_extension(AddUserInfo)

Something like this could also be useful:

logger.with_extension(AddUserInfo) do
  logger.info(...)
end

What do you think of this ?
This is something I could work on but only if there is chance it can get merged in the gem.

Edit: After looking into the code I found the context class which looks similar to what I want to achieve with this but not as flexible.

include globally shortcut?

Twp/logging has a nice feature where it's possible to call Logging.globally and have a logger instance created and named for every instance. This is done via this code; it's very convenient.

I've no idea whether there are any drawbacks to it.

Would that be feasible to copy (assuming drawbacks are minimal)?

Add `trace` method

Background: Logstash has moved to log4j2 instead of Cabin. With Logstash wanting to have plugins run on both Logstash 2.4 and 5.x, and log4j2 providing trace method for logging, we want to support using log4j2's trace logs on Logstash plugins running on 2.4 (without log4j2).

So, let's add trace level log to Cabin.

Emit logs as JSON rather than Ruby.inspect when to a file

As discussed on IRC.

I'd like to try experimenting with logging as JSON & handling un-encodable characters when you give ruby-cabin a filehandle as a handler.

I figured it'd be better to do it here: smaller gem to work with & non-logstash users can benefit. Also, I don't have to use JIRA, and that always makes me ๐Ÿ˜„

Alternative: provide a bundled handler with ruby-cabin that does this.

add configuration file support

It would be really great to add the ability to set a configuration file which defines loglevel for each logger, and also which output(s) to use.

Something like log4j.xml for example :

    <?xml version="1.0" encoding="UTF-8"?>
    <Configuration status="debug" strict="true" name="XMLConfigTest"
                   packages="org.apache.logging.log4j.test">
      <Appenders>
        <Appender type="Console" name="STDOUT">
          <Layout type="PatternLayout" pattern="%m MDC%X%n"/>
        </Appender>
        <Appender type="File" name="File" fileName="${filename}">
          <Layout type="PatternLayout">
            <Pattern>%d %p %C{1.} [%t] %m%n</Pattern>
          </Layout>
        </Appender>
      </Appenders>

      <Loggers>
        <Logger name="org.apache.logging.log4j.test1" level="debug" additivity="false">
          <AppenderRef ref="STDOUT"/>
        </Logger>

        <Logger name="org.apache.logging.log4j.test2" level="debug" additivity="false">
          <AppenderRef ref="File"/>
        </Logger>

        <Root level="trace">
          <AppenderRef ref="File"/>
        </Root>
      </Loggers>

    </Configuration>

Cannot subscribe to Logger.new(STDOUT)

I have started looking for a logger I can output directly to JSON (or at least the message) for easy import to logstash. Thought I would give this a try.

On OSX 10.8.2 I get an error whenever I try to subscribe to a Logger instance:

$ irb
>> require 'cabin'
=> true
>> log = Cabin::Channel.new
=> #<Cabin::Channel:0x1037b8690 @subscriber_lock=#<Mutex:0x1037b85c8>, @data={}, @subscribers={}, @level=:info, @metrics=#<Cabin::Metrics:0x1037b85f0 @metrics_lock=#<Mutex:0x1037b85a0>, @channel=#<Cabin::Channel:0x1037b8690 ...>, @metrics={}>>
>> log.subscribe(Logger.new(STDOUT))
=> 2176688180
>> log.info "vomit"
NoMethodError: undefined method `downcase' for :info:Symbol
    from /Library/Ruby/Gems/1.8/gems/cabin-0.6.0/lib/cabin/outputs/stdlib-logger.rb:20:in `<<'
    from /Library/Ruby/Gems/1.8/gems/cabin-0.6.0/lib/cabin/channel.rb:173:in `publish'
    from /Library/Ruby/Gems/1.8/gems/cabin-0.6.0/lib/cabin/channel.rb:172:in `each'
    from /Library/Ruby/Gems/1.8/gems/cabin-0.6.0/lib/cabin/channel.rb:172:in `publish'
    from /Library/Ruby/Gems/1.8/gems/cabin-0.6.0/lib/cabin/channel.rb:171:in `synchronize'
    from /Library/Ruby/Gems/1.8/gems/cabin-0.6.0/lib/cabin/channel.rb:171:in `publish'
    from /Library/Ruby/Gems/1.8/gems/cabin-0.6.0/lib/cabin/mixins/logger.rb:102:in `_log'
    from /Library/Ruby/Gems/1.8/gems/cabin-0.6.0/lib/cabin/mixins/logger.rb:79:in `log_with_level'
    from /Library/Ruby/Gems/1.8/gems/cabin-0.6.0/lib/cabin/mixins/logger.rb:64:in `info'
    from (irb):4

Is there something I am doing wrong or is this a bug? Ruby 1.8.7 (no rvm), Cabin 0.6.0.

Remove Ruby 1.8.7 support

I am wondering what the opinion of removing ruby 1.8.7 support would be? It is officially end-of-lifed and the travis tests are failing on 1.8.7. Moreover it is somewhat difficult to debug since 1.8.7 as ruby-install has dropped it and there is no officially 1.8.7 docker image either.

I would be glad to do the work. I just want to make sure that it would be useful.

Convenient class methods for logging

The laziest of our trade (i.e. me) would love to have some "wundervolle" class methods like:

require 'cabin'
...
class Foo
include Cabin::MagicClassStuffWithPhaserBeams
...
  Log.info('using class method', awesome: true)

To my understanding logging is mostly a global issue by application (at least for my stuff) so class methods are fine.

Add new mixin providing terminal/ui reporting output.

There are cases where a program wishes to report things to a human watching the screen, but if there is no human watching, then there is sometimes no benefit in actually logging.

Origin: elastic/logstash#2163

Details: Add a new mixin that adds a 'terminal' method to Cabin::Channel that emits output to any subscriber that is a tty. Include this mixin in Cabin::Channel by default.

Parsing backtrace doesn't work in ruby 1.8.7

So if you do, you get a nasty exception:

undefined method `[]' for nil:NilClass
/home/dkowis/.rvm/gems/ruby-1.8.7-p371/gems/cabin-0.6.0/lib/cabin/mixins/logger.rb:112:in `debugharder'
/home/dkowis/.rvm/gems/ruby-1.8.7-p371/gems/cabin-0.6.0/lib/cabin/mixins/logger.rb:100:in `_log'
/home/dkowis/.rvm/gems/ruby-1.8.7-p371/gems/cabin-0.6.0/lib/cabin/mixins/logger.rb:79:in `log_with_level'
/home/dkowis/.rvm/gems/ruby-1.8.7-p371/gems/cabin-0.6.0/lib/cabin/mixins/logger.rb:64:in `debug'
./lib/tasks/packaging.rake:87

Diagnosing with puts'es from a project of mine:
This stems from the BACKTRACE_RE:

BACKTRACE_RE: (?-mix:([^:]+):([0-9]+):in(.*)')`

The callinfo variable when used in a rake task is:

CALLINFO: ./lib/tasks/packaging.rake:87

So yeah, that regexp doesn't match, and then it's all over :(

#Channel.time could allow mid-block logs emitting so-far-duration

This is nice:

logger.time("Doing stuff") do
  do_something
end

But this would be better -

logger.time("Doing stuff") do
  logger.info("foo")
  do_something1
  logger.info("bar")
  do_something2
end

In the above example, logger.info("foo") while nested in a 'time' block should emit something like:

{ "message": "foo", "parent": "Doing stuff", "duration": 0.124 }
{ "message": "bar", "parent": "Doing stuff", "duration": 0.547 }

Or something like that, basically maintaining "we are in a timer block" ? Maybe not 'parent' but something like it.

Missing 0.9.0 tag

Hi,

It seems the 0.9.0 release available as a gem hasn't been tagged; would it be possible to do so?

Thank you!

Unexpected error, using Cabin to log Aws debug output

I'm enjoying the approach you've taken, regarding structured logs, contexts, and metrics. I'm quite new to the library, so perhaps I'm missing something... But, I have found that when I use Cabin as the logger for Aws and turn on Aws wire trace debugging, Cabin dies. I'm using ruby-2.1.2, aws-sdk-2.0.30, and cabin-0.7.1, and I've put together a gist, to help demonstrate.

In a nutshell, it seems like Cabin::Channel isn't a drop in replacement for a logger ... Or I haven't set it up correctly. So, when Aws tries to append (calling #<<) to the configured logger, Cabin can not respond to the message.

I suppose the easiest solution would be to create a Logger object, have a channel subscribe to it, and then have Aws log to the Logger. I was hoping to avoid that, because I'd like to have consistent output. My application outputs JSON, adding ruby's silly prefix to my logs seems like a waste.

Time.now.strftime("%6N") doesn't work in Ruby 1.8.7.

$ ruby --version
ruby 1.8.7 (2010-06-23 patchlevel 299) [x86_64-linux]
$ ruby -e 'print Time.now.strftime("%Y-%m-%dT%H:%M:%S.%6N%z\n")'
2014-01-30T20:36:36.   %6N+0400

This leads to log messages in fpm looks like this:

{:level=>:info, :message=>"Searching for Twisted==13.0.0", :timestamp=>"2014-01-30T20:08:23.   %6N+0400"}

logstash-ready by default

I want to create structured logs to feed logstash with no effort (think: lazy-driven engineering)
The provided examples mostly log to STDOUT which is not ready to parse by logstash, i.e. via JSON, because its a serialized hash.

Since I ruby-cabins USP is structured logging I thing a structured output should be right build-in.

@jordansissel What do you think?

Methods created with `define_method` seem .... slower ?

Under Ruby 2.2.1 and Jruby 1.7.24 (and other versions), Cabin's Logger definitions of things like debug? and debug make Ruby choose some very poor things. At least, this is what @colinsurprenant and I have observed. The reason we have this hypothesis was helping a Logstash user with high cpu usage and noticed several jstack results with the high-cpu thread running non-JIT'd code (JRuby interpreter) in code that probably should be JIT'd at that point in the runtime, at least by our assumptions ;P

Code to highlight this:

require "cabin"
require "benchmark"

c = Cabin::Channel.get
iterations = 100_000

msg = "ok"
Benchmark.bmbm(30) do |x|
  [:info, :debug].each do |level|
    c.level = level
    x.report("#{level}: nothing") do
      iterations.times { }
    end
    x.report("#{level}: debug? (#{c.debug?})") do
      iterations.times { c.debug? }
    end
    x.report("#{level}: debug? && debug() (#{c.debug?})") do
      iterations.times { c.debug? && c.debug(msg) }
    end
    x.report("#{level}: debug() (#{c.debug?})") do
      iterations.times { c.debug(msg) }
    end
  end
end

The short-circuit debug? && debug(...) is ineffective, for some reason, in both MRI and JRuby. Benchmark results. The following is the benchmark code (pasted above) report from JRuby 1.7.24

(I ignored the warmup/rehearsal output)

                                      user     system      total        real
info: nothing                     0.000000   0.000000   0.000000 (  0.002000)
info: debug? (false)              0.040000   0.000000   0.040000 (  0.057000)
info: debug? && debug() (false)   8.540000   0.040000   8.580000 (  8.601000)
info: debug() (false)             9.290000   0.020000   9.310000 (  9.371000)
debug: nothing                    0.000000   0.000000   0.000000 (  0.003000)
debug: debug? (true)              0.070000   0.000000   0.070000 (  0.094000)
debug: debug? && debug() (true)   9.400000   0.030000   9.430000 (  9.459000)
debug: debug() (true)             8.980000   0.040000   9.020000 (  9.048000)

Notice that debug? && debug() and debug() had similar runtimes, where the expectation is that debug? && ... would short circuit and not invoke debug() at all, and should have similar runtime to the debug? alone. I tested debug? && debug(), debug? and debug(), and debug() if debug? and all had the same runtime results -- unexpectedly slow.

Even going to an extreme and trying to precompute the predicate has the same slow performance.

I'm still working on a more comprehensive benchmark so we can try to isolate and fix the problem...

Add log level option to subscribe method

The idea behind this is to allow

logger = Cabin::Channel.get
logger.level = :info
logger.subscribe(File.open("/tmp/log", "a"), :info) # :debug would be the same as nil
logger.subscribe(STDOUT, :warn)

Such that:

  • logger.debug("test") won't be sent anywhere
  • logger.info("test") is sent only to the logfile
  • logger.fatal("test") is sent to both

@logger modifies hashes passed to it

Threw me as I was debugging a third party library,

which throws exceptions if unknown keys are in the hash.

require 'rubygems'
require 'cabin'

logger = Cabin::Channel.new

somehash = {:server => "foo", :port => "5555"}
puts "before: #{somehash}"

logger.info("config:", somehash)
puts "after: #{somehash}"

Cabin 2: What have we learned?

This ticket aims to track good and bad things learned from the cabin experiment so far.

My intent is to drop features that are bad, keep ones that are good, and pave the way for new hopefully-good features.

Cabin should redact/replace non UTF-8 characters

Cabin causes my process to die, when it encounters non UTF-8 characters such as binary data. My application moves command output around, and tries to play nice with the encoding. Logging the output leads to an encode() fail.

Currently on phone, will post example.

Logs not showing up in logstash's own logfile until shutdown

If only a few messages (2-3) are produced, they don't show up in logstash's own logfile until the process is shutdown.

Test with:

bin/logstash -e 'input { generator { count => 1000} } filter { ruby { code => "sleep 1" exclude_tags => [""] } } output { stdout { codec => rubydebug } }'  --log test.log

now run tailf test.log, it doesn't show anything.

If logstash is shutdown, the deprecated warning with the timestamp of startup suddenly appears.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.