GithubHelp home page GithubHelp logo

shopify / rotoscope Goto Github PK

View Code? Open in Web Editor NEW
193.0 408.0 11.0 433 KB

High-performance logger of Ruby method invocations

License: MIT License

Ruby 69.94% C 29.64% Shell 0.42%
ruby introspection callgraph high-performance invocation

rotoscope's Introduction

Rotoscope

Rotoscope is a high-performance logger of Ruby method invocations.

Status

Build Status Gem Version

Rotoscope is subject to breaking changes in minor versions until 1.0 is available.

Example

require 'rotoscope'

class Dog
  def bark
    Noisemaker.speak('woof!')
  end
end

class Noisemaker
  def self.speak(str)
    puts(str)
  end
end

log_file = File.expand_path('dog_trace.log')
puts "Writing to #{log_file}..."

Rotoscope::CallLogger.trace(log_file) do
  dog1 = Dog.new
  dog1.bark
end

The resulting method calls are saved in the specified dest in the order they were received.

Sample output:

entity,method_name,method_level,filepath,lineno,caller_entity,caller_method_name,caller_method_level
Dog,new,class,example/dog.rb,19,<ROOT>,<UNKNOWN>,<UNKNOWN>
Dog,initialize,instance,example/dog.rb,19,Dog,new,class
Dog,bark,instance,example/dog.rb,20,<ROOT>,<UNKNOWN>,<UNKNOWN>
Noisemaker,speak,class,example/dog.rb,5,Dog,bark,instance
Noisemaker,puts,class,example/dog.rb,11,Noisemaker,speak,class
IO,puts,instance,example/dog.rb,11,Noisemaker,puts,class
IO,write,instance,example/dog.rb,11,IO,puts,instance
IO,write,instance,example/dog.rb,11,IO,puts,instance

API

Default Logging Interface

Rotoscope ships with a default logger, Rotoscope::CallLogger. This provides a simple-to-use interface to the tracing engine that maintains performance as much as possible.

Rotoscope::CallLogger.trace(dest, excludelist: [])

Writes all calls of methods to dest, except for those whose filepath contains any entry in excludelist. dest is either a filename or an IO. Methods invoked at the top of the trace will have a caller entity of <ROOT> and a caller method name of <UNKNOWN>.

Rotoscope::CallLogger.trace(dest) { |call| ... }
# or...
Rotoscope::CallLogger.trace(dest, excludelist: ["/.gem/"]) { |call| ... }

Rotoscope::CallLogger.new(dest, excludelist: [])

Same interface as Rotoscope::CallLogger::trace, but returns a Rotoscope::CallLogger instance, allowing fine-grain control via Rotoscope::CallLogger#start_trace and Rotoscope::CallLogger#stop_trace.

rs = Rotoscope::CallLogger.new(dest)
# or...
rs = Rotoscope::CallLogger.new(dest, excludelist: ["/.gem/"])

Rotoscope::CallLogger#trace(&block)

Similar to Rotoscope::CallLogger::trace, but does not need to create a file handle on invocation.

rs = Rotoscope::CallLogger.new(dest)
rs.trace do |rotoscope|
  # code to trace...
end

Rotoscope::CallLogger#start_trace

Begins writing method calls to the dest specified in the initializer.

rs = Rotoscope::CallLogger.new(dest)
rs.start_trace
# code to trace...
rs.stop_trace

Rotoscope::CallLogger#stop_trace

Stops writing method invocations to the dest. Subsequent calls to Rotoscope::CallLogger#start_trace may be invoked to resume tracing.

rs = Rotoscope::CallLogger.new(dest)
rs.start_trace
# code to trace...
rs.stop_trace

Rotoscope::CallLogger#mark(str = "")

Inserts a marker '--- ' to divide output. Useful for segmenting multiple blocks of code that are being profiled. If str is provided, the line will be prefixed by '--- ', followed by the string passed.

rs = Rotoscope::CallLogger.new(dest)
rs.start_trace
# code to trace...
rs.mark('Something goes wrong here') # produces `--- Something goes wrong here` in the output
# more code ...
rs.stop_trace

Rotoscope::CallLogger#close

Flushes the buffer and closes the file handle. Once this is invoked, no more writes can be performed on the Rotoscope::CallLogger object. Sets state to :closed.

rs = Rotoscope::CallLogger.new(dest)
rs.trace { |rotoscope| ... }
rs.close

Rotoscope::CallLogger#state

Returns the current state of the Rotoscope::CallLogger object. Valid values are :open, :tracing and :closed.

rs = Rotoscope::CallLogger.new(dest)
rs.state # :open
rs.trace do
  rs.state # :tracing
end
rs.close
rs.state # :closed

Rotoscope::CallLogger#closed?

Shorthand to check if the state is set to :closed.

rs = Rotoscope::CallLogger.new(dest)
rs.closed? # false
rs.close
rs.closed? # true

Low-level API

For those who prefer to define their own logging logic, Rotoscope also provides a low-level API. This is the same one used by Rotoscope::CallLogger internally. Users may specify a block that is invoked on each detected method call.

Rotoscope.new(&blk)

Creates a new instance of the Rotoscope class. The block argument is invoked on every call detected by Rotoscope. The block is passed the same instance returned by Rotoscope#new allowing the low-level methods to be called.

rs = Rotoscope.new do |call|
  # We likely don't want to record calls to Rotoscope
  return if self == call.receiver
  ...
end

Rotoscope#trace(&blk)

The equivalent of calling Rotoscope#start_trace and then Rotoscope#stop_trace. The call to #stop_trace is within an ensure block so it is always called when the block terminates.

rs = Rotoscope.new do |call|
  ...
end

rs.trace do
  # call some code
end

Rotoscope#start_trace

Begins detecting method calls invoked after this point.

rs = Rotoscope.new do |call|
  ...
end

rs.start_trace
# Calls after this points invoke the
# block passed to `Rotoscope.new`

Rotoscope#stop_trace

Disables method call detection invoked after this point.

rs = Rotoscope.new do |call|
  ...
end

rs.start_trace
...
rs.stop_trace
# Calls after this points will no longer
# invoke the block passed to `Rotoscope.new`

Rotoscope#tracing?

Identifies whether the Rotoscope object is actively tracing method calls.

rs = Rotoscope.new do |call|
  ...
end

rs.tracing? # => false
rs.start_trace
rs.tracing? # => true

Rotoscope#receiver

Returns the object that the method is being called against.

rs = Rotoscope.new do |call|
  call.receiver # => #<Foo:0x00007fa3d2197c10>
end

Rotoscope#receiver_class

Returns the class of the object that the method is being called against.

rs = Rotoscope.new do |call|
  call.receiver_class # => Foo
end

Rotoscope#receiver_class_name

Returns the stringified class of the object that the method is being called against.

rs = Rotoscope.new do |call|
  call.receiver_class_name # => "Foo"
end

Rotoscope#method_name

Returns the name of the method being invoked.

rs = Rotoscope.new do |call|
  call.method_name # => "bar"
end

Rotoscope#singleton_method?

Returns true if the method called is defined at the class level. If the call is to an instance method, this returns false.

rs = Rotoscope.new do |call|
  call.singleton_method? # => false
end

Rotoscope#caller_object

Returns the object whose context we invoked the call from.

rs = Rotoscope.new do |call|
  call.caller_object # => #<SomeClass:0x00008aa6d2cd91b61>
end

Rotoscope#caller_class

Returns the class of the object whose context we invoked the call from.

rs = Rotoscope.new do |call|
  call.caller_class # => SomeClass
end

Rotoscope#caller_class_name

Returns the tringified class of the object whose context we invoked the call from.

rs = Rotoscope.new do |call|
  call.caller_class_name # => "SomeClass"
end

Rotoscope#caller_method_name

Returns the stringified class of the object whose context we invoked the call from.

rs = Rotoscope.new do |call|
  call.caller_method_name # => "call_foobar"
end

Rotoscope#caller_singleton_method?

Returns true if the method invoking the call is defined at the class level. If the call is to an instance method, this returns false.

rs = Rotoscope.new do |call|
  call.caller_singleton_method? # => true
end

Rotoscope#caller_path

Returns the path to the file where the call was invoked.

rs = Rotoscope.new do |call|
  call.caller_path # => "/rotoscope_test.rb"
end

Rotoscope#caller_lineno

Returns the line number corresponding to the #caller_path where the call was invoked. If unknown, returns -1.

rs = Rotoscope.new do |call|
  call.caller_lineno # => 113
end

rotoscope's People

Contributors

airhorns avatar byroot avatar casperisfine avatar cursedcoder avatar dylanahsmith avatar exterm avatar jahfer avatar peterzhu2118 avatar tmlayton avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rotoscope's Issues

Release 0.3.0

We're currently on v0.3.0.pre.4 (soon to be pre.5 probably), but the changes in v0.3.0 have been largely stable for a while. Should cut a proper release once #54 goes in.

Can this be used with Ruby 1.8.7

I built this extenison, ruby-debug and debugger-ruby_code_sources with support for 1.8.7.

I am finally struck at missing debug.h dependencies, which seems to be not present as part of ruby sources 1.8.7 :-(.

Any help ?

Flatten does not account for threads

Rotoscope#flatten_into keeps a single call stack, which I don't expect would work properly if multiple threads are running. I think we would need to have a per-thread stack, which would require having the thread id as part of the event trace.

Calls from default argument values show inconsistent caller information

For example, if we have a method defined with

def create_applied_discount_from(discount, code: discount.try(:code))

the discount.try(:code) call shows that line in the file path and line number for the call, but the entity and method name refer to the caller of create_applied_discount_from which is inconsistent with the file path. Ideally, for the discount.try(:code) call, the caller method would show up as create_applied_discount_from and the caller entity would be the class of the object that method was called on.

This is happening because rotoscope tries to keep information from the stack based by pushing a frame to this stack on call events and popping a stack from on a return event. However, in this case the call event hasn't happened yet.

Quotes in methods names aren't escaped in the CSV output

When we define tests using the test method, we pass a string that sometimes include quote characters. However, rotoscope just uses printf style formatting for the output with no escaping of quote characters (e.g. fprintf(stream, "\"%s\",\"%s\",%s,\"%s\",%d\n",). This causes problems when parsing the CSV output.

Deduplication for flattened output can omit non-duplicate lines

The deduplication is done by using a hash, but it uses the hash of the line as the key rather than the line itself. Since different lines may hash to the same hash value, we could be omitting lines that aren't duplicates.

I'm not aware of specific places where we have gotten the wrong data, but I thought I would open this issue to document the possible issue.

Doing this properly would use a lot more memory. However, proper deduplication could be done outside of rotoscope using the simple awk script !seen[$0]++.

Accurately retrieving caller location inside of TracePoint excessively expensive

Problem

tl;dr TracePoint logger is too slow to compute caller filepaths

Rotoscope is implemented in C using the Ruby TracePoint API. By default, the :call and :return Ruby trace events allow you to retrieve the filepath and line number (via rb_tracearg_lineno and rb_tracearg_path in C). While this is quite fast, it has the unfortunate side effect of pointing to the callee of the method (i.e. where the method is received), rather than the caller.

The path of the caller is highly important to maintain an accurate "blacklist/whitelist" option so we don't log every single method invocation inside of our framework or hot paths that we don't care to know the internals of. Introducing a blacklist dramatically improves the performance abilities of tracing method invocations.

Solution Attempts

In an attempt to remain performant and accurate, I examined a rudimentary implementation where I installed a :line TracePoint (performant, hah) and kept the last read line in memory. This does technically work but it isn't easily possible to use this to determine the return of a method, since it will only know the deepest level of the stack, not when a frame pop occurs.

At the suggestion of @dylanahsmith, I tried another variation where I invoked Kernel.caller_locations to compute the backtrace of the current trace, and extract the filepath/lineno. This does produce 100% correct results, but is incredibly slow in comparison to the other attempts.

Using the tracearg methods, I'm able to complete a Shopify/shopify CI run in ~10 minutes with Rotoscope wrapping all tests. Using caller_locations and bumping up the CI timeouts lets most of the containers finish in 2.5 hours. Boo. This is the version currently in the master branch.

I've compiled a table below where you can see the retreived caller results of the different attempts:

table 1

We can shortcut the complexity of Kernel.caller_locations on C-invocations and instead use the default TracePoint methods, since the top-level stack won't be able to point to the method definition, and it falls back to the caller instead.

Next Steps

Examining how rb_tracearg_filepath is implemented, we can see rb_vm_get_ruby_level_next_cfp is invoked. Looking at the definition of the method, the logic is relatively simple: increment the frame pointer until a Ruby frame is found. In our case, we want to find a Ruby frame, and return the frame after that one. While the method is not static (a miracle?!), it is unfortunately declared inside of internal.h and uses a significant amount of internal data structures that are prohibitively expensive to copy the definitions of as a workaround (e.g. rb_control_frame_t has a field for rb_iseq_t, which itself holds a field for rb_iseq_constant_body, and that struct looks like this.

At this point, it's not particularly obvious to me how I can get around this performance problem without forking Ruby to implement a method that returns the second Ruby VM frame, but I'd like to avoid that if possible.

I've added a comment to the already-open issue in the Ruby bugtracker that detailed this problem, but haven't received any traction on it. The original issue was opened over four years ago.

/cc @dylanahsmith @airhorns @nick-mcdonald @stephenminded

Methods defined via blocks produce incorrect call-locations

Introduction

As identified in #38.

When using dynamically defined methods via define_method and define_singleton_method, the call stack is inconsistent compared with how predefined Ruby methods look.

Failing test here:

def test_dynamic_methods_in_blacklist
skip <<-FAILING_TEST_CASE
Return events for dynamically created methods (define_method, define_singleton_method)
do not have the correct stack frame information (the call of a dynamically defined method
is correctly treated as a Ruby :call, but its return must be treated as a :c_return)
FAILING_TEST_CASE
contents = rotoscope_trace(blacklist: [MONADIFY_PATH]) { Example.apply("my value!") }
assert_equal [
{ event: "call", entity: "Example", method_name: "apply", method_level: "class", filepath: "/rotoscope_test.rb", lineno: -1 },
{ event: "call", entity: "Example", method_name: "monad", method_level: "class", filepath: "/rotoscope_test.rb", lineno: -1 },
{ event: "return", entity: "Example", method_name: "monad", method_level: "class", filepath: "/rotoscope_test.rb", lineno: -1 },
{ event: "return", entity: "Example", method_name: "apply", method_level: "class", filepath: "/rotoscope_test.rb", lineno: -1 },
], parse_and_normalize(contents)
end

Problem

On :call events, the frame points to where the method is received in Ruby, i.e. the block passed to define_method or define_singleton_method. This is the expected behaviour for Ruby invocations:

[TracePoint] has the unfortunate side effect of pointing to the callee of the method (i.e. where the method is received), rather than the caller.

But, when the :return event fires, instead of pointing to the end of the method definition like a normal ruby method, it points one frame up the stack, to where the original caller occurred. Since we assume Ruby methods point one level deeper, Rotoscope pops the frame one level further, and points to where the caller's caller came from. This means the call and return events won't match up.

Solution?

I tried to figure out how to get around this one (by using Ruby's bitmasks to determine if it was a dynamically defined method, etc.), but I've been unsuccessful thus far.

This problem is currently avoidable via flatten: true, since unmatched returns are ignored.

Calls from blocks show up as from caller method

This is happening for a similar reason as #57

This is happening because rotoscope tries to keep information from the stack based by pushing a frame to this stack on call events and popping a stack from on a return event.

except that it doesn't keep track of calls to blocks.

Unlike that issue, this one is easier to fix with the current approach by also tracing block calls and returns so they can be included in the rotoscope stack.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.