janestreet / magic-trace Goto Github PK

View Code? Open in Web Editor NEW

4.5K 34.0 85.0 35.99 MB

magic-trace collects and displays high-resolution traces of what a process is doing

Home Page: https://magic-trace.org

License: MIT License

Makefile 0.01% OCaml 99.69% C 0.30% Shell 0.01%

intel x86 visualizer tracing profile performance-tools introspection

magic-trace's Introduction

magic-trace

Overview

magic-trace collects and displays high-resolution traces of what a process is doing. People have used it to:

figure out why an application running in production handles some requests slowly while simultaneously handling a sea of uninteresting requests,
look at what their code is actually doing instead of what they think it's doing,
get a history of what their application was doing before it crashed, instead of a mere stacktrace at that final instant,
...and much more!

magic-trace:

has 2%-10% overhead,
doesn't require application changes to use,
traces every function call with ~40ns resolution, and
renders a timeline of call stacks going back (a configurable) ~10ms.

You use it like perf: point it to a process and off it goes. The key difference from perf is that instead of sampling call stacks throughout time, magic-trace uses Intel Processor Trace to snapshot a ring buffer of all control flow leading up to a chosen point in time¹. Then, you can explore an interactive timeline of what happened.

You can point magic-trace at a function such that when your application calls it, magic-trace takes a snapshot. Alternatively, attach it to a running process and detach it with Ctrl+C, to see a trace of an arbitrary point in your program.

Testimonials

"Magic-trace is one of the simplest command-line debugging tools I have ever used."

Francis Ricci, Jane Street

"Magic-trace is not just for performance. The tool gives insight directly into what happens in your program, when, and why. Consider using it for all your introspective goals!"

Andrew Hunter, Jane Street

I use perf a ton, and I think that both perf and magic-trace give perspectives that the other doesn't. The benefit I got from magic-trace was entirely based on the fact that it works in slices at any zoom level, so I was able to see all the function calls that a 70ns function was performing, which was invisible in perf.

Doug Patti, Jane Street

more testimonials...

Install

Make sure the system you want to trace is supported. The constraints that most commonly trip people up are: VMs are mostly not supported, Intel only (Skylake² or later), Linux only.
Grab a release binary from the latest release page.
1. If downloading the prebuilt binary (not package), chmod +x magic-trace³
2. If downloading the package, run sudo dpkg -i magic-trace*.deb
Then, test it by running magic-trace -help, which should bring up some help text.

Getting started

Here's a sample C program to try out. It's a slightly modified version of the example in man 3 dlopen. Download that, build it with gcc demo.c -ldl -o demo, then leave it running ./demo. We're going to use that program to learn how dlopen works.
Run magic-trace attach -pid $(pidof demo). When you see the message that it's successfully attached, wait a couple seconds and Ctrl+C magic-trace. It will output a file called trace.fxt in your working directory.

Open magic-trace.org, click "Open trace file" in the top-left-hand and give it the trace file generated in the previous step.

That should have expanded into a trace. Zoom in until you can see an individual loop through dlopen/dlsym/cos/printf/dlclose.
- W zooms into wherever your mouse cursor is pointed (you'll need to zoom in a bunch to see anything useful),
- S zooms out,
- A moves left,
- D moves right, and
- scroll wheel moves your viewport up and down the stack. You'll only need to scroll to see particularly deep stack traces, it's probably not useful for this example.

Click and drag on the white space around the call stacks to measure. Plant flags by clicking in the timeline along the top. Using the measurement tool, measure how long it takes to run cos. On my screen it takes ~5.7us.

Congratulations, you just magically traced your first program!

In contrast to traditional perf workflows, magic-trace excels at hypothesis generation. For example, you might notice that taking 6us to run cos is a really long time! If you zoom in even more, you'll see that there's actually five pink "[untraced]" cells in there. If you re-run magic-trace with root and pass it -trace-include-kernel, you'll see stacktraces for those. They're page fault handlers! The demo program actually calls cos twice. If you zoom in even more near the end of the 6us cos call, you'll see that the second call takes far less time and does not page fault.

How to use it

magic-trace continuously records control flow into a ring buffer. Upon some sort of trigger, it takes a snapshot of that buffer and reconstructs call stacks.

There are two ways to take a snapshot:

We just did this one: Ctrl+C magic-trace. If magic-trace terminates without already having taken a snapshot, it takes a snapshot of the end of the program.

You can also trigger snapshots when the application calls a function. To do so, pass magic-trace the -trigger flag.

-trigger '?' brings up a fuzzy-finding selector that lets you choose from all symbols in your executable,
-trigger SYMBOL selects a specific, fully mangled, symbol you know ahead of time, and
-trigger . selects the default symbol magic_trace_stop_indicator.

Stop indicators are powerful. Here are some ideas for where you might want to place one:

If you're using an asynchronous runtime, any time a scheduler cycle takes too long.
In a server, when a request takes a surprisingly long time.
After the garbage collector runs, to see what it's doing and what it interrupted.
After a compiler pass has completed.

You may leave the stop indicator in production code. It doesn't need to do anything in particular, magic-trace just needs the name. It is just an empty, but not inlined, function. It will cost ~10us to call, but only when magic-trace actually uses it to take a snapshot.

Documentation

More documentation is available on the magic-trace wiki.

Discussion

Join us on Discord to chat synchronously, or the GitHub discussion group to do so asynchronously.

Contributing

If you'd like to contribute:

read the build instructions,
set up your editor,
take a quick tour through the codebase, then
hit up the issue tracker for a good starter project.

Privacy policy

magic-trace does not send your code or derivatives of your code (including traces) anywhere.

magic-trace.org is a lightly modified fork of Perfetto, and runs entirely in your browser. As far as we can tell, it does not send your trace anywhere. If you're worried about that changing one day, set up your own local copy of the Perfetto UI and use that instead.

Acknowledgements

Tristan Hume is the original author of magic-trace. He wrote it while working at Jane Street, who currently maintains it.

Intel PT is the foundational technology upon which magic-trace rests. We'd like to thank the people at Intel for their years-long efforts to make it available, despite its slow uptake in the greater software community.

magic-trace would not be possible without perfs extensive support for Intel PT. perf does most of the work in interpreting Intel PT's output, and magic-trace likely wouldn't exist were it not for their efforts. Thank you, perf developers.

magic-trace.org is a fork of Perfetto, with minor modifications. We'd like to thank the people at Google responsible for it. It's a high quality codebase that solves a hard problem well.

The ideas behind magic-trace are in no way unique. We've written down a list of prior art that has influenced its design.

perf can do this too, but that's not how most people use it. In fact, if you peek under the hood you'll see that magic-trace uses perf to drive Intel PT. ↩
Strictly speaking, anything newer than Broadwell, but this is not a platform we regularly test on, and timing resolution is worse (~1us). ↩
https://github.com/actions/upload-artifact/issues/38 ↩

magic-trace's People

Contributors

Stargazers

Watchers

Forkers

93r goldstar111 xyene cgaebel gretay-js quantum5 billduff eelsjack benzyx bnigito kp5431 markmark1 patmosxx-v2 rememberlenny an1310 enedil eternalerrors warmchang 0xqd justanotherdot cxz asad-awadia y0d4a chippi99 ksharpdabu marciopocebon fengjixuchui jrcribb indirection zytmatrix ssahgal kingking888 nanderoo vdt robberth auaan laeeth maxihuesito awitten1 ranchoice fizek eeryinkblot xiaohan2013 rocker9527 pbaumbacher system-performance lamoreauxaj shoffmeister therustmonk jane-street-immersion-program hlian lsxia istlemin knrafto doytsujin qunqunqun camilotk alphanso studiovc cybersecurity-labs dmaroo jamestiotio geseq uucad ecatmur fire-depot hollisticgit andystudio joxoby simpzan sahma61 cole67 moathdarwesh achyutha poechsel tommy04062019 v-gb int-y1 eternalops ssh352 mshinwell theothornhill sabjohnso xtlys edgarsanchezquirarte

magic-trace's Issues

Don't crash on the v8 demo

The v8 demo is in demo/demo.js

Right now, attempts to magic-trace it crash while trying to parse perf output:

   "3435768/3435768 1283684.240499276:   int                      556263730105 v8::base::OS::Abort+0x15 =>     556263608d40 v8::internal::Snapshot::DefaultSnapshotBlob+0x57e00"))

or even:

   "3435768/3435768 1283760.029165481:   int                      55626368b83f v8::internal::Snapshot::DefaultSnapshotBlob+0xda8ff =>     5562635f7e80 v8::internal::Snapshot::DefaultSnapshotBlob+0x46f40"))

At this point, I'd normally go add parsing for software interrupts, int. But I don't understand what's going on here. Looking at V8's source code, OS::Abort is supposed to abort the process. But the process didn't abort! Also also, DefaultSnapshotBlob is pretty trivial function that doesn't look like it should be generating software interrupts.

Upload CI artifacts to tags

This would remove the need to upload them manually, and drop the requirement for people to be logged into GitHub in https://github.com/janestreet/magic-trace/wiki/Getting-started.

Better capability checking for if we can trace the kernel

Ref https://perf.wiki.kernel.org/index.php/Perf_tools_support_for_Intel%C2%AE_Processor_Trace#Adding_capabilities_to_perf

Right now we just check if the user is root, which covers most cases but isn't precise.

Handle `perf` failing to start more gracefully

If perf fails to start for any reason, magic-trace hangs and does not propagate the error up. An easy way to test this is to intentionally mess up the flags to perf record, or unintentionally by running on a too-old perf version.

`trace` is race-y with process startup

If a process takes less than ~500ms to execute, then trace will produce an empty tracefile. We should synchronize process startup with perf, probably through some SIGSTOP/SIGCONT dance.

Support increasing the snapshot buffer size

The snapshot buffer can be ~arbitrarily sized, and is passed as the argument to --snapshot in perf. Larger buffers would allow for more data to be captured and displayed.

Ref https://man7.org/linux/man-pages/man1/perf-intel-pt.1.html:

   To select snapshot mode a new option has been added:

       -S

   Optionally it can be followed by the snapshot size e.g.

       -S0x100000

   The default snapshot size is the auxtrace mmap size. If neither
   auxtrace mmap size nor snapshot size is specified, then the
   default is 4MiB for privileged users (or if
   /proc/sys/kernel/perf_event_paranoid < 0), 128KiB for
   unprivileged users. If an unprivileged user does not specify mmap
   pages, the mmap pages will be reduced as described in the new
   auxtrace mmap size option section below.

   The snapshot size is displayed if the option -vv is used e.g.

       Intel PT snapshot size: %zu

magic-trace gives [unknown] symbol for symbols containing spaces

magic-trace gives [unknown] symbol in the perfetto profile for symbols containing whitespace.

This is a very common scenario for C++ template types.

Perf handles those fine by itself.

The regex at https://github.com/janestreet/magic-trace/blob/master/src/perf_tool_backend.ml#L104 is failing. It should probably use proper parsing instead.

Process search doesn't show non-root processes when running as root

It should, because those processes are still traceable.

Fail gracefully

When magic-trace first starts up, it should detect conditions that we know will never work and bail out early with a good error message when they're hit. This wiki page contains a list of constraints we know about. Let's automate as much as we can.

If `-symbol` is specified, SIGINT shouldn't capture a trace

Otherwise there's no good way to exit if you realize you've accidentally selected the wrong symbol.

(Ran into this while showing someone how to use magic-trace for the first time.)

common events trigger too fast

If I run a particular trace for an event I expect to be trigger very often (say expected 1000Hz) then as soon as magic trace starts recording, it stops. As a consequence I get a very limited history.

While my software trigger could be written to only probabalistically trigger I'd like it if magic trace had an option "refuse to trigger until we've been recording for TIME-SPAN".

`setjmp`/`longjmp` support

We cannot support arbitrary _setjmp/longjmp, but if we see a longjmp we can probably avoid totally breaking the trace by resetting the stack, and ignoring future rets underflowing our call stack (because we wouldn't know how many stack frames should have gotten popped).

They appear as Call "_setjmp" and Call "longjmp" tokens in the stream.

Any support planned for managed languages?

From what I can tell magic-trace right now only works with native binaries as you need to select a symbol address up front.

Do guys have any plans for managed languages? Maybe via the perf symbol map files?

Support more perf event kinds

Ref https://github.com/torvalds/linux/blob/7ee022567bf9e2e0b3cd92461a2f4986ecc99673/tools/perf/builtin-script.c#L1546:

static struct {
	u32 flags;
	const char *name;
} sample_flags[] = {
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL, "call"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN, "return"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CONDITIONAL, "jcc"},
	{PERF_IP_FLAG_BRANCH, "jmp"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_INTERRUPT, "int"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN | PERF_IP_FLAG_INTERRUPT, "iret"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_SYSCALLRET, "syscall"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN | PERF_IP_FLAG_SYSCALLRET, "sysret"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_ASYNC, "async"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_ASYNC |	PERF_IP_FLAG_INTERRUPT, "hw int"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TX_ABORT, "tx abrt"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_BEGIN, "tr strt"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END, "tr end"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMENTRY, "vmentry"},
	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMEXIT, "vmexit"},
	{0, NULL}
};

CI artifacts should be built with flambda

The decoding stage gets a fair bit faster if they are.

Trace should include absolute times

This'd make it easier to line up with events the application may be logging.

magic-trace trace decoding fails on OCaml programs compiled on my system

$ cat test.ml
let () = print_endline "Hello World"

$ ~/.opam/4.13.1/bin/ocamlopt -o test ./test.ml

$ magic-trace trace -o test.trace ./test
[Couldn't find symbol. Will still snapshot on end]
Hello World
[ perf record: Woken up 1 times to write data ]
[Finished recording!]
[Snapshot taken!]
[ perf record: Captured and wrote 0.002 MB /tmp/magic_trace.tmp.e5ca22/perf.data ]
[Decoding, this may take 30s or so...]
(monitor.ml.Error
 ("Owee_buf.Invalid_format(\"unknown .debug_line version\")")
 ("Raised at Owee_buf.invalid_format in file \"src/owee_buf.ml\", line 22, characters 25-51"
  "Called from Owee_buf.assert_format in file \"src/owee_buf.ml\" (inlined), line 26, characters 4-22"
  "Called from Owee_debug_line.read_header in file \"src/owee_debug_line.ml\", line 54, characters 2-80"
  "Called from Owee_debug_line.read_chunk in file \"src/owee_debug_line.ml\", line 82, characters 12-27"
  "Called from Magic_trace_core__Elf.addr_table.(fun).load_table_next in file \"core/elf.ml\", line 103, characters 12-45"
  "Called from Base__Option.iter in file \"src/option.ml\" (inlined), line 68, characters 14-17"
  "Called from Magic_trace_core__Elf.addr_table in file \"core/elf.ml\", line 90, characters 2-1023"
  "Called from Magic_trace_lib__Trace.Make_commands.decode_to_trace.(fun) in file \"src/trace.ml\", line 70, characters 25-43"
  "Called from Tracing__Tool_output.write_and_view in file \"src/tool_output.ml\", line 32, characters 16-19"
  "Called from Async_kernel__Deferred0.bind.(fun) in file \"src/deferred0.ml\", line 54, characters 64-69"
  "Called from Async_kernel__Job_queue.run_jobs in file \"src/job_queue.ml\", line 167, characters 6-47"
  "Caught by monitor Monitor.protect"))

this is magic-trace v0.15.0 as packaged on opam
System: Fedora 34
owee is version 0.4 (I don't know if 0.5 would fix it, but the magic-trace package on opam requires owee 0.4)

Update the getting-started guide with magic-trace.org

When the magic-trace.org rebrand is complete, redo the demo in the readme using the newly-rebranded website.

Filter events from after the stop indicator

Magic-trace doesn't stop recording exactly when the stop indicator is hit. That can be quite confusing, especially if you get unlucky and the stop event is nowhere near the right hand side of the trace.

To fix this, let's filter out events from the trace that happen after the stop indicator returns.

It's conceivable that people will want the current behavior behind a flag, but I weakly think it's not worth the extra UI complexity.

Trace state tracking is broken by hardware interrupts

Perf outputs two events magic-trace uses to construct its [untraced] spans: tr strt and tr end.

As we discovered after a deep dive today, magic-trace isn't handling these events properly and that is the cause of some staircase traces:

Subtleties include:

tr end doesn't need to be accompanied by a tr strt. For example, there's an implicit tr end during a hardware interrupt (hw int). But when tracing userspace only, you don't even see hardware interrupts.
Due to an Intel PT bug (?), there are sometimes two tr strts instead of one.

and they make it challenging to figure out what the correct trace state should be.

We propose the following algorithm for tracking trace state, instead of what magic-trace does today:

Explicitly track trace state per-thread, one of Tracing | Not_tracing.
Initial state is Tracing.
Tracing -> Not_tracing on tr end
Not_tracing -> Tracing on tr strt
On tr end while Not_tracing, print a warning and disbelieve it.
On tr strt while Tracing, print a warning and disbelieve it.
There's one exception: A tr strt is permitted as the first event of a thread, even though it's prohibited by the other rules.

This usually, but not always, happens around a call to memmove. I've left some example perf script output below:

 1139/1139  428146.916343395:   jcc                            40af03 itch_bbo::book::Book::add_order+0x3b3 =>           40b012 itch_bbo::book::Book::add_order+0x4c2
 1139/1139  428146.916343397:   call                           40b06c itch_bbo::book::Book::add_order+0x51c =>     7ffff7329220 __memmove_ssse3_back+0x0
 1139/1139  428146.916343397:   jmp                      7ffff732924a __memmove_ssse3_back+0x2a =>     7ffff732ba10 __memmove_ssse3_back+0x27f0
 1139/1139  428146.916343398:   return                   7ffff732ba16 __memmove_ssse3_back+0x27f6 =>           40b072 itch_bbo::book::Book::add_order+0x522
 1139/1139  428146.916343398:   call                           40b093 itch_bbo::book::Book::add_order+0x543 =>     7ffff7329220 __memmove_ssse3_back+0x0
 1139/1139  428146.916343445:   tr strt                             0 [unknown] =>     7ffff732bbd0 __memmove_ssse3_back+0x29b0
 1139/1139  428146.916343561:   return                   7ffff732bbd4 __memmove_ssse3_back+0x29b4 =>           40b099 itch_bbo::book::Book::add_order+0x549
 1139/1139  428146.916343592:   jmp                            40b0a8 itch_bbo::book::Book::add_order+0x558 =>           40b1ba itch_bbo::book::Book::add_order+0x66a
 1139/1139  428146.916343592:   jmp                            40b1ce itch_bbo::book::Book::add_order+0x67e =>           40b62d itch_bbo::book::Book::add_order+0xadd

 1139/1139  428146.916323767:   jmp                            40d53e itch_bbo::main+0x9ee =>           40d540 itch_bbo::main+0x9f0
 1139/1139  428146.916323767:   jcc                            40d54e itch_bbo::main+0x9fe =>           40d63f itch_bbo::main+0xaef
 1139/1139  428146.916324004:   hw int                         40d64f itch_bbo::main+0xaff => ffffffff8ad90750 [unknown]
 1139/1139  428146.916324247:   tr strt                             0 [unknown] => ffffffff8ad90762 [unknown]
 1139/1139  428146.916324607:   tr strt                             0 [unknown] =>           40d64f itch_bbo::main+0xaff
 1139/1139  428146.916324732:   jmp                            40d679 itch_bbo::main+0xb29 =>           40d730 itch_bbo::main+0xbe0
 1139/1139  428146.916324732:   call                           40d78b itch_bbo::main+0xc3b =>           40c650 itch_bbo::maybe_sanity_check_execution+0x0

 1139/1139  428146.916294568:   call                           40a5b2 alloc::collections::btree::remove::<impl alloc::collections::btree::node::Handle<alloc::collections::btree::node::NodeRef<alloc::collections::btree::node::marker::Mut,K,V,alloc::collections::btree::node::marker::Leaf>,alloc::collections::btree::node::marker::KV>>::remove_leaf_kv+0x62 =>     7ffff7329220 __memmove_ssse3_back+0x0
 1139/1139  428146.916294568:   jcc                      7ffff7329226 __memmove_ssse3_back+0x6 =>     7ffff732924e __memmove_ssse3_back+0x2e
 1139/1139  428146.916294569:   jmp                      7ffff732926c __memmove_ssse3_back+0x4c =>     7ffff732b4b0 __memmove_ssse3_back+0x2290
 1139/1139  428146.916294569:   return                   7ffff732b4b6 __memmove_ssse3_back+0x2296 =>           40a5b5 alloc::collections::btree::remove::<impl alloc::collections::btree::node::Handle<alloc::collections::btree::node::NodeRef<alloc::collections::btree::node::marker::Mut,K,V,alloc::collections::btree::node::marker::Leaf>,alloc::collections::btree::node::marker::KV>>::remove_leaf_kv+0x65
 1139/1139  428146.916294570:   call                           40a5d6 alloc::collections::btree::remove::<impl alloc::collections::btree::node::Handle<alloc::collections::btree::node::NodeRef<alloc::collections::btree::node::marker::Mut,K,V,alloc::collections::btree::node::marker::Leaf>,alloc::collections::btree::node::marker::KV>>::remove_leaf_kv+0x86 =>     7ffff7329220 __memmove_ssse3_back+0x0
 1139/1139  428146.916294615:   tr strt                             0 [unknown] =>     7ffff732924e __memmove_ssse3_back+0x2e
 1139/1139  428146.916294690:   jmp                      7ffff732926c __memmove_ssse3_back+0x4c =>     7ffff732b5c0 __memmove_ssse3_back+0x23a0
 1139/1139  428146.916294690:   return                   7ffff732b5c8 __memmove_ssse3_back+0x23a8 =>           40a5d9 alloc::collections::btree::remove::<impl alloc::collections::btree::node::Handle<alloc::collections::btree::node::NodeRef<alloc::collections::btree::node::marker::Mut,K,V,alloc::collections::btree::node::marker::Leaf>,alloc::collections::btree::node::marker::KV>>::remove_leaf_kv+0x89
 1139/1139  428146.916294690:   jmp                            40a65c alloc::collections::btree::remove::<impl alloc::collections::btree::node::Handle<alloc::collections::btree::node::NodeRef<alloc::collections::btree::node::marker::Mut,K,V,alloc::collections::btree::node::marker::Leaf>,alloc::collections::btree::node::marker::KV>>::remove_leaf_kv+0x10c =>           40a6c0 alloc::collections::btree::remove::<impl alloc::collections::btree::node::Handle<alloc::collections::btree::node::NodeRef<alloc::collections::btree::node::marker::Mut,K,V,alloc::collections::btree::node::marker::Leaf>,alloc::collections::btree::node::marker::KV>>::remove_leaf_kv+0x170
 1139/1139  428146.916294690:   call                           40a6c3 alloc::collections::btree::remove::<impl alloc::collections::btree::node::Handle<alloc::collections::btree::node::NodeRef<alloc::collections::btree::node::marker::Mut,K,V,alloc::collections::btree::node::marker::Leaf>,alloc::collections::btree::node::marker::KV>>::remove_leaf_kv+0x173 =>           40a1d0 alloc::collections::btree::node::BalancingContext<K,V>::merge_tracking_child_edge+0x0
 1139/1139  428146.916294707:   call                           40a297 alloc::collections::btree::node::BalancingContext<K,V>::merge_tracking_child_edge+0xc7 =>     7ffff7329220 __memmove_ssse3_back+0x0

.

Clean up the CLI surface

The CLI surface of magic-trace could use some polish. Here is a small list of minor complaints I'd like to address before 1.0:

support non-standard characters in symbol names on command line

I have a binary with a symbol whose c name according to e.g. readelf is like:

foo_bar_$5_baz$3_qux_12345

This fails:

magic-trace attach <stuff> -symbol 'foo_bar_$5_baz$3_qux_12345'

Surprisingly if I do:

magic-trace attach <stuff> -symbol foo_bar

I see the right symbol in my fzf, can select it, and everything works (it does not appear to have any escapes etc.) I wonder if we're not escaping the characters in a regex? I also wonder if we should separate symbol-regex from this-is-the-symbol-i-know-it.

External debug symbols for snapshot symbol selection

Sometimes people package up debug symbols separately from the executable. I'm not sure how that works yet, but we should definitely support it.

Fix broken stacks traces on Go code

In this simple Go example, it's clear that Go's stack switch causes stacktraces to wander off the right hand side of the screen. I think this is easy to fix: when trace_writer.ml, sees the symbol runtime.newstack, it should mark all currently-open stack frames as closed.

I'm not sure if there is any more custom control flow in Go code. e.g. do Go stacks shrink? Please file more bugs if you notice any.

Support tracing into kernel mode

perf can already trace into the kernel, given sufficient perms. Adding support to magic trace involves (at least) supporting iret and hw int decoding / state machine updating, but after that it might "just work".

Speed up decoding

Looks like the scanfs are the most expensive bit, and could be stuck into the regex.

/cc @billduff re: #78 (comment) :)

Support more control over timing data

http://halobates.de/blog/p/432 for context

We should be able to specify noretcomp and also the mtc_period (independently) for cases where we need more precise timestamps and are willing to eat the cost in trace size.

Related: #36

Changing c-states breaks magic-trace / IPT

I was experiencing issues where my traces were completely empty around the snapshot. After investigating the perf file it hinted at "instruction trace errors" which led me to

https://perf.wiki.kernel.org/index.php/Perf_tools_support_for_Intel%C2%AE_Processor_Trace

which mentions

It is not uncommon to get overflows when transitioning to a C-state, so these errors are not significant.

I was testing this on a TGL laptop and after disabling turbo boost I got pretty stable traces again.

I am wondering whether other people share the same experience with switching c-states or whether there is maybe something else behind it?

If not it might be worth mentioning disabling c-states in the readme / tutorial? Turbo boost was enough for me but probably something lik e the max_cstate kernel flags work as well.

This is on 5.15.17.

trace-many-times

I frequently want as many traces of $CONDITION as I can get. Right now I build a command line, watch it go, wait til it fires, wait for it to decode, and rerun.

I can write a shell loop around this, but you know what sounds great? A flag to "once you're done, re-arm and do it again with a fresh filename".

[Question] Breakpoints using perf hardware breakpoints

Thanks for this interesting project especially its amazingly written accompanying blog post on the janestreet tech blog!

I was intrigued by the following line in the post:

It turns out that perf_event_open can use hardware breakpoints and notify you when a memory address is executed or accessed

Very cool! So I understand that (1) You get notified (probably via a fd) that a hardware breakpoint has been reached (2) You enable intel processor trace for that thread (3) You resume the thread paused on the hardware breakpoint

My question is how do you do (3) ? How do you resume the thread? Do you sent it a SIGCONT or something like that?

Perf_tool_backend doesn't recognize "tr strt tr end"

This shows up for me in our Go demo. E.g. we fail to parse this perf line:

"2118573/2118573 770614.599007116: tr strt tr end 0 [unknown] => 4591e1 [unknown]"

I included a commented-out test for this in my recent pull request to the line-parsing code, for convenience of whoever fixes this.

magic-trace cannot find fzf executable even though it's in $PATH

Installed magic-trace as described in https://blog.janestreet.com/magic-trace/.

~ $ magic-trace attach -output magic.ftf
(monitor.ml.Error
 (Unix.Unix_error "No such file or directory" execvp
  "((prog /usr/bin/fzf) (argv (/usr/bin/fzf)))")
 ("Raised at Base__Result.ok_exn in file \"src/result.ml\", line 249, characters 17-26"
  "Called from Async_kernel__Deferred1.M.map.(fun) in file \"src/deferred1.ml\", line 17, characters 40-45"
  "Called from Async_kernel__Job_queue.run_jobs in file \"src/job_queue.ml\", line 167, characters 6-47"))
No pid selected

~ $ which fzf
/home/omer/rcbackup/nvim/pack/plugins/start/fzf/bin/fzf

CI .deb packages should be built on Ubuntu 18.04

Currently they target 20.04 and thus limit compatibility to that or newer, but we should be able to ~easily downgrade to 18.04 (the oldest release supported by GitHub actions).

Support bigger traces at lower resolution

My understanding is that one of the main blockers preventing taking traces for longer time periods is that perfetto can't handle the size of traces that it produces. I would happily settle for longer traces which only showed function calls that took more than a certain amount of time -- to give a high-level view of where the time was spent. Combined with some option to filter a trace down to a given time range this should allow for exploring large traces reasonably ergonomically.

Support rust demangling

https://github.com/rust-lang/rustc-demangle has a nice C API we could use. I don't know if there's an easy way to detect if a symbol is rust vs. C++ vs. C, but I'd like to avoid forcing the user to specify, if possible.

Rewrite the README

It should have the following sections:

standard github badges
the magic trace logo
an overview of what this is and why anyone should care, with a picture
installation, including how to turn off perf paranoid mode
examples
supported platforms etc.
links to documentation
how to contribute, including a quick tour of the tree and a request to please squash/rebase PRs

If `-symbol` is specified, the name should also be printed

We currently print the address (as an int -- it should probably be printed as a pointer). We should also print the name, in case the user might've misselected their symbol). name @ addr format sounds reasonable.

(Ran into this while showing someone how to use magic-trace for the first time.)

Run tests in CI

We have a couple of tests, we should figure out how to run them in the CI.

Erroneous warning when symbol not specified

Running magic trace without -symbol prints out

[Couldn't find symbol. Will still snapshot on end]

even though that is expected.

Binaries with DWARF5 debug info fail at decoding time

It seems that magic-trace struggles with DWARF5 which gcc11 now uses by default. Testing a simple program compiled with DWARF5 info gives:

[Attaching to 4199088]
[Snapshot taken!]
...
[ perf record: Woken up 2 times to write data ]
[Finished recording!]
[ perf record: Captured and wrote 4.012 MB /tmp/magic_trace.tmp.848576/perf.data ]
[Decoding, this may take 30s or so...]
(monitor.ml.Error
 ("Owee_buf.Invalid_format(\"unknown .debug_line version\")")
 ("Raised at Owee_buf.invalid_format in file \"src/owee_buf.ml\", line 22, characters 25-51"
  "Called from Owee_buf.assert_format in file \"src/owee_buf.ml\" (inlined), line 26, characters 4-22"
  "Called from Owee_debug_line.read_header in file \"src/owee_debug_line.ml\", line 54, characters 2-80"
  "Called from Owee_debug_line.read_chunk in file \"src/owee_debug_line.ml\", line 82, characters 12-27"
  "Called from Magic_trace_core__Elf.addr_table.(fun).load_table_next in file \"core/elf.ml\", line 103, characters 12-45"
  "Called from Base__Option.iter in file \"src/option.ml\" (inlined), line 68, characters 14-17"
  "Called from Magic_trace_core__Elf.addr_table in file \"core/elf.ml\", line 90, characters 2-1023"
  "Called from Magic_trace_lib__Trace.Make_commands.decode_to_trace.(fun) in file \"src/trace.ml\", line 70, characters 25-43"
  "Called from Tracing__Tool_output.write_and_view in file \"src/tool_output.ml\", line 32, characters 16-19"
  "Called from Async_kernel__Deferred0.bind.(fun) in file \"src/deferred0.ml\", line 54, characters 64-69"
  "Called from Async_kernel__Job_queue.run_jobs in file \"src/job_queue.ml\", line 167, characters 6-47"
  "Caught by monitor Monitor.protect"))

Passing -gdwarf-4 to gcc makes the issue go away.

Provide a better error message when fzf isn't found

When fzf isn't available in the user's PATH, we crash with an ocaml exception. It's a pretty jarring experience for someone who'se never seen one before, and a terrible first impression.

Instead, let's give them a human-readable error message that invites the user to either:

install fzf, or
pass -pid to perf

Represent pointers as pointers in the Fuchsia trace

We represent them as signed int63s right now.

Rebrand the magic-trace website

We got some new assets from Christy.

Implement a `perf script --dlfilter` backend

Currently, we spend a lot of time sscan-ing textual output from perf script, and filter out branches within the same symbol in OCaml.

On recent (5.14+) perf installs, we could offload this to perf script --dlfilter, and have a short native object that does the filtering for us. Ref https://man7.org/linux/man-pages/man1/perf-dlfilter.1.html.

Command line flag to trace or attach without converting the results to Fuchsia trace

Other tools can use magic_trace library to decode events and convert to different formats.

Build static binaries in CI

This would require:

building against musl
possibly different dune targets

...but would allow the attached binaries to be run on anything, not just Ubuntu 20.04-glibc-equivalent systems.

time range filtering

As we know perfetto falls over at some point with large enough traces; can we write a tool that slices our output .ftfs by time range so we can look at parts successfully?

This isn't a good solution but it would be great for usability.

Admittedly it's also a pure Fuschia Format problem as opposed to a magic-trace one but nevertheless there might not be a better place for this to live.

Demangle symbols names in fzf

When showing the user an fzf to select a trigger symbol, show demangled symbols instead of mangled ones.

Symbols specified at the command line, I think, should still be mangled to ease copy+paste between apps and to avoid forcing us to write a name mangler too.

The message we show after a user selects a symbol should print its mangled name so the user can copy+paste it into future magic-trace invocations.

Create some demos

Using magic-trace is a visual experience. We should create example programs and demonstrations of magic-tracing them in a few different formats for people's varying attention spans.

a < 10s gif like speedscope has, demonstrating the value magic-trace can provide
a 2 minute video quickly walking through a short debugging session
A text-based walkthrough where we spell out exactly what commands to run at each step