GithubHelp home page GithubHelp logo

uber-archive / pyflame Goto Github PK

View Code? Open in Web Editor NEW
3.0K 67.0 238.0 377 KB

🔥 Pyflame: A Ptracing Profiler For Python. This project is deprecated and not maintained.

License: Apache License 2.0

Makefile 0.96% Shell 2.28% M4 10.76% C++ 58.42% Python 27.32% C 0.26%
ptrace uwsgi flame-charts fedora docker debian profiler python

pyflame's Introduction

Pyflame: A Ptracing Profiler For Python

Build Status Docs Status COPR Status

(This project is deprecated and not maintained.)

Pyflame is a high performance profiling tool that generates flame graphs for Python. Pyflame is implemented in C++, and uses the Linux ptrace(2) system call to collect profiling information. It can take snapshots of the Python call stack without explicit instrumentation, meaning you can profile a program without modifying its source code. Pyflame is capable of profiling embedded Python interpreters like uWSGI. It fully supports profiling multi-threaded Python programs.

Pyflame usually introduces significantly less overhead than the builtin profile (or cProfile) modules, and emits richer profiling data. The profiling overhead is low enough that you can use it to profile live processes in production.

Full Documentation: https://pyflame.readthedocs.io

pyflame

Quickstart

Building And Installing

For Debian/Ubuntu, install the following:

# Install build dependencies on Debian or Ubuntu.
sudo apt-get install autoconf automake autotools-dev g++ pkg-config python-dev python3-dev libtool make

Once you have the build dependencies installed:

./autogen.sh
./configure
make

The make command will produce an executable at src/pyflame that you can run and use.

Optionally, if you have virtualenv installed, you can test the executable you produced using make check.

Using Pyflame

The full documentation for using Pyflame is here. But here's a quick guide:

# Attach to PID 12345 and profile it for 1 second
pyflame -p 12345

# Attach to PID 768 and profile it for 5 seconds, sampling every 0.01 seconds
pyflame -s 5 -r 0.01 -p 768

# Run py.test against tests/, emitting sample data to prof.txt
pyflame -o prof.txt -t py.test tests/

In all of these cases you will get flame graph data on stdout (or to a file if you used -o). This data is in the format expected by flamegraph.pl, which you can find here.

FAQ

The full FAQ is here.

What's The Deal With (idle) Time?

Full answer here. tl;dr: use the -x flag to suppress (idle) output.

What About These Ptrace Errors?

See here.

How Do I Profile Threaded Applications?

Use the --threads option.

Is There A Way To Just Dump Stack Traces?

Yes, use the -d option.

pyflame's People

Contributors

abeutot avatar akatrevorjay avatar amboar avatar batterseapower avatar eklitzke avatar emfree avatar faicker avatar fmichea avatar inikolaev avatar jackvreeken avatar jamespic avatar jeevandev avatar jmphilli avatar leezu avatar realfatcat avatar simonzheng avatar stevenkaras avatar tijko avatar ziyili66 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyflame's Issues

Occasional segfault with --threads

I am sometimes getting segfaults when using --threads. Here's some info from a core dump:

(env) evan@localhost ~/code/pyflame (276f0c3...) $ gdb python
GNU gdb (GDB) Fedora 7.12.1-46.fc25
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...Reading symbols from /usr/lib/debug/usr/bin/python2.7.debug...done.
done.
(gdb) core core.python.6569.localhost.localdomain.1489091075
/home/evan/code/pyflame/core.python.6569.localhost.localdomain.1489091075: No such file or directory.
(gdb) core /tmp/core.python.6569.localhost.localdomain.1489091075
warning: core file may not match specified executable file.
[New LWP 6571]
[New LWP 6570]
[New LWP 6569]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `python tests/threaded_sleeper.py'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f3ef11af959 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x56268986fce0)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:205
205	  int err = lll_futex_timed_wait_bitset (futex_word, expected, abstime,
[Current thread is 1 (Thread 0x7f3ee3fff700 (LWP 6571))]
(gdb) bt
#0  0x00007f3ef11af959 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x56268986fce0)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  do_futex_wait (sem=sem@entry=0x56268986fce0, abstime=0x0) at sem_waitcommon.c:111
#2  0x00007f3ef11afa04 in __new_sem_wait_slow (sem=0x56268986fce0, abstime=0x0) at sem_waitcommon.c:181
#3  0x00007f3ef11afaaa in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29
#4  0x00007f3ef14cff15 in PyThread_acquire_lock (lock=0x56268986fce0, waitflag=waitflag@entry=1) at /usr/src/debug/Python-2.7.13/Python/thread_pthread.h:324
#5  0x00007f3ef149bb29 in PyEval_EvalFrameEx (
    f=f@entry=Frame 0x7f3ef17fb730, for file tests/threaded_sleeper.py, line 25, in do_sleep (target=<float at remote 0x562689806608>), throwflag=throwflag@entry=0)
    at /usr/src/debug/Python-2.7.13/Python/ceval.c:1193
#6  0x00007f3ef14a10ae in fast_function (nk=0, na=<optimized out>, n=<optimized out>, pp_stack=0x7f3ee3ffe560, func=<optimized out>)
    at /usr/src/debug/Python-2.7.13/Python/ceval.c:4514
#7  call_function (oparg=<optimized out>, pp_stack=0x7f3ee3ffe560) at /usr/src/debug/Python-2.7.13/Python/ceval.c:4449
#8  PyEval_EvalFrameEx (f=f@entry=Frame 0x7f3ef19456f0, for file tests/threaded_sleeper.py, line 34, in sleep_b (), throwflag=throwflag@entry=0)
    at /usr/src/debug/Python-2.7.13/Python/ceval.c:3063
#9  0x00007f3ef14a4adc in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=args@entry=0x7f3ef1960068, argcount=0, 
    kws=kws@entry=0x7f3ef1960068, kwcount=0, defs=0x0, defcount=0, closure=0x0) at /usr/src/debug/Python-2.7.13/Python/ceval.c:3661
#10 0x00007f3ef142d04d in function_call (func=<function at remote 0x7f3ef1800140>, arg=(), kw={}) at /usr/src/debug/Python-2.7.13/Objects/funcobject.c:523
#11 0x00007f3ef1408003 in PyObject_Call (func=func@entry=<function at remote 0x7f3ef1800140>, arg=arg@entry=(), kw=kw@entry={})
    at /usr/src/debug/Python-2.7.13/Objects/abstract.c:2547
#12 0x00007f3ef149f093 in ext_do_call (nk=<optimized out>, na=0, flags=<optimized out>, pp_stack=0x7f3ee3ffe808, func=<function at remote 0x7f3ef1800140>)
    at /usr/src/debug/Python-2.7.13/Python/ceval.c:4743
#13 PyEval_EvalFrameEx (
    f=f@entry=Frame 0x7f3ef17feb00, for file /usr/lib64/python2.7/threading.py, line 757, in run (self=<Thread(_Thread__ident=139908089902848, _Thread__block=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0x7f3ef1933190>, acquire=<built-in method acquire of thread.lock object at remote 0x7f3ef1933190>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0x7f3ef1933190>) at remote 0x7f3ef17ff1d0>, _Thread__name='Thread-2', _Thread__daemonic=False, _Thread__started=<_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0x7f3ef1933170>, acquire=<built-in method acquire of thread.lock object at remote 0x7f3ef1933170>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0x7f3ef1933170>) at remote 0x7f3ef17ff190>) at remote 0x7f3ef17ff150>, _Thread__stderr=<file at remote 0x7f3ef19821e0>, _Thread__target=<function at remote 0x7f3ef1800140>, ...(truncated), throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.13/Python/ceval.c:3102
#14 0x00007f3ef14a10ae in fast_function (nk=0, na=<optimized out>, n=<optimized out>, pp_stack=0x7f3ee3ffe940, func=<optimized out>)
    at /usr/src/debug/Python-2.7.13/Python/ceval.c:4514
#15 call_function (oparg=<optimized out>, pp_stack=0x7f3ee3ffe940) at /usr/src/debug/Python-2.7.13/Python/ceval.c:4449
#16 PyEval_EvalFrameEx (
    f=f@entry=Frame 0x7f3edc000910, for file /usr/lib64/python2.7/threading.py, line 804, in __bootstrap_inner (self=<Thread(_Thread__ident=139908089902848, _Thread__b---Type <return> to continue, or q <return> ---Type <return> to continue, or q <return> to quit---
lock=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0x7f3ef1933190>, acquire=<built-in method acquire of thread.lock object at remote 0x7f3ef1933190>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0x7f3ef1933190>) at remote 0x7f3ef17ff1d0>, _Thread__name='Thread-2', _Thread__daemonic=False, _Thread__started=<_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0x7f3ef1933170>, acquire=<built-in method acquire of thread.lock object at remote 0x7f3ef1933170>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0x7f3ef1933170>) at remote 0x7f3ef17ff190>) at remote 0x7f3ef17ff150>, _Thread__stderr=<file at remote 0x7f3ef19821e0>, _Thread__target=<function at remote 0x7...(truncated), throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.13/Python/ceval.c:3063
#17 0x00007f3ef14a10ae in fast_function (nk=0, na=<optimized out>, n=<optimized out>, pp_stack=0x7f3ee3ffea80, func=<optimized out>) at /usr/src/debug/Python-2.7.13/Python/ceval.c:4514
#18 call_function (oparg=<optimized out>, pp_stack=0x7f3ee3ffea80) at /usr/src/debug/Python-2.7.13/Python/ceval.c:4449
#19 PyEval_EvalFrameEx (
    f=f@entry=Frame 0x7f3ef1803210, for file /usr/lib64/python2.7/threading.py, line 777, in __bootstrap (self=<Thread(_Thread__ident=139908089902848, _Thread__block=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0x7f3ef1933190>, acquire=<built-in method acquire of thread.lock object at remote 0x7f3ef1933190>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0x7f3ef1933190>) at remote 0x7f3ef17ff1d0>, _Thread__name='Thread-2', _Thread__daemonic=False, _Thread__started=<_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0x7f3ef1933170>, acquire=<built-in method acquire of thread.lock object at remote 0x7f3ef1933170>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0x7f3ef1933170>) at remote 0x7f3ef17ff190>) at remote 0x7f3ef17ff150>, _Thread__stderr=<file at remote 0x7f3ef19821e0>, _Thread__target=<function at remote 0x7f3ef18...(truncated), 
    throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.13/Python/ceval.c:3063
#20 0x00007f3ef14a4adc in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=args@entry=0x7f3ef18a06e8, argcount=1, kws=kws@entry=0x0, kwcount=0, defs=0x0, 
    defcount=0, closure=0x0) at /usr/src/debug/Python-2.7.13/Python/ceval.c:3661
#21 0x00007f3ef142cf6c in function_call (func=<function at remote 0x7f3ef17fd0c8>, 
    arg=(<Thread(_Thread__ident=139908089902848, _Thread__block=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0x7f3ef1933190>, acquire=<built-in method acquire of thread.lock object at remote 0x7f3ef1933190>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0x7f3ef1933190>) at remote 0x7f3ef17ff1d0>, _Thread__name='Thread-2', _Thread__daemonic=False, _Thread__started=<_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0x7f3ef1933170>, acquire=<built-in method acquire of thread.lock object at remote 0x7f3ef1933170>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0x7f3ef1933170>) at remote 0x7f3ef17ff190>) at remote 0x7f3ef17ff150>, _Thread__stderr=<file at remote 0x7f3ef19821e0>, _Thread__target=<function at remote 0x7f3ef1800140>, _Thread__kwargs={}, _Verbose__verbose=False, _Thread__args=(), _Thread__stopped=False, _...(truncated), 
    kw=0x0) at /usr/src/debug/Python-2.7.13/Objects/funcobject.c:523
#22 0x00007f3ef1408003 in PyObject_Call (func=func@entry=<function at remote 0x7f3ef17fd0c8>, 
    arg=arg@entry=(<Thread(_Thread__ident=139908089902848, _Thread__block=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0x7f3ef1933190>, acquire=<built-in method acquire of thread.lock object at remote 0x7f3ef1933190>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0x7f3ef1933190>) at remote 0x7f3ef17ff1d0>, _Thread__name='Thread-2', _Thread__daemonic=False, _Thread__started=<_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0x7f3ef1933170>, acquire=<built-in method acquire of thread.lock object at remote 0x7f3ef1933170>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0x7f3ef1933170>) at remote 0x7f3ef17ff190>) at remote 0x7f3ef17ff150>, _Thread__stderr=<file at remote 0x7f3ef19821e0>, _Thread__target=<function at remote 0x7f3ef1800140>, _Thread__kwargs={}, _Verbose__verbose=False, _Thread__args=(), _Thread__stopped=False, _...(truncated), kw=kw@entry=0x0) at /usr/src/debug/Python-2.7.13/Objects/abstract.c:2547
#23 0x00007f3ef1416efc in instancemethod_call (func=<function at remote 0x7f3ef17fd0c8>, 
    arg=(<Thread(_Thread__ident=139908089902848, _Thread__block=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0x7f3ef1933190>, acquire=<built-in method acquire of thread.lock object at remote 0x7f3ef1933190>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0x7f3ef1933190>) at remote 0x7f3ef17ff1d0>, _Thread__name='Thread-2', _Thread__daemonic=False, _Thread__started=<_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0x7f3ef1933170>, acquire=<built-in method acquire of thread.lock object at remote 0x7f3ef1933170>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0x7f3ef1933170>) at remote 0x7f3ef17ff190>) at remote 0x7f3ef17ff150>, _Thread__stderr=<file at remote 0x7f3ef19821e0>, _Thread__target=<function at remote 0x7f3ef1800140>, _Thread__kwargs={}, _Verbose__verbose=False, _Thread__args=(), _Thread__stopped=False, _...(truncated), 
    kw=0x0) at /usr/src/debug/Python-2.7.13/Objects/classobject.c:2602
#24 0x00007f3ef1408003 in PyObject_Call (func=func@entry=<instancemethod at remote 0x7f3ef186c230>, arg=arg@entry=(), kw=<optimized out>) at /usr/src/debug/Python-2.7.13/Objects/abstract.c:2547
#25 0x00007f3ef149abc7 in PyEval_CallObjectWithKeywords (func=<instancemethod at remote 0x7f3ef186c230>, arg=(), kw=<optimized out>) at /usr/src/debug/Python-2.7.13/Python/ceval.c:4298
#26 0x00007f3ef14d41a2 in t_bootstrap (boot_raw=0x5626898392e0) at /usr/src/debug/Python-2.7.13/Modules/threadmodule.c:620
#27 0x00007f3ef11a76ca in start_thread (arg=0x7f3ee3fff700) at pthread_create.c:333
#28 0x00007f3ef07d1f7f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
(gdb) disas
Dump of assembler code for function do_futex_wait:
   0x00007f3ef11af920 <+0>:	push   %r12
   0x00007f3ef11af922 <+2>:	push   %rbp
   0x00007f3ef11af923 <+3>:	mov    %rdi,%rbp
   0x00007f3ef11af926 <+6>:	push   %rbx
   0x00007f3ef11af927 <+7>:	sub    $0x10,%rsp
   0x00007f3ef11af92b <+11>:	mov    0x8(%rdi),%ebx
   0x00007f3ef11af92e <+14>:	callq  0x7f3ef11b0280 <__pthread_enable_asynccancel>
   0x00007f3ef11af933 <+19>:	mov    $0xffffffff,%r9d
   0x00007f3ef11af939 <+25>:	mov    %eax,%r12d
   0x00007f3ef11af93c <+28>:	xor    %r8d,%r8d
   0x00007f3ef11af93f <+31>:	xor    $0x189,%ebx
   0x00007f3ef11af945 <+37>:	xor    %r10d,%r10d
   0x00007f3ef11af948 <+40>:	xor    %edx,%edx
   0x00007f3ef11af94a <+42>:	movslq %ebx,%rsi
   0x00007f3ef11af94d <+45>:	mov    %rbp,%rdi
   0x00007f3ef11af950 <+48>:	mov    $0xca,%eax
   0x00007f3ef11af955 <+53>:	syscall 
   0x00007f3ef11af957 <+55>:	cmp    $0xfffffffffffff000,%rax
   0x00007f3ef11af95d <+61>:	ja     0x7f3ef11af978 <do_futex_wait+88>
   0x00007f3ef11af95f <+63>:	mov    %r12d,%edi
   0x00007f3ef11af962 <+66>:	callq  0x7f3ef11b02e0 <__pthread_disable_asynccancel>
   0x00007f3ef11af967 <+71>:	xor    %eax,%eax
   0x00007f3ef11af969 <+73>:	add    $0x10,%rsp
   0x00007f3ef11af96d <+77>:	pop    %rbx
   0x00007f3ef11af96e <+78>:	pop    %rbp
   0x00007f3ef11af96f <+79>:	pop    %r12
   0x00007f3ef11af971 <+81>:	retq   
   0x00007f3ef11af972 <+82>:	nopw   0x0(%rax,%rax,1)
   0x00007f3ef11af978 <+88>:	mov    %r12d,%edi
   0x00007f3ef11af97b <+91>:	mov    %rax,0x8(%rsp)
   0x00007f3ef11af980 <+96>:	callq  0x7f3ef11b02e0 <__pthread_disable_asynccancel>
   0x00007f3ef11af985 <+101>:	mov    0x8(%rsp),%rax
   0x00007f3ef11af98a <+106>:	cmp    $0xfffffff5,%eax
   0x00007f3ef11af98d <+109>:	je     0x7f3ef11af9a8 <do_futex_wait+136>
   0x00007f3ef11af98f <+111>:	cmp    $0xfffffffc,%eax
   0x00007f3ef11af992 <+114>:	je     0x7f3ef11af9a8 <do_futex_wait+136>
   0x00007f3ef11af994 <+116>:	cmp    $0xffffff92,%eax
   0x00007f3ef11af997 <+119>:	je     0x7f3ef11af9a8 <do_futex_wait+136>
   0x00007f3ef11af999 <+121>:	lea    0x3420(%rip),%rdi        # 0x7f3ef11b2dc0
   0x00007f3ef11af9a0 <+128>:	callq  0x7f3ef11a5630 <__libc_fatal@plt>
   0x00007f3ef11af9a5 <+133>:	nopl   (%rax)
   0x00007f3ef11af9a8 <+136>:	neg    %eax
   0x00007f3ef11af9aa <+138>:	jmp    0x7f3ef11af969 <do_futex_wait+73>
End of assembler dump.
(gdb) info registers
rax            0xfffffffffffffff7	-9
rbx            0x189	393
rcx            0x7f3ef11af959	139908309776729
rdx            0x0	0
rsi            0x189	393
rdi            0x56268986fce0	94723516071136
rbp            0x56268986fce0	0x56268986fce0
rsp            0x7f3ee3ffe2a0	0x7f3ee3ffe2a0
r8             0x0	0
r9             0xffffffff	4294967295
r10            0x0	0
r11            0x246	582
r12            0x0	0
r13            0x562689806620	94723515639328
r14            0x7f3ef18f1c56	139908317387862
r15            0x7f3ef17fb730	139908316378928
rip            0x7f3ef11af959	0x7f3ef11af959 <do_futex_wait+57>
eflags         0x10246	[ PF ZF IF RF ]
cs             0x33	51
ss             0x2b	43
ds             0x0	0
es             0x0	0
fs             0x0	0
gs             0x0	0

What's suspicious here is that %rip is at 0x7f3ef11af959, which is not a valid address. In fact, it's two bytes past a valid address. The syscall instruction is two bytes. So it looks to me like there is an issue where the instruction pointer isn't being restored properly.

Compilation of frob34.cc fails with Python.h not found

I am on a fully updated Ubuntu 16.04 system. For some reason the compilation process does not set the right include flags (which should be /usr/include/python3.5m).

./configure

checking for PY26... yes
checking for PY34... no
checking for PY35... yes
checking for PY36... no
configure: WARNING: Building without Python 3.6 support

make

libtool: link: ranlib .libs/libfrob26.a
libtool: link: ( cd ".libs" && rm -f "libfrob26.la" && ln -s "../libfrob26.la" "libfrob26.la" )
/bin/bash ../libtool  --tag=CXX   --mode=compile g++ -DHAVE_CONFIG_H -I.     -g -O2 -std=c++11 -Wall -MT libfrob34_la-frob34.lo -MD -MP -MF .deps/libfrob34_la-frob34.Tpo -c -o libfrob34_la-frob34.lo `test -f 'frob34.cc' || echo './'`frob34.cc
libtool: compile:  g++ -DHAVE_CONFIG_H -I. -g -O2 -std=c++11 -Wall -MT libfrob34_la-frob34.lo -MD -MP -MF .deps/libfrob34_la-frob34.Tpo -c frob34.cc  -fPIC -DPIC -o .libs/libfrob34_la-frob34.o
In file included from frob34.cc:18:0:
./frob.cc:28:20: fatal error: Python.h: No such file or directory
compilation terminated.

./configure failure on OSX

I'm on OSX 10.11.5

I checked out the git repo at 16958de

master!pyflame *> ./autogen.sh
configure.ac:12: installing 'build-aux/compile'
configure.ac:8: installing 'build-aux/install-sh'
configure.ac:8: installing 'build-aux/missing'
src/Makefile.am: installing 'build-aux/depcomp'

master!pyflame *> ./configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... build-aux/install-sh -c -d
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking for g++... g++
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking for style of include used by make... GNU
checking dependency style of g++... gcc3
checking for gcc... gcc
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking dependency style of gcc... gcc3
checking whether C compiler accepts "-std=c++11"... no
checking whether C compiler accepts "-std=c++0x"... no
configure: error: failed to detect C++11 support

Pyflame always runs as if I had passed --threads

Using pyflame 1.3.1 with Python 2.7.12 on Ubuntu 16.04 pyflame always returns output as if I had passed the --threads option. This makes it hard to get a useful profile of the running thread. Here is some sample code that should illustrate the problem:

import threading, time

def main():
    busyThread = threading.Thread(target=busy)
    busyThread.start()
    while True:
        time.sleep(1000)

def busy():
    number = 0
    while True:
        number += 1

if __name__ == '__main__':
    main()

Here is some sample output
sudo pyflame -s 0 PID results in:
profilable.py::15;profilable.py:main:7 1
/usr/lib/python2.7/threading.py:__bootstrap:774;/usr/lib/python2.7/threading.py:__bootstrap_inner:801;/usr/lib/python2.7/threading.py:run:754;profilable.py:busy:11 1

Why would -s 0 without the --threads option return with two samples?

sudo pyflame -s 0 --threads PID results in:
profilable.py::15;profilable.py:main:7 1
/usr/lib/python2.7/threading.py:__bootstrap:774;/usr/lib/python2.7/threading.py:__bootstrap_inner:801;/usr/lib/python2.7/threading.py:run:754;profilable.py:busy:12 1

This is as expected but it is identical to running pyflame without --threads

sudo pyflame -s 30 PID results in:
profilable.py::15;profilable.py:main:7 24549
/usr/lib/python2.7/threading.py:__bootstrap:774;/usr/lib/python2.7/threading.py:__bootstrap_inner:801;/usr/lib/python2.7/threading.py:run:754;profilable.py:busy:12 24548
/usr/lib/python2.7/threading.py:__bootstrap:774;/usr/lib/python2.7/threading.py:__bootstrap_inner:801;/usr/lib/python2.7/threading.py:run:754;profilable.py:busy:11 1

The total number of samples * the default sampling period exceeds the time spent profiling. Also it seems unlikely sampling the active thread would end up in the sleeping thread so often

sudo pyflame -s 30 --threads PID results in:
profilable.py::15;profilable.py:main:7 25126
/usr/lib/python2.7/threading.py:__bootstrap:774;/usr/lib/python2.7/threading.py:__bootstrap_inner:801;/usr/lib/python2.7/threading.py:run:754;profilable.py:busy:11 12537
/usr/lib/python2.7/threading.py:__bootstrap:774;/usr/lib/python2.7/threading.py:__bootstrap_inner:801;/usr/lib/python2.7/threading.py:run:754;profilable.py:busy:12 12589

This is nearly identical to running pyflame without the --threads option

Travis tests started failing for Python 3.4 and 3.5

Something about the Travis VM environment changed to cause the Python 3 tests to always fail. This is not related to a code change, the old revisions that were passing Travis are now failing. I installed a Trusty VM locally, and cannot reproduce the issue. So Travis changed something about their VM environment that caused the breakage.

The errors are all like:

       AssertionError: assert not 'Failed to PTRACE_PEEKDATA at 0x10: Input/output error\n'

This indicates a null pointer dereference via ptrace. Basically ptrace has a pointer to some struct and is trying to dereference a field in the struct, but the struct pointer is null.

Ideally I could get an actual VM image used by Travis so I could debug locally. Otherwise I can just try adding lots of sanity checks to the code and run many builds, and try to work things out backwards. I'm worried though that if the error is something like the ELF symbol table is different in some unusual way on Travis, that it would be extremely time consuming to debug the issue using this approach.

Wrong parameter to trace a process

In the README.md file it's mentioned a parameter (-t) to trace a process from start to finish.
ex: pyflame -t py.test tests/

However, if I try to run this command I get a message of "Invalid option".
And if I look into the source code there's no mention to a "-t" parameter, even in the first commit of pyflame.cc: 0a507c0

I guess this should be somewhere in those lines, isn't it? https://github.com/uber/pyflame/blob/master/src/pyflame.cc#L118

So I guess either the README should be updated or the source code, or am I missing something?.

Add code to walk the list of thread states

This is how you do it:

  • each interpreter has a field called tstate_head which is the head of a linked list of thread states
  • each thread state has a field called next which is a pointer to the next thread state
  • the last thread state has a null pointer for next

Likewise:

  • each thread state has a back reference to the interpreter state in a field called interp
  • there's a single linked list of interpreters whose head is the static symbol interp_head

In the single-threaded case there is one interpreter and one thread state.

In the generic case the way you enumerate the thread states is:

  • locate _PyThreadState_Current
  • follow interp up to the interpreter
  • follow tstate_head to the first thread state (which will always be the same as _PyThreadState_Current for a single-threaded program, but could be different for a multi-threaded program)
  • follow the next field until NULL is encountered

It should be fine to ignore the multiple interpreter thing, no one really uses that feature nowadays.

BSD Support

For real BSDs (FreeBSD, OpenBSD, etc.) the following parts of the code have to be updated:

  • The ptrace() code in ptrace.cc needs to be updated to use the BSD ptrace() interface
  • The ASLR code in aslr.cc has to be updated
  • The setns() stuff should be disabled on BSD, since it's Linux-specific

For macOS, all of the above need to be done, plus Mach-O parsing code needs to be added to symbol.cc.

Implement proper filesystem namespace support

The process will be like this:

  • open /proc/self/ns/mnt to get a file descriptor for our existing namespace
  • open /proc/PID/ns/mnt to get a file descriptor for the target PID's namespace
  • call setns(2) with the target fd
  • do symbol resolution for the ELF executable as usual (i.e. on the executable path we got from examining /proc/PID/cmdline)
  • call setns(2) with the fd for our original namespace to return back to the original namespace
  • close the two fds that were created

PauseChildThreads() sometimes hangs

I think this is related to #53 (cc/ @jamespic) . I am consistently seeing Pyflame hang when profiling threads on my Chromebook. This is running Debian Jessie in a chroot, and the host kernel is 3.14.0. IIRC I did test this branch on a regular Jessie host, so I think it might be related to the ChromeOS kernel? I'll have to check.

What I am seeing is that it goes to pause both threads in a Python process running test_threaded_sleeper.py, and gets stuck on the second thread. By "stuck" I mean the call to wait() that is made after the PTRACE_ATTACH never completes. The full stack is like:

(gdb) bt
#0  0x00007f0358495ab2 in __libc_wait (stat_loc=0x0) at ../sysdeps/unix/sysv/linux/wait.c:30
#1  0x00000000004065eb in pyflame::PtraceAttach (pid=pid@entry=26554) at ptrace.cc:42
#2  0x00000000004079a1 in PauseChildThreads (pid=26553) at ptrace.cc:209
#3  pyflame::PtraceCallFunction (pid=26553, addr=4794543) at ptrace.cc:222
#4  0x000000000040a5f1 in pyflame::PyFrob::DetectPython (this=this@entry=0x7fffd588a130) at pyfrob.cc:135
#5  0x0000000000403845 in main (argc=<optimized out>, argv=<optimized out>) at pyflame.cc:268

This is a blocker to tag a new release.

Some question about profiling gevent based web application

My application is a web application hosted by gunicorn, using gevent as async worker.

gunicorn pre-forked two processes, every one is a gevent worker (single thread). My attachment graph is profiling for one worker.

From the doc, since pyflame can't profile IO and c code, I think my left IDLE part should be related to it.

The middle part is my real business logic related code.

For the right long part, all in gevent.threadpool:_worker (task = task_queue.get()), I'm not familiar with gevent internal, is it normal? seems it's gevent main greenlet waiting for available tasks, My guess is because gevent is blocked by some CPU heavy code. But how can it be profiled with my real application code on the same level (If my assumption is right, the right part and middle part should be mixed together)

If only look at the middle part (my business logic code), does it only means CPU usage of my call stack? For example I have a very complex function, it will do both heavy CPU work and call a slow rpc service (waiting for IO). From pyflame, the waiting for IO part should be in IDLE, so my rpc part on flame graph will be narrow, is my understanding right?

Thanks for any advice.

screen shot 2017-04-26 at 10 09 45 am

Crash in setns code

Here's a trace from strace:

ptrace(PTRACE_ATTACH, 129837, 0, 0)     = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=129837, si_uid=33, si_status=SIGSTOP, si_utime=0, si_stime=0} ---
wait4(-1, NULL, 0, NULL)                = 129837
lstat("/proc/self/ns/mnt", {st_mode=S_IFLNK|0777, st_size=0, ...}) = 0
readlink("/proc/self/ns/mnt", "mnt:[4026531840]", 4096) = 16
readlink("/proc/129837/ns/mnt", "mnt:[4026532986]", 4096) = 16
open("/proc/self/ns/mnt", O_RDONLY)     = 3
open("/proc/129837/ns/mnt", O_RDONLY)   = 4
readlink("/proc/129837/exe", "/usr/sbin/nginx", 4096) = 15
setns(4, 0)                             = 0
open("/usr/sbin/nginx", O_RDONLY)       = 5
setns(3, 0)                             = 0
lseek(5, 0, SEEK_END)                   = 979976
mmap(NULL, 979976, PROT_READ, MAP_SHARED, 5, 0) = 0x7f0b9c2d0000
close(5)                                = 0
open("/proc/129837/maps", O_RDONLY)     = 5
read(5, "00400000-004d7000 r-xp 00000000 fc:00 50202960                           /usr/sbin/nginx\n006d7000-006d8000 r--p 000d7000 fc:00 50202960                           /usr/sbin/nginx\n006d8000-006ef000 rw-p 000d8000 fc:00 50202960                           /usr/sbin/nginx\n006ef000-006fe000 rw-p 00000000 00:00 0 \n00af7000-00b56000 rw-p 00000000 00:00 0                                  [heap]\n7f37cbe62000-7f37cbe67000 r-xp 00000000 fc:00 50202886                   /usr/lib/x86_64-linux-gnu/libXdmcp.so.6.0.0\n7f37cbe67000-7f37cc066000 ---p 00005000 fc:00 50202886                   /usr/lib/x86_64-linux-gnu/libXdmcp.so.6.0.0\n7f37cc066000-7f37cc067000 rw-p 00004000 fc:00 50202886                   /usr/lib/x86_64-linux-gnu/libXdmcp.so.6.0.0\n7f37cc067000-7f37cc06a000 r-xp 00000000 fc:00 50202884                   /usr/lib/x86_64-linux-gnu/libXau.so.6.0.0\n7f37cc06a000-7f37cc269000 ---p 00003000 fc:00 50202884                   /usr/lib/x86_64-linux-gnu/libXau.so.6.0.0\n7f37cc269000-7f37cc26a000 r--p "..., 8191) = 4061
read(5, "7f37cd25e000-7f37cd26f000 r-xp 00000000 fc:00 50202888                   /usr/lib/x86_64-linux-gnu/libXpm.so.4.11.0\n7f37cd26f000-7f37cd46e000 ---p 00011000 fc:00 50202888                   /usr/lib/x86_64-linux-gnu/libXpm.so.4.11.0\n7f37cd46e000-7f37cd46f000 r--p 00010000 fc:00 50202888                   /usr/lib/x86_64-linux-gnu/libXpm.so.4.11.0\n7f37cd46f000-7f37cd470000 rw-p 00011000 fc:00 50202888                   /usr/lib/x86_64-linux-gnu/libXpm.so.4.11.0\n7f37cd470000-7f37cd4ab000 r-xp 00000000 fc:00 50202904                   /usr/lib/x86_64-linux-gnu/libfontconfig.so.1.8.0\n7f37cd4ab000-7f37cd6aa000 ---p 0003b000 fc:00 50202904                   /usr/lib/x86_64-linux-gnu/libfontconfig.so.1.8.0\n7f37cd6aa000-7f37cd6ac000 r--p 0003a000 fc:00 50202904                   /usr/lib/x86_64-linux-gnu/libfontconfig.so.1.8.0\n7f37cd6ac000-7f37cd6ad000 rw-p 0003c000 fc:00 50202904                   /usr/lib/x86_64-linux-gnu/libfontconfig.so.1.8.0\n7f37cd6ad000-7f37cd751000 r-xp 00000000 fc:00 50"..., 8191) = 4024
read(5, "7f37ce820000-7f37ce822000 r--p 00019000 fc:00 121637035                  /lib/x86_64-linux-gnu/libaudit.so.1.0.0\n7f37ce822000-7f37ce823000 rw-p 0001b000 fc:00 121637035                  /lib/x86_64-linux-gnu/libaudit.so.1.0.0\n7f37ce823000-7f37ce82d000 rw-p 00000000 00:00 0 \n7f37ce82d000-7f37ce9cf000 r-xp 00000000 fc:00 122817824                  /lib/x86_64-linux-gnu/libc-2.19.so\n7f37ce9cf000-7f37cebce000 ---p 001a2000 fc:00 122817824                  /lib/x86_64-linux-gnu/libc-2.19.so\n7f37cebce000-7f37cebd2000 r--p 001a1000 fc:00 122817824                  /lib/x86_64-linux-gnu/libc-2.19.so\n7f37cebd2000-7f37cebd4000 rw-p 001a5000 fc:00 122817824                  /lib/x86_64-linux-gnu/libc-2.19.so\n7f37cebd4000-7f37cebd8000 rw-p 00000000 00:00 0 \n7f37cebd8000-7f37cec06000 r-xp 00000000 fc:00 122819021                  /usr/lib/x86_64-linux-gnu/libGeoIP.so.1.6.2\n7f37cec06000-7f37cee05000 ---p 0002e000 fc:00 122819021                  /usr/lib/x86_64-linux-gnu/libGeoIP.so.1.6.2\n7f37cee050"..., 8191) = 4054
read(5, "7f37cfc4f000-7f37cfe7f000 r-xp 00000000 fc:00 122819040                  /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0\n7f37cfe7f000-7f37d007e000 ---p 00230000 fc:00 122819040                  /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0\n7f37d007e000-7f37d009c000 r--p 0022f000 fc:00 122819040                  /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0\n7f37d009c000-7f37d00ac000 rw-p 0024d000 fc:00 122819040                  /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0\n7f37d00ac000-7f37d00b0000 rw-p 00000000 00:00 0 \n7f37d00b0000-7f37d010f000 r-xp 00000000 fc:00 122819110                  /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0\n7f37d010f000-7f37d030f000 ---p 0005f000 fc:00 122819110                  /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0\n7f37d030f000-7f37d0313000 r--p 0005f000 fc:00 122819110                  /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0\n7f37d0313000-7f37d031a000 rw-p 00063000 fc:00 122819110                  /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0\n7f37d031a000-7f37d0"..., 8191) = 4023
read(5, "7fff8ac6e000-7fff8ac70000 r--p 00000000 00:00 0                          [vvar]\n7fff8ac70000-7fff8ac72000 r-xp 00000000 00:00 0                          [vdso]\nffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]\n", 8191) = 244
read(5, "", 8191)                       = 0
close(5)                                = 0
setns(4, 0)                             = 0
open("", O_RDONLY)                      = -1 ENOENT (No such file or directory)
setns(3, 0)                             = 0
munmap(0x7f0b9c2d0000, 979976)          = 0
close(3)                                = 0
close(4)                                = 0
write(2, "Failed to open target : No such file or directory", 49) = 49
write(2, "\n", 1)                       = 1
exit_group(1)                           = ?
+++ exited with 1 +++

Failed to locate libpython within timeout period.

I'm using python 3.5 , when I execute the command "src/pyflame -s 60 -r 0.01 -p 25192 | /FlameGraph/flamegraph.pl > test_flame.svg"
The error 'Failed to locate libpython within timeout period' was raised. Is that a compatible issue?

Improve Python 3 Support

Currently Pyflame works with Python 3 only as long as the file names are ASCII. If the file names include Unicode characters, Pyflame will fail to decode them.

Some Background

If you download the source code for Python, you can find the implementation of strings (actually unicode objects) in Objects/unicodeobject.c. There are a few other files, but this is the main one.

A unicode object in Python actually has one of three representations; it can be any of the following:

  • PyASCIIObject
  • PyCompactUnicodeObject
  • PyUnicodeObject

Which one is actually used depends on what codepoints are actually in the string. If the string just contains ASCII characters a compact PyASCIIObject representation will be used. This representation is just like a Python 2 string: it has a size, and a pointer to bytes. Currently Pyflame assumes that string objects are really of type PyASCIIObject.

A Solution

A fix for this would include:

  • logic to detect what underlying type the unicode object actually is
  • logic for getting the raw bytes from the Unicode object
  • logic for converting the raw bytes to UTF-8 for non-ASCII strings

You should be pretty familiar with the internals of unicodeobject.c before attempting to implement a fix here.

As of this writing, the code here should probably go into src/frob.cc which is where the existing string handling code is. Although depending on the complexity of the solution, it may make sense to break out into another file.

Python 3.6.1 support

Bleeding edge I know. Testing with Ubuntu 16.04 x64,

virtualenv -p python3.6 env
source env/bin/activate
pyflame -t python -c 'println(sum(i for i in range(0, 100000)))'

I only get (idle) 2 out of it. The same invocation works for python 3.5.2 and 2.7 on the same system. Attaching to a 3.6.1 process also only shows idle time. Sometimes, I instead get Failed to PTRACE_PEEKDATA at 0x10: Input/output error with different running processes, but I haven't figured out how to reproduce that.

Compilation failed due to `ENABLE_PY2` not defined

I can't compile pyflame successfully. My environment gets Py2.7 and the develop files are installed. See the configure log:

checking for PY26... yes
checking for PY34... no
checking for PY34... no
configure: WARNING: Building without Python 3.4/3.5 support
checking for PY36... no
configure: WARNING: Building without Python 3.6 support

But the compilation still fails:

pyflame.cc:40:1: 错误:static assertion failed: Need Python2 or Python3 support to build
static_assert(false, "Need Python2 or Python3 support to build");

Looks to me only ENABLE_PY26 is defined but not ENABLE_PY2. I temporarily change it to ENABLE_PY26 and the build succeeds.

Use PTRACE_TRACEME in trace mode

The current tracing code has a race condition when detecting if the child process is ready to be traced. The code should use PTRACE_TRACEME. I read the man page and I think this is the correct way to do it:

  • parent calls fork()
  • child does PTRACE_TRACEME and then raise(SIGSTOP)
  • parent calls waitpid(0, 0, __WALL)
  • parent checks for SIGCHLD
  • parent sets PTRACE_O_TRACEEXEC on child
  • parent resumes child
  • child calls exec()
  • parent calls waitpid(0, 0, __WALL)
  • parent checks for PTRACE_EVENT_EXEC
  • parent probes child for Python symbols
  • parent resumes child and beings poll loop

Also related, instead of repeating PTRACE_ATTACH and PTRACE_DETACH I think it would be better to use PTRACE_SEIZE followed by PTRACE_CONT and PTRACE_INTERRUPT. That way another process can't race Pyflame.

"idle" detalization

Do you have any plans to support sampling "idle" call stacks too?

pyflame's design looks the best for me, making it potentially the best sampling python profiler, but lack of "idle" detalization is currently a show-stopper, as I'm working on web software that performs various IO (e.g. networking to several servers, locks) and I need to know which calls caused the extra latency and so on ("off-cpu").
Another story is threading support but as I see from #13 it is already planned, isn't it?

I had to make a custom profiler for running it continuously in production with a very low overhead, it works good and proved itself useful but I'm struggling with releasing it to the public because it needs a nice documentation (a complication: English is not my native) and quite more polishing and so on, and these are the boring demotivating tasks for me alone. So when I read the blog post about pyflame, I thought I could abandon my project for a clearly potentially superior one (it's sad but relieving enough). So now I need to deside :) Probably I could try to contribute, though I'm new to Python internals (and to ptrace-related things too). All my experience with Python internals is reducing the GIL impact for my profiler sampling loop - it runs in a thread, and I reduced the GIL-related calls to the minimum (using Cython tricks) as they turned out to be very expensive (I profiled it with vtune).

Pyflame can't profile its own test suite

I wanted to Pyflame the Pyflame test suite, but Pyflame exited very early with a 0 status code. I haven't dug into this yet, but just from thinking about it my guess is that when Pyflame becomes the parent of the traced process it's somehow getting notifications from grandchild processes or something, and then the waitid() loop thinks the traced process has exited. In any event, after building Pyflame the following command should not print profiling data until the test suite is actually done running:

./src/pyflame -t py.test tests/

This is tangentially related to #67, if my waitid() theory is true.

It doesn't work with app under uwsgi

Hi!

It is great tool, but I have problems with app under uwsgi. Pyflame shows only (idle) for it. Could you help me please?

Python application (app.py):

# -*- coding: utf-8 -*-
from flask import Flask


app = Flask(__name__)


@app.route('/')
def hello_world():
    return 'Hello, World!'

Starting uwsgi:

[~/pyflame_test]$ uwsgi --workers 1 --module app:app --http-socket=127.0.0.1:8011
*** Starting uWSGI 2.0.12 (64bit) on [Wed Oct  5 04:53:43 2016] ***
...
spawned uWSGI worker 1 (and the only) (pid: 29707, cores: 1)

Starting pyflame:

sudo ./pyflame 29707 -s 5

Making http request:

[~/]$ curl http://127.0.0.1:8011/
Hello, World!

pyflame has printed: (idle) 4363

If I run strace to this worker I'll see:

strace -p 29707
...
read(5, "GET / HTTP/1.1\r\nUser-Agent: curl"..., 4096) = 78
writev(5, [{"HTTP/1.1 200 OK\r\nContent-Type: t"..., 79}, {"Hello, World!", 13}], 2) = 92
...
[~/pyflame_test]$ uwsgi --version
2.0.12

--threads -t crash: Failed to PTRACE_GETREGS: No such process

pyflame 1.5.0 (compiled from d1a76174e6b570c7e98af79a694f8769271f922a)
Python 2.7.12
$ uname -a
Linux test-VirtualBox 4.10.0-28-generic #32~16.04.2-Ubuntu SMP Thu Jul 20 10:19:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
(I had the same problem on 14 ubuntu, IIRC)
$ cat pyflametest.py
import time

def sleep300 ():
    time.sleep(.300)

def sleep700 ():
    time.sleep(.700)

def main ():
    t1 = time.time()
    while True:
        if time.time() - t1 >= 5:
            return
        sleep300()
        sleep700()

main()
works:
$ pyflame -t python pyflametest.py 
(idle) 3470
/usr/lib/python2.7/site.py:<module>:563;/usr/lib/python2.7/site.py:main:545;/usr/lib/python2.7/site.py:addusersitepackages:272;/usr/lib/python2.7/site.py:getusersitepackages:247;/usr/lib/python2.7/site.py:getuserbase:237;/usr/lib/python2.7/sysconfig.py:get_config_var:582;/usr/lib/python2.7/sysconfig.py:get_config_vars:509;/usr/lib/python2.7/re.py:<module>:105;/usr/lib/python2.7/sre_compile.py:<module>:15;/usr/lib/python2.7/sre_parse.py:<module>:706 1

crashes:
test@test-VirtualBox:~$ pyflame --threads -t python pyflametest.py
terminate called after throwing an instance of 'pyflame::PtraceException'
  what():  Failed to PTRACE_GETREGS: No such process
Aborted (core dumped)
$ make test
./runtests.sh
Running test suite against Python 2.7.12
.....s..............
19 passed, 1 skipped in 13.38 seconds
Running test suite against Python 3.5.2
Already using interpreter /usr/bin/python3
....................
20 passed in 14.47 seconds

Add Memory Safety Tests

I occasionally run Pyflame through valgrind, and it always gets a completely clean report. However, I'd like to test this more rigorously. Things I'd like to see:

  • Add tests that check that valgrind --tool=memcheck on Pyflame returns a clean output report
  • Add ASAN tests that build Pyflame with ASAN and make sure no errors are reported
  • Add tcmalloc heap checker tests (although I think the above two checks should really be sufficient)

I'd kind of like to write this as a separate library/framework, since I've wanted to do these kinds of tests with other C++ programs I've written. These should be written in a way that they're optional, so the tests will only run if the tools are actually available on the system. And of course, the Travis job should run the full suite of memory checking tests.

C compiler needs C++ standard option?

Running configure with clang/clang++ 3.9 fails

checking whether C compiler accepts "-std=c++11"... no
checking whether C compiler accepts "-std=c++0x"... no
configure: error: failed to detect C++11 support

configure:3938: clang -c -std=gnu11 -march=native -O3 -pipe -std=c++11 conftest.c >&5
error: invalid argument '-std=c++11' not allowed with 'C/ObjC'

Not sure why the C compiler has to support that.

Permission errors when testing in a Docker container

I tried to build and test this entirely in a Docker container and got permission errors. I'll try to be as explicit as possible so this is reproducible. Please let me know if I'm just doing something silly.

This is running on a Lucid machine, using the Trusty base image.

General approach

I'm running the Docker image from the Dockerfile below then attaching to the container with a new bash shell (docker exec -i -t <container> /bin/bash) and running pyflame <test_script.py's PID>. I always get the permission error message Failed to attach to PID <PID>: Operation not permitted.

test_script.py is just a trivial python script I was using to test.

Dockerfile

FROM ubuntu:trusty

ENV DEBIAN_FRONTEND noninteractive

RUN apt-get update && apt-get -y dist-upgrade
RUN apt-get install -y \
    autoconf \
    autotools-dev \
    cmake \
    bash \
    git \
    g++ \
    pkg-config \
    python-dev

RUN mkdir -p work
WORKDIR work

ADD test_script.py /work/test_script.py
ADD test_file /work/test_file

RUN git clone https://github.com/uber/pyflame.git
RUN cd pyflame && \
    ./autogen.sh && \
    ./configure && \
    make && \
    make install

USER nobody
CMD python test_script.py

Things I've already tried

  • I've confirmed that kernel.yama.ptrace_scope = 0
  • I've run the container as root and attached and ran pyflame as root

Need to add apt install automake

I was getting a Can't exec "aclocal": No such file or directory at /usr/share/autoconf/Autom4te/FileUtils.pm line 326 error and needed to add apt install automake on my Linux Mint box

Gathered data is lost if process terminates before Pyflame

Currently, if the process being profiled terminates before Pyflame has finished, the Pyflame process errors rather than printing what has been profiled up until that point.

Ideally, if the process being profiled terminates before Pyflame has finished, the Pyflame process would exit cleanly and the output be as normal.

I'm not sure of the feasibility of this request so feel free to close the issue if appropriate.


Reproducible example

Running the following will demonstrate the error:

$ python -c "from time import sleep; sleep(1)" & sudo pyflame "$!" -s 5
[1] 11075
Failed to attach to PID 11075: No such process

Use case

I'm trying to profile a set of tests that are invoked with py.test ....

Currently, I start this process in the background with py.test ... &. I determine the PID of the process (by writing and reading from a file) and then call Pyflame on this PID. Given that I want to profile the whole test suite, I have to match the time that Pyflame is called for with the time the test suite should take to run (which is naturally variable).

It would be especially convenient to call Pyflame for longer than necessary to ensure the whole test suite is profiled.

Empty flame graphs when there's a lot of idle time

I ran this very simple Python program:

import time

def compute(x):
    res = x**2*7/9+31
    return res

index = 0
while True:
    index += 1
    compute(index)
    time.sleep(1)

And I profiled it using the following command line:

pyflame -s 20 -r 0.00000000000001 {pid} | ./flamegraph.pl > flames.svg

I understand that compute() completes and returns very rapidly. And I also understand that time.sleep() takes the majority of my program's time. However, given that the sampling interval is very small, I would have expected the flame graph to contain at least a few samples for the compute() function. But this isn't the case; the flame graph always shows 100% idle time.

Now, if I considerably reduce the value passed to time.sleep() (e.g. 0.0000001) in my program, then the flame graph contains samples for the compute() method, as we could expect.

Is this the normal behaviour of pyflame? Or is it a technical limitation?

Thanks!

Introduce -p for specifying the PID to trace

I think that instead of supplying the target PID as the last argument, we should use -p to specify the PID to trace. Thus, instead of pyflame 1678 you'd use pyflame -p 1678. This change would make Pyflame more consistent with other debugging tools like lsof, strace, gdb, etc. The rollout plan will be like:

  • Add a new -p option that the PID can be used in, but keep the existing PID parsing behavior if -p is not used.
  • Update all docs to use -p.
  • Print a deprecation warning to stderr if a PID is passed without -p.
  • (Optional) At some future point in time, remove the legacy PID parsing code and make -p mandatory.

I have a branch named dashp that starts this work.

Add option to profile child processes

I've hit a few use cases where it would be handy to profile a process and its children (multiprocessing, forking web servers, PySpark). Would there be interest in a patch implementing this?

uwsgi: Target is Python 0, which is not supported by this pyflame build.

Hi,

I'm getting "Target is Python 0, which is not supported by this pyflame build." error when running pyflame against uwsgi (despite what the issue #23 says, that it should be working).

OS: Ubuntu Xenial 16.04
Python: 2.7
uwsgi: 2.0.14
pyflame: 1.2.1

Steps to reproduce:

  • the same as #23

Running uwsgi:

$ uwsgi --workers 1 --module app:app --http-socket=127.0.0.1:8011
...
Python version: 2.7.12 (default, Nov 19 2016, 06:48:10)  [GCC 5.4.0 20160609]
...

Running pyflame:

$ sudo ./pyflame/src/pyflame `pgrep uwsgi`
Target is Python 0, which is not supported by this pyflame build.
$

uwsgi really is loading libpython2.7

$ sudo lsof -p `pgrep uwsgi` | grep libpython
uwsgi   11149 ubuntu  mem       REG  253,1  3582904  19947 /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
$

Looking at the src/pyfrob.cc, you seem to fallback to "libpython2.7.so", which does exist on my system:

$ ldconfig -p | grep libpython
	libpython3.5m.so.1.0 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
	libpython2.7.so.1.0 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
	libpython2.7.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libpython2.7.so
$

I have also tried uwsgi options --enable-threads, --single-interpreter, --master/workers with no luck.

Add config file, for setting default parameters

At the minimum, we want to support setting a default sample rate (same as the -r option) and sample time (same as the -s option).

Ideally this could be done using the XDG spec, so that it would be something like:

  • First read /etc/pyflame/config.yaml
  • Merge with ~/.config/pyflame/config.yaml

That way a site operator can update the default settings globally. It looks like there are a few C and C++ libraries that implement basic XDG support.

Make it so Pyflame can be built simultaneously against Python2 & Python3

Since Pyflame doesn't actually link against libpython we can built it against both Python2 and Python3. This will remove the need to have a configure option, and it will also simplify Debian package since a single package can be built that will support both Python versions.

This might require some weird autoconf hacks :-/

Resolve function names

Right now we resolve the file name and line number; but I suspect most people would rather use the filename and function name.

The mode should be configurable via a command line option.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.