GithubHelp home page GithubHelp logo

knobs-dials / straced Goto Github PK

View Code? Open in Web Editor NEW
9.0 2.0 2.0 106 KB

runs strace on processes that spend time in iowait

License: Apache License 2.0

Python 100.00%
strace linux iowait system-administration

straced's Introduction

straceD

Intended as an automated answer to "what programs are making my drives churn so hard, and why?"

  • Periodically checks for processes that are in D state (uninterruptable sleep, typically "waiting on IO" and then usually "waiting on disk"),
  • ...and straces them if they keep doing that.
  • Stops stracing and gives a summary when
    • it stops being in D state, OR
    • the traced process stops running, OR
    • you press Ctrl-C on this program (helps give summaries on processes that rarely or never exit, like databases and other daemons)

By default you get a summary (strace's -c argument), only once we stop tracing for any of those reasons. If you want a more realtime (and much spammier) feed, add -C to get all the syscalls of the process. You may then also want to use -e to have strace filter a bit more.

Considerations

You need root rights.

Programs run more slowly while having strace attached to it.

I'm fairly sure that a process can be straced by only one thing at a time, so consider possible clashes with similar tools, debuggers and such.

The "if it stays in D state" check is actually "if runs of ps reports D state more than once, some time apart", which is coarse and can miss things. I consider this a feature because it ends up ignoring some (series of) shorter-but-not-instant syscalls (like smartd's), but for specific purposes you may wish to lower it.

Example

# straceD
[20:24:56.17] Starting to trace process music-sync-disk(28324)
[20:25:45.51] music-sync-disk(28324):
[20:25:45.51] music-sync-disk(28324): % time     seconds  usecs/call     calls    errors syscall
[20:25:45.51] music-sync-disk(28324): ------ ----------- ----------- --------- --------- ----------------
[20:25:45.51] music-sync-disk(28324):  80.21    0.434608          30     14722           stat
[20:25:45.51] music-sync-disk(28324):  13.77    0.074588         174       429           getdents
[20:25:45.51] music-sync-disk(28324):   4.69    0.025420           5      5021           lstat
[20:25:45.51] music-sync-disk(28324):   0.58    0.003118          11       293           munmap
[20:25:45.51] music-sync-disk(28324):   0.29    0.001575           7       211           openat
[20:25:45.51] music-sync-disk(28324):   0.22    0.001198           6       214           close
[20:25:45.51] music-sync-disk(28324):   0.13    0.000721           3       211           fstat
[20:25:45.51] music-sync-disk(28324):   0.07    0.000360          10        35           write
[20:25:45.51] music-sync-disk(28324):   0.02    0.000125           8        16           read
[20:25:45.51] music-sync-disk(28324):   0.01    0.000068          11         6           mmap
[20:25:45.51] music-sync-disk(28324):   0.00    0.000015          15         1           rt_sigreturn
[20:25:45.51] music-sync-disk(28324):   0.00    0.000015          15         1           getpid
[20:25:45.51] music-sync-disk(28324):   0.00    0.000004           4         1           brk
[20:25:45.51] music-sync-disk(28324):   0.00    0.000004           4         1           rt_sigaction
[20:25:45.51] music-sync-disk(28324):   0.00    0.000000           0         1           sendto
[20:25:45.51] music-sync-disk(28324): ------ ----------- ----------- --------- --------- ----------------

...which was a process doing a large directory treewalk.

Arguments

Usage: straceD [options]

Options:
  -h, --help            show this help message and exit
  -c                    use strace's -c, which prints a syscall summary but
                        not the individual ones (default)
  -C                    use strace's -C, which prints both individual syscalls
                        and a summary
  -f                    use strace's -f, which follows forked processes.
  --forget=FORGET       After how many seconds of behaving (no longer in D
                        state) we forget about a process. Default: 5
  -s SLEEP, --sleep=SLEEP
                        time to sleep between checks, in seconds. Default: 0.5
  -v, --verbose         How much extra detail to print (our own stuff, plus
                        e.g. whether to filter out strace's attach/detach
                        stuff). Can be repeated.
  -e EXPR               expression to pass through to strace -e, for example
                        -e file. Note that unlike strace itself, our option
                        parsing needs that space there. Note also that this
                        also filters the summary output.

TODO

  • portablity, e.g. compatibility with non-linux ps implementations

  • the subprocess code had zombie issues. I think I fixed it, but should still read up more on edge cases.

  • add 'ignore process called X' parameter, e.g. to ignore the database on a database host

straced's People

Contributors

knobs-dials avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

straced's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.