GithubHelp home page GithubHelp logo

yugr / sortcheck Goto Github PK

View Code? Open in Web Editor NEW
35.0 4.0 3.0 157 KB

Tool for detecting violations of ordering axioms in qsort/bsearch callbacks.

License: MIT License

Makefile 5.38% C 82.41% Shell 12.21%
dynamic-analysis runtime-verification qsort program-analysis

sortcheck's Introduction

License Build Status Codecov coverage Total alerts Coverity Scan

What is this?

SortChecker is a tool for detecting violations of ordering axioms in comparison functions passed to qsort (also bsearch, lfind, etc.). For complex data structures it's very easy to violate one of the requirements. Such violations cause undefined behavior and may lead to all sorts of runtime errors (including unexpected results, inconsistent results across different platforms or even aborts) (also here, see this answer for explanations).

The tool works by intercepting qsort and friends through LD_PRELOAD and performing various checks prior to passing control to libc. It could be applied to both C and C++ programs although for the latter std::sort and std::binary_search are more typical (use my SortChecker++ tool to diagnose errors in them).

The tool is quite robust - I've successfully booted stock Ubuntu 14, Fedora 22 and Debian chroot and bootstrapped GCC 4.9.

The project is MIT-licensed. It has no fancy dependencies, just Glibc and Bash.

What are current results?

I've done some basic testing of Ubuntu 14.04 and Fedora 22 distro under SortChecker (open file/web browsers, navigate system menus, install various apps, etc.).

The tool has found errors in many programs. Here are some trophies:

I haven't seen a noticeable slowdown when working in a fully checked distro or building C++ projects with a checked compiler.

Usage

You do not need to rebuild your app to test it under SortChecker. Just run with preloaded libsortcheck.so:

$ LD_PRELOAD=libsortcheck.so myapp ...

(you'll probably want to combine this with some kind of regression or random/fuzz testing to achieve good coverage).

You could also use a helper script sortcheck to do this for you:

$ sortcheck myapp ...

To debug the issue, you can run with

$ SORTCHECK_OPTIONS=raise=1 sortcheck myapp ...

and then examine generated coredump in gdb.

By default SortChecker enables a set of common checks which should be enough for most users. You can also customize it's behavior through SORTCHECK_OPTIONS environment variable which is a colon-separated list of option assignments e.g.

$ export SORTCHECK_OPTIONS=debug=1:max_errors=10

You can also put option string to /SORTCHECK_OPTIONS file (this is particularly useful for testing of daemon processes).

Supported options are

  • max_errors - maximum number of errors to report (default 10)
  • debug - print debug info (default false)
  • print_to_file - print warnings to specified file (rather than default stderr)
  • print_to_syslog - print warnings to syslog instead of stderr (default false)
  • do_report_error - print reports (only used for benchmarking, default true)
  • raise - raise signal on detecting violation (useful for inspecting issues in debugger)
  • sleep - sleep for N seconds before printing error and continuing (may be useful for attaching with gdb and examining the situation)
  • check - comma-separated list of checks to perform; available options are
    • default - default set of checks (see below)
    • basic - check that comparison functions return stable results and does not modify inputs (enabled by default)
    • sorted - check that arrays passed to bsearch are sorted (enabled by default)
    • symmetry - check that cmp(x,y) == -cmp(y,x) (enabled by default)
    • transitivity - check that if x < y && y < z, then x < z (enabled by default)
    • reflexivity - check that cmp(x,x) == 0 (usually not very important so disabled by default, on the other hand may trigger on otherwise undetected asymmetry bugs)
    • unique - check that cmp does not compare different objects as equal (to avoid random orderings on different platforms)
    • good_bsearch - bsearch uses a restricted (non-symmetric) form of comparison function so some checks are not generally applicable; this option tells SortChecker that it should test bsearch more aggressively (unsafe so disabled by default). Note that this option may cause runtime errors or crashes if applied inappropriately.
    • for each option XYZ there's a dual no_XYZ (which disables corresponding check)
  • start - check the start-th group of 32 leading elements (default 0); a value of rand will select random group.

Applying to full distribution

You can run full Linux distro under SortChecker:

  • add full path to libsortcheck.so to /etc/ld.so.preload

  • create a global config:

    $ echo print_to_syslog=1:check=default:start=rand | sudo tee /SORTCHECK_OPTIONS 
    $ sudo chmod a+r /SORTCHECK_OPTIONS
    
  • reboot

Due to randomized order of checks it makes sense to check for errors and reboot several times to detect more errors.

Disclaimer: in this mode libsortcheck.so will be preloaded to all your processes so any malfunction may permanently break your system. It's highly recommended to backup the disk or make VM snapshot.

Build

To build the tool, simply run make from project top directory. Makefile supports various candies (e.g. AddressSanitizer, debug build, etc.) - run make help for mode details.

If you enable AddressSanitizer you'll need to add libasan.so to LD_PRELOAD (before libsortcheck.so).

To test the tool, run make check. Note that I've myself only tested SortChecker on Ubuntu and Fedora.

Known issues

  • SortChecker is not fully thread-safe yet (should be easy to fix though)
  • SortChecker is currently Linux-only (relies on LD_PRELOAD)

Future plans

The tool only supports C now which rules out most of C++ code because it uses (inline) std::sort and std::binary_search (and other similar APIs). For those see another tool SortChecker++ which does a simple compile-time instrumentation via Clang.

It would be great to make SortChecker a part of standard debuggin tool like UBsan. Here's a discussion in LLVM mailing list which unfortunately didn't go too far.

It may also make sense to check other popular sorting APIs:

  • qsort_s, bsearch_s (are they availabile/used?)
  • fts_open, scandir
  • Berkeley DB's set_bt_compare, set_dup_compare, etc.
  • Glib2's g_qsort_with_data and other users of GCompareFunc/GCompareDataFunc
  • Gnulib's gl_listelement_compar_fn and friends
  • Libiberty's splay_tree API
  • OpenSSL's objects.h API
  • etc.

Here's less high-level stuff (sorted by priority):

  • ensure that code is thread-safe (may need lots of platform-dependent code for atomics...)
  • print complete backtrace rather than just address of caller (libunwind?)
  • print array elements which triggered errors (i.e. hex dumps)
  • use random array subsets for testing
  • other minor TODO/FIXME are scattered all over the codebase

sortcheck's People

Contributors

yugr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.