GithubHelp home page GithubHelp logo

silky / sqlsmith Goto Github PK

View Code? Open in Web Editor NEW

This project forked from anse1/sqlsmith

0.0 3.0 0.0 151 KB

SQLsmith is a tool that can generate random SQL queries

License: GNU General Public License v3.0

Makefile 0.84% M4 0.65% C++ 88.75% XSLT 4.00% PLpgSQL 5.75%

sqlsmith's Introduction

SQLsmith

Description

SQLsmith is a tool that can generate random SQL queries. Its paragon is Csmith, which proved valuable for quality assurance in C compilers.

SQLsmith is still in an early prototyping stage but already found a couple of minor bugs in Postgres. The only supported RDBMS is Postgres at the moment.

It might also be useful in its current stage for safely putting arbitrary databases under random load.

Dependencies:

  • C++11
  • libpqxx

Usage

sqlsmith connects to the target database to retrieve the schema for query generation and to send the generated queries to. Currently, all generated statements are rolled back.

Example invocation:

cd sqlsmith
make sqlsmith
./sqlsmith --verbose --target="host=/tmp port=65432"

The following options are currently supported:

--target=connstrtarget database (default: libpq defaults)
--log-to=connstrdatabase for logging errors (default: don’t log)
--verboseemit progress output
--versionshow version information
--seed=intseed RNG with specified integer instead of PID
--dry-runprint queries instead of executing them
--max-queries=longterminate after generating this many queries

Sample output:

--verbose makes sqlsmith emit some progress indication to stderr. A symbol is output for each query sent to the server. Currently the following ones are generated:

symbolmeaningdetails
.okQuery generated and executed with ok sqlstate
ssyntax errorThese are bugs in sqlsmith - please report
ttimeoutSQLsmith sets a statement timeout of 1s
cbroken connectionThese happen when a query crashes the server
eother error

It also periodically emits error reports. In the following example, these are mostly caused by the primitive type system.

queries: 39000 (202.399 gen/s, 298.942 exec/s)
AST stats (avg): height = 5.599 nodes = 37.8489
82	ERROR:  invalid regular expression: quantifier operand invalid
70	ERROR:  canceling statement due to statement timeout
44	ERROR:  operator does not exist: point = point
27	ERROR:  operator does not exist: xml = xml
22	ERROR:  cannot compare arrays of different element types
11	ERROR:  could not determine which collation to use for string comparison
5	ERROR:  invalid regular expression: nfa has too many states
4	ERROR:  cache lookup failed for index 2619
4	ERROR:  invalid regular expression: brackets [] not balanced
3	ERROR:  operator does not exist: polygon = polygon
2	ERROR:  invalid regular expression: parentheses () not balanced
1	ERROR:  invalid regular expression: invalid character range
error rate: 0.00705128

The only one that looks interesting here is the cache lookup one. Taking a closer look at it reveals that it happens when you query a certain catalog view like this:

self=# select indexdef from pg_catalog.pg_indexes where indexdef is not NULL;
FEHLER:  cache lookup failed for index 2619

This is because the planner then puts pg_get_indexdef(oid) in a context where it sees non-index-oids, which causes it to croak:

                                     QUERY PLAN                                     
------------------------------------------------------------------------------------
 Hash Join  (cost=17.60..30.65 rows=9 width=4)
   Hash Cond: (i.oid = x.indexrelid)
   ->  Seq Scan on pg_class i  (cost=0.00..12.52 rows=114 width=8)
         Filter: ((pg_get_indexdef(oid) IS NOT NULL) AND (relkind = 'i'::"char"))
   ->  Hash  (cost=17.31..17.31 rows=23 width=4)
         ->  Hash Join  (cost=12.52..17.31 rows=23 width=4)
               Hash Cond: (x.indrelid = c.oid)
               ->  Seq Scan on pg_index x  (cost=0.00..4.13 rows=113 width=8)
               ->  Hash  (cost=11.76..11.76 rows=61 width=8)
                     ->  Seq Scan on pg_class c  (cost=0.00..11.76 rows=61 width=8)
                           Filter: (relkind = ANY ('{r,m}'::"char"[]))

Now this is more of a curiosity than a bug, but maybe someday SQLsmith finds a real one…

Building on OSX

In order to build on Mac OSX, assuming you use Homebrew, run the following

brew install libpqxx automake libtool autoconf autoconf-archive
cd sqlsmith
autoreconf -i
./configure
make sqlsmith

License

See COPYING for using and distributing this code.

Authors

Andreas Seltenreich <[email protected]>

sqlsmith's People

Contributors

anse1 avatar lfittl avatar xxorde avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.