GithubHelp home page GithubHelp logo

jepsen-io / jepsen Goto Github PK

View Code? Open in Web Editor NEW
6.6K 192.0 709.0 45.81 MB

A framework for distributed systems verification, with fault injection

Clojure 95.00% Shell 1.04% Erlang 0.91% Go 0.09% C 0.67% Java 1.71% TLA 0.26% Dockerfile 0.14% Python 0.17%

jepsen's Introduction

Jepsen

Breaking distributed systems so you don't have to.

Jepsen is a Clojure library. A test is a Clojure program which uses the Jepsen library to set up a distributed system, run a bunch of operations against that system, and verify that the history of those operations makes sense. Jepsen has been used to verify everything from eventually-consistent commutative databases to linearizable coordination systems to distributed task schedulers. It can also generate graphs of performance and availability, helping you characterize how a system responds to different faults. See jepsen.io for examples of the sorts of analyses you can carry out with Jepsen.

Clojars Project Build Status

Design Overview

A Jepsen test runs as a Clojure program on a control node. That program uses SSH to log into a bunch of db nodes, where it sets up the distributed system you're going to test using the test's pluggable os and db.

Once the system is running, the control node spins up a set of logically single-threaded processes, each with its own client for the distributed system. A generator generates new operations for each process to perform. Processes then apply those operations to the system using their clients. The start and end of each operation is recorded in a history. While performing operations, a special nemesis process introduces faults into the system--also scheduled by the generator.

Finally, the DB and OS are torn down. Jepsen uses a checker to analyze the test's history for correctness, and to generate reports, graphs, etc. The test, history, analysis, and any supplementary results are written to the filesystem under store/<test-name>/<date>/ for later review. Symlinks to the latest results are maintained at each level for convenience.

Documentation

This tutorial walks you through writing a Jepsen test from scratch.

For reference, see the API documentation.

An independent translation is available in Chinese.

Setting up a Jepsen Environment

So, you've got a Jepsen test, and you'd like to run it! Or maybe you'd like to start learning how to write tests. You've got several options:

AWS

If you have an AWS account, you can launch a full Jepsen cluster---control and DB nodes---from the AWS Marketplace. Click "Continue to Subscribe", "Continue to Configuration", and choose "CloudFormation Template". You can choose the number of nodes you'd like to deploy, adjust the instance types and disk sizes, and so on. These are full VMs, which means they can test clock skew.

The AWS marketplace clusters come with an hourly fee (generally $1/hr/node), which helps fund Jepsen development.

LXC

You can set up your DB nodes as LXC containers, and use your local machine as the control node. See the LXC documentation for guidelines. This might be the easiest setup for hacking on tests: you'll be able to edit source code, run profilers, etc on the local node. Containers don't have real clocks, so you generally can't use them to test clock skew.

VMs, Real Hardware, etc.

You should be able to run Jepsen against almost any machines which have:

  • A TCP network
  • An SSH server
  • Sudo or root access

Each DB node should be accessible from the control node via SSH: you need to be able to run ssh myuser@some-node, and get a shell. By default, DB nodes are named n1, n2, n3, n4, and n5, but that (along with SSH username, password, identity files, etc) is all definable in your test, or at the CLI. The account you use on those boxes needs sudo access to set up DBs, control firewalls, etc.

BE ADVISED: tests may mess with clocks, add apt repos, run killall -9 on processes, and generally break things, so you shouldn't, you know, point Jepsen at your prod machines unless you like to live dangerously, or you wrote the test and know exactly what it's doing.

NOTE: Most Jepsen tests are written with more specific requirements in mind---like running on Debian, using iptables for network manipulation, etc. See the specific test code for more details.

Docker (Unsupported)

There is a Docker Compose setup for running a Jepsen cluster on a single machine. Sadly the Docker platform has been something of a moving target; this environment tends to break in new and exciting ways on various platforms every few months. If you're a Docker whiz and can get this going reliably on Debian & OS X that's great--pull requests would be a big help.

Like other containers Docker containers don't have real clocks--that means you generally can't use them to test clock skew.

Setting Up Control Nodes

For AWS and Docker installs, your control node comes preconfigured with all the software you'll need to run most Jepsen tests. If you build your own control node (or if you're using your local machine as a control node), you'll need a few things:

  • A JVM---version 1.8 or higher.
  • JNA, so the JVM can talk to your SSH.
  • Leiningen: a Clojure build tool.
  • Gnuplot: how Jepsen renders performance plots.
  • Graphviz: how Jepsen renders transactional anomalies.

On Debian, try:

sudo apt install openjdk-17-jdk libjna-java gnuplot graphviz

... to get the basic requirements in place. Debian's Leiningen packages are ancient, so download lein from the web instead.

Running a Test

Once you've got everything set up, you should be able to run cd aerospike; lein test, and it'll spit out something like

INFO  jepsen.core - Analysis invalid! (ノಥ益ಥ)ノ ┻━┻

{:valid? false,
 :counter
 {:valid? false,
  :reads
  [[190 193 194]
   [199 200 201]
   [253 255 256]
   ...}}

Working With the REPL

Jepsen tests emit .jepsen files in the store/ directory. You can use these to investigate a test at the repl. Run lein repl in the test directory (which should contain store..., then load a test using store/test:

user=> (def t (store/test -1))

-1 is the last test run, -2 is the second-to-last. 0 is the first, 1 is the second, and so on. You can also load a by the string directory name. As a handy shortcut, clicking on the title of a test in the web interface will copy its path to the clipboard.

user=> (def t (store/test "/home/aphyr/jepsen.etcd/store/etcd append etcdctl kill/20221003T124714.485-0400"))

These have the same structure as the test maps you're used to working with in Jepsen, though without some fields that wouldn't make sense to serialize--no :checker, :client, etc.

jepsen.etcd=> (:name t)
"etcd append etcdctl kill"
jepsen.etcd=> (:ops-per-key t)
200

These test maps are also lazy: to speed up working at the REPL, they won't load the history or results until you ask for them. Then they're loaded from disk and cached.

jepsen.etcd=> (count (:history t))
52634

You can use all the usual Clojure tricks to introspect results and histories. Here's an aborted read (G1a) anomaly--we'll pull out the ops which wrote and read the aborted read:

jepsen.etcd=> (def writer (-> t :results :workload :anomalies :G1a first :writer))
#'jepsen.etcd/writer
jepsen.etcd=> (def reader (-> t :results :workload :anomalies :G1a first :op))
#'jepsen.etcd/reader

The writer appended 11 and 12 to key 559, but failed, returning a duplicate key error:

jepsen.etcd=> (:value writer)
[[:r 559 nil] [:r 558 nil] [:append 559 11] [:append 559 12]]
jepsen.etcd=> (:error writer)
[:duplicate-key "rpc error: code = InvalidArgument desc = etcdserver: duplicate key given in txn request"]

The reader, however, observed a value for 559 beginning with 12!

jepsen.etcd=> (:value reader)
[[:r 559 [12]] [:r 557 [1]]]

Let's find all successful transactions:

jepsen.etcd=> (def txns (->> t :history (filter #(and (= :txn (:f %)) (= :ok (:type %)))) (map :value)))
#'jepsen.etcd/txns

And restrict those to just operations which affected key 559:

jepsen.etcd=> (->> txns (filter (partial some (comp #{559} second))) pprint)
([[:r 559 [12]] [:r 557 [1]]]
 [[:r 559 [12]] [:append 559 1] [:r 559 [12 1]]]
 [[:append 556 32]
  [:r 556 [1 18 29 32]]
  [:r 556 [1 18 29 32]]
  [:r 559 [12 1]]]
 [[:r 559 [12 1]]]
 [[:append 559 9] [:r 557 [1 5]] [:r 558 [1]] [:r 558 [1]]]
 [[:r 559 [12 1 9]] [:r 559 [12 1 9]]]
 [[:append 559 17]]
 [[:r 559 [12 1 9 17]] [:append 558 5]]
 [[:r 559 [12 1 9 17]]
  [:append 557 22]
  [:append 559 27]
  [:r 557 [1 5 12 22]]])

Sure enough, no OK appends of 12 to key 559!

You'll find more functions for slicing-and-dicing tests in jepsen.store.

FAQ

JSCH auth errors

If you see com.jcraft.jsch.JSchException: Auth fail, this means something about your test's :ssh map is wrong, or your control node's SSH environment is a bit weird.

  1. Confirm that you can ssh to the node that Jepsen failed to connect to. Try ssh -v for verbose information--pay special attention to whether it uses a password or private key.
  2. If you intend to use a username and password, confirm that they're specified correctly in your test's :ssh map.
  3. If you intend to log in with a private key, make sure your SSH agent is running.
    • ssh-add -l should show the key you use to log in.
    • If your agent isn't running, try launching one with ssh-agent.
    • If your agent shows no keys, you might need to add it with ssh-add.
    • If you're SSHing to a control node, SSH might be forwarding your local agent's keys rather than using those on the control node. Try ssh -a to disable agent forwarding.

If you've SSHed to a DB node already, you might also encounter a jsch bug which doesn't know how to read hashed known_hosts files. Remove all keys for the DB hosts from your known_hosts file, then:

ssh-keyscan -t rsa n1 >> ~/.ssh/known_hosts
ssh-keyscan -t rsa n2 >> ~/.ssh/known_hosts
ssh-keyscan -t rsa n3 >> ~/.ssh/known_hosts
ssh-keyscan -t rsa n4 >> ~/.ssh/known_hosts
ssh-keyscan -t rsa n5 >> ~/.ssh/known_hosts

to add unhashed versions of each node's hostkey to your ~/.ssh/known_hosts.

SSHJ auth errors

If you get an exception like net.schmizz.sshj.transport.TransportException: Could not verify 'ssh-ed25519' host key with fingerprint 'bf:4a:...' for 'n1' on port 22, but you're sure you've got the keys in your ~/.ssh/known-hosts, this is because (I think) SSHJ tries to verify only the ed25519 key and ignores the RSA key. You can add the ed25519 keys explicitly via:

ssh-keyscan -t ed25519 n1 >> ~/.ssh/known_hosts
...

Other Projects

Additional projects that may be of interest:

  • Jecci: A wrapper framework around Jepsen
  • Porcupine: a linearizability checker written in Go.
  • elle-cli: command-line frontend to transactional consistency checkers for black-box databases.

jepsen's People

Contributors

akihirosuda avatar alexanderabramov avatar andreidan avatar aphyr avatar bdarnell avatar bitbckt avatar cwen0 avatar dakrone avatar danielmai avatar frozenspider avatar fruitcupwarrior avatar hptabster avatar jacobmbr avatar jhalterman avatar jpfuentes2 avatar knz avatar leapsky avatar manishrjain avatar map7000 avatar martinmr avatar mkcp avatar nurturenature avatar overvenus avatar pavlobaron avatar siddontang avatar sprsquish avatar stevana avatar ttyusupov avatar uncp avatar vjuranek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jepsen's Issues

travis-ci

Any reason why we have no CI setup?

Test failures: is this expected?

After using Oracle Java 8 jepsen starts to run! However it seems to fail eventually.
Zookeeper test either doesn't stop overnight. Or stop after running out of memory.

zookeeper has no report produced.


clojure.main.main (main.java:37)

Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.FutureTask.report (FutureTask.java:122)
java.util.concurrent.FutureTask.get (FutureTask.java:192)
clojure.core$deref_future.invoke (core.clj:2186)
clojure.core$future_call$reify__6736.deref (core.clj:6683)
clojure.core$deref.invoke (core.clj:2206)
clojure.core$pmap$step__6749$fn__6751.invoke (core.clj:6733)
clojure.lang.LazySeq.sval (LazySeq.java:40)
clojure.lang.LazySeq.seq (LazySeq.java:49)
clojure.lang.Cons.next (Cons.java:39)
clojure.lang.RT.next (RT.java:674)
clojure.core/next (core.clj:64)
clojure.core$concat$cat__4217$fn__4218.invoke (core.clj:707)
clojure.lang.LazySeq.sval (LazySeq.java:40)
clojure.lang.LazySeq.seq (LazySeq.java:49)
clojure.lang.ChunkedCons.chunkedNext (ChunkedCons.java:59)
clojure.lang.ChunkedCons.next (ChunkedCons.java:43)
clojure.lang.RT.next (RT.java:674)
clojure.core/next (core.clj:64)
clojure.core.protocols$naive_seq_reduce.invoke (protocols.clj:65)
clojure.core.protocols$interface_or_naive_reduce.invoke (protocols.clj:73)
clojure.core.protocols/fn (protocols.clj:171)
clojure.core.protocols$fn__6478$G__6473__6487.invoke (protocols.clj:19)
clojure.core.protocols$seq_reduce.invoke (protocols.clj:31)
clojure.core.protocols/fn (protocols.clj:101)
clojure.core.protocols$fn__6452$G__6447__6465.invoke (protocols.clj:13)
clojure.core$reduce.invoke (core.clj:6519)
knossos.linear$step.invoke (linear.clj:251)
clojure.core$partial$fn__4529.invoke (core.clj:2501)
clojure.lang.PersistentVector.reduce (PersistentVector.java:333)
clojure.core$reduce.invoke (core.clj:6518)
knossos.linear$analysis.invoke (linear.clj:312)
jepsen.checker$reify__6560.check (checker.clj:53)
jepsen.checker$compose$reify__6600$fn__6602.invoke (checker.clj:256)
clojure.core$pmap$fn__6744$fn__6745.invoke (core.clj:6729)
clojure.core$binding_conveyor_fn$fn__4444.invoke (core.clj:1916)
clojure.lang.AFn.call (AFn.java:18)
java.util.concurrent.FutureTask.run (FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
java.lang.Thread.run (Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
at [empty stack trace]

lein test :only jepsen.zookeeper-test/zk-test

FAIL in (zk-test) (zookeeper_test.clj:7)
expected: (:valid? (:results (jepsen/run! (zk/zk-test "3.4.5+dfsg-2"))))
actual: false

Ran 1 tests containing 1 assertions.
1 failures, 0 errors.
Tests failed.

Failed to run chronos test on ubuntu

Just to give you an overview of my current situation to avoid goose-chasing non-existent problem: I'm currently running on a modification of the project on ubuntu (modified to point jepsen dependency to my modded version, which adds username/password and proxy, port the projects to latest version of jepsen, etc.). Currently, I managed to get lein test running for jepsen project with 0 error and 0 failures, but I still have a bunch of "indeterminate state" exception for one of the test.

When I run the test for chronos, I got the error Caused by: java.lang.RuntimeException: [sudo] password for master: stop: Unknown instance:. Looking at the log, it seems that the test attempts to stop and remove existing mesos/chronos system on the host without checking whether they are already installed or not. I couldn't find any installation-related log from mesosphere.clj in the log below - only starting and stopping the services.

Would you please confirm whether my speculation is correct or not?

Full log below:

master@master-host:~/jepsen/chronos$ lein test
INFO  org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.4.0-1202560, built on 11/16/2011 07:18 GMT
INFO  org.apache.zookeeper.ZooKeeper - Client environment:host.name=master-host
INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.8.0_91
INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Oracle Corporation
INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-8-oracle/jre
INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/home/master/jepsen/chronos/test:/home/master/jepsen/chronos/src:/home/master/jepsen/chronos/dev-resources:/home/master/jepsen/chronos/resources:/home/master/jepsen/chronos/target/classes:/home/master/.m2/repository/clj-time/clj-time/0.11.0/clj-time-0.11.0.jar:/home/master/.m2/repository/org/clojure/math.combinatorics/0.1.1/math.combinatorics-0.1.1.jar:/home/master/.m2/repository/org/apache/zookeeper/zookeeper/3.4.0/zookeeper-3.4.0.jar:/home/master/.m2/repository/org/clojure/clojure/1.8.0/clojure-1.8.0.jar:/home/master/.m2/repository/clj-http/clj-http/3.0.1/clj-http-3.0.1.jar:/home/master/.m2/repository/org/slf4j/slf4j-log4j12/1.7.21/slf4j-log4j12-1.7.21.jar:/home/master/.m2/repository/org/clojure/algo.generic/0.1.2/algo.generic-0.1.2.jar:/home/master/.m2/repository/tigris/tigris/0.1.1/tigris-0.1.1.jar:/home/master/.m2/repository/commons-codec/commons-codec/1.10/commons-codec-1.10.jar:/home/master/.m2/repository/org/apache/httpcomponents/httpmime/4.5.2/httpmime-4.5.2.jar:/home/master/.m2/repository/org/jboss/netty/netty/3.2.2.Final/netty-3.2.2.Final.jar:/home/master/.m2/repository/com/jcraft/jsch.agentproxy.usocket-nc/0.0.9/jsch.agentproxy.usocket-nc-0.0.9.jar:/home/master/.m2/repository/net/sf/trove4j/trove4j/3.0.3/trove4j-3.0.3.jar:/home/master/.m2/repository/dk/brics/automaton/automaton/1.11-8/automaton-1.11-8.jar:/home/master/.m2/repository/interval-metrics/interval-metrics/1.0.0/interval-metrics-1.0.0.jar:/home/master/.m2/repository/potemkin/potemkin/0.4.3/potemkin-0.4.3.jar:/home/master/.m2/repository/com/fasterxml/jackson/dataformat/jackson-dataformat-smile/2.5.3/jackson-dataformat-smile-2.5.3.jar:/home/master/.m2/repository/knossos/knossos/0.2.6/knossos-0.2.6.jar:/home/master/.m2/repository/io/aleph/dirigiste/0.1.3/dirigiste-0.1.3.jar:/home/master/.m2/repository/com/jcraft/jsch.agentproxy.core/0.0.9/jsch.agentproxy.core-0.0.9.jar:/home/master/.m2/repository/org/clojure/tools.nrepl/0.2.12/tools.nrepl-0.2.12.jar:/home/master/.m2/repository/org/clojure/data.fressian/0.2.1/data.fressian-0.2.1.jar:/home/master/.m2/repository/zookeeper-clj/zookeeper-clj/0.9.3/zookeeper-clj-0.9.3.jar:/home/master/.m2/repository/commons-io/commons-io/2.4/commons-io-2.4.jar:/home/master/.m2/repository/clojure-complete/clojure-complete/0.2.4/clojure-complete-0.2.4.jar:/home/master/.m2/repository/org/apache/httpcomponents/httpclient/4.5.2/httpclient-4.5.2.jar:/home/master/.m2/repository/manifold/manifold/0.1.4/manifold-0.1.4.jar:/home/master/.m2/repository/args4j/args4j/2.0.29/args4j-2.0.29.jar:/home/master/.m2/repository/com/boundary/high-scale-lib/1.0.6/high-scale-lib-1.0.6.jar:/home/master/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.5.3/jackson-core-2.5.3.jar:/home/master/.m2/repository/org/apache/httpcomponents/httpcore/4.4.4/httpcore-4.4.4.jar:/home/master/.m2/repository/org/clojars/pallix/analemma/1.0.0/analemma-1.0.0.jar:/home/master/.m2/repository/hiccup/hiccup/1.0.5/hiccup-1.0.5.jar:/home/master/.m2/repository/clj-tuple/clj-tuple/0.2.2/clj-tuple-0.2.2.jar:/home/master/.m2/repository/junit/junit/3.8.1/junit-3.8.1.jar:/home/master/.m2/repository/primitive-math/primitive-math/0.1.5/primitive-math-0.1.5.jar:/home/master/.m2/repository/com/jcraft/jsch.agentproxy.jsch/0.0.9/jsch.agentproxy.jsch-0.0.9.jar:/home/master/.m2/repository/com/fasterxml/jackson/dataformat/jackson-dataformat-cbor/2.5.3/jackson-dataformat-cbor-2.5.3.jar:/home/master/.m2/repository/org/clojure/tools.logging/0.3.1/tools.logging-0.3.1.jar:/home/master/.m2/repository/slingshot/slingshot/0.12.2/slingshot-0.12.2.jar:/home/master/.m2/repository/loco/loco/0.3.0/loco-0.3.0.jar:/home/master/.m2/repository/com/jcraft/jsch/0.1.53/jsch-0.1.53.jar:/home/master/.m2/repository/com/jcraft/jsch.agentproxy.usocket-jna/0.0.9/jsch.agentproxy.usocket-jna-0.0.9.jar:/home/master/.m2/repository/org/javabits/jgrapht/jgrapht-core/0.9.3/jgrapht-core-0.9.3.jar:/home/master/.m2/repository/org/fressian/fressian/0.6.6/fressian-0.6.6.jar:/home/master/.m2/repository/jepsen/jepsen/0.1.1-SNAPSHOT/jepsen-0.1.1-SNAPSHOT.jar:/home/master/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/home/master/.m2/repository/org/slf4j/slf4j-api/1.7.7/slf4j-api-1.7.7.jar:/home/master/.m2/repository/net/java/dev/jna/jna-platform/4.1.0/jna-platform-4.1.0.jar:/home/master/.m2/repository/jepsen/zookeeper/jepsen.zookeeper/0.1.0-SNAPSHOT/jepsen.zookeeper-0.1.0-SNAPSHOT.jar:/home/master/.m2/repository/com/jcraft/jsch.agentproxy.sshagent/0.0.9/jsch.agentproxy.sshagent-0.0.9.jar:/home/master/.m2/repository/net/java/dev/jna/jna/4.1.0/jna-4.1.0.jar:/home/master/.m2/repository/riddley/riddley/0.1.12/riddley-0.1.12.jar:/home/master/.m2/repository/commons-logging/commons-logging/1.2/commons-logging-1.2.jar:/home/master/.m2/repository/cheshire/cheshire/5.5.0/cheshire-5.5.0.jar:/home/master/.m2/repository/org/choco-solver/choco-solver/3.3.0/choco-solver-3.3.0.jar:/home/master/.m2/repository/byte-streams/byte-streams/0.2.2/byte-streams-0.2.2.jar:/home/master/.m2/repository/gnuplot/gnuplot/0.1.1/gnuplot-0.1.1.jar:/home/master/.m2/repository/org/clojars/achim/multiset/0.1.0/multiset-0.1.0.jar:/home/master/.m2/repository/clj-ssh/clj-ssh/0.5.14/clj-ssh-0.5.14.jar:/home/master/.m2/repository/com/jcraft/jsch.agentproxy.pageant/0.0.9/jsch.agentproxy.pageant-0.0.9.jar:/home/master/.m2/repository/jline/jline/0.9.94/jline-0.9.94.jar:/home/master/.m2/repository/joda-time/joda-time/2.8.2/joda-time-2.8.2.jar:/home/master/.m2/repository/avout/avout/0.5.4/avout-0.5.4.jar
INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
INFO  org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
INFO  org.apache.zookeeper.ZooKeeper - Client environment:os.version=4.2.0-35-generic
INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.name=master
INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.home=/home/master
INFO  org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/home/master/jepsen/chronos

lein test jepsen.chronos-test
INFO  jepsen.os.debian - :n3 setting up debian
INFO  jepsen.os.debian - :n2 setting up debian
INFO  jepsen.os.debian - :n5 setting up debian
INFO  jepsen.os.debian - :n4 setting up debian
INFO  jepsen.os.debian - :n1 setting up debian
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (cat /etc/hosts)
INFO  jepsen.control - (cat /etc/hosts)
INFO  jepsen.control - (cat /etc/hosts)
INFO  jepsen.control - (date +%s)
INFO  jepsen.control - (date +%s)
INFO  jepsen.control - (cat /etc/hosts)
INFO  jepsen.control - (date +%s)
INFO  jepsen.control - (stat -c %Y /var/cache/apt/pkgcache.bin)
INFO  jepsen.control - (stat -c %Y /var/cache/apt/pkgcache.bin)
INFO  jepsen.control - (date +%s)
INFO  jepsen.control - (stat -c %Y /var/cache/apt/pkgcache.bin)
INFO  jepsen.control - (dpkg --get-selections man-db curl iputils-ping logrotate rsyslog psmisc sysvinit-utils faketime vim unzip wget iptables)
INFO  jepsen.control - (dpkg --get-selections man-db curl iputils-ping logrotate rsyslog psmisc sysvinit-utils faketime vim unzip wget iptables)
INFO  jepsen.control - (stat -c %Y /var/cache/apt/pkgcache.bin)
INFO  jepsen.control - (cat /etc/hosts)
INFO  jepsen.control - (dpkg --get-selections man-db curl iputils-ping logrotate rsyslog psmisc sysvinit-utils faketime vim unzip wget iptables)
INFO  jepsen.control - (dpkg --get-selections systemd)
INFO  jepsen.control - (dpkg --get-selections systemd)
INFO  jepsen.control - (dpkg --get-selections man-db curl iputils-ping logrotate rsyslog psmisc sysvinit-utils faketime vim unzip wget iptables)
INFO  jepsen.control - (date +%s)
INFO  jepsen.control - (dpkg --get-selections systemd)
INFO  jepsen.control - (iptables -F)
INFO  jepsen.control - (iptables -F)
INFO  jepsen.control - (dpkg --get-selections systemd)
INFO  jepsen.control - (stat -c %Y /var/cache/apt/pkgcache.bin)
INFO  jepsen.control - (iptables -F)
INFO  jepsen.control - (iptables -X)
INFO  jepsen.control - (iptables -X)
INFO  jepsen.control - (iptables -F)
INFO  jepsen.control - (dpkg --get-selections man-db curl iputils-ping logrotate rsyslog psmisc sysvinit-utils faketime vim unzip wget iptables)
INFO  jepsen.control - (iptables -X)
INFO  jepsen.control - (iptables -X)
INFO  jepsen.control - (dpkg --get-selections systemd)
INFO  jepsen.control - (iptables -F)
INFO  jepsen.control - (iptables -X)
INFO  jepsen.chronos - :n2 stopping chronos
INFO  jepsen.chronos - :n4 stopping chronos
INFO  jepsen.chronos - :n5 stopping chronos
INFO  jepsen.chronos - :n3 stopping chronos
INFO  jepsen.chronos - :n1 stopping chronos
INFO  jepsen.control - (service chronos stop)
INFO  jepsen.control - (service chronos stop)
INFO  jepsen.control - (service chronos stop)
INFO  jepsen.control - (service chronos stop)
INFO  jepsen.control - (service chronos stop)
INFO  jepsen.control - (ps aux | grep /usr/bin/chronos | grep -v grep | awk "{print \$2}" | xargs kill -9)
INFO  jepsen.control - (ps aux | grep /usr/bin/chronos | grep -v grep | awk "{print \$2}" | xargs kill -9)
INFO  jepsen.control - (ps aux | grep /usr/bin/chronos | grep -v grep | awk "{print \$2}" | xargs kill -9)
INFO  jepsen.control - (ps aux | grep /usr/bin/chronos | grep -v grep | awk "{print \$2}" | xargs kill -9)
INFO  jepsen.control - (ps aux | grep /usr/bin/chronos | grep -v grep | awk "{print \$2}" | xargs kill -9)
INFO  jepsen.mesosphere - :n2 stopping mesos-slave
INFO  jepsen.control - (killall -9 mesos-slave)
INFO  jepsen.mesosphere - :n4 stopping mesos-slave
INFO  jepsen.control - (killall -9 mesos-slave)
INFO  jepsen.mesosphere - :n3 stopping mesos-slave
INFO  jepsen.control - (killall -9 mesos-slave)
INFO  jepsen.mesosphere - :n1 stopping mesos-slave
INFO  jepsen.control - (killall -9 mesos-slave)
INFO  jepsen.mesosphere - :n5 stopping mesos-slave
INFO  jepsen.control - (killall -9 mesos-slave)
INFO  jepsen.control - (rm -rf /var/run/mesos/slave.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/slave.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/slave.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/slave.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/slave.pid)
INFO  jepsen.mesosphere - :n2 stopping mesos-master
INFO  jepsen.control - (killall -9 mesos-master)
INFO  jepsen.mesosphere - :n4 stopping mesos-master
INFO  jepsen.control - (killall -9 mesos-master)
INFO  jepsen.mesosphere - :n3 stopping mesos-master
INFO  jepsen.control - (killall -9 mesos-master)
INFO  jepsen.mesosphere - :n1 stopping mesos-master
INFO  jepsen.control - (killall -9 mesos-master)
INFO  jepsen.mesosphere - :n5 stopping mesos-master
INFO  jepsen.control - (killall -9 mesos-master)
INFO  jepsen.control - (rm -rf /var/run/mesos/master.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/master.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/master.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/master.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/master.pid)
INFO  jepsen.control - (rm -rf /var/lib/mesos/master/* /var/lib/mesos/slave/* /var/log/mesos/*)
INFO  jepsen.control - (rm -rf /var/lib/mesos/master/* /var/lib/mesos/slave/* /var/log/mesos/*)
INFO  jepsen.control - (rm -rf /var/lib/mesos/master/* /var/lib/mesos/slave/* /var/log/mesos/*)
INFO  jepsen.control - (rm -rf /var/lib/mesos/master/* /var/lib/mesos/slave/* /var/log/mesos/*)
INFO  jepsen.control - (rm -rf /var/lib/mesos/master/* /var/lib/mesos/slave/* /var/log/mesos/*)
INFO  jepsen.zookeeper - :n4 tearing down ZK
INFO  jepsen.control - (service zookeeper stop)
INFO  jepsen.zookeeper - :n2 tearing down ZK
INFO  jepsen.control - (service zookeeper stop)
INFO  jepsen.zookeeper - :n3 tearing down ZK
INFO  jepsen.control - (service zookeeper stop)
INFO  jepsen.zookeeper - :n5 tearing down ZK
INFO  jepsen.control - (service zookeeper stop)
INFO  jepsen.zookeeper - :n1 tearing down ZK
INFO  jepsen.control - (service zookeeper stop)
INFO  jepsen.chronos - setting up db on  :n4
INFO  jepsen.chronos - setting up db on  :n3
INFO  jepsen.zookeeper - :n3 installing ZK 3.4.5+dfsg-2
INFO  jepsen.chronos - setting up db on  :n2
INFO  jepsen.zookeeper - :n2 installing ZK 3.4.5+dfsg-2
INFO  jepsen.zookeeper - :n4 installing ZK 3.4.5+dfsg-2
INFO  jepsen.control - (apt-cache policy zookeeper)
INFO  jepsen.control - (apt-cache policy zookeeper)
INFO  jepsen.control - (apt-cache policy zookeeper)
INFO  jepsen.chronos - setting up db on  :n5
INFO  jepsen.zookeeper - :n5 installing ZK 3.4.5+dfsg-2
INFO  jepsen.control - (apt-cache policy zookeeper)
INFO  jepsen.chronos - setting up db on  :n1
INFO  jepsen.zookeeper - :n1 installing ZK 3.4.5+dfsg-2
INFO  jepsen.control - (apt-cache policy zookeeper)
INFO  jepsen.os.debian - Installing :zookeeper 3.4.5+dfsg-2
INFO  jepsen.os.debian - Installing :zookeeper 3.4.5+dfsg-2
INFO  jepsen.control - (apt-get install -y --force-yes zookeeper=3.4.5+dfsg-2)
INFO  jepsen.os.debian - Installing :zookeeper 3.4.5+dfsg-2
INFO  jepsen.control - (apt-get install -y --force-yes zookeeper=3.4.5+dfsg-2)
INFO  jepsen.control - (apt-get install -y --force-yes zookeeper=3.4.5+dfsg-2)
INFO  jepsen.os.debian - Installing :zookeeper 3.4.5+dfsg-2
INFO  jepsen.os.debian - Installing :zookeeper 3.4.5+dfsg-2
INFO  jepsen.control - (apt-get install -y --force-yes zookeeper=3.4.5+dfsg-2)
INFO  jepsen.control - (apt-get install -y --force-yes zookeeper=3.4.5+dfsg-2)
INFO  jepsen.chronos - :n5 stopping chronos
INFO  jepsen.chronos - :n4 stopping chronos
INFO  jepsen.control - (service chronos stop)
INFO  jepsen.chronos - :n3 stopping chronos
INFO  jepsen.control - (service chronos stop)
INFO  jepsen.chronos - :n2 stopping chronos
INFO  jepsen.control - (service chronos stop)
INFO  jepsen.chronos - :n1 stopping chronos
INFO  jepsen.control - (service chronos stop)
INFO  jepsen.control - (service chronos stop)
INFO  jepsen.control - (ps aux | grep /usr/bin/chronos | grep -v grep | awk "{print \$2}" | xargs kill -9)
INFO  jepsen.control - (ps aux | grep /usr/bin/chronos | grep -v grep | awk "{print \$2}" | xargs kill -9)
INFO  jepsen.control - (ps aux | grep /usr/bin/chronos | grep -v grep | awk "{print \$2}" | xargs kill -9)
INFO  jepsen.control - (ps aux | grep /usr/bin/chronos | grep -v grep | awk "{print \$2}" | xargs kill -9)
INFO  jepsen.control - (ps aux | grep /usr/bin/chronos | grep -v grep | awk "{print \$2}" | xargs kill -9)
INFO  jepsen.mesosphere - :n3 stopping mesos-slave
INFO  jepsen.control - (killall -9 mesos-slave)
INFO  jepsen.mesosphere - :n2 stopping mesos-slave
INFO  jepsen.control - (killall -9 mesos-slave)
INFO  jepsen.mesosphere - :n4 stopping mesos-slave
INFO  jepsen.control - (killall -9 mesos-slave)
INFO  jepsen.mesosphere - :n1 stopping mesos-slave
INFO  jepsen.control - (killall -9 mesos-slave)
INFO  jepsen.mesosphere - :n5 stopping mesos-slave
INFO  jepsen.control - (killall -9 mesos-slave)
INFO  jepsen.control - (rm -rf /var/run/mesos/slave.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/slave.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/slave.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/slave.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/slave.pid)
INFO  jepsen.mesosphere - :n3 stopping mesos-master
INFO  jepsen.control - (killall -9 mesos-master)
INFO  jepsen.mesosphere - :n2 stopping mesos-master
INFO  jepsen.control - (killall -9 mesos-master)
INFO  jepsen.mesosphere - :n4 stopping mesos-master
INFO  jepsen.control - (killall -9 mesos-master)
INFO  jepsen.mesosphere - :n1 stopping mesos-master
INFO  jepsen.control - (killall -9 mesos-master)
INFO  jepsen.mesosphere - :n5 stopping mesos-master
INFO  jepsen.control - (killall -9 mesos-master)
INFO  jepsen.control - (rm -rf /var/run/mesos/master.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/master.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/master.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/master.pid)
INFO  jepsen.control - (rm -rf /var/run/mesos/master.pid)
INFO  jepsen.control - (rm -rf /var/lib/mesos/master/* /var/lib/mesos/slave/* /var/log/mesos/*)
INFO  jepsen.control - (rm -rf /var/lib/mesos/master/* /var/lib/mesos/slave/* /var/log/mesos/*)
INFO  jepsen.control - (rm -rf /var/lib/mesos/master/* /var/lib/mesos/slave/* /var/log/mesos/*)
INFO  jepsen.control - (rm -rf /var/lib/mesos/master/* /var/lib/mesos/slave/* /var/log/mesos/*)
INFO  jepsen.control - (rm -rf /var/lib/mesos/master/* /var/lib/mesos/slave/* /var/log/mesos/*)
INFO  jepsen.zookeeper - :n2 tearing down ZK
INFO  jepsen.control - (service zookeeper stop)
INFO  jepsen.zookeeper - :n3 tearing down ZK
INFO  jepsen.control - (service zookeeper stop)
INFO  jepsen.zookeeper - :n1 tearing down ZK
INFO  jepsen.control - (service zookeeper stop)
INFO  jepsen.zookeeper - :n4 tearing down ZK
INFO  jepsen.control - (service zookeeper stop)
INFO  jepsen.zookeeper - :n5 tearing down ZK
INFO  jepsen.control - (service zookeeper stop)

lein test :only jepsen.chronos-test/install-test

ERROR in (install-test) (FutureTask.java:122)
expected: (:valid? (:results (jepsen/run! (simple-test "0.28.1-2.0.20.ubuntu1204" "2.4.0-0.1.20151007110204.ubuntu1204"))))
  actual: java.util.concurrent.ExecutionException: java.lang.RuntimeException: [sudo] password for master: stop: Unknown instance:


 at java.util.concurrent.FutureTask.report (FutureTask.java:122)
    java.util.concurrent.FutureTask.get (FutureTask.java:192)
    clojure.core$deref_future.invokeStatic (core.clj:2208)
    clojure.core$future_call$reify__6962.deref (core.clj:6688)
    clojure.core$deref.invokeStatic (core.clj:2228)
    clojure.core$deref.invoke (core.clj:2214)
    clojure.core$map$fn__4785.invoke (core.clj:2646)
    clojure.lang.LazySeq.sval (LazySeq.java:40)
    clojure.lang.LazySeq.seq (LazySeq.java:49)
    clojure.lang.RT.seq (RT.java:521)
    clojure.core$seq__4357.invokeStatic (core.clj:137)
    clojure.core$dorun.invokeStatic (core.clj:3024)
    clojure.core$dorun.invoke (core.clj:3024)
    jepsen.core$on_nodes.invokeStatic (core.clj:85)
    jepsen.core$on_nodes.invoke (core.clj:81)
    jepsen.core$run_BANG_$fn__6798.invoke (core.clj:415)
    jepsen.core$run_BANG_.invokeStatic (core.clj:383)
    jepsen.core$run_BANG_.invoke (core.clj:333)
    jepsen.chronos_test$fn__11146.invokeStatic (chronos_test.clj:7)
    jepsen.chronos_test/fn (chronos_test.clj:6)
    clojure.test$test_var$fn__7983.invoke (test.clj:716)
    clojure.test$test_var.invokeStatic (test.clj:716)
    clojure.test$test_var.invoke (test.clj:707)
    clojure.test$test_vars$fn__8005$fn__8010.invoke (test.clj:734)
    clojure.test$default_fixture.invokeStatic (test.clj:686)
    clojure.test$default_fixture.invoke (test.clj:682)
    clojure.test$test_vars$fn__8005.invoke (test.clj:734)
    clojure.test$default_fixture.invokeStatic (test.clj:686)
    clojure.test$default_fixture.invoke (test.clj:682)
    clojure.test$test_vars.invokeStatic (test.clj:730)
    clojure.test$test_all_vars.invokeStatic (test.clj:736)
    clojure.test$test_ns.invokeStatic (test.clj:757)
    clojure.test$test_ns.invoke (test.clj:742)
    clojure.core$map$fn__4785.invoke (core.clj:2646)
    clojure.lang.LazySeq.sval (LazySeq.java:40)
    clojure.lang.LazySeq.seq (LazySeq.java:49)
    clojure.lang.Cons.next (Cons.java:39)
    clojure.lang.RT.boundedLength (RT.java:1749)
    clojure.lang.RestFn.applyTo (RestFn.java:130)
    clojure.core$apply.invokeStatic (core.clj:648)
    clojure.test$run_tests.invokeStatic (test.clj:767)
    clojure.test$run_tests.doInvoke (test.clj:767)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invokeStatic (core.clj:646)
    clojure.core$apply.invoke (core.clj:641)
    user$eval85$fn__144$fn__175.invoke (form-init2456026430981548870.clj:1)
    user$eval85$fn__144$fn__145.invoke (form-init2456026430981548870.clj:1)
    user$eval85$fn__144.invoke (form-init2456026430981548870.clj:1)
    user$eval85.invokeStatic (form-init2456026430981548870.clj:1)
    user$eval85.invoke (form-init2456026430981548870.clj:1)
    clojure.lang.Compiler.eval (Compiler.java:6927)
    clojure.lang.Compiler.eval (Compiler.java:6917)
    clojure.lang.Compiler.load (Compiler.java:7379)
    clojure.lang.Compiler.loadFile (Compiler.java:7317)
    clojure.main$load_script.invokeStatic (main.clj:275)
    clojure.main$init_opt.invokeStatic (main.clj:277)
    clojure.main$init_opt.invoke (main.clj:277)
    clojure.main$initialize.invokeStatic (main.clj:308)
    clojure.main$null_opt.invokeStatic (main.clj:342)
    clojure.main$null_opt.invoke (main.clj:339)
    clojure.main$main.invokeStatic (main.clj:421)
    clojure.main$main.doInvoke (main.clj:384)
    clojure.lang.RestFn.invoke (RestFn.java:421)
    clojure.lang.Var.invoke (Var.java:383)
    clojure.lang.AFn.applyToHelper (AFn.java:156)
    clojure.lang.Var.applyTo (Var.java:700)
    clojure.main.main (main.java:37)
Caused by: java.lang.RuntimeException: [sudo] password for master: stop: Unknown instance:


 at jepsen.control$throw_on_nonzero_exit.invokeStatic (control.clj:109)
    jepsen.control$throw_on_nonzero_exit.invoke (control.clj:104)
    jepsen.control$exec_STAR_.invokeStatic (control.clj:125)
    jepsen.control$exec_STAR_.doInvoke (control.clj:121)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invokeStatic (core.clj:646)
    clojure.core$apply.invoke (core.clj:641)
    jepsen.control$exec.invokeStatic (control.clj:141)
    jepsen.control$exec.doInvoke (control.clj:135)
    clojure.lang.RestFn.invoke (RestFn.java:436)
    jepsen.zookeeper$db$reify__10462.teardown_BANG_ (zookeeper.clj:63)
    jepsen.mesosphere$db$reify__10516.teardown_BANG_ (mesosphere.clj:155)
    jepsen.chronos$db$reify__11075.teardown_BANG_ (chronos.clj:77)
    jepsen.db$eval4143$fn__4144$G__4135__4148.invoke (db.clj:4)
    jepsen.db$eval4143$fn__4144$G__4134__4153.invoke (db.clj:4)
    clojure.core$partial$fn__4759.invoke (core.clj:2516)
    jepsen.core$on_nodes$fn__6673$fn__6675.invoke (core.clj:89)
    clojure.core$binding_conveyor_fn$fn__4676.invoke (core.clj:1938)
    clojure.lang.AFn.call (AFn.java:18)
    java.util.concurrent.FutureTask.run (FutureTask.java:266)
    java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
    java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
    java.lang.Thread.run (Thread.java:745)

lein test jepsen.chronos.checker-test

Ran 3 tests containing 5 assertions.
0 failures, 1 errors.
Tests failed.

Few tests from jepsen/test/jepsen are failing

I ran all tests from https://github.com/aphyr/jepsen/tree/master/jepsen/test/jepsen
Here are fragments of logs for failed tests:

test-jepsen.checker-test.log:Caused by: java.lang.IllegalArgumentException: No single method: check of interface: jepsen.checker.Checker found for function: check of protocol: Checker

test-jepsen.independent-test.log:Caused by: java.lang.IllegalArgumentException: Can't define method not in interfaces: check

test-jepsen.perf-test.log:Caused by: java.lang.IllegalArgumentException: No single method: check of interface: jepsen.checker.Checker found for function: check of protocol: Checker

I can put tracebacks as well, although I believe it is easy to reproduce.

Error removing systemd

When running jepsen for the first time on a set of machines, I hit an error in the debian os setup which causes jepsen to fail when attempting to remove systemd:

systemd is the active init system, please switch to another before removing systemd.
dpkg: error processing package systemd (--purge):
 subprocess installed pre-removal script returned error exit status 1
Failed to stop lib-init-rw.mount: Unit lib-init-rw.mount not loaded.
Errors were encountered while processing:
 systemd
E: Sub-process /usr/bin/dpkg returned an error code (1)

The debian bug describing this issue mentions that the machine needs to be rebooted after sysvinit is installed before systemd can be uninstalled. Manually rebooting my machines solves the problem, and I'm not sure if there's a nice way for Jepsen to handle this, but just something to consider.

having trouble getting started

I'm not a clojure user, but I want to repro your test results.

I've done the following (on a Mac):

  1. install clojure via brew install clojure
    • installed version is 1.5.1
    • java version is 1.6.0_33
  2. install lein following the instructions on its github page
  3. clone jepsen repo
  4. install jepsen deps via lein install
  5. ?? lein midje and lein test both fail

I've tried lein midje as suggested in the README, but got an error:

jareds-partybus-4:aphyr-jepsen jhirsch$ lein midje
'midje' is not a task. See 'lein help'.

Did you mean this?
         do

I've tried lein test, a bit of a shot in the dark, but got weird failures, see this gist for the full output.

What am I missing here?

Reinventing the shell wheel?

I'm not an expert in Jepsen or on Clojure, but I'm starting to think about how to use it for one of my projects, and one thing that struck me is the DSL you have for getting things set up - e.g.:

(when-not (debian/installed? "mongodb-org")
  (c/su
    (try
      (debian/install {:mongodb-org version})
      (catch RuntimeException e
        (c/exec :apt-key :adv :--keyserver "keyserver.ubuntu.com" :--recv "7F0CEB10")
        (c/exec :echo "deb http://downloads-distro.mongodb.org/repo/debian-sysvinit dist 10gen" :> "/etc/apt/sources.list.d/mongodb.list")
        (c/exec :apt-get :update)
        (debian/install {:mongodb-org version})
    (c/exec :update-rc.d :mongod :remove :-f)))))

Is there any reason it couldn't just be written as a simple shell script (not to mention one of the many config management tools out there, but that might be overkill)?

sudo apt-get install mongodb-org=$VERSION || { \
    sudo apt-key adv --keyserver "keyserver.ubuntu.com" --recv "7F0CEB10" && \
    sudo echo "deb http://downloads-distro.mongodb.org/repo/debian-sysvinit dist 10gen" > "/etc/apt/sources.list.d/mongodb.list" && \
    sudo apt-get update && \
    sudo apt-get install mongodb-org=$VERSION }
update-rc.d mongod remove -f

It just seems like one more thing to learn in an already-complex system :-)

User-provided SSH info is ignored

Caveat: I'm not too experienced with clojure, but..

It looks like jepsen is using pmap created threads to create SSH sessions using dynamic variables to house the connection information, which isn't working since the dynamic variable values are set in a different thread than the SSH connection attempt takes place in. The result is basically that user-supplied username, password, etc., are ignored, and the defaults are always used.

Problem with SSH loggers?

I have spun up the jepsen-vagrant environment and gotten past the issues noted in #39, now I get this error, "Auth fail". I'm able to 'ssh root@n1', and it'd be nice to see any more verbose messages available about this, it seems to be complaining that the logger for clj-ssh hasn't been setup properly:

-- snip --

vagrant@jepsen:/jepsen$ !lein
lein with-profile +rabbitmq test jepsen.system.rabbitmq-test

lein test jepsen.system.rabbitmq-test
SLF4J: The following loggers will not work becasue they were created
SLF4J: during the default configuration phase of the underlying logging system.
SLF4J: See also http://www.slf4j.org/codes.html#substituteLogger
SLF4J: clj-ssh.ssh
SLF4J: clj-ssh.ssh
SLF4J: clj-ssh.ssh
SLF4J: clj-ssh.ssh
SLF4J: clj-ssh.ssh
SLF4J: clj-ssh.ssh
SLF4J: clj-ssh.ssh
SLF4J: clj-ssh.ssh

lein test :only jepsen.system.rabbitmq-test/rabbit-test

ERROR in (rabbit-test) (Session.java:512)
Uncaught exception, not in assertion.
expected: nil
actual: com.jcraft.jsch.JSchException: Auth fail
at com.jcraft.jsch.Session.connect (Session.java:512)
com.jcraft.jsch.Session.connect (Session.java:183)
clj_ssh.ssh$connect.invoke (ssh.clj:327)
jepsen.control$session.invoke (control.clj:182)
clojure.lang.AFn.applyToHelper (AFn.java:154)
clojure.lang.AFn.applyTo (AFn.java:144)
clojure.core$apply.invoke (core.clj:624)
jepsen.core$fcatch$wrapper__4829.doInvoke (core.clj:39)
clojure.lang.RestFn.invoke (RestFn.java:408)
clojure.core$pmap$fn__6328$fn__6329.invoke (core.clj:6463)
clojure.core$binding_conveyor_fn$fn__4145.invoke (core.clj:1910)
clojure.lang.AFn.call (AFn.java:18)
java.util.concurrent.FutureTask.run (FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
java.lang.Thread.run (Thread.java:745)

Ran 1 tests containing 1 assertions.
0 failures, 1 errors.
Tests failed.
Error encountered performing task 'test' with profile(s): 'base,system,user,provided,dev,rabbitmq'
Tests failed.

-- snip --

I'm happy to help continue troubleshooting this, and have even sent a PR to jepsen-vagrant to note some of the steps I took to get past hurdles, but I'm drawing a blank here so far.

Thanks in advance to anyone who has time to help with this!

Basic instruction needed

I am trying to run this on ES 1.5 (even tried on 2.0)
command i tried is : lein test

I got the below error:

java.io.FileNotFoundException: /home/qa/sathish/ES/elasticsearch/target/classes/META-INF/maven/elasticsearch/elasticsearch/pom.properties (No such file or directory)
at java.io.FileOutputStream.open0 (FileOutputStream.java:-2)
java.io.FileOutputStream.open (FileOutputStream.java:270)
java.io.FileOutputStream. (FileOutputStream.java:213)
clojure.java.io$fn__9522.invokeStatic (io.clj:230)
clojure.java.io/fn (io.clj:230)
clojure.java.io$fn__9459$G__9428__9466.invoke (io.clj:69)
clojure.java.io$fn__9534.invokeStatic (io.clj:263)
clojure.java.io/fn (io.clj:259)
clojure.java.io$fn__9459$G__9428__9466.invoke (io.clj:69)
clojure.java.io$fn__9496.invokeStatic (io.clj:166)
clojure.java.io/fn (io.clj:166)
clojure.java.io$fn__9472$G__9424__9479.invoke (io.clj:69)
clojure.java.io$writer.invokeStatic (io.clj:119)
clojure.java.io$writer.doInvoke (io.clj:104)
clojure.lang.RestFn.invoke (RestFn.java:410)
clojure.lang.AFn.applyToHelper (AFn.java:154)
clojure.lang.RestFn.applyTo (RestFn.java:132)
clojure.core$apply.invokeStatic (core.clj:648)
clojure.core$spit.invokeStatic (core.clj:6668)
clojure.core$spit.doInvoke (core.clj:6668)
clojure.lang.RestFn.invoke (RestFn.java:425)
leiningen.core.eval$write_pom_properties.invokeStatic (eval.clj:40)
leiningen.core.eval$write_pom_properties.invoke (eval.clj:35)
leiningen.core.eval$prep.invokeStatic (eval.clj:83)
leiningen.core.eval$prep.invoke (eval.clj:72)
leiningen.core.eval$eval_in_project.invokeStatic (eval.clj:369)
leiningen.core.eval$eval_in_project.invoke (eval.clj:363)
leiningen.test$test.invokeStatic (test.clj:196)
leiningen.test$test.doInvoke (test.clj:163)
clojure.lang.RestFn.invoke (RestFn.java:410)
clojure.lang.Var.invoke (Var.java:379)
clojure.lang.AFn.applyToHelper (AFn.java:154)
clojure.lang.Var.applyTo (Var.java:700)
clojure.core$apply.invokeStatic (core.clj:648)
clojure.core$apply.invoke (core.clj:641)
leiningen.core.main$partial_task$fn__5829.doInvoke (main.clj:272)
clojure.lang.RestFn.invoke (RestFn.java:410)
clojure.lang.AFn.applyToHelper (AFn.java:154)
clojure.lang.RestFn.applyTo (RestFn.java:132)
clojure.lang.AFunction$1.doInvoke (AFunction.java:29)
clojure.lang.RestFn.applyTo (RestFn.java:137)
clojure.core$apply.invokeStatic (core.clj:648)
clojure.core$apply.invoke (core.clj:641)
leiningen.core.main$apply_task.invokeStatic (main.clj:322)
leiningen.core.main$apply_task.invoke (main.clj:308)
leiningen.core.main$resolve_and_apply.invokeStatic (main.clj:328)
leiningen.core.main$resolve_and_apply.invoke (main.clj:324)
leiningen.core.main$_main$fn__5895.invoke (main.clj:401)
leiningen.core.main$_main.invokeStatic (main.clj:394)
leiningen.core.main$_main.doInvoke (main.clj:391)
clojure.lang.RestFn.invoke (RestFn.java:408)
clojure.lang.Var.invoke (Var.java:379)
clojure.lang.AFn.applyToHelper (AFn.java:154)
clojure.lang.Var.applyTo (Var.java:700)
clojure.core$apply.invokeStatic (core.clj:646)
clojure.main$main_opt.invokeStatic (main.clj:314)
clojure.main$main_opt.invoke (main.clj:310)
clojure.main$main.invokeStatic (main.clj:421)
clojure.main$main.doInvoke (main.clj:384)
clojure.lang.RestFn.invoke (RestFn.java:436)
clojure.lang.Var.invoke (Var.java:388)
clojure.lang.AFn.applyToHelper (AFn.java:160)
clojure.lang.Var.applyTo (Var.java:700)
clojure.main.main (main.java:37)

Some one help to resolve this

Authentication failure when using the base setup task.

14:43:38 Authentication failed for user ubuntu@n5
14:43:38 /home/xeno/.gem/ruby/2.1.0/gems/net-ssh-2.8.0/lib/net/ssh.rb:217:in `start'
         /home/xeno/.gem/ruby/2.1.0/gems/salticid-0.9.8/lib/salticid/host.rb:491:in `block in ssh'
         /home/xeno/.gem/ruby/2.1.0/gems/salticid-0.9.8/lib/salticid/host.rb:483:in `synchronize'
         /home/xeno/.gem/ruby/2.1.0/gems/salticid-0.9.8/lib/salticid/host.rb:483:in `ssh'
         /home/xeno/.gem/ruby/2.1.0/gems/salticid-0.9.8/lib/salticid/host.rb:179:in `exec!'
         /home/xeno/dev/jepsen/salticid/main.rb:14:in `block (5 levels) in load'
         /home/xeno/.gem/ruby/2.1.0/gems/salticid-0.9.8/lib/salticid/host.rb:48:in `as'
         /home/xeno/.gem/ruby/2.1.0/gems/salticid-0.9.8/lib/salticid/host.rb:500:in `sudo'
         /home/xeno/dev/jepsen/salticid/main.rb:13:in `block (4 levels) in load'
         /home/xeno/.gem/ruby/2.1.0/gems/salticid-0.9.8/lib/snippets/object/instance_exec.rb:8:in `instan
         ce_exec'
         /home/xeno/.gem/ruby/2.1.0/gems/salticid-0.9.8/lib/salticid/task.rb:36:in `run'
         /home/xeno/.gem/ruby/2.1.0/gems/salticid-0.9.8/lib/salticid/role_proxy.rb:24:in `method_missing'
         command line:1:in `block (2 levels) in <top (required)>'
         /home/xeno/.gem/ruby/2.1.0/gems/salticid-0.9.8/bin/salticid:98:in `instance_eval'
         /home/xeno/.gem/ruby/2.1.0/gems/salticid-0.9.8/bin/salticid:98:in `block (2 levels) in <top (req
         uired)>'

Is net-ssh supposed to be pinned?

Error running etcd test

I'm having the following error running the etcd test using the docker option,

$ lein with-profile etcd test :only jepsen.system.etcd-test/register-test

WARN ignoring checkouts directory knossos as it does not contain a project.clj file.

lein test jepsen.system.etcd-test

lein test :only jepsen.system.etcd-test/register-test

ERROR in (register-test) (AFn.java:429)
Uncaught exception, not in assertion.
expected: nil
actual: clojure.lang.ArityException: Wrong number of args (1) passed to: generator/delay
at clojure.lang.AFn.throwArity (AFn.java:429)
clojure.lang.AFn.invoke (AFn.java:32)
jepsen.system.etcd_test/fn (etcd_test.clj:31)
clojure.test$test_var$fn__7187.invoke (test.clj:704)
clojure.test$test_var.invoke (test.clj:704)
clojure.test$test_vars$fn__7209$fn__7214.invoke (test.clj:722)
clojure.test$default_fixture.invoke (test.clj:674)
clojure.test$test_vars$fn__7209.invoke (test.clj:722)
clojure.test$default_fixture.invoke (test.clj:674)
clojure.test$test_vars.invoke (test.clj:718)
clojure.test$test_all_vars.invoke (test.clj:728)
clojure.test$test_ns.invoke (test.clj:747)
clojure.core$map$fn__4245.invoke (core.clj:2559)
clojure.lang.LazySeq.sval (LazySeq.java:40)
clojure.lang.LazySeq.seq (LazySeq.java:49)
clojure.lang.Cons.next (Cons.java:39)
clojure.lang.RT.boundedLength (RT.java:1654)
clojure.lang.RestFn.applyTo (RestFn.java:130)
clojure.core$apply.invoke (core.clj:626)
clojure.test$run_tests.doInvoke (test.clj:762)
clojure.lang.RestFn.applyTo (RestFn.java:137)
clojure.core$apply.invoke (core.clj:624)
user$eval85$fn__200$fn__251.invoke (form-init4707164460779162580.clj:1)
user$eval85$fn__200$fn__201.invoke (form-init4707164460779162580.clj:1)
user$eval85$fn__200.invoke (form-init4707164460779162580.clj:1)
user$eval85.invoke (form-init4707164460779162580.clj:1)
clojure.lang.Compiler.eval (Compiler.java:6703)
clojure.lang.Compiler.eval (Compiler.java:6693)
clojure.lang.Compiler.load (Compiler.java:7130)
clojure.lang.Compiler.loadFile (Compiler.java:7086)
clojure.main$load_script.invoke (main.clj:274)
clojure.main$init_opt.invoke (main.clj:279)
clojure.main$initialize.invoke (main.clj:307)
clojure.main$null_opt.invoke (main.clj:342)
clojure.main$main.doInvoke (main.clj:420)
clojure.lang.RestFn.invoke (RestFn.java:421)
clojure.lang.Var.invoke (Var.java:383)
clojure.lang.AFn.applyToHelper (AFn.java:156)
clojure.lang.Var.applyTo (Var.java:700)
clojure.main.main (main.java:37)

Ran 1 tests containing 1 assertions.
0 failures, 1 errors.
Tests failed.
Error encountered performing task 'test' with profile(s): 'etcd'
Tests failed

What am I doing wrong?

Aerospike tests failed

I was running the aerospike test and there are errors :-

lein test :only aerospike.core-test/cas-register

ERROR in (cas-register) (ArrayList.java:177)
Uncaught exception, not in assertion.
expected: nil
  actual: java.lang.NullPointerException: null
 at java.util.ArrayList.<init> (ArrayList.java:177)
    clojure.core$shuffle.invoke (core.clj:6665)
    jepsen.report$linearizability.invoke (report.clj:36)
    aerospike.core_test/fn (core_test.clj:13)
    clojure.test$test_var$fn__7187.invoke (test.clj:704)
    clojure.test$test_var.invoke (test.clj:704)
    clojure.test$test_vars$fn__7209$fn__7214.invoke (test.clj:722)
    clojure.test$default_fixture.invoke (test.clj:674)
    clojure.test$test_vars$fn__7209.invoke (test.clj:722)
    clojure.test$default_fixture.invoke (test.clj:674)
    clojure.test$test_vars.invoke (test.clj:718)
    clojure.test$test_all_vars.invoke (test.clj:728)
    clojure.test$test_ns.invoke (test.clj:747)
    clojure.core$map$fn__4245.invoke (core.clj:2559)
    clojure.lang.LazySeq.sval (LazySeq.java:40)
    clojure.lang.LazySeq.seq (LazySeq.java:49)
    clojure.lang.Cons.next (Cons.java:39)
    clojure.lang.RT.boundedLength (RT.java:1654)
    clojure.lang.RestFn.applyTo (RestFn.java:130)
    clojure.core$apply.invoke (core.clj:626)
    clojure.test$run_tests.doInvoke (test.clj:762)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invoke (core.clj:624)
    user$eval85$fn__140$fn__171.invoke (form-init6726222666022464519.clj:1)
    user$eval85$fn__140$fn__141.invoke (form-init6726222666022464519.clj:1)
    user$eval85$fn__140.invoke (form-init6726222666022464519.clj:1)
    user$eval85.invoke (form-init6726222666022464519.clj:1)
    clojure.lang.Compiler.eval (Compiler.java:6703)
    clojure.lang.Compiler.eval (Compiler.java:6693)
    clojure.lang.Compiler.load (Compiler.java:7130)
    clojure.lang.Compiler.loadFile (Compiler.java:7086)
    clojure.main$load_script.invoke (main.clj:274)
    clojure.main$init_opt.invoke (main.clj:279)
    clojure.main$initialize.invoke (main.clj:307)
    clojure.main$null_opt.invoke (main.clj:342)
    clojure.main$main.doInvoke (main.clj:420)
    clojure.lang.RestFn.invoke (RestFn.java:421)
    clojure.lang.Var.invoke (Var.java:383)
    clojure.lang.AFn.applyToHelper (AFn.java:156)
    clojure.lang.Var.applyTo (Var.java:700)
    clojure.main.main (main.java:37)

Ran 2 tests containing 3 assertions.
2 failures, 1 errors.
Tests failed. 

redis.clj bitrot?

Just a heads up, I believe redis/sentinel-app/add should either return ok or error, but it currently returns a long. This causes the following stacktrace when I try to run redis-sentinel:

java.lang.ClassCastException: java.lang.Long cannot be cast to clojure.lang.Associative
        at clojure.lang.RT.assoc(RT.java:703)
        at clojure.core$assoc.invoke(core.clj:187)
        at jepsen.load$wrap_latency$measure_latency__2589.invoke(load.clj:102)
        at jepsen.load$wrap_record_req$record_req__2601.invoke(load.clj:129)
        at jepsen.console$wrap_ordered_log$logger__2560.invoke(console.clj:109)
        at jepsen.load$map_fixed_rate$boss__2582$fn__2584.invoke(load.clj:65)

Hazelcast and Coherence

I would love to see these added to the suite of tests. If no one grabs them, I'll work on them when I have a chance.

Thanks

ERROR in (create-test) (FutureTask.java:122)

FYI. I've been running elasticsearch tests in a loop, recently saw this failure (the test run before and after this finished just fine). Just dumping full log of jepsen printout here as I don't understand its internals:

lein test jepsen.system.elasticsearch-test
INFO  jepsen.os.debian - :n5 setting up debian
INFO  jepsen.os.debian - :n4 setting up debian
INFO  jepsen.os.debian - :n3 setting up debian
INFO  jepsen.os.debian - :n1 setting up debian
INFO  jepsen.os.debian - :n2 setting up debian
INFO  jepsen.os.debian - :n3 debian set up
INFO  jepsen.os.debian - :n1 debian set up
INFO  jepsen.os.debian - :n5 debian set up
INFO  jepsen.os.debian - :n4 debian set up
INFO  jepsen.os.debian - :n2 debian set up
INFO  jepsen.system.elasticsearch - :n3 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n2 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n5 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n4 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n1 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n4 configuring elasticsearch
INFO  jepsen.system.elasticsearch - :n3 configuring elasticsearch
INFO  jepsen.system.elasticsearch - :n2 configuring elasticsearch
INFO  jepsen.system.elasticsearch - :n1 configuring elasticsearch
INFO  jepsen.system.elasticsearch - :n5 configuring elasticsearch
INFO  jepsen.system.elasticsearch - :n4 starting elasticsearch
INFO  jepsen.system.elasticsearch - :n2 starting elasticsearch
INFO  jepsen.system.elasticsearch - :n1 starting elasticsearch
INFO  jepsen.system.elasticsearch - :n5 starting elasticsearch
INFO  jepsen.system.elasticsearch - :n4 elasticsearch ready
INFO  jepsen.system.elasticsearch - :n2 elasticsearch ready
INFO  jepsen.system.elasticsearch - :n5 elasticsearch ready
INFO  jepsen.system.elasticsearch - :n1 elasticsearch ready
INFO  jepsen.system.elasticsearch - :n3 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n4 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n5 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n2 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n1 elasticsearch nuked

lein test :only jepsen.system.elasticsearch-test/create-test

ERROR in (create-test) (FutureTask.java:122)
Uncaught exception, not in assertion.
expected: nil
  actual: java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: com.jcraft.jsch.JSchException: verify: false
 at java.util.concurrent.FutureTask.report (FutureTask.java:122)
    java.util.concurrent.FutureTask.get (FutureTask.java:188)
    clojure.core$deref_future.invoke (core.clj:2180)
    clojure.core$future_call$reify__6320.deref (core.clj:6417)
    clojure.core$deref.invoke (core.clj:2200)
    clojure.core$map$fn__4245.invoke (core.clj:2559)
    clojure.lang.LazySeq.sval (LazySeq.java:40)
    clojure.lang.LazySeq.seq (LazySeq.java:49)
    clojure.lang.Cons.next (Cons.java:39)
    clojure.lang.RT.next (RT.java:599)
    clojure.core$next.invoke (core.clj:64)
    clojure.core$dorun.invoke (core.clj:2856)
    jepsen.core$on_nodes.invoke (core.clj:72)
    jepsen.core$run_BANG_.invoke (core.clj:315)
    jepsen.system.elasticsearch_test/fn (elasticsearch_test.clj:87)
    clojure.test$test_var$fn__7187.invoke (test.clj:704)
    clojure.test$test_var.invoke (test.clj:704)
    clojure.test$test_vars$fn__7209$fn__7214.invoke (test.clj:721)
    clojure.test$default_fixture.invoke (test.clj:674)
    clojure.test$test_vars$fn__7209.invoke (test.clj:721)
    clojure.test$default_fixture.invoke (test.clj:674)
    clojure.test$test_vars.invoke (test.clj:718)
    clojure.test$test_all_vars.invoke (test.clj:727)
    clojure.test$test_ns.invoke (test.clj:746)
    clojure.core$map$fn__4245.invoke (core.clj:2559)
    clojure.lang.LazySeq.sval (LazySeq.java:40)
    clojure.lang.LazySeq.seq (LazySeq.java:49)
    clojure.lang.Cons.next (Cons.java:39)
    clojure.lang.RT.boundedLength (RT.java:1655)
    clojure.lang.RestFn.applyTo (RestFn.java:130)
    clojure.core$apply.invoke (core.clj:626)
    clojure.test$run_tests.doInvoke (test.clj:761)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invoke (core.clj:624)
    user$eval85$fn__140$fn__171.invoke (form-init968023791120871162.clj:1)
    user$eval85$fn__140$fn__141.invoke (form-init968023791120871162.clj:1)
    user$eval85$fn__140.invoke (form-init968023791120871162.clj:1)
    user$eval85.invoke (form-init968023791120871162.clj:1)
    clojure.lang.Compiler.eval (Compiler.java:6676)
    clojure.lang.Compiler.eval (Compiler.java:6666)
    clojure.lang.Compiler.load (Compiler.java:7103)
    clojure.lang.Compiler.loadFile (Compiler.java:7059)
    clojure.main$load_script.invoke (main.clj:274)
    clojure.main$init_opt.invoke (main.clj:279)
    clojure.main$initialize.invoke (main.clj:307)
    clojure.main$null_opt.invoke (main.clj:342)
    clojure.main$main.doInvoke (main.clj:420)
    clojure.lang.RestFn.invoke (RestFn.java:421)
    clojure.lang.Var.invoke (Var.java:383)
    clojure.lang.AFn.applyToHelper (AFn.java:156)
    clojure.lang.Var.applyTo (Var.java:700)
    clojure.main.main (main.java:37)
Caused by: java.util.concurrent.ExecutionException: com.jcraft.jsch.JSchException: verify: false
 at java.util.concurrent.FutureTask.report (FutureTask.java:122)
    java.util.concurrent.FutureTask.get (FutureTask.java:188)
    clojure.core$deref_future.invoke (core.clj:2180)
    clojure.core$future_call$reify__6320.deref (core.clj:6417)
    clojure.core$deref.invoke (core.clj:2200)
    clojure.core$map$fn__4245.invoke (core.clj:2557)
    clojure.lang.LazySeq.sval (LazySeq.java:40)
    clojure.lang.LazySeq.seq (LazySeq.java:49)
    clojure.lang.RT.seq (RT.java:484)
    clojure.core$seq.invoke (core.clj:133)
    clojure.core$map$fn__4249.invoke (core.clj:2562)
    clojure.lang.LazySeq.sval (LazySeq.java:40)
    clojure.lang.LazySeq.seq (LazySeq.java:49)
    clojure.lang.RT.seq (RT.java:484)
    clojure.core$seq.invoke (core.clj:133)
    clojure.core.protocols$seq_reduce.invoke (protocols.clj:30)
    clojure.core.protocols/fn (protocols.clj:54)
    clojure.core.protocols$fn__6031$G__6026__6044.invoke (protocols.clj:13)
    clojure.core$reduce.invoke (core.clj:6286)
    clojure.core$into.invoke (core.clj:6338)
    jepsen.system.elasticsearch$reify__5932$fn__5933.invoke (elasticsearch.clj:110)
    jepsen.system.elasticsearch$reify__5932.setup_BANG_ (elasticsearch.clj:74)
    jepsen.db$cycle_BANG_.invoke (db.clj:22)
    clojure.lang.AFn.applyToHelper (AFn.java:160)
    clojure.lang.AFn.applyTo (AFn.java:144)
    clojure.core$apply.invoke (core.clj:626)
    clojure.core$partial$fn__4228.doInvoke (core.clj:2468)
    clojure.lang.RestFn.invoke (RestFn.java:421)
    jepsen.core$on_nodes$fn__7382.invoke (core.clj:71)
    clojure.core$pmap$fn__6328$fn__6329.invoke (core.clj:6463)
    clojure.core$binding_conveyor_fn$fn__4145.invoke (core.clj:1910)
    clojure.lang.AFn.call (AFn.java:18)
    java.util.concurrent.FutureTask.run (FutureTask.java:262)
    java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1145)
    java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:615)
    java.lang.Thread.run (Thread.java:744)
Caused by: com.jcraft.jsch.JSchException: verify: false
 at com.jcraft.jsch.Session.connect (Session.java:330)
    com.jcraft.jsch.Session.connect (Session.java:183)
    clj_ssh.ssh$connect.invoke (ssh.clj:327)
    jepsen.control$session.invoke (control.clj:177)
    jepsen.system.elasticsearch$reify__5932$fn__5933$fn__5943$fn__5944.invoke (elasticsearch.clj:110)
    clojure.core$binding_conveyor_fn$fn__4145.invoke (core.clj:1910)
    clojure.lang.AFn.call (AFn.java:18)
    java.util.concurrent.FutureTask.run (FutureTask.java:262)
    java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1145)
    java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:615)
    java.lang.Thread.run (Thread.java:744)

Ran 1 tests containing 1 assertions.
0 failures, 1 errors.
Error encountered performing task 'test' with profile(s): 'base,system,user,provided,dev,elasticsearch'
Tests failed.

SmartOS Support

From Twitter discussion, it seems like there's a desire for SmartOS support (principally for testing Manatee).

https://twitter.com/notmatt/status/638806846241812480

This would encompass a jepsen.os.smartos namespace with appropriate analogues to the functions in jepsen.os.debian, as well as some work with jepsen.net, jepsen.control.net, and jepsen.control.util.

I'm happy to PR this if there is interest.

Broken link in https://github.com/aphyr/jepsen/blob/master/doc/db.md

Quoting https://github.com/aphyr/jepsen/blob/master/doc/db.md,

noop-test, like all Jepsen tests, is a map with keys like :os, :name, :db, etc. See jepsen.core for an overview of test structure, and jepsen.core/run for the full definition of a test.

The jepsen.core here is a link to https://github.com/aphyr/jepsen/blob/master/doc/jepsen/src/jepsen/core.clj which is no longer there. Should it be https://github.com/aphyr/jepsen/blob/master/jepsen/src/jepsen/core.clj instead?

jepsen tests failed, not #39?

Every test failed like this: I have ssh StrictHostKeyChecking no in ~/ssh/config. There is no known_hosts file.

The system is Debian Jessie 8.4. Installing openjdk 1.8 is not straightforward as it is still in testing.

I think this is similar to issue #39 but not the same. It looks like java jna conflict of sort.
This happens 100% time and for all tests. Here is the output for zookeeper test.

Please help me!

gary@debian-jepsen1:~/jepsen/zookeeper$ lein test
INFO org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.4.0-1202560, built on 11/16/2011 07:18 GMT
INFO org.apache.zookeeper.ZooKeeper - Client environment:host.name=debian-jepsen1
INFO org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.8.0_91
INFO org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Oracle Corporation
INFO org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
INFO org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/home/gary/jepsen/zookeeper/test:/home/gary/jepsen/zookeeper/src:/home/gary/jepsen/zookeeper/dev-resources:/home/gary/jepsen/zookeeper/resources:/home/gary/jepsen/zookeeper/target/classes:/home/gary/.m2/repository/net/java/dev/jna/platform/3.4.0/platform-3.4.0.jar:/home/gary/.m2/repository/org/clojure/math.combinatorics/0.1.1/math.combinatorics-0.1.1.jar:/home/gary/.m2/repository/org/apache/zookeeper/zookeeper/3.4.0/zookeeper-3.4.0.jar:/home/gary/.m2/repository/org/clojure/algo.generic/0.1.2/algo.generic-0.1.2.jar:/home/gary/.m2/repository/org/jboss/netty/netty/3.2.2.Final/netty-3.2.2.Final.jar:/home/gary/.m2/repository/net/java/dev/jna/jna/3.4.0/jna-3.4.0.jar:/home/gary/.m2/repository/interval-metrics/interval-metrics/1.0.0/interval-metrics-1.0.0.jar:/home/gary/.m2/repository/knossos/knossos/0.2.4/knossos-0.2.4.jar:/home/gary/.m2/repository/clj-ssh/clj-ssh/0.5.11/clj-ssh-0.5.11.jar:/home/gary/.m2/repository/com/jcraft/jsch.agentproxy.jsch/0.0.7/jsch.agentproxy.jsch-0.0.7.jar:/home/gary/.m2/repository/joda-time/joda-time/2.2/joda-time-2.2.jar:/home/gary/.m2/repository/jepsen/jepsen/0.0.9/jepsen-0.0.9.jar:/home/gary/.m2/repository/org/clojure/tools.nrepl/0.2.12/tools.nrepl-0.2.12.jar:/home/gary/.m2/repository/org/slf4j/slf4j-api/1.6.1/slf4j-api-1.6.1.jar:/home/gary/.m2/repository/com/jcraft/jsch.agentproxy.sshagent/0.0.7/jsch.agentproxy.sshagent-0.0.7.jar:/home/gary/.m2/repository/zookeeper-clj/zookeeper-clj/0.9.3/zookeeper-clj-0.9.3.jar:/home/gary/.m2/repository/clojure-complete/clojure-complete/0.2.4/clojure-complete-0.2.4.jar:/home/gary/.m2/repository/com/jcraft/jsch.agentproxy.pageant/0.0.7/jsch.agentproxy.pageant-0.0.7.jar:/home/gary/.m2/repository/commons-codec/commons-codec/1.7/commons-codec-1.7.jar:/home/gary/.m2/repository/clj-tuple/clj-tuple/0.1.2/clj-tuple-0.1.2.jar:/home/gary/.m2/repository/com/jcraft/jsch.agentproxy.usocket-nc/0.0.7/jsch.agentproxy.usocket-nc-0.0.7.jar:/home/gary/.m2/repository/com/jcraft/jsch.agentproxy.core/0.0.7/jsch.agentproxy.core-0.0.7.jar:/home/gary/.m2/repository/com/boundary/high-scale-lib/1.0.6/high-scale-lib-1.0.6.jar:/home/gary/.m2/repository/org/clojars/pallix/analemma/1.0.0/analemma-1.0.0.jar:/home/gary/.m2/repository/clj-time/clj-time/0.6.0/clj-time-0.6.0.jar:/home/gary/.m2/repository/hiccup/hiccup/1.0.5/hiccup-1.0.5.jar:/home/gary/.m2/repository/org/clojure/tools.logging/0.2.6/tools.logging-0.2.6.jar:/home/gary/.m2/repository/junit/junit/3.8.1/junit-3.8.1.jar:/home/gary/.m2/repository/potemkin/potemkin/0.3.4/potemkin-0.3.4.jar:/home/gary/.m2/repository/org/clojure/clojure/1.7.0/clojure-1.7.0.jar:/home/gary/.m2/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar:/home/gary/.m2/repository/com/jcraft/jsch/0.1.51/jsch-0.1.51.jar:/home/gary/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/home/gary/.m2/repository/byte-streams/byte-streams/0.1.4/byte-streams-0.1.4.jar:/home/gary/.m2/repository/org/fressian/fressian/0.6.3/fressian-0.6.3.jar:/home/gary/.m2/repository/com/jcraft/jsch.agentproxy.usocket-jna/0.0.7/jsch.agentproxy.usocket-jna-0.0.7.jar:/home/gary/.m2/repository/gnuplot/gnuplot/0.1.1/gnuplot-0.1.1.jar:/home/gary/.m2/repository/org/clojars/achim/multiset/0.1.0/multiset-0.1.0.jar:/home/gary/.m2/repository/org/clojure/data.fressian/0.2.0/data.fressian-0.2.0.jar:/home/gary/.m2/repository/jline/jline/0.9.94/jline-0.9.94.jar:/home/gary/.m2/repository/avout/avout/0.5.4/avout-0.5.4.jar:/home/gary/.m2/repository/riddley/riddley/0.1.6/riddley-0.1.6.jar
INFO org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
INFO org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
INFO org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=
INFO org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
INFO org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
INFO org.apache.zookeeper.ZooKeeper - Client environment:os.version=3.16.0-4-amd64
INFO org.apache.zookeeper.ZooKeeper - Client environment:user.name=gary
INFO org.apache.zookeeper.ZooKeeper - Client environment:user.home=/home/gary
INFO org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/home/gary/jepsen/zookeeper

lein test jepsen.zookeeper-test

lein test :only jepsen.zookeeper-test/zk-test

ERROR in (zk-test) (FutureTask.java:122)
expected: (:valid? (:results (jepsen/run! (zk/zk-test "3.4.5+dfsg-2"))))
actual: java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: Could not initialize class com.jcraft.jsch.agentproxy.usocket.JNAUSocketFactory$CLibrary
at java.util.concurrent.FutureTask.report (FutureTask.java:122)
java.util.concurrent.FutureTask.get (FutureTask.java:192)
clojure.core$deref_future.invoke (core.clj:2186)
clojure.core$future_call$reify__6736.deref (core.clj:6683)
clojure.core$deref.invoke (core.clj:2206)
clojure.core$map$fn__4553.invoke (core.clj:2622)
clojure.lang.LazySeq.sval (LazySeq.java:40)
clojure.lang.LazySeq.seq (LazySeq.java:56)
clojure.lang.RT.seq (RT.java:507)
clojure.core/seq (core.clj:137)
clojure.core$dorun.invoke (core.clj:3009)
clojure.core$doall.invoke (core.clj:3025)
jepsen.core$run_BANG_$fn__6833.invoke (core.clj:398)
jepsen.core$run_BANG_.invoke (core.clj:379)
jepsen.zookeeper_test/fn (zookeeper_test.clj:7)
clojure.test$test_var$fn__7670.invoke (test.clj:704)
clojure.test$test_var.invoke (test.clj:704)
clojure.test$test_vars$fn__7692$fn__7697.invoke (test.clj:722)
clojure.test$default_fixture.invoke (test.clj:674)
clojure.test$test_vars$fn__7692.invoke (test.clj:722)
clojure.test$default_fixture.invoke (test.clj:674)
clojure.test$test_vars.invoke (test.clj:718)
clojure.test$test_all_vars.invoke (test.clj:728)
clojure.test$test_ns.invoke (test.clj:747)
clojure.core$map$fn__4553.invoke (core.clj:2624)
clojure.lang.LazySeq.sval (LazySeq.java:40)
clojure.lang.LazySeq.seq (LazySeq.java:49)
clojure.lang.Cons.next (Cons.java:39)
clojure.lang.RT.boundedLength (RT.java:1735)
clojure.lang.RestFn.applyTo (RestFn.java:130)
clojure.core$apply.invoke (core.clj:632)
clojure.test$run_tests.doInvoke (test.clj:762)
clojure.lang.RestFn.applyTo (RestFn.java:137)
clojure.core$apply.invoke (core.clj:630)
user$eval85$fn__144$fn__175.invoke (form-init802407220195414458.clj:1)
user$eval85$fn__144$fn__145.invoke (form-init802407220195414458.clj:1)
user$eval85$fn__144.invoke (form-init802407220195414458.clj:1)
user$eval85.invoke (form-init802407220195414458.clj:1)
clojure.lang.Compiler.eval (Compiler.java:6782)
clojure.lang.Compiler.eval (Compiler.java:6772)
clojure.lang.Compiler.load (Compiler.java:7227)
clojure.lang.Compiler.loadFile (Compiler.java:7165)
clojure.main$load_script.invoke (main.clj:275)
clojure.main$init_opt.invoke (main.clj:280)
clojure.main$initialize.invoke (main.clj:308)
clojure.main$null_opt.invoke (main.clj:343)
clojure.main$main.doInvoke (main.clj:421)
clojure.lang.RestFn.invoke (RestFn.java:421)
clojure.lang.Var.invoke (Var.java:383)
clojure.lang.AFn.applyToHelper (AFn.java:156)
clojure.lang.Var.applyTo (Var.java:700)
clojure.main.main (main.java:37)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class com.jcraft.jsch.agentproxy.usocket.JNAUSocketFactory$CLibrary
at com.jcraft.jsch.agentproxy.usocket.JNAUSocketFactory.open (JNAUSocketFactory.java:114)
com.jcraft.jsch.agentproxy.connector.SSHAgentConnector.open (SSHAgentConnector.java:80)
com.jcraft.jsch.agentproxy.connector.SSHAgentConnector. (SSHAgentConnector.java:48)
clj_ssh.agent$sock_agent_connector.invoke (agent.clj:15)
clj_ssh.agent$connect.invoke (agent.clj:35)
clj_ssh.ssh$ssh_agent.invoke (ssh.clj:148)
jepsen.control$session.invoke (control.clj:193)
clojure.lang.AFn.applyToHelper (AFn.java:154)
clojure.lang.AFn.applyTo (AFn.java:144)
clojure.core$apply.invoke (core.clj:630)
clojure.core$with_bindings_STAR_.doInvoke (core.clj:1868)
clojure.lang.RestFn.applyTo (RestFn.java:142)
clojure.core$apply.invoke (core.clj:634)
clojure.core$bound_fn_STAR_$fn__4439.doInvoke (core.clj:1890)
clojure.lang.RestFn.applyTo (RestFn.java:137)
clojure.core$apply.invoke (core.clj:630)
jepsen.core$fcatch$wrapper__6696.doInvoke (core.clj:55)
clojure.lang.RestFn.invoke (RestFn.java:408)
clojure.core$pmap$fn__6744$fn__6745.invoke (core.clj:6729)
clojure.core$binding_conveyor_fn$fn__4444.invoke (core.clj:1916)
clojure.lang.AFn.call (AFn.java:18)
java.util.concurrent.FutureTask.run (FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
java.lang.Thread.run (Thread.java:745)

Ran 1 tests containing 1 assertions.
0 failures, 1 errors.
Tests failed.
gary@debian-jepsen1:~/jepsen/zookeeper$

mongodb idempotent error conversion seems to be backwards?

Hi - I might just be not understanding the intention, but in the macro "with-errors" in mongodb.core the following code:

  `(let [error-type# (if (~idempotent-ops (:f ~op))
                   :fail
                   :info)]

Seems to be doing the opposite of what is intended - when called with (with-errors op #{:read} it will turn all read errors into :fail, and all other errors into :info.

Apologies if I'm just not understanding the goal here.

jepsen depends on a version of knossos that doesn't exist

currently, jepsen depends on version 0.2.1-snapshot of knossos, which doesn't exist in clojars (latest is 0.2)

igalic@levix ~/src/jepsen/jepsen (git)-[es-1.4] % lein deps :tree
Could not find artifact knossos:knossos:jar:0.2.1-SNAPSHOT in clojars (https://clojars.org/repo/)
This could be due to a typo in :dependencies or network issues.
If you are behind a proxy, try setting the 'http_proxy' environment variable.
1 igalic@levix ~/src/jepsen/jepsen (git)-[es-1.4] %

this seems to have been a very recent addition:

igalic@levix ~/src/jepsen/jepsen (git)-[es-1.4] % git blame project.clj | grep -i knoss
b3ba335f (Aphyr 2015-02-25 11:45:06 -0800  8)                  [knossos "0.2.1-SNAPSHOT"]
igalic@levix ~/src/jepsen/jepsen (git)-[es-1.4] %

i.e. since b3ba335

The scaffolding tutorial fails with java.lang.UnsatisfiedLinkError: Can't obtain static newInstance method for class com.sun.jna.Structure at com.sun.jna.Native.initIDs

Trying to follow https://github.com/aphyr/jepsen/blob/master/doc/scaffolding.md, got the following problem at the step "Next, we'll replace the example test that lein generated (test/jepsen/zookeeper_test.clj) with one that calls the zk-test function, runs the test that function returns, looks at the results, and ensures that the :valid? key is true."

The stack trace is as follows:

alexey@instance-1:~/jepsen.hazelcast$ lein test

lein test jepsen.hazelcast-test

lein test :only jepsen.hazelcast-test/hc-test

ERROR in (hc-test) (FutureTask.java:122)
expected: (:valid? (:results (jepsen/run! (hc/hc-test "3.6.3"))))
  actual: java.util.concurrent.ExecutionException: java.lang.UnsatisfiedLinkError: Can't obtain static newInstance method for class com.sun.jna.Structure
 at java.util.concurrent.FutureTask.report (FutureTask.java:122)
    java.util.concurrent.FutureTask.get (FutureTask.java:192)
    clojure.core$deref_future.invoke (core.clj:2186)
    clojure.core$future_call$reify__6736.deref (core.clj:6683)
    clojure.core$deref.invoke (core.clj:2206)
    clojure.core$pmap$step__6749$fn__6751.invoke (core.clj:6733)
    clojure.lang.LazySeq.sval (LazySeq.java:40)
    clojure.lang.LazySeq.seq (LazySeq.java:49)
    clojure.lang.RT.seq (RT.java:507)
    clojure.core/seq (core.clj:137)
    clojure.core$dorun.invoke (core.clj:3009)
    clojure.core$doall.invoke (core.clj:3025)
    jepsen.core$run_BANG_$fn__6833.invoke (core.clj:398)
    jepsen.core$run_BANG_.invoke (core.clj:379)
    jepsen.hazelcast_test/fn (hazelcast_test.clj:7)
    clojure.test$test_var$fn__7670.invoke (test.clj:704)
    clojure.test$test_var.invoke (test.clj:704)
    clojure.test$test_vars$fn__7692$fn__7697.invoke (test.clj:722)
    clojure.test$default_fixture.invoke (test.clj:674)
    clojure.test$test_vars$fn__7692.invoke (test.clj:722)
    clojure.test$default_fixture.invoke (test.clj:674)
    clojure.test$test_vars.invoke (test.clj:718)
    clojure.test$test_all_vars.invoke (test.clj:728)
    clojure.test$test_ns.invoke (test.clj:747)
    clojure.core$map$fn__4553.invoke (core.clj:2624)
    clojure.lang.LazySeq.sval (LazySeq.java:40)
    clojure.lang.LazySeq.seq (LazySeq.java:49)
    clojure.lang.Cons.next (Cons.java:39)
    clojure.lang.RT.boundedLength (RT.java:1735)
    clojure.lang.RestFn.applyTo (RestFn.java:130)
    clojure.core$apply.invoke (core.clj:632)
    clojure.test$run_tests.doInvoke (test.clj:762)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invoke (core.clj:630)
    user$eval85$fn__144$fn__175.invoke (form-init3003048779867791666.clj:1)
    user$eval85$fn__144$fn__145.invoke (form-init3003048779867791666.clj:1)
    user$eval85$fn__144.invoke (form-init3003048779867791666.clj:1)
    user$eval85.invoke (form-init3003048779867791666.clj:1)
    clojure.lang.Compiler.eval (Compiler.java:6782)
    clojure.lang.Compiler.eval (Compiler.java:6772)
    clojure.lang.Compiler.load (Compiler.java:7227)
    clojure.lang.Compiler.loadFile (Compiler.java:7165)
    clojure.main$load_script.invoke (main.clj:275)
    clojure.main$init_opt.invoke (main.clj:280)
    clojure.main$initialize.invoke (main.clj:308)
    clojure.main$null_opt.invoke (main.clj:343)
    clojure.main$main.doInvoke (main.clj:421)
    clojure.lang.RestFn.invoke (RestFn.java:421)
    clojure.lang.Var.invoke (Var.java:383)
    clojure.lang.AFn.applyToHelper (AFn.java:156)
    clojure.lang.Var.applyTo (Var.java:700)
    clojure.main.main (main.java:37)
Caused by: java.lang.UnsatisfiedLinkError: Can't obtain static newInstance method for class com.sun.jna.Structure
 at com.sun.jna.Native.initIDs (Native.java:-2)
    com.sun.jna.Native.<clinit> (Native.java:135)
    com.jcraft.jsch.agentproxy.usocket.JNAUSocketFactory$CLibrary.<clinit> (JNAUSocketFactory.java:47)
    com.jcraft.jsch.agentproxy.usocket.JNAUSocketFactory.open (JNAUSocketFactory.java:114)
    com.jcraft.jsch.agentproxy.connector.SSHAgentConnector.open (SSHAgentConnector.java:80)
    com.jcraft.jsch.agentproxy.connector.SSHAgentConnector.<init> (SSHAgentConnector.java:48)
    clj_ssh.agent$sock_agent_connector.invoke (agent.clj:15)
    clj_ssh.agent$connect.invoke (agent.clj:35)
    clj_ssh.ssh$ssh_agent.invoke (ssh.clj:148)
    jepsen.control$session.invoke (control.clj:193)
    clojure.lang.AFn.applyToHelper (AFn.java:154)
    clojure.lang.AFn.applyTo (AFn.java:144)
    clojure.core$apply.invoke (core.clj:630)
    clojure.core$with_bindings_STAR_.doInvoke (core.clj:1868)
    clojure.lang.RestFn.applyTo (RestFn.java:142)
    clojure.core$apply.invoke (core.clj:634)
    clojure.core$bound_fn_STAR_$fn__4439.doInvoke (core.clj:1890)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invoke (core.clj:630)
    jepsen.core$fcatch$wrapper__6696.doInvoke (core.clj:55)
    clojure.lang.RestFn.invoke (RestFn.java:408)
    clojure.core$pmap$fn__6744$fn__6745.invoke (core.clj:6729)
    clojure.core$binding_conveyor_fn$fn__4444.invoke (core.clj:1916)
    clojure.lang.AFn.call (AFn.java:18)
    java.util.concurrent.FutureTask.run (FutureTask.java:266)
    java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
    java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
    java.lang.Thread.run (Thread.java:745)
Ran 1 tests containing 1 assertions.
0 failures, 1 errors.
Tests failed.
alexey@instance-1:~/jepsen.hazelcast$

(I did call the test jepsen.hazelcast, but otherwise it's the same dummy test)

The system is as follows:

alexey@instance-1:~$ uname -a
Linux instance-1 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux
alexey@instance-1:~$ dpkg -s libjna-java
Package: libjna-java
Status: install ok installed
Priority: optional
Section: java
Installed-Size: 233
Maintainer: Debian Java maintainers <[email protected]>
Architecture: all
Version: 4.1.0-1

Nemesis iptables rules issues

On :start, it shall drop related, established as well as new connections. This would simulate traffic isolation better for systems with connections being tracked.

Elasticsearch test - not working?

Hello,

On my ubuntu 14.04 machine I've got 5 lxc ubuntu nodes with working ssh. When I run the sample elasticsearch test, that is:

   lein with-profile +elasticsearch test jepsen.system.elasticsearch-test

I get the error (in short):

INFO  jepsen.system.elasticsearch - :n4 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n3 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n1 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n2 elasticsearch nuked
INFO  jepsen.system.elasticsearch - :n5 elasticsearch nuked

lein test :only jepsen.system.elasticsearch-test/create-test

FAIL in (create-test) (elasticsearch_test.clj:88)
expected: (:valid? (:results test))
  actual: false
{:valid? false,
 :html {:valid? true},
 :set
 {:valid? false,
  :lost
  "#{654..655 657 661..663 666..668 670..679 681..698 701..703 706..708 711..713 716..719 721..723 726..728 731..732 734..735 837 839 841..842}",
  :recovered
  "#{132..133 136..143 145..146 148..155 157..204 207..208 215 233 257 280 302 326 349 396 419 807 815 820..821 829 838}",
  :ok
  "#{0..134 136..143 145..146 148..155 157..371 373..441 443..464 466..487 489..510 512..533 535..556 558..579 581..601 603..624 626..647 649..653 656 658..660 664..665 669 680 699..700 704..705 709..710 714..715 720 724..725 729..730 733 740..744 747 749 756 761 763 768 774..776 778 782..783 790..791 795..796 798 803 805..807 810..811 815 820..827 829 832..836 838 840 843..846 848..856 858..879 881..902 904..925 927..948 950..971 973..993 995..1010}",
  :recovered-frac 85/1011,
  :unexpected-frac 0,
  :unexpected "#{}",
  :lost-frac 64/1011,
  :ok-frac 866/1011}}

Ran 1 tests containing 1 assertions.
1 failures, 0 errors.
Tests failed.
Error encountered performing task 'test' with profile(s): 'base,system,user,provided,dev,elasticsearch'
Tests failed.

The full log is over 2K lines long, so I've put it on pastebin:
http://pastebin.com/KLfiLv7u

What's the expected output for this test? Am I missing something obvious...?

Trouble running etcd tests.

I get the following error:

~/jepsen/jepsen $ lein with-profile +etcd test jepsen.system.etcd-test
WARN ignoring checkouts directory knossos as it does not contain a project.clj file.

lein test jepsen.system.etcd-test

lein test :only jepsen.system.etcd-test/register-test

ERROR in (register-test) (AFn.java:429)
Uncaught exception, not in assertion.
expected: nil
  actual: clojure.lang.ArityException: Wrong number of args (1) passed to: generator/delay
 at clojure.lang.AFn.throwArity (AFn.java:429)
    clojure.lang.AFn.invoke (AFn.java:32)
    jepsen.system.etcd_test/fn (etcd_test.clj:31)

etcd_test.clj is calling (gen/delay 100), but the definition of delay in generator.clj seems to require two arguments.

Is the etcd test functional right now? If not, is there another test you'd recommend as a starting point for writing a new test?

Basic usage instructions

Is there a page somewhere explaining basic usage? I'd like to run the elasticsearch tests but when I use the syntax in the readme I get a FileNotFoundException:

Caused by: java.io.FileNotFoundException: Could not locate jepsen/tests__init.class or jepsen/tests.clj on classpath:

I'm not too familiar with Clojure though, so if this is a really basic error feel free to tell me to go pound sand.

jepsen.core-test succeeds but with many indeterminate warning

I know that I asked the same question in #122, but this time, it is jepsen itself, so it's a bit weird for the client to crash for every action in the test (which means that we can't test any property of the system).

I'm currently running on snapshot at 2e48c81, with some mods to run on my setup (which should not affect the behavior of Jepsen).

lein test jepsen.core-test
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (rm /tmp/jepsen-test)
INFO  jepsen.control - (rm /tmp/jepsen-test)
INFO  jepsen.control - (rm /tmp/jepsen-test)
INFO  jepsen.control - (rm /tmp/jepsen-test)
INFO  jepsen.control - (rm /tmp/jepsen-test)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (echo 2132535907 > /tmp/jepsen-test)
INFO  jepsen.control - (echo 2132535907 > /tmp/jepsen-test)
INFO  jepsen.control - (echo 2132535907 > /tmp/jepsen-test)
INFO  jepsen.control - (echo 2132535907 > /tmp/jepsen-test)
INFO  jepsen.control - (echo 2132535907 > /tmp/jepsen-test)
INFO  jepsen.control - (hostname)
INFO  jepsen.core - nemesis done
INFO  jepsen.core - Worker 4 starting
INFO  jepsen.core - Worker 4 done
INFO  jepsen.core - Worker 1 starting
INFO  jepsen.core - Worker 1 done
INFO  jepsen.core - Worker 2 starting
INFO  jepsen.core - Worker 2 done
INFO  jepsen.core - Worker 3 starting
INFO  jepsen.core - Worker 3 done
INFO  jepsen.core - Worker 0 starting
INFO  jepsen.core - Worker 0 done
INFO  jepsen.core - Waiting for nemesis to complete
INFO  jepsen.core - nemesis done.
INFO  jepsen.core - Tearing down nemesis
INFO  jepsen.core - Nemesis torn down
INFO  jepsen.core - Snarfing log files
INFO  jepsen.core - downloading /tmp/jepsen-test to jepsen-test
INFO  jepsen.core - downloading /tmp/jepsen-test to jepsen-test
INFO  jepsen.core - downloading /tmp/jepsen-test to jepsen-test
INFO  jepsen.core - downloading /tmp/jepsen-test to jepsen-test
INFO  jepsen.core - downloading /tmp/jepsen-test to jepsen-test
INFO  jepsen.core - Run complete, writing
INFO  jepsen.store - Wrote /root/jepsen/jepsen/store/ssh test/20160524T113828.000+0700/history.txt
INFO  jepsen.store - Wrote /root/jepsen/jepsen/store/ssh test/20160524T113828.000+0700/results.edn
INFO  jepsen.core - Analyzing
INFO  jepsen.core - Analysis complete
INFO  jepsen.store - Wrote /root/jepsen/jepsen/store/ssh test/20160524T113828.000+0700/history.txt
INFO  jepsen.store - Wrote /root/jepsen/jepsen/store/ssh test/20160524T113828.000+0700/results.edn
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (rm /tmp/jepsen-test)
INFO  jepsen.control - (rm /tmp/jepsen-test)
INFO  jepsen.control - (rm /tmp/jepsen-test)
INFO  jepsen.control - (rm /tmp/jepsen-test)
INFO  jepsen.control - (rm /tmp/jepsen-test)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.control - (hostname)
INFO  jepsen.core - Everything looks good! ヽ(‘ー`)ノ

{:valid? true, :configs ({:model {}, :pending []}), :final-paths ()}

INFO  jepsen.core - Worker 4 starting
INFO  jepsen.core - Worker 2 starting
INFO  jepsen.core - Worker 0 starting
INFO  jepsen.core - nemesis done
INFO  jepsen.core - Worker 3 starting
INFO  jepsen.core - Worker 1 starting
INFO  jepsen.util - 3   :invoke :write  0
INFO  jepsen.util - 3   :ok     :write  0
INFO  jepsen.util - 0   :invoke :write  2
INFO  jepsen.util - 0   :ok     :write  2
INFO  jepsen.util - 0   :invoke :cas    [1 1]
INFO  jepsen.util - 0   :fail   :cas    [1 1]
INFO  jepsen.util - 0   :invoke :cas    [4 0]
INFO  jepsen.util - 0   :fail   :cas    [4 0]
INFO  jepsen.util - 0   :invoke :write  3
INFO  jepsen.util - 0   :ok     :write  3
INFO  jepsen.util - 0   :invoke :read   nil
INFO  jepsen.util - 0   :ok     :read   3
INFO  jepsen.util - 2   :invoke :write  3
INFO  jepsen.core - Worker 0 done
INFO  jepsen.util - 2   :ok     :write  3
INFO  jepsen.core - Worker 2 done
INFO  jepsen.util - 4   :invoke :cas    [4 3]
INFO  jepsen.util - 4   :fail   :cas    [4 3]
INFO  jepsen.core - Worker 4 done
INFO  jepsen.util - 3   :invoke :cas    [3 1]
INFO  jepsen.util - 3   :ok     :cas    [3 1]
INFO  jepsen.core - Worker 3 done
INFO  jepsen.util - 1   :invoke :cas    [1 0]
INFO  jepsen.util - 1   :ok     :cas    [1 0]
INFO  jepsen.core - Worker 1 done
INFO  jepsen.core - Waiting for nemesis to complete
INFO  jepsen.core - nemesis done.
INFO  jepsen.core - Tearing down nemesis
INFO  jepsen.core - Nemesis torn down
INFO  jepsen.core - Run complete, writing
INFO  jepsen.store - Wrote /root/jepsen/jepsen/store/noop/20160524T113831.000+0700/results.edn
INFO  jepsen.store - Wrote /root/jepsen/jepsen/store/noop/20160524T113831.000+0700/history.txt
INFO  jepsen.core - Analyzing
INFO  jepsen.core - Analysis complete
INFO  jepsen.store - Wrote /root/jepsen/jepsen/store/noop/20160524T113831.000+0700/history.txt
INFO  jepsen.store - Wrote /root/jepsen/jepsen/store/noop/20160524T113831.000+0700/results.edn
INFO  jepsen.core - Everything looks good! ヽ(‘ー`)ノ

{:valid? true,
 :configs ({:model {:value 0}, :pending []}),
 :final-paths ()}

INFO  jepsen.core - nemesis done
INFO  jepsen.core - Worker 1 starting
INFO  jepsen.core - Worker 4 starting
INFO  jepsen.util - 1   :invoke :dequeue        nil
INFO  jepsen.util - 4   :invoke :dequeue        nil
INFO  jepsen.core - Worker 2 starting
INFO  jepsen.util - 2   :invoke :dequeue        nil
INFO  jepsen.core - Worker 3 starting
WARN  jepsen.core - Process 1 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 6   :invoke :dequeue        nil
WARN  jepsen.core - Process 6 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 11  :invoke :dequeue        nil
WARN  jepsen.core - Process 11 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 16  :invoke :enqueue        1
INFO  jepsen.core - Worker 0 starting
WARN  jepsen.core - Process 16 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 3   :invoke :enqueue        0
INFO  jepsen.util - 21  :invoke :enqueue        3
WARN  jepsen.core - Process 3 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 8   :invoke :enqueue        4
WARN  jepsen.core - Process 8 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 13  :invoke :dequeue        nil
WARN  jepsen.core - Process 13 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 18  :invoke :dequeue        nil
WARN  jepsen.core - Process 18 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 23  :invoke :dequeue        nil
WARN  jepsen.core - Process 23 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 28  :invoke :enqueue        5
WARN  jepsen.core - Process 28 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 33  :invoke :dequeue        nil
INFO  jepsen.util - 0   :invoke :enqueue        2
WARN  jepsen.core - Process 0 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 5   :invoke :dequeue        nil
WARN  jepsen.core - Process 5 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 10  :invoke :dequeue        nil
WARN  jepsen.core - Process 10 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 15  :invoke :dequeue        nil
WARN  jepsen.core - Process 15 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 20  :invoke :enqueue        6
WARN  jepsen.core - Process 20 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 25  :invoke :dequeue        nil
WARN  jepsen.core - Process 25 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
WARN  jepsen.core - Process 2 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 7   :invoke :dequeue        nil
WARN  jepsen.core - Process 7 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 12  :invoke :enqueue        7
WARN  jepsen.core - Process 12 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 17  :invoke :dequeue        nil
WARN  jepsen.core - Process 17 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 22  :invoke :dequeue        nil
WARN  jepsen.core - Process 22 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 27  :invoke :enqueue        8
WARN  jepsen.core - Process 27 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 32  :invoke :dequeue        nil
WARN  jepsen.core - Process 32 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 37  :invoke :dequeue        nil
WARN  jepsen.core - Process 37 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 42  :invoke :enqueue        9
WARN  jepsen.core - Process 42 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 47  :invoke :enqueue        10
WARN  jepsen.core - Process 47 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.util - 52  :invoke :enqueue        11
WARN  jepsen.core - Process 52 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
WARN  jepsen.core - Process 33 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
WARN  jepsen.core - Process 4 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.core - Worker 4 done
WARN  jepsen.core - Process 21 indeterminate
java.lang.AssertionError: Assert failed: false
        at jepsen.core_test$fn$reify__7379.invoke_BANG_(core_test.clj:96)
        at jepsen.core$worker$fn__6900$fn__6901.invoke(core.clj:142)
        at jepsen.core$worker$fn__6900.invoke(core.clj:140)
        at clojure.core$binding_conveyor_fn$fn__4676.invoke(core.clj:1938)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  jepsen.core - Worker 3 done
INFO  jepsen.core - Worker 2 done
INFO  jepsen.core - Worker 0 done
INFO  jepsen.core - Worker 1 done
INFO  jepsen.core - Waiting for nemesis to complete
INFO  jepsen.core - nemesis done.
INFO  jepsen.core - Tearing down nemesis
INFO  jepsen.core - Nemesis torn down
INFO  jepsen.core - Run complete, writing
INFO  jepsen.store - Wrote /root/jepsen/jepsen/store/noop/20160524T113832.000+0700/results.edn
INFO  jepsen.store - Wrote /root/jepsen/jepsen/store/noop/20160524T113832.000+0700/history.txt
INFO  jepsen.core - Analyzing
INFO  jepsen.core - Analysis complete
INFO  jepsen.store - Wrote /root/jepsen/jepsen/store/noop/20160524T113832.000+0700/results.edn
INFO  jepsen.store - Wrote /root/jepsen/jepsen/store/noop/20160524T113832.000+0700/history.txt
INFO  jepsen.core - Everything looks good! ヽ(‘ー`)ノ

{:valid? true}

Is it normal to have a bunch of exception logged as WARN in zookeeper test and high cost to analyze history?

When running zookeeper test, I keep getting a bunch of warning logs. Some examples:

INFO  jepsen.util - 170 :invoke :cas    [4 0]
WARN  jepsen.core - Process 170 indeterminate
java.lang.IllegalMonitorStateException
INFO  jepsen.util - :nemesis    :info   :start  "Cut off {:n5 #{:n3 :n4 :n2}, :n1 #{:n3 :n4 :n2}, :n3 #{:n5 :n1}, :n4 #{:n5 :n1}, :n2 #{:n5 :n1}}"
INFO  jepsen.util - 144 :invoke :cas    [4 0]
INFO  jepsen.util - 172 :invoke :cas    [2 3]
WARN  jepsen.core - Process 172 indeterminate
java.lang.IllegalMonitorStateException
INFO  jepsen.control - (iptables -A INPUT -s 192.168.99.222 -j DROP -w)
INFO  jepsen.util - 211 :invoke :read   nil
WARN  jepsen.core - Process 211 indeterminate
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /jepsen/data
INFO  jepsen.core - Worker 1 done
INFO  org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 3335ms for sessionid 0x454cc6ebfda0000, closing socket connection and attempting reconnect
WARN  jepsen.core - Process 144 indeterminate
java.lang.IllegalMonitorStateException

Is this normal?

Also, the cost of analyzing the history seems to be pretty high:

INFO  jepsen.core - Analyzing
INFO  knossos.linear - :space 2720 :cost 2.92E+24 :op {:type :ok, :f :read, :value 4, :process 32, :time 17632658846, :index 83}

Is the high cost normal?

Using quorum flag for Etcd still fails for linearizable test

Hi @aphyr

I know that etcd has a quorum flag for reading linearizability, so I try to test it with following changes:

:read  (let [value (-> client
                                 (v/get k {:consistent? true :quorum? true})
                                 (json/parse-string true))]
                   (assoc op :type :ok :value value))

I also upgrade to use jepsen 0.0.7 and clojure 1.7.0 in the project.clj.

To my surprise, the test fails. I see the etcd source and find that it it uses the quorum flag, it will commit a read-only entry in the log and do the query after majority followers replicating it, so I think this mechanism is linearizable. Am I wrong, or my test is wrong?

Below is the failed test output

{:html {:valid? true},
 :linear
 {:valid? false,
  :configs
  ({:model {:value 1},
    :pending
    [{:type :invoke,
      :f :cas,
      :value [4 2],
      :process 3,
      :time 6831014019,
      :index 58}
     {:type :invoke,
      :f :write,
      :value 3,
      :process 6,
      :time 8839541714,
      :index 74}
     {:type :invoke,
      :f :read,
      :process 8,
      :time 8838654383,
      :value 2,
      :index 75}]}
   {:model {:value 1},
    :pending
    [{:type :invoke,
      :f :write,
      :value 0,
      :process 1,
      :time 6830095656,
      :index 57}
     {:type :invoke,
      :f :cas,
      :value [4 2],
      :process 3,
      :time 6831014019,
      :index 58}
     {:type :invoke,
      :f :write,
      :value 3,
      :process 6,
      :time 8839541714,
      :index 74}
     {:type :invoke,
      :f :read,
      :process 8,
      :time 8838654383,
      :value 2,
      :index 75}]}),
  :final-paths
  ([{:op
     {:type :ok,
      :f :read,
      :process 0,
      :time 8833496421,
      :value 1,
      :index 73},
     :model {:value 1}}
    {:op
     {:type :invoke,
      :f :write,
      :value 0,
      :process 1,
      :time 6830095656,
      :index 57},
     :model {:value 0}}
    {:op
     {:type :invoke,
      :f :write,
      :value 3,
      :process 6,
      :time 8839541714,
      :index 74},
     :model {:value 3}}
    {:op
     {:type :invoke,
      :f :cas,
      :value [4 2],
      :process 3,
      :time 6831014019,
      :index 58},
     :model {:msg "can't CAS 3 from 4 to 2"}}]
   [{:op
     {:type :ok,
      :f :read,
      :process 0,
      :time 8833496421,
      :value 1,
      :index 73},
     :model {:value 1}}
    {:op
     {:type :invoke,
      :f :cas,
      :value [4 2],
      :process 3,
      :time 6831014019,
      :index 58},
     :model {:msg "can't CAS 1 from 4 to 2"}}]
   [{:op
     {:type :ok,
      :f :read,
      :process 0,
      :time 8833496421,
      :value 1,
      :index 73},
     :model {:value 1}}
    {:op
     {:type :invoke,
      :f :write,
      :value 3,
      :process 6,
      :time 8839541714,
      :index 74},
     :model {:value 3}}
    {:op
     {:type :invoke,
      :f :write,
      :value 0,
      :process 1,
      :time 6830095656,
      :index 57},
     :model {:value 0}}
    {:op
     {:type :ok,
      :f :read,
      :process 8,
      :time 8852473707,
      :value 2,
      :index 76},
     :model {:msg "can't read 2 from register 0"}}]
   [{:op
     {:type :ok,
      :f :read,
      :process 0,
      :time 8833496421,
      :value 1,
      :index 73},
     :model {:value 1}}
    {:op
     {:type :invoke,
      :f :write,
      :value 0,
      :process 1,
      :time 6830095656,
      :index 57},
     :model {:value 0}}
    {:op
     {:type :invoke,
      :f :write,
      :value 3,
      :process 6,
      :time 8839541714,
      :index 74},
     :model {:value 3}}
    {:op
     {:type :ok,
      :f :read,
      :process 8,
      :time 8852473707,
      :value 2,
      :index 76},
     :model {:msg "can't read 2 from register 3"}}]
   [{:op
     {:type :ok,
      :f :read,
      :process 0,
      :time 8833496421,
      :value 1,
      :index 73},
     :model {:value 1}}
    {:op
     {:type :invoke,
      :f :write,
      :value 3,
      :process 6,
      :time 8839541714,
      :index 74},
     :model {:value 3}}
    {:op
     {:type :invoke,
      :f :cas,
      :value [4 2],
      :process 3,
      :time 6831014019,
      :index 58},
     :model {:msg "can't CAS 3 from 4 to 2"}}]
   [{:op
     {:type :ok,
      :f :read,
      :process 0,
      :time 8833496421,
      :value 1,
      :index 73},
     :model {:value 1}}
    {:op
     {:type :invoke,
      :f :write,
      :value 3,
      :process 6,
      :time 8839541714,
      :index 74},
     :model {:value 3}}
    {:op
     {:type :invoke,
      :f :write,
      :value 0,
      :process 1,
      :time 6830095656,
      :index 57},
     :model {:value 0}}
    {:op
     {:type :invoke,
      :f :cas,
      :value [4 2],
      :process 3,
      :time 6831014019,
      :index 58},
     :model {:msg "can't CAS 0 from 4 to 2"}}]
   [{:op
     {:type :ok,
      :f :read,
      :process 0,
      :time 8833496421,
      :value 1,
      :index 73},
     :model {:value 1}}
    {:op
     {:type :invoke,
      :f :write,
      :value 3,
      :process 6,
      :time 8839541714,
      :index 74},
     :model {:value 3}}
    {:op
     {:type :ok,
      :f :read,
      :process 8,
      :time 8852473707,
      :value 2,
      :index 76},
     :model {:msg "can't read 2 from register 3"}}]
   [{:op
     {:type :ok,
      :f :read,
      :process 0,
      :time 8833496421,
      :value 1,
      :index 73},
     :model {:value 1}}
    {:op
     {:type :invoke,
      :f :write,
      :value 0,
      :process 1,
      :time 6830095656,
      :index 57},
     :model {:value 0}}
    {:op
     {:type :ok,
      :f :read,
      :process 8,
      :time 8852473707,
      :value 2,
      :index 76},
     :model {:msg "can't read 2 from register 0"}}]
   [{:op
     {:type :ok,
      :f :read,
      :process 0,
      :time 8833496421,
      :value 1,
      :index 73},
     :model {:value 1}}
    {:op
     {:type :ok,
      :f :read,
      :process 8,
      :time 8852473707,
      :value 2,
      :index 76},
     :model {:msg "can't read 2 from register 1"}}]
   [{:op
     {:type :ok,
      :f :read,
      :process 0,
      :time 8833496421,
      :value 1,
      :index 73},
     :model {:value 1}}
    {:op
     {:type :invoke,
      :f :write,
      :value 0,
      :process 1,
      :time 6830095656,
      :index 57},
     :model {:value 0}}
    {:op
     {:type :invoke,
      :f :cas,
      :value [4 2],
      :process 3,
      :time 6831014019,
      :index 58},
     :model {:msg "can't CAS 0 from 4 to 2"}}]),
  :previous-ok
  {:type :ok,
   :f :read,
   :process 0,
   :time 8833496421,
   :value 1,
   :index 73},
  :op
  {:type :ok,
   :f :read,
   :process 8,
   :time 8852473707,
   :value 2,
   :index 76}},
 :valid? false}


lein test :only jepsen.etcd-test/register-test

FAIL in (register-test) (etcd_test.clj:46)
expected: (:valid? (:results test))
  actual: false
Not linearizable. Linearizable prefix was:

Followed by inconsistent operation:
        nil nil

-------------------------------------------------------------
Just prior to that operation, possible interpretations of the
linearizable prefix were:

lein test :only jepsen.etcd-test/register-test

ERROR in (register-test) (ArrayList.java:177)
Uncaught exception, not in assertion.
expected: nil
  actual: java.lang.NullPointerException: null
 at java.util.ArrayList.<init> (ArrayList.java:177)
    clojure.core$shuffle.invoke (core.clj:6981)
    jepsen.report$linearizability.invoke (report.clj:36)
    jepsen.etcd_test/fn (etcd_test.clj:47)
    clojure.test$test_var$fn__7670.invoke (test.clj:704)
    clojure.test$test_var.invoke (test.clj:704)
    clojure.test$test_vars$fn__7692$fn__7697.invoke (test.clj:722)
    clojure.test$default_fixture.invoke (test.clj:674)
    clojure.test$test_vars$fn__7692.invoke (test.clj:722)
    clojure.test$default_fixture.invoke (test.clj:674)
    clojure.test$test_vars.invoke (test.clj:718)
    clojure.test$test_all_vars.invoke (test.clj:728)
    clojure.test$test_ns.invoke (test.clj:747)
    clojure.core$map$fn__4553.invoke (core.clj:2624)
    clojure.lang.LazySeq.sval (LazySeq.java:40)
    clojure.lang.LazySeq.seq (LazySeq.java:49)
    clojure.lang.Cons.next (Cons.java:39)
    clojure.lang.RT.boundedLength (RT.java:1735)
    clojure.lang.RestFn.applyTo (RestFn.java:130)
    clojure.core$apply.invoke (core.clj:632)
    clojure.test$run_tests.doInvoke (test.clj:762)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invoke (core.clj:630)
    user$eval85$fn__144$fn__175.invoke (form-init5183894627931871576.clj:1)
    user$eval85$fn__144$fn__145.invoke (form-init5183894627931871576.clj:1)
    user$eval85$fn__144.invoke (form-init5183894627931871576.clj:1)
    user$eval85.invoke (form-init5183894627931871576.clj:1)
    clojure.lang.Compiler.eval (Compiler.java:6782)
    clojure.lang.Compiler.eval (Compiler.java:6772)
    clojure.lang.Compiler.load (Compiler.java:7227)
    clojure.lang.Compiler.loadFile (Compiler.java:7165)
    clojure.main$load_script.invoke (main.clj:275)
    clojure.main$init_opt.invoke (main.clj:280)
    clojure.main$initialize.invoke (main.clj:308)
    clojure.main$null_opt.invoke (main.clj:343)
    clojure.main$main.doInvoke (main.clj:421)
    clojure.lang.RestFn.invoke (RestFn.java:421)
    clojure.lang.Var.invoke (Var.java:383)
    clojure.lang.AFn.applyToHelper (AFn.java:156)
    clojure.lang.Var.applyTo (Var.java:700)
    clojure.main.main (main.java:37)

Ran 1 tests containing 2 assertions.
1 failures, 1 errors.
Tests failed.

Thanks you.

Nemesis crashes on concurrent iptables operations

I'm experiencing fairly frequent nemesis crashes on the partition! call in a test I'm working on, which AFAICT is being caused by concurrent iptables operations:

18:00:15.321 WARN  [jepsen nemesis]: jepsen.core - Nemesis crashed evaluating {:time 79828671383, :process :nemesis, :type :info, :f :start}
java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?


  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
  at clojure.core$deref_future.invoke(core.clj:2180)
  at clojure.core$future_call$reify__6320.deref(core.clj:6420)
  at clojure.core$deref.invoke(core.clj:2200)
  at clojure.core$map$fn__4245.invoke(core.clj:2559)
  at clojure.lang.LazySeq.sval(LazySeq.java:40)
  at clojure.lang.LazySeq.seq(LazySeq.java:49)
  at clojure.lang.Cons.next(Cons.java:39)
  at clojure.lang.RT.next(RT.java:598)
  at clojure.core$next.invoke(core.clj:64)
  at clojure.core$dorun.invoke(core.clj:2856)
  at jepsen.nemesis$partition_BANG_.invoke(nemesis.clj:28)
  at jepsen.nemesis$partitioner$reify__3494.invoke_BANG_(nemesis.clj:85)
  at jepsen.core$nemesis_worker$fn__3163$fn__3168.invoke(core.clj:192)
  at jepsen.core$nemesis_worker$fn__3163.invoke(core.clj:190)
  at clojure.core$binding_conveyor_fn$fn__4145.invoke(core.clj:1910)
  at clojure.lang.AFn.call(AFn.java:18)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)

I'm not sure what the best approach for avoiding this is, but wanted to at least file it and see what you thought...

jcraft Auth failure in set_app.clj

Hey,

I've got 5 ubuntu VMs running on a server, each configured as follows:

  • username/password: ubuntu
  • hostnames n1, n2, n3, n4, n5
  • passwordless sudo privileges
  • ~/.ssh/authorized_keys contains the public key I'm using for these tests
  • /etc/hosts pointing to each other's IP addresses

I've successfully run salticid base.setup and salticid <APP>.setup for three values of 'APP: postgres, zk, mongo

On the guest machine, I've tried running:

lein run <APP> for pg, zk, and mongo.

All three of them eventually fail on line 126 of set_app.clj. Here's the stacktrace:

Caused by: com.jcraft.jsch.JSchException: Auth fail
        at com.jcraft.jsch.Session.connect(Session.java:512)
        at com.jcraft.jsch.Session.connect(Session.java:183)
        at clj_ssh.ssh$connect.invoke(ssh.clj:327)
        at jepsen.set_app$run$fn__2074$fn__2075.invoke(set_app.clj:126)
        at clojure.core$binding_conveyor_fn$fn__4107.invoke(core.clj:1836)
        at clojure.lang.AFn.call(AFn.java:18)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:701)

Any idea what might be going wrong?

Thanks!

test is stuck at (jepsen/sycnhronize ) due to not configuring all the nodes.

Hi,
I am using jepsen framework inside docker container. Basically there is one docker in which I setup the jepsen testing framework. Further according to nodes requirement for the test I create dockers containers inside it. All my tests gets stuck intermittently in following way:

INFO jepsen.os.debian - :n2 setting up debian
INFO jepsen.os.debian - :n1 setting up debian
INFO jepsen.os.debian - :n3 setting up debian
INFO jepsen.os.debian - Installing #{rsyslog}
INFO jepsen.os.debian - Installing #{rsyslog}
INFO jepsen.os.debian - Installing #{rsyslog}
INFO tacc.core - :n2 not tearing down system
INFO tacc.core - :n3 not tearing down system
INFO tacc.core - :n1 not tearing down system
INFO tacc.core - :n1 tore down
INFO tacc.core - :n2 tore down
INFO tacc.core - :n3 tore down
INFO jepsen.os.debian - setting up :optumsoft apt repo
INFO jepsen.os.debian - setting up :optumsoft apt repo
INFO jepsen.os.debian - setting up :optumsoft apt repo
INFO jepsen.os.debian - Installing #{}
INFO jepsen.os.debian - Installing #{}
INFO jepsen.os.debian - Installing #{}
INFO tacc.core - :n1 done install
INFO tacc.core - :n1 {scheduler.json [:n1]}
INFO tacc.core - inside :n1 scheduler.json
INFO tacc.core - :n1 done configure
INFO tacc.core - :n1 system set up

Basically package installation step does not start on one of the nodes and test gets stuck at (jepsen/synchronize )
Is there any thing I should take care of in my test?

Thanks

License please

This work is amazing @aphyr ! Please add a license to the project (open source maybe?)

rabbit-test JSchException: Auth fail

lein test :only jepsen.rabbitmq-test/rabbit-test

ERROR in (rabbit-test) (Session.java:512)
Uncaught exception, not in assertion.
expected: nil
actual: com.jcraft.jsch.JSchException: Auth fail
at com.jcraft.jsch.Session.connect (Session.java:512)
com.jcraft.jsch.Session.connect (Session.java:183)
clj_ssh.ssh$eval5931$fn__5938.invoke (ssh.clj:118)
clj_ssh.ssh.protocols$eval5857$fn__5880$G__5848__5889.invoke (protocols.clj:4)
clj_ssh.ssh$connect.invoke (ssh.clj:401)
jepsen.control$session.invoke (control.clj:197)
clojure.lang.AFn.applyToHelper (AFn.java:154)
clojure.lang.AFn.applyTo (AFn.java:144)
clojure.core$apply.invoke (core.clj:624)
clojure.core$with_bindings_STAR_.doInvoke (core.clj:1862)
clojure.lang.RestFn.applyTo (RestFn.java:142)
clojure.core$apply.invoke (core.clj:628)
clojure.core$bound_fn_STAR_$fn__4140.doInvoke (core.clj:1884)
clojure.lang.RestFn.applyTo (RestFn.java:137)
clojure.core$apply.invoke (core.clj:624)
jepsen.core$fcatch$wrapper__7445.doInvoke (core.clj:53)
clojure.lang.RestFn.invoke (RestFn.java:408)
clojure.core$pmap$fn__6328$fn__6329.invoke (core.clj:6466)
clojure.core$binding_conveyor_fn$fn__4145.invoke (core.clj:1910)
clojure.lang.AFn.call (AFn.java:18)
java.util.concurrent.FutureTask.run (FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
java.lang.Thread.run (Thread.java:745)

Note:
The ssh test passes though (lein test :only jepsen.core-test/ssh-test)

How to run test case with 6 machines - one control node and 5 db nodes?

I set up 6 Ubuntu 14.04 VMs: 1 control node and 5 to-be-db-nodes. I have set up password-less SSH and make sure that known_hosts file stores the host names/IP address in plain instead of hash. The 6 VMs are behind firewall, so I have http_proxy and https_proxy environment variables set.

From my understanding, LXC set up is for running test on a single host, by spawning VMs on that host. I already have 6 separate VMs set up (equivalent to having 6 separate machines), so this step should not be necessary.

However, when I run lein test on aerospike (following the docs), I keep getting this error:

ERROR in (cas-register) (Util.java:349)
Uncaught exception, not in assertion.
expected: nil
  actual: com.jcraft.jsch.JSchException: java.net.UnknownHostException: n1
 at com.jcraft.jsch.Util.createSocket (Util.java:349)
    com.jcraft.jsch.Session.connect (Session.java:215)
    com.jcraft.jsch.Session.connect (Session.java:183)
    clj_ssh.ssh$eval5935$fn__5942.invoke (ssh.clj:118)
    clj_ssh.ssh.protocols$eval5861$fn__5884$G__5852__5893.invoke (protocols.clj:4)
    clj_ssh.ssh$connect.invoke (ssh.clj:401)
    jepsen.control$session.invoke (control.clj:197)
    clojure.lang.AFn.applyToHelper (AFn.java:154)
    clojure.lang.AFn.applyTo (AFn.java:144)
    clojure.core$apply.invoke (core.clj:624)
    clojure.core$with_bindings_STAR_.doInvoke (core.clj:1862)
    clojure.lang.RestFn.applyTo (RestFn.java:142)
    clojure.core$apply.invoke (core.clj:628)
    clojure.core$bound_fn_STAR_$fn__4140.doInvoke (core.clj:1884)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invoke (core.clj:624)
    jepsen.core$fcatch$wrapper__7449.doInvoke (core.clj:53)
    clojure.lang.RestFn.invoke (RestFn.java:408)
    clojure.core$pmap$fn__6328$fn__6329.invoke (core.clj:6466)
    clojure.core$binding_conveyor_fn$fn__4145.invoke (core.clj:1910)
    clojure.lang.AFn.call (AFn.java:18)
    java.util.concurrent.FutureTask.run (FutureTask.java:266)
    java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
    java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
    java.lang.Thread.run (Thread.java:745)
Caused by: java.net.UnknownHostException: n1
 at java.net.AbstractPlainSocketImpl.connect (AbstractPlainSocketImpl.java:184)
    java.net.SocksSocketImpl.connect (SocksSocketImpl.java:392)
    java.net.Socket.connect (Socket.java:589)
    java.net.Socket.connect (Socket.java:538)
    java.net.Socket.<init> (Socket.java:434)
    java.net.Socket.<init> (Socket.java:211)
    com.jcraft.jsch.Util.createSocket (Util.java:343)
    com.jcraft.jsch.Session.connect (Session.java:215)
    com.jcraft.jsch.Session.connect (Session.java:183)
    clj_ssh.ssh$eval5935$fn__5942.invoke (ssh.clj:118)
    clj_ssh.ssh.protocols$eval5861$fn__5884$G__5852__5893.invoke (protocols.clj:4)
    clj_ssh.ssh$connect.invoke (ssh.clj:401)
    jepsen.control$session.invoke (control.clj:197)
    clojure.lang.AFn.applyToHelper (AFn.java:154)
    clojure.lang.AFn.applyTo (AFn.java:144)
    clojure.core$apply.invoke (core.clj:624)
    clojure.core$with_bindings_STAR_.doInvoke (core.clj:1862)
    clojure.lang.RestFn.applyTo (RestFn.java:142)
    clojure.core$apply.invoke (core.clj:628)
    clojure.core$bound_fn_STAR_$fn__4140.doInvoke (core.clj:1884)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invoke (core.clj:624)
    jepsen.core$fcatch$wrapper__7449.doInvoke (core.clj:53)
    clojure.lang.RestFn.invoke (RestFn.java:408)
    clojure.core$pmap$fn__6328$fn__6329.invoke (core.clj:6466)
    clojure.core$binding_conveyor_fn$fn__4145.invoke (core.clj:1910)
    clojure.lang.AFn.call (AFn.java:18)
    java.util.concurrent.FutureTask.run (FutureTask.java:266)
    java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
    java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
    java.lang.Thread.run (Thread.java:745)

I have also tried changing hosts-map in jepsen/src/jepsen/control/net.clj, but to no avail.

What should I do to get Jepsen to run for this set up with separate machine?

JSchException: Packet corrupt

I see this error intermittently running the chronos test:

ERROR in (install-test) (FutureTask.java:122)
expected: (:valid? (:results (run! (simple-test "0.23.0-1.0.debian81" "2.4.0-0.1.20150828104228.debian81"))))
  actual: java.util.concurrent.ExecutionException: com.jcraft.jsch.JSchException: Packet corrupt
 at java.util.concurrent.FutureTask.report (FutureTask.java:122)
    java.util.concurrent.FutureTask.get (FutureTask.java:192)
    clojure.core$deref_future.invoke (core.clj:2180)
    clojure.core$future_call$reify__6320.deref (core.clj:6420)
    clojure.core$deref.invoke (core.clj:2200)
    clojure.core$map$fn__4245.invoke (core.clj:2559)
    clojure.lang.LazySeq.sval (LazySeq.java:40)
    clojure.lang.LazySeq.seq (LazySeq.java:49)
    clojure.lang.Cons.next (Cons.java:39)
    clojure.lang.RT.next (RT.java:598)
    clojure.core$next.invoke (core.clj:64)
    clojure.core$dorun.invoke (core.clj:2856)
    jepsen.core$on_nodes.invoke (core.clj:86)
    jepsen.core$run_BANG_$fn__7420.invoke (core.clj:392)
    jepsen.core$run_BANG_.invoke (core.clj:362)
    jepsen.chronos_test/fn (chronos_test.clj:7)
    clojure.test$test_var$fn__7187.invoke (test.clj:704)
    clojure.test$test_var.invoke (test.clj:704)
    clojure.test$test_vars$fn__7209$fn__7214.invoke (test.clj:722)
    clojure.test$default_fixture.invoke (test.clj:674)
    clojure.test$test_vars$fn__7209.invoke (test.clj:722)
    clojure.test$default_fixture.invoke (test.clj:674)
    clojure.test$test_vars.invoke (test.clj:718)
    clojure.test$test_all_vars.invoke (test.clj:728)
    clojure.test$test_ns.invoke (test.clj:747)
    clojure.core$map$fn__4245.invoke (core.clj:2559)
    clojure.lang.LazySeq.sval (LazySeq.java:40)
    clojure.lang.LazySeq.seq (LazySeq.java:49)
    clojure.lang.Cons.next (Cons.java:39)
    clojure.lang.RT.boundedLength (RT.java:1654)
    clojure.lang.RestFn.applyTo (RestFn.java:130)
    clojure.core$apply.invoke (core.clj:626)
    clojure.test$run_tests.doInvoke (test.clj:762)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invoke (core.clj:624)
    user$eval85$fn__140$fn__171.invoke (form-init5900053764853348907.clj:1)
    user$eval85$fn__140$fn__141.invoke (form-init5900053764853348907.clj:1)
    user$eval85$fn__140.invoke (form-init5900053764853348907.clj:1)
    user$eval85.invoke (form-init5900053764853348907.clj:1)
    clojure.lang.Compiler.eval (Compiler.java:6703)
    clojure.lang.Compiler.eval (Compiler.java:6693)
    clojure.lang.Compiler.load (Compiler.java:7130)
    clojure.lang.Compiler.loadFile (Compiler.java:7086)
    clojure.main$load_script.invoke (main.clj:274)
    clojure.main$init_opt.invoke (main.clj:279)
    clojure.main$initialize.invoke (main.clj:307)
    clojure.main$null_opt.invoke (main.clj:342)
    clojure.main$main.doInvoke (main.clj:420)
    clojure.lang.RestFn.invoke (RestFn.java:421)
    clojure.lang.Var.invoke (Var.java:383)
    clojure.lang.AFn.applyToHelper (AFn.java:156)
    clojure.lang.Var.applyTo (Var.java:700)
    clojure.main.main (main.java:37)
Caused by: com.jcraft.jsch.JSchException: Packet corrupt
 at com.jcraft.jsch.Session.start_discard (Session.java:1050)
    com.jcraft.jsch.Session.read (Session.java:920)
    com.jcraft.jsch.Session.connect (Session.java:309)
    com.jcraft.jsch.Session.connect (Session.java:183)
    clj_ssh.ssh$eval5794$fn__5801.invoke (ssh.clj:118)
    clj_ssh.ssh.protocols$eval5720$fn__5743$G__5711__5752.invoke (protocols.clj:4)
    clj_ssh.ssh$connect.invoke (ssh.clj:401)
    clj_ssh.ssh$ssh.invoke (ssh.clj:722)
    jepsen.control$ssh_STAR_.invoke (control.clj:115)
    jepsen.control$exec_STAR_.doInvoke (control.clj:121)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invoke (core.clj:624)
    jepsen.control$exec.doInvoke (control.clj:135)
    clojure.lang.RestFn.invoke (RestFn.java:482)
    jepsen.mesosphere$db$reify__9742$fn__9743.invoke (mesosphere.clj:153)
    jepsen.mesosphere$db$reify__9742.teardown_BANG_ (mesosphere.clj:150)
    jepsen.chronos$db$reify__10301.teardown_BANG_ (chronos.clj:76)
    jepsen.db$eval5519$fn__5520$G__5509__5524.invoke (db.clj:4)
    jepsen.db$eval5519$fn__5520$G__5508__5529.invoke (db.clj:4)
    clojure.lang.AFn.applyToHelper (AFn.java:160)
    clojure.lang.AFn.applyTo (AFn.java:144)
    clojure.core$apply.invoke (core.clj:626)
    clojure.core$partial$fn__4228.doInvoke (core.clj:2468)
    clojure.lang.RestFn.invoke (RestFn.java:421)
    jepsen.core$on_nodes$fn__7322.invoke (core.clj:85)
    clojure.core$pmap$fn__6328$fn__6329.invoke (core.clj:6466)
    clojure.core$binding_conveyor_fn$fn__4145.invoke (core.clj:1910)
    clojure.lang.AFn.call (AFn.java:18)
    java.util.concurrent.FutureTask.run (FutureTask.java:266)
    java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
    java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
    java.lang.Thread.run (Thread.java:745)

lein test jepsen.chronos.checker-test

Configuration: jepsen-vagrant VM running debian/jessie64; the jepsen boxes themselves are lxc containers. Jepsen is a git checkout of cf35539. I had to make a few tweaks to the configuration for other reasons: ssh is using password-based auth, and sshd_config has "MaxSessions 50" (due to https://stackoverflow.com/questions/6947651/is-there-a-limit-to-how-many-channels-can-be-open-per-session-in-jsch), and "PermitRootLogin yes". If there's any more information I can provide to help repro the issue, just let me know.

/var/cache/apt/pkgcache.bin is missing in jespen debian container.

Tried to duplicate SNAPSHOT ISOLATION issue (codership/galera#336) on my local machine. But dirty-reads-test is exiting with file (/var/cache/apt/pkgcache.bin ) missing error in jespen debian containers.

Local machine info :
OS: Ubuntu 15.04

Followed this for jespen download : https://github.com/aphyr/jepsen/tree/master/docker

root@aa912f8def96:/jepsen/galera# lein test

lein test jepsen.galera-test
INFO jepsen.os.debian - :n3 setting up debian
INFO jepsen.os.debian - :n4 setting up debian
INFO jepsen.os.debian - :n1 setting up debian
INFO jepsen.os.debian - :n2 setting up debian
INFO jepsen.os.debian - :n5 setting up debian
INFO jepsen.os.debian - Installing #{man-db curl iputils-ping logrotate sysvinit-core rsyslog faketime vim unzip wget iptables sysvinit}
INFO jepsen.os.debian - Installing #{man-db curl iputils-ping logrotate sysvinit-core rsyslog faketime vim unzip wget iptables sysvinit}
INFO jepsen.os.debian - Installing #{man-db curl iputils-ping logrotate sysvinit-core rsyslog faketime vim unzip wget iptables sysvinit}
INFO jepsen.os.debian - Installing #{man-db curl iputils-ping logrotate sysvinit-core rsyslog faketime vim unzip wget iptables sysvinit}

lein test :only jepsen.galera-test/dirty-reads-test

ERROR in (dirty-reads-test) (FutureTask.java:122)
expected: (:valid? (:results (run! (dirty-reads/test- version 4))))
actual: java.util.concurrent.ExecutionException: java.lang.RuntimeException: stat: cannot stat '/var/cache/apt/pkgcache.bin': No such file or directory

Failed to run jepsen sample test on ubuntu and ubuntu lxc containers

Does anyone successfully run jepsen testing on ubuntu 14.04 with ubuntu containers? Since ubuntu containers do not take root password as described in lxc.md, I simply set PasswordAuthentication no to allow ssh in. However, it fails to locate sysvinit-core when running the test. I just wonder if I have to create debain containers instead?


lein test aerospike.core-test
INFO jepsen.os.debian - :n4 setting up debian
INFO jepsen.os.debian - :n1 setting up debian
INFO jepsen.os.debian - :n5 setting up debian
INFO jepsen.os.debian - :n3 setting up debian
INFO jepsen.os.debian - :n2 setting up debian
INFO jepsen.os.debian - Installing #{man-db curl sysvinit-core faketime unzip wget iptables sysvinit}
INFO jepsen.os.debian - Installing #{man-db curl sysvinit-core faketime unzip wget iptables sysvinit}
INFO jepsen.os.debian - Installing #{man-db curl sysvinit-core faketime unzip wget iptables sysvinit}
INFO jepsen.os.debian - Installing #{man-db curl sysvinit-core faketime unzip wget iptables sysvinit}

lein test :only aerospike.core-test/counter

ERROR in (counter) (FutureTask.java:122)
Uncaught exception, not in assertion.
expected: nil
INFO jepsen.os.debian - :n4 setting up debian
INFO jepsen.os.debian - :n5 setting up debian
INFO jepsen.os.debian - :n2 setting up debian
INFO jepsen.os.debian - :n3 setting up debian
INFO jepsen.os.debian - :n1 setting up debian
INFO jepsen.os.debian - Installing #{man-db curl sysvinit-core faketime unzip wget iptables sysvinit}
INFO jepsen.os.debian - Installing #{man-db curl sysvinit-core faketime unzip wget iptables sysvinit}
INFO jepsen.os.debian - Installing #{man-db curl sysvinit-core faketime unzip wget iptables sysvinit}
INFO jepsen.os.debian - Installing #{man-db curl sysvinit-core faketime unzip wget iptables sysvinit}
actual: java.util.concurrent.ExecutionException: java.lang.RuntimeException: E: Unable to locate package sysvinit-core
E: Package 'sysvinit' has no installation candidate

Reading package lists...
Building dependency tree...
Reading state information...
Package sysvinit is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
However the following packages replace it:
upstart sysvinit-utils

at java.util.concurrent.FutureTask.report (FutureTask.java:122)
java.util.concurrent.FutureTask.get (FutureTask.java:192)
clojure.core$deref_future.invoke (core.clj:2180)
clojure.core$future_call$reify__6320.deref (core.clj:6420)
clojure.core$deref.invoke (core.clj:2200)
clojure.core$map$fn__4245.invoke (core.clj:2559)
clojure.lang.LazySeq.sval (LazySeq.java:40)
clojure.lang.LazySeq.seq (LazySeq.java:56)
clojure.lang.RT.seq (RT.java:484)
clojure.core$seq.invoke (core.clj:133)
clojure.core$dorun.invoke (core.clj:2855)
jepsen.core$on_nodes.invoke (core.clj:86)
jepsen.core$run_BANG_$fn__7561.invoke (core.clj:391)
jepsen.core$run_BANG_.invoke (core.clj:362)
aerospike.core_test/fn (core_test.clj:16)
clojure.test$test_var$fn__7187.invoke (test.clj:704)
clojure.test$test_var.invoke (test.clj:704)
clojure.test$test_vars$fn__7209$fn__7214.invoke (test.clj:722)
clojure.test$default_fixture.invoke (test.clj:674)
clojure.test$test_vars$fn__7209.invoke (test.clj:722)
clojure.test$default_fixture.invoke (test.clj:674)
clojure.test$test_vars.invoke (test.clj:718)
clojure.test$test_all_vars.invoke (test.clj:728)
clojure.test$test_ns.invoke (test.clj:747)
clojure.core$map$fn__4245.invoke (core.clj:2559)
clojure.lang.LazySeq.sval (LazySeq.java:40)
clojure.lang.LazySeq.seq (LazySeq.java:49)
clojure.lang.Cons.next (Cons.java:39)
clojure.lang.RT.boundedLength (RT.java:1654)
clojure.lang.RestFn.applyTo (RestFn.java:130)
clojure.core$apply.invoke (core.clj:626)
clojure.test$run_tests.doInvoke (test.clj:762)
clojure.lang.RestFn.applyTo (RestFn.java:137)
clojure.core$apply.invoke (core.clj:624)
user$eval85$fn__144$fn__175.invoke (form-init23722721549352995.clj:1)
user$eval85$fn__144$fn__145.invoke (form-init23722721549352995.clj:1)
user$eval85$fn__144.invoke (form-init23722721549352995.clj:1)
user$eval85.invoke (form-init23722721549352995.clj:1)
clojure.lang.Compiler.eval (Compiler.java:6703)
clojure.lang.Compiler.eval (Compiler.java:6693)
clojure.lang.Compiler.load (Compiler.java:7130)
clojure.lang.Compiler.loadFile (Compiler.java:7086)
clojure.main$load_script.invoke (main.clj:274)
clojure.main$init_opt.invoke (main.clj:279)
clojure.main$initialize.invoke (main.clj:307)
clojure.main$null_opt.invoke (main.clj:342)
clojure.main$main.doInvoke (main.clj:420)
clojure.lang.RestFn.invoke (RestFn.java:421)
clojure.lang.Var.invoke (Var.java:383)
clojure.lang.AFn.applyToHelper (AFn.java:156)
clojure.lang.Var.applyTo (Var.java:700)
clojure.main.main (main.java:37)
Caused by: java.lang.RuntimeException: E: Unable to locate package sysvinit-core
E: Package 'sysvinit' has no installation candidate

Reading package lists...
Building dependency tree...
Reading state information...
Package sysvinit is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
However the following packages replace it:
upstart sysvinit-utils

at jepsen.control$throw_on_nonzero_exit.invoke (control.clj:105)
jepsen.control$exec_STAR_.doInvoke (control.clj:121)
clojure.lang.RestFn.applyTo (RestFn.java:137)
clojure.core$apply.invoke (core.clj:624)
jepsen.control$exec.doInvoke (control.clj:135)
clojure.lang.RestFn.applyTo (RestFn.java:137)
clojure.core$apply.invoke (core.clj:630)
jepsen.os.debian$install.invoke (debian.clj:98)
jepsen.os.debian$reify__8012$fn__8013.invoke (debian.clj:132)
jepsen.os.debian$reify__8012.setup_BANG_ (debian.clj:130)
jepsen.os$eval5601$fn__5602$G__5593__5606.invoke (os.clj:4)
jepsen.os$eval5601$fn__5602$G__5592__5611.invoke (os.clj:4)
clojure.lang.AFn.applyToHelper (AFn.java:160)
clojure.lang.AFn.applyTo (AFn.java:144)
clojure.core$apply.invoke (core.clj:626)
clojure.core$partial$fn__4228.doInvoke (core.clj:2468)
clojure.lang.RestFn.invoke (RestFn.java:421)
jepsen.core$on_nodes$fn__7463.invoke (core.clj:85)
clojure.core$pmap$fn__6328$fn__6329.invoke (core.clj:6466)
clojure.core$binding_conveyor_fn$fn__4145.invoke (core.clj:1910)
clojure.lang.AFn.call (AFn.java:18)
java.util.concurrent.FutureTask.run (FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
java.lang.Thread.run (Thread.java:745)

lein test :only aerospike.core-test/cas-register

ERROR in (cas-register) (FutureTask.java:122)
Uncaught exception, not in assertion.
expected: nil
actual: java.util.concurrent.ExecutionException: java.lang.RuntimeException: E: Unable to locate package sysvinit-core
E: Package 'sysvinit' has no installation candidate

Reading package lists...
Building dependency tree...
Reading state information...
Package sysvinit is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
However the following packages replace it:
upstart sysvinit-utils

at java.util.concurrent.FutureTask.report (FutureTask.java:122)
java.util.concurrent.FutureTask.get (FutureTask.java:192)
clojure.core$deref_future.invoke (core.clj:2180)
clojure.core$future_call$reify__6320.deref (core.clj:6420)
clojure.core$deref.invoke (core.clj:2200)
clojure.core$map$fn__4245.invoke (core.clj:2559)
clojure.lang.LazySeq.sval (LazySeq.java:40)
clojure.lang.LazySeq.seq (LazySeq.java:56)
clojure.lang.RT.seq (RT.java:484)
clojure.core$seq.invoke (core.clj:133)
clojure.core$dorun.invoke (core.clj:2855)
jepsen.core$on_nodes.invoke (core.clj:86)
jepsen.core$run_BANG_$fn__7561.invoke (core.clj:391)
jepsen.core$run_BANG_.invoke (core.clj:362)
aerospike.core_test/fn (core_test.clj:9)
clojure.test$test_var$fn__7187.invoke (test.clj:704)
clojure.test$test_var.invoke (test.clj:704)
clojure.test$test_vars$fn__7209$fn__7214.invoke (test.clj:722)
clojure.test$default_fixture.invoke (test.clj:674)
clojure.test$test_vars$fn__7209.invoke (test.clj:722)
clojure.test$default_fixture.invoke (test.clj:674)
clojure.test$test_vars.invoke (test.clj:718)
clojure.test$test_all_vars.invoke (test.clj:728)
clojure.test$test_ns.invoke (test.clj:747)
clojure.core$map$fn__4245.invoke (core.clj:2559)
clojure.lang.LazySeq.sval (LazySeq.java:40)
clojure.lang.LazySeq.seq (LazySeq.java:49)
clojure.lang.Cons.next (Cons.java:39)
clojure.lang.RT.boundedLength (RT.java:1654)
clojure.lang.RestFn.applyTo (RestFn.java:130)
clojure.core$apply.invoke (core.clj:626)
clojure.test$run_tests.doInvoke (test.clj:762)
clojure.lang.RestFn.applyTo (RestFn.java:137)
clojure.core$apply.invoke (core.clj:624)
user$eval85$fn__144$fn__175.invoke (form-init23722721549352995.clj:1)
user$eval85$fn__144$fn__145.invoke (form-init23722721549352995.clj:1)
user$eval85$fn__144.invoke (form-init23722721549352995.clj:1)
user$eval85.invoke (form-init23722721549352995.clj:1)
clojure.lang.Compiler.eval (Compiler.java:6703)
clojure.lang.Compiler.eval (Compiler.java:6693)
clojure.lang.Compiler.load (Compiler.java:7130)
clojure.lang.Compiler.loadFile (Compiler.java:7086)
clojure.main$load_script.invoke (main.clj:274)
clojure.main$init_opt.invoke (main.clj:279)
clojure.main$initialize.invoke (main.clj:307)
clojure.main$null_opt.invoke (main.clj:342)
clojure.main$main.doInvoke (main.clj:420)
clojure.lang.RestFn.invoke (RestFn.java:421)
clojure.lang.Var.invoke (Var.java:383)
clojure.lang.AFn.applyToHelper (AFn.java:156)
clojure.lang.Var.applyTo (Var.java:700)
clojure.main.main (main.java:37)
Caused by: java.lang.RuntimeException: E: Unable to locate package sysvinit-core
E: Package 'sysvinit' has no installation candidate

Reading package lists...
Building dependency tree...
Reading state information...
Package sysvinit is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
However the following packages replace it:
upstart sysvinit-utils

at jepsen.control$throw_on_nonzero_exit.invoke (control.clj:105)
jepsen.control$exec_STAR_.doInvoke (control.clj:121)
clojure.lang.RestFn.applyTo (RestFn.java:137)
clojure.core$apply.invoke (core.clj:624)
jepsen.control$exec.doInvoke (control.clj:135)
clojure.lang.RestFn.applyTo (RestFn.java:137)
clojure.core$apply.invoke (core.clj:630)
jepsen.os.debian$install.invoke (debian.clj:98)
jepsen.os.debian$reify__8012$fn__8013.invoke (debian.clj:132)
jepsen.os.debian$reify__8012.setup_BANG_ (debian.clj:130)
jepsen.os$eval5601$fn__5602$G__5593__5606.invoke (os.clj:4)
jepsen.os$eval5601$fn__5602$G__5592__5611.invoke (os.clj:4)
clojure.lang.AFn.applyToHelper (AFn.java:160)
clojure.lang.AFn.applyTo (AFn.java:144)
clojure.core$apply.invoke (core.clj:626)
clojure.core$partial$fn__4228.doInvoke (core.clj:2468)
clojure.lang.RestFn.invoke (RestFn.java:421)
jepsen.core$on_nodes$fn__7463.invoke (core.clj:85)
clojure.core$pmap$fn__6328$fn__6329.invoke (core.clj:6466)
clojure.core$binding_conveyor_fn$fn__4145.invoke (core.clj:1910)
clojure.lang.AFn.call (AFn.java:18)
java.util.concurrent.FutureTask.run (FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
java.lang.Thread.run (Thread.java:745)

Ran 2 tests containing 2 assertions.
0 failures, 2 errors.
Tests failed.

What are pending operations in the checker/total-queue?

I'm very sorry to asking questions in the issues, but could you clarify (in the docs?) what are those Pending reported in the checker, for an example report:

{:queue
{:valid? true,
:final-queue
{:pending
#{10859 ... snip ...
8879 10860 13488 9063}}},
:total-queue
{:valid? false,
:lost
#{ ... snip ... 9063},
:recovered #{},
:recovered-frac 0,
:unexpected-frac 0,
:unexpected #{},
:lost-frac 142/16775,
:duplicated-frac 0,
:ok-frac 6986/16775,
:duplicated #{}},

The pending 10859 was only invoked to be enqueued with unknown result, in the history:
4288 :invoke :enqueue 10859
while the 9063 was enq'ed OK, then reported as lost:
285 :invoke :enqueue 9063
285 :ok :enqueue 9063
...but it is still remaining in the pending list!

What magic stays behind this? And do you think it is doable to add a pending-frac number to report as well? And the most disturbing Q: if a test run A has pending X, B has Y, and Y < X, does that mean the "B has a better result than A"?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.