Comments (20)
Hi @seedoilz, Could you elaborate how you identified that snapshot isolation is violated?
from dgraph.
It is a timestamp-based determination method. All transactions are sorted in ascending order by commit timestamps. Then iterates through each transaction and, based on its start timestamps, determines which committed transactions it should have read, checking for consistency with the data it actually read. It also checks to see if concurrent transactions have write conflicts.
We are working on a paper on this, but it's only in draft form at the moment, so if you don't mind and are interested, we'd be happy to share it.
from dgraph.
If you could send across a way to reproduce the issue, we are happy to look into it. We saw something like this before too #8146 but later found out that the issue was with the application code.
from dgraph.
What you need to do is to download the [json file](https://box.nju.edu.cn/f/64a49141b4e44368bf41/) here and clone this [code](https://github.com/Tsunaou/dbcdc-runner). Then you need to set up the environment including Leiningen and Java (What Jepsen needs). Then you can run the following code in root directory of the code. (replacing the ${dbcop-workload-path} with the path of that json file).
lein run test-all -w rw \
--txn-num 120000 \
--time-limit 43200 \
-r 10000 \
--node dummy-node \
--isolation snapshot-isolation \
--expected-consistency-model snapshot-isolation \
--nemesis none \
--existing-postgres \
--no-ssh \
--database dgraph \
--dbcop-workload-path ${dbcop-workload-path} \
--dbcop-workload
from dgraph.
--lru_mb
is an old parameter. What version of Dgraph are you using? The compose file that you are using looks really old to me.
from dgraph.
Sorry. Actually this compose file is not the one I used. However, I accidentally deleted my compose file. But my compose file is based on the one I gave. So I think it is not a big deal.
from dgraph.
This compose file is the one I was using. You could change a bit (node name) and use it.
version: "3.2"
networks:
dgraph:
services:
zero:
image: dgraph/dgraph:latest
volumes:
- data-volume:/dgraph
ports:
- 5080:5080
- 6080:6080
networks:
- dgraph
deploy:
placement:
constraints:
- node.hostname == VM-0-7-tencentos
command: dgraph zero --my=zero:5080 --replicas 3
alpha1:
image: dgraph/dgraph:latest
hostname: "alpha1"
volumes:
- data-volume:/dgraph
ports:
- 8080:8080
- 9080:9080
networks:
- dgraph
deploy:
placement:
constraints:
- node.hostname == VM-0-7-tencentos
command: dgraph alpha --my=alpha1:7080 --security whitelist=10.0.0.0/8,172.0.0.0/8,192.168.0.0/16,127.0.0.1 --zero=zero:5080
alpha2:
image: dgraph/dgraph:latest
hostname: "alpha2"
volumes:
- data-volume:/dgraph
ports:
- 8081:8081
- 9081:9081
networks:
- dgraph
deploy:
replicas: 1
placement:
constraints:
- node.hostname == VM-0-12-tencentos
command: dgraph alpha --my=alpha2:7081 --security whitelist=10.0.0.0/8,172.0.0.0/8,192.168.0.0/16,127.0.0.1 --zero=zero:5080 -o 1
alpha3:
image: dgraph/dgraph:latest
hostname: "alpha3"
volumes:
- data-volume:/dgraph
ports:
- 8082:8082
- 9082:9082
networks:
- dgraph
deploy:
replicas: 1
placement:
constraints:
- node.hostname == VM-0-14-tencentos
command: dgraph alpha --my=alpha3:7082 --security whitelist=10.0.0.0/8,172.0.0.0/8,192.168.0.0/16,127.0.0.1 --zero=zero:5080 -o 2
volumes:
data-volume:
from dgraph.
@mangalaman93 hello, could you help me?
from dgraph.
Sorry about the delay. The compose files still doesn't look right because the same volume is mounted in all the alphas.
from dgraph.
After removing the volume, I get this null pointer exception. Am I running it right?
INFO [2023-08-25 20:18:08,431] jepsen test runner - jepsen.db Tearing down DB
INFO [2023-08-25 20:18:08,433] jepsen test runner - jepsen.db Setting up DB
INFO [2023-08-25 20:18:08,434] jepsen test runner - jepsen.core Relative time begins now
WARN [2023-08-25 20:18:08,442] main - jepsen.core Test crashed!
java.lang.NullPointerException: null
at disalg.dbcdc.client$open.invokeStatic(client.clj:56)
at disalg.dbcdc.client$open.invoke(client.clj:51)
at disalg.dbcdc.rw.Client.open_BANG_(rw.clj:107)
at jepsen.core$run_case_BANG_$fn__9727.invoke(core.clj:220)
at dom_top.core$real_pmap_helper$build_thread__213$fn__214.invoke(core.clj:146)
at clojure.lang.AFn.applyToHelper(AFn.java:152)
at clojure.lang.AFn.applyTo(AFn.java:144)
at clojure.core$apply.invokeStatic(core.clj:667)
at clojure.core$with_bindings_STAR_.invokeStatic(core.clj:1990)
at clojure.core$with_bindings_STAR_.doInvoke(core.clj:1990)
at clojure.lang.RestFn.invoke(RestFn.java:425)
at clojure.lang.AFn.applyToHelper(AFn.java:156)
at clojure.lang.RestFn.applyTo(RestFn.java:132)
at clojure.core$apply.invokeStatic(core.clj:671)
at clojure.core$bound_fn_STAR_$fn__5818.doInvoke(core.clj:2020)
at clojure.lang.RestFn.invoke(RestFn.java:397)
at clojure.lang.AFn.run(AFn.java:22)
at java.base/java.lang.Thread.run(Thread.java:829)
WARN [2023-08-25 20:18:08,446] main - jepsen.cli Test crashed
java.lang.NullPointerException: null
at disalg.dbcdc.client$open.invokeStatic(client.clj:56)
at disalg.dbcdc.client$open.invoke(client.clj:51)
at disalg.dbcdc.rw.Client.open_BANG_(rw.clj:107)
at jepsen.core$run_case_BANG_$fn__9727.invoke(core.clj:220)
at dom_top.core$real_pmap_helper$build_thread__213$fn__214.invoke(core.clj:146)
at clojure.lang.AFn.applyToHelper(AFn.java:152)
at clojure.lang.AFn.applyTo(AFn.java:144)
at clojure.core$apply.invokeStatic(core.clj:667)
at clojure.core$with_bindings_STAR_.invokeStatic(core.clj:1990)
at clojure.core$with_bindings_STAR_.doInvoke(core.clj:1990)
at clojure.lang.RestFn.invoke(RestFn.java:425)
at clojure.lang.AFn.applyToHelper(AFn.java:156)
at clojure.lang.RestFn.applyTo(RestFn.java:132)
at clojure.core$apply.invokeStatic(core.clj:671)
at clojure.core$bound_fn_STAR_$fn__5818.doInvoke(core.clj:2020)
at clojure.lang.RestFn.invoke(RestFn.java:397)
at clojure.lang.AFn.run(AFn.java:22)
at java.base/java.lang.Thread.run(Thread.java:829)
Error parsing edn file 'null': Cannot open <nil> as a Reader.
from dgraph.
No, but I think this null pointer exception is raised because the data volume is not mounted on the host. As a result, jepsen can not find the data file.
In addition, in my servers, the same volume is mounted in all the alphas which works well.
from dgraph.
If you mount the same volume inside all zero and alphas, they will end up using the same p directory which is a problem. And why is the test trying to read a file that dgraph has written? I'm not sure how this compose file working for you. Am I missing something?
from dgraph.
I know why you have the null pointer exception. It is because that I forgot to tell u that you need to put this file in dbcdc-runner/resources/
Of course if you set password for dgraph, you need to change this file a little bit.
from dgraph.
It is running now, but I do not see any new predicate in the cluster. Is the code writing data into dgraph?
from dgraph.
Since we can not access the dgraph by 127.0.0.1, we use the public ip address to operate the database.
So you need to replace the 175.27.241.31 in dbcdc-runner/src/disalg/dbcdc/impls/dgraph.clj with your own ip address or localhost(127.0.0.1) if you could access the database with localhost.
from dgraph.
It is still somehow hitting the 175.27.241.31 IP even after I have changed it everywhere as well run lein clean.
from dgraph.
Maybe you forgot to change the ip address in the .edn file that I gave you recently.
I know why you have the null pointer exception. It is because that I forgot to tell u that you need to put this file in dbcdc-runner/resources/ Of course if you set password for dgraph, you need to change this file a little bit.
from dgraph.
That was it. I do see 1000 values for value
predicate. How long is the test configured to run? What did you observe when it failed?
from dgraph.
I am able to do the complete run of the test now though it fails in the analysis step due to limited memory on my laptop. I am thinking of running it on a bigger machine but before that is it possible for you to share results of your run where you concluded that it failed for you? And how did you conclude that?
from dgraph.
It is a timestamp-based determination method. All transactions are sorted in ascending order by commit timestamps. Then iterates through each transaction and, based on its start timestamps, determines which committed transactions it should have read, checking for consistency with the data it actually read. It also checks to see if concurrent transactions have write conflicts. We are working on a paper on this, but it's only in draft form at the moment, so if you don't mind and are interested, we'd be happy to share it.
By using this jar file.
java -jar TimeKiller.jar --history_path THE_JSON_FILE_PATH --enable_session false
In addition, what I quote is what we do to test the snapshot isolation. The code is here: https://github.com/FertileFragrance/TimeKiller
from dgraph.
Related Issues (20)
- [FEATURE]: @custom directive to make batch requests to services that accept them
- [QUESTION]: IDE for DQL HOT 2
- [QUESTION]: data writing is slow HOT 2
- [BUG]: Online Restore graphql API has no interface to get status
- [ENHANCEMENT]: Simplify Administrative Operations with REST API
- [ENHANCEMENT]: Support case-insensitive sorting of strings HOT 1
- [QUESTION]: full text search - Polish lang support HOT 1
- [ENHANCEMENT]: Sorting using facets
- [BUG]: crash gracefully when dgraph binary version is incompatible with data on disk HOT 3
- [How to use regexp in @filter]: <regexp use> HOT 3
- dgraph performance
- Query profiler: Add an ability to get performance statistics of queries
- [BUG]: Dgraph.Allow-Origin CORS setting does not work as docs claim
- [QUESTION]: Can i write some extension for db
- Can you recommend several graphic libraries that display graph data on the web or Flutter side?
- [BUG]: When build oss degraph alpha exit with error : flags: acl / encryption is an enterprise-only feature
- Cancel schema and improve it to automatically generate schema, modify schema, and update schema based on data without human intervention, and ensure compatibility with historical data. HOT 1
- [Documentation]: <Running dgraph tests on Mac OS> HOT 1
- /run.sh: line 16: dgraph-ratel: command not found HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dgraph.