GithubHelp home page GithubHelp logo

pchampin / sophia_benchmark Goto Github PK

View Code? Open in Web Editor NEW
5.0 4.0 3.0 2.72 MB

Benchmarking sophia and comparing it to other RDF libraries

License: MIT License

Makefile 0.24% Java 0.54% Shell 0.05% C 1.28% JavaScript 0.65% Python 0.88% Rust 0.45% Jupyter Notebook 95.91%

sophia_benchmark's Introduction

Benchmarking for Sophia

This is an environment for benchmarking the sophia library, and comparing it with other RDF libraries.

See the results

The results are available in benchmark_results.ipynb. They should display correctly on github. Otherwise, you need Jupyter to visualize them.

Reproduce the results

The tests have been designed for my machine, running Ubuntu 18.10. To load and build all the necessarily files, type make in the root directory of the project (see benchmark_results.ipynb for dependencies). To re-generate the CSV files, use the run_benchmark command with the appropriate arguments.

Further Requirements

n3js

export NODE_OPTIONS=--max_old_space_size=16000

librdf

one of the following (depending on your distribution):

  • pacman -S redland
  • apt install librdf-dev

Adding libraries to the benchmark

If you want to add another library to the benchmark, have a look at the BENCHMARK_INTERFACE.txt file.

sophia_benchmark's People

Contributors

pchampin avatar konradhoeffner avatar dependabot[bot] avatar

Stargazers

FredHay avatar Johan avatar Ana Gelez avatar Gregg Kellogg avatar

Watchers

Gregg Kellogg avatar  avatar James Cloos avatar  avatar

sophia_benchmark's Issues

measuring time to retrieve all triples

In all benchmarks, t0 is reset to the current time after time_first is measured.
However that means that the diagram title "Time (in s) to retrieve all triples" does not fit the measurements.
Either the title should be changed to "Time (in s) to retrieve the rest of the triples" or the time should not be reset.

For example here in Python:

  for triple in g.triples(pattern):
        if time_first is None:
            time_first = get_time() - t0
            t0 = get_time()   <-- this should be removed

make fails on Arch Linux

I don't know much about make, but it seems it is missing clang. Can this be included somehow?

sophia_benchmark$ make
cd data; make
make[1]: Entering directory '/home/konrad/tmp/sophia_benchmark/data'
wget http://downloads.dbpedia.org/2016-10/core-i18n/en/persondata_en.ttl.bz2
--2022-11-10 09:07:34--  http://downloads.dbpedia.org/2016-10/core-i18n/en/persondata_en.ttl.bz2
Resolving downloads.dbpedia.org (downloads.dbpedia.org)... 139.18.16.66
Connecting to downloads.dbpedia.org (downloads.dbpedia.org)|139.18.16.66|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 66513289 (63M) [application/octet-stream]
Saving to: ‘persondata_en.ttl.bz2’

persondata_en.ttl.bz2                   100%[=============================================================================>]  63.43M  47.9MB/s    in 1.3s    

2022-11-10 09:07:35 (47.9 MB/s) - ‘persondata_en.ttl.bz2’ saved [66513289/66513289]

bunzip2 persondata_en.ttl.bz2
tail -10000 "persondata_en.ttl" >"persondata_en_10k.ttl"
tail -20000 "persondata_en.ttl" >"persondata_en_20k.ttl"
tail -40000 "persondata_en.ttl" >"persondata_en_40k.ttl"
tail -80000 "persondata_en.ttl" >"persondata_en_80k.ttl"
tail -100000 "persondata_en.ttl" >"persondata_en_100k.ttl"
tail -1000000 "persondata_en.ttl" >"persondata_en_1M.ttl"
make[1]: Leaving directory '/home/konrad/tmp/sophia_benchmark/data'
cd b_sophia; make
make[1]: Entering directory '/home/konrad/tmp/sophia_benchmark/b_sophia'
cargo build --release
    Updating git repository `https://github.com/pchampin/sophia_rs`
    Updating git submodule `https://github.com/w3c/json-ld-api`
    Updating git submodule `https://github.com/w3c/rdf-tests.git`
  Downloaded language-tag v0.9.0
  Downloaded proc-macro2 v1.0.10
  Downloaded cfg-if v0.1.10
  Downloaded rio_api v0.4.2
  Downloaded rustversion v1.0.2
  Downloaded regex-syntax v0.6.17
  Downloaded syn v1.0.18
  Downloaded time-macros v0.1.0
  Downloaded time-macros-impl v0.1.1
  Downloaded thiserror-impl v1.0.15
  Downloaded thiserror v1.0.15
  Downloaded proc-macro-hack v0.5.15
  Downloaded memchr v2.3.3
  Downloaded standback v0.2.6
  Downloaded regex v1.3.7
  Downloaded aho-corasick v0.7.10
  Downloaded unicode-xid v0.2.0
  Downloaded quote v1.0.3
  Downloaded rio_turtle v0.4.2
  Downloaded weak-table v0.2.3
  Downloaded libc v0.2.69
  Downloaded time v0.2.10
  Downloaded 22 crates (1.7 MB) in 1.57s
   Compiling proc-macro2 v1.0.10
   Compiling unicode-xid v0.2.0
   Compiling syn v1.0.18
   Compiling standback v0.2.6
   Compiling memchr v2.3.3
   Compiling quote v0.3.15
   Compiling rustversion v1.0.2
   Compiling once_cell v1.12.0
   Compiling regex-syntax v0.6.17
   Compiling proc-macro-hack v0.5.15
   Compiling libc v0.2.69
   Compiling lazy_static v1.4.0
   Compiling weak-table v0.2.3
   Compiling rio_api v0.4.2
   Compiling cfg-if v0.1.10
   Compiling resiter v0.4.0
   Compiling thread_local v1.1.4
   Compiling peg v0.5.7
   Compiling rio_turtle v0.4.2
   Compiling aho-corasick v0.7.10
   Compiling quote v1.0.3
   Compiling language-tag v0.9.0
   Compiling regex v1.3.7
   Compiling thiserror-impl v1.0.15
   Compiling time-macros-impl v0.1.1
   Compiling time v0.2.10
   Compiling time-macros v0.1.0
   Compiling thiserror v1.0.15
   Compiling sophia_term v0.5.2 (https://github.com/pchampin/sophia_rs?tag=v0.5.2#53ee96a8)
   Compiling sophia v0.5.2 (https://github.com/pchampin/sophia_rs?tag=v0.5.2#53ee96a8)
   Compiling sophia_benchmark v0.1.0 (/home/konrad/tmp/sophia_benchmark/b_sophia)
    Finished release [optimized] target(s) in 12.31s
make[1]: Leaving directory '/home/konrad/tmp/sophia_benchmark/b_sophia'
cd b_librdf; make
make[1]: Entering directory '/home/konrad/tmp/sophia_benchmark/b_librdf'
clang -c parse.c -o parse.o -Ofast -Wall -I/usr/include/raptor2 -I/usr/include/rasqal
make[1]: clang: No such file or directory
make[1]: *** [Makefile:12: parse.o] Error 127
make[1]: Leaving directory '/home/konrad/tmp/sophia_benchmark/b_librdf'
make: *** [Makefile:5: all] Error 2

After installing clang manually, I get the following error instead:

cd data; make
make[1]: Entering directory '/home/konrad/tmp/sophia_benchmark/data'
make[1]: Nothing to be done for 'all'.
make[1]: Leaving directory '/home/konrad/tmp/sophia_benchmark/data'
cd b_sophia; make
make[1]: Entering directory '/home/konrad/tmp/sophia_benchmark/b_sophia'
cargo build --release
    Finished release [optimized] target(s) in 0.02s
make[1]: Leaving directory '/home/konrad/tmp/sophia_benchmark/b_sophia'
cd b_librdf; make
make[1]: Entering directory '/home/konrad/tmp/sophia_benchmark/b_librdf'
clang -c query.c -o query.o -Ofast -Wall -I/usr/include/raptor2 -I/usr/include/rasqal
query.c:30:10: fatal error: 'redland.h' file not found
#include <redland.h>
         ^~~~~~~~~~~
1 error generated.
make[1]: *** [Makefile:15: query.o] Error 1
make[1]: Leaving directory '/home/konrad/tmp/sophia_benchmark/b_librdf'
make: *** [Makefile:5: all] Error 2

benchmark with HDT?

I would be very interested in seeing how the CPU and memory performance compares when using and HDT based memory store instead of the default ones.
Also I think in general an up to date benchmark against the other triple stores of today would be useful.
Unfortunately I don't have the same CPU as you and can't run the code, see #5.
Could you rerun the benchmark using the new version of Sophia and with HDT?
Alternatively, if #5 is fixed I could do that myself too.
I plan on releasing hdt crate 0.0.1 soon which will include and HDT-based sophia graph and optimized triples_with-s queries and maybe triples_with_o.
Triples_with_p will probably still be slow but I'm mainly interested in memory consumption.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.