GithubHelp home page GithubHelp logo

LMDB OOM's for a Large Dataset about rdf4j HOT 8 OPEN

benherber avatar benherber commented on June 10, 2024
LMDB OOM's for a Large Dataset

from rdf4j.

Comments (8)

hmottestad avatar hmottestad commented on June 10, 2024 2

You can create a PR with the fixes you want to backport and we can merge them into main. If the code ends up being identical between main and develop then there shouldn't be any problems. If not then we will need to be a bit more careful when merging main into develop later.

from rdf4j.

kenwenzel avatar kenwenzel commented on June 10, 2024 1

The current Implementation in 4.x.x is experimental and I've made some fixes and enhancements in the develop branch.
Those could be backported to 4.x.x but I don't know the correct procedure for doing this.

@hmottestad Could you give some advice here?

from rdf4j.

kenwenzel avatar kenwenzel commented on June 10, 2024

It could be due to the order by clause that needs to materialize all values in a sorted set.

@JervenBolleman I think you have worked on the persistent sets?

@benherber BTW, you should/could use write batches with 100k triples. It would also be better to use the 5.0.0-SNAPSHOT AS the 4.x.x version hast several bugs in the LmdbStore.

from rdf4j.

benherber avatar benherber commented on June 10, 2024

It could be due to the order by clause that needs to materialize all values in a sorted set.

@JervenBolleman I think you have worked on the persistent sets?

@benherber BTW, you should/could use write batches with 100k triples. It would also be better to use the 5.0.0-SNAPSHOT AS the 4.x.x version hast several bugs in the LmdbStore.

Ah that would make sense. Oh good to know! Will try it out thanks. Just to confirm, since 5.0.0 is coming down the pipe rather soon, does that mean that the lmdb implementation in 4.x.x will not get future bug fixes?

from rdf4j.

kenwenzel avatar kenwenzel commented on June 10, 2024

@benherber How do you execute the queries? Are all results materialized?

from rdf4j.

benherber avatar benherber commented on June 10, 2024

@benherber How do you execute the queries? Are all results materialized?

I just iterate through the result set, counting the number of triples:

try (final TupleQueryResult res = query.evaluate()) {
	for (final BindingSet set : res) {
		for (final var ignore : set) {
			count++;
		}
	}
}

from rdf4j.

kenwenzel avatar kenwenzel commented on June 10, 2024

OK, that looks good. Could you investigate the memory usage with VisualVM while running the query?

from rdf4j.

benherber avatar benherber commented on June 10, 2024

OK, that looks good. Could you investigate the memory usage with VisualVM while running the query?

Yea I'll probably get to doing that later this week if I get the chance. Will update once I do

from rdf4j.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.