GithubHelp home page GithubHelp logo

Comments (5)

maxxkia avatar maxxkia commented on June 6, 2024

I tried to get around this problem by just parsing the body and references section:

Element body = extractor.getBodyAsNLM();
List<Element> references = extractor.getReferencesAsNLM();

but again some other problem occurs when parsing the references section:

Exception in thread "main" java.lang.NullPointerException
	at pl.edu.icm.cermine.InternalContentExtractor.getReferencesAsNLM(InternalContentExtractor.java:225)
	at pl.edu.icm.cermine.ContentExtractor.getReferencesAsNLM(ContentExtractor.java:444)
	at pl.edu.icm.cermine.ContentExtractor.getReferencesAsNLM(ContentExtractor.java:460)

from cermine.

dtkaczyk avatar dtkaczyk commented on June 6, 2024

@maxxkia Thanks for reporting. Unfortunately, I wasn't able to reproduce this on my laptop so far. Could you specify which CERMINE and Java version you use? Also, please provide exact code you are using, starting from reading in the PDF files.

from cermine.

maxxkia avatar maxxkia commented on June 6, 2024

I'm using CERMINE 1.13, and running Java 1.8.0_131.

Ok, after some debugging I figured out where the problem was. In my older code I was instantiating the extractor once:

ContentExtractor extractor = new ContentExtractor();

And then processed the documents in a loop

for (InputStream is : inputStreams)
{
	extractor.setPDF(is);
	Element body = extractor.getContentAsNLM();
}

Then, changing the code as follows fixed the problem:

for (InputStream is : inputStreams)
{
	ContentExtractor extractor = new ContentExtractor();
	extractor.setPDF(is);
	Element body = extractor.getContentAsNLM();
}

So actually something is going wrong in setPDF since it looks like it's not initializing properly after a document has been processed. Or am I using it in a wrong way?

from cermine.

dtkaczyk avatar dtkaczyk commented on June 6, 2024

No, you are right, it was a bug. I believe it is fixed by commit 82c28dc Could you take a look at the newest code in master branch and test it in your setting?

from cermine.

maxxkia avatar maxxkia commented on June 6, 2024

Yes, the problem is solved now. Thanks.

from cermine.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.