GithubHelp home page GithubHelp logo

Comments (3)

jamilatta avatar jamilatta commented on August 27, 2024

@bnewbold is this constant error or is it sporadic error?

My intention is to know if this occurs in all processing? I need to know if you are not getting SciELO metadata, so that we can classify and prioritize this demand.

from articlemetaapi.

bnewbold avatar bnewbold commented on August 27, 2024

@jamilatta Thank you for your rapid reply!

This error occured on my first attempt, after iterating through about 19,700 identifiers. Here is the script I am writing:

https://gist.github.com/bnewbold/9918634282f6013e13174badbce64a93

I am running a second time now and have gotten past 50,000 identifiers, so this is probably sporadic. I'll note that I almost immediately get requests.exceptions.ReadTimeout errors (in both cases, trying from two separate machines). The complete failure happens if:

fail retrieving data from (http://articlemeta.scielo.org/api/v1/article/identifiers) attempt(1/10)

... all the attempts fail. I assume this is due to rate limiting, as mentioned in the source. Perhaps there should be an extra delay by default to prevent these timeouts?

As some context, I am hoping to extract the full metadata for all 900k - 1million articles as a JSON snapshot, to archive and include in https://fatcat.wiki. Particularly articles which do not have a DOI. If there is a more efficient way to achieve this, please let me know!

Thank you for maintaining articlemetaapi.

from articlemetaapi.

jamilatta avatar jamilatta commented on August 27, 2024

@bnewbold I will think a way to avoid all the attempts fail.

Lets me talk with coworkers to think about and soon I return to you.

Thanks.

from articlemetaapi.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.