GithubHelp home page GithubHelp logo

Comments (7)

deragent avatar deragent commented on July 22, 2024

Is this the full arxiv-email that you receive?

In my case, there is usually also a footer at the bottom, in the form:

%%%---%%%---%%%---%%%---%%%---%%%---%%%---%%%---%%%---%%%---%%%---%%%---%%%---
For subscribe options to combined physics archives,
e-mail To: [[email protected]](mailto:[email protected]), Subject: subscribe
-----------------------------------------------------------------------------
For help on viewing and making submissions, see http://arxiv.org/help/

And the parser also looks for, and removes this part during the parsing.

from arxivfilter.

louiskirsch avatar louiskirsch commented on July 22, 2024

This is the footer, it looks a bit different from yours, maybe that is the issue.

------------------------------------------------------------------------------
\\
arXiv:2308.12896
replaced with revised version Tue, 29 Aug 2023 15:57:02 GMT   (2999kb,D)

Title: Beyond Document Page Classification: Design, Datasets, and Challenges
Authors: Jordy Van Landeghem, Sanket Biswas, Matthew B. Blaschko,
 Marie-Francine Moens
Categories: cs.CV cs.CL cs.LG
Comments: 8 pages, under review
\\ ( https://arxiv.org/abs/2308.12896 ,  2999kb)
%%%---%%%---%%%---%%%---%%%---%%%---%%%---%%%---%%%---%%%---%%%---%%%---%%%---

from arxivfilter.

deragent avatar deragent commented on July 22, 2024

I'll look into this, but this should be an easy fix!

Could you maybe attach a full copy of one of your E-Mails, for testing?

from arxivfilter.

louiskirsch avatar louiskirsch commented on July 22, 2024

Thanks!
arxiv.txt

from arxivfilter.

deragent avatar deragent commented on July 22, 2024

So, the E-Mail with the simple footer should no longer lead to an error.

See version 0.3.4: https://pypi.org/project/arxiv-filter/0.3.4/

@louiskirsch: On other thing I noticed in the arxiv.txt file that you posted, is that there are a lot of newlines in unexpected places. For example the author lists are often on multiple lines (see lines 46 - 50).
Is this always the case in your E-mails, or is this due to the way you copied the E-Mail?

The current version of arxiv_filter can not handle such newlines, and thus will not properly parse the author lists in the arxiv.txt file which you posted.

from arxivfilter.

louiskirsch avatar louiskirsch commented on July 22, 2024

I checked the source of the email, these line breaks (in the authors list) are in the original email I received.

from arxivfilter.

deragent avatar deragent commented on July 22, 2024

Indeed, I just checked that this is similar in my emails, but there seems to be a difference in the number of spaces at the beginning of a line which continues the previous line. I created a new issue for this: #8

But the original issue should be fixed with the latest release (0.3.4) - closing.

from arxivfilter.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.