GithubHelp home page GithubHelp logo

arxivfilter's People

Contributors

deragent avatar mod20388 avatar

Stargazers

 avatar

Watchers

 avatar

arxivfilter's Issues

Date format warnings when running in non-English locale

When running in a non-English locale (fi_FI.utf8), I get warnings like these when processing the arXiv email text:

time data '16 Nov 2023 21:30:05 GMT' does not match format '%d %b %Y %H:%M:%S %Z'
time data '17 Nov 2023 14:07:49 GMT' does not match format '%d %b %Y %H:%M:%S %Z'
time data '17 Nov 2023 15:35:57 GMT' does not match format '%d %b %Y %H:%M:%S %Z'

These go away when the locale is set to English or C (e.g., running LC_ALL=en_US.utf8 arxiv-filter or LC_ALL=C arxiv-filter).

Request: Support negative weights

Right now only positive values are supported by the filter. This is a request to support also negative values. The use case for this is filtering out keywords, authors, etc. that the user is not interested in.

Improve name matching for short names

Currently the name matching does not care about where the name to match occurs (thus not taking into account name "boundaries").

This leads to very unsatisfying results when trying to match very short names, such as for example "Li".

To illustrate this, all the following names would be matched by "Li":

  • Olivio
  • Polli
  • Julian
  • Bellido
  • etc.

This needs to be improved.

Parsing of submission meta-data is incorrect in some cases.

Currently the parser assumes that lines which continue the previous one in the meta-data section are prefixed with two spaces (' '). See the "Title" tag here:

\\
arXiv:2308.14777
Date: Mon, 28 Aug 2023 12:38:02 GMT   (4607kb,D)

Title: Power-spectrum space decomposition of frequency tomographic data for
  intensity mapping experiments
Authors: Chang Feng, Filipe B. Abdalla
Categories: astro-ph.IM astro-ph.CO
Comments: 5 pages, 3 figures
\\

This does not seem to be universal, but in some cases the continuation of a line is indicated by a single space ( ). See the "Title" tag here:

\\
arXiv:2306.07015
replaced with revised version Tue, 29 Aug 2023 09:17:05 GMT   (151kb,D)

Title: Combining Primal and Dual Representations in Deep Restricted Kernel
 Machines Classifiers
Authors: Francesco Tonin, Panagiotis Patrinos, Johan A. K. Suykens
Categories: cs.LG
\\ ( https://arxiv.org/abs/2306.07015 ,  151kb)

It is unclear why there is this variation, and if it is linked to subscription to different archives (physics vs. cs).

Does not parse arXiv emails with "empty" footer

When pasting the arxiv email, I get 'This is not an arxiv email!'

------------------------------------------------------------------------------
------------------------------------------------------------------------------
Send any comments regarding submissions directly to submitter.
------------------------------------------------------------------------------
Archives at http://arxiv.org/
To unsubscribe, e-mail To: [[email protected]](mailto:[email protected]), Subject: cancel
------------------------------------------------------------------------------
Submissions to:
Artificial Intelligence
Machine Learning
received from  Mon 28 Aug 23 18:00:00 GMT  to  Tue 29 Aug 23 18:00:00 GMT
------------------------------------------------------------------------------
------------------------------------------------------------------------------
\\
arXiv:2308.14815
Date: Mon, 28 Aug 2023 18:06:24 GMT   (3725kb,D)

Title: Distributionally Robust Statistical Verification with Imprecise Neural
 Networks
Authors: Souradeep Dutta, Michele Caprio, Vivian Lin, Matthew Cleaveland, Kuk
 Jin Jang, Ivan Ruchkin, Oleg Sokolsky, Insup Lee
Categories: cs.AI cs.LG cs.RO
\\
 A particularly challenging problem in AI safety is providing guarantees on
the behavior of high-dimensional autonomous systems. Verification approaches
centered around reachability analysis fail to scale, and purely statistical
approaches are constrained by the distributional assumptions about the sampling
process. Instead, we pose a distributionally robust version of the statistical
verification problem for black-box systems, where our performance guarantees
hold over a large family of distributions. This paper proposes a novel approach
based on a combination of active learning, uncertainty quantification, and
neural network verification. A central piece of our approach is an ensemble
technique called Imprecise Neural Networks, which provides the uncertainty to
guide active learning. The active learning uses an exhaustive neural-network
verification tool Sherlock to collect samples. An evaluation on multiple
physical simulators in the openAI gym Mujoco environments with
reinforcement-learned controllers demonstrates that our approach can provide
useful and scalable guarantees for high-dimensional systems.
\\ ( https://arxiv.org/abs/2308.14815 ,  3725kb)
------------------------------------------------------------------------------

Use different background color depending on which part matches.

For example, use a different color (e.g. green instead of blue) in the background of the title, when there is a match in authorship.

This would allow for easier differentiation, and some papers might be worth reading, due to the author, and not due to the actual topic.

The new colors should integrate properly in the fading out of the color towards the less matched entries.

Show which Keywords got matched by the filter

It would be nice to quickly see, which keywords got matched by the filtering.

One solution (preferred) would be, to highlight the words in the title and in the abstract (for example by making them bold, or similar).

A simpler solution to implement, would be to add an additional list (for example below the author list), with all keywords which got matched.

Improve the README.md

  • Better explain the usage
  • Add note on installation of qt and pyqt5
  • Add example config file

Crash due to passing float value to Qt

I am getting a consistent crash on Python 3.11 and PyQt 6.6:

Traceback (most recent call last):
  File "/home/mod20/.local/pipx/venvs/arxiv-filter/lib/python3.11/site-packages/arxiv_filter/ui/ArxivFilter.py", line 141, in parsedCallback
    self.tableFiltered.setEntries(filtered)
  File "/home/mod20/.local/pipx/venvs/arxiv-filter/lib/python3.11/site-packages/arxiv_filter/ui/ListView.py", line 52, in setEntries
    widget = ListEntry(entry, self._filtered)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mod20/.local/pipx/venvs/arxiv-filter/lib/python3.11/site-packages/arxiv_filter/ui/ListEntry.py", line 69, in __init__
    self.initUI()
  File "/home/mod20/.local/pipx/venvs/arxiv-filter/lib/python3.11/site-packages/arxiv_filter/ui/ListEntry.py", line 98, in initUI
    header_lo.setStretch(0, 0.01)
TypeError: setStretch(self, index: int, stretch: int): argument 2 has unexpected type 'float'

According to the Qt doc the setStretch function is only taking int values. Correcting the offending values in ui/ListEntry.pi seems to fix it.

Improve text color in "filtered" item list

The contrast of black text on dark blue background is not amazing for readability!

Bad contrast in normal mode

This problem is even worse, when a user uses a dark mode theme on the computer, as follows:

White text on white BG when using a dark mode theme

This leads to white text on a nearly white background, and renders the titles unreadable.

Pasting Arxiv E-Mail text fails

Message: *This is not an arXiv email."

Maybe due to additional empty line in the beginning of the pasted text?!?

To be investigated.

Drag and drop of the same text works!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.