deragent / arxivfilter Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 2.0 289 KB

Small program which allows to custom filter arXiv daily blasts per Drag&Drop

License: MIT License

Python 100.00%

arxivfilter's People

Contributors

Stargazers

Watchers

Forkers

mod20388 shomu-maersk

arxivfilter's Issues

Date format warnings when running in non-English locale

When running in a non-English locale (fi_FI.utf8), I get warnings like these when processing the arXiv email text:

time data '16 Nov 2023 21:30:05 GMT' does not match format '%d %b %Y %H:%M:%S %Z'
time data '17 Nov 2023 14:07:49 GMT' does not match format '%d %b %Y %H:%M:%S %Z'
time data '17 Nov 2023 15:35:57 GMT' does not match format '%d %b %Y %H:%M:%S %Z'

These go away when the locale is set to English or C (e.g., running LC_ALL=en_US.utf8 arxiv-filter or LC_ALL=C arxiv-filter).

Make program close on Ctrl+C in terminal

As in the title. The window should close upon Ctrl+C when started from a terminal.

Request: Support negative weights

Right now only positive values are supported by the filter. This is a request to support also negative values. The use case for this is filtering out keywords, authors, etc. that the user is not interested in.

Improve name matching for short names

Currently the name matching does not care about where the name to match occurs (thus not taking into account name "boundaries").

This leads to very unsatisfying results when trying to match very short names, such as for example "Li".

To illustrate this, all the following names would be matched by "Li":

Olivio
Polli
Julian
Bellido
etc.

This needs to be improved.

Parsing of submission meta-data is incorrect in some cases.

Currently the parser assumes that lines which continue the previous one in the meta-data section are prefixed with two spaces (' '). See the "Title" tag here:

\\
arXiv:2308.14777
Date: Mon, 28 Aug 2023 12:38:02 GMT   (4607kb,D)

Title: Power-spectrum space decomposition of frequency tomographic data for
  intensity mapping experiments
Authors: Chang Feng, Filipe B. Abdalla
Categories: astro-ph.IM astro-ph.CO
Comments: 5 pages, 3 figures
\\

This does not seem to be universal, but in some cases the continuation of a line is indicated by a single space ( ). See the "Title" tag here:

\\
arXiv:2306.07015
replaced with revised version Tue, 29 Aug 2023 09:17:05 GMT   (151kb,D)

Title: Combining Primal and Dual Representations in Deep Restricted Kernel
 Machines Classifiers
Authors: Francesco Tonin, Panagiotis Patrinos, Johan A. K. Suykens
Categories: cs.LG
\\ ( https://arxiv.org/abs/2306.07015 ,  151kb)

It is unclear why there is this variation, and if it is linked to subscription to different archives (physics vs. cs).

Does not parse arXiv emails with "empty" footer

When pasting the arxiv email, I get 'This is not an arxiv email!'

------------------------------------------------------------------------------
------------------------------------------------------------------------------
Send any comments regarding submissions directly to submitter.
------------------------------------------------------------------------------
Archives at http://arxiv.org/
To unsubscribe, e-mail To: [[email protected]](mailto:[email protected]), Subject: cancel
------------------------------------------------------------------------------
Submissions to:
Artificial Intelligence
Machine Learning
received from  Mon 28 Aug 23 18:00:00 GMT  to  Tue 29 Aug 23 18:00:00 GMT
------------------------------------------------------------------------------
------------------------------------------------------------------------------
\\
arXiv:2308.14815
Date: Mon, 28 Aug 2023 18:06:24 GMT   (3725kb,D)

Title: Distributionally Robust Statistical Verification with Imprecise Neural
 Networks
Authors: Souradeep Dutta, Michele Caprio, Vivian Lin, Matthew Cleaveland, Kuk
 Jin Jang, Ivan Ruchkin, Oleg Sokolsky, Insup Lee
Categories: cs.AI cs.LG cs.RO
\\
 A particularly challenging problem in AI safety is providing guarantees on
the behavior of high-dimensional autonomous systems. Verification approaches
centered around reachability analysis fail to scale, and purely statistical
approaches are constrained by the distributional assumptions about the sampling
process. Instead, we pose a distributionally robust version of the statistical
verification problem for black-box systems, where our performance guarantees
hold over a large family of distributions. This paper proposes a novel approach
based on a combination of active learning, uncertainty quantification, and
neural network verification. A central piece of our approach is an ensemble
technique called Imprecise Neural Networks, which provides the uncertainty to
guide active learning. The active learning uses an exhaustive neural-network
verification tool Sherlock to collect samples. An evaluation on multiple
physical simulators in the openAI gym Mujoco environments with
reinforcement-learned controllers demonstrates that our approach can provide
useful and scalable guarantees for high-dimensional systems.
\\ ( https://arxiv.org/abs/2308.14815 ,  3725kb)
------------------------------------------------------------------------------

Use different background color depending on which part matches.

For example, use a different color (e.g. green instead of blue) in the background of the title, when there is a match in authorship.

This would allow for easier differentiation, and some papers might be worth reading, due to the author, and not due to the actual topic.

The new colors should integrate properly in the fading out of the color towards the less matched entries.

Show which Keywords got matched by the filter

It would be nice to quickly see, which keywords got matched by the filtering.

One solution (preferred) would be, to highlight the words in the title and in the abstract (for example by making them bold, or similar).

A simpler solution to implement, would be to add an additional list (for example below the author list), with all keywords which got matched.

Improve the README.md

Better explain the usage
Add note on installation of qt and pyqt5
Add example config file

Crash due to passing float value to Qt

I am getting a consistent crash on Python 3.11 and PyQt 6.6:

Traceback (most recent call last):
  File "/home/mod20/.local/pipx/venvs/arxiv-filter/lib/python3.11/site-packages/arxiv_filter/ui/ArxivFilter.py", line 141, in parsedCallback
    self.tableFiltered.setEntries(filtered)
  File "/home/mod20/.local/pipx/venvs/arxiv-filter/lib/python3.11/site-packages/arxiv_filter/ui/ListView.py", line 52, in setEntries
    widget = ListEntry(entry, self._filtered)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mod20/.local/pipx/venvs/arxiv-filter/lib/python3.11/site-packages/arxiv_filter/ui/ListEntry.py", line 69, in __init__
    self.initUI()
  File "/home/mod20/.local/pipx/venvs/arxiv-filter/lib/python3.11/site-packages/arxiv_filter/ui/ListEntry.py", line 98, in initUI
    header_lo.setStretch(0, 0.01)
TypeError: setStretch(self, index: int, stretch: int): argument 2 has unexpected type 'float'

According to the Qt doc the setStretch function is only taking int values. Correcting the offending values in ui/ListEntry.pi seems to fix it.

Improve text color in "filtered" item list

The contrast of black text on dark blue background is not amazing for readability!

This problem is even worse, when a user uses a dark mode theme on the computer, as follows:

This leads to white text on a nearly white background, and renders the titles unreadable.

Pasting Arxiv E-Mail text fails

Message: *This is not an arXiv email."

Maybe due to additional empty line in the beginning of the pasted text?!?

To be investigated.

Drag and drop of the same text works!

deragent / arxivfilter Goto Github PK

arxivfilter's People

Contributors

Stargazers

Watchers

Forkers

arxivfilter's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs