GithubHelp home page GithubHelp logo

dlesbre / bibtex-autocomplete Goto Github PK

View Code? Open in Web Editor NEW
81.0 3.0 6.0 943 KB

Python package to autocomplete bibtex bibliographies

License: MIT License

Makefile 2.11% Python 97.01% TeX 0.79% Dockerfile 0.08%
python bibtex cli terminal research script scraper rest-api arxiv-api crossref-api

bibtex-autocomplete's People

Contributors

dlesbre avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

bibtex-autocomplete's Issues

SSL Error with researchr

CONNECTION ERROR: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)

ssl verify failed on Windows

My previous issue was resolved with the following commits and version 1.1.1. Thank you for that.

However, there is more. Now that the .bib file is opened properly, SSL Certificate problems arise.
The log looks like this:

==== Reading files =============================================================
Reading file 1 / 1 from 'References.bib'
Read 62 entries from 1 file

==== Completing entries ========================================================
[researchr] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[researchr] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[unpaywall] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[researchr] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[crossref] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[researchr] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[researchr] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[unpaywall] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[researchr] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[crossref] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[researchr] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[researchr] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[researchr] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[unpaywall] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[researchr] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)
[crossref] WARNING: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)

Again, this happens only on Windows (Python 3.9).

Modifying the HTTPSConnection in https.py to
connection = HTTPSConnection(domain, timeout=self.connection_timeout, context = ssl._create_unverified_context())
fixes this, as described in this stackexchange post.

`btac` command not found on windows

When installing via pip install bibtexautocomplete the pip copies btac to Scripts. However, windows requires a btac.exe (or .cmd) in that folder to work properly.

According to this reference:

Although setup() supports a scripts keyword for pointing to pre-made scripts to install, the recommended approach to achieve cross-platform compatibility is to use console_scripts entry points (see below).

Thus, replacing scripts with console_scripts in setup.py should fix the issue, e.g.:

  entry_points={
        'console_scripts': [
            'btac=btac.main',
        ],
    },

How to reference bibtex-autocomplete

I would like to add bibtex-autocomplete to the contribution section of my paper.

If I am not misstaken, then this is not yet possible from this great project.

There are various ways to support this

via Zenodo (see for example ASReview https://zenodo.org/records/10203469, and for example these stackoverflow items https://academia.stackexchange.com/questions/151533/how-to-cite-a-github-repository-that-i-used-for-data-collection-research-paper, https://academia.stackexchange.com/questions/14010/how-do-you-cite-a-github-repository

or directly via GitHub.

@dlesbre would you be okay to support one of these two options?
or can I assist in some way or another?

I would love to reference your great work.
Regards Edzo

problem with installation: type error prevents execution

Hello,

I was very happy to find bibtexautocomplete. It seems to do exactly what I would like to achieve (systematically adding doi field to manually curated bibtex file). Unfortunately, I was unable to properly install it. I am using Ubuntu 20.04 LTS and python 3.8. Below is the error message when I execute btac --version on the command line.

I have no experience with python and am thus unable to debug the problem. Would you be able to tell me how I can get bibtex running?

Thank you in advance and best wishes,

Kim

Traceback (most recent call last):
File "/home/hubert/.local/bin/btac", line 3, in
from bibtexautocomplete.core.main import main
File "/home/hubert/.local/lib/python3.8/site-packages/bibtexautocomplete/core/init.py", line 1, in
from .autocomplete import BibtexAutocomplete
File "/home/hubert/.local/lib/python3.8/site-packages/bibtexautocomplete/core/autocomplete.py", line 17, in
from ..lookups.abstract_base import LookupType
File "/home/hubert/.local/lib/python3.8/site-packages/bibtexautocomplete/lookups/abstract_base.py", line 36, in
LookupType = type[LookupProtocol]
TypeError: 'type' object is not subscriptable

Overwrite specific field

Hi!

I've noticed you have a flag (-f --force-overwrite) to enable overwriting existing fields, but I was wondering if it had the functionality to limit it to a specific field so just overwrite existing authors for example?

Mistakes / Inaccuracies

Dropping a few mistakes the tool made for my .bib-file

Reference management is by no means a trivial task, which no one has mastered to automate yet, but maybe this gets this project one step closer ;)


Original reference, pointing to this book.
@book{Dubois2012,
author = {Dubois, Didier and Prade, Henri},
publisher = {Springer Science & Business Media},
title = {Possibility Theory: An Approach to Computerized Processing of Uncertainty},
year = {2012}
}

Autocompleted version with a nonexistent doi, and a journal as booktitle?
@book{Dubois2012,
title = {Possibility Theory: An Approach to Computerized Processing of Uncertainty},
author = {Dubois, Didier and Prade, Henri},
booktitle = {Journal of the American Society for Information Science},
doi = {10.1007/978-1-4684-5287-7},
issn = {0002-8231},
month = {3},
pages = {1-263},
publisher = {Springer Science & Business Media},
year = {2012},
}


Original reference, meaning this book
@book{Augustin2014,
author = {Augustin, Thomas and Coolen, Frank PA and De Cooman, Gert and Troffaes, Matthias CM},
publisher = {John Wiley & Sons},
title = {Introduction to imprecise probabilities},
year = {2014}
}

Autocompleted version with a doi pointing to a paper from different authors
@book{Augustin2014,
title = {Introduction to imprecise probabilities},
author = {Augustin, Thomas and Coolen, Frank PA and De Cooman, Gert and Troffaes, Matthias CM},
booktitle = {Optimization Under Uncertainty with Applications to Aerospace Engineering},
doi = {10.1007/978-3-030-60166-9_2},
isbn = {9783030601652},
issn = {1940-6517},
month = {5},
pages = {35-79},
publisher = {John Wiley & Sons},
year = {2014},
}

Regression: uncaught error when checking DOI

New error in 1.3:

UNEXPECTED ERROR: 
 | Uncaught exception when checking DOI resolution                                 
 | Entry = <built-in function id>
 | DOI = 10.1007/s10703-022-00405-8
 | 
 | As a result, this DOI will NOT be added to the entry
 | 
 | Traceback (most recent call last):
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexautocomplete/bibtex/fields.py", line 108, in slow_check
 |     return doi_checker.query() is True
 |            ^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexautocomplete/lookups/abstract_base.py", line 114, in query
 |     return super().query()
 |            ^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexautocomplete/lookups/abstract_base.py", line 94, in query
 |     return self.process_data(data)
 |            ^^^^^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexautocomplete/APIs/doi.py", line 92, in process_data
 |     if self.check_url(value["data"]["value"].to_str()):
 |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexautocomplete/APIs/doi.py", line 108, in check_url
 |     text = normalize_str_weak(final.data.decode())
 |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexautocomplete/bibtex/normalize.py", line 58, in normalize_str_weak
 |     string = latex_to_unicode(string)
 |              ^^^^^^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 65, in latex_to_unicode
 |     string = _replace_all_latex(string, itertools.chain(
 |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 53, in _replace_all_latex
 |     string = _replace_latex(string, l.rstrip(), u)
 |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 35, in _replace_latex
 |     if unicodedata.combining(unicod):
 |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 | TypeError: combining() argument must be a unicode character, not str
 | 
 | You can report this bug at https://github.com/dlesbre/bibtex-autocomplete/issues
 | 
UNEXPECTED ERROR: 
 | Uncaught exception when checking DOI resolution                                 
 | Entry = <built-in function id>
 | DOI = 10.1007/978-3-030-65474-0_14
 | 
 | As a result, this DOI will NOT be added to the entry
 | 
 | Traceback (most recent call last):
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexautocomplete/bibtex/fields.py", line 108, in slow_check
 |     return doi_checker.query() is True
 |            ^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexautocomplete/lookups/abstract_base.py", line 114, in query
 |     return super().query()
 |            ^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexautocomplete/lookups/abstract_base.py", line 94, in query
 |     return self.process_data(data)
 |            ^^^^^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexautocomplete/APIs/doi.py", line 92, in process_data
 |     if self.check_url(value["data"]["value"].to_str()):
 |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexautocomplete/APIs/doi.py", line 108, in check_url
 |     text = normalize_str_weak(final.data.decode())
 |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexautocomplete/bibtex/normalize.py", line 58, in normalize_str_weak
 |     string = latex_to_unicode(string)
 |              ^^^^^^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 65, in latex_to_unicode
 |     string = _replace_all_latex(string, itertools.chain(
 |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 53, in _replace_all_latex
 |     string = _replace_latex(string, l.rstrip(), u)
 |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 |   File "/home/dorian/.local/pipx/venvs/bibtexautocomplete/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 35, in _replace_latex
 |     if unicodedata.combining(unicod):
 |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 | TypeError: combining() argument must be a unicode character, not str
 | 
 | You can report this bug at https://github.com/dlesbre/bibtex-autocomplete/issues
 | 

UTF-8 decoding issue on macOS and Linux

Hi,

I'm trying to autocomplete the DOIs in a bib file and get a crash both on macOS (Ventura 13.2.1 / Python 3.10.2) and Linux (Ubuntu 20.04.5 LTS / Python 3.8.10). On both systems btac is pip-installed in it's own venv and crashes with (minor path variations on) the following error:

Querying databases: |██████████████████████████████████████⚠︎ | (!) [95%] in 35:47.3
Traceback (most recent call last):
  File "/home/leon/src/bibtex-autocomplete/venv/bin/btac", line 8, in <module>
    sys.exit(main())
  File "/home/leon/src/bibtex-autocomplete/venv/lib/python3.8/site-packages/bibtexautocomplete/core/main.py", line 106, in main
    completer.autocomplete(args.verbose < 0)
  File "/home/leon/src/bibtex-autocomplete/venv/lib/python3.8/site-packages/bibtexautocomplete/core/autocomplete.py", line 170, in autocomplete
    self.update_entry(entries[position], threads, position)
  File "/home/leon/src/bibtex-autocomplete/venv/lib/python3.8/site-packages/bibtexautocomplete/core/autocomplete.py", line 193, in update_entry
    to_add = self.sanitize(to_add)
  File "/home/leon/src/bibtex-autocomplete/venv/lib/python3.8/site-packages/bibtexautocomplete/core/autocomplete.py", line 235, in sanitize
    if doi_checker.query() is not True:
  File "/home/leon/src/bibtex-autocomplete/venv/lib/python3.8/site-packages/bibtexautocomplete/lookups/condition_mixin.py", line 27, in query
    return super().query()
  File "/home/leon/src/bibtex-autocomplete/venv/lib/python3.8/site-packages/bibtexautocomplete/lookups/abstract_base.py", line 119, in query
    return self.process_data(data)
  File "/home/leon/src/bibtex-autocomplete/venv/lib/python3.8/site-packages/bibtexautocomplete/APIs/doi.py", line 92, in process_data
    if self.check_url(value["data"]["value"].to_str()):
  File "/home/leon/src/bibtex-autocomplete/venv/lib/python3.8/site-packages/bibtexautocomplete/APIs/doi.py", line 101, in check_url
    text = normalize_str_weak(final.data.decode())
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 10: invalid start byte

The bibfile has 580 entries; it seems they all get queried before the crash... Does the above give any indication that could help to drill down to the offending entry?

cheers,
Leon

Decoding error on Windows

Hi,

Originially coming from "the stackexchange post", I ran into this error with the bare script there, where it struggles to open (and write) one of my .bib files. Adding encoding="utf8" to the with open(...) (and write()) fixed it in the script.

The same error appears with this package on Windows, it works fine with the package out of the box on the same .bib file on a Linux VM, however.

This is the error message:

Reading file 1 / 1 from 'References.bib'
Traceback (most recent call last):
  File "...user\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "...user\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "...user\AppData\Local\Programs\Python\Python39\lib\site-packages\bibtexautocomplete\__main__.py", line 4, in <module>
    main()
  File "...user\AppData\Local\Programs\Python\Python39\lib\site-packages\bibtexautocomplete\core\main.py", line 73, in main
    databases = BibtexAutocomplete.read(args.input)
  File "...user\AppData\Local\Programs\Python\Python39\lib\site-packages\bibtexautocomplete\core\autocomplete.py", line 273, in read
    dbs.append(file_read(file))
  File "...user\AppData\Local\Programs\Python\Python39\lib\site-packages\bibtexautocomplete\bibtex\io.py", line 72, in file_read
    bibtex = file.read()
  File "...user\AppData\Local\Programs\Python\Python39\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 8540: character maps to <undefined>

Really like the project you have going on here. It did return a couple of false entries and inaccuracies, but is overall a really convenient tool.

Tom

Zenodo

Some conferences in my field are used to publish papers to zenodo. Therefore, I would like to implement a lookup for zenodo. I've given a look at the code, and it should be similar to the DBLP API class, but I cannot completely get what are the supposed input and output types of each method.

Documentation for the Zenodo API is here: https://developers.zenodo.org/#list36

Cannot save data

I ran btac -i my-bib.bib and it completed the search sucessfully.
But still when writing the file I get the error.

(pls, let me know what specs/detailed info you need)

Querying databases: |████████████████████████████████████████| [100%] in 30:43.9 
Modified 368 / 374 entries, added 400 fields

==== Writing files =============================================================

Writing file 1 / 1 to '[PosixPath('my-bib.bib')]'
Traceback (most recent call last):
  File "/opt/homebrew/bin/btac", line 5, in <module>
    main()
  File "/opt/homebrew/lib/python3.9/site-packages/bibtexautocomplete/core/main.py", line 79, in main
    completer.write(args.output)
  File "/opt/homebrew/lib/python3.9/site-packages/bibtexautocomplete/core/autocomplete.py", line 223, in write
    wrote += file_write(file, db)
  File "/opt/homebrew/lib/python3.9/site-packages/bibtexautocomplete/bibtex/io.py", line 56, in file_write
    with open(filepath, "w") as file:
TypeError: expected str, bytes or os.PathLike object, not list

bibtext compatible articles

Hi, I see that btac adds publisher to @article entry type.
It's not compatible neither with BibTeX nor BiBLaTeX.

Invocation of biber --validate-datamodel thesis yields warnings about invalid field.

i.e. WARN - Datamodel: article entry 'teokarevic2011ten' (refs.bib): Invalid field 'publisher' for entrytype 'article' which indeed is not required by APA style.

Do you think --strict mode could be implemented?

Don't set URL to dx.doi.org

When URL is dx.doi.org/10.nnnnn/mmmmmm (or just doi.org/...), this is redundant with DOI information. It should be detected and moved to the doi field, or just removed if the DOI is already set.

btac removes capitalization

For example, title = {{HermiT}: {A}n {OWL} 2 reasoner}, becomes title = {HermiT: An OWL 2 reasoner},, which some bibliography styles incorrectly format in all lower case.
Is there a way to tell btac to keep the capitalization in place?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.