ferru97 / pypaperbot Goto Github PK
View Code? Open in Web Editor NEWPyPaperBot is a Python tool for downloading scientific papers using Google Scholar, Crossref, and SciHub.
License: MIT License
PyPaperBot is a Python tool for downloading scientific papers using Google Scholar, Crossref, and SciHub.
License: MIT License
I ran into a situation where different articles with the same keys appear in the bibtex.bib file. For example:
@inproceedings{Hosseini_2016,
doi = {10.1109/ism.2016.0028},
url = {https://doi.org/10.1109%2Fism.2016.0028},
year = 2016,
month = {dec},
publisher = {{IEEE}},
author = {Mohammad Hosseini and Viswanathan Swaminathan},
title = {Adaptive 360 {VR} Video Streaming: Divide and Conquer},
booktitle = {2016 {IEEE} International Symposium on Multimedia ({ISM})}
}
@inproceedings{Hosseini_2016,
doi = {10.1109/ism.2016.0093},
url = {https://doi.org/10.1109%2Fism.2016.0093},
year = 2016,
month = {dec},
publisher = {{IEEE}},
author = {Mohammad Hosseini and Viswanathan Swaminathan},
title = {Adaptive 360 {VR} Video Streaming Based on {MPEG}-{DASH} {SRD}},
booktitle = {2016 {IEEE} International Symposium on Multimedia ({ISM})}
Because of this, I cannot correctly process the records using the bibtex parsing library. The library believes that the same articles are written under the same keys, although this is not the case. Is there a way to avoid giving the same keys to articles? For example, add an option that will add a sequence number or random characters to the key.
If you call it without arguments, it will tell you:
Error, provide at least one of the following arguments: --query or --file
The correct argument appears to be--doi-file
(not--file
)
Hello,
Thank you for your tool. It is magnificent and very useful. I just want to highlight a minor thing. When it is searching for the list of DOIs, if it can't find it, this causes an error when it comes to download it and stop the program.
Hi!
Is there any reason the .bib file is saved in latin-1 encoding?
PyPaperBot/PyPaperBot/Paper.py
Line 96 in ee5b502
Why not utf-8? Because of this, I have to change the encoding of the file before opening it.
I'm encountered an error using a .txt with 12 DOIs that traces back to the regular expressions in .Paper. The Re package won't download because it has depreciated. Could you update the .Paper module to import Regex instead of Re?
I can't understand whether the proxy is only for downloading papers, or also for crossref?
I kind of wish it is for both. So when use it frequently, and not get blocked.
Hello!
I got this error while downloading
Download 202 of 8701 -> None
Traceback (most recent call last):
File "C:\Users\kir-m\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "C:\Users\kir-m\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\kir-m\AppData\Local\Programs\Python\Python37\lib\site-packages\PyPaperBot\__main__.py", line 122, in <module>
main()
File "C:\Users\kir-m\AppData\Local\Programs\Python\Python37\lib\site-packages\PyPaperBot\__main__.py", line 118, in main
start(args.query, args.scholar_pages, dwn_dir, args.min_year , max_dwn, max_dwn_type , args.journal_filter, args.restrict, DOIs)
File "C:\Users\kir-m\AppData\Local\Programs\Python\Python37\lib\site-packages\PyPaperBot\__main__.py", line 45, in start
downloadPapers(to_download, dwn_dir, num_limit)
File "C:\Users\kir-m\AppData\Local\Programs\Python\Python37\lib\site-packages\PyPaperBot\Downloader.py", line 62, in downloadPapers
pdf_dir = getSaveDir(dwnl_dir, p.getFileName())
File "C:\Users\kir-m\AppData\Local\Programs\Python\Python37\lib\site-packages\PyPaperBot\Paper.py", line 31, in getFileName
return re.sub('[^\w\-_\. ]', '_', self.title)+".pdf"
File "C:\Users\kir-m\AppData\Local\Programs\Python\Python37\lib\re.py", line 192, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object
I understand that all download problems are difficult to fix. But I need to download quite a few articles. I would like to have, for example, an option or default behavior when such errors do not lead to an abnormal end but are written to the log. I think it's easy to do it by adding try-except.
Hi,
Thank you so much for this nice tool. I wanted to download papers in HTML format, how can I use it for such a purpose?
Thanks.
Hello and thanks for making this tool. So I encountered an error while trying to download a paper, here is the output
$ python -m PyPaperBot --query="Machine Learning" --scholar-pages=1 --min-year=2020 --dwn-dir="~/current"
PyPaperBot is a Python tool for downloading scientific papers using Google Scholar, Crossref and SciHub.
Query: Machine Learning
Google Scholar page 1 : 5 papers found
Searching paper 1 of 5 on Crossref...
Searching paper 2 of 5 on Crossref...
Searching paper 3 of 5 on Crossref...
Searching paper 4 of 5 on Crossref...
Searching paper 5 of 5 on Crossref...
Papers found on Crossref: 4/5
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/hskalin/.local/lib/python3.8/site-packages/PyPaperBot/__main__.py", line 122, in <module>
main()
File "/home/hskalin/.local/lib/python3.8/site-packages/PyPaperBot/__main__.py", line 118, in main
start(args.query, args.scholar_pages, dwn_dir, args.min_year , max_dwn, max_dwn_type , args.journal_filter, args.restrict, DOIs)
File "/home/hskalin/.local/lib/python3.8/site-packages/PyPaperBot/__main__.py", line 37, in start
to_download = filter_min_date(to_download,min_date)
File "/home/hskalin/.local/lib/python3.8/site-packages/PyPaperBot/PapersFilters.py", line 50, in filter_min_date
if paper.sc_year!=None and int(paper.sc_year)>=min_year:
AttributeError: 'Paper' object has no attribute 'sc_year'
So what might be causing this?
Please add heroku support so that we can deploy it on heroku and use it on telegram
Use some service/website to automatically detect a Sci-Hub working link
Hi,
Thanks for your excellent tools for paper download.
Could you add one function that can skip already download papers in the folder?
Best wishes
Hi thanks for this Nice package.
I was wondering how can I provide an google scholar advanced search string?
I would like something like: --query="string 1 "string 2 that is an exact phrase" "
Also how do I add, --max-year like --min-year so that i can search for a limit my search to a time window between [min-year, max-year]
TIA.
The package always detect and download files upto 10 pages.
Hi, this script is great. Although I need to generate a list of active URLs where people can access the PDFs rather than just downloading the PDFs locally to my computer. Can we add a new parameter that would simple copy/paste the URL to the PDF that the script already knows and uses to the spreadsheet output. Thanks.
Good morning,
I am trying to download a pdf of a science paper with this code line:
!python -m PyPaperBot --query="10.1038/s41598-023-43091-0" --scholar-pages=2 --dwn-dir="path/to/download/dir"
The query string is a DOI and if you search for it in Google Scholar it does find only one paper (which is the one I am searching for).
Unfortunately, the code line gives me this error:
_Query: 10.1038/s41598-023-43091-0
Google Scholar page 1 : 10 papers found
Paper not found...
Google Scholar page 2 : 10 papers found
Paper not found...
Work completed!_
I tried to use the title of the paper, but it does not work. I tried the URL, but again it returns an error.
How can I fix it? Can you help me please?
I am using Colab right now, Python 3.10 and I would like to use Google Scholar option and not Scihub.
Thank you so much in advance!
Matteo
I have successfully installed all dependencies, ensured correct configuration settings, and the application runs without any immediate errors. However, the papers still aren't downloading.
result.csv
Downloading papers from DOIs
Searching paper 1 of 13 with DOI 10.1108/IJIS-01-2021-0022
Python 3
Searching paper 2 of 13 with DOI 10.3390/app11219816
Python 3
Searching paper 3 of 13 with DOI 10.1007/s10457-017-0145-y
Python 3
Searching paper 4 of 13 with DOI 10.1016/j.deveng.2018.07.001
Python 3
Searching paper 5 of 13 with DOI 10.12775/EQ.2017.006
Python 3
Searching paper 6 of 13 with DOI 10.1016/j.aquaculture.2016.05.012
Python 3
Searching paper 7 of 13 with DOI 10.1080/14754835.2013.754293
Python 3
Searching paper 8 of 13 with DOI 10.4113/jom.2010.1086
Python 3
Searching paper 9 of 13 with DOI 10.1016/j.foodpol.2006.05.005
Python 3
Searching paper 10 of 13 with DOI 10.1142/9789812703040_0140
Python 3
Searching paper 11 of 13 with DOI 10.1038/s41598-023-33042-0
Python 3
Searching paper 12 of 13 with DOI 10.1016/j.still.2023.105744
Python 3
Searching paper 13 of 13 with DOI 10.1016/j.ecoinf.2023.102075
Python 3
Using https://sci-hub.shop as Sci-Hub instance
Download 1 of 13 -> The intertwined relationship of shadow banking and commercial banks’ deposit growth: evidence from India
Download 2 of 13 -> A Novel Approach in Prediction of Crop Production Using Recurrent Cuckoo Search Optimization Neural Networks
Download 3 of 13 -> FAO guidelines and geospatial application for agroforestry suitability mapping: case study of Ranchi, Jharkhand state of India
Download 4 of 13 -> Sustainable development as successful technology transfer: Empowerment through teaching, learning, and using digital participatory mapping techniques in Mazvihwa, Zimbabwe
Download 5 of 13 -> Land Evaluation in terms of Agroforestry Suitability, an Approach to Improve Livelihood and Reduce Poverty: A FAO based Methodology by Geospatial Solution: A case study of Palamu district, Jharkhand, India
Download 6 of 13 -> Hierarchical clustering and partitioning to characterize shrimp grow-out farms in northeast Brazil
Download 7 of 13 -> Fictions of Humanitarian Responsibility: Narrating Microfinance
Download 8 of 13 -> Roads to Participatory Planning: Integrating Cognitive Mapping and GIS for Transport Prioritization in Rural Lesotho
Download 9 of 13 -> Growth options and poverty reduction in Ethiopia – An economy-wide model analysis
Download 10 of 13 -> An Integrated Approach of Remote Sensing and GIS to Poverty Alleviation and Coastal Development in Cox’s Bazar, Bangladesh
Download 11 of 13 -> Towards reducing chemical usage for weed control in agriculture using UAS imagery analysis and computer vision techniques
Download 12 of 13 -> Delineation and optimization of cotton farmland management zone based on time series of soil-crop properties at landscape scale in south Xinjiang, China
Download 13 of 13 -> Machine learning-based spatial-temporal assessment and change transition analysis of wetlands: An application of Google Earth Engine in Sylhet, Bangladesh (1985–2022)
Work completed!
If you like this project, you can offer me a cup of coffee at --> https://www.paypal.com/paypalme/ferru97 <-- :)
Is someone else facing this issue? Am I missing some step? Should we explicitly add api keys to some page?
--scholar_start is used to choose the starting page from witch start the search on scholar
Hello.
I am trying to download 100 pdfs using dois using pyPaperBot. But only 41 gets downloaded and i get this error.
Here are the error messages. It finished download at number 40 and then printed this : TypeError: expected string or bytes-like object. detailed error below.
Thanks a lot in advance.
Download 40 of 100 -> Biochar decreased rhizodeposits stabilization via opposite effects on bacteria and fungi: diminished fungi-promoted aggregation and enhanced bacterial mineralization
Download 41 of 100 -> None
Traceback (most recent call last):
File "/mnt/home/bandopad/miniconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/mnt/home/bandopad/miniconda3/lib/python3.7/runpy.py", line 85, in run_code
exec(code, run_globals)
File "/mnt/ufs18/rs-033/ShadeLab/WorkingSpace/Bandopadhyay_WorkingSpace/metaanalysis_doi/environment/lib/python3.7/site-packages/PyPaperBot/main.py", line 122, in
main()
File "/mnt/ufs18/rs-033/ShadeLab/WorkingSpace/Bandopadhyay_WorkingSpace/metaanalysis_doi/environment/lib/python3.7/site-packages/PyPaperBot/main.py", line 118, in main
start(args.query, args.scholar_pages, dwn_dir, args.min_year , max_dwn, max_dwn_type , args.journal_filter, args.restrict, DOIs)
File "/mnt/ufs18/rs-033/ShadeLab/WorkingSpace/Bandopadhyay_WorkingSpace/metaanalysis_doi/environment/lib/python3.7/site-packages/PyPaperBot/main.py", line 45, in start
downloadPapers(to_download, dwn_dir, num_limit)
File "/mnt/ufs18/rs-033/ShadeLab/WorkingSpace/Bandopadhyay_WorkingSpace/metaanalysis_doi/environment/lib/python3.7/site-packages/PyPaperBot/Downloader.py", line 62, in downloadPapers
pdf_dir = getSaveDir(dwnl_dir, p.getFileName())
File "/mnt/ufs18/rs-033/ShadeLab/WorkingSpace/Bandopadhyay_WorkingSpace/metaanalysis_doi/environment/lib/python3.7/site-packages/PyPaperBot/Paper.py", line 31, in getFileName
return re.sub('[^\w\-_\. ]', '', self.title)+".pdf"
File "/mnt/home/bandopad/miniconda3/lib/python3.7/re.py", line 192, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object
(environment) (base) -bash-4.2$
The script now searches for all articles regardless of the year, and then filters them if the --min-year
option is specified. Because of this, much fewer articles are downloaded from one page than they actually are. To get around this, I use a trick like this:
python -m PyPaperBot --query="stereoscopic&as_ylo=2010" --scholar-pages=10 --dwn-dir="./"
It would be cool to set the as_ylo
option inside the script itself
Sounds like users have having similar problems with downloading and there aren't many updates.
Good time of a day! While I try to download an article, it just creates bibtex and csv file.
python -m PyPaperBot --doi=":10.4304/jetwi.2.3.258-268" --dwn-dir="/home/___/Desktop/Thesis/Experiment"
Downloading papers from DOIs
Searching paper 1 of 1 with DOI :10.4304/jetwi.2.3.258-268
Python 3
Using https://sci-hub.ee as Sci-Hub instance
Download 1 of 1 -> A Survey of Text Summarization Extractive Techniques
Work completed!
Hello!
Is there any reason for this line to existing? Because of it, after changing the IP, the previous page is downloaded again.
PyPaperBot/PyPaperBot/Scholar.py
Line 33 in a380ee0
When using the --query option, each article is downloaded 10 times and someone is skipped
Apologies to not getting back to you sooner.
I switched to a different computer and managed to get two separate downloads to my specified directory. Only thing is, I now have two instances of a 'bibtex.bib' and a excel filed named 'result,' which is simply the book's information in each separate field.
Tried changing the download directory, and used a different DOI from a different article: same result, same two files. Any help would be appreciated
Restrict mode 0 download the papers anyway
Hello, I've been attempting to download books via the --doi command, but after inputting the relevant information for the DOI number and correct download dir, I get a
FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/XYZ/Downloads`/result.csv'
Any help would be appreciated, thanks
As a result of the fix in #45
Executing commands with or without --max-dwn-cites=10
!python -m PyPaperBot --query="Machine learning" --scholar-pages=1 --min-year=2018 --max-dwn-cites=10 --dwn-dir="\content\papers" --scihub-mirror="https://sci-hub.do"
Now results in
PyPaperBot is a Python tool for downloading scientific papers using Google Scholar, Crossref and SciHub.
If you like this project, you can give me a cup of coffee at --> https://www.paypal.com/paypalme/ferru97 <-- :)
Query: Machine learning
Google Scholar page 1 : 10 papers found
Paper not found...
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.7/dist-packages/PyPaperBot/__main__.py", line 148, in <module>
main()
File "/usr/local/lib/python3.7/dist-packages/PyPaperBot/__main__.py", line 145, in main
start(args.query, args.scholar_results, scholar_pages, dwn_dir, proxy, args.min_year , max_dwn, max_dwn_type , args.journal_filter, args.restrict, DOIs, args.scihub_mirror)
File "/usr/local/lib/python3.7/dist-packages/PyPaperBot/__main__.py", line 48, in start
Paper.generateReport(to_download,dwn_dir+"result.csv")
File "/usr/local/lib/python3.7/dist-packages/PyPaperBot/Paper.py", line 65, in generateReport
with open(path, mode="w", encoding='utf-8', newline='', buffering=1) as w_file:
FileNotFoundError: [Errno 2] No such file or directory: '/content/papers/result.csv'
Hi,
Your software works great, but it is a little bit slow when searching for queries on google scholar. Is it possible to parallelize for example the search on the single pages?
When trying to download papers using DOI I got the following error:
C:\Users\sparadis>python -m PyPaperBot --doi="10.0086/s41037-711-0132-1" --dwn-dir="C:\User\example\papers"`
PyPaperBot is a Python tool for downloading scientific papers using Google Scholar, Crossref and SciHub.
If you like this project, you can give me a cup of coffee at --> https://www.paypal.com/paypalme/ferru97 <-- :)
Traceback (most recent call last):
File "C:\Users\sparadis\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
return run_code(code, main_globals, None,
File "C:\Users\sparadis\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\sparadis\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPaperBot_main.py", line 139, in
main()
File "C:\Users\sparadis\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPaperBot_main.py", line 136, in main
start(args.query, scholar_pages, dwn_dir, args.min_year , max_dwn, max_dwn_type , args.journal_filter, args.restrict, DOIs, args.scihub_mirror)
UnboundLocalError: local variable 'scholar_pages' referenced before assignment
The search is sometimes rate-limited but PyPaperBot's response is simply "Paper not found...". In this case, PyPaperBot should display the message returned in the HTML response (see below). Optionally, for those working on their own local network, an option could appear to open the URL in a browser to solve the CAPTCHA there.
Response:
HTML status code: 429
HTML response:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head><meta http-equiv="content-type" content="text/html; charset=utf-8"><meta name="viewport" content="initial-scale=1"><title>https://scholar.google.com/scholar?hl=en&q=abc%22&as_vis=1&as_sdt=1,5&start=380</title></head>
<body style="font-family: arial, sans-serif; background-color: #fff; color: #000; padding:20px; font-size:18px;" onload="e=document.getElementById('captcha');if(e){e.focus();} if(solveSimpleChallenge) {solveSimpleChallenge(,);}">
<div style="max-width:400px;">
<hr noshade size="1" style="color:#ccc; background-color:#ccc;"><br>
<form id="captcha-form" action="index" method="post">
<noscript>
<div style="font-size:13px;">
In order to continue, please enable javascript on your web browser.
</div>
</noscript>
<script src="https://www.google.com/recaptcha/api.js" async defer></script>
<script>var submitCallback = function(response) {document.getElementById('captcha-form').submit();};</script>
<div id="recaptcha" class="g-recaptcha" data-sitekey="6LfwuyUTAAAAAOAmoS0fdqijC2PbbdH4kjq62Y1b" data-callback="submitCallback" data-s="KskQ5aUxKskQnKskQaho-qeu-uodlwNquodlPzUtHOt0SgxuodlK-LDBK8m5HPeJXBMS9x8m5HPeJI0J2v8m5HPeJltxo_1M0kQRb8m5HPeJbfd8pHy0kNPRa2Z_RFJpvQHAs6zrLM1aI5Lca58_waI5Lca51aI5Lca5x3IDmu1ffftae0mAEAvsm4Un_7xFpkcSr7xFpkcSFkD7xFpkcSVwXjYIOOdb_jc"></div>
<input type='hidden' name='q' value='NWU4I8GNWU4I8GIhAxOzGIhAxOZT-uGIhAxOMgFy'><input type="hidden" name="continue" value="https://scholar.google.com/scholar?hl=en&q=abc%22&as_vis=1&as_sdt=1,5&start=380">
</form>
<hr noshade size="1" style="color:#ccc; background-color:#ccc;">
<div style="font-size:13px;">
<b>About this page</b><br><br>
Our systems have detected unusual traffic from your computer network. This page checks to see if it's really you sending the requests, and not a robot. <a href="#" onclick="document.getElementById('infoDiv').style.display='block';">Why did this happen?</a><br><br>
<div id="infoDiv" style="display:none; background-color:#eee; padding:10px; margin:0 0 15px 0; line-height:1.4em;">
This page appears when Google automatically detects requests coming from your computer network which appear to be in violation of the <a href="//www.google.com/policies/terms/">Terms of Service</a>. The block will expire shortly after those requests stop. In the meantime, solving the above CAPTCHA will let you continue to use our services.<br><br>This traffic may have been sent by malicious software, a browser plug-in, or a script that sends automated requests. If you share your network connection, ask your administrator for help — a different computer using the same IP address may be responsible. <a href="//support.google.com/websearch/answer/86640">Learn more</a><br><br>Sometimes you may be asked to solve the CAPTCHA if you are using advanced terms that robots are known to use, or sending requests very quickly.
</div>
IP address: xxx.xxx.xxx.xxx<br>Time: 2022-01-30T12:01:36Z<br>URL: https://scholar.google.com/scholar?hl=en&q=abc&as_vis=1&as_sdt=1,5&start=380<br>
</div>
</div>
</body>
</html>
Hi, is there any opportunity to have this grab asbstracts? That would be extremely convenient and helpful. Let me know what you think?
Hi Vito, This is a neat tool you've got here, I came across this error when I tried to use --max-dwn-cites
Environment: Google Colab
Input:
!python -m PyPaperBot --query="Machine learning" --scholar-pages=1 --min-year=2018 --max-dwn-cites=10 --dwn-dir="\content\papers" --scihub-mirror="https://sci-hub.do"
Output:
PyPaperBot is a Python tool for downloading scientific papers using Google Scholar, Crossref and SciHub.
If you like this project, you can give me a cup of coffee at --> https://www.paypal.com/paypalme/ferru97 <-- :)
Query: Machine learning
Google Scholar page 1 : 10 papers found
Searching paper 1 of 9 on Crossref...
Searching paper 2 of 9 on Crossref...
Python 3
Searching paper 3 of 9 on Crossref...
Python 3
Searching paper 4 of 9 on Crossref...
Python 3
Searching paper 5 of 9 on Crossref...
Python 3
Searching paper 6 of 9 on Crossref...
Python 3
Searching paper 7 of 9 on Crossref...
Python 3
Python 3
Searching paper 8 of 9 on Crossref...
Python 3
Searching paper 9 of 9 on Crossref...
Python 3
Papers found on Crossref: 8/9
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.7/dist-packages/PyPaperBot/__main__.py", line 148, in <module>
main()
File "/usr/local/lib/python3.7/dist-packages/PyPaperBot/__main__.py", line 145, in main
start(args.query, args.scholar_results, scholar_pages, dwn_dir, proxy, args.min_year , max_dwn, max_dwn_type , args.journal_filter, args.restrict, DOIs, args.scihub_mirror)
File "/usr/local/lib/python3.7/dist-packages/PyPaperBot/__main__.py", line 43, in start
to_download.sort(key=lambda x: int(x.sc_cites) if x.sc_cites!=None else 0, reverse=True)
File "/usr/local/lib/python3.7/dist-packages/PyPaperBot/__main__.py", line 43, in <lambda>
to_download.sort(key=lambda x: int(x.sc_cites) if x.sc_cites!=None else 0, reverse=True)
AttributeError: 'Paper' object has no attribute 'sc_cites'
Also when I tried to use --max-dwn-year
Input:
!python -m PyPaperBot --query="Machine learning" --scholar-pages=1 --min-year=2018 --max-dwn-year=10 --dwn-dir="\content\papers" --scihub-mirror="https://sci-hub.do"
Output:
PyPaperBot is a Python tool for downloading scientific papers using Google Scholar, Crossref and SciHub.
If you like this project, you can give me a cup of coffee at --> https://www.paypal.com/paypalme/ferru97 <-- :)
Query: Machine learning
Google Scholar page 1 : 10 papers found
Searching paper 1 of 9 on Crossref...
Searching paper 2 of 9 on Crossref...
Python 3
Searching paper 3 of 9 on Crossref...
Python 3
Searching paper 4 of 9 on Crossref...
Python 3
Searching paper 5 of 9 on Crossref...
Python 3
Searching paper 6 of 9 on Crossref...
Python 3
Searching paper 7 of 9 on Crossref...
Python 3
Python 3
Searching paper 8 of 9 on Crossref...
Python 3
Searching paper 9 of 9 on Crossref...
Python 3
Papers found on Crossref: 8/9
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.7/dist-packages/PyPaperBot/__main__.py", line 148, in <module>
main()
File "/usr/local/lib/python3.7/dist-packages/PyPaperBot/__main__.py", line 145, in main
start(args.query, args.scholar_results, scholar_pages, dwn_dir, proxy, args.min_year , max_dwn, max_dwn_type , args.journal_filter, args.restrict, DOIs, args.scihub_mirror)
File "/usr/local/lib/python3.7/dist-packages/PyPaperBot/__main__.py", line 40, in start
to_download.sort(key=lambda x: int(x.sc_year) if x.sc_year!=None else 0, reverse=True)
File "/usr/local/lib/python3.7/dist-packages/PyPaperBot/__main__.py", line 40, in <lambda>
to_download.sort(key=lambda x: int(x.sc_year) if x.sc_year!=None else 0, reverse=True)
AttributeError: 'Paper' object has no attribute 'sc_year'
What's the reason for this🐺
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.