GithubHelp home page GithubHelp logo

Comments (8)

proverbs53 avatar proverbs53 commented on July 26, 2024 1

I included the whole statement in download_book in an if-statement (if req.headers['Content-Type'] == 'application/pdf':) and got the following warnings:

So those are indeed the only three books that have been revoked.

from springer_free_books.

astrodextro avatar astrodextro commented on July 26, 2024 1

I solved this by adding KeyError to the errors caught in download_books(books, folder, patches): function in line 134 of helper.py.

from
except (OSError, IOError) as e:

to:
except (OSError, IOError, KeyError) as e:

This way when the KeyError is encountered it is caught and I get
`Overall Progress: 85%|█████████████████████████████████████████████▋ | 329/389 [1:02:36<1:29:43, 89.72s/it]'content-length'

  • Problem downloading: Introduction to Programming with Fortran, so skipping it.`

and the download continues with the next book

from springer_free_books.

proverbs53 avatar proverbs53 commented on July 26, 2024

Replaced

chunk_size = 1024

file_size = int(req.headers['Content-Length'])

num_bars = file_size // chunk_size

with

            chunk_size = 1024

            if 'Content-Length' in req.headers:

                file_size = int(req.headers['Content-Length'])

                num_bars = file_size // chunk_size

            else:

                print("Warning: missing key 'Content-Length' in request headers; taking default length of 100 for progress bar.")

                num_bars = 100

`
, but I got security errors when trying to push my local branch (it would be my first time contritbuting).

from springer_free_books.

pokui avatar pokui commented on July 26, 2024

Just run into that error too, not sure what book it was trying to download at the time.

Traceback (most recent call last):
  File "main.py", line 88, in <module>
    download_books(books, folder, patches)
  File "/usr/home/pokui/code/springer_free_books/helper.py", line 133, in download_books
libunwind: EHHeaderParser::decodeTableEntry: bad fde: CIE ID is not zero
    download_book(request, output_file, patch)
  File "/usr/home/pokui/code/springer_free_books/helper.py", line 87, in download_book
    file_size = int(req.headers['Content-Length'])
  File "/home/pokui/.local/lib/python3.7/site-packages/requests/structures.py", line 54, in __getitem__
    return self._store[key.lower()][1]
KeyError: 'content-length'

from springer_free_books.

proverbs53 avatar proverbs53 commented on July 26, 2024

Investigated a little bit further, but the hack will not work. What happens is that in these cases, the book has been split across multiple pdfs (or actually, still seems to be behind the paywall), so the download link won't work. Content-Type in that case is 'text/html;charset=utf-8' instead of 'application/pdf'.

Files impacted are (using new indexing method)
295 "A Beginner's Guide to Scala, Object Orientation and Functional Programming"
331 "Introduction to Programming with Fortran"
388 "Advanced Guide to Python 3 Programming"

from springer_free_books.

pokui avatar pokui commented on July 26, 2024

Ok, then it means it's a file that is in the process of being removed from the list. The answer would then be to emit an error that the file is no longer available for download.

from springer_free_books.

idlogin avatar idlogin commented on July 26, 2024

also getting this error at the following index. initially i thought it was due to a dropped internet/vpn connection but restarting it always results in the same.

This is a straight dump without any filters.

:~/springer_free_books$ python3 main.py

389 titles ready to be downloaded...
Overall Progress: 75%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 293/389 [02:22<00:46, 2.05it/s]
Traceback (most recent call last):
File "main.py", line 88, in
download_books(books, folder, patches)
File "/home/uduo/springer_free_books/helper.py", line 133, in download_books
download_book(request, output_file, patch)
File "/home/uduo/springer_free_books/helper.py", line 87, in download_book
file_size = int(req.headers['Content-Length'])
File "/home/uduo/springer_free_books/.venv/lib/python3.6/site-packages/requests/structures.py", line 54, in getitem
return self._store[key.lower()][1]
KeyError: 'content-length'

from springer_free_books.

proverbs53 avatar proverbs53 commented on July 26, 2024

I retrieved the latest code version, and it contains this line that ruins the exception catch file_size = int(req.headers['Content-Length']) if req.headers.get('Content-Length') else 30000.

After removing that, it goes on like planned. Many kudos both for this solution, of just catching the exception, and adding the retry when any exceptions occur.

from springer_free_books.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.