GithubHelp home page GithubHelp logo

magzdb's Introduction

magzdb - magzdb.org Downloader

Python Package CI Code Coverage Python Versions The Uncompromising Code Formatter Monthly Downloads License: MIT

Buy Me A Coffee

Installation

Install using pip

$ pip install -U magzdb

Usage

usage: magzdb [-h] [-V] -i MAGAZINE_ID [-e [EDITION [EDITION ...]]]
              [-f FILTER] [-l] [-P DIRECTORY_PREFIX] [--downloader DOWNLOADER]
              [--debug]

Magzdb.org Downloader

required arguments:
  -i MAGAZINE_ID, --id MAGAZINE_ID
                        ID of the Magazine to Download. eg. http://magzdb.org/j/<ID>.

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         Print program version and exit
  -e [EDITION [EDITION ...]], --editions [EDITION [EDITION ...]]
                        Select Edition
  -f FILTER, --filter FILTER
                        Use filter. See README#Filters
  -l, --latest          Download only latest edition.
  -P DIRECTORY_PREFIX, --directory-prefix DIRECTORY_PREFIX
                        Download directory.
  --downloader DOWNLOADER
                        Use External downloader (RECOMMENDED). Currently supported: aria2, wget, curl
  --debug               Print debug information.
  --skip-download       Don't download files.

Usage Examples

Docker

docker build . -t magzdb
docker run -v $(PWD):/tmp magzdb -h

# Add alias to shell
alias magzdb="docker run -v $(PWD):/tmp magzdb"
magzdb -h

Download all editions

$ magzdb -i 1826

Filters

You can supply filter using -f, for example to download issues between 4063895 and 4063901, you can write as

$ magzdb -i 1826 -f "eid > 4063895 and eid < 4063901"

You can use eid, year in the filter expression.

More examples of filter expression
  • eid > 4063895 and eid < 4063901 or eid >= 4063895 and eid <= 4063901
  • eid >= 4063895 or eid != 4063895
  • year >= 2018, year <= 2018, year == 2018 or even year != 2018

Download only latest edition

$ magzdb -i 1826 -l

Download only latest edition with custom location magazine

$ magzdb -i 1826 -l -P magazine

Use external downloader

$ magzdb -i 1826 -l -P magazine --downloader wget

This is recommended since internal downloader does not support resuming interrupted downloads.

Python Installation Recommendation

If you don't want to install official Python to your system (global). You can install pyenv installer environment under your specific account. It's prefered method for macOS users, because High Sierra and later macOS ships with old Python 2.7.10.

Contributing

Found a bug or missing a feature you are more than welcome to contribute.

License

MIT

magzdb's People

Contributors

coool avatar esurdam avatar larseberhart avatar mayurnix avatar model-map avatar pyup-bot avatar skyme5 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

magzdb's Issues

Is a range of edition possible?

  • magzdb version: 1.0.4
  • Python version: 3.7.3
  • Operating System: Debian 10.5

Description

Is it possible to download a range of editions, for example all between the years 2000 and 2020?

What I Did

For example,
magzdb -i 1234 -e 100-200, did not work.

Possible to include other sources? https://magazinelib.com?

Just curious if it is possible to adapt the script to pull from other magazine resources, like https://magazinelib.com for example? It looks slightly more complicated (I'm not a programmer though) as magazines and releases are not organized by an ID number. That said, there are many magazines there that are not available on Magzdb and it would be a great addition!

Just a thought....

Cheers!

Downloads aren't working instantly skips to next file

  • magzdb version: 1.1.8
  • Python version: 3.10
  • Operating System: Win 11

Description

Trying to do a download
magzdb -i 1798
Gets back the correct list of files to download. But it instantly says each files was downloaded, Parses through list instantly but folder is empty of files, no actual download occurs.

Add docker to Unraid community application

First of all, thanks for the development. This is great!!!

Would be great to see native support for Unraid by adding the docker container to Unraid's application.

Currently, it can installed via dockerhub, but think it would see wider use if added as community application.

Download URL not found when multiple download links exist

  • magzdb version: 1.1.5
  • Python version: 3.9.7
  • Operating System: Ubuntu 21.10

Description

I'm trying to download an edition of scientific american from 1866. This issue is available and has two download links. The first link is dead/ goes to a page that says no resource. The second download link works, but magzdb doesn't see it.

http://magzdb.org/num/3694138

Format: pdf (12.62 megabytes / 13230323 b.) Download from [freelibrary.lib] Scan
author: no
Note: no
http://magzdb.org/file/262482/dl (The file is not available on this resource.)

Format: pdf (12.62 megabytes / 13230323 b.) Download from [file.magzdb.org] Scan
author: no
Note: no
http://magzdb.org/file/463968/dl
http://file.magzdb.org/ul/2490/1866/Scientific%20American%201866%20v015%20n16.pdf

magzdb -i 2490 -e 3694138 --downloader wget
2021-12-06 13:31:07.510 | INFO     | magzdb.magzdb:download:206 - Found 1 editions of Scientific_American
2021-12-06 13:31:07.510 | INFO     | magzdb.magzdb:download:211 - Downloading year 1866 id 3694138
2021-12-06 13:31:08.259 | ERROR    | magzdb.magzdb:download:226 - Download Url not found for http://magzdb.org/num/3694138/dl

Automating script for magazine download

Really love the script. A few ideas how this can be automated:

  • Have a folder where I can place magazine ID (as file without extension). This can be used to run the script automatically for all magazine IDs placed in this folder

  • Rename and tag files after download

  • Check if magazine issue already downloaded and not download again if exists

Download exits (seemingly before complete)

  • magzdb version: 1.1.1
  • Python version: 3.7.3
  • Operating System: Debian 10.5

Description

Magzdb quits downloading and exits with the following,

Traceback (most recent call last):
  File "/home/user01/.local/bin/magzdb", line 10, in <module>
    sys.exit(main())
  File "/home/user01/.local/lib/python3.7/site-packages/magzdb/cli.py", line 87, in main
    filter=args.filter,
  File "/home/user01/.local/lib/python3.7/site-packages/magzdb/magzdb.py", line 214, in download
    r'''<a\s*href\=\.\.\/file\/(?P<id>\d+)/dl>''',
AttributeError: 'NoneType' object has no attribute 'group'

What I Did

I've had this error several times, but in this instance I ran the following,
/home/user01/.local/bin/magzdb -i 1654 -f "year >= 2009"

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.

custom output filename

Description

Currently filenames are used as provided from the magzdb.org, it would be nice to provide ability for custom filenames with attributes such eid, year, issue

wget exiting with error - not creating folder?

  • magzdb version: 1.1.2
  • Python version: 3.7.3
  • Operating System: Debian 10.5

Description

When running with --downloader wget, magzdb exits with error. It seems to be that wget isn't creating the correct folders.

What I Did

I ran the following,

user01@s107487:~/books/temp_magazines$ /home/user01/.local/bin/magzdb -i 1593 -e 4056530 --downloader wget --debug
2020-09-25 06:01:11.903 | INFO     | magzdb.magzdb:download:199 - Found 1 editions of Everyday_practical_electronics
2020-09-25 06:01:11.903 | INFO     | magzdb.magzdb:download:204 - Downloading year 2020 issue 8
2020-09-25 06:01:11.904 | DEBUG    | magzdb.magzdb:_print:52 - Issue ID: 4056530
2020-09-25 06:01:11.988 | DEBUG    | magzdb.magzdb:_print:52 - Download Link ID: 675386
2020-09-25 06:01:12.084 | DEBUG    | magzdb.magzdb:_print:52 - Download URL: http://file.magzdb.org/ul/1593/Practical Electronics - 2020-08.pdf
2020-09-25 06:01:12.085 | DEBUG    | magzdb.magzdb:_print:52 - wget -c -O "/home/user01/books/temp_magazines/Everyday_practical_electronics/Practical_Electronics_-_2020-08.pdf" "http://file.magzdb.org/ul/1593/Practical Electronics - 2020-08.pdf"
Traceback (most recent call last):
  File "/home/user01/.local/bin/magzdb", line 10, in <module>
    sys.exit(main())
  File "/home/user01/.local/lib/python3.7/site-packages/magzdb/cli.py", line 96, in main
    filter=args.filter,
  File "/home/user01/.local/lib/python3.7/site-packages/magzdb/magzdb.py", line 241, in download
    subprocess.run(command)
  File "/usr/lib/python3.7/subprocess.py", line 472, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.7/subprocess.py", line 775, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.7/subprocess.py", line 1522, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'wget -c -O "/home/user01/books/temp_magazines/Everyday_practical_electronics/Practical_Electronics_-_2020-08.pdf" "http://file.magzdb.org/ul/1593/Practical Electronics - 2020-08.pdf"': 'wget -c -O "/home/user01/books/temp_magazines/Everyday_practical_electronics/Practical_Electronics_-_2020-08.pdf" "http://file.magzdb.org/ul/1593/Practical Electronics - 2020-08.pdf"'

If I first run the command without --downloader, as expected the file downloads and I get...

user01@s107487:~/books/temp_magazines$ /home/user01/.local/bin/magzdb -i 1593 -e 4056530 --debug
2020-09-25 06:50:37.830 | WARNING  | magzdb.cli:main:81 - Use of external downloader like wget or aria2 is recommended
2020-09-25 06:50:38.245 | INFO     | magzdb.magzdb:download:199 - Found 1 editions of Everyday_practical_electronics
2020-09-25 06:50:38.245 | INFO     | magzdb.magzdb:download:204 - Downloading year 2020 issue 8
2020-09-25 06:50:38.245 | DEBUG    | magzdb.magzdb:_print:52 - Issue ID: 4056530
2020-09-25 06:50:38.328 | DEBUG    | magzdb.magzdb:_print:52 - Download Link ID: 675386
2020-09-25 06:50:38.448 | DEBUG    | magzdb.magzdb:_print:52 - Download URL: http://file.magzdb.org/ul/1593/Practical Electronics - 2020-08.pdf
2020-09-25 06:50:38.926 | DEBUG    | magzdb.magzdb:_print:52 - Downloading to /home/user01/books/temp_magazines/Everyday_practical_electronics/Practical_Electronics_-_2020-08.pdf

Without removing the file/folder created in the last step, If I then run the wget command that DEBUG printed in the first attempt,

user01@s107487:~/books/temp_magazines$ wget -c -O "/home/user01/books/temp_magazines/Everyday_practical_electronics/Practical_Electronics_-_2020-08.pdf" "http://file.magzdb.org/ul/1593/Practical Electronics - 2020-08.pdf"
--2020-09-25 06:51:41--  http://file.magzdb.org/ul/1593/Practical%20Electronics%20-%202020-08.pdf
Resolving file.magzdb.org (file.magzdb.org)... 84.39.241.145
Connecting to file.magzdb.org (file.magzdb.org)|84.39.241.145|:80... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

    The file is already fully retrieved; nothing to do.

Again, without deleting the file/folder created and running the first command again, it exits with the same error,

user01@s107487:~/books/temp_magazines$ /home/user01/.local/bin/magzdb -i 1593 -e 4056530 --downloader wget --debug
2020-09-25 06:55:39.899 | INFO     | magzdb.magzdb:download:199 - Found 1 editions of Everyday_practical_electronics
2020-09-25 06:55:39.900 | INFO     | magzdb.magzdb:download:204 - Downloading year 2020 issue 8
2020-09-25 06:55:39.900 | DEBUG    | magzdb.magzdb:_print:52 - Issue ID: 4056530
2020-09-25 06:55:39.976 | DEBUG    | magzdb.magzdb:_print:52 - Download Link ID: 675386
2020-09-25 06:55:40.090 | DEBUG    | magzdb.magzdb:_print:52 - Download URL: http://file.magzdb.org/ul/1593/Practical Electronics - 2020-08.pdf
2020-09-25 06:55:40.090 | DEBUG    | magzdb.magzdb:_print:52 - wget -c -O "/home/user01/books/temp_magazines/Everyday_practical_electronics/Practical_Electronics_-_2020-08.pdf" "http://file.magzdb.org/ul/1593/Practical Electronics - 2020-08.pdf"
Traceback (most recent call last):
  File "/home/user01/.local/bin/magzdb", line 10, in <module>
    sys.exit(main())
  File "/home/user01/.local/lib/python3.7/site-packages/magzdb/cli.py", line 96, in main
    filter=args.filter,
  File "/home/user01/.local/lib/python3.7/site-packages/magzdb/magzdb.py", line 241, in download
    subprocess.run(command)
  File "/usr/lib/python3.7/subprocess.py", line 472, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.7/subprocess.py", line 775, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.7/subprocess.py", line 1522, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'wget -c -O "/home/user01/books/temp_magazines/Everyday_practical_electronics/Practical_Electronics_-_2020-08.pdf" "http://file.magzdb.org/ul/1593/Practical Electronics - 2020-08.pdf"': 'wget -c -O "/home/user01/books/temp_magazines/Everyday_practical_electronics/Practical_Electronics_-_2020-08.pdf" "http://file.magzdb.org/ul/1593/Practical Electronics - 2020-08.pdf"'

Found 0 editions of ____

  • magzdb version:1.1.4
  • Python version:3.7.3
  • Operating System:Debian 10.5

Description

This issue seems to be back again. I get 'Found Zero Editions' when running the below, but do not have any editions in the download folder. It doesn't seem to happen with all magazines though; 5423 is the only example I have at the moment

What I Did

magzdb -i 5423 -f "year >= 2018" -P /home/books/magazines --downloader wget
2020-11-07 09:24:32.888 | INFO     | magzdb.magzdb:download:202 - Found 0 editions of TIME

Download stuck

  • magzdb version: 1.0.3
  • Python version: 3.8
  • Operating System: Win10

Description

Download stuck after first pdf downloaded

Local file server down (file.magzdb.org)

I haven't tried the script, but it seems the local file server is down? According to the web page, this is the case for many months already. Does this script still function regardless? Any VPN required?

New update of the website

It seems that after a few days 503, the file server is not file.magzdb.org anymore and change to elibrary.keenetic.pro maybe.

Found 0 Editions of ...

  • magzdb version: 1.0.2
  • Python version: 3.7.3
  • Operating System: Debian 10.5

Description

All attempts to download return 'Found 0 editions of ____ '

What I Did

magzdb -i 4274
magzdb -i 4274 -e 4058710
magzdb -I 2703
magzdb -i 2869

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.