GithubHelp home page GithubHelp logo

ebook-tools's People

Contributors

di-dc avatar heliostatic avatar na-- avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ebook-tools's Issues

Error in lib.sh

Hi!

I run
./organize-ebooks.sh -v -ocr=true -owi -o=/mnt/d/Books/organized -ofu=/mnt/d/Books/uncertain -ofc=/mnt/d/Books/corrupt -ofp=/mnt/d/Books/pamphlet /path/to/books/ and get the following error.

[/mnt/d/Re../Medsklad/files/011912.pdf] Fetching metadata from Google...
[/mnt/d/Re../Medsklad/files/011912.pdf] Calling fetch-ebook-metadata --verbose --allowed-plugin=Google --isbn=3000050000
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] b"Running identify query with parameters:\n{'title': None, 'authors': [], 'identifiers': {'isbn':
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] '3000050000'}, 'timeout': 30}\nUsing plugins: Google (1, 0, 1)\nThe log from individual plugins
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] is below\n\n****************************** Google (1, 0, 1) ******************************\nFound
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] 1 results\nDownloading from Google took 0.6083431243896484\n\n\n---\nTitle : Das
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] Hunderttage-Stadion: Entstehungsgeschichte des Bad Nauheimer Kunsteisstadions unter Colonel Paul
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] R. Knight\nAuthor(s) : Heinrich Burk\nPublisher : Stadt Bad Nauheim\nLanguages
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] : deu\nPublished : 1999-09-15T11:52:30.012039+00:00\nIdentifiers
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] : google:163xAgAACAAJ, isbn:9783000050008\nMaking query:
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] https://books.google.com/books/feeds/volumes?q=isbn%3A3000050000&max-results=20&start-index=1&min-viewability=none\n\n********************************************************************************\nThe
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] identify phase took 0.80 seconds\nThe longest time (0.608343) was taken by: Google\nMerging
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] results from different sources\nWe have 1 merged results, merging took: 0.00 seconds\n"
[/mnt/d/Re../Medsklad/files/011912.pdf] Successfully fetched metadata:
[/mnt/d/Re../Medsklad/files/011912.pdf] [meta] b'Title : Das Hunderttage-Stadion: Entstehungsgeschichte des Bad
[/mnt/d/Re../Medsklad/files/011912.pdf] [meta] Nauheimer Kunsteisstadions unter Colonel Paul R. Knight\nAuthor(s) : Heinrich
[/mnt/d/Re../Medsklad/files/011912.pdf] [meta] Burk\nPublisher : Stadt Bad Nauheim\nLanguages : deu\nPublished :
[/mnt/d/Re../Medsklad/files/011912.pdf] [meta] 1999-09-15T11:52:30.012039+00:00\nIdentifiers : google:163xAgAACAAJ, isbn:9783000050008'
[/mnt/d/Re../Medsklad/files/011912.pdf] Addding additional metadata to the end of the metadata file...
[/mnt/d/Re../Medsklad/files/011912.pdf] Organizing '/mnt/d/RecipeCD/books/Medsklad/files/011912.pdf' (with '/tmp/tmp.m4sWetIL4y.txt')...
[/mnt/d/Re../Medsklad/files/011912.pdf] Variables that will be used for the new filename construction:
[/mnt/d/Re../Medsklad/files/011912.pdf] BTITLE Das Hunderttage-Stadion: Entstehungsgeschichte des Bad Nauheimer Kunsteisstadions unter Colonel Paul
[/mnt/d/Re../Medsklad/files/011912.pdf] EXT pdf
[/mnt/d/Re../Medsklad/files/011912.pdf] METADATA_SOURCE Google
[/mnt/d/Re../Medsklad/files/011912.pdf] OLD_FILE_PATH _mnt_d_RecipeCD_books_Medsklad_files_011912.pdf
[/mnt/d/Re../Medsklad/files/011912.pdf] ALL_FOUND_ISBNS 5225009204,3000050000
[/mnt/d/Re../Medsklad/files/011912.pdf] ISBN 3000050000
[/mnt/d/Re../Medsklad/files/011912.pdf] /home/magi/ebook-tools/lib.sh: line 682: d[TITLE]: unbound variable
[/mnt/d/Re../Medsklad/files/011912.pdf] ERROR on line 682 /home/magi/ebook-tools/lib.sh!
[/mnt/d/Re../Medsklad/files/011912.pdf] ERROR on line 136 ./organize-ebooks.sh!
ERROR on line 313 ./organize-ebooks.sh!

Docker (Windows): Ran out of memory for input buffer

Here is a screenshot of the entire error message, it happens when processing seemingly random books (It has only happened thrice while processing around 50.000 books). It creates a folder named ERROR at 35 in the output folder while crashing.
image

metadata related error

Hello,

I'm running the docker container in an UnRAID environment, spun it up and managing it with Portainer.

command used is

docker run -it -v /blahblah:/unorganized -v /blahblahblah:/organized ebooktools/scripts:latest

then within the container

$ organize-ebooks.sh -v /unorganized -o /organized

which gets me

Recursively scanning '/unorganized' for files
[/unorgani..cence - Lyn Macdonald.epub] Testing '/unorganized/5k books/1915 The Death of Innocence - Lyn Macdonald.epub' for corruption...
[/unorgani..cence - Lyn Macdonald.epub] The file has a '.epub' extension, testing with 7z...
[/unorgani..cence - Lyn Macdonald.epub] File passed the corruption test, looking for ISBNs...
[/unorgani..cence - Lyn Macdonald.epub] Searching file '/unorganized/5k books/1915 The Death of Innocence - Lyn Macdonald.epub' for ISBN numbers...
[/unorgani..cence - Lyn Macdonald.epub] Ebook MIME type: application/zip
[/unorgani..cence - Lyn Macdonald.epub] This application failed to start because it could not find or load the Qt platform plugin "headless"
[/unorgani..cence - Lyn Macdonald.epub] in "/usr/lib/calibre/calibre/plugins".
[/unorgani..cence - Lyn Macdonald.epub]
[/unorgani..cence - Lyn Macdonald.epub] Reinstalling the application may fix this problem.
[/unorgani..cence - Lyn Macdonald.epub] ERROR on line 569 /ebook-tools/lib.sh!
[/unorgani..cence - Lyn Macdonald.epub] ERROR on line 569 /ebook-tools/lib.sh!
[/unorgani..cence - Lyn Macdonald.epub] ERROR on line 296 /ebook-tools/organize-ebooks.sh!
ERROR on line 311 /ebook-tools/organize-ebooks.sh!

So, a bunch of things:

  • file is epub but is being recognized as "application/zip"
  • 'could not find or load the Qt platform plugin "headless"'
  • ERROR lib.sh line 569 (ebookmeta="$(ebook-meta "$file_path")")
  • ERROR on organize-ebooks.sh line 296 (isbns="$(search_file_for_isbns "$file_path")")
  • ERROR on organize-ebooks.sh line 311 ('for fpath in "$@"; do')

How to use Last name then first name for author name?

I don't always know the first names of authors so I would prefer to sort them based on their last name.
So instead of:
Cory Doctorow - [Little Brother #1] - Little Brother (2008) [0765319853].pdf
how can I obtain:
Doctorow, Cory - [Little Brother #1] - Little Brother (2008) [0765319853].pdf

Thanks.

Docker image doesn't have less installed

Steps to repro: Use interactive-organizer.sh and select Read in terminal
Expected: Able to read book in terminal
Actual: Command fails because less is not installed

File    'Aesop - Aesop Fables (2016) [9789887739401,9781434001467].epub' (394.5KiB in '/unorganized-books/cleaned/uncertain/') [has metadata]
Old     'Aesop's Fables - Aesop.epub' (in '/unorganized-books/Fiction and Short Stories/1001 Books You Must Read Before You Die/Aesop/')
No missing words from the old filename in the new!
Possible actions:
 0/spb) Move file and metadata to '/unorganized-books/cleaned/'
 m/tab) Move to another folder          | i/bs)  Interactively reorganize the file
 o/ent) Open file in external viewer    | l)     Read in terminal
 c)     Read the saved metadata file    | ?)     Run ebook-meta on the file
 t/`)   Run shell in terminal           | e)     Eval code (change env vars)
 s)     Skip file                       | q/esc) Quit
Chosen option: l
Reading '/unorganized-books/cleaned/uncertain/Aesop - Aesop Fables (2016) [9789887739401,9781434001467].epub' (application/epub+zip) with less...
Converting ebook '/unorganized-books/cleaned/uncertain/Aesop - Aesop Fables (2016) [9789887739401,9781434001467].epub' to text format in file '/tmp/tmp.Fo8jJonmgK.txt'...
./interactive-organizer.sh: line 121: less: command not found
ERROR on line 121 ./interactive-organizer.sh!
ERROR on line 356 ./interactive-organizer.sh!
user@5e0343e69f50:/ebook-tools$ less
bash: less: command not found

Error on line 311

Keep getting this error. I suspect it's not escaping a file name properly or something but as there's no verbose mode, I can't see which files are causing it issues.

edit: found the verbose mode at bottom of README.md (derp), for some reason couldn't see it in --help or the source.

File Sort Flags not bound

./organize-ebooks.sh: line 311: FILE_SORT_FLAGS[@]: unbound variable

File sort flags is not a bound variable upon cloning the repo.

Ubuntu 16.04.3 LTS

organize-ebooks.sh uses epubs as folders

I'Ve been using the script a long time, now, all of a sudden, it opens epubs like a folder and sorts every single HTML page into the pamphlet folder, rendering the original epub useless.

Pls update instruction sample script

Very useful script but difficult to for Windows audience to figure out based on the sample command inside the instruction:

  1. How to utilize the phamlet foldering option correctly as subfolder of working directory (not centralized location) thus allowing phamphlets to exist in original subfolder?
  2. How to pass the unorganized to another folder than working dir? this would reduce reprocessing the file when resuming from errors
  3. How to maintain user existing custom folder structure while recursively renaming all files in subfolder?
  4. Some common RegEx samples for file names, i.e. Year, Publisher, Title

Thanks.

organize-ebooks.sh keeps failing

I run the scripts installed from zip archive. No matter what directory I try, a command line like./organize-ebooks.sh -d -v -km "/Volumes/Public/bearbeiten/Texte/benennen Buch/sortiert nach Sprache/en/klassifiziert/006.22" gives

Recursively scanning '/Volumes/Public/bearbeiten/Texte/benennen Buch/sortiert nach Sprache/en/klassifiziert/006.22' for files

[/Volumes/..nventor's Guide (2017).pdf] Testing '/Volumes/Public/bearbeiten/Texte/benennen Buch/sortiert nach Sprache/en/klassifiziert/006.22/The Arduino Inventor's Guide (2017).pdf' for corruption...

[/Volumes/..nventor's Guide (2017).pdf] tr: Illegal byte sequence
[/Volumes/..nventor's Guide (2017).pdf] Checking pdf file for integrity...
[/Volumes/..nventor's Guide (2017).pdf] fmt: illegal option -- -
[/Volumes/..nventor's Guide (2017).pdf] usage: fmt [-cmps] [-d chars] [-l num] [-t num]
[/Volumes/..nventor's Guide (2017).pdf] [-w width | -width | goal [maximum]] [file ...]
[/Volumes/..nventor's Guide (2017).pdf] Options: -c center each line instead of formatting
[/Volumes/..nventor's Guide (2017).pdf] -d double-space after at line end
[/Volumes/..nventor's Guide (2017).pdf] -l turn each spaces at start of line into a tab
[/Volumes/..nventor's Guide (2017).pdf] -m try to make sure mail header lines stay separate
[/Volumes/..nventor's Guide (2017).pdf] -n format lines beginning with a dot
[/Volumes/..nventor's Guide (2017).pdf] -p allow indented paragraphs
[/Volumes/..nventor's Guide (2017).pdf] -s coalesce whitespace inside lines
[/Volumes/..nventor's Guide (2017).pdf] -t have tabs every columns
[/Volumes/..nventor's Guide (2017).pdf] -w set maximum width to
[/Volumes/..nventor's Guide (2017).pdf] goal set target width to goal
[/Volumes/..nventor's Guide (2017).pdf] pdfinfo returned successfully
[/Volumes/..nventor's Guide (2017).pdf] fmt: illegal option -- -
[/Volumes/..nventor's Guide (2017).pdf] usage: fmt [-cmps] [-d chars] [-l num] [-t num]
[/Volumes/..nventor's Guide (2017).pdf] [-w width | -width | goal [maximum]] [file ...]
[/Volumes/..nventor's Guide (2017).pdf] Options: -c center each line instead of formatting
[/Volumes/..nventor's Guide (2017).pdf] -d double-space after at line end
[/Volumes/..nventor's Guide (2017).pdf] -l turn each spaces at start of line into a tab
[/Volumes/..nventor's Guide (2017).pdf] -m try to make sure mail header lines stay separate
[/Volumes/..nventor's Guide (2017).pdf] -n format lines beginning with a dot
[/Volumes/..nventor's Guide (2017).pdf] -p allow indented paragraphs
[/Volumes/..nventor's Guide (2017).pdf] -s coalesce whitespace inside lines
[/Volumes/..nventor's Guide (2017).pdf] -t have tabs every columns
[/Volumes/..nventor's Guide (2017).pdf] -w set maximum width to
[/Volumes/..nventor's Guide (2017).pdf] goal set target width to goal
[/Volumes/..nventor's Guide (2017).pdf] ERROR on line 191 /Users/Guy/Eingang/ebook-tools-master/lib.sh!
[/Volumes/..nventor's Guide (2017).pdf] ERROR on line 191 /Users/Guy/Eingang/ebook-tools-master/lib.sh!
[/Volumes/..nventor's Guide (2017).pdf] ERROR on line 391 /Users/Guy/Eingang/ebook-tools-master/lib.sh!
[/Volumes/..nventor's Guide (2017).pdf] ERROR on line 268 ./organize-ebooks.sh!
ERROR on line 313 ./organize-ebooks.sh!

curl link is not valid anymore?

Building the Docker image I found that running

RUN curl 'https://www.mobileread.com/forums/attachment.php?attachmentid=163537' > goodreads.zip

fetching link is not valid anymore. Website just returns html page with message:
'Invalid Attachment specified. If you followed a valid link, please notify the administrator'

Is there any workaround?

Use of the tool with docker issue

Hello ,

First than you for this tool to organize books. I have a huge collection of books that I want to organize so I decide to try your tool via docker because I am on windows

  1. I pulled the image in my local docker :

Image en ligne 1

  1. Due to the fact that I am on windows with wsl , my files with are supposed to be here where my linux distribution is : \wsl$\Ubuntu-20.04\home\a\unorganized-books

Image en ligne 2

  1. I ran the docker with this instruction :

docker run -it -v //wsl$/Ubuntu-20.04/home/a:/unorganized-books ebooktools/scripts:latest

  1. Finally, I ran the final instruction :

organize-ebooks.sh --output-folder=organized-books/

as result I have this

eBook Organizer v0.5.1

Usage: organize-ebooks.sh [OPTIONS] EBOOK_FOLDERS...

For information about the possible options, see the README.md file or the script source itself

5 Summary

Image en ligne 3

What I did wrong please ? Can you help me please ?

Regards

Always No such file or directory error when running in docker

Hi

My directory structure

/data
├── Books
│   ├── corrupt
│   ├── organized
│   ├── pamphlet
│   └── uncertain
├── Downloads

Error when running via docker.

docker run -it ebooktools/scripts:latest organize-ebooks.sh -ocr=true -v -o=/data/Books/organized -ofc=/data/Books/corrupt -ofp=/data/Books/pamphlet -owi --output-folder-uncertain=/data/Books/uncertain -mfo="WorldCat xISBN,Google,Goodreads,ISBNDB,OZON.ru" -owis="Goodreads,Amazon.com,Google,Edelweiss,Open Library,Big Book Search" -oft='"${d[TITLE]} ${d[AUTHORS]} ${d[PUBLISHED]:+${d[PUBLISHED]%%-*}}.${d[EXT]}"' /data/Downloads

Recursively scanning '/data/Downloads' for files
find: ‘/data/Downloads’: No such file or directory
ERROR on line 311 organize-ebooks.sh!

Please help!

Organize ebooks

I'm crap at regexp and things like that. But I'd like to organize my books by Author and then by series, if any. I guess that is possible, but how? And it would be great to have the year in the title as well. Something like this:

Dan Brown
    - Robert Langdon Series
        - Origin (2017)
        - Inferno (2018)

If it's done with a oneliner in bash or in ebook-tools doesn't really matter.

Is there any way to embed metadata directly to ebooks?

I want to thank you and congratulate you for this amazing piece of code.
Your scripting skills are amazing.
I'm using the organize-books.sh and interactive-organizer.sh scripts to make sure the books are correctly renamed, but I'd like to embed the metadata directly to the files.
At this moment, I end with the metadata in human readable format (incompatible with calibre) and a set of books more or less well renamed and supervised by me (with the interactive script) that I'm forced to feed to calibre in order to scrape again (and prone to failure) to be able to embed the metadata later.
Is there any way the first script generates opf metadata format in order to embed it directly to calibre or better, that the script embeds the metatada itself?

Thank you again.

Folder organizaiton options

Enhancement request. Allow for identified ebooks to be organized into a predictable folder structure without renaming the ebooks themselves.

My goal is to organize all books by Author but I can imagine others may want to organise by different metadata.

Note: I may need to also add sharding as filesystems like samba typically dont do well with multi-thousand long folder lists. Sharding may be off topic for this request but included for completeness.

Example for demonstration:

From:

Frank Herbert Dune.epub
Douglas Adams The Hitchhikers Guide to the Galaxy.epub
George Orwell Nineteen Eighty-Four.epub
Isaac Asimov Foundation.epub

To:

\organized-books\D\Douglas Adams The Hitchhikers Guide to the Galaxy.epub
\organized-books\F\Frank Herbert Dune.epub
\organized-books\G\George Orwell Nineteen Eighty-Four.epub
\organized-books\I\Isaac Asimov Foundation.epub

The drivers for this are:

  • I am less interested in having things named correctly as I am filed in the correct location
  • I have found that the match rates of the author are typically much higher than than title
  • I prefer to maintain unrenamed files so to not lose any extra useful data the filename may contain

Excellent work. Took me far to long to stumble uopn this excellent project idea.

Pamphlet option does not seem to work with absolute pathing

Using the docker image -

It seems whenever i attempt to sort pamphlets with an absolute path, the folder structure becomes weird. Almost like the filename includes the old path. Note the double slash in the TO-line

OK:	/vault/Literature/test/example_pamphlet.pdf

TO:	/vault/Literature/Pamphlets//vault/Literature/test/example_pamphlet.pdf

The command i'm using at the moment:

organize-ebooks.sh -o=/vault/Literature/Books -owi -ofu=/vault/Literature/Uncertain -ofp=/vault/Literature/Pamphlets -ofc=/vault/Literature/z_corrupted --keep-metadata /vault/Literature/test

A small hotfix that i have found working is to change the current directory to my unsorted folder and use the current folder as an argument instead. Then it outputs to "/vault/Literature/Pamphlets/./Proofmarks/example_pamphlet.pdf" instead, which makes it work as intended at least.

ERROR on lines 136 and 313

Hello,
For runinng:

sudo ./organize-ebooks.sh -o=/media/storage/certain --organize--without--isbn 
--output-folder-uncertain=/media/storage/uncertain 
--output-folder-corrupt=/media/storage/corrupt 
-ofp=/media/storage/pamphlets --keep-metadata 
--output-filename-template='"${d[AUTHORS]// & /, }/${d[AUTHORS]// & /, } - ${d[SERIES]:+ [${d[SERIES]//:/ -}] - }
${d[TITLE]//:/ -}${d[PUBLISHED]:+ (${d[PUBLISHED]%%-*})}${d[ISBN]:+ [${d[ISBN]}]}.${d[EXT]}"' .

I get this:

ERROR on line 136 ./organize-ebooks.sh!
ERROR on line 313 ./organize-ebooks.sh!

I am running this on wsl ubuntu and with mounted hard drive. Installation of deps went easily.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.