na-- / ebook-tools Goto Github PK
View Code? Open in Web Editor NEWShell scripts for organizing and managing ebook collections
License: GNU General Public License v3.0
Shell scripts for organizing and managing ebook collections
License: GNU General Public License v3.0
Hi!
I run
./organize-ebooks.sh -v -ocr=true -owi -o=/mnt/d/Books/organized -ofu=/mnt/d/Books/uncertain -ofc=/mnt/d/Books/corrupt -ofp=/mnt/d/Books/pamphlet /path/to/books/ and get the following error.
[/mnt/d/Re../Medsklad/files/011912.pdf] Fetching metadata from Google...
[/mnt/d/Re../Medsklad/files/011912.pdf] Calling fetch-ebook-metadata --verbose --allowed-plugin=Google --isbn=3000050000
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] b"Running identify query with parameters:\n{'title': None, 'authors': [], 'identifiers': {'isbn':
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] '3000050000'}, 'timeout': 30}\nUsing plugins: Google (1, 0, 1)\nThe log from individual plugins
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] is below\n\n****************************** Google (1, 0, 1) ******************************\nFound
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] 1 results\nDownloading from Google took 0.6083431243896484\n\n\n---\nTitle : Das
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] Hunderttage-Stadion: Entstehungsgeschichte des Bad Nauheimer Kunsteisstadions unter Colonel Paul
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] R. Knight\nAuthor(s) : Heinrich Burk\nPublisher : Stadt Bad Nauheim\nLanguages
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] : deu\nPublished : 1999-09-15T11:52:30.012039+00:00\nIdentifiers
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] : google:163xAgAACAAJ, isbn:9783000050008\nMaking query:
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] https://books.google.com/books/feeds/volumes?q=isbn%3A3000050000&max-results=20&start-index=1&min-viewability=none\n\n********************************************************************************\nThe
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] identify phase took 0.80 seconds\nThe longest time (0.608343) was taken by: Google\nMerging
[/mnt/d/Re../Medsklad/files/011912.pdf] [fetch-meta-Google] results from different sources\nWe have 1 merged results, merging took: 0.00 seconds\n"
[/mnt/d/Re../Medsklad/files/011912.pdf] Successfully fetched metadata:
[/mnt/d/Re../Medsklad/files/011912.pdf] [meta] b'Title : Das Hunderttage-Stadion: Entstehungsgeschichte des Bad
[/mnt/d/Re../Medsklad/files/011912.pdf] [meta] Nauheimer Kunsteisstadions unter Colonel Paul R. Knight\nAuthor(s) : Heinrich
[/mnt/d/Re../Medsklad/files/011912.pdf] [meta] Burk\nPublisher : Stadt Bad Nauheim\nLanguages : deu\nPublished :
[/mnt/d/Re../Medsklad/files/011912.pdf] [meta] 1999-09-15T11:52:30.012039+00:00\nIdentifiers : google:163xAgAACAAJ, isbn:9783000050008'
[/mnt/d/Re../Medsklad/files/011912.pdf] Addding additional metadata to the end of the metadata file...
[/mnt/d/Re../Medsklad/files/011912.pdf] Organizing '/mnt/d/RecipeCD/books/Medsklad/files/011912.pdf' (with '/tmp/tmp.m4sWetIL4y.txt')...
[/mnt/d/Re../Medsklad/files/011912.pdf] Variables that will be used for the new filename construction:
[/mnt/d/Re../Medsklad/files/011912.pdf] BTITLE Das Hunderttage-Stadion: Entstehungsgeschichte des Bad Nauheimer Kunsteisstadions unter Colonel Paul
[/mnt/d/Re../Medsklad/files/011912.pdf] EXT pdf
[/mnt/d/Re../Medsklad/files/011912.pdf] METADATA_SOURCE Google
[/mnt/d/Re../Medsklad/files/011912.pdf] OLD_FILE_PATH _mnt_d_RecipeCD_books_Medsklad_files_011912.pdf
[/mnt/d/Re../Medsklad/files/011912.pdf] ALL_FOUND_ISBNS 5225009204,3000050000
[/mnt/d/Re../Medsklad/files/011912.pdf] ISBN 3000050000
[/mnt/d/Re../Medsklad/files/011912.pdf] /home/magi/ebook-tools/lib.sh: line 682: d[TITLE]: unbound variable
[/mnt/d/Re../Medsklad/files/011912.pdf] ERROR on line 682 /home/magi/ebook-tools/lib.sh!
[/mnt/d/Re../Medsklad/files/011912.pdf] ERROR on line 136 ./organize-ebooks.sh!
ERROR on line 313 ./organize-ebooks.sh!
When I try running the script I get this error
ERROR on line 313 ./organize-ebooks.sh!
Hello,
I'm running the docker container in an UnRAID environment, spun it up and managing it with Portainer.
command used is
docker run -it -v /blahblah:/unorganized -v /blahblahblah:/organized ebooktools/scripts:latest
then within the container
$ organize-ebooks.sh -v /unorganized -o /organized
which gets me
Recursively scanning '/unorganized' for files
[/unorgani..cence - Lyn Macdonald.epub] Testing '/unorganized/5k books/1915 The Death of Innocence - Lyn Macdonald.epub' for corruption...
[/unorgani..cence - Lyn Macdonald.epub] The file has a '.epub' extension, testing with 7z...
[/unorgani..cence - Lyn Macdonald.epub] File passed the corruption test, looking for ISBNs...
[/unorgani..cence - Lyn Macdonald.epub] Searching file '/unorganized/5k books/1915 The Death of Innocence - Lyn Macdonald.epub' for ISBN numbers...
[/unorgani..cence - Lyn Macdonald.epub] Ebook MIME type: application/zip
[/unorgani..cence - Lyn Macdonald.epub] This application failed to start because it could not find or load the Qt platform plugin "headless"
[/unorgani..cence - Lyn Macdonald.epub] in "/usr/lib/calibre/calibre/plugins".
[/unorgani..cence - Lyn Macdonald.epub]
[/unorgani..cence - Lyn Macdonald.epub] Reinstalling the application may fix this problem.
[/unorgani..cence - Lyn Macdonald.epub] ERROR on line 569 /ebook-tools/lib.sh!
[/unorgani..cence - Lyn Macdonald.epub] ERROR on line 569 /ebook-tools/lib.sh!
[/unorgani..cence - Lyn Macdonald.epub] ERROR on line 296 /ebook-tools/organize-ebooks.sh!
ERROR on line 311 /ebook-tools/organize-ebooks.sh!
So, a bunch of things:
ebookmeta="$(ebook-meta "$file_path")"
)isbns="$(search_file_for_isbns "$file_path")"
)Steps to repro: Use interactive-organizer.sh
and select Read in terminal
Expected: Able to read book in terminal
Actual: Command fails because less
is not installed
File 'Aesop - Aesop Fables (2016) [9789887739401,9781434001467].epub' (394.5KiB in '/unorganized-books/cleaned/uncertain/') [has metadata]
Old 'Aesop's Fables - Aesop.epub' (in '/unorganized-books/Fiction and Short Stories/1001 Books You Must Read Before You Die/Aesop/')
No missing words from the old filename in the new!
Possible actions:
0/spb) Move file and metadata to '/unorganized-books/cleaned/'
m/tab) Move to another folder | i/bs) Interactively reorganize the file
o/ent) Open file in external viewer | l) Read in terminal
c) Read the saved metadata file | ?) Run ebook-meta on the file
t/`) Run shell in terminal | e) Eval code (change env vars)
s) Skip file | q/esc) Quit
Chosen option: l
Reading '/unorganized-books/cleaned/uncertain/Aesop - Aesop Fables (2016) [9789887739401,9781434001467].epub' (application/epub+zip) with less...
Converting ebook '/unorganized-books/cleaned/uncertain/Aesop - Aesop Fables (2016) [9789887739401,9781434001467].epub' to text format in file '/tmp/tmp.Fo8jJonmgK.txt'...
./interactive-organizer.sh: line 121: less: command not found
ERROR on line 121 ./interactive-organizer.sh!
ERROR on line 356 ./interactive-organizer.sh!
user@5e0343e69f50:/ebook-tools$ less
bash: less: command not found
Keep getting this error. I suspect it's not escaping a file name properly or something but as there's no verbose mode, I can't see which files are causing it issues.
edit: found the verbose mode at bottom of README.md
(derp), for some reason couldn't see it in --help
or the source.
./organize-ebooks.sh: line 311: FILE_SORT_FLAGS[@]: unbound variable
File sort flags is not a bound variable upon cloning the repo.
Ubuntu 16.04.3 LTS
I've epubs in different languages.
It would be great if I had the opportunity to distinct epubs on language.
Adding "${d[LANGUAGE]}" to oft gives an error.
Anything else I couldn´t find/figure out...
I'Ve been using the script a long time, now, all of a sudden, it opens epubs like a folder and sorts every single HTML page into the pamphlet folder, rendering the original epub useless.
Very useful script but difficult to for Windows audience to figure out based on the sample command inside the instruction:
Thanks.
I run the scripts installed from zip archive. No matter what directory I try, a command line like./organize-ebooks.sh -d -v -km "/Volumes/Public/bearbeiten/Texte/benennen Buch/sortiert nach Sprache/en/klassifiziert/006.22"
gives
Recursively scanning '/Volumes/Public/bearbeiten/Texte/benennen Buch/sortiert nach Sprache/en/klassifiziert/006.22' for files
[/Volumes/..nventor's Guide (2017).pdf] Testing '/Volumes/Public/bearbeiten/Texte/benennen Buch/sortiert nach Sprache/en/klassifiziert/006.22/The Arduino Inventor's Guide (2017).pdf' for corruption...
[/Volumes/..nventor's Guide (2017).pdf] tr: Illegal byte sequence
[/Volumes/..nventor's Guide (2017).pdf] Checking pdf file for integrity...
[/Volumes/..nventor's Guide (2017).pdf] fmt: illegal option -- -
[/Volumes/..nventor's Guide (2017).pdf] usage: fmt [-cmps] [-d chars] [-l num] [-t num]
[/Volumes/..nventor's Guide (2017).pdf] [-w width | -width | goal [maximum]] [file ...]
[/Volumes/..nventor's Guide (2017).pdf] Options: -c center each line instead of formatting
[/Volumes/..nventor's Guide (2017).pdf] -d double-space after at line end
[/Volumes/..nventor's Guide (2017).pdf] -l turn each spaces at start of line into a tab
[/Volumes/..nventor's Guide (2017).pdf] -m try to make sure mail header lines stay separate
[/Volumes/..nventor's Guide (2017).pdf] -n format lines beginning with a dot
[/Volumes/..nventor's Guide (2017).pdf] -p allow indented paragraphs
[/Volumes/..nventor's Guide (2017).pdf] -s coalesce whitespace inside lines
[/Volumes/..nventor's Guide (2017).pdf] -t have tabs every columns
[/Volumes/..nventor's Guide (2017).pdf] -w set maximum width to
[/Volumes/..nventor's Guide (2017).pdf] goal set target width to goal
[/Volumes/..nventor's Guide (2017).pdf] pdfinfo returned successfully
[/Volumes/..nventor's Guide (2017).pdf] fmt: illegal option -- -
[/Volumes/..nventor's Guide (2017).pdf] usage: fmt [-cmps] [-d chars] [-l num] [-t num]
[/Volumes/..nventor's Guide (2017).pdf] [-w width | -width | goal [maximum]] [file ...]
[/Volumes/..nventor's Guide (2017).pdf] Options: -c center each line instead of formatting
[/Volumes/..nventor's Guide (2017).pdf] -d double-space after at line end
[/Volumes/..nventor's Guide (2017).pdf] -l turn each spaces at start of line into a tab
[/Volumes/..nventor's Guide (2017).pdf] -m try to make sure mail header lines stay separate
[/Volumes/..nventor's Guide (2017).pdf] -n format lines beginning with a dot
[/Volumes/..nventor's Guide (2017).pdf] -p allow indented paragraphs
[/Volumes/..nventor's Guide (2017).pdf] -s coalesce whitespace inside lines
[/Volumes/..nventor's Guide (2017).pdf] -t have tabs every columns
[/Volumes/..nventor's Guide (2017).pdf] -w set maximum width to
[/Volumes/..nventor's Guide (2017).pdf] goal set target width to goal
[/Volumes/..nventor's Guide (2017).pdf] ERROR on line 191 /Users/Guy/Eingang/ebook-tools-master/lib.sh!
[/Volumes/..nventor's Guide (2017).pdf] ERROR on line 191 /Users/Guy/Eingang/ebook-tools-master/lib.sh!
[/Volumes/..nventor's Guide (2017).pdf] ERROR on line 391 /Users/Guy/Eingang/ebook-tools-master/lib.sh!
[/Volumes/..nventor's Guide (2017).pdf] ERROR on line 268 ./organize-ebooks.sh!
ERROR on line 313 ./organize-ebooks.sh!
Building the Docker image I found that running
RUN curl 'https://www.mobileread.com/forums/attachment.php?attachmentid=163537' > goodreads.zip
fetching link is not valid anymore. Website just returns html page with message:
'Invalid Attachment specified. If you followed a valid link, please notify the administrator'
Is there any workaround?
Hello ,
First than you for this tool to organize books. I have a huge collection of books that I want to organize so I decide to try your tool via docker because I am on windows
docker run -it -v //wsl$/Ubuntu-20.04/home/a:/unorganized-books ebooktools/scripts:latest
organize-ebooks.sh --output-folder=organized-books/
as result I have this
eBook Organizer v0.5.1
Usage: organize-ebooks.sh [OPTIONS] EBOOK_FOLDERS...
For information about the possible options, see the README.md file or the script source itself
5 Summary
What I did wrong please ? Can you help me please ?
Regards
Hi
My directory structure
/data
├── Books
│ ├── corrupt
│ ├── organized
│ ├── pamphlet
│ └── uncertain
├── Downloads
Error when running via docker.
docker run -it ebooktools/scripts:latest organize-ebooks.sh -ocr=true -v -o=/data/Books/organized -ofc=/data/Books/corrupt -ofp=/data/Books/pamphlet -owi --output-folder-uncertain=/data/Books/uncertain -mfo="WorldCat xISBN,Google,Goodreads,ISBNDB,OZON.ru" -owis="Goodreads,Amazon.com,Google,Edelweiss,Open Library,Big Book Search" -oft='"${d[TITLE]} ${d[AUTHORS]} ${d[PUBLISHED]:+${d[PUBLISHED]%%-*}}.${d[EXT]}"' /data/Downloads
Recursively scanning '/data/Downloads' for files
find: ‘/data/Downloads’: No such file or directory
ERROR on line 311 organize-ebooks.sh!
Please help!
I'm crap at regexp and things like that. But I'd like to organize my books by Author and then by series, if any. I guess that is possible, but how? And it would be great to have the year in the title as well. Something like this:
Dan Brown
- Robert Langdon Series
- Origin (2017)
- Inferno (2018)
If it's done with a oneliner in bash or in ebook-tools doesn't really matter.
I want to thank you and congratulate you for this amazing piece of code.
Your scripting skills are amazing.
I'm using the organize-books.sh and interactive-organizer.sh scripts to make sure the books are correctly renamed, but I'd like to embed the metadata directly to the files.
At this moment, I end with the metadata in human readable format (incompatible with calibre) and a set of books more or less well renamed and supervised by me (with the interactive script) that I'm forced to feed to calibre in order to scrape again (and prone to failure) to be able to embed the metadata later.
Is there any way the first script generates opf metadata format in order to embed it directly to calibre or better, that the script embeds the metatada itself?
Thank you again.
Enhancement request. Allow for identified ebooks to be organized into a predictable folder structure without renaming the ebooks themselves.
My goal is to organize all books by Author but I can imagine others may want to organise by different metadata.
Note: I may need to also add sharding as filesystems like samba typically dont do well with multi-thousand long folder lists. Sharding may be off topic for this request but included for completeness.
Example for demonstration:
From:
Frank Herbert Dune.epub
Douglas Adams The Hitchhikers Guide to the Galaxy.epub
George Orwell Nineteen Eighty-Four.epub
Isaac Asimov Foundation.epub
To:
\organized-books\D\Douglas Adams The Hitchhikers Guide to the Galaxy.epub
\organized-books\F\Frank Herbert Dune.epub
\organized-books\G\George Orwell Nineteen Eighty-Four.epub
\organized-books\I\Isaac Asimov Foundation.epub
The drivers for this are:
named correctly
as I am filed in the correct location
Excellent work. Took me far to long to stumble uopn this excellent project idea.
Using the docker image -
It seems whenever i attempt to sort pamphlets with an absolute path, the folder structure becomes weird. Almost like the filename includes the old path. Note the double slash in the TO-line
OK: /vault/Literature/test/example_pamphlet.pdf
TO: /vault/Literature/Pamphlets//vault/Literature/test/example_pamphlet.pdf
The command i'm using at the moment:
organize-ebooks.sh -o=/vault/Literature/Books -owi -ofu=/vault/Literature/Uncertain -ofp=/vault/Literature/Pamphlets -ofc=/vault/Literature/z_corrupted --keep-metadata /vault/Literature/test
A small hotfix that i have found working is to change the current directory to my unsorted folder and use the current folder as an argument instead. Then it outputs to "/vault/Literature/Pamphlets/./Proofmarks/example_pamphlet.pdf"
instead, which makes it work as intended at least.
Hello,
For runinng:
sudo ./organize-ebooks.sh -o=/media/storage/certain --organize--without--isbn
--output-folder-uncertain=/media/storage/uncertain
--output-folder-corrupt=/media/storage/corrupt
-ofp=/media/storage/pamphlets --keep-metadata
--output-filename-template='"${d[AUTHORS]// & /, }/${d[AUTHORS]// & /, } - ${d[SERIES]:+ [${d[SERIES]//:/ -}] - }
${d[TITLE]//:/ -}${d[PUBLISHED]:+ (${d[PUBLISHED]%%-*})}${d[ISBN]:+ [${d[ISBN]}]}.${d[EXT]}"' .
I get this:
ERROR on line 136 ./organize-ebooks.sh!
ERROR on line 313 ./organize-ebooks.sh!
I am running this on wsl ubuntu and with mounted hard drive. Installation of deps went easily.
The title says it all, my specific case was "Debbie Macomber - [The Manning Men #3] - The Manning Grooms - Bride on the Loose\Same Time, Next Year (2008) [9780778326021]"
Hi!
Tell me how to move unorganized books to a separate directory?
Thanks
Is it planned or can someone port this amazing tool to Windows executable (or .bat) and provide a binary to run on Windows 7+ system?
This would really be helpful I believe.
Maybe this sh to bat converter also helps:
https://daniel-sc.github.io/bash-shell-to-bat-converter/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.