haroldtreen / epub-press-clients Goto Github PK

View Code? Open in Web Editor NEW

566.0 17.0 67.0 15.49 MB

📦 Clients for building books with EpubPress.

Home Page: https://epub.press

License: GNU General Public License v3.0

HTML 0.46% JavaScript 98.96% CSS 0.58%

ebook article kindle epub extension firefox npm javascript chrome

epub-press-clients's Introduction

epub-press-clients

Easy to use clients for building ebooks with EpubPress.

Backend code can be found in haroldtreen/epub-press.

Overview

EpubPress is a service for stitching articles/blogs/webpages into a customized ebook.

🌟 Benefits 🌟

EpubPress makes reading the web more enjoyable!

Downloads your articles for offline reading.
Books are compatible with all your iPhone, Android, Kindle, Nook, etc.
Removes website boilerplate and ads. Just. Clean. Content.
Lets you group information together (eg. "News from this week", "Top 10 travel articles").
Easy to sharing with friends.

Packages 📦

epub-press-chrome

Source code for the EpubPress chrome extension. The extension allows you to build ebooks by selecting articles from your currently open tabs.

It is available on the Chrome Store

See the Readme here

epub-press-js

A javascript library for creating books with EpubPress.

It is available on npm

See the Readme here

epub-press-widgets

A set of ready to use widgets for integration by publishers. Give your users the ability to download your content in an ebook.

Development/release is on the Roadmap.*

See the Readme here

Roadmap 🛣

Widgets.
Ability to share your generated creations with others.
Custom cover page.
Extra metadata (eg. author).
Detection for pages with a list of links (eg. OneTab, Feedly, RSS Feeds) and show those links.

Have any awesome ideas? Suggestions? Feature requests? Would love to hear them!
[email protected]

Bug Reporting 🐛

Create an issue in Github.

Send a help request to [email protected].
Please include as much information as possible (eg. version, os, reproduction steps, screenshots)

Acknowledgements 👏

Icon created by Picol.org (http://www.picol.org/)
Content extraction using node-readability (https://www.npmjs.com/package/node-readability)
epub construction using nodepub (https://www.npmjs.com/package/nodepub)

epub-press-clients's People

Contributors

Stargazers

Watchers

epub-press-clients's Issues

code block rendering error

I'm not fluent in English and it's my first issue, please excuse any mistakes I've made :)

I'm trying to convert GJS documentation into epub.
The code block in it should be this

but it looks like

const {GLib} = imports.gi;
      const loop = new GLib.MainLoop(null, false);
      // Returns a Promise that randomly fails or succeeds after one second
      function unreliablePromise() {
          return new Promise((resolve, reject) => {
              GLib.timeout_add_seconds(GLib.PRIORITY_DEFAULT, 1, () => {
                  if (Math.random() >= 0.5)
                      resolve('success');
                  else
                      reject(Error('failure'));
                  
                  return GLib.SOURCE_REMOVE;
              });
          });
      }
      // An example async function, demonstrating how Promises can be resolved
      // sequentially while catching errors in a try..catch block.
      async function exampleAsyncFunction() {
          try {
              let count = 0;
              
              while (true) {
                  await unreliablePromise();
                  log(`Promises resolved: ${++count}`);
              }
          } catch (e) {
              logError(e);
              loop.quit();
          }
      }
      // Run the async function
      exampleAsyncFunction();
      loop.run();
      
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

Generate Cover Image w/ Book Titles

Currently, every book is given the exact same cover image. This makes it difficult to distinguish books from each other.

Solution: Add the title to each cover image so that books are distinguishable by the cover.

Where's the donate button?

I would like to support the project and have some Bitcoin. Do you accept donations?

Extension is unresponsive when sending many pages

Current Behavior

When a user with a slow connection or many tabs uses the service, clicking the Download button makes the extension unresponsive.

Expected Behaviour

When Download is clicked, the steps before talking to the server (grabbing the HTML, making the request) should be shown within the loader.

Info

EpubPress is receiving messages from all the tabs containing all their HTML. So if you have 20 pages open, each with 10 Mb of HTML, that's 200 MB+ of RAM that will be needed just to transfer the HTML.

The solution to this problem could be both showing a loader to make things look more responsive, but also chunking the data.

epub.press down?

Seems to be a problem publishing and visiting the site -502 Bad Gateway

Support Custom Cover Images

Add a field for Cover Image url.

EpubPress could then download the image and use that instead of the default.

Image extraction for sites requiring authentication

I would like to enable image extraction on sites which require authentication. Currently the server
fails downloading images and I get an epub with only the content in the HTML. Looking through the code it looks like my options would be either:

Port the book generation portion of the back end into the browser plugin and build the book in browser.
Have the browser plugin send all images in the HTML to the back end.

I think I could probably come up to speed and work on this but would appreciate some direction and any ideas you might have.

Document compatible Linux e-reader software

The main site currently shows supported readers for a variety of platforms. It doesn't give any examples for Linux.

Some (such as Calibre) work on Linux, but the docs should be updated to show this.

rtl issues

using.mobi I get words in reversed order like this https://tarjema.files.wordpress.com/2018/03/20180311_132838cropped.jpg?w=640
using .epub I get characters in a wrong way like this https://i.ytimg.com/vi/8iGlWF9G_KY/maxresdefault.jpg

I think it was working well in previous versions so please check this also I can contribute to this issue but I want a guide :)

Picture of Content can't get

The ebook can't get picture of webpage

Omit table of contents if it's only one article

Is it possible to remove the TOC if it's only one artice?

Extract article author and date. Include both in the references.

The server should be better at extracting the author and date from articles.

It would also be great to include these field in the references page.

Open source EpubPress backend

I'm now working for Squarespace and have less time to maintain the backend code. I'd like to open source it so others can add features without waiting on me.

Todo

Audit commit history for sensitive data.
Write better documentation.
Keep the clients and backend within an EpubPress organization?
Add appropriate license info.

Chrome Extension throws "Unexpected token < in JSON at position 0"

Preserve text indentation

When converting content that contains indented text, for things such as lines of code, the resulting ePub removes all the indenting and left aligns the lines of code. This makes it hard to read the code examples.

Revamp Service Limits

Currently EpubPress sets limits in a few categories.

Number of items.
Number of images.
Size of individual images.

The real reason for these limitations is space. Each image/section requires more data to store and send. The result is that single articles with many images don't work well. Books with many small chapters aren't possible.

A better system would only set limits on total bandwidth.

Firefox extension omits all text from some pages, splits it one line per page from other pages

Tested in WF 56 and FF 66. Can't use debugging instructions, since they're for Chrome.

Omits almost all text: http://www.kolumbus.fi/taglarsson/dokumentit/sov.htm

Splits across pages: http://www.kolumbus.fi/taglarsson/dokumentit/sovburo.htm

macOS Safari extension

I've seen the Safari extension is already planned. Is there already any ETA available? Or is there any code which can already be used or testdriven? Would love to help out here, since Safari is my primary content browser.

Generate table of content from headings

Hi,

I think it would be easier to navigate when reading long articles if we could generate full table content for an article from its headings.

I wonder if we could implement that feature. Thank you.

Detect language of articles and set the language in the ebook

Current Behavior

If you publish an ebook with a foreign language in it, you can't use epub listening software because the language on the ebook is set to english.

Expected Behavior

EpubPress should detect the language of the content, or the user should be able to provide a value.

Support for .txt files

Not Downloading

I am trying to download multiple pages (chapters) from the same webpages, but whenever I add more than one page to the potential epub, pressing the download button doesn't do anything

Breaking Articles into Chapters

It would be nice if the user could break articles into Chapters. For instance.

Chapter 1
- Article 1
- Article 2
Chapter 2
- Article 3
- Article 4

Expected Behavior

This could involve adding a + button to the list of articles.
Clicking on it would insert another checkbox but with a textfield beside it.
Filling the textfield would create a "Chapter" with whatever that text was. Articles after that break would be included in the "Chapter".
Articles not preceded by a break would be "Top Level" articles.
The Table of contents would have a hierarchy to show what articles are in each chapter.

Support Custom Cover Pages

Summary

@wrobbins recently added support for user provided cover images:

haroldtreen/epub-press#34

The clients however are still very specific about what they send to the backend:

https://github.com/haroldtreen/epub-press-clients/blob/master/packages/epub-press-js/epub-press.js#L37-L54

In order for users to take advantage of the backend feature we will need to update the clients.

Acceptance Criteria

epub-press-js can accept the additional metadata
epub-press-js has docs for these new params (added here )
Bonus: A field is added in epub-press-chrome for the extension.
- Additional fields should be hidden by default and only shown if expanded? Maybe we save if the user has expanded these extra fields and consistently show the fields between usages.
- This is a stretch - another issue to followup on this would be sufficient.

feature request: config to download comments on reddit

Can this also download the comment section of reddit?

Failed - Unexpected token < in JSON at position 0

macOS: Sierra 10.12.5
Chrome: 58.0.3029.110 (64-bit)

Steps to reproduce:
1/ Open new Chrome window
2/ Navigate to https://seat14c.com/future_ideas
3/ Open each of the entries in the “Passenger Manifest” on the left-hand-side in a new tab, from top to bottom, including “The Origin Story” (but not including “Claim Your Seat”)
4/ Close the root tab, so that “The Origin Story” has focus
5/ Click the Epub.Press icon
6/ Enter the title “Seat 14C”
7/ Click “Select All”
8/ Click “Download”

Expected outcome:

completed ePub downloads successfully

Actual outcome:

“Failed - Unexpected token < in JSON at position 0"

Workaround:

Selecting a single tab, the ePub is generated and downloaded successfully.

iOS Safari "Add to iBooks"

Are there any plans for adding a Share Extension to Safari on iOS to print an article as book to iBooks? iBooks currently only supports PDF - see the attached screenshot.

FWIW: I might help out creating an iOS app, but I don't know the specs of ePub.press yet.

tilde ~ in filename causes EpubPress to hang in Brave browser

I have used EpubPress for months now on Firefox and never had any problems using tilde ~ in a filename. But I recently switched to the Brave browser (Chromium-based) and I found the EpubPress hangs if I use a tilde ~ in a filename. It gets all the way to "Done! Pressing Your Epub" with the progress bar at 100% but it never shows success. It times out after a long time (maybe 5 minutes?) and says "Failed Download timed out". Following your debugging tips, I found this error: Unchecked runtime.lastError: Invalid filename.

If I do everything exactly the same but leave the tilde ~ out of the filename, it works normally. And it also works in Firefox even with a tilde ~. I have only seen the problem with Brave and tilde ~ in filename.

OS is Windows 10.

EpubPressJS only works in the browser.

But building book generating web-scrapers with NodeJs would be so much fun!

Support for RTL Languages

Hey there, What an excellent add-on! I found it yesterday on the Mozilla site.

I see that content I create from an article in Hebrew comes out aligned to the left rather than the right, which I guess means that the directionality of the text is not extracted and preserved from the DOM when creating the epub. I didn't dive into the code yet, but I bet it should be an easy one to fix if the code already looks for other local and global settings like it for indentations and the like.

In case I can't find the place and submit a PR (JS is not my forte, I'm an infrastructure guy), I'm also leaving here this item, in case anyone else can pick this up.

Thanks!

Debugging Instructions

When people encounter errors, it would be great to have instructions for debugging.

Eg.

How to patch EpubPress to inspect javascript client internals.
A debug mode to create logs.
Inspecting the chrome extension.

Incrementing default titles

Current behavior

If you create a book with the Chrome Extension and don't provide a title, a default of EpubPress - <Month> <Day> <Year> is used.

This can be annoying if the user wants to create multiple books and not give each one a unique title.

Desired Behavior

When a book is created, the extension should start a count of how many books have been created that day. Each subsequent book created in the same day, will include that count so that each book has a unique title.

Add a CLI client

CLIs just are the best interfaces. I can automate them and integrate the service into bigger scripts, or just it interactively on any device regardless of the browser. (iOS, cough cough.)

why did we had the size limitations of EpubPress?

why did we had the limitations of EpubPress?
Books are limited to containing 50 articles.
Books must be 10 Mb or less for email delivery to work.
Images in an article must be 1 Mb or less. Images that exceed this limit will be removed.
No more than 30 images will be downloaded.

file title in unknown | kindle

when I save file ( send to kindle email) I got unknown title

Lazyloaded image can't show

The ebook can't get picture of webpage
Lazyloaded image can't show

Firefox extension is stuck

My firefox extension is stuck at following position

I tried to reopen firefox, clear cache, etc...
even after disable/enable the extension is in the same state.
Is it possible to cancel the process or reset the extension state?
Thanks for your help.

List of References

Is it possible to add a list of references to all article URL's at the end of the book?

Firefox extension sticks on "fetching images" again and again

Tested in both Waterfox 56 and Firefox 66. Same behavior in both. Uninstalled and reinstalled. Same behavior afterwards.

When EpubPress gets stuck "fetching images" it fails to produce and epub, and on checking, it shows it trying "fetching images" again without the usual EpubPress menu.

Extension dark mode icon

Current Behavior

Set your browser theme / OS theme to dark mode.
The extension icon is black and not visible in the toolbar.

Expected Behavior

There should be a white icon to provide contrast in dark mode.
Maybe the UI should also update / invert (black background + white text? 🤔)

.mobi file always gets the default title

I'm using the Firefox add-on and download the files as .mobi. Even though I set a title in the add-on, I always get an ebook with the default title ("EpubPress" and then the date) that I have to adapt manually in Calibre.

How Can I Change The User-Agent of EpubPressJs?

I'm trying to use EpubPressJs (talking to a local server hosted on my machine) to download articles, but I'm getting blocked right off the bat on some websites with super aggressive firewalls.

The same thing was happening when I was using Beautiful Soup (Python scraper), and just changing the user-agent to

Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36

Solved the problem. How can I do the same on EpubPress? Please bear in mind that I'm not a programmer, so if you could ELI5, it'd be very very much appreciated!

P.S. @haroldtreen thank you for building such an awesome application! I can't put into words how awesome it is to read properly formatted articles offline :')

Meta data

Can we set the Author of the Book to the article Author instead of "ePub.Press"? Maybe only, if only one article is converted?

If you'd like to keep ePub.Press in the meta data, there's a Field something like "creator" which contains the app created the book.

Add UI for creating ebooks from a list of links

User submitted suggestion: Create a UI that allows you to create an ebook from a list of URLs rather than open tabs.

safari file unknow

Safari downloads the file as Unknown. You then must manually add the file extension (eg. .epub or .mobi)

Must I add the file extension after the download is completed?

Support chrome webp images format

Hi,

Thanks for this great chrome extension.

If the website uses the webp images format with chrome, there will be no images in the ebook. I think the backend is not able to fetch this type of image.

Example page:
https://www.republik.ch/2018/07/24/zimbabwes-geschichte-zeigt-ein-duesteres-bild

Example url from chrome:
https://cdn.republik.space/s3/republik-assets/github/republik/article-zimbabwe-und-der-hoellenhund/images/28f7748f4e791665e88a0a324c212fc3c8d72a2b.jpeg.webp?size=3600x2328&resize=3000x

Same image when using firefox:
https://cdn.republik.space/s3/republik-assets/github/republik/article-zimbabwe-und-der-hoellenhund/images/28f7748f4e791665e88a0a324c212fc3c8d72a2b.jpeg?size=3600x2328&resize=3000x

My workaround for now is to use firefox for epub generation.

Filter files (.pdf, .png, etc.) from appearing in tab list.

Reproduction steps:

Open a PDF in Chrome (eg. http://www.gameaipro.com/GameAIPro/GameAIPro_Chapter15_Runtime_Compiled_C++_for_Rapid_AI_Development.pdf)
Click the extension.
Notice that it is listed as a page you can have in your ebook.

When you send these types of files, it just results in an error:

Would be better to not even show these types of files.

Various images sometimes missing from output

Sometimes various images are missing from generated ebooks. I encountered this while trying to convert all of the chapters/sections of Tempest: Geometries of Play to an epub. After opening all pages in tabs, selecting them in epubpress, and generating, some images are present in the output but some are not. Which ones are present or missing varies on each attempt. I've seen this issue occur with several different websites. Perhaps epubpress is exceeding some servers' maximum concurrent connections or something so some images timeout?

Latex formulas are huge

Latex formulas result in giant height and width images in the output epub. (For instance try to conver this equation). When I convert with dotepub they are slightly large but better.

I added [some_url] to my book but the chapter is empty

Some possible causes:

Javascript is required to load the page (should be fixed in the latest
The website HTML is poorly structured which caused content extraction to fail.
The content extractor needs to be updated.

Feel free to post the URL here or email it to [email protected] and we'll look into getting that fixed for you 👍 .

haroldtreen / epub-press-clients Goto Github PK

epub-press-clients's Introduction

epub-press-clients

Overview

🌟 Benefits 🌟

Packages 📦

epub-press-chrome

epub-press-js

epub-press-widgets

Roadmap 🛣

Bug Reporting 🐛

Acknowledgements 👏

epub-press-clients's People

Contributors

Stargazers

Watchers

Forkers

epub-press-clients's Issues

Current Behavior

Expected Behaviour

Info

Todo

Expected Behavior

Summary

Acceptance Criteria

Current behavior

Desired Behavior

Recommend Projects

Recommend Topics

Recommend Org

Jobs