GithubHelp home page GithubHelp logo

Binary release with Nutika about pdfgrabber HOT 4 OPEN

BigB84 avatar BigB84 commented on June 21, 2024
Binary release with Nutika

from pdfgrabber.

Comments (4)

FelixFrog avatar FelixFrog commented on June 21, 2024

Thank for letting me know about it. I wasn't aware of its existence. Sadly I think this will add a lot of complexity for very little benefit. As of now pdfgrabber's speed is capped by the downloading speed, pymupdf's speed (which is already a c binding) and generally being single-threaded. It also consumes a lot of memory, due to often storing many 100+ mb zip files in memory. I think fixing these issues would give a much better UX improvement for the effort it would require.

Also as of now I am looking into adding a CLI interface. Once that has been added, it would be much easier to create a binary release with a simple GUI, in particular for windows (even if I am reluctant about that, I have to acknowledge that the majority of the users here are using Windows and are not particularly expert of the CLI).

from pdfgrabber.

BigB84 avatar BigB84 commented on June 21, 2024

Right!

When it comes to GUI. I would think about it. With ease-of-use comes popularization. This app may be controversial for service owners and copyright holders. But you decide ofc.

CLI interface is good idea. Would you eventually accept PR for that?

Maybe you could open a milestone for that?

from pdfgrabber.

BigB84 avatar BigB84 commented on June 21, 2024

Maybe there could be an separate issue (or rename that) / milestone opened for optimizations?

For unzipping for instance I've heard somewhere that WinZip is hardware accelerated, so for instance some bindings to winzip in python could be found and implemented.

Also something for simultaneous reading and writing like this one in go

from pdfgrabber.

FelixFrog avatar FelixFrog commented on June 21, 2024

CLI interface is good idea. Would you eventually accept PR for that?

Yes, ofc. For now the main interface has very low code quality, and I would like to eventually rewrite it (using sqlite as database instead of tinydb). But having a CLI interface would not stand in the way of that, and i would be a great idea regardless. Implementing it with argparser would make using something like this very trivial.

For unzipping for instance I've heard somewhere that WinZip is hardware accelerated, so for instance some bindings to winzip in python could be found and implemented.

I will look into it. As of now my choice is towards py7zr. The only real place where this is needed though is in the RPLUS_EPUB extraction, so I'm not too sure about adding it as a dependency just for that, maybe there is a way of using directly 7z if installed, otherwise falling back to the pythonic way.

from pdfgrabber.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.