GithubHelp home page GithubHelp logo

mjourdan / paperwork Goto Github PK

View Code? Open in Web Editor NEW

This project forked from openpaperwork/paperwork

1.0 1.0 0.0 17.97 MB

Using scanners and OCR to grep dead trees the easy way (Linux only)

License: GNU General Public License v3.0

Shell 0.53% Python 99.47%

paperwork's Introduction

Paperwork

Description

Paperwork is a personal document manager for scanned documents (and PDFs).

It's designed to be easy and fast to use. The idea behind Paperwork is "scan & forget": You should be able to just scan a new document and forget about it until the day you need it again.

In other words, let the machine do most of the work for you.

Screenshots

Main Window & Scan

Search Suggestions

Labels

Settings window

Details

Papers are organized into documents. Each document contains pages.

It uses mainly 4 other pieces of software:

  • Sane: To scan the pages
  • Tesseract: To extract the words from the pages (OCR)
  • GTK/Glade: For the user interface
  • Whoosh: To index and search documents, and provide keyword suggestions

Page orientation is automatically guessed using OCR.

Since OCR is not perfect, and since some documents don't contain useful keywords, Paperwork allows also to put labels on each document.

Licence

GPLv3 or later. See COPYING.

Installation

Archives

Github can automatically provides .tar.gz and .zip files if required. However, they are not required to install Paperwork. They are indicated here as a convenience for package maintainers.

Contact/Help

Development

All the information can be found on the wiki

paperwork's People

Contributors

jflesch avatar jfleschwyplay avatar kigeia avatar jaesivsm avatar plietar avatar bignaux avatar kryskool avatar blueyed avatar dontcallmedom avatar glandos avatar krap avatar ajira86 avatar kernelhacker avatar arthurlutz avatar swap38 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.