GithubHelp home page GithubHelp logo

hason / scantailor-advanced Goto Github PK

View Code? Open in Web Editor NEW

This project forked from 4lex4/scantailor-advanced

0.0 1.0 0.0 2.58 MB

Scan Tailor Advanced is the version that merges the features of the Scan Tailor Featured and Scan Tailor Enhanced versions, brings new ones and fixes.

License: GNU General Public License v3.0

C++ 97.15% C 0.21% CMake 1.55% Objective-C 0.03% HTML 1.06%

scantailor-advanced's Introduction

Scan Tailor Advanced

The Scan Tailor version that merges the features of the Scan Tailor Featured and Scan Tailor Enhanced versions, brings new ones and fixes.

Contents:

Description

Scan Tailor is an interactive post-processing tool for scanned pages. It performs operations such as:

You give it raw scans, and you get pages ready to be printed or assembled into a PDF or DjVu file. Scanning, optical character recognition, and assembling multi-page documents are out of scope of this project.

Features

Scan Tailor Enhanced features

  • Auto margins

Auto margins feature allows keep page content on original place. In the Margins step you can choose from Auto, Manual (default) and Original mode. The manual mode is the original one. Auto mode try to decide if it is better to align page top, bottom or center. Original mode keeps page on their vertical original position.

  • Page detect

Page detect feature allows detect page in black margins or switch off page content detection and keep original page layout.

  • Deviation

Deviation feature enables highlighting of different pages. Highlighted in red are pages from Deskew filter with too high skew, from Select Content filter pages with different size of content and in Margins filter are highlighted pages which does not match others.

  • Picture shape

Picture shape feature adds option for mixed pages to choose from free shape and rectangular shape images. This patch does not improve the original algoritm but creates from the detected "blobs" rectangular shapes and the rectangles that intersects joins to one.

  • Multi column thumbnails view [reworked]

This allows to expand and un-dock thumbnails view to see more thumbnails at a time.

This feature had performance and drawing issues and has been reworked.

Scan Tailor Featured features

  • Scan Tailor Featured fixes & improvements
  1. Deleted 3 Red Points The 3 central red points on the topmost (bottom-most) horizontal blue line of the dewarping mesh are now eliminated.
  2. Manual dewarping mode auto switch The dewarping mode is now set to MANUAL (from OFF) after the user has moved the dewarping mesh.
  3. Auto dewarping vertical half correction This patch corrects the original auto-dewarping in half the cases when it fails. If the vertical content boundary angle (calculated by auto-dewarping) exceeds an empirical value (2.75 degrees from vertical), the patch adds a new point to the distortion model (with the coordinates equal to the neighboring points) to make this boundary vertical. The patch works ONLY for the linear end of the top (bottom) horizontal line of the blue mesh (and not for the opposite curved end).
  • Line vertical dragging on dewarp

You can move the topmost (bottom-most) horizontal blue line of the dewarping mesh up and down as a whole - if you grab it at the most left (right) red point - holding down the CTRL key.

  • Square picture zones

You can create the rectangular picture zones - holding down the CTRL key. You can move the (rectangular) picture zones corners in an orthogonal manner - holding down the CTRL key.

  • Auto save project [optimized]

Set the "auto-save project" checked in the Settings menu and you will get your project auto-saved provided you have originally saved your new project. Works at the batch processing too.

This feature had performance issues and has been optimized.

  • Quadro Zoner

Another rectangular picture zone shape. This option is based on Picture shape, Square picture zones. It squeezes every Picture shape zone down to the real rectangular picture outline and then replaces it (the resulting raster zone) by a vector rectangular zone, so that a user could easily adjust it afterwards (by moving its corners in an orthogonal manner).

  • Marginal dewarping

An automatic dewarping mode. Works ONLY with such raw scans that have the top and bottom curved page borders (on the black background). It automatically sets the red points of the blue mesh along these borders (to create a distortion model) and then dewarps the scan according to them. Works best on the low-curved scans.

*Other features of this version, such as Export, Dont_Equalize_Illumination_Pic_Zones, Original_Foreground_Mixed has't been moved due to dirty realization. Their functionality is fully covered by Full control over settings on output and Splitting output features.

Scan Tailor Advanced features

  • Scan Tailor Advanced fixes & improvements
  1. Portability. The setting is stored in the folder with a program.

  2. Page splitting had an influence on output only in b&w mode with dewarping disabled. Now it works in all the modes.

  3. Page layout and all the other views now consider splitting settings. Corresponding improvements are done to thumbnails.

  4. Changed Scan Tailor behavior on page split stage.

    1. Reworked apply cut feature. Now on applying cut to the pages with different dimensions than the page the cut applied to, Scan Tailor tries to adapt cutters instead of fully rejecting the cut setting and switching to auto mode for those pages as it was before. The later was annoying as pages could be similar and had the difference in a few pixels.
    2. Added check to reject invalid cut settings in manual mode.
    3. UI: Added cutters interaction between each other. They can't more intersect each other, which created a wrong page layout configuration before.
  5. Optimized memory usage on the output stage.

  6. Reworking on Multi column thumbnails view feature from ver. Enhanced. Now thumbnails is shown evenly.

  7. Added option to control highlighting (with red asterisks) the thumbnails of pages with high deviation. The option refreshes the thumbnails instantly.

  8. Fixed other bugs of official, Enhanced and Featured versions and made lots of other improvements.

  • Light and Dark color schemes

You can choose a desired color scheme in settings.

  • Multi-threading support for batch processing

This significantly increases the speed of processing. The count of threads to use can be adjusted while processing.

Warning! More threads requires more memory to use. Exclude situations of that to be overflowed.

  • Full control over settings on output

This feature enables to control cut margins, normalizing illumination before binarization, normalizing illumination in color areas options, Savitzky-Golay and morphological smoothing on output in any mode (of course, those setting that can be applied in the current mode).

  • Filling outside areas

Now outside pixels can be filled with the background color of the page.

Added filling setting with the following options:

  1. Background: estimate the background and fill outside pixels with its color.
  2. White: always fill with white.
  • Tiff compression

Tiff compression options allow to disable or change compression method in tiff files.

There are two options in settings dialog: B&W and color compression.

  1. The B&W one has None, LZW, Deflate and CCITT G4 (Default) options.
  2. The color one has None, LZW (Default), Deflate and JPEG options.
  • Adaptive binarization

Sauvola and Wolf binarization algorithms have been added. They can be applied when normalizing illumination does not help.

  • Splitting output

The feature allows to split the mixed output scans into the pairs of a foreground (letters) and background (images) layer.

You can choose between B&W or color (original) foreground.

It can be useful:

  • for the further DjVu encoding,
  • to apply different filters to letters and images, which when being applied to the whole image gives worse results.
  • to apply a binarization to the letters from a third party app without affecting the images.

Note: That does not rename files to 0001, 0002... It can be made by a third party app, for example Bulk Rename Utility

Building

Go to this repository and follow the instructions given there.

scantailor-advanced's People

Contributors

4lex4 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.