GithubHelp home page GithubHelp logo

smile8848 / retcom Goto Github PK

View Code? Open in Web Editor NEW

This project forked from imifume/retcom

0.0 0.0 0.0 44.52 MB

A fast comic cleaner/typesetter/translator utility.

License: MIT License

Python 100.00%

retcom's Introduction

RetCom

A fast comic cleaner/typesetter/translator utility.

Features

  • Open and edit .jpg, .png, .tiff, .zip, .rar, .cbz, .cbr images/archives.
  • Recognize text using optical character recognition.
  • Content-aware cleaning using advance inpainting techniques.
  • Mask-based cleaning using text boxes.
  • Translate text using Google Translate, straight from the software.
  • Add text bubbles and change font size and family.
  • Easily export and share text boxes and text bubbles with others.
  • Immediately export cleaned and typeset pages.

Sounds too good to be true? Check out this demo: to be added.

Installation

Simply run the installer for your platform, and you can use the core features.

If you want to use OCR (optical character recognition) features, you will also need to install Tesseract-OCR. If you don't have Tesseract installed, the app will warn you that OCR features are unavailable each time you launch it, and provides additional instructions for each platform to install it. The same platform-specific instructions are repeated here:

macOS

Install Brew or MacPorts and run the following command:

# brew
brew install tesseract

# ports
sudo port install tesseract

Windows

Use the Tesseract installer from UB Mannheim. In particular, install tesseract-ocr-w64-setup-v4.1.0.20190314.exe (simply go through the installation without checking any additional options). Locate the folder where Tesseract-OCR is installed (usually C:\Program Files\Tesseract-OCR or C:\Program Files (x86)\Tesseract-OCR), and add this to your PATH variable as follows:

  1. Press the Windows button and search for 'Edit the system environment variables'.
  2. Click the 'Environment Variables' button on the bottom left.
  3. Select 'Path' in the 'System variables' list, and press 'Edit'.
  4. In the window that just opened, press 'New', and paste the path to Tesseract-OCR.

That's it, you should be good to go now.

Linux

Install Tesseract-OCR following the instructions here. Be sure to install Tesseract v4, since this is directly compatible with the bundled language files.

If you are on Ubuntu, simply run sudo apt install tesseract-ocr.

Usage

More detailed instructions will be added later. For now, check this set of shortcuts out:

Scene Editor

Keystroke Action
I / J / K / L Move selected; hold shift to nudge
A Add box
shift+A Select all items
D / bcksp / del Remove selected items
shift+{D / bcksp / del} Remove flagged items
E Edit box
F Flag selected
shift+F Unflag selected
G Group selected
shift+G Ungroup selected
H Hide selected
shift+H Unhide all
S OCR rubberband selection
shift+S OCR rubberband tight selection
R Restore selected
W Inpaint black
shift+W Inpaint white
[ Scale down
shift+[ Fine scale down
] Scale up
shift+] Fine scale up

General Shortcuts

Note: Replace ctrl by command on macOS.

Keystroke Action
ctrl+E Export LSTMBox
ctrl+shift+E Export TXTEll
ctrl+I Page information
ctrl+L Load LSTMBox
ctrl+shift+L Load TXTEll
ctrl+O Open image/archive
ctrl+P Prescan page
ctrl+S Save cleaned image
ctrl+shift+S Save current scene as image
ctrl+T Translate page

Settings

You can access the config.json file in your installation folder (exact location depends on your operating system). The different settings should be somewhat obvious, but detailed documentation will follow soon.

Changing OCR/translation language

One thing to look out for is the language tag. The default values are for vertical Japanese text, but you can change this by setting "isVertical" : false and changing the language tag to match one of the files in the tessdata folder. You can download more .traineddata files here; simply add the files to the tessdata folder in the installation location.

The translation language is easily changed by changing the translationLanguage tag to a two-letter language code. These language codes can be found in translator_constants.py. The source language is set to be detected automatically at the moment, although we will probably add a feature to change this manually.

Details / Q&A

Note: This section will be greatly expanded upon in the future.

How do you save OCR results?

We use a modified version of Tesseract's LSTM .box format (see here for more info), encoding the box to which each glyph belongs, as well as what group it belongs to, if any. This format has the following form:

<symbol> <left> <bottom> <right> <top> <group>

and uses the .box extension.

We use a special character, known as the unit separator ('โŸ', U+001F), to prevent a textbox from being resized based on its content. This allows you to create masking boxes that only serve to cover text, but are ignored during collation/translation.

How do you save text bubbles?

We call text bubbles 'ellipsoids' in RetCom, to not confuse them with textboxes. Ellipsoids are encoded in such a way so as to store the width and length, position, font size and family, and content of the ellipsoid.

We follow the following convention:

<content> <size> <family> <color> <x> <y> <w> <h>

and we use .ell. For details on serialization, check the source code.

How can I make text bold/italic?

You can simply use HTML formatting in the text bubbles, so bold text would be <b>bold</b>, and italics would be <i>italics</>. Try out other HTML tags and see what work for yourself!

Can I contribute?

Sure! Just open an issue and we can talk.

X is not working!

Same as above.

Can you add feature Y?

Maybe, open an issue and say how you would go about doing it and why it's useful.

Dependencies

Besides Tesseract-OCR, we use the following amazing open source dependencies:

Package Use
fontTools Character size determination
NumPy Numerical support
OpenCV Inpainting
Pillow Image cropping and I/O
rarfile RAR file support
requests Communication with Google Translate API

License

MIT

retcom's People

Contributors

imifume avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.