GithubHelp home page GithubHelp logo

sohaib023 / splerge-tab-aug Goto Github PK

View Code? Open in Web Editor NEW

This project forked from pyxploiter/deep-splerge

7.0 2.0 3.0 3.19 MB

Code for: U. Khan, S. Zahid, M.A. Ali, A. Ul-Hasan and F. Shafait, TabAug: Data Driven Augmentation for Enhanced Table Structure Recognition (2021)

Python 98.10% Shell 1.90%

splerge-tab-aug's Introduction

About

This repository contains split model for table structure extraction. The model predicts row/column seperators against an input image. It has five executable scripts:

- prepare_data.py
- train.py
- infer.py
- merge.py
- eval.py

A model has been provided with this repository placed at model_out/split_model.pth. The model has been trained on an augmented data-set created from the originally provided labelled dataset.

Usages

1. Prepare Data

prepare_data.py takes as input the original labelled dataset (images, XML files and OCR files) and prepares the data for usage by the split model. Specifically it creates crops of tables out of the original dataset and generates corresponding split model labels and OCR.

Note: If provided OCR directory does not contain the corresponding OCR files, the program will generate OCR data and write it to the folder. ** XMLS are data annotations in PascalVoc format. **

usage: prepare_data.py [-h] -img IMAGE_DIR -xml XML_DIR -ocr OCR_DIR -o OUT_DIR

optional arguments:
  -h, --help            show this help message and exit
  -img IMAGE_DIR, --image_dir IMAGE_DIR
                        Directory containing images
  -xml XML_DIR, --xml_dir XML_DIR
                        Directory containing ground truth xmls in PasvalVoc Format
  -ocr OCR_DIR, --ocr_dir OCR_DIR
                        Directory containing ocr files. (If an
                        OCR file is not found, it will be generated and saved
                        in this directory for future use)
  -o OUT_DIR, --out_dir OUT_DIR
                        Path of output directory for generated data

Sample Command: python prepare_data.py -img data/images/ -xml data/xmls/ -ocr data/ocr/ -o data/prepared/

2. Train Split Model

train.py takes as input the data generated by the prepare_data.py script and starts training of the split model. The script has three required arguments namely images_dir, labels_dir and output_weight_path. Rest of the arguments are optional and have been set with default values.

usage: train.py [-h] -img TRAIN_IMAGES_DIR -l TRAIN_LABELS_DIR -o
                OUTPUT_WEIGHT_PATH [-e NUM_EPOCHS] [-s SAVE_EVERY]
                [--log_every LOG_EVERY] [--val_every VAL_EVERY]
                [--lr LEARNING_RATE] [--dr DECAY_RATE] [--vs VALIDATION_SPLIT]

optional arguments:
  -h, --help            show this help message and exit
  -img TRAIN_IMAGES_DIR, --images_dir TRAIN_IMAGES_DIR
                        Path to training table images (generated by
                        prepare_data.py).
  -l TRAIN_LABELS_DIR, --labels_dir TRAIN_LABELS_DIR
                        Path to labels for split model (generated by
                        prepare_data.py).
  -o OUTPUT_WEIGHT_PATH, --output_weight_path OUTPUT_WEIGHT_PATH
                        Output folder path for model checkpoints and summary.
  -e NUM_EPOCHS, --num_epochs NUM_EPOCHS
                        Number of epochs.
  -s SAVE_EVERY, --save_every SAVE_EVERY
                        Save checkpoints after given epochs
  --log_every LOG_EVERY
                        Print logs after every given steps
  --val_every VAL_EVERY
                        perform validation after given steps
  --lr LEARNING_RATE, --learning_rate LEARNING_RATE
                        learning rate
  --dr DECAY_RATE, --decay_rate DECAY_RATE
                        weight decay rate
  --vs VALIDATION_SPLIT, --validation_split VALIDATION_SPLIT
                        validation split in data

Sample Command: python train.py -img data/prepared/table_images/ -l data/prepared/table_split_labels/ -o model_out/

3. Inference of Split Model

infer.py can be used to take a trained model and infer split results against a folder of table cropped images. Inside {OUTPUT_PATH} it generates two folders. One contains predictions in the form of XML files (same format as ground truth XMLs). The other contains visualization of the split results.

usage: infer.py [-h] -img TEST_IMAGES_DIR -m MODEL_WEIGHTS -o OUTPUT_PATH

optional arguments:
  -h, --help            show this help message and exit
  -img TEST_IMAGES_DIR, --test_images_dir TEST_IMAGES_DIR
                        Path to testing data table images (generated by
                        prepare_data.py).
  -m MODEL_WEIGHTS, --model_weights MODEL_WEIGHTS
                        path to model weights.
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        path to the output directory

Sample Command: python infer.py -i data/prepared/table_images/ -m model_out/split_model.pth -o out_infer

4. Apply Merge Heuristics

merge.py can be used to apply merge heuristics on XML files predicted by the split model through infer.py. It can optionally be provided with table-level images if visualization of merges is required. If it is not provided, visualization will be skipped and only XMLs will be written in {OUTPUT_DIR}.

usage: merge.py [-h] -i INPUT_XML_DIR -o OUTPUT_DIR -ocr OCR_DIR [-img IMAGES_DIR]

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT_XML_DIR, --input_xml_dir INPUT_XML_DIR
                        Path to folder containing XML files predicted by
                        infer.py
  -o OUTPUT_DIR, --output_dir OUTPUT_DIR
                        Path to folder for writing output XML files and
                        visualization (optional) of merge heuristics.
  -ocr OCR_DIR, --ocr_dir OCR_DIR
                        Path to folder containing table-level OCR files
                        generated by prepare_data.py
  -img IMAGES_DIR, --images_dir IMAGES_DIR
                        Path to table-level images generated by
                        prepare_data.py (Optional. If not provided merge
                        visualization will not be written).

Sample Command: python merge.py -i out_infer/predicted_xmls/ -o merge_output/ -ocr data/prepared/table_ocr -img data/prepared/table_images/

5. Evaluation

Once results have been generated by infer.py or merge.py in XML format, they can be evaluated using eval.py script. It has four inputs. First three are, the original document level images, ground-truth XMLs and OCR files. Note, that these are not the ones generated by the prepare_data.py script but the original data. Fourth input is path to prediction XMLs generated by infer.py script. Fifth is the output directory where evaluation results are to be written.

usage: eval.py [-h] -i IMAGES_DIR -xml XML_DIR -o OCR_DIR -p PRED_DIR -e
               EVAL_OUT

optional arguments:
  -h, --help            show this help message and exit
  -i IMAGES_DIR, --images_dir IMAGES_DIR
                        path to directory containing document-level images.
  -xml XML_DIR, --xml_dir XML_DIR
                        path to directory containing document-level ground-
                        truth XML files.
  -o OCR_DIR, --ocr_dir OCR_DIR
                        path to directory containing document-level ocr.
  -p PRED_DIR, --pred_dir PRED_DIR
                        path to directory containing table-level prediction
                        XML files.
  -e EVAL_OUT, --eval_out EVAL_OUT
                        path of directory in which to write the evaluation
                        results.

Sample Command: python eval.py -i data/images/ -xml data/xmls/ -o data/ocr -p out_infer/predicted_xmls/ -e evaluation/

Citation

If this work is useful for your research or if you use this implementation in your academic projects, please cite the following papers:

@InProceedings{ICDAR2019,
author = {Christopher Tensmeyer, Vlad Morariu, Brian Price, Scott Cohen and Tony Martinez},
title = {Deep Splitting and Merging for Table Structure Decomposition},
booktitle = {The 15th IAPR International Conference on Document Analysis and Recognition (ICDAR)},
month = {September},
year = {2019}
}

splerge-tab-aug's People

Contributors

sohaib023 avatar pyxploiter avatar uyousafzai avatar

Stargazers

HenryLee avatar Yilun Xu avatar Constantin Lehenmeier  avatar  avatar  avatar suc16 avatar Phan Hoang avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.