P2PaLA

Page to PAGE Layout Analysis (P2PaLA) is a toolkit for Document Layout Analysis based on Neural Networks.

If you find this toolkit useful in your research, please cite:

@misc{p2pala2017,
  author = {Lorenzo Quirós},
  title = {P2PaLA: Page to PAGE Layout Analysis tookit},
  year = {2017},
  publisher = {GitHub},
  note = {GitHub repository},
  howpublished = {\url{https://github.com/lquirosd/P2PaLA}},
}

Or check this paper for details Arxiv.

Requirements

Linux (OSX may work, but untested.).
Python (2.7 under conda virtual environment is recomended)
Python future pip install future
Numpy (installed by default using conda)
PyTorch (0.3.0). conda install pytorch torchvision -c pytorch
OpenCv (3.1.0). conda install -c menpo opencv
NVIDIA GPU + CUDA CuDNN (CPU mode and CUDA without CuDNN may work with minimal modification, but untested).
tensorboard-pytorch (v0.9) [Optional]. pip install tensorboardX > A diferent conda env is recomended to keep tensorflow separated from PyTorch

For a full list of dependencies see conda env file

Usage

Input data must follow the folder structure data_tag/page, where images must be into the data_tag folder and xml files into page. For example:

mkdir -p data/{train,val,test,prod}/page;
tree data;

data
├── prod
│   ├── page
│   │   ├── prod_0.xml
│   │   └── prod_1.xml
│   ├── prod_0.jpg
│   └── prod_1.jpg
├── test
│   ├── page
│   │   ├── test_0.xml
│   │   └── test_1.xml
│   ├── test_0.jpg
│   └── test_1.jpg
├── train
│   ├── page
│   │   ├── train_0.xml
│   │   └── train_1.xml
│   ├── train_0.jpg
│   └── train_1.jpg
└── val
    ├── page
    │   ├── val_0.xml
    │   └── val_1.xml
    ├── val_0.jpg
    └── val_1.jpg

Run the tool.

python P2PaLA.py --config config.txt --tr_data ./data/train --te_data ./data/test --log_comment "_foo"

Use TensorBoard to visualize train status:

tensorboard --logdir ./work/runs

xml-PAGE files must be at "./work/results/test/"

We recommend Transkribus or nw-page-editor to visualize and edit PAGE-xml files.

For detail about arguments and config file, see docs or python P2PaLa.py -h.
For more detailed example see egs:
- Bozen dataset see
- cBAD complex competition dataset see
- OHG dataset see

License

GNU General Public License v3.0 See LICENSE to see the full text.

Acknowledgments

Code is inspired by pix2pix and pytorch-CycleGAN-and-pix2pix

To-do

Save best model under criteria [best train L1, best val L1, ...]
stop training after X epochs without improvement
Provide an example of use
Provide Docker
Include BaselinePage to detect baselines.
Test on Mac/OS.

fendaq / p2pala Goto Github PK

p2pala's Introduction

P2PaLA

Requirements

Usage

License

Acknowledgments

To-do

p2pala's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs