GithubHelp home page GithubHelp logo

re_testr's Introduction

RE_TESTR

This repository is our reproduction of the paper Text Spotting Transformers (CVPR 2022).

We show the author's setup instructions below.

Getting Started

We use the following environment in our experiments. It's recommended to install the dependencies via Anaconda

  • CUDA 11.3
  • Python 3.8
  • PyTorch 1.10.1
  • Official Pre-Built Detectron2

Installation

Please refer to the Installation section of AdelaiDet: README.md.

If you have not installed Detectron2, following the official guide: INSTALL.md.

After that, build this repository with

python setup.py build develop

Preparing Datasets

Please download TotalText, CTW1500, MLT, and CurvedSynText150k according to the guide provided by AdelaiDet: README.md.

ICDAR2015 dataset can be download via link.

Extract all the datasets and make sure you organize them as follows

- datasets
  | - CTW1500
  |   | - annotations
  |   | - ctwtest_text_image
  |   | - ctwtrain_text_image
  | - totaltext (or icdar2015)
  |   | - test_images
  |   | - train_images
  |   | - test.json
  |   | - train.json
  | - mlt2017 (or syntext1, syntext2)
      | - annotations
      | - images

After that, download polygonal annotations, along with evaluation files and extract them under datasets folder.

Visualization Demo

You can try to visualize the predictions of the network using the following command:

python demo/demo.py --config-file <PATH_TO_CONFIG_FILE> --input <FOLDER_TO_INTPUT_IMAGES> --output <OUTPUT_FOLDER> --opts MODEL.WEIGHTS <PATH_TO_MODEL_FILE> MODEL.TRANSFORMER.INFERENCE_TH_TEST 0.3

You may want to adjust INFERENCE_TH_TEST to filter out predictions with lower scores.

Training

You can train from scratch or finetune the model by putting pretrained weights in weights folder.

Example commands:

python tools/train_net.py --config-file <PATH_TO_CONFIG_FILE> --num-gpus 8

All configuration files can be found in configs/TESTR, excluding those files named Base-xxxx.yaml.

TESTR_R_50.yaml is the config for TESTR-Bezier, while TESTR_R_50_Polygon.yaml is for TESTR-Polygon.

Evaluation

python tools/train_net.py --config-file <PATH_TO_CONFIG_FILE> --eval-only MODEL.WEIGHTS <PATH_TO_MODEL_FILE>

The author's model result

Dataset Annotation Type Lexicon Det-P Det-R Det-F E2E-P E2E-R E2E-F Link
Pretrain Bezier None 88.87 76.47 82.20 63.58 56.92 60.06 OneDrive
Polygonal None 88.18 77.51 82.50 66.19 61.14 63.57 OneDrive
TotalText Bezier None 92.83 83.65 88.00 74.26 69.05 71.56 OneDrive
Full - - - 86.42 80.35 83.28
Polygonal None 93.36 81.35 86.94 76.85 69.98 73.25 OneDrive
Full - - - 88.00 80.13 83.88
CTW1500 Bezier None 89.71 83.07 86.27 55.44 51.34 53.31 OneDrive
Full - - - 83.05 76.90 79.85
Polygonal None 92.04 82.63 87.08 59.14 53.09 55.95 OneDrive
Full - - - 86.16 77.34 81.51
ICDAR15 Polygonal None 90.31 89.70 90.00 65.49 65.05 65.27 OneDrive
Strong - - - 87.11 83.29 85.16
Weak - - - 80.36 78.38 79.36
Generic - - - 73.82 73.33 73.57

Our replication

Our notbooks, config files and modified modules are stored in the RE_Experiments folder.

It contains three subfolders:

  • fine_tune: contains the notebooks and config files for the fine-tuning experiments.
  • box-to-poly: contains the notebooks, config files and modified layer for the experiments with the box-to-poly process.
  • factorized_atten: contains the notebooks, config files and modified layers for the experiments with the factorized attention module.

To replicate our results, you can follow the steps below:

  1. Setup environment by authors instructions
  2. Download the datasets, annotations and model weights from the links above
  3. Choose an experiment folder in RE_Experiments
  4. If there's a folder called 'layer' inside, replace the code in adet/layers/deformable_transformer.py file with the one in it.
  5. Run the notebook

re_testr's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.