GithubHelp home page GithubHelp logo

attendfov / cdistnet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from simplify23/cdistnet

0.0 1.0 0.0 1.64 MB

Official Pytorch implementations of CDistNet

License: Apache License 2.0

Python 46.61% Jupyter Notebook 53.39%

cdistnet's Introduction

CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

The official code of CDistNet.

Paper Link : Arxiv Link

As a different paradigm, we have great confidence in CDistNet's continued high recognition performance across multiple scenarios. To do this, we explore the reasons for ABINet's high performance. The separation of LM and VM gives ABINet even more performance improvements. To this end, we also applied training strategies to CDistNetv2 to find more room for improvement. Be more concerned about CDistNet~ pipline

To Do List

  • HA-IC13 & CA-IC13
  • Pre-train model
  • Cleaned Code
  • Document
  • Distributed Training

Two New Datasets

we test other sota method in HA-IC13 and CA-IC13 datasets.

HA_CA CDistNet has a performance advantage over other SOTA methods as the character distance increases (1-6)

HA-IC13

Method 1 2 3 4 5 6 Code & Pretrain model
VisionLAN (ICCV 2021) 93.58 92.88 89.97 82.26 72.23 61.03 Offical Code
ABINet (CVPR 2021 ) 95.92 95.22 91.95 85.76 73.75 64.99 Offical Code
RobustScanner* (ECCV 2020) 96.15 95.33 93.23 88.91 81.10 71.53 --
Transformer-baseline* 96.27 95.45 92.42 86.46 79.35 72.46 --
CDistNet 96.62 96.15 94.28 89.96 83.43 77.71 --

CA-IC13

Method 1 2 3 4 5 6 Code & Pretrain model
VisionLAN (ICCV 2021) 94.87 92.77 84.01 75.03 64.29 52.74 Offical Code
ABINet (CVPR 2021 ) 96.62 95.92 87.86 76.31 65.46 54.49 Offical Code
RobustScanner* (ECCV 2020) 95.22 94.87 85.30 76.55 68.38 60.79 --
Transformer-baseline* 95.68 94.40 85.88 75.85 65.93 58.58 --
CDistNet 96.27 95.57 88.45 79.58 70.36 63.13 --

Datasets

The datasets are same as ABINet

Environment

package you can find in env_cdistnet.yaml.

#Installed
conda create -n CDistNet python=3.7
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=9.2 -c pytorch
pip install opencv-python mmcv notebook numpy einops tensorboardX Pillow thop timm tornado tqdm matplotlib lmdb

Pretrained Models

Get the pretrained models from BaiduNetdisk(passwd:d6jd), GoogleDrive. (We both offer training log and result.csv in same file.) The pretrained model should set in models/reconstruct_CDistNetv3_3_10

Performances of the pretrained models are summaried as follows:

Train

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --config=configs/CDistNet_config.py

Eval

CUDA_VISIBLE_DEVICES=0 python eval.py --config=configs/CDistNet_config.py

Citation

@article{Zheng2021CDistNetPM,
  title={CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition},
  author={Tianlun Zheng and Zhineng Chen and Shancheng Fang and Hongtao Xie and Yu-Gang Jiang},
  journal={ArXiv},
  year={2021},
  volume={abs/2111.11011}
}

cdistnet's People

Contributors

simplify23 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.