GithubHelp home page GithubHelp logo

chaodreaming / can Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lbh1024/can

0.0 0.0 0.0 1.02 MB

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition (ECCV’2022 Poster).

License: MIT License

Python 100.00%

can's Introduction

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition

This is the official pytorch implementation of CAN (ECCV'2022).

Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, Xiang Bai

Abstract

Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism. However, such methods may fail to accurately read formulas with complicated structure or generate long markup sequences, as the attention results are often inaccurate due to the large variance of writing styles or spatial layouts. To alleviate this problem, we propose an unconventional network for HMER named Counting-Aware Network (CAN), which jointly optimizes two tasks: HMER and symbol counting. Specifically, we design a weakly-supervised counting module that can predict the number of each symbol class without the symbol-level position annotations, and then plug it into a typical attention-based encoder-decoder model for HMER. Experiments on the benchmark datasets for HMER validate that both joint optimization and counting results are beneficial for correcting the prediction errors of encoder-decoder models, and CAN consistently outperforms the state-of-the-art methods. In particular, compared with an encoder-decoder model for HMER, the extra time cost caused by the proposed counting module is marginal.

Pipeline

Counting Module

Datasets

Download the CROHME dataset from BaiduYun (downloading code: 1234) and put it in datasets/.

The HME100K dataset can be download from the official website HME100K.

Training

Check the config file config.yaml and train with the CROHME dataset:

python train.py --dataset CROHME

By default the batch size is set to 8 and you may need to use a GPU with 32GB RAM to train your model.

Testing

Fill in the checkpoint (pretrained model path) in the config file config.yaml and test with the CROHME dataset:

python inference.py --dataset CROHME

Note that the testing dataset path is set in the inference.py.

Citation

@inproceedings{CAN,
  title={When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition},
  author={Li, Bohan and Yuan, Ye and Liang, Dingkang and Liu, Xiao and Ji, Zhilong and Bai, Jinfeng and Liu, Wenyu and Bai, Xiang},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  pages={197--214},
  year={2022}
}

Recommendation

Some other excellent open-sourced HMER algorithms can be found here:

WAP[PR'2017] DWAP-TD[ICML'2020] BTTR[ICDAR'2021] ABM[AAAI'2022] SAN[CVPR'2022] CoMER[ECCV'2022]

can's People

Contributors

lbh1024 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.