msight-tech / research-charnet Goto Github PK

CharNet: Convolutional Character Networks

License: Other

Python 99.74% Shell 0.26%

research-charnet's Introduction

Convolutional Character Networks

This project hosts the testing code for CharNet, described in our paper:

Convolutional Character Networks
Linjie Xing, Zhi Tian, Weilin Huang, and Matthew R. Scott;
In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019.

Installation

pip install torch torchvision
python setup.py build develop

Run

Please run bash download_weights.sh to download our trained weights.
For ICDAR 2015, please run the following command line. Please replace images_dir with the directory containing ICDAR 2015 testing images. The results will be in results_dir.
```
python tools/test_net.py configs/icdar2015_hourglass88.yaml <images_dir> <results_dir>
```

Citation

If you find this work useful for your research, please cite as:

@inproceedings{xing2019charnet,
title={Convolutional Character Networks},
author={Xing, Linjie and Tian, Zhi and Huang, Weilin and Scott, Matthew R},
booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
year={2019}
}

Contact

For any questions, please feel free to reach:

[email protected]

License

CharNet is CC-BY-NC 4.0 licensed, as found in the LICENSE file. It is released for academic research / non-commercial use only. If you wish to use for commercial purposes, please contact [email protected].

research-charnet's People

Contributors

Stargazers

Watchers

Forkers

trantorrepository ocrbyyue himamis zghzdxs mtcai shengzhang90 hell-to-heaven bobiace alwc chengmuni66 wolfworld6 yipeng-sun yuckfu basispoint happog wqtwjt1996 gztangde noyoshi leo-xxx yangtong1989 shenggaozhu kapitsa2811 wuqiangch sunxingxingtf qutrino tnkong huangmhao tangyoubao rkshuai yisampi yanqi1811 bachelorwangwei lijun20 junlynwoo-lee siyecao99999 jadentan xuweidongkobe changwinnie flyinglsj kamei310110 samjcheng xiaoyubing friendmine lawesly fendaq verazjy yezilover linhduongtuan ccccchen0118 wxk2008 songyaheng wsn sisrfeng barongeng 17666107783 wangqiang1588 zonasw igordavidyuk 1lovesjohnny kyoungyeon90 dexception mahiratmis lamhoangtung tangshao0804 mweiss17 yangyijune dun933 yifan-zhao lil-shawn vghost2008 zhangxinnan df595149790 sanster 2016xjtuzyt arkothiwala pkq1688 binging512 sporterman zgsxwsdxg zhangyangang ranian963 niteshsukhwani timomo3 zhiguoma zhucheng725 omarsayedmostafa ygest sailfish009 dy1998 shiyuan0806 mess-lelouch zhanzhanmiao peternara acasia surajiyer fanchb jayveehe fengxingxiang pravendrakhichi hi-ai

research-charnet's Issues

speed up inference

hello,

How can I speed up inference time on charnet as described on tools/test_net.py?
tried to apply torch multiprocess but it failed due to lambda usage in the model definition.

Some clues on how to batch predict on charnet may help a lot here.

Tnx!

How to improve the results of recognizing numbers?

White applying the code to pictures like this

some problems arise:
1.numbers in 2-digits can not be detected
2.some number is recognized as English letterr. e.g. 4→A

while some numbers in one picture can be recognized, the txt file in <results_dir> coresponding to another picture with the same content but in a slightly different view is empty.

Could you please tell how to solve the problems?
Can your code generate bbox?
Thanks a lot！

Could you please upload the weight model trained on ICDAR 2017 MLT?

the test demo cannot recognizenumbers using the model of icdar2015_hourglass88.pth 。

simply img only has numbers

Is lexicon used for text detection ?

Hi. Thank you for the interesting works.
I'm confused about your implementation. In the paper, for training with your iterative character detection module, only the gt-word length is used. For the ICDAR2015 task 4.1 text localization, the single "word" gt is provided (accurate word or ###) in string form. But in your implementation, it uses lexicon, the candidate of the words! Is it fair for "localization (or detection) only " task ???
It is obvious that using lexicon actually enhances the detection result ! Or is there anything i missed ?

Thanks for reading.

Training Code

Will the training code be provided later?

Training mode error

I have added a training module and I noticed when forwarding the hourglass88 with torch.no_grad it works just fine but when removing torch.no_grad the gpu (12GB) run out of memory just by forwarding one image sample. I get this message:

RuntimeError: CUDA out of memory. Tried to allocate 46.00 MiB (GPU 0; 10.91 GiB total capacity; 10.21 GiB already allocated; 44.75 MiB free; 92.41 MiB cached).

It seems by the paper that you training not 1 but 4 images per mini-batch.

expand to chinese is ok?

Question about Character and Word detection branches

In your paper, the images you show have rotated character bounding boxes and simple bounding boxes for the word detector. However, in your code, the WordDetector has an orientation prediction head and output, while the CharDetector returns None for the character orientation. Is it possible that the two have been accidentally swapped?

About training code

Appreciate great job! I'm looking forward the Training code...

Unable to download weights

it is creating an empty file in weights directory

about raw training code.

Hi, I have read your CharNet paper and interested in it.But only a portion of the evaluation code was open sourced. I want to know can I obtain the full raw code as soon as possible (for academic purposes) through other means (such as purchase)?
Thank you for your time and consideration.

Trained weights

The website for download trained weight has been disabled.
Could anyone please provide the weight?

save single char boxes

In your code,it save words boxes. I want to save char boxes,how to do.

* please add requirements.txt *

I think there should be a requirements file as well.

Question about text instance segmentation.

Firstly, thanks to your paper, it is a great job.
In the paper, it is said that there is an instance segmentation sub-branch in character branch, but how do you obtain instance segmentation ground truth masks? Do you obtain them the same way as Mask textspotter?
@mscottml

is the model in download_weights.sh with or without trained on real-world images?

as a title, the model which is downloaded in download_weights.sh, did you train this model on real-world images with "iterative character detection"?

ModuleNotFoundErrors and GPU-only Support

Summary

Hi! Great project and very interesting paper. I've encountered two issues:

setup.py doesn't include all of the dependencies
The GPU-only support isn't explicit.

Steps to Recreate

After following the setup instructions in your ReadME on Ubuntu 18.04 with python 3.7.4, I tried running:

python tools/test_net.py configs/icdar2015_hourglass88.yaml images_dir results_dir

and received the following error:

python tools/test_net.py configs/icdar2015_hourglass88.yaml images_dir results_dir
Traceback (most recent call last):
  File "tools/test_net.py", line 9, in <module>
    from charnet.modeling.model import CharNet
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/model.py", line 17, in <module>
    from .postprocessing import OrientedTextPostProcessing
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/postprocessing.py", line 11, in <module>
    import editdistance
ModuleNotFoundError: No module named 'editdistance'

As a side note, I'm running Ubuntu on the Windows Subsystem for Linux

After pip installing editdistance, I got another import error:

python tools/test_net.py configs/icdar2015_hourglass88.yaml images_dir results_dir
Traceback (most recent call last):
  File "tools/test_net.py", line 9, in <module>
    from charnet.modeling.model import CharNet
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/model.py", line 17, in <module>
    from .postprocessing import OrientedTextPostProcessing
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/postprocessing.py", line 13, in <module>
    from .rotated_nms import nms, nms_with_char_cls, \
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/rotated_nms.py", line 9, in <module>
    import pyclipper
ModuleNotFoundError: No module named 'pyclipper'

I pip installed that only to see that yacs was also required. Interestingly, yacs is in your setup.py file.

pip installing those dependencies got me past the import errors, but then I got the following error:

Traceback (most recent call last):
  File "tools/test_net.py", line 69, in <module>
    charnet.cuda()
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
    module._apply(fn)
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
    module._apply(fn)
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
    module._apply(fn)
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 224, in _apply
    param_applied = fn(param)
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in <lambda>
    return self._apply(lambda t: t.cuda(device))
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/cuda/__init__.py", line 192, in _lazy_init
    _check_driver()
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/cuda/__init__.py", line 102, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError: 
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

Recommended Fix

Updating setup.py to include all of the dependencies and also updating the ReadMe to make it clear that you need a GPU machine in order to run the code.

Not able to download weights

Error while executing download-weights.sh file attached.
Unable to verify the issuer's authority.

Invalid syntax error

python tools/test_net.py configs/icdar2015_hourglass88.yaml <images_dir> <results_dir>

This command is showing invalid syntax error while running in Ubuntu

I am using python 2.7.17 and ubuntu -18.04

Error

I am getting this error. Could you help ?

请问 pytorch 使用的哪个版本

Loss function

Could you describe the loss function, Dataloader of the paper? We would like to re-produce the your paper. Many thanks.

Training Script

Hi, thank you for this excellent work! I would be very grateful if you could include the training file for this model.
Looking forward to hear from you.

What's the results of the given trained weights for ICDAR2015?

Following the description, I download the trained weights, then test and evaluate on the detetion datasets of IC15. The results seems strange as follow:
precision: 0.966, recall: 0.744, hmean: 0.841
Is the result correct? Thanks for your reply.

How to get synthetic dataset with char label (both bounding box and character label)?

Hello,
To my best knowledge, the annotation format of 800k synthetic datasst is as follow:

http://www.robots.ox.ac.uk/~vgg/data/scenetext/readme.txt

It is composed of charBB, wordBB, txt, imnames. However the txt and the wordBB is not strictly aligned.

How to get the char label with both bounding box and character label?

Looking forward to your reply and thanks in advance!

Bad result ?

I had tried your great work but it seemed that the results on my test images were not so good

are there any parameters that I need to adjust to achieve better results ?

Line text detection and recognizer instead of word level

How do train model to detect and recognize line text instead of word text?

how do you recognize multi-language characters

the paper said that there only 68 classes for character recognition, but for 2017MLT dataset, how do you recognize characters out of English?
Thank you.

Release of Training code

Are you guys planning to release training code ? If no, can you share the details of how loss functions used for combined learning ?

Training scripts

Hi, Thanks for the work. When will the training scripts be shared?

The trained weights have been uploaded to Baidu Net Disk

When I downloaded this trained weight, I encountered a "network error" or "slow download" situation, so I uploaded this weight to Baidu Net Disk, hoping to help you。

The Trained Weight: icdar2015_hourglass88.pth
Size: 356.9MB （in Ubuntu）

链接: https://pan.baidu.com/s/1M85Cd_V8NxqZzS41JHJP3A
提取码: dmbj

How to distinguish between upper and lower case

I see that the monitoring results are in uppercase.How to distinguish between upper and lower case

The process halts hence doesn't proceed to provide outputs

Ran it for about 1 hour still no output. Using CPU

When will training code be released?

Firstly, thanks to your paper, it is a great job. Then , I want to ask you , when will training code be released? I am looking forward to it.

OS error

hello everyone , i am getting this error by running test_net.py on windows 10

(charnet) C:\Users\karti\research-charnet>python tools/test_net.py configs/icdar2015_hourglass88.yaml C:\Users\karti\research-charnet\inputimages C:\Users\karti\research-charnet\result
Traceback (most recent call last):
File "tools/test_net.py", line 9, in
from charnet.modeling.model import CharNet
File "c:\users\karti\research-charnet\charnet\modeling\model.py", line 17, in
from .postprocessing import OrientedTextPostProcessing
File "c:\users\karti\research-charnet\charnet\modeling\postprocessing.py", line 13, in
from .rotated_nms import nms, nms_with_char_cls,
File "c:\users\karti\research-charnet\charnet\modeling\rotated_nms.py", line 10, in
from shapely.geometry import Polygon
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\geometry_init_.py", line 4, in
from .base import CAP_STYLE, JOIN_STYLE
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\geometry\base.py", line 18, in
from shapely.coords import CoordinateSequence
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\coords.py", line 8, in
from shapely.geos import lgeos
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\geos.py", line 145, in
lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
File "C:\Users\karti\Miniconda3\envs\charnet\lib\ctypes_init.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found

please help me with this

One and two character recognition/detection

Hello, I am having some issues trying to recognize images with only one character (or two)
I'm runing this model on Racing Bib Numbers and some runners have 2 digits numbers on their bibs.

Is there any parameter regarding to the char/text length?

是如何处理弯曲文本的包围盒子的？检测分支弯曲文本的区域是由四边形还是多边形给出？

Are the bounding boxes of curve text (e.g., total_text) quadrilateral or polygonal?

Are the bounding boxes of curve text (e.g., total_text) quadrilateral or polygonal? If quadrilateral, how do the code output a bounding box with curved text?

I want to test the performance of charnet.

I want to test the performance of charnet. How can I use existing code to test ICDAR2015 and Total_text? I am a beginner in deep learning and hope you can help me.

How to train my own datasets?

Time consuming on postprocessing

I realize that the network take too much time on postprocessing (Only 0.1s for running through network but cost around 2s for postprocessing). Is there anyway that I can do to speed up the postprocessing? Thank you!

the source code does not have training method and loss function

代码开源了还不给别人训练,难道您是怕被人打脸么?还是想找免费的测试人员?

How to collect the "correct" char-level bounding boxes detected in real-world images

Author，I want to know how to collect the "correct" char-level bounding boxes detected in real-world images in the second step of 3.4 Iterative Character Detection section.I also want to know how to get the synthetic data, such as Synth800k with the char-level labels.Looking forward to your reply.Thanks!

some questions

Hello author, after reading charnet paper and code, I have some questions:

1. Character Branch

In 3.2. Character Branch of paper, it said:

This branch contains three sub-branches, for text instance segmentation, character detection and character recognition, respectively.

But in the model.py, I didn't find the Text instance segmentation sub-branch as depicted in Figure 2. In your code, it is replaced by a shrunk char region score prediction branch just like EAST model?

Below is some visualizion sample using your pretrained model:

(I used cv2.applyColorMap(), cv2.addWeighted() and cv2.polylines() for better visualization)
(the angle output is None???)

So, charnet's Character Branch is in fact a EAST-like head(shrunk char score map & geometry map) + char recognition head ?

2. ic15 testset performance

I used the pretrained model and the default config file, the result on ic15 testset is:

precision:0.966   recall:0.744   hmean:0.841

which is far away from the paper report, I noticed that the pred_char_orient in CharDetector class is None. So these open-sourced code is incompleted ?

3. Iterative Character Detection

Iterative Character Detection method is the key for charnet-training in real-world datasets. During each step(2nd~4th step), the parameter of Model A which generates pseudo-gt char-bboxes is fixed, and is different from the Model B to be trained ? or there is only one Model during the whole train schedule?
Looking forward to your reply, thanks!