GithubHelp home page GithubHelp logo

research-charnet's Introduction

Convolutional Character Networks

This project hosts the testing code for CharNet, described in our paper:

Convolutional Character Networks
Linjie Xing, Zhi Tian, Weilin Huang, and Matthew R. Scott;
In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019.

Installation

pip install torch torchvision
python setup.py build develop

Run

  1. Please run bash download_weights.sh to download our trained weights.

  2. For ICDAR 2015, please run the following command line. Please replace images_dir with the directory containing ICDAR 2015 testing images. The results will be in results_dir.

    python tools/test_net.py configs/icdar2015_hourglass88.yaml <images_dir> <results_dir>
    

Citation

If you find this work useful for your research, please cite as:

@inproceedings{xing2019charnet,
title={Convolutional Character Networks},
author={Xing, Linjie and Tian, Zhi and Huang, Weilin and Scott, Matthew R},
booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
year={2019}
}

Contact

For any questions, please feel free to reach:

License

CharNet is CC-BY-NC 4.0 licensed, as found in the LICENSE file. It is released for academic research / non-commercial use only. If you wish to use for commercial purposes, please contact [email protected].

research-charnet's People

Contributors

cyril9227 avatar mscottml avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

research-charnet's Issues

speed up inference

hello,

How can I speed up inference time on charnet as described on tools/test_net.py?
tried to apply torch multiprocess but it failed due to lambda usage in the model definition.

Some clues on how to batch predict on charnet may help a lot here.

Tnx!

How to improve the results of recognizing numbers?

White applying the code to pictures like this
as
some problems arise:
1.numbers in 2-digits can not be detected
2.some number is recognized as English letterr. e.g. 4→A

while some numbers in one picture can be recognized, the txt file in <results_dir> coresponding to another picture with the same content but in a slightly different view is empty.

Could you please tell how to solve the problems?
Can your code generate bbox?
Thanks a lot!

Is lexicon used for text detection ?

Hi. Thank you for the interesting works.
I'm confused about your implementation. In the paper, for training with your iterative character detection module, only the gt-word length is used. For the ICDAR2015 task 4.1 text localization, the single "word" gt is provided (accurate word or ###) in string form. But in your implementation, it uses lexicon, the candidate of the words! Is it fair for "localization (or detection) only " task ???
It is obvious that using lexicon actually enhances the detection result ! Or is there anything i missed ?

Thanks for reading.

Training mode error

I have added a training module and I noticed when forwarding the hourglass88 with torch.no_grad it works just fine but when removing torch.no_grad the gpu (12GB) run out of memory just by forwarding one image sample. I get this message:

RuntimeError: CUDA out of memory. Tried to allocate 46.00 MiB (GPU 0; 10.91 GiB total capacity; 10.21 GiB already allocated; 44.75 MiB free; 92.41 MiB cached).

It seems by the paper that you training not 1 but 4 images per mini-batch.

Question about Character and Word detection branches

In your paper, the images you show have rotated character bounding boxes and simple bounding boxes for the word detector. However, in your code, the WordDetector has an orientation prediction head and output, while the CharDetector returns None for the character orientation. Is it possible that the two have been accidentally swapped?

about raw training code.

Hi, I have read your CharNet paper and interested in it.But only a portion of the evaluation code was open sourced. I want to know can I obtain the full raw code as soon as possible (for academic purposes) through other means (such as purchase)?
Thank you for your time and consideration.

Trained weights

The website for download trained weight has been disabled.
Could anyone please provide the weight?

Question about text instance segmentation.

Firstly, thanks to your paper, it is a great job.
In the paper, it is said that there is an instance segmentation sub-branch in character branch, but how do you obtain instance segmentation ground truth masks? Do you obtain them the same way as Mask textspotter?
@mscottml

ModuleNotFoundErrors and GPU-only Support

Summary

Hi! Great project and very interesting paper. I've encountered two issues:

  1. setup.py doesn't include all of the dependencies
  2. The GPU-only support isn't explicit.

Steps to Recreate

After following the setup instructions in your ReadME on Ubuntu 18.04 with python 3.7.4, I tried running:

python tools/test_net.py configs/icdar2015_hourglass88.yaml images_dir results_dir

and received the following error:

python tools/test_net.py configs/icdar2015_hourglass88.yaml images_dir results_dir
Traceback (most recent call last):
  File "tools/test_net.py", line 9, in <module>
    from charnet.modeling.model import CharNet
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/model.py", line 17, in <module>
    from .postprocessing import OrientedTextPostProcessing
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/postprocessing.py", line 11, in <module>
    import editdistance
ModuleNotFoundError: No module named 'editdistance'

As a side note, I'm running Ubuntu on the Windows Subsystem for Linux

After pip installing editdistance, I got another import error:

python tools/test_net.py configs/icdar2015_hourglass88.yaml images_dir results_dir
Traceback (most recent call last):
  File "tools/test_net.py", line 9, in <module>
    from charnet.modeling.model import CharNet
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/model.py", line 17, in <module>
    from .postprocessing import OrientedTextPostProcessing
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/postprocessing.py", line 13, in <module>
    from .rotated_nms import nms, nms_with_char_cls, \
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/rotated_nms.py", line 9, in <module>
    import pyclipper
ModuleNotFoundError: No module named 'pyclipper'

I pip installed that only to see that yacs was also required. Interestingly, yacs is in your setup.py file.

pip installing those dependencies got me past the import errors, but then I got the following error:

Traceback (most recent call last):
  File "tools/test_net.py", line 69, in <module>
    charnet.cuda()
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
    module._apply(fn)
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
    module._apply(fn)
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
    module._apply(fn)
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 224, in _apply
    param_applied = fn(param)
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in <lambda>
    return self._apply(lambda t: t.cuda(device))
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/cuda/__init__.py", line 192, in _lazy_init
    _check_driver()
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/cuda/__init__.py", line 102, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError: 
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

Recommended Fix

Updating setup.py to include all of the dependencies and also updating the ReadMe to make it clear that you need a GPU machine in order to run the code.

Invalid syntax error

python tools/test_net.py configs/icdar2015_hourglass88.yaml <images_dir> <results_dir>

This command is showing invalid syntax error while running in Ubuntu
image
I am using python 2.7.17 and ubuntu -18.04

Error

I am getting this error. Could you help ?

Screenshot 2020-02-03 at 9 39 38 PM

Loss function

Could you describe the loss function, Dataloader of the paper? We would like to re-produce the your paper. Many thanks.

Training Script

Hi, thank you for this excellent work! I would be very grateful if you could include the training file for this model.
Looking forward to hear from you.

Bad result ?

I had tried your great work but it seemed that the results on my test images were not so good
image

are there any parameters that I need to adjust to achieve better results ?

Release of Training code

Are you guys planning to release training code ? If no, can you share the details of how loss functions used for combined learning ?

Training scripts

Hi, Thanks for the work. When will the training scripts be shared?

OS error

hello everyone , i am getting this error by running test_net.py on windows 10

(charnet) C:\Users\karti\research-charnet>python tools/test_net.py configs/icdar2015_hourglass88.yaml C:\Users\karti\research-charnet\inputimages C:\Users\karti\research-charnet\result
Traceback (most recent call last):
File "tools/test_net.py", line 9, in
from charnet.modeling.model import CharNet
File "c:\users\karti\research-charnet\charnet\modeling\model.py", line 17, in
from .postprocessing import OrientedTextPostProcessing
File "c:\users\karti\research-charnet\charnet\modeling\postprocessing.py", line 13, in
from .rotated_nms import nms, nms_with_char_cls,
File "c:\users\karti\research-charnet\charnet\modeling\rotated_nms.py", line 10, in
from shapely.geometry import Polygon
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\geometry_init_.py", line 4, in
from .base import CAP_STYLE, JOIN_STYLE
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\geometry\base.py", line 18, in
from shapely.coords import CoordinateSequence
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\coords.py", line 8, in
from shapely.geos import lgeos
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\geos.py", line 145, in
lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
File "C:\Users\karti\Miniconda3\envs\charnet\lib\ctypes_init
.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found

please help me with this

One and two character recognition/detection

Hello, I am having some issues trying to recognize images with only one character (or two)
I'm runing this model on Racing Bib Numbers and some runners have 2 digits numbers on their bibs.

Is there any parameter regarding to the char/text length?

Time consuming on postprocessing

I realize that the network take too much time on postprocessing (Only 0.1s for running through network but cost around 2s for postprocessing). Is there anyway that I can do to speed up the postprocessing? Thank you!

some questions

Hello author, after reading charnet paper and code, I have some questions:

1. Character Branch

In 3.2. Character Branch of paper, it said:

This branch contains three sub-branches, for text instance segmentation, character detection and character recognition, respectively.

But in the model.py, I didn't find the Text instance segmentation sub-branch as depicted in Figure 2. In your code, it is replaced by a shrunk char region score prediction branch just like EAST model?

Below is some visualizion sample using your pretrained model:
Screenshot from 2019-11-12 17-05-20
Screenshot from 2019-11-12 17-08-50
(I used cv2.applyColorMap(), cv2.addWeighted() and cv2.polylines() for better visualization)
(the angle output is None???)

So, charnet's Character Branch is in fact a EAST-like head(shrunk char score map & geometry map) + char recognition head ?

2. ic15 testset performance

I used the pretrained model and the default config file, the result on ic15 testset is:

precision:0.966   recall:0.744   hmean:0.841

which is far away from the paper report, I noticed that the pred_char_orient in CharDetector class is None. So these open-sourced code is incompleted ?

3. Iterative Character Detection

Iterative Character Detection method is the key for charnet-training in real-world datasets. During each step(2nd~4th step), the parameter of Model A which generates pseudo-gt char-bboxes is fixed, and is different from the Model B to be trained ? or there is only one Model during the whole train schedule?
Looking forward to your reply, thanks!

loss function

@mscottml Thank you very much for your great work. Can you tell me what is the specific definition of the loss function?Can you provide it in advance? thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.