msight-tech / research-charnet Goto Github PK

View Code? Open in Web Editor NEW

610.0 610.0 142.0 261 KB

CharNet: Convolutional Character Networks

License: Other

Python 99.74% Shell 0.26%

research-charnet's Issues

The process halts hence doesn't proceed to provide outputs

Ran it for about 1 hour still no output. Using CPU

OS error

hello everyone , i am getting this error by running test_net.py on windows 10

(charnet) C:\Users\karti\research-charnet>python tools/test_net.py configs/icdar2015_hourglass88.yaml C:\Users\karti\research-charnet\inputimages C:\Users\karti\research-charnet\result
Traceback (most recent call last):
File "tools/test_net.py", line 9, in
from charnet.modeling.model import CharNet
File "c:\users\karti\research-charnet\charnet\modeling\model.py", line 17, in
from .postprocessing import OrientedTextPostProcessing
File "c:\users\karti\research-charnet\charnet\modeling\postprocessing.py", line 13, in
from .rotated_nms import nms, nms_with_char_cls,
File "c:\users\karti\research-charnet\charnet\modeling\rotated_nms.py", line 10, in
from shapely.geometry import Polygon
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\geometry_init_.py", line 4, in
from .base import CAP_STYLE, JOIN_STYLE
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\geometry\base.py", line 18, in
from shapely.coords import CoordinateSequence
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\coords.py", line 8, in
from shapely.geos import lgeos
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\geos.py", line 145, in
lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
File "C:\Users\karti\Miniconda3\envs\charnet\lib\ctypes_init.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found

please help me with this

About training code

Appreciate great job! I'm looking forward the Training code...

How to get synthetic dataset with char label (both bounding box and character label)?

Hello,
To my best knowledge, the annotation format of 800k synthetic datasst is as follow:

http://www.robots.ox.ac.uk/~vgg/data/scenetext/readme.txt

It is composed of charBB, wordBB, txt, imnames. However the txt and the wordBB is not strictly aligned.

How to get the char label with both bounding box and character label?

Looking forward to your reply and thanks in advance!

Training Code

Will the training code be provided later?

loss function

@mscottml Thank you very much for your great work. Can you tell me what is the specific definition of the loss function?Can you provide it in advance? thanks

Is lexicon used for text detection ?

Hi. Thank you for the interesting works.
I'm confused about your implementation. In the paper, for training with your iterative character detection module, only the gt-word length is used. For the ICDAR2015 task 4.1 text localization, the single "word" gt is provided (accurate word or ###) in string form. But in your implementation, it uses lexicon, the candidate of the words! Is it fair for "localization (or detection) only " task ???
It is obvious that using lexicon actually enhances the detection result ! Or is there anything i missed ?

Thanks for reading.

Trained weights

The website for download trained weight has been disabled.
Could anyone please provide the weight?

Training mode error

I have added a training module and I noticed when forwarding the hourglass88 with torch.no_grad it works just fine but when removing torch.no_grad the gpu (12GB) run out of memory just by forwarding one image sample. I get this message:

RuntimeError: CUDA out of memory. Tried to allocate 46.00 MiB (GPU 0; 10.91 GiB total capacity; 10.21 GiB already allocated; 44.75 MiB free; 92.41 MiB cached).

It seems by the paper that you training not 1 but 4 images per mini-batch.

Error

I am getting this error. Could you help ?

是如何处理弯曲文本的包围盒子的？检测分支弯曲文本的区域是由四边形还是多边形给出？

The trained weights have been uploaded to Baidu Net Disk

When I downloaded this trained weight, I encountered a "network error" or "slow download" situation, so I uploaded this weight to Baidu Net Disk, hoping to help you。

The Trained Weight: icdar2015_hourglass88.pth
Size: 356.9MB （in Ubuntu）

链接: https://pan.baidu.com/s/1M85Cd_V8NxqZzS41JHJP3A
提取码: dmbj

Invalid syntax error

python tools/test_net.py configs/icdar2015_hourglass88.yaml <images_dir> <results_dir>

This command is showing invalid syntax error while running in Ubuntu

I am using python 2.7.17 and ubuntu -18.04

Question about text instance segmentation.

Firstly, thanks to your paper, it is a great job.
In the paper, it is said that there is an instance segmentation sub-branch in character branch, but how do you obtain instance segmentation ground truth masks? Do you obtain them the same way as Mask textspotter?
@mscottml

Unable to download weights

it is creating an empty file in weights directory

One and two character recognition/detection

Hello, I am having some issues trying to recognize images with only one character (or two)
I'm runing this model on Racing Bib Numbers and some runners have 2 digits numbers on their bibs.

Is there any parameter regarding to the char/text length?

Not able to download weights

Error while executing download-weights.sh file attached.
Unable to verify the issuer's authority.

some questions

Hello author, after reading charnet paper and code, I have some questions:

1. Character Branch

In 3.2. Character Branch of paper, it said:

This branch contains three sub-branches, for text instance segmentation, character detection and character recognition, respectively.

But in the model.py, I didn't find the Text instance segmentation sub-branch as depicted in Figure 2. In your code, it is replaced by a shrunk char region score prediction branch just like EAST model?

Below is some visualizion sample using your pretrained model:

(I used cv2.applyColorMap(), cv2.addWeighted() and cv2.polylines() for better visualization)
(the angle output is None???)

So, charnet's Character Branch is in fact a EAST-like head(shrunk char score map & geometry map) + char recognition head ?

2. ic15 testset performance

I used the pretrained model and the default config file, the result on ic15 testset is:

precision:0.966   recall:0.744   hmean:0.841

which is far away from the paper report, I noticed that the pred_char_orient in CharDetector class is None. So these open-sourced code is incompleted ?

3. Iterative Character Detection

Iterative Character Detection method is the key for charnet-training in real-world datasets. During each step(2nd~4th step), the parameter of Model A which generates pseudo-gt char-bboxes is fixed, and is different from the Model B to be trained ? or there is only one Model during the whole train schedule?
Looking forward to your reply, thanks!

speed up inference

hello,

How can I speed up inference time on charnet as described on tools/test_net.py?
tried to apply torch multiprocess but it failed due to lambda usage in the model definition.

Some clues on how to batch predict on charnet may help a lot here.

Tnx!

How to distinguish between upper and lower case

I see that the monitoring results are in uppercase.How to distinguish between upper and lower case

Time consuming on postprocessing

I realize that the network take too much time on postprocessing (Only 0.1s for running through network but cost around 2s for postprocessing). Is there anyway that I can do to speed up the postprocessing? Thank you!

* please add requirements.txt *

I think there should be a requirements file as well.

Loss function

Could you describe the loss function, Dataloader of the paper? We would like to re-produce the your paper. Many thanks.

What's the results of the given trained weights for ICDAR2015?

Following the description, I download the trained weights, then test and evaluate on the detetion datasets of IC15. The results seems strange as follow:
precision: 0.966, recall: 0.744, hmean: 0.841
Is the result correct? Thanks for your reply.

Do I need an Australian insitution's account to download the weights from https://cloudstor.aarnet.edu.au/ ?

In weights.sh there is :
wget https://cloudstor.aarnet.edu.au/plus/s/c0PaY4pzPUhPmL9/download -O weights/icdar2015_hourglass88.pth

How to collect the "correct" char-level bounding boxes detected in real-world images

Author，I want to know how to collect the "correct" char-level bounding boxes detected in real-world images in the second step of 3.4 Iterative Character Detection section.I also want to know how to get the synthetic data, such as Synth800k with the char-level labels.Looking forward to your reply.Thanks!

how do you recognize multi-language characters

the paper said that there only 68 classes for character recognition, but for 2017MLT dataset, how do you recognize characters out of English?
Thank you.

Are the bounding boxes of curve text (e.g., total_text) quadrilateral or polygonal?

Are the bounding boxes of curve text (e.g., total_text) quadrilateral or polygonal? If quadrilateral, how do the code output a bounding box with curved text?

Bad result ?

I had tried your great work but it seemed that the results on my test images were not so good

are there any parameters that I need to adjust to achieve better results ?

When will training code be released?

Firstly, thanks to your paper, it is a great job. Then , I want to ask you , when will training code be released? I am looking forward to it.

Training scripts

Hi, Thanks for the work. When will the training scripts be shared?

is the model in download_weights.sh with or without trained on real-world images?

as a title, the model which is downloaded in download_weights.sh, did you train this model on real-world images with "iterative character detection"?

Could you please upload the weight model trained on ICDAR 2017 MLT?

the test demo cannot recognizenumbers using the model of icdar2015_hourglass88.pth 。

simply img only has numbers

Release of Training code

Are you guys planning to release training code ? If no, can you share the details of how loss functions used for combined learning ?

expand to chinese is ok?

the source code does not have training method and loss function

代码开源了还不给别人训练,难道您是怕被人打脸么?还是想找免费的测试人员?

Training Script

Hi, thank you for this excellent work! I would be very grateful if you could include the training file for this model.
Looking forward to hear from you.

请问 pytorch 使用的哪个版本

save single char boxes

In your code,it save words boxes. I want to save char boxes,how to do.

Line text detection and recognizer instead of word level

How do train model to detect and recognize line text instead of word text?

ModuleNotFoundErrors and GPU-only Support

Summary

Hi! Great project and very interesting paper. I've encountered two issues:

setup.py doesn't include all of the dependencies
The GPU-only support isn't explicit.

Steps to Recreate

After following the setup instructions in your ReadME on Ubuntu 18.04 with python 3.7.4, I tried running:

python tools/test_net.py configs/icdar2015_hourglass88.yaml images_dir results_dir

and received the following error:

python tools/test_net.py configs/icdar2015_hourglass88.yaml images_dir results_dir
Traceback (most recent call last):
  File "tools/test_net.py", line 9, in <module>
    from charnet.modeling.model import CharNet
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/model.py", line 17, in <module>
    from .postprocessing import OrientedTextPostProcessing
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/postprocessing.py", line 11, in <module>
    import editdistance
ModuleNotFoundError: No module named 'editdistance'

As a side note, I'm running Ubuntu on the Windows Subsystem for Linux

After pip installing editdistance, I got another import error:

python tools/test_net.py configs/icdar2015_hourglass88.yaml images_dir results_dir
Traceback (most recent call last):
  File "tools/test_net.py", line 9, in <module>
    from charnet.modeling.model import CharNet
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/model.py", line 17, in <module>
    from .postprocessing import OrientedTextPostProcessing
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/postprocessing.py", line 13, in <module>
    from .rotated_nms import nms, nms_with_char_cls, \
  File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/rotated_nms.py", line 9, in <module>
    import pyclipper
ModuleNotFoundError: No module named 'pyclipper'

I pip installed that only to see that yacs was also required. Interestingly, yacs is in your setup.py file.

pip installing those dependencies got me past the import errors, but then I got the following error:

Traceback (most recent call last):
  File "tools/test_net.py", line 69, in <module>
    charnet.cuda()
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
    module._apply(fn)
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
    module._apply(fn)
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
    module._apply(fn)
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 224, in _apply
    param_applied = fn(param)
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in <lambda>
    return self._apply(lambda t: t.cuda(device))
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/cuda/__init__.py", line 192, in _lazy_init
    _check_driver()
  File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/cuda/__init__.py", line 102, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError: 
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

Recommended Fix

Updating setup.py to include all of the dependencies and also updating the ReadMe to make it clear that you need a GPU machine in order to run the code.

I want to test the performance of charnet.

I want to test the performance of charnet. How can I use existing code to test ICDAR2015 and Total_text? I am a beginner in deep learning and hope you can help me.

about raw training code.

Hi, I have read your CharNet paper and interested in it.But only a portion of the evaluation code was open sourced. I want to know can I obtain the full raw code as soon as possible (for academic purposes) through other means (such as purchase)?
Thank you for your time and consideration.

How to improve the results of recognizing numbers?

White applying the code to pictures like this

some problems arise:
1.numbers in 2-digits can not be detected
2.some number is recognized as English letterr. e.g. 4→A

while some numbers in one picture can be recognized, the txt file in <results_dir> coresponding to another picture with the same content but in a slightly different view is empty.

Could you please tell how to solve the problems?
Can your code generate bbox?
Thanks a lot！

Question about Character and Word detection branches

In your paper, the images you show have rotated character bounding boxes and simple bounding boxes for the word detector. However, in your code, the WordDetector has an orientation prediction head and output, while the CharDetector returns None for the character orientation. Is it possible that the two have been accidentally swapped?

How to train my own datasets?

The size of input images on training

Could you tell the size of input images on training?
Is it the same as the testing phase?

msight-tech / research-charnet Goto Github PK

research-charnet's Issues

1. Character Branch

2. ic15 testset performance

3. Iterative Character Detection

Summary

Steps to Recreate

Recommended Fix

Recommend Projects

Recommend Topics

Recommend Org

Jobs