msight-tech / research-charnet Goto Github PK
View Code? Open in Web Editor NEWCharNet: Convolutional Character Networks
License: Other
CharNet: Convolutional Character Networks
License: Other
hello everyone , i am getting this error by running test_net.py on windows 10
(charnet) C:\Users\karti\research-charnet>python tools/test_net.py configs/icdar2015_hourglass88.yaml C:\Users\karti\research-charnet\inputimages C:\Users\karti\research-charnet\result
Traceback (most recent call last):
File "tools/test_net.py", line 9, in
from charnet.modeling.model import CharNet
File "c:\users\karti\research-charnet\charnet\modeling\model.py", line 17, in
from .postprocessing import OrientedTextPostProcessing
File "c:\users\karti\research-charnet\charnet\modeling\postprocessing.py", line 13, in
from .rotated_nms import nms, nms_with_char_cls,
File "c:\users\karti\research-charnet\charnet\modeling\rotated_nms.py", line 10, in
from shapely.geometry import Polygon
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\geometry_init_.py", line 4, in
from .base import CAP_STYLE, JOIN_STYLE
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\geometry\base.py", line 18, in
from shapely.coords import CoordinateSequence
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\coords.py", line 8, in
from shapely.geos import lgeos
File "C:\Users\karti\Miniconda3\envs\charnet\lib\site-packages\shapely\geos.py", line 145, in
lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
File "C:\Users\karti\Miniconda3\envs\charnet\lib\ctypes_init.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found
please help me with this
Appreciate great job! I'm looking forward the Training code...
Hello,
To my best knowledge, the annotation format of 800k synthetic datasst is as follow:
http://www.robots.ox.ac.uk/~vgg/data/scenetext/readme.txt
It is composed of charBB, wordBB, txt, imnames. However the txt and the wordBB is not strictly aligned.
How to get the char label with both bounding box and character label?
Looking forward to your reply and thanks in advance!
Will the training code be provided later?
@mscottml Thank you very much for your great work. Can you tell me what is the specific definition of the loss function?Can you provide it in advance? thanks
Hi. Thank you for the interesting works.
I'm confused about your implementation. In the paper, for training with your iterative character detection module, only the gt-word length is used. For the ICDAR2015 task 4.1 text localization, the single "word" gt is provided (accurate word or ###) in string form. But in your implementation, it uses lexicon, the candidate of the words! Is it fair for "localization (or detection) only " task ???
It is obvious that using lexicon actually enhances the detection result ! Or is there anything i missed ?
Thanks for reading.
The website for download trained weight has been disabled.
Could anyone please provide the weight?
I have added a training module and I noticed when forwarding the hourglass88 with torch.no_grad it works just fine but when removing torch.no_grad the gpu (12GB) run out of memory just by forwarding one image sample. I get this message:
RuntimeError: CUDA out of memory. Tried to allocate 46.00 MiB (GPU 0; 10.91 GiB total capacity; 10.21 GiB already allocated; 44.75 MiB free; 92.41 MiB cached).
It seems by the paper that you training not 1 but 4 images per mini-batch.
When I downloaded this trained weight, I encountered a "network error" or "slow download" situation, so I uploaded this weight to Baidu Net Disk, hoping to help you。
The Trained Weight: icdar2015_hourglass88.pth
Size: 356.9MB (in Ubuntu)
链接: https://pan.baidu.com/s/1M85Cd_V8NxqZzS41JHJP3A
提取码: dmbj
Firstly, thanks to your paper, it is a great job.
In the paper, it is said that there is an instance segmentation sub-branch in character branch, but how do you obtain instance segmentation ground truth masks? Do you obtain them the same way as Mask textspotter?
@mscottml
it is creating an empty file in weights directory
Hello, I am having some issues trying to recognize images with only one character (or two)
I'm runing this model on Racing Bib Numbers and some runners have 2 digits numbers on their bibs.
Is there any parameter regarding to the char/text length?
Hello author, after reading charnet paper and code, I have some questions:
In 3.2. Character Branch of paper, it said:
This branch contains three sub-branches, for text instance segmentation, character detection and character recognition, respectively.
But in the model.py, I didn't find the Text instance segmentation sub-branch as depicted in Figure 2. In your code, it is replaced by a shrunk char region score prediction branch just like EAST model?
Below is some visualizion sample using your pretrained model:
(I used cv2.applyColorMap(), cv2.addWeighted() and cv2.polylines() for better visualization)
(the angle output is None???)
So, charnet's Character Branch is in fact a EAST-like head(shrunk char score map & geometry map) + char recognition head ?
I used the pretrained model and the default config file, the result on ic15 testset is:
precision:0.966 recall:0.744 hmean:0.841
which is far away from the paper report, I noticed that the pred_char_orient
in CharDetector class is None. So these open-sourced code is incompleted ?
Iterative Character Detection method is the key for charnet-training in real-world datasets. During each step(2nd~4th step), the parameter of Model A which generates pseudo-gt char-bboxes is fixed, and is different from the Model B to be trained ? or there is only one Model during the whole train schedule?
Looking forward to your reply, thanks!
hello,
How can I speed up inference time on charnet as described on tools/test_net.py?
tried to apply torch multiprocess but it failed due to lambda usage in the model definition.
Some clues on how to batch predict on charnet may help a lot here.
Tnx!
I see that the monitoring results are in uppercase.How to distinguish between upper and lower case
I realize that the network take too much time on postprocessing (Only 0.1s for running through network but cost around 2s for postprocessing). Is there anyway that I can do to speed up the postprocessing? Thank you!
I think there should be a requirements file as well.
Could you describe the loss function, Dataloader of the paper? We would like to re-produce the your paper. Many thanks.
Following the description, I download the trained weights, then test and evaluate on the detetion datasets of IC15. The results seems strange as follow:
precision: 0.966, recall: 0.744, hmean: 0.841
Is the result correct? Thanks for your reply.
In weights.sh there is :
wget https://cloudstor.aarnet.edu.au/plus/s/c0PaY4pzPUhPmL9/download -O weights/icdar2015_hourglass88.pth
Author,I want to know how to collect the "correct" char-level bounding boxes detected in real-world images in the second step of 3.4 Iterative Character Detection section.I also want to know how to get the synthetic data, such as Synth800k with the char-level labels.Looking forward to your reply.Thanks!
the paper said that there only 68 classes for character recognition, but for 2017MLT dataset, how do you recognize characters out of English?
Thank you.
Are the bounding boxes of curve text (e.g., total_text) quadrilateral or polygonal? If quadrilateral, how do the code output a bounding box with curved text?
Firstly, thanks to your paper, it is a great job. Then , I want to ask you , when will training code be released? I am looking forward to it.
Hi, Thanks for the work. When will the training scripts be shared?
as a title, the model which is downloaded in download_weights.sh, did you train this model on real-world images with "iterative character detection"?
simply img only has numbers
Are you guys planning to release training code ? If no, can you share the details of how loss functions used for combined learning ?
代码开源了还不给别人训练,难道您是怕被人打脸么?还是想找免费的测试人员?
Hi, thank you for this excellent work! I would be very grateful if you could include the training file for this model.
Looking forward to hear from you.
In your code,it save words boxes. I want to save char boxes,how to do.
How do train model to detect and recognize line text instead of word text?
Hi! Great project and very interesting paper. I've encountered two issues:
setup.py
doesn't include all of the dependenciesAfter following the setup instructions in your ReadME on Ubuntu 18.04 with python 3.7.4, I tried running:
python tools/test_net.py configs/icdar2015_hourglass88.yaml images_dir results_dir
and received the following error:
python tools/test_net.py configs/icdar2015_hourglass88.yaml images_dir results_dir
Traceback (most recent call last):
File "tools/test_net.py", line 9, in <module>
from charnet.modeling.model import CharNet
File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/model.py", line 17, in <module>
from .postprocessing import OrientedTextPostProcessing
File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/postprocessing.py", line 11, in <module>
import editdistance
ModuleNotFoundError: No module named 'editdistance'
As a side note, I'm running Ubuntu on the Windows Subsystem for Linux
After pip installing editdistance, I got another import error:
python tools/test_net.py configs/icdar2015_hourglass88.yaml images_dir results_dir
Traceback (most recent call last):
File "tools/test_net.py", line 9, in <module>
from charnet.modeling.model import CharNet
File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/model.py", line 17, in <module>
from .postprocessing import OrientedTextPostProcessing
File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/postprocessing.py", line 13, in <module>
from .rotated_nms import nms, nms_with_char_cls, \
File "/c/Users/cmcallister/dev/research-charnet/charnet/modeling/rotated_nms.py", line 9, in <module>
import pyclipper
ModuleNotFoundError: No module named 'pyclipper'
I pip installed that only to see that yacs
was also required. Interestingly, yacs is in your setup.py file.
pip installing those dependencies got me past the import errors, but then I got the following error:
Traceback (most recent call last):
File "tools/test_net.py", line 69, in <module>
charnet.cuda()
File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in cuda
return self._apply(lambda t: t.cuda(device))
File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
module._apply(fn)
File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
module._apply(fn)
File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
module._apply(fn)
File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 224, in _apply
param_applied = fn(param)
File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in <lambda>
return self._apply(lambda t: t.cuda(device))
File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/cuda/__init__.py", line 192, in _lazy_init
_check_driver()
File "/home/cmcallister/.pyenv/versions/3.7.4/lib/python3.7/site-packages/torch/cuda/__init__.py", line 102, in _check_driver
http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
Updating setup.py
to include all of the dependencies and also updating the ReadMe to make it clear that you need a GPU machine in order to run the code.
I want to test the performance of charnet. How can I use existing code to test ICDAR2015 and Total_text? I am a beginner in deep learning and hope you can help me.
Hi, I have read your CharNet paper and interested in it.But only a portion of the evaluation code was open sourced. I want to know can I obtain the full raw code as soon as possible (for academic purposes) through other means (such as purchase)?
Thank you for your time and consideration.
White applying the code to pictures like this
some problems arise:
1.numbers in 2-digits can not be detected
2.some number is recognized as English letterr. e.g. 4→A
while some numbers in one picture can be recognized, the txt file in <results_dir> coresponding to another picture with the same content but in a slightly different view is empty.
Could you please tell how to solve the problems?
Can your code generate bbox?
Thanks a lot!
In your paper, the images you show have rotated character bounding boxes and simple bounding boxes for the word detector. However, in your code, the WordDetector has an orientation
prediction head and output, while the CharDetector returns None
for the character orientation. Is it possible that the two have been accidentally swapped?
Could you tell the size of input images on training?
Is it the same as the testing phase?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.