Comments (7)
@joaossmacedo Can you please share your github repo for multiple orientations model, I need that for my project
from east.
I've found a solution that works but it's sub-optimal.
First of all, there is a limitation. It will only work if the angle is 0º, 90º, 180º or 270º.
Secondly, it will increase the process duration.
The idea
- Detect boxes;
- Crop the image according to a box;
- Check if height > width. If it's rotate 90º to make the text horizontal;
- Run the cropped image through the recognition model;
- If the score is low, rotate the image 180º;
- Run through the recognition model;
- Compare to the results and use the better one;
The code
boxes = detect(detection_model, img, 0.7)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
for box in boxes:
cropped_image = crop_image(img, box)
if cropped_image.size == 0:
continue
if cropped_image.shape[0] > cropped_image.shape[1]:
cropped_image = cv2.rotate(cropped_image, cv2.ROTATE_90_CLOCKWISE)
predict, probability = recognize(recognition_model, cropped_image)
if probability < 0.8:
cropped_image = cv2.rotate(cropped_image, cv2.ROTATE_180)
new_predict, new_probability = recognize(recognition_model, cropped_image)
if new_probability > probability:
predict = new_predict
probability = new_probability
Alternative idea
One idea that we explored but didn't end up using was to run through Tesseract to get it's angle. We decided not to use this method because we would need to add Tesseract to our project and, in our case, it was faster to recognize than to detect.
from east.
@joaossmacedo does the EAST model out of the box detect text region of any orientation or you had to make some changes to do it ?
Currently, I have trained it on some synthetic images for about 15k steps now and it doesn't seems to detect in all the orientations. I started from resnet checkpoint. I don't think amount of data is a problem since the training data is about 800,000 samples. Do I just keep training for more steps ?
from east.
In the project that I used EAST on, the data was also synthetic but only had text on 0º, 90º, 180º and 270º degrees. It was able to detect the text on all of those orientations.
I didn't use any previous checkpoint so I can't comment on that specifically. However, I believe it should be able to detect text in all orientations as evidenced by the images on the README.
I'm sorry I couldn't be more helpful.
from east.
@joaossmacedo Thanks. That's makes sense. I'll poke around a bit more.
from east.
Here are couple of results using eval.py
script on my trained model. It's very basic one using all defaults. (no changes made).
The bounding boxes looks like they are not rotated. My guess is eval.py
does not make use of angle information to rotate and plot the bounding boxes ?
pics are from icdar15 test set -
from east.
hi, i also checked rotation, it dosn't have rotation correction in text detection, i think for adding this option, you should have box of every alphabet ,
also for some text on red background it didnt work well.
from east.
Related Issues (20)
- can't detect slash
- how to avoid symbol or icon
- convert to tensorrt
- ImportError: Python version mismatch: module was compiled for version 3.6, while the interpreter is running version 3.7.
- EAST in tensorflowjs
- PRUNING HOT 1
- Does this model recognize other languages HOT 1
- KeyError: 'verbosity'
- !!!!救救孩子!!!!!10w个epoch一定要都跑完吗???为什么训练到1000左右都没有保存模型????? HOT 1
- What does `geometry` contain?
- Invert links
- 请问我应该从哪里看训练过程中各项的变化
- can we custom train for specific number detection like credit card number
- FileNotFoundError: [WinError 2] The system cannot find the file specified HOT 1
- ModuleNotFoundError: No module named 'lanms.adaptor'
- Not able to install on Mac
- how to modify multigpu_train.py to train on cpu with custom dataset
- cannot import resnetv1
- Pure C version of LANMS
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from east.