Comments (26)
Right, probably the cropped images are the reason why it can improve. Thanks
from crnn-pytorch.
Hi Mariem, thank you for your comment. I followed Holmeyoung's method. stage 1, I did not change the params. I trained until the loss starts fluctuating. Stage 2, I changed the learning rate and so on. The loss reduced a little bit but the accuracy never exceeded 0.83. The training data is larg, and it is mix of (English and Arabic) and (synthetic and cropped images).
from crnn-pytorch.
Actually I have 7M not 70M; I mistakenly typed 70M :) It takes me days because I am new to ML and DL. I reconstructed part of the English synth 90 K datasets and built Arabic synth datasets. Also, I cropped about 2 K images. I am using tensor-book with 32 GB DDR4 RAM.
from crnn-pytorch.
Hi Mariem,
Sure, you can buy it from Lambda's website, and the prices are different depends on the memory size. Please check this website for more details.
https://lambdalabs.com/deep-learning/laptops/tensorbook
Also, this is a forum where you can see users' reviews.
https://deeptalk.lambdalabs.com/t/lambda-tensorbook-specifications/388
If you have more question, do not hesitate to ask.
from crnn-pytorch.
Good, I think you can get good recognition accuracy. 60M text with different fonts will definitlly help.
Good luck.
from crnn-pytorch.
Hello, I guess that there's some corruption with the data (like the text written on the image is not complete because of the size of the background image)
eg: the ground truth is ABCDEFGH but the image has ABCDEF (length of the image doesn't allow the whole writing). So check ur data.
And could you tell me about the rest of the params, also I guess it'll rise up more later, it takes time after all if it's not done with generalizing all the characters.
from crnn-pytorch.
I see, I guess when there'r cropped images, the accuracy doesn't add up (that's why it reached 83% and stopped there); this is just according to my experience unless holmeyoung has another saying in this.
from crnn-pytorch.
You're welcome, and btw can you tell me how long it took for the 70M images to be created? (Hours or days) and if it doesn't bother you : how much RAM you have? Thanks ^_^
from crnn-pytorch.
I see ~ Thank you very much :D
from crnn-pytorch.
Hello again Niddal-imam :))
Can you tell me how to buy a tensor book, and how much it costs?
Thanks.
from crnn-pytorch.
Thanks for your answer,
However, I checked the website, it says that it costs (Max) about $3,355.00? How is that calculated?
from crnn-pytorch.
I bought the premium one and it cost my 3,181 US dollar. However, as it is going to be shipped from the US, you need to consider the duty and tax fees.
from crnn-pytorch.
Yes but 3,181 US dollar is too little?!? O.O" I mean it says that it's the before shipping price but what is the real price lol (sorry for any inconviences).
from crnn-pytorch.
:) As I live in the UK, It cost me extra 600 US dollar for duty and tax. Total price was about 4 thousand US dollar. Actually, the prices is not to little when compare it with other "Machine Learning" laptops, but the price is reasonable.
from crnn-pytorch.
Hello again :) I wanted to ask again if your work has reached a good result on real life images and if also, the rapidity of the tensor book has given you quick results? (Because I'm going to try some of the AWS instances so I wanted to ask you ^_^ ) Thanks.
from crnn-pytorch.
Hi,
Yes, I was able to improve the recognition from 16% to 46% CRW on test real-world images. I have trained the model with over 200k synthetic images. The more images I used, the better recognition I got. Regarding the rapidity of the tensor book, it took me about 3 days to train the model with 200k images for about 10 epochs. Although the tensor book is not cheap, it is better than renting virtual machine and pay per hour. It has saved me a lot of money. I highly recommend it :)
from crnn-pytorch.
I see, thank you very much for the answer. I hope I can afford it. However, my question is: is it able to recognize any image you give it now? and if we compare it to Google vision API, do you recommend google vision or this project?
from crnn-pytorch.
And by more images: do you mean we add more images of the same sample (same composition/shape/representation) or varient images (more diffferent than the synthetic images)?
from crnn-pytorch.
Yes, it can recognize any real-world images with 57% accuracy for English and 46% for Arabic. Actually I have not tried Google vision API, but I will compare my results with Google vision API. Thank you for the suggestion; I have been looking for a model to compare my results with.
For training, I first generated 100k synthetic samples, and after training the model with synthetic samples only, the model could not recognize real-world images accurately. Then, I mixed 100k synthetic and 3000 real-world samples for training. The result was better ~ 20% CRW. Finally, I used 200k synthetic and 3000 real-world samples and got 46% CRW. The synthetic images are samples with different text fonts, background, text sizes.
I hope that I answer your questions.
from crnn-pytorch.
Okay, thanks a lot. I did the text-image generation (synthetic) used in this project (with about 5 different fonts and 5 different text polices but with a very big japanese text file for about +7000 characters). I created 10Million for train and 1Million for test. I'm going to do the training but - lol after what you said I'm wondering if it's enough (the image composition).
P.S: I'm going to use it not on real world images but on something that is gray and similar to the text-images generated in this project.
What do you think?
from crnn-pytorch.
In my case, when training the model on synthetic samples, the model recognizes synthetic test samples with ~ 80% accuracy. So, I think your model can achieve even better accuracy as you are using big training dataset.
Good luck.
from crnn-pytorch.
Yes but the composition isn't very varient so I'm worried probably it won't recognize anything I give it to it later (disregarding the accuracy of the model I mean).
from crnn-pytorch.
What do you mean by composition? Do you mean words embedded in the generated images?
from crnn-pytorch.
Yes, by composition I mean for example I create images with 10 different backgrounds instead of 5 different bakgrounds which mean the model will learn different background features. That's the variety of the data. = is what I mean by composition.
Also different text polices (embedded in the image), different rotations, etc.
from crnn-pytorch.
Right, I think the model does not learn background's features, it just learn embedded text labeled.
absolute/path/to/image/xxx.jpg label of xxx.jpg
This is my understanding, and Holmeyoung can correct me if I am wrong. So, using 5 or more background does not help that much, but using more labels (words) helps. In my case, I used different corpus and dictionaries to generate text.
from crnn-pytorch.
I see, for me I used a text file that has a huge Japanese and English text corpus (adresses, cities, names, dates, ...etc): its size is 60Mo (the file) and I generated images with random text from this file (about 10 characters per image) - I created 10Millions bcuz according to Holmeyoung it should be nb_characters * 1000 images = 7Millions but I created 10 millions for train and then 1M for test.
from crnn-pytorch.
Related Issues (20)
- Batchnormalization layer
- KeyError : ' ' HOT 1
- Problem loading checkpointed model
- inference accuracy HOT 10
- Output indicates "PAD" char for all columns HOT 3
- Train problem HOT 1
- A question about pre-trained model HOT 3
- Problem about ctc_loss variable input_length while training HOT 1
- Mean vals and norm vals
- No predictions when training or testing the net !!!
- create image Tensor HOT 2
- Traanning Question HOT 3
- number images train
- Training Problems HOT 2
- 梯度爆炸,loss显示持续显示为inf HOT 1
- img.sub_(0.5).div_(0.5)
- 运行demo用cenn.pth预训练模型显示Expected 512, got 64
- val loss:nan, accuray:0 HOT 1
- [Friendly reminder] About the accuracy of demo.py
- val gpu slow HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crnn-pytorch.