GithubHelp home page GithubHelp logo

Comments (8)

stalagmite7 avatar stalagmite7 commented on May 26, 2024

I too am looking to train on my own dataset, of 109 ascii characters, and am trying to implement the CRNN for strings (like the original paper). My label for each image is the ascii equivalent of each of the characters in the original string label. So in my hdf5 dataset, each image will have an array of ints as the label.
My question now is where are all the spots in the code that I change the number of alphabets from 11 (as per your implementation) to 128 as per mine?

from crnn.caffe.

stalagmite7 avatar stalagmite7 commented on May 26, 2024

Also, if I start training on it after changing alphabet_number in crnn.prototxt to 109 (I've mapped my ascii characters from 1-109 -> including symbols, like in the original CRNN implementation), I get this error:

.
.
.
.
I0913 17:55:12.587182 14556 solver.cpp:273] Solving crnn
I0913 17:55:12.587183 14556 solver.cpp:274] Learning Rate Policy: step
I0913 17:55:12.589298 14556 solver.cpp:331] Iteration 0, Testing net (#0)
F0913 17:55:12.701380 14556 ctc_loss_layer.cu:36] Check failed: status == CTC_STATUS_SUCCESS (1 vs. 0) cuda memcpy or memset failed
*** Check failure stack trace: ***
    @     0x7f1b402e4daa  (unknown)
    @     0x7f1b402e4ce4  (unknown)
    @     0x7f1b402e46e6  (unknown)
    @     0x7f1b402e7687  (unknown)
    @     0x7f1b40a71de1  caffe::CtcLossLayer<>::Forward_gpu()
    @     0x7f1b40a197b3  caffe::Net<>::ForwardFromTo()
    @     0x7f1b40a19b77  caffe::Net<>::Forward()
    @     0x7f1b40a320b2  caffe::Solver<>::Test()
    @     0x7f1b40a3283e  caffe::Solver<>::TestAll()
    @     0x7f1b40a34a39  caffe::Solver<>::Step()
    @     0x7f1b40a34c5a  caffe::Solver<>::Solve()
    @           0x408085  train()
    @           0x4059ac  main
    @     0x7f1b3f2def45  (unknown)
    @           0x40620b  (unknown)
    @              (nil)  (unknown)

Is this really a cuda error (since it says memcopy/memset failed) or is this because of the number of alphabets in my labels that is causing the CTC to fail?
@yalecyu , please let me know, thanks!

from crnn.caffe.

yalecyu avatar yalecyu commented on May 26, 2024

Can you send your prototxt to my email?

from crnn.caffe.

stalagmite7 avatar stalagmite7 commented on May 26, 2024

Thanks for responding, @yalecyu ! But I found out where I still needed to change the number of outputs: in one of the inner product layers in the crnn.prototxt, I changed the num_outputs parameter to the size of my new alphabet set, and now I am able to train successfully! :)

from crnn.caffe.

stalagmite7 avatar stalagmite7 commented on May 26, 2024

@samylee , have you been able to figure it out? I can tell you what I did to get my training on my dataset.

You'll need to change the alphabet_size parameter in the crnn.prototxt to fit the size of your dictionary of characters + 1 (for the blank label). And change your blank_label parameter to alphabet_size-1 (caffe classifier needs each class - each character in this case) to be indexed form 0 to n-1, for n classes.

In your generate_dataset.py file, you'll need to make sure your label_seq array is np.ones multiplied with the blank_label number that you set in your prototxt. Also, since you'll be storing the dataset in hdf5 format, you will only be able to store numbers (caffe will throw an error during training for any data type not float or double for labels), which means you want to store the ascii values (or other numerical conversion) of your letters, not as strings. And then an additional step would be to map these characters from [0, n) to fit caffe's classification requirements.

from crnn.caffe.

xuyang1102 avatar xuyang1102 commented on May 26, 2024

@stalagmite7 hi,if I want to train the Chinese character recognition, how can I prepare for the train dataset ?

from crnn.caffe.

stalagmite7 avatar stalagmite7 commented on May 26, 2024

You'll need to have word level labels, and change the generate dataset file to reflect the number of characters in your alphabet-dictionary and map the alphabets to [0, num_alpha) like I mentioned in the previous comment. During prediction, you'll have to inverse-map the integer vector predicted back to the chinese alphabet you've used.

from crnn.caffe.

ImDePanDa avatar ImDePanDa commented on May 26, 2024

@stalagmite7 Hi,bro.
There are some troubles about my own data.Because there are some alphabets in my data like 'a'、'b'、'c',I can not just make it into np.array.And I don't know how to change the "generate_dataset.py".Can you give me a template.
Looking forward to your reply

from crnn.caffe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.