GithubHelp home page GithubHelp logo

tripERR about facenet HOT 4 CLOSED

Fatemeh89 avatar Fatemeh89 commented on May 1, 2024
tripERR

from facenet.

Comments (4)

Fatemeh89 avatar Fatemeh89 commented on May 1, 2024 1

Dear David
I forget to ask my main problem. When I ran your code the value of accuracy changed in each iteration. At the begging iterations the accuracy is about 50 and after several iterations the accuracy increase up to 75. And then the accuracy is swinging between 70-79.
Is it normal?

from facenet.

davidsandberg avatar davidsandberg commented on May 1, 2024

Hi,
Did you change alpha to a larger value than 0.2? I must say that I'm not too surprised about the variations, but more surprised about the large values for the triplet loss (tripErr) over all. Typically when training with triplet loss (and alpha == 0.2) it starts off at 0.2 and starts to decrease. But note that there will be variations due to the limited batch size.
Below is a typical sequence when training the first ten batches. The triplet loss is swinging between 0.177 and 0.243. If you want to reduce the "swinging" i think that the best option would be to increase the batch size, but that is not necessarily an efficient way to train the network.
Epoch: [0][0/1000] Time 1.063 tripErr 0.202
Epoch: [0][1/1000] Time 0.946 tripErr 0.196
Epoch: [0][2/1000] Time 0.521 tripErr 0.177
Epoch: [0][3/1000] Time 0.520 tripErr 0.192
Epoch: [0][4/1000] Time 0.517 tripErr 0.201
Epoch: [0][5/1000] Time 0.523 tripErr 0.193
Epoch: [0][6/1000] Time 0.518 tripErr 0.208
Epoch: [0][7/1000] Time 0.524 tripErr 0.243
Epoch: [0][8/1000] Time 0.520 tripErr 0.182
Epoch: [0][9/1000] Time 0.522 tripErr 0.181
Other thinks to check would be that you have the normalization of the embedding:
norm = tf.nn.l2_normalize(affn1, 1, 1e-10)
Without this line the scaling of the embeddings would be off and triplet loss would probably also behave in a strange way.

from facenet.

Fatemeh89 avatar Fatemeh89 commented on May 1, 2024

Dear David,
I ran your code by your default parameters and I achieved these results:

Selecting random triplets for validation
MAX COUNTER
5
Running forward pass on validation set
Epoch: [33] Time 74.384 tripErr 0.090 accuracy 0.761±0.045
Saving checkpoint
end Saving checkpoint
Loading training data
Selecting suitable triplets for training
(nrof_random_negs, nrof_triplets) = (337, 1755): time=120.514 seconds
Epoch: [34][0/200] Time 118.839 tripErr 0.082
Epoch: [34][1/200] Time 30.373 tripErr 0.080
Epoch: [34][2/200] Time 17.797 tripErr 0.040
Epoch: [34][3/200] Time 18.275 tripErr 0.076
Epoch: [34][4/200] Time 18.064 tripErr 0.090
Epoch: [34][5/200] Time 17.141 tripErr 0.079
Epoch: [34][6/200] Time 17.161 tripErr 0.072
Epoch: [34][7/200] Time 16.821 tripErr 0.048
Epoch: [34][8/200] Time 16.905 tripErr 0.052
Epoch: [34][9/200] Time 17.416 tripErr 0.071
Epoch: [34][10/200] Time 16.991 tripErr 0.073
Epoch: [34][11/200] Time 16.828 tripErr 0.066
Epoch: [34][12/200] Time 16.904 tripErr 0.082
Epoch: [34][13/200] Time 16.977 tripErr 0.091
Epoch: [34][14/200] Time 17.991 tripErr 0.068
Epoch: [34][15/200] Time 17.233 tripErr 0.030
Epoch: [34][16/200] Time 17.734 tripErr 0.044
Epoch: [34][17/200] Time 17.406 tripErr 0.072
Epoch: [34][18/200] Time 17.544 tripErr 0.048
Epoch: [34][19/200] Time 17.386 tripErr 0.047
Epoch: [34][20/200] Time 28.742 tripErr 0.023
Epoch: [34][21/200] Time 17.609 tripErr 0.044
Epoch: [34][22/200] Time 16.971 tripErr 0.036
Epoch: [34][23/200] Time 17.371 tripErr 0.065
Epoch: [34][24/200] Time 17.099 tripErr 0.053
Epoch: [34][25/200] Time 17.170 tripErr 0.061
Epoch: [34][26/200] Time 17.261 tripErr 0.056
Epoch: [34][27/200] Time 16.966 tripErr 0.077
Epoch: [34][28/200] Time 16.800 tripErr 0.086
Epoch: [34][29/200] Time 17.183 tripErr 0.063
Epoch: [34][30/200] Time 17.016 tripErr 0.053
Epoch: [34][31/200] Time 16.932 tripErr 0.057
Epoch: [34][32/200] Time 17.082 tripErr 0.020
Epoch: [34][33/200] Time 17.003 tripErr 0.058
Epoch: [34][34/200] Time 16.980 tripErr 0.035
Epoch: [34][35/200] Time 17.157 tripErr 0.027
Epoch: [34][36/200] Time 16.989 tripErr 0.054
Epoch: [34][37/200] Time 17.005 tripErr 0.070
Epoch: [34][38/200] Time 17.274 tripErr 0.071
Epoch: [34][39/200] Time 17.237 tripErr 0.024
Epoch: [34][40/200] Time 25.430 tripErr 0.057
Epoch: [34][41/200] Time 17.460 tripErr 0.055
Epoch: [34][42/200] Time 17.299 tripErr 0.041
Epoch: [34][43/200] Time 17.062 tripErr 0.104
Epoch: [34][44/200] Time 16.842 tripErr 0.071
Epoch: [34][45/200] Time 17.271 tripErr 0.049
Epoch: [34][46/200] Time 17.496 tripErr 0.045
Epoch: [34][47/200] Time 17.343 tripErr 0.013
Epoch: [34][48/200] Time 17.022 tripErr 0.059
Epoch: [34][49/200] Time 17.063 tripErr 0.063
Epoch: [34][50/200] Time 17.383 tripErr 0.016
Epoch: [34][51/200] Time 16.972 tripErr 0.078
Epoch: [34][52/200] Time 17.223 tripErr 0.065
Epoch: [34][53/200] Time 17.054 tripErr 0.054
Epoch: [34][54/200] Time 17.271 tripErr 0.042
Epoch: [34][55/200] Time 17.045 tripErr 0.058
Epoch: [34][56/200] Time 17.142 tripErr 0.042
Epoch: [34][57/200] Time 17.638 tripErr 0.057

Epoch: [34][58/200] Time 18.257 tripErr 0.029

May I ask you a question? When you ran your code on “Facecrub” dataset, what was the “accuracy” value?
Could you achieve the FaceNet’s accuracy in your result?
And I need to be faster. So, if I use the gray images instead of color images, will the result change?
What do you suggest me if I want to decrease the time of training?
May it be good if I use the Caffe instead of Python in my implementation?

from facenet.

davidsandberg avatar davidsandberg commented on May 1, 2024

I haven't checked the accuracy of the model trained on purly facescrub lately. The ROC-curve can be found in data/20160216_roc_lfw_nn4.png. I think the accuracy for this model is a little bit above 80%. If you want to try it you can download the model using the link in the readme file (https://drive.google.com/file/d/0B5MzpY9kBtDVUFNFR21kZnYybkE/view?usp=sharing) and run "validate_on_lfw" on that model. To reach the accuracy from the facenet paper i believe that a significantly larger training set than facescrub would be needed.
Regarding the performance with gray scale images i think that would be a very interesting investigation. Other ways to speed up the training would be to use a smaller model. In the facenet paper they use models NNS1 and NNS2 but i didn't see any description of those architectures.
And regarding the swinging i must say that it sounds a bit strange. I haven't seen that much variation in the accuracy even though some small variations are of course expected. Attached is the learning curve for the training of the 20160308 model.
20160308_learning_curve

from facenet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.