GithubHelp home page GithubHelp logo

Comments (9)

xqpinitial avatar xqpinitial commented on May 23, 2024

http://blog.csdn.net/yan_joy/article/details/53608519

from largemargin_softmax_loss.

qianxinchun avatar qianxinchun commented on May 23, 2024

I tried "clip_gradients" the solver.prototxt, but it still ended up with 87.3365.

from largemargin_softmax_loss.

xqpinitial avatar xqpinitial commented on May 23, 2024

firstly,please change the deplay iteration from 200 to 10 to see how the loss change.
seconfly,please reduce the base_lr = 0.0001 or lr = 0.000001 to see the loss
thirdly
1、观察数据中是否有异常样本或异常label导致数据读取异常
2、调小初始化权重,以便使softmax输入的feature尽可能变小
3、降低学习率,这样就能减小权重参数的波动范围,从而减小权重变大的可能性。这条也是网上出现较多的方法。
4、如果有BN(batch normalization)层,finetune时最好不要冻结BN的参数,否则数据分布不一致时很容易使输出值变的很大

from largemargin_softmax_loss.

wy1iu avatar wy1iu commented on May 23, 2024

For CIFAR10, it should be easy to train. If the network diverges, consider decreasing lambda more smoothly. Or simply lower down the difficulty of the loss, i.e. setting a smaller m.

from largemargin_softmax_loss.

shenmanmiao avatar shenmanmiao commented on May 23, 2024

Same problem with @qianxinchun , the network diverges even i set lambda_min=0.5 and m=2. @wy1iu Could you please share your training log(m=4) please?

from largemargin_softmax_loss.

wy1iu avatar wy1iu commented on May 23, 2024

I believe you could train it using PReLU. Using ReLU may need more parameter tuning. @shenmanmiao

from largemargin_softmax_loss.

shenmanmiao avatar shenmanmiao commented on May 23, 2024

PReLU works well on Cifar10, thanks @wy1iu for your reply.

from largemargin_softmax_loss.

billhyde avatar billhyde commented on May 23, 2024

Hi,thank you for your sharing. I use CASIA-WebFace and A-softmax(Sphereface paper) train the model. The model converged and the accuracy on lfw is 97.5%. It is really hard to achieve the accuracy above 99%, It is really grateful,if you can provide any suggestions. My QQ is 729512518,

from largemargin_softmax_loss.

yfllllll avatar yfllllll commented on May 23, 2024

@shenmanmiao have you reproduce the result on cifar10? can you share the train_val.prototxt?

from largemargin_softmax_loss.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.