Light

Difficult to train with LargeMargin_Softmax_Loss on cifar10 about largemargin_softmax_loss HOT 9 CLOSED

wy1iu commented on May 23, 2024

Difficult to train with LargeMargin_Softmax_Loss on cifar10

from largemargin_softmax_loss.

Comments (9)

xqpinitial commented on May 23, 2024

http://blog.csdn.net/yan_joy/article/details/53608519

from largemargin_softmax_loss.

qianxinchun commented on May 23, 2024

I tried "clip_gradients" the solver.prototxt, but it still ended up with 87.3365.

from largemargin_softmax_loss.

xqpinitial commented on May 23, 2024

firstly,please change the deplay iteration from 200 to 10 to see how the loss change.
seconfly,please reduce the base_lr = 0.0001 or lr = 0.000001 to see the loss
thirdly
1、观察数据中是否有异常样本或异常label导致数据读取异常
2、调小初始化权重，以便使softmax输入的feature尽可能变小
3、降低学习率，这样就能减小权重参数的波动范围，从而减小权重变大的可能性。这条也是网上出现较多的方法。
4、如果有BN（batch normalization）层，finetune时最好不要冻结BN的参数，否则数据分布不一致时很容易使输出值变的很大

from largemargin_softmax_loss.

wy1iu commented on May 23, 2024

For CIFAR10, it should be easy to train. If the network diverges, consider decreasing lambda more smoothly. Or simply lower down the difficulty of the loss, i.e. setting a smaller m.

from largemargin_softmax_loss.

shenmanmiao commented on May 23, 2024

Same problem with @qianxinchun , the network diverges even i set lambda_min=0.5 and m=2. @wy1iu Could you please share your training log(m=4) please?

from largemargin_softmax_loss.

wy1iu commented on May 23, 2024

I believe you could train it using PReLU. Using ReLU may need more parameter tuning. @shenmanmiao

from largemargin_softmax_loss.

shenmanmiao commented on May 23, 2024

PReLU works well on Cifar10, thanks @wy1iu for your reply.

from largemargin_softmax_loss.

billhyde commented on May 23, 2024

Hi,thank you for your sharing. I use CASIA-WebFace and A-softmax(Sphereface paper) train the model. The model converged and the accuracy on lfw is 97.5%. It is really hard to achieve the accuracy above 99%, It is really grateful,if you can provide any suggestions. My QQ is 729512518,

from largemargin_softmax_loss.

yfllllll commented on May 23, 2024

@shenmanmiao have you reproduce the result on cifar10? can you share the train_val.prototxt?

from largemargin_softmax_loss.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs