Comments (9)
http://blog.csdn.net/yan_joy/article/details/53608519
from largemargin_softmax_loss.
I tried "clip_gradients" the solver.prototxt, but it still ended up with 87.3365.
from largemargin_softmax_loss.
firstly,please change the deplay iteration from 200 to 10 to see how the loss change.
seconfly,please reduce the base_lr = 0.0001 or lr = 0.000001 to see the loss
thirdly
1、观察数据中是否有异常样本或异常label导致数据读取异常
2、调小初始化权重,以便使softmax输入的feature尽可能变小
3、降低学习率,这样就能减小权重参数的波动范围,从而减小权重变大的可能性。这条也是网上出现较多的方法。
4、如果有BN(batch normalization)层,finetune时最好不要冻结BN的参数,否则数据分布不一致时很容易使输出值变的很大
from largemargin_softmax_loss.
For CIFAR10, it should be easy to train. If the network diverges, consider decreasing lambda more smoothly. Or simply lower down the difficulty of the loss, i.e. setting a smaller m.
from largemargin_softmax_loss.
Same problem with @qianxinchun , the network diverges even i set lambda_min=0.5 and m=2. @wy1iu Could you please share your training log(m=4) please?
from largemargin_softmax_loss.
I believe you could train it using PReLU. Using ReLU may need more parameter tuning. @shenmanmiao
from largemargin_softmax_loss.
PReLU works well on Cifar10, thanks @wy1iu for your reply.
from largemargin_softmax_loss.
Hi,thank you for your sharing. I use CASIA-WebFace and A-softmax(Sphereface paper) train the model. The model converged and the accuracy on lfw is 97.5%. It is really hard to achieve the accuracy above 99%, It is really grateful,if you can provide any suggestions. My QQ is 729512518,
from largemargin_softmax_loss.
@shenmanmiao have you reproduce the result on cifar10? can you share the train_val.prototxt?
from largemargin_softmax_loss.
Related Issues (20)
- hard to convergence HOT 4
- About A-Softmax HOT 21
- some typos in HOT 2
- Licensing HOT 3
- trian_accuracy decrease? HOT 1
- Computation of k value from eq. (6) HOT 2
- the deploy.prototxt of LargeMargin_Softmax_Loss HOT 5
- Check failed: target_blobs.size() == source_layer.blobs_size() (1 vs. 2) HOT 2
- Activation function problem HOT 1
- Pairs of testing
- Why `lambda = max(lambda_min,base*(1+gamma*iteration)^(-power)`? Any particular reason?
- Can I know which part of the paper do sign_x_ correspond to?
- void LargeMarginInnerProductLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
- train accuracy decrease HOT 1
- train mnist,loss is nan
- evaluate LargeMargin_Softmax_Loss on lfw
- L-softmax + center loss HOT 1
- L
- Angle margin
- Is the CIFAR10 dataset error rate given in the paper the result of a single model?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from largemargin_softmax_loss.