Hi, thanks for your really interesting work. I try to run your code for training <code

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Question about the accuracy of training vgg16 on CIFAR10 dataset? about confidnet HOT 6 CLOSED

valeoai commented on July 20, 2024

Question about the accuracy of training vgg16 on CIFAR10 dataset?

from confidnet.

Comments (6)

GWwangshuo commented on July 20, 2024 1

@chcorbi Thanks for your reply.

1. Actually, I re-implement it by refering your code. I add another 5 fc layers, freeze all layers and deactive dropout layer in the feature extractor. However, I cannot train a good ConfidNet which should generate the similar distribution figure as yours shown in the above.

2. I have attempted your code for many times in one week as following:

Step 1 python3 train.py -c confs/exp_cifar10.yaml -f
- Instead I use different parameters for getting reasonable test accuracy (92.24%)
- using lr to 0.05;
- using random_crop:32 ;
- using multi_step lr schedule.
Step 2 python3 train.py -c confs/selfconfid_classif.yaml -f
- resume pretrained model from Step 1
However, I still cannot obtain the same performance as yours. I am really confused. Till to now, I can achieve the good test accuracy on cifar10 test set, but I am still stuck on the ConfidNet.
Could you give me some suggestions to achieve your performance or how to obtain the similar histogram as yours on cifar10 test set? I really appreciate your kind help!

3. Morever, I try to use your pretrained model on cifar10 dataset from here to draw the following distribution figures. It seems your pretrained model has the same problem.

The left figure is Maximum class probability and the right figure corresponds to True class probability. Could you please verify it or share the code for drawing histogram as yours?
Thanks a lot.

Did your draw the previous TCP figure by using the ground truth? Did ConfidNet really learn how to predict TCP on the test set? Thanks.

from confidnet.

chcorbi commented on July 20, 2024

Hi, thank you for the feedback. In the paper, classification model is selected using validation set accuracy. If you tested with the model of the last epoch, that may explain the difference.

Regarding the metrics, they are sensible to the model used and the error/success partition it creates. A model with a lower reported test accuracy leads to more errors in the test set. As such, if a error sample isn't well ranked, its impact on the metric will be reduce if there are more errors. That's why in the paper, I make sure to compare the various confidence measures (MCP, TrustScore, MCDropout, ConfidNet) using the same classification model.

If needed, more details about the implementation and hyper-parameters are provided in the supplemental:
https://papers.nips.cc/paper/8556-addressing-failure-detection-by-learning-model-confidence

from confidnet.

GWwangshuo commented on July 20, 2024

@chcorbi Thanks for you reply. I have tried to reimplement your method by myself. However, after I train a good classifier on cifar10 test set, I cannot obtain a well confidnet by freezing the feature extractor and only finetuning the last fully connection layer in confidnet. In my experiment, the confidnet tends to converge fast (only a few epoch) and finally gives the predictions around 0.9.

During test, the true class probility is also around 0.9 for all samples which are incorrect? Could you give me some hints to explain this phenonmenon? To be specific, I draw the below figure:

the left figure is the distribution of the baseline trained without confidnet while the right figure represents the figure trained with confidnet. It turns out that the confidnet is easy to be overfittting. Please give me some suggestions about how to finetune the confidnet, I really appreciate it. Thanks.

from confidnet.

chcorbi commented on July 20, 2024

Did you re-implement from scratch? If so, be careful in Pytorch that your feature extractor layers are indeed set to require_grad=False during training, as does the function freeze_layers() in the SelfConfidLearner class. Plus, also deactivate dropout layers to avoid unwanted stochastic effects.

only finetuning the last fully connection layer in confidnet

In this implementation, ConfidNet is made of 5 fc layers added upon the penultimate layer of the original model. If you using only 1 fc layer for ConfidNet that may explain the drop in confidence estimation.

During test, the true class probility is also around 0.9 for all samples which are incorrect?

Using True Class Probability (TCP) as confidence measure, your missclassified samples should rather have low values such as in the figure of the paper:

If you don't have this kind of figure for TCP, you may have a problem in your code. This will certainly affect ConfidNet training as TCP is the target value during confidence training.
Regarding ConfidNet figure, it won't be as good as TCP for sure but it should be something in-between TCP figure and MCP figure.

from confidnet.

chcorbi commented on July 20, 2024

The distribution plot presented in the paper corresponds to a comparison between MCP and the criterion TCP. ConfidNet is trained to match that TCP criterion on the training dataset. According to the results obtained, when drawing the plot associated to ConfidNet, you will find something between MCP plot and TCP plot, actually closer to MCP indeed.

Your plot seems accurate, comparing here MCP and ConfidNet. The error distribution have been slightly shifted to lower values while keeping success prediction to high values. If you measure the AP_errors, you can find that ConfidNet improves over MCP.

To help visualize, I added a notebook to plot success/errors histogram in commit e94bd89

from confidnet.

GWwangshuo commented on July 20, 2024

@chcorbi Thanks for your help. Close this issue since I am clear now.

from confidnet.

Question about the accuracy of training vgg16 on CIFAR10 dataset? about confidnet HOT 6 CLOSED

Comments (6)

Related Issues (12)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs