GithubHelp home page GithubHelp logo

valeoai / confidnet Goto Github PK

View Code? Open in Web Editor NEW
161.0 161.0 33.0 175 KB

Addressing Failure Prediction by Learning Model Confidence

License: Other

Python 78.23% C++ 8.36% Dockerfile 0.35% Shell 0.22% Jupyter Notebook 12.84%

confidnet's People

Contributors

chcorbi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

confidnet's Issues

Freeze Miniconda version and dependencies in Dockerfile

Hello, for the Dockerfile to properly install the packages, I suggest to explicitly set the targeted Miniconda version:

Replace
RUN wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
by
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.12.0-Linux-x86_64.sh -O miniconda.sh

Additionally, some dependencies must be explicitly specified:

RUN pip install tensorflow==1.13.1 torchsummary pyyaml verboselogs coloredlogs click scikit-learn pillow==6.0.0 protobuf==3.20.0

How to choose which epoch to use?

Hi authors,

Great work! I really appreciate you can release code of your Neurips paper.

One question is that how do you select which epoch to use (for next step training or report final results). I trained a baseline for Cifar10 and find that the results varies from epoch to epoch. Do you have any specific criterions?

Thanks!

Kernel Restarting

The kernel for ConfidNet/confidnet/notebook/success_errors_histogram.ipynb appears to have died. It will restart automatically
Screenshot from 2021-12-23 00-40-24
.

ModuleNotFoundError

An error occurred while running the train.py file: "ModuleNotFoundError: No module named 'structured_map_ranking_loss' "

ConfidNet performs worse than MCP when I reproduce SVHN results

Hi Charles!
Thank you for your nice work on ConfidNet and for providing this framework with your paper. For an upcoming publication, I would like to run ConfidNet as a baseline. However, when I try to reproduce your results on SVHN, ConfidNet performs inferior to MCP:

Screenshot 2021-04-07 at 11 49 01

I understand, that results are volatile due to the limited number of incorrect predictions, but I tried multiple runs and always got the same performance pattern. So here is what I did exactly:

  • I run the standard exp_svhn.yaml and select the best epoch according to val-accuracy
  • I run confident training with the following configs:

`Data parameters
data:
dataset: svhn
data_dir: /media/paul/ssd1/datasets/svhn
input_size: [32,32]
input_channels: 3
num_classes: 10
valid_size: 0.1

Training parameters
training:
output_folder: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_selfconfid
task: classification
learner: selfconfid
nb_epochs: 200
batch_size: 128
loss:
name: selfconfid_mse
weighting: 1
optimizer:
name: adam
lr: 0.0001
weight_decay: 0.0001
lr_schedule:
ft_on_val: False
metrics: ['accuracy', 'auc', 'ap_success', 'ap_errors']
pin_memory: False
num_workers: 12
augmentations:
normalize: [[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]

Model parameters
model:
name: small_convnet_svhn_selfconfid_classic
resume: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run/model_epoch_052.ckpt # best val-acc of previous encoder-classifier training
feature_dim: 512
uncertainty:`

  • I select the best epoch according to val-aupr-err
  • I run fine-tuning wit the following confids:

`Data parameters
data:
dataset: svhn
data_dir: /media/paul/ssd1/datasets/svhn
input_size: [32,32]
input_channels: 3
num_classes: 10
valid_size: 0.1

Training parameters
training:
output_folder: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_finetune
task: classification
learner: selfconfid
nb_epochs: 20
batch_size: 128
loss:
name: selfconfid_mse
weighting: 1
optimizer:
name: adam
lr: 0.0000001 # 1e-7
lr_schedule:
ft_on_val: False
metrics: ['accuracy', 'auc', 'ap_success', 'ap_errors']
pin_memory: False
num_workers: 12
augmentations:
normalize: [[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]

Model parameters
model:
name: small_convnet_svhn_selfconfid_cloning
resume: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run/model_epoch_052.ckpt # best val-acc of previous encoder-classifier training
feature_dim: 512
uncertainty: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_selfconfid/model_epoch_111.ckpt # best AURP-error of previous confidnet training.
`

  • Again I select the epoch according to best val-aupr-err for testing

I would like to make sure I use your code correctly in order to not report unfair baseline results. It would be great if you could give me feedback on this, thanks in advance!

Pre-trained Cifar10 baseline classification model's accuracy

Hi, thanks for sharing the really nice work and codes!

When running the code in cifar10 with the given pre-trained VGG model, I have come across two problems:

  1. I can get the reported result with the pre-trained original classification model(92.20%). But I found the training accuracy for the pre-trained original classification model is 99.9% which is a little bit different from Table 3(98.69%) in the supplementary. As the error number in the training set seems important for the latter confidence training according to Table 3 in the paper, so I wonder if there is any problem with my computed training accuracy or my understanding is incorrect.

  2. I found the parameters config in the given pre-trained classification model setting is slightly different from the paper description, e.g., Adam vs SGD. Is there any recommended parameter setting to get a similar baseline classifier for cifar10 in the paper?

CamVid dataset train / val / test split

Hello authors,

After downloading CamVid data set from the link in your repo, we got 101 images with labels. How do you split train / val / test data? Could you please share your list so that I can reproduce your results? Thanks in advance for your help!

Best,
Yingda

Main Segnet Model for Camvid

Hi, sorry for disturbing you after this repo has been released for this long time. I find that if I need to use the code to train a side learner for the Camvid dataset, I need to have a main segnet model trained with camvid. Can I know do you still keep the main segnet model that you use or how should I retrain the main segnet model for camvid if I want to follow your steps?

Question about the accuracy of training vgg16 on CIFAR10 dataset?

Hi, thanks for your really interesting work. I try to run your code for training vgg16 on cifar10 dataset. However, the final accuracy on test set is not as good as yours. My best performance for classification on cifar10 test set can only achieve 89.32% with the same configuration as yours. Could you give me some help for improving this accuracy?

Moreover, I tried to modify your implementation of vgg16 and tuned the lr and weight decay to achieve the same accuracy on test set; However, by using this pretrained model to finetune confidnet, I still cannot get the similar FPR-95%-TPR AUPR-Error AUPR-Success AUC as yours. Why?

ConfidNet Failure Cases & Generalization

Hi,

I've been admiring the paper a lot. Especially its ability to estimate confidence post-training.

I have a few questions:

  1. What were some of your failure cases? Was there a situation where the method didn't work?
  2. What if the network is overfitting, meaning the value of the confidence is always saturated to 1? How can you make sure that the network has a reasonable confidence level?
  3. Is it safe to stay if you know the architecture of any model, it is possible to train the ConfidNet as a branch for any model? what are the limitations?

code bug? why i meet this when i test confidnet?

File "/ConfidNet/confidnet/learners/default_learner.py", line 179, in evaluate
metrics.update(pred, target, confidence)
UnboundLocalError: local variable 'pred' referenced before assignment
(I'm sure no changes were made to the code except the config.yaml)
When i set --mode confidnet I will meet this error, but when i set --mode trust-score --mode mc-dropout --mode mcp everything is normal, and i can get the result.

At the same time, when i set --mode tcp , the result is wired . The E-AURC is 0 and the AURC is much lower than other method. Any explanation for this?

Hoping get your respond soon, It's a good work!

Attempts to pre-train models results in error: UnboundLocalError: local variable 'pred' referenced before assignment

Issuing the recommended training command python3 train.py -c confs/exp_cifar10.yaml (or the same with cifar100) results in the following error:

Traceback (most recent call last):
  File "train.py", line 135, in <module>
    main()
  File "train.py", line 131, in main
    learner.train(epoch)
  File "/ConfidNet/confidnet/learners/default_learner.py", line 80, in train
    val_losses, scores_val = self.evaluate(self.val_loader, self.prod_val_len, split="val")
  File "/ConfidNet/confidnet/learners/default_learner.py", line 179, in evaluate
    metrics.update(pred, target, confidence)
UnboundLocalError: local variable 'pred' referenced before assignment

In debug, I noticed that the value of mode passed into evaluate(...) is taking the default of "normal", which of course is not handled in any of the if-clauses. Any idea what is expected to happen in "normal" mode?

Should the default value for mode instead be "mcp"? (I am able to proceed with training if I make this modification, but I'm not sure that this is achieving the intended result)

Trying to understand the ConfidNet training process

Hi, I read your corresponding arxiv article and was trying to understand the learning process as described in the paper.

Learning scheme. Our complete confidence model, from input image to confidence score, shares its first encoding part (‘ConvNet’ in Fig.2) with the classification model M. The training of ConfidNet starts by fixing entirely M (freezing w) and learning θ using loss (4).

I understand the main code is implemented in selfconfid_learner.py. The baseline model is frozen, bn+dropout is disabled allowing only the training of ConfidNet. However, I didn't understand the statement

In a next step, we can then fine-tune the ConvNet encoder. However, as model M has to remain fixed to compute similar classification predictions, we have now to decouple the feature encoders used for classification and confidence prediction respectively.

I couldn't find the corresponding code which performs the above action. Can you please help me understand what you mean by the above statement?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.