GithubHelp home page GithubHelp logo

valeoai / confidnet Goto Github PK

View Code? Open in Web Editor NEW
162.0 7.0 34.0 175 KB

Addressing Failure Prediction by Learning Model Confidence

License: Other

Python 78.23% C++ 8.36% Dockerfile 0.35% Shell 0.22% Jupyter Notebook 12.84%

confidnet's Introduction

Addressing Failure Prediction by Learning Model Confidence

Charles Corbière, Nicolas Thome, Avner Bar-Hen, Matthieu Cord, Patrick Pérez
Neural Information Processing Systems (NeurIPS), 2019

If you find this code useful for your research, please cite our paper:

@incollection{NIPS2019_8556,
   title = {Addressing Failure Prediction by Learning Model Confidence},
   author = {Corbi\`{e}re, Charles and THOME, Nicolas and Bar-Hen, Avner and Cord, Matthieu and P\'{e}rez, Patrick},
   booktitle = {Advances in Neural Information Processing Systems 32},
   editor = {H. Wallach and H. Larochelle and A. Beygelzimer and F. d\textquotesingle Alch\'{e}-Buc and E. Fox and R. Garnett},
   pages = {2902--2913},
   year = {2019},
   publisher = {Curran Associates, Inc.},
   url = {http://papers.nips.cc/paper/8556-addressing-failure-prediction-by-learning-model-confidence.pdf}
}

Abstract

Assessing reliably the confidence of a deep neural net and predicting its failures is of primary importance for the practical deployment of these models. In this paper, we propose a new target criterion for model confidence, corresponding to the True Class Probability (TCP).We show how using the TCP is more suited than relying on the classic Maximum Class Probability (MCP). We provide in addition theoretical guarantees for TCP in the context of failure prediction. Since the true class is by essence unknown at test time, we propose to learn TCP criterion on the training set, introducing a specific learning scheme adapted to this context. Extensive experiments are conducted for validating the relevance of the proposed approach. We study various network architectures, small and large scale datasets for image classification and semantic segmentation. We show that our approach consistently outperforms several strong methods, from MCP to Bayesian uncertainty, as well as recent approaches specifically designed for failure prediction.

Installation

  1. Clone the repo:
$ git clone https://github.com/valeoai/ConfidNet
  1. Install this repository and the dependencies using pip:
$ pip install -e ConfidNet

With this, you can edit the ConfidNet code on the fly and import function and classes of ConfidNet in other project as well.

  1. Optional. To uninstall this package, run:
$ pip uninstall ConfidNet

You can take a look at the Dockerfile if you are uncertain about steps to install this project.

Datasets

MNIST, SVHN, CIFAR-10 and CIFAR-100 datasets are managed by Pytorch dataloader. First time you run a script, the dataloader will download the dataset in confidnet/data/DATASETNAME-data.

CamVid dataset need to be download beforehand (available here) and the structure must follow:

<data_dir>/train/                       % Train images folder
<data_dir>/trainannot/                  % Train labels folder
<data_dir>/val/                         % Validation images folder
<data_dir>/valannot/                    % Validation labels folder
<data_dir>/test/                        % Test images folder
<data_dir>/testannot/                   % Test labels folder
<data_dir>/train.txt                    % List training samples
<data_dir>/val.txt                      % List validation samples
<data_dir>/test.txt                     % List test samples
...

Running the code

Training

First, to train a baseline model, create a config.yaml file adapted to your dataset. You can find examples in confidnet/confs/. Don't forget to set the output_folder entry to a path of your own. (N.B: if the subfolder doesn't exist yet, the script will create one). Then, simply execute the following command:

$ cd ConfidNet/confidnet
$ python3 train.py -c confs/your_config_file.yaml 

It will create an output folder located as indicated in your config.yaml. This folder includes model weights, train/val split used, a copy of your config file and tensorboard logs.

By default, if the output folder is already existing, training will load last weights epoch and will continue. If you want to force restart training, simply add -f as argument

$ cd ConfidNet/confidnet
$ python3 train.py -c confs/your_config_file.yaml -f

When training ConfidNet, don't forget to add the folder path of your baseline model in your config.yaml:

...
model:
    name: vgg16_selfconfid_classic
    resume: /path/to/weights_folder/model_epoch_040.ckpt
    uncertainty:

Same remark if you want to fine-tune ConfidNet, fill the uncertainty entry.

Testing

To test your model, use the following command:

$ cd ConfidNet/confidnet
$ python3 test.py -c path/to/your/experiment/folder/your_config_file.yaml -e NUM_EPOCHS -m METHOD
  • -c: indicate here the config yaml copy saved in the output folder
  • -e: choose model weights to evaluate by their epoch
  • -m: choose the method to compute uncertainty. Available methods are normal (MCP), mc_dropout, trust_score, confidnet.

Results will be printed at the end of the script.

Pre-trained models

Model weights for MNIST and CIFAR-10 datasets used in the paper are available along with this release. Each zip file contains weights for pre-trained baseline model and weights for ConfidNet. If you want to use baseline weights:

  • unzip files respecting folder structure
  • either for baseline or confidnet, each folder contains at least weights + config file
  • fill your config file with the weights folder path
  • train your model as indicated earlier

Acknowledgements

confidnet's People

Contributors

chcorbi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

confidnet's Issues

Freeze Miniconda version and dependencies in Dockerfile

Hello, for the Dockerfile to properly install the packages, I suggest to explicitly set the targeted Miniconda version:

Replace
RUN wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
by
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.12.0-Linux-x86_64.sh -O miniconda.sh

Additionally, some dependencies must be explicitly specified:

RUN pip install tensorflow==1.13.1 torchsummary pyyaml verboselogs coloredlogs click scikit-learn pillow==6.0.0 protobuf==3.20.0

Kernel Restarting

The kernel for ConfidNet/confidnet/notebook/success_errors_histogram.ipynb appears to have died. It will restart automatically
Screenshot from 2021-12-23 00-40-24
.

ConfidNet Failure Cases & Generalization

Hi,

I've been admiring the paper a lot. Especially its ability to estimate confidence post-training.

I have a few questions:

  1. What were some of your failure cases? Was there a situation where the method didn't work?
  2. What if the network is overfitting, meaning the value of the confidence is always saturated to 1? How can you make sure that the network has a reasonable confidence level?
  3. Is it safe to stay if you know the architecture of any model, it is possible to train the ConfidNet as a branch for any model? what are the limitations?

Trying to understand the ConfidNet training process

Hi, I read your corresponding arxiv article and was trying to understand the learning process as described in the paper.

Learning scheme. Our complete confidence model, from input image to confidence score, shares its first encoding part (‘ConvNet’ in Fig.2) with the classification model M. The training of ConfidNet starts by fixing entirely M (freezing w) and learning θ using loss (4).

I understand the main code is implemented in selfconfid_learner.py. The baseline model is frozen, bn+dropout is disabled allowing only the training of ConfidNet. However, I didn't understand the statement

In a next step, we can then fine-tune the ConvNet encoder. However, as model M has to remain fixed to compute similar classification predictions, we have now to decouple the feature encoders used for classification and confidence prediction respectively.

I couldn't find the corresponding code which performs the above action. Can you please help me understand what you mean by the above statement?

code bug? why i meet this when i test confidnet?

File "/ConfidNet/confidnet/learners/default_learner.py", line 179, in evaluate
metrics.update(pred, target, confidence)
UnboundLocalError: local variable 'pred' referenced before assignment
(I'm sure no changes were made to the code except the config.yaml)
When i set --mode confidnet I will meet this error, but when i set --mode trust-score --mode mc-dropout --mode mcp everything is normal, and i can get the result.

At the same time, when i set --mode tcp , the result is wired . The E-AURC is 0 and the AURC is much lower than other method. Any explanation for this?

Hoping get your respond soon, It's a good work!

CamVid dataset train / val / test split

Hello authors,

After downloading CamVid data set from the link in your repo, we got 101 images with labels. How do you split train / val / test data? Could you please share your list so that I can reproduce your results? Thanks in advance for your help!

Best,
Yingda

ConfidNet performs worse than MCP when I reproduce SVHN results

Hi Charles!
Thank you for your nice work on ConfidNet and for providing this framework with your paper. For an upcoming publication, I would like to run ConfidNet as a baseline. However, when I try to reproduce your results on SVHN, ConfidNet performs inferior to MCP:

Screenshot 2021-04-07 at 11 49 01

I understand, that results are volatile due to the limited number of incorrect predictions, but I tried multiple runs and always got the same performance pattern. So here is what I did exactly:

  • I run the standard exp_svhn.yaml and select the best epoch according to val-accuracy
  • I run confident training with the following configs:

`Data parameters
data:
dataset: svhn
data_dir: /media/paul/ssd1/datasets/svhn
input_size: [32,32]
input_channels: 3
num_classes: 10
valid_size: 0.1

Training parameters
training:
output_folder: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_selfconfid
task: classification
learner: selfconfid
nb_epochs: 200
batch_size: 128
loss:
name: selfconfid_mse
weighting: 1
optimizer:
name: adam
lr: 0.0001
weight_decay: 0.0001
lr_schedule:
ft_on_val: False
metrics: ['accuracy', 'auc', 'ap_success', 'ap_errors']
pin_memory: False
num_workers: 12
augmentations:
normalize: [[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]

Model parameters
model:
name: small_convnet_svhn_selfconfid_classic
resume: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run/model_epoch_052.ckpt # best val-acc of previous encoder-classifier training
feature_dim: 512
uncertainty:`

  • I select the best epoch according to val-aupr-err
  • I run fine-tuning wit the following confids:

`Data parameters
data:
dataset: svhn
data_dir: /media/paul/ssd1/datasets/svhn
input_size: [32,32]
input_channels: 3
num_classes: 10
valid_size: 0.1

Training parameters
training:
output_folder: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_finetune
task: classification
learner: selfconfid
nb_epochs: 20
batch_size: 128
loss:
name: selfconfid_mse
weighting: 1
optimizer:
name: adam
lr: 0.0000001 # 1e-7
lr_schedule:
ft_on_val: False
metrics: ['accuracy', 'auc', 'ap_success', 'ap_errors']
pin_memory: False
num_workers: 12
augmentations:
normalize: [[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]

Model parameters
model:
name: small_convnet_svhn_selfconfid_cloning
resume: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run/model_epoch_052.ckpt # best val-acc of previous encoder-classifier training
feature_dim: 512
uncertainty: /mnt/hdd2/checkpoints/confid_test/svhn_smallconv_run_selfconfid/model_epoch_111.ckpt # best AURP-error of previous confidnet training.
`

  • Again I select the epoch according to best val-aupr-err for testing

I would like to make sure I use your code correctly in order to not report unfair baseline results. It would be great if you could give me feedback on this, thanks in advance!

Question about the accuracy of training vgg16 on CIFAR10 dataset?

Hi, thanks for your really interesting work. I try to run your code for training vgg16 on cifar10 dataset. However, the final accuracy on test set is not as good as yours. My best performance for classification on cifar10 test set can only achieve 89.32% with the same configuration as yours. Could you give me some help for improving this accuracy?

Moreover, I tried to modify your implementation of vgg16 and tuned the lr and weight decay to achieve the same accuracy on test set; However, by using this pretrained model to finetune confidnet, I still cannot get the similar FPR-95%-TPR AUPR-Error AUPR-Success AUC as yours. Why?

Attempts to pre-train models results in error: UnboundLocalError: local variable 'pred' referenced before assignment

Issuing the recommended training command python3 train.py -c confs/exp_cifar10.yaml (or the same with cifar100) results in the following error:

Traceback (most recent call last):
  File "train.py", line 135, in <module>
    main()
  File "train.py", line 131, in main
    learner.train(epoch)
  File "/ConfidNet/confidnet/learners/default_learner.py", line 80, in train
    val_losses, scores_val = self.evaluate(self.val_loader, self.prod_val_len, split="val")
  File "/ConfidNet/confidnet/learners/default_learner.py", line 179, in evaluate
    metrics.update(pred, target, confidence)
UnboundLocalError: local variable 'pred' referenced before assignment

In debug, I noticed that the value of mode passed into evaluate(...) is taking the default of "normal", which of course is not handled in any of the if-clauses. Any idea what is expected to happen in "normal" mode?

Should the default value for mode instead be "mcp"? (I am able to proceed with training if I make this modification, but I'm not sure that this is achieving the intended result)

ModuleNotFoundError

An error occurred while running the train.py file: "ModuleNotFoundError: No module named 'structured_map_ranking_loss' "

Pre-trained Cifar10 baseline classification model's accuracy

Hi, thanks for sharing the really nice work and codes!

When running the code in cifar10 with the given pre-trained VGG model, I have come across two problems:

  1. I can get the reported result with the pre-trained original classification model(92.20%). But I found the training accuracy for the pre-trained original classification model is 99.9% which is a little bit different from Table 3(98.69%) in the supplementary. As the error number in the training set seems important for the latter confidence training according to Table 3 in the paper, so I wonder if there is any problem with my computed training accuracy or my understanding is incorrect.

  2. I found the parameters config in the given pre-trained classification model setting is slightly different from the paper description, e.g., Adam vs SGD. Is there any recommended parameter setting to get a similar baseline classifier for cifar10 in the paper?

Main Segnet Model for Camvid

Hi, sorry for disturbing you after this repo has been released for this long time. I find that if I need to use the code to train a side learner for the Camvid dataset, I need to have a main segnet model trained with camvid. Can I know do you still keep the main segnet model that you use or how should I retrain the main segnet model for camvid if I want to follow your steps?

How to choose which epoch to use?

Hi authors,

Great work! I really appreciate you can release code of your Neurips paper.

One question is that how do you select which epoch to use (for next step training or report final results). I trained a baseline for Cifar10 and find that the results varies from epoch to epoch. Do you have any specific criterions?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.