GithubHelp home page GithubHelp logo

ibm / constrained-fscil Goto Github PK

View Code? Open in Web Editor NEW
46.0 3.0 15.0 1.18 MB

PyTorch Implementation of the CVPR'22 Paper "Constrained Few-shot Class-incremental Learning"

License: Apache License 2.0

Python 100.00%

constrained-fscil's Introduction

Constrained Few-shot Class-incremental Learning

Michael Hersche, Geethan Karunaratne, Giovanni Cherubini, Luca Benini, Abu Sebastian, Abbas Rahimi

CVPR'22

Requirements

The conda software is required for running the code. Generate a new environment with

$ conda create --name cfscil_env python=3.6
$ conda activate cfscil_env

We need PyTorch 1.3 and CUDA.

$ (cfscil_env) conda install pytorch=1.3 torchvision cudatoolkit=10.1 -c pytorch
$ (cfscil_env) pip install -r requirements.txt

Datasets

We provide the code for running experiments on miniImageNet and CIFAR100. The experiments for Omniglot have been conducted in a different framework, and cannot be released. However, we provide our split under data/index_list/omniglot as well as a dataloader under code/lib/dataloader/FSCIL/omniglot.

We follow the FSCIL setting to use the same data index_list for training. For CIFAR100, the dataset will be download automatically. For miniImageNet, you can download the dataset here. Please put the downloaded file under code/data/ folder and unzip it.

$ (cfscil_env) cd code/data/
$ (cfscil_env) gdown 1_x4o0iFetEv-T3PeIxdSbBPUG4hFfT8U
$ (cfscil_env) tar -xvf miniimagenet.tar 

Usage

The whole simulator is runnable from the command line via the code/main.py script which serves as a command parser. Everything should be run from the code directory.

The structure of any command looks like

$ (cfscil_env) python main.py command [subcommand(s)] [-option(s)] [argument(s)]

and help can be found for every command and subcommand by adding a trailing --help. The main.py file also contains all default parameters used for simulations.

Simulation

To run a single simulation of the model (incl. training, validation, testing), use the simulation command. A logging directory should be specified, in case the default path is not wanted. Any simulation parameter that should be different from the default found in main.py can be specified by chaining -p parameter value pairs.

$ (cfscil_env) python main.py simulation --logdir path/to/logdir -p parameter_1 value_1 -p parameter_2 value_2

All parameters are interpreted as strings and translated by the parser, so no "s are needed. Boolean parameters' value can be specified as t, true, f or false.

Run main experiments on CIFAR100

# Pretraining
$ (cfscil_env) python -u main.py simulation -v -ld "log/test_CIFAR100/pretrain_basetrain" -p max_train_iter 120  -p data_folder "data" -p trainstage pretrain_baseFSCIL -p pretrainFC linear -p dataset cifar100 -p random_seed 7 -p learning_rate 0.01 -p batch_size 128 -p optimizer SGD -p SGDnesterov True -p lr_step_size 30 -p representation real -p dim_features 512 -p block_architecture mini_resnet12

# Metatraining
$ (cfscil_env) python -u main.py simulation -v -ld "log/test_CIFAR100/meta_basetrain" -p max_train_iter 70000 -p data_folder "data" -p resume "log/test_CIFAR100/pretrain_basetrain"  -p trainstage metatrain_baseFSCIL -p dataset cifar100 -p average_support_vector_inference True -p random_seed 7 -p learning_rate 0.01 -p batch_size_training 10 -p batch_size_inference 128 -p optimizer SGD -p sharpening_activation softabs -p SGDnesterov True -p lr_step_size 30000  -p  representation tanh -p dim_features 512 -p num_ways 60 -p num_shots 5 -p block_architecture mini_resnet12

# Evaluation Mode 1 (num_shots relates only to number of shots in base session, on novel there are always 5)
$ (cfscil_env) python -u main.py simulation -v -ld "log/test_CIFAR100/eval/mode1"  -p data_folder "data"  -p resume "log/test_CIFAR100/meta_basetrain" -p dim_features 512 -p retrain_iter 0 -p nudging_iter 0 -p bipolarize_prototypes False -p nudging_act_exp 4 -p nudging_act doubleexp -p trainstage train_FSCIL -p dataset cifar100 -p random_seed 7 -p learning_rate 0.01 -p batch_size_training 128 -p batch_size_inference 128 -p num_query_training 0 -p optimizer SGD -p sharpening_activation abs -p SGDnesterov True -p representation tanh -p retrain_act tanh -p num_ways 60 -p num_shots 200 -p block_architecture mini_resnet12

# Evaluation Mode 2
$ (cfscil_env) python -u main.py simulation -v -ld "log/test_CIFAR100/eval/mode2"  -p data_folder "data"  -p resume "log/test_CIFAR100/meta_basetrain" -p dim_features 512 -p retrain_iter 10 -p nudging_iter 0 -p bipolarize_prototypes True -p nudging_act_exp 4 -p nudging_act doubleexp -p trainstage train_FSCIL -p dataset cifar100 -p random_seed 7 -p learning_rate 0.01 -p batch_size_training 128 -p batch_size_inference 128 -p num_query_training 0 -p optimizer SGD -p sharpening_activation abs -p SGDnesterov True -p representation tanh -p retrain_act tanh -p num_ways 60 -p num_shots 200 -p block_architecture mini_resnet12

# Evaluation Mode 3
$ (cfscil_env) python -u main.py simulation -v -ld "log/test_CIFAR100/eval/mode3" -p data_folder "data"  -p resume "log/test_CIFAR100/meta_basetrain" -p dim_features 512 -p retrain_iter 50 -p nudging_iter 100 -p bipolarize_prototypes False -p nudging_act_exp 4 -p nudging_act doubleexp -p trainstage train_FSCIL -p dataset cifar100 -p random_seed 7 -p learning_rate 0.01 -p batch_size_training 128 -p batch_size_inference 128 -p num_query_training 0 -p optimizer SGD -p sharpening_activation abs -p SGDnesterov True -p representation tanh -p retrain_act tanh -p num_ways 60 -p num_shots 200 -p block_architecture mini_resnet12

Run main experiments on miniImageNet

# Pretraining
$ (cfscil_env) python -u main.py simulation -v -ld "log/test_mini_imagenet/pretrain_basetrain" -p max_train_iter 120  -p data_folder "data" -p trainstage pretrain_baseFSCIL -p pretrainFC linear -p dataset mini_imagenet -p random_seed 7 -p learning_rate 0.01 -p batch_size 128 -p optimizer SGD -p SGDnesterov True -p lr_step_size 30 -p representation real -p dim_features 512 -p block_architecture mini_resnet12

# Metatraining
$ (cfscil_env) python -u main.py simulation -v -ld "log/test_mini_imagenet/meta_basetrain" -p max_train_iter 70000 -p data_folder "data" -p resume "log/test_mini_imagenet/pretrain_basetrain"  -p trainstage metatrain_baseFSCIL -p dataset mini_imagenet -p average_support_vector_inference True -p random_seed 7 -p learning_rate 0.01 -p batch_size_training 10 -p batch_size_inference 128 -p optimizer SGD -p sharpening_activation softabs -p SGDnesterov True -p lr_step_size 30000  -p  representation tanh -p dim_features 512 -p num_ways 60 -p num_shots 5 -p block_architecture mini_resnet12

# Evaluation Mode 1 (num_shots relates only to number of shots in base session, on novel there are always 5)
$ (cfscil_env) python -u main.py simulation -v -ld "log/test_mini_imagenet/eval/mode1"  -p data_folder "data"  -p resume "log/test_mini_imagenet/meta_basetrain" -p dim_features 512 -p retrain_iter 0 -p nudging_iter 0 -p bipolarize_prototypes False -p nudging_act_exp 4 -p nudging_act doubleexp -p trainstage train_FSCIL -p dataset mini_imagenet -p random_seed 7 -p learning_rate 0.01 -p batch_size_training 128 -p batch_size_inference 128 -p num_query_training 0 -p optimizer SGD -p sharpening_activation abs -p SGDnesterov True -p representation tanh -p retrain_act tanh -p num_ways 60 -p num_shots 100 -p block_architecture mini_resnet12

# Evaluation Mode 2
$ (cfscil_env) python -u main.py simulation -v -ld "log/test_mini_imagenet/eval/mode2"  -p data_folder "data"  -p resume "log/test_mini_imagenet/meta_basetrain" -p dim_features 512 -p retrain_iter 10 -p nudging_iter 0 -p bipolarize_prototypes True -p nudging_act_exp 4 -p nudging_act doubleexp -p trainstage train_FSCIL -p dataset mini_imagenet -p random_seed 7 -p learning_rate 0.01 -p batch_size_training 128 -p batch_size_inference 128 -p num_query_training 0 -p optimizer SGD -p sharpening_activation abs -p SGDnesterov True -p representation tanh -p retrain_act tanh -p num_ways 60 -p num_shots 100 -p block_architecture mini_resnet12

# Evaluation Mode 3
$ (cfscil_env) python -u main.py simulation -v -ld "log/test_mini_imagenet/eval/mode3" -p data_folder "data"  -p resume "log/test_mini_imagenet/meta_basetrain" -p dim_features 512 -p retrain_iter 50 -p nudging_iter 100 -p bipolarize_prototypes False -p nudging_act_exp 4 -p nudging_act doubleexp -p trainstage train_FSCIL -p dataset mini_imagenet -p random_seed 7 -p learning_rate 0.01 -p batch_size_training 128 -p batch_size_inference 128 -p num_query_training 0 -p optimizer SGD -p sharpening_activation abs -p SGDnesterov True -p representation tanh -p retrain_act tanh -p num_ways 60 -p num_shots 100 -p block_architecture mini_resnet12

Inspection with TensorBoard

For a detailed inspection of the simulation, the TensorBoard tool can be used. During simulations, data is collected which can be illustrated by the tool in the browser.

Acknowledgment

Our code is based on

Citation

If you use the work released here for your research, please cite this paper:

@inproceedings{hersche2022cfscil,
    Author = {Hersche, Michael and Karunaratne, Geethan and Cherubini, Giovanni and Benini, Luca and Sebastian, Abu and Rahimi, Abbas },
    Booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    Title = {Constrained Few-shot Class-incremental Learning},
    Year = {2022}}

License

Our code is licensed under Apache 2.0. Please refer to the LICENSE file for the licensing of our code.

constrained-fscil's People

Contributors

ibm-open-source-bot avatar mhersche avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

constrained-fscil's Issues

The effectiveness of the fully connected layer

Hi, thanks for your excellent work.
I'm wondering the effectiveness of the fully connected layer. In the paper, the fully connected layer produces a support vector for every training input. Since there is no ablation study, I want to know what will happen if we just employ the outputs from the feature extractor to calculate prototypes?

Checkpoint file of meta-trained model

Hello author,
Thanks for the release of the code of your paper "Constrained Few-Shot Class-Incremental Learning" published in CVPR'22
After reading your paper, I tried to run your method for miniImagenet, but I failed due to the GPU memory issue in my machine.
Cound you upload your meta-trained model for miniImagenet?
Thank you:)

Multiply one-hot vector with feature representation of new data

sum_feat_vec = t.matmul(t.transpose(y_onehot,0,1), feat_vec)

Hi, very impressive work!!!

I have a question about the code. I see when you calculate the feature representation of new data, you multiply the feat-vec with one-hot vector. I didn't get it. If you multiply with one-hot vector, then most of the new feature representation would turn to zero. Is it will be more discriminative?

Best regards,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.