augself's Introduction

Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Accepted to NeurIPS 2021

TL;DR: Learning augmentation-aware information by predicting the difference between two augmented samples improves the transferability of representations.

Dependencies

conda create -n AugSelf python=3.8 pytorch=1.7.1 torchvision=0.8.2 cudatoolkit=10.1 ignite -c pytorch
conda activate AugSelf
pip install scipy tensorboard kornia==0.4.1 sklearn

Checkpoints

We provide ImageNet100-pretrained models in this Dropbox link.

Pretraining

We here provide SimSiam+AugSelf pretraining scripts. For training the baseline (i.e., no AugSelf), remove --ss-crop and --ss-color options. For using other frameworks like SimCLR, use the --framework option.

STL-10

CUDA_VISIBLE_DEVICES=0 python pretrain.py \
    --logdir ./logs/stl10/simsiam/aug_self \
    --framework simsiam \
    --dataset stl10 \
    --datadir DATADIR \
    --model resnet18 \
    --batch-size 256 \
    --max-epochs 200 \
    --ss-color 1.0 --ss-crop 1.0

ImageNet100

python pretrain.py \
    --logdir ./logs/imagenet100/simsiam/aug_self \
    --framework simsiam \
    --dataset imagenet100 \
    --datadir DATADIR \
    --batch-size 256 \
    --max-epochs 500 \
    --model resnet50 \
    --base-lr 0.05 --wd 1e-4 \
    --ckpt-freq 50 --eval-freq 50 \
    --ss-crop 0.5 --ss-color 0.5 \
    --num-workers 16 --distributed

Evaluation

Our main evaluation setups are linear evaluation on fine-grained classification datasets (Table 1) and few-shot benchmarks (Table 2).

linear evaluation

CUDA_VISIBLE_DEVICES=0 python transfer_linear_eval.py \
    --pretrain-data imagenet100 \
    --ckpt CKPT \
    --model resnet50 \
    --dataset cifar10 \
    --datadir DATADIR \
    --metric top1

few-shot

CUDA_VISIBLE_DEVICES=0 python transfer_few_shot.py \
    --pretrain-data imagenet100 \
    --ckpt CKPT \
    --model resnet50 \
    --dataset cub200 \
    --datadir DATADIR

augself's People

Contributors

Stargazers

Watchers

augself's Issues

about AugSelf

hello,excuse me,different data augment methods have different forms of parameters, such as color perturbation and random cropping, like this different type of augment, how can Augself learn the difference between them?

What does ss_only option do?

Linear evaluation of MoCo on CUB

Hi! Thank you for a great paper and for sharing the code!

I'm looking to reproduce your results on the MoCo model, especially for transferring it to the CUB dataset.

The command I'm running is:

CUDA_VISIBLE_DEVICES=0 python transfer_linear_eval.py \
    --pretrain-data imagenet100 \
    --ckpt $CHECKPOINT_PATH \
    --model resnet50 \
    --dataset cub200 \
    --datadir $CUB_DIR \
    --metric top1

However, the model achieves accuracy lower than the one reported in the paper (37.0, as reported in Tab. 3):

For MoCo baseline (checkpoint shared by you) I got test acc=0.2575
and for MoCo_augself (checkpoint shared by you) I got test acc=0.3224
For MoCo pretrained by myself, I got test acc=0.3309

Thus, I think the issue may be in linear evaluation.

Oddly enough, I roughly reproduced your results on the CIFAR-10 and CIFAR-100 dataset (with ~0.5% difference), so maybe the issue is with CUB only.

Could you kindly provide some guidance on whether I got the hyperparameters right? Alternatively, how did you set up the CUB files - did you use the default train / test / split?

Best regards!

Recommend Projects

hankook / augself Goto Github PK