Scaling and Benchmarking Self-Supervised Visual Representation Learning

License: Other

Python 100.00%

fair_self_supervision_benchmark's Introduction

FAIR Self-Supervision Benchmark is deprecated. Please see VISSL, a ground-up rewrite of benchmark in PyTorch.

FAIR Self-Supervision Benchmark

This code provides various benchmark (and legacy) tasks for evaluating quality of visual representations learned by various self-supervision approaches. This code corresponds to our work on Scaling and Benchmarking Self-Supervised Visual Representation Learning. The code is written in Python and can be used to evaluate both PyTorch and Caffe2 models (see this). We hope that this benchmark release will provided a consistent evaluation strategy that will allow measuring the progress in self-supervision easily.

Introduction

The goal of fair_self_supervision_benchmark is to standardize the methodology for evaluating quality of visual representations learned by various self-supervision approaches. Further, it provides evaluation on a variety of tasks as follows:

Benchmark tasks: The benchmark tasks are based on principle: a good representation (1) transfers to many different tasks, and, (2) transfers with limited supervision and limited fine-tuning. The tasks are as follows.

Image Classification
- VOC07
- COCO2014
- Places205
Low-Shot Image Classification
- VOC07
- Places205
Object Detection on VOC07 and VOC07+12 with frozen backbone for detectors:
- Fast R-CNN
- Faster R-CNN
Surface Normal Estimation
Visual Navigation in Gibson Environment

These Benchmark tasks use the network architectures:

Legacy tasks: We also classify some commonly used evaluation tasks as legacy tasks for reasons mentioned in Section 7 of paper. The tasks are as follows:

ImageNet-1K classification task
VOC07 full finetuning
Object Detection on VOC07 and VOC07+12 with full tuning for detectors:
- Fast R-CNN
- Faster R-CNN

License

fair_self_supervision_benchmark is CC-NC 4.0 International licensed, as found in the LICENSE file.

Citation

If you use fair_self_supervision_benchmark in your research or wish to refer to the baseline results published in the paper, please use the following BibTeX entry.

@article{goyal2019scaling,
  title={Scaling and Benchmarking Self-Supervised Visual Representation Learning},
  author={Goyal, Priya and Mahajan, Dhruv and Gupta, Abhinav and Misra, Ishan},
  journal={arXiv preprint arXiv:1905.01235},
  year={2019}
}

Installation

Please find installation instructions in INSTALL.md.

Getting Started

After installation, please see GETTING_STARTED.md for how to run various benchmark tasks.

Model Zoo

We provide models used in our paper in the MODEL_ZOO.

References

Scaling and Benchmarking Self-Supervised Visual Representation Learning. Priya Goyal, Dhruv Mahajan, Abhinav Gupta*, Ishan Misra*. Tech report, arXiv, May 2019.

fair_self_supervision_benchmark's People

Contributors

Stargazers

Watchers

fair_self_supervision_benchmark's Issues

Size of extracted features

Could you point me to where the code resizes the extracted features?

For ResNet-50, I got 9216 extracted features after layers 1, 2, and 4, and 8192 features after layers 3 and 5. Where do these numbers come from?

Thank you and apologizes if this is a silly question!

[bug] Missing make_SN_labels.py?

The script make_SN_labels.py referenced in the Surface Normal Estimation README doesn't seem to exist in either of the repositories or data download. I could be mistaken, but I haven't been able to find it with over an hour of searching. Thanks!

[bug] GPU is not extensively used in feature extraction

Hello,

Thank you for sharing this benchmark for evaluating self-supervised learning approaches.

I followed the INSTALL.md file to setup the benchmark tool successfully. Then followed this README.md file to download the datasets and setup file hierarchies accordingly.
And finally, used extra_scripts/README.md to produce image/label lists for each dataset as expected. So far so good.

I simply want to extract COCO2014 features from a pre-trained model, for instance AlexNet-In1K. To do that first I change NUM_DEVICES to 1 in
caffenet_bvlc_supervised_extract_features.yaml file since I have only 1 GPU, then
run the following code that I take from GETTING_STARTED.md

python tools/extract_features.py \
    --config_file configs/benchmark_tasks/image_classification/coco2014/caffenet_bvlc_supervised_extract_features.yaml \
    --data_type train \
    --output_file_prefix trainval \
    --output_dir /tmp/ssl-benchmark-output/extract_features/weights_init \
    TEST.PARAMS_FILE https://dl.fbaipublicfiles.com/fair_self_supervision_benchmark/models/caffenet_bvlc_in1k_supervised.npy \ TRAIN.DATA_FILE /tmp/ssl-benchmark-output/coco/train_images.npy \
    TRAIN.LABELS_FILE /tmp/ssl-benchmark-output/coco/train_labels.npy

While extracting features, it seems that GPU utilization is quite low, mostly less than 5% but instead, CPU load is 100% almost all the time. I believe that there is something wrong in device selection. Could you please help me resolving this issue?

I have the same thing when I try this on machines with:

Ubuntu-16.04, CUDA-9, pytorch-1.0
CentOs-7, CUDA-10, pytorch-1.0

Bulent

Pretraining using a custom subset of Imagenet.

Thanks for sharing the code and paper!

I am looking to perform pre-training using Jigsaw(Resnet50) pretext method, just like what was mentioned in the paper(Table 2). But instead of using Imagenet1K, I would like to pretrained with an Imagenet subset of 10 classes. I also have only one GPU available.

My question:

Is the code for pre-training using Jigsaw(Resenet50) available? I ask this because I am only seeing codes and commands for Benchmark Tasks and Legacy Tasks. If pretrained code is available for Jigsaw(Resenet5), pls indicate where I can find them.

Many thanks and looking forward for your prompt reply!

Preprocessing COCO - 'minival', 'valminusminival'

What are minival and valminusminival for the coco dataset? My COCO14 annotations folder looks like this:

captions_train2014.json  instances_train2014.json  person_keypoints_train2014.json
captions_val2014.json    instances_val2014.json    person_keypoints_val2014.json

Therefore I get an error in when I try to preprocess COCO when running extra_scripts/create_coco_data_files.py:

partitions = ['val', 'train', 'minival', 'valminusminival']
    for partition in partitions:

Do I need minival and valminusminival?

SVM Training Time

How long does it usually take to train the SVM for COCO dataset?

I've run it for 36 hours and it still seems to be training.

Thanks,
Jason

[feature]Train Jigsaw for 1000 permutations

I wanted to train Jigsaw for 1000 and 3000 permutations and was wondering how can I train new Jigsaw models for these permutations?

When have you applied Jigsaw, for pre-training Faster-RCNN

Hello,

I'm tiring to understand where did you apply the Jigsaw augmentation for object detection:

To the original image, before the backbone
To the proposed regions, suggested by the RPN, so on a feature map

Thanks in advance!
And have a nice day

Evaluating benchmark on simple imagenet pretrained model loaded from PyTorch

I'd like to plug in the PyTorch default ResNet model with ImageNet pretraining to see how it does on the dataset. I'm using the default torchvision.models.resnet.ResNet as my model, but I don't know how to save it in the right format to be read in. In other words, your functions save_model_params(model, params_file, checkpoint_dir, model_iter) and checkpoints.load_model_from_params_file(model takes in a model that seems to be a ModelBuilder class, and I'd like to load a a model of the class torchvision.models.resnet.ResNet .

Any idea on what the best way to do this is?

Thanks,
Jason

Can't get low accuracy for randomly extracted features

I am trying to get a <10% accuracy on VOC for using randomly extracted features, but I can't seem to do so. Running the following lines on a clean version of the repository gives me a Mean AP of 0.514. Am I doing this correctly?

python extra_scripts/create_voc_data_files.py \
    --data_source_dir /home/brenta/scratch/data/VOC2007/ \
    --output_dir voc07

python tools/extract_features.py \
    --config_file configs/benchmark_tasks/image_classification/voc07/caffenet_bvlc_random_extract_features.yaml \
    --data_type train \
    --output_file_prefix trainval \
    --output_dir extract_features/random \
    TRAIN.DATA_FILE voc07/train_images.npy \
    TRAIN.LABELS_FILE voc07/train_labels.npy

python tools/svm/train_svm_kfold.py \
    --data_file extract_features/random/trainval_conv1_s4k19_resize_features.npy \
    --targets_data_file extract_features/random/trainval_conv1_s4k19_resize_targets.npy \
    --costs_list "0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0" \
    --output_path voc07_svm/svm_conv1/

python tools/svm/test_svm.py \
    --data_file extract_features/random/trainval_conv1_s4k19_resize_features.npy \
    --targets_data_file extract_features/random/trainval_conv1_s4k19_resize_targets.npy \
    --costs_list "0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0" \
    --output_path voc07_svm/svm_conv1/

[bug] Following few-shot recipe does not seem to work at "testing SVM" stage

Hello - thanks for providing the benchmarking code =)

I've been trying to reproduce the places205 few-shot results. I've followed the various sections of the README's, and I've ran into a few inconsistencies in file names, which I've tried to sensibly resolve, and eventually got everything working (I think) up until the Step 4: Testing SVM stage of the GETTING_STARTED.md guide: step 4 - here

Here's everything I've done, with possibly relevant inconsistencies in bold:

Downloaded and renamed places205 in the desired format
Converted places205 to numpy files

python extra_scripts/create_imagenet_data_files.py --data_source_dir places205 --output_dir places205-processed

Generated the 5 samples at different k values

python extra_scripts/create_places_low_shot_samples.py \
    --images_data_file ssl-benchmark-output/places205/train_images.npy \
    --targets_data_file ssl-benchmark-output/places205/train_labels.npy \
    --output_path ssl-benchmark-output/places205/low_shot/ \
    --k_values "1,2,4,8,16,32,64,96" \
    --num_samples 5

I extracted the training and test fetures with the following two commands

# TRAIN
python tools/extract_features.py \
    --config_file configs/benchmark_tasks/low_shot_image_classification/places205/resnet50_supervised_low_shot_extract_features.yaml \ --data_type train \
    --output_file_prefix trainval \
    --output_dir ssl-benchmark-output/extract_features/weights_init \
    TEST.PARAMS_FILE https://dl.fbaipublicfiles.com/fair_self_supervision_benchmark/models/resnet50_in1k_supervised.pkl \
    TRAIN.DATA_FILE ssl-benchmark-output/places205/train_images_sample4_k1.npy \
    TRAIN.LABELS_FILE ssl-benchmark-output/places205/train_labels_sample4_k1.npy
# TEST
python tools/extract_features.py  \
  --config_file configs/benchmark_tasks/low_shot_image_classification/places205/resnet50_supervised_low_shot_extract_features.yaml \
  --data_type test \
  --output_file_prefix test \
  --output_dir ssl-benchmark-output/extract_features/weights_init \
  TEST.PARAMS_FILE https://dl.fbaipublicfiles.com/fair_self_supervision_benchmark/models/resnet50_in1k_supervised.pkl \ 
  TEST.DATA_FILE places205-processed/val_images.npy \
  TEST.LABELS_FILE places205-processed/val_labels.npy

Train the SVM I had to change the --data_file and --targets arg to not have the s0 part in the filepath

python tools/svm/train_svm_low_shot.py \
  --data_file ssl-benchmark-output/extract_features/weights_init/trainval_res_conv1_bn_resize_features.npy \
  --targets_data_file ssl-benchmark-output/extract_features/weights_init/trainval_res_conv1_bn_resize_targets.npy \ 
  --costs_list "0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0"  \
  --output_path ssl-benchmark-output/p205_svm_low_shot/svm_conv1/

Test the SVM I had to change the --data_file and --targets arg to not have the s0 part in the filepath and also set the k_values and sample_inds to only work with the first set of features trained in the previous command: doesn't seem to be an option to train them all at once

python tools/svm/test_svm_low_shot.py 
  --data_file ssl-benchmark-output/extract_features/weights_init/test_res_conv1_bn_resize_features.npy \
  --targets_data_file ssl-benchmark-output/extract_features/weights_init/test_res_conv1_bn_resize_targets.npy \
  --costs_list "0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0" \
  --output_path ssl-benchmark-output/p205_svm_low_shot/svm_conv1/ \
  --k_values "4" \
  --sample_inds "0"

The first error was:

[INFO: test_svm_low_shot.py:  188]: Namespace(costs_list='0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0', data_file='ssl-benchmark-output/extract_features/weights_init/test_res_conv1_bn_resize_features.npy', dataset='voc', generate_json=0, json_targets=None, k_values='4', output_path='ssl-benchmark-output/p205_svm_low_shot/svm_conv1/', sample_inds='1', targets_data_file='ssl-benchmark-output/extract_features/weights_init/test_res_conv1_bn_resize_targets.npy')
[INFO: test_svm_low_shot.py:   80]: Testing svm for k-values: [4] and sample_inds: [0]
[INFO: svm_helper.py:   58]: loading features and targets...
[INFO: svm_helper.py:   63]: Loaded features: (20500, 9216) and targets: (20500, 1)
[INFO: test_svm_low_shot.py:   98]: Testing SVM for costs: [1e-07, 1e-06, 1e-05, 0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.0, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625, 0.00048828125, 0.000244140625, 0.0001220703125, 6.103515625e-05, 3.0517578125e-05, 1.52587890625e-05, 7.62939453125e-06, 3.814697265625e-06, 1.9073486328125e-06]
[INFO: svm_helper.py:  144]: Testing SVM for classes: range(0, 1)
[INFO: svm_helper.py:  145]: Num classes: 1
[INFO: test_svm_low_shot.py:  125]: Test sample/k_value/cost/cls: 2/4/1e-07/0
Traceback (most recent call last):
  File "tools/svm/test_svm_low_shot.py", line 193, in <module>
    main()
  File "tools/svm/test_svm_low_shot.py", line 189, in main
    test_svm_low_shot(opts)
  File "tools/svm/test_svm_low_shot.py", line 128, in test_svm_low_shot
    with open(model_file, 'rb') as fopen:
FileNotFoundError: [Errno 2] No such file or directory: 'ssl-benchmark-output/p205_svm_low_shot/svm_conv1/cls0_cost1e-07_sample1_k4.pickle'

Looking in ssl-benchmark-output/p205_svm_low_shot/svm_conv1/ that file isn't there, but this one is: cls0_cost1e-07_sample1_k4.pickle, so I renamed it to cls0_cost1e-07_sample1_k4.pickle and reran the script, output this time:

[INFO: test_svm_low_shot.py:   80]: Testing svm for k-values: [4] and sample_inds: [0]                                    
[INFO: svm_helper.py:   58]: loading features and targets...                                                              
[INFO: svm_helper.py:   63]: Loaded features: (20500, 9216) and targets: (20500, 1)                                       
[INFO: test_svm_low_shot.py:   98]: Testing SVM for costs: [1e-07, 1e-06, 1e-05, 0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.
0, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625, 0.00048828125, 0.000244140625, 0.000122070
3125, 6.103515625e-05, 3.0517578125e-05, 1.52587890625e-05, 7.62939453125e-06, 3.814697265625e-06, 1.9073486328125e-06]   [INFO: svm_helper.py:  144]: Testing SVM for classes: range(0, 1)                                                         
[INFO: svm_helper.py:  145]: Num classes: 1                                                                               [INFO: test_svm_low_shot.py:  125]: Test sample/k_value/cost/cls: 1/4/1e-07/0                                             Traceback (most recent call last):                                                                                          File "tools/svm/test_svm_low_shot.py", line 193, in <module>                                                            
    main()                                                                                                                  File "tools/svm/test_svm_low_shot.py", line 189, in main                                                                    test_svm_low_shot(opts)                                                                                               
  File "tools/svm/test_svm_low_shot.py", line 138, in test_svm_low_shot                                                   
    eval_cls_labels, eval_preds                                                                                           
  File "/rscratch/cjrd/dul-project/deepul_proj_2/fair_self_supervision_benchmark/tools/svm/svm_helper.py", line 99, in get
_precision_recall                                                                                                         
    preds[:, np.newaxis].astype(np.float64)                                                                               
  File "<__array_function__ internals>", line 6, in hstack                                                                
  File "/rscratch/cjrd/anaconda3/envs/ssl/lib/python3.7/site-packages/numpy/core/shape_base.py", line 345, in hstack      
    return _nx.concatenate(arrs, 1)                                                                                       
  File "<__array_function__ internals>", line 6, in concatenate                                                           
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the 
array at index 1 has 3 dimension(s)

I tried fiddling with the numpy/dimensions of the the given problem, but it just seems to create more problems downstream.

Any help would be very appreciated. Thank you!

[feature] Plans to support other pre-trained models?

Are there any plans to support custom pre-trained models, instead of AlexNet/ResNet-50? As background, I am working on a few new models and would like to evaluate them across all the benchmarks you've put together. Thanks!

Is there a simple demo file, containing code to get the visual representation of a single image?

Is there someway that we can get a simple demo file, which can be used to get the visual representation of each image.
something like

python3 demo.py --model=model_name --image=/path/to/image --pretrained=/path/to/pretrained/weights

This will return the visual representation of the image according to the model given. This will be very helpful, for beginners like me to just play around with representations.

Thanks!

facebookresearch / fair_self_supervision_benchmark Goto Github PK