GithubHelp home page GithubHelp logo

unlimited_labeled_data_project_strong_augmen's Introduction

Thanh's Notes

This is a fork of the FixMatch-Pytorch repo from https://github.com/kekmodel/FixMatch-pytorch

Prelimminary CIFAR10 Top-1

#Labels per class 10 40 400
Thanh's trials 92.37 94.03 95.09

Weights and log files are in vision38:/data/tvu/classes/fixMatch-pytorch/results/

Plan

Realisticity Analysis

  • Hypothesis 1: strongly and weakly augmented images have different realisiticity and we can measure/distinct this using out-of-distribution methods
  • Hypothesis 2: there are variation of realisiticity even among strongly augmented images. e.g. we expect posterization to be more unrealisitic than flip+translate+crop

Improving SSL

  • Hypothesis 3: we can improve SSL by training with appropriate/various levels of unrealisticity of strong augmentation

TODO

  • Test run this repo with CIFAR-10

Realisticity Analysis

  • Integrate OOD detection for measuring "realisiticity" of augmented images e.g. https://arxiv.org/abs/1912.03263 https://arxiv.org/abs/1905.11001
  • Manually check ODD score and qualitative realisticity of sample augmented images. See save-aug branch for the extraction sample augmented images (before normalization)
  • Generate/save all augmented variation as use in one epoch of FixMatch
  • Analysize the ODD scores of this distribution augmented images to test hypotheses 1 and 2

Improving SSL

  • Train FixMatch with various levels of unrealistic augmentation

(below is the README from the original repo)

FixMatch

This is an unofficial PyTorch implementation of FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence. The official Tensorflow implementation is here.

This code is only available in FixMatch (RandAugment). Now only experiments on CIFAR-10 and CIFAR-100 are available.

Requirements

  • Python 3.6+
  • PyTorch 1.4
  • torchvision 0.5
  • tensorboard
  • tqdm
  • numpy
  • apex (optional)

Usage

Train

Train the model by 4000 labeled data of CIFAR-10 dataset:

python train.py --dataset cifar10 --num-labeled 4000 --arch wideresnet --batch-size 64 --lr 0.03 --seed 5 --out results/[email protected]

Train the model by 10000 labeled data of CIFAR-100 dataset by using DistributedDataParallel:

python -m torch.distributed.launch --nproc_per_node 4 ./train.py --dataset cifar100 --num-labeled 10000 --arch wideresnet --batch-size 16 --lr 0.03 --out results/cifar100@10000

* When using DDP, do not use a seed.

Monitoring training progress

tensorboard --logdir=<your out_dir>

Results (Accuracy)

CIFAR10

#Labels 40 250 4000
Paper (RA) 86.19 ± 3.37 94.93 ± 0.65 95.74 ± 0.05
This code 92.92 94.13 95.33
Acc. curve link link link

CIFAR100

#Labels 400 2500 10000
Paper (RA) 51.15 ± 1.75 71.71 ± 0.11 77.40 ± 0.12
This code - - -
Acc. curve - - -

References

@article{sohn2020fixmatch,
    title={FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence},
    author={Kihyuk Sohn and David Berthelot and Chun-Liang Li and Zizhao Zhang and Nicholas Carlini and Ekin D. Cubuk and Alex Kurakin and Han Zhang and Colin Raffel},
    journal={arXiv preprint arXiv:2001.07685},
    year={2020},
}

unlimited_labeled_data_project_strong_augmen's People

Contributors

floralzhao avatar kekmodel avatar thanhmvu avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.