GithubHelp home page GithubHelp logo

mma's Introduction

MMA - Combining MixMatch and Active Learning for Better Accuracy with Fewer Labels

Code for the paper: "Combining MixMatch and Active Learning for Better Accuracy with Fewer Labels" by Shuang Song, David Berthelot, and Afshin Rostamizadeh.

This is not an officially supported Google product.

Setup

Install dependencies

sudo apt install python3-dev python3-virtualenv python3-tk imagemagick
virtualenv -p python3 --system-site-packages env3
. env3/bin/activate
pip install -r requirements.txt

Install datasets

export ML_DATA="path to where you want the datasets saved"
# Download datasets
CUDA_VISIBLE_DEVICES= ./scripts/create_datasets.py

Running

We have hard-coded the parameters (batch for AL and number of iterations between each querying) used in the paper in mixmatch_lineargrow.py. The parameters are documented and can be changed there.

To do the experiment on CIFAR-10 with diff as the uncertainty measurement on two augmentations of samples and no diversification method, i.e.,training mixmatch with 32 filters on CIFAR-10 shuffled with seed=1, starting from 250 randomly selected samples, querying 50 each time until 4000 labelled samples with diff.aug-direct:

CUDA_VISIBLE_DEVICES=0 python mixmatch_lineargrow.py --filters=32 --w_match=75 --beta=0.75 --dataset=cifar10.1@250_train50000 --grow_size=50 --grow_by=diff2.aug-direct

Monitoring training progress

You can point tensorboard to the training folder (by default it is --train_dir=./MMA_exp) to monitor the training process:

tensorboard.sh --port 6007 --logdir MMA_exp

Citing this work

@misc{song2019combining,
      title={Combining MixMatch and Active Learning for Better Accuracy with Fewer Labels},
      author={Shuang Song and David Berthelot and Afshin Rostamizadeh},
      year={2019},
      eprint={1912.00594},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

mma's People

Contributors

shs037 avatar

Stargazers

Solarica Palit avatar  avatar yeep avatar Vlad Fomenko avatar  avatar Gábor Mihálcz avatar Hwidong Na avatar  avatar Tao Ye avatar Jose Cohenca avatar  avatar Yujie Wang avatar  avatar

Watchers

James Cloos avatar  avatar Andrew Chen avatar  avatar  avatar  avatar paper2code - bot avatar

mma's Issues

training procedure is stuck when labeled data grows to 1650

Hi,

When I was reproducing the example in README, I encountered some problems that the whole training procedure is stuck after finished training grow_nimg iterations with 1650 labeled data. (even typing ctrl+C can't terminate it)
Do you guys face this problem before?
I'm not sure whether the problem comes from the code itself or it is caused by my machine.

Thanks.

Security Policy violation Binary Artifacts

This issue was automatically created by Allstar.

Security Policy Violation
Project is out of compliance with Binary Artifacts policy: binaries present in source code

Rule Description
Binary Artifacts are an increased security risk in your repository. Binary artifacts cannot be reviewed, allowing the introduction of possibly obsolete or maliciously subverted executables. For more information see the Security Scorecards Documentation for Binary Artifacts.

Remediation Steps
To remediate, remove the generated executable artifacts from the repository.

Artifacts Found

  • libml/pycache/data.cpython-37.pyc
  • libml/pycache/utils.cpython-37.pyc

Additional Information
This policy is drawn from Security Scorecards, which is a tool that scores a project's adherence to security best practices. You may wish to run a Scorecards scan directly on this repository for more details.


Allstar has been installed on all Google managed GitHub orgs. Policies are gradually being rolled out and enforced by the GOSST and OSPO teams. Learn more at http://go/allstar

This issue will auto resolve when the policy is in compliance.

Issue created by Allstar. See https://github.com/ossf/allstar/ for more information. For questions specific to the repository, please contact the owner or maintainer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.