GithubHelp home page GithubHelp logo

vgcq / dsd2 Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 0.0 662 KB

[AAAI24] DSD2: Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free?

License: MIT License

Python 100.00%

dsd2's Introduction

DSD2 : can we Dodge Sparse Double Descent and compress the neural network worry-free?

arXiv

This GitHub implements the key experiments of the following paper : DSD2 : can we Dodge Sparse Double Descent and compress the neural network worry-free?.

Libraries

  • Python = 3.10
  • PyTorch = 1.13
  • Torchvision = 0.14
  • Numpy = 1.23

Usage

In practice, you can begin with a set of defaults and optionally modify individual hyperparameters as desired. To view the hyperparameters for each subcommand, use the following command.

main.py [subcommand] [...] --help

Example Runs

To run a ResNet-18 on CIFAR-10 with 10% of label noise, batch size of 128, learning rate of 0.1, weight decay of 1e-4 for 160 epochs: python main.py --data_path YOUR_PATH_TO_CIFAR --lr 0.1 --batch_size 128 --weight_decay 1e-4 --epochs 160

To run a VGG-like model on CIFAR-100 with 20% of label noise, batch size of 128, learning rate of 0.1, and weight decay of 1e-4 for 160 epochs: python main.py --model VGG-like --dataset CIFAR-100 --data_path YOUR_PATH_TO_CIFAR --lr 0.1 --batch_size 128 --weight_decay 1e-4 --epochs 160 --amount_noise 0.2

To run a VGG-like model distilled from a ResNet-18 teacher on CIFAR-10 with 50% of label noise: python kd.py --teacher_model=ResNet-18 --path_to_teacher_model YOUR_PATH_TO_TEACHER_MODEL --student_model VGG-like --dataset CIFAR-10 --data_path YOUR_PATH_TO_CIFAR --lr 0.1 --batch_size 128 --weight_decay 1e-4 --epochs 160 --amount_noise 0.5

To calculate the entropy of the pruned ResNet-18 on CIFAR-10: python entropy.py --model_path YOUR_PATH_TO_PRUNED_MODELS --dataset CIFAR-10 --data_path YOUR_PATH_TO_CIFAR --arch ResNet-18

Citation

If you find this useful for your research, please cite the following paper.

@inproceedings{quetu2024dsd2,
  title={DSD$^2$: Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free?},
  author={Qu{\'e}tu, Victor and Tartaglione, Enzo},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={13},
  pages={14749--14757},
  year={2024}
}

dsd2's People

Contributors

vgcq avatar

Stargazers

 avatar ChrisXue avatar Nacereddine Laddaoui avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.