GithubHelp home page GithubHelp logo

hwidong-na / kd_methods_with_tf Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sseung0703/kd_methods_with_tf

0.0 1.0 0.0 35.18 MB

Knowledge distillation methods implemented with Tensorflow (now there are 11 (+1) methods, and will be added more.)

License: MIT License

Python 91.81% Jupyter Notebook 8.19%

kd_methods_with_tf's Introduction

Knowledge Distillation Methods with Tensorflow

Knowledge distillation is the method to enhance student network by teacher knowledge. So annually knowledge distillation methods have been proposed, but each paper's do experiments with different networks and compare with different methods. Moreover, each method is implemented by each author, so if a new researcher wants to study knowledge distillation, they have to find or implement all of the methods. Surely it is tough work. To reduce this burden, I publish some codes and modify from my research codes. I'll update the code and knowledge distillation algorithm, and all of the things will be implemented using Tensorflow.

Upgrade version of this Repo. will be available at this link

Implemented Knowledge Distillation Methods

Please check detail of each category in MHGD and If you think the above categorization is useful, please consider citing the following paper.

@inproceedings{GraphKD,
  title = {Graph-based Knowledge Distillation by Multi-head Attention Network},
  author = {Seunghyun Lee, Byung Cheol Song},
  booktitle = {British Machine Vision Conference (BMVC)},
  year = {2019}
}

Response-based Knowledge

Defined knowledge by the neural response of the hidden layer or the output layer of the network

Multi-connection Knowledge

Increase the quantity of knowledge by sensing several points of the teacher network

Shared-representation Knowledge

Defined knowledge by the shared representation between two feature maps

Relational Knowledge

Defined knowledge by intra-data relation

Experimental Results

The below table and plot are sample results using ResNet and train on CIFAR100.

I use the same hyper-parameter for training each network, and only tune hyper-parameter of each distillation algorithm. However the results may be not optimal. All of the numerical values and plots are averages of five trials.

Network architecture

The teacher network is ResNet32 and Student is ResNet8, and the student network is well-converged (not over and under-fit) for evaluating each distillation algorithm performance precisely. Note that implemented ResNet has doubled depth.

Training/Validation accuracy

Methods Last Accuracy Best Accuracy
Student 71.76 71.92
Teacher 78.96 79.08
Soft-logits 71.79 72.08
FitNet 72.74 72.96
AT 72.31 72.60
FSP 72.65 72.91
DML 73.27 73.47
KD-SVD 73.68 73.78
KD-EID 73.84 74.07
FT 73.35 73.50
AB 73.08 73.41
RKD 73.40 73.48
MHGD 73.98 74.30

Plan to do

  • Upgrade this Repo. to TF2.0. :)

kd_methods_with_tf's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.