quantization-tutorials's Introduction

Quantization-Tutorials

A bunch of coding tutorials for my Youtube videos on Neural Network Quantization.

Resnet-Eager-Mode-Quant:

This is the first coding tutorial. We take the torchvision ResNet model and quantize it entirely from scratch with the PyTorch quantization library, using Eager Mode Quantization.

We discuss common issues one can run into, as well as some interesting but tricky bugs.

Resnet-Eager-Mode-Dynamic-Quant:

TODO

In this tutorial, we do dynamic quantization on a ResNet model. We look at how dynamic quantization works, what the default settings are in PyTorch, and discuss how it differs to static quantization.

How to do FX Graph Mode Quantization (PyTorch ResNet Coding tutorial)

In this tutorial series, we use Torch's FX Graph mode quantization to quantize a ResNet. In the first video, we look at the Directed Acyclic Graph (DAG), and see how the fusing, placement of quantstubs and FloatFunctionals all happen automatically. In the second, we look at some of the intricacies of how quantization interacts with the GraphModule. In the third and final video, we look at some more advanced techniques for manipulating and traversing the graph, and use these to discover an alternative to forward hooks, and for fusing BatchNorm layers into their preceding Convs.

Quantization Aware Training

In this tutorial we look at how to do Quantization Aware Training (QAT) on an FX Graph Mode quantized Resnet. We build a small trianing lopp with a mini custom data loader. We also generalise the evaluate function we've been using in our tutorials to generalise to other images. We go looking for and find some of the danges of overfit.

Cross Layer Equalization (CLE)

In this tutorial, we look at Cross-Layer Equalization, a classic data-free method for improving the quantization of one's models. We use a graph-tracing method to find all of the layers we can do CLE on, do CLE, evaluate the results, and then visualize what's happening inside the model.

quantization-tutorials's People

Contributors

Stargazers

Watchers

quantization-tutorials's Issues

I love your works

Im an student from VietNam, and i love this projects.

Best regards

Question about PTQ fake quantized FP32 model to INT8

Hi Oscar,

First of all many thanks for your tutorials, they are incredibly useful to learn quantization and get hands-on experience on this!

I have the following situation and perhaps you could enlighten me a bit, since I cannot seem to find a connection between your tutorials and what I want to achieve.

My goal is to be able to quantize a ViT to INT8, any type (for example, DeiT-tiny would be more than enough). To avoid boilerplate and save training time, I have found methods that do PTQ on most ViT versions. An example of these methods is FQ-ViT, that performs PTQ on ViTs and yields a quantized model in INT8 or INT4 (but fake quantized to FP32).

The problem comes when I want to continue the path from the fake quantized model to an actual INT8 model. Since the method already yields a fake quantized model in FP32, do I need to just cast all weights and activation values to INT8? Or do I need to modify the code, add the quantization stubs, re-run the PTQ calibration to build the qconfigs and then save the resulting model?

I would be very happy to hear how you would approach this problem, as there are very little resources on the internet regarding quantizing ViTs.

Best,

Fabrizio

Recommend Projects

oscarsavolainendr / quantization-tutorials Goto Github PK

quantization-tutorials's Introduction

Quantization-Tutorials

Resnet-Eager-Mode-Quant:

Resnet-Eager-Mode-Dynamic-Quant:

How to do FX Graph Mode Quantization (PyTorch ResNet Coding tutorial)

Quantization Aware Training

Cross Layer Equalization (CLE)

quantization-tutorials's People

Contributors

Stargazers

Watchers

quantization-tutorials's Issues

I love your works

Question about PTQ fake quantized FP32 model to INT8

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs