I'm looking at <a href="https://github.com/Xilinx/brevitas/blob/master/src/brevitas_ex

What is role of scaling_per_output_channel in QuantReLU? about brevitas HOT 1 CLOSED

phixerino commented on September 26, 2024

What is role of scaling_per_output_channel in QuantReLU?

from brevitas.

Comments (1)

Giuseppe5 commented on September 26, 2024

Similar to what happens for weight scaling, you can have one scale factor for the entire tensor to quantize, or one per each channel of said tensor. Other slicing of the tensor to compute scale factors are also possible, although arguably less common (e.g., per-row, per-group, etc.).

The use of per tensor vs per channel depends on the network topology, hardware constraints of the device where you plan to execute your network, and other factors.

As a rule of thumb, the more fine grained the granularity of your scale factors, the better the final accuracy of the quantized network. Similarly, the computational cost and memory usage of your network will increase since scaling factors are stored in high precision.

from brevitas.

Related Issues (20)

Implement context-manager based export
Missing Proxy tests
Export ONNX QOperator HOT 5
Fix Value Tracer
Activation Equalization co-optimize flag
Update entrypoint for LLM HOT 1
Add squeeze / unsqueeze operations to quant invariant functions in `torch_handler.py` HOT 4
Add support for minifloat ptq with fx backend on residual models
Implement `torch.where` STE for minifloat clamping
Remove maximum assumptions about NaN/inf values for minifloat configurations
Change way of setting `NaN` and `inf` values for custom minifloat formats
Update signature check
Deprecate use of MacOS (Darwin) runners in CI
Adding tests for "quantize" function for CNN PTQ HOT 7
Call for better/more documentation
Per-channel zero points but per-tensor scales HOT 6
Documentation setup thoughts HOT 3
update dependencies=2.0.1 requirement HOT 4
Mac OSX Tests for `torch==1.9.1` fail when installing dependencies HOT 3
Weights not quantized after using qnn.QuantConv2d layers for QAT HOT 1

What is role of scaling_per_output_channel in QuantReLU? about brevitas HOT 1 CLOSED

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs