Comments (1)
Similar to what happens for weight scaling, you can have one scale factor for the entire tensor to quantize, or one per each channel of said tensor. Other slicing of the tensor to compute scale factors are also possible, although arguably less common (e.g., per-row, per-group, etc.).
The use of per tensor vs per channel depends on the network topology, hardware constraints of the device where you plan to execute your network, and other factors.
As a rule of thumb, the more fine grained the granularity of your scale factors, the better the final accuracy of the quantized network. Similarly, the computational cost and memory usage of your network will increase since scaling factors are stored in high precision.
from brevitas.
Related Issues (20)
- Implement context-manager based export
- Missing Proxy tests
- Export ONNX QOperator HOT 5
- Fix Value Tracer
- Activation Equalization co-optimize flag
- Update entrypoint for LLM HOT 1
- Add squeeze / unsqueeze operations to quant invariant functions in `torch_handler.py` HOT 4
- Add support for minifloat ptq with fx backend on residual models
- Implement `torch.where` STE for minifloat clamping
- Remove maximum assumptions about NaN/inf values for minifloat configurations
- Change way of setting `NaN` and `inf` values for custom minifloat formats
- Update signature check
- Deprecate use of MacOS (Darwin) runners in CI
- Adding tests for "quantize" function for CNN PTQ HOT 7
- Call for better/more documentation
- Per-channel zero points but per-tensor scales HOT 6
- Documentation setup thoughts HOT 3
- update dependencies=2.0.1 requirement HOT 4
- Mac OSX Tests for `torch==1.9.1` fail when installing dependencies HOT 3
- Weights not quantized after using qnn.QuantConv2d layers for QAT HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from brevitas.