hshen14 / neural-compressor Goto Github PK
View Code? Open in Web Editor NEWThis project forked from intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Home Page: https://intel.github.io/neural-compressor/
License: Apache License 2.0