weiminw / vllm-gptq Goto Github PK
View Code? Open in Web Editor NEWThis project forked from chu-tianxiang/vllm-gptq
A high-throughput and memory-efficient inference and serving engine for LLMs
Home Page: https://vllm.readthedocs.io
License: Apache License 2.0