Comments (8)
Hello, it seems that the CoHere model is mainly tailored for RAG. Although the LLM component is expected to be supported, it's worth noting that the embedding model tends to be relatively small and computation-bound, hence it may not yield significant benefits from weight-only quantization.
Please feel free to correct me. We'll prioritize exploring this if it could bring large benefits to the community.
from auto-round.
Apologies for any confusion, just to clarify, are you referring to https://huggingface.co/CohereForAI/c4ai-command-r-v01 instead of https://huggingface.co/Cohere? We will have a test of https://huggingface.co/CohereForAI/c4ai-command-r-v01
from auto-round.
Apologies for any confusion, just to clarify, are you referring to https://huggingface.co/CohereForAI/c4ai-command-r-v01 instead of https://huggingface.co/Cohere? We will have a test of https://huggingface.co/CohereForAI/c4ai-command-r-v01
yes,i mean https://huggingface.co/CohereForAI/c4ai-command-r-v01 and https://huggingface.co/CohereForAI/c4ai-command-r-plus
from auto-round.
yes,i mean https://huggingface.co/CohereForAI/c4ai-command-r-v01 and https://huggingface.co/CohereForAI/c4ai-command-r-plus
Ok, have you already met some issues? If the LLM is "typical", AutoRound should have already supported without requiring any additional code or configuration.
Anyway, we will have a test.
from auto-round.
yes,i mean https://huggingface.co/CohereForAI/c4ai-command-r-v01 and https://huggingface.co/CohereForAI/c4ai-command-r-plus
Ok, have you already met some issues? If the LLM is "typical", AutoRound should have already supported without requiring any additional code or configuration. Anyway, we will have a test.
Not yet,i will have a try
from auto-round.
I've conducted a preliminary test on https://huggingface.co/CohereForAI/c4ai-command-r-v01 and it appears that AutoRound works well. However, given our current prioritized tasks, we plan to postpone reporting the recipe until we have available compute resources.
from auto-round.
thanks a lot
from auto-round.
close this as there is no issue, feel free to reopen it.
from auto-round.
Related Issues (20)
- ceval, cmmlu acc is not matched with lm-eval results HOT 1
- support F.linear and matmul in some moe models
- 8-bit quantization support HOT 2
- Quantization/layer speed is very slow HOT 2
- Merge dataloader to dataset HOT 1
- OPT model quantize_lm_head clarification HOT 3
- Set the default scale_dtype to FP16 HOT 1
- large discrepancy between GPTQ model and qdq model
- hook AutoHfQuantizer of transformers to support different backends and mixed precision quantization HOT 1
- Unexpected ppl diff HOT 3
- falcon 7b bug with disable_trust_remote_code HOT 1
- support multimodal models
- support activation quantization
- support simulated MXPF4
- support trainable equivalent transformation
- support low cpu memory usage
- question about calib data HOT 14
- if the whole block is excluded from the quantization, bug will occur
- Qbits lm-eval incorrect behaviour HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from auto-round.