Important
我正在寻求一个PhD.的就读机会,如果您对我有兴趣可以发邮件给我: [email protected]
I am seeking a PhD. opportunity, if you are interested you can email me: [email protected]
|
|
|
|
|
|
|
|
|
|
🐳 Aurora is a [Chinese Version] MoE model. Aurora is a further work based on Mixtral-8x7B, which activates the chat capability of the model's Chinese open domain.
Home Page: https://arxiv.org/abs/2312.14557
License: Apache License 2.0
Important
我正在寻求一个PhD.的就读机会,如果您对我有兴趣可以发邮件给我: [email protected]
I am seeking a PhD. opportunity, if you are interested you can email me: [email protected]
|
|
|
|
|
|
|
|
|
|
感觉推理速度好慢,吐字速度比qwen-14B慢了好多,是因为Qlora导致的吗?有没有什么提高推理速度的好办法?
Hi Rongsheng,
Thanks for your work! I'm wondering what optimization strategy is used (ZERO-1/2/3)?
Training data is filtered from three sources. Is it possible to open source? Or filter pipeline.
Dear @WangRongsheng,
Thank you for your contribution; this paper is amazing. However, I have a question regarding the instruction finetuning dataset as mentioned below:
In the datasets alpaca_data_zh_51k
and alpaca_gpt4_data_zh
, can you explain why both datasets from Alpaca were used? It appears that the alpaca_gpt4_data_zh
dataset might have higher quality as it contains natural responses, unlike the alpaca_data_zh_51k
original dataset. Is it more beneficial to utilize both datasets for the instruction finetuning step, or would it be preferable to prioritize using only the alpaca_gpt4_data_zh
set due to its inclusion of natural responses?
Thank you for your clarification.
Best regards,
请问多机器怎么做全参数微调。可以使用deepspeed这些并行框架吗,谢谢
看您论文中第二部分 数据 中写到:
alpaca_data_zh_51k dataset consists of approximately 51000 sentence pairs, each containing a Chinese sentence and its corresponding English translation.
但我看到 https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/data/alpaca_data_zh_51k.json 中的数据情况是,大多数数据的instruction和output都是纯中文的,没有对应的英文。
是我找错地方了吗??还是说您对链接中的数据,又做了翻译,再作为训练数据?
麻烦您回答,谢谢!
I want to use with the vllm, so how to merge lora weight with the base model and save the whole model ?
我想用vllm跑推理,怎么merge然后保存整个模型。
Hi, as title said, I'm interested in the data used in DPO training. Is it possible to get it?
作者您好,作为刚入坑llm的新人,想请教两个问题:
model mixtral 7x8 + chinese lora 4 bit quantization
提示
1.you have a question.think step by step.if the question talk about china's topic then output {"topic":"china"} else output {"topic":"other"}.Question:浙江在哪里?
2.你有一个问题,一步步思考。如果问题涉及**的主题,则输出 {"topic":"china"} else 输出 {"topic":"other"}.Question:浙江在哪里?
mixtral:
能够推理过程并描述。
output: The question is about a location in China, so the topic is related to China.\nAnswer: {"topic": "china"}\n\nNote: Zhejiang is a province located in the eastern part of China, near Shanghai and bordering the East China Sea.
chinese lora:
没有推理和描述。
output: {"topic":"other"}.
看起来,lora版本遗忘了推理相关的知识导致逻辑判断推理(是不是训练数据中缺少中英文对齐、COT、STEM相关的数据?)。
我非常喜欢mixtral 7bx8,模型性能很强但是中文比较差,希望你越做越好。
试了下用vllm多卡推理,没有成功...
在这试着问下
谢谢
Hi @WangRongsheng, I am trying to run your web_demo.py on Linux using:
CUDA_VISIBLE_DEVICES=0 python src/web_demo.py --model_name_or_path mistralai/Mixtral-8x7B-Instruct-v0.1 --checkpoint_dir /mnt347/ddd/test/Aurora/final-checkpoint --finetuning_type lora --quantization_bit 4 --template mistral
I have also set share=True in:
def main():
demo = create_web_demo()
demo.queue()
demo.launch(server_name="0.0.0.0", server_port=7888, share=True, inbrowser=True)
However when I run the code it seems that is running in the back-end:
01/16/2024 05:01:00 - INFO - llmtuner.model.adapter - Fine-tuning method: LoRA
01/16/2024 05:01:01 - INFO - llmtuner.model.adapter - Loaded fine-tuned model from checkpoint(s): /mnt347/ddd/test/Aurora/final-checkpoint
01/16/2024 05:01:01 - INFO - llmtuner.model.loader - trainable params: 0 || all params: 46706200576 || trainable%: 0.0000
01/16/2024 05:01:01 - INFO - llmtuner.model.loader - This IS expected that the trainable params is 0 if you are using model for inference only.
01/16/2024 05:01:01 - INFO - llmtuner.data.template - Add pad token:
Running on local URL: http://0.0.0.0:7888
but not in the browser (Chrome) as the web-page (of the URL above) is loading indefinitely (I guess it will fail eventually). Could you please help with this? Many thanks in advance!
如何让Aurora与Flash Attention 2结合,进一步加快推理速度
在本地使用vllm,相同的prompt,generate结果和论文中的不同。
(Pdb) llm.generate("你是谁?", sampling_params) [RequestOutput(request_id=7, prompt='你是谁?', prompt_token_ids=[1, 28705, 29383, 28971, 235, 179, 132, 29771], prompt_logprobs=None, outputs=[CompletionOutput(index=0, text='\n\n我是一个程序员,我喜欢折腾各种东西。
论文中的generate结果:
User 你是谁? Mixtral-8x7B-Instruct-v0.1 Hello! I’m an assistant designed to help you with a variety of tasks. I strive to provide useful, honest, and respectful responses while ensuring your data is secure. It’s nice to meet you! How can I assist you today? 你好,很高兴认识你! 我可以wie kann ich Ihnen helfen heute?nitschen Sie mir bitte helfen?
RT
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.