Comments (5)
去掉 --with_prompt , 没做sft就不需要prompt
from medicalgpt.
谢谢解答,但是我用 python inference.py --model_type llama --base_model ../baichuan/model --interactive
得到的结果是:
Input:北京是
Setting pad_token_id
to eos_token_id
:2 for open-end generation.
Response: 北京是-. GSG pup: t gTs s1g ( C T fren—-. up-c-apm-eviner不需要 followed-..ta---.-p.--hest..-..-t.很快...-号.-il-...... Be........all...........--....-...-..l.8...-.. in. until..............any.........................der............................明alled..................oul..irts.......or...... be.....ore................................... eight.......9.............ber..able..................三分.................................ides..............3......................................
from medicalgpt.
Input:登鹳雀楼->王之涣
你可以先了解base model 和SFT后的模型的区别。用few-shot测试base model
from medicalgpt.
嗯嗯我了解,只是没把这个例子贴出来,结果是一样的:
Input:登鹳雀楼->王之涣\n夜雨寄北->
Setting pad_token_id
to eos_token_id
:2 for open-end generation.
Response: 登鹳雀楼->王之涣\n夜雨寄北-> ((tbsds andb to,m -d
or-nings fullmm any threeardsa3/ with?ak –italsoss takes ($ushundensstatrd &.sdston;act --aaen。^@woesaredbstand
from medicalgpt.
更新代码,template_name=baichuan-chat兼容原版模型推理。
from medicalgpt.
Related Issues (20)
- chatglm2合并sft_qlora后,推理出现自动续答 HOT 3
- dpo_training训练chatglm3-6b模型报错。 HOT 1
- ChatGLMForSequenceClassification rm步骤出错 HOT 1
- dpo训练出错 HOT 5
- TypeError: ChatGLMForSequenceClassification.forward() got an unexpected keyword argument 'output_attentions'
- chatglm3训练在rm之后,进行lora模型权重合并到base model,出现问题:ValueError: chatglm does not support sequence classification HOT 2
- ValueError: The model does not have a language model head, please use a model that has one. HOT 1
- 使用qwen进行pretrain的时候出现了问题:Cannot copy out of meta tensor; no data! HOT 1
- llama进行rm训练的时候,出现问题ValueError: weight is on the meta device, we need a `value` to put in on cpu. HOT 1
- 关于Chatglm3的增量预训练 HOT 1
- 增量预训练效果评估 HOT 1
- assert tokenzier_vocab_size > model_vocab_size HOT 5
- 预训练后模型出现自问自答、输出未知序列、重复口吃现象 HOT 6
- 扩充词表后能否直接进行SFT呢?
- 运行inference.py文件,报AttributeError: property 'eos_token' of 'ChatGLMTokenizer' object has no setter HOT 1
- lora模型合并 HOT 2
- ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (3,2) and requested shape (1,2) HOT 1
- 使用deepspeed 全参数sft后,inference 回答的都为空,有解决办法吗 HOT 2
- Regarding RLHF and DPO training data HOT 2
- UserWarning: None of the inputs have requires_grad=True. Gradients will be None HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from medicalgpt.