Comments (4)
用的Ziya-LLaMA-13B-v1, 代码更新了。
from medicalgpt.
一样的问题
Loading checkpoint shards: 100%|████████████████| 28/28 [00:17<00:00, 1.59it/s]
Vocab of the base model: 39424
Vocab of the tokenizer: 39410
Traceback (most recent call last):
File "scripts/gradio_demo.py", line 190, in
main()
File "scripts/gradio_demo.py", line 77, in main
assert tokenzier_vocab_size > model_vocab_size
AssertionError
我是之前下载的Ziya-LLaMA-13B-v1,我看现在还有个Ziya-LLaMA-13B-v1.1 ,麻烦请确认一下具体基础模型有什么版本需求?
我启动的参数是
python scripts/gradio_demo.py --base_model ~/Ziya-LLaMA-13B-v1/ --lora_model ~/ziya-llama-13b-medical-lora/ --tokenizer_path ~/Ziya-LLaMA-13B-v1
指定了使用BaseModel的tokenizer
from medicalgpt.
@shibing624 按照更新的代码重新跑,双卡4090跑,显存每卡只占了14G左右,推理时依然会报错,
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)
/opt/conda/conda-bld/pytorch_1670525552843/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [64,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
您知道是什么问题吗?谢谢
另外,需要在import时增加 from transformers import GenerationConfig,否则会报错
from medicalgpt.
已经补上了GenerationConfig。fixed gradio demo.
from medicalgpt.
Related Issues (20)
- 使用deepspeed 全参数sft后,inference 回答的都为空,有解决办法吗 HOT 2
- Regarding RLHF and DPO training data HOT 2
- UserWarning: None of the inputs have requires_grad=True. Gradients will be None HOT 2
- reward_modeling咨询 HOT 1
- orpo脚本NoneType问题 HOT 6
- 训练reward_modeling.py HOT 1
- 几步的训练怎么都是独立的,rm都没用sft的adapter HOT 1
- 对chat模型进行二次预训练后,自问自答 HOT 1
- 关于提前结束训练 HOT 4
- dpo_training.py eal存在空的情况 HOT 2
- AMD 执行 run_pt.sh失败 HOT 1
- 有没有人能分享下自己微调后的模型id,我懒得弄,只想吃现成的 HOT 1
- vocab扩展后的模型合并问题 HOT 1
- ppo训练时出现问题:UserWarning: KL divergence is starting to become negative: -233.50 HOT 2
- DPO训练,报错:“IndexError: Invalid key: 0 is out of bounds for size 0” HOT 2
- 运行pretraining.py时报错:RuntimeError: CUDA error: device-side assert triggered HOT 4
- 医学大模型全流程体验 HOT 2
- 关于llama3的权重转换 HOT 1
- ValueError: Please specify target_modules in peft_config HOT 1
- PPO和SFT阶段数据集 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from medicalgpt.