默认tokenizer_path是没有提供了，将其指定为Ziya-LLaMA-13B-v1模型所在路径后，会提示长度不一致错误，该怎么解决？ <p dir="aut

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

直接运行时会出现tokenizer长度错误 about medicalgpt HOT 4 CLOSED

shibing624 commented on July 27, 2024

直接运行时会出现tokenizer长度错误

from medicalgpt.

Comments (4)

shibing624 commented on July 27, 2024 1

用的Ziya-LLaMA-13B-v1，代码更新了。

from medicalgpt.

charryshi commented on July 27, 2024

一样的问题
Loading checkpoint shards: 100%|████████████████| 28/28 [00:17<00:00, 1.59it/s]
Vocab of the base model: 39424
Vocab of the tokenizer: 39410
Traceback (most recent call last):
File "scripts/gradio_demo.py", line 190, in
main()
File "scripts/gradio_demo.py", line 77, in main
assert tokenzier_vocab_size > model_vocab_size
AssertionError

我是之前下载的Ziya-LLaMA-13B-v1，我看现在还有个Ziya-LLaMA-13B-v1.1 ，麻烦请确认一下具体基础模型有什么版本需求？
我启动的参数是
python scripts/gradio_demo.py --base_model ~/Ziya-LLaMA-13B-v1/ --lora_model ~/ziya-llama-13b-medical-lora/ --tokenizer_path ~/Ziya-LLaMA-13B-v1
指定了使用BaseModel的tokenizer

from medicalgpt.

chelovek21 commented on July 27, 2024

@shibing624 按照更新的代码重新跑，双卡4090跑，显存每卡只占了14G左右，推理时依然会报错，
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)
/opt/conda/conda-bld/pytorch_1670525552843/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [64,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
您知道是什么问题吗？谢谢

另外，需要在import时增加 from transformers import GenerationConfig，否则会报错

from medicalgpt.

shibing624 commented on July 27, 2024

已经补上了GenerationConfig。fixed gradio demo.

from medicalgpt.

Recommend Projects

直接运行时会出现tokenizer长度错误 about medicalgpt HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs