run the command as follow: python convert_checkpoint.py --model_dir /Qwen1.5-32B-Chat/

<a class="user-mention notranslate" data-hoverc

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="use

<a class="user-mention notranslate

failed to convert Qwen1.5-32B-Chat about tensorrt-llm HOT 16 OPEN

AGI-player commented on August 16, 2024 1

failed to convert Qwen1.5-32B-Chat

from tensorrt-llm.

Comments (16)

Hukongtao commented on August 16, 2024 2

With this modification, the code can run normally

from tensorrt-llm.

jershi425 commented on August 16, 2024

@AGI-player Thanks for your feedback. This is a known issue and we will fix it soon.

from tensorrt-llm.

AGI-player commented on August 16, 2024

ok~

from tensorrt-llm.

Hukongtao commented on August 16, 2024

I got this error

from tensorrt-llm.

AGI-player commented on August 16, 2024

it should be caused by the differences between GQA(32B) and MHA(others)

from tensorrt-llm.

AGI-player commented on August 16, 2024

nice, i'll try it.

from tensorrt-llm.

shilijunConnan commented on August 16, 2024

I commented these three lines，but still have the same error

from tensorrt-llm.

Hukongtao commented on August 16, 2024

“--qwen_type qwen2”, Do you set this？

from tensorrt-llm.

shilijunConnan commented on August 16, 2024

I set it， using
python3 convert_checkpoint.py --model_dir /workspace/model/model/Qwen1.5-32B-Chat/ --output_dir /workspace/model/model/Qwen-32B-trt --dtype float16 --qwen_type qwen2
and it doesn't seem to work

from tensorrt-llm.

Hukongtao commented on August 16, 2024

python3 convert_checkpoint.py \
    --model_dir         ./Qwen1.5-32B-Chat-GPTQ-Int4/ \
    --output_dir        ./tllm_checkpoint_1gpu_gptq/ \
    --dtype float16 \
    --use_weight_only \
    --weight_only_precision int4_gptq \
    --per_group \
    --load_model_on_cpu \
    --qwen_type qwen2

This works for me

from tensorrt-llm.

shilijunConnan commented on August 16, 2024

Thanks， it works for me in Qwen1.5-32B-Chat-GPTQ-Int4 too

from tensorrt-llm.

KnightLancelot commented on August 16, 2024

This issue arises because the conversion of the non-quantized version of qwen1.5 is not implemented in "tensorrt_llm/models/qwen/convert.py" or "tensorrt_llm/models/qwen/model.py".

from tensorrt-llm.

Fred-cell commented on August 16, 2024

is this issue fixed now?

from tensorrt-llm.

KnightLancelot commented on August 16, 2024

from tensorrt-llm.

AGI-player commented on August 16, 2024

@jershi425 is this issue fixed in the latest version?

from tensorrt-llm.

shilijunConnan commented on August 16, 2024

这是来自QQ邮箱的用户自动回复邮件。收到~

from tensorrt-llm.

failed to convert Qwen1.5-32B-Chat about tensorrt-llm HOT 16 OPEN

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs