本地加载大模型，qwen15-32B-AWQ,不能使用GPU 信息如下： INFO | model_worker | Loading the model [

本地加载大模型，qwen15-32B-AWQ,不能使用GPU，请问要如何配置 about langchain-chatchat HOT 3 CLOSED

tf18536 commented on June 13, 2024

本地加载大模型，qwen15-32B-AWQ,不能使用GPU，请问要如何配置

from langchain-chatchat.

Comments (3)

dosubot commented on June 13, 2024 4

要在GPU上运行qwen15-32B-AWQ模型，您需要在server_config.py文件中的FSCHAT_MODEL_WORKERS字典里为您的模型配置设备设置。这里是一个配置示例，将模型设定为使用CUDA：

FSCHAT_MODEL_WORKERS = {
    ...
    "qwen15-32B-AWQ": {
        "device": "cuda",
    },
    ...
}

确保您的系统已安装CUDA，并且GPU已正确配置以与CUDA一起工作。此外，如果您使用Docker，还需要安装NVIDIA Docker工具包，并在运行Docker容器时使用--gpus标志。

完成这些配置后，通过检查配置文件中的LLM_DEVICE设置来验证qwen15-32B-AWQ模型是否正在使用GPU。如果设置为"cuda"，则表示模型已配置为使用GPU。如果是"auto"，系统将自动检测并使用可用的GPU，尽管可能会发出有关自动检测的警告。如果模型或系统不支持GPU使用，或者没有可用的GPU，它可能会回退到使用CPU。

Details

Sources

About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

from langchain-chatchat.

ilovesouthpark commented on June 13, 2024 1

model_config这里指定model用gpu来跑，不要auto，还是不行的话找一下安装的依赖fastchat里的model_adapter.py文件，修改一下1697这部分（我这里是1697行）
model = AutoModelForCausalLM.from_pretrained(
model_path,
#device_map="cuda:0", 加上这句，然后去掉注释试试。
config=config,
low_cpu_mem_usage=True,
trust_remote_code=True,
**from_pretrained_kwargs,

from langchain-chatchat.

tf18536 commented on June 13, 2024

问题已解决，感谢！

from langchain-chatchat.

本地加载大模型，qwen15-32B-AWQ,不能使用GPU，请问要如何配置 about langchain-chatchat HOT 3 CLOSED

Comments (3)

Details

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs