GithubHelp home page GithubHelp logo

chatchat-space / langchain-chatchat Goto Github PK

View Code? Open in Web Editor NEW
29.9K 281.0 5.2K 137.95 MB

Langchain-Chatchat(原Langchain-ChatGLM, Qwen 与 Llama 等)基于 Langchain 与 ChatGLM 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain

License: Apache License 2.0

Python 22.49% Shell 0.29% Dockerfile 0.11% JavaScript 0.25% TypeScript 68.33% MDX 8.41% Makefile 0.11%
chatglm langchain llm knowledge-base llama chatbot chatgpt embedding faiss fastchat

langchain-chatchat's Introduction

chatchat-space%2FLangchain-Chatchat | Trendshift

pypi badge Generic badge

🌍 READ THIS IN ENGLISH

📃 LangChain-Chatchat (原 Langchain-ChatGLM)

基于 ChatGLM 等大语言模型与 Langchain 等应用框架实现,开源、可离线部署的 RAG 与 Agent 应用项目。


目录

概述

🤖️ 一种利用 langchain **实现的基于本地知识库的问答应用,目标期望建立一套对中文场景与开源模型支持友好、可离线运行的知识库问答解决方案。

💡 受 GanymedeNil 的项目 document.aiAlexZhangji 创建的 ChatGLM-6B Pull Request 启发,建立了全流程可使用开源模型实现的本地知识库问答应用。本项目的最新版本中可使用 XinferenceOllama 等框架接入 GLM-4-ChatQwen2-InstructLlama3 等模型,依托于 langchain 框架支持通过基于 FastAPI 提供的 API 调用服务,或使用基于 Streamlit 的 WebUI 进行操作。

✅ 本项目支持市面上主流的开源 LLM、 Embedding 模型与向量数据库,可实现全部使用开源模型离线私有部署。与此同时,本项目也支持 OpenAI GPT API 的调用,并将在后续持续扩充对各类模型及模型 API 的接入。

⛓️ 本项目实现原理如下图所示,过程包括加载文件 -> 读取文本 -> 文本分割 -> 文本向量化 -> 问句向量化 -> 在文本向量中匹配出与问句向量最相似的 top k个 -> 匹配出的文本作为上下文和问题一起添加到 prompt中 -> 提交给 LLM生成回答。

📺 原理介绍视频

实现原理图

从文档处理角度来看,实现流程如下:

实现原理图2

🚩 本项目未涉及微调、训练过程,但可利用微调或训练对本项目效果进行优化。

🌐 AutoDL 镜像0.3.0 版本所使用代码已更新至本项目 v0.3.0 版本。

🐳 Docker 镜像将会在近期更新。

🧑‍💻 如果你想对本项目做出贡献,欢迎移步开发指南 获取更多开发部署相关信息。

功能介绍

0.3.x 版本功能一览

功能 0.2.x 0.3.x
模型接入 本地:fastchat
在线:XXXModelWorker
本地:model_provider,支持大部分主流模型加载框架
在线:oneapi
所有模型接入均兼容openai sdk
Agent ❌不稳定 ✅针对ChatGLM3和QWen进行优化,Agent能力显著提升
LLM对话
知识库对话
搜索引擎对话
文件对话 ✅仅向量检索 ✅统一为File RAG功能,支持BM25+KNN等多种检索方式
数据库对话
多模态图片对话 ✅ 推荐使用 qwen-vl-chat
ARXIV文献对话
Wolfram对话
文生图
本地知识库管理
WEBUI ✅更好的多会话支持,自定义系统提示词...

0.3.x 版本的核心功能由 Agent 实现,但用户也可以手动实现工具调用:

操作方式 实现的功能 适用场景
选中"启用Agent",选择多个工具 由LLM自动进行工具调用 使用ChatGLM3/Qwen或在线API等具备Agent能力的模型
选中"启用Agent",选择单个工具 LLM仅解析工具参数 使用的模型Agent能力一般,不能很好的选择工具
想手动选择功能
不选中"启用Agent",选择单个工具 不使用Agent功能的情况下,手动填入参数进行工具调用 使用的模型不具备Agent能力
不选中任何工具,上传一个图片 图片对话 使用 qwen-vl-chat 等多模态模型

更多功能和更新请实际部署体验.

已支持的模型部署框架与模型

本项目中已经支持市面上主流的如 GLM-4-ChatQwen2-Instruct 等新近开源大语言模型和 Embedding 模型,这些模型需要用户自行启动模型部署框架后,通过修改配置信息接入项目,本项目已支持的本地模型部署框架如下:

模型部署框架 Xinference LocalAI Ollama FastChat
OpenAI API 接口对齐
加速推理引擎 GPTQ, GGML, vLLM, TensorRT, mlx GPTQ, GGML, vLLM, TensorRT GGUF, GGML vLLM
接入模型类型 LLM, Embedding, Rerank, Text-to-Image, Vision, Audio LLM, Embedding, Rerank, Text-to-Image, Vision, Audio LLM, Text-to-Image, Vision LLM, Vision
Function Call /
更多平台支持(CPU, Metal)
异构 / /
集群 / /
操作文档链接 Xinference 文档 LocalAI 文档 Ollama 文档 FastChat 文档
可用模型 Xinference 已支持模型 LocalAI 已支持模型 Ollama 已支持模型 FastChat 已支持模型

除上述本地模型加载框架外,项目中也为可接入在线 API 的 One API 框架接入提供了支持,支持包括 OpenAI ChatGPTAzure OpenAI APIAnthropic Claude智谱清言百川 等常用在线 API 的接入使用。

Note

关于 Xinference 加载本地模型: Xinference 内置模型会自动下载,如果想让它加载本机下载好的模型,可以在启动 Xinference 服务后,到项目 tools/model_loaders 目录下执行 streamlit run xinference_manager.py,按照页面提示为指定模型设置本地路径即可.

快速上手

pip 安装部署

0. 软硬件要求

💡 软件方面,本项目已支持在 Python 3.8-3.11 环境中进行使用,并已在 Windows、macOS、Linux 操作系统中进行测试。

💻 硬件方面,因 0.3.0 版本已修改为支持不同模型部署框架接入,因此可在 CPU、GPU、NPU、MPS 等不同硬件条件下使用。

1. 安装 Langchain-Chatchat

从 0.3.0 版本起,Langchain-Chatchat 提供以 Python 库形式的安装方式,具体安装请执行:

pip install langchain-chatchat -U

Important

为确保所使用的 Python 库为最新版,建议使用官方 Pypi 源或清华源。

Note

因模型部署框架 Xinference 接入 Langchain-Chatchat 时需要额外安装对应的 Python 依赖库,因此如需搭配 Xinference 框架使用时,建议使用如下安装方式:

pip install "langchain-chatchat[xinference]" -U

2. 模型推理框架并加载模型

从 0.3.0 版本起,Langchain-Chatchat 不再根据用户输入的本地模型路径直接进行模型加载,涉及到的模型种类包括 LLM、Embedding、Reranker 及后续会提供支持的多模态模型等,均改为支持市面常见的各大模型推理框架接入,如 XinferenceOllamaLocalAIFastChatOne API 等。

因此,请确认在启动 Langchain-Chatchat 项目前,首先进行模型推理框架的运行,并加载所需使用的模型。

这里以 Xinference 举例, 请参考 Xinference文档 进行框架部署与模型加载。

Warning

为避免依赖冲突,请将 Langchain-Chatchat 和模型部署框架如 Xinference 等放在不同的 Python 虚拟环境中, 比如 conda, venv, virtualenv 等。

3. 初始化项目配置与数据目录

从 0.3.1 版本起,Langchain-Chatchat 使用本地 yaml 文件的方式进行配置,用户可以直接查看并修改其中的内容,服务器会自动更新无需重启。

  1. 设置 Chatchat 存储配置文件和数据文件的根目录(可选)
# on linux or macos
export CHATCHAT_ROOT=/path/to/chatchat_data

# on windows
set CHATCHAT_ROOT=/path/to/chatchat_data

若不设置该环境变量,则自动使用当前目录。

  1. 执行初始化
chatchat init

该命令会执行以下操作:

  • 创建所有需要的数据目录
  • 复制 samples 知识库内容
  • 生成默认 yaml 配置文件
  1. 修改配置文件
  • 配置模型(model_settings.yaml)
    需要根据步骤 2. 模型推理框架并加载模型 中选用的模型推理框架与加载的模型进行模型接入配置,具体参考 model_settings.yaml 中的注释。主要修改以下内容:

    # 默认选用的 LLM 名称
     DEFAULT_LLM_MODEL: qwen1.5-chat
    
     # 默认选用的 Embedding 名称
     DEFAULT_EMBEDDING_MODEL: bge-large-zh-v1.5
    
    # 将 `LLM_MODEL_CONFIG` 中 `llm_model, action_model` 的键改成对应的 LLM 模型
    # 在 `MODEL_PLATFORMS` 中修改对应模型平台信息
  • 配置知识库路径(basic_settings.yaml)(可选)
    默认知识库位于 CHATCHAT_ROOT/data/knowledge_base,如果你想把知识库放在不同的位置,或者想连接现有的知识库,可以在这里修改对应目录即可。

    # 知识库默认存储路径
     KB_ROOT_PATH: D:\chatchat-test\data\knowledge_base
    
     # 数据库默认存储路径。如果使用sqlite,可以直接修改DB_ROOT_PATH;如果使用其它数据库,请直接修改SQLALCHEMY_DATABASE_URI。
     DB_ROOT_PATH: D:\chatchat-test\data\knowledge_base\info.db
    
     # 知识库信息数据库连接URI
     SQLALCHEMY_DATABASE_URI: sqlite:///D:\chatchat-test\data\knowledge_base\info.db
  • 配置知识库(kb_settings.yaml)(可选)

    默认使用 FAISS 知识库,如果想连接其它类型的知识库,可以修改 DEFAULT_VS_TYPEkbs_config

4. 初始化知识库

Warning

进行知识库初始化前,请确保已经启动模型推理框架及对应 embedding 模型,且已按照上述步骤3完成模型接入配置。

chatchat kb -r

更多功能可以查看 chatchat kb --help

出现以下日志即为成功:


----------------------------------------------------------------------------------------------------
知识库名称      :samples
知识库类型      :faiss
向量模型:      :bge-large-zh-v1.5
知识库路径      :/root/anaconda3/envs/chatchat/lib/python3.11/site-packages/chatchat/data/knowledge_base/samples
文件总数量      :47
入库文件数      :42
知识条目数      :740
用时            :0:02:29.701002
----------------------------------------------------------------------------------------------------

总计用时        :0:02:33.414425

Note

知识库初始化的常见问题

1. Windows 下重建知识库或添加知识文件时卡住不动

此问题常出现于新建虚拟环境中,可以通过以下方式确认:

from unstructured.partition.auto import partition

如果该语句卡住无法执行,可以执行以下命令:

pip uninstall python-magic-bin
# check the version of the uninstalled package
pip install 'python-magic-bin=={version}'

然后按照本节指引重新创建知识库即可。

5. 启动项目

chatchat start -a

出现以下界面即为启动成功:

WebUI界面

Warning

由于 chatchat 配置默认监听地址 DEFAULT_BIND_HOST 为 127.0.0.1, 所以无法通过其他 ip 进行访问。

如需通过机器ip 进行访问(如 Linux 系统), 需要到 basic_settings.yaml 中将监听地址修改为 0.0.0.0。

其它配置

  1. 数据库对话配置请移步这里 数据库对话配置说明

源码安装部署/开发部署

源码安装部署请参考 开发指南

Docker 部署

docker pull chatimage/chatchat:0.3.0-2024-0624

Important

强烈建议: 使用 docker-compose 部署, 具体参考 README_docker

旧版本迁移

  • 0.3.x 结构改变很大,强烈建议您按照文档重新部署. 以下指南不保证100%兼容和成功. 记得提前备份重要数据!
  • 首先按照 安装部署 中的步骤配置运行环境,修改配置文件
  • 将 0.2.x 项目的 knowledge_base 目录拷贝到配置的 DATA 目录下

项目里程碑

  • 2023年4月: Langchain-ChatGLM 0.1.0 发布,支持基于 ChatGLM-6B 模型的本地知识库问答。

  • 2023年8月: Langchain-ChatGLM 改名为 Langchain-Chatchat,发布 0.2.0 版本,使用 fastchat 作为模型加载方案,支持更多的模型和数据库。

  • 2023年10月: Langchain-Chatchat 0.2.5 发布,推出 Agent 内容,开源项目在Founder Park & Zhipu AI & Zilliz 举办的黑客马拉松获得三等奖。

  • 2023年12月: Langchain-Chatchat 开源项目获得超过 20K stars.

  • 2024年6月: Langchain-Chatchat 0.3.0 发布,带来全新项目架构。

  • 🔥 让我们一起期待未来 Chatchat 的故事 ···


协议

本项目代码遵循 Apache-2.0 协议。

联系我们

Telegram

Telegram

项目交流群

二维码

🎉 Langchain-Chatchat 项目微信交流群,如果你也对本项目感兴趣,欢迎加入群聊参与讨论交流。

公众号

二维码

🎉 Langchain-Chatchat 项目官方公众号,欢迎扫码关注。

引用

如果本项目有帮助到您的研究,请引用我们:

@software{langchain_chatchat,
    title        = {{langchain-chatchat}},
    author       = {Liu, Qian and Song, Jinke, and Huang, Zhiguo, and Zhang, Yuxuan, and glide-the, and liunux4odoo},
    year         = 2024,
    journal      = {GitHub repository},
    publisher    = {GitHub},
    howpublished = {\url{https://github.com/chatchat-space/Langchain-Chatchat}}
}

langchain-chatchat's People

Contributors

bones-zhu avatar calcitem avatar changxubo avatar chinainfant avatar eltociear avatar fengyunzaidushi avatar fxjhello avatar glide-the avatar hzg0601 avatar imclumsypanda avatar inksong avatar keenzhu avatar khazic avatar kztao avatar liangtongt avatar liunux4odoo avatar margox avatar pearjelly avatar qiankunli avatar qqlww1987 avatar showmecodett avatar songpb avatar srszzw avatar sysalong avatar xldistance avatar ykk648 avatar yuehua-s avatar zqt996 avatar zqtgit avatar zrzrzrzrzrzrzr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

langchain-chatchat's Issues

运行环境:GPU需要多大的?

如果按照THUDM/ChatGLM-6B的说法,使用的GPU大小应该在13GB左右,但运行脚本后,占用了24GB还不够。

是因为langchain的RetrievalQA.from_chain_type 这个函数调用的原因还是 导入的HuggingFaceEmbeddings太大?

希望能提供模型运行时的环境配置信息,非常感谢!

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████| 8/8 [00:09<00:00, 1.19s/it]
Traceback (most recent call last):
File "/home/node6/liujie/workspace/langchain-ChatGLM/knowledge_based_chatglm.py", line 11, in
from chatglm_llm import ChatGLM
File "/home/node6/liujie/workspace/langchain-ChatGLM/chatglm_llm.py", line 9, in
class ChatGLM(LLM):
File "pydantic/main.py", line 221, in pydantic.main.ModelMetaclass.new
File "pydantic/fields.py", line 506, in pydantic.fields.ModelField.infer
File "pydantic/fields.py", line 436, in pydantic.fields.ModelField.init
File "pydantic/fields.py", line 546, in pydantic.fields.ModelField.prepare
File "pydantic/fields.py", line 570, in pydantic.fields.ModelField._set_default_and_type
File "pydantic/fields.py", line 439, in pydantic.fields.ModelField.get_default
File "pydantic/utils.py", line 693, in pydantic.utils.smart_deepcopy
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/copy.py", line 153, in deepcopy
y = copier(memo)
File "/home/node6/py_env/anaconda3/lib/python3.9/site-packages/torch/nn/parameter.py", line 55, in deepcopy
result = type(self)(self.data.clone(memory_format=torch.preserve_format), self.requires_grad)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 23.53 GiB total capacity; 22.80 GiB already allocated; 8.62 MiB free; 22.80 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

效果如何优化

image

如图所示,将该项目的README.md和该项目结合后,回答效果并不理想,请问可以从哪些方面进行优化

When I try to run the `python knowledge_based_chatglm.py`, I got this error in macOS(M1 Max, OS 13.2)

~/Downloads/langchain-ChatGLM-master $ python knowledge_based_chatglm.py
Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.11/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/opt/homebrew/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
    conn.connect()
  File "/opt/homebrew/lib/python3.11/site-packages/urllib3/connection.py", line 419, in connect
    self.sock = ssl_wrap_socket(
                ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/ssl.py", line 517, in wrap_socket
    return self.sslsocket_class._create(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/ssl.py", line 1075, in _create
    self.do_handshake()
  File "/opt/homebrew/Cellar/[email protected]/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/ssl.py", line 1346, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.11/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /THUDM/chatglm-6b/resolve/main/tokenizer_config.json (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/lijin02/Downloads/langchain-ChatGLM-master/knowledge_based_chatglm.py", line 11, in <module>
    from chatglm_llm import ChatGLM
  File "/Users/lijin02/Downloads/langchain-ChatGLM-master/chatglm_llm.py", line 19, in <module>
    tokenizer = AutoTokenizer.from_pretrained(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 640, in from_pretrained
    tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 484, in get_tokenizer_config
    resolved_config_file = cached_file(
                           ^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/transformers/utils/hub.py", line 409, in cached_file
    resolved_file = hf_hub_download(
                    ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1166, in hf_hub_download
    metadata = get_hf_file_metadata(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1498, in get_hf_file_metadata
    r = _request_wrapper(
        ^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 407, in _request_wrapper
    response = _request_wrapper(
               ^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 442, in _request_wrapper
    return http_backoff(
           ^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/huggingface_hub/utils/_http.py", line 129, in http_backoff
    response = requests.request(method=method, url=url, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/requests/adapters.py", line 563, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /THUDM/chatglm-6b/resolve/main/tokenizer_config.json (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)')))

输出answer的时间很长,是否可以把文本向量化的部分提前做好存储起来?

GPU:4090 24G显存
输入一篇5000字的文档后,输入问题根据文档输出答案,一个问题要好几分钟才显示答案,且第二个问题时就会out of memory

请问:
(1)这个效率是否正常
(2)如果正常,是否可以把文本向量化的部分提前做好存储起来?

因为输入文档路径后,会经历读取文本-文本分割-文本向量化-提问向量化-在文本向量中匹配与提问向量最相似的top k个-匹配出文本作为上下文和问题一起添加到prompt中--提交LLM生成答案。是否可以把文本向量化的部分提前做好存储起来?

程序报错torch.cuda.OutOfMemoryError如何解决?

报错详细信息如下:

Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Loading checkpoint shards: 100%|██████████| 8/8 [00:04<00:00, 1.83it/s]
Input your local knowledge file path 请输入本地知识文件路径:E:\try0.md
No sentence-transformers model found with name C:\Users\50902/.cache\torch\sentence_transformers\GanymedeNil_text2vec-large-chinese. Creating a new one with MEAN pooling.
Input your question 请输入问题:什么是瓜子栱?
Traceback (most recent call last):
File "E:\01进行项目#0Y0_AI_Arch\02_digital-humanities\ChatGLM-6B-main\langchain-ChatGLM-master\knowledge_based_chatglm.py", line 67, in
resp, history = get_knowledge_based_answer(query=query,
File "E:\01进行项目#0Y0_AI_Arch\02_digital-humanities\ChatGLM-6B-main\langchain-ChatGLM-master\knowledge_based_chatglm.py", line 45, in get_knowledge_based_answer
chatglm = ChatGLM()
File "E:\01进行项目#0Y0_AI_Arch\02_digital-humanities\ChatGLM-6B-main\langchain-ChatGLM-master\chatglm_llm.py", line 28, in init
super().init()
File "pydantic\main.py", line 339, in pydantic.main.BaseModel.init
File "pydantic\main.py", line 1066, in pydantic.main.validate_model
File "pydantic\fields.py", line 439, in pydantic.fields.ModelField.get_default
File "pydantic\utils.py", line 693, in pydantic.utils.smart_deepcopy
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 146, in deepcopy
y = copier(x, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 146, in deepcopy
y = copier(x, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 146, in deepcopy
y = copier(x, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 146, in deepcopy
y = copier(x, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 146, in deepcopy
y = copier(x, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 146, in deepcopy
y = copier(x, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
File "C:\Users\50902\AppData\Local\Programs\Python\Python39\lib\copy.py", line 153, in deepcopy
y = copier(memo)
File "E:\01进行项目#0Y0_AI_Arch\02_digital-humanities\ChatGLM-6B-main\venv\lib\site-packages\torch\nn\parameter.py", line 55, in deepcopy
result = type(self)(self.data.clone(memory_format=torch.preserve_format), self.requires_grad)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 23.99 GiB total capacity; 22.97 GiB already allocated; 0 bytes free; 22.98 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

进程已结束,退出代码1

无法打开gradio的页面

$ python webui.py
/home/zsd/.local/lib/python3.10/site-packages/gradio/components.py:164: UserWarning: Unknown style parameter: height
warnings.warn(f"Unknown style parameter: {key}")
Running on local URL: http://0.0.0.0:7860

To create a public link, set share=True in launch().

报错Use `repo_type` argument if needed.

Traceback (most recent call last):
File "/home/zsd/langchain-ChatGLM/knowledge_based_chatglm.py", line 102, in
init_cfg(LLM_MODEL, EMBEDDING_MODEL, LLM_HISTORY_LEN)
File "/home/zsd/langchain-ChatGLM/knowledge_based_chatglm.py", line 46, in init_cfg
chatglm.load_model(model_name_or_path=llm_model_dict[LLM_MODEL])
File "/home/zsd/langchain-ChatGLM/chatglm_llm.py", line 52, in load_model
self.tokenizer = AutoTokenizer.from_pretrained(
File "/home/zsd/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 619, in from_pretrained
tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
File "/home/zsd/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 463, in get_tokenizer_config
resolved_config_file = cached_file(
File "/home/zsd/.local/lib/python3.10/site-packages/transformers/utils/hub.py", line 409, in cached_file
resolved_file = hf_hub_download(
File "/home/zsd/.local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 112, in _inner_fn
validate_repo_id(arg_value)
File "/home/zsd/.local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 160, in validate_repo_id
raise HFValidationError(
huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/Users/liuqian/Downloads/ChatGLM-6B/chatglm-6b'. Use repo_type argument if needed.

Error using the new version with langchain

Error with the new changes:

The code is

template = """Question: {question}

Answer: Let's think step by step."""

prompt = PromptTemplate(template=template, input_variables=["question"])


local_llm = ChatGLM()

print(local_llm('What is the capital of France? '))
print(local_llm('Translate to German: How are you?'))
print(local_llm('Translate to Chinese: How are you?'))
llm_chain = LLMChain(prompt=prompt, 
                     llm=local_llm)
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████| 8/8 [00:06<00:00,  1.30it/s]
The dtype of attention mask (torch.int64) is not bool
history:  []
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/ubuntu/langchain_test2/test3_chatglp.py:35 in <module>                        │
│                                                                                                  │
│    32                                                                                            │
│    33 local_llm = ChatGLM()                                                                      │
│    34                                                                                            │
│ ❱  35 print(local_llm('What is the capital of France? '))                                        │
│    36 print(local_llm('Translate to German: How are you?'))                                      │
│    37 print(local_llm('Translate to Chinese: How are you?'))                                     │
│    38 llm_chain = LLMChain(prompt=prompt,                                                        │
│                                                                                                  │
│ /home/ubuntu/langchain_test2/.venv/lib/python3.11/site-packages/langchain/llms/base │
│ .py:246 in __call__                                                                              │
│                                                                                                  │
│   243 │                                                                                          │
│   244 │   def __call__(self, prompt: str, stop: Optional[List[str]] = None) -> str:              │
│   245 │   │   """Check Cache and run the LLM on the given prompt and input."""                   │
│ ❱ 246 │   │   return self.generate([prompt], stop=stop).generations[0][0].text                   │
│   247 │                                                                                          │
│   248 │   @property                                                                              │
│   249 │   def _identifying_params(self) -> Mapping[str, Any]:                                    │
│                                                                                                  │
│ /home/ubuntu/langchain_test2/.venv/lib/python3.11/site-packages/langchain/llms/base │
│ .py:140 in generate                                                                              │
│                                                                                                  │
│   137 │   │   │   │   output = self._generate(prompts, stop=stop)                                │
│   138 │   │   │   except (KeyboardInterrupt, Exception) as e:                                    │
│   139 │   │   │   │   self.callback_manager.on_llm_error(e, verbose=self.verbose)                │
│ ❱ 140 │   │   │   │   raise e                                                                    │
│   141 │   │   │   self.callback_manager.on_llm_end(output, verbose=self.verbose)                 │
│   142 │   │   │   return output                                                                  │
│   143 │   │   params = self.dict()                                                               │
│                                                                                                  │
│ /home/ubuntu/langchain_test2/.venv/lib/python3.11/site-packages/langchain/llms/base │
│ .py:137 in generate                                                                              │
│                                                                                                  │
│   134 │   │   │   │   {"name": self.__class__.__name__}, prompts, verbose=self.verbose           │
│   135 │   │   │   )                                                                              │
│   136 │   │   │   try:                                                                           │
│ ❱ 137 │   │   │   │   output = self._generate(prompts, stop=stop)                                │
│   138 │   │   │   except (KeyboardInterrupt, Exception) as e:                                    │
│   139 │   │   │   │   self.callback_manager.on_llm_error(e, verbose=self.verbose)                │
│   140 │   │   │   │   raise e                                                                    │
│                                                                                                  │
│ /home/ubuntu/langchain_test2/.venv/lib/python3.11/site-packages/langchain/llms/base │
│ .py:325 in _generate                                                                             │
│                                                                                                  │
│   322 │   │   generations = []                                                                   │
│   323 │   │   for prompt in prompts:                                                             │
│   324 │   │   │   text = self._call(prompt, stop=stop)                                           │
│ ❱ 325 │   │   │   generations.append([Generation(text=text)])                                    │
│   326 │   │   return LLMResult(generations=generations)                                          │
│   327 │                                                                                          │
│   328 │   async def _agenerate(                                                                  │
│                                                                                                  │
│ in pydantic.main.BaseModel.__init__:341                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValidationError: 1 validation error for Generation
text
  none is not an allowed value (type=type_error.none.not_allowed)

对话到第二次的时候就报错UnicodeDecodeError: 'utf-8' codec can't decode

对话第一次是没问题的,模型返回输出后又给到请输入你的问题,我再输入问题就报错
File "/root/--2023/mon_Apr/langchain-ChatGLM/knowledge_based_chatglm.py", line 73, in
query = input("Input your question 请输入问题:")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 3: invalid continuation byte
请问这是什么问题呢?

为什么每次运行都会loading checkpoint

我把这个embeding模型下载到本地后,无法正常启动。
原代码每次运行代码都会提示这个
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:26<00:00, 13.11s/it]
/home/rgzn/miniconda3/envs/langchain/lib/python3.9/site-packages/langchain/chains/conversational_retrieval/base.py:192: UserWarning: ChatVectorDBChain is deprecated - please use from langchain.chains import ConversationalRetrievalChain
warnings.warn

问一下chat_history的逻辑

感谢开源。
看了下chat_history的逻辑,就是把之前所有轮的历史,让chatglm合并成一个question。
这种情况下:
如果对话历史很多轮,现在问的和老早之前的历史假设没有什么关联了,那生成的新的question会不会不准确了。
那有什么办法可以知道当前聊的问题是一轮新的问题,而跟之前的问题没有关联了。

怎么让模型严格根据检索的数据进行回答,减少胡说八道的回答呢

举个例子:

问题

理想汽车销售的车型是什么?

检索到了一篇文章

5月10日,理想汽车正式公布了2022年第一季度财报。第一季度,公司共交付31,716辆理想ONE车型,同比增长152.1%,实现营业收入95.6亿元,同比增长167.5%。同时,一季度理想汽车净亏损1090万元,去年同期净亏损为3.60亿元。

  2022年第一季度,理想汽车收入总额为95.6亿元,较2021年第一季度的35.8亿元增加167.5%,较2021年第四季度的106.2亿元减少10.0%。其中,2022年第一季度的车辆销售收入为93.1亿元,理想汽车表示车辆销售收入较2021年第一季度增加主要归因于2022年第一季度交付车辆增加。车辆销售收入较2021年第四季度减少主要归因于受**春节假期的季节性影响,致2022年第一季度交付的车辆减少。

chatglm 的答案

理想汽车是一家新能源汽车制造商,销售的车型主要是新能源汽车,包括理想ONE、理想P7等。

怎么让 chatglm 严格根据检索的内容进行回答问题,不要胡说八道呢?

爲啥最後還是報錯 哭。。

Failed to import transformers.models.t5.configuration_t5 because of the following error (look up to see
its traceback):
Failed to import transformers.onnx.config because of the following error (look up to see its traceback):
DLL load failed while importing _imaging: 找不到指定的模块。

What version of you are using?

Hi Panda, I saw the pip install -r requirements command in README, and want to confirm you are using python2 or python3? because my pip and pip3 version are all is 22.3.

建议弄一个插件系统

如题弄成stable-diffusion-webui那种能装插件,再开一个存储库给使用者或插件开发,存储或下载插件。

运行失败,Loading checkpoint未达到100%被kill了,请问下是什么原因?

日志如下:
python knowledge_based_chatglm.py
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Loading checkpoint shards: 38%|██████▊ | 3/8 [00:02<00:03, 1.28it/s]Killed

用的in4的量化版本,推理的时候显示需要申请10Gb的显存

File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4-qe/modeling_chatglm.py", line 581, in forward
attention_outputs = self.attention(
File "/root/miniconda3/envs/gptq/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4-qe/modeling_chatglm.py", line 435, in forward
context_layer, present, attention_probs = attention_fn(
File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4-qe/modeling_chatglm.py", line 250, in attention_fn
matmul_result = torch.empty(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 10.36 GiB (GPU 0; 14.76 GiB total capacity; 4.54 GiB already allocated; 9.01 GiB free; 4.93 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

cpu上运行webui,step3 asking时报错

web运行,文件加载都正常,asking时报错

README.txt 已成功加载
Traceback (most recent call last):
File "/home/chwang/.local/lib/python3.8/site-packages/gradio/routes.py", line 394, in run_predict
output = await app.get_blocks().process_api(
File "/home/chwang/.local/lib/python3.8/site-packages/gradio/blocks.py", line 1075, in process_api
result = await self.call_function(
File "/home/chwang/.local/lib/python3.8/site-packages/gradio/blocks.py", line 884, in call_function
prediction = await anyio.to_thread.run_sync(
File "/home/chwang/.local/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/home/chwang/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/home/chwang/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "webui.py", line 31, in get_answer
resp, history = kb.get_knowledge_based_answer(
File "/repo/chaowang/AI/langchain-ChatGLM/knowledge_based_chatglm.py", line 98, in get_knowledge_based_answer
retriever=vector_store.as_retriever(search_kwargs={"k": VECTOR_SEARCH_TOP_K}),
AttributeError: 'NoneType' object has no attribute 'as_retriever'

请问:纯cpu可以吗?

很酷的实现,极大地开拓了我的眼界!很顺利的在gpu机器上运行了
另:想在另一台cpu的服务器部署(性能还可以),想提前了解:

1、仅CPU可以运行吗?(chatglm可以,但考虑到依赖的其他项目,自己就不确定了)
2、win安装detectron2好像有点困难啊……是否有好经验?(在gpu机器安装时遇到问题,当时偷懒去ubuntu搞定了)

加油~以及一些建议

加油,我认为你的方向是对的。
ui方面不妨借鉴或者配合github.com/Akegarasu/ChatGLM-webui

程序运行后一直卡住

感谢作者的付出,不过本人在运行时出现了问题,请大家帮助。
情况如下:

win10, anaconda环境,Python 3.10, 已根据 requirements.txt 安装了组件。
另外,在hugging face下载了 GanymedeNil\text2vec-large-chinese,放在这个项目根目录下。
chatglm_llm.py 也按照 ChatGLM-6B放置的路径进行了修改。

运行knowledge_based_chatglm.py 后,显存占用也正常 (这个文件使用了myml/langchain-ChatGLM 的分支,修正了之前显存占用翻倍的问题),然而虽然没有发生异常,但在输入参考文件路径之后,程序一直卡着不动,CPU有一个核心满负荷,但没有进一步输出。

控制台信息粘贴在下面,请大家指点一下是怎么回事:(有警告的那个GPU没有影响,是GTX650,实际模型运行在 P40上)

Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Loading checkpoint shards: 100%|██████████| 8/8 [00:22<00:00, 2.81s/it]
C:\Users\Admin\anaconda3\envs\langchain-glm\lib\site-packages\torch\cuda_init_.py:132: UserWarning:
Found GPU1 NVIDIA GeForce GTX 650 which is of cuda capability 3.0.
PyTorch no longer supports this GPU because it is too old.
The minimum cuda capability supported by this library is 3.7.

warnings.warn(old_gpu_warn % (d, name, major, minor, min_arch // 10, min_arch % 10))
Input your local knowledge file path 请输入本地知识文件路径:D:\My_Doc\PyTorchProj\ChatGLM\ChatGLM-6B\README.md
No sentence-transformers model found with name GanymedeNil/text2vec-large-chinese. Creating a new one with MEAN pooling.

nltk package unable to either download or load local nltk_data folder

I'm running this project on an offline Windows Server environment so I download the Punkt and averaged_perceptron_tagger tokenizer in this directory:
'nltk_data/tokenizers/punkt/english.pickle' but I keep receiving this LookupError:

LookupError:
**********************************************************************
  Resource [93maveraged_perceptron_tagger[0m not found.
  Please use the NLTK Downloader to obtain the resource:

  [31m>>> import nltk
  >>> nltk.download('averaged_perceptron_tagger')
  [0m
  For more information see: https://www.nltk.org/data.html

  Attempted to load [93mtaggers/averaged_perceptron_tagger/averaged_perceptron_tagger.pickle[0m

  Searched in:
    - 'C:\\Users\\username/nltk_data'
    - 'C:\\Users\\username\\AppData\\Local\\Programs\\Python\\Python38\\nltk_data'
    - 'C:\\Users\\username\\AppData\\Local\\Programs\\Python\\Python38\\share\\nltk_data'
    - 'C:\\Users\\username\\AppData\\Local\\Programs\\Python\\Python38\\lib\\nltk_data'
    - 'C:\\Users\\username\\AppData\\Roaming\\nltk_data'
    - 'C:\\nltk_data'
    - 'D:\\nltk_data'
    - 'E:\\nltk_data'
**********************************************************************

I put the nltk_data file in almost all of the directories above but this error keeps coming up. How can I solve this on an offline machine?

Demo演示无法给出输出内容

你好,测试了项目自带新闻稿示例和自行上传的一个文本,可以加载进去,但是无法给出答案,请问属于什么情况,如何解决,谢谢。PS: 1、今天早上刚下载全部代码;2、硬件服务器满足要求;3、按操作说明正常操作。

The FileType.UNK file type is not supported in partition. 解决办法

ValueError: Invalid file /home/yawu/Documents/langchain-ChatGLM-master/data. The FileType.UNK file type is not supported in partition.
这个报错是因为输入的filepath不正确。demo中需要给一个具体的目录+文件名,而不是文件夹名称。

希望对大家有所帮助。

24G的显存还是爆掉了,是否支持双卡运行

RuntimeError: CUDA out of memory. Tried to allocate 96.00 MiB (GPU 0; 23.70 GiB total capacity; 22.18 GiB already allocated; 12.75 MiB free; 22.18 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

请教一下langchain协调使用向量库和chatGLM工作的

代码里面这段是创建问答模型的,会接入ChatGLM和本地语料的向量库,langchain回答的时候是怎么个优先顺序?先搜向量库,没有再找chatglm么? 还是什么机制?
knowledge_chain = ChatVectorDBChain.from_llm(
llm=chatglm,
vectorstore=vector_store,
qa_prompt=prompt,
condense_question_prompt=new_question_prompt,
)

在mac m2max上抛出了ValueError: 150001 is not in list这个异常

我把chatglm_llm.py加载模型的代码改成如下
model_path = 'chatglm-6b' max_token: int = 2048 temperature: float = 0.1 top_p = 0.9 history = [] tokenizer = AutoTokenizer.from_pretrained( model_path, trust_remote_code=True ) model = ( AutoModel.from_pretrained( model_path, trust_remote_code=True) .half().to('mps') )

如何读取多个txt文档?

如题,请教一下如何读取多个txt文档?示例代码中只给了读一个文档的案例,这个input我换成string之后也只能指定一个文档,无法用通配符指定多个文档,也无法传入多个文件路径的列表。
感谢解答

[复现问题] 构造 prompt 时从知识库中提取的文字乱码

hi,我在尝试复现 README 中的效果,也使用了 ChatGLM-6B 的 README 作为输入文本,但发现从知识库中提取的文字是乱码,导致构造的 prompt 不可用。想了解如何解决这个问题。

System: 基于以下内容,简洁和专业的来回答用户的问题。
    如果无法从中得到答案,请说 "不知道" 或 "没有足够的相关信息",不要试图编造答案。答案请使用中文。
    ----------------
    # ChatGLM-6B

[GLM-130B@ICLR 23]

[GLM@ACL 22]

Blog ¢ ð

[GitHub]

[GitHub] ¢ ð

HF Repo ¢ ð

Twitter ¢ ð
    ----------------

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.