🍓 About Me
- 🔭 主要使用: Python,Golang
- 🌱 推し: 聖代橋氷織, 西園寺風莉
- 📫 E-mail: [email protected]
- 🍨 Blog: Akiba's Blog
- 🔏 OpenPGP: D1EF652A3015B1A2
- 👯 About me: 一个渣渣 Ctfer, web/misc 方向
- 🌐 Languages: English, 中文, 日本語
❄️ Skills
🎄 Others
A WebUI for ChatGLM-6B
🍓 About Me
❄️ Skills
🎄 Others
請問模型在下載後存放在哪裡?
Could you add a screenshot of the webui to the README? I want to get an idea of what is included in this web UI and how it compares to demos of ChatGLM on Hugging Face Spaces. Thanks!
刚安装,输入文本,点击发送后没有任何反应,提示信息为:Generation failed: RuntimeError('Library cudart is not initialized')
gfcjnb
H:\软件\百度云网盘\BaiduNetdiskDownload\ChatGLM(1)\py310\lib\site-packages\gradio\deprecation.py:40: UserWarning: height
is deprecated in Interface()
, please use it within launch()
instead.
warnings.warn(value)
H:\软件\百度云网盘\BaiduNetdiskDownload\ChatGLM(1)\py310\lib\site-packages\gradio\deprecation.py:43: UserWarning: You have unused kwarg parameters in Textbox, please remove them: {'container': False}
warnings.warn(
Traceback (most recent call last):
File "H:\软件\百度云网盘\BaiduNetdiskDownload\ChatGLM(1)\webui.py", line 58, in
main()
File "H:\软件\百度云网盘\BaiduNetdiskDownload\ChatGLM(1)\webui.py", line 45, in main
ui.queue(concurrency_count=5, max_size=64).launch(
TypeError: Blocks.launch() got an unexpected keyword argument 'root_path'
自己搞好多不清楚的,一起来弄吧。。准备搞个部署问题的解决文档出来
比如连续对话中,有些对话其实没必要添加到上下文里。
可以默认设置为都是添加到上下文,但可以在那一轮对话上取消标记,之后问答时,取消标记的对话就不添加到上下文了。
这样可以把对话历史和上下文分开,减少显卡的压力,应该能实现更多轮次的对话
更新最新版webui后每次发送都报错
Generation failed: AttributeError("'ChatGLMForConditionalGeneration' object has no attribute 'stream_chat'")
Traceback (most recent call last):
File "L:\ChatGLM\py310\lib\site-packages\gradio\routes.py", line 393, in run_predict
output = await app.get_blocks().process_api(
File "L:\ChatGLM\py310\lib\site-packages\gradio\blocks.py", line 1059, in process_api
result = await self.call_function(
File "L:\ChatGLM\py310\lib\site-packages\gradio\blocks.py", line 882, in call_function
prediction = await anyio.to_thread.run_sync(
File "L:\ChatGLM\py310\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "L:\ChatGLM\py310\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "L:\ChatGLM\py310\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "L:\ChatGLM\py310\lib\site-packages\gradio\utils.py", line 549, in async_iteration
return next(iterator)
File "L:\ChatGLM\modules\ui.py", line 30, in predict
ctx.refresh_last()
File "L:\ChatGLM\modules\context.py", line 42, in refresh_last
query, output = self.rh[-1]
IndexError: list index out of range
目录如下:
lxy52@YSTYLE-PC MINGW64 /d/Code/Python/ChatGLM-6B (main)
$ tree -d -L 2
.
|-- ChatGLM-webui
| |-- modules
| |-- outputs
| `-- scripts
|-- THUDM
| |-- chatglm-6b
| |-- chatglm-6b-int4
| |-- chatglm-6b-int4-qe
| `-- chatglm-6b-main
|-- examples
|-- limitations
|-- outputs
| |-- markdown
| `-- save
`-- resources
15 directories
在/d/Code/Python/ChatGLM-6B/ChatGLM-webui
下运行 python .\webui.py --model-path ..\THUDM\chatglm-6b-int4-qe\
会报如下错误,加载chatglm-6b-int4也会报错, 是目录不能用回退的方式加载么?还是什么原因,上周某个版本就可以的,更新了后就不行了
(ChatGLM) PS D:\Code\Python\ChatGLM-6B\ChatGLM-webui> python .\webui.py --model-path ..\THUDM\chatglm-6b-int4-qe\
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
No compiled kernel found.
Compiling kernels : C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.c -shared -o C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.so
Kernels compiled : C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.so
Cannot load cpu kernel, don't use quantized model on cpu.
Using quantization cache
Applying quantization to glm layers
GPU memory: 8.59 GB
No compiled kernel found.
Compiling kernels : C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.c -shared -o C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.so
Kernels compiled : C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.so
Traceback (most recent call last):
File "D:\Code\Python\ChatGLM-6B\ChatGLM-webui\webui.py", line 52, in <module>
init()
File "D:\Code\Python\ChatGLM-6B\ChatGLM-webui\webui.py", line 24, in init
load_model()
File "D:\Code\Python\ChatGLM-6B\ChatGLM-webui\modules\model.py", line 61, in load_model
prepare_model()
File "D:\Code\Python\ChatGLM-6B\ChatGLM-webui\modules\model.py", line 42, in prepare_model
model = model.half().quantize(4).cuda()
File "C:\Users\lxy52/.cache\huggingface\modules\transformers_modules\modeling_chatglm.py", line 1281, in quantize
load_cpu_kernel(**kwargs)
File "C:\Users\lxy52/.cache\huggingface\modules\transformers_modules\quantization.py", line 390, in load_cpu_kernel
cpu_kernels = CPUKernel(**kwargs)
File "C:\Users\lxy52/.cache\huggingface\modules\transformers_modules\quantization.py", line 157, in __init__
kernels = ctypes.cdll.LoadLibrary(kernel_file)
File "D:\Application\Miniconda3\envs\ChatGLM\lib\ctypes\__init__.py", line 452, in LoadLibrary
return self._dlltype(name)
self._handle = _dlopen(self._name, mode)
rnels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.
FileNotFoundError: Could not find module 'nvcuda.dll' (or one of its dependencies). Try using the full path with constructor syntax.ue is:open 这个咋整啊家人们
建议优化markdown渲染,发现在输出数学公式后回复速度会急剧下降,希望修复
mac urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with LibreSSL 2.8.3.
我这自动下载放C盘,有点吃不消啊,模型可以放在项目目录嘛。
我发现webui运行chatglm-6b, 进程内存占用逐步升高,到最后进程退出。
It could better if the the repo could include a setup script that uses VENV
model-path 默认值为 THUDM/chatglm-6b,与指引中 model/chatglm-6b 不一致且并未注明,容易造成误解,可否统一或在README中注明
另,目前无模型直接运行会导致在当前系统的 .cache 文件夹下载并使用模型且不遵循源文件命名,可能影响Windows用户系统盘空间(若webui并不打算在系统盘),且存在用户日常用火绒等软件清理临时文件时将全部模型文件清理掉需要重下的风险,直接把https://huggingface.co/THUDM/chatglm-6b git clone到 model-path 应该更好?
这是使用流式输出的时候的报错:Generation failed: AttributeError("'ChatGLMForConditionalGeneration' object has no attribute 'stream_chat'")
`
设备和型号 =二氧化碳_bus.get_card_设备和(二氧化碳_car.get_card_型号)
`
这是生成代码的时候,代码也被翻译了
如标题希望能添加使用不要key的搜寻引擎
Hi 大家好,我是 RWKV 的作者,目前有中英文 Chat 模型和小说模型,7B 和 14B:
https://zhuanlan.zhihu.com/p/618011122
RWKV 现在有 pip package 可以直接调用推理,支持 INT8 量化,支持 streaming 模式(可以用很小显存运行),支持拆分到多张显卡:
https://pypi.org/project/rwkv/
大家可以合作加入 RWKV 支持吗?如有兴趣可以加 RWKV 的 QQ 群,谢谢。请问 ChatGLM 有没有群,我也加。
启动webyi使用流式输出就会出现问题,请问是哪里出错了呢?
不使用流式输出就不会出现问题
能加入局域网远程访问功能吗?不想抱着服务器在跟AI聊天了👋 ,或者说已经有方法能够实现局域网访问?
只做了4位的,8位同理。model.py中,相应函数改为以下内容。首次运行,需将firset_run改为1。
可在config中加入开关并自动检测保存状态。
def prepare_model():
import pickle
from transformers import AutoModel
global model
if cmd_opts.precision == "int4":
firset_run=0
if firset_run:
model = AutoModel.from_pretrained(cmd_opts.model_path, trust_remote_code=True)
model = model.half().quantize(4)
print("量化完毕")
with open(cmd_opts.model_path+"int4", 'wb') as f:
pickle.dump(model, f)
print("保存量化完毕")
else:
with open(cmd_opts.model_path+"int4", 'rb') as f:
model = pickle.load(f)
model = model.cuda()
model = model.eval()
return
model = AutoModel.from_pretrained(cmd_opts.model_path, trust_remote_code=True)
if cmd_opts.cpu:
model = model.float()
else:
if cmd_opts.precision == "fp16":
model = model.half().cuda()
elif cmd_opts.precision == "int8":
model = model.half().quantize(8).cuda()
model = model.eval()
def load_model():
if cmd_opts.ui_dev:
return
global tokenizer, model
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(cmd_opts.model_path, trust_remote_code=True)
prepare_model()
在colab里面,使用python3.10安装gradio的时候,出现如下错误,可能是ffmpy 最高只支持到python3.9。不知道这个问题要如何解决
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting ffmpy
Using cached ffmpy-0.3.0.tar.gz (4.8 kB)
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Preparing metadata (setup.py) ... error
error: metadata-generation-failed
UnicodeEncodeError: 'gbk' codec can't encode character '\U0001f44b' in position 2: illegal multibyte sequence
尝试启用流式输出报错:Generation failed: AttributeError("'ChatGLMForConditionalGeneration' object has no attribute 'stream_chat'")
环境:Python 3.10.7
pip list:
Package Version
aiofiles 23.1.0
aiohttp 3.8.4
aiosignal 1.3.1
altair 4.2.2
anyio 3.6.2
async-timeout 4.0.2
attrs 22.2.0
certifi 2022.12.7
charset-normalizer 3.1.0
click 8.1.3
colorama 0.4.6
contourpy 1.0.7
cpm-kernels 1.0.11
cycler 0.11.0
entrypoints 0.4
fastapi 0.95.0
ffmpy 0.3.0
filelock 3.10.3
fonttools 4.39.2
frozenlist 1.3.3
fsspec 2023.3.0
gradio 3.23.0
h11 0.14.0
httpcore 0.16.3
httpx 0.23.3
huggingface-hub 0.13.3
icetk 0.0.4
idna 3.4
Jinja2 3.1.2
jsonschema 4.17.3
kiwisolver 1.4.4
linkify-it-py 2.0.0
markdown-it-py 2.2.0
MarkupSafe 2.1.2
matplotlib 3.7.1
mdit-py-plugins 0.3.3
mdurl 0.1.2
multidict 6.0.4
numpy 1.24.2
orjson 3.8.8
packaging 23.0
pandas 1.5.3
Pillow 9.4.0
pip 22.2.2
protobuf 3.20.0
pydantic 1.10.7
pydub 0.25.1
pyparsing 3.0.9
pyrsistent 0.19.3
python-dateutil 2.8.2
python-multipart 0.0.6
pytz 2022.7.1
PyYAML 6.0
regex 2023.3.23
requests 2.28.2
rfc3986 1.5.0
semantic-version 2.10.0
sentencepiece 0.1.97
setuptools 63.2.0
six 1.16.0
sniffio 1.3.0
starlette 0.26.1
tokenizers 0.13.2
toolz 0.12.0
torch 1.13.1+cu117
torchvision 0.14.1+cu117
tqdm 4.65.0
transformers 4.27.3
typing_extensions 4.5.0
uc-micro-py 1.0.1
urllib3 1.26.15
uvicorn 0.21.1
websockets 10.4
yarl 1.8.2
您好,请问能用.safetensors格式的模型吗?
用Hugging Face的工具转换chatglm-6b-int4。
最开始我使用share=ture,但在通过外部IP访问的时候显示拒绝连接,而且对话等待时间变长,然后,我打算使用宝塔的反代功能,在未开启缓存的情况下,能够打开网页,但输入对话后,发送一直没变化,对话也仍然停留在对话框中
Thank you for providing the web UI! However, I am currently using a laptop with an RTX3060, which means that I am only able to use the CPU. Unfortunately, the CPU is too slow and the performance is not ideal. On the other hand, I also have a MacBook Air with an M2 processor and 16GB of RAM. Would it be possible for you to create a version of the UI that is compatible with Apple Silicon macs? I would greatly appreciate it if I could use the UI on my MacBook.
python窗口显示:
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 8/8 [00:14<00:00, 1.82s/it]
GPU memory: 12.88 GB
Choosing precision int8 according to your VRAM. If you want to decide precision yourself, please add argument --precision when launching the application.
Running on local URL: http://127.0.0.1:17860
To create a public link, set share=True
in launch()
.
在网页对话时,发送消息不成功,右上角出现如下信息。
Something went wrong
Expecting value: line 1 column 1 (char 0)
请不吝赐教
No module named 'transformers_modules.THUDM/chatglm-6b'
windows环境下运行,部分请求要7s以上响应,这时候web ui会频繁出现回答丢失,只有问一些简单的问题,回复后才会把之前的回答带回来。分析了一下,发现去掉 ui.queue().launch(..) 的queue() ,直接 ui.launch()就解决了。
看了下文档,这个queue()功能是为了解决60s以上返回的场景设计的,这里使用反而导致交互上出现问题。
如题
如题,目前的方式对于国内来说几乎必须挂梯子才能下载完整预训练模型,而启动后又必须关闭梯子不然就报错,但即便模型下载完过也必须联网自检,不然就报错,这很鸡肋,不挂梯子不能启动,挂完梯子又要关掉才能用,希望作者可以优化启动模型自检必须要联网自检的冗余步骤,thanks
官方hugging face仓库已支持流式输出,调用接口为model.stream_chat()
输入文本点击提交就界面就显示如下错误:
Something went wrong
Expecting value: line 1 column 1 (char 0)
看到有人说把 queue去掉,去掉了就会报queue的错误。这个问题如何解决?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.