GithubHelp home page GithubHelp logo

soulteary / docker-llama2-chat Goto Github PK

View Code? Open in Web Editor NEW
526.0 6.0 81.0 8.75 MB

Play LLaMA2 (official / 中文版 / INT4 / llama2.cpp) Together! ONLY 3 STEPS! ( non GPU / 5GB vRAM / 8~14GB vRAM)

Home Page: https://www.zhihu.com/people/soulteary/posts

License: Apache License 2.0

Python 96.73% Roff 0.18% Shell 3.09%
llama llama2 llm llama2-docker llama2-playground

docker-llama2-chat's Issues

请问本地部署好cpu环境小模型后,怎么支持restful的API调用?

先感谢作者,让人能快速体验

# 自行下载Chinese-Llama-2-7b-ggml-q4.bin放到`pwd`/soulteary,然后这就跑起来了
docker run --ulimit memlock=-1 --ulimit stack=67108864 --rm -it -v `pwd`/soulteary:/app/soulteary soulteary/llama2:runtime bash
# 这就可以开始聊起来了
./main -m /app/soulteary/Chinese-Llama-2-7b-ggml-q4.bin -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt

image

个人感觉,小麻雀实际会有更多的应用落地机会,很多应用场景已经足够应付了
接下来,首当其冲就是通过API调用,得有restful的API,这样才能方便和其他系统应用对接

咋弄呢,还请不吝赐教

HeaderTooLarge when testing

I tried deploying it. I had this error:

Traceback (most recent call last):
File "/app/model.py", line 10, in
model = AutoModelForCausalLM.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 493, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2903, in from_pretrained
) = cls._load_pretrained_model(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3246, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 447, in load_state_dict
with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

Any idea ?

safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

I am trying to run the 7b-chat, but getting this error

Traceback (most recent call last):
File "/app/app.py", line 6, in
from model import run
File "/app/model.py", line 10, in
model = AutoModelForCausalLM.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 493, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2903, in from_pretrained
) = cls._load_pretrained_model(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3246, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 447, in load_state_dict
with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

scripts/run-13b.sh fails with http 401 from huggingface.co URLs

Environment: Google Cloud, Nvidia A100 40 GB, 12vCPU, 100 GB disk
Docker and CUDA 12.1 are installed.

This part is OK:

git clone https://github.com/soulteary/docker-llama2-chat
scripts/make-13b.sh

Access from Google VM to huggingface.co seems to be ok (ping 10-12ms)

This part FAILS.

scripts/run-13b.sh

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 261, in hf_raise_for_status
response.raise_for_status()
File "/usr/local/lib/python3.10/dist-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/meta-llama/Llama-2-13b-chat-hf/resolve/main/config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 417, in cached_file
resolved_file = hf_hub_download(
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1195, in hf_hub_download
metadata = get_hf_file_metadata(
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1541, in get_hf_file_metadata
hf_raise_for_status(r)
File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 293, in hf_raise_for_status
raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-64c0a70b-7e218e8a7f87e86a5fbfb030;382d0c02-cba0-4312-b459-953c3d6951bb)

Repository Not Found for url: https://huggingface.co/meta-llama/Llama-2-13b-chat-hf/resolve/main/config.json.
Please make sure you specified the correct repo_id and repo_type.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/app/app.py", line 6, in
from model import run
File "/app/model.py", line 10, in
config = AutoConfig.from_pretrained(model_id)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 983, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 617, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 672, in _get_config_dict
resolved_config_file = cached_file(
File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 433, in cached_file
raise EnvironmentError(
OSError: meta-llama/Llama-2-13b-chat-hf is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with use_auth_token or log in with huggingface-cli login and pass use_auth_token=True.

请问要怎么改docker 网站分享的port 从7860换成其他port

我将docker script port 7860换成8000了,
docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --rm -it -v pwd/LinkSoul:/app/LinkSoul -p
7860:8000 soulteary/llama2:7b-cn

但是启动之后画面还是显示 7860
Running on local URL: http://0.0.0.0:7860

To create a public link, set share=True in launch().

请问这个网站的port mapping 可以更改吗?

执行bash make-7b-cn.sh时报错

$ bash make-7b-cn.sh
ERROR: could not find docker: CreateFile docker: The system cannot find the file specified.
2023/08/01 17:02:02 http2: server: error reading preface from client //./pipe/docker_engine: file has already been closed
ERROR: could not find docker: CreateFile docker: The system cannot find the file specified.
屏幕截图 2023-08-01 171446

希望取得联系

尊敬的 docker-llama2-chat 开发者您好,我是 InternLM 社区开发者&志愿者 尖米, 您的工作非常对我的帮助很大,感觉也可以很好的在 InternLM 中使用,我的微信是 mzm312,希望取得联系

OSError: You seem to have cloned a repository without having git-lfs installed

OSError: You seem to have cloned a repository without having git-lfs installed

按照教程里做的:
https://soulteary.com/2023/07/21/use-docker-to-quickly-get-started-with-the-chinese-version-of-llama2-open-source-large-model.html

运行容器:sh scripts/run-7b-cn.sh 报错:

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

NOTE: CUDA Forward Compatibility mode ENABLED.
Using CUDA 12.1 driver version 530.30.02 with kernel driver version 525.105.17.
See https://docs.nvidia.com/deploy/cuda-compatibility/ for details.

Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 460, in load_state_dict
return torch.load(checkpoint_file, map_location="cpu")
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 883, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1101, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/app/app.py", line 6, in
from model import run
File "/app/model.py", line 10, in
model = AutoModelForCausalLM.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 493, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2903, in from_pretrained
) = cls._load_pretrained_model(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3246, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 465, in load_state_dict
raise OSError(
OSError: You seem to have cloned a repository without having git-lfs installed. Please install git-lfs and run git lfs install followed by git lfs pull in the folder you cloned.

ValueError

大佬您好!我参考的用 Docker 容器快速上手 Meta AI 出品的 LLaMA2 开源大模型。这篇文章

但是我出现了ValueError: The following model_kwargs are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)这个错误是怎么回事,已经可以本地访问Gradio了

No such file or directory

操作步骤如下:

  1. docker pull soulteary/llama2:converter
  2. docker run --ulimit memlock=-1 --ulimit stack=67108864 --rm -it -v pwd/LinkSoul:/app/LinkSoul -v pwd/soulteary:/app/soulteary soulteary/llama2:converter bash
  3. python3 convert.py /app/LinkSoul/Chinese-Llama-2-7b/ --outfile /app/soulteary/Chinese-Llama-2-7b-ggml.bin

报错:
No such file or directory: '/app/LinkSoul/Chinese-Llama-2-7b'

soulteary/llama2:base not found

你好,我在通过bash运行脚本的时候发生了如下错误,我也去了dockerhub上的soulteary上查看了镜像,好像没有tag为base的镜像,请问这个错误是为什么呢

Step 1/3 : FROM soulteary/llama2:base
manifest for soulteary/llama2:base not found: manifest unknown: manifest unknown

llama2量化后版本加载报错

llama2-7b-chat-hf,按照提供的量化步骤,得到4bit版本的模型并补齐模型文件,通过AutoModelForCausalLM.from_pretrained方式加载时,报NotImplementedError: Cannot copy out of meta tensor; no data!
环境配置:
accelerate==0.21.0
bitsandbytes==0.40.2
gradio==3.37.0
protobuf==3.20.3
scipy==1.11.1
sentencepiece==0.1.99
transformers==4.31.0
torch==1.13.0a0+340c412
cuda==11.7

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.