GithubHelp home page GithubHelp logo

camenduru / text-generation-webui-colab Goto Github PK

View Code? Open in Web Editor NEW
2.0K 2.0K 361.0 165 KB

A colab gradio web UI for running Large Language Models

License: The Unlicense

Jupyter Notebook 100.00%
alpaca colab colab-notebook colaboratory gradio koala lama llama llamas llm vicuna

text-generation-webui-colab's Introduction

๐Ÿฃ Please follow me for new updates https://twitter.com/camenduru
๐Ÿ”ฅ Please join our discord server https://discord.gg/k5BwmmvJJU
๐Ÿฅณ Please join my patreon community https://patreon.com/camenduru

๐Ÿšฆ WIP ๐Ÿšฆ

๐Ÿฆ’ Colab

colab Info - Model Page
Open In Colab vicuna-13b-GPTQ-4bit-128g
https://vicuna.lmsys.org
Open In Colab vicuna-13B-1.1-GPTQ-4bit-128g
https://vicuna.lmsys.org
Open In Colab stable-vicuna-13B-GPTQ-4bit-128g
https://huggingface.co/CarperAI/stable-vicuna-13b-delta
Open In Colab gpt4-x-alpaca-13b-native-4bit-128g
https://huggingface.co/chavinlo/gpt4-x-alpaca
Open In Colab pyg-7b-GPTQ-4bit-128g
https://huggingface.co/Neko-Institute-of-Science/pygmalion-7b
Open In Colab koala-13B-GPTQ-4bit-128g
https://bair.berkeley.edu/blog/2023/04/03/koala
Open In Colab oasst-llama13b-GPTQ-4bit-128g
https://open-assistant.io
Open In Colab wizard-lm-uncensored-7b-GPTQ-4bit-128g
https://github.com/nlpxucan/WizardLM
Open In Colab mpt-storywriter-7b-GPTQ-4bit-128g
https://www.mosaicml.com
Open In Colab wizard-lm-uncensored-13b-GPTQ-4bit-128g
https://github.com/nlpxucan/WizardLM
Open In Colab pyg-13b-GPTQ-4bit-128g
https://huggingface.co/PygmalionAI/pygmalion-13b
Open In Colab falcon-7b-instruct-GPTQ-4bit
https://falconllm.tii.ae/
Open In Colab wizard-lm-13b-1.1-GPTQ-4bit-128g
https://github.com/nlpxucan/WizardLM
Open In Colab llama-2-7b-chat-GPTQ-4bit (4bit)
https://ai.meta.com/llama/
Open In Colab llama-2-13b-chat-GPTQ-4bit (4bit)
https://ai.meta.com/llama/
๐Ÿšฆ WIP ๐Ÿšฆ please try llama-2-13b-chat or llama-2-7b-chat or llama-2-7b-chat-GPTQ-4bit
Open In Colab llama-2-7b-chat (16bit)
https://ai.meta.com/llama/
Open In Colab llama-2-13b-chat (8bit)
https://ai.meta.com/llama/
Open In Colab redmond-puffin-13b-GPTQ-4bit (4bit)
https://huggingface.co/NousResearch/Redmond-Puffin-13B
Open In Colab stable-beluga-7b (16bit)
https://huggingface.co/stabilityai/StableBeluga-7B
Open In Colab doctor-gpt-7b (16bit)
https://ai.meta.com/llama/ (https://github.com/llSourcell/DoctorGPT)
Open In Colab code-llama-7b (16bit)
https://github.com/facebookresearch/codellama
Open In Colab code-llama-instruct-7b (16bit)
https://github.com/facebookresearch/codellama
Open In Colab code-llama-python-7b (16bit)
https://github.com/facebookresearch/codellama
Open In Colab mistral-7b-Instruct-v0.1-8bit (8bit)
https://mistral.ai/
Open In Colab mytho-max-l2-13b-GPTQ (4bit)
https://huggingface.co/Gryphe/MythoMax-L2-13b

๐Ÿฆ’ Colab Pro

According to the Facebook Research LLaMA license (Non-commercial bespoke license), maybe we cannot use this model with a Colab Pro account. But Yann LeCun said "GPL v3" (https://twitter.com/ylecun/status/1629189925089296386) I am a little confused. Is it possible to use this with a non-free Colab Pro account?

Tutorial

https://www.youtube.com/watch?v=kgA7eKU1XuA

โš  If you encounter an IndexError: list index out of range error, please set the models instruction template.

Screenshot 2023-08-28 165206

Text Generation Web UI

https://github.com/oobabooga/text-generation-webui (Thanks to @oobabooga โค)

Models License

Model License
vicuna-13b-GPTQ-4bit-128g From https://vicuna.lmsys.org: The online demo is a research preview intended for non-commercial use only, subject to the model License of LLaMA, Terms of Use of the data generated by OpenAI, and Privacy Practices of ShareGPT. Please contact us If you find any potential violation. The code is released under the Apache License 2.0.
gpt4-x-alpaca-13b-native-4bit-128g https://huggingface.co/chavinlo/alpaca-native -> https://huggingface.co/chavinlo/alpaca-13b -> https://huggingface.co/chavinlo/gpt4-x-alpaca
llama-2 https://ai.meta.com/llama/ Llama 2 is available for free for research and commercial use. ๐Ÿฅณ

Special Thanks

Thanks to facebookresearch โค for https://github.com/facebookresearch/llama
Thanks to lmsys โค for https://huggingface.co/lmsys/vicuna-13b-delta-v0
Thanks to anon8231489123 โค for https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g (GPTQ 4bit quantization of: https://huggingface.co/lmsys/vicuna-13b-delta-v0)
Thanks to tatsu-lab โค for https://github.com/tatsu-lab/stanford_alpaca
Thanks to chavinlo โค for https://huggingface.co/chavinlo/gpt4-x-alpaca
Thanks to qwopqwop200 โค for https://github.com/qwopqwop200/GPTQ-for-LLaMa
Thanks to tsumeone โค for https://huggingface.co/tsumeone/gpt4-x-alpaca-13b-native-4bit-128g-cuda (GPTQ 4bit quantization of: https://huggingface.co/chavinlo/gpt4-x-alpaca)
Thanks to transformers โค for https://github.com/huggingface/transformers
Thanks to gradio-app โค for https://github.com/gradio-app/gradio
Thanks to TheBloke โค for https://huggingface.co/TheBloke/stable-vicuna-13B-GPTQ
Thanks to Neko-Institute-of-Science โค for https://huggingface.co/Neko-Institute-of-Science/pygmalion-7b
Thanks to gozfarb โค for https://huggingface.co/gozfarb/pygmalion-7b-4bit-128g-cuda (GPTQ 4bit quantization of: https://huggingface.co/Neko-Institute-of-Science/pygmalion-7b)
Thanks to young-geng โค for https://huggingface.co/young-geng/koala
Thanks to TheBloke โค for https://huggingface.co/TheBloke/koala-13B-GPTQ-4bit-128g (GPTQ 4bit quantization of: https://huggingface.co/young-geng/koala)
Thanks to dvruette โค for https://huggingface.co/dvruette/oasst-llama-13b-2-epochs
Thanks to gozfarb โค for https://huggingface.co/gozfarb/oasst-llama13b-4bit-128g (GPTQ 4bit quantization of: https://huggingface.co/dvruette/oasst-llama-13b-2-epochs)
Thanks to ehartford โค for https://huggingface.co/ehartford/WizardLM-7B-Uncensored
Thanks to TheBloke โค for https://huggingface.co/TheBloke/WizardLM-7B-uncensored-GPTQ (GPTQ 4bit quantization of: https://huggingface.co/ehartford/WizardLM-7B-Uncensored)
Thanks to mosaicml โค for https://huggingface.co/mosaicml/mpt-7b-storywriter
Thanks to OccamRazor โค for https://huggingface.co/OccamRazor/mpt-7b-storywriter-4bit-128g (GPTQ 4bit quantization of: https://huggingface.co/mosaicml/mpt-7b-storywriter)
Thanks to ehartford โค for https://huggingface.co/ehartford/WizardLM-13B-Uncensored
Thanks to ausboss โค for https://huggingface.co/ausboss/WizardLM-13B-Uncensored-4bit-128g (GPTQ 4bit quantization of: https://huggingface.co/ehartford/WizardLM-13B-Uncensored)
Thanks to PygmalionAI โค for https://huggingface.co/PygmalionAI/pygmalion-13b
Thanks to notstoic โค for https://huggingface.co/notstoic/pygmalion-13b-4bit-128g (GPTQ 4bit quantization of: https://huggingface.co/PygmalionAI/pygmalion-13b)
Thanks to WizardLM โค for https://huggingface.co/WizardLM/WizardLM-13B-V1.1
Thanks to TheBloke โค for https://huggingface.co/TheBloke/WizardLM-13B-V1.1-GPTQ (GPTQ 4bit quantization of: https://huggingface.co/WizardLM/WizardLM-13B-V1.1)
Thanks to meta-llama โค for https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
Thanks to TheBloke โค for https://huggingface.co/TheBloke/Llama-2-7b-Chat-GPTQ (GPTQ 4bit quantization of: https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)
Thanks to meta-llama โค for https://huggingface.co/meta-llama/Llama-2-13b-chat-hf
Thanks to localmodels โค for https://huggingface.co/localmodels/Llama-2-13B-Chat-GPTQ (GPTQ 4bit quantization of: https://huggingface.co/meta-llama/Llama-2-13b-chat-hf)
Thanks to NousResearch โค for https://huggingface.co/NousResearch/Redmond-Puffin-13B
Thanks to TheBloke โค for https://huggingface.co/TheBloke/Redmond-Puffin-13B-GPTQ (GPTQ 4bit quantization of: https://huggingface.co/NousResearch/Redmond-Puffin-13B)
Thanks to llSourcell โค for https://huggingface.co/llSourcell/medllama2_7b
Thanks to MetaAI โค for https://ai.meta.com/research/publications/code-llama-open-foundation-models-for-code/
Thanks to TheBloke โค for https://huggingface.co/TheBloke/CodeLlama-7B-fp16
Thanks to TheBloke โค for https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-fp16
Thanks to TheBloke โค for https://huggingface.co/TheBloke/CodeLlama-7B-Python-fp16
Thanks to MistralAI โค for https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
Thanks to Gryphe โค for https://huggingface.co/Gryphe/MythoMax-L2-13b
Thanks to TheBloke โค for https://huggingface.co/TheBloke/MythoMax-L2-13B-GPTQ (GPTQ 4bit quantization of: https://huggingface.co/Gryphe/MythoMax-L2-13b)

Medical Advice Disclaimer

DISCLAIMER: THIS WEBSITE DOES NOT PROVIDE MEDICAL ADVICE The information, including but not limited to, text, graphics, images and other material contained on this website are for informational purposes only. No material on this site is intended to be a substitute for professional medical advice, diagnosis or treatment. Always seek the advice of your physician or other qualified health care provider with any questions you may have regarding a medical condition or treatment and before undertaking a new health care regimen, and never disregard professional medical advice or delay in seeking it because of something you have read on this website.

text-generation-webui-colab's People

Contributors

camenduru avatar r3gm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

text-generation-webui-colab's Issues

Trying to put an image at the character but it fail

Hello! I'm using the colab version of this and i'm trying to import an image to the character but nothing seens to works for some reason, i'm trying to do this for a while but I don't know why is giving me so many erros! can someone help me to solve this?
Screenshot_20230615_185530
Screenshot_20230615_185553

cannot import name 'is_npu_available' from 'accelerate.utils'

โ”‚ /usr/local/lib/python3.10/dist-packages/peft/utils/other.py:24 in โ”‚
โ”‚ โ”‚
โ”‚ 21 import accelerate โ”‚
โ”‚ 22 import torch โ”‚
โ”‚ 23 from accelerate.hooks import add_hook_to_module, remove_hook_from_modu โ”‚
โ”‚ โฑ 24 from accelerate.utils import is_npu_available, is_xpu_available โ”‚
โ”‚ 25 โ”‚
โ”‚ 26 from ..import_utils import is_auto_gptq_available โ”‚
โ”‚ 27 โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
ImportError: cannot import name 'is_npu_available' from 'accelerate.utils'
(/usr/local/lib/python3.10/dist-packages/accelerate/utils/init.py)

actually it is installed.

gptQ 4 Bit does not produce any output on Cloab error: IndexError: list index out of range

gptq 4 Bit does not produce any output on Cloab
error: IndexError: list index out of range

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/gradio/routes.py", line 427, in run_predict
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1323, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1067, in call_function
prediction = await utils.async_iteration(iterator)
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 336, in async_iteration
return await iterator.anext()
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 329, in anext
return await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 312, in run_sync_iterator_async
return next(iterator)
File "/content/text-generation-webui/modules/chat.py", line 305, in generate_chat_reply_wrapper
for i, history in enumerate(generate_chat_reply(text, state, regenerate, _continue, loading_message=True)):
File "/content/text-generation-webui/modules/chat.py", line 290, in generate_chat_reply
for history in chatbot_wrapper(text, state, regenerate=regenerate, _continue=_continue, loading_message=loading_message):
File "/content/text-generation-webui/modules/chat.py", line 194, in chatbot_wrapper
stopping_strings = get_stopping_strings(state)
File "/content/text-generation-webui/modules/chat.py", line 161, in get_stopping_strings
state['turn_template'].split('<|user-message|>')[1].split('<|bot|>')[0] + '<|bot|>',
IndexError: list index out of range

new to *free* google colab

Being on an NVIDIA T4, Is it possible to utilize xformers, and use exllamav2 as the loader for (mistral flavor of your choice)GPTQ 4bit 32gs ... I have a feeling it would perform blazingly fast with minimal degradation and great context... But you've spent more time on this...

[Feature Request] Support InternLM

Dear text-generation-webui-colab developer,

Greetings! I am vansinhu, a community developer and volunteer at InternLM. Your work has been immensely beneficial to me, and I believe it can be effectively utilized in InternLM as well. Welcome to add Discord https://discord.gg/gF9ezcmtM3 . I hope to get in touch with you.

Best regards,
vansinhu

mpt-7b-chat

Hello can you make on google colab mpt-7b-chat please?
Or you wait anything?

vicuna-13b-GPTQ-4bit-128g.ipynb seems to have dep conflict

Not sure what I'm doing wrong, but it seems transformers might have conflicting version numbers, or PIL.Image.Resampling isn't available for some reason.

Running https://colab.research.google.com/github/camenduru/text-generation-webui-colab/blob/main/vicuna-13b-GPTQ-4bit-128g.ipynb gave me output that ends with:

Status Legend:
(OK):download completed.
/content/text-generation-webui
Traceback (most recent call last):
  File "/content/text-generation-webui/server.py", line 18, in <module>
    from modules import api, chat, shared, training, ui
  File "/content/text-generation-webui/modules/api.py", line 6, in <module>
    from modules.text_generation import generate_reply
  File "/content/text-generation-webui/modules/text_generation.py", line 7, in <module>
    import transformers
ModuleNotFoundError: No module named 'transformers'

which I've traced to the last line !python server.py --share --chat --wbits 4 --groupsize 128

Running the following in a new code block (no version numbers) to address missing deps didn't seem to get me very far either:

!pip install transformers accelerate datasets peft safetensors SentencePiece
!python server.py --share --chat --wbits 4 --groupsize 128

still gave me this error:

AttributeError: module 'PIL.Image' has no attribute 'Resampling'

Other references:

https://github.com/camenduru/text-generation-webui/blob/main/requirements.txt

https://github.com/camenduru/text-generation-webui/blob/main/server.py

Task exception was never retrieved

2023-08-21 06:16:14 ERROR:Task exception was never retrieved
future: <Task finished name='w0d6mwwkndk_173' coro=<Queue.process_events() done, defined at /usr/local/lib/python3.10/dist-packages/gradio/queueing.py:343> exception=1 validation error for PredictBody
event_id
Field required [type=missing, input_value={'fn_index': 173, 'data':...on_hash': 'w0d6mwwkndk'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.1/v/missing>
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 347, in process_events
client_awake = await self.gather_event_data(event)
File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 220, in gather_event_data
data, client_awake = await self.get_message(event, timeout=receive_timeout)
File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 456, in get_message
return PredictBody(**data), True
File "/usr/local/lib/python3.10/dist-packages/pydantic/main.py", line 159, in init
pydantic_self.pydantic_validator.validate_python(data, self_instance=pydantic_self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for PredictBody
event_id
Field required [type=missing, input_value={'fn_index': 173, 'data':...on_hash': 'w0d6mwwkndk'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.1/v/missing

I use pyg-13b and here is the error.

[Bug]: OS Error: No file named pytorch_model.bin

I'm getting an os error
No file named pytorch_model.bin in directory models
when running text generation webui:
stable-vicuna-13B-GPTQ-4bit-128g

Notebook
%cd /content
!apt-get -y install -qq aria2

!git clone -b v1.7 https://github.com/camenduru/text-generation-webui
%cd /content/text-generation-webui
!pip install -r requirements.txt
!pip install -U gradio==3.28.3

!mkdir /content/text-generation-webui/repositories
%cd /content/text-generation-webui/repositories
!git clone -b v1.2 https://github.com/camenduru/GPTQ-for-LLaMa.git
%cd GPTQ-for-LLaMa
!python setup_cuda.py install

!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/stable-vicuna-13B-GPTQ/raw/main/config.json -d /content/text-generation-webui/models/stable-vicuna-13B-GPTQ -o config.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/stable-vicuna-13B-GPTQ/raw/main/generation_config.json -d /content/text-generation-webui/models/stable-vicuna-13B-GPTQ -o generation_config.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/stable-vicuna-13B-GPTQ/raw/main/special_tokens_map.json -d /content/text-generation-webui/models/stable-vicuna-13B-GPTQ -o special_tokens_map.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/stable-vicuna-13B-GPTQ/resolve/main/tokenizer.model -d /content/text-generation-webui/models/stable-vicuna-13B-GPTQ -o tokenizer.model
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/stable-vicuna-13B-GPTQ/raw/main/tokenizer_config.json -d /content/text-generation-webui/models/stable-vicuna-13B-GPTQ -o tokenizer_config.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/stable-vicuna-13B-GPTQ/resolve/main/stable-vicuna-13B-GPTQ-4bit.compat.no-act-order.safetensors -d /content/text-generation-webui/models/stable-vicuna-13B-GPTQ -o stable-vicuna-13B-GPTQ-4bit.compat.no-act-order.safetensors

%cd /content/text-generation-webui
!python server.py --share --chat --wbits 4 --groupsize 128

Output

/content/text-generation-webui
2023-07-12 06:36:29 INFO:Unwanted HTTP request redirected to localhost :)
2023-07-12 06:36:32 WARNING:The gradio "share link" feature uses a proprietary executable to create a reverse tunnel. Use it with care.
2023-07-12 06:36:35.091770: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
bin /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so
2023-07-12 06:36:38 INFO:Loading stable-vicuna-13B-GPTQ...
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ /content/text-generation-webui/server.py:1154 in โ”‚
โ”‚ โ”‚
โ”‚ 1151 โ”‚ โ”‚ update_model_parameters(model_settings, initial=True) # hija โ”‚
โ”‚ 1152 โ”‚ โ”‚ โ”‚
โ”‚ 1153 โ”‚ โ”‚ # Load the model โ”‚
โ”‚ โฑ 1154 โ”‚ โ”‚ shared.model, shared.tokenizer = load_model(shared.model_name โ”‚
โ”‚ 1155 โ”‚ โ”‚ if shared.args.lora: โ”‚
โ”‚ 1156 โ”‚ โ”‚ โ”‚ add_lora_to_model(shared.args.lora) โ”‚
โ”‚ 1157 โ”‚
โ”‚ โ”‚
โ”‚ /content/text-generation-webui/modules/models.py:74 in load_model โ”‚
โ”‚ โ”‚
โ”‚ 71 โ”‚ โ”‚ โ”‚ โ”‚ return None, None โ”‚
โ”‚ 72 โ”‚ โ”‚
โ”‚ 73 โ”‚ shared.args.loader = loader โ”‚
โ”‚ โฑ 74 โ”‚ output = load_func_maploader โ”‚
โ”‚ 75 โ”‚ if type(output) is tuple: โ”‚
โ”‚ 76 โ”‚ โ”‚ model, tokenizer = output โ”‚
โ”‚ 77 โ”‚ else: โ”‚
โ”‚ โ”‚
โ”‚ /content/text-generation-webui/modules/models.py:144 in huggingface_loader โ”‚
โ”‚ โ”‚
โ”‚ 141 โ”‚ โ”‚
โ”‚ 142 โ”‚ # Load the model in simple 16-bit mode by default โ”‚
โ”‚ 143 โ”‚ if not any([shared.args.cpu, shared.args.load_in_8bit, shared.args โ”‚
โ”‚ โฑ 144 โ”‚ โ”‚ model = LoaderClass.from_pretrained(Path(f"{shared.args.model_ โ”‚
โ”‚ 145 โ”‚ โ”‚ if torch.has_mps: โ”‚
โ”‚ 146 โ”‚ โ”‚ โ”‚ device = torch.device('mps') โ”‚
โ”‚ 147 โ”‚ โ”‚ โ”‚ model = model.to(device) โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factor โ”‚
โ”‚ y.py:484 in from_pretrained โ”‚
โ”‚ โ”‚
โ”‚ 481 โ”‚ โ”‚ โ”‚ ) โ”‚
โ”‚ 482 โ”‚ โ”‚ elif type(config) in cls._model_mapping.keys(): โ”‚
โ”‚ 483 โ”‚ โ”‚ โ”‚ model_class = _get_model_class(config, cls._model_mapping) โ”‚
โ”‚ โฑ 484 โ”‚ โ”‚ โ”‚ return model_class.from_pretrained( โ”‚
โ”‚ 485 โ”‚ โ”‚ โ”‚ โ”‚ pretrained_model_name_or_path, *model_args, config=con โ”‚
โ”‚ 486 โ”‚ โ”‚ โ”‚ ) โ”‚
โ”‚ 487 โ”‚ โ”‚ raise ValueError( โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:2449 โ”‚
โ”‚ in from_pretrained โ”‚
โ”‚ โ”‚
โ”‚ 2446 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ " to load this model from those weights." โ”‚
โ”‚ 2447 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ ) โ”‚
โ”‚ 2448 โ”‚ โ”‚ โ”‚ โ”‚ else: โ”‚
โ”‚ โฑ 2449 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ raise EnvironmentError( โ”‚
โ”‚ 2450 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ f"Error no file named {_add_variant(WEIGHTS_N โ”‚
โ”‚ 2451 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ f" {TF_WEIGHTS_NAME + '.index'} or {FLAX_WEIG โ”‚
โ”‚ 2452 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ f" {pretrained_model_name_or_path}." โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or
flax_model.msgpack found in directory models/stable-vicuna-13B-GPTQ.

Kinda slow

ik i dont know anything but is there a way to make it faster?

Api not working

How to use the api to interact with the webui with my python terminal in my local pc

[Bug]: ERROR:Task exception was never retrieved

Why am I getting this error though.
No model is loading when I try to in the Webui

WARNING:The gradio "share link" feature downloads a proprietary and unaudited blob to create a reverse tunnel. This is potentially dangerous.
bin /opt/conda/envs/textgen/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda110.so
INFO:Loading stable-vicuna-13B-GPTQ...
INFO:Found the following quantized model: models/stable-vicuna-13B-GPTQ/stable-vicuna-13B-GPTQ-4bit.compat.no-act-order.safetensors
INFO:Using the following device map for the quantized model:
INFO:Loaded the model in 55.13 seconds.
INFO:Loading the extension "gallery"...
Running on local URL: http://127.0.0.1:7860
Running on public URL: https://f1bdbcf1f947d12f33.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces
INFO:HTTP Request: POST http://127.0.0.1:7860/reset "HTTP/1.1 200 OK"
ERROR:Task exception was never retrieved
future: <Task finished name='64ppr2666p8_90' coro=<Queue.process_events() done, defined at /opt/conda/envs/textgen/lib/python3.10/site-packages/gradio/queueing.py:343> exception=1 validation error for PredictBody
event_id
Field required [type=missing, input_value={'fn_index': 90, 'data': ...on_hash': '64ppr2666p8'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.1.2/v/missing>
Traceback (most recent call last):
File "/opt/conda/envs/textgen/lib/python3.10/site-packages/gradio/queueing.py", line 347, in process_events
client_awake = await self.gather_event_data(event)
File "/opt/conda/envs/textgen/lib/python3.10/site-packages/gradio/queueing.py", line 220, in gather_event_data
data, client_awake = await self.get_message(event, timeout=receive_timeout)
File "/opt/conda/envs/textgen/lib/python3.10/site-packages/gradio/queueing.py", line 456, in get_message
return PredictBody(**data), True
File "/opt/conda/envs/textgen/lib/python3.10/site-packages/pydantic/main.py", line 150, in init
pydantic_self.pydantic_validator.validate_python(data, self_instance=pydantic_self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for PredictBody
event_id
Field required [type=missing, input_value={'fn_index': 90, 'data': ...on_hash': '64ppr2666p8'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.1.2/v/missing

Load new characters?

When I upload new characters, it always gives me an error. How can we fix this? Also there seems to be no local copy saved in my gdrive.

Endless queue

After clicking the "Running on public URL" link on pygmalion 13b and 7b a queue starts that never ends
pyg
image

Colab generates error

Colab generates error:

ValueError: Loading models/falcon-7b-instruct-GPTQ requires you to execute the
configuration file in that repo on your local machine. Make sure you have read
the code there to avoid malicious use, then set the option
trust_remote_code=True to remove this error.

Details:

2023-08-14 09:42:41 INFO:Unwanted HTTP request redirected to localhost :)
2023-08-14 09:42:44 WARNING:The gradio "share link" feature uses a proprietary executable to create a reverse tunnel. Use it with care.
2023-08-14 09:42:46.457649: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
bin /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so
2023-08-14 09:42:49 INFO:Loading falcon-7b-instruct-GPTQ...
2023-08-14 09:42:49 INFO:The AutoGPTQ params are: {'model_basename': 'gptq_model-4bit-64g', 'device': 'cuda:0', 'use_triton': False, 'inject_fused_attention': True, 'inject_fused_mlp': True, 'use_safetensors': True, 'trust_remote_code': False, 'max_memory': None, 'quantize_config': None, 'use_cuda_fp16': True}
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ /content/text-generation-webui/server.py:1154 in โ”‚
โ”‚ โ”‚
โ”‚ 1151 โ”‚ โ”‚ update_model_parameters(model_settings, initial=True) # hija โ”‚
โ”‚ 1152 โ”‚ โ”‚ โ”‚
โ”‚ 1153 โ”‚ โ”‚ # Load the model โ”‚
โ”‚ โฑ 1154 โ”‚ โ”‚ shared.model, shared.tokenizer = load_model(shared.model_name โ”‚
โ”‚ 1155 โ”‚ โ”‚ if shared.args.lora: โ”‚
โ”‚ 1156 โ”‚ โ”‚ โ”‚ add_lora_to_model(shared.args.lora) โ”‚
โ”‚ 1157 โ”‚
โ”‚ โ”‚
โ”‚ /content/text-generation-webui/modules/models.py:74 in load_model โ”‚
โ”‚ โ”‚
โ”‚ 71 โ”‚ โ”‚ โ”‚ โ”‚ return None, None โ”‚
โ”‚ 72 โ”‚ โ”‚
โ”‚ 73 โ”‚ shared.args.loader = loader โ”‚
โ”‚ โฑ 74 โ”‚ output = load_func_maploader โ”‚
โ”‚ 75 โ”‚ if type(output) is tuple: โ”‚
โ”‚ 76 โ”‚ โ”‚ model, tokenizer = output โ”‚
โ”‚ 77 โ”‚ else: โ”‚
โ”‚ โ”‚
โ”‚ /content/text-generation-webui/modules/models.py:288 in AutoGPTQ_loader โ”‚
โ”‚ โ”‚
โ”‚ 285 def AutoGPTQ_loader(model_name): โ”‚
โ”‚ 286 โ”‚ import modules.AutoGPTQ_loader โ”‚
โ”‚ 287 โ”‚ โ”‚
โ”‚ โฑ 288 โ”‚ return modules.AutoGPTQ_loader.load_quantized(model_name) โ”‚
โ”‚ 289 โ”‚
โ”‚ 290 โ”‚
โ”‚ 291 def ExLlama_loader(model_name): โ”‚
โ”‚ โ”‚
โ”‚ /content/text-generation-webui/modules/AutoGPTQ_loader.py:56 in โ”‚
โ”‚ load_quantized โ”‚
โ”‚ โ”‚
โ”‚ 53 โ”‚ } โ”‚
โ”‚ 54 โ”‚ โ”‚
โ”‚ 55 โ”‚ logger.info(f"The AutoGPTQ params are: {params}") โ”‚
โ”‚ โฑ 56 โ”‚ model = AutoGPTQForCausalLM.from_quantized(path_to_model, **params) โ”‚
โ”‚ 57 โ”‚ โ”‚
โ”‚ 58 โ”‚ # These lines fix the multimodal extension when used with AutoGPTQ โ”‚
โ”‚ 59 โ”‚ if hasattr(model, 'model'): โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/auto.py:79 in โ”‚
โ”‚ from_quantized โ”‚
โ”‚ โ”‚
โ”‚ 76 โ”‚ โ”‚ warmup_triton: bool = False, โ”‚
โ”‚ 77 โ”‚ โ”‚ **kwargs โ”‚
โ”‚ 78 โ”‚ ) -> BaseGPTQForCausalLM: โ”‚
โ”‚ โฑ 79 โ”‚ โ”‚ model_type = check_and_get_model_type(save_dir or model_name_o โ”‚
โ”‚ 80 โ”‚ โ”‚ quant_func = GPTQ_CAUSAL_LM_MODEL_MAP[model_type].from_quantiz โ”‚
โ”‚ 81 โ”‚ โ”‚ keywords = {key: kwargs[key] for key in signature(quant_func). โ”‚
โ”‚ 82 โ”‚ โ”‚ return quant_func( โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/_utils.py:123 in โ”‚
โ”‚ check_and_get_model_type โ”‚
โ”‚ โ”‚
โ”‚ 120 โ”‚
โ”‚ 121 โ”‚
โ”‚ 122 def check_and_get_model_type(model_dir, trust_remote_code=False): โ”‚
โ”‚ โฑ 123 โ”‚ config = AutoConfig.from_pretrained(model_dir, trust_remote_code=t โ”‚
โ”‚ 124 โ”‚ if config.model_type not in SUPPORTED_MODELS: โ”‚
โ”‚ 125 โ”‚ โ”‚ raise TypeError(f"{config.model_type} isn't supported yet.") โ”‚
โ”‚ 126 โ”‚ model_type = config.model_type โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/transformers/models/auto/configurati โ”‚
โ”‚ on_auto.py:947 in from_pretrained โ”‚
โ”‚ โ”‚
โ”‚ 944 โ”‚ โ”‚ config_dict, unused_kwargs = PretrainedConfig.get_config_dict( โ”‚
โ”‚ 945 โ”‚ โ”‚ has_remote_code = "auto_map" in config_dict and "AutoConfig" i โ”‚
โ”‚ 946 โ”‚ โ”‚ has_local_code = "model_type" in config_dict and config_dict[" โ”‚
โ”‚ โฑ 947 โ”‚ โ”‚ trust_remote_code = resolve_trust_remote_code( โ”‚
โ”‚ 948 โ”‚ โ”‚ โ”‚ trust_remote_code, pretrained_model_name_or_path, has_loca โ”‚
โ”‚ 949 โ”‚ โ”‚ ) โ”‚
โ”‚ 950 โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/transformers/dynamic_module_utils.py โ”‚
โ”‚ :553 in resolve_trust_remote_code โ”‚
โ”‚ โ”‚
โ”‚ 550 โ”‚ โ”‚ โ”‚ _raise_timeout_error(None, None) โ”‚
โ”‚ 551 โ”‚ โ”‚
โ”‚ 552 โ”‚ if has_remote_code and not has_local_code and not trust_remote_cod โ”‚
โ”‚ โฑ 553 โ”‚ โ”‚ raise ValueError( โ”‚
โ”‚ 554 โ”‚ โ”‚ โ”‚ f"Loading {model_name} requires you to execute the configu โ”‚
โ”‚ 555 โ”‚ โ”‚ โ”‚ " repo on your local machine. Make sure you have read the โ”‚
โ”‚ 556 โ”‚ โ”‚ โ”‚ " set the option trust_remote_code=True to remove this e โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
ValueError: Loading models/falcon-7b-instruct-GPTQ requires you to execute the
configuration file in that repo on your local machine. Make sure you have read
the code there to avoid malicious use, then set the option
trust_remote_code=True to remove this error.

unable to use google_translate_plus extension

using camenduru/text-generation-webui-colab
i used following code to download the extension:
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://github.com/Vasyanator/google_translate_plus/blob/main/requirements.txt -d /content/text-generation-webui/extensions/google_translate_plus -o requirements.txt
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://github.com/Vasyanator/google_translate_plus/blob/main/script.py -d /content/text-generation-webui/extensions/google_translate_plus -o script.py
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://github.com/Vasyanator/google_translate_plus/blob/main/settings.json -d /content/text-generation-webui/extensions/google_translate_plus -o settings.json

i used following code to load the extension๏ผš
--extensions google_translate_plus

when i try to run the model, at the stage :
2023-10-15 14:59:48 INFO:Loading the extension "google_translate_plus"...
2023-10-15 14:59:48 ERROR:Failed to load the extension "google_translate_plus".

Traceback (most recent call last):
File "/content/text-generation-webui/modules/extensions.py", line 35, in load_extensions
exec(f"import extensions.{name}.script")
File "", line 1, in
File "/content/text-generation-webui/extensions/google_translate_plus/script.py", line 1, in
{"payload":{"allShortcutsEnabled":false,"fileTree":{"":{"items":[{"name":"README.md","path":"README.md","contentType":"file"},{"name":"requirements.txt","path":"requirements.txt","contentType":"file"},{"name":"script.py","path":"script.py","contentType":"file"},{"name":"settings.json","path":"settings.json","contentType":"file"}],"totalCount":4}},"fileTreeProcessingTime":1.890233,"foldersToFetch":[],"reducedMotionEnabled":null,"repo":........
NameError: name 'false' is not defined

Simplify some instructions

Hi,

I've just tried to make it simpler by putting some values in variables:

USER="TheBloke"
MODEL="WizardCoder-Python-13B-V1.0-GPTQ"
FILES=("config.json" "generation_config.json" "special_tokens_map.json" "tokenizer.model" "tokenizer_config.json" "model.safetensors")

then make loop to download required files:
If you don't want to distinguish between raw and resolve:

%%bash

for FILE in "${FILES[@]}"; do
  !aria2c --console-log-level=error -c -x 16 -s 16 -k 1M "https://huggingface.co/$USER/$MODEL/resolve/main/$FILE" -d "/content/text-generation-webui/models/$MODEL" -o $FILE
done

Use this if you want to distinguish between raw and resolve:

%%bash

for FILE in "${FILES[@]}"; do
  if [[ $FILE == "tokenizer.model" || $FILE == "model.safetensors" ]]; then
    !aria2c --console-log-level=error -c -x 16 -s 16 -k 1M "https://huggingface.co/$USER/$MODEL/resolve/main/$FILE" -d "/content/text-generation-webui/models/$MODEL" -o $FILE
  else
    !aria2c --console-log-level=error -c -x 16 -s 16 -k 1M "https://huggingface.co/$USER/$MODEL/raw/main/$FILE" -d "/content/text-generation-webui/models/$MODEL" -o $FILE
  fi
done

didn't work ๐Ÿ˜ข

import os

user_name = "anon8231489123" #@param {"type": "string"}

model_name = "gpt4-x-alpaca-13b-native-4bit-128g" #@param {"type": "string"}

!apt-get -y install -qq aria2
!git clone -b v1.0 https://github.com/camenduru/text-generation-webui
%cd /content/text-generation-webui
!pip install -r requirements.txt

models_path = "/content/text-generation-webui/models/"
model_path = os.path.join(models_path, model_name)
os.makedirs(model_path, exist_ok=True)

!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/{user_name}/{model_name}/raw/main/config.json -d {model_path} -o config.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/{user_name}/{model_name}/raw/main/generation_config.json -d {model_path} -o generation_config.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/{user_name}/{model_name}/raw/main/pytorch_model.bin.index.json -d {model_path} -o pytorch_model.bin.index.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/{user_name}/{model_name}/raw/main/special_tokens_map.json -d {model_path} -o special_tokens_map.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/{user_name}/{model_name}/resolve/main/tokenizer.model -d {model_path} -o tokenizer.model
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/{user_name}/{model_name}/raw/main/tokenizer_config.json -d {model_path} -o tokenizer_config.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/{user_name}/{model_name}/resolve/main/gpt-x-alpaca-13b-native-4bit-128g-cuda.pt -d {model_path} -o {model_name}.pt
%cd /content/text-generation-webui
!python server.py --share --chat --wbits 4 --groupsize 128 --model {model_name}

got:

Loading gpt4-x-alpaca-13b-native-4bit-128g...
Loading model ...
^C

didn't try on colab pro, is there a way to optimize this?

edit: i just found this:

tsumeone/gpt4-x-alpaca-13b-native-4bit-128g-cuda

falcon-7b-instruct-GPTQ-4bit.ipynb

INFO:Gradio HTTP request redirected to localhost :)
WARNING:trust_remote_code is enabled. This is dangerous.
WARNING:The gradio "share link" feature uses a proprietary executable to create a reverse tunnel. Use it with care.
2023-06-06 21:55:46.220247: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
bin /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so
INFO:Loading falcon-7b-instruct-GPTQ...
INFO:The AutoGPTQ params are: {'model_basename': 'gptq_model-4bit-64g', 'device': 'cuda:0', 'use_triton': False, 'use_safetensors': True, 'trust_remote_code': True, 'max_memory': None, 'quantize_config': None}
WARNING:CUDA extension not installed.
WARNING:The safetensors archive passed at models/falcon-7b-instruct-GPTQ/gptq_model-4bit-64g.safetensors does not contain metadata. Make sure to save your model with the `save_pretrained` method. Defaulting to 'pt' metadata.
WARNING:can't get model's sequence length from model config, will set to 4096.
WARNING:RWGPTQForCausalLM hasn't fused attention module yet, will skip inject fused attention.
WARNING:RWGPTQForCausalLM hasn't fused mlp module yet, will skip inject fused mlp.
INFO:Loaded the model in 36.17 seconds.

INFO:Loading the extension "gallery"...
Running on local URL:  http://127.0.0.1:7860/
Running on public URL: https://ccd3202fc68d7be036.gradio.live/

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces
ERROR:Task exception was never retrieved
future: <Task finished name='hszag9ma4as_118' coro=<Queue.process_events() done, defined at /usr/local/lib/python3.10/dist-packages/gradio/queueing.py:343> exception=ValidationError(model='PredictBody', errors=[{'loc': ('data',), 'msg': 'field required', 'type': 'value_error.missing'}])>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 347, in process_events
    client_awake = await self.gather_event_data(event)
  File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 220, in gather_event_data
    data, client_awake = await self.get_message(event, timeout=receive_timeout)
  File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 456, in get_message
    return PredictBody(**data), True
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for PredictBody
data
  field required (type=value_error.missing)
Output generated in 15.99 seconds (0.94 tokens/s, 15 tokens, context 67, seed 1207267814)

The problem didn't go away even after the fix

I checked redmoond puffin 13b and vicuna 13b the problems remained that the messages were repeated, I changed the instruct to llama v2, but nothing helped. During a conversation with a character after several messages, he just repeats the messages each time.

SyntaxError: illegal target for annotation

actually I am trying to run in on kaggle but it's giving me this error can somebody help me out

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
cab6f1|OK  |    14KiB/s|/content/text-generation-webui/models/pyg-13b-4bit-128g/tokenizer_config.json

Status Legend:
(OK):download completed.
[#846e7e 6.7GiB/6.9GiB(96%) CN:16 DL:233MiB]0m]m
Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
846e7e|OK  |   226MiB/s|/content/text-generation-webui/models/pyg-13b-4bit-128g/4bit-128g.safetensors

Status Legend:
(OK):download completed.
/content/text-generation-webui
Gradio HTTP request redirected to localhost :)
Traceback (most recent call last):
  File "/content/text-generation-webui/server.py", line 43, in <module>
    import modules.extensions as extensions_module
  File "/content/text-generation-webui/modules/extensions.py", line 6, in <module>
    import extensions
  File "/opt/conda/lib/python3.10/site-packages/extensions/__init__.py", line 7
    "bufferView": 5,
    ^^^^^^^^^^^^
SyntaxError: illegal target for annotation

message repeat of characters

When talking with the bot, I have repeated messages, nothing helps, even changing the temperature and other settings. After updating I got this error. I use redmond puffin 13b. Any help please? or advice

pyg7b and other models stop work

5.5.2023 everything is okey, now i have this error. pyg7b model.

โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ /content/text-generation-webui/server.py:927 in โ”‚
โ”‚ โ”‚
โ”‚ 924 โ”‚ โ”‚ }) โ”‚
โ”‚ 925 โ”‚ โ”‚
โ”‚ 926 โ”‚ # Launch the web UI โ”‚
โ”‚ โฑ 927 โ”‚ create_interface() โ”‚
โ”‚ 928 โ”‚ while True: โ”‚
โ”‚ 929 โ”‚ โ”‚ time.sleep(0.5) โ”‚
โ”‚ 930 โ”‚ โ”‚ if shared.need_restart: โ”‚
โ”‚ โ”‚
โ”‚ /content/text-generation-webui/server.py:514 in create_interface โ”‚
โ”‚ โ”‚
โ”‚ 511 โ”‚ if shared.args.extensions is not None and len(shared.args.extensio โ”‚
โ”‚ 512 โ”‚ โ”‚ extensions_module.load_extensions() โ”‚
โ”‚ 513 โ”‚ โ”‚
โ”‚ โฑ 514 โ”‚ with gr.Blocks(css=ui.css if not shared.is_chat() else ui.css + ui โ”‚
โ”‚ 515 โ”‚ โ”‚ โ”‚
โ”‚ 516 โ”‚ โ”‚ # Create chat mode interface โ”‚
โ”‚ 517 โ”‚ โ”‚ if shared.is_chat(): โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/gradio/blocks.py:1285 in exit โ”‚
โ”‚ โ”‚
โ”‚ 1282 โ”‚ โ”‚ โ”‚ Context.root_block = None โ”‚
โ”‚ 1283 โ”‚ โ”‚ else: โ”‚
โ”‚ 1284 โ”‚ โ”‚ โ”‚ self.parent.children.extend(self.children) โ”‚
โ”‚ โฑ 1285 โ”‚ โ”‚ self.config = self.get_config_file() โ”‚
โ”‚ 1286 โ”‚ โ”‚ self.app = routes.App.create_app(self) โ”‚
โ”‚ 1287 โ”‚ โ”‚ self.progress_tracking = any(block_fn.tracks_progress for blo โ”‚
โ”‚ 1288 โ”‚ โ”‚ self.exited = True โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/gradio/blocks.py:1261 in โ”‚
โ”‚ get_config_file โ”‚
โ”‚ โ”‚
โ”‚ 1258 โ”‚ โ”‚ โ”‚ โ”‚ assert isinstance(block, serializing.Serializable) โ”‚
โ”‚ 1259 โ”‚ โ”‚ โ”‚ โ”‚ block_config["serializer"] = serializer โ”‚
โ”‚ 1260 โ”‚ โ”‚ โ”‚ โ”‚ block_config["info"] = { โ”‚
โ”‚ โฑ 1261 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ "input": list(block.input_api_info()), # type: i โ”‚
โ”‚ 1262 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ "output": list(block.output_api_info()), # type: โ”‚
โ”‚ 1263 โ”‚ โ”‚ โ”‚ โ”‚ } โ”‚
โ”‚ 1264 โ”‚ โ”‚ โ”‚ config["components"].append(block_config) โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/gradio_client/serializing.py:40 in โ”‚
โ”‚ input_api_info โ”‚
โ”‚ โ”‚
โ”‚ 37 โ”‚ # For backwards compatibility โ”‚
โ”‚ 38 โ”‚ def input_api_info(self) -> tuple[str, str]: โ”‚
โ”‚ 39 โ”‚ โ”‚ api_info = self.api_info() โ”‚
โ”‚ โฑ 40 โ”‚ โ”‚ return (api_info["serialized_input"][0], api_info["serialized_ โ”‚
โ”‚ 41 โ”‚ โ”‚
โ”‚ 42 โ”‚ # For backwards compatibility โ”‚
โ”‚ 43 โ”‚ def output_api_info(self) -> tuple[str, str]: โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
KeyError: 'serialized_input'

and here is vicuna-13B-GPTQ
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ /content/text-generation-webui/server.py:591 in โ”‚
โ”‚ โ”‚
โ”‚ 588 โ”‚ โ”‚ shared.gradio['interface'].launch(prevent_thread_lock=True, sh โ”‚
โ”‚ 589 โ”‚
โ”‚ 590 โ”‚
โ”‚ โฑ 591 create_interface() โ”‚
โ”‚ 592 โ”‚
โ”‚ 593 while True: โ”‚
โ”‚ 594 โ”‚ time.sleep(0.5) โ”‚
โ”‚ โ”‚
โ”‚ /content/text-generation-webui/server.py:320 in create_interface โ”‚
โ”‚ โ”‚
โ”‚ 317 โ”‚ if shared.args.extensions is not None and len(shared.args.extensio โ”‚
โ”‚ 318 โ”‚ โ”‚ extensions_module.load_extensions() โ”‚
โ”‚ 319 โ”‚ โ”‚
โ”‚ โฑ 320 โ”‚ with gr.Blocks(css=ui.css if not shared.is_chat() else ui.css + ui โ”‚
โ”‚ 321 โ”‚ โ”‚ if shared.is_chat(): โ”‚
โ”‚ 322 โ”‚ โ”‚ โ”‚ shared.gradio['Chat input'] = gr.State() โ”‚
โ”‚ 323 โ”‚ โ”‚ โ”‚ with gr.Tab("Text generation", elem_id="main"): โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/gradio/blocks.py:1200 in exit โ”‚
โ”‚ โ”‚
โ”‚ 1197 โ”‚ โ”‚ โ”‚ Context.root_block = None โ”‚
โ”‚ 1198 โ”‚ โ”‚ else: โ”‚
โ”‚ 1199 โ”‚ โ”‚ โ”‚ self.parent.children.extend(self.children) โ”‚
โ”‚ โฑ 1200 โ”‚ โ”‚ self.config = self.get_config_file() โ”‚
โ”‚ 1201 โ”‚ โ”‚ self.app = routes.App.create_app(self) โ”‚
โ”‚ 1202 โ”‚ โ”‚ self.progress_tracking = any(block_fn.tracks_progress for blo โ”‚
โ”‚ 1203 โ”‚ โ”‚ self.exited = True โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/gradio/blocks.py:1176 in โ”‚
โ”‚ get_config_file โ”‚
โ”‚ โ”‚
โ”‚ 1173 โ”‚ โ”‚ โ”‚ โ”‚ assert isinstance(block, serializing.Serializable) โ”‚
โ”‚ 1174 โ”‚ โ”‚ โ”‚ โ”‚ block_config["serializer"] = serializer โ”‚
โ”‚ 1175 โ”‚ โ”‚ โ”‚ โ”‚ block_config["info"] = { โ”‚
โ”‚ โฑ 1176 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ "input": list(block.input_api_info()), # type: i โ”‚
โ”‚ 1177 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ "output": list(block.output_api_info()), # type: โ”‚
โ”‚ 1178 โ”‚ โ”‚ โ”‚ โ”‚ } โ”‚
โ”‚ 1179 โ”‚ โ”‚ โ”‚ config["components"].append(block_config) โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/gradio_client/serializing.py:40 in โ”‚
โ”‚ input_api_info โ”‚
โ”‚ โ”‚
โ”‚ 37 โ”‚ # For backwards compatibility โ”‚
โ”‚ 38 โ”‚ def input_api_info(self) -> tuple[str, str]: โ”‚
โ”‚ 39 โ”‚ โ”‚ api_info = self.api_info() โ”‚
โ”‚ โฑ 40 โ”‚ โ”‚ return (api_info["serialized_input"][0], api_info["serialized_ โ”‚
โ”‚ 41 โ”‚ โ”‚
โ”‚ 42 โ”‚ # For backwards compatibility โ”‚
โ”‚ 43 โ”‚ def output_api_info(self) -> tuple[str, str]: โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
KeyError: 'serialized_input'

problem - the character doesn't say anything

Traceback (most recent call last):
File "/content/text-generation-webui/modules/callbacks.py", line 56, in gentask
ret = self.mfunc(callback=_callback, *args, **self.kwargs)
File "/content/text-generation-webui/modules/text_generation.py", line 311, in generate_with_callback
shared.model.generate(**kwargs)
File "/usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/_base.py", line 443, in generate
return self.model.generate(**kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1642, in generate
return self.sample(
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2724, in sample
outputs = self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 809, in forward
outputs = self.model(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 697, in forward
layer_outputs = decoder_layer(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 426, in forward
hidden_states = self.mlp(hidden_states)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 210, in forward
[F.linear(x, gate_proj_slices[i]) for i in range(self.config.pretraining_tp)], dim=-1
File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 210, in
[F.linear(x, gate_proj_slices[i]) for i in range(self.config.pretraining_tp)], dim=-1
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1848x5120 and 13824x640)
Output generated in 4.45 seconds (0.00 tokens/s, 0 tokens, context 1848, seed 1194575447)

Unable to run the API extention

After checking the "api" option under the Session tab, I clicked the "Apply flags/extension and Restart" button as shown below:
Screenshot 2023-09-07 at 1 53 03

This generated the following logs in the colab console:

> 2023-09-06 16:30:28 WARNING:skip module injection for FusedLlamaMLPForQuantizedModel not support integrate without triton yet.
2023-09-06 16:30:28 INFO:Loaded the model in 51.97 seconds.

2023-09-06 16:30:28 INFO:Loading the extension "gallery"...
Running on local URL:  http://127.0.0.1:7860/
Running on public URL: https://<my_old_live_link>/

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)

---------------------------------<Below is the log after I restarted the the server with api option>---------------------------

ERROR:    Exception in ASGI application

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/websockets/websockets_impl.py", line 247, in run_asgi
    result = await self.app(self.scope, self.asgi_receive, self.asgi_send)
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 276, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 149, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 75, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 341, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 82, in app
    await func(session)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 289, in app
    await dependant.call(**values)
  File "/usr/local/lib/python3.10/dist-packages/gradio/routes.py", line 536, in join_queue
    session_info = await asyncio.wait_for(
  File "/usr/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
    return fut.result()
  File "/usr/local/lib/python3.10/dist-packages/starlette/websockets.py", line 133, in receive_json
    self._raise_on_disconnect(message)
  File "/usr/local/lib/python3.10/dist-packages/starlette/websockets.py", line 105, in _raise_on_disconnect
    raise WebSocketDisconnect(message["code"])
starlette.websockets.WebSocketDisconnect: 1012
Closing server running on port: 7860
2023-09-06 16:31:32 INFO:Loading the extension "gallery"...
2023-09-06 16:31:32 ERROR:Failed to load the extension "api".
Traceback (most recent call last):
  File "/content/text-generation-webui/modules/extensions.py", line 40, in load_extensions
    extension.setup()
  File "/content/text-generation-webui/extensions/api/script.py", line 10, in setup
    if shared.public_api:
AttributeError: module 'modules.shared' has no attribute 'public_api'
Starting API at http://127.0.0.1:5000/api
Running on local URL:  http://127.0.0.1:7860/
Running on public URL: https://<my_new_live_link>/

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
Output generated in 7.95 seconds (4.90 tokens/s, 39 tokens, context 45, seed 932200172)

I tried the following code to get the response after that. However, I am getting 404 error.
Could you please tell me how do I start the API correctly and get the responses?


# For local streaming, the websockets are hosted without ssl - http://
HOST = '<my_new_live_link>'
URI = f'https://{HOST}/api/v1/generate'


# For reverse-proxied streaming, the remote will likely host with ssl - https://
# URI = 'https://your-uri-here.trycloudflare.com/api/v1/generate'


def run(prompt):
    request = {
        'prompt': prompt,
        'max_new_tokens': 250,
        'auto_max_new_tokens': False,
        'max_tokens_second': 0,

        # Generation params. If 'preset' is set to different than 'None', the values
        # in presets/preset-name.yaml are used instead of the individual numbers.
        'preset': 'None',
        'do_sample': True,
        'temperature': 0.7,
        'top_p': 0.1,
        'typical_p': 1,
        'epsilon_cutoff': 0,  # In units of 1e-4
        'eta_cutoff': 0,  # In units of 1e-4
        'tfs': 1,
        'top_a': 0,
        'repetition_penalty': 1.18,
        'repetition_penalty_range': 0,
        'top_k': 40,
        'min_length': 0,
        'no_repeat_ngram_size': 0,
        'num_beams': 1,
        'penalty_alpha': 0,
        'length_penalty': 1,
        'early_stopping': False,
        'mirostat_mode': 0,
        'mirostat_tau': 5,
        'mirostat_eta': 0.1,
        'guidance_scale': 1,
        'negative_prompt': '',

        'seed': -1,
        'add_bos_token': True,
        'truncation_length': 2048,
        'ban_eos_token': False,
        'skip_special_tokens': True,
        'stopping_strings': []
    }
    print(URI)
    response = requests.post(URI, json=request)
    print(response)

    if response.status_code == 200:
        result = response.json()['results'][0]['text']
        print(prompt + result)


if __name__ == '__main__':
    prompt = "In order to make homemade bread, follow these steps:\n1)"
    run(prompt)```
  

Will llama2 70B be supported in future?

Hi there, many thanks for this wonderful sharing!
Just wonder will there be a 70b running on colab?
Have tried Petals' work however the chat worked not that right.
Best,

Something wrong with the colab

Hi camenduru
Firt of all thanks for your work.
My problem is , everi tim i want to run any modell in colab i got the same issue.
vicuna works fine but pyg 7b or pyg 13b not and the wizard unces not working too.

W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
bin /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ /content/text-generation-webui/server.py:44 in โ”‚
โ”‚ โ”‚
โ”‚ 41 from PIL import Image โ”‚
โ”‚ 42 โ”‚
โ”‚ 43 import modules.extensions as extensions_module โ”‚
โ”‚ โฑ 44 from modules import chat, shared, training, ui โ”‚
โ”‚ 45 from modules.html_generator import chat_html_wrapper โ”‚
โ”‚ 46 from modules.LoRA import add_lora_to_model โ”‚
โ”‚ 47 from modules.models import load_model, load_soft_prompt, unload_model โ”‚
โ”‚ โ”‚
โ”‚ /content/text-generation-webui/modules/training.py:13 in โ”‚
โ”‚ โ”‚
โ”‚ 10 import torch โ”‚
โ”‚ 11 import transformers โ”‚
โ”‚ 12 from datasets import Dataset, load_dataset โ”‚
โ”‚ โฑ 13 from peft import (LoraConfig, get_peft_model, prepare_model_for_int8_t โ”‚
โ”‚ 14 โ”‚ โ”‚ โ”‚ โ”‚ set_peft_model_state_dict) โ”‚
โ”‚ 15 โ”‚
โ”‚ 16 from modules import shared, ui โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/peft/init.py:22 in โ”‚
โ”‚ โ”‚
โ”‚ 19 โ”‚
โ”‚ 20 version = "0.4.0.dev0" โ”‚
โ”‚ 21 โ”‚
โ”‚ โฑ 22 from .mapping import MODEL_TYPE_TO_PEFT_MODEL_MAPPING, PEFT_TYPE_TO_CON โ”‚
โ”‚ 23 from .peft_model import ( โ”‚
โ”‚ 24 โ”‚ PeftModel, โ”‚
โ”‚ 25 โ”‚ PeftModelForCausalLM, โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/peft/mapping.py:16 in โ”‚
โ”‚ โ”‚
โ”‚ 13 # See the License for the specific language governing permissions and โ”‚
โ”‚ 14 # limitations under the License. โ”‚
โ”‚ 15 โ”‚
โ”‚ โฑ 16 from .peft_model import ( โ”‚
โ”‚ 17 โ”‚ PeftModel, โ”‚
โ”‚ 18 โ”‚ PeftModelForCausalLM, โ”‚
โ”‚ 19 โ”‚ PeftModelForSeq2SeqLM, โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/peft/peft_model.py:31 in โ”‚
โ”‚ โ”‚
โ”‚ 28 from transformers.modeling_outputs import SequenceClassifierOutput, T โ”‚
โ”‚ 29 from transformers.utils import PushToHubMixin โ”‚
โ”‚ 30 โ”‚
โ”‚ โฑ 31 from .tuners import ( โ”‚
โ”‚ 32 โ”‚ AdaLoraModel, โ”‚
โ”‚ 33 โ”‚ AdaptionPromptModel, โ”‚
โ”‚ 34 โ”‚ LoraModel, โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/peft/tuners/init.py:21 in โ”‚
โ”‚ โ”‚
โ”‚ โ”‚
โ”‚ 18 # limitations under the License. โ”‚
โ”‚ 19 โ”‚
โ”‚ 20 from .adaption_prompt import AdaptionPromptConfig, AdaptionPromptModel โ”‚
โ”‚ โฑ 21 from .lora import LoraConfig, LoraModel โ”‚
โ”‚ 22 from .adalora import AdaLoraConfig, AdaLoraModel โ”‚
โ”‚ 23 from .p_tuning import PromptEncoder, PromptEncoderConfig, PromptEncoder โ”‚
โ”‚ 24 from .prefix_tuning import PrefixEncoder, PrefixTuningConfig โ”‚
โ”‚ โ”‚
โ”‚ /usr/local/lib/python3.10/dist-packages/peft/tuners/lora.py:735 in โ”‚
โ”‚ โ”‚
โ”‚ 732 โ”‚ โ”‚ โ”‚ โ”‚ result += output โ”‚
โ”‚ 733 โ”‚ โ”‚ โ”‚ return result โ”‚
โ”‚ 734 โ”‚ โ”‚
โ”‚ โฑ 735 โ”‚ class Linear4bit(bnb.nn.Linear4bit, LoraLayer): โ”‚
โ”‚ 736 โ”‚ โ”‚ # Lora implemented in a dense layer โ”‚
โ”‚ 737 โ”‚ โ”‚ def init( โ”‚
โ”‚ 738 โ”‚ โ”‚ โ”‚ self, โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
AttributeError: module 'bitsandbytes.nn' has no attribute 'Linear4bit'

i dont know its something wit me or the code.

Thanks for your atention and work

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.