GithubHelp home page GithubHelp logo

josstorer / rwkv-runner Goto Github PK

View Code? Open in Web Editor NEW
5.0K 44.0 472.0 55.61 MB

A RWKV management and startup tool, full automation, only 8MB. And provides an interface compatible with the OpenAI API. RWKV is a large language model that is fully open source and available for commercial use.

Home Page: https://www.rwkv.com

License: MIT License

Go 7.25% HTML 0.07% JavaScript 1.31% TypeScript 69.66% Python 18.14% Makefile 0.15% SCSS 0.27% Batchfile 0.49% Shell 0.77% Ruby 1.64% Dockerfile 0.27%
rwkv api api-client chatgpt llm tool wails

rwkv-runner's Introduction

RWKV Runner

This project aims to eliminate the barriers of using large language models by automating everything for you. All you need is a lightweight executable program of just a few megabytes. Additionally, this project provides an interface compatible with the OpenAI API, which means that every ChatGPT client is an RWKV client.

license release py-version

English | 简体中文 | 日本語

Install

Windows MacOS Linux

FAQs | Preview | Download | Simple Deploy Example | Server Deploy Examples | MIDI Hardware Input

Tips

  • You can deploy backend-python on a server and use this program as a client only. Fill in your server address in the Settings API URL.

  • If you are deploying and providing public services, please limit the request size through API gateway to prevent excessive resource usage caused by submitting overly long prompts. Additionally, please restrict the upper limit of requests' max_tokens based on your actual situation: https://github.com/josStorer/RWKV-Runner/blob/master/backend-python/utils/rwkv.py#L567, the default is set as le=102400, which may result in significant resource consumption for individual responses in extreme cases.

  • Default configs has enabled custom CUDA kernel acceleration, which is much faster and consumes much less VRAM. If you encounter possible compatibility issues (output garbled), go to the Configs page and turn off Use Custom CUDA kernel to Accelerate, or try to upgrade your gpu driver.

  • If Windows Defender claims this is a virus, you can try downloading v1.3.7_win.zip and letting it update automatically to the latest version, or add it to the trusted list (Windows Security -> Virus & threat protection -> Manage settings -> Exclusions -> Add or remove exclusions -> Add an exclusion -> Folder -> RWKV-Runner).

  • For different tasks, adjusting API parameters can achieve better results. For example, for translation tasks, you can try setting Temperature to 1 and Top_P to 0.3.

Features

  • RWKV model management and one-click startup.
  • Front-end and back-end separation, if you don't want to use the client, also allows for separately deploying the front-end service, or the back-end inference service, or the back-end inference service with a WebUI. Simple Deploy Example | Server Deploy Examples
  • Compatible with the OpenAI API, making every ChatGPT client an RWKV client. After starting the model, open http://127.0.0.1:8000/docs to view more details.
  • Automatic dependency installation, requiring only a lightweight executable program.
  • Pre-set multi-level VRAM configs, works well on almost all computers. In Configs page, switch Strategy to WebGPU, it can also run on AMD, Intel, and other graphics cards.
  • User-friendly chat, completion, and composition interaction interface included. Also supports chat presets, attachment uploads, MIDI hardware input, and track editing. Preview | MIDI Hardware Input
  • Built-in WebUI option, one-click start of Web service, sharing your hardware resources.
  • Easy-to-understand and operate parameter configuration, along with various operation guidance prompts.
  • Built-in model conversion tool.
  • Built-in download management and remote model inspection.
  • Built-in one-click LoRA Finetune. (Windows Only)
  • Can also be used as an OpenAI ChatGPT, GPT-Playground, Ollama and more clients. (Fill in the API URL and API Key in Settings page)
  • Multilingual localization.
  • Theme switching.
  • Automatic updates.

Simple Deploy Example

git clone https://github.com/josStorer/RWKV-Runner

# Then
cd RWKV-Runner
python ./backend-python/main.py #The backend inference service has been started, request /switch-model API to load the model, refer to the API documentation: http://127.0.0.1:8000/docs

# Or
cd RWKV-Runner/frontend
npm ci
npm run build #Compile the frontend
cd ..
python ./backend-python/webui_server.py #Start the frontend service separately
# Or
python ./backend-python/main.py --webui #Start the frontend and backend service at the same time

# Help Info
python ./backend-python/main.py -h

API Concurrency Stress Testing

ab -p body.json -T application/json -c 20 -n 100 -l http://127.0.0.1:8000/chat/completions

body.json:

{
  "messages": [
    {
      "role": "user",
      "content": "Hello"
    }
  ]
}

Embeddings API Example

Note: v1.4.0 has improved the quality of embeddings API. The generated results are not compatible with previous versions. If you are using embeddings API to generate knowledge bases or similar, please regenerate.

If you are using langchain, just use OpenAIEmbeddings(openai_api_base="http://127.0.0.1:8000", openai_api_key="sk-")

import numpy as np
import requests


def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))


values = [
    "I am a girl",
    "我是个女孩",
    "私は女の子です",
    "广东人爱吃福建人",
    "我是个人类",
    "I am a human",
    "that dog is so cute",
    "私はねこむすめです、にゃん♪",
    "宇宙级特大事件!号外号外!"
]

embeddings = []
for v in values:
    r = requests.post("http://127.0.0.1:8000/embeddings", json={"input": v})
    embedding = r.json()["data"][0]["embedding"]
    embeddings.append(embedding)

compared_embedding = embeddings[0]

embeddings_cos_sim = [cosine_similarity(compared_embedding, e) for e in embeddings]

for i in np.argsort(embeddings_cos_sim)[::-1]:
    print(f"{embeddings_cos_sim[i]:.10f} - {values[i]}")

MIDI Input

Tip: You can download https://github.com/josStorer/sgm_plus and unzip it to the program's assets/sound-font directory to use it as an offline sound source. Please note that if you are compiling the program from source code, do not place it in the source code directory.

If you don't have a MIDI keyboard, you can use virtual MIDI input software like Virtual Midi Controller 3 LE, along with loopMIDI, to use a regular computer keyboard as MIDI input.

USB MIDI Connection

  • USB MIDI devices are plug-and-play, and you can select your input device in the Composition page
  • image

Mac MIDI Bluetooth Connection

  • For Mac users who want to use Bluetooth input, please install Bluetooth MIDI Connect, then click the tray icon to connect after launching, afterwards, you can select your input device in the Composition page.
  • image

Windows MIDI Bluetooth Connection

  • Windows seems to have implemented Bluetooth MIDI support only for UWP (Universal Windows Platform) apps. Therefore, it requires multiple steps to establish a connection. We need to create a local virtual MIDI device and then launch a UWP application. Through this UWP application, we will redirect Bluetooth MIDI input to the virtual MIDI device, and then this software will listen to the input from the virtual MIDI device.
  • So, first, you need to download loopMIDI to create a virtual MIDI device. Click the plus sign in the bottom left corner to create the device.
  • image
  • Next, you need to download Bluetooth LE Explorer to discover and connect to Bluetooth MIDI devices. Click "Start" to search for devices, and then click "Pair" to bind the MIDI device.
  • image
  • Finally, you need to install MIDIberry, This UWP application can redirect Bluetooth MIDI input to the virtual MIDI device. After launching it, double-click your actual Bluetooth MIDI device name in the input field, and in the output field, double-click the virtual MIDI device name we created earlier.
  • image
  • Now, you can select the virtual MIDI device as the input in the Composition page. Bluetooth LE Explorer no longer needs to run, and you can also close the loopMIDI window, it will run automatically in the background. Just keep MIDIberry open.
  • image

Related Repositories:

Preview

Homepage

image

Chat

image

image

Completion

image

Composition

Tip: You can download https://github.com/josStorer/sgm_plus and unzip it to the program's assets/sound-font directory to use it as an offline sound source. Please note that if you are compiling the program from source code, do not place it in the source code directory.

image

image

Configuration

image

Model Management

image

Download Management

image

LoRA Finetune

image

Settings

image

rwkv-runner's People

Contributors

beenotung avatar cabralski avatar dependabot[bot] avatar eltociear avatar github-actions[bot] avatar gmarcusm avatar josstorer avatar longhronshen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rwkv-runner's Issues

申请增加Python环境配置项

现有1.0.8版本中未增加Python环境自定义管理。
对于一切用户均在运行目录里下载和配置一个带torch的py3.10环境,对于小白很友好。
但希望发开自定义环境配置,电脑里已经有5个python,4个torch啦。。。。

在创建GPU-4G-7B-CN时出现错误,使用的硬件是RTX2060 6G

模型文件为RWKV-4-Raven-7B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230430-ctx8192.pth
具体报错信息为:
�[32mINFO�[0m: Started server process [�[36m17688�[0m]
�[32mINFO�[0m: Waiting for application startup.
torch found: D:\py-ai\RWVK-Runner\py310\Lib\site-packages\torch\lib
torch set
�[32mINFO�[0m: Application startup complete.
�[32mINFO�[0m: Uvicorn running on �[1mhttp://127.0.0.1:8000�[0m (Press CTRL+C to quit)
�[32mINFO�[0m: 127.0.0.1:49573 - "�[1mGET / HTTP/1.1�[0m" �[32m200 OK�[0m
�[32mINFO�[0m: 127.0.0.1:49573 - "�[1mGET / HTTP/1.1�[0m" �[32m200 OK�[0m
�[32mINFO�[0m: 127.0.0.1:49573 - "�[1mGET /status HTTP/1.1�[0m" �[32m200 OK�[0m
�[32mINFO�[0m: 127.0.0.1:49573 - "�[1mOPTIONS /update-config HTTP/1.1�[0m" �[32m200 OK�[0m
�[32mINFO�[0m: 127.0.0.1:49573 - "�[1mOPTIONS /switch-model HTTP/1.1�[0m" �[32m200 OK�[0m
max_tokens=4100 temperature=1.2 top_p=0.5 presence_penalty=0.4 frequency_penalty=0.4
�[32mINFO�[0m: 127.0.0.1:49573 - "�[1mPOST /update-config HTTP/1.1�[0m" �[32m200 OK�[0m
RWKV_JIT_ON 1 RWKV_CUDA_ON 0 RESCALE_LAYER 6

Loading models/RWKV-4-Raven-7B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230430-ctx8192.pth ...
Strategy: (total 32+1=33 layers)

  • cuda [float16, uint8], store 8 layers, stream 24 layers
  • cpu [float32, float32], store 1 layers
    0-cuda-float16-uint8 1-cuda-float16-uint8 2-cuda-float16-uint8 3-cuda-float16-uint8 4-cuda-float16-uint8 5-cuda-float16-uint8 6-cuda-float16-uint8 7-cuda-float16-uint8 8-cuda-float16-uint8-stream 9-cuda-float16-uint8-stream 10-cuda-float16-uint8-stream 11-cuda-float16-uint8-stream 12-cuda-float16-uint8-stream 13-cuda-float16-uint8-stream 14-cuda-float16-uint8-stream 15-cuda-float16-uint8-stream 16-cuda-float16-uint8-stream 17-cuda-float16-uint8-stream 18-cuda-float16-uint8-stream 19-cuda-float16-uint8-stream 20-cuda-float16-uint8-stream 21-cuda-float16-uint8-stream 22-cuda-float16-uint8-stream 23-cuda-float16-uint8-stream 24-cuda-float16-uint8-stream 25-cuda-float16-uint8-stream 26-cuda-float16-uint8-stream 27-cuda-float16-uint8-stream 28-cuda-float16-uint8-stream 29-cuda-float16-uint8-stream 30-cuda-float16-uint8-stream 31-cuda-float16-uint8-stream 32-cpu-float32-float32
    emb.weight f16 cpu 50277 4096
    1 validation error for RWKV
    root
    Torch not compiled with CUDA enabled (type=assertion_error)
    INFO: 127.0.0.1:49573 - "POST /switch-model HTTP/1.1" 500 Internal Server Error
    ERROR: Exception in ASGI application
    Traceback (most recent call last):
    File "D:\py-ai\RWVK-Runner\backend-python\routes\config.py", line 36, in switch_model
    RWKV(
    File "pydantic\main.py", line 341, in pydantic.main.BaseModel.init
    pydantic.error_wrappers.ValidationError: 1 validation error for RWKV
    root
    Torch not compiled with CUDA enabled (type=assertion_error)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 428, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\uvicorn\middleware\proxy_headers.py", line 78, in call
return await self.app(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\fastapi\applications.py", line 276, in call
await super().call(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\applications.py", line 122, in call
await self.middleware_stack(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\middleware\errors.py", line 184, in call
raise exc
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\middleware\errors.py", line 162, in call
await self.app(scope, receive, _send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\middleware\cors.py", line 92, in call
await self.simple_response(scope, receive, send, request_headers=headers)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\middleware\cors.py", line 147, in simple_response
await self.app(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\middleware\exceptions.py", line 79, in call
raise exc
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\middleware\exceptions.py", line 68, in call
await self.app(scope, receive, sender)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 21, in call
raise e
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\routing.py", line 718, in call
await route.handle(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\routing.py", line 276, in handle
await self.app(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\routing.py", line 66, in app
response = await func(request)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\fastapi\routing.py", line 237, in app
raw_response = await run_endpoint_function(
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\fastapi\routing.py", line 165, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\concurrency.py", line 41, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "D:\py-ai\RWVK-Runner\backend-python\routes\config.py", line 45, in switch_model
raise HTTPException(status.HTTP_500_INTERNAL_SERVER_ERROR, "failed to load")
AttributeError: 'function' object has no attribute 'HTTP_500_INTERNAL_SERVER_ERROR'

[issue] Torch not compiled with CUDA enabled (type=assertion_error)

笔记本,4070 8G,启动模型时遇到以下错误:

INFO: Started server process [9288]
INFO: Waiting for application startup.
torch found: D:\04programs\RWKV_LMM\py310\Lib\site-packages\torch\lib
torch set
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: 127.0.0.1:53777 - "GET / HTTP/1.1" 200 OK
INFO: 127.0.0.1:53777 - "GET / HTTP/1.1" 200 OK
INFO: 127.0.0.1:53777 - "GET /status HTTP/1.1" 200 OK
max_tokens=4100 temperature=1.2 top_p=0.5 presence_penalty=0.4 frequency_penalty=0.4
INFO: 127.0.0.1:53777 - "POST /update-config HTTP/1.1" 200 OK
RWKV_JIT_ON 1 RWKV_CUDA_ON 0 RESCALE_LAYER 6

Loading models/RWKV-4-Raven-3B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230429-ctx4096.pth ...
Strategy: (total 32+1=33 layers)

  • cuda [float16, uint8], store 33 layers
    0-cuda-float16-uint8 1-cuda-float16-uint8 2-cuda-float16-uint8 3-cuda-float16-uint8 4-cuda-float16-uint8 5-cuda-float16-uint8 6-cuda-float16-uint8 7-cuda-float16-uint8 8-cuda-float16-uint8 9-cuda-float16-uint8 10-cuda-float16-uint8 11-cuda-float16-uint8 12-cuda-float16-uint8 13-cuda-float16-uint8 14-cuda-float16-uint8 15-cuda-float16-uint8 16-cuda-float16-uint8 17-cuda-float16-uint8 18-cuda-float16-uint8 19-cuda-float16-uint8 20-cuda-float16-uint8 21-cuda-float16-uint8 22-cuda-float16-uint8 23-cuda-float16-uint8 24-cuda-float16-uint8 25-cuda-float16-uint8 26-cuda-float16-uint8 27-cuda-float16-uint8 28-cuda-float16-uint8 29-cuda-float16-uint8 30-cuda-float16-uint8 31-cuda-float16-uint8 32-cuda-float16-uint8
    emb.weight f16 cpu 50277 2560
    1 validation error for RWKV
    root
    Torch not compiled with CUDA enabled (type=assertion_error)
    INFO: 127.0.0.1:53775 - "POST /switch-model HTTP/1.1" 500 Internal Server Error
    ERROR: Exception in ASGI application
    Traceback (most recent call last):
    File "D:\04programs\RWKV_LMM\backend-python\routes\config.py", line 36, in switch_model
    RWKV(
    File "pydantic\main.py", line 341, in pydantic.main.BaseModel.init
    pydantic.error_wrappers.ValidationError: 1 validation error for RWKV
    root
    Torch not compiled with CUDA enabled (type=assertion_error)

切换模型失败

�[32mINFO�[0m: Started server process [�[36m13476�[0m]
�[32mINFO�[0m: Waiting for application startup.
torch found: F:\video\rwkv\py310\Lib\site-packages\torch\lib
torch set
�[32mINFO�[0m: Application startup complete.
�[32mINFO�[0m: Uvicorn running on �[1mhttp://127.0.0.1:8000�[0m (Press CTRL+C to quit)
�[32mINFO�[0m: 127.0.0.1:6372 - "�[1mGET / HTTP/1.1�[0m" �[32m200 OK�[0m
�[32mINFO�[0m: 127.0.0.1:6372 - "�[1mGET / HTTP/1.1�[0m" �[32m200 OK�[0m
�[32mINFO�[0m: 127.0.0.1:6372 - "�[1mGET /status HTTP/1.1�[0m" �[32m200 OK�[0m
max_tokens=4100 temperature=1.2 top_p=0.5 presence_penalty=0.4 frequency_penalty=0.4
�[32mINFO�[0m: 127.0.0.1:6372 - "�[1mPOST /update-config HTTP/1.1�[0m" �[32m200 OK�[0m
RWKV_JIT_ON 1 RWKV_CUDA_ON 0 RESCALE_LAYER 6

Loading models/RWKV-4-Raven-3B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230429-ctx4096.pth ...
Strategy: (total 32+1=33 layers)

  • cuda [float16, uint8], store 33 layers
    0-cuda-float16-uint8 1-cuda-float16-uint8 2-cuda-float16-uint8 3-cuda-float16-uint8 4-cuda-float16-uint8 5-cuda-float16-uint8 6-cuda-float16-uint8 7-cuda-float16-uint8 8-cuda-float16-uint8 9-cuda-float16-uint8 10-cuda-float16-uint8 11-cuda-float16-uint8 12-cuda-float16-uint8 13-cuda-float16-uint8 14-cuda-float16-uint8 15-cuda-float16-uint8 16-cuda-float16-uint8 17-cuda-float16-uint8 18-cuda-float16-uint8 19-cuda-float16-uint8 20-cuda-float16-uint8 21-cuda-float16-uint8 22-cuda-float16-uint8 23-cuda-float16-uint8 24-cuda-float16-uint8 25-cuda-float16-uint8 26-cuda-float16-uint8 27-cuda-float16-uint8 28-cuda-float16-uint8 29-cuda-float16-uint8 30-cuda-float16-uint8 31-cuda-float16-uint8 32-cuda-float16-uint8
    emb.weight f16 cpu 50277 2560
    1 validation error for RWKV
    root
    Torch not compiled with CUDA enabled (type=assertion_error)
    INFO: 127.0.0.1:6375 - "POST /switch-model HTTP/1.1" 500 Internal Server Error
    ERROR: Exception in ASGI application
    Traceback (most recent call last):
    File "F:\video\rwkv\backend-python\routes\config.py", line 36, in switch_model
    RWKV(
    File "pydantic\main.py", line 341, in pydantic.main.BaseModel.init
    pydantic.error_wrappers.ValidationError: 1 validation error for RWKV
    root
    Torch not compiled with CUDA enabled (type=assertion_error)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "F:\video\rwkv\py310\Lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 428, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "F:\video\rwkv\py310\Lib\site-packages\uvicorn\middleware\proxy_headers.py", line 78, in call
return await self.app(scope, receive, send)
File "F:\video\rwkv\py310\Lib\site-packages\fastapi\applications.py", line 276, in call
await super().call(scope, receive, send)
File "F:\video\rwkv\py310\Lib\site-packages\starlette\applications.py", line 122, in call
await self.middleware_stack(scope, receive, send)
File "F:\video\rwkv\py310\Lib\site-packages\starlette\middleware\errors.py", line 184, in call
raise exc
File "F:\video\rwkv\py310\Lib\site-packages\starlette\middleware\errors.py", line 162, in call
await self.app(scope, receive, _send)
File "F:\video\rwkv\py310\Lib\site-packages\starlette\middleware\cors.py", line 92, in call
await self.simple_response(scope, receive, send, request_headers=headers)
File "F:\video\rwkv\py310\Lib\site-packages\starlette\middleware\cors.py", line 147, in simple_response
await self.app(scope, receive, send)
File "F:\video\rwkv\py310\Lib\site-packages\starlette\middleware\exceptions.py", line 79, in call
raise exc
File "F:\video\rwkv\py310\Lib\site-packages\starlette\middleware\exceptions.py", line 68, in call
await self.app(scope, receive, sender)
File "F:\video\rwkv\py310\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 21, in call
raise e
File "F:\video\rwkv\py310\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "F:\video\rwkv\py310\Lib\site-packages\starlette\routing.py", line 718, in call
await route.handle(scope, receive, send)
File "F:\video\rwkv\py310\Lib\site-packages\starlette\routing.py", line 276, in handle
await self.app(scope, receive, send)
File "F:\video\rwkv\py310\Lib\site-packages\starlette\routing.py", line 66, in app
response = await func(request)
File "F:\video\rwkv\py310\Lib\site-packages\fastapi\routing.py", line 237, in app
raw_response = await run_endpoint_function(
File "F:\video\rwkv\py310\Lib\site-packages\fastapi\routing.py", line 165, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
File "F:\video\rwkv\py310\Lib\site-packages\starlette\concurrency.py", line 41, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "F:\video\rwkv\py310\Lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "F:\video\rwkv\py310\Lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "F:\video\rwkv\py310\Lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "F:\video\rwkv\backend-python\routes\config.py", line 45, in switch_model
raise HTTPException(status.HTTP_500_INTERNAL_SERVER_ERROR, "failed to load")
AttributeError: 'function' object has no attribute 'HTTP_500_INTERNAL_SERVER_ERROR'

Feature Request: Change Chat Prompt (Roles and Prompt)

Currently, the prompt for the Chat mode on python-backend is as follows:

The following is a coherent verbose detailed conversation between a girl named {bot} and her friend {user}.
{bot} is very intelligent, creative and friendly.
{bot} is unlikely to disagree with {user}, and {bot} doesn't like to ask {user} questions.
{bot} likes to tell {user} a lot about herself and her opinions.
{bot} usually gives {user} kind, helpful and informative advices.

Some releases of RWKV such as RWKV-4-World, state that Alice/Bob should not be used, but instead: Human/Bot, User/AI, Question/Answer, see this model card for the exact statement.

There are plenty of "configuration" endpoints now, such as:

@router.post("/switch-model")
def switch_model(body: SwitchModelBody, response: Response):
    ...

We could make one more endpoint such as:

@router.post("/change-chat-prompt")
def change_chat_prompt(body: ChangeChatPromptBody):
    ...

..that allows for changes the prompt, as well as specifying what stop is going to be (user, assistant). That will also require some UI changes.

RWKV 1.5B is insanely good at it scale

Even though it does not directly point out that AI does no exists in the year 1010. But it does reason a lot philosoically.

RWKV-Runner_windows_x64_rZox1v8w3T

Mostly LLaMA tuned model simply spit out, "It's hard to predit what AI will be like because 1010 is xxx years in the future," LOL

请问python依赖无法下载,可不可以手动下载~

wkv_cuda40.pyd
wkv_cuda10_30.pyd
torch.py
config.py
rwkv.py
global_var.py
convert_model.py
wkv_cuda_model.py
ngrok.py
dep_check.py
requirements_versions.txt
main.py
completion.py
requirements.txt
上面的这些全部都无法下载,好像是jsdelivr的问题,我这边无法连接到cdn网络,请问能手动下载嘛?

关于int8量化加速的问题

关于int8量化,有几个问题想请教一下

  1. int8量化模型是什么如何生成的?使用GPTQ/RPTQ之类的方法,还是说仅仅对权重进行pertensor/perchannel的量化?
  2. int8量化模型是如何推理的?是在kernel中进行将权重反量化回fp16,然后再和fp16的activation相乘吗?还是说activation也被量化乘int8了,进行int8 * int8的乘法?

烦请指教,谢谢

申请增加对rwkv.cpp的支持

如题,穷党没有GPU ┭┮﹏┭┮ 。另外,下载不能进行管理,模型转化选择in8反而转化出来的模型(float32)变大了。

出错,配置是3060 6g显存 模型为RWKV-4-Raven-7B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230430-ctx8192

错误信息如下:
INFO: Started server process [22260]
INFO: Waiting for application startup.
torch found: D:\LLM\py310\Lib\site-packages\torch\lib
torch set
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: 127.0.0.1:63563 - "GET / HTTP/1.1" 200 OK
INFO: 127.0.0.1:63563 - "GET / HTTP/1.1" 200 OK
INFO: 127.0.0.1:63563 - "GET /status HTTP/1.1" 200 OK
INFO: 127.0.0.1:63563 - "OPTIONS /update-config HTTP/1.1" 200 OK
INFO: 127.0.0.1:63564 - "OPTIONS /switch-model HTTP/1.1" 200 OK
max_tokens=4100 temperature=1.2 top_p=0.5 presence_penalty=0.4 frequency_penalty=0.4
INFO: 127.0.0.1:63564 - "POST /update-config HTTP/1.1" 200 OK
RWKV_JIT_ON 1 RWKV_CUDA_ON 0 RESCALE_LAYER 6

Loading models/RWKV-4-Raven-7B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230430-ctx8192.pth ...
Strategy: (total 32+1=33 layers)

  • cuda [float16, uint8], store 8 layers, stream 25 layers
    0-cuda-float16-uint8 1-cuda-float16-uint8 2-cuda-float16-uint8 3-cuda-float16-uint8 4-cuda-float16-uint8 5-cuda-float16-uint8 6-cuda-float16-uint8 7-cuda-float16-uint8 8-cuda-float16-uint8-stream 9-cuda-float16-uint8-stream 10-cuda-float16-uint8-stream 11-cuda-float16-uint8-stream 12-cuda-float16-uint8-stream 13-cuda-float16-uint8-stream 14-cuda-float16-uint8-stream 15-cuda-float16-uint8-stream 16-cuda-float16-uint8-stream 17-cuda-float16-uint8-stream 18-cuda-float16-uint8-stream 19-cuda-float16-uint8-stream 20-cuda-float16-uint8-stream 21-cuda-float16-uint8-stream 22-cuda-float16-uint8-stream 23-cuda-float16-uint8-stream 24-cuda-float16-uint8-stream 25-cuda-float16-uint8-stream 26-cuda-float16-uint8-stream 27-cuda-float16-uint8-stream 28-cuda-float16-uint8-stream 29-cuda-float16-uint8-stream 30-cuda-float16-uint8-stream 31-cuda-float16-uint8-stream 32-cuda-float16-uint8-stream
    emb.weight f16 cpu 50277 4096
    blocks.0.ln1.weight f16 cuda:0 4096
    blocks.0.ln1.bias f16 cuda:0 4096
    blocks.0.ln2.weight f16 cuda:0 4096
    blocks.0.ln2.bias f16 cuda:0 4096
    blocks.0.att.time_decay f32 cuda:0 4096
    blocks.0.att.time_first f32 cuda:0 4096
    blocks.0.att.time_mix_k f16 cuda:0 4096
    blocks.0.att.time_mix_v f16 cuda:0 4096
    blocks.0.att.time_mix_r f16 cuda:0 4096
    blocks.0.att.key.weight i8 cuda:0 4096 4096
    blocks.0.att.value.weight i8 cuda:0 4096 4096
    blocks.0.att.receptance.weight i8 cuda:0 4096 4096
    blocks.0.att.output.weight i8 cuda:0 4096 4096
    blocks.0.ffn.time_mix_k f16 cuda:0 4096
    blocks.0.ffn.time_mix_r f16 cuda:0 4096
    blocks.0.ffn.key.weight i8 cuda:0 4096 16384
    blocks.0.ffn.receptance.weight i8 cuda:0 4096 4096
    blocks.0.ffn.value.weight i8 cuda:0 16384 4096
    ...............................................................................................................................................................................................................................................................................................................[enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 268435456 bytes.
    INFO: 127.0.0.1:63563 - "POST /switch-model HTTP/1.1" 500 Internal Server Error
    ERROR: Exception in ASGI application
    Traceback (most recent call last):
    File "D:\LLM\backend-python\routes\config.py", line 36, in switch_model
    RWKV(
    File "pydantic\main.py", line 339, in pydantic.main.BaseModel.init
    File "pydantic\main.py", line 1102, in pydantic.main.validate_model
    File "D:\LLM\py310\Lib\site-packages\langchain\llms\rwkv.py", line 119, in validate_environment
    values["client"] = RWKVMODEL(
    File "D:\LLM\py310\Lib\site-packages\torch\jit_script.py", line 293, in init_then_script
    original_init(self, *args, **kwargs)
    File "D:\LLM\py310\Lib\site-packages\rwkv\model.py", line 308, in init
    w[x] = torch.clip(torch.floor(w[x] * 256), min=0, max=255).to(dtype=torch.uint8)
    RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 268435456 bytes.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\LLM\py310\Lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 428, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "D:\LLM\py310\Lib\site-packages\uvicorn\middleware\proxy_headers.py", line 78, in call
return await self.app(scope, receive, send)
File "D:\LLM\py310\Lib\site-packages\fastapi\applications.py", line 276, in call
await super().call(scope, receive, send)
File "D:\LLM\py310\Lib\site-packages\starlette\applications.py", line 122, in call
await self.middleware_stack(scope, receive, send)
File "D:\LLM\py310\Lib\site-packages\starlette\middleware\errors.py", line 184, in call
raise exc
File "D:\LLM\py310\Lib\site-packages\starlette\middleware\errors.py", line 162, in call
await self.app(scope, receive, _send)
File "D:\LLM\py310\Lib\site-packages\starlette\middleware\cors.py", line 92, in call
await self.simple_response(scope, receive, send, request_headers=headers)
File "D:\LLM\py310\Lib\site-packages\starlette\middleware\cors.py", line 147, in simple_response
await self.app(scope, receive, send)
File "D:\LLM\py310\Lib\site-packages\starlette\middleware\exceptions.py", line 79, in call
raise exc
File "D:\LLM\py310\Lib\site-packages\starlette\middleware\exceptions.py", line 68, in call
await self.app(scope, receive, sender)
File "D:\LLM\py310\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 21, in call
raise e
File "D:\LLM\py310\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "D:\LLM\py310\Lib\site-packages\starlette\routing.py", line 718, in call
await route.handle(scope, receive, send)
File "D:\LLM\py310\Lib\site-packages\starlette\routing.py", line 276, in handle
await self.app(scope, receive, send)
File "D:\LLM\py310\Lib\site-packages\starlette\routing.py", line 66, in app
response = await func(request)
File "D:\LLM\py310\Lib\site-packages\fastapi\routing.py", line 237, in app
raw_response = await run_endpoint_function(
File "D:\LLM\py310\Lib\site-packages\fastapi\routing.py", line 165, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
File "D:\LLM\py310\Lib\site-packages\starlette\concurrency.py", line 41, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "D:\LLM\py310\Lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "D:\LLM\py310\Lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "D:\LLM\py310\Lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "D:\LLM\backend-python\routes\config.py", line 45, in switch_model
raise HTTPException(status.HTTP_500_INTERNAL_SERVER_ERROR, "failed to load")
AttributeError: 'function' object has no attribute 'HTTP_500_INTERNAL_SERVER_ERROR'

failed to unzip python

failed to unzip python
python-3.10.11-embed-amd64.zip
它下了这个包然后一直提示解压失败

速度问题

尝试下载了7B-v11模型,配置使用GPU-6G-7B-CN,做了转换,其他参数默认,GPU显存占用6GB,运行速度与chatglm-6b-int4还是有些差距,结果看着要好不少,速度大概2个字/s, 请教一下性能在哪种配置下是最好的
显卡为2080super
cpu 为AMD 3060
内存 64GB

使用github的exe自动下载依赖进行到某一步时报错

请问这些报错该怎么解决呀?
用的是github的exe。
exe重启后下载列表是空的,删掉pip缓存重新下试过了,也尝试过以管理员身份运行exe文件。
有下载安装过程,进行到某一步就报错了。
①报错截图
QQ图片20230529054130
QQ图片20230529054134
②目录结构
backend-python
models
py310
cache.json
cmd-helper.bat
config.json
python-3.10.11-embed-amd64.zip
RWKV-Runner _windows_x64.exe
③系统环境
Windows10家庭版21H2
GTX1650 4G(notebook)

windows 10 运行时报错:

NFO: Started server process [11568]
INFO: Waiting for application startup.
torch found: E:\ai\py310\Lib\site-packages\torch\lib
torch set
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: 127.0.0.1:53384 - "GET / HTTP/1.1" 200 OK
INFO: 127.0.0.1:53384 - "GET /status HTTP/1.1" 200 OK
INFO: 127.0.0.1:53384 - "OPTIONS /update-config HTTP/1.1" 200 OK
INFO: 127.0.0.1:53381 - "OPTIONS /switch-model HTTP/1.1" 200 OK
max_tokens=4100 temperature=1.2 top_p=0.5 presence_penalty=0.4 frequency_penalty=0.4
INFO: 127.0.0.1:53384 - "POST /update-config HTTP/1.1" 200 OK
E:\ai\py310\Lib\site-packages_distutils_hack_init_.py:18: UserWarning: Distutils was imported before Setuptools, but importing Setuptools also replaces the distutils module in sys.modules. This may lead to undesirable behaviors or errors. To avoid these issues, avoid using distutils directly, ensure that setuptools is installed in the traditional way (e.g. not an editable install), and/or make sure that setuptools is always imported before distutils.
warnings.warn(
E:\ai\py310\Lib\site-packages_distutils_hack_init_.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
Using E:\ai\backend-python as PyTorch extensions root...
Ninja is required to load C++ extensions
INFO: 127.0.0.1:53384 - "POST /switch-model HTTP/1.1" 500 Internal Server Error

启动模型失败

启动失败

C:\Users\lich\Downloads\py310\Lib\site-packages\_distutils_hack\__init__.py:18: UserWarning: Distutils was imported before Setuptools, but importing Setuptools also replaces the `distutils` module in `sys.modules`. This may lead to undesirable behaviors or errors. To avoid these issues, avoid using distutils directly, ensure that setuptools is installed in the traditional way (e.g. not an editable install), and/or make sure that setuptools is always imported before distutils.
  warnings.warn(
C:\Users\lich\Downloads\py310\Lib\site-packages\_distutils_hack\__init__.py:33: UserWarning: Setuptools is replacing distutils.
  warnings.warn("Setuptools is replacing distutils.")
No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2'
Using C:\Users\lich\Downloads\backend-python as PyTorch extensions root...
Ninja is required to load C++ extensions
INFO:     127.0.0.1:56715 - "POST /switch-model HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "C:\Users\lich\Downloads\backend-python\routes\config.py", line 49, in switch_model
    RWKV(
  File "C:\Users\lich\Downloads\backend-python\utils\rwkv.py", line 20, in __init__
    from rwkv.model import RWKV as Model  # dynamic import to make RWKV_CUDA_ON work
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\rwkv\model.py", line 78, in <module>
    load(
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\torch\utils\cpp_extension.py", line 1284, in load
    return _jit_compile(
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\torch\utils\cpp_extension.py", line 1509, in _jit_compile
    _write_ninja_file_and_build_library(
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\torch\utils\cpp_extension.py", line 1593, in _write_ninja_file_and_build_library
    verify_ninja_availability()
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\torch\utils\cpp_extension.py", line 1649, in verify_ninja_availability
    raise RuntimeError("Ninja is required to load C++ extensions")
RuntimeError: Ninja is required to load C++ extensions

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 428, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\uvicorn\middleware\proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\fastapi\applications.py", line 276, in __call__
    await super().__call__(scope, receive, send)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\starlette\applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\starlette\middleware\errors.py", line 184, in __call__
    raise exc
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\starlette\middleware\errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\starlette\middleware\cors.py", line 92, in __call__
    await self.simple_response(scope, receive, send, request_headers=headers)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\starlette\middleware\cors.py", line 147, in simple_response
    await self.app(scope, receive, send)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\starlette\middleware\exceptions.py", line 79, in __call__
    raise exc
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\starlette\middleware\exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 21, in __call__
    raise e
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\starlette\routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\starlette\routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\starlette\routing.py", line 66, in app
    response = await func(request)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\fastapi\routing.py", line 237, in app
    raw_response = await run_endpoint_function(
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\fastapi\routing.py", line 165, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\starlette\concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "C:\Users\lich\Downloads\py310\Lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "C:\Users\lich\Downloads\backend-python\routes\config.py", line 58, in switch_model
    raise HTTPException(status.HTTP_500_INTERNAL_SERVER_ERROR, "failed to load")
AttributeError: 'function' object has no attribute 'HTTP_500_INTERNAL_SERVER_ERROR'

1.08无法运行。

提示:Python依赖缺失, 是否安装?
但安装后,依然是上面的提示。

转换模型失败

转换失败,exit stauts 1,管理员运行的,帮看一下怎么回事

找不到自定义cuda文件

开启使用自定义cuda算子加速选项后运行,提示找不到支持的自定义cuda文件,这个怎么解决啊?

3070ti 8G显卡使用GPU-6G-7B-CN模板开启加速后报错

INFO: Started server process [2280]
INFO: Waiting for application startup.
torch found: D:\AIGC\chatrvkw模型\py310\Lib\site-packages\torch\lib
torch set
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: 127.0.0.1:49159 - "GET / HTTP/1.1" 200 OK
INFO: 127.0.0.1:49159 - "GET /status HTTP/1.1" 200 OK
INFO: 127.0.0.1:49159 - "OPTIONS /update-config HTTP/1.1" 200 OK
max_tokens=4100 temperature=1.2 top_p=0.5 presence_penalty=0.4 frequency_penalty=0.4
INFO: 127.0.0.1:49159 - "POST /update-config HTTP/1.1" 200 OK
INFO: 127.0.0.1:49159 - "OPTIONS /switch-model HTTP/1.1" 200 OK
D:\AIGC\chatrvkw模型\py310\Lib\site-packages_distutils_hack_init_.py:18: UserWarning: Distutils was imported before Setuptools, but importing Setuptools also replaces the distutils module in sys.modules. This may lead to undesirable behaviors or errors. To avoid these issues, avoid using distutils directly, ensure that setuptools is installed in the traditional way (e.g. not an editable install), and/or make sure that setuptools is always imported before distutils.
warnings.warn(
D:\AIGC\chatrvkw模型\py310\Lib\site-packages_distutils_hack_init_.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
Using D:\AIGC\chatrvkw模型\backend-python as PyTorch extensions root...
Ninja is required to load C++ extensions
INFO: 127.0.0.1:49159 - "POST /switch-model HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "D:\AIGC\chatrvkw模型\backend-python\routes\config.py", line 36, in switch_model
RWKV(
File "pydantic\main.py", line 339, in pydantic.main.BaseModel.init
File "pydantic\main.py", line 1102, in pydantic.main.validate_model
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\langchain\llms\rwkv.py", line 111, in validate_environment
from rwkv.model import RWKV as RWKVMODEL
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\rwkv\model.py", line 78, in
load(
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\torch\utils\cpp_extension.py", line 1284, in load
return _jit_compile(
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\torch\utils\cpp_extension.py", line 1508, in _jit_compile
_write_ninja_file_and_build_library(
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\torch\utils\cpp_extension.py", line 1592, in _write_ninja_file_and_build_library
verify_ninja_availability()
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\torch\utils\cpp_extension.py", line 1648, in verify_ninja_availability
raise RuntimeError("Ninja is required to load C++ extensions")
RuntimeError: Ninja is required to load C++ extensions

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 428, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\uvicorn\middleware\proxy_headers.py", line 78, in call
return await self.app(scope, receive, send)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\fastapi\applications.py", line 276, in call
await super().call(scope, receive, send)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\starlette\applications.py", line 122, in call
await self.middleware_stack(scope, receive, send)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\starlette\middleware\errors.py", line 184, in call
raise exc
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\starlette\middleware\errors.py", line 162, in call
await self.app(scope, receive, _send)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\starlette\middleware\cors.py", line 92, in call
await self.simple_response(scope, receive, send, request_headers=headers)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\starlette\middleware\cors.py", line 147, in simple_response
await self.app(scope, receive, send)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\starlette\middleware\exceptions.py", line 79, in call
raise exc
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\starlette\middleware\exceptions.py", line 68, in call
await self.app(scope, receive, sender)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 21, in call
raise e
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\starlette\routing.py", line 718, in call
await route.handle(scope, receive, send)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\starlette\routing.py", line 276, in handle
await self.app(scope, receive, send)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\starlette\routing.py", line 66, in app
response = await func(request)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\fastapi\routing.py", line 237, in app
raw_response = await run_endpoint_function(
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\fastapi\routing.py", line 165, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\starlette\concurrency.py", line 41, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "D:\AIGC\chatrvkw模型\py310\Lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "D:\AIGC\chatrvkw模型\backend-python\routes\config.py", line 45, in switch_model
raise HTTPException(status.HTTP_500_INTERNAL_SERVER_ERROR, "failed to load")
AttributeError: 'function' object has no attribute 'HTTP_500_INTERNAL_SERVER_ERROR'

Win11非管理员模式下启动失败

Windows 11,点击运行后会使用 Windows Terminal 打开第一个命令窗口,并自动关闭 —— 此时会被判断为进程被关闭,没有正确识别接下来打开的命令窗口,也就不会进行接下来的模型识别。
若通过管理员模式打开,则直接打开Python的命令窗口,正常执行。
关于这一点,建议对上述问题进行排除,或者提示可能需要用管理员身份运行。

模型转换失败

我已经运行了一个模型,现在打算运行另一个模型,结果就遇到了以下错误。我的版本是1.1.4,尝试过管理员运行也没用。

Could not import azure.core python package.
INFO: Started server process [15024]
INFO: Waiting for application startup.
torch found: C:\Users\clank\Downloads\py310\Lib\site-packages\torch\lib
torch set
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO: 127.0.0.1:49266 - "GET / HTTP/1.1" 200 OK
INFO: 127.0.0.1:49268 - "GET / HTTP/1.1" 200 OK
INFO: 127.0.0.1:49266 - "GET /status HTTP/1.1" 200 OK
INFO: 127.0.0.1:49266 - "OPTIONS /update-config HTTP/1.1" 200 OK
INFO: 127.0.0.1:49268 - "OPTIONS /switch-model HTTP/1.1" 200 OK
max_tokens=4100 temperature=1.2 top_p=0.5 presence_penalty=0.4 frequency_penalty=0.4
INFO: 127.0.0.1:49268 - "POST /update-config HTTP/1.1" 200 OK
RWKV_JIT_ON 1 RWKV_CUDA_ON 0 RESCALE_LAYER 6

Loading ./models/RWKV-4-Raven-3B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230429-ctx4096.pth ...
PytorchStreamReader failed reading zip archive: invalid header or archive is corrupted
INFO: 127.0.0.1:49266 - "POST /switch-model HTTP/1.1" 500 Internal Server Error

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.