matatonic / openedai-vision Goto Github PK

An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.

License: GNU Affero General Public License v3.0

Dockerfile 0.78% Python 98.63% Shell 0.60%

openedai-vision's Introduction

OpenedAI Vision

An OpenAI API compatible vision server, it functions like gpt-4-vision-preview and lets you chat about the contents of an image.

Compatible with the OpenAI Vision API (aka "chat with images")
Does not connect to the OpenAI API and does not require an OpenAI API Key
Not affiliated with OpenAI in any way

Model support

See: OpenVLM Leaderboard

Recent updates

Version 0.28.0

new model support: internlm-xcomposer2d5-7b
new model support: dolphin-vision-7b (currently KeyError: 'bunny-qwen')
Pin glm-v-9B revision until we support transformers 4.42

Version 0.27.1

new model support: qnguyen3/nanoLLaVA-1.5
Complete support for chat without images (using placeholder images where required, 1x1 clear or 8x8 black as necessary)
Require transformers==4.41.2 (4.42 breaks many models)

Version 0.27.0

new model support: OpenGVLab/InternVL2 series of models (1B, 2B, 4B, 8B*, 26B*, 40B*, 76B*) - *(current top open source models)

Version 0.26.0

new model support: cognitivecomputations/dolphin-vision-72b

Version 0.25.1

Fix typo in vision.sample.env

Version 0.25.0

New model support: microsoft/Florence family of models. Not a chat model, but simple questions are ok and all commands are functional. ex "<MORE_DETAILED_CAPTION>", "", "", etc.
Improved error handling & logging

Version 0.24.1

Compatibility: Support generation without images for most models. (llava based models still require an image)

Version 0.24.0

Full streaming support for almost all models.
Update vikhyatk/moondream2 to 2024-05-20 + streaming
API compatibility improvements, strip extra leading space if present
Revert: no more 4bit double quant (slower for insignificant vram savings - protest and it may come back as an option)

Version 0.23.0

New model support: Together.ai's Llama-3-8B-Dragonfly-v1, Llama-3-8B-Dragonfly-Med-v1 (medical image model)
Compatibility: web.chatboxai.app can now use openedai-vision as an OpenAI API Compatible backend!
Initial support for streaming (real streaming for some [dragonfly, internvl-chat-v1-5], fake streaming for the rest). More to come.

Version 0.22.0

new model support: THUDM/glm-4v-9b

Version 0.21.0

new model support: Salesforce/xgen-mm-phi3-mini-instruct-r-v1
Major improvements in quality and compatibility for --load-in-4/8bit for many models (InternVL-Chat-V1-5, cogvlm2, MiniCPM-Llama3-V-2_5, Bunny, Monkey, ...). Layer skip with quantized loading.

Version 0.20.0

enable hf_transfer for faster model downloads (over 300MB/s)
6 new Bunny models from BAAI: Bunny-v1_0-3B-zh, Bunny-v1_0-3B, Bunny-v1_0-4B, Bunny-v1_1-4B, Bunny-v1_1-Llama-3-8B-V

Version 0.19.1

really Fix <|end|> token for Mini-InternVL-Chat-4B-V1-5, thanks again @Ph0rk0z

Version 0.19.0

new model support: tiiuae/falcon-11B-vlm
add --max-tiles option for InternVL-Chat-V1-5 and xcomposer2-4khd backends. Tiles use more vram for higher resolution, the default is 6 and 40 respectively, but both are trained up to 40. Some context length warnings may appear near the limits of the model.
Fix <|end|> token for Mini-InternVL-Chat-4B-V1-5, thanks again @Ph0rk0z

Version 0.18.0

new model support: OpenGVLab/Mini-InternVL-Chat-4B-V1-5, thanks @Ph0rk0z
new model support: failspy/Phi-3-vision-128k-instruct-abliterated-alpha

Version 0.17.0

new model support: openbmb/MiniCPM-Llama3-V-2_5

Version 0.16.1

Add "start with" parameter to pre-fill assistant response & backend support (doesn't work with all models) - aka 'Sure,' support.

Version 0.16.0

new model support: microsoft/Phi-3-vision-128k-instruct

Version 0.15.1

new model support: OpenGVLab/Mini-InternVL-Chat-2B-V1-5

Version 0.15.0

new model support: cogvlm2-llama3-chinese-chat-19B, cogvlm2-llama3-chat-19B

Version 0.14.1

new model support: idefics2-8b-chatty, idefics2-8b-chatty-AWQ (it worked already, no code change)
new model support: XComposer2-VL-1.8B (it worked already, no code change)

Version: 0.14.0

docker-compose.yml: Assume the runtime supports the device (ie. nvidia)
new model support: qihoo360/360VL-8B, qihoo360/360VL-70B (70B is untested, too large for me)
new model support: BAAI/Emu2-Chat, Can be slow to load, may need --max-memory option control the loading on multiple gpus
new model support: TIGER-Labs/Mantis: Mantis-8B-siglip-llama3, Mantis-8B-clip-llama3, Mantis-8B-Fuyu

Version: 0.13.0

new model support: InternLM-XComposer2-4KHD
new model support: BAAI/Bunny-Llama-3-8B-V
new model support: qresearch/llama-3-vision-alpha-hf

Version: 0.12.1

new model support: HuggingFaceM4/idefics2-8b, HuggingFaceM4/idefics2-8b-AWQ
Fix: remove prompt from output of InternVL-Chat-V1-5

Version: 0.11.0

new model support: OpenGVLab/InternVL-Chat-V1-5, up to 4k resolution, top opensource model
MiniGemini renamed MGM upstream

API Documentation

OpenAI Vision guide

Docker support

Edit the vision.env or vision-alt.env file to suit your needs. See: vision.sample.env for an example.

cp vision.sample.env vision.env
# OR for alt the version
cp vision-alt.sample.env vision-alt.env

You can run the server via docker compose like so:

# for OpenedAI Vision Server
docker compose up
# for OpenedAI Vision Server (alternate, for Mini-Gemini > 2B, uses transformers==4.36.2)
docker compose -f docker-compose.alt.yml up

Add the -d flag to daemonize. To install as a service, add --restart unless-stopped.

To update your setup (or download the image before running the server), you can pull the latest version of the image with the following command:

# for OpenedAI Vision Server
docker compose pull
# for OpenedAI Vision Server (alternate, for Mini-Gemini > 2B, nanollava, moondream1) which uses transformers==4.36.2
docker compose -f docker-compose.alt.yml pull

Manual Installation instructions

# install the python dependencies
pip install -U -r requirements.txt "transformers==4.41.2" "autoawq>=0.2.5"
# OR install the python dependencies for the alt version
pip install -U -r requirements.txt "transformers==4.36.2"
# run the server with your chosen model
python vision.py --model vikhyatk/moondream2

For MiniGemini support the docker image is recommended. See prepare_minigemini.sh for manual installation instructions, models for mini_gemini must be downloaded to local directories, not just run from cache.

Usage

usage: vision.py [-h] -m MODEL [-b BACKEND] [-f FORMAT] [-d DEVICE] [--device-map DEVICE_MAP] [--max-memory MAX_MEMORY] [--no-trust-remote-code] [-4]
                 [-8] [-F] [-T MAX_TILES] [-L {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [-P PORT] [-H HOST] [--preload]

OpenedAI Vision API Server

options:
  -h, --help            show this help message and exit
  -m MODEL, --model MODEL
                        The model to use, Ex. llava-hf/llava-v1.6-mistral-7b-hf (default: None)
  -b BACKEND, --backend BACKEND
                        Force the backend to use (moondream1, moondream2, llavanext, llava, qwen-vl) (default: None)
  -f FORMAT, --format FORMAT
                        Force a specific chat format. (vicuna, mistral, chatml, llama2, phi15, gemma) (doesn't work with all models) (default: None)
  -d DEVICE, --device DEVICE
                        Set the torch device for the model. Ex. cpu, cuda:1 (default: auto)
  --device-map DEVICE_MAP
                        Set the default device map policy for the model. (auto, balanced, sequential, balanced_low_0, cuda:1, etc.) (default: auto)
  --max-memory MAX_MEMORY
                        (emu2 only) Set the per cuda device_map max_memory. Ex. 0:22GiB,1:22GiB,cpu:128GiB (default: None)
  --no-trust-remote-code
                        Don't trust remote code (required for many models) (default: False)
  -4, --load-in-4bit    load in 4bit (doesn't work with all models) (default: False)
  -8, --load-in-8bit    load in 8bit (doesn't work with all models) (default: False)
  -F, --use-flash-attn  Use Flash Attention 2 (doesn't work with all models or GPU) (default: False)
  -T MAX_TILES, --max-tiles MAX_TILES
                        Change the maximum number of tiles. [1-55+] (uses more VRAM for higher resolution, doesn't work with all models) (default: None)
  -L {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        Set the log level (default: INFO)
  -P PORT, --port PORT  Server tcp port (default: 5006)
  -H HOST, --host HOST  Host to listen on, Ex. localhost (default: 0.0.0.0)
  --preload             Preload model and exit. (default: False)

Sample API Usage

chat_with_image.py has a sample of how to use the API.

Usage

usage: chat_with_image.py [-h] [-s SYSTEM_PROMPT] [-S START_WITH] [-m MAX_TOKENS] [-t TEMPERATURE] [-p TOP_P] [-u] [-1] [--no-stream] image_url [questions ...]

Test vision using OpenAI

positional arguments:
  image_url             URL or image file to be tested
  questions             The question to ask the image (default: None)

options:
  -h, --help            show this help message and exit
  -s SYSTEM_PROMPT, --system-prompt SYSTEM_PROMPT
  -S START_WITH, --start-with START_WITH
                        Start reply with, ex. 'Sure, ' (doesn't work with all models) (default: None)
  -m MAX_TOKENS, --max-tokens MAX_TOKENS
  -t TEMPERATURE, --temperature TEMPERATURE
  -p TOP_P, --top_p TOP_P
  -u, --keep-remote-urls
                        Normally, http urls are converted to data: urls for better latency. (default: False)
  -1, --single          Single turn Q&A, output is only the model response. (default: False)
  --no-stream           Disable streaming response. (default: False)

Example:

$ python chat_with_image.py -1 https://images.freeimages.com/images/large-previews/cd7/gingko-biloba-1058537.jpg "Describe the image."
The image presents a single, large green leaf with a pointed tip and a serrated edge. The leaf is attached to a thin stem, suggesting it's still connected to its plant. The leaf is set against a stark white background, which contrasts with the leaf's vibrant green color. The leaf's position and the absence of other objects in the image give it a sense of isolation. There are no discernible texts or actions associated with the leaf. The relative position of the leaf to the background remains constant as it is the sole object in the image. The image does not provide any information about the leaf's size or the type of plant it belongs to. The leaf's serrated edge and pointed tip might suggest it's from a deciduous tree, but without additional context, this is purely speculative. The image is a simple yet detailed representation of a single leaf.

$ python chat_with_image.py https://images.freeimages.com/images/large-previews/e59/autumn-tree-1408307.jpg
Answer: The image captures a serene autumn scene. The main subject is a deciduous tree, standing alone on the shore of a calm lake. The tree is in the midst of changing colors, with leaves in shades of orange, yellow, and green. The branches of the tree are bare, indicating that the leaves are in the process of falling. The tree is positioned on the left side of the image, with its reflection visible in the still water of the lake.

The background of the image features a mountain range, which is partially obscured by a haze. The mountains are covered in a dense forest, with trees displaying a mix of green and autumnal colors. The sky above is clear and blue, suggesting a calm and sunny day.

The overall composition of the image places the tree as the focal point, with the lake, mountains, and sky serving as a picturesque backdrop. The image does not contain any discernible text or human-made objects, reinforcing the natural beauty of the scene. The relative positions of the objects in the image create a sense of depth and perspective, with the tree in the foreground, the lake in the middle ground, and the mountains and sky in the background. The image is a testament to the tranquil beauty of nature during the autumn season.

Question: What kind of tree is it?
Answer: Based on the image, it is not possible to definitively identify the species of the tree. However, the tree's characteristics, such as its broad leaves and the way they change color in the fall, suggest that it could be a type of deciduous tree commonly found in temperate regions. Without more specific details or a closer view, it is not possible to provide a more precise identification of the tree species.

Question: Is it a larch?
Answer: The tree in the image could potentially be a larch, which is a type of deciduous conifer. Larches are known for their needle-like leaves that turn yellow and orange in the fall before falling off. However, without a closer view or more specific details, it is not possible to confirm whether the tree is indeed a larch. The image does not provide enough detail to make a definitive identification of the tree species.

Question: ^D

Known Problems & Workarounds

Related to cuda device split, If you get:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument tensors in method wrapper_CUDA_cat)

Try to specify a single cuda device with CUDA_VISIBLE_DEVICES=1 (or # of your GPU) before running the script. or set the device via --device-map cuda:0 (or --device cuda:0 in the alt image!) on the command line.

4bit/8bit quantization and flash attention 2 don't work for all the models. No workaround, see: sample.env for known working configurations.
Yi-VL is currently not working.
The default --device-map auto doesn't always work well with these models. If you have issues with multiple GPU's, try using sequential and selecting the order of your CUDA devices, like so:

# Example for reversing the order of 2 devices.
CUDA_VISIBLE_DEVICES=1,0 python vision.py -m llava-hf/llava-v1.6-34b-hf --device-map sequential

openedai-vision's People

Contributors

Stargazers

Watchers

Forkers

2132660698 neor38 waywardspooky anhlbt not-important-vr awesome-service mdwoicke mojowebs guruprasadsirsi morgan7street lzjever

openedai-vision's Issues

FlashAttention only supports Ampere GPUs or newer

Dear DevTeam,
thanks so much for this great tool!
During my test I found a big show stopper the "FlashAttention" option...
In my setup I have two Nvidia RTX 8000 board and this board are from Turing family (TU102GL) and they not support "FlashAttention".
Could be possible run the Vision models with this library?

I will add some more details:
Used Model: "python vision.py -m OpenGVLab/InternVL2-1B --device-map cuda:0"

Logs:

  File "/usr/local/lib/python3.11/site-packages/torch/autograd/function.py", line 553, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/flash_attn/flash_attn_interface.py", line 290, in forward
    out, q, k, v, out_padded, softmax_lse, S_dmask, rng_state = _flash_attn_varlen_forward(
                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/flash_attn/flash_attn_interface.py", line 86, in _flash_attn_varlen_forward
    out, q, k, v, out_padded, softmax_lse, S_dmask, rng_state = flash_attn_cuda.varlen_fwd(
                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: FlashAttention only supports Ampere GPUs or newer.

I did not use the Flag "FlashAttention" but I still receive the error.

Thanks so much

Responding even if there is no image provided

I tried to integrate this with a Chat Interface.
If I provide image its working, if I don't provide image its breaking.

Please make it to work for both, so that it can easily be integrated with any application, which can handle both image and text or text only or image only.

This is what happened when I tried only with Text:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
result = await app( # type: ignore[func-returns-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in call
return await self.app(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in call
await super().call(scope, receive, send)
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/starlette/applications.py", line 123, in call
await self.middleware_stack(scope, receive, send)
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/starlette/middleware/base.py", line 189, in call
with collapse_excgroups():
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/contextlib.py", line 158, in exit
self.gen.throw(typ, value, traceback)
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/starlette/_utils.py", line 93, in collapse_excgroups
raise exc
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/sse_starlette/sse.py", line 273, in wrap
await func()
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/sse_starlette/sse.py", line 253, in stream_response
async for data in self.body_iterator:
File "/home/Ubuntu/openedai-vision/vision.py", line 59, in streamer
async for resp in vision_qna.stream_chat_with_images(request):
File "/home/Ubuntu/openedai-vision/backend/llavanext.py", line 34, in stream_chat_with_images
inputs = self.processor(prompt, images, return_tensors="pt").to(self.model.device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/transformers/models/llava_next/processing_llava_next.py", line 110, in call
image_inputs = self.image_processor(images, do_pad=do_pad, return_tensors=return_tensors)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/transformers/image_processing_utils.py", line 41, in call
return self.preprocess(images, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/transformers/models/llava_next/image_processing_llava_next.py", line 678, in preprocess
images = make_batched_images(images)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/Ubuntu/miniconda3/envs/openedai/lib/python3.11/site-packages/transformers/models/llava_next/image_processing_llava_next.py", line 67, in make_batched_images
if isinstance(images, (list, tuple)) and isinstance(images[0], (list, tuple)) and is_valid_image(images[0][0]):
~~~~~~^^^
IndexError: list index out of range

simple valid base64 image results in incorrect padding error

Same image works fine when using LM Studio but results in a 500 error in openedai-vision

Input

[
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "what do you see in the image?"
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
                }
            }
        ]
    }
]

Output from LM Studio
{"id":"chatcmpl-m6h6759cqvl7mby9i1gz","object":"chat.completion","created":1720801907,"model":"moondream/moondream2-gguf/moondream2-text-model-f16.gguf","choices":[{"index":0,"message":{"role":"assistant","content":"In this image, there are four squares of varying sizes, all rendered in shades of gray and white. They appear to be placed side by side with some distance between them, creating a sense of depth and space within the design. The colors used for each square range from pure black to more muted tones, adding contrast and visual interest to the overall composition. This image seems to be a digital illustration or an artistic representation of an abstract pattern that could potentially be used in various mediums such as graphic design or interior decorating."},"finish_reason":"stop"}],"usage":{"prompt_tokens":42,"completion_tokens":105,"total_tokens":147}}

Output from openedai-vision

(venvlcpp) anand@hsti4090:~/openedai-vision$ docker compose up
[+] Running 2/1
 ✔ Network openedai-vision_default     Created                                                                                                                                           0.2s 
 ✔ Container openedai-vision-server-1  Created                                                                                                                                           0.1s 
Attaching to server-1
server-1  | 2024-07-12 16:47:58.053 | INFO     | __main__:<module>:143 - Loading VisionQnA[moondream2] with vikhyatk/moondream2
server-1  | Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
server-1  | You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
server-1  | 2024-07-12 16:48:00.058 | INFO     | vision_qna:loaded_banner:92 - Loaded vikhyatk/moondream2 on device: cuda:0 with dtype: torch.bfloat16
server-1  | INFO:     Started server process [7]
server-1  | INFO:     Waiting for application startup.
server-1  | INFO:     Application startup complete.
server-1  | INFO:     Uvicorn running on http://0.0.0.0:5006 (Press CTRL+C to quit)
server-1  | INFO:     172.21.0.1:39040 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
server-1  | ERROR:    Exception in ASGI application
server-1  |   + Exception Group Traceback (most recent call last):
server-1  |   |   File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 87, in collapse_excgroups
server-1  |   |     yield
server-1  |   |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 190, in __call__
server-1  |   |     async with anyio.create_task_group() as task_group:
server-1  |   |   File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 680, in __aexit__
server-1  |   |     raise BaseExceptionGroup(
server-1  |   | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
server-1  |   +-+---------------- 1 ----------------
server-1  |     | Traceback (most recent call last):
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
server-1  |     |     result = await app(  # type: ignore[func-returns-value]
server-1  |     |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
server-1  |     |     return await self.app(scope, receive, send)
server-1  |     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
server-1  |     |     await super().__call__(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
server-1  |     |     await self.middleware_stack(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
server-1  |     |     raise exc
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
server-1  |     |     await self.app(scope, receive, _send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 189, in __call__
server-1  |     |     with collapse_excgroups():
server-1  |     |   File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__
server-1  |     |     self.gen.throw(typ, value, traceback)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 93, in collapse_excgroups
server-1  |     |     raise exc
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 191, in __call__
server-1  |     |     response = await self.dispatch_func(request, call_next)
server-1  |     |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/openedai.py", line 126, in log_requests
server-1  |     |     response = await call_next(request)
server-1  |     |                ^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 165, in call_next
server-1  |     |     raise app_exc
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 151, in coro
server-1  |     |     await self.app(scope, receive_or_disconnect, send_no_error)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in __call__
server-1  |     |     await self.app(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
server-1  |     |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
server-1  |     |     raise exc
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
server-1  |     |     await app(scope, receive, sender)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 756, in __call__
server-1  |     |     await self.middleware_stack(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 776, in app
server-1  |     |     await route.handle(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 297, in handle
server-1  |     |     await self.app(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
server-1  |     |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
server-1  |     |     raise exc
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
server-1  |     |     await app(scope, receive, sender)
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 72, in app
server-1  |     |     response = await func(request)
server-1  |     |                ^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
server-1  |     |     raw_response = await run_endpoint_function(
server-1  |     |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
server-1  |     |     return await dependant.call(**values)
server-1  |     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/vision.py", line 87, in vision_chat_completions
server-1  |     |     text = await vision_qna.chat_with_images(request)
server-1  |     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/vision_qna.py", line 116, in chat_with_images
server-1  |     |     return ''.join([r async for r in self.stream_chat_with_images(request)])
server-1  |     |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/vision_qna.py", line 116, in <listcomp>
server-1  |     |     return ''.join([r async for r in self.stream_chat_with_images(request)])
server-1  |     |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/backend/moondream2.py", line 30, in stream_chat_with_images
server-1  |     |     images, prompt = await prompt_from_messages(request.messages, self.format)
server-1  |     |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/vision_qna.py", line 704, in prompt_from_messages
server-1  |     |     return await known_formats[format](messages)
server-1  |     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/vision_qna.py", line 229, in phi15_prompt_from_messages
server-1  |     |     img_data = await url_handler(c.image_url.url)
server-1  |     |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     |   File "/app/vision_qna.py", line 178, in url_to_image
server-1  |     |     img_data = DataURI(img_url).data
server-1  |     |                ^^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/datauri/__init__.py", line 85, in __new__
server-1  |     |     uri._parse  # Trigger any ValueErrors on instantiation.
server-1  |     |     ^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/site-packages/datauri/__init__.py", line 148, in _parse
server-1  |     |     data = decode64(_data)
server-1  |     |            ^^^^^^^^^^^^^^^
server-1  |     |   File "/usr/local/lib/python3.11/base64.py", line 88, in b64decode
server-1  |     |     return binascii.a2b_base64(s, strict_mode=validate)
server-1  |     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |     | binascii.Error: Incorrect padding
server-1  |     +------------------------------------
server-1  | 
server-1  | During handling of the above exception, another exception occurred:
server-1  | 
server-1  | Traceback (most recent call last):
server-1  |   File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
server-1  |     result = await app(  # type: ignore[func-returns-value]
server-1  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
server-1  |     return await self.app(scope, receive, send)
server-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
server-1  |     await super().__call__(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
server-1  |     await self.middleware_stack(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
server-1  |     raise exc
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
server-1  |     await self.app(scope, receive, _send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 189, in __call__
server-1  |     with collapse_excgroups():
server-1  |   File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__
server-1  |     self.gen.throw(typ, value, traceback)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 93, in collapse_excgroups
server-1  |     raise exc
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 191, in __call__
server-1  |     response = await self.dispatch_func(request, call_next)
server-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/openedai.py", line 126, in log_requests
server-1  |     response = await call_next(request)
server-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 165, in call_next
server-1  |     raise app_exc
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 151, in coro
server-1  |     await self.app(scope, receive_or_disconnect, send_no_error)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in __call__
server-1  |     await self.app(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
server-1  |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
server-1  |     raise exc
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
server-1  |     await app(scope, receive, sender)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 756, in __call__
server-1  |     await self.middleware_stack(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 776, in app
server-1  |     await route.handle(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 297, in handle
server-1  |     await self.app(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
server-1  |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
server-1  |     raise exc
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
server-1  |     await app(scope, receive, sender)
server-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 72, in app
server-1  |     response = await func(request)
server-1  |                ^^^^^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
server-1  |     raw_response = await run_endpoint_function(
server-1  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
server-1  |     return await dependant.call(**values)
server-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/vision.py", line 87, in vision_chat_completions
server-1  |     text = await vision_qna.chat_with_images(request)
server-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/vision_qna.py", line 116, in chat_with_images
server-1  |     return ''.join([r async for r in self.stream_chat_with_images(request)])
server-1  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/vision_qna.py", line 116, in <listcomp>
server-1  |     return ''.join([r async for r in self.stream_chat_with_images(request)])
server-1  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/backend/moondream2.py", line 30, in stream_chat_with_images
server-1  |     images, prompt = await prompt_from_messages(request.messages, self.format)
server-1  |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/vision_qna.py", line 704, in prompt_from_messages
server-1  |     return await known_formats[format](messages)
server-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/vision_qna.py", line 229, in phi15_prompt_from_messages
server-1  |     img_data = await url_handler(c.image_url.url)
server-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  |   File "/app/vision_qna.py", line 178, in url_to_image
server-1  |     img_data = DataURI(img_url).data
server-1  |                ^^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/datauri/__init__.py", line 85, in __new__
server-1  |     uri._parse  # Trigger any ValueErrors on instantiation.
server-1  |     ^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/site-packages/datauri/__init__.py", line 148, in _parse
server-1  |     data = decode64(_data)
server-1  |            ^^^^^^^^^^^^^^^
server-1  |   File "/usr/local/lib/python3.11/base64.py", line 88, in b64decode
server-1  |     return binascii.a2b_base64(s, strict_mode=validate)
server-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
server-1  | binascii.Error: Incorrect padding
server-1  | INFO:     172.21.0.1:39048 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
server-1  | ERROR:    Exception in ASGI application

Choosing Model via the post request, when making API Call.

Currently Providing model is a required argument.

python vision.py
usage: vision.py [-h] -m MODEL [-b BACKEND] [-f FORMAT] [-d DEVICE] [--device-map DEVICE_MAP]
                 [--max-memory MAX_MEMORY] [--no-trust-remote-code] [-4] [-8] [-F] [-T MAX_TILES]
                 [-L {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [-P PORT] [-H HOST] [--preload]
vision.py: error: the following arguments are required: -m/--model

AIM:Adding the ability to Choose the model when Calling the API. This would be a great option.
It gives the additional flexibility.

inference for certain images work while other images fail with a 500 error from the PIL library

Issue is documented here with error logs

allanbunch/node-red-openai-api#24

matatonic / openedai-vision Goto Github PK

openedai-vision's Introduction

OpenedAI Vision

Model support

Recent updates

API Documentation

Docker support

Manual Installation instructions

Usage

Sample API Usage

Known Problems & Workarounds

openedai-vision's People

Contributors

Stargazers

Watchers

Forkers

openedai-vision's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs