bricks-cloud / bricksllm Goto Github PK

🔒 Enterprise-grade API gateway that helps you monitor and impose cost or rate limits per API key. Get fine-grained access control and monitoring per user, application, or environment. Supports OpenAI, Azure OpenAI, Anthropic, vLLM, and open-source LLMs.

Home Page: https://trybricks.ai/

License: MIT License

Go 99.96% Shell 0.04%

golang llm openai ai anthropic azure gpt postgresql rest-api ycombinator

bricksllm's Issues

Add proxy server for forwarding OpenAI requests

Set up a proxy server that forwards requests from Client side to OpenAI.

Requirements:

Integration with the API key validation middleware
Record token usage from the OpenAI response

Investigate which observability format that BricksLLM should support first

Context

Bricks is built to integrate with existing cloud observability tools. Investigate which observability tools that Bricks should support first. Options include

OpenTelemetry
statsd
dogstatsd
Prometheus

Authentication for Key Creation in BricksLLM API

Description:
I am using the BricksLLM API to create keys for my application. However, I am concerned about unauthorized key creation. Is there any way to add authentication to the key creation process?

Steps to Reproduce:

Navigate to the key creation section of the BricksLLM API.
Attempt to create a key without authentication.
Observe that the key is created successfully.

Expected Behavior:
Unauthorized key creation should not be possible. There should be some form of authentication required to create a key.

Actual Behavior:
Unauthorized key creation is possible without any authentication.

Additional Information:
I am using the latest version of the BricksLLM API. I have not made any modifications to the API code.

Certain Bricks API keys can't pass authentication.

When I create Bricks API keys like 5xrdVPkTQMschs5CRLxYZuMu, I always get 401 not authorized error:

$ curl -X PATCH -H "Content-Type: application/json" -H "X-API-KEY: [redacted]" localhost:8001/api/key-management/keys/b0070528-209a-4ed2-a788-c58aeef523ec -d '{
    "name": "testkey",
    "tags": ["testkey"],                
    "key": "5xrdVPkTQMschs5CRLxYZuMu",
    "settingIds": ["a104afa5-4313-48cc-a75f-0fe57cec2877"],
    "costLimitInUsdOverTime": 0.1,
    "costLimitInUsdUnit": "d",
    "rateLimitOverTime": 30,       
    "rateLimitUnit": "m",                                                  
    "shouldLogRequest": true,
    "shouldLogResponse": true
}'      
{"name":"testkey","createdAt":1709286279,"updatedAt":1709377458,"tags":["testkey"],"keyId":"b0070528-209a-4ed2-a788-c58aeef523ec","revoked":false,"key":"ac480863ca46a254b576ac824c1e633fa38ac46582cd300e06b83084acc297ae","revokedReason":"","costLimitInUsd":0,"costLimitInUsdOverTime":0.1,"costLimitInUsdUnit":"d","rateLimitOverTime":30,"rateLimitUnit":"m","ttl":"","settingId":"","allowedPaths":null,"settingIds":["a104afa5-4313-48cc-a75f-0fe57cec2877"],"shouldLogRequest":true,"shouldLogResponse":true,"rotationEnabled":false}

$ curl -X POST http://localhost:8002/api/providers/openai/v1/chat/completions \
   -H "Authorization: Bearer 5xrdVPkTQMschs5CRLxYZuMu" \
   -H "Content-Type: application/json" \
   -d '{
          "model": "gpt-3.5-turbo",
          "messages": [
              {
                  "role": "system",
                  "content": "I'"'"'m testing, hoping to catch some error."
              }
          ]
      }'

{"error":{"code":"401","message":"[BricksLLM] not authorized","type":""}}

After some debugging, I found the sha256 hash of this key doesn't match the hash stored in my database:

$ echo -n 5xrdVPkTQMschs5CRLxYZuMu | sha256sum
106ff82a265e1bae932377d34ba7cc737793eaf98190df2be1637c16143f1816  -

bricksllm=> SELECT key FROM keys WHERE name = 'testkey';
                               key                                
------------------------------------------------------------------
 ac480863ca46a254b576ac824c1e633fa38ac46582cd300e06b83084acc297ae
(1 row)

The hash calculated here is also 106ff...

BricksLLM/internal/authenticator/authenticator.go

Line 160 in b41d5fe

hash := encrypter.Encrypt(raw)

When I use keys like WsbjdNiM9CP2wukbZMjF, authentication passes, hashes match, no problem.
I think there are problems in code related to storing keys.

Not able to make it work

Its it was not able to detect openai api key using export command. I had to update yml file manually.

After updating I am still not able use - Getting following errors:

{'error': {'message': '[BricksLLM] failed to parse openai error response',
'code': 500}}

More granular metrics

Per-user ID (user parameter in OpenAI)
Per-model

Add support for Azure OpenAI

Tasks

Support chat completion endpoint
Support embeddings endpoint

Add support for more granular rate limit per API route

Is it possible to define rate limits at the granularity of the OpenAI API? I.e., different RPM/TPM for each model? Context is that we want to give 100 students access to the API at the same time through our tier 3 token so every key needs to have 35 RPM/1600 TPM for gpt-3.5-turbo and 50 RPM/5000 TPM for text-embedding-ada-002. If I understand the documentation correctly we could only set the minimum of each limit currently?

CORS preflight request (http OPTIONS method) not routed

I have BetterChatGPT running on http://localhost:5173/. In this ChatGPT-like web application, the AI API calls are sent from the browser. This works when I use https://api.openai.com/v1/chat/completions as the API endpoint. However, it doesn't work, when I use my BricksLLM instance ( https://ai.molgen.mpg.de/api/providers/openai/v1/chat/completions ) as the API endpoint.

The reason, IMO, is that the browser (Firefox 122.0, but surly others as well) doesn't generally allow scripts from one origin ( http://localhost:5173/) to access resources from another origin (the API endpoints) because of the same-origin policy, unless the server supports the CORS protocol and send headers which explicitly allow its resources to be access by other origins.

In the case of simple GET, POST and HEAD request, these headers are sent during the normal client-server dialog. However, the API calls are not simple in this sense, for example because the include an Authorization header. In this cases, the browser crafts a so-called preflight request on its own and sends it to the server to validate the cross-origin use before sending the real request. The preflight request uses the http OPTIONS method.

Here is a picture of the Firefox network log of a working API request towards the openai API endpoint :

And here is one towards the non-functional BricksLLM API endpoint:

The OPTIONS request is replied with a "404 not found" response and the POST request is not even sent by the browser because of its same-origin policy and the failed CORS preflight check.

Please note, that this explanation of same-origin policy and CORS is my current understanding ( source ), but I'm not an expert in that area and may be wrong.

The difference can also be demonstrated with CURL:

buczek@theinternet:~$ curl -i -X OPTIONS https://api.openai.com/v1/chat/completions -H "Authorization: Bearer <CENSORED>"
HTTP/2 200 
date: Fri, 09 Feb 2024 10:15:40 GMT
content-length: 0
access-control-allow-headers: 
access-control-allow-methods: GET, OPTIONS, POST
strict-transport-security: max-age=15724800; includeSubDomains
cf-cache-status: DYNAMIC
set-cookie: <CENSORED>; path=/; expires=Fri, 09-Feb-24 10:45:40 GMT; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
set-cookie: <CENSORED>; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
server: cloudflare
cf-ray: <CENSORED>
alt-svc: h3=":443"; ma=86400

buczek@theinternet:~$ curl -i -X OPTIONS https://ai.molgen.mpg.de/api/providers/openai/v1/chat/completions -H "Authorization: Bearer <CENSORED"
HTTP/1.1 404 Not Found
Server: nginx/1.25.3
Date: Fri, 09 Feb 2024 10:15:47 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 78
Connection: keep-alive
Strict-Transport-Security: max-age=31536000; includeSubDomains

{"error":{"code":"404","message":"[BricksLLM] route not supported","type":""}}
buczek@theinternet:~$

Maybe this can be fixed by just adding OPTIONS to the methods which are proxied to the provider.

Rate limit refresh time is not consistent

Currently rate limits refresh at the end of the time interval with the key creation time as the start. The correct behavior is following: if a key is created at 55 seconds UTC with 1 request per minute rate limit, the key needs to refresh after 4 seconds instead of after 60 seconds.

Oauth authetnication

Currently, BricksLLM proxy only has basic key authentication. Oauth could help improve the security of proxy endpoints.

Error response is in OpenAI format

Currently internal errors in BricksLLM returns error response using OpenAI's format. This might cause issues for Anthropic SDKs

Add support for OpenAI fine tuning endpoints without cost tracking

BricksLLM currently does not support OpenAI fine tuning endpoints. Here is an example PR where I add support for OpenAI voice and image API endpoints. Cost tracking is fairly complicated, and should be tackled in a separate PR.

Add support for Gemini

Gemini has been released. Add support for Gemini endpoints.

Add monthly rate limits

https://community.openai.com/t/openai-api-proxy-to-share-limited-token-budget-with-others/535403/14

Investigate issues related to the streaming mode

noticed that [streaming mode](https://platform.openai.com/docs/api-reference/streaming) for chat completions is much less fluent when using the proxy when compared to the normal OpenAI API. The normal API delivers many (5-10 I guess) chunks per second to the client while the proxy seems to update the response only once per second without caring about individual chunks. Would it be complicated to fix that?

(I only took at short look at the [implementation 1](https://github.com/bricks-cloud/BricksLLM/blob/325f1d88315411e75ac9aadf7c96b468b37eb66e/internal/server/web/proxy.go#L770-L826), maybe the buffer size is too large or synchronous cost estimation takes too much time? Of course I don’t have a deeper knowledge of your codebase, even though it appears nice to read :slight_smile: )

PostgreSQL not ready during Docker-up

PostgreSQL container is not ready to accept connections when the BricksLLM container tries to connect to it.

BricksLLM-Docker (main)
$ docker-compose up
time="2024-04-05T14:01:36+05:00" level=warning msg="The "OPENAI_API_KEY" variable is not set. Defaulting to a blank string."
[+] Running 22/22
✔ bricksllm 4 layers [⣿⣿⣿⣿] 0B/0B Pulled 12.3s
✔ 3c854c8cbf46 Already exists 0.0s
✔ 2d02c7b1f0cc Pull complete 3.0s
✔ 4f4fb700ef54 Pull complete 3.0s
✔ 677203612e8c Pull complete 8.0s
✔ redis 7 layers [⣿⣿⣿⣿⣿⣿⣿] 0B/0B Pulled 14.0s
✔ 4abcf2066143 Already exists 0.0s
✔ 2c3a1d240687 Pull complete 3.5s
✔ 643f361aa308 Pull complete 4.8s
✔ a693cf0e318c Pull complete 6.0s
✔ ade57efb0b22 Pull complete 9.1s
✔ 2fa2e1566407 Pull complete 8.8s
✔ 4464e0709769 Pull complete 10.0s
✔ postgresql 8 layers [⣿⣿⣿⣿⣿⣿⣿⣿] 0B/0B Pulled 17.7s
✔ 59bf1c3509f3 Pull complete 0.9s
✔ c50e01d57241 Pull complete 0.6s
✔ a0646b0f1ead Pull complete 0.7s
✔ 7433e5151e0c Pull complete 10.5s
✔ 8854018388d9 Pull complete 1.4s
✔ 8de463f7fd19 Pull complete 1.6s
✔ b39ee18abab9 Pull complete 2.2s
✔ 11d7473a0ff9 Pull complete 2.3s
[+] Running 6/6
✔ Network bricksllm-docker_default Created 0.1s
✔ Volume "bricksllm-docker_postgresql" Created 0.0s
✔ Volume "bricksllm-docker_redis" Created 0.0s
✔ Container bricksllm-docker-redis-1 Created 5.4s
✔ Container bricksllm-docker-postgresql-1 Created 5.6s
✔ Container bricksllm-docker-bricksllm-1 Created 0.3s
Attaching to bricksllm-1, postgresql-1, redis-1
redis-1 | 1:C 05 Apr 2024 09:02:00.647 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis-1 | 1:C 05 Apr 2024 09:02:00.647 # Redis version=6.2.14, bits=64, commit=00000000, modified=0, pid=1, just started
redis-1 | 1:C 05 Apr 2024 09:02:00.647 # Configuration loaded
redis-1 | 1:M 05 Apr 2024 09:02:00.648 # Server initialized
postgresql-1 | The files belonging to this database system will be owned by user "postgres".
postgresql-1 | This user must also own the server process.
postgresql-1 |
postgresql-1 | The database cluster will be initialized with locale "en_US.utf8".
postgresql-1 | The default database encoding has accordingly been set to "UTF8".
postgresql-1 | The default text search configuration will be set to "english".
postgresql-1 |
postgresql-1 | Data page checksums are disabled.
postgresql-1 |
postgresql-1 | fixing permissions on existing directory /var/lib/postgresql/data ... ok
postgresql-1 | creating subdirectories ... ok
postgresql-1 | selecting dynamic shared memory implementation ... posix
postgresql-1 | selecting default max_connections ... 100
postgresql-1 | selecting default shared_buffers ... 128MB
postgresql-1 | selecting default time zone ... UTC
postgresql-1 | creating configuration files ... ok
postgresql-1 | running bootstrap script ... ok
bricksllm-1 | [BRICKSLLM] FATAL | 2024-04-05T09:02:01Z | error creating custom providers table: dial tcp 172.x.x.x:5432: connect: connection refused

Delete endpoint for some entities: provider, provider setting, ...

How can I delete a custom provider or provider settings ?

Add support of cost estimation for fine tuned models

Context

Current cost estimation feature does not include support for fine tuned models

Tasks

Add support for cost estimation for fine tuned models

Update Patch endpoints to only allow updates to name and revoked fields

Context

Patch endpoints can also update rate limit, cost limit and ttl related fields.

Tasks

Remove rate limit, cost limit and ttl related fields from Patch endpoint update key
Remove Patch endpoint validation logic on rate limit, cost limit and ttl related fields

Support models whitelist for each API key

Add openAPI/Swagger documentation for the endpoints

Please consider adding openAPI documentation for the endpoints and make it available on e.g. the /swagger path.

I'm happy to see frequent updates to this awesome project but it can be a struggle to keep documentation and implementation in sync.
E.g I can't get the the GET /api/events to work given the current documentation

bricks-cloud / bricksllm Goto Github PK

bricksllm's Issues

Context

Context

Tasks

Context

Tasks

Recommend Projects

Recommend Topics

Recommend Org

Jobs