GithubHelp home page GithubHelp logo

bricks-cloud / bricksllm Goto Github PK

View Code? Open in Web Editor NEW
765.0 4.0 48.0 1.31 MB

🔒 Enterprise-grade API gateway that helps you monitor and impose cost or rate limits per API key. Get fine-grained access control and monitoring per user, application, or environment. Supports OpenAI, Azure OpenAI, Anthropic, vLLM, and open-source LLMs.

Home Page: https://trybricks.ai/

License: MIT License

Go 99.96% Shell 0.04%
golang llm openai ai anthropic azure gpt postgresql rest-api ycombinator

bricksllm's Issues

Authentication for Key Creation in BricksLLM API

Description:
I am using the BricksLLM API to create keys for my application. However, I am concerned about unauthorized key creation. Is there any way to add authentication to the key creation process?

Steps to Reproduce:

  1. Navigate to the key creation section of the BricksLLM API.
  2. Attempt to create a key without authentication.
  3. Observe that the key is created successfully.

Expected Behavior:
Unauthorized key creation should not be possible. There should be some form of authentication required to create a key.

Actual Behavior:
Unauthorized key creation is possible without any authentication.

Additional Information:
I am using the latest version of the BricksLLM API. I have not made any modifications to the API code.

Certain Bricks API keys can't pass authentication.

When I create Bricks API keys like 5xrdVPkTQMschs5CRLxYZuMu, I always get 401 not authorized error:

$ curl -X PATCH -H "Content-Type: application/json" -H "X-API-KEY: [redacted]" localhost:8001/api/key-management/keys/b0070528-209a-4ed2-a788-c58aeef523ec -d '{
    "name": "testkey",
    "tags": ["testkey"],                
    "key": "5xrdVPkTQMschs5CRLxYZuMu",
    "settingIds": ["a104afa5-4313-48cc-a75f-0fe57cec2877"],
    "costLimitInUsdOverTime": 0.1,
    "costLimitInUsdUnit": "d",
    "rateLimitOverTime": 30,       
    "rateLimitUnit": "m",                                                  
    "shouldLogRequest": true,
    "shouldLogResponse": true
}'      
{"name":"testkey","createdAt":1709286279,"updatedAt":1709377458,"tags":["testkey"],"keyId":"b0070528-209a-4ed2-a788-c58aeef523ec","revoked":false,"key":"ac480863ca46a254b576ac824c1e633fa38ac46582cd300e06b83084acc297ae","revokedReason":"","costLimitInUsd":0,"costLimitInUsdOverTime":0.1,"costLimitInUsdUnit":"d","rateLimitOverTime":30,"rateLimitUnit":"m","ttl":"","settingId":"","allowedPaths":null,"settingIds":["a104afa5-4313-48cc-a75f-0fe57cec2877"],"shouldLogRequest":true,"shouldLogResponse":true,"rotationEnabled":false}

$ curl -X POST http://localhost:8002/api/providers/openai/v1/chat/completions \
   -H "Authorization: Bearer 5xrdVPkTQMschs5CRLxYZuMu" \
   -H "Content-Type: application/json" \
   -d '{
          "model": "gpt-3.5-turbo",
          "messages": [
              {
                  "role": "system",
                  "content": "I'"'"'m testing, hoping to catch some error."
              }
          ]
      }'

{"error":{"code":"401","message":"[BricksLLM] not authorized","type":""}}

After some debugging, I found the sha256 hash of this key doesn't match the hash stored in my database:

$ echo -n 5xrdVPkTQMschs5CRLxYZuMu | sha256sum
106ff82a265e1bae932377d34ba7cc737793eaf98190df2be1637c16143f1816  -

bricksllm=> SELECT key FROM keys WHERE name = 'testkey';
                               key                                
------------------------------------------------------------------
 ac480863ca46a254b576ac824c1e633fa38ac46582cd300e06b83084acc297ae
(1 row)

The hash calculated here is also 106ff...

hash := encrypter.Encrypt(raw)

When I use keys like WsbjdNiM9CP2wukbZMjF, authentication passes, hashes match, no problem.
I think there are problems in code related to storing keys.

Not able to make it work

Its it was not able to detect openai api key using export command. I had to update yml file manually.

error1

After updating I am still not able use - Getting following errors:

{'error': {'message': '[BricksLLM] failed to parse openai error response',
'code': 500}}

image

Add support for more granular rate limit per API route

Is it possible to define rate limits at the granularity of the OpenAI API? I.e., different RPM/TPM for each model? Context is that we want to give 100 students access to the API at the same time through our tier 3 token so every key needs to have 35 RPM/1600 TPM for gpt-3.5-turbo and 50 RPM/5000 TPM for text-embedding-ada-002. If I understand the documentation correctly we could only set the minimum of each limit currently?

CORS preflight request (http OPTIONS method) not routed

I have BetterChatGPT running on http://localhost:5173/. In this ChatGPT-like web application, the AI API calls are sent from the browser. This works when I use https://api.openai.com/v1/chat/completions as the API endpoint. However, it doesn't work, when I use my BricksLLM instance ( https://ai.molgen.mpg.de/api/providers/openai/v1/chat/completions ) as the API endpoint.

The reason, IMO, is that the browser (Firefox 122.0, but surly others as well) doesn't generally allow scripts from one origin ( http://localhost:5173/) to access resources from another origin (the API endpoints) because of the same-origin policy, unless the server supports the CORS protocol and send headers which explicitly allow its resources to be access by other origins.

In the case of simple GET, POST and HEAD request, these headers are sent during the normal client-server dialog. However, the API calls are not simple in this sense, for example because the include an Authorization header. In this cases, the browser crafts a so-called preflight request on its own and sends it to the server to validate the cross-origin use before sending the real request. The preflight request uses the http OPTIONS method.

Here is a picture of the Firefox network log of a working API request towards the openai API endpoint :

ok

And here is one towards the non-functional BricksLLM API endpoint:

fail

The OPTIONS request is replied with a "404 not found" response and the POST request is not even sent by the browser because of its same-origin policy and the failed CORS preflight check.

Please note, that this explanation of same-origin policy and CORS is my current understanding ( source ), but I'm not an expert in that area and may be wrong.

The difference can also be demonstrated with CURL:

buczek@theinternet:~$ curl -i -X OPTIONS https://api.openai.com/v1/chat/completions -H "Authorization: Bearer <CENSORED>"
HTTP/2 200 
date: Fri, 09 Feb 2024 10:15:40 GMT
content-length: 0
access-control-allow-headers: 
access-control-allow-methods: GET, OPTIONS, POST
strict-transport-security: max-age=15724800; includeSubDomains
cf-cache-status: DYNAMIC
set-cookie: <CENSORED>; path=/; expires=Fri, 09-Feb-24 10:45:40 GMT; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
set-cookie: <CENSORED>; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
server: cloudflare
cf-ray: <CENSORED>
alt-svc: h3=":443"; ma=86400

buczek@theinternet:~$ curl -i -X OPTIONS https://ai.molgen.mpg.de/api/providers/openai/v1/chat/completions -H "Authorization: Bearer <CENSORED"
HTTP/1.1 404 Not Found
Server: nginx/1.25.3
Date: Fri, 09 Feb 2024 10:15:47 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 78
Connection: keep-alive
Strict-Transport-Security: max-age=31536000; includeSubDomains

{"error":{"code":"404","message":"[BricksLLM] route not supported","type":""}}
buczek@theinternet:~$ 

Maybe this can be fixed by just adding OPTIONS to the methods which are proxied to the provider.

Rate limit refresh time is not consistent

Currently rate limits refresh at the end of the time interval with the key creation time as the start. The correct behavior is following: if a key is created at 55 seconds UTC with 1 request per minute rate limit, the key needs to refresh after 4 seconds instead of after 60 seconds.

Oauth authetnication

Currently, BricksLLM proxy only has basic key authentication. Oauth could help improve the security of proxy endpoints.

Investigate issues related to the streaming mode

noticed that [streaming mode](https://platform.openai.com/docs/api-reference/streaming) for chat completions is much less fluent when using the proxy when compared to the normal OpenAI API. The normal API delivers many (5-10 I guess) chunks per second to the client while the proxy seems to update the response only once per second without caring about individual chunks. Would it be complicated to fix that?

(I only took at short look at the [implementation 1](https://github.com/bricks-cloud/BricksLLM/blob/325f1d88315411e75ac9aadf7c96b468b37eb66e/internal/server/web/proxy.go#L770-L826), maybe the buffer size is too large or synchronous cost estimation takes too much time? Of course I don’t have a deeper knowledge of your codebase, even though it appears nice to read :slight_smile: )

PostgreSQL not ready during Docker-up

PostgreSQL container is not ready to accept connections when the BricksLLM container tries to connect to it.

BricksLLM-Docker (main)
$ docker-compose up
time="2024-04-05T14:01:36+05:00" level=warning msg="The "OPENAI_API_KEY" variable is not set. Defaulting to a blank string."
[+] Running 22/22
✔ bricksllm 4 layers [⣿⣿⣿⣿] 0B/0B Pulled 12.3s
✔ 3c854c8cbf46 Already exists 0.0s
✔ 2d02c7b1f0cc Pull complete 3.0s
✔ 4f4fb700ef54 Pull complete 3.0s
✔ 677203612e8c Pull complete 8.0s
✔ redis 7 layers [⣿⣿⣿⣿⣿⣿⣿] 0B/0B Pulled 14.0s
✔ 4abcf2066143 Already exists 0.0s
✔ 2c3a1d240687 Pull complete 3.5s
✔ 643f361aa308 Pull complete 4.8s
✔ a693cf0e318c Pull complete 6.0s
✔ ade57efb0b22 Pull complete 9.1s
✔ 2fa2e1566407 Pull complete 8.8s
✔ 4464e0709769 Pull complete 10.0s
✔ postgresql 8 layers [⣿⣿⣿⣿⣿⣿⣿⣿] 0B/0B Pulled 17.7s
✔ 59bf1c3509f3 Pull complete 0.9s
✔ c50e01d57241 Pull complete 0.6s
✔ a0646b0f1ead Pull complete 0.7s
✔ 7433e5151e0c Pull complete 10.5s
✔ 8854018388d9 Pull complete 1.4s
✔ 8de463f7fd19 Pull complete 1.6s
✔ b39ee18abab9 Pull complete 2.2s
✔ 11d7473a0ff9 Pull complete 2.3s
[+] Running 6/6
✔ Network bricksllm-docker_default Created 0.1s
✔ Volume "bricksllm-docker_postgresql" Created 0.0s
✔ Volume "bricksllm-docker_redis" Created 0.0s
✔ Container bricksllm-docker-redis-1 Created 5.4s
✔ Container bricksllm-docker-postgresql-1 Created 5.6s
✔ Container bricksllm-docker-bricksllm-1 Created 0.3s
Attaching to bricksllm-1, postgresql-1, redis-1
redis-1 | 1:C 05 Apr 2024 09:02:00.647 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis-1 | 1:C 05 Apr 2024 09:02:00.647 # Redis version=6.2.14, bits=64, commit=00000000, modified=0, pid=1, just started
redis-1 | 1:C 05 Apr 2024 09:02:00.647 # Configuration loaded
redis-1 | 1:M 05 Apr 2024 09:02:00.648 # Server initialized
postgresql-1 | The files belonging to this database system will be owned by user "postgres".
postgresql-1 | This user must also own the server process.
postgresql-1 |
postgresql-1 | The database cluster will be initialized with locale "en_US.utf8".
postgresql-1 | The default database encoding has accordingly been set to "UTF8".
postgresql-1 | The default text search configuration will be set to "english".
postgresql-1 |
postgresql-1 | Data page checksums are disabled.
postgresql-1 |
postgresql-1 | fixing permissions on existing directory /var/lib/postgresql/data ... ok
postgresql-1 | creating subdirectories ... ok
postgresql-1 | selecting dynamic shared memory implementation ... posix
postgresql-1 | selecting default max_connections ... 100
postgresql-1 | selecting default shared_buffers ... 128MB
postgresql-1 | selecting default time zone ... UTC
postgresql-1 | creating configuration files ... ok
postgresql-1 | running bootstrap script ... ok
bricksllm-1 | [BRICKSLLM] FATAL | 2024-04-05T09:02:01Z | error creating custom providers table: dial tcp 172.x.x.x:5432: connect: connection refused

Add openAPI/Swagger documentation for the endpoints

Please consider adding openAPI documentation for the endpoints and make it available on e.g. the /swagger path.

I'm happy to see frequent updates to this awesome project but it can be a struggle to keep documentation and implementation in sync.
E.g I can't get the the GET /api/events to work given the current documentation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.