Asked by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

feat: add support for ollama about llm-ls HOT 3 CLOSED

huggingface commented on August 16, 2024 21

feat: add support for ollama

from llm-ls.

Comments (3)

ishaan-jaff commented on August 16, 2024

Hi @McPatate I’m the maintainer of LiteLLM - we allow you to create a proxy server to call 100+ LLMs, and I think it can solve your problem (I'd love your feedback if it does not)

Try it here: https://docs.litellm.ai/docs/proxy_server

Using LiteLLM Proxy Server

import openai
openai.api_base = "http://0.0.0.0:8000/" # proxy url
print(openai.ChatCompletion.create(model="test", messages=[{"role":"user", "content":"Hey!"}]))

Creating a proxy server

Ollama models

$ litellm --model ollama/llama2 --api_base http://localhost:11434

Hugging Face Models

$ export HUGGINGFACE_API_KEY=my-api-key #[OPTIONAL]
$ litellm --model claude-instant-1

Anthropic

$ export ANTHROPIC_API_KEY=my-api-key
$ litellm --model claude-instant-1

Palm

$ export PALM_API_KEY=my-palm-key
$ litellm --model palm/chat-bison

from llm-ls.

McPatate commented on August 16, 2024

If it were a rust crate why not, but I'm not adding a proxy to the project. This adds a dependency on Python for users and I don't like the extra process.

from llm-ls.

noahbald commented on August 16, 2024

I'm not all that familiar with Rust, but we when the request is generated in request_completion would it be reasonable to use a dynamic property name?

async fn request_completion(
    http_client: &reqwest::Client,
    ide: Ide,
    model: &str,
    request_params: RequestParams,
    api_token: Option<&String>,
    prompt: String,
    inputs_key: String,
    request_options: HashMap<String, String>,
) -> Result<Vec<Generation>> {
    let mut body = HashMap::new();
    body.extend(request_options);
    body.insert(inputs_key, prompt);
    body.insert("parameters", request_params.into());

    let res = http_client
        .post(build_url(model))
        .json(body)
        .headers(build_headers(api_token, ide)?)
        .send()
        .await
        .map_err(internal_error)?;

    // ...
}

Just an example, but we could add inputs_key and request_options to CompletionParams. To get this working for ollama a user could give inputs_key as "prompt" and request_options as { model: "ollama:7b-code" }.

Also as an aside, I don't get why we wouldn't be passing the params as a whole to the request_completion call? Would this be bad practise in Rust?

// Before
let result = request_completion(
    http_client,
    params.ide,
    params.model,
    params.request_params,
    params.api_token.as_ref(),
    prompt,
)
// After
let result = request_completion(
    http_client,
    params,
    prompt,
)

from llm-ls.

Recommend Projects

feat: add support for ollama about llm-ls HOT 3 CLOSED

Comments (3)

Using LiteLLM Proxy Server

Creating a proxy server

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs