GithubHelp home page GithubHelp logo

Comments (11)

JonahMMay avatar JonahMMay commented on June 6, 2024 2

I think this may have something to do with LocalAI's chat and completion templates. I customized the plugin connection to remove functions and have the simple template of "You are my smart home assistant." Then I told it "Tell me a joke" to which it replied "Tell me a joke".

But if I build a similar query via curl, I get a proper response:
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf", "messages": [{"role": "system", "content": "You are my smart home assistant."},{"role": "user", "content": " Tell me a joke."}], "temperature": 0.7 }'

{"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Sure, I can do that! Here's a joke for you: Why did the scarecrow win an award? Because he was outstanding in his field."}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

I started my LocalAI container in debug mode in an SSH window and watched the logs as they came on screen. The big difference I see is this. For the HASS plug-in:
5:01PM DBG Prompt (before templating): You are my smart home assistant.
Tell me a joke.
5:01PM DBG Prompt (after templating): You are my smart home assistant.
Tell me a joke.
5:01PM DBG Grammar: root-0-arguments-list-item ::= "{" space ""domain"" space ":" space string "," space ""service"" space ":" space string "," space ""service_data"" space ":" space root-0-arguments-list-item-service-data "}" space
root-0-arguments-list ::= "[" space (root-0-arguments-list-item ("," space root-0-arguments-list-item))? "]" space
root-0-arguments ::= "{" space ""list"" space ":" space root-0-arguments-list "}" space
root-0 ::= "{" space ""arguments"" space ":" space root-0-arguments "," space ""function"" space ":" space root-0-function "}" space
root-1-arguments ::= "{" space ""message"" space ":" space string "}" space
space ::= " "?
string ::= """ (
[^"\\] |
"\" (["\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
)
""" space
root-0-arguments-list-item-service-data ::= "{" space ""entity_id"" space ":" space string "}" space
root-1-function ::= ""answer""
root-0-function ::= ""execute_services""
root-1 ::= "{" space ""arguments"" space ":" space root-1-arguments "," space ""function"" space ":" space root-1-function "}" space
root ::= root-0 | root-1
5:01PM DBG Model already loaded in memory: luna-ai-llama2-uncensored.Q4_K_M.gguf
5:01PM DBG Model 'luna-ai-llama2-uncensored.Q4_K_M.gguf' already loaded
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 is processing [task id: 14]
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 : kv cache rm - [0, end)
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: prompt eval time = 74.89 ms / 15 tokens ( 4.99 ms per token, 200.30 tokens per second)
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: eval time = 743.44 ms / 30 runs ( 24.78 ms per token, 40.35 tokens per second)
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: total time = 818.33 ms
5:01PM DBG Function return: { "arguments": { "message": "Tell me a joke." },"function": "answer"} map[arguments:map[message:Tell me a joke.] function:answer]
5:01PM DBG nothing to do, computing a reply
5:01PM DBG Reply received from LLM: Tell me a joke.
5:01PM DBG Reply received from LLM(finetuned): Tell me a joke.
5:01PM DBG Response: {"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"message":{"role":"assistant","content":"Tell me a joke."}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

Versus the curl command:
4:55PM DBG Prompt (before templating): You are my smart home assistant.
You are you?
4:55PM DBG Template found, input modified to: Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

You are my smart home assistant.
You are you?

Response:

4:55PM DBG Prompt (after templating): Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

You are my smart home assistant.
You are you?

Response:

4:55PM DBG Model already loaded in memory: luna-ai-llama2-uncensored.Q4_K_M.gguf
4:55PM DBG Model 'luna-ai-llama2-uncensored.Q4_K_M.gguf' already loaded
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 is processing [task id: 12]
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 : kv cache rm - [0, end)
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: prompt eval time = 95.88 ms / 48 tokens ( 2.00 ms per token, 500.62 tokens per second)
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: eval time = 268.57 ms / 14 runs ( 19.18 ms per token, 52.13 tokens per second)
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: total time = 364.45 ms
4:55PM DBG Response: {"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"I am your smart home assistant. How can I assist you today?"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

I also converted the curl request to Invoke-WebRequest and ran it on the local PC and it worked fine still. Just to rule out some sort of issue with remotely accessing the model and files.
{"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Sure, here's a joke for you: Why did the tomato turn red? Because it saw the salad dressing!"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

from extended_openai_conversation.

jonny190 avatar jonny190 commented on June 6, 2024 1

I'm using docker compose for my LocalAI instance
`
version: '3.6'

services:
api:
image: quay.io/go-skynet/local-ai:master-cublas-cuda12
build:
context: .
dockerfile: Dockerfile
env_file:
- .env
volumes:
- /mnt/nfs/appdata/localai/models:/models:cached
- /mnt/nfs/appdata/localai/images/:/tmp/generated/images/
command: ["/usr/bin/local-ai" ]
ports:
- 8080:8080
`
not sure how your trying to run yours but happy to help you troubleshoot

from extended_openai_conversation.

JonahMMay avatar JonahMMay commented on June 6, 2024 1

Awesome! I am headed out of town later today and won't be back until Friday, but I should have remote access to my systems if there's anything you'd like me to test or look at.

from extended_openai_conversation.

jekalmin avatar jekalmin commented on June 6, 2024

Thanks for reporting an issue.

Currently, there is an issue when using LocalAI. (see #17 (comment))

Let me try this too and see if something can be done to fix.
(I failed to install LocalAI before, but let me try again!)

from extended_openai_conversation.

JonahMMay avatar JonahMMay commented on June 6, 2024

I'm in a bit over my head here, but it might be related to this?

mudler/LocalAI#1187

from extended_openai_conversation.

jekalmin avatar jekalmin commented on June 6, 2024

@JonahMMay
Thanks for sharing information.
Have you tried curl with functions added?

curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "model": "thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf",
    "messages": [
        {
            "role": "system",
            "content": "You are my smart home assistant."
        },
        {
            "role": "user",
            "content": "tell me a joke"
        }
    ],
    "functions": [
        {
            "name": "execute_services",
            "description": "Execute service of devices in Home Assistant.",
            "parameters": {
                "type": "object",
                "properties": {
                    "domain": {
                        "description": "The domain of the service.",
                        "type": "string"
                    },
                    "service": {
                        "description": "The service to be called",
                        "type": "string"
                    },
                    "service_data": {
                        "description": "The service data object to indicate what to control.",
                        "type": "object"
                    }
                },
                "required": [
                    "domain",
                    "service",
                    "service_data"
                ]
            }
        }
    ],
    "function_call": "auto",
    "temperature": 0.7
}'

Unfortunately, I still didn't get LocalAI to work :(

from extended_openai_conversation.

JonahMMay avatar JonahMMay commented on June 6, 2024

If change functions to [] it appears to work fine. Trying the full code in your comment gives
{"error":{"code":500,"message":"Unrecognized schema: map[description:The service data object to indicate what to control. type:object]","type":""}}

from extended_openai_conversation.

jekalmin avatar jekalmin commented on June 6, 2024

Sorry to bother you.
Could you try this again?

curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "model": "thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf",
    "messages": [
        {
            "role": "system",
            "content": "You are my smart home assistant."
        },
        {
            "role": "user",
            "content": "turn on livingroom light"
        }
    ],
    "functions": [
        {
            "name": "execute_services",
            "description": "Execute service of devices in Home Assistant.",
            "parameters": {
                "type": "object",
                "properties": {
                    "domain": {
                        "description": "The domain of the service.",
                        "type": "string"
                    },
                    "service": {
                        "description": "The service to be called",
                        "type": "string"
                    },
                    "service_data": {
                        "type": "object",
                        "properties": {
                            "entity_id": {
                                "type": "array",
                                "items": {
                                    "type": "string",
                                    "description": "The entity_id retrieved from available devices. It must start with domain, followed by dot character."
                                }
                            }
                        }
                    }
                },
                "required": [
                    "domain",
                    "service",
                    "service_data"
                ]
            }
        }
    ],
    "function_call": "auto",
    "temperature": 0.7
}'

from extended_openai_conversation.

jekalmin avatar jekalmin commented on June 6, 2024

Never mind.
I just setup LocalAI successfully.

from extended_openai_conversation.

clambertus avatar clambertus commented on June 6, 2024

I also have this problem with the same results as @JonahMMay. If I remove the functions and function_call block from the example above, I do get an expected response, otherwise the response is the same as my input.

from extended_openai_conversation.

ThePragmaticArt avatar ThePragmaticArt commented on June 6, 2024

Same issue on my end with OpenAI Extended + LocalAI given a few models I’ve tried. 😞

Vanilla API requests are fine.

from extended_openai_conversation.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.