Comments (11)
I think this may have something to do with LocalAI's chat and completion templates. I customized the plugin connection to remove functions and have the simple template of "You are my smart home assistant." Then I told it "Tell me a joke" to which it replied "Tell me a joke".
But if I build a similar query via curl, I get a proper response:
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf", "messages": [{"role": "system", "content": "You are my smart home assistant."},{"role": "user", "content": " Tell me a joke."}], "temperature": 0.7 }'
{"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Sure, I can do that! Here's a joke for you: Why did the scarecrow win an award? Because he was outstanding in his field."}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
I started my LocalAI container in debug mode in an SSH window and watched the logs as they came on screen. The big difference I see is this. For the HASS plug-in:
5:01PM DBG Prompt (before templating): You are my smart home assistant.
Tell me a joke.
5:01PM DBG Prompt (after templating): You are my smart home assistant.
Tell me a joke.
5:01PM DBG Grammar: root-0-arguments-list-item ::= "{" space ""domain"" space ":" space string "," space ""service"" space ":" space string "," space ""service_data"" space ":" space root-0-arguments-list-item-service-data "}" space
root-0-arguments-list ::= "[" space (root-0-arguments-list-item ("," space root-0-arguments-list-item))? "]" space
root-0-arguments ::= "{" space ""list"" space ":" space root-0-arguments-list "}" space
root-0 ::= "{" space ""arguments"" space ":" space root-0-arguments "," space ""function"" space ":" space root-0-function "}" space
root-1-arguments ::= "{" space ""message"" space ":" space string "}" space
space ::= " "?
string ::= """ (
[^"\\] |
"\" (["\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
) """ space
root-0-arguments-list-item-service-data ::= "{" space ""entity_id"" space ":" space string "}" space
root-1-function ::= ""answer""
root-0-function ::= ""execute_services""
root-1 ::= "{" space ""arguments"" space ":" space root-1-arguments "," space ""function"" space ":" space root-1-function "}" space
root ::= root-0 | root-1
5:01PM DBG Model already loaded in memory: luna-ai-llama2-uncensored.Q4_K_M.gguf
5:01PM DBG Model 'luna-ai-llama2-uncensored.Q4_K_M.gguf' already loaded
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 is processing [task id: 14]
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 : kv cache rm - [0, end)
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: prompt eval time = 74.89 ms / 15 tokens ( 4.99 ms per token, 200.30 tokens per second)
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: eval time = 743.44 ms / 30 runs ( 24.78 ms per token, 40.35 tokens per second)
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: total time = 818.33 ms
5:01PM DBG Function return: { "arguments": { "message": "Tell me a joke." },"function": "answer"} map[arguments:map[message:Tell me a joke.] function:answer]
5:01PM DBG nothing to do, computing a reply
5:01PM DBG Reply received from LLM: Tell me a joke.
5:01PM DBG Reply received from LLM(finetuned): Tell me a joke.
5:01PM DBG Response: {"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"message":{"role":"assistant","content":"Tell me a joke."}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
Versus the curl command:
4:55PM DBG Prompt (before templating): You are my smart home assistant.
You are you?
4:55PM DBG Template found, input modified to: Below is an instruction that describes a task. Write a response that appropriately completes the request.
Instruction:
You are my smart home assistant.
You are you?
Response:
4:55PM DBG Prompt (after templating): Below is an instruction that describes a task. Write a response that appropriately completes the request.
Instruction:
You are my smart home assistant.
You are you?
Response:
4:55PM DBG Model already loaded in memory: luna-ai-llama2-uncensored.Q4_K_M.gguf
4:55PM DBG Model 'luna-ai-llama2-uncensored.Q4_K_M.gguf' already loaded
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 is processing [task id: 12]
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 : kv cache rm - [0, end)
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: prompt eval time = 95.88 ms / 48 tokens ( 2.00 ms per token, 500.62 tokens per second)
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: eval time = 268.57 ms / 14 runs ( 19.18 ms per token, 52.13 tokens per second)
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: total time = 364.45 ms
4:55PM DBG Response: {"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"I am your smart home assistant. How can I assist you today?"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
I also converted the curl request to Invoke-WebRequest and ran it on the local PC and it worked fine still. Just to rule out some sort of issue with remotely accessing the model and files.
{"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Sure, here's a joke for you: Why did the tomato turn red? Because it saw the salad dressing!"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
from extended_openai_conversation.
I'm using docker compose for my LocalAI instance
`
version: '3.6'
services:
api:
image: quay.io/go-skynet/local-ai:master-cublas-cuda12
build:
context: .
dockerfile: Dockerfile
env_file:
- .env
volumes:
- /mnt/nfs/appdata/localai/models:/models:cached
- /mnt/nfs/appdata/localai/images/:/tmp/generated/images/
command: ["/usr/bin/local-ai" ]
ports:
- 8080:8080
`
not sure how your trying to run yours but happy to help you troubleshoot
from extended_openai_conversation.
Awesome! I am headed out of town later today and won't be back until Friday, but I should have remote access to my systems if there's anything you'd like me to test or look at.
from extended_openai_conversation.
Thanks for reporting an issue.
Currently, there is an issue when using LocalAI. (see #17 (comment))
Let me try this too and see if something can be done to fix.
(I failed to install LocalAI before, but let me try again!)
from extended_openai_conversation.
I'm in a bit over my head here, but it might be related to this?
from extended_openai_conversation.
@JonahMMay
Thanks for sharing information.
Have you tried curl with functions added?
curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf",
"messages": [
{
"role": "system",
"content": "You are my smart home assistant."
},
{
"role": "user",
"content": "tell me a joke"
}
],
"functions": [
{
"name": "execute_services",
"description": "Execute service of devices in Home Assistant.",
"parameters": {
"type": "object",
"properties": {
"domain": {
"description": "The domain of the service.",
"type": "string"
},
"service": {
"description": "The service to be called",
"type": "string"
},
"service_data": {
"description": "The service data object to indicate what to control.",
"type": "object"
}
},
"required": [
"domain",
"service",
"service_data"
]
}
}
],
"function_call": "auto",
"temperature": 0.7
}'
Unfortunately, I still didn't get LocalAI to work :(
from extended_openai_conversation.
If change functions to [] it appears to work fine. Trying the full code in your comment gives
{"error":{"code":500,"message":"Unrecognized schema: map[description:The service data object to indicate what to control. type:object]","type":""}}
from extended_openai_conversation.
Sorry to bother you.
Could you try this again?
curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf",
"messages": [
{
"role": "system",
"content": "You are my smart home assistant."
},
{
"role": "user",
"content": "turn on livingroom light"
}
],
"functions": [
{
"name": "execute_services",
"description": "Execute service of devices in Home Assistant.",
"parameters": {
"type": "object",
"properties": {
"domain": {
"description": "The domain of the service.",
"type": "string"
},
"service": {
"description": "The service to be called",
"type": "string"
},
"service_data": {
"type": "object",
"properties": {
"entity_id": {
"type": "array",
"items": {
"type": "string",
"description": "The entity_id retrieved from available devices. It must start with domain, followed by dot character."
}
}
}
}
},
"required": [
"domain",
"service",
"service_data"
]
}
}
],
"function_call": "auto",
"temperature": 0.7
}'
from extended_openai_conversation.
Never mind.
I just setup LocalAI successfully.
from extended_openai_conversation.
I also have this problem with the same results as @JonahMMay. If I remove the functions and function_call block from the example above, I do get an expected response, otherwise the response is the same as my input.
from extended_openai_conversation.
Same issue on my end with OpenAI Extended + LocalAI given a few models I’ve tried. 😞
Vanilla API requests are fine.
from extended_openai_conversation.
Related Issues (20)
- Google Search Error HOT 6
- Custom Functions & Retrival of Documents, Notes... with PrivateGPT or Anything LLM HOT 5
- llama_cpp.server -- Input should be [...] and 500 Internal Server Error HOT 2
- Example Function "is_valid_entity" verifies given entity_ids if they are exposed, visible, etc.
- Expose service to call AI assistant to generate message in automatisation (Feature) HOT 1
- [Feature Request] Make helper function is_exposed() more available
- Something went wrong. Service call requested response data but did not match any entities. HOT 7
- What "expected_output" for open interpreter?
- Converting Llama3/Ollama to CrewAI creates an unusable (and angry) LLM
- No context history HOT 5
- Internet Access? HOT 8
- panic: Unrecognized schema: map[] HOT 6
- "Unexpected exception" error during integration suggests a problem with the OpenAI server URL
- Something went wrong service sensor.read not found HOT 2
- [Request] Schedule things ("turn off fan in 20 minutes") HOT 1
- Blocking call to open inside event loop warnings on 2024.6 dev HOT 2
- get_attributes function only pulls current day from weather.forecast_home HOT 1
- How to upgrade to gpt-4o HOT 5
- Migrate to native LLM API support in Home Assistant
- Options gear icon doesn't appear HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from extended_openai_conversation.