rizerphe / local-llm-function-calling Goto Github PK

View Code? Open in Web Editor NEW

270.0 4.0 27.0 167 KB

A tool for generating function arguments and choosing what function to call with local LLMs

Home Page: https://local-llm-function-calling.readthedocs.io/

License: MIT License

Python 100.00%

chatgpt-functions huggingface-transformers json-schema llm llm-inference openai-function-call openai-functions

local-llm-function-calling's People

Contributors

Stargazers

Watchers

local-llm-function-calling's Issues

Could not install via pip

I tried with python version 3.11.4 and it claimed that it was not between 3.11 and 4.0. I am on windows
I tried using the example locally and started loading the requirements and got to typing and it could not import NotRequired (I think it was in prompter) from typing but found it in typing_extensions. Seems like python is a mess between versions sometimes. It look like you are in the middle of making changes and that is why it can't be installed right now. Just wanted to give you a heads up. I am real interesting in trying this out.

Function calling Issue

The function calling feature within the "local_function_calling" tool is currently not functioning as expected. When attempting to use this feature, it does not produce the intended results or throws an error.
Use llm:- codellama-13b-instruct.Q4_K_M.gguf
code:- function_call = generator.generate("What is the weather like today in Brooklyn?")
print(function_call)
output got:- {'name': 'get_current_weathe', 'parameters': '{\n "location": "Microsoft.Azure.Comm"\n}'}

ERROR: No matching distribution found for local-llm-function-calling

Hey i am getting the following error while installing this library when i pip install local-llm-function-calling

pip install local-llm-function-calling Defaulting to user installation because normal site-packages is not writeable ERROR: Ignored the following versions that require a different python version: 0.1.0 Requires-Python >=3.11,<4.0; 0.1.1 Requires-Python >=3.11,<4.0; 0.1.2 Requires-Python >=3.11,<4.0; 0.1.3 Requires-Python >=3.11,<4.0; 0.1.4 Requires-Python >=3.11,<4.0 ERROR: Could not find a version that satisfies the requirement local-llm-function-calling (from versions: none) ERROR: No matching distribution found for local-llm-function-calling

Using CUDA for inference

Modify the huggingface.py file to use GPU for fast inference:

About the parameter extraction?

Thank you for your open-source code, it’s written very concisely and clearly. However, there’s something I’m not quite clear about - how do you do parameter extraction? I saw a token loop but I’m not sure how it works, could you explain it to me, thank you~

can this be used for function calling with other opensource models?

Hi just found this repo really interesting. I am wondering if this can be used for function calling task with other models like Llama2, falcon, vicuna, PaLm or else.
Model is being used via inference endpoint just like openAI.
🤔🤔 @rizerphe

Error at installing

using :- pip install local-llm-function-calling[llama-cpp]
got following error:-
ERROR: Failed building wheel for llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

is it possible for us to connect over mail?

Docs example doesn't seem correct

Apologies as I am new to this package, but going through the docs the examples don't seem correct:

local-llm-function-calling/docs/generation.md

Line 65 in d862388

"parameters": "{\n \"location\": \"{{{{{{{{{{{{{{{{{{{{\"\n}"

Note the schema is

functions = [
    {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
                    "maxLength": 20,
                },
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
            },
            "required": ["location"],
        },
    }
]

When I run on llama-2-7b-chat.Q5_K_M.gguf I get

>>> generator.generate("What is the weather like today in Brooklyn?")
{'name': 'get_current_weather', 'parameters': '{\n    "location": "type"\n}'}

which again doesn't provide an answer to the question but is slightly similar to the example.

Function Calling with LM Studio Server Model

Is there anyway to use this feature with chat-completion, instead of generator?

LM studio offers a server you can use your local models, in a chat-completion exactly like open-AI.

This is what the server script looks like:

from openai import OpenAI

# Example: reuse your existing OpenAI setup
from openai import OpenAI

# Point to the local server
client = OpenAI(base_url="http://localhost:1234/v1", api_key="not-needed")

completion = client.chat.completions.create(
  model="local-model", # this field is currently unused
  messages=[
    {"role": "system", "content": "Always answer in rhymes."},
    {"role": "user", "content": "Introduce yourself."}
  ],
  temperature=0.7,
)

print(completion.choices[0].message)

It would be very nice for a plug and play in function calling.

How does this project be compatible with BPE tokenizers?

Hi, I have found this part in the doc:

It demonstrates the word "date" is not acceptable. Will sub workds like "uni" in BPE tokenizers are acceptable?

Problems when trying to use Llama-2 models for function calling

When I run the following command:

generator = Generator.hf(functions, "meta-llama/Llama-2-7b-chat-hf")

I get the following error:

AttributeError Traceback (most recent call last)
Cell In[11], line 1
----> 1 function_call = generator.generate("What is the weather like today in Delaware?")
2 print(function_call)

File /local_llm_function_calling/generator.py:189, in Generator.generate(self, prompt, function_call, max_length, max_new_tokens, suffix)
174 """Generate the function call
175
176 Args:
(...)
186 FunctionCall: The generated function call
187 """
188 function_name = self.choose_function(prompt, function_call, suffix)
--> 189 arguments = self.generate_arguments(
190 prompt, function_name, max_new_tokens, max_length
191 )
192 return {"name": function_name, "parameters": arguments}

File /local_llm_function_calling/generator.py:157, in Generator.generate_arguments(self, prompt, function_call, max_length, max_new_tokens)
147 prefix = self.prompter.prompt(prompt, self.functions, function_call)
148 constraint = JsonSchemaConstraint(
149 [
150 function
(...)
155 ] # type: ignore
156 )
--> 157 generated = self.constrainer.generate(
158 prefix,
159 constraint,
160 max_length,
161 max_new_tokens,
162 )
163 validated = constraint.validate(generated)
164 return generated[: validated.end_index] if validated.end_index else generated

File /local_llm_function_calling/constrainer.py:221, in Constrainer.generate(self, prefix, constraint, max_len, max_new_tokens)
219 generation = self.model.start_generation(prefix)
220 for _ in range(max_new_tokens) if max_new_tokens else count():
--> 221 if self.advance_generation(generation, constraint, max_len):
222 break
223 return generation.get_generated()

File /local_llm_function_calling/constrainer.py:191, in Constrainer.advance_generation(self, generation, constraint, max_len)
173 def advance_generation(
174 self,
175 generation: Generation,
176 constraint: Callable[[str], tuple[bool, bool]],
177 max_len: int | None = None,
178 ) -> bool:
179 """Advance the generation by one token
180
181 Args:
(...)
189 bool: Whether the generation is complete
190 """
--> 191 done, length = self.gen_next_token(generation, constraint)
192 if done:
193 return True

File /local_llm_function_calling/constrainer.py:163, in Constrainer.gen_next_token(self, generation, constraint)
161 except SequenceTooLongError:
162 return (True, 0)
--> 163 for token in sorted_tokens:
164 generated = generation.get_generated(token)
165 fit = constraint(generated)

File /local_llm_function_calling/model/huggingface.py:63, in HuggingfaceGeneration.get_sorted_tokens(self)
54 def get_sorted_tokens(self) -> Iterator[int]:
55 """Get the tokens sorted by probability
56
57 Raises:
(...)
61 The next of the most likely tokens
62 """
---> 63 if self.inputs.shape[1] >= self.model.config.n_positions:
64 raise SequenceTooLongError()
65 gen_tokens = self.model.generate(
66 input_ids=self.inputs,
67 output_scores=True,
(...)
70 pad_token_id=self.tokenizer.eos_token_id,
71 )

File /function_calling_env/lib/python3.11/site-packages/transformers/configuration_utils.py:262, in PretrainedConfig.getattribute(self, key)
260 if key != "attribute_map" and key in super().getattribute("attribute_map"):
261 key = super().getattribute("attribute_map")[key]
--> 262 return super().getattribute(key)

AttributeError: 'LlamaConfig' object has no attribute 'n_positions'

Has anyone been able to successfully run this with Llama-2 models? If so did you run into this problem and how did you fix it?

Benchmark data

Any plans on providing benchmarks w/ the top OSS models like Mistral 7b using this as well as benchmarks against fine-tuned models.

rizerphe / local-llm-function-calling Goto Github PK

local-llm-function-calling's People

Contributors

Stargazers

Watchers

Forkers

local-llm-function-calling's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs