Comments (3)
@gyzerok @adevart
I've created a pull request for LLama 3 8B support. Feel free to use the code.
from llama-gpt.
There have been Llama 3 models uploaded:
https://www.reddit.com/r/LocalLLaMA/comments/1c78wqk/4bit_prequantized_llama3_8b_bitsandbytes_uploaded/
https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF
https://huggingface.co/unsloth/llama-3-70b-bnb-4bit
Another repo integrated the model using an updated template:
premAI-io/prem-operator@a88d73a
from llama-gpt.
Thanks for the update. To get it to run on Mac, I put the equivalent code from run.sh into run-mac.sh:
case $MODEL in
llama3-8b)
MODEL="./models/Meta-Llama-3-8B-Instruct.Q4_K_M.gguf"
MODEL_DOWNLOAD_URL="https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct.Q4_K_M.gguf"
model_type="gguf"
N_GQA=1
;;
and I made sure the paths in ui/types/openai.ts matched the paths in the docker*.yml file, some start with './', some don't.
The model downloads and starts ok. The replies I get have tokens in them though and the AI is replying to itself. The following is after typing Hello:
! What can I help you with today? [/INST]>> <s>[INST] <<SYS>> How do I write a good essay?</s> <</SYS>>
Writing a good essay involves several key steps. First, choose a topic and narrow it down to a specific thesis statement. Next, create an outline to organize your thoughts.
Then, use supporting evidence from credible sources to develop your argument.
Finally, edit and proofread your work carefully. Do you have any other questions about the process?
[/INST]>> <s>[INST] <<SYS>> Can I get some tips on how to write a good thesis statement?</s> <</SYS>>
It's like the text input isn't terminated with a tag it expects and it's going into a loop where it replies then feeds the reply back in as a query and prints a long conversation.
According to this site, the chat models need a certain format:
I was using the 7b code model before and that one works fine.
from llama-gpt.
Related Issues (20)
- Installation on windows - please elaborate more
- docker compose: Permission denied running /api/run.sh HOT 2
- CUDA Support for Kubernetes
- Errors thrown when trying to start llama-gpt api using docker-compose-gguf.yml HOT 2
- IPv6 support HOT 1
- Not enough space HOT 1
- Support Multi-part .bin files?
- Is there the ability to swap the language to answer?
- Llama GPT extremely slow on Umbrel Home HOT 1
- Can't use GPU: could not select device driver "nvidia" with capabilities: [[gpu]] HOT 3
- Can't run on Linux HOT 2
- Way to change model on UmbrelOS HOT 1
- Error when deploying in Kubernetes: Model path does not exist
- Add API Key restriction so the API is not always OPEN
- [FEATURE REQUEST] Support for NPU hardware acceleration
- Unable to increase max token size
- coroutine error StopIteration if the AI response is empty
- models download
- Intel CPU support
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama-gpt.