Generate Kubernetes resource YAML manifests from a text prompt
Gener8-Llama2 is a simple Kubernetes resource YAML generator based on Meta's Llama-2 model
Please make you have Python 3.8.X or higher version
Request for accessing Llama models here
You will receive a mail with the URL to download the model which we will use later.
Make sure you have all the repos downloaded: llama
, and llama.cpp
First download the llama-2–7b-chat
model from llama.
$ cd llama/
$ /bin/bash ./download.sh
Enter the URL from email: https://download.llamameta.net/*?XXXXXXXXXXXXX
Enter the list of models to download without spaces (7B,13B,70B,7B-chat,13B-chat,70B-chat), or press Enter for all: 7B-chat
Now we have to convert the downloaded model to f16 format and quantize it to reduce its size.
-
Build llama.cpp project
$ cd llama.cpp $ make
-
First activate a virtual env and install all the requirements
$ python3 -m venv llama2 $ source llama2/bin/activate $ python3 -m pip install -r requirements.txt
-
Then convert the model into f16 format and quantize it
$ python3 convert.py --outfile models/7B-chat/ggml-model-f16.bin --outtype f16 ../../llama2/llama/llama-2-7b-chat --vocab-dir ../../llama2/llama $ ./quantize ./models/7B-chat/ggml-model-f16.bin ./models/7B-chat/ggml-model-q4_0.bin q4_0
-
Make sure you change the
vocab_size
in llama/llama-2-7b-chat/params.json to 32000$ cat llama/llama-2-7b-chat/params.json {"dim": 4096, "multiple_of": 256, "n_heads": 32, "n_layers": 32, "norm_eps": 1e-06, "vocab_size": 32000}
Before proceeding further, please make sure you have setup the Llama2 model using the steps given in Prerequisites section
- Run python server
$ python app.py
* Serving Flask app 'app'
* Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on http://127.0.0.1:5000
- Use Curl or Webapp to send query to server
To query using webapp, open
/PATH/TO/REPO/Gener8-Llama2/frontend/index.html
in your browser and enter the description of the K8s resource you want to generate specs for
We love your input! We want to make contributing to this project as easy and transparent as possible, whether it's:
- Reporting a bug
- Discussing the current state of the code
- Submitting a fix
- Proposing new features