GithubHelp home page GithubHelp logo

gener8-llama2's Introduction

Gener8-Llama2

Generate Kubernetes resource YAML manifests from a text prompt

Gener8-Llama2 is a simple Kubernetes resource YAML generator based on Meta's Llama-2 model

Architecture

image

Prerequisites

Please make you have Python 3.8.X or higher version

Requesting access to Llama Models

Request for accessing Llama models here image

You will receive a mail with the URL to download the model which we will use later. image

Setup Llama2 Model

Make sure you have all the repos downloaded: llama, and llama.cpp

First download the llama-2–7b-chat model from llama.

$ cd llama/
$ /bin/bash ./download.sh
  Enter the URL from email: https://download.llamameta.net/*?XXXXXXXXXXXXX
  Enter the list of models to download without spaces (7B,13B,70B,7B-chat,13B-chat,70B-chat), or press Enter for all: 7B-chat

Converting and Quantizing Downloaded Model

Now we have to convert the downloaded model to f16 format and quantize it to reduce its size.

  1. Build llama.cpp project

    $ cd llama.cpp
    $ make
    
    
  2. First activate a virtual env and install all the requirements

    $ python3 -m venv llama2
    $ source llama2/bin/activate
    $ python3 -m pip install -r requirements.txt
  3. Then convert the model into f16 format and quantize it

    $ python3 convert.py --outfile models/7B-chat/ggml-model-f16.bin --outtype f16 ../../llama2/llama/llama-2-7b-chat --vocab-dir ../../llama2/llama
    $ ./quantize  ./models/7B-chat/ggml-model-f16.bin ./models/7B-chat/ggml-model-q4_0.bin q4_0
  4. Make sure you change the vocab_size in llama/llama-2-7b-chat/params.json to 32000

    $ cat llama/llama-2-7b-chat/params.json
    {"dim": 4096, "multiple_of": 256, "n_heads": 32, "n_layers": 32, "norm_eps": 1e-06, "vocab_size": 32000}

Build

Before proceeding further, please make sure you have setup the Llama2 model using the steps given in Prerequisites section

  1. Run python server
$ python app.py
 * Serving Flask app 'app'
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000
  1. Use Curl or Webapp to send query to server To query using webapp, open /PATH/TO/REPO/Gener8-Llama2/frontend/index.html in your browser and enter the description of the K8s resource you want to generate specs for

Screenshot 2024-01-12 at 11 43 10 AM

Contributing

We love your input! We want to make contributing to this project as easy and transparent as possible, whether it's:

  • Reporting a bug
  • Discussing the current state of the code
  • Submitting a fix
  • Proposing new features

gener8-llama2's People

Contributors

prasadg193 avatar rutu-k avatar

Watchers

 avatar  avatar

Forkers

rutu-k

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.