The h2o-llmstudio from ayalaramjee

Welcome to H2O LLM Studio, a framework and no-code GUI designed for
fine-tuning state-of-the-art large language models (LLMs).

With H2O LLM Studio, you can
Quickstart
What's New
Setup
Run H2O LLM Studio GUI
Run H2O LLM Studio GUI using Docker from a nightly build
Run H2O LLM Studio GUI by building your own Docker image
Run H2O LLM Studio with command line interface (CLI)
Data format and example data
Training your model
Example: Run on OASST data via CLI
Model checkpoints
Documentation
Contributing
License

With H2O LLM Studio, you can

easily and effectively fine-tune LLMs without the need for any coding experience.
use a graphic user interface (GUI) specially designed for large language models.
finetune any LLM using a large variety of hyperparameters.
use recent finetuning techniques such as Low-Rank Adaptation (LoRA) and 8-bit model training with a low memory footprint.
use Reinforcement Learning (RL) to finetune your model (experimental)
use advanced evaluation metrics to judge generated answers by the model.
track and compare your model performance visually. In addition, Neptune integration can be used.
chat with your model and get instant feedback on your model performance.
easily export your model to the Hugging Face Hub and share it with the community.

Quickstart

For questions, discussing, or just hanging out, come and join our Discord!

We offer several ways of getting started quickly.

Using CLI for fine-tuning LLMs:

What's New

PR 364 User secrets are now handled more securely and flexible. Support for handling secrets using the 'keyring' library was added. User settings are tried to be migrated automatically.
PR 328 RLHF is now a separate problem type. Note that starting a new RLHF experiment from an old experiment that used RLHF is no longer supported. To continue from a previous experiment, please start a new experiment and enter the settings from the previous experiment manually.
PR 308 Sequence to sequence models have been added as a new problem type.
PR 152 Add RLHF functionality for fine-tuning LLMs.
PR 132 Add 4bit training that allows training of larger LLM backbones with less GPU memory. See here for a comprehensive summary of this method.
PR 40 Added functionality for supporting nested conversations in data. A new parent_id_column can be selected for datasets to support tree-like structures in your conversational data. Additional augmentation settings have been added for this feature.

Please note that due to current rapid development we cannot guarantee full backwards compatibility of new functionality. We thus recommend to pin the version of the framework to the one you used for your experiments. For resetting, please delete/backup your data and output folders.

Setup

H2O LLM Studio requires a machine with Ubuntu 16.04+ and at least one recent Nvidia GPU with Nvidia drivers version >= 470.57.02. For larger models, we recommend at least 24GB of GPU memory.

For more information about installation prerequisites, see the Set up H2O LLM Studio guide in the documentation.

Recommended Install

The recommended way to install H2O LLM Studio is using pipenv with Python 3.10. To install Python 3.10 on Ubuntu 16.04+, execute the following commands:

System installs (Python 3.10)

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.10
sudo apt-get install python3.10-distutils
curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10

Installing NVIDIA Drivers (if required)

If deploying on a 'bare metal' machine running Ubuntu, one may need to install the required Nvidia drivers and CUDA. The following commands show how to retrieve the latest drivers for a machine running Ubuntu 20.04 as an example. One can update the following based on their OS.

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.4.3/local_installers/cuda-repo-ubuntu2004-11-4-local_11.4.3-470.82.01-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2004-11-4-local_11.4.3-470.82.01-1_amd64.deb
sudo apt-key add /var/cuda-repo-ubuntu2004-11-4-local/7fa2af80.pub
sudo apt-get -y update
sudo apt-get -y install cuda

Create virtual environment (pipenv)

The following command will create a virtual environment using pipenv and will install the dependencies using pipenv:

make setup

Using requirements.txt

If you wish to use conda or another virtual environment, you can also install the dependencies using the requirements.txt file:

pip install -r requirements.txt

Installing custom packages

If you need to install additional Python packages into your environment, you can do so using pip after activating your virtual environment via make shell. For example, to install flash-attention, you would use the following commands:

make shell
pip install flash-attn --no-build-isolation
pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary

Alternatively, you can also directly install via pipenv install package_name.

Run H2O LLM Studio GUI

You can start H2O LLM Studio using the following command:

make llmstudio

This command will start the H2O wave server and app. Navigate to http://localhost:10101/ (we recommend using Chrome) to access H2O LLM Studio and start fine-tuning your models!

If you are running H2O LLM Studio with a custom environment other than Pipenv, you need to start the app as follows:

H2O_WAVE_APP_ADDRESS=http://127.0.0.1:8756 \
H2O_WAVE_MAX_REQUEST_SIZE=25MB \
H2O_WAVE_NO_LOG=true \
H2O_WAVE_PRIVATE_DIR="/download/@output/download" \
wave run app

Run H2O LLM Studio GUI using Docker from a nightly build

Install Docker first by following instructions from NVIDIA Containers. H2O LLM Studio images are stored in the h2oai GCR vorvan container repository.

mkdir -p `pwd`/data
mkdir -p `pwd`/output
docker run \
    --runtime=nvidia \
    --shm-size=64g \
    --init \
    --rm \
    -u `id -u`:`id -g` \
    -p 10101:10101 \
    -v `pwd`/data:/workspace/data \
    -v `pwd`/output:/workspace/output \
    -v ~/.cache:/home/llmstudio/.cache \
    gcr.io/vorvan/h2oai/h2o-llmstudio:nightly

Navigate to http://localhost:10101/ (we recommend using Chrome) to access H2O LLM Studio and start fine-tuning your models!

(Note other helpful docker commands are docker ps and docker kill.)

Run H2O LLM Studio GUI by building your own Docker image

docker build -t h2o-llmstudio .
docker run \
    --runtime=nvidia \
    --shm-size=64g \
    --init \
    --rm \
    -u `id -u`:`id -g` \
    -p 10101:10101 \
    -v `pwd`/data:/workspace/data \
    -v `pwd`/output:/workspace/output \
    -v ~/.cache:/home/llmstudio/.cache \
    h2o-llmstudio

Run H2O LLM Studio with command line interface (CLI)

You can also use H2O LLM Studio with the command line interface (CLI) and specify the configuration file that contains all the experiment parameters. To finetune using H2O LLM Studio with CLI, activate the pipenv environment by running make shell, and then use the following command:

python train.py -C {path_to_config_file}

To run on multiple GPUs in DDP mode, run the following command:

bash distributed_train.sh {NR_OF_GPUS} -C {path_to_config_file}

By default, the framework will run on the first k GPUs. If you want to specify specific GPUs to run on, use the CUDA_VISIBLE_DEVICES environment variable before the command.

To start an interactive chat with your trained model, use the following command:

python prompt.py -e {experiment_name}

where experiment_name is the output folder of the experiment you want to chat with (see configuration). The interactive chat will also work with model that were finetuned using the UI.

To publish the model to Hugging Face, use the following command:

make shell 

python publish_to_hugging_face.py -p {path_to_experiment} -d {device} -a {api_key} -u {user_id} -m {model_name} -s {safe_serialization}

path_to_experiment is the output folder of the experiment. device is the target device for running the model, either 'cpu' or 'cuda:0'. Default is 'cuda:0'. api_key is the Hugging Face API Key. If user logged in, it can be omitted. user_id is the Hugging Face user ID. If user logged in, it can be omitted. model_name is the name of the model to be published on Hugging Face. It can be omitted. safe_serialization is a flag indicating whether safe serialization should be used. Default is True.

Data format and example data

For details on the data format required when importing your data or example data that you can use to try out H2O LLM Studio, see Data format in the H2O LLM Studio documentation.

Training your model

With H2O LLM Studio, training your large language model is easy and intuitive. First, upload your dataset and then start training your model. Start by creating an experiment. You can then monitor and manage your experiment, compare experiments, or push the model to Hugging Face to share it with the community.

Example: Run on OASST data via CLI

As an example, you can run an experiment on the OASST data via CLI. For instructions, see Run an experiment on the OASST data guide in the H2O LLM Studio documentation.

Model checkpoints

All open-source datasets and models are posted on H2O.ai's Hugging Face page and our H2OGPT repository.

Documentation

Detailed documentation and frequently asked questions (FAQs) for H2O LLM Studio can be found at https://docs.h2o.ai/h2o-llmstudio/. If you wish to contribute to the docs, navigate to the /documentation folder of this repo and refer to the README.md for more information.

Contributing

We are happy to accept contributions to the H2O LLM Studio project. Please refer to the CONTRIBUTING.md file for more information.

License

H2O LLM Studio is licensed under the Apache 2.0 license. Please see the LICENSE file for more information.

ayalaramjee / h2o-llmstudio Goto Github PK

h2o-llmstudio's Introduction

Welcome to H2O LLM Studio, a framework and no-code GUI designed for fine-tuning state-of-the-art large language models (LLMs).

Jump to