zetavg / llama-lora-tuner Goto Github PK

UI tool for fine-tuning and testing your own LoRA models base on LLaMA, GPT-J and more. One-click run on Google Colab. + A Gradio ChatGPT-like Chat UI to demonstrate your language models.

Python 92.76% Jupyter Notebook 2.95% JavaScript 1.80% CSS 2.50%

alpaca alpaca-lora llama lora machine-learning google-colab gpt-j gpt ai peft language-model

llama-lora-tuner's Introduction

🦙🎛️ LLaMA-LoRA Tuner

Making evaluating and fine-tuning LLaMA models with low-rank adaptation (LoRA) easy.

Update:

On the dev branch, there's a new Chat UI and a new Demo Mode config as a simple and easy way to demonstrate new models.

However, the new version does not have the fine-tuning feature yet and is not backward compatible as it uses a new way to define how models are loaded, and also a new format of prompt templates (from LangChain).

For more info, see: #28.

LLM.Tuner.Chat.UI.in.Demo.Mode.mp4

Features

See a demo on Hugging Face *Only serves UI demonstration. To try training or text generation, run on Colab.

1-click up and running in Google Colab with a standard GPU runtime.
- Loads and stores data in Google Drive.
Evaluate various LLaMA LoRA models stored in your folder or from Hugging Face.
Switch between base models such as decapoda-research/llama-7b-hf, nomic-ai/gpt4all-j, databricks/dolly-v2-7b, EleutherAI/gpt-j-6b, or EleutherAI/pythia-6.9b.
Fine-tune LLaMA models with different prompt templates and training dataset format.
- Load JSON and JSONL datasets from your folder, or even paste plain text directly into the UI.
- Supports Stanford Alpaca seed_tasks, alpaca_data and OpenAI "prompt"-"completion" format.
- Use prompt templates to keep your dataset DRY.

How to Start

There are various ways to run this app:

Run on Google Colab: The simplest way to get started, all you need is a Google account. Standard (free) GPU runtime is sufficient to run generation and training with micro batch size of 8. However, the text generation and training is much slower than on other cloud services, and Colab might terminate the execution in inactivity while running long tasks.
Run on a cloud service via SkyPilot: If you have a cloud service (Lambda Labs, GCP, AWS, or Azure) account, you can use SkyPilot to run the app on a cloud service. A cloud bucket can be mounted to preserve your data.
Run locally: Depends on the hardware you have.

Run On Google Colab

See video for step-by-step instructions.

Open this Colab Notebook and select Runtime > Run All (⌘/Ctrl+F9).

You will be prompted to authorize Google Drive access, as Google Drive will be used to store your data. See the "Config"/"Google Drive" section for settings and more info.

After approximately 5 minutes of running, you will see the public URL in the output of the "Launch"/"Start Gradio UI 🚀" section (like Running on public URL: https://xxxx.gradio.live). Open the URL in your browser to use the app.

Run on a cloud service via SkyPilot

After following the installation guide of SkyPilot, create a .yaml to define a task for running the app:

# llm-tuner.yaml

resources:
  accelerators: A10:1  # 1x NVIDIA A10 GPU, about US$ 0.6 / hr on Lambda Cloud. Run `sky show-gpus` for supported GPU types, and `sky show-gpus [GPU_NAME]` for the detailed information of a GPU type.
  cloud: lambda  # Optional; if left out, SkyPilot will automatically pick the cheapest cloud.

file_mounts:
  # Mount a presisted cloud storage that will be used as the data directory.
  # (to store train datasets trained models)
  # See https://skypilot.readthedocs.io/en/latest/reference/storage.html for details.
  /data:
    name: llm-tuner-data  # Make sure this name is unique or you own this bucket. If it does not exists, SkyPilot will try to create a bucket with this name.
    store: s3  # Could be either of [s3, gcs]
    mode: MOUNT

# Clone the LLaMA-LoRA Tuner repo and install its dependencies.
setup: |
  conda create -q python=3.8 -n llm-tuner -y
  conda activate llm-tuner

  # Clone the LLaMA-LoRA Tuner repo and install its dependencies
  [ ! -d llm_tuner ] && git clone https://github.com/zetavg/LLaMA-LoRA-Tuner.git llm_tuner
  echo 'Installing dependencies...'
  pip install -r llm_tuner/requirements.lock.txt

  # Optional: install wandb to enable logging to Weights & Biases
  pip install wandb

  # Optional: patch bitsandbytes to workaround error "libbitsandbytes_cpu.so: undefined symbol: cget_col_row_stats"
  BITSANDBYTES_LOCATION="$(pip show bitsandbytes | grep 'Location' | awk '{print $2}')/bitsandbytes"
  [ -f "$BITSANDBYTES_LOCATION/libbitsandbytes_cpu.so" ] && [ ! -f "$BITSANDBYTES_LOCATION/libbitsandbytes_cpu.so.bak" ] && [ -f "$BITSANDBYTES_LOCATION/libbitsandbytes_cuda121.so" ] && echo 'Patching bitsandbytes for GPU support...' && mv "$BITSANDBYTES_LOCATION/libbitsandbytes_cpu.so" "$BITSANDBYTES_LOCATION/libbitsandbytes_cpu.so.bak" && cp "$BITSANDBYTES_LOCATION/libbitsandbytes_cuda121.so" "$BITSANDBYTES_LOCATION/libbitsandbytes_cpu.so"
  conda install -q cudatoolkit -y

  echo 'Dependencies installed.'

  # Optional: Install and setup Cloudflare Tunnel to expose the app to the internet with a custom domain name
  [ -f /data/secrets/cloudflared_tunnel_token.txt ] && echo "Installing Cloudflare" && curl -L --output cloudflared.deb https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64.deb && sudo dpkg -i cloudflared.deb && sudo cloudflared service uninstall || : && sudo cloudflared service install "$(cat /data/secrets/cloudflared_tunnel_token.txt | tr -d '\n')"

  # Optional: pre-download models
  echo "Pre-downloading base models so that you won't have to wait for long once the app is ready..."
  python llm_tuner/download_base_model.py --base_model_names='decapoda-research/llama-7b-hf,nomic-ai/gpt4all-j'

# Start the app. `hf_access_token`, `wandb_api_key` and `wandb_project` are optional.
run: |
  conda activate llm-tuner
  python llm_tuner/app.py \
    --data_dir='/data' \
    --hf_access_token="$([ -f /data/secrets/hf_access_token.txt ] && cat /data/secrets/hf_access_token.txt | tr -d '\n')" \
    --wandb_api_key="$([ -f /data/secrets/wandb_api_key.txt ] && cat /data/secrets/wandb_api_key.txt | tr -d '\n')" \
    --wandb_project='llm-tuner' \
    --timezone='Atlantic/Reykjavik' \
    --base_model='decapoda-research/llama-7b-hf' \
    --base_model_choices='decapoda-research/llama-7b-hf,nomic-ai/gpt4all-j,databricks/dolly-v2-7b' \
    --share

Then launch a cluster to run the task:

sky launch -c llm-tuner llm-tuner.yaml

-c ... is an optional flag to specify a cluster name. If not specified, SkyPilot will automatically generate one.

You will see the public URL of the app in the terminal. Open the URL in your browser to use the app.

Note that exiting sky launch will only exit log streaming and will not stop the task. You can use sky queue --skip-finished to see the status of running or pending tasks, sky logs <cluster_name> <job_id> connect back to log streaming, and sky cancel <cluster_name> <job_id> to stop a task.

When you are done, run sky stop <cluster_name> to stop the cluster. To terminate a cluster instead, run sky down <cluster_name>.

Remember to stop or shutdown the cluster when you are done to avoid incurring unexpected charges. Run sky cost-report to see the cost of your clusters.

Log into the cloud machine or mount the filesystem of the cloud machine on your local computer

To log into the cloud machine, run ssh <cluster_name>, such as ssh llm-tuner.

If you have sshfs installed on your local machine, you can mount the filesystem of the cloud machine on your local computer by running a command like the following:

mkdir -p /tmp/llm_tuner_server && umount /tmp/llm_tuner_server || : && sshfs llm-tuner:/ /tmp/llm_tuner_server

Run locally

Prepare environment with conda

conda create -y python=3.8 -n llm-tuner
conda activate llm-tuner

pip install -r requirements.lock.txt
python app.py --data_dir='./data' --base_model='decapoda-research/llama-7b-hf' --timezone='Atlantic/Reykjavik' --share

You will see the local and public URLs of the app in the terminal. Open the URL in your browser to use the app.

For more options, see python app.py --help.

UI development mode

To test the UI without loading the language model, use the --ui_dev_mode flag:

python app.py --data_dir='./data' --base_model='decapoda-research/llama-7b-hf' --share --ui_dev_mode

To use Gradio Auto-Reloading, a config.yaml file is required since command line arguments are not supported. There's a sample file to start with: cp config.yaml.sample config.yaml. Then, just run gradio app.py.

Usage

See video on YouTube.

Acknowledgements

TBC

llama-lora-tuner's People

Contributors

Stargazers

Watchers

llama-lora-tuner's Issues

Private Repo can't load

Please help, where do I place the HuggingFace auth token?

QLora support

Are there plans to integrate QLora to this tuner? Does it require structural changes to support it?
https://github.com/artidoro/qlora

It's already great as is; but the 4bit quantized models are significantly faster in inference.

Edit: apparently it's the opposite, they seem to be slower!

Cannot select `None` template

I don't want anything else besides what I've provided in the input/response - is this not the way to do it?

I always running into those error messages when installing dependencies.

Running command git clone --filter=blob:none --quiet https://github.com/huggingface/peft.git /tmp/pip-install-x68u1j5h/peft_61c54bf9fa7f44718ba4c52407a011e8
fatal: unable to access 'https://github.com/huggingface/peft.git/': GnuTLS recv error (-110): The TLS connection was non-properly terminated.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet https://github.com/huggingface/peft.git /tmp/pip-install-x68u1j5h/peft_61c54bf9fa7f44718ba4c52407a011e8 did not run successfully.
│ exit code: 128
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

note: This error originates from a subprocess, and is likely not a problem with pip.

Running command git fetch -q https://github.com/huggingface/transformers.git 3f96e0b4e483c4c7d4ec9dcdc24b0b0cdf31ea5c
Running command git checkout -q 3f96e0b4e483c4c7d4ec9dcdc24b0b0cdf31ea5c
fatal: unable to access 'https://github.com/huggingface/transformers.git/': GnuTLS recv error (-110): The TLS connection was non-properly terminated.

It looks like a network problem, but I tried different computers and different network environments. It still has the same error messages.

Error when fine tuning: Size Mismatch

Hello, I am trying to fine tune and start with Alpaca 7B as a base model. I am getting an error message of "Size Mismatch though"

Would anyone know where this comes from?

Here is the error:

RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
	size mismatch for base_model.model.model.layers.0.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.0.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.0.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.0.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.1.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.1.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.1.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.1.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.2.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.2.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.2.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.2.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.3.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.3.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.3.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.3.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.4.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.4.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.4.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.4.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.5.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.5.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.5.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.5.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.6.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.6.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.6.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.6.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.7.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.7.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.7.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.7.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.8.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.8.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.8.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.8.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.9.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.9.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.9.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.9.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.10.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.10.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.10.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.10.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.11.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.11.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.11.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.11.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.12.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.12.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.12.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.12.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.13.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.13.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.13.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.13.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.14.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.14.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.14.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.14.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.15.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.15.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.15.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.15.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.16.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.16.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.16.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.16.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.17.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.17.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.17.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.17.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.18.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.18.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.18.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.18.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.19.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.19.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.19.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.19.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.20.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.20.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.20.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.20.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.21.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.21.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.21.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.21.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.22.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.22.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.22.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.22.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.23.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.23.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.23.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.23.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.24.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.24.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.24.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.24.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.25.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.25.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.25.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.25.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.26.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.26.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.26.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.26.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.27.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.27.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.27.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.27.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.28.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.28.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.28.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.28.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.29.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.29.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.29.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.29.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.30.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.30.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.30.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.30.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.31.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).
	size mismatch for base_model.model.model.layers.31.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 8]).
	size mismatch for base_model.model.model.layers.31.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).

Can't use secondary Google drive

drive.mount(google_drive_mount_path) gives Error: credential propagation was unsuccessful if I select a different account than the one from google colab. The auth is successful, I even get an email Google Drive for desktop was granted access to your Google Account

I know you may not be able to fix this, I couldn't find a functioning workaround: googlecolab/colabtools#2732 (comment)

Run pretrained model

After successfully training the model, I want to run the trained model locally. How do I do this? I copied the folder with the trained model, but there is no config.json file there

Support smaller models

How difficult would it be to support 3-4 bit models, e.g. https://huggingface.co/Sosaka/Alpaca-native-4bit-ggml or https://huggingface.co/decapoda-research/llama-smallint-pt/blob/main/llama-7b-3bit.pt - or non-conversational ones that are smaller than the 7b?

Comparison to text-generation-webui

I love what you've created, was wondering how it differs from the efforts at https://github.com/camenduru/text-generation-webui, which also seems to be able to tune lora models https://github.com/oobabooga/text-generation-webui/blob/main/docs/Training-LoRAs.md

WandB charts lose data.

Was all normal last night, then I woke up after it finished and it had made some of the graphs only show 1 dot.

Support for Larger Models

It would be great if there was a way to use this with the 13B, 30B or 60B LLaMa model sizes.

offload-between-cpu-and-gpu

Question 1: I got this error message when using my own dataset to fine-tune. What is the cause?

Question 2: Would it be an issue to disk/memory if I got too many LoRA models tuned and saved? How should I delete the previously trained models?

Does not work any longer on Google Colab or locally

I managed to run the code and the expected error turned up - but after running it a second time it just kept loading and nothing happened. Have tried now for an hour and same thing.

Regarding running it locally it's possible to run it but it simply doesn't work. Whatever input there is no output and not even an error message.

Anyone get this to install / run on anywhere but Colab?

seems like lots of missing dependencies like.. " llamaconverter requires the protobuf " .. installed it but no go lot of other issues
got to inf and finetune on colab with the fix #29 (comment)

Would be great to run on a local or docker would be even better
(I am looking into that with different python ver (states python=3.8 but seems not working with huggingf stuff) / pip etc -- so maybe the way to go )
This was The only thing I was able to actually achieve success fine tuning a llm -- oobabooga .. always breaking .. never got a clean Lora / training session without crapping out halfway though -- So.. This thing works!
but damn .. cant get it loaded on anything other than gd colab -- what are your experiences?

Loss chart errors after a bit

It'll be working fine for a while, then just die. This happens every time.

Use finetuned model for inference programmatically

Hi and kudos for this awesome tool! 💯
I have finetuned a model on my own dataset and I can quantitatively assess its performance via the inference tab.
However, I'd prefer to have a script that allows me to use it locally.

Am I right to assume that the generate method in llama_lora/lib/inference.py can be used to load the model and use it for prediction? A snippet/notebook would be extremely helpful!

Many thanks! ❤️

Text looping when using inference and go full determinism

While using text inference for testing my LoRA, when regenerate with changed temperature, Top P and stuff, the output is still the same as before. Tested on unhelpful-ai.

Refresh instead of timing out

My current training takes 35 hours, it will time out - unless we refresh or increase the timeout substantially

LlamaTokenizer class issue

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. The tokenizer class you load from this checkpoint is 'LLaMATokenizer'. The class this function is called from is 'LlamaTokenizer'.

HI! I'm running the LLM Tuner UI and run into this issue, which has been solved in another issue https://github.com/huggingface/transformers/issues/22222#issuecomment-1477171703. However, whenever I try to simply change the LlamaTokenizer name in tokenizer_config.json in the Huggingface cache ~/.cache/huggingface/hub/models--decapoda-research--llama-7b-hf, other issues pop whenever running the app.

Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████| 33/33 [00:13<00:00,  2.52it/s]
Traceback (most recent call last):
  File "llm_tuner/app.py", line 147, in <module>
    fire.Fire(main)
  File "/opt/conda/envs/llm-tuner/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/opt/conda/envs/llm-tuner/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/opt/conda/envs/llm-tuner/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "llm_tuner/app.py", line 119, in main
    prepare_base_model(Config.default_base_model_name)
  File "/home/gcpuser/sky_workdir/llm_tuner/llama_lora/models.py", line 262, in prepare_base_model
    Global.new_base_model_that_is_ready_to_be_used = get_new_base_model(
  File "/home/gcpuser/sky_workdir/llm_tuner/llama_lora/models.py", line 80, in get_new_base_model
    tokenizer = get_tokenizer(base_model_name)
  File "/home/gcpuser/sky_workdir/llm_tuner/llama_lora/models.py", line 156, in get_tokenizer
    raise e
  File "/home/gcpuser/sky_workdir/llm_tuner/llama_lora/models.py", line 143, in get_tokenizer
    tokenizer = AutoTokenizer.from_pretrained(
  File "/opt/conda/envs/llm-tuner/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 700, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/opt/conda/envs/llm-tuner/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1811, in from_pretrained
    return cls._from_pretrained(
  File "/opt/conda/envs/llm-tuner/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1965, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/opt/conda/envs/llm-tuner/lib/python3.8/site-packages/transformers/models/llama/tokenization_llama_fast.py", line 89, in __init__
    super().__init__(
  File "/opt/conda/envs/llm-tuner/lib/python3.8/site-packages/transformers/tokenization_utils_fast.py", line 114, in __init__
    fast_tokenizer = convert_slow_tokenizer(slow_tokenizer)
  File "/opt/conda/envs/llm-tuner/lib/python3.8/site-packages/transformers/convert_slow_tokenizer.py", line 1288, in convert_slow_tokenizer
    return converter_class(transformer_tokenizer).converted()
  File "/opt/conda/envs/llm-tuner/lib/python3.8/site-packages/transformers/convert_slow_tokenizer.py", line 445, in __init__
    from .utils import sentencepiece_model_pb2 as model_pb2
  File "/opt/conda/envs/llm-tuner/lib/python3.8/site-packages/transformers/utils/sentencepiece_model_pb2.py", line 91, in <module>
    _descriptor.EnumValueDescriptor(
  File "/opt/conda/envs/llm-tuner/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 796, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.

Any idea on how to tackle this so that the model and tokenizer will match properly? And any insight on if it will affect finetuning results if I didn't match up the classnames earlier?

Continue from checkpoint fails

Whenever I try to continue fine tuning from a checkpoint, it starts training for a bit then gets to the validation step (at least I think that's where it's failing) and gives an error like "checkpoint-xxxx not found in list". I'll update with the exact error when I run it again, but it does this consistently and I can't figure out what's going wrong.

WandB logging doesn't save evaluation loss

Inference got error "probability tensor contains either `inf`, `nan` or element < 0" while using beams=2 with temperature=0.1

Using decapoda-research/llama-7b-hf, beams = 2 with temperature = 0.1:

Note: Error will not be shown if Stream Output is enabled. If Stream Output is enabled, it will just output nothing.

beams = 2 with temperature = 0.4 also got this error, however, beams = 2 with temperature = 0.5 will not.

(unhelpful-ai-v01-3)

TypeError: init() got an unexpected keyword argument 'llm_int8_skip_modules'

Hi. i try to train localy with my RTX3060 on windows 10. Can somebody help me with this erorr?

I think i did this steps to start it works with cuda

python -m venv lora
.\lora\Scripts\activate

pip install -r requirements.lock.txt

pip install pynvml==11.0.0

pip uninstall bitsandbytes

pip install R:/llama/bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl  ( install from my disk)

pip install torch torchvision torchaudio -f https://download.pytorch.org/whl/cu118/torch_stable.html

pip install --upgrade transformers torch

pip install bitsandbytes --upgrade

And this is erorr

(lora) R:\llama\lora>python app.py --data_dir="./data" --base_model='meta-llama/Llama-2-7b-chat-hf'
fatal: not a git repository (or any of the parent directories): .git
Cannot get git commit hash: Command '['git', 'rev-parse', 'HEAD']' returned non-zero exit status 128.
bin R:\llama\lora\lora\lib\site-packages\bitsandbytes\libbitsandbytes_cuda118.dll

GPU compute capability:  (8, 6)
GPU total number of SMs:  28
GPU total cores:  3584
GPU total memory: 12884901888 bytes (12288.00 MB) (12.00 GB)
CPU available memory: 52328894464 bytes (49904.72 MB) (48.74 GB)
Will keep 2 offloaded models in CPU RAM.

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 2/2 [00:07<00:00,  3.58s/it]
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Loading base model meta-llama/Llama-2-7b-chat-hf...
Traceback (most recent call last):
  File "R:\llama\lora\llama_lora\ui\finetune\training.py", line 283, in training
    train_output = Global.finetune_train_fn(
  File "R:\llama\lora\llama_lora\lib\finetune.py", line 203, in train
    model = AutoModelForCausalLM.from_pretrained(
  File "R:\llama\lora\lora\lib\site-packages\transformers\models\auto\auto_factory.py", line 566, in from_pretrained
    return model_class.from_pretrained(
  File "R:\llama\lora\lora\lib\site-packages\transformers\modeling_utils.py", line 3236, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
TypeError: __init__() got an unexpected keyword argument 'llm_int8_skip_modules'

Aborting training should delete output folder

Error When Fine Tuning

Thank you for creating this. Inference works fine for me, however when attempting to Fine Tune using the Colab version, I get this error:

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'. 
The class this function is called from is 'LlamaTokenizer'.
trainable params: 4194304 || all params: 6742609920 || trainable%: 0.06220594176090199
RuntimeError: module compiled against API version 0x10 but this version of numpy is 0xf
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/transformers/utils/import_utils.py", line 1125, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
  File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/usr/local/lib/python3.9/dist-packages/transformers/trainer.py", line 68, in <module>
    from .data.data_collator import DataCollator, DataCollatorWithPadding, default_data_collator
  File "/usr/local/lib/python3.9/dist-packages/transformers/data/__init__.py", line 26, in <module>
    from .metrics import glue_compute_metrics, xnli_compute_metrics
  File "/usr/local/lib/python3.9/dist-packages/transformers/data/metrics/__init__.py", line 18, in <module>
    if is_sklearn_available():
  File "/usr/local/lib/python3.9/dist-packages/transformers/utils/import_utils.py", line 558, in is_sklearn_available
    return is_scipy_available() and importlib.util.find_spec("sklearn.metrics")
  File "/usr/lib/python3.9/importlib/util.py", line 94, in find_spec
    parent = __import__(parent_name, fromlist=['__path__'])
  File "/usr/local/lib/python3.9/dist-packages/sklearn/__init__.py", line 82, in <module>
    from .base import clone
  File "/usr/local/lib/python3.9/dist-packages/sklearn/base.py", line 17, in <module>
    from .utils import _IS_32BIT
  File "/usr/local/lib/python3.9/dist-packages/sklearn/utils/__init__.py", line 25, in <module>
    from .fixes import parse_version, threadpool_info
  File "/usr/local/lib/python3.9/dist-packages/sklearn/utils/fixes.py", line 19, in <module>
    import scipy.stats
  File "/usr/local/lib/python3.9/dist-packages/scipy/stats/__init__.py", line 485, in <module>
    from ._stats_py import *
  File "/usr/local/lib/python3.9/dist-packages/scipy/stats/_stats_py.py", line 37, in <module>
    from numpy.testing import suppress_warnings
  File "/usr/local/lib/python3.9/dist-packages/numpy/testing/__init__.py", line 10, in <module>
    from ._private.utils import *
  File "/usr/local/lib/python3.9/dist-packages/numpy/testing/_private/utils.py", line 23, in <module>
    import numpy.linalg.lapack_lite
ImportError: numpy.core.multiarray failed to import

Syntax error in code from local install

While installing locally on a windows machine using W.S.L. and conda, after installing requirements and trying to run python app.py --help I get a syntax error.

Traceback (most recent call last):
  File "app.py", line 9, in <module>
    from llama_lora.ui.main_page import main_page, get_page_title, main_page_custom_css
  File "/mnt/d/LLaMA-LoRA-Tuner/llama_lora/ui/main_page.py", line 5, in <module>
    from .inference_ui import inference_ui
  File "/mnt/d/LLaMA-LoRA-Tuner/llama_lora/ui/inference_ui.py", line 13, in <module>
    from ..lib.csv_logger import CSVLogger
  File "/mnt/d/LLaMA-LoRA-Tuner/llama_lora/lib/csv_logger.py", line 26
    def setup(
    ^
SyntaxError: duplicate argument 'components' in function definition

Type Issue on Google Colab

This commit appears to have broken Google Colab: db1ee85#diff-bb6ee86b2e23a2f846401f730c9969a0f4d7db42f8066a8be15b7f141eda416b

It appears to rely on Python 3.10 type features:

─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <cell line: 2>:2                                                                              │
│                                                                                                  │
│ /content/llama_lora/llama_lora/ui/main_page.py:5 in <module>                                     │
│                                                                                                  │
│     2                                                                                            │
│     3 from ..globals import Global                                                               │
│     4                                                                                            │
│ ❱   5 from .inference_ui import inference_ui                                                     │
│     6 from .finetune_ui import finetune_ui                                                       │
│     7 from .tokenizer_ui import tokenizer_ui                                                     │
│     8                                                                                            │
│                                                                                                  │
│ /content/llama_lora/llama_lora/ui/inference_ui.py:13 in <module>                                 │
│                                                                                                  │
│    10 from ..globals import Global                                                               │
│    11 from ..models import get_model, get_tokenizer, get_device                                  │
│    12 from ..lib.inference import generate                                                       │
│ ❱  13 from ..lib.csv_logger import CSVLogger                                                     │
│    14 from ..utils.data import (                                                                 │
│    15 │   get_available_template_names,                                                          │
│    16 │   get_available_lora_model_names,                                                        │
│                                                                                                  │
│ /content/llama_lora/llama_lora/lib/csv_logger.py:10 in <module>                                  │
│                                                                                                  │
│    7 from pathlib import Path                                                                    │
│    8 from typing import Any, List                                                                │
│    9                                                                                             │
│ ❱ 10 class CSVLogger(FlaggingCallback):                                                          │
│   11 │   """                                                                                     │
│   12 │   The default implementation of the FlaggingCallback abstract class. Each flagged         │
│   13 │   sample (both the input and output data) is logged to a CSV file with headers on the     │
│                                                                                                  │
│ /content/llama_lora/llama_lora/lib/csv_logger.py:29 in CSVLogger                                 │
│                                                                                                  │
│   26 │   def setup(                                                                              │
│   27 │   │   self,                                                                               │
│   28 │   │   components: List[Any],                                                              │
│ ❱ 29 │   │   flagging_dir: str | Path,                                                           │
│   30 │   ):                                                                                      │
│   31 │   │   self.components = components                                                        │
│   32 │   │   self.flagging_dir = flagging_dir                                                    │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: unsupported operand type(s) for |: 'type' and 'type'

App starts but shows Error on both LoRa Model and Prompt Template fields

Tried GColab and local install. Both show the same error . Can't load model or prompt .

"Select a LoRA model form your data directory, or type in a model name on HF (e.g: tloen/alpaca-lora-7b)."

Not sure if doing something wrong .

Running locally without git and intternet

Hi,
I have already downloaded base model 33 files and its there in one folder. Can you please let me know changes to run this model locally without any internet . Currently it is trying to download base model from git and its failing.

Thanks

ValueError('Loading ... requires you to execute the tokenizer file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option `trust_remote_code=True` to remove this error.')

As titled.

Executing tuned Model locally

I was able to tune the model with our data on Google colab.
We would like to run locally, build a REST API for other applications to use the same. Is it possible to download the trained model , run locally and interact with the model through direct Python call, instead of using the UI?. Can you point me to any sample code?

Falcon Support

Are there plans to support Falcom llm？Thx!
https://huggingface.co/tiiuae/falcon-40b
https://huggingface.co/tiiuae/falcon-40b-instruct