GithubHelp home page GithubHelp logo

coco0201 / llama2-lora-fine-tuning Goto Github PK

View Code? Open in Web Editor NEW

This project forked from git-cloner/llama2-lora-fine-tuning

0.0 0.0 0.0 23.15 MB

llama2 finetuning with deepspeed and lora

Home Page: https://gitclone.com/aiit/chat/

License: MIT License

Shell 3.22% Python 96.78%

llama2-lora-fine-tuning's Introduction

用Lora和deepspeed微调LLaMA2-Chat

在两块P100(16G)上微调Llama-2-7b-chat模型。

数据源采用了alpaca格式,由train和validation两个数据源组成。

1、显卡要求

16G显存及以上(P100或T4及以上),一块或多块。

2、Clone源码

git clone https://github.com/git-cloner/llama2-lora-fine-tuning
cd llama2-lora-fine-tuning

3、安装依赖环境

# 创建虚拟环境
conda create -n llama2 python=3.9 -y
conda activate llama2
# 下载github.com上的依赖资源(需要反复试才能成功,所以单独安装)
export GIT_TRACE=1
export GIT_CURL_VERBOSE=1
pip install git+https://github.com/PanQiWei/AutoGPTQ.git -i https://pypi.mirrors.ustc.edu.cn/simple --trusted-host=pypi.mirrors.ustc.edu.cn
pip install git+https://github.com/huggingface/peft -i https://pypi.mirrors.ustc.edu.cn/simple
pip install git+https://github.com/huggingface/transformers -i https://pypi.mirrors.ustc.edu.cn/simple
# 安装其他依赖包
pip install -r requirements.txt -i https://pypi.mirrors.ustc.edu.cn/simple
# 验证bitsandbytes
python -m bitsandbytes

4、下载原始模型

python model_download.py --repo_id daryl149/llama-2-7b-chat-hf

5、扩充中文词表

# 使用了https://github.com/ymcui/Chinese-LLaMA-Alpaca.git的方法扩充中文词表
# 扩充完的词表在merged_tokenizes_sp(全精度)和merged_tokenizer_hf(半精度)
# 在微调时,将使用--tokenizer_name ./merged_tokenizer_hf参数
python merge_tokenizers.py \
  --llama_tokenizer_dir ./models/daryl149/llama-2-7b-chat-hf \
  --chinese_sp_model_file ./chinese_sp.model

6、微调参数说明

有以下几个参数可以调整:

参数 说明 取值
load_in_bits 模型精度 4和8,如果显存不溢出,尽量选高精度8
block_size token最大长度 首选2048,内存溢出,可选1024、512等
per_device_train_batch_size 训练时每块卡每次装入批量数 只要内存不溢出,尽量往大选
per_device_eval_batch_size 评估时每块卡每次装入批量数 只要内存不溢出,尽量往大选
include 使用的显卡序列 如两块:localhost:1,2(特别注意的是,序列与nvidia-smi看到的不一定一样)
num_train_epochs 训练轮数 至少3轮

7、微调

chmod +x finetune-lora.sh
# 微调
./finetune-lora.sh
# 微调(后台运行)
pkill -9 -f finetune-lora
nohup ./finetune-lora.sh > train.log  2>&1 &
tail -f train.log

8、测试

CUDA_VISIBLE_DEVICES=0 python generate.py \
    --base_model './models/daryl149/llama-2-7b-chat-hf' \
    --lora_weights 'output/checkpoint-2000' \
    --load_8bit #不加这个参数是用的4bit

llama2-lora-fine-tuning's People

Contributors

little51 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.