GithubHelp home page GithubHelp logo

fanconic / discopop Goto Github PK

View Code? Open in Web Editor NEW

This project forked from luchris429/discopop

0.0 0.0 0.0 25.5 MB

Code for Discovering Preference Optimization Algorithms with and for Large Language Models

License: MIT License

Shell 13.75% Python 85.83% Makefile 0.42%

discopop's Introduction

Discovering Preference Optimization Algorithms with and for Large Language Models

๐Ÿค— Model | ๐Ÿ“š Paper | ๐Ÿ“ Blog

Method

This repository contains the code for our paper "Discovering Preference Optimization Algorithms with and for Large Language Models".

The code for training is largely taken and adapted from huggingface/alignment-handbook.

Setup and Evolution

To run the code in this project, first, create a Python virtual environment using e.g. Conda:

conda create -n handbook python=3.10 && conda activate handbook

Next, install PyTorch v2.1.2 - the precise version is important for reproducibility! Since this is hardware-dependent, we direct you to the PyTorch Installation Page.

You can then install the remaining package dependencies as follows:

python -m pip install .

You will also need Flash Attention 2 installed, which can be done by running:

python -m pip install flash-attn==2.5.7 --no-build-isolation

Note If your machine has less than 96GB of RAM and many CPU cores, reduce the MAX_JOBS arguments, e.g. MAX_JOBS=4 pip install flash-attn==2.5.7 --no-build-isolation

Next, log into your Hugging Face and Wandb accounts as follows:

huggingface-cli login
wandb login

Finally, install Git LFS so that you can push models to the Hugging Face Hub:

sudo apt-get install git-lfs

Then, install FastChat for MT-Bench as follows (in the same directory that you cloned this repo):

cd ../
git clone https://github.com/lm-sys/FastChat.git
cd FastChat
pip install -e ".[model_worker,llm_judge]"

Make sure that it is loading the correct chat template for Zephyr-Gemma.

See this issue for the template.

To launch the evolution script:

python3 scripts/launch_evo.py --wandb

Evaluations

Chat Evals

Finally, you need to install Alpaca Eval 2.0. Annoyingly, alpaca_eval uses openai>1.5.0 and mt-bench uses openai==0.28, which is not backward compatible. Therefore we need to create a second conda environment, that is a copy of the first.

conda create --name handbook_alpaca --clone handbook
conda activate handbook_alpaca

Subsequently we install alpaca_eval as follows:

pip install alpaca-eval

I have also created an extra folder in this repo named alpaca_eval, where we store all the model and api config files

Whenever you want to run an mt-bench model evaluation, you can do this with the following command:

conda activate handbook
python scripts/run_evaluations.py \
    --model-id <name_of_your_model> \
    --model-path <path_to_model_weights_or_HF> \
    --num-generations 1 \
    --mt-bench \

Whenever you want to run an alpaca_eval model evaluation, you can do this with the following command:

conda activate handbook_alpaca
python scripts/run_evaluations.py \
    --model-id <name_of_your_model> \
    --num-generations 1 \
    --alpaca-eval \
    --alpaca-model <path_to_your_model_config>/configs.yaml \
    --alpaca-reference-model path_to_ref_model_config>/configs.yaml \
    --alpaca-openai-configs <path_to_your_client_config>/openai_configs.yaml

TL;DR

If you want to run both together, We have prepared bash scripts:

source scripts/train_tldr.sh 
source scripts/eval_tldr.sh 

IMDb

source scripts/train_eval_imdb.sh 

Citation

@article{lu2024discopop,
  title={Discovering Preference Optimization Algorithms with and for Large Language Models},
  author={Lu, Chris and Holt, Samuel and Fanconi, Claudio and Chan, Alex J and Foerster, Jakob and van der Schaar, Mihaela and Lange, Robert Tjarko},
  journal={arXiv preprint arXiv:2406.08414},
  year={2024}
}

discopop's People

Contributors

luchris429 avatar samholt avatar fanconic avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.