GithubHelp home page GithubHelp logo

clha's Introduction

CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment

arXiv: Abstract / PDF

πŸ“£ News

  • [24/Feb/2024] πŸŽ‰ Our paper is accepted by LREC-COLING 2024 (The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation)!

✨ Abstract

Reinforcement learning from human feedback (RLHF) is a crucial technique in aligning large language models (LLMs) with human preferences, ensuring these LLMs behave in beneficial and comprehensible ways to users. However, a longstanding challenge in human alignment techniques based on reinforcement learning lies in their inherent complexity and difficulty in training. To address this challenge, we present a simple yet effective Contrastive Learning Framework for Human Alignment (CLHA) to align LLMs with human preferences directly. CLHA employs a novel rescoring strategy to evaluate the noise within the data by considering its inherent quality and dynamically adjusting the training process. Simultaneously, CLHA utilizes pairwise contrastive loss and adaptive supervised fine-tuning loss to adaptively modify the likelihood of generating responses, ensuring enhanced alignment with human preferences. Using advanced methods, CLHA surpasses other algorithms, showcasing superior performance in terms of reward model scores, automatic evaluations, and human assessments on the widely used β€œHelpful and Harmless” dataset.

✨ The pipeline of CLHA

πŸ’ͺ Dataset

Data Preparation

We provide the preprocessed data for training and testing, which can be get with following steps:

  1. Download data.zip and unzip it.
  2. Place the unzipped data folder in the root directory of the project.

Besides, we also provide the scripts for preprocessing the raw data. Please follow the steps below to prepare the data:

  1. Create a directory named data in the root directory of this project.
  2. Create a directory named data/raw_data in the data directory.
  3. Download the raw data from HH-RLHF, which should be named as hhrlhf, and put it in the data/raw_data directory.
  4. Run the following command to preprocess the data:
# For HH-RLHF
cd train/hh_preprocess_data
python step_1_process.py
python step_2_get_train_data.py
python step_3_get_test_data.py

πŸ’ͺ Usage

Train

We provide the training scripts for training the model. For example, you can run the following commands to train the model:

mkdir checkpoints
mkdir logs
#Download your reward models from https://huggingface.co/OpenAssistant/oasst-rm-2.1-pythia-1.4b-epoch-2.5 and https://huggingface.co/OpenAssistant/oasst-rm-2-pythia-6.9b-epoch-1
mkdir rm
cd train
# Train LLMs with HH-RLHF
./train_hh.sh [id_of_exp] hh_train_len2 2

The scripts can be easily modified to train LLMs with different datasets.

Test

The following command can be used to test the model:

# Test LLMs with HH-RLHF
cd eval_hh
./run_infer_main_dist.sh

Note: Before running, the id_of_exp and corresponding ranking length (during training) in run_infer_main_dist.sh have to be specified.

🀝 Acknowledgements

This project was inspired by PRO [DAMO-ConvAI]. We appreciate the original work done by the author.

πŸ”“ Citation

If this work is helpful to you, welcome to cite our paper as:

@article{fang2024clha,
  title={CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment},
  author={Fang, Feiteng and Zhu, Liang and Yang, Min and Feng, Xi and Hou, Jinchang and Zhao, Qixuan and Li, Chengming and Hu, Xiping and Xu, Ruifeng},
  journal={arXiv preprint arXiv:2403.16649},
  year={2024}
}

clha's People

Contributors

calubkk avatar

Stargazers

Iftitahu Ni'mah avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.