GithubHelp home page GithubHelp logo

tomekkorbak / pretraining-with-human-feedback Goto Github PK

View Code? Open in Web Editor NEW
167.0 6.0 14.0 138 KB

Code accompanying the paper Pretraining Language Models with Human Preferences

Home Page: https://arxiv.org/abs/2302.08582

License: MIT License

Python 100.00%
ai-alignment ai-safety decision-transformers gpt language-models pretraining reinforcement-learning rlhf

pretraining-with-human-feedback's Introduction

Pretraining Language Models with Human Preferences

This repo contains the code accompanying the paper Pretraining Language Models with Human Preferences. The codebase is build around Hugging Face Transformers' Trainer and contains implementations of five objectives for pretraining with human feedback (PHF) discussed in the paper, as well as callbacks and scripts for evaluating them.

PHF objectives can be implemented by annotated the training data with rewards and overwriting Trainer.compute_loss to use them as additional training signal. Rewards are provided by an instance of apo.scorers.Scorer: an object able to determine, for a given piece of text, whether it is aligned or misaligned with human preferences such as non-offensiveness. The scorer is also used for evaluating samples from PHF-trained LMs.

The codebase is built around Hugging Face ecosystem and wand (for monitoring and experiment management).

Quickstart

We assume Python 3.9+. To run the training script for MLE on the toxicity task, do:

pip install -r requirements.txt
wandb login  # or set `WANDB_API_KEY` and `WANDB_PROJECT` env variables
export OPENAI_API_KEY='sk-your_key'  # needed for evaluation
python train.py --task configs/toxicity/pretrain.yml --method configs/toxicity/mle.yml

Configuration

The train.py scripts requires paths to two config files: for task and for method. Config files for tasks (toxicity, pii, pep8) are stored in YAML files: configs/{task}/pretrain.yml (for pretraining experiments) and configs/{task}/finetuning.yml (for finetuning). Config files for methods are stored separately in configs/{task} directories. Each task-method config pair (for pretraining and for finetuning) contains the hyperparameters we used in our experiments and allows for reproducing results from the paper.

Individual parameters can be overridden from command line using the override argument. For instance:

python train.py --task configs/toxicity/pretrain.yml --method configs/toxicity/mle.yml --override training.per_device_train_batch_size=8

Tasks

Name Config files Training data Scorer Description
Toxicity configs/toxicity tomekkorbak/pile-detoxify DetoxifyToxicityScorer Misalignment score is the probability of toxicity according to detoxify
PII configs/pii tomekkorbak/pile-pii-scrubadub PIIScorer Misalignment score is the number of PIIs (e.g. names, URLs) per character, according to scrubadub
PEP8 configs/pep8 kejian/codeparrot-train-more-filter-3.3b-cleaned PEP8Scorer Misalignment score is the number of PEP8 violations per character, according to pycodestyle

Objectives

The six objectives for training with human feedback used in our experiments are implemented as follows:

Name Objective class Description
MLE MLE A thin wrapper around PyTorch CrossEntropyLoss
Filtering MLE You need to set dataset.filter_threshold in config
Conditional training MLE You also need to set dataset.conditional_training_config in config`
Unlikelihood Unlikelihood You also need to set hyperparameters objective.score_threshold and objective.alpha
AWR AWR You also need to set hyperparameters objective.alpha and objective.beta
RWR AWR A special case of AWR with objective.alpha=1

Pretrained models

The models pretrained in our experiments are available on HugginFace Hub:

Objective Toxicity PEP8 PII
MLE tomekkorbak/goofy_pasteur kejian/mighty-mle tomekkorbak/nervous_wozniak
Filtering median tomekkorbak/amazing_shannon kejian/mighty-filtering tomekkorbak/cocky_carson
Conditional tomekkorbak/hungry_saha kejian/mighty-conditional tomekkorbak/boring_mcclintock
UL tomekkorbak/nifty_banach kejian/mighty-ul tomekkorbak/affectionate_wescoff
AWR tomekkorbak/upbeat_ramanujan kejian/vigor-awr tomekkorbak/confident_knuth
RWR tomekkorbak/keen_clarke kejian/mighty-rwr tomekkorbak/gifted_hugle

Metrics

On each evaluation step, apo.callbacks.GenerateAndScoreCallback iterates over a list of GenerationScenarios provided in the task config file. For each scenario, num_samples samples are generated and the following wandb metrics are computed:

  • score, average misalignment (across num_samples samples) of the generated samples assigned by the scorer
    • score_max@25, average maximum score in 25 samples (similar to expected maximum toxicity in the RealToxicityPrompts paper)
  • current_samples, a wandb.Table of samples together with their prompts (if any) and scores

In addition to scoring LM samples, we use apo.callbacks.KLGPT3Callback to estimate KL of the current LM from GPT-3. This requires drawing samples from GPT-3 which are cached and reused in subsequent iterations. |

Codebase structure

.
├── apo
│   ├── callbacks.py  # callbacks implementing the evaluation pipeline 
│   ├── dataset_wrappers.py  # an iterable for streaming blocks of tokens for training
│   ├── kl_gpt3.py  # logic for measuring KL from GPT-3
│   └── metrics.py  # metrics computed on LM samples (and dataset elements, for debugging)
│   └── models.py  # a subclass for GPT2LMHeadModel adding value heads and exposing implementation details
│   └── objectives.py  # classes implementing loss functions
│   ├── scorer_utils.py
│   ├── scorers.py  # classes for scoring LM samples and dataset elements
│   └── trainer.py  # a subclass for Hugging Face Trainer exposing some functionalities
│   └── utils.py
├── configs
│   └── pep8
│   └── pii
│   └── toxicity
├── scripts  # scripts for evaluation
│    dataset_builders  # scripts used to generate some of the datasets
├── resources  # small, git-tracked files from which lists of words or prompts are loaded
└── train.py  # the main training script

Citing

@misc{https://doi.org/10.48550/arxiv.2302.08582,
  doi = {10.48550/ARXIV.2302.08582},
  url = {https://arxiv.org/abs/2302.08582},
  author = {Korbak, Tomasz and Shi, Kejian and Chen, Angelica and Bhalerao, Rasika and Buckley, Christopher L. and Phang, Jason and Bowman, Samuel R. and Perez, Ethan},
  keywords = {Computation and Language (cs.CL), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Pretraining Language Models with Human Preferences},
  publisher = {arXiv},  
  year = {2023},
  copyright = {Creative Commons Attribution 4.0 International}
}

pretraining-with-human-feedback's People

Contributors

eltociear avatar tomekkorbak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pretraining-with-human-feedback's Issues

Pretrained models release

Hi,

Thanks for the interesting work. Are you planning to release the pretrained model checkpoints? They will be very helpful. Thank you.

Why is toxicity threshold so low? It is set to 0.00056.

In configs/toxicity/conditional.yml, we have the line

dataset:
  conditional_training_config:
    threshold: 0.00056
    aligned_prefix: "<|aligned|>"
    misaligned_prefix: "<|misaligned|>"
    drop_token_fraction: 0.01

Why here is the toxicity threshold 0.00056? This is incredibly low. Only sentences with toxicity scores lower than 0.00056 would be marked as non-toxic. Everything greater (or equal to) that would be marked as toxic.

Don't we only want documents to be marked as toxic when their toxicity is, let's say, 0.9 or greater? (I chose 0.9 arbitrarily as an example). Generally speaking, 0.00056 seems to be quite a low threshold and I'm worried that this might hurt performance.

Can you explain the thought process that went into making the toxicity threshold 0.00056? Is this simply what got the best results?

Thanks!

Training dataset

Thank you for your interesting work and code. However, I cannot find the training dataset, further cannot run the code. Could you please share the training dataset used in your paper?

Code doesn't run due to shuffle=True exception

I"m getting some weird error when using the default for the dataloader of shuffle=True. Can you please help me debug why this is occurring?

Traceback (most recent call last):
  File "/lfs/hyperturing2/0/rschaef/KoyejoLab-Pretrain-Human-Feedback/train.py", line 163, in <module>
    train(args.checkpoint_path, config=config)
  File "/lfs/hyperturing2/0/rschaef/KoyejoLab-Pretrain-Human-Feedback/train.py", line 139, in train
    trainer.train(resume_from_checkpoint=checkpoint_path)
  File "/lfs/hyperturing2/0/rschaef/miniconda3/envs/pretrain_hf/lib/python3.9/site-packages/transformers/trainer.py", line 1196, in train
    train_dataloader = self.get_train_dataloader()
  File "/lfs/hyperturing2/0/rschaef/KoyejoLab-Pretrain-Human-Feedback/apo/trainer.py", line 118, in get_train_dataloader
    return DataLoader(
  File "/lfs/hyperturing2/0/rschaef/miniconda3/envs/pretrain_hf/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 228, in __init__
python-BaseException
    raise ValueError(
ValueError: DataLoader with IterableDataset: expected unspecified shuffle option, but got shuffle=True

The code doesn't run due to AttributeError: '_IterDataPipeSerializationWrapper' object has no attribute 'datapipe

When I run the code via:
python train.py --task configs/toxicity/pretrain.yml --method configs/toxicity/mle.yml

I am getting this error:

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Using pad_token, but it is not set yet.
setting gradient_accumulation_steps=8 based on effective_batch_size=64 and instantaneous_bsz=8 (world_size=1, n_gpu=1)
setting max_steps=50354 based on num_tokens=3.30e+09 and tokens_already_seen=0.00e+00
max_steps is given, it will override any value given in num_train_epochs
Setting train_dataloader.batch_size=8
Using amp half precision backend
/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  warnings.warn(
***** Running training *****
  Num examples = 3222656
  Num Epochs = 9223372036854775807
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 64
  Gradient Accumulation steps = 8
  Total optimization steps = 50354
Traceback (most recent call last):
  File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/train.py", line 153, in <module>
    train(args.checkpoint_path, config=config)
  File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/train.py", line 129, in train
    trainer.train(resume_from_checkpoint=checkpoint_path)
  File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer.py", line 1343, in train
    self.control = self.callback_handler.on_train_begin(args, self.state, self.control)
  File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer_callback.py", line 347, in on_train_begin
    return self.call_event("on_train_begin", args, state, control)
  File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer_callback.py", line 388, in call_event
    result = getattr(callback, event)(
  File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/apo/callbacks.py", line 135, in on_train_begin
    tokens_already_seen = kwargs.get('train_dataloader').dataset.datapipe.skip_tokens
  File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/torch/utils/data/datapipes/datapipe.py", line 129, in __getattr__
    raise AttributeError(f"'{self.__class__.__name__}' object has no attribute '{attribute_name}")
AttributeError: '_IterDataPipeSerializationWrapper' object has no attribute 'datapipe

Clarification of required GPU memory to pretrain?

Approximately how much GPU memory is required to pretrain? We're running on a single GPU but we're receiving the following error, even with batch size 1:

RuntimeError: CUDA out of memory. Tried to allocate 296.00 MiB (GPU 0; 10.76 GiB total capacity; 6.34 GiB already allocated; 206.56 MiB free; 9.41 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Do we just need to move to a GPU with more memory? Or are we doing something wrong?

Cannot allocate memory error

I'm getting the following output, which ends with an error:
I'm trying to run python train.py --task configs/toxicity/pretrain.yml --method configs/toxicity/mle.yml

setting gradient_accumulation_steps=16 based on effective_batch_size=64 and instantaneous_bsz=80 (world_size=1, n_gpu=10)
setting max_steps=50354 based on num_tokens=3.30e+09 and tokens_already_seen=0.00e+00
Setting train_dataloader.batch_size=80
Setting state.tokens_seen=0.00e+00
Generating samples, scenario unconditional, batch 1 of 8
Generating samples, scenario unconditional, batch 2 of 8
Generating samples, scenario unconditional, batch 3 of 8
Generating samples, scenario unconditional, batch 4 of 8
Generating samples, scenario unconditional, batch 5 of 8
Generating samples, scenario unconditional, batch 6 of 8
Generating samples, scenario unconditional, batch 7 of 8
Generating samples, scenario unconditional, batch 8 of 8
Using pad_token, but it is not set yet.
max_steps is given, it will override any value given in num_train_epochs
Using amp half precision backend
/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  warnings.warn(
***** Running training *****
  Num examples = 64453120
  Num Epochs = 9223372036854775807
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 1280
  Gradient Accumulation steps = 16
  Total optimization steps = 50354
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Traceback (most recent call last):
  File "/lfs/hyperturing1/0/schundi/pretraining-with-human-feedback/train.py", line 153, in <module>
    train(args.checkpoint_path, config=config)
  File "/lfs/hyperturing1/0/schundi/pretraining-with-human-feedback/train.py", line 129, in train
    trainer.train(resume_from_checkpoint=checkpoint_path)
  File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/site-packages/transformers/trainer.py", line 1343, in train
    self.control = self.callback_handler.on_train_begin(args, self.state, self.control)
  File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/site-packages/transformers/trainer_callback.py", line 347, in on_train_begin
    return self.call_event("on_train_begin", args, state, control)
  File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/site-packages/transformers/trainer_callback.py", line 388, in call_event
    result = getattr(callback, event)(
  File "/lfs/hyperturing1/0/schundi/pretraining-with-human-feedback/apo/callbacks.py", line 88, in on_train_begin
    self.run(args, state, control, model, tokenizer, **kwargs)
  File "/lfs/hyperturing1/0/schundi/pretraining-with-human-feedback/apo/callbacks.py", line 171, in run
    self.generate_and_score(model, tokenizer, step=state.global_step)
  File "/lfs/hyperturing1/0/schundi/pretraining-with-human-feedback/apo/callbacks.py", line 203, in generate_and_score
    for name, value in metric.score_texts(texts=samples.continuations).items()
  File "/lfs/hyperturing1/0/schundi/pretraining-with-human-feedback/apo/metrics.py", line 80, in score_texts
    pool = Pool(os.cpu_count())
  File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/context.py", line 119, in Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,
  File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/pool.py", line 212, in __init__
    self._repopulate_pool()
  File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/pool.py", line 303, in _repopulate_pool
    return self._repopulate_pool_static(self._ctx, self.Process,
  File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/pool.py", line 326, in _repopulate_pool_static
    w.start()
  File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/context.py", line 277, in _Popen
    return Popen(process_obj)
  File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/afs/cs.stanford.edu/u/schundi/miniconda/lib/python3.9/multiprocessing/popen_fork.py", line 66, in _launch
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

Have you encountered this error before, or have advice to get past it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.