GithubHelp home page GithubHelp logo

tatsu-lab / stanford_alpaca Goto Github PK

View Code? Open in Web Editor NEW
28.8K 339.0 4.0K 8.45 MB

Code and documentation to train Stanford's Alpaca models, and generate the data.

Home Page: https://crfm.stanford.edu/2023/03/13/alpaca.html

License: Apache License 2.0

Python 100.00%
deep-learning instruction-following language-model

stanford_alpaca's People

Contributors

eltociear avatar lxuechen avatar rtaori avatar tiiiger avatar yanndubs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stanford_alpaca's Issues

Finetuning using standard hugging face training code

In ReadMe.md I saw that the model is finetuned using Stanford hugging face setup. I tried it but getting this error. Could someone help in calling Llama weights using hf

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("Bitsy/llama-7b-hfcompatible-clean")

Error :

KeyError Traceback (most recent call last)
in
1 from transformers import AutoModelForCausalLM
2
----> 3 model = AutoModelForCausalLM.from_pretrained("Bitsy/llama-7b-hfcompatible-clean")

2 frames
/usr/local/lib/python3.9/dist-packages/transformers/models/auto/configuration_auto.py in getitem(self, key)
577 return self._extra_content[key]
578 if key not in self._mapping:
--> 579 raise KeyError(key)
580 value = self._mapping[key]
581 module_name = model_type_to_module_name(key)

KeyError: 'llama'

inference kwargs

Thanks for the great work, I reproduced the training, but at inference time tends to generate shorter text. I am using:

generated = model.generate(batch["input_ids"], max_length=512)

Does the interface on the demo web page adjust other kwargs?
Thanks

Do you shift the output label?

From your training code, the output label and input label is the same. Where do you shift the output label? Will this happen automatically inside trainer?

Confusion about input ids

Hi, thanks for sharing such a great job.
I've read your fine-tuning code and I'm a little confused about the inputs of the model.
From the code, the Input of model should be, here's an, example: ### # Instruction: {instruction}### Input{input}### Response:{response}. so the input_ids: tokenizer(example), label_ids:tokenizer(example), and label_ids[:len(source_len)]=IGNORE_INDEX.
I would like to ask, why do input ids contain response token ids? So the data target won't leak?

I am looking forward to your reply. Thank you very much.

Public release of model weights

Congratulations on the fine-tune! We have observed some fantastic performance through the provided web interface.

AFAIK the original Llama model was released under GNU/GPL, you should be able to distribute derivative work respecting this original license, correct? (Even if the original model weights have not officially been distributed to the public yet)

Will you provide some sort of wait-list to notify us when the model weights are made available?

Interested in as much information as you may share on this, again, congratulations and thank your impressive work!

https://github.com/facebookresearch/llama/blob/main/LICENSE

'type' object is not subscriptable

The exception can be fixed by replacing 'dict' to 'Dict'

from typing import Optional, Sequence, Union ... def openai_completion( prompts: Union[str, Sequence[str], Sequence[dict[str, str]], dict[str, str]],
-->
from typing import Optional, Sequence, Union, Dict ... def openai_completion( prompts: Union[str, Sequence[str], Sequence[Dict[str, str]], Dict[str, str]],

Questions on fine-tuning process

I have three questions regarding the fine-tuning process.

  1. How does max length (hyperparameter) work? Does each training sample concatanate multiple examples until it reaches the max length, or each training sample only includes a single example that is padded to the max length?
  2. Is cross entropy loss is applied to all tokens including the input tokens (instruction + input), or just output tokens (response), or the weighted sum?
  3. How is an user prompt processed at test time? Is it considered as an example with an empty input field?

Thank you in advance.

No evaluation dataset was given for the trainer

Hi, there, I just finish the finetuning process as introduced in train.py. However, I encountered one problem about trainer.evaluate().

{'loss': 0.3974, 'learning_rate': 3.5380966993958655e-11, 'epoch': 3.0}
{'loss': 0.4492, 'learning_rate': 0.0, 'epoch': 3.0}
{'train_runtime': 17758.138, 'train_samples_per_second': 8.785, 'train_steps_per_second': 0.069, 'train_loss': 0.7304400721402787, 'epoch': 3.0}
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 1218/1218 [4:55:48<00:00, 14.57s/it]
Traceback (most recent call last):
  File "/home/codes/finetune_llama/alpaca/train.py", line 233, in <module>
    train()
  File "/home/codes/finetune_llama/alpaca/train.py", line 227, in train
    trainer.evaluate()
  File "/home/anaconda3/envs/hawq/lib/python3.9/site-packages/transformers/trainer.py", line 2920, in evaluate
    eval_dataloader = self.get_eval_dataloader(eval_dataset)
  File "/home/anaconda3/envs/hawq/lib/python3.9/site-packages/transformers/trainer.py", line 934, in get_eval_dataloader
    raise ValueError("Trainer: evaluation requires an eval_dataset.")
ValueError: Trainer: evaluation requires an eval_dataset.

Should I give an eval_dataset here?

Generation problem after / before instruction fine-tuning

Environment: 6xA6000 48GB with Ubuntu 22.04, Pytorch 1.13.0

I ran into a generation problem after following your instruction to convert LLaMA-7B weight using your attached script.

I simply used the following script to directly test generation after loading the converted LLaMA-7B model:

tokenizer.batch_decode(model.generate(**tokenizer('I want to ', return_tensors="pt")))

The output of above code is:

'I want to acoérницschutzirectorioieckťDEX threshold släktetolasĭüttpiel'

The problem happens both before and after following your README for instruction fine-tuning. (note that I see the loss is decreasing over time during the fine-tuning stage which seems OK)

I have no problem running generation using original code from LLaMA, may I know your generation script so that I can test what caused the problem? Thanks.

Inquiry: Inference Parameters used for Gradio Demo

As an independent researcher I'm interested in knowing what generation parameters are used in the Gradio Web Demo. Things such as temperature and repetition penalty, if you have used even more advanced samplers like Typical Sampling or Tail Free Sampling, I'd be interested to know that as well. From my brief testing it appears that the some parameter or setting is hampering creativity, perhaps that is intentional for the demo?
Thanks in advance!

Training code detail.

Thanks for sharing this project. I have been trying to train the larger model for an offline first free education assistant for poor students preparing for competitive exams . Sharing training code, even if in an pr can really helpful for me fine-tune an education assistant.

Training recipe??

The blog says training recipe is too released in the code, but I cannot find it. Can you update the repo with code used for training the model, along with required dependencies/guide, etc, to help us do the same, maybe with bigger models.
Thanks for this awesome repo.

OOM issue

Can this finetuning script fit into A10, which only has 24GB GPU memory? I am trying to fine-tune the model on 4 A10 GPUs using a batch size of 1, but I still get an OOM error.

Not quite understand the importance of this repo.

Hi, devs at stanford. Today I took a try on your project and run the command to generate the data. And after awhile, it outputs a json file, regen.json like below. So I have a little confused, forgive my ignorance but I really don't know how to make something cool with this "regex.json" file. You know what I mean, is like I got a file, but what can I do with it. I guessed ppl might able to create something similar to ChatGpt but weaker, this is my guessing so far. Please enlightened me, thanks.
image

CUDA out of memory

Hi

Great work! In READM, you guys mention that 4 A100 80G can train this model, but when I try 8 40G A100, it meets cuda oom error.

Example of Instruction-Tuning Training

Hello, thank you for open-sourcing this work. We are now interested in generating our own instructions to fine-tune the Llama model based on your documentation and approach. Could you please advise on any resources or references we can use? Also, are these codes available on Hugging Face?

How to inference after finetuning ?

Thanks for sharing the training code. I've finished a 3-epoch finetuing.
However, I don't find the inference code.
Would you please give some advice on it? or sharing the infercence code ?
Thanks again!

When can we support airgap installation?

HI guys,
This one is awesome. When do you guys plan to support airgap installation? in another words, the end user can run it in their Laptop or any VMs in public cloud?

infer cost

Hi,

Can a consumer level GPU run infer with alpaca-7B model?

Resuming from checkpoint

My first run of the trainer could not save the model because the evaluate() call fails. I have removed that method call and now would like to resume from the last checkpoint. However, I cannot seem to get that working. Is there some disparity between model architecture and checkpoint architecture? The change I made to accommodate checkpoint resumption and the error I get are shown below show below.

Change for checkpoint resumption

data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) trainer = Trainer(model=model, tokenizer=tokenizer, args=training_args, **data_module) transformers.logging.set_verbosity_info() trainer.train() #trainer.train("output/checkpoint-18000") #trainer.evaluate() trainer.save_state() safe_save_model_for_hf_trainer(trainer=trainer, output_dir=training_args.output_dir)

Error stacktrace

`Loading model from output/checkpoint-18000/.
Traceback (most recent call last):
File "/home/ubuntu/alpaca/stanford_alpaca/train.py", line 246, in
train()
File "/home/ubuntu/alpaca/stanford_alpaca/train.py", line 239, in train
trainer.train("output/checkpoint-18000/")
File "/home/ubuntu/.local/lib/python3.10/site-packages/transformers/trainer.py", line 1617, in train
self._load_from_checkpoint(resume_from_checkpoint)
File "/home/ubuntu/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2120, in _load_from_checkpoint
load_result = load_sharded_checkpoint(model, resume_from_checkpoint, strict=is_sagemaker_mp_enabled())
File "/home/ubuntu/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 385, in load_sharded_checkpoint
state_dict = torch.load(os.path.join(folder, shard_file), map_location="cpu")
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/serialization.py", line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/serialization.py", line 1172, in _load
result = unpickler.load()
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/_utils.py", line 169, in _rebuild_tensor_v2
tensor = _rebuild_tensor(storage, storage_offset, size, stride)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/_utils.py", line 148, in rebuild_tensor
return t.set
(storage._untyped_storage, storage_offset, size, stride)
RuntimeError: Trying to resize storage that is not resizable
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 122406 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 122407 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 122409 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 2 (pid: 122408) of binary: /usr/local/bin/python3.10
Traceback (most recent call last):
File "/home/ubuntu/.local/bin/torchrun", line 8, in
sys.exit(main())
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

train.py FAILED
`

Question about training precision

In the provided training command:

torchrun --nproc_per_node=4 --master_port=<your_random_port> train.py \
    --model_name_or_path <your_path_to_hf_converted_llama_ckpt_and_tokenizer> \
    --data_path ./alpaca_data.json \
    --bf16 True \
    --output_dir <your_output_dir> \
    --num_train_epochs 3 \
    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 8 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 2000 \
    --save_total_limit 1 \
    --learning_rate 2e-5 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --fsdp "full_shard auto_wrap" \
    --fsdp_transformer_layer_cls_to_wrap 'LLaMADecoderLayer' \
    --tf32 True

Why is --bf16 used, if the model checkpoints were originally fp16? Is it simply overridden by the --tf32 flag later?

Reduce the length of your prompt.

prompt_batches: 0%| | 0/1 [00:00<?, ?it/s]WARNING:root:OpenAIError: This model's maximum context length is 4097 tokens, however you requested 4162 tokens (1090 in your prompt; 3072 for the completion). Please reduce your prompt; or completion length..

No checkpoint and no eval_dataset

It seems no eval_dataset and thus no storing for checkpoint ?

(for privacy, I hide the absolute file path and replace with )

Traceback (most recent call last):
  File "<path>/stanford_alpaca/train.py", line 232, in <module>
    train()
  File "<path>/stanford_alpaca/train.py", line 226, in train
    trainer.evaluate()
  File "<path>/stanford_alpaca/transformers-68d640f7c368bcaaaecfc678f11908ebbd3d6176/src/transformers/trainer.py", line 2920, in evaluate
    eval_dataloader = self.get_eval_dataloader(eval_dataset)
  File "<path>/stanford_alpaca/transformers-68d640f7c368bcaaaecfc678f11908ebbd3d6176/src/transformers/trainer.py", line 934, in get_eval_dataloader
    raise ValueError("Trainer: evaluation requires an eval_dataset.")
ValueError: Trainer: evaluation requires an eval_dataset.

why 52K?

Hello, thank you for open-sourcing your training details! I just tried your demo and found the responses surprisingly fluent.

Wondering if your decision to train on a 52K instruction dataset was influenced by some criteria? I'm wondering if there's a floor where you found responses to be qualitatively inferior, or trying a number beyond 52K to have not yielded better results?

Bigger LLaMA models

Dear Stanford Researchers, Professors, Students (all geniuses) thank you for your amazing job!
Would the tuning code you released in this repo (and the dataset) be fit for finetuning larger LLaMA models like 13b/30b/65b?

How would the computational effort scale with such models?

Support for gpt-3.5-turbo

gpt-3.5-turbo is cheaper and faster than davinci. I'm not 100% sure whether it will actually work better for Alpaca but figure it may be worth a trial. Any interest in taking a PR?

Fine-Tuning very slow (6h->24h??)

Hello, first of all thank you for releasing the training code for alpaca, we really appreaciate it.

I am running the fine-tuning script on an 4xA100-SXM4-80GB, and currently getting an 24H ETA. Which doesn't really scales with the reported "3 hours on 8 80GB A100s" mentioned on https://crfm.stanford.edu/2023/03/13/alpaca.html , Shouldn't it be around 6hours, or even 12hours considering that the script "is not particularly optimized"?

Is anyone else encountering this issue? And if this is expected, then what were the methods you used to optimize the fine-tuning process?

Running on CUDA 12.1, Torch 1.13, and the transformers fork of llama at the commit you mentioned.

Thanks.

Reduce reproduction cost 96%, from $600 to $24, by releasing the instruct dataset only

The blog post says $500 was spent producing the dataset.
The blog post also says $100 was spent on 3xA100 80GB for 3 hours.
The market rate for 4xA100 is around $8 per hour. (See vast.ai for example)

If the dataset is provided for fine tuning then Alpaca could be reproduce for just about $24 and we would not have to wait for Facebook's response regarding sharing of the pre-trained model.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.