tatsu-lab / stanford_alpaca Goto Github PK

View Code? Open in Web Editor NEW

28.8K 339.0 4.0K 8.45 MB

Code and documentation to train Stanford's Alpaca models, and generate the data.

Home Page: https://crfm.stanford.edu/2023/03/13/alpaca.html

License: Apache License 2.0

Python 100.00%

deep-learning instruction-following language-model

stanford_alpaca's People

Contributors

Stargazers

Watchers

Forkers

c00renut epinnock codeaudit tranquilo12 loki44 suryatmodulus rozgo muharremokutan mindrages smy20011 stanleyjacob peterjliu baocin abacaj sbusso mobarmg lk251 rogervaas avsolatorio shobith bhanuc annihilatorrrr albertoual danielwe2 kemolo sslava k-nar mcwebdev brentes moerehman wvangeit standardgalactic vishnuvaradaraj xuhuifan daniel-furman thecloudfather upupbl sanmiandresofa brianbaldock macols77 dalian-ai kangli davidyuan666 entn-at skymsg higuseonhye mysticaltech gaohuan2015 bungerr darrengao628 ddaying theemancipator zhao-kun nztinversive sxthunder dumpmemory ditonogo techthiyanes wangguojim endeavorh greenrock21 co-simulation cellinlab wregret randbear royshan petercao markxsq weiplanet munifico yifree llegomark ukaserge zzhalan trojblue liudaihu yanniszhou backupart dnasdw capxax trisix jadentan hanlin-luo linhduongtuan viyv alanxmay slf188 zhangxt chengzhongnan markschmidty mtfelix igorcosta ccoltong1215 imaubrey xloongcn sunalamye pkuiloveoov wi11iam5-mpu billyan2018 gvc0461082002

stanford_alpaca's Issues

does support multi-turn training data?

Thanks for the greate job to help us easy to finetune on llama
I found that the training data is just single turn, does support for the mulit-turn data like OIG

Numpie lost Factories

Any APIs like OpenAI will be released in the future?

Any plan?

Can we access this model from huggingface? [eos]

[Q] How much vRAM does finetuning LLaMa 7B require?

How much vRAM does finetuning LLaMa 7B require?

What was the hardware used to train Alpaca?

OpenAIError Error communicating with OpenAI

Hello, I kept receive this issue.

Finetuning using standard hugging face training code

In ReadMe.md I saw that the model is finetuned using Stanford hugging face setup. I tried it but getting this error. Could someone help in calling Llama weights using hf

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("Bitsy/llama-7b-hfcompatible-clean")

Error :

KeyError Traceback (most recent call last)
in
1 from transformers import AutoModelForCausalLM
2
----> 3 model = AutoModelForCausalLM.from_pretrained("Bitsy/llama-7b-hfcompatible-clean")

2 frames
/usr/local/lib/python3.9/dist-packages/transformers/models/auto/configuration_auto.py in getitem(self, key)
577 return self._extra_content[key]
578 if key not in self._mapping:
--> 579 raise KeyError(key)
580 value = self._mapping[key]
581 module_name = model_type_to_module_name(key)

KeyError: 'llama'

inference kwargs

Thanks for the great work, I reproduced the training, but at inference time tends to generate shorter text. I am using:

generated = model.generate(batch["input_ids"], max_length=512)

Does the interface on the demo web page adjust other kwargs?
Thanks

Due to OOM, who can finetune LLaMA using bitsandbytes for an 8-bit setting on a single 3090?

Dear @ALL
Due to OOM as mentioned in previous issues, who can finetune LLaMA using bitsandbytes for an 8-bit setting on a single 3090?
If yes, please share your experiments and experience.

Best regards,
Linh

CUDA out of memory for a single core A100 80G GPU

I encountered the CUDA OOM on a single core A100 80G using your training code? Can i fix this by changing anything?

Do you shift the output label?

From your training code, the output label and input label is the same. Where do you shift the output label? Will this happen automatically inside trainer?

Confusion about input ids

Hi, thanks for sharing such a great job.
I've read your fine-tuning code and I'm a little confused about the inputs of the model.
From the code, the Input of model should be, here's an, example: ### # Instruction: {instruction}### Input{input}### Response:{response}. so the input_ids: tokenizer(example), label_ids:tokenizer(example), and label_ids[:len(source_len)]=IGNORE_INDEX.
I would like to ask, why do input ids contain response token ids? So the data target won't leak?

I am looking forward to your reply. Thank you very much.

Public release of model weights

Congratulations on the fine-tune! We have observed some fantastic performance through the provided web interface.

AFAIK the original Llama model was released under GNU/GPL, you should be able to distribute derivative work respecting this original license, correct? (Even if the original model weights have not officially been distributed to the public yet)

Will you provide some sort of wait-list to notify us when the model weights are made available?

Interested in as much information as you may share on this, again, congratulations and thank your impressive work!

https://github.com/facebookresearch/llama/blob/main/LICENSE

how to fine-tune on V100

need help!

'type' object is not subscriptable

The exception can be fixed by replacing 'dict' to 'Dict'

from typing import Optional, Sequence, Union ... def openai_completion( prompts: Union[str, Sequence[str], Sequence[dict[str, str]], dict[str, str]],
-->
from typing import Optional, Sequence, Union, Dict ... def openai_completion( prompts: Union[str, Sequence[str], Sequence[Dict[str, str]], Dict[str, str]],

Questions on fine-tuning process

I have three questions regarding the fine-tuning process.

How does max length (hyperparameter) work? Does each training sample concatanate multiple examples until it reaches the max length, or each training sample only includes a single example that is padded to the max length?
Is cross entropy loss is applied to all tokens including the input tokens (instruction + input), or just output tokens (response), or the weighted sum?
How is an user prompt processed at test time? Is it considered as an example with an empty input field?

Thank you in advance.

No evaluation dataset was given for the trainer

Hi, there, I just finish the finetuning process as introduced in train.py. However, I encountered one problem about trainer.evaluate().

{'loss': 0.3974, 'learning_rate': 3.5380966993958655e-11, 'epoch': 3.0}
{'loss': 0.4492, 'learning_rate': 0.0, 'epoch': 3.0}
{'train_runtime': 17758.138, 'train_samples_per_second': 8.785, 'train_steps_per_second': 0.069, 'train_loss': 0.7304400721402787, 'epoch': 3.0}
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 1218/1218 [4:55:48<00:00, 14.57s/it]
Traceback (most recent call last):
  File "/home/codes/finetune_llama/alpaca/train.py", line 233, in <module>
    train()
  File "/home/codes/finetune_llama/alpaca/train.py", line 227, in train
    trainer.evaluate()
  File "/home/anaconda3/envs/hawq/lib/python3.9/site-packages/transformers/trainer.py", line 2920, in evaluate
    eval_dataloader = self.get_eval_dataloader(eval_dataset)
  File "/home/anaconda3/envs/hawq/lib/python3.9/site-packages/transformers/trainer.py", line 934, in get_eval_dataloader
    raise ValueError("Trainer: evaluation requires an eval_dataset.")
ValueError: Trainer: evaluation requires an eval_dataset.

Should I give an eval_dataset here?

Generation problem after / before instruction fine-tuning

Environment: 6xA6000 48GB with Ubuntu 22.04, Pytorch 1.13.0

I ran into a generation problem after following your instruction to convert LLaMA-7B weight using your attached script.

I simply used the following script to directly test generation after loading the converted LLaMA-7B model:

tokenizer.batch_decode(model.generate(**tokenizer('I want to ', return_tensors="pt")))

The output of above code is:

'I want to acoérницschutzirectorioieckťDEX threshold släktetolasĭüttpiel'

The problem happens both before and after following your README for instruction fine-tuning. (note that I see the loss is decreasing over time during the fine-tuning stage which seems OK)

I have no problem running generation using original code from LLaMA, may I know your generation script so that I can test what caused the problem? Thanks.

Inquiry: Inference Parameters used for Gradio Demo

As an independent researcher I'm interested in knowing what generation parameters are used in the Gradio Web Demo. Things such as temperature and repetition penalty, if you have used even more advanced samplers like Typical Sampling or Tail Free Sampling, I'd be interested to know that as well. From my brief testing it appears that the some parameter or setting is hampering creativity, perhaps that is intentional for the demo?
Thanks in advance!

Exception: Could not find the transformer layer class to wrap in the model

the version of transformers is https://github.com/huggingface/transformers/pull/21955/commits

Training code detail.

Thanks for sharing this project. I have been trying to train the larger model for an offline first free education assistant for poor students preparing for competitive exams . Sharing training code, even if in an pr can really helpful for me fine-tune an education assistant.

Training recipe??

The blog says training recipe is too released in the code, but I cannot find it. Can you update the repo with code used for training the model, along with required dependencies/guide, etc, to help us do the same, maybe with bigger models.
Thanks for this awesome repo.

OOM issue

Can this finetuning script fit into A10, which only has 24GB GPU memory? I am trying to fine-tune the model on 4 A10 GPUs using a batch size of 1, but I still get an OOM error.

Not quite understand the importance of this repo.

Hi, devs at stanford. Today I took a try on your project and run the command to generate the data. And after awhile, it outputs a json file, regen.json like below. So I have a little confused, forgive my ignorance but I really don't know how to make something cool with this "regex.json" file. You know what I mean, is like I got a file, but what can I do with it. I guessed ppl might able to create something similar to ChatGpt but weaker, this is my guessing so far. Please enlightened me, thanks.

CUDA out of memory

Great work! In READM, you guys mention that 4 A100 80G can train this model, but when I try 8 40G A100, it meets cuda oom error.

We are thinking about why this small model can store enough world knowledge

Hi, we find your work in home page.
https://crfm.stanford.edu/2023/03/13/alpaca.html
This work inspires us how to adjust large language models in a good way.
Now, We are thinking about why this small model can store enough world knowledge.

Best.

Example of Instruction-Tuning Training

Hello, thank you for open-sourcing this work. We are now interested in generating our own instructions to fine-tune the Llama model based on your documentation and approach. Could you please advise on any resources or references we can use? Also, are these codes available on Hugging Face?

How to inference after finetuning ?

Thanks for sharing the training code. I've finished a 3-epoch finetuing.
However, I don't find the inference code.
Would you please give some advice on it? or sharing the infercence code ?
Thanks again!

When can we support airgap installation?

HI guys,
This one is awesome. When do you guys plan to support airgap installation? in another words, the end user can run it in their Laptop or any VMs in public cloud?

How to train with the Bible content?

Hi,

What is the steps to train it with this specific Bible content?

Example:
https://raw.githubusercontent.com/tushortz/variety-bible-text/master/bibles/kjv.txt

Can you show me the steps to train it?

And the other question is: The final file is compatible with LLAMA?

Thanks.

infer cost

Hi,

Can a consumer level GPU run infer with alpaca-7B model?

Comparing training log [Shared my training log]

I am currently training the model, and I am hoping to compare it with others. I am only using only 2 A100-80G.
Here is my wanb log:
https://wandb.ai/charliezjw/huggingface/runs/hil1q6lt

Resuming from checkpoint

My first run of the trainer could not save the model because the evaluate() call fails. I have removed that method call and now would like to resume from the last checkpoint. However, I cannot seem to get that working. Is there some disparity between model architecture and checkpoint architecture? The change I made to accommodate checkpoint resumption and the error I get are shown below show below.

Change for checkpoint resumption

data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) trainer = Trainer(model=model, tokenizer=tokenizer, args=training_args, **data_module) transformers.logging.set_verbosity_info() trainer.train() #trainer.train("output/checkpoint-18000") #trainer.evaluate() trainer.save_state() safe_save_model_for_hf_trainer(trainer=trainer, output_dir=training_args.output_dir)

Error stacktrace

`Loading model from output/checkpoint-18000/.
Traceback (most recent call last):
File "/home/ubuntu/alpaca/stanford_alpaca/train.py", line 246, in
train()
File "/home/ubuntu/alpaca/stanford_alpaca/train.py", line 239, in train
trainer.train("output/checkpoint-18000/")
File "/home/ubuntu/.local/lib/python3.10/site-packages/transformers/trainer.py", line 1617, in train
self._load_from_checkpoint(resume_from_checkpoint)
File "/home/ubuntu/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2120, in _load_from_checkpoint
load_result = load_sharded_checkpoint(model, resume_from_checkpoint, strict=is_sagemaker_mp_enabled())
File "/home/ubuntu/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 385, in load_sharded_checkpoint
state_dict = torch.load(os.path.join(folder, shard_file), map_location="cpu")
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/serialization.py", line 809, in load
return _load(opened_zipfile, map_location, pickle_module, pickle_load_args)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/serialization.py", line 1172, in _load
result = unpickler.load()
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/_utils.py", line 169, in _rebuild_tensor_v2
tensor = _rebuild_tensor(storage, storage_offset, size, stride)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/_utils.py", line 148, in rebuild_tensor
return t.set(storage._untyped_storage, storage_offset, size, stride)
RuntimeError: Trying to resize storage that is not resizable
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 122406 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 122407 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 122409 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 2 (pid: 122408) of binary: /usr/local/bin/python3.10
Traceback (most recent call last):
File "/home/ubuntu/.local/bin/torchrun", line 8, in
sys.exit(main())
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init**.py", line 346, in wrapper
return f(*args, kwargs)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call**
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

train.py FAILED
`

How to plot the pie chart ?

Once you collect 52k synthetic dataset, how did you plot the pie chart here ?

Thanks !

Question about training precision

In the provided training command:

torchrun --nproc_per_node=4 --master_port=<your_random_port> train.py \
    --model_name_or_path <your_path_to_hf_converted_llama_ckpt_and_tokenizer> \
    --data_path ./alpaca_data.json \
    --bf16 True \
    --output_dir <your_output_dir> \
    --num_train_epochs 3 \
    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 8 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 2000 \
    --save_total_limit 1 \
    --learning_rate 2e-5 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --fsdp "full_shard auto_wrap" \
    --fsdp_transformer_layer_cls_to_wrap 'LLaMADecoderLayer' \
    --tf32 True

Why is --bf16 used, if the model checkpoints were originally fp16? Is it simply overridden by the --tf32 flag later?

generate_instruction_following_data

Hi, What form should generate_instruction_following_data be in to execute generate instruction?

Reduce the length of your prompt.

prompt_batches: 0%| | 0/1 [00:00<?, ?it/s]WARNING:root:OpenAIError: This model's maximum context length is 4097 tokens, however you requested 4162 tokens (1090 in your prompt; 3072 for the completion). Please reduce your prompt; or completion length..

No checkpoint and no eval_dataset

It seems no eval_dataset and thus no storing for checkpoint ?

(for privacy, I hide the absolute file path and replace with )

Traceback (most recent call last):
  File "<path>/stanford_alpaca/train.py", line 232, in <module>
    train()
  File "<path>/stanford_alpaca/train.py", line 226, in train
    trainer.evaluate()
  File "<path>/stanford_alpaca/transformers-68d640f7c368bcaaaecfc678f11908ebbd3d6176/src/transformers/trainer.py", line 2920, in evaluate
    eval_dataloader = self.get_eval_dataloader(eval_dataset)
  File "<path>/stanford_alpaca/transformers-68d640f7c368bcaaaecfc678f11908ebbd3d6176/src/transformers/trainer.py", line 934, in get_eval_dataloader
    raise ValueError("Trainer: evaluation requires an eval_dataset.")
ValueError: Trainer: evaluation requires an eval_dataset.

Can you share the log of your finetuning code?

As the name implies, can you share the training log?

why 52K?

Hello, thank you for open-sourcing your training details! I just tried your demo and found the responses surprisingly fluent.

Wondering if your decision to train on a 52K instruction dataset was influenced by some criteria? I'm wondering if there's a floor where you found responses to be qualitatively inferior, or trying a number beyond 52K to have not yielded better results?

Bigger LLaMA models

Dear Stanford Researchers, Professors, Students (all geniuses) thank you for your amazing job!
Would the tuning code you released in this repo (and the dataset) be fit for finetuning larger LLaMA models like 13b/30b/65b?

How would the computational effort scale with such models?

Plan to release the web demo code

Hi, thanks for sharing your work, this is amazing!

Do you plan to release the web demo code ?

Loading llama-7b from huggingface

Could you share the link to the adopted llama-7b model? I was trying the one from decapoda-research (https://huggingface.co/decapoda-research) (https://huggingface.co/decapoda-research/llama-7b-hf/discussions) but it looks like the model itself cannot be loaded.

Support for gpt-3.5-turbo

gpt-3.5-turbo is cheaper and faster than davinci. I'm not 100% sure whether it will actually work better for Alpaca but figure it may be worth a trial. Any interest in taking a PR?

Fine-Tuning very slow (6h->24h??)

Hello, first of all thank you for releasing the training code for alpaca, we really appreaciate it.

I am running the fine-tuning script on an 4xA100-SXM4-80GB, and currently getting an 24H ETA. Which doesn't really scales with the reported "3 hours on 8 80GB A100s" mentioned on https://crfm.stanford.edu/2023/03/13/alpaca.html , Shouldn't it be around 6hours, or even 12hours considering that the script "is not particularly optimized"?

Is anyone else encountering this issue? And if this is expected, then what were the methods you used to optimize the fine-tuning process?

Running on CUDA 12.1, Torch 1.13, and the transformers fork of llama at the commit you mentioned.

Thanks.

tatsu-lab / stanford_alpaca Goto Github PK

stanford_alpaca's People

Contributors

Stargazers

Watchers

Forkers

stanford_alpaca's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs