Comments (9)
lr=2e-4
lora_rank=8
lora_alpha=32
lora_trainable="q_proj,v_proj,k_proj,o_proj,gate_proj,down_proj,up_proj"
modules_to_save="embed_tokens,lm_head"
lora_dropout=0.05
pretrained_model=./llama2
chinese_tokenizer_path=./llama2
dataset_dir=./data
data_cache=1
per_device_train_batch_size=16
gradient_accumulation_steps=8
block_size=512
output_dir=output_dir
deepspeed_config_file=ds_zero2_no_offload.json
torchrun --nnodes 1 --nproc_per_node 2 run_clm_pt_with_peft.py
--deepspeed ${deepspeed_config_file}
--model_name_or_path ${pretrained_model}
--tokenizer_name_or_path ${chinese_tokenizer_path}
--dataset_dir ${dataset_dir}
--data_cache_dir ${data_cache}
--validation_split_percentage 0.001
--per_device_train_batch_size ${per_device_train_batch_size}
--do_train
--seed $RANDOM
--fp16
--num_train_epochs 1
--lr_scheduler_type cosine
--learning_rate ${lr}
--warmup_ratio 0.05
--weight_decay 0.01
--logging_strategy steps
--logging_steps 10
--save_strategy epoch
--save_total_limit 1
--gradient_accumulation_steps ${gradient_accumulation_steps}
--preprocessing_num_workers 8
--block_size ${block_size}
--output_dir ${output_dir}
--overwrite_output_dir
--ddp_timeout 30000
--logging_first_step True
--lora_rank ${lora_rank}
--lora_alpha ${lora_alpha}
--trainable ${lora_trainable}
--lora_dropout ${lora_dropout}
--modules_to_save ${modules_to_save}
--torch_dtype float16
--load_in_kbits 16
--save_safetensors True
--gradient_checkpointing
--ddp_find_unused_parameters False
from chinese-llama-alpaca-2.
from chinese-llama-alpaca-2.
你好,代码我没有改动任何地方,是哪里有什么问题吗,为什么预训练出来的lora模块才48B
from chinese-llama-alpaca-2.
已收到,感谢来信。祝好!
你好,能问一下这是什么问题吗
from chinese-llama-alpaca-2.
pt_lora_model
下权重正常吗?
from chinese-llama-alpaca-2.
pt_lora_model
下权重正常吗?
也是这样
from chinese-llama-alpaca-2.
你好,代码我没有改动任何地方,是哪里有什么问题吗,为什么预训练出来的lora模块才48B
兄弟,这个坑在这里,你只需要在训练脚本中注释如下代码即可正常:
# old_state_dict = model.state_dict
# model.state_dict = (
# lambda self, *_, **__: get_peft_model_state_dict(self, old_state_dict())
# ).__get__(model, type(model))
from chinese-llama-alpaca-2.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.
from chinese-llama-alpaca-2.
Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.
from chinese-llama-alpaca-2.
Related Issues (20)
- 预训练完成后模型的使用 HOT 4
- 指令精调 HOT 2
- 指令精调 HOT 4
- 无法从checkpoint恢复训练 HOT 3
- 多卡训练卡在加载模型 HOT 7
- ImportError: /usr/local/lib/python3.10/dist-packages/transformer_engine_extensions.cpython-310-x86_64-linux-gnu.so: undefined symbol: HOT 2
- 通过openai_server_demo/openai_api_server_vllm.py 运行,输出出现自问自答 HOT 2
- 训练垂直领域大模型应该基于哪个版本? HOT 3
- 权重合并后重新加载训练时出现错误 HOT 30
- 预训练数据以及微调数据会开源吗? HOT 2
- 模型,做了屏蔽词管理么? HOT 1
- 使用transformer命令行进行交互时推理报错 HOT 2
- HELP!!!!!!!!!!!!!!!!!!!!!!! HOT 1
- 模型微调 HOT 2
- 模型预训练时的labels问题 HOT 2
- 训练数据和测试数据开源了么? HOT 3
- 请问reward模型怎么部署推理? HOT 3
- 什么导致chinese-alpaca-2-7b推理存在大量重复生成情况 呢 HOT 6
- binascii.Error: Incorrect padding:How to solve it?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chinese-llama-alpaca-2.