Light

cshaitao / chatglm_mutli_gpu_tuning Goto Github PK

View Code? Open in Web Editor NEW

115.0 115.0 11.0 139 KB

deepspeed+trainer简单高效实现多卡微调大模型

License: MIT License

Python 99.06% Shell 0.94%

chatglm_mutli_gpu_tuning's Introduction

Hi there 👋 This is Haitao Li.

😀 I’m a second-year master student at Tsinghua IR Group supervised by Prof. Yiqun Liu.
🏆 My research lies in Information Retrieval and Legal Case Retrieval. I currently focus on more reliable and interpretable legal case retrieval techniques with large language models. I am also very curious about dense retrieval. The publications are available at my homepage.
📫 Contact me via [email protected]

chatglm_mutli_gpu_tuning's People

Contributors

Stargazers

Watchers

Forkers

felixzhang7 ai-jie01 zhanglv0209 xuqy1981 chenbingxiayu nanqiai foreveract sexiong306 vincent507cpu wenshengcheung eeet9

chatglm_mutli_gpu_tuning's Issues

想请问下lora Ptuning-v2, Freeze三种微调方式的双卡16G可以跑么？

如题，想请教一下有关显存占用

requirements中的torch版本冲突

你好，requirements中的torch版本冲突，torch=2.0 和pytorch=1.12.1，请问应该有那个版本的呢

作者大大，运行原始ChatGLM项目的web_demo.py，在本地电脑尝试访问时出

运行bash ptuning.sh报错 Name Optional is not define

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/mdisk2/tanjunwen/gitprj/mChatGLM_mutli_gpu_tuning/finetune_ptuning.py:34 in │
│ │
│ 31 │ │ return super().forward(x).to(torch.float32) │
│ 32 │
│ 33 │
│ ❱ 34 class ModifiedTrainer(Trainer): │
│ 35 │ def evaluate( │
│ 36 │ │ self, │
│ 37 │ │ eval_dataset: Optional[Dataset] = None, │
│ │
│ /home/mdisk2/tanjunwen/gitprj/mChatGLM_mutli_gpu_tuning/finetune_ptuning.py:37 in │
│ ModifiedTrainer │
│ │
│ 34 class ModifiedTrainer(Trainer): │
│ 35 │ def evaluate( │
│ 36 │ │ self, │
│ ❱ 37 │ │ eval_dataset: Optional[Dataset] = None, │
│ 38 │ │ ignore_keys: Optional[List[str]] = None, │
│ 39 │ │ metric_key_prefix: str = "eval", │
│ 40 │ │ **gen_kwargs │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
NameError: name 'Optional' is not defined
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/mdisk2/tanjunwen/gitprj/mChatGLM_mutli_gpu_tuning/finetune_ptuning.py:34 in │
│ │
│ 31 │ │ return super().forward(x).to(torch.float32) │
│ 32 │
│ 33 │
│ ❱ 34 class ModifiedTrainer(Trainer): │
│ 35 │ def evaluate( │
│ 36 │ │ self, │
│ 37 │ │ eval_dataset: Optional[Dataset] = None, │
│ │
│ /home/mdisk2/tanjunwen/gitprj/mChatGLM_mutli_gpu_tuning/finetune_ptuning.py:37 in │
│ ModifiedTrainer │
│ │
│ 34 class ModifiedTrainer(Trainer): │
│ 35 │ def evaluate( │
│ 36 │ │ self, │
│ ❱ 37 │ │ eval_dataset: Optional[Dataset] = None, │
│ 38 │ │ ignore_keys: Optional[List[str]] = None, │
│ 39 │ │ metric_key_prefix: str = "eval", │
│ 40 │ │ **gen_kwargs │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
NameError: name 'Optional' is not defined
[2023-05-27 11:30:35,479] [INFO] [launch.py:428:sigkill_handler] Killing subprocess 58780
[2023-05-27 11:30:35,505] [INFO] [launch.py:428:sigkill_handler] Killing subprocess 58781
[2023-05-27 11:30:35,505] [ERROR] [launch.py:434:sigkill_handler] ['/home/mdisk2/tanjunwen/anaconda3/bin/python', '-u', 'finetune_ptuning.py', '--local_rank=1', '--train_path', 'junshi/train.json', '--max_len', '128', '--max_input_len', '256', '--model_name_or_path', '/chatGLM-6B', '--tokenizer_name/chatGLM-6B', '--per_device_train_batch_size', '8', '--gradient_accumulation_steps', '4', '--num_train_epochs', '1', '--save_steps', '1000', '--learning_rate', '2e-2', '--fp16', '--logging_steps', '50', '--prefix_projection', 'True', '--pre_seq_len', '128', '--output_dir', 'output', '--deepspeed', 'ds_config.json'] exits with return code = 1

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble