GithubHelp home page GithubHelp logo

chatglm_mutli_gpu_tuning's Introduction

Hi there 👋 This is Haitao Li.

  • 😀 I’m a second-year master student at Tsinghua IR Group supervised by Prof. Yiqun Liu.
  • 🏆 My research lies in Information Retrieval and Legal Case Retrieval. I currently focus on more reliable and interpretable legal case retrieval techniques with large language models. I am also very curious about dense retrieval. The publications are available at my homepage.
  • 📫 Contact me via [email protected]

chatglm_mutli_gpu_tuning's People

Contributors

cshaitao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

chatglm_mutli_gpu_tuning's Issues

运行bash ptuning.sh报错 Name Optional is not define

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/mdisk2/tanjunwen/gitprj/mChatGLM_mutli_gpu_tuning/finetune_ptuning.py:34 in │
│ │
│ 31 │ │ return super().forward(x).to(torch.float32) │
│ 32 │
│ 33 │
│ ❱ 34 class ModifiedTrainer(Trainer): │
│ 35 │ def evaluate( │
│ 36 │ │ self, │
│ 37 │ │ eval_dataset: Optional[Dataset] = None, │
│ │
│ /home/mdisk2/tanjunwen/gitprj/mChatGLM_mutli_gpu_tuning/finetune_ptuning.py:37 in │
│ ModifiedTrainer │
│ │
│ 34 class ModifiedTrainer(Trainer): │
│ 35 │ def evaluate( │
│ 36 │ │ self, │
│ ❱ 37 │ │ eval_dataset: Optional[Dataset] = None, │
│ 38 │ │ ignore_keys: Optional[List[str]] = None, │
│ 39 │ │ metric_key_prefix: str = "eval", │
│ 40 │ │ **gen_kwargs │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
NameError: name 'Optional' is not defined
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/mdisk2/tanjunwen/gitprj/mChatGLM_mutli_gpu_tuning/finetune_ptuning.py:34 in │
│ │
│ 31 │ │ return super().forward(x).to(torch.float32) │
│ 32 │
│ 33 │
│ ❱ 34 class ModifiedTrainer(Trainer): │
│ 35 │ def evaluate( │
│ 36 │ │ self, │
│ 37 │ │ eval_dataset: Optional[Dataset] = None, │
│ │
│ /home/mdisk2/tanjunwen/gitprj/mChatGLM_mutli_gpu_tuning/finetune_ptuning.py:37 in │
│ ModifiedTrainer │
│ │
│ 34 class ModifiedTrainer(Trainer): │
│ 35 │ def evaluate( │
│ 36 │ │ self, │
│ ❱ 37 │ │ eval_dataset: Optional[Dataset] = None, │
│ 38 │ │ ignore_keys: Optional[List[str]] = None, │
│ 39 │ │ metric_key_prefix: str = "eval", │
│ 40 │ │ **gen_kwargs │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
NameError: name 'Optional' is not defined
[2023-05-27 11:30:35,479] [INFO] [launch.py:428:sigkill_handler] Killing subprocess 58780
[2023-05-27 11:30:35,505] [INFO] [launch.py:428:sigkill_handler] Killing subprocess 58781
[2023-05-27 11:30:35,505] [ERROR] [launch.py:434:sigkill_handler] ['/home/mdisk2/tanjunwen/anaconda3/bin/python', '-u', 'finetune_ptuning.py', '--local_rank=1', '--train_path', 'junshi/train.json', '--max_len', '128', '--max_input_len', '256', '--model_name_or_path', '/chatGLM-6B', '--tokenizer_name/chatGLM-6B', '--per_device_train_batch_size', '8', '--gradient_accumulation_steps', '4', '--num_train_epochs', '1', '--save_steps', '1000', '--learning_rate', '2e-2', '--fp16', '--logging_steps', '50', '--prefix_projection', 'True', '--pre_seq_len', '128', '--output_dir', 'output', '--deepspeed', 'ds_config.json'] exits with return code = 1

数据集输入格式与输出不一致

作者大大,想问一下我按照您给的数据集格式输入后进行二次微调,但是在推理输出时他还是按照chat的方式进行交流,会显示理解不了,而完成不了下游任务的输出格式(我正在进行的是关键词扩写的任务,输入关键词,输出一段话,输入训练的数据集也都是这个格式)有没有办法直接输入input得到output呢?或者说这是哪方面的问题呢?
feb653d4fed783a9d89f11904f0ddb1

p-t训练时报错"save_prefixencoder"参数未定义

您好,使用您的代码在p-tuning时出现报错 TypeError: Trainer.init() got an unexpected keyword argument 'save_prefixencoder',检查trainer代码也确实没有定义这个参数,原版的chatglm自己的trainer定义了这个参数trainer定义了这个参数 在finetune_ptuning.py中是否也应当使用这个自定义的trainer呢?

训练了5000步,预测内容为空

我的推理代码是这样写的,我不知道这样写是不是有问题,但是预测出来的结果都是空,但是如果路径不是我训练的路径而是模型路径就会有结果,是不是我推理代码写的有问题,我用的是deepspeed 的freeze方式训练 使用的zreo2并行方式
d451d0f4a6bd57809cf606ecabf6b1f
501c2c0eef4901f3e5bad7a977bf57a
1ec6db616da76dc7f15cdabc55c6e2a

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.