hscspring / hcgf Goto Github PK

Humanable Chat Generative-model Fine-tuning | LLM微调

License: Apache License 2.0

Makefile 0.23% Python 84.71% Jupyter Notebook 14.87% Shell 0.18%

chatglm chatgpt fine-tuning chatglm2 large-language-models llama llm

hcgf's Issues

无法导入transformers.generation，该如何解决？

import hcgf
Traceback (most recent call last):
File "", line 1, in
File "/home/work/hcgf/hcgf/init.py", line 1, in
from .sft import GlmLora
File "/home/work/hcgf/hcgf/sft/init.py", line 1, in
from .lora_ft import GlmLora
File "/home/work/hcgf/hcgf/sft/lora_ft.py", line 13, in
from .chatglm import ChatGLMForConditionalGeneration, ChatGLMTokenizer
File "/home/work/hcgf/hcgf/sft/chatglm/init.py", line 1, in
from .modeling_chatglm import ChatGLMForConditionalGeneration
File "/home/work/hcgf/hcgf/sft/chatglm/modeling_chatglm.py", line 30, in
from transformers.generation.logits_process import LogitsProcessor
ModuleNotFoundError: No module named 'transformers.generation'

无法导入transformers.generation 该如何解决，谢谢！

无法直接eval模型

gl = hcgf.GlmLora("THUDM/chatglm-6b", device="cuda:0")
gl.load_pretrained("lora.pt")
gl.eval()

如果没有gl.load_pretrained("lora.pt")这句话，就会

Loading tokenizer THUDM/chatglm-6b
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Switch to inference mode...
Traceback (most recent call last):
  File "/home/user/git/hcgf/test2/webui.py", line 12, in <module>
    gl.eval()
  File "/home/user/anaconda3/envs/glm-finetune/lib/python3.10/site-packages/hcgf/sft/ft.py", line 279, in eval
    self.model.half()
AttributeError: 'GlmLora' object has no attribute 'model'. Did you mean: 'mode'?

test number must less than total number

请问遇到这种情况怎么办。我稍微修改了一下代码，发现这两个值都是0。

如果数据集中含有换行符的话会报错 JSONDecodeError

如果数据集中含有换行符的话会报错
Traceback (most recent call last):
File "", line 1, in
File "/mnt/data/dev/hcgf/hcgf/hcgf/sft/lora_ft.py", line 130, in load_data
self.dataloader = GlmDataLoader(data_path, self.tokenizer, max_seq_len)
File "/mnt/data/dev/hcgf/hcgf/hcgf/dataloader/data_loader.py", line 28, in init
self.data = self._read_files(data_path)
File "/mnt/data/dev/hcgf/hcgf/hcgf/dataloader/data_loader.py", line 34, in _read_files
js = json.loads(line.text.strip())
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/json/init.py", line 346, in loads
return _default_decoder.decode(s)
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 63 (char 62)

解决方法：修改hcgf/dataloader/data_loader.py 34行的 js = json.loads(line.text.strip())为 js = json.loads(line.text.strip(), strict=False)

UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 68: illegal multibyte sequence

import hcgf
gl = hcgf.GlmLora("THUDM/chatglm-6b", device="cuda:0", lora_r=32)
出错：UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 68: illegal multibyte sequence
我环境是windows

示例中chatgpt_finetune_faq.json找不到

示例中chatgpt_finetune_faq.json找不到，请问这个文件从哪里可以下载，我想了解格式，自己微调一下。谢谢

调了个寂寞

用 tests/test_data/test_data.json 调了一下，还是谁都不认识

换成大的数据集之后显存不够

如题，开始用800条数据训练，显存只有10几个g，换成5万条数据之后，现在32g显存直接溢出了，这个是什么原因呢

无法理解的推理逻辑

hcgf/hcgf/sft/lora_ft.py

Line 152 in 7a2854b

def chat(self, inp: str, history: List[str] = None, max_len: int = 128):

answer = "".join(response)

我不理解这句话，为什么身为数组的res没有使用，反而是作为非数组的response被join了？

【Notice】v0.2.0发布

这个版本时间有点久，本来想做一下DDP就完事，后来觉得干脆一次到位把FSDP实现了一下。期间考虑用accelerate或deepspeed，也使用过上述框架，但后来想想觉得还是自己写吧，可控一些。

其实一直不太想给太多配置和选择，工业用最好能直接了当，选最好或相对比较好的就行了。所以，这个版本做了非常多的修改，往往今天想好的设计，写了一半又推倒重来。即便现在，依然很不满意，而且，其实也还没有非常充分的测试……不过实在不好再拖。

本次主要更新了分布式微调（训练），使用时需要用命令行，具体可参考README。其他部分与之前版本兼容（部分参数有微调）。

关于Lora（或类似的微调方式）得补充说明几句，它和正常的全量调整参数不同，本质上是一种对已有模型的「干扰」方案，现在已经有一些实验证明它会遗忘。而且在多轮对话上效果怎样，也没有权威的评测。不过，我们也一直在探索新的（或之前已有的当时由于不满足条件没被重视）微调方式。
微调的意义在于，我们想让其在能力不下降的同时，记住给的新领域知识。这其实不是更新已有知识，而是增加新的领域知识。当然，也可以让其根据新领域知识更新已有知识。这是两种不同的做法，行业处于探索阶段。
实际上，目前更多/实用的用法是Embedding召回+LLM辅助生成的方案，我在Hugging-LLM教程中有提及。总而言之，应首先明确自己的需求。

近期的Issue都没看，主要是gmail居然没有提醒（之前都是有的），每天看邮件没有看到提示所以就没注意到新的回复和Issue。很抱歉未能及时回复。

RuntimeError: Internal: [MASK] is already defined.

微调过一次后再读取模型出现这个报错，帮忙看一下，谢谢

如何用多卡微调呢？

如题

多卡场景下微调完进行推理的时候报错

gl.load_pretrained("lora-ckpt-last-1110.pt").eval()
Switch to inference mode...
gl.chat("你是谁?")
Traceback (most recent call last):
File "", line 1, in
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/site-packages/hcgf-0.0.7-py3.10.egg/hcgf/sft/lora_ft.py", line 253, in chat
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 43, in generator_context
response = gen.send(None)
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/site-packages/hcgf-0.0.7-py3.10.egg/hcgf/sft/lora_ft.py", line 225, in stream_chat
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 43, in generator_context
response = gen.send(None)
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/site-packages/hcgf-0.0.7-py3.10.egg/hcgf/sft/chatglm/modeling_chatglm.py", line 1345, in stream_generate
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/site-packages/hcgf-0.0.7-py3.10.egg/hcgf/sft/chatglm/modeling_chatglm.py", line 1148, in forward
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/site-packages/hcgf-0.0.7-py3.10.egg/hcgf/sft/chatglm/modeling_chatglm.py", line 895, in forward
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 160, in forward
return F.embedding(
File "/mnt/data/dev/anaconda3/envs/pytorch1131/lib/python3.10/site-packages/torch/nn/functional.py", line 2210, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper__index_select)

chat添加stop参数后出现 slice() cannot be applied to a 0-dim tensor.

File "/home/user/git/hcgf/test2/./webui.py", line 97, in predict
    response, history = gl.chat(inp=input, history=history, max_len=max_length,stop=['<eop>'])
  File "/home/user/anaconda3/envs/glm-finetune/lib/python3.10/site-packages/hcgf/sft/lora_ft.py", line 250, in chat
    custom_stop_tensor_list = create_token_tensor_list(
  File "/home/user/anaconda3/envs/glm-finetune/lib/python3.10/site-packages/hcgf/utils/utils.py", line 60, in create_token_tensor_list
    tids = tokenizer(
IndexError: slice() cannot be applied to a 0-dim tensor.

有什么办法解决生成结果有大段重复的问题吗？

如题，有时生成结果有大段重复，可以通过加loss约束或者后处理方法来解决吗

微调和推理不能兼得

# 微调
import hcgf
gl = hcgf.GlmLora("THUDM/chatglm-6b", device="cuda:0")
gl.load_data("./data/chatgpt_finetune_faq.json").tune()

# 推理
import hcgf
gl = hcgf.GlmLora("THUDM/chatglm-6b", device="cuda:0", infer_mode=True)
gl.load_pretrained("/path/to/lora_pt").eval()
gl.chat("你是谁?")

这是微调和推理代码，有办法在只调用一次GlmLora的情况下(或者尽可能复用显存)，将微调和推理代码合并起来，先微调，然后直接开始推理微调后的结果，然后又能继续微调，而不是反复重复加载模型呢

我试过并不成功，似乎存在half和float冲突，不知道要怎么完美融合微调和推理

ModuleNotFoundError: No module named 'hcgf'

import hcgf
gl = hcgf.GlmLora("model", load_in_8bit=True, lora_r=32)

ModuleNotFoundError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_15056\644290863.py in
----> 1 import hcgf
2 gl = hcgf.GlmLora("model", load_in_8bit=True, lora_r=32)

ModuleNotFoundError: No module named 'hcgf'

gl.load_data("./tests/test_data.json").tune()

NameError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_15056\3044540026.py in
----> 1 gl.load_data("./tests/test_data.json").tune()

NameError: name 'gl' is not defined

(glm-finetune) user@calculator:~/git/hcgf/test2$ python3 infer.py
Loading tokenizer and model of THUDM/chatglm-6b
Loading checkpoint shards:  38%|██████████▉                  | 3/8 [00:02<00:03,  1.43it/s]Loading checkpoint shards:  50%|██████████████▌              | 4/8 [00:02<00:02,  1.43it/s]Loading checkpoint shards: 100%|█████████████████████████████| 8/8 [00:05<00:00,  1.60it/s]
Processing peft model
/home/user/anaconda3/envs/glm-finetune/lib/python3.10/site-packages/peft/tuners/lora.py:173: UserWarning: fan_in_fan_out is set to True but the target module is not a Conv1D. Setting fan_in_fan_out to False.
  warnings.warn(
trainable params: 3670016 || all params: 6258876416
trainable%: 0.05863697820615348
Traceback (most recent call last):
  File "/home/user/git/hcgf/test2/infer.py", line 4, in <module>
    gl.load_pretrained("output/ckpt/lora-ckpt-last-52.pt").eval()
  File "/home/user/anaconda3/envs/glm-finetune/lib/python3.10/site-packages/hcgf/sft/lora_ft.py", line 148, in eval
    self.model.to(self.device).eval()
  File "/home/user/anaconda3/envs/glm-finetune/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1930, in eval
    return self.train(False)
  File "/home/user/anaconda3/envs/glm-finetune/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1911, in train
    module.train(mode)
  File "/home/user/anaconda3/envs/glm-finetune/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1911, in train
    module.train(mode)
  File "/home/user/anaconda3/envs/glm-finetune/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1911, in train
    module.train(mode)
  [Previous line repeated 4 more times]
  File "/home/user/anaconda3/envs/glm-finetune/lib/python3.10/site-packages/peft/tuners/lora.py", line 417, in train
    delta_w = F.conv1d(
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [8192, 8, 1, 1], but got 3-dimensional input of size [1, 16, 4096] instead

推理代码

# 推理
import hcgf
gl = hcgf.GlmLora("THUDM/chatglm-6b", device="cuda:0", infer_mode=True)
gl.load_pretrained("output/ckpt/lora-ckpt-last-52.pt").eval()
gl.chat("你是谁?")

# 微调
import hcgf
gl = hcgf.GlmLora("THUDM/chatglm-6b", device="cuda:0")
gl.load_data("./data/chatgpt_finetune_faq.json").tune()

没有看见怎么设置保存路径呢...

hscspring / hcgf Goto Github PK

hcgf's Issues

import hcgf gl = hcgf.GlmLora("model", load_in_8bit=True, lora_r=32)

ModuleNotFoundError: No module named 'hcgf'

gl.load_data("./tests/test_data.json").tune()

Recommend Projects

Recommend Topics

Recommend Org

Jobs

import hcgf
gl = hcgf.GlmLora("model", load_in_8bit=True, lora_r=32)