(chatglm3-finetune) root@g101:/data/ChatGLM3/chatglm3-finetune# python finetune.py --d

FineTune CUDA out of memory about chatglm3-finetune HOT 10 OPEN

freecow commented on July 24, 2024

FineTune CUDA out of memory

from chatglm3-finetune.

Comments (10)

Jeru2023 commented on July 24, 2024 1

修改 device_map 参数来指定设备。如果你想使用 GPU，将 device_map="auto" 修改为 device_map="cuda"。如果你想使用 CPU，将其修改为 device_map="cpu"

In my case, device_map needs to be set to cuda:0 instead of cuda

from chatglm3-finetune.

Jeru2023 commented on July 24, 2024

Same here, PyTorch reserved too much memory...

from chatglm3-finetune.

Jeru2023 commented on July 24, 2024

Try to modify finetune line 38 to set load_in_8bit to true:
model = AutoModel.from_pretrained(
"{your model path}", load_in_8bit=True, trust_remote_code=True, device_map="auto"
).cuda()

from chatglm3-finetune.

freecow commented on July 24, 2024

finetune.py line 34 to set load_in_8bit to true and delete half():
Original:
model = ChatGLMForConditionalGeneration.from_pretrained(
"model", load_in_8bit=False, trust_remote_code=False, device_map="auto"
).half()

Modified:
model = ChatGLMForConditionalGeneration.from_pretrained(
"model", load_in_8bit=True, trust_remote_code=False, device_map="auto"
)

Error Message:
File "/root/miniconda3/envs/chatglm3-finetune/lib/python3.10/site-packages/torch/nn/functional.py", line 2210, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper__index_select)

from chatglm3-finetune.

xxw1995 commented on July 24, 2024

修改 device_map 参数来指定设备。如果你想使用 GPU，将 device_map="auto" 修改为 device_map="cuda"。如果你想使用 CPU，将其修改为 device_map="cpu"

from chatglm3-finetune.

chenmins commented on July 24, 2024

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 713 C python 26303MiB |
+-----------------------------------------------------------------------------+

from chatglm3-finetune.

freecow commented on July 24, 2024

刚刚测试了，需要 26GB GPU 显存。 Mon Oct 30 11:36:26 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.105.01 Driver Version: 515.105.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100-SXM... On | 00000000:A1:00.0 Off | 0 | | N/A 51C P0 327W / 400W | 26305MiB / 81920MiB | 95% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 713 C python 26303MiB | +-----------------------------------------------------------------------------+

那看来不是3090这种24G能玩起来的，毕竟好像也不能多卡FineTune

from chatglm3-finetune.

Jeru2023 commented on July 24, 2024

24G够了，我是4090单卡，一个epoch10秒，还挺快

from chatglm3-finetune.

Jeru2023 commented on July 24, 2024

刚刚测试了，需要 26GB GPU 显存。 Mon Oct 30 11:36:26 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.105.01 Driver Version: 515.105.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100-SXM... On | 00000000:A1:00.0 Off | 0 | | N/A 51C P0 327W / 400W | 26305MiB / 81920MiB | 95% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 713 C python 26303MiB | +-----------------------------------------------------------------------------+

那看来不是3090这种24G能玩起来的，毕竟好像也不能多卡FineTune

今天用3090也试了一下，没问题的

from chatglm3-finetune.

sukibean163 commented on July 24, 2024

刚刚测试了，需要 26GB GPU 显存。 Mon Oct 30 11:36:26 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.105.01 Driver Version: 515.105.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100-SXM... On | 00000000:A1:00.0 Off | 0 | | N/A 51C P0 327W / 400W | 26305MiB / 81920MiB | 95% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 713 C python 26303MiB | +-----------------------------------------------------------------------------+

那看来不是3090这种24G能玩起来的，毕竟好像也不能多卡FineTune

今天用3090也试了一下，没问题的

为什么我用24G的4090跑，也是出了同样的问题？

from chatglm3-finetune.

FineTune CUDA out of memory about chatglm3-finetune HOT 10 OPEN

Comments (10)

Related Issues (19)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs