When I ran train. py with a GPU. It seems that RAM has run out. My computer has 46G of RAM, including 30G virtual memory.
$ python3 baseline/train.py --max_step=200 --debug --batch_size=96
mkdir: cannot create directory ‘./model’: File exists
loaded 10000 images
loaded 20000 images
loaded 30000 images
loaded 40000 images
loaded 50000 images
loaded 60000 images
loaded 70000 images
loaded 80000 images
loaded 90000 images
loaded 100000 images
loaded 110000 images
loaded 120000 images
loaded 130000 images
loaded 140000 images
loaded 150000 images
loaded 160000 images
loaded 170000 images
loaded 180000 images
loaded 190000 images
loaded 200000 images
finish loading data, 197999 training images, 2001 testing images
observation_space (96, 128, 128, 7) action_space 13
/home/rody/xu/npaint/LearningToPaint/baseline/DRL/ddpg.py:157: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
s0 =torch.tensor(self.state, device='cpu')
/home/rody/xu/npaint/LearningToPaint/baseline/DRL/ddpg.py:163: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
s1 =torch.tensor(state, device='cpu')
#0: steps:200 interval_time:9.08 train_time:0.00
#1: steps:400 interval_time:22.40 train_time:0.00
#2: steps:600 interval_time:19.66 train_time:6.90
#3: steps:800 interval_time:20.01 train_time:5.28
#4: steps:1000 interval_time:20.89 train_time:6.01
#5: steps:1200 interval_time:20.52 train_time:6.34
#6: steps:1400 interval_time:18.20 train_time:7.01
Killed
total used free shared buff/cache available
Mem: 15892 15627 139 11 125 81
Swap: 30273 30273 0