Preparing to train...
Loading data...
Tokenizing data...
Creating model...
/home/wangyongbo/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/rnn.py:46: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.2 and num_layers=1
"num_layers={}".format(dropout, num_layers))
Starting epoch 0
Traceback (most recent call last):
File "/DATA2/wangyongbo/ms_marco/MSMARCO-Question-Answering/Baseline/experiment/train.py", line 230, in <module>
main()
File "/DATA2/wangyongbo/ms_marco/MSMARCO-Question-Answering/Baseline/experiment/train.py", line 224, in main
checkpoint, args.exp_folder)
File "/DATA2/wangyongbo/ms_marco/MSMARCO-Question-Answering/Baseline/experiment/checkpointing.py", line 61, in checkpoint
save_model(model, dest)
File "/DATA2/wangyongbo/ms_marco/MSMARCO-Question-Answering/Baseline/experiment/checkpointing.py", line 45, in save_model
save_params(destination, 'model/'+name, value)
File "/DATA2/wangyongbo/ms_marco/MSMARCO-Question-Answering/Baseline/experiment/checkpointing.py", line 31, in save_params
destination.create_dataset(path, data=params, compression='gzip')
File "/home/wangyongbo/anaconda3/envs/py36/lib/python3.6/site-packages/h5py/_hl/group.py", line 136, in create_dataset
dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds)
File "/home/wangyongbo/anaconda3/envs/py36/lib/python3.6/site-packages/h5py/_hl/dataset.py", line 83, in make_new_dset
else base.guess_dtype(data)))
File "/home/wangyongbo/anaconda3/envs/py36/lib/python3.6/site-packages/numpy/core/numeric.py", line 538, in asarray
return array(a, dtype, copy=False, order=order)
File "/home/wangyongbo/anaconda3/envs/py36/lib/python3.6/site-packages/torch/tensor.py", line 450, in __array__
return self.numpy()
TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
#!/bin/bash
export CUDA_VISIBLE_DEVICES=0,1
export MS_PATH=/DATA2/wangyongbo/ms_marco/MSMARCO-Question-Answering/Baseline/experiment
python3 $MS_PATH/train.py $MS_PATH $MS_PATH/data/train_v2.1.json \
--force_restart \
--cuda=True
The environments used python3.6, CUDA9.0.