Excuse me! When I run bash train. sh, the error occurs as follows. How can I solve it?

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Training problem,about mrnabati/centerfusion

Comments (7)

mrnabati commented on June 9, 2024 1

Hi. Looks like you are trying to access a GPU device that does not exist. If you only have one GPU, you need to change the following parameter in both train.sh and test.sh scripts:

export CUDA_VISIBLE_DEVICES=0,1

and also the --gpu parameter in those scripts.

from centerfusion.

AHappyFlyBird commented on June 9, 2024

Hi. Looks like you are trying to access a GPU device that does not exist. If you only have one GPU, you need to change the following parameter in both train.sh and test.sh scripts:
export CUDA_VISIBLE_DEVICES=0,1
and also the --gpu parameter in those scripts.

I am so happy to see your reply. Thanks for your work. Using your suggestion, I solved this problem, but a new issue apppeared, the error occurs as follows. Looking foward to your reply.

Using tensorboardX
/usr/local/lib/python3.6/dist-packages/sklearn/utils/linear_assignment_.py:21: DeprecationWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
DeprecationWarning)
Fix size testing.
training chunk_sizes: [32]
input h w: 448 800
heads {'hm': 10, 'reg': 2, 'wh': 2, 'dep': 1, 'rot': 8, 'dim': 3, 'amodel_offset': 2, 'dep_sec': 1, 'rot_sec': 8, 'nuscenes_att': 8, 'velocity': 3}
weights {'hm': 1, 'reg': 1, 'wh': 0.1, 'dep': 1, 'rot': 1, 'dim': 1, 'amodel_offset': 1, 'dep_sec': 1, 'rot_sec': 1, 'nuscenes_att': 1, 'velocity': 1}
head conv {'hm': [256], 'reg': [256], 'wh': [256], 'dep': [256], 'rot': [256], 'dim': [256], 'amodel_offset': [256], 'dep_sec': [256, 256, 256], 'rot_sec': [256, 256, 256], 'nuscenes_att': [256, 256, 256], 'velocity': [256, 256, 256]}
Namespace(K=100, amodel_offset_weight=1, arch='dla_34', aug_rot=0, backbone='dla34', batch_size=32, chunk_sizes=[32], custom_dataset_ann_path='', custom_dataset_img_path='', custom_head_convs={'dep_sec': 3, 'rot_sec': 3, 'velocity': 3, 'nuscenes_att': 3}, data_dir='/content/drive/MyDrive/CenterFusion/src/lib/../../data', dataset='nuscenes', dataset_version='', debug=0, debug_dir='/content/drive/MyDrive/CenterFusion/src/lib/../../exp/ddd/centerfusion/debug', debugger_theme='white', demo='', dense_reg=1, dep_res_weight=1, dep_weight=1, depth_scale=1, dim_weight=1, disable_frustum=False, dla_node='dcn', down_ratio=4, eval=False, eval_n_plots=0, eval_render_curves=False, exp_dir='/content/drive/MyDrive/CenterFusion/src/lib/../../exp/ddd', exp_id='centerfusion', fix_res=True, fix_short=-1, flip=0.5, flip_test=False, fp_disturb=0, freeze_backbone=False, frustumExpansionRatio=0.0, gpus=[0], gpus_str='0', head_conv={'hm': [256], 'reg': [256], 'wh': [256], 'dep': [256], 'rot': [256], 'dim': [256], 'amodel_offset': [256], 'dep_sec': [256, 256, 256], 'rot_sec': [256, 256, 256], 'nuscenes_att': [256, 256, 256], 'velocity': [256, 256, 256]}, head_kernel=3, heads={'hm': 10, 'reg': 2, 'wh': 2, 'dep': 1, 'rot': 8, 'dim': 3, 'amodel_offset': 2, 'dep_sec': 1, 'rot_sec': 8, 'nuscenes_att': 8, 'velocity': 3}, hm_dist_thresh={0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 1, 6: 1, 7: 1, 8: 0, 9: 0}, hm_disturb=0, hm_hp_weight=1, hm_to_box_ratio=0.3, hm_transparency=0.7, hm_weight=1, hp_weight=1, hungarian=False, ignore_loaded_cats=[], img_format='jpg', input_h=448, input_res=800, input_w=800, iou_thresh=0, keep_res=False, kitti_split='3dop', layers_to_freeze=['base', 'dla_up', 'ida_up'], load_model='../models/centernet_baseline_e170.pth', load_results='', lost_disturb=0, lr=0.00025, lr_step=[50], ltrb=False, ltrb_amodal=False, ltrb_amodal_weight=0.1, ltrb_weight=0.1, master_batch_size=32, max_age=-1, max_frame_dist=3, max_pc=1000, max_pc_dist=60.0, model_output_list=False, msra_outchannel=256, neck='dlaup', new_thresh=0.3, nms=False, no_color_aug=False, no_pause=False, no_pre_img=False, non_block_test=False, normalize_depth=True, not_cuda_benchmark=False, not_max_crop=False, not_prefetch_test=False, not_rand_crop=True, not_set_cuda_env=False, not_show_bbox=False, not_show_number=False, num_classes=10, num_epochs=60, num_head_conv=1, num_img_channels=3, num_iters=-1, num_resnet_layers=101, num_stacks=1, num_workers=4, nuscenes_att=True, nuscenes_att_weight=1, off_weight=1, optim='adam', out_thresh=-1, output_h=112, output_res=200, output_w=200, pad=31, pc_atts=['x', 'y', 'z', 'dyn_prop', 'id', 'rcs', 'vx', 'vy', 'vx_comp', 'vy_comp', 'is_quality_valid', 'ambig_state', 'x_rms', 'y_rms', 'invalid_state', 'pdh0', 'vx_rms', 'vy_rms'], pc_feat_channels={'pc_dep': 0, 'pc_vx': 1, 'pc_vz': 2}, pc_feat_lvl=['pc_dep', 'pc_vx', 'pc_vz'], pc_roi_method='pillars', pc_z_offset=0.0, pillar_dims=[1.5, 0.2, 0.2], pointcloud=True, pre_hm=False, pre_img=False, pre_thresh=-1, print_iter=0, prior_bias=-4.6, public_det=False, qualitative=False, r_a=250, r_b=5, radar_sweeps=3, reg_loss='l1', reset_hm=False, resize_video=False, resume=False, reuse_hm=False, root_dir='/content/drive/MyDrive/CenterFusion/src/lib/../..', rot_weight=1, rotate=0, run_dataset_eval=True, same_aug_pre=False, save_all=False, save_dir='/content/drive/MyDrive/CenterFusion/src/lib/../../exp/ddd/centerfusion', save_framerate=30, save_img_suffix='', save_imgs=[], save_point=[20, 40, 50], save_results=False, save_video=False, scale=0, secondary_heads=['velocity', 'nuscenes_att', 'dep_sec', 'rot_sec'], seed=317, shift=0.1, show_track_color=False, show_velocity=False, shuffle_train=True, sigmoid_dep_sec=True, skip_first=-1, sort_det_by_dist=False, tango_color=False, task='ddd', test_dataset='nuscenes', test_focal_length=-1, test_scales=[1.0], track_thresh=0.3, tracking=False, tracking_weight=1, train_split='mini_train', trainval=False, transpose_video=False, use_loaded_results=False, val_intervals=1, val_split='mini_val', velocity=True, velocity_weight=1, video_h=512, video_w=512, vis_gt_bev='', vis_thresh=0.3, warm_start_weights=False, weights={'hm': 1, 'reg': 1, 'wh': 0.1, 'dep': 1, 'rot': 1, 'dim': 1, 'amodel_offset': 1, 'dep_sec': 1, 'rot_sec': 1, 'nuscenes_att': 1, 'velocity': 1}, wh_weight=0.1, zero_pre_hm=False, zero_tracking=False)
Creating model...
Using node type: (<class 'model.networks.dla.DeformConv'>, <class 'model.networks.dla.DeformConv'>)
Warning: No ImageNet pretrain!!
loaded ../models/centernet_baseline_e170.pth, epoch 28
Skip loading parameter nuscenes_att.0.weight, required shapetorch.Size([256, 67, 3, 3]), loaded shapetorch.Size([256, 64, 3, 3]).
Skip loading parameter nuscenes_att.2.weight, required shapetorch.Size([256, 256, 1, 1]), loaded shapetorch.Size([8, 256, 1, 1]).
Skip loading parameter nuscenes_att.2.bias, required shapetorch.Size([256]), loaded shapetorch.Size([8]).
Skip loading parameter velocity.0.weight, required shapetorch.Size([256, 67, 3, 3]), loaded shapetorch.Size([256, 64, 3, 3]).
Skip loading parameter velocity.2.weight, required shapetorch.Size([256, 256, 1, 1]), loaded shapetorch.Size([3, 256, 1, 1]).
Skip loading parameter velocity.2.bias, required shapetorch.Size([256]), loaded shapetorch.Size([3]).
No param dep_sec.0.weight.
No param dep_sec.0.bias.
No param dep_sec.2.weight.
No param dep_sec.2.bias.
No param dep_sec.4.weight.
No param dep_sec.4.bias.
No param dep_sec.6.weight.
No param dep_sec.6.bias.
No param rot_sec.0.weight.
No param rot_sec.0.bias.
No param rot_sec.2.weight.
No param rot_sec.2.bias.
No param rot_sec.4.weight.
No param rot_sec.4.bias.
No param rot_sec.6.weight.
No param rot_sec.6.bias.
No param nuscenes_att.4.weight.
No param nuscenes_att.4.bias.
No param nuscenes_att.6.weight.
No param nuscenes_att.6.bias.
No param velocity.4.weight.
No param velocity.4.bias.
No param velocity.6.weight.
No param velocity.6.bias.
Setting up validation data...
Dataset version
==> initializing mini_val data from /content/drive/MyDrive/CenterFusion/src/lib/../../data/nuscenes/annotations_3sweeps/mini_val.json,
images from /content/drive/MyDrive/CenterFusion/src/lib/../../data/nuscenes ...
loading annotations into memory...
Done (t=0.99s)
creating index...
index created!
Loaded mini_val 486 samples
Setting up train data...
Dataset version
==> initializing mini_train data from /content/drive/MyDrive/CenterFusion/src/lib/../../data/nuscenes/annotations_3sweeps/mini_train.json,
images from /content/drive/MyDrive/CenterFusion/src/lib/../../data/nuscenes ...
loading annotations into memory...
Done (t=3.87s)
creating index...
index created!
Loaded mini_train 1938 samples
Starting training...
ddd/centerfusionTraceback (most recent call last):
File "main.py", line 140, in
main(opt)
File "main.py", line 84, in main
log_dict_train, _ = trainer.train(epoch, train_loader)
File "/content/drive/MyDrive/CenterFusion/src/lib/trainer.py", line 406, in train
return self.run_epoch('train', epoch, data_loader)
File "/content/drive/MyDrive/CenterFusion/src/lib/trainer.py", line 178, in run_epoch
output, loss, loss_stats = model_with_loss(batch, phase)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, kwargs)
File "/content/drive/MyDrive/CenterFusion/src/lib/trainer.py", line 123, in forward
outputs = self.model(batch['image'], pc_hm=pc_hm, pc_dep=pc_dep, calib=calib)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call**
result = self.forward(*input, kwargs)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/base_model.py", line 91, in forward
feats = self.img2feats(x)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/dla.py", line 622, in img2feats
x = self.dla_up(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call**
result = self.forward(*input, kwargs)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/dla.py", line 572, in forward
ida(layers, len(layers) -i - 2, len(layers))
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call**
result = self.forward(*input, kwargs)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/dla.py", line 543, in forward
layers[i] = upsample(project(layers[i]))
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call**
result = self.forward(*input, kwargs)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/dla.py", line 516, in forward
x = self.conv(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call**
result = self.forward(*input, kwargs)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/DCNv2/dcn_v2.py", line 128, in forward
self.deformable_groups)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/DCNv2/dcn_v2.py", line 31, in forward
ctx.deformable_groups)
RuntimeError: Not compiled with GPU support (dcn_v2_forward at /content/drive/My Drive/CenterFusion/src/lib/model/networks/DCNv2/src/dcn_v2.h:35)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7f726a55b273 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #1: dcn_v2_forward(at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, int, int, int, int, int, int, int, int, int) + 0x159 (0x7f7249ae4fd9 in /content/drive/MyDrive/CenterFusion/src/lib/model/networks/DCNv2/_ext.cpython-36m-x86_64-linux-gnu.so)
frame #2: + 0x16629 (0x7f7249af2629 in /content/drive/MyDrive/CenterFusion/src/lib/model/networks/DCNv2/_ext.cpython-36m-x86_64-linux-gnu.so)
frame #3: + 0x12a2d (0x7f7249aeea2d in /content/drive/MyDrive/CenterFusion/src/lib/model/networks/DCNv2/_ext.cpython-36m-x86_64-linux-gnu.so)
frame #4: python3() [0x50a4a5]

frame #6: python3() [0x507be4]
frame #7: python3() [0x588c8b]
frame #9: THPFunction_apply(_object, _object) + 0x9df (0x7f72b49f191f in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #10: python3() [0x50a12f]
frame #13: python3() [0x594a01]
frame #16: python3() [0x507be4]
frame #18: python3() [0x594a01]
frame #19: python3() [0x54a971]
frame #21: python3() [0x50a433]
frame #24: python3() [0x594a01]
frame #27: python3() [0x507be4]
frame #29: python3() [0x594a01]
frame #30: python3() [0x54a971]
frame #32: python3() [0x50a433]
frame #35: python3() [0x594a01]
frame #38: python3() [0x507be4]
frame #40: python3() [0x594a01]
frame #41: python3() [0x54a971]
frame #43: python3() [0x50a433]
frame #46: python3() [0x594a01]
frame #49: python3() [0x507be4]
frame #51: python3() [0x594a01]
frame #52: python3() [0x54a971]
frame #54: python3() [0x50a433]
frame #56: python3() [0x5095c8]
frame #57: python3() [0x50a2fd]
frame #59: python3() [0x507be4]
frame #61: python3() [0x594a01]

from centerfusion.

mrnabati commented on June 9, 2024

Make sure you build the DCNv2 library after installing PyTorch and also PyTorch is installed with GPU support. This error usually happens when DCNv2 is built with a PyTorch without GPU support.

from centerfusion.

fabrizioschiano commented on June 9, 2024

I have the same error when trying to

bash experiments/test.sh

The error is:

  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/dla.py", line 516, in forward
    x = self.conv(x)
  File "/home/fabrizioschiano/.virtualenvs/centerfusion/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/DCNv2/dcn_v2.py", line 161, in forward
    return dcn_v2_conv(
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/DCNv2/dcn_v2.py", line 23, in forward
    output = _backend.dcn_v2_forward(
RuntimeError: Not compiled with GPU support

Using tensorboardX
Fix size testing.
training chunk_sizes: [32]
input h w: 448 800
heads {'hm': 10, 'reg': 2, 'wh': 2, 'dep': 1, 'rot': 8, 'dim': 3, 'amodel_offset': 2, 'dep_sec': 1, 'rot_sec': 8, 'nuscenes_att': 8, 'velocity': 3}
weights {'hm': 1, 'reg': 1, 'wh': 0.1, 'dep': 1, 'rot': 1, 'dim': 1, 'amodel_offset': 1, 'dep_sec': 1, 'rot_sec': 1, 'nuscenes_att': 1, 'velocity': 1}
head conv {'hm': [256], 'reg': [256], 'wh': [256], 'dep': [256], 'rot': [256], 'dim': [256], 'amodel_offset': [256], 'dep_sec': [256, 256, 256], 'rot_sec': [256, 256, 256], 'nuscenes_att': [256, 256, 256], 'velocity': [256, 256, 256]}
Namespace(K=100, amodel_offset_weight=1, arch='dla_34', aug_rot=0, backbone='dla34', batch_size=32, chunk_sizes=[32], custom_dataset_ann_path='', custom_dataset_img_path='', custom_head_convs={'dep_sec': 3, 'rot_sec': 3, 'velocity': 3, 'nuscenes_att': 3}, data_dir='/home/fabrizioschiano/repositories/CenterFusion/src/lib/../../data', dataset='nuscenes', dataset_version='', debug=0, debug_dir='/home/fabrizioschiano/repositories/CenterFusion/src/lib/../../exp/ddd/centerfusion/debug', debugger_theme='white', demo='', dense_reg=1, dep_res_weight=1, dep_weight=1, depth_scale=1, dim_weight=1, disable_frustum=False, dla_node='dcn', down_ratio=4, eval=False, eval_n_plots=0, eval_render_curves=False, exp_dir='/home/fabrizioschiano/repositories/CenterFusion/src/lib/../../exp/ddd', exp_id='centerfusion', fix_res=True, fix_short=-1, flip=0.5, flip_test=True, fp_disturb=0, freeze_backbone=False, frustumExpansionRatio=0.0, gpus=[0], gpus_str='0', head_conv={'hm': [256], 'reg': [256], 'wh': [256], 'dep': [256], 'rot': [256], 'dim': [256], 'amodel_offset': [256], 'dep_sec': [256, 256, 256], 'rot_sec': [256, 256, 256], 'nuscenes_att': [256, 256, 256], 'velocity': [256, 256, 256]}, head_kernel=3, heads={'hm': 10, 'reg': 2, 'wh': 2, 'dep': 1, 'rot': 8, 'dim': 3, 'amodel_offset': 2, 'dep_sec': 1, 'rot_sec': 8, 'nuscenes_att': 8, 'velocity': 3}, hm_dist_thresh={0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 1, 6: 1, 7: 1, 8: 0, 9: 0}, hm_disturb=0, hm_hp_weight=1, hm_to_box_ratio=0.3, hm_transparency=0.7, hm_weight=1, hp_weight=1, hungarian=False, ignore_loaded_cats=[], img_format='jpg', input_h=448, input_res=800, input_w=800, iou_thresh=0, keep_res=False, kitti_split='3dop', layers_to_freeze=['base', 'dla_up', 'ida_up'], load_model='../models/centerfusion_e60.pth', load_results='', lost_disturb=0, lr=0.000125, lr_step=[60], ltrb=False, ltrb_amodal=False, ltrb_amodal_weight=0.1, ltrb_weight=0.1, master_batch_size=32, max_age=-1, max_frame_dist=3, max_pc=1000, max_pc_dist=60.0, model_output_list=False, msra_outchannel=256, neck='dlaup', new_thresh=0.3, nms=False, no_color_aug=False, no_pause=False, no_pre_img=False, non_block_test=False, normalize_depth=True, not_cuda_benchmark=False, not_max_crop=False, not_prefetch_test=False, not_rand_crop=False, not_set_cuda_env=False, not_show_bbox=False, not_show_number=False, num_classes=10, num_epochs=70, num_head_conv=1, num_img_channels=3, num_iters=-1, num_resnet_layers=101, num_stacks=1, num_workers=4, nuscenes_att=True, nuscenes_att_weight=1, off_weight=1, optim='adam', out_thresh=-1, output_h=112, output_res=200, output_w=200, pad=31, pc_atts=['x', 'y', 'z', 'dyn_prop', 'id', 'rcs', 'vx', 'vy', 'vx_comp', 'vy_comp', 'is_quality_valid', 'ambig_state', 'x_rms', 'y_rms', 'invalid_state', 'pdh0', 'vx_rms', 'vy_rms'], pc_feat_channels={'pc_dep': 0, 'pc_vx': 1, 'pc_vz': 2}, pc_feat_lvl=['pc_dep', 'pc_vx', 'pc_vz'], pc_roi_method='pillars', pc_z_offset=-0.0, pillar_dims=[1.5, 0.2, 0.2], pointcloud=True, pre_hm=False, pre_img=False, pre_thresh=-1, print_iter=0, prior_bias=-4.6, public_det=False, qualitative=False, r_a=250, r_b=5, radar_sweeps=6, reg_loss='l1', reset_hm=False, resize_video=False, resume=False, reuse_hm=False, root_dir='/home/fabrizioschiano/repositories/CenterFusion/src/lib/../..', rot_weight=1, rotate=0, run_dataset_eval=True, same_aug_pre=False, save_all=False, save_dir='/home/fabrizioschiano/repositories/CenterFusion/src/lib/../../exp/ddd/centerfusion', save_framerate=30, save_img_suffix='', save_imgs=[], save_point=[90], save_results=False, save_video=False, scale=0, secondary_heads=['velocity', 'nuscenes_att', 'dep_sec', 'rot_sec'], seed=317, shift=0, show_track_color=False, show_velocity=False, shuffle_train=False, sigmoid_dep_sec=True, skip_first=-1, sort_det_by_dist=False, tango_color=False, task='ddd', test_dataset='nuscenes', test_focal_length=-1, test_scales=[1.0], track_thresh=0.3, tracking=False, tracking_weight=1, train_split='train', trainval=False, transpose_video=False, use_loaded_results=False, val_intervals=10, val_split='mini_val', velocity=True, velocity_weight=1, video_h=512, video_w=512, vis_gt_bev='', vis_thresh=0.3, warm_start_weights=False, weights={'hm': 1, 'reg': 1, 'wh': 0.1, 'dep': 1, 'rot': 1, 'dim': 1, 'amodel_offset': 1, 'dep_sec': 1, 'rot_sec': 1, 'nuscenes_att': 1, 'velocity': 1}, wh_weight=0.1, zero_pre_hm=False, zero_tracking=False)
Dataset version 
==> initializing mini_val data from /home/fabrizioschiano/repositories/CenterFusion/src/lib/../../data/nuscenes/annotations_6sweeps/mini_val.json, 
 images from /home/fabrizioschiano/repositories/CenterFusion/src/lib/../../data/nuscenes ...
loading annotations into memory...
Done (t=0.83s)
creating index...
index created!
Loaded mini_val 486 samples
Creating model...
Using node type: (<class 'model.networks.dla.DeformConv'>, <class 'model.networks.dla.DeformConv'>)
Warning: No ImageNet pretrain!!
loaded ../models/centerfusion_e60.pth, epoch 60
Traceback (most recent call last):
  File "test.py", line 215, in <module>
    prefetch_test(opt)
  File "test.py", line 125, in prefetch_test
    ret = detector.run(pre_processed_images)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/detector.py", line 118, in run
    output, dets, forward_time = self.process(
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/detector.py", line 321, in process
    output = self.model(images, pc_dep=pc_dep, calib=calib)[-1]
  File "/home/fabrizioschiano/.virtualenvs/centerfusion/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/base_model.py", line 91, in forward
    feats = self.img2feats(x)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/dla.py", line 622, in img2feats
    x = self.dla_up(x)
  File "/home/fabrizioschiano/.virtualenvs/centerfusion/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/dla.py", line 572, in forward
    ida(layers, len(layers) -i - 2, len(layers))
  File "/home/fabrizioschiano/.virtualenvs/centerfusion/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/dla.py", line 543, in forward
    layers[i] = upsample(project(layers[i]))
  File "/home/fabrizioschiano/.virtualenvs/centerfusion/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/dla.py", line 516, in forward
    x = self.conv(x)
  File "/home/fabrizioschiano/.virtualenvs/centerfusion/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/DCNv2/dcn_v2.py", line 161, in forward
    return dcn_v2_conv(
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/DCNv2/dcn_v2.py", line 23, in forward
    output = _backend.dcn_v2_forward(
RuntimeError: Not compiled with GPU support

The DCNv2 library seems to be built correctly (I run the make.sh file without errors)

I checked my pytorch installation.

When I check my pytorch version with:

python -c "import torch; print(torch.__version__)"

I get

1.9.1+cu102

Then, if I do:

python -c "import torch; print(torch.cuda.is_available())"

I get:

True

What am I doing wrong?

I will come back here if I find a solution.

from centerfusion.

fabrizioschiano commented on June 9, 2024

@AHappyFlyBird , how did you solve your problem?

from centerfusion.

fabrizioschiano commented on June 9, 2024

After some research I understood that the problem was that I actually did not have CUDA installed.

You can find it out by doing:
nvcc –V

If nothing is returned it means that you did not install CUDA

I followed all this:

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/

And I installed CUDA with the following official link

https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=20.04&target_type=deb_local

Then I installed the nvidia-development-kit simply with
sudo apt install nvidia-cuda-toolkit

Then you can do:

export CUDA_HOME=/usr/local/cuda-11

(before doing it you should check that this is the folder in which CUDA has been installed on your machine)

Then, I had another problem:

what(): No CUDA GPUs are available

I found out what to do thanks to this issue.

I had to change the line

export CUDA_VISIBLE_DEVICES=1

export CUDA_VISIBLE_DEVICES=0

In the test.sh of this repository.

I hope this helps someone else in the same situation.

from centerfusion.

fabrizioschiano commented on June 9, 2024

@mrnabati @AHappyFlyBird , do you think that is it normal to get at the beginning of the training all the following printed out?

No param dep_sec.0.weight.
No param dep_sec.0.bias.
No param dep_sec.2.weight.
No param dep_sec.2.bias.
No param dep_sec.4.weight.
No param dep_sec.4.bias.
No param dep_sec.6.weight.
No param dep_sec.6.bias.
No param rot_sec.0.weight.
No param rot_sec.0.bias.
No param rot_sec.2.weight.
No param rot_sec.2.bias.
No param rot_sec.4.weight.
No param rot_sec.4.bias.
No param rot_sec.6.weight.
No param rot_sec.6.bias.
No param nuscenes_att.4.weight.
No param nuscenes_att.4.bias.
No param nuscenes_att.6.weight.
No param nuscenes_att.6.bias.
No param velocity.4.weight.
No param velocity.4.bias.
No param velocity.6.weight.
No param velocity.6.bias.

from centerfusion.

Training problem about centerfusion HOT 7 CLOSED

Comments (7)

I am so happy to see your reply. Thanks for your work. Using your suggestion, I solved this problem, but a new issue apppeared, the error occurs as follows. Looking foward to your reply.

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs