GithubHelp home page GithubHelp logo

Training problem about centerfusion HOT 7 CLOSED

mrnabati avatar mrnabati commented on June 9, 2024
Training problem

from centerfusion.

Comments (7)

mrnabati avatar mrnabati commented on June 9, 2024 1

Hi. Looks like you are trying to access a GPU device that does not exist. If you only have one GPU, you need to change the following parameter in both train.sh and test.sh scripts:

export CUDA_VISIBLE_DEVICES=0,1

and also the --gpu parameter in those scripts.

from centerfusion.

AHappyFlyBird avatar AHappyFlyBird commented on June 9, 2024

Hi. Looks like you are trying to access a GPU device that does not exist. If you only have one GPU, you need to change the following parameter in both train.sh and test.sh scripts:

export CUDA_VISIBLE_DEVICES=0,1

and also the --gpu parameter in those scripts.

I am so happy to see your reply. Thanks for your work. Using your suggestion, I solved this problem, but a new issue apppeared, the error occurs as follows. Looking foward to your reply.

Using tensorboardX
/usr/local/lib/python3.6/dist-packages/sklearn/utils/linear_assignment_.py:21: DeprecationWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
DeprecationWarning)
Fix size testing.
training chunk_sizes: [32]
input h w: 448 800
heads {'hm': 10, 'reg': 2, 'wh': 2, 'dep': 1, 'rot': 8, 'dim': 3, 'amodel_offset': 2, 'dep_sec': 1, 'rot_sec': 8, 'nuscenes_att': 8, 'velocity': 3}
weights {'hm': 1, 'reg': 1, 'wh': 0.1, 'dep': 1, 'rot': 1, 'dim': 1, 'amodel_offset': 1, 'dep_sec': 1, 'rot_sec': 1, 'nuscenes_att': 1, 'velocity': 1}
head conv {'hm': [256], 'reg': [256], 'wh': [256], 'dep': [256], 'rot': [256], 'dim': [256], 'amodel_offset': [256], 'dep_sec': [256, 256, 256], 'rot_sec': [256, 256, 256], 'nuscenes_att': [256, 256, 256], 'velocity': [256, 256, 256]}
Namespace(K=100, amodel_offset_weight=1, arch='dla_34', aug_rot=0, backbone='dla34', batch_size=32, chunk_sizes=[32], custom_dataset_ann_path='', custom_dataset_img_path='', custom_head_convs={'dep_sec': 3, 'rot_sec': 3, 'velocity': 3, 'nuscenes_att': 3}, data_dir='/content/drive/MyDrive/CenterFusion/src/lib/../../data', dataset='nuscenes', dataset_version='', debug=0, debug_dir='/content/drive/MyDrive/CenterFusion/src/lib/../../exp/ddd/centerfusion/debug', debugger_theme='white', demo='', dense_reg=1, dep_res_weight=1, dep_weight=1, depth_scale=1, dim_weight=1, disable_frustum=False, dla_node='dcn', down_ratio=4, eval=False, eval_n_plots=0, eval_render_curves=False, exp_dir='/content/drive/MyDrive/CenterFusion/src/lib/../../exp/ddd', exp_id='centerfusion', fix_res=True, fix_short=-1, flip=0.5, flip_test=False, fp_disturb=0, freeze_backbone=False, frustumExpansionRatio=0.0, gpus=[0], gpus_str='0', head_conv={'hm': [256], 'reg': [256], 'wh': [256], 'dep': [256], 'rot': [256], 'dim': [256], 'amodel_offset': [256], 'dep_sec': [256, 256, 256], 'rot_sec': [256, 256, 256], 'nuscenes_att': [256, 256, 256], 'velocity': [256, 256, 256]}, head_kernel=3, heads={'hm': 10, 'reg': 2, 'wh': 2, 'dep': 1, 'rot': 8, 'dim': 3, 'amodel_offset': 2, 'dep_sec': 1, 'rot_sec': 8, 'nuscenes_att': 8, 'velocity': 3}, hm_dist_thresh={0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 1, 6: 1, 7: 1, 8: 0, 9: 0}, hm_disturb=0, hm_hp_weight=1, hm_to_box_ratio=0.3, hm_transparency=0.7, hm_weight=1, hp_weight=1, hungarian=False, ignore_loaded_cats=[], img_format='jpg', input_h=448, input_res=800, input_w=800, iou_thresh=0, keep_res=False, kitti_split='3dop', layers_to_freeze=['base', 'dla_up', 'ida_up'], load_model='../models/centernet_baseline_e170.pth', load_results='', lost_disturb=0, lr=0.00025, lr_step=[50], ltrb=False, ltrb_amodal=False, ltrb_amodal_weight=0.1, ltrb_weight=0.1, master_batch_size=32, max_age=-1, max_frame_dist=3, max_pc=1000, max_pc_dist=60.0, model_output_list=False, msra_outchannel=256, neck='dlaup', new_thresh=0.3, nms=False, no_color_aug=False, no_pause=False, no_pre_img=False, non_block_test=False, normalize_depth=True, not_cuda_benchmark=False, not_max_crop=False, not_prefetch_test=False, not_rand_crop=True, not_set_cuda_env=False, not_show_bbox=False, not_show_number=False, num_classes=10, num_epochs=60, num_head_conv=1, num_img_channels=3, num_iters=-1, num_resnet_layers=101, num_stacks=1, num_workers=4, nuscenes_att=True, nuscenes_att_weight=1, off_weight=1, optim='adam', out_thresh=-1, output_h=112, output_res=200, output_w=200, pad=31, pc_atts=['x', 'y', 'z', 'dyn_prop', 'id', 'rcs', 'vx', 'vy', 'vx_comp', 'vy_comp', 'is_quality_valid', 'ambig_state', 'x_rms', 'y_rms', 'invalid_state', 'pdh0', 'vx_rms', 'vy_rms'], pc_feat_channels={'pc_dep': 0, 'pc_vx': 1, 'pc_vz': 2}, pc_feat_lvl=['pc_dep', 'pc_vx', 'pc_vz'], pc_roi_method='pillars', pc_z_offset=0.0, pillar_dims=[1.5, 0.2, 0.2], pointcloud=True, pre_hm=False, pre_img=False, pre_thresh=-1, print_iter=0, prior_bias=-4.6, public_det=False, qualitative=False, r_a=250, r_b=5, radar_sweeps=3, reg_loss='l1', reset_hm=False, resize_video=False, resume=False, reuse_hm=False, root_dir='/content/drive/MyDrive/CenterFusion/src/lib/../..', rot_weight=1, rotate=0, run_dataset_eval=True, same_aug_pre=False, save_all=False, save_dir='/content/drive/MyDrive/CenterFusion/src/lib/../../exp/ddd/centerfusion', save_framerate=30, save_img_suffix='', save_imgs=[], save_point=[20, 40, 50], save_results=False, save_video=False, scale=0, secondary_heads=['velocity', 'nuscenes_att', 'dep_sec', 'rot_sec'], seed=317, shift=0.1, show_track_color=False, show_velocity=False, shuffle_train=True, sigmoid_dep_sec=True, skip_first=-1, sort_det_by_dist=False, tango_color=False, task='ddd', test_dataset='nuscenes', test_focal_length=-1, test_scales=[1.0], track_thresh=0.3, tracking=False, tracking_weight=1, train_split='mini_train', trainval=False, transpose_video=False, use_loaded_results=False, val_intervals=1, val_split='mini_val', velocity=True, velocity_weight=1, video_h=512, video_w=512, vis_gt_bev='', vis_thresh=0.3, warm_start_weights=False, weights={'hm': 1, 'reg': 1, 'wh': 0.1, 'dep': 1, 'rot': 1, 'dim': 1, 'amodel_offset': 1, 'dep_sec': 1, 'rot_sec': 1, 'nuscenes_att': 1, 'velocity': 1}, wh_weight=0.1, zero_pre_hm=False, zero_tracking=False)
Creating model...
Using node type: (<class 'model.networks.dla.DeformConv'>, <class 'model.networks.dla.DeformConv'>)
Warning: No ImageNet pretrain!!
loaded ../models/centernet_baseline_e170.pth, epoch 28
Skip loading parameter nuscenes_att.0.weight, required shapetorch.Size([256, 67, 3, 3]), loaded shapetorch.Size([256, 64, 3, 3]).
Skip loading parameter nuscenes_att.2.weight, required shapetorch.Size([256, 256, 1, 1]), loaded shapetorch.Size([8, 256, 1, 1]).
Skip loading parameter nuscenes_att.2.bias, required shapetorch.Size([256]), loaded shapetorch.Size([8]).
Skip loading parameter velocity.0.weight, required shapetorch.Size([256, 67, 3, 3]), loaded shapetorch.Size([256, 64, 3, 3]).
Skip loading parameter velocity.2.weight, required shapetorch.Size([256, 256, 1, 1]), loaded shapetorch.Size([3, 256, 1, 1]).
Skip loading parameter velocity.2.bias, required shapetorch.Size([256]), loaded shapetorch.Size([3]).
No param dep_sec.0.weight.
No param dep_sec.0.bias.
No param dep_sec.2.weight.
No param dep_sec.2.bias.
No param dep_sec.4.weight.
No param dep_sec.4.bias.
No param dep_sec.6.weight.
No param dep_sec.6.bias.
No param rot_sec.0.weight.
No param rot_sec.0.bias.
No param rot_sec.2.weight.
No param rot_sec.2.bias.
No param rot_sec.4.weight.
No param rot_sec.4.bias.
No param rot_sec.6.weight.
No param rot_sec.6.bias.
No param nuscenes_att.4.weight.
No param nuscenes_att.4.bias.
No param nuscenes_att.6.weight.
No param nuscenes_att.6.bias.
No param velocity.4.weight.
No param velocity.4.bias.
No param velocity.6.weight.
No param velocity.6.bias.
Setting up validation data...
Dataset version
==> initializing mini_val data from /content/drive/MyDrive/CenterFusion/src/lib/../../data/nuscenes/annotations_3sweeps/mini_val.json,
images from /content/drive/MyDrive/CenterFusion/src/lib/../../data/nuscenes ...
loading annotations into memory...
Done (t=0.99s)
creating index...
index created!
Loaded mini_val 486 samples
Setting up train data...
Dataset version
==> initializing mini_train data from /content/drive/MyDrive/CenterFusion/src/lib/../../data/nuscenes/annotations_3sweeps/mini_train.json,
images from /content/drive/MyDrive/CenterFusion/src/lib/../../data/nuscenes ...
loading annotations into memory...
Done (t=3.87s)
creating index...
index created!
Loaded mini_train 1938 samples
Starting training...
ddd/centerfusionTraceback (most recent call last):
File "main.py", line 140, in
main(opt)
File "main.py", line 84, in main
log_dict_train, _ = trainer.train(epoch, train_loader)
File "/content/drive/MyDrive/CenterFusion/src/lib/trainer.py", line 406, in train
return self.run_epoch('train', epoch, data_loader)
File "/content/drive/MyDrive/CenterFusion/src/lib/trainer.py", line 178, in run_epoch
output, loss, loss_stats = model_with_loss(batch, phase)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/content/drive/MyDrive/CenterFusion/src/lib/trainer.py", line 123, in forward
outputs = self.model(batch['image'], pc_hm=pc_hm, pc_dep=pc_dep, calib=calib)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/base_model.py", line 91, in forward
feats = self.img2feats(x)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/dla.py", line 622, in img2feats
x = self.dla_up(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/dla.py", line 572, in forward
ida(layers, len(layers) -i - 2, len(layers))
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/dla.py", line 543, in forward
layers[i] = upsample(project(layers[i]))
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/dla.py", line 516, in forward
x = self.conv(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, kwargs)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/DCNv2/dcn_v2.py", line 128, in forward
self.deformable_groups)
File "/content/drive/MyDrive/CenterFusion/src/lib/model/networks/DCNv2/dcn_v2.py", line 31, in forward
ctx.deformable_groups)
RuntimeError: Not compiled with GPU support (dcn_v2_forward at /content/drive/My Drive/CenterFusion/src/lib/model/networks/DCNv2/src/dcn_v2.h:35)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7f726a55b273 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #1: dcn_v2_forward(at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, int, int, int, int, int, int, int, int, int) + 0x159 (0x7f7249ae4fd9 in /content/drive/MyDrive/CenterFusion/src/lib/model/networks/DCNv2/_ext.cpython-36m-x86_64-linux-gnu.so)
frame #2: + 0x16629 (0x7f7249af2629 in /content/drive/MyDrive/CenterFusion/src/lib/model/networks/DCNv2/_ext.cpython-36m-x86_64-linux-gnu.so)
frame #3: + 0x12a2d (0x7f7249aeea2d in /content/drive/MyDrive/CenterFusion/src/lib/model/networks/DCNv2/_ext.cpython-36m-x86_64-linux-gnu.so)
frame #4: python3() [0x50a4a5]

frame #6: python3() [0x507be4]
frame #7: python3() [0x588c8b]
frame #9: THPFunction_apply(_object
, _object
) + 0x9df (0x7f72b49f191f in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #10: python3() [0x50a12f]
frame #13: python3() [0x594a01]
frame #16: python3() [0x507be4]
frame #18: python3() [0x594a01]
frame #19: python3() [0x54a971]
frame #21: python3() [0x50a433]
frame #24: python3() [0x594a01]
frame #27: python3() [0x507be4]
frame #29: python3() [0x594a01]
frame #30: python3() [0x54a971]
frame #32: python3() [0x50a433]
frame #35: python3() [0x594a01]
frame #38: python3() [0x507be4]
frame #40: python3() [0x594a01]
frame #41: python3() [0x54a971]
frame #43: python3() [0x50a433]
frame #46: python3() [0x594a01]
frame #49: python3() [0x507be4]
frame #51: python3() [0x594a01]
frame #52: python3() [0x54a971]
frame #54: python3() [0x50a433]
frame #56: python3() [0x5095c8]
frame #57: python3() [0x50a2fd]
frame #59: python3() [0x507be4]
frame #61: python3() [0x594a01]

from centerfusion.

mrnabati avatar mrnabati commented on June 9, 2024

Make sure you build the DCNv2 library after installing PyTorch and also PyTorch is installed with GPU support. This error usually happens when DCNv2 is built with a PyTorch without GPU support.

from centerfusion.

fabrizioschiano avatar fabrizioschiano commented on June 9, 2024

I have the same error when trying to

bash experiments/test.sh

The error is:

  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/dla.py", line 516, in forward
    x = self.conv(x)
  File "/home/fabrizioschiano/.virtualenvs/centerfusion/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/DCNv2/dcn_v2.py", line 161, in forward
    return dcn_v2_conv(
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/DCNv2/dcn_v2.py", line 23, in forward
    output = _backend.dcn_v2_forward(
RuntimeError: Not compiled with GPU support
Using tensorboardX
Fix size testing.
training chunk_sizes: [32]
input h w: 448 800
heads {'hm': 10, 'reg': 2, 'wh': 2, 'dep': 1, 'rot': 8, 'dim': 3, 'amodel_offset': 2, 'dep_sec': 1, 'rot_sec': 8, 'nuscenes_att': 8, 'velocity': 3}
weights {'hm': 1, 'reg': 1, 'wh': 0.1, 'dep': 1, 'rot': 1, 'dim': 1, 'amodel_offset': 1, 'dep_sec': 1, 'rot_sec': 1, 'nuscenes_att': 1, 'velocity': 1}
head conv {'hm': [256], 'reg': [256], 'wh': [256], 'dep': [256], 'rot': [256], 'dim': [256], 'amodel_offset': [256], 'dep_sec': [256, 256, 256], 'rot_sec': [256, 256, 256], 'nuscenes_att': [256, 256, 256], 'velocity': [256, 256, 256]}
Namespace(K=100, amodel_offset_weight=1, arch='dla_34', aug_rot=0, backbone='dla34', batch_size=32, chunk_sizes=[32], custom_dataset_ann_path='', custom_dataset_img_path='', custom_head_convs={'dep_sec': 3, 'rot_sec': 3, 'velocity': 3, 'nuscenes_att': 3}, data_dir='/home/fabrizioschiano/repositories/CenterFusion/src/lib/../../data', dataset='nuscenes', dataset_version='', debug=0, debug_dir='/home/fabrizioschiano/repositories/CenterFusion/src/lib/../../exp/ddd/centerfusion/debug', debugger_theme='white', demo='', dense_reg=1, dep_res_weight=1, dep_weight=1, depth_scale=1, dim_weight=1, disable_frustum=False, dla_node='dcn', down_ratio=4, eval=False, eval_n_plots=0, eval_render_curves=False, exp_dir='/home/fabrizioschiano/repositories/CenterFusion/src/lib/../../exp/ddd', exp_id='centerfusion', fix_res=True, fix_short=-1, flip=0.5, flip_test=True, fp_disturb=0, freeze_backbone=False, frustumExpansionRatio=0.0, gpus=[0], gpus_str='0', head_conv={'hm': [256], 'reg': [256], 'wh': [256], 'dep': [256], 'rot': [256], 'dim': [256], 'amodel_offset': [256], 'dep_sec': [256, 256, 256], 'rot_sec': [256, 256, 256], 'nuscenes_att': [256, 256, 256], 'velocity': [256, 256, 256]}, head_kernel=3, heads={'hm': 10, 'reg': 2, 'wh': 2, 'dep': 1, 'rot': 8, 'dim': 3, 'amodel_offset': 2, 'dep_sec': 1, 'rot_sec': 8, 'nuscenes_att': 8, 'velocity': 3}, hm_dist_thresh={0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 1, 6: 1, 7: 1, 8: 0, 9: 0}, hm_disturb=0, hm_hp_weight=1, hm_to_box_ratio=0.3, hm_transparency=0.7, hm_weight=1, hp_weight=1, hungarian=False, ignore_loaded_cats=[], img_format='jpg', input_h=448, input_res=800, input_w=800, iou_thresh=0, keep_res=False, kitti_split='3dop', layers_to_freeze=['base', 'dla_up', 'ida_up'], load_model='../models/centerfusion_e60.pth', load_results='', lost_disturb=0, lr=0.000125, lr_step=[60], ltrb=False, ltrb_amodal=False, ltrb_amodal_weight=0.1, ltrb_weight=0.1, master_batch_size=32, max_age=-1, max_frame_dist=3, max_pc=1000, max_pc_dist=60.0, model_output_list=False, msra_outchannel=256, neck='dlaup', new_thresh=0.3, nms=False, no_color_aug=False, no_pause=False, no_pre_img=False, non_block_test=False, normalize_depth=True, not_cuda_benchmark=False, not_max_crop=False, not_prefetch_test=False, not_rand_crop=False, not_set_cuda_env=False, not_show_bbox=False, not_show_number=False, num_classes=10, num_epochs=70, num_head_conv=1, num_img_channels=3, num_iters=-1, num_resnet_layers=101, num_stacks=1, num_workers=4, nuscenes_att=True, nuscenes_att_weight=1, off_weight=1, optim='adam', out_thresh=-1, output_h=112, output_res=200, output_w=200, pad=31, pc_atts=['x', 'y', 'z', 'dyn_prop', 'id', 'rcs', 'vx', 'vy', 'vx_comp', 'vy_comp', 'is_quality_valid', 'ambig_state', 'x_rms', 'y_rms', 'invalid_state', 'pdh0', 'vx_rms', 'vy_rms'], pc_feat_channels={'pc_dep': 0, 'pc_vx': 1, 'pc_vz': 2}, pc_feat_lvl=['pc_dep', 'pc_vx', 'pc_vz'], pc_roi_method='pillars', pc_z_offset=-0.0, pillar_dims=[1.5, 0.2, 0.2], pointcloud=True, pre_hm=False, pre_img=False, pre_thresh=-1, print_iter=0, prior_bias=-4.6, public_det=False, qualitative=False, r_a=250, r_b=5, radar_sweeps=6, reg_loss='l1', reset_hm=False, resize_video=False, resume=False, reuse_hm=False, root_dir='/home/fabrizioschiano/repositories/CenterFusion/src/lib/../..', rot_weight=1, rotate=0, run_dataset_eval=True, same_aug_pre=False, save_all=False, save_dir='/home/fabrizioschiano/repositories/CenterFusion/src/lib/../../exp/ddd/centerfusion', save_framerate=30, save_img_suffix='', save_imgs=[], save_point=[90], save_results=False, save_video=False, scale=0, secondary_heads=['velocity', 'nuscenes_att', 'dep_sec', 'rot_sec'], seed=317, shift=0, show_track_color=False, show_velocity=False, shuffle_train=False, sigmoid_dep_sec=True, skip_first=-1, sort_det_by_dist=False, tango_color=False, task='ddd', test_dataset='nuscenes', test_focal_length=-1, test_scales=[1.0], track_thresh=0.3, tracking=False, tracking_weight=1, train_split='train', trainval=False, transpose_video=False, use_loaded_results=False, val_intervals=10, val_split='mini_val', velocity=True, velocity_weight=1, video_h=512, video_w=512, vis_gt_bev='', vis_thresh=0.3, warm_start_weights=False, weights={'hm': 1, 'reg': 1, 'wh': 0.1, 'dep': 1, 'rot': 1, 'dim': 1, 'amodel_offset': 1, 'dep_sec': 1, 'rot_sec': 1, 'nuscenes_att': 1, 'velocity': 1}, wh_weight=0.1, zero_pre_hm=False, zero_tracking=False)
Dataset version 
==> initializing mini_val data from /home/fabrizioschiano/repositories/CenterFusion/src/lib/../../data/nuscenes/annotations_6sweeps/mini_val.json, 
 images from /home/fabrizioschiano/repositories/CenterFusion/src/lib/../../data/nuscenes ...
loading annotations into memory...
Done (t=0.83s)
creating index...
index created!
Loaded mini_val 486 samples
Creating model...
Using node type: (<class 'model.networks.dla.DeformConv'>, <class 'model.networks.dla.DeformConv'>)
Warning: No ImageNet pretrain!!
loaded ../models/centerfusion_e60.pth, epoch 60
Traceback (most recent call last):
  File "test.py", line 215, in <module>
    prefetch_test(opt)
  File "test.py", line 125, in prefetch_test
    ret = detector.run(pre_processed_images)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/detector.py", line 118, in run
    output, dets, forward_time = self.process(
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/detector.py", line 321, in process
    output = self.model(images, pc_dep=pc_dep, calib=calib)[-1]
  File "/home/fabrizioschiano/.virtualenvs/centerfusion/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/base_model.py", line 91, in forward
    feats = self.img2feats(x)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/dla.py", line 622, in img2feats
    x = self.dla_up(x)
  File "/home/fabrizioschiano/.virtualenvs/centerfusion/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/dla.py", line 572, in forward
    ida(layers, len(layers) -i - 2, len(layers))
  File "/home/fabrizioschiano/.virtualenvs/centerfusion/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/dla.py", line 543, in forward
    layers[i] = upsample(project(layers[i]))
  File "/home/fabrizioschiano/.virtualenvs/centerfusion/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/dla.py", line 516, in forward
    x = self.conv(x)
  File "/home/fabrizioschiano/.virtualenvs/centerfusion/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/DCNv2/dcn_v2.py", line 161, in forward
    return dcn_v2_conv(
  File "/home/fabrizioschiano/repositories/CenterFusion/src/lib/model/networks/DCNv2/dcn_v2.py", line 23, in forward
    output = _backend.dcn_v2_forward(
RuntimeError: Not compiled with GPU support

The DCNv2 library seems to be built correctly (I run the make.sh file without errors)

I checked my pytorch installation.

When I check my pytorch version with:

python -c "import torch; print(torch.__version__)"

I get

1.9.1+cu102

Then, if I do:

python -c "import torch; print(torch.cuda.is_available())"

I get:

True

What am I doing wrong?

I will come back here if I find a solution.

from centerfusion.

fabrizioschiano avatar fabrizioschiano commented on June 9, 2024

@AHappyFlyBird , how did you solve your problem?

from centerfusion.

fabrizioschiano avatar fabrizioschiano commented on June 9, 2024

After some research I understood that the problem was that I actually did not have CUDA installed.

You can find it out by doing:
nvcc –V

If nothing is returned it means that you did not install CUDA

I followed all this:

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/

And I installed CUDA with the following official link

https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=20.04&target_type=deb_local

Then I installed the nvidia-development-kit simply with
sudo apt install nvidia-cuda-toolkit

Then you can do:

export CUDA_HOME=/usr/local/cuda-11

(before doing it you should check that this is the folder in which CUDA has been installed on your machine)

Then, I had another problem:

what(): No CUDA GPUs are available

I found out what to do thanks to this issue.

I had to change the line

export CUDA_VISIBLE_DEVICES=1

To

export CUDA_VISIBLE_DEVICES=0

In the test.sh of this repository.

I hope this helps someone else in the same situation.

from centerfusion.

fabrizioschiano avatar fabrizioschiano commented on June 9, 2024

@mrnabati @AHappyFlyBird , do you think that is it normal to get at the beginning of the training all the following printed out?

No param dep_sec.0.weight.
No param dep_sec.0.bias.
No param dep_sec.2.weight.
No param dep_sec.2.bias.
No param dep_sec.4.weight.
No param dep_sec.4.bias.
No param dep_sec.6.weight.
No param dep_sec.6.bias.
No param rot_sec.0.weight.
No param rot_sec.0.bias.
No param rot_sec.2.weight.
No param rot_sec.2.bias.
No param rot_sec.4.weight.
No param rot_sec.4.bias.
No param rot_sec.6.weight.
No param rot_sec.6.bias.
No param nuscenes_att.4.weight.
No param nuscenes_att.4.bias.
No param nuscenes_att.6.weight.
No param nuscenes_att.6.bias.
No param velocity.4.weight.
No param velocity.4.bias.
No param velocity.6.weight.
No param velocity.6.bias.

from centerfusion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.