GithubHelp home page GithubHelp logo

lzccccc / smoke Goto Github PK

View Code? Open in Web Editor NEW
664.0 664.0 173.0 9.9 MB

SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation

License: MIT License

Python 68.64% C++ 6.24% Cuda 22.80% C 2.32%
3d-object-detection autonomous-driving

smoke's People

Contributors

lzccccc avatar tdtce avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

smoke's Issues

Validation set performance of the pretrained model

I evaluated the performance of your uploaded model on the val split (3769 images) and got the following results:
car_detection_ground AP: 83.374908 82.937012 75.748917
car_detection_3d AP: 78.142258 72.837662 65.392242

which are much higher than reported in the paper.

So I wonder if that model was trained using all the training images (trainval) instead of the train split?

Requirements.txt ?

Hi, thank u for open sourcing your amazing work. Do you have a requirements.txt file stating each module with its version ? I can see that you have provided requirements in your README.md, but i am still getting some errors here and there regarding "module not found.." while running your inference script.

Thank you

results compared to centernet

Hi, thank you for your attention.I trained 3712 training examples and evaluated 3769 validation examples according to the parameters given in your paper.I got results worse than centernet.Does anyone get similar results with me?How to improve it?
centernet:
car_detection AP: 96.429062 87.383652 78.831161
car_orientation AP: 93.708160 83.642677 75.108299
pedestrian_detection AP: 69.460915 60.578434 52.195190
pedestrian_orientation AP: 57.431622 49.492664 42.333351
cyclist_detection AP: 73.471001 48.506691 41.843220
cyclist_orientation AP: 65.426979 43.168201 37.650307
car_detection_ground AP: 31.262831 29.248146 25.693661
pedestrian_detection_ground AP: 21.528582 20.380455 16.530041
cyclist_detection_ground AP: 21.939928 13.853647 13.297730
car_detection_3d AP: 17.370939 17.168695 15.270061
pedestrian_detection_3d AP: 20.790129 19.560652 15.860862
cyclist_detection_3d AP: 21.344902 13.315850 12.881376
SMOKE:
car_detection AP: 86.502937 77.495712 68.934151
car_orientation AP: 86.208595 77.035370 68.207565
pedestrian_detection AP: 62.393944 54.208607 46.607456
pedestrian_orientation AP: 43.747810 37.509506 32.416935
cyclist_detection AP: 41.442078 30.185410 29.691633
cyclist_orientation AP: 23.559668 16.461676 16.320688
car_detection_ground AP: 22.207165 18.516039 16.109665
pedestrian_detection_ground AP: 8.190163 6.635043 6.192138
cyclist_detection_ground AP: 1.415183 0.826446 0.826446
car_detection_3d AP: 16.936316 14.226256 13.635677
pedestrian_detection_3d AP: 6.768989 6.250000 5.621096
cyclist_detection_3d AP: 1.376730 0.826446 0.826446

Questions about horizontal flipping in augmentation

Hello!

First, I want to thank you for releasing the code of the architecture its quality.

During the last days, I have been going through the code and testing it with my own dataset (I created my own data loader).
However, I have some doubts regarding the augmentation techniques, in concrete, with respect the flipping of the images, which are the following:

In first place, with certain probability the images are flipped horizontally. In this process, the data that is modified is:

  • The input image is flipped.
  • The intrinsic matrix.
  • The location in the X axis is multiplied by -1.
  • The rotation is multiplied by -1. (This will be the second question).

This modified data is saved as the label in the pytorch dataset. As it can be seen in the __get_item__(self, idx). Along with a flag if the image have been flipped.

Then, when the network is calculating the loss during training, there is a method that decodes the rotation. This method uses the flipping flag to modify the rotation (in the inverse way of the data loader) of those values that were flipped.

In summary, the data loader multiplies by -1 the rotation (of the unmodified data) and saves it as the label (that I see correct) but then, the decoder, applies the same transformation to the data, so the prediction is set to the rotation of the unmodified data and, therefore, being incorrect with respect the label.

As I see it, the network should not worry about resetting the rotation to the original data as the data loader had taken care before. I am wrong?

In second place, with respect the rotation transformation, I believe that multiplying by -1 is not correct in the case of the horizontal flipping (it is for the vertical one).

The reasoning is that, for a car that has a rotation of 90 degrees (in KITTI, a vehicle looking forward) when the image is flipped horizontally does not change the rotation of the car so it remains 90 degrees. In python, the formula is:

if rotation < 0:
	rotation = -180 - rotation
else:
	rotation = 180 - rotation
return rotation

Thanks beforehand.

How DIMENSION_REFERENCE in SMOKE_HEAD is calculated?

I want to train smoke on different datasets,but when I pass over the whole code,I kown that num class is related with DIMENSION_REFERENCE.In kitti dataset,there is 3 values for DIMENSION_REFERENCE in smoke/config/defaults.py:

# Reference car size in (length, height, width)
# for (car, cyclist, pedestrian)
_C.MODEL.SMOKE_HEAD.DIMENSION_REFERENCE = ((3.88, 1.63, 1.53),
                                           (1.78, 1.70, 0.58),
                                           (0.88, 1.73, 0.67))

and I want to know how DIMENSION_REFERENCE in SMOKE_HEAD is calculated?
Thanks.

CUDA out of memory

I have a Nvidia TITAN V with12G of memory. However, when I ran the trainning, it was out of memory:

untimeError: CUDA out of memory. Tried to allocate 240.00 MiB (GPU 0; 11.78 GiB total capacity; 10.69 GiB already allocated; 29.25 MiB free; 19.64 MiB cached)

Is this possible?

为什么要用loc_center计算proj_point而不是直接用(x,y,z)呢?

    loc_center = np.array([x, y - h / 2, z])  
    proj_point = np.matmul(K, loc_center) ##
    proj_point = proj_point[:2] / proj_point[2]

为什么要用loc_center计算proj_point而不是直接用(x,y,z)呢? 假如目标是个人的话,就相当于y方向用了头顶那点的坐标而不是腰部的坐标?

RuntimeError: received 0 items of ancdata

Hi,

I am following the instruction on ReadMe to run training part. When I execute "python tools/plain_train_net.py --config-file "configs/smoke_gn_vector.yaml", I get an error "RuntimeError: received 0 items of ancdata"

Ubuntu: 16.04
Python: 3.7.7
CUDA: 10.0
Pytorch: 1.4.0

The entire log is below

Command Line Args: Namespace(ckpt=None, config_file='configs/smoke_gn_vector.yaml', dist_url='tcp://127.0.0.1:50153', eval_only=False, machine_rank=0, num_gpus=1, num_machines=1, opts=[])
[2020-06-19 18:10:56,601] smoke INFO: Using 1 GPUs
[2020-06-19 18:10:56,601] smoke INFO: Collecting environment info
[2020-06-19 18:10:57,952] smoke INFO: 
PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: 10.0

OS: Ubuntu 16.04.6 LTS
GCC version: (Ubuntu 5.5.0-12ubuntu1~16.04) 5.5.0 20171010
CMake version: version 3.5.1

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.0.130
GPU models and configuration: 
GPU 0: GeForce GTX 1080 Ti
GPU 1: GeForce GTX 1080 Ti
GPU 2: GeForce GTX 1080 Ti
GPU 3: TITAN Xp

Nvidia driver version: 418.87.01
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.2

Versions of relevant libraries:
[pip3] numpy==1.18.1
[pip3] torch==1.4.0
[pip3] torchfile==0.1.0
[pip3] torchvision==0.5.0
[conda] blas                      1.0                         mkl  
[conda] mkl                       2020.1                      217  
[conda] mkl-service               2.3.0            py37he904b0f_0  
[conda] mkl_fft                   1.1.0            py37h23d657b_0  
[conda] mkl_random                1.1.1            py37h0573a6f_0  
[conda] pytorch                   1.4.0           py3.7_cuda10.0.130_cudnn7.6.3_0    pytorch
[conda] torchvision               0.5.0                py37_cu100    pytorch
        Pillow (7.1.2)
[2020-06-19 18:10:57,952] smoke INFO: Namespace(ckpt=None, config_file='configs/smoke_gn_vector.yaml', dist_url='tcp://127.0.0.1:50153', eval_only=False, machine_rank=0, num_gpus=1, num_machines=1, opts=[])
[2020-06-19 18:10:57,952] smoke INFO: Loaded configuration file configs/smoke_gn_vector.yaml
[2020-06-19 18:10:57,953] smoke INFO: 
MODEL:
  WEIGHT: "catalog://ImageNetPretrained/DLA34"
INPUT:
  FLIP_PROB_TRAIN: 0.5
  SHIFT_SCALE_PROB_TRAIN: 0.3
DATASETS:
  DETECT_CLASSES: ("Car", "Cyclist", "Pedestrian")
  TRAIN: ("kitti_train",)
  TEST: ("kitti_test",)
  TRAIN_SPLIT: "trainval"
  TEST_SPLIT: "test"
SOLVER:
  BASE_LR: 2.5e-4
  STEPS: (10000, 18000)
  MAX_ITERATION: 25000
  IMS_PER_BATCH: 32
[2020-06-19 18:10:57,953] smoke INFO: Running with config:
CUDNN_BENCHMARK: True
DATALOADER:
  ASPECT_RATIO_GROUPING: False
  NUM_WORKERS: 4
  SIZE_DIVISIBILITY: 0
DATASETS:
  DETECT_CLASSES: ('Car', 'Cyclist', 'Pedestrian')
  MAX_OBJECTS: 30
  TEST: ('kitti_test',)
  TEST_SPLIT: test
  TRAIN: ('kitti_train',)
  TRAIN_SPLIT: trainval
INPUT:
  FLIP_PROB_TRAIN: 0.5
  HEIGHT_TEST: 384
  HEIGHT_TRAIN: 384
  PIXEL_MEAN: [0.485, 0.456, 0.406]
  PIXEL_STD: [0.229, 0.224, 0.225]
  SHIFT_SCALE_PROB_TRAIN: 0.3
  SHIFT_SCALE_TRAIN: (0.2, 0.4)
  TO_BGR: True
  WIDTH_TEST: 1280
  WIDTH_TRAIN: 1280
MODEL:
  BACKBONE:
    BACKBONE_OUT_CHANNELS: 64
    CONV_BODY: DLA-34-DCN
    DOWN_RATIO: 4
    FREEZE_CONV_BODY_AT: 0
    USE_NORMALIZATION: GN
  DEVICE: cuda
  GROUP_NORM:
    DIM_PER_GP: -1
    EPSILON: 1e-05
    NUM_GROUPS: 32
  SMOKE_HEAD:
    DEPTH_REFERENCE: (28.01, 16.32)
    DIMENSION_REFERENCE: ((3.88, 1.63, 1.53), (1.78, 1.7, 0.58), (0.88, 1.73, 0.67))
    LOSS_ALPHA: 2
    LOSS_BETA: 4
    LOSS_TYPE: ('FocalLoss', 'DisL1')
    LOSS_WEIGHT: (1.0, 10.0)
    NUM_CHANNEL: 256
    PREDICTOR: SMOKEPredictor
    REGRESSION_CHANNEL: (1, 2, 3, 2)
    REGRESSION_HEADS: 8
    USE_NMS: False
    USE_NORMALIZATION: GN
  SMOKE_ON: True
  WEIGHT: catalog://ImageNetPretrained/DLA34
OUTPUT_DIR: ./tools/logs
PATHS_CATALOG: /home/robot1/Shukai/SMOKE/smoke/config/paths_catalog.py
SEED: -1
SOLVER:
  BASE_LR: 0.00025
  BIAS_LR_FACTOR: 2
  CHECKPOINT_PERIOD: 20
  EVALUATE_PERIOD: 20
  IMS_PER_BATCH: 32
  LOAD_OPTIMIZER_SCHEDULER: True
  MASTER_BATCH: -1
  MAX_ITERATION: 25000
  OPTIMIZER: Adam
  STEPS: (10000, 18000)
TEST:
  DETECTIONS_PER_IMG: 50
  DETECTIONS_THRESHOLD: 0.25
  IMS_PER_BATCH: 1
  PRED_2D: True
  SINGLE_GPU_TEST: True
[2020-06-19 18:10:57,954] smoke.utils.envs INFO: Using a generated random seed 58027627
[2020-06-19 18:11:00,129] smoke.utils.check_point INFO: Loading checkpoint from catalog://ImageNetPretrained/DLA34
[2020-06-19 18:11:00,129] smoke.utils.check_point INFO: catalog://ImageNetPretrained/DLA34 points to http://dl.yf.io/dla/models/imagenet/dla34-ba72cf86.pth
[2020-06-19 18:11:00,130] smoke.utils.check_point INFO: url http://dl.yf.io/dla/models/imagenet/dla34-ba72cf86.pth cached in /home/robot1/.torch/models/dla34-ba72cf86.pth
[2020-06-19 18:11:00,163] smoke.utils.model_serialization INFO: backbone.body.base.base_layer.0.weight                                loaded from base_layer.0.weight                 of shape (16, 3, 7, 7)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.base_layer.1.bias                                  loaded from base_layer.1.bias                   of shape (16,)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.base_layer.1.weight                                loaded from base_layer.1.weight                 of shape (16,)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level0.0.weight                                    loaded from level0.0.weight                     of shape (16, 16, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level0.1.bias                                      loaded from level0.1.bias                       of shape (16,)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level0.1.weight                                    loaded from level0.1.weight                     of shape (16,)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level1.0.weight                                    loaded from level1.0.weight                     of shape (32, 16, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level1.1.bias                                      loaded from level1.1.bias                       of shape (32,)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level1.1.weight                                    loaded from level1.1.weight                     of shape (32,)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level2.project.0.weight                            loaded from level2.project.0.weight             of shape (64, 32, 1, 1)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level2.project.1.bias                              loaded from level2.project.1.bias               of shape (64,)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level2.project.1.weight                            loaded from level2.project.1.weight             of shape (64,)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level2.root.conv.weight                            loaded from level2.root.conv.weight             of shape (64, 128, 1, 1)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level2.tree1.conv1.weight                          loaded from level2.tree1.conv1.weight           of shape (64, 32, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level2.tree1.conv2.weight                          loaded from level2.tree1.conv2.weight           of shape (64, 64, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level2.tree2.conv1.weight                          loaded from level2.tree2.conv1.weight           of shape (64, 64, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level2.tree2.conv2.weight                          loaded from level2.tree2.conv2.weight           of shape (64, 64, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.project.0.weight                            loaded from level3.project.0.weight             of shape (128, 64, 1, 1)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.project.1.bias                              loaded from level3.project.1.bias               of shape (128,)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.project.1.weight                            loaded from level3.project.1.weight             of shape (128,)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree1.project.0.weight                      loaded from level3.tree1.project.0.weight       of shape (128, 64, 1, 1)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree1.project.1.bias                        loaded from level3.tree1.project.1.bias         of shape (128,)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree1.project.1.weight                      loaded from level3.tree1.project.1.weight       of shape (128,)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree1.root.conv.weight                      loaded from level3.tree1.root.conv.weight       of shape (128, 256, 1, 1)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree1.tree1.conv1.weight                    loaded from level3.tree1.tree1.conv1.weight     of shape (128, 64, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree1.tree1.conv2.weight                    loaded from level3.tree1.tree1.conv2.weight     of shape (128, 128, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree1.tree2.conv1.weight                    loaded from level3.tree1.tree2.conv1.weight     of shape (128, 128, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree1.tree2.conv2.weight                    loaded from level3.tree1.tree2.conv2.weight     of shape (128, 128, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree2.root.conv.weight                      loaded from level3.tree2.root.conv.weight       of shape (128, 448, 1, 1)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree2.tree1.conv1.weight                    loaded from level3.tree2.tree1.conv1.weight     of shape (128, 128, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree2.tree1.conv2.weight                    loaded from level3.tree2.tree1.conv2.weight     of shape (128, 128, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree2.tree2.conv1.weight                    loaded from level3.tree2.tree2.conv1.weight     of shape (128, 128, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level3.tree2.tree2.conv2.weight                    loaded from level3.tree2.tree2.conv2.weight     of shape (128, 128, 3, 3)
[2020-06-19 18:11:00,164] smoke.utils.model_serialization INFO: backbone.body.base.level4.project.0.weight                            loaded from level4.project.0.weight             of shape (256, 128, 1, 1)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.project.1.bias                              loaded from level4.project.1.bias               of shape (256,)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.project.1.weight                            loaded from level4.project.1.weight             of shape (256,)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree1.project.0.weight                      loaded from level4.tree1.project.0.weight       of shape (256, 128, 1, 1)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree1.project.1.bias                        loaded from level4.tree1.project.1.bias         of shape (256,)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree1.project.1.weight                      loaded from level4.tree1.project.1.weight       of shape (256,)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree1.root.conv.weight                      loaded from level4.tree1.root.conv.weight       of shape (256, 512, 1, 1)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree1.tree1.conv1.weight                    loaded from level4.tree1.tree1.conv1.weight     of shape (256, 128, 3, 3)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree1.tree1.conv2.weight                    loaded from level4.tree1.tree1.conv2.weight     of shape (256, 256, 3, 3)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree1.tree2.conv1.weight                    loaded from level4.tree1.tree2.conv1.weight     of shape (256, 256, 3, 3)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree1.tree2.conv2.weight                    loaded from level4.tree1.tree2.conv2.weight     of shape (256, 256, 3, 3)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree2.root.conv.weight                      loaded from level4.tree2.root.conv.weight       of shape (256, 896, 1, 1)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree2.tree1.conv1.weight                    loaded from level4.tree2.tree1.conv1.weight     of shape (256, 256, 3, 3)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree2.tree1.conv2.weight                    loaded from level4.tree2.tree1.conv2.weight     of shape (256, 256, 3, 3)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree2.tree2.conv1.weight                    loaded from level4.tree2.tree2.conv1.weight     of shape (256, 256, 3, 3)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level4.tree2.tree2.conv2.weight                    loaded from level4.tree2.tree2.conv2.weight     of shape (256, 256, 3, 3)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level5.project.0.weight                            loaded from level5.project.0.weight             of shape (512, 256, 1, 1)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level5.project.1.bias                              loaded from level5.project.1.bias               of shape (512,)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level5.project.1.weight                            loaded from level5.project.1.weight             of shape (512,)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level5.root.conv.weight                            loaded from level5.root.conv.weight             of shape (512, 1280, 1, 1)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level5.tree1.conv1.weight                          loaded from level5.tree1.conv1.weight           of shape (512, 256, 3, 3)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level5.tree1.conv2.weight                          loaded from level5.tree1.conv2.weight           of shape (512, 512, 3, 3)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level5.tree2.conv1.weight                          loaded from level5.tree2.conv1.weight           of shape (512, 512, 3, 3)
[2020-06-19 18:11:00,165] smoke.utils.model_serialization INFO: backbone.body.base.level5.tree2.conv2.weight                          loaded from level5.tree2.conv2.weight           of shape (512, 512, 3, 3)
[2020-06-19 18:11:00,191] smoke.data.datasets.kitti INFO: Initializing KITTI trainval set with 7481 files loaded
[2020-06-19 18:11:00,191] smoke.trainer INFO: Start training
Traceback (most recent call last):
  File "tools/plain_train_net.py", line 100, in <module>
    args=(args,),
  File "/home/robot1/Shukai/SMOKE/smoke/engine/launch.py", line 56, in launch
    main_func(*args)
  File "tools/plain_train_net.py", line 88, in main
    train(cfg, model, device, distributed)
  File "tools/plain_train_net.py", line 53, in train
    arguments
  File "/home/robot1/Shukai/SMOKE/smoke/engine/trainer.py", line 58, in do_train
    for data, iteration in zip(data_loader, range(start_iter, max_iter)):
  File "/home/robot1/anaconda3/envs/SMOKE/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/home/robot1/anaconda3/envs/SMOKE/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 841, in _next_data
    idx, data = self._get_data()
  File "/home/robot1/anaconda3/envs/SMOKE/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 808, in _get_data
    success, data = self._try_get_data()
  File "/home/robot1/anaconda3/envs/SMOKE/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 761, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "/home/robot1/anaconda3/envs/SMOKE/lib/python3.7/multiprocessing/queues.py", line 113, in get
    return _ForkingPickler.loads(res)
  File "/home/robot1/anaconda3/envs/SMOKE/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 294, in rebuild_storage_fd
    fd = df.detach()
  File "/home/robot1/anaconda3/envs/SMOKE/lib/python3.7/multiprocessing/resource_sharer.py", line 58, in detach
    return reduction.recv_handle(conn)
  File "/home/robot1/anaconda3/envs/SMOKE/lib/python3.7/multiprocessing/reduction.py", line 185, in recv_handle
    return recvfds(s, 1)[0]
  File "/home/robot1/anaconda3/envs/SMOKE/lib/python3.7/multiprocessing/reduction.py", line 161, in recvfds
    len(ancdata))
RuntimeError: received 0 items of ancdata

Any help is appreciated!

Shukai

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Hi ,I am interested in your work.
When I run python tools/plain_train_net.py --config-file "configs/smoke_gn_vector.yaml"
there is a bug:

Traceback (most recent call last):
  File "tools/plain_train_net.py", line 100, in <module>
    args=(args,),
  File "/home/kaixin/jupyter_projects/Project/SMOKE/smoke/engine/launch.py", line 56, in launch
    main_func(*args)
  File "tools/plain_train_net.py", line 88, in main
    train(cfg, model, device, distributed)
  File "tools/plain_train_net.py", line 53, in train
    arguments
  File "/home/kaixin/jupyter_projects/Project/SMOKE/smoke/engine/trainer.py", line 76, in do_train
    losses.backward()
  File "/home/kaixin/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/kaixin/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Can you do me a favor?I use Pytorch1.0.0

training SMOKE using own dataset

Hi,
Thanks for sharing this great work. Now when I use own car dataset to train the model, I can not detect any car image including my
training image.
Next is my training image and label.
I hope someone can give me some guidance, thank you in abvance.
000000.txt

000000

000000

TypeError: zip argument #1 must support iteration

I met this problem, but really dont konw how to solve. some one can help me?

Traceback (most recent call last):
  File "tools/plain_train_net.py", line 102, in <module>
    args=(args,),
  File "/home/zhifanyong/deep_learning/SMOKE/smoke/engine/launch.py", line 53, in launch
    daemon=False,
  File "/home/zhifanyong/venv/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
    while not spawn_context.join():
  File "/home/zhifanyong/venv/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 118, in join
    raise Exception(msg)
Exception: 

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/home/zhifanyong/venv/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
    fn(i, *args)
  File "/home/zhifanyong/deep_learning/SMOKE/smoke/engine/launch.py", line 88, in _distributed_worker
    main_func(*args)
  File "/home/zhifanyong/deep_learning/SMOKE/tools/plain_train_net.py", line 90, in main
    train(cfg, model, device, distributed)
  File "/home/zhifanyong/deep_learning/SMOKE/tools/plain_train_net.py", line 53, in train
    arguments
  File "/home/zhifanyong/deep_learning/SMOKE/smoke/engine/trainer.py", line 58, in do_train
    for data, iteration in zip(data_loader, range(start_iter, max_iter)):
TypeError: zip argument #1 must support iteration

Training problem

Hi,
Thanks for sharing this great work. I train on train(3712) and evaluate on the val (3769) can only gain very low AP. Could you share your training log file so that I can trace the bugs.

how to run smoke on RTX3070 GPU?

as i know smoke requirements
Ubuntu 16.04
Python 3.7
Pytorch 1.3.1
CUDA 10.0

but my PC is RTX3070 GPU,only support cuda11.1 and more up version ,So I try to configure the environment as follows:

(base) zxl@R9000P:~/mywork/MANA-AI/DCNv2_latest$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0

ubuntu18.04
python 3.8
pytorch1.8.1和1.9.0
CUDA11.1

Run under PyTorch 1.8.1 and 1.9.0 environments:

python setup.py build develop

error as below

/home/zxl/mywork/MANA-AI/smoke_mono_3d/smoke/csrc/cuda/dcn_v2_cuda.cu(127): 
error: identifier "THCudaBlas_SgemmBatched" is undefined
/home/zxl/mywork/MANA-AI/smoke_mono_3d/smoke/csrc/cuda/dcn_v2_cuda.cu(275): error: identifier "THCudaBlas_Sgemm" is undefined
/home/zxl/mywork/MANA-AI/smoke_mono_3d/smoke/csrc/cuda/dcn_v2_cuda.cu(329): error: identifier "THCudaBlas_Sgemv" is undefined
3 errors detected in the compilation of "/home/zxl/mywork/MANA-AI/smoke_mono_3d/smoke/csrc/cuda/dcn_v2_cuda.cu".
error: command '/usr/local/cuda-11.1/bin/nvcc' failed with exit status 1

The following methods are found online

git clone https://github.com/jinfagang/DCNv2_latest.git
cd DCNv2_latest
python setup.py build develop

CUDA 11.1 + PyTorch 1.8.1 or 1.9.0 can build successfully, but execute the test program

python testcuda.py

error as below

raise RuntimeError(msg)
RuntimeError: Jacobian mismatch for output 0 with respect to input 0,
numerical:tensor([[ 0.4043,  0.0048, -0.0100,  ...,  0.0000,  0.0000,  0.0000],
       [ 0.1935,  0.0695,  0.0132,  ...,  0.0000,  0.0000,  0.0000],
       [-0.0009,  0.0000,  0.3827,  ...,  0.0000,  0.0000,  0.0000],
       ...,
       [ 0.0000,  0.0000,  0.0000,  ...,  0.0000, -0.0237,  0.0000],
       [ 0.0000,  0.0000,  0.0000,  ..., -0.4143, -0.8342,  0.0000],
       [ 0.0000,  0.0000,  0.0000,  ..., -0.2155, -0.1278, -0.1084]],
      device='cuda:0')
analytical:tensor([[ 0.4043,  0.0049, -0.0100,  ...,  0.0000,  0.0000,  0.0000],
       [ 0.1934,  0.0695,  0.0133,  ...,  0.0000,  0.0000,  0.0000],
       [-0.0011,  0.0000,  0.3829,  ...,  0.0000,  0.0000,  0.0000],
       ...,
       [ 0.0000,  0.0000,  0.0000,  ...,  0.0000, -0.0237,  0.0000],
       [ 0.0000,  0.0000,  0.0000,  ..., -0.4146, -0.8340,  0.0000],
       [ 0.0000,  0.0000,  0.0000,  ..., -0.2157, -0.1280, -0.1084]],
      device='cuda:0')

What puzzled me most was that the following instructions were run successfully in all three environments

(py38torch171-smoke) zxl@R9000P:~/mywork/MANA-AI/DCNv2_latest$ python
Python 3.8.10 (default, Jun  4 2021, 15:09:15) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.7.1'
>>> torch.cuda.is_available()
True
>>>
(py38torch181-smoke) zxl@R9000P:~/mywork/MANA-AI/DCNv2_la00m$ python
Python 3.8.10 (default, Jun  4 2021, 15:09:15) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.8.1+cu111'
>>> torch.cuda.is_available()
True
>>>
(py38torch190-smoke) zxl@R9000P:~/mywork/MANA-AI/DCNv2_latest$ python
Python 3.8.10 (default, Jun  4 2021, 15:09:15) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.9.0+cu111'
>>> torch.cuda.is_available()
True
>>>

I've been researching all day but have no success. Can you configure the environment on the RTX30 series graphics card CUDA 11.1? Thanks a lot!

How to change backbone?

Thank you for sharing this great code base. I'd like to try different backbones; e.g. changing the DLA-34 backbone to Resnet-50. Seems like I'd have to change code in few places, like backbone.py, registry.py and change out dla.py to new backbone. Am I missing anything else? Do you recommend a guide to do this?
Thanks!

Problems Training SMOKE on custom dataset

I am training SMOKE on my own dataset with size (1920, 1208) and there are errors in the backbone. Note that I have changed the image size in the config. Please help.
Screenshot 2021-06-22 at 1 21 21 PM

projection of 3D center not on image plane

Hello @lzccccc . Thank you for making this amazing work open-source. I tried your code and found that some failure cases may due to out-of-image 3D centers, like the right- and left-first cars in the image below. So I was wondering do you include this kind of objects during training? If so, how do you assign keypoint labels to them?

failed_case

No module named 'maskrcnn_benchmark'

I finished all the setup as your instruction. But when I getting start, the error is as follow.
``Traceback (most recent call last):
File "tools/plain_train_net.py", line 9, in
from smoke.utils.check_point import DetectronCheckpointer
File /MONO_3D/SMOKE/smoke/utils/check_point.py", line 8, in
from smoke.utils.model_zoo import cache_url
File "/MONO_3D/SMOKE/smoke/utils/model_zoo.py", line 9, in
from maskrcnn_benchmark.utils.comm import is_main_process
ModuleNotFoundError: No module named 'maskrcnn_benchmark'

Did I need to install the maskrcnn?

multi-gpu training?

does the code support Multi-GPU training?
TypeError: zip argument #1 must support iteration
What does this error mean? How should I solve it?

Error in Regression Loss Normalization

Hi!

First of all, I want to thank you for releasing a readable and high-quality code. Unfortunately, it is not a common practice and its really appreciated. Also congratulations for the great performance this model achieves.

I am trying to implement your model to my data pipeline and I have found a difference between how the Regression Loss is normalized in the paper and in the code.

In lines 129 to 143 in the loss.py file we have:

reg_loss_ori = F.l1_loss(
    predict_boxes3d["ori"] * reg_mask,
    targets_regression * reg_mask,
    reduction="sum") / (self.loss_weight[1] * self.max_objs)

reg_loss_dim = F.l1_loss(
    predict_boxes3d["dim"] * reg_mask,
    targets_regression * reg_mask,
    reduction="sum") / (self.loss_weight[1] * self.max_objs)

reg_loss_loc = F.l1_loss(
    predict_boxes3d["loc"] * reg_mask,
    targets_regression * reg_mask,
    reduction="sum") / (self.loss_weight[1] * self.max_objs)

where self.max_objs is 30 and it's the maximum number of objects accepted in the training samples (basically used for padding I think). Therefore, the regression losses are normalized using this number. However, equation 8 in the paper indicates that the regression should be divided by the number of KeyPoints (therefore, the number of ground truth objects).

Am I right or maybe I have misunderstood the paper?

Thanks in advance!

Different imagesize

Hi,

I have dataset with different image size. And its larger so I an resizing to smaller image. WHat changes needs to be done to make the code work correctly. I have changed the intrinsic matrix when resizing image accordingly. But results are bad. What other changes are needed?

Training on Google Colab

I'm sorry for breaking in with the absolutely noob questions.

But I am trying to run your (great!) work on Google Colab, and when I try to train it, it starts training (it seems) but it never shows any information. Should it be like that ?

It's running for hours now in the background, still the same. Silently running, no info.

Thanks for any help!

Sigmoid activation on wrong regression features

Hi
Thank you for the fantastic project!
Looking at the forward function in SMOKE/blob/master/smoke/modeling/heads/smoke_head/smoke_predictor.py,
I was wondering if the sigmoid() function should be applied to features 1 and 2 (the offsets), whereas if I understand
correctly, it is currently applied to features 3, 4 and 5 (the dims).
Thanks!
Tsubu

How MAX_ITERATION is calculated in smoke_gn_vector.yaml?

Hello!

First, thanks so much for releasing the code! It has very high quality and is very tidy and concise.

I wonder how the MAX_ITERATION is calculated. Is it calculated based on the number of images on 'trainval' ? In other words, 'trainval' has 7481 images, you trained it using batch_size=32 with 60 epochs. In my calculation, the total m will be 7481*60/32 =14026 iterations. However, in your config, MAX_ITERATION is 25000, about twice times as my calculation. Do I miss anything?

Thanks in advance!

Inference fails while using pretrained model

I have followed the guidelines as mentioned in the README.md, I have downloaded the dataset and organized it exactly as requested.
downloaded the pretrained model, and updated the file:
SMOKE/configs/smoke_gn_vector.yaml, so it would point to the correct path of the weights files (model_final.pth).

After that I have ran:
python tools/plain_train_net.py --eval-only --config-file "configs/smoke_gn_vector.yaml"

but I keep getting:

2020-12-13 11:01:51,428] smoke.utils.model_serialization INFO: heads.predictor.regression_head.1.weight                              loaded from heads.predictor.regression_head.1.weight                              of shape (256,)
[2020-12-13 11:01:51,429] smoke.utils.model_serialization INFO: heads.predictor.regression_head.3.bias                                loaded from heads.predictor.regression_head.3.bias                                of shape (8,)
[2020-12-13 11:01:51,429] smoke.utils.model_serialization INFO: heads.predictor.regression_head.3.weight                              loaded from heads.predictor.regression_head.3.weight                              of shape (8, 256, 1, 1)
Traceback (most recent call last):
  File "tools/plain_train_net.py", line 100, in <module>
    args=(args,),
  File "/content/drive/My Drive/Colab_Notebooks/SMOKE/smoke/engine/launch.py", line 56, in launch
    main_func(*args)
  File "tools/plain_train_net.py", line 78, in main
    _ = checkpointer.load(ckpt, use_latest=args.ckpt is None)
  File "/content/drive/My Drive/Colab_Notebooks/SMOKE/smoke/utils/check_point.py", line 60, in load
    self._load_model(checkpoint)
  File "/content/drive/My Drive/Colab_Notebooks/SMOKE/smoke/utils/check_point.py", line 96, in _load_model
    load_state_dict(self.model, checkpoint.pop("model"))
  File "/content/drive/My Drive/Colab_Notebooks/SMOKE/smoke/utils/model_serialization.py", line 78, in load_state_dict
    model.load_state_dict(model_state_dict)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for KeypointDetector:
	size mismatch for heads.predictor.class_head.3.weight: copying a param with shape torch.Size([3, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 256, 1, 1]).
	size mismatch for heads.predictor.class_head.3.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([1]).

Can someone suggest me what am i doing wrong?

Thank you!

apply for random photos

is it possible to apply this solution for random photos with cars from the Internet? even if with a loss of quality?

Inference Pre-trained Model

Hello, first thank you for sharing this amazing work and release the code, it really help me a lot.
But I have several question I want to ask for

  1. In the pre-trained model that you have provide, I try to run you provided pre-trained model model_final.pth, but I have a strange result like this:

Cyclist 0 0 0.29319998621940613 1101.7052001953125 157.85940551757812 1171.0078125 199.82049560546875 1.5821000337600708 0.6988000273704529 2.05430006980896 20.817399978637695 0.7282000184059143 27.693300247192383 0.9377999901771545 0.2808000147342682
Pedestrian 0 0 -0.156700000166893 0.0 4.8867998123168945 32.05400085449219 47.18880081176758 1.6751999855041504 0.8062000274658203 1.1485999822616577 -27.59119987487793 -6.371799945831299 33.09489822387695 -0.8515999913215637 0.28049999475479126
Cyclist 0 0 0.40369999408721924 928.3699951171875 211.3343048095703 984.5086059570312 262.87060546875 1.868499994277954 0.5382999777793884 1.90339994430542 13.737299919128418 3.1115000247955322 27.609399795532227 0.8654000163078308 0.28029999136924744
Pedestrian 0 0 -0.15150000154972076 1194.6259765625 286.85760498046875 1224.0 332.6874084472656 1.631100058555603 0.6679999828338623 0.9711999893188477 24.250900268554688 5.956200122833252 28.213600158691406 0.5586000084877014 0.2802000045776367
Cyclist 0 0 1.6592999696731567 1142.3118896484375 55.76430130004883 1169.814453125 108.3541030883789 1.6448999643325806 0.6118000149726868 1.9859000444412231 20.006200790405273 -2.709199905395508 25.61370086669922 2.3224000930786133 0.2797999978065491
Cyclist 0 0 0.09459999948740005 1166.0030517578125 62.07600021362305 1224.0 112.0342025756836 1.7129000425338745 0.58160001039505 1.8250000476837158 23.007600784301758 -2.739799976348877 27.437599182128906 0.7924000024795532 0.27970001101493835
Pedestrian 0 0 -0.09070000052452087 1170.5811767578125 236.30099487304688 1204.6690673828125 286.3161926269531 1.7871999740600586 0.7777000069618225 0.9431999921798706 22.403799057006836 3.9769999980926514 27.162599563598633 0.5989999771118164 0.2784000039100647
Cyclist 0 0 -0.021299999207258224 492.5469970703125 0.0 539.0078125 40.38359832763672 1.700700044631958 0.6797000169754028 2.018399953842163 -3.909600019454956 -6.295499801635742 31.283700942993164 -0.14560000598430634 0.27799999713897705

The image is using KITTI dataset. Do you think that my output is correct or not? Because when I see in the dataset there are car, but it detected as no cars.
Thank you

How much memery is needed when train with single GPU?

I trained the model wiht single GPU when batchsize is 2 and got error of out of memery. I tried to test with a very simple backbone, containing only 4 downsample layers, but also got the same error.
My total GPU memery is 8G.

Visualazition problem

Can the output coordinates of the model be directly visualized, does any transformation is needed to display them in the original image?

Error in reading ground-truth labels while evaluating pre-trained model

Thanks a lot for making your code open source, and for fostering further research in this field!

I downloaded the pre-trained model that is linked here, and I also used the ImageSets as described in the issue #3 and placed the same in both the training and testing data folders.
On running only the evaluation on the pre-trained model, using the kitti eval folder provided here as referenced in issue #4 I got this error:-

ERROR: Couldn't read: 004627.txt of ground truth. Please write me an email! An error occurred while processing your results

Could you please tell me how this error came about? And how I can fix it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.