maudzung / rtm3d Goto Github PK

Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)

Home Page: https://arxiv.org/pdf/2001.03343.pdf

License: MIT License

Python 99.79% Shell 0.21%

rtm3d real-time monocular-images centernet pytorch-implementation 3d-object-detection autonomous-vehicles self-driving-car autonomous-driving pytorch

rtm3d's People

Stargazers

Watchers

rtm3d's Issues

data orgaanization

hello,the data organization doesn't show in readme.md.
Would you please update the data organization part?

PRE-TRAINED WEIGHTS

hey,guy,can u share your pre-trained weights?
my email:[email protected]

pretrained weights

hey,guy
why can't you provide a pretrained weight?

About config.subdivisions

Wonder why need to add config.subdivisions to the training process? could you please explain how does it work ?

Implementation of the 3D Bounding Box Estimation (3.2 part in the paper)

It should be argmin instead of argmax in the formula (7)

This part can be solved by using the g2opy library.
Welcome to contribute to the implementation of this part!
Thanks!

when could you provide a trained model

cuda run time error when set gpu_idx from 0 to other numbers

If I change the gpu_idx from 0 to other numbers (for example, set gpu idx 5) in train.sh, there is an error that is shown below. If I set the gpu_idx 0, the script can run normally. I am confused with this bug.

THCudaCheck FAIL file=/pytorch/torch/csrc/cuda/Module.cpp line=59 error=101 : invalid device ordinal
Traceback (most recent call last):
torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (101) : invalid device ordinal at /pytorch/torch/csrc/cuda/Module.cpp:59

Only 2D Bounding Box when test

Hi ,
Is it that the repo doesn't finish the 3D Bounding Box. I found that there are only 2d bounding boxes when test phrase.

Thank you!

Pretrained Weights

Hi,

Can you please share the pretrained weights.

Thanks,
Harel

how to train custom dataset

Nice job， can you tell me how to train custom dataset with RTM3D ? thanks!

About soft weight

Sorry, I have question about kfpn(F.softmax(outs, dim=-1)), the function of softmax is not used for channel?

need your files

Hello author, can you provide your pre-training weights and evaluate.py files?
This is my email ——[email protected] , thank you very much!

Loss nan

Thanks for the great reproduction! I got the nan loss about half of the total number of epoch. Have you met this before?

Epoch: [127/300]
2020-08-17 21:48:47,912: logger.py - info(), at Line 49:INFO:
Train - Epoch: [127/300][ 35/464] Time 0.738 ( 0.738) Data 0.000 ( 0.014) Loss 2.1274e+00 (1.5406e+00)
2020-08-17 21:49:24,172: logger.py - info(), at Line 49:INFO:
Train - Epoch: [127/300][ 85/464] Time 0.730 ( 0.730) Data 0.000 ( 0.006) Loss 1.2996e+00 (1.5485e+00)
2020-08-17 21:50:00,407: logger.py - info(), at Line 49:INFO:
Train - Epoch: [127/300][135/464] Time 0.750 ( 0.728) Data 0.000 ( 0.004) Loss 2.1689e+00 (1.5462e+00)
2020-08-17 21:50:36,613: logger.py - info(), at Line 49:INFO:
Train - Epoch: [127/300][185/464] Time 0.727 ( 0.727) Data 0.000 ( 0.003) Loss 1.5142e+00 (1.5394e+00)
2020-08-17 21:51:12,840: logger.py - info(), at Line 49:INFO:
Train - Epoch: [127/300][235/464] Time 0.732 ( 0.727) Data 0.000 ( 0.002) Loss 1.5060e+00 (1.5328e+00)
2020-08-17 21:51:49,090: logger.py - info(), at Line 49:INFO:
Train - Epoch: [127/300][285/464] Time 0.723 ( 0.726) Data 0.000 ( 0.002) Loss 1.4645e+00 (1.5327e+00)
2020-08-17 21:52:25,279: logger.py - info(), at Line 49:INFO:
Train - Epoch: [127/300][335/464] Time 0.731 ( 0.726) Data 0.000 ( 0.002) Loss 1.5955e+00 (1.5390e+00)
2020-08-17 21:53:01,426: logger.py - info(), at Line 49:INFO:
Train - Epoch: [127/300][385/464] Time 0.732 ( 0.726) Data 0.000 ( 0.002) Loss 1.5102e+00 (1.5416e+00)
2020-08-17 21:53:37,567: logger.py - info(), at Line 49:INFO:
Train - Epoch: [127/300][435/464] Time 0.725 ( 0.725) Data 0.000 ( 0.001) Loss 1.6077e+00 (1.5410e+00)
2020-08-17 21:53:57,834: logger.py - info(), at Line 49:INFO:
----------------------------------------
2020-08-17 21:53:57,834: logger.py - info(), at Line 49:INFO:
=================================== 128/300 ===================================
2020-08-17 21:53:57,834: logger.py - info(), at Line 49:INFO:
----------------------------------------
2020-08-17 21:53:57,834: logger.py - info(), at Line 49:INFO:
Epoch: [128/300]
2020-08-17 21:54:14,399: logger.py - info(), at Line 49:INFO:
Train - Epoch: [128/300][ 21/464] Time 0.730 ( 0.753) Data 0.000 ( 0.031) Loss 1.5503e+00 (1.5762e+00)
2020-08-17 21:54:50,617: logger.py - info(), at Line 49:INFO:
Train - Epoch: [128/300][ 71/464] Time 0.734 ( 0.733) Data 0.000 ( 0.010) Loss 1.6927e+00 (1.5937e+00)
2020-08-17 21:55:26,919: logger.py - info(), at Line 49:INFO:
Train - Epoch: [128/300][121/464] Time 0.737 ( 0.730) Data 0.000 ( 0.006) Loss 1.4969e+00 (1.5841e+00)
2020-08-17 21:56:03,156: logger.py - info(), at Line 49:INFO:
Train - Epoch: [128/300][171/464] Time 0.736 ( 0.729) Data 0.000 ( 0.004) Loss 1.6823e+00 (1.5828e+00)
2020-08-17 21:56:39,368: logger.py - info(), at Line 49:INFO:
Train - Epoch: [128/300][221/464] Time 0.728 ( 0.728) Data 0.000 ( 0.003) Loss 1.4616e+00 (1.5912e+00)
2020-08-17 21:57:15,445: logger.py - info(), at Line 49:INFO:
Train - Epoch: [128/300][271/464] Time 0.725 ( 0.726) Data 0.000 ( 0.003) Loss nan (nan)
2020-08-17 21:57:51,119: logger.py - info(), at Line 49:INFO:
Train - Epoch: [128/300][321/464] Time 0.711 ( 0.724) Data 0.000 ( 0.002) Loss nan (nan)
2020-08-17 21:58:26,710: logger.py - info(), at Line 49:INFO:
Train - Epoch: [128/300][371/464] Time 0.725 ( 0.723) Data 0.000 ( 0.002) Loss nan (nan)
2020-08-17 21:59:02,274: logger.py - info(), at Line 49:INFO:
Train - Epoch: [128/300][421/464] Time 0.711 ( 0.721) Data 0.000 ( 0.002) Loss nan (nan)
2020-08-17 21:59:32,180: logger.py - info(), at Line 49:INFO:
----------------------------------------
2020-08-17 21:59:32,180: logger.py - info(), at Line 49:INFO:
=================================== 129/300 ===================================
2020-08-17 21:59:32,180: logger.py - info(), at Line 49:INFO:
----------------------------------------
2020-08-17 21:59:32,180: logger.py - info(), at Line 49:INFO:
Epoch: [129/300]
2020-08-17 21:59:38,533: logger.py - info(), at Line 49:INFO:
Train - Epoch: [129/300][ 7/464] Time 0.731 ( 0.794) Data 0.000 ( 0.088) Loss nan (nan)
2020-08-17 22:00:14,077: logger.py - info(), at Line 49:INFO:
Train - Epoch: [129/300][ 57/464] Time 0.715 ( 0.722) Data 0.000 ( 0.012) Loss nan (nan)
2020-08-17 22:00:49,770: logger.py - info(), at Line 49:INFO:
Train - Epoch: [129/300][107/464] Time 0.706 ( 0.718) Data 0.000 ( 0.007) Loss nan (nan)
2020-08-17 22:01:25,436: logger.py - info(), at Line 49:INFO:
Train - Epoch: [129/300][157/464] Time 0.726 ( 0.717) Data 0.000 ( 0.005) Loss nan (nan)
2020-08-17 22:02:01,175: logger.py - info(), at Line 49:INFO:
Train - Epoch: [129/300][207/464] Time 0.726 ( 0.716) Data 0.000 ( 0.004) Loss nan (nan)
2020-08-17 22:02:36,896: logger.py - info(), at Line 49:INFO:
Train - Epoch: [129/300][257/464] Time 0.716 ( 0.716) Data 0.000 ( 0.003) Loss nan (nan)
2020-08-17 22:03:12,634: logger.py - info(), at Line 49:INFO:
Train - Epoch: [129/300][307/464] Time 0.717 ( 0.716) Data 0.000 ( 0.003) Loss nan (nan)
2020-08-17 22:03:48,380: logger.py - info(), at Line 49:INFO:
Train - Epoch: [129/300][357/464] Time 0.725 ( 0.716) Data 0.000 ( 0.002) Loss nan (nan)
2020-08-17 22:04:24,092: logger.py - info(), at Line 49:INFO:
Train - Epoch: [129/300][407/464] Time 0.719 ( 0.715) Data 0.000 ( 0.002) Loss nan (nan)
2020-08-17 22:04:59,779: logger.py - info(), at Line 49:INFO:
Train - Epoch: [129/300][457/464] Time 0.691 ( 0.715) Data 0.000 ( 0.002) Loss nan (nan)
2020-08-17 22:05:04,096: logger.py - info(), at Line 49:INFO:

Error(s) in loading state_dict for PoseResNet:

When I test model:
Error(s) in loading state_dict for PoseResNet:
Missing key(s) ... ...
Unexpected key(s) ... ...
What should I do to solve this problem?

pre-trained model

hi, guys,
thanks for your great job, I want to make a video test with pre-trained model, so when will you release it or have any other plans, thanks much

evaluate.py

i can not find that.

AP on KITTI

Hi，could you provide the accuracy on KITTI valset?

About the robustness and portability of monocular 3D models

Monocular 3D depends on camera parameters. If you change a different camera or installation method, the original DataSet training model will not work. So how can you solve this difference

3D visualization

Can you tell me how you visualize live video？

No FPN in backbone network？

I wonder if FPN structure is not used in this network model?
How does the model solve the problem of occlusion?

Thanks a lot.

Suggest to loosen the dependency on albumentations

Hi, your project RTM3D(commit id: 0bd3868) requires "albumentations==0.4.5" in its dependency. After analyzing the source code, we found that the following versions of albumentations can also be suitable, i.e., albumentations 0.4.0, 0.4.1, 0.4.2, 0.4.3, 0.4.4, since all functions that you directly (3 APIs: albumentations.augmentations.transforms.RandomBrightnessContrast.init, albumentations.augmentations.transforms.GaussNoise.init, albumentations.core.composition.Compose.init) or indirectly (propagate to 12 albumentations's internal APIs and 0 outsider APIs) used from the package have not been changed in these versions, thus not affecting your usage.

Therefore, we believe that it is quite safe to loose your dependency on albumentations from "albumentations==0.4.5" to "albumentations>=0.4.0,<=0.4.5". This will improve the applicability of RTM3D and reduce the possibility of any further dependency conflict with other projects.

May I pull a request to further loosen the dependency on albumentations?

By the way, could you please tell us whether such an automatic tool for dependency analysis may be potentially helpful for maintaining dependencies easier during your development?

What the mean of"Model_rtm3d_epoch_120.pth" and"Utils_rtm3d_epoch_120.pth"?

Hello,I tarined the network successully, and the "Model_rtm3d_epoch_120.pth" and"Utils_rtm3d_epoch_120.pth" appear in "RTM3D/checkpoints/rtm3d"，What do they represent?

Why not use center point of 3d bbox?

Hi, @maudzung
In RTM3D paper, the author use center point of 3D bbox.
However, in this repo it seems to only calculate the loss of center offset, why not add the loss of center point?
Without using this loss may bring bad performance.
Wishing for your reply

Some problem when show 3d box

I have trained this model and try to show detection result on image. The 2D bounding box seems grate but 3D bounding box looks so bad. I just draw 8 points of 3D bounding box and find some objects used same point. Have you meet same question?

About a fix in "compute_radius" (kitti_data_utils.py)

There is an old problem of mis-implementation of the CornetNet. And "has been fixed" in CenterNet Duankaiwen/CenterNet#47.

This repo, by default, makes ground truth heatmap following CenterNet who follows CornerNet.

The intuition of CornerNet's setting is that "We determine the radius by the size of an object by ensuring that a pair of points within the radius would generate a bounding box with at least t IoU with the ground-truth annotation" (t=0.3 for CornerNet).
The original implementation has been strange following this idea. Because empirically and theoretically, the only possible way to have the smallest IoU with the original bbox is (w - 2r) (h - 2r). The corresponding equation is (w - 2r)(h-2r) / wh = min_iou. which yields the coefficients of r2 in "compute_radius". No "min(r1, r2, r3)" is actually needed. The fix minimizes the code changes and does not try to further fix this one.
The quadratic equation has two solution, one is that (w-2r) < 0 && (h-2r) < 0 the other is (w-2r) >0 && (h-2r) >0. The latter one is what we need. It always corresponds to the solution (b2 - sq2) / (2 * a2), while the first solution is r with larger magnitude -> (b2+sq2) / (2*a2). This has been fixed.

I have not thoroughly tested the performance because of hardware limitations. But a possible fix could be?:

def compute_radius(det_size, min_overlap=0.7):
    height, width = det_size

    a2 = 4
    b2 = 2 * (height + width)
    c2 = (1 - min_overlap) * width * height
    sq2 = np.sqrt(b2 ** 2 - 4 * a2 * c2)
    r2 = (b2 - sq2) / (2 * a2)
    
    return r2

I would also point out that this will not necessarily give better performance. Because the original idea from CornerNet does not necessarily transfer well to RTM3D here.

maudzung / rtm3d Goto Github PK

rtm3d's People

Stargazers

Watchers

Forkers

rtm3d's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs