halfsummer11 / captra Goto Github PK

[ICCV 2021, Oral] Official PyTorch implementation of CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Home Page: https://yijiaweng.github.io/CAPTRA

Python 88.10% Shell 4.04% C++ 2.34% Cuda 4.92% C 0.60%

captra's People

Contributors

Stargazers

Watchers

captra's Issues

Camera Parameters

Hi, thanks for your work. I am wondering how to get the camera parameters of the BMVC dataset or how to get the segmentation mask of this BMVC dataset.

关于实验结果的问题

您好！我注意到表1中有些论文的结果好像和原始论文给出的结果不太一样。
比如NOCS的结果和6-PACK论文中报告的结果是一样的，但是和NOCS论文中给出的结果不同；还有6-PACK的结果和原始论文不一样，CASS的结果也和原始论文不一样。
请问您在表1给出的实验结果是重新计算的结果吗，和原始NOCS的计算方式有什么不同之处呢？或者采用了6-PACK提供的调整后的gt pose吗？

期待您的回复

Visualizion of Pose Box estimation

Hello
Which script contains the code to watch the pose prediction box on top of the RGB videos of the NOCs dataset?
Thank you

A typo in func eval_single_part_iou() ?

In pose_utils/bbox_utils.py, Line 175 . link
The "pred_pose['rotation']" should be "gt_pose['rotation']" ?!

cur_pose['rotation'] = torch.matmul(pred_pose['rotation'], y_rotation_matrix(2 * np.pi * i / float(n)).reshape(1, 1, 3, 3))

Question about the Performance on NOCS-REAL275 (Tab 1 & Tab 8)

Thanks for sharing good work.

I have some questions related to the performance on NOCS-REAL275 (Tab 1 & Tab 8)

Q1. Can I ask why the reported CASS performance of 5°, 5cm is different compared to the original CASS paper (23.5 vs 29.44)?? Can you explain how to get these results including mIoU, R_err, and T_err metrics?? Also, the reported 6-PACK performance of 5°, 5cm is worse than the original paper (33.3 vs 28.92).

Q2. Can you explain the Oracle ICP and what is different the ICP from 6-PACK?? The paper 6-PACK also reported the ICP and it performs close to NOCS at 5°, 5cm metric. Can you explain why your Oracle ICP is worse than NOCS??

Q3. As I understand your method uses the same random noise (scale = 0.02, rot = 5°, trans=3cm) with ground-truth initial pose during training and inference. Have you tried different parameter settings (ex, rot= 5°, trans=5cm)?? Also, Can you explain where the code is implemented related to rotation noise?? I only found the translation and scale noisy part of the below code.

CAPTRA/datasets/nocs_data/nocs_data_process.py

Line 31 in d981582

if perturb_cfg is not None:

Normalized Corner

Hello. How is this normalized bounding box calculated? Its normalization scale is among the instance or all objects in the dataset.

Question about training model

Hi, I am sorry to bother you again. There is still a problem when I use my training model for testing:

So I added "sys.setrecursionlimit(1000000)" in "test.py", then a new error occurred:

But when I use your pre-trained model for testing and evaluation, it is ok. I found that the size of your pre-trained model is about 8.9M and 8.0M for 1_bottle_rot and 1_bottle_coordnet respectively, but my training model is about 26.5M and 21.2M for 1_bottle_rot and 1_bottle_coordnet respectively. I want to know how can get the model like your pre-trained ‘model_0000.pt’(I did not change any code except 'epoch').
Looking forward to your reply!

What's the purpos of the perturbed_part

CAPTRA/network/models/model.py

Line 414 in 1eab3fa

 perturbed_part = add_noise_to_part_dof(self.feed_dict[i - 1]['gt_part'], self.pose_perturb_cfg) 

Hi, thanks for sharing the code.

Seems that the perturbed_part is never used later, I'm curious what this poses with noise for? Thanks

Train on more GPUs

Hi authors! What a wonderful job!
I've found that it costs such a long time to train, so I'm wondering whether I can use method like 'torch.nn.DataParallel' to train on more GPUs?
Looking forward to your reply!

GPU setup

Hello, first of all thank you for opening your code!
I wonder if you can use more GPU to reduce training time.
Thank you!

可视化结果

请问目前库中有对于图片和视频流最终预测结果可视化的代码吗？
（因为有看到vis_vtils.py文件）

Question about testing frames in NOCS-REAL275

Hello, authors!
Thank you for your great work!
In your paper, you mentioned that there were totaling 3200 frames in the testing split of NOCS-REAL275. However, in the published dataset I've downloaded, there are only 2745 frames. Could you please tell me where the extra data come from?
Looking forward to your reply.

GPU setup and training time

Hello, thank you for releasing such a wonderful work!
Recently, I am reading this paper and learning the code, but I have several problems about the training setups.

What is the GPUs setup? and how long did does it take to train the NOCS dataset and the SAPIEN dataset respectively? I have tried to train the RotationNet using a 3090 GPU, however, it takes ~2 hours per category on the NOCS dataset. It seems to be very time-consuming.
What's the epoch setup of training CoordinateNet and RotationNet respectively on the two datasets?

Pre-trained models access

Hello, the per-trained models aren't accessible, the connection to http://download.cs.stanford.edu/orion/captra/ times-out.

Alignment of GT points from SAPIEN

Hi, thanks for sharing the work!

I'm trying to develop a new model based on your code, I'm trying to transform the npcs pcl to the camera pcl using the ground truth poses, but I found a little un-alignment in the dataset, could you help to see whether my transformation is correct?

_input = self.feed_dict[i]["points"].squeeze(0).detach().cpu().numpy().transpose()
_cam_pts_center = self.feed_dict[i]['points_mean'].squeeze(0).detach().cpu().numpy().transpose()
_cam_pts = _input + _cam_pts_center
np.savetxt("../debug/input_pts.txt", _cam_pts)

# try to use poses to trans gt npcs to camera pts
_R = gt_part["rotation"].squeeze(0).detach().cpu().numpy()  # K,3,3
_t = gt_part["translation"].squeeze(0).detach().cpu().numpy()  # K,3,1
_s = gt_part["scale"].squeeze(0).detach().cpu().numpy()  # K

K = _R.shape[0]
for k in range(K):
    _m = _gt_labels == k
    _part_gt_npcs = _gt_npcs[_m].T
    _part_npcs_in_cam = ((_s[k] * _R[k] @ _part_gt_npcs) + _t[k]).T
    np.savetxt(f"../debug/part_{k}_cam.txt", _part_npcs_in_cam)

Then I get the point cloud like this

The green point is the camera pcl and the other colors are the pts that transformed from the npcs pcl using the ground truth

Thanks

Question about the training process!

Hi, I have trained 30 epochs for RotationNet and 20 epochs for CoordinateNet on bottles of the NOCS dataset. However, during training, I found many losses were output as "nan" as follows:
RotationNet:

CoordNet:

What caused this? Will this affect the training model?

CUDA kernel failed : no kernel image is available for execution on the device

Hi, thank you for releasing such a wonderful job! But I got an error: "CUDA kernel failed : no kernel image is available for execution on the device" when both testing and training, I have checked my environment and did not find any problem.
Testing:

Traing:

My environment settings:

Terminal checking:

halfsummer11 / captra Goto Github PK

captra's People

Contributors

Stargazers

Watchers

Forkers

captra's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs