GithubHelp home page GithubHelp logo

halfsummer11 / captra Goto Github PK

View Code? Open in Web Editor NEW
112.0 112.0 21.0 255.1 MB

[ICCV 2021, Oral] Official PyTorch implementation of CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Home Page: https://yijiaweng.github.io/CAPTRA

Python 88.10% Shell 4.04% C++ 2.34% Cuda 4.92% C 0.60%

captra's People

Contributors

halfsummer11 avatar supern1ck avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

captra's Issues

Camera Parameters

Hi, thanks for your work. I am wondering how to get the camera parameters of the BMVC dataset or how to get the segmentation mask of this BMVC dataset.

关于实验结果的问题

您好!我注意到表1中有些论文的结果好像和原始论文给出的结果不太一样。
比如NOCS的结果和6-PACK论文中报告的结果是一样的,但是和NOCS论文中给出的结果不同;还有6-PACK的结果和原始论文不一样,CASS的结果也和原始论文不一样。
请问您在表1给出的实验结果是重新计算的结果吗,和原始NOCS的计算方式有什么不同之处呢?或者采用了6-PACK提供的调整后的gt pose吗?

期待您的回复

A typo in func eval_single_part_iou() ?

In pose_utils/bbox_utils.py, Line 175 . link
The "pred_pose['rotation']" should be "gt_pose['rotation']" ?!

cur_pose['rotation'] = torch.matmul(pred_pose['rotation'], y_rotation_matrix(2 * np.pi * i / float(n)).reshape(1, 1, 3, 3))

Question about the Performance on NOCS-REAL275 (Tab 1 & Tab 8)

Thanks for sharing good work.

I have some questions related to the performance on NOCS-REAL275 (Tab 1 & Tab 8)

Q1. Can I ask why the reported CASS performance of 5°, 5cm is different compared to the original CASS paper (23.5 vs 29.44)?? Can you explain how to get these results including mIoU, R_err, and T_err metrics?? Also, the reported 6-PACK performance of 5°, 5cm is worse than the original paper (33.3 vs 28.92).

Q2. Can you explain the Oracle ICP and what is different the ICP from 6-PACK?? The paper 6-PACK also reported the ICP and it performs close to NOCS at 5°, 5cm metric. Can you explain why your Oracle ICP is worse than NOCS??

Q3. As I understand your method uses the same random noise (scale = 0.02, rot = 5°, trans=3cm) with ground-truth initial pose during training and inference. Have you tried different parameter settings (ex, rot= 5°, trans=5cm)?? Also, Can you explain where the code is implemented related to rotation noise?? I only found the translation and scale noisy part of the below code.

if perturb_cfg is not None:

Normalized Corner

Hello. How is this normalized bounding box calculated? Its normalization scale is among the instance or all objects in the dataset.

Question about training model

Hi, I am sorry to bother you again. There is still a problem when I use my training model for testing:
1
So I added "sys.setrecursionlimit(1000000)" in "test.py", then a new error occurred:
2
But when I use your pre-trained model for testing and evaluation, it is ok. I found that the size of your pre-trained model is about 8.9M and 8.0M for 1_bottle_rot and 1_bottle_coordnet respectively, but my training model is about 26.5M and 21.2M for 1_bottle_rot and 1_bottle_coordnet respectively. I want to know how can get the model like your pre-trained ‘model_0000.pt’(I did not change any code except 'epoch').
Looking forward to your reply!

Train on more GPUs

Hi authors! What a wonderful job!
I've found that it costs such a long time to train, so I'm wondering whether I can use method like 'torch.nn.DataParallel' to train on more GPUs?
Looking forward to your reply!

GPU setup

Hello, first of all thank you for opening your code!
I wonder if you can use more GPU to reduce training time.
Thank you!

可视化结果

请问目前库中有对于图片和视频流最终预测结果可视化的代码吗?
(因为有看到vis_vtils.py文件)

Question about testing frames in NOCS-REAL275

Hello, authors!
Thank you for your great work!
In your paper, you mentioned that there were totaling 3200 frames in the testing split of NOCS-REAL275. However, in the published dataset I've downloaded, there are only 2745 frames. Could you please tell me where the extra data come from?
Looking forward to your reply.

GPU setup and training time

Hello, thank you for releasing such a wonderful work!
Recently, I am reading this paper and learning the code, but I have several problems about the training setups.

  1. What is the GPUs setup? and how long did does it take to train the NOCS dataset and the SAPIEN dataset respectively? I have tried to train the RotationNet using a 3090 GPU, however, it takes ~2 hours per category on the NOCS dataset. It seems to be very time-consuming.
  2. What's the epoch setup of training CoordinateNet and RotationNet respectively on the two datasets?

Alignment of GT points from SAPIEN

Hi, thanks for sharing the work!

I'm trying to develop a new model based on your code, I'm trying to transform the npcs pcl to the camera pcl using the ground truth poses, but I found a little un-alignment in the dataset, could you help to see whether my transformation is correct?

_input = self.feed_dict[i]["points"].squeeze(0).detach().cpu().numpy().transpose()
_cam_pts_center = self.feed_dict[i]['points_mean'].squeeze(0).detach().cpu().numpy().transpose()
_cam_pts = _input + _cam_pts_center
np.savetxt("../debug/input_pts.txt", _cam_pts)

# try to use poses to trans gt npcs to camera pts
_R = gt_part["rotation"].squeeze(0).detach().cpu().numpy()  # K,3,3
_t = gt_part["translation"].squeeze(0).detach().cpu().numpy()  # K,3,1
_s = gt_part["scale"].squeeze(0).detach().cpu().numpy()  # K

K = _R.shape[0]
for k in range(K):
    _m = _gt_labels == k
    _part_gt_npcs = _gt_npcs[_m].T
    _part_npcs_in_cam = ((_s[k] * _R[k] @ _part_gt_npcs) + _t[k]).T
    np.savetxt(f"../debug/part_{k}_cam.txt", _part_npcs_in_cam)

Then I get the point cloud like this

image

image

The green point is the camera pcl and the other colors are the pts that transformed from the npcs pcl using the ground truth

Thanks

Question about the training process!

Hi, I have trained 30 epochs for RotationNet and 20 epochs for CoordinateNet on bottles of the NOCS dataset. However, during training, I found many losses were output as "nan" as follows:
RotationNet:
image
CoordNet:
image
What caused this? Will this affect the training model?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.