GithubHelp home page GithubHelp logo

wenbowen123 / iros20-6d-pose-tracking Goto Github PK

View Code? Open in Web Editor NEW
361.0 15.0 65.0 86.8 MB

[IROS 2020] se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains

License: Other

Python 99.47% Shell 0.53%
robotics tracking computer-vision pose-estimation 3d manipulation robot robots domain-adaptation dataset

iros20-6d-pose-tracking's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

iros20-6d-pose-tracking's Issues

Combined with ROS

Hello,author,I want to use this method in ROS, but i don't know how to do, I really hope to see you release a version with ROS as soon as possible, thank you very much, Best wishes to you!

confusion about camera_extrinsic_parameters_calibration

Hi, bowen, I have download your YCBInEOAT Dataset , and wanna peodict on my own RGBD data via BundleTrack.
But i checked the cam_K.txt( camera_extrinsic_parameters_calibration ), i found out that the data in the txt is a 33 matrix.
As far as i know, for extrinsic calibration, the matrix is AFFINE(4
4). And the intrinsic matrix is 3*3 is ​​exactly the same as the external parameter calibration parameter you provided. Cloud u pls explain that ? thanxxxxxx~~~~~~~~~~~~~~~~

about predict.sh

I update the path in the predict.sh, but it doesn't work.

ckpt_dir: None
dataset_info_path None/../dataset_info.yml
Traceback (most recent call last):
File "predict.py", line 646, in
with open(dataset_info_path,'r') as ff:
FileNotFoundError: [Errno 2] No such file or directory: 'None/../dataset_info.yml'

here is my predict.sh:
CUDA_VISIBLE_DEVICES=1,2,3,4
python /home/zkyd/iros20-6d-pose-tracking-master/predict.py
--train_data_path /home/zkyd/iros20-6d-pose-tracking-master/data/YCB_Video_Dataset/data_synthetic/banana/train_data_blender_DR
--ckpt_dir /home/zkyd/iros20-6d-pose-tracking-master/data/YCB_weights/banana/model_best_val.pth.tar
--mean_std_path /home/zkyd/iros20-6d-pose-tracking-master/data/YCB_weights/banana
--class_id 3
why?

about predict.py and evl_yab.py

  1. 您好,请问predict.py和evl_yab.py这两个.py都需要运行才是测试吗?
    如果要进行测试,需要运行两个.py吗?

Results on YCBInEOAT dataset.

Hi, bowen, sorry to bother you again. When I run predit.py to evaluate the nine video sequences of YCBInEOAT dataset. All sequences except for the 'bleach_hard_00_03_chaitanya' one perform well. I just use your provided pre-trained weights and the translation normalizer and translation normalizer are set to 0.03m and 30 degree respectively. The following are several visulization results:
0000001
0000050
0000100
0000150
0000323
Hope for your assistance.

About trans_normalizer and rot_normalizer in datasets.py

Hi, wenbo
In the file datasets.py , the values of trans_normalizer and rot_normalizer , how are these two parameters set, is there any reference?

class TrackDataset(Dataset):
	def __init__(self, root,mode,images_mean, images_std, pretransforms=None, augmentations=None, posttransforms=None, dataset_info=None, trans_normalizer=0.03, rot_normalizer=5*np.pi/180):

Because when I modify the max_rotation: 15 and max_translation: 0.2 in dataset_info.yml, after generating paired data, when calling the training function to train the model, the function of processData() in datasets.py throw an exception

 if self.mode=='train': 
             assert (trans_label<=1).all() and (trans_label>=-1).all()
             assert (rot_label >= -1).all() and (rot_label <= 1).all(),'root:\n{}\nrot_label\n{}\n A2B_in_cam_rot{}\n'.format(self. root,rot_label,A2B_in_cam_rot)

Looking forward to your reply.

No depth for training and inference

sorry for bother you,in your paper "ablation study",“For No depth, the depth modality is removed in both training
and inference stage to study its importance”。

how can I remove depth modality?

  1. I found that in train.py:
    augmentations = Compose([
    HSVJitter(hsv_noise[0],hsv_noise[1],hsv_noise[2]),
    GaussianNoise(config['data_augmentation']['gaussian_noise']['rgb'], config['data_augmentation']['gaussian_noise']['depth']),
    GaussianBlur(config['data_augmentation']['gaussian_blur_kernel']),
    BlackCover(prob=0.2),
    # DepthMissing(prob=0.5,missing_percent=config['data_augmentation']['depth_missing_percent']),
    ])
    so I should remove the "#" and use DepthMissing(prob=0.5,missing_percent=config['data_augmentation']['depth_missing_percent'],and in config.yml to set depth_missing_percent' =1?

  2. in predict.py, there are many places involves "depth", how should I modify the code to realize No depth for training and inference?
    could you give me some advice?, thank you.

about pyrenderer

sorry to bother you.

  1. once I use pyrender,
    self.renderer = Renderer([obj_path],self.cam_K,dataset_info['camera']['height'],dataset_info['camera']['width'])

this error occurs:

Traceback (most recent call last):
File "/home/tiger/anaconda3/envs/se3/lib/python3.7/site-packages/pyrender/platforms/pyglet_platform.py", line 39, in init_context
width=1, height=1)
File "/home/tiger/anaconda3/envs/se3/lib/python3.7/site-packages/pyglet/window/xlib/init.py", line 165, in init
super(XlibWindow, self).init(*args, **kwargs)
File "/home/tiger/anaconda3/envs/se3/lib/python3.7/site-packages/pyglet/window/init.py", line 588, in init
config = screen.get_best_config(config)
File "/home/tiger/anaconda3/envs/se3/lib/python3.7/site-packages/pyglet/canvas/base.py", line 194, in get_best_config
raise window.NoSuchConfigException()
pyglet.window.NoSuchConfigException

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "produce_train_pair_data.py", line 269, in
completeBlenderYcbDR()
File "produce_train_pair_data.py", line 220, in completeBlenderYcbDR
producer = ProducerPurturb(dataset_info)
File "produce_train_pair_data.py", line 86, in init
self.renderer = Renderer([obj_path],self.cam_K,dataset_info['camera']['height'],dataset_info['camera']['width'])
File "/opt/tiger/zyz_6d_pose/iros20-6d-pose-tracking/offscreen_renderer.py", line 69, in init
self.r = pyrender.OffscreenRenderer(self.W, self.H)
File "/home/tiger/anaconda3/envs/se3/lib/python3.7/site-packages/pyrender/offscreen.py", line 31, in init
self._create()
File "/home/tiger/anaconda3/envs/se3/lib/python3.7/site-packages/pyrender/offscreen.py", line 149, in _create
self._platform.init_context()
File "/home/tiger/anaconda3/envs/se3/lib/python3.7/site-packages/pyrender/platforms/pyglet_platform.py", line 45, in init_context
'internal error message was "{}"'.format(e)
ValueError: Failed to initialize Pyglet window with an OpenGL >= 3+ context. If you're logged in via SSH, ensure that you're running your script with vglrun (i.e. VirtualGL). The internal error message was ""

could you give me some advice?

about produce_train_pair_data.py

thank you for share this code, i have some questions as below:

  1. I use my own model and rgb_files generated by blender to generate image pairs, how to set class_id ?
  2. and I didn't get any generated images rgbA。
    /opt/tiger/zyz_6d_pose/iros20-6d-pose tracking/media/YCB_traindata/ggl/blender_syn_sequence/mydataset_DR/0000499rgb.png
    [[-1.34358856e-07 -1.00000012e+00 -1.87169619e-23 5.90973608e-02]
    [ 1.00000012e+00 1.34358856e-07 -1.57009270e-16 2.34215427e-02]
    [ 1.57009270e-16 -1.87169619e-23 1.00000024e+00 5.00000238e-01]
    [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]]
    [[ 0.05048425 -0.99854613 -0.01889992 0.16632858]
    [ 0.99428245 0.0520339 -0.09324765 -0.33160405]
    [ 0.09409552 -0.01408432 0.99546378 -2.42301163]
    [ 0. 0. 0. 1. ]]
    moving to val: 0/500
    Traceback (most recent call last):
    File "produce_train_pair_data.py", line 263, in
    completeBlenderYcbDR()
    File "produce_train_pair_data.py", line 253, in completeBlenderYcbDR
    shutil.move(rgbA_files[i],out_val_path+'%07drgbA.png'%(i))
    IndexError: list index out of range

thank you so much.

:question: Training without depth

Hello, I am trying to reproduce your works on a custom dataset, however I do not have access to depth data. What would be some minimal modification to the code so as to reproduce the "No depth" situation from your ablation study ?

Obtained results different from Ours_YCB_results

Thanks for sharing the great work! I'm trying to reproduce the result on YCB. I downloaded the pretrained model weights for power_drill and the synthetic training dataset. Use predict.py to get pose txt file for the sequence "data_organized/0054". The visualization result looks OK. But when I checked the pose for a certain frame, the results are slightly different from results in Ours_YCB_results. I’ve tried both pyrenderer and vispy, both results are different from “Ours_YCB_results/Ours/035_power_drill/ycb_results_model_epoch215/seq54”. Do you have any clue where the difference come from?

I also noticed that the default value of translation and rotation normalizer is 0.03 meter and 5 degree in predict.py. But in the paper you mentioned translation and rotation of 0.02m and 15 degree. I’ve tried to change the translation and rotation normalizer to 0.02m and 15 degree, but that results in a less stable result visually and the difference with your result is even larger. I guess the translation and rotation normalizer used in predict.py should be the same as the ones used while generating the synthetic dataset. Is it possible for you to provide the normalizer values used for generating your synthetic data?
image
image

Looking forward to your reply! Thanks in advance.

ImportError: Library "GLU" not found

First of all, thank you very much for sharing the code.
I want to use your code to reproduce the test results in the paper, but I encountered some problems when configuring the environment:

import pyrender
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/pyglet/init.py", line 334, in getattr
return getattr(self._module, name)
AttributeError: 'NoneType' object has no attribute 'Window'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/pyglet/init.py", line 334, in getattr
return getattr(self._module, name)
AttributeError: 'NoneType' object has no attribute '_create_shadow_window'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "/opt/conda/lib/python3.7/site-packages/pyrender/init.py", line 12, in
from .viewer import Viewer
File "/opt/conda/lib/python3.7/site-packages/pyrender/viewer.py", line 36, in
class Viewer(pyglet.window.Window):
File "/opt/conda/lib/python3.7/site-packages/pyglet/init.py", line 340, in getattr
import(import_name)
File "/opt/conda/lib/python3.7/site-packages/pyglet/window/init.py", line 1897, in
gl._create_shadow_window()
File "/opt/conda/lib/python3.7/site-packages/pyglet/init.py", line 340, in getattr
import(import_name)
File "/opt/conda/lib/python3.7/site-packages/pyglet/gl/init.py", line 95, in
from pyglet.gl.lib import GLException
File "/opt/conda/lib/python3.7/site-packages/pyglet/gl/lib.py", line 149, in
from pyglet.gl.lib_glx import link_GL, link_GLU, link_GLX
File "/opt/conda/lib/python3.7/site-packages/pyglet/gl/lib_glx.py", line 46, in
glu_lib = pyglet.lib.load_library('GLU')
File "/opt/conda/lib/python3.7/site-packages/pyglet/lib.py", line 164, in load_library
raise ImportError('Library "%s" not found.' % names[0])
ImportError: Library "GLU" not found.

I referenced some methods apt-get install freeglut3-dev but it didn't work,

How can I solve it?
thank you if you can answer my question.

about the results

you reply is :"Not exactly. The final result is the AUC by plotting all data in a curve. For more details on this, you can refer to PoseCNN. We followed the same evaluation."

I konw this " AUC of ADD-S"metric.
Maybe I haven't expressed my question clearly in last issue, let me tell it clearly:

  1. the reported results(97.25) of "bleach cleanser'' is the the AUC by plotting all data in a curve, I know it~;
  2. but "bleach cleanser'' exists in in data_organized/0051, 0054, 0055, 0057, that is , 4 test-sets;
  3. I mean, if:
    a. AUC of ADD-S results in 0051 is 97;
    b. AUC of ADD-S results in 0054 is 96.5;
    c. AUC of ADD-S results in 0055 is 97.5;
    d. AUC of ADD-S results in 0057is 96.5;
  4. then the reported results(97.25) of "bleach cleanser'' is the average of the "four AUC of ADD-S results"?

hoping your reply~

error about train.py

root@dev-john-yjaf-7595585b4-zqjtr:/data/new_build/code/Pose/se3_TrackNet/TrackNet# sh train.sh
output_path /data/new_build/code/Pose/se3_TrackNet/tmp/train_output
loaded dataset info from: /data/YCB/YCB_traindata/master_chef_can/train_data_blender_DR/../dataset_info.yml
self.cam_K:
[[1.066778e+03 0.000000e+00 3.129869e+02]
[0.000000e+00 1.067487e+03 2.413109e+02]
[0.000000e+00 0.000000e+00 1.000000e+00]]
making dataset... for train
#dataset: 199587
self.trans_normalizer=0.03, self.rot_normalizer=0.08726646259971647
len(train_dataset)= 199587
Computing mean std for n=10000
<torch.utils.data.dataloader.DataLoader object at 0x7fb3d51caed0>
Traceback (most recent call last):
File "train.py", line 109, in
for i, (data, target, A_in_cams, B_in_cams, rgbA, rgbB, maskA, maskB) in enumerate(train_loader):
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next
data = self._next_data()
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
return self._process_data(data)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
data.reraise()
File "/opt/conda/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/data/new_build/code/Pose/se3_TrackNet/TrackNet/datasets.py", line 95, in getitem
if rgbB.shape[0]!=self.dataset_info['resolution']:
IndexError: tuple index out of range

my config.yml
`data_augmentation:
hsv_noise: [15,15,15]
gaussian_noise:
rgb: 2
depth: 5
gaussian_blur_kernel: 6
depth_missing_percent: 0.4

learning_rate: 0.001
weight_decay: 0.000001
epochs: 300
loss_weights:
trans: 1
rot: 1

data_path: /data/YCB/YCB_traindata/master_chef_can/train_data_blender_DR
validation_path: /data/YCB/YCB_traindata/master_chef_can/validation_data_blender_DR
output_path: /data/new_build/code/Pose/se3_TrackNet/tmp/train_output
batch_size: 200
n_workers: 20
`

/data/YCB/YCB_traindata/master_chef_can/dataset_info.yml
`boundingbox: 10
camera:
centerX: 312.9869
centerY: 241.3109
focalX: 1066.778
focalY: 1067.487
height: 480
width: 640
distribution: gauss
models:

  • 0: null
    model_path: /data/YCB/YCB_traindata/master_chef_can/textured.ply
    object_width: 189.09586716088756
    resolution: 176
    train_samples: 200000
    val_samples: 2000`

I check the data ,do not find error,Is there something unmodified that caused this error?

Setup requirements and instructions

Dear @wenbowen123,

first of all I would like to thank you for the great work!

I'm interested in trying the provided code and I was wondering whether you are planning to release more detailed instructions about installation requirements and steps to setup the environment.

Thank you

about predict.py error

here is my path:
--YCBInEOAT_dir', default='/home/zkyd/iros20-6d-pose-tracking-master/data/YCBInEOAT_Dataset/YCBInEOAT/bleach0'
--train_data_path', default="/home/zkyd/iros20-6d-pose-tracking-master/data/YCBInEOAT_Dataset/YCBInEOAT_synthetic/bleach_cleanser/train_data_blender_DR
--ckpt_dir', default="/home/zkyd/iros20-6d-pose-tracking-master/data/YCBInEOAT_weights/bleach_cleanser/model_best_val.pth.tar
--mean_std_path', default="/home/zkyd/iros20-6d-pose-tracking-master/data/YCBInEOAT_weights/bleach_cleanser
I don't find bleach_cleanser folder in the YCBInEOAT , I think YCBInEOAT_dir stand for test_path,right or wrong?
I run the predict.py, but always account the eror " File "predict.py", line 218, in on_track
rgbB, depthB = crop_bbox(current_rgb, current_depth, bb, self.image_size)
File "/home/zkyd/iros20-6d-pose-tracking-master/Utils.py", line 340, in crop_bbox
color_crop[top_offset:bottom_offset, left_offset:right_offset, :] = color[top:bottom, left:right, :]
ValueError: could not broadcast input array from shape (71,633,3) into shape (71,0,3)
"
How can I solve it?
thank you if you can answer my question.

About posecnn initialization pose

Hello, I used posecnn to initialize the pose of master_chef_can. I used the pre-training weights you provided to evaluate and failed to get the index value described in the paper. I carefully checked the index of each sequence and found that in the sequence 55 of YCB-Video The initialization of posecnn is very bad, what do I need to do to get the ADD indicator in your paper?

Multi-object tracking

Nice work!
i have two questions:

  1. i wonder how you do multi-object tracking shown in your video.
  2. why Lie algebra is better than other alternatives to regress the rotation, such as quaternion or this paper[1]?

[1]Zhou, Y., Barnes, C., Lu, J., Yang, J., Li, H.: On the continuity of rotation representations in neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2019) 5745–5753

On the se(3) representation

Hi, it's me again :) I'm really interested in your work and I'm recently studying the se(3) representation used in the paper. I'm a bit confused because it seems to differ from the standard se(3) on the translation part.
In section III.A the exponential map from se(3) to SE(3) is given by
image
Here the translation part of the rigid transformation is directly taken from the 6D vector [t, w] in se(3).

However, the exponential map I see elsewhere looks like this:
image
Here the final translation is Vt.
This is in accordance with Fig. 3 in the paper
image

I wonder if there is a typo in III.A or the 6D transformation is actually parameterized by the so(3) representation w for rotations and a plain translation t, which also makes sense, especially when the predictions of t and w are disentangled into two branches in the network.
Thanks a lot!

Upd: After a bit more research I found that the exponential map in III.A is defined as "pseudo-exponential map" in Blanco10 and is said to yield Jacobians that are more efficient to evaluate. Is this what the paper intended to use?

Bounding box format

Line 138 in predict.py , I found:

with_add = bounding_box / 100 * object_max_width
self.object_width = object_max_width + with_add

To me, it looks like object_width is supposed to be a single float. However, the bounding box is a two dimensional array (list of the 3D coordinates of the bounding box vertices) and the division makes no sense. Needless to say, with my current bounding box format, the code above won't run. What I am doing wrong ?

Guide on how to use pre-trained weights

Hello, I got inside the docker container and was able to cck that the predict_ros.py runs with my rosmaster.

However, I would need a guide on how to set it up. My idea would be start using the pre-trained weights.
Could you please provide a step-guide on:

  • where is the docker mounted on my machine?
  • where do I have to put the pretrianed weights and what do I need to edit?
  • Any further steps before running predict_ros.py with pre-trained weights?

Many thanks for your help and support!!

Questions about running speed(90 Hz) and overfitting

Hi, thank you for your sharing and help! I have two questions and look forward to your reply.

  1. I want to know how your running speed reaches 90Hz and why when I run predict.py(use fuction getResultsYcb()) under RTX3090, the speed is much lower than 90Hz (the output time interval of every 100 frames is greater than 10 seconds)
    提问1

  2. I want to know if you have completely completed 300 epochs training on the YCB-Video and Ycbineoat datasets, and whether 300 epochs may lead to overfitting(I think 300 epochs may be too much), For example, "ycb_results_model_epoch180" under the path "Ours/002_master_chef_can/ycb_results_model_epoch180" means that epoch180 is the best after training for 300 epochs?

about eval_ YCBInEOAT.py

Hello author, thank you for your sharing and help. I tried to train ycbineoat datasets in engineering and generate models, but when I ran eval_ YCBInEOAT.py
2021-03-20 22-59-35 的屏幕截图
When you want to evaluate the model score, you always report an error "track back (most recent call last):"

File "/home/mimashi1/iros20-6d-pose-tracking-master/eval_ ycbineoat.py ", line 126, in

eval_ all(args)

File "/home/mimashi1/iros20-6d-pose-tracking-master/eval_ ycbineoat.py ", line 89, in eval_ all

assert len(pred_ files)==len(gt_ files),'#pred_ files:{}, #gt_ files:{}'.format(len(pred_ files),len(gt_ files))

UnboundLocalError: local variable 'gt_ files' referenced before assignment”

My path configuration is like this

“ parser.add_ argument('--YCBInEOAT_ dir', default='/home/mimashi1/iros20-6d-pose-tracking-master/data/bleach_ cleanser')

parser.add_ argument('--ycb_ dir', default='/media/bowen/e25c9489-2f57-42dd-b076-021c59369fec/DATASET/Tracking/YCB_ Video_ Dataset')

parser.add_ argument('--class_ id',type=int,default=1)

parser.add_ argument('--res_ dir',type=str,default='/home/mimashi1/iros20-6d-pose-tracking-master/debug')”

I hope you can give me some suggestions and look forward to receiving your reply. Thank you.

Missing dataset_info.yml (For YCB-Video)

Hi!
When running predict.py, I get the following error if I don't specify a path for train_data_path:

Traceback (most recent call last):
  File "predict_original.py", line 632, in <module>
    with open(dataset_info_path,'r') as ff:
FileNotFoundError: [Errno 2] No such file or directory: 'None/../dataset_info.yml'

This makes sense, given that None as a path yields nothing.
Setting train_data_path to where I have set up my dataset however, gives me the same error as dataset_info.yml cannot be found in my YCB-Video dataset folder.

I can confirm that I have downloaded the dataset correctly since my teammate and myself have the same files and folders within the YCB_Video_Dataset folder.

Should I have a dataset_info.yml file somewhere? If not, how should it be set out? Is there a file that I can download? What should my YCB_Video_Dataset folder look like?
Here is my current directory within YCB_Video_Dataset:

cameras  data_organized  image_sets  LICENSE  pairs  README
data     data_syn        keyframes   models   poses

I should also mention that my dataset folder is separate from this repository's folder due to me needing it to test other 6D pose estimator networks.
Looking at the code in more detail suggests that dataset_info.yml is required for predictions as it contains camera parameters, image resolutions, bounding box info, etc.

Cheers,
Lachlan

About synthetic data

First of all thank you very much for sharing. I saw that you shared the synthetic data set used for training. I have some doubts and need your help:

  1. Is the synthetic data generated in blender using the parameters you gave in the paper?
  2. Mentioned in the paper: "For both training and inference, rendering of It1 is implemented in C++ OpenGL.” But in the code, only OpenGL is used for rendering in preeditct.py(call in vispy_renderer.py). The input in training is A (the synthesized image has been generated), so A in training and It-1 in inference are Is it generated by the same tool?

Looking forward to your answer, thank you!

about predict.py path

hello,author.thank you first. If I want to run predict.py for test on YCBInEOAT dataset, for example banana, but I don't know how to set ycb_dir, YCBInEOAT_dir, train_data_path, ckpt_dir, mean_std_path and class id.

Call for {model}.ply files.

Thanks for your public code. When I run preditions on YCB and YCBInEOAT dataset, it needs .ply file of target model. However, these files seems to be in train directories which is about 15G. I would not plan to train se(3)-TrackNet networks. So could you provide a convenient link to only these .ply files for all the 21 objects?

Arguments in predict_ros.py

Could you please give more details about the usage of predict_ros.py?

Spesifically, what are these:

  • artifact_id
  • pose_init_file
  • artifacts_folder
  • camera_frame_name
  • object_frame_name

Point cloud format for eval

Hello,
I recently trained se(3)-tracknet on a custom dataset and when I tried the evaluation script, I got this error:

Traceback (most recent call last):
  File "predict.py", line 419, in <module>
    predictSequenceMyData()
  File "predict.py", line 321, in predictSequenceMyData
    rot_normalizer=30 * np.pi / 180)
  File "predict.py", line 162, in __init__
    self.renderer = VispyRenderer(ply_file, self.K, H=height, W=width)
  File "/home/anael/se3-tracknet-repo/vispy_renderer.py", line 127, in __init__
    face_indices = np.stack(face_indices, axis=0)
  File "<__array_function__ internals>", line 6, in stack
  File "/opt/anaconda/envs/tracknet/lib/python3.7/site-packages/numpy/core/shape_base.py", line 427, in stack
    raise ValueError('all input arrays must have the same shape')
ValueError: all input arrays must have the same shape

I simply generated my .ply cloud point file using blender and it seems my point cloud doesn't have the right structure, but I am unsure how I could fix it. Most elements in face_indices have length 4 but some have length 3 and others have length 5. Is there a modification I could apply to my mesh or point cloud so that all face_indices elements have same length ?

about B_in_camera

hello, I am sorry to bother you again, because I have two questions that need your correction.

  1. my syn_data is generated by blender, and the pose for *B is save as a pkl(e.g.,0.pkl),so I directly extract the pose of .pkl as *B's pose(B_in_cam),I don't have the transformation as your code(blendercam_in_world, poses_in_world....).
    do you think this is right?
    meta = np.load(rgb_file.replace('rgb.png','poses_in_world.npz'))
    class_ids = meta['class_ids']
    poses_in_world = meta['poses_in_world']
    blendercam_in_world = meta['blendercam_in_world']
    pos = np.where(class_ids==class_id)
    B_in_cam = np.linalg.inv(cvcam_in_blendercam).dot(np.linalg.inv(blendercam_in_world).dot(poses_in_world[pos,:,:].reshape(4,4)))

that's my modified code:
B_in_cam = pickle.load(open('.../{:07}RT.pkl'))

  1. After generating image pairs for training, I run train.py and got an error about generating labels for training. could you tell give me some advice?

Traceback (most recent call last):
File "train.py", line 110, in
for i, (data, target, A_in_cams, B_in_cams, rgbA, rgbB, maskA, maskB) in enumerate(train_loader):
File "/home/tiger/anaconda3/envs/se3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "/home/tiger/anaconda3/envs/se3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
AssertionError: Traceback (most recent call last):
File "/home/tiger/anaconda3/envs/se3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/tiger/anaconda3/envs/se3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/opt/tiger/zyz_6d_pose/iros20-6d-pose-tracking/datasets.py", line 107, in getitem
data, target, rgbA, rgbB, maskA, maskB = self.processData(rgbA,depthA,A_in_cam,rgbB,depthB,B_in_cam,maskB)
File "/opt/tiger/zyz_6d_pose/iros20-6d-pose-tracking/datasets.py", line 154, in processData
assert (rot_label>=-1).all() and (rot_label<=1).all(),'root:\n{}\nrot_label\n{}\n A2B_in_cam_rot{}\n'.format(self.root,rot_label,A2B_in_cam_rot)
AssertionError: root:
/opt/tiger/zyz_6d_pose/iros20-6d-pose-tracking/media/YCB_traindata/ggl/train_data_blender_DR
rot_label
[ 0.5192515 -1.72661945 -0.50831636]
A2B_in_cam_rot[[ 0.98769196 0.04075547 -0.15100854]
[-0.04756784 0.99799397 -0.04177721]
[ 0.14900296 0.04844617 0.98764927]]

could you give me some advice? thanks in advance.

About the rgb map and depth map alignment

Hello author, thank you very much for sharing and answering. I have some problems in the process of making the data, and I hope to get your reply

  1. I am also using Azure Kinect sensor to get the rgb map and depth map, but the size and scale of the two images are different. How can I manipulate the Azure Kinect sensor to get the correct rgb map and depth map and align them?
  2. I noticed that the test data has "depth" and "depth_filled", is there any difference between the two files and how do I get them.

Translated with www.DeepL.com/Translator (free version)

Some details about the generation of the training samples.

In the "Experiments" section of the paper, you said "the camera's pose is randomly sampled from a sphere of radius between 0.6 to 1.3m, followed by an additional rotation along camera z-axis sampled between 0 to 360 degree.", but it seems there is no code about the camera's pose sampling in the "blender_dataset_generator.py" file. We need your help and thank you firstly.

Two questions about paper

Hello bowen, I have two question about your paper:

  1. How to make the YCBInEOAT dataset?
    In paper, you said that the dataset is "accurately annotated manually", can you provide detail tools or method for that?

  2. Can one network track all objects?
    In 2D vision tracking like KCF or some deep learning method, the tracking pipeline is just draw a 2D bbox, then the network begin to tracking the 2D bbox. All kinds of objects can be used in a same network.
    Have you ever try to do so in 3D tracking? Maybe given a random object CAD and its inital pose, then tracking it in camera, but not just track one object base on one network parameters?

Thank you~
Best

questions regard setup

Hi, thanks for your public code. Errors occurred when launching docker container:

Error: No such container: se3_tracknet
access control disabled, clients can connect from any host
run_container.sh: line 3: nvidia-docker: command not found

What's more, I'm trying to run your code in remote linux server that has no screen to display rendered results. So could you tell me how to run your code predict.py/predictSequenceYcb() without the assistance of docker?

about predict.py error

here is my path:
--YCBInEOAT_dir', default='/home/zkyd/iros20-6d-pose-tracking-master/data/YCBInEOAT_Dataset/YCBInEOAT/bleach0'
--train_data_path', default="/home/zkyd/iros20-6d-pose-tracking-master/data/YCBInEOAT_Dataset/YCBInEOAT_synthetic/bleach_cleanser/train_data_blender_DR
--ckpt_dir', default="/home/zkyd/iros20-6d-pose-tracking-master/data/YCBInEOAT_weights/bleach_cleanser/model_best_val.pth.tar
--mean_std_path', default="/home/zkyd/iros20-6d-pose-tracking-master/data/YCBInEOAT_weights/bleach_cleanser
I don't find bleach_cleanser folder in the YCBInEOAT , I think YCBInEOAT_dir stand for test_path,right or wrong?
I run the predict.py, but always account the eror " File "predict.py", line 218, in on_track
rgbB, depthB = crop_bbox(current_rgb, current_depth, bb, self.image_size)
File "/home/zkyd/iros20-6d-pose-tracking-master/Utils.py", line 340, in crop_bbox
color_crop[top_offset:bottom_offset, left_offset:right_offset, :] = color[top:bottom, left:right, :]
ValueError: could not broadcast input array from shape (71,633,3) into shape (71,0,3)
"
How can I solve it?
thank you if you can answer my question.

CuDNN error while running predict.sh

I am running it on a NVIDIA RTX 3080 GPU

Traceback (most recent call last):
File "/home/se3_tracknet/predict.py", line 640, in <module>
predictSequenceMyData()
File "/home/se3_tracknet/predict.py", line 591, in predictSequenceMyData
cur_pose = tracker.on_track(A_in_cam, rgb, depth, gt_A_in_cam=np.eye(4),gt_B_in_cam=np.eye(4), debug=debug,samples=samples)
File "/home/se3_tracknet/predict.py", line 252, in on_track
prediction = self.model(dataA,dataB)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/home/se3_tracknet/se3_tracknet.py", line 84, in forward
a = self.convA1(A)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py", line 343, in forward
return self.conv2d_forward(input, self.weight)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py", line 340, in conv2d_forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
 

The following is the output of "nvidia-smi"

Wed Oct 26 23:54:26 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05 Driver Version: 510.73.05 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| 43% 49C P2 100W / 370W | 1624MiB / 10240MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
 
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+```

about data

Hello, author. Thank you for sharing. I have some problems with datasets. You mentioned in the readme file that you need five datasets, but these add up to 350gb of memory, but my computer only has 200g. So I want to ask how to solve this problem?

Blender file 1.blend does not exist

Hello bowen,
I found that the blender file 1.blend does not exist in the repository when running your code. This may cause blender_main.py cannot be run correctly. Please check and upload the file.

Thank you~
Best

about the network

hello, sorry to bother you, there are two "self.convAB2 "in the code, I wonder if the second one is "self.convAB3"?
according to the network picture in the paper, there are two ResnetBasicBlocks, but in the forward(), only use self.convAB2() one time.
hoping you reply~
截屏2021-05-15 上午11 17 08
self.convAB1 = ConvBNReLU(128, 256, kernel_size=3, stride=2)
self.convAB2 = ResnetBasicBlock(256, 256, bias=True)
self.convAB2 = ResnetBasicBlock(256, 256, bias=True)

    ab = torch.cat((a, b), 1).contiguous() 
    ab = self.convAB1(ab)  
    ab = self.convAB2(ab)
    output['feature'] = ab

Conda Environment environment.yml issues

Hi!
Pardon if this is explained in detail somewhere else, however I keep running into conflicting packages when creating the conda environment using the environment.yml file. More specifically, Conda continuously attempts to solve the environment (for about half an hour at a time), giving me the following terminal output:

mu00185683@mu00185683-GL65-Leopard-10SFK:~/iros20-6d-pose-tracking$ conda env create -f environment.yml 
Collecting package metadata (repodata.json): done
Solving environment: | 
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.

It will then examine the conflicts for various packages present in the environment file and ultimately fail.

Using the default file straight from the repository, my terminal takes a while to attempt to solve the conflicts, where it will eventually end in a ResolvePackageNotFound error. It does not get any further than this and installing the listed packages manually does not work.

Hence, I have made changes to the file, which involve moving the following 3 packages:
gcc_impl_linux-64=7.5.0=hd420e75_6
gxx_impl_linux-64=7.5.0=hdf63c60_6
qt=4.8.7=2

from under the "dependencies" tag to under the "pip" tag to avoid these errors. This is as suggested by Aki1987 in this thread: datitran/object_detector_app#41 . These three packages are the ones that appear in the ResolvePackageNotFound error.
This then brings me to this point where I am getting `Solving environment: failed" Errors and whatnot. The process takes about half an hour before it spits out a very long error (far longer than my terminals viewing window) meaning that I cannot figure out what is going on.

My setup:
Miniconda, installing the 64-bit Ubuntu 18.04 version following this link: https://docs.conda.io/en/latest/miniconda.html#linux-installers
Python 3.8.5
pip 20.2.4 from /home/mu00185683/miniconda3/lib/python3.8/site-packages/pip (python 3.8)
I'm happy to provide more info if needed.

I'm very lost in this, and all I want to do is to be able to run the network and determine if it's the right fit for my application! If this environment setup is not needed, then please let me know because the instructions on actually running the network are very vague (But that's for another issue thread!)

I'll update this thread if I happen to figure it out.

Any help will be greatly appreciated!
Cheers,
Lachlan

cannot find some functions and how to generate dataset_info.yml?

Dear Bowen
We are trying to re-implement your work. But it is not available for the current repo.
1. The repo does not include completeBlenderYcbDR(), compute_2Dboundingbox, and normalize_scale functions. So the code cannot be ran successfully.
2. How to generate the dataset_info.yml files? The raw ycb-video dataset also does not have such files.

Thanks in advance.

关于训练时间

您好,温博士,请问您训练一个物体时需要多少时间?我是用3090,batch_size=200,来进行训练,发现训练200轮大概需要10天的时间,请问您在训练时需要多久的时间,我的操作是否出现了一些问题

about first About the first frame of the annotation pose

Hello, relying on your sharing, I made a new prediction data containing rgb map, depth map, with camera internal reference. But it lacks the annotated pose of the first frame, I would like to ask how can I get the pose of the first frame (4*4 matrix)

When will the code be released

Hi, thanks for the great work! I'm very interested in your project and I really look forward to the code release. Is there any plan on the release date? Thanks :))

about multiply GPU training

I modify the CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7, in the train.sh, but it always training by only one GPU, how can I train by multiply GPU?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.