GithubHelp home page GithubHelp logo

kirumang / pix2pose Goto Github PK

View Code? Open in Web Editor NEW
178.0 178.0 34.0 998 KB

Original implementation of the paper "Pix2Pose: Pixel-Wise Coordinate Regression of Objects for 6D Pose Estimation", in ICCV 2019, https://arxiv.org/abs/1908.07433

License: MIT License

Dockerfile 0.15% Python 98.66% Shell 1.19%

pix2pose's Introduction

I'm

  • an Applied Scientist at Robotics AI, Amazon
  • working on object perception and recognition for robot manipulation tasks (e.g., segmentation, pose estimation, grasp generation/planning, general 3D vision stuff)

Main interests

  • Computer Vision and Robot vision: pose estimation of objects, object recognition, object modeling/learning, manipulation
  • Robot programming: ROS

Languages, Libraries, and Tools

  • Python, C++, C#
  • Tensorflow, Keras, Pytorch, PCL, OpenCV, Scipy, Pybullet
  • Blender, Catia

How to reach me

pix2pose's People

Contributors

kirumang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pix2pose's Issues

Shape mismatch error in running inference code

Hi,
I am trying to run your code and got the following error:

Traceback (most recent call last):
  File "tools/5_evaluation_bop_basic.py", line 310, in <module>
    obj_pix2pose[obj_order_id].est_pose(image_t,roi.astype(np.int))            
  File "/host_home/repos/Pix2Pose/pix2pose_model/recognition.py", line 89, in est_pose
    base_image[vv1:vv2,uu1:uu2] = image_no_mask_zero
ValueError: could not broadcast input array from shape (462,56,3) into shape (0,56,3)

I have tracked the code, and it happened when some of the bounding box indices are negative, and the shifting does not work sometimes.
Below are the numbers that the error happens for them:
Inpix2pose.get_boxes() method, the bbox input is: [-65 128 -27 163]
v_max, u_maxare 480, 640, respectively.
and the output for vv1,vv2,uu1,uu2 are 71 52 0 52.
The value for vv1, vv2 pair is clearly wrong. Could you please help?

Below is my config file:

{
  "backbone":"resnet50",
  "dataset_dir": "/host_home/repos/Pix2Pose/datasets",
  "dataset_names": ["lmo"],
  "detection_pipeline": "rcnn",
  "path_to_detection_pipeline": "/home/kiru/common_ws/Mask_RCNN_Mod" ,  
  "path_to_output": "./bop_result_ali" ,
  "outlier_th":[0.15,0.25,0.35],
  "inlier_th":0.15,
  "norm_factor_fn":"norm_factor.json",
  "background_imgs_for_training":"/home/kiru/media/hdd/datasets/coco2017/train2017/",
  "score_type":1,
  "task_type":2,
  "cand_factor":2,
  "test_target":"test_targets_bop19"  
}

Visualization Problem

I have gotten the CSV output of T_less dataset and I tried to visualize it using the bop_toolkit.
I changed the shading to 'flat' since I got this error :

  File "scripts/vis_est_poses.py", line 129, in <module>
    ren.add_object(obj_id, model_path, surf_color=model_color)
  File "/Users/negar/Documents/CS701/Pix2Pose/bop_toolkit/bop_toolkit_lib/renderer_py.py", line 376, in add_object
    vertices = np.array(list(zip(model['pts'], model['normals'],
KeyError: 'normals'

However, now I get this error : 

  File "bop_toolkit/scripts/vis_est_poses.py", line 246, in <module>
    vis_rgb_resolve_visib=p['vis_rgb_resolve_visib'])
  File "/Users/negar/Documents/CS701/Pix2Pose/bop_toolkit/bop_toolkit_lib/visualization.py", line 150, in vis_object_poses
    pose['obj_id'], pose['R'], pose['t'], fx, fy, cx, cy)
  File "/Users/negar/Documents/CS701/Pix2Pose/bop_toolkit/bop_toolkit_lib/renderer_py.py", line 470, in render_object
    app.run(framecount=0)
  File "/Applications/anaconda3/envs/Pix2Pose_env/lib/python3.6/site-packages/glumpy/app/__init__.py", line 317, in run
    clock = __init__(clock=clock, framerate=framerate, backend=__backend__)
  File "/Applications/anaconda3/envs/Pix2Pose_env/lib/python3.6/site-packages/glumpy/app/__init__.py", line 277, in __init__
    window.dispatch_event('on_resize', window._width, window._height)
  File "/Applications/anaconda3/envs/Pix2Pose_env/lib/python3.6/site-packages/glumpy/app/window/event.py", line 396, in dispatch_event
    if getattr(self, event_type)(*args):
  File "/Applications/anaconda3/envs/Pix2Pose_env/lib/python3.6/site-packages/glumpy/app/window/window.py", line 221, in on_resize
    self.dispatch_event('on_draw', 0.0)
  File "/Applications/anaconda3/envs/Pix2Pose_env/lib/python3.6/site-packages/glumpy/app/window/event.py", line 386, in dispatch_event
    if handler(*args):
  File "/Users/negar/Documents/CS701/Pix2Pose/bop_toolkit/bop_toolkit_lib/renderer_py.py", line 462, in on_draw
    curr_obj_id, mat_model, mat_view, mat_proj)
  File "/Users/negar/Documents/CS701/Pix2Pose/bop_toolkit/bop_toolkit_lib/renderer_py.py", line 508, in _draw_rgb
    program.draw(gl.GL_TRIANGLES, self.index_buffers[obj_id])
  File "/Applications/anaconda3/envs/Pix2Pose_env/lib/python3.6/site-packages/glumpy/gloo/program.py", line 603, in draw
    self.activate()
  File "/Applications/anaconda3/envs/Pix2Pose_env/lib/python3.6/site-packages/glumpy/gloo/globject.py", line 95, in activate
    self._activate()
  File "/Applications/anaconda3/envs/Pix2Pose_env/lib/python3.6/site-packages/glumpy/gloo/program.py", line 393, in _activate
    attribute.activate()
  File "/Applications/anaconda3/envs/Pix2Pose_env/lib/python3.6/site-packages/glumpy/gloo/globject.py", line 102, in activate
    self._update()
  File "/Applications/anaconda3/envs/Pix2Pose_env/lib/python3.6/site-packages/glumpy/gloo/variable.py", line 418, in _update
    stride = self.data.stride
AttributeError: 'NoneType' object has no attribute 'stride'

I've attached my CSV file as well.

pix2pose-iccv19_tless-test-primesense.csv.zip

Explainability of the autoencoder

Hi Kiru,

is there a way to explain what the autoencoder is actually learning, i.e. which visual features are important for the reconstruction of the input? An example for what I am looking for, but in the case of classification, is using the Grad-CAM approach, where you can see which regions of the image are of importance.

What would be a good approach to do this for autoencoders?

Inference time

Hello,

I am trying to reach the inference time from the paper with RCNN (6-7fps) but it seems there is a bottleneck in pose estimation with the method est_pose(...) in recognition.py.

I measure the time it takes to execute certain code segments during the evaluation. It seems the bottleneck lies in the RANSAC computation in the following lines:

for cand_id in range(len(input_refined)):
v1_ori,v2_ori,u1_ori,u2_ori,v1,v2,u1,u2,vv1,vv2,uu1,uu2=box_refined[cand_id]
img_prob_ori = resize(prob[cand_id,:,:,0],(v2_ori-v1_ori,u2_ori-u1_ori),order=1,mode='constant',cval=1)
img_prob_ori = img_prob_ori[vv1:vv2,uu1:uu2]
gray = np.linalg.norm(decode[cand_id],axis=2)<0.3
non_gray = np.invert(gray)
decode[cand_id,gray,:]=0 #difference..
img_pred =(decode[cand_id]+1)/2
img_pred[img_pred > 1] = 1
img_pred[img_pred < 0] = 0
img_pred_ori = resize(img_pred,(v2_ori-v1_ori,u2_ori-u1_ori),order=1,mode='constant',cval=0.5)*255
non_gray = resize(non_gray.astype(float),(v2_ori-v1_ori,u2_ori-u1_ori),order=1,mode='constant',cval=0)>0.9
non_gray = non_gray[vv1:vv2,uu1:uu2]
n_non_gray = np.sum(non_gray)
if n_non_gray<10:
continue
img_pred_ori = img_pred_ori[vv1:vv2,uu1:uu2]
rgb_aug_test2 = np.zeros((rgb.shape[0],rgb.shape[1],3),np.uint8)
rgb_aug_test2[v1:v2,u1:u2]=[128,128,128]
rgb_aug_test2[v1:v2,u1:u2]=img_pred_ori
rot_pred_cand,tra_pred_cand,valid_mask,n_inliers = self.pnp_ransac(rgb_aug_test2,img_prob_ori,non_gray,v1,v2,u1,u2)
#print("n_inliers:",n_inliers,rot_pred_cand,tra_pred_cand)
n_inliers = n_inliers / n_non_gray
if(n_inliers>max_inlier):
valid_mask_full = np.zeros((rgb.shape[0],rgb.shape[1]),bool)
valid_mask_full[v1:v2,u1:u2]=valid_mask
rot_pred = rot_pred_cand
tra_pred = tra_pred_cand
img_pred_f = img_pred_ori
max_inlier = n_inliers
#frac of max_inlier
if(max_inlier==-1):
#print("not valid max_inlier at the second stage")
return img_pred,-1,-1,-1,-1,np.array([v1,v2,u1,u2],np.int)
else:
return img_pred_f.astype(np.uint8),valid_mask_full,rot_pred,tra_pred,max_inlier/n_init_mask,np.array([v1,v2,u1,u2],np.int)

I used timeit to see how much time it takes for this patch of code to execute and achieve a mean time of around 1 second pro image.

I am using 640x400 RGB images, and the basic_evaluation script. My GPU is NVIDIA RTX 2070, so I guess the hardware might not be the issue.

What could be the cause for the bottleneck in this case? I achieve the same results for the detections with the Faster R-CNN (130 ms) as the paper, but the pose estimation is significantly worse.

Permission denied: '/home/kiru'`

Hello, can you tell me where to download the weights file “inference.hdf5” and “weight_detection” ,I only found “Base archive”, “Object models”, “Synt. training images”, “All test images” from BOP: Benchmark for 6D Object Pose Estimation. There are only “models”and “model_eval” folders inside,no “models_xyz” folder.
And when I execute the script:python3 tools/5_evaluation_bop_basic.py <gpu_id> <cfg_path> <dataset_name>
I get this error:
Traceback (most recent call last): File "/home/Workspace/Project/Pix2Pose/tools/5_evaluation_bop_basic.py", line 119, in <module> os.makedirs(output_img) File "/home/anaconda3/envs/pix2pose/lib/python3.6/os.py", line 210, in makedirs makedirs(head, mode, exist_ok) File "/home/anaconda3/envs/pix2pose/lib/python3.6/os.py", line 210, in makedirs makedirs(head, mode, exist_ok) File "/home/anaconda3/envs/pix2pose/lib/python3.6/os.py", line 210, in makedirs makedirs(head, mode, exist_ok) File "/home/anaconda3/envs/pix2pose/lib/python3.6/os.py", line 220, in makedirs mkdir(name, mode) PermissionError: [Errno 13] Permission denied: '/home/kiru'
I found no “kiru” in the “os.py”, and I change all paths in the folder “tools” with "kiru". But it doesn't work.
Can you tell me how to solve it? Thank you!

one object with multi model

Hi,
I have a question about the train data generating. For one object, like a cup, I use depth sensor to reconstruct a lot 3D models, and how could I create the train data pairs in a consistent style. I mean that each model of the cup is in different locations in world coords. If I can directly use the script in tools? or I have to transform the model to same position and then run the scripts?

Error while reproducing BOP results on YCB-V using resnet50

Hello,

Thank you for your wonderful work! I am trying to reproduce the results of BOP challenge on YCB-V dataset using both versions submitted in the BOP challenge. I am able to reproduce the results by using 'paper' as backbone but while using 'resnet50' as backbone, I get the following error.

load pix2pose weight for obj_1 from /home/akm7rng/Desktop/Dataset/BOP/Dataset/ycbv/pix2pose_weights/01/inference_resnet_model.hdf5
XXX lineno: 188, opcode: 0
Traceback (most recent call last):
  File "/home/akm7rng/envs/pix2pose/lib/python3.6/site-packages/keras/engine/saving.py", line 260, in load_model
    model = model_from_config(model_config, custom_objects=custom_objects)
  File "/home/akm7rng/envs/pix2pose/lib/python3.6/site-packages/keras/engine/saving.py", line 334, in model_from_config
    return deserialize(config, custom_objects=custom_objects)
  File "/home/akm7rng/envs/pix2pose/lib/python3.6/site-packages/keras/layers/__init__.py", line 55, in deserialize
    printable_module_name='layer')
  File "/home/akm7rng/envs/pix2pose/lib/python3.6/site-packages/keras/utils/generic_utils.py", line 145, in deserialize_keras_object
    list(custom_objects.items())))
  File "/home/akm7rng/envs/pix2pose/lib/python3.6/site-packages/keras/engine/network.py", line 1027, in from_config
    process_node(layer, node_data)
  File "/home/akm7rng/envs/pix2pose/lib/python3.6/site-packages/keras/engine/network.py", line 986, in process_node
    layer(unpack_singleton(input_tensors), **kwargs)
  File "/home/akm7rng/envs/pix2pose/lib/python3.6/site-packages/keras/engine/base_layer.py", line 457, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/akm7rng/envs/pix2pose/lib/python3.6/site-packages/keras/layers/core.py", line 682, in call
    return self.function(inputs, **arguments)
  File "/home/kiru/common_ws/Pix2Pose/pix2pose_model/ae_model.py", line 188, in <lambda>
SystemError: unknown opcode'

I am running the code with Python 3.6.3 and Keras 2.2.1.
Please let me know if you need any information from my side.
Thank you for your assistance!

pyglet.window.NoSuchConfigException

Hello Kiru,
I have two problems.
First, when I used cfg_bop2019.json to run 5_evaluation_bop_basic.py, the following problems occurred:

Warning.. have to adjust the scale
Traceback (most recent call last):
  File "/home/Pix2Pose/tools/5_evaluation_bop_basic.py", line 297, in <module>
    img_pred,mask_pred,rot_pred,tra_pred,frac_inlier,bbox_t = obj_pix2pose[obj_order_id].est_pose(image_t,roi.astype(np.int))
ValueError: not enough values to unpack (expected 6, got 5)

I changed the values before the equal sign to five on line 297, then get the following information:
ValueError: too many values to unpack (expected 5)

Second, when I run 5_evaluation_bop_icp3d.py, the following problems occurred:

if models are not fully listed above, please make sure there are ply files available
Traceback (most recent call last):
  File "/home/Pix2Pose/tools/5_evaluation_bop_icp3d.py", line 193, in <module>
    ren = Renderer((im_width,im_height),cam_K)
  File "/home/Pix2Pose/rendering/renderer_xyz.py", line 96, in get_instance
    instances[cls] = cls(size, cam)
  File "/home/Pix2Pose/rendering/renderer_xyz.py", line 106, in __init__
    app.Canvas.__init__(self, show=False, size=size)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/vispy/app/canvas.py", line 205, in __init__
    self.create_native()
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/vispy/app/canvas.py", line 222, in create_native
    self._app.backend_module.CanvasBackend(self, **self._backend_kwargs)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/vispy/app/backends/_pyglet.py", line 211, in __init__
    screen=self._vispy_screen)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/pyglet/window/xlib/__init__.py", line 171, in __init__
    super(XlibWindow, self).__init__(*args, **kwargs)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/pyglet/window/__init__.py", line 593, in __init__
    config = screen.get_best_config(config)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/pyglet/canvas/base.py", line 197, in get_best_config
    raise window.NoSuchConfigException()
pyglet.window.NoSuchConfigException

Can you help me to solve it?

Script gets killed on execution of 5_evaluation_bop_basic.py

Hey, while trying to reproduce the bop results on the t-less dataset I ran into the following error. The only changes I made in the cfg file were done to customize the paths for "dataset_dir", "path_to_detection_pipeline" and "path_to_output". While running the 5_evaluation_bop_basic.py script was stopped without further error description. Is the issue related to insufficient memory and if yes is there a chance to bypass the shortage?
Thanks in advance and props to your well documented and easy to be read code!

$ python ./tools/5_evaluation_bop_basic.py gpu_id=0 ./cfg/cfg_bop2020_custom.json tless
Using TensorFlow backend.
2021-01-01 12:04:17.586929: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2021-01-01 12:04:17.613975: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3000000000 Hz
2021-01-01 12:04:17.614244: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55aa550 executing computations on platform Host. Devices:
2021-01-01 12:04:17.614257: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
1 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000001.ply
2 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000002.ply
3 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000003.ply
4 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000004.ply
5 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000005.ply
6 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000006.ply
7 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000007.ply
8 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000008.ply
9 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000009.ply
10 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000010.ply
11 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000011.ply
12 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000012.ply
13 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000013.ply
14 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000014.ply
15 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000015.ply
16 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000016.ply
17 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000017.ply
18 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000018.ply
19 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000019.ply
20 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000020.ply
21 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000021.ply
22 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000022.ply
23 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000023.ply
24 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000024.ply
25 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000025.ply
26 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000026.ply
27 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000027.ply
28 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000028.ply
29 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000029.ply
30 /home/users/rgg_js/pix2pose/bop/tless/models_reconst/obj_000030.ply
if models are not fully listed above, please make sure there are ply files available
Camera info-----------------
1280 1024
[[1.07565092e+03 0.00000000e+00 6.41068883e+02]
 [0.00000000e+00 1.07390348e+03 5.07721598e+02]
 [0.00000000e+00 0.00000000e+00 1.00000000e+00]]
-----------------

Configurations:
BACKBONE                       resnet101
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     1
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE         None
DETECTION_MAX_INSTANCES        200
DETECTION_MIN_CONFIDENCE       0.001
DETECTION_NMS_THRESHOLD        0.7
FPN_CLASSIF_FC_LAYERS_SIZE     1024
GPU_COUNT                      1
GRADIENT_CLIP_NORM             5.0
IMAGES_PER_GPU                 1
IMAGE_CHANNEL_COUNT            3
IMAGE_MAX_DIM                  1280
IMAGE_META_SIZE                43
IMAGE_MIN_DIM                  1024
IMAGE_MIN_SCALE                0
IMAGE_RESIZE_MODE              square
IMAGE_SHAPE                    [1280 1280    3]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.001
LOSS_WEIGHTS                   {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               100
MEAN_PIXEL                     [123.7 116.8 103.9]
MINI_MASK_SHAPE                (56, 56)
NAME                           tless
NUM_CLASSES                    31
POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        2000
POST_NMS_ROIS_TRAINING         2000
PRE_NMS_LIMIT                  6000
ROI_POSITIVE_RATIO             0.33
RPN_ANCHOR_RATIOS              [0.5, 1, 2]
RPN_ANCHOR_SCALES              (16, 32, 64, 128, 256)
RPN_ANCHOR_STRIDE              1
RPN_BBOX_STD_DEV               [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD              0.9
RPN_TRAIN_ANCHORS_PER_IMAGE    256
STEPS_PER_EPOCH                1000
TOP_DOWN_PYRAMID_SIZE          256
TRAIN_BN                       False
TRAIN_ROIS_PER_IMAGE           1
USE_MINI_MASK                  True
USE_RPN_ROIS                   True
VALIDATION_STEPS               5
WEIGHT_DECAY                   0.0001


WARNING:tensorflow:From /home/users/rgg_js/.venvs/pix2pose/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
2021-01-01 12:04:25.392865: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 51380224 exceeds 10% of system memory.
2021-01-01 12:04:25.562079: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 51380224 exceeds 10% of system memory.
2021-01-01 12:04:25.597218: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 51380224 exceeds 10% of system memory.
load pix2pose weight for obj_1 from /home/users/rgg_js/pix2pose/bop/tless/pix2pose_weights/01/inference_resnet_model.hdf5
2021-01-01 12:04:35.454465: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 16777216 exceeds 10% of system memory.
2021-01-01 12:04:35.454786: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 33554432 exceeds 10% of system memory.
load pix2pose weight for obj_2 from /home/users/rgg_js/pix2pose/bop/tless/pix2pose_weights/02/inference_resnet_model.hdf5
Killed

The performance of mask-rcnn on tless dataset

Hi,

I use the provided weight mask_rcnn_tless_0005.h5 to test images on test_targets_bop2019.json of t-less dataset. However, the 2D detection effect is not very accurate, especially in complex scenes such as 19, 20. Is this normal?

Examples are as follows:

2020-11-11 23-55-20屏幕截图
2020-11-11 23-55-46屏幕截图

visualization of 3D bounding box

Hello, I want to know how to visualize the verification results and display the 3D bounding box in the paper. When not using BOP_ In the case of toolkit, I use the rendering pipeline to render the pose of the object. In this function, the rendering pipeline is used to create a training image with a 3D model.Its return value is a two-dimensional bounding box. How can I draw a three-dimensional bounding box on a test image.Hope to get your answer, thank you.
11111
I made some attempts based on the answers to the previous questions.I got a rendering of a single object
222222

Layer #2 (named "conv1") expects 2 weight(s), but the saved weights have 1 element(s).

I apologize for the newbie question but I've been trying to get this code up and running on a dedicated machine but I can't seem to get past one error.

When running the command:
python3 tools/5_evaluation_bop_basic.py 0 /home/taylor/Pix2Pose/cfg/cfg_bop2020.json tless

I get the error:
Traceback (most recent call last):
File "tools/5_evaluation_bop_basic.py", line 191, in
model.load_weights(last_path, by_name=True)
File "/home/taylor/tf_env/lib/python3.7/site-packages/mask_rcnn-2.1-py3.7.egg/mrcnn/model.py", line 2130, in load_weights
File "/home/taylor/tf_env/lib/python3.7/site-packages/keras/engine/saving.py", line 1328, in load_weights_from_hdf5_group_by_name
str(weight_values[i].shape) + '.')
ValueError: Layer #391 (named "mrcnn_bbox_fc"), weight <tf.Variable 'mrcnn_bbox_fc/kernel:0' shape=(1024, 116) dtype=float32> has shape (1024, 116), but the saved weight has shape (1024, 36).

I've believe the folder structures are correct and I downloaded the T-Less: 2D Mask R-CNN Detection + Pix2Pose weights from the link provided and placed them correctly.

As far as I understand it seems the error is caused by a difference in how Keras saves a model vs how Tensorflow does. It seems TensorFlow saves as [beta, gamma, running_mean, running_variance].
While Keras saves as [gamma, beta, running_mean, running_variance].

All that info to ask if anyone could verify that was the issue I'm running into and if there was a solution?
Thank you for taking the time to read this and great work on this project!

Problem of LineMOD prediction result

Hello, I tried to run the model on predicting the LineMOD dataset and the img_pred and mask_pred results are not making any sense, such as the picture attached below

(from line 303,304: img_pred,mask_pred,rot_pred,tra_pred,frac_inlier,bbox_t =\obj_pix2pose[obj_order_id].est_pose(image_t,roi.astype(np.int)) )

image.

I notice that before loading the weight, there is one line that appears: Re-starting from epoch 5, I am wondering if that means only the weight generated at the 5th epoch is loaded here?

ValueError: Could not load "" Reason: "image file is truncated"

Hello, I am in trouble when I run the 5_evaluation_bop_basic.py file. The error is:

Recognizing scene_id:9, im_id:242
Recognizing scene_id:9, im_id:243
Traceback (most recent call last):
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/PIL/ImageFile.py", line 235, in load
    s = read(self.decodermaxblock)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/PIL/PngImagePlugin.py", line 659, in load_read
    cid, pos, length = self.png.read()
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/PIL/PngImagePlugin.py", line 122, in read
    length = i32(s)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/PIL/_binary.py", line 82, in i32be
    return unpack_from(">I", c, o)[0]
struct.error: unpack_from requires a buffer of at least 4 bytes

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/imageio/plugins/pillow.py", line 671, in pil_try_read
    im.getdata()[0]
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/PIL/Image.py", line 1304, in getdata
    self.load()
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/PIL/ImageFile.py", line 241, in load
    raise IOError("image file is truncated")
OSError: image file is truncated

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/Pix2Pose/tools/5_evaluation_bop_basic.py", line 261, in <module>
    image_t = inout.load_im(rgb_path)            
  File "./bop_toolkit/bop_toolkit_lib/inout.py", line 22, in load_im
    im = imageio.imread(path)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/imageio/core/functions.py", line 264, in imread
    reader = read(uri, format, "i", **kwargs)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/imageio/core/functions.py", line 186, in get_reader
    return format.get_reader(request)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/imageio/core/format.py", line 164, in get_reader
    return self.Reader(self, request)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/imageio/core/format.py", line 214, in __init__
    self._open(**self.request.kwargs.copy())
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/imageio/plugins/pillow.py", line 300, in _open
    return PillowFormat.Reader._open(self, pilmode=pilmode, as_gray=as_gray)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/imageio/plugins/pillow.py", line 137, in _open
    pil_try_read(self._im)
  File "/home/anaconda3/envs/pix2pose/lib/python3.6/site-packages/imageio/plugins/pillow.py", line 682, in pil_try_read
    raise ValueError(error_message)
ValueError: Could not load "" 
Reason: "image file is truncated"
Please see documentation at: http://pillow.readthedocs.io/en/latest/installation.html#external-libraries

I downloaded the T-LESS dataset from the BOP, which lacks images 08-109.png, 08-110.png, 09-244.png, 14-367.png, 18-497.png. I downloaded the corresponding file from the official website, and it is still error in the position of 09-244.png after being placed.

Can you tell me how to solve it?

Please provide data for the Retinanet training data set for T-LESS generated by 1_1_scene_gen_for_detection.py.

Hello!
Thank you for proposing this wonderful method and releasing the code.
I'm trying to run your public code 1_1_scene_gen_for_detection.py
However, we were unable to run it in our execution environment.

This is probably due to a problem with our runtime environment, but I don't see any solution.
If possible, could you please provide the Retinanet training dataset generated by 1_1_scene_gen_for_detection.py for the T-LESS dataset?

I’m sorry for the trouble. Thank you so much.

BrokenPipeError when training

Hi Kiru,

first of all great work on this project! I have been using your repo for a while now and trained my own models before, but the new training code seems to throw a BrokenPipeError with the stack trace:

  File "/home/marzdr/anaconda3/envs/pix2pose_clone/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/marzdr/anaconda3/envs/pix2pose_clone/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/marzdr/anaconda3/envs/pix2pose_clone/lib/python3.6/multiprocessing/pool.py", line 130, in worker
    put((job, i, (False, wrapped)))
  File "/home/marzdr/anaconda3/envs/pix2pose_clone/lib/python3.6/multiprocessing/queues.py", line 347, in put
    self._writer.send_bytes(obj)
  File "/home/marzdr/anaconda3/envs/pix2pose_clone/lib/python3.6/multiprocessing/connection.py", line 200, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/home/marzdr/anaconda3/envs/pix2pose_clone/lib/python3.6/multiprocessing/connection.py", line 404, in _send_bytes
    self._send(header + buf)
  File "/home/marzdr/anaconda3/envs/pix2pose_clone/lib/python3.6/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

I could not fix the problem itself, so I just use your old training script from this commit 9751daa if anyone faces the same problem as me. The old training script seems to work fine with the up-to-date repo as it is.

What could be the problem? I am using an RTX 2070 with 16 GB RAM so it shouldn't be a memory capacity problem.

Training is working with the old script, so it's not a major problem, but I wanted to give a heads-up to anyone who might come across this issue.

Could you release some dataset you genetated?

Hi, Could you release some data you used in pix2pose? I want to test some idea but the data generating is complicated, just few object is ok. The data is a pair of RGB and XYZ map.
Thank you!

Best way to create custom dataset with real data

Hi Kiru,

My setup

I am using Pix2Pose for a project where I am estimating the pose of a moving car from a static camera. The scene is very simple: the camera is static and the car is moving only as a forward and backward translation (no turns).

I am using the official CAD model for the car for the pose estimation.

My question to you

I want to use real training data to avoid the "reality gap" from synthetic data. I am using markers attached to the windshield of the car to estimate the ground truth pose with respect to them. However this has two issues:

  • The pose of the markers is slightly noisy, so we can expect the Pix2Pose-GAN to inherit this noise as well
  • The pose is with respect to the marker, so I have to manually adjust the CAD-Modell coordinate system to be in sync with the marker pose, which will never be perfect

What is a good general way to generate ground truth 6D pose data? Is it possible to achieve a good offline pose accuracy with just one camera or should I include more to reduce the variance? Or if I want good poses, should I even turn to hardcore equipment, such as LiDAR scanners? I would like to achieve a GT-Accuracy of under 1.5cm.

Please let me know if you know any good techniques to achieve my task or refer me to some literature. In the end, the Pix2Pose model cannot be better than the training data you give it, right? :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.