GithubHelp home page GithubHelp logo

thodan / epos Goto Github PK

View Code? Open in Web Editor NEW
71.0 9.0 12.0 1.05 MB

Code for "EPOS: Estimating 6D Pose of Objects with Symmetries", CVPR 2020.

Home Page: http://cmp.felk.cvut.cz/epos/

License: MIT License

Python 99.43% Shell 0.57%
6d-pose-estimation 6dof-pose object-pose-estimation object-detection deep-learning encoder-decoder robust-estimation computer-vision

epos's People

Contributors

thodan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

epos's Issues

Issue in train time

when i train, i got nan for loss

python train.py --model=lmo

step: 0 total_loss: 9.5576973 obj_cls: 2.77258897 frag_cls: 4.15888262 frag_loc: 2.37503433
step: 100 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: nan
INFO:tensorflow:global_step/sec: 2.1272
step: 200 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: nan
INFO:tensorflow:global_step/sec: 2.22972
step: 300 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: nan
INFO:tensorflow:global_step/sec: 2.22798
step: 400 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: 2.52651024
INFO:tensorflow:global_step/sec: 2.22882
step: 500 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: nan
INFO:tensorflow:global_step/sec: 2.23132
step: 600 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: 2.38968158
INFO:tensorflow:global_step/sec: 2.22965
step: 700 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: nan
INFO:tensorflow:global_step/sec: 2.23278
step: 800 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: nan
INFO:tensorflow:global_step/sec: 2.22892
step: 900 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: 2.19273663
INFO:tensorflow:global_step/sec: 2.22798
step: 1000 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: nan
INFO:tensorflow:global_step/sec: 2.22868
step: 1100 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: nan
INFO:tensorflow:global_step/sec: 2.22729

so i think error generated for it

Caused by op 'logits/pred_frag_conf/weights_1', defined at:
File "train.py", line 559, in
tf.app.run()
File "/home/default/anaconda3/envs/epos/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "train.py", line 485, in main
freeze_regex_list=FLAGS.freeze_regex_list)
File "train.py", line 355, in _train_epos_model
reuse_variable=(i != 0))
File "train.py", line 267, in _tower_loss
outputs_to_num_channels)
File "train.py", line 239, in _build_epos_model
tf.summary.histogram(model_var.op.name, model_var)
File "/home/default/anaconda3/envs/epos/lib/python3.6/site-packages/tensorflow/python/summary/summary.py", line 187, in histogram
tag=tag, values=values, name=scope)
File "/home/default/anaconda3/envs/epos/lib/python3.6/site-packages/tensorflow/python/ops/gen_logging_ops.py", line 284, in histogram_summary
"HistogramSummary", tag=tag, values=values, name=name)
File "/home/default/anaconda3/envs/epos/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/default/anaconda3/envs/epos/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/default/anaconda3/envs/epos/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/home/default/anaconda3/envs/epos/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Nan in summary histogram for: logits/pred_frag_conf/weights_1
[[node logits/pred_frag_conf/weights_1 (defined at train.py:239) = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](logits/pred_frag_conf/weights_1/tag, logits/pred_frag_conf/weights/read/_9035)]]
[[{{node xception_65/middle_flow/block1/unit_3/xception_module/separable_conv2_depthwise/BatchNorm/moving_mean/read/_9950}} = _SendT=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1856_..._mean/read", _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

What can i do for training?

wrong link

Hi, thodan. Firstly, thanks for your sharing. However, when I compiled the OSMesa, I found the link "ftp://ftp.freedesktop.org/pub/mesa/older-versions/17.x/mesa-${mesaversion}.tar.gz" did't work. The file I downloaded always unzipped unsuccessfully. I guess the downloaded file is defective because it just 324 bytes. Could you share a new link or the file?
2021-08-13 18-10-13屏幕截图

How to get the experimental comparison results

Hello, you have achieved good experimental results in your paper compared to other advanced methods, but the evaluation indicators used in your paper are not reported in other papers, how do you get them? The results of other papers, that is, the AR results of other papers in your paper, how did you get them, thank you for your reply!!!

Rank error in function misc.resolve_shape when trying to use flag upsample_logits

I am trying to use the repository to do pose estimation on my own dataset.

First of all, everything seems to work fine in check_train_input.py, train.py, eval.py, and infer.py with the following parameters in params.yml:

#Dataset.
dataset: "sphere"

#Model.
model_variant: "xception_65"
atrous_rates: [12, 24, 36]
encoder_output_stride: 8
decoder_output_stride: [4]
upsample_logits: false
frag_seg_agnostic: false
frag_loc_agnostic: false
num_frags: 64

#Establishing correspondences.
corr_min_obj_conf: 0.1
corr_min_frag_rel_conf: 0.5
corr_project_to_model: false

#Training.
train_tfrecord_names: ["sphere_train-blender"]
train_max_height_before_crop: 128
train_crop_size: "128,128"
optimizer: "AdamOptimizer"
save_interval_steps: 10000
initialize_last_layer: false
fine_tune_batch_norm: false
train_steps: 4500000
train_batch_size: 4
base_learning_rate: 0.0001
obj_cls_loss_weight: 1.0
frag_cls_loss_weight: 1.0
frag_loc_loss_weight: 100.0
train_knn_frags: 1
data_augmentations:
  random_adjust_brightness:
    min_delta: -0.15
    max_delta: 0.15
  random_adjust_contrast:
    min_delta: 0.85
    max_delta: 1.15
  random_adjust_saturation:
    min_delta: 0.85
    max_delta: 1.15
  random_adjust_hue:
    max_delta: 1.0
  random_blur:
    max_sigma: 1.5
  random_gaussian_noise:
    max_sigma: 0.03
  jpeg_artifacts:
    min_quality: 85

However, when I enable the upsample_logits flag, I get the following error:

Traceback (most recent call last):
  File "/home/user/miniconda/envs/eposaidev/lib/python3.8/site-packages/tensorflow_core/python/framework/tensor_shape.py", line 928, in merge_with
    self.assert_same_rank(other)
  File "/home/user/miniconda/envs/eposaidev/lib/python3.8/site-packages/tensorflow_core/python/framework/tensor_shape.py", line 982, in assert_same_rank
    raise ValueError("Shapes %s and %s must have the same rank" %
ValueError: Shapes (?, 128, 128) and (?, ?, ?, ?) must have the same rank

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/miniconda/envs/eposaidev/lib/python3.8/site-packages/tensorflow_core/python/framework/tensor_shape.py", line 1013, in with_rank
    return self.merge_with(unknown_shape(rank=rank))
  File "/home/user/miniconda/envs/eposaidev/lib/python3.8/site-packages/tensorflow_core/python/framework/tensor_shape.py", line 934, in merge_with
    raise ValueError("Shapes %s and %s are not compatible" % (self, other))
ValueError: Shapes (?, 128, 128) and (?, ?, ?, ?) are not compatible

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 584, in <module>
    tf.app.run()
  File "/home/user/miniconda/envs/eposaidev/lib/python3.8/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/user/miniconda/envs/eposaidev/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/home/user/miniconda/envs/eposaidev/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "train.py", line 505, in main
    train_tensor, summary_op = _train_epos_model(
  File "train.py", line 374, in _train_epos_model
    loss = _tower_loss(
  File "train.py", line 285, in _tower_loss
    _build_epos_model(
  File "train.py", line 202, in _build_epos_model
    loss.add_obj_cls_loss(
  File "/home/user/phd/epos/epos_lib/loss.py", line 131, in add_obj_cls_loss
    targets_shape = misc.resolve_shape(targets, 4)[1:3]
  File "/home/user/phd/epos/epos_lib/misc.py", line 44, in resolve_shape
    shape = tensor.get_shape().with_rank(rank).as_list()
  File "/home/user/miniconda/envs/eposaidev/lib/python3.8/site-packages/tensorflow_core/python/framework/tensor_shape.py", line 1015, in with_rank
    raise ValueError("Shape %s must have rank %d" % (self, rank))
ValueError: Shape (?, 128, 128) must have rank 4

I tried multiple sources of data, including the tfrecord file ycbv_test_targets-bop19.tfrecord provided by the authors, so I at least have some confidence that the data is not the source of the issue. However, I am not an in-depth expert on this repository and have not yet traced the entire path of the data through the code up until this point of failure.

Any clues or insights as to what the shapes should be like at this point of failure? Appreciate the help.

Depth information in training

Are depth images really necessary for training? I am quite confused because in your paper you say you only use RGB images, but when I try to train my own dataset, it requires depth images (when running calc_gt_info).

nan values in training

Hello, thodan.
First of all, thanks for your sharing.
I'm trying to training model with 'python train.py --model='ycbv_custom'', but i got nan values in losses. Like:
step: 0 total_loss: 9.91135 obj_cls: 3.09097505 frag_cls: 4.15891361 frag_loc: 2.40971112
step: 100 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: nan
INFO:tensorflow:global_step/sec: 1.57567
step: 200 total_loss: nan obj_cls: nan frag_cls: nan frag_loc: nan
INFO:tensorflow:global_step/sec: 1.66842

And i check the input data with visualization, it works. I trying to fix for days, But I don't where the error is...
Can you give me some advice for that?

The visualization result is bad

Hi, thodan. Firstly, thanks for your sharing and you readme file is very detailed. But after I configured the environment successfully and run the infer.py file with your provided pre-trained models on tless and ycbv dataset, the visualization result is bad on nearly all test images. So look forward for help.

about the checkpoint

can you tell me how to get the resnet backbone pre-trained models, I only found the model for xception-65.

Looking forward to your reply!

error install blender: GLU error

image

I have an error like this, but I already 'export LD_LIBRARY_PATH=$REPO_PATH/external/llvm/lib:$LD_LIBRARY_PATH' in the previous section.

the links in the project were invalid

Hello, first of all, thank you for your sharing. I found that all the links in the project were invalid. Could you please tell me how to get it now?
Also, I found that Osmesa and Progression-X were not compiled successfully. Will this affect my training?
I've now got my own dataset in BOP format.

Looking forward to your reply!
thank you!

How to label ground truth of b(u)

Hi~
Thanks for your wonderful work.
After reading you paper, I have a question that can be hardly found in your paper.
Your paper says,'if the ground-truth one-hot distribution ¯bi(u) indicates a different fragment at pixels with similar appearance, the network is expected to learn at such pixels the same probability bij (u) for all the indicated fragments.'

Your paper says 'Vectors a¯(u),¯bi(u), and r¯ij (u) are obtained by rendering the 3D object models in the ground-truth poses with a custom OpenGL shader.' But I can not find how to get the ground-truth of ¯bi(u) especially for partial symmetries. Could you please give me some suggestions?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.