GithubHelp home page GithubHelp logo

Comments (9)

fmrico avatar fmrico commented on June 23, 2024

Hi @dnlwbr

Maybe @fgonzalezr1998 could help you here.

We are also moving our efforts to yolact_ros_3d. I can show you some experiments this week with this:
Captura de pantalla de 2020-11-13 10-01-35
yolact

Best!!

from gb_visual_detection_3d.

fgonzalezr1998 avatar fgonzalezr1998 commented on June 23, 2024

Hi @dnlwbr The working frame that you have to specify at config file has follow the following coordinates: x-axis point to the front, y-axis point to the left and z-axis points to the top. So, the frame you are using is not correct. Also, take care of using a pointcloud that be depth_registered.

  • About the second question: With 2D bounding boxes provided by YOLO. Then, the pointcloud pixels that corresponds with pixels of the bounding box are counted and the 3D coordinates are updated (Not recursive. It is iterative). By last, Visual markers are composed and published.

As @fmrico said, yolact_ros_3d is its improvement. It uses YOLACT instead of YOLO and if applies a fast thinning algorithm to compute the 3D bounding boxes optimally.

from gb_visual_detection_3d.

dnlwbr avatar dnlwbr commented on June 23, 2024

Thanks for your quick response. I will try the coordinate frame described by you.

About yolact_ros3d: I would like to try it, but unfortunately I don't use ros2 yet. Is there a ros version of it?

Do you have any experience how accurate the 3d bounding boxes calculated by darknet_ros_3d and yolact_ros3d are compared to those from a NN 3d object detection?

from gb_visual_detection_3d.

fgonzalezr1998 avatar fgonzalezr1998 commented on June 23, 2024

@dnlwbr yolact_ros_3d bboxes are more accurate than darknet_ros_3d because of the usage of YOLACT instead of YOLO.

We have not compared it with a NN 3d object detection. However, yolact_ros_3d and dartknet_ros_3d provide an adventage: You can get 3d bounding boxes without depend of a singular NN 3d object detection.

Unfurtunately, yolact_ros_3d is not available for ROS1 by the moment. When this package has been finalized, I will develop its version for Melodic/Noetic.

from gb_visual_detection_3d.

germal avatar germal commented on June 23, 2024

Hello @fgonzalezr1998
This project is simply amazing.Any chance that you add a path tracking of an object capability to it ?
Thank a lot

from gb_visual_detection_3d.

fgonzalezr1998 avatar fgonzalezr1998 commented on June 23, 2024

There is a repo (who is already in development) that calculates the octomap of each detected object and put it in a 2d map. In this way you can make a semantic mapping and object tracking.

This package is being developed over ROS2 but when it is finished, I will migrate it to ROS1. If you are interested, you can take a look here @germal

from gb_visual_detection_3d.

germal avatar germal commented on June 23, 2024

Hello @fgonzalezr1998 Thank you for your reply !
I see that in darknet_ros the reference frame for the detection is the camera and the objects detected can be visualized on the map using the markers as trick.
Is it the tracking system of yoloact_ros_3d different ?
Thanks a lot !
germal

from gb_visual_detection_3d.

fgonzalezr1998 avatar fgonzalezr1998 commented on June 23, 2024

Hi @germal yolact_ros_3d publishes 3d bounding boxes and marker but also it publishes an octomap that can be overlapped in a map (in map frame). In this way, you can do a semantic mapping, so, you will can say something like "come to the table", "go near of the refrigerator", etc. here you have a simple demo about how octomaps works.

A great part of this work has been done by @fmrico

However, I insist, this work is already in progress. Thank you for your interest!

from gb_visual_detection_3d.

dnlwbr avatar dnlwbr commented on June 23, 2024

Hi @fgonzalezr1998 what exactly do you mean with depth_registered? Do you mean the pointcloud has to be organized with the same height/width as the rgb image and every pixel on the image has a corresponding point in the pointcloud with the same index?

Also, does the working frame has to be the local frame of the camera or is it possible to use a global frame like odom or map if a transformation is provided via /tf?

Thank you!

from gb_visual_detection_3d.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.