GithubHelp home page GithubHelp logo

eulerwong / labelfusion Goto Github PK

View Code? Open in Web Editor NEW

This project forked from robotlocomotion/labelfusion

0.0 0.0 0.0 25.31 MB

LabelFusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes

Home Page: http://labelfusion.csail.mit.edu

License: Other

Python 92.55% Shell 4.80% Dockerfile 2.65%

labelfusion's Introduction

LabelFusion

docs/fusion_short.gif

This repo holds the code for LabelFusion.

Quick Links

Getting Started

First step is to download and set up our Docker.

If alternatively you'd like to natively install, this document may be helpful: https://github.com/RobotLocomotion/LabelFusion/blob/master/docs/setup.rst

Inspecting Data from LabelFusion

If you've downloaded some of LabelFusion data and would like to inspect some of it, we recommend the following:

  1. Run our docker image (instructions here: https://hub.docker.com/r/robotlocomotion/labelfusion/)
  2. Inside the docker image navigate to a log directory and run the alignment tool. Even though the data has already been labeled, you can inspect the results:
cd ~/labelfusion/data/logs_test/2017-06-16-20
run_alignment_tool

You should see a GUI like the following:

docs/labelfusion_screenshot.png

  1. Inspect labeled images (cd path-to-labelfusion-data/logs_test/2017-06-16-20/images and browse the images)
  2. Run a script to print out the overall status of the dataset (note this may take ~10-20 seconds to run for the full dataset): dataset_update_status -o

Training on Object Detection, Segmentation, and/or Pose data

docs/color_masks.gif

docs/bbox_trim.gif

LabelFusion provides training data which is able to train a variety of perception systems. This includes:

  • semantic segmentation (pixelwise classification)
  • 2D object detection (bounding box + classification) -- note that we provide the segmentation masks, not the bounding boxes, but the bounding boxes could be computed from the masks
  • 6 DoF object poses
  • 3D object detection (bounding box + classidication) -- the 3D bounding box can be computed from the 6 DoF object poses together with their mesh.
  • 6 DoF camera pose - this is provided without any labeling, just through the use of the dense SLAM method we use, ElasticFusion

Please see this document to better understand how the data is structured: https://github.com/RobotLocomotion/LabelFusion/blob/master/docs/data_organization.rst

At the time of publication for LabelFusion, we used this repo to train segmentation networks: https://github.com/DrSleep/tensorflow-deeplab-resnet

Quick Pipeline Instructions for Making New Labeled Data with LabelFusion

This is the quick version. If you'd prefer to go step-by-step manually, see Pipeline_Instructions.

Collect raw data from Xtion

First, cdlf && cd data/logs, then make a new directory for your data. In one terminal, run:

openni2-camera-lcm

In another, run:

lcm-logger

Your data will be saved in current directory as lcmlog-*.

Process into labeled training data

First we will launch a log player with a slider, and a viewer. The terminal will prompt for a start and end time to trim the log, then save the outputs:

run_trim

Next, we prepare for object pose fitting, by running ElasticFusion and formatting the output:

run_prep

Next, launch the object alignment tool and follow the three steps:

run_alignment_tool
  1. Check available object types:
  • In your data directory, open object_data.yaml and review the available objects, and add the objects / meshes that you need.
    • If you need multiple instances of the same object, you will need to create separate copies of the object with unique names (e.g. drill-1, drill-2, ...). For networks that do object detection, ensure that you remove this distinction from your labels / classes.
  1. Align the reconstructed point cloud:

    • Open measurement panel (View -> Measurement Panel), then check Enabled in measurement panel

    • Use shift + click and click two points: first on the surface of the table, then on a point above the table

    • Open Director terminal with F8 and run:

      gr.rotateReconstructionToStandardOrientation()
      
    • Close the run_alignment_tool application (ctrl + c) and rerun.

  2. Segment the pointcloud above the table

    • Same as above, use shift + click and click two points: first on the

    surface of the table, then on a point above the table - Open Director terminal with F8 and run:

    gr.segmentTable()
    gr.saveAboveTablePolyData()
    
    • Close the run_alignment_tool application (ctrl + c) and rerun.
  3. Align each object and crop point clouds.

    • Assign the current object you're aligning, e.g.:

      objectName = "drill"
      
    • Launch point cloud alignment:

      gr.launchObjectAlignment(objectName)
      

      This launches a new window. Click the same three points in model and on pointcloud. Using shift + click to do this. After you do this the affordance should appear in main window using the transform that was just computed.

      • If the results are inaccurate, you can rerun the above command, or you can double-click on each affordance and move it with an interactive marker: left-click to translate along an axis, right-click to rotate along an axis.
    • When you are done with an object's registration (or just wish to save intermediate poses), run:

      gr.saveRegistrationResults()
      

After the alignment outputs have been saved, we can create the labeled data:

run_create_data

By default, only RGB images and labels will be saved. If you'd also like to save depth images, use the -d flag:

run_create_data -d

Citing LabelFusion

If you find LabelFusion useful in your work, please consider citing:

@inproceedings{marion2018label,
  title={Label Fusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes},
  author={Marion, Pat and Florence, Peter R and Manuelli, Lucas and Tedrake, Russ},
  booktitle={2018 IEEE International Conference on Robotics and Automation (ICRA)},
  pages={3325--3242},
  year={2018},
  organization={IEEE}
}

labelfusion's People

Contributors

5yler avatar christophersweeney avatar ericcousineau-tri avatar eulerwong avatar ianchen-tw avatar manuelli avatar patmarion avatar peteflorence avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.