GithubHelp home page GithubHelp logo

cwczyj / labelfusion Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ianchen-tw/labelfusion

0.0 0.0 0.0 25.29 MB

LabelFusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes

Home Page: http://labelfusion.csail.mit.edu

License: Other

Python 94.23% Shell 3.60% Dockerfile 2.16%

labelfusion's Introduction

LabelFusion

This readme is to document how to create your own data with LabelFusion.

If you're looking to download the example LabelFusion dataset, go here: http://labelfusion.csail.mit.edu/#data

Setup

Recommended setup is through our Docker.

If instead you'd prefer a native install, go to: "Setup Instructions".

Quick Pipeline Instructions

This is the quick version. If you'd prefer to go step-by-step manually, see Pipeline_Instructions.

Camera intrinsic calibration

For ElasticFusion calibration, create camera.cfg file into your lcm-log folder. camera.cfg is fx fy px py in one line.

For render training image, edit LabelFusion/modules/labelfusion/rendertrainingimages.py "setCameraInstrinsicsAsus" fuction.

def setCameraInstrinsicsAsus(view):
        principalX = 320.0
        principalY = 240.0
        focalLength = 617.0 # fx = fy = focalLength
        setCameraIntrinsics(view, principalX, principalY, focalLength)

Collect raw data from Xtion

First, cdlf && cd data/logs, then make a new directory for your data. In one terminal, run:

openni2-camera-lcm

In another, run:

lcm-logger

Your data will be saved in current directory as lcmlog-*.

Collect raw data from Realsense

First, install librealsense , intel_ros_relasense and rgbd_ros_to_lcm

Second, cdlf && cd data/logs, then make a new directory for your data. In one terminal, run:

roscore

In one, run:

roslaunch realsense2_camera rs_rgbd.launch

modify rgbd_ros_to_lcm topic: modify this file ~/catkin_ws/src/rgbd_ros_to_lcm/launch/lcm_republisher.launch to

<?xml version="1.0"?>
<launch>
  <node name="lcm_republisher" pkg="rgbd_ros_to_lcm" type="lcm_republisher" output="screen" respawn="false" >
    <rosparam subst_value="true">
      # input parameters
      subscribe_point_cloud: false
      rgb_topic: /camera/color/image_raw
      depth_topic: /camera/aligned_depth_to_color/image_raw
      cloud_topic: /camera/depth_registered/points

      # output parameters
      output_lcm_channel: "OPENNI_FRAME"
      compress_rgb: true
      compress_depth: true

      debug_print_statements: true
    </rosparam>
  </node>
</launch>

and run

roslaunch rgbd_ros_to_lcm lcm_republisher.launch

In another, run:

lcm-logger

Your data will be saved in current directory as lcmlog-*.

Process into labeled training data

First we will launch a log player with a slider, and a viewer. The terminal will prompt for a start and end time to trim the log, then save the outputs:

run_trim

Next, we prepare for object pose fitting, by running ElasticFusion and formatting the output:

run_prep

Next, launch the object alignment tool and follow the three steps:

run_alignment_tool
  1. Check available object types:
  • In your data directory, open object_data.yaml and review the available objects, and add the objects / meshes that you need.
    • If you need multiple instances of the same object, you will need to create separate copies of the object with unique names (e.g. drill-1, drill-2, ...). For networks that do object detection, ensure that you remove this distinction from your labels / classes.
  1. Align the reconstructed point cloud:

    • Open measurement panel (View -> Measurement Panel), then check Enabled in measurement panel

    • Use shift + click and click two points: first on the surface of the table, then on a point above the table

    • Open Director terminal with F8 and run:

      gr.rotateReconstructionToStandardOrientation()
      
    • Close the run_alignment_tool application (ctrl + c) and rerun.

  2. Segment the pointcloud above the table

    • Same as above, use shift + click and click two points: first on the

    surface of the table, then on a point above the table - Open Director terminal with F8 and run:

    gr.segmentTable()
    gr.saveAboveTablePolyData()
    
    • Close the run_alignment_tool application (ctrl + c) and rerun.
  3. Align each object and crop point clouds.

    • Assign the current object you're aligning, e.g.:

      objectName = "drill"
      
    • Launch point cloud alignment:

      gr.launchObjectAlignment(objectName)
      

      This launches a new window. Click the same three points in model and on pointcloud. Using shift + click to do this. After you do this the affordance should appear in main window using the transform that was just computed.

      • If the results are inaccurate, you can rerun the above command, or you can double-click on each affordance and move it with an interactive marker: left-click to translate along an axis, right-click to rotate along an axis.
    • When you are done with an object's registration (or just wish to save intermediate poses), run:

      gr.saveRegistrationResults()
      

After the alignment outputs have been saved, we can create the labeled data:

run_create_data

By default, only RGB images and labels will be saved. If you'd also like to save depth images, use the -d flag:

run_create_data -d

Train SegNet on labeled data

Navigate to /SegNet/MovingCamera/

Copy all the data you want to use (created by run_create_data from different datasets) into ./train

Use a different subdirectory inside /train/ for each log, i.e.:

/train/log-1
/train/log-2

Then resize all of the training images to a better size for training:

python resize_all_images.py

Finally, create the description of image-label pairs needed as SegNet input:

python create_traiing_set_list.py

To train SegNet:

cd /
./SegNet/caffe-segnet/build/tools/caffe train -gpu 0 -solver /SegNet/Models/moving_camera_solver.prototxt

labelfusion's People

Contributors

5yler avatar christophersweeney avatar daichou avatar ericcousineau-tri avatar ianchen-tw avatar manuelli avatar patmarion avatar peteflorence avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.