The comp341-a1 from kanelindsay

Add Classification to Regression Model

Add extra output neurons that predict object classes.

Compile Some External Data for Testing

We need a couple of examples of unseen data from outside Anh's dataset for the assignment. Please find new RGB-D images of classes included in the network.

Re-Do the Maths for Layer Sizes

Make it perfect and nice

Rectangle Metric

Implement a function using the 'Rectangle Metric' for evaluation.

Include Depth Map as Input

Include the depth map somewhere in the model as an input.
Could include it as an extra channel of the image at the start of the network, or add it somewhere later on?

Grasp Visualisation

Need a matplotlib visualiser to draw grasp(s) over sample images.

Multi-Grasp Detection

Improve the network to suggest more than one grasping location, and use more than one grasp as ground truth.

Resource:

Multi-Grasp Detection

The preceeding models assume that there is only a single correct grasp per image and try to predict that grasp. MultiGrasp divides the image into an NxN grid and assumes that there is at most one grasp per grid cell. It predicts one grasp per cell and also the likelihood that the predicted grasp would be feasible on the object. For a cell to predict a grasp the center of that grasp must fall within the cell.

The output of this model is an NxNx7 prediction. The first channel is a heatmap of how likely a region is to contain a correct grasp. The other six channels contain the predicted grasp coordinates for that region. For experiments on the Cornell dataset we used a 7x7 grid, making the actual output layer 7x7x7 or 343 neurons. Our first model can be seen as a specific case of this model with a grid size of 1x1 where the probability of the grasp existing in the single cell is implicitly one.

Training MultiGrasp requires some special considerations.
Every time MultiGrasp sees an image it randomly picks up to five grasps to treat as ground truth. It constructs a heatmap with up to five cells marked with ones and the rest filled with zeros. It also calculates which cells those grasps fall into and fills in the appropriate columns of the ground truth with the grasp coordinates. During training we do not backpropagate error for the entire 7x7x7 grid because many of the column entries are blank (if there is no grasp in that cell). Instead we backpropagate error for the entire heatmap channel and also for the specific cells that contain ground truth grasps.

Pre-Train Images

Edit code to pre-train the network to classify images well. Then, train it on boxes.

kanelindsay / comp341-a1 Goto Github PK

comp341-a1's People

Contributors

Watchers

comp341-a1's Issues

Add Classification to Regression Model

Compile Some External Data for Testing

Re-Do the Maths for Layer Sizes

Rectangle Metric

Include Depth Map as Input

Grasp Visualisation

Multi-Grasp Detection

Multi-Grasp Detection

Pre-Train Images

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs