Boulderdash

Chris Kinsey

USU CS5600 Project 2

Objective

The purpose of this project is to perform image segmentation on indoor rock climbing walls. Indoor climbing walls can get very cluttered, and the hope with this project is to eventually be able to dynamically draw highlights over images of climbing walls in order to quickly see all rocks in a route. The first step toward that goal is to apply Tensorflow to highlight all images on a wall, regardless of what route they are part of.

Environment

Windows 10
Python 3.6.3
OpenCV 3.4.4
Tensorflow 1.2.1

Running the Application

Simply run main.py to train a new neural network on the provided images.

To run the unit tests, run unit_tests.py. These should both work out of the box if run in the above environment.

Data

I have not been able to find any datasets of segmented rock wall images, so I had to harvest them myself. All images were taken on a Samsung Galaxy S5 and split in half vertically. Segmentation was done for the images by hand using GIMP 2. Segmentation maps were drawn by tracing all the rocks in an image and creating an image mask. The segmentation maps are the same size as the images. Images are stored in ./img and segmentation maps are stored in ./seg_map

Roadblocks

As I anticipated in my proposal, my hardware was not powerful enough to handle a fully convolutional network for large images. I downsized the images at import time by a factor of 10 in each dimension. This meant a lot of lost detail, but not so much that the images were unrecognizable.

Fully convolutional networks are a fairly advanced topic. The most common way to do one is to take the weights from a pre-trained convolutional network, such as VGG Net, and replace the fully connected layers at the end with transpose convolutional layers and unpooling layer to upscale the representations back to image size.

Pre-trained convnets are highly specialized, and it is unlikely that using one that had not been trained on images similar to mine would perform well. This meant that I had to train my networks from scratch, which is even more difficult. TFLearn does not seem to have the right tools for this project. Other people have been able to do this using the base Tensorflow API, and some people have contributed parts of a segmentation net architecture to TFLearn, such as a 2D segmentation cross entropy loss function. Though some of these tools exist, they are almost entirely undocumented, and it is very possible that I am the first person to try to do a FCN for image segmentation in pure TFLearn. As an example, the aforementioned loss function for 2d segmentation does not work for TFLearn out of the box, it has to have its own wrapper.

Ultimately, the generated segmentation maps came out as grids. The TFLearn accuracy moved up from about 0.1 to about 0.2, but the output is nothing like what I expected. TFLearn is working as it expects to, but providing invalid output. This makes me think that the functions I am using are not valid for the kind of operation I am trying to do.

Deliverables

Since I was limited by training time and development roadblocks, I did not manage to get all the deliverables I had planned on. I did include plenty of source images and ground truth segmentation maps. I have the trained network, the code to run it, and a sample output segmentation map. Since the output maps were invalid, I did not bother making overlaid images.

Prospective Future Architecture

A simple architecture would be repeated convolutional networks that first move the representation to a small size with many filters, process them further with 1x1 convolutions, and then undo each shrinking layer step by step. Since this will ultimately have to be done in plain TensorFlow, I think would be worthwhile to use the inception layers that made GoogLeNet so successful. These use 1x1, 3x3, 5x5, and pooling filters at every single layer. The output of each operation is contatenated to the rest, and this is all treated as one layer. This allows many different kinds of features to be collected at every step.

toph-goes-up / boulderdash Goto Github PK

boulderdash's Introduction

Boulderdash

Chris Kinsey

USU CS5600 Project 2

Objective

Environment

Running the Application

Data

Roadblocks

Deliverables

Prospective Future Architecture

boulderdash's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs