pondjames007 / lostintranslation Goto Github PK

View Code? Open in Web Editor NEW

NYU ITP 2019 Thesis. An interactive experience to see how machine interpret one thing differently from human

Python 37.85% JavaScript 31.13% HTML 31.01%

lostintranslation's Introduction

Lost in Translation

NYU ITP 2019 Thesis
An interactive experience to see how machine interpret one thing differently from human.

Presentation Video in ITP Thesis Week 2019

Introduction

The project has a recursive process for human and machine to interpret each other’s results. Human needs to come up with a sentence to describe an image generated by machine and the machine will do multiple machine learning translations from the description from human to a sketch and then to an image in each round of process.

Inspiration

Telephone Game

An example of multiple translations
Drawception - Picture Telephone Drawing Game

Closed Loop

A project uses machine learning to do feedback loop on images and texts.
Jake Elwes - Closed Loop

Implementation

Python Server with Flask
Javascript Client
Generate a sentence from an image by im2txt
Find word tags and get nouns by SpaCy
Word Vector similarity by SpaCy
Draw doodles by SketchRNN
Generate new images by AttnGan

app.py

Server code
Coordinate and process most of the data.
Use http connection to communicate with Runway and Client.

static/client.js

Client Code
Present the result and collect user input.

categories.json

A Json file that store all sketch categories

draw_strokes.py

functions to draw sketch

drawSketch.py

a test function to draw sketch

im2txt

A machine learning model that can generate a sentence based on an image.
The model is originated from models/research/im2txt. A pre-trained model is provided in Runway.

SketchRNN

A machine learning model that can generate doodle in specific categories.
The doodle data is from Quick, Draw! The Data and the model detail is from Magenta - SketchRNN.
It is downloaded from Google Cloud Platform.

AttnGan

The model is from GitHub - taoxugit/AttnGAN.
A machine learning model that can generate image from a sentence.
A pre-trained model is provided in Runway.

Recommend Projects

pondjames007 / lostintranslation Goto Github PK