GithubHelp home page GithubHelp logo

mlb_unity_delighting's Introduction

Texture De-lighting

Project Managers: Zhongxia Yan, Michael Zhang

Team Members: Quinn Tran, Gefen Kohavi, Murtaza Dalal, Tracy Lou, Varsha Ramakrishnan

Machine Learning at Berkeley partnered with Unity Technologies to apply ML methods toward de-lighting surface texture.

Background

Realistic in-game objects are often captured from the real world through a process called photogrammetry. Photogrammetry involves capturing images of all angles of an object (e.g. a rock) and then reconstructing the object from those images with 3D reconstruction techniques. De-lighting is necessary to remove the effect of non-uniform real world lighting and shadow on the object, so that the object can be re-lit by lighting within the game environment. Currently, de-lighting is manually done by artists.

Data

We aimed to build models that operated on surface texture maps (i.e. the unwrapped surface of a 3D object), instead of operating on meshes directly. Our model takes in the lighted texture map and seeks to generate a de-lit texture map. Unity Technologies has several de-lit texture maps already (done by artists), so this serves as our desired output. To generate the lighted inputs into our models, Unity Technologies placed the de-lit meshes in various lighting conditions and generated the lit texture maps. Our dataset is entirely consisted of textures of rocks.

Below: left is de-lit by artist (our ground truth), right is lighted.

Model

Our core model consists of a 4 layer encoder followed by a 4 layer decoder, with residual connections between the layers. The model is fully convolutional, so it takes in 32x32 randomly cropped / rotated / flipped patches as input during training time (this speeds up training dramatically and produces better output), but takes in the entire texture map during test time. alt text

Loss Function

We experimented with several loss functions on top of our core model.

L2

This is the pixelwise L2 loss between predicted texture map and desired texture map.

Gradient Difference

Let I(0, 0) be an image and let I(1, 0) be the image shifted one pixel to the right, and I(0, 1) be the image shifted one pixel up. We call the horizontal gradient I(0, 0) - I(1, 0), and the vertical gradient I(0, 0) - I(0, 1). We compute the horizontal and vertical gradients for the model output and desired output, then take the L2 loss between these two outputs for our gradient difference loss.

This particular loss enforces relative changes from pixel to pixel, so this allows the output to keep more of the fine details from the input.

Adversarial

We train a convolutional discriminator to predict whether a texture map is a de-lit ground truth (vs generated by our generator).

Alpha Mask

Since our inputs have regions where alpha = 0 (see examples above), we tried applying a mask that only take the alpha > 0 regions into account when calculating the losses above.

Results

We show results on the test set (ground truth has never been seen by model before). All result series are lit texture (model input), model output, ground truth. Title is type of loss function we used.

L2 + Gradient Difference + Adversarial + Scaled Input + Alpha Mask (Best)

This has the three losses in the title, and we directly add a scaled (factor is trainable) input to the output. We hoped that this would better directly transfer the fine details to the output, but this transfered too much of the lighting to the output as well. In addition we use a mask to ignore contributions to the loss from regions where alpha = 0.

The lighting and shadow effects are removed, and the resolution is better than all the other models that we tried. Some mid-level details (e.g. dark regions of ~ 3-5 pixels in radius) are lost.

L2 with Full Image Input (instead of 32x32 patches)

Output is much blurrier than our best, training time was significantly long, not all lighting effects are removed, and the color of the red rim is off.

L2 (32x32 Patches)

Output resolution is better and the color is closer (especially the rim), but still not all lighting effects are removed and the color of the red rim is off.

L2 + Gradient Difference

The gradient difference loss greatly improves the amount of fine details kept.

L2 + Gradient Difference + Adversarial

Adversarial loss didn't seem to help too much beyond just L2 + Gradient Difference, but we didn't get to do more tuning.

alt text

The adversarial component of generator loss never plateau'ed, so we probably should spend more time tuning hyperparameters.

L2 + Gradient Difference + Adversarial + Scaled Input

This has the same loss as above, but we directly add a scaled (factor is trainable) input to the output. We hoped that this would better directly transfer the fine details to the output, but this transfered too much of the lighting to the output as well.

mlb_unity_delighting's People

Contributors

ssquinntran avatar tetradserket avatar zhongxiayan avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

omnioverflow

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.