GithubHelp home page GithubHelp logo

footkol / how-diffusion-models-work Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kohinoor23/how-diffusion-models-work

0.0 0.0 0.0 55.8 MB

Notes from How Diffusion Models Work by DeepLearning.ai

Python 0.33% Jupyter Notebook 99.67%

how-diffusion-models-work's Introduction

How-Diffusion-Models-Work

Notes from How Diffusion Models Work by DeepLearning.ai

Contents

Intuition

Sampling

  • With Extra Noise
explorer_pC0437cXSo.mp4

Training

Context Embedding

Faster Sampling


Notes

Taught By Sharon Zhou

Noted by Atul

image

  • Example used throughout the course: Generate 16X16 size sprites for video games.

Intuition

  • Goal : Given a lot of sprite images, generate even more sprite images

image

  • What does the network learn?

    • Fine details
    • General outline
    • Everything in between
  • Noising Process (bob as ink drop analogy)

image
  • Denoising Process (what should the NN think?)

    • If its' Bob the sprite, keep it as it is
    • If its likely to be Bob, suggest more details to be filled
    • If its just an outline of a sprite, suggest general details for likely sprite(bob/fred/...)
    • If its nothing, suggest outline of a sprite
  • Give the NN input noise, whose pixels are obtained from Normal distribution, and get a completely new sprite !

Sampling

  • Assume you have a trained NN
  • At each denoising step, it predicts noise, and subtracts it to get a better image
  • NOTE: At each denoising step, some random noise is added again to prevent "mode collapse"

Neural Network

  • UNet Architecture
    • Input and output of same size
    • First used for image segmentation

image

  • Takes a noisy image, embeds into small space by downsampling, and upsamples to predict noise

  • Can take more info. in form of embeddings

    • Time: related to timestep, and noise level added
    • Context: guides generation process
  • Checkout forward() in sampling notebook

image

Training

Learns the distribution of what is "not noise"

  • Sample training image, timestep t, and noise, randomly
    • Timestep helps control level of noise
    • randomisation ensures a stable model
  • Add noise to image
  • Input this into NN, which predicts the noise
  • Compute loss between actual and predicted noise
  • Backprop and learn

image

Control

  • Embeddings are vectors , for instance, strings represented as number vectors
  • Given as input to NN along with training image
  • Get associated with a training example, and its properties
  • Uses: Generate funky mixtures by combining embeddings
  • Context formats
    • Text
    • Categories, one hot encoded (Eg. hero, non-hero, spells ...)

image

Fast Sampling : DDIM

  • DDPM is slow!
    • Multiple timesteps, and markovian nature
  • Skips steps, making the process deterministic
  • Lower quality than DDPM

Summary

Other applications : Music, Inpainting, Textual Inversion

how-diffusion-models-work's People

Contributors

kohinoor23 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.