GithubHelp home page GithubHelp logo

artem-gorodetskii / wikiart-latent-diffusion Goto Github PK

View Code? Open in Web Editor NEW
6.0 3.0 1.0 156.53 MB

Conditional denoising diffusion probabilistic model trained in latent space.

Python 1.00% Jupyter Notebook 99.00%
ai-art diffusion latent-diffusion wikiart

wikiart-latent-diffusion's Introduction

WikiArt-Latent-Diffusion

Conditional denoising diffusion probabilistic model trained in latent space to generate paintings by famous artists. See the animation of the latent diffusion process in the figure below.

img-name
Fig. 1. The animation of the latent diffusion process.

Generalization to Different Sizes

The model is able to generalize to different image sizes. See generated examples below.

img-name
Fig. 2. Generated painting in the style of Ivan Aivazovsky.

img-name
Fig. 3. Generated painting in the style of Ivan Aivazovsky.

img-name
Fig. 4. Generated painting in the style of Ivan Aivazovsky.

img-name
Fig. 5. Generated painting in the style of Martiros Saryan.

img-name
Fig. 6. Generated painting in the style of Camille Pissarro.

img-name
Fig. 7. Generated painting in the style of Pyotr Konchalovsky.

img-name
Fig. 8. Generated painting in the style of Pierre Auguste Renoir.

Repository structure:

Dataset

We used the WikiArt dataset containing 81444 pieces of visual art from various artists. All images were cropped and resized to 512x512 resolution. To convert images into latent representation we apply the pretrained VQ-VAE from the Stable Diffusion model implemented by StabilityAI.

Diffusion Model

We adapted 2D UNet model from Hugging Face diffusers package by adding three additional embedding layers to control paining style, including artist name, genre name and style name. Before adding the style embedding to time embedding, we pass each type of style embedding through PreNet modules.

The network is trained to predict the unscaled noise component using Huber loss function (it produces better results on this dataset compared to L2 loss). During evaluation, the generated latent representations are decoded into images using the pretrained VQ-VAE.

wikiart-latent-diffusion's People

Contributors

artem-gorodetskii avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

wn1695173791

wikiart-latent-diffusion's Issues

Using HugGAN Dataset

This is not an issue, more of a question. how did you use HugGAN data set. it is all labeled train from 0 to 71. did you concatenate the files? or decompress it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.