wuerstchen's Introduction

Würstchen

What is this?

Würstchen is a new framework for training text-conditional models by moving the computationally expensive text-conditional stage into a highly compressed latent space. Common approaches make use of a single stage compression, while Würstchen introduces another Stage that introduces even more compression. In total we have Stage A & B that are responsible for compressing images and Stage C that learns the text-conditional part in the low dimensional latent space. With that Würstchen achieves a 42x compression factor, while still reconstructing images faithfully. This enables training of Stage C to be fast and computationally cheap. We refer to the paper for details.

Use Würstchen

You can use the model simply through the notebooks here. The Stage B notebook only for reconstruction and the Stage C notebook is for the text-conditional generation. You can also try the text-to-image generation on Google Colab.

Train your own Würstchen

Training Würstchen is considerably faster and cheaper than other text-to-image as it trains in a much smaller latent space of 12x12. We provide training scripts for both Stage B and Stage C.

Download Models

Model	Download	Parameters	Conditioning
Würstchen v1	Huggingface	1B (Stage C) + 600M (Stage B) + 19M (Stage A)	CLIP-H-Text

Acknowledgment

Special thanks to Stability AI for providing compute for our research.

Recommend Projects

mrcichon / wuerstchen Goto Github PK

wuerstchen's Introduction

Würstchen

What is this?

Use Würstchen

Train your own Würstchen

Download Models

Acknowledgment

wuerstchen's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs