GithubHelp home page GithubHelp logo

yinwin / diffsynth-studio Goto Github PK

View Code? Open in Web Editor NEW

This project forked from modelscope/diffsynth-studio

0.0 0.0 0.0 3.47 MB

Enjoy the magic of Diffusion models!

License: Apache License 2.0

Python 100.00%

diffsynth-studio's Introduction

DiffSynth Studio

Introduction

DiffSynth Studio is a Diffusion engine. We have restructured architectures including Text Encoder, UNet, VAE, among others, maintaining compatibility with models from the open-source community while enhancing computational performance. We provide many interesting features. Enjoy the magic of Diffusion models!

Roadmap

  • Aug 29, 2023. We propose DiffSynth, a video synthesis framework.
  • Oct 1, 2023. We release an early version of this project, namely FastSDXL. A try for building a diffusion engine.
    • The source codes are released on GitHub.
    • FastSDXL includes a trainable OLSS scheduler for efficiency improvement.
      • The original repo of OLSS is here.
      • The technical report (CIKM 2023) is released on arXiv.
      • A demo video is shown on Bilibili.
      • Since OLSS requires additional training, we don't implement it in this project.
  • Nov 15, 2023. We propose FastBlend, a powerful video deflickering algorithm.
  • Dec 8, 2023. We decide to develop a new Project, aiming to release the potential of diffusion models, especially in video synthesis. The development of this project is started.
  • Jan 29, 2024. We propose Diffutoon, a fantastic solution for toon shading.
    • Project Page.
    • The source codes are released in this project.
    • The technical report (IJCAI 2024) is released on arXiv.
  • June 13, 2024. DiffSynth Studio is transfered to ModelScope. The developers have transitioned from "I" to "we". Of course, I will still participate in development and maintenance.
  • June 21, 2024. We propose ExVideo, a post-tuning technique aimed at enhancing the capability of video generation models. We have extended Stable Video Diffusion to achieve the generation of long videos up to 128 frames.
  • Until now, DiffSynth Studio has supported the following models:

Installation

Create Python environment:

conda env create -f environment.yml

We find that sometimes conda cannot install cupy correctly, please install it manually. See this document for more details.

Enter the Python environment:

conda activate DiffSynthStudio

Usage (in Python code)

The Python examples are in examples. We provide an overview here.

Long Video Synthesis

We trained an extended video synthesis model, which can generate 128 frames. examples/ExVideo

github_title.mp4

Image Synthesis

Generate high-resolution images, by breaking the limitation of diffusion models! examples/image_synthesis

512*512 1024*1024 2048*2048 4096*4096
512 1024 2048 4096
1024*1024 2048*2048
1024 2048

Toon Shading

Render realistic videos in a flatten style and enable video editing features. examples/Diffutoon

Diffutoon.mp4
Diffutoon_edit.mp4

Video Stylization

Video stylization without video models. examples/diffsynth

winter_stone.mp4

Chinese Models

Use Hunyuan-DiT to generate images with Chinese prompts. We also support LoRA fine-tuning of this model. examples/hunyuan_dit

Prompt: 少女手捧鲜花,坐在公园的长椅上,夕阳的余晖洒在少女的脸庞,整个画面充满诗意的美感

1024x1024 2048x2048 (highres-fix)
image_1024 image_2048

Prompt: 一只小狗蹦蹦跳跳,周围是姹紫嫣红的鲜花,远处是山脉

Without LoRA With LoRA
image_without_lora image_with_lora

Usage (in WebUI)

python -m streamlit run DiffSynth_Studio.py
sdxl_turbo_ui.mp4

diffsynth-studio's People

Contributors

artiprocher avatar linhqyy avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.