GithubHelp home page GithubHelp logo

ljzycmd / t2i-adapter-w-masactrl Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tencentarc/t2i-adapter

11.0 0.0 0.0 2.05 MB

MasaCtrl with T2I-Adapter for controllable consistent image synthesis and editing

Home Page: https://ljzycmd.github.io/projects/MasaCtrl/

License: Apache License 2.0

Python 100.00%

t2i-adapter-w-masactrl's Introduction

MasaCtrl with T2I-Adapter

This repo contains the implementation of MasaCtrl integrated to controllable diffusion model T2I-Adapter.


Introduction

We propose MasaCtrl, a tuning-free method for non-rigid consistent image synthesis and editing. The key idea is to combine the contents from the source image and the layout synthesized from text prompt and additional controls into the desired synthesized or edited image, with Mutual Self-Attention Control.

Main Features

1 Controllable Consistent Image Synthesis and Editing

Directly modifying the text prompts often cannot generate target layout of desired image, thus we further integrate our method into existing proposed controllable diffusion pipelines (like T2I-Adapter and ControlNet) to obtain stable synthesis and editing results.

The target layout controlled by additional guidance.

Synthesis (left part) and editing (right part) results with T2I-Adapter

2 Consistent Video Synthesis

With dense consistent guidance, MasaCtrl enables video synthesis

Video Synthesis Results (with keypose and canny guidance)

Usage

Install

Please refer to usage guide of T2I-Adapter here (or from official repo) and download pretrained guidance models.

Checkpoints

Stable Diffusion:

You can download these checkpoints on their official repository and Hugging Face.

Personalized Models: You can download personlized models from CIVITAI or train your own customized models.

Start

For controllable synthesis:

python masactrl_w_adapter.py \
--which_cond sketch \
--cond_path_src SOURCE_CONDITION_PATH \
--cond_path CONDITION_PATH \
--cond_inp_type sketch \
--prompt_src "A bear walking in the forest" \
--prompt "A bear standing in the forest" \
--sd_ckpt models/sd-v1-4.ckpt \
--resize_short_edge 512 \
--cond_tau 1.0 \
--cond_weight 1.0 \
--n_samples 1 \
--adapter_ckpt models/t2iadapter_sketch_sd14v1.pth

NOTE: You can download the sketch examples here.

For real image editing:

python masactrl_w_adapter.py \
--src_img_path SOURCE_IMAGE_PATH \
--cond_path CONDITION_PATH \
--cond_inp_type image \
--prompt_src "" \
--prompt "a photo of a man wearing black t-shirt, giving a thumbs up" \
--sd_ckpt models/sd-v1-4.ckpt \
--resize_short_edge 512 \
--cond_tau 1.0 \
--cond_weight 1.0 \
--n_samples 1 \
--which_cond sketch \
--adapter_ckpt models/t2iadapter_sketch_sd14v1.pth \
--outdir ./workdir/masactrl_w_adapter_inversion/black-shirt

NOTE: You can download the real image editing example here.

Acknowledgements

We thank the awesome research works Prompt-to-Prompt, T2I-Adapter.

Contact

If your have any comments or questions, please open a new issue or feel free to contact Mingdeng Cao and Xintao Wang.

t2i-adapter-w-masactrl's People

Contributors

tothebeginning avatar xinntao avatar mc-e avatar bzboys avatar ljzycmd avatar liangbinxie avatar haofanwang avatar eltociear avatar nousr avatar

Stargazers

George Davila Durendal avatar Jeff Carpenter avatar Hay Kim avatar kim ji yoon avatar dingangui avatar  avatar Zijin Yin avatar  avatar  avatar Jiahao Wang avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.