GithubHelp home page GithubHelp logo

byteshow1234 / anydoor Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ali-vilab/anydoor

0.0 0.0 0.0 29.29 MB

Official implementations for paper: Anydoor: zero-shot object-level image customization

Home Page: https://damo-vilab.github.io/AnyDoor-Page/

License: MIT License

Shell 0.06% Python 99.94%

anydoor's Introduction

AnyDoor: Zero-shot Object-level Image Customization

Xi Chen · Lianghua Huang · Yu Liu · Yujun Shen · Deli Zhao · Hengshuang Zhao

Paper PDF Project Page
The University of Hong Kong   |   Alibaba Group |   Ant Group

News

  • [2023.12.17] Release train & inference & demo code, and pretrained checkpoint.
  • [Soon] Release the new version paper.
  • [Soon] Support online demo.
  • [On-going] Scale-up the training data and release stronger models as the foundaition model for downstream region-to-region generation tasks.
  • [On-going] Release specific-designed models for downstream tasks like virtual tryon, face swap, text and logo transfer, etc.

Installation

Install with conda:

conda env create -f environment.yaml
conda activate anydoor

or pip:

pip install -r requirements.txt

Additionally, for training, you need to install panopticapi, pycocotools, and lvis-api.

pip install git+https://github.com/cocodataset/panopticapi.git

pip install pycocotools -i https://pypi.douban.com/simple

pip install lvis

Download Checkpoints

Download AnyDoor checkpoint:

Download DINOv2 checkpoint and revise /configs/anydoor.yaml for the path (line 83)

Download Stable Diffusion V2.1 if you want to train from scratch.

Inference

We provide inference code in run_inference.py (from Line 222 - ) for both inference single image and inference a dataset (VITON-HD Test). You should modify the data path and run the following code. The generated results are provided in examples/TestDreamBooth/GEN for single image, and VITONGEN for VITON-HD Test.

python run_inference.py

The inferenced results on VITON-Test would be like [garment, ground truth, generation].

Noticing that AnyDoor does not contain any specific design/tuning for tryon, we think it would be helpful to add skeleton infos or warped garment, and tune on tryon data to make it better :)

Our evaluation data for DreamBooth an COCOEE coud be downloaded at Google Drive:

  • URL: [to be released]

Gradio demo

Currently, we suport local gradio demo. To launch it, you should firstly modify /configs/demo.yaml for the path to the pretrained model, and /configs/anydoor.yaml for the path to DINOv2(line 83).

Afterwards, run the script:

python run_gradio_demo.py

The gradio demo would look like the UI shown below:

Train

Prepare datasets

  • Download the datasets that present in /configs/datasets.yaml and modify the corresponding paths.
  • You could prepare you own datasets according to the formates of files in ./datasets.
  • If you use UVO dataset, you need to process the json following ./datasets/Preprocess/uvo_process.py
  • You could refer to run_dataset_debug.py to verify you data is correct.

Prepare initial weight

  • If your would like to train from scratch, convert the downloaded SD weights to control copy by running:
sh ./scripts/convert_weight.sh  

Start training

  • Modify the training hyper-parameters in run_train_anydoor.py Line 26-34 according to your training resources. We verify that using 2-A100 GPUs with batch accumulation=1 could get satisfactory results after 300,000 iterations.

  • Start training by executing:

sh ./scripts/train.sh  

Acknowledgements

This project is developped on the codebase of ControlNet. We appreciate this great work!

Citation

If you find this codebase useful for your research, please use the following entry.

@article{chen2023anydoor,
  title={Anydoor: Zero-shot object-level image customization},
  author={Chen, Xi and Huang, Lianghua and Liu, Yu and Shen, Yujun and Zhao, Deli and Zhao, Hengshuang},
  journal={arXiv preprint arXiv:2307.09481},
  year={2023}
}

anydoor's People

Contributors

xavierchen34 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.