GithubHelp home page GithubHelp logo

sakamoto-hiroshi / dm-vton Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kisekloset/dm-vton

0.0 0.0 0.0 11.86 MB

๐Ÿ‘— DM-VTON: Distilled Mobile Real-time Virtual Try-On

Home Page: https://sites.google.com/view/ltnghia/research/DMVTON

License: Other

Shell 0.13% C++ 0.38% Python 88.59% C 0.04% Cuda 10.74% Cython 0.13%

dm-vton's Introduction

DM-VTON: Distilled Mobile Real-time Virtual Try-On

[Paper] [Colab Notebook] [Web Demo]


This is the official pytorch implementation of DM-VTON: Distilled Mobile Real-time Virtual Try-On. DM-VTON is designed to be fast, lightweight, while maintaining the quality of the try-on image. It can achieve 40 frames per second on a single Nvidia Tesla T4 GPU and only take up 37 MB of memory.

๐Ÿ“ Documentation

Installation

This source code has been developed and tested with python==3.10, as well as pytorch=1.13.1 and torchvision==0.14.1. We recommend using the conda package manager for installation.

  1. Clone this repo.
git clone https://github.com/KiseKloset/DM-VTON.git
  1. Install dependencies with conda (we provide script scripts/install.sh).
conda create -n dm-vton python=3.10
conda activate dm-vton
bash scripts/install.sh

Data Preparation

VITON

Because of copyright issues with the original VITON dataset, we use a resized version provided by CP-VTON. We followed the work of Han et al. to filter out duplicates and ensure no data leakage happens (VITON-Clean). You can download VITON-Clean dataset here.

VITON VITON-Clean
Training pairs 14221 6824
Testing pairs 2032 416

Dataset folder structure:

โ”œโ”€โ”€ VTON-Clean
|   โ”œโ”€โ”€ VITON_test
|   |   โ”œโ”€โ”€ test_pairs.txt
|   |   โ”œโ”€โ”€ test_img
โ”‚   โ”‚   โ”œโ”€โ”€ test_color
โ”‚   โ”‚   โ”œโ”€โ”€ test_edge
|   โ”œโ”€โ”€ VITON_traindata
|   |   โ”œโ”€โ”€ train_pairs.txt
|   |   โ”œโ”€โ”€ train_img
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ [000003_0.jpg | ...]  # Person
โ”‚   โ”‚   โ”œโ”€โ”€ train_color
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ [000003_1.jpg | ...]  # Garment
โ”‚   โ”‚   โ”œโ”€โ”€ train_edge
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ [000003_1.jpg | ...]  # Garment mask
โ”‚   โ”‚   โ”œโ”€โ”€ train_label
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ [000003_0.jpg | ...]  # Parsing map
โ”‚   โ”‚   โ”œโ”€โ”€ train_densepose
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ [000003_0.npy | ...]  # Densepose
โ”‚   โ”‚   โ”œโ”€โ”€ train_pose
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ [000003_0.json | ...] # Openpose

Inference

test.py run inference on image folders, then evaluate FID, LPIPS, runtime and save results to runs/TEST_DIR. Check the sample script for running: scripts/test.sh. You can download the pretrained checkpoints here.

Note: to run and save separate results for each pair [person, garment], set batch_size=1.

Training

For each dataset, you need to train a Teacher network first to guide the Student network. DM-VTON uses FS-VTON as the Teacher. Each model is trained through 2 stages: first stage only trains warping module and stage 2 trains the entire model (warping module + generator). Check the sample scripts for training both Teacher network (scripts/train_pb_warp + scripts/train_pb_e2e) and Student network (scripts/train_pf_warp + scripts/train_pf_e2e). We also provide a Colab notebook Colab as a quick tutorial.

Training Settings

A full list of trainning settings can be found in opt/train_opt.py. Below are some important settings.

  • device: Device (gpu) for performing training (e.g. 0,1,2). DM-VTON needs a GPU to run with cupy.
  • batch_size: Customize batch_size for each stage to optimize for your hardware.
  • lr: learning rate
  • Epochs = niter + niter_decay
    • niter: Number of epochs using starting learning rate.
    • niter_decay: Number of epochs to linearly decay learning rate to zero.
  • save_period: Frequency of saving checkpoints after save_period epochs.
  • resume: Use if you want to continue training from a previous process.
  • project and name: The results (checkpoints, logs, images, etc.) will be saved in the project/name folder. Note that if the folder already exists, the code will create a new folder (e.g. project/name-1, project/name-2).`

๐Ÿ“ˆ Result

Results on VITON

Methods FID $\downarrow$ Runtime (ms) $\downarrow$ Memory (MB) $\downarrow$
ACGPN (CVPR20) 33.3 153.6 565.9
PF-AFN (CVPR21) 27.3 35.8 293.3
C-VTON (WACV22) 37.1 66.9 168.6
SDAFN (ECCV22) 30.2 83.4 150.9
FS-VTON (CVPR22) 26.5 37.5 309.3
OURS 28.2 23.3 37.8

๐Ÿ˜Ž Supported Models

We also support some parser-free models that can be used as Teacher and/or Student. The methods all have a 2-stage architecture (warping module and generator). For more details, see here.

Methods Source Teacher Student
PF-AFN Parser-Free Virtual Try-on via Distilling Appearance Flows โœ… โœ…
FS-VTON Style-Based Global Appearance Flow for Virtual Try-On โœ… โœ…
RMGN RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on โŒ โœ…
DM-VTON (Ours) DM-VTON: Distilled Mobile Real-time Virtual Try-On โœ… โœ…

โ„น Citation

If our code or paper is helpful to your work, please consider citing:

@inproceedings{nguyen2023dm,
  title        = {DM-VTON: Distilled Mobile Real-time Virtual Try-On},
  author       = {Nguyen-Ngoc, Khoi-Nguyen and Phan-Nguyen, Thanh-Tung and Le, Khanh-Duy and Nguyen, Tam V and Tran, Minh-Triet and Le, Trung-Nghia},
  year         = 2023,
  booktitle    = {IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)},
}

๐Ÿ™ Acknowledgments

This code is based on PF-AFN.

๐Ÿ“„ License

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. The use of this code is for academic purposes only.

dm-vton's People

Contributors

zero-nnkn avatar ltnghia avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.