GithubHelp home page GithubHelp logo

dm-gan-mdd's Introduction

Modality Disentangled Discriminator for Text-to-Image Synthesis

Introduction

This project page provides pytorch code that implements the paper: "Modality Disentangled Discriminator for Text-to-Image Synthesis".

How to use

Python

  • Python2.7
  • Pytorch1.0+
  • tensorflow (pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.12.0-cp27-none-linux_x86_64.whl)
  • pip install easydict pathlib
  • conda install requests nltk pandas scikit-image pyyaml

Data

  1. Download metadata for birds coco and save them to data/

    • python google_drive.py 1O_LtUP9sch09QH3s_EBAgLEctBQ5JBSJ ./data/bird.zip
    • python google_drive.py 1rSnbIGNDGZeHlsUlLdahj0RJ9oo6lgH9 ./data/coco.zip
  2. Download the birds image data. Extract them to data/birds/

    • cd data/birds
    • wget http://www.vision.caltech.edu/visipedia-data/CUB-200-2011/CUB_200_2011.tgz
    • tar -xvzf CUB_200_2011.tgz
  3. Download coco dataset and extract the images to data/coco/

    • cd data/coco
    • wget http://images.cocodataset.org/zips/train2014.zip
    • wget http://images.cocodataset.org/zips/val2014.zip
    • unzip train2014.zip
    • unzip val2014.zip
    • mv train2014 images
    • cp val2014/* images

Pretrained Models

  • DAMSM for bird. Download and save it to DAMSMencoders/
    • python google_drive.py 1GNUKjVeyWYBJ8hEU-yrfYQpDOkxEyP3V DAMSMencoders/bird.zip
  • DAMSM for coco. Download and save it to DAMSMencoders/
    • python google_drive.py 1zIrXCE9F6yfbEJIbNP5-YrEe2pZcPSGJ DAMSMencoders/coco.zip
  • DM-GAN-MDD for bird. Download and save it to models
  • DM-GAN-MDD for coco. Download and save it to models
  • IS for bird
    • python google_drive.py 0B3y_msrWZaXLMzNMNWhWdW0zVWs eval/IS/inception_finetuned_models.zip
  • FID for bird
    • python google_drive.py 1747il5vnY2zNkmQ1x_8hySx537ZAJEtj eval/FID/bird_val.npz
  • FID for coco
    • python google_drive.py 10NYi4XU3_bLjPEAg5KQal-l8A_d8lnL5 eval/FID/coco_val.npz

Training

  • go into code/ folder
  • bird: python main.py --cfg cfg/bird_DMGANMDD.yml --gpu 0
  • coco: python main.py --cfg cfg/coco_DMGANMDD.yml --gpu 0

Validation

  1. Images generation:
    • go into code/ folder
    • python main.py --cfg cfg/eval_bird_DMGANMDD.yml --gpu 0
    • python main.py --cfg cfg/eval_coco_DMGANMDD.yml --gpu 0
  2. Inception score (IS for bird, IS for coco):
    • cd DM-GAN-MDD/eval/IS/bird && CUDA_VISIBLE_DEVICES=0 python inception_score_bird.py --image_folder ../../../models/netG_DMGANMDD_bird
    • cd DM-GAN-MDD/eval/IS/coco && CUDA_VISIBLE_DEVICES=0 python inception_score_coco.py ../../../models/netG_DMGANMDD_coco
  3. FID:
    • cd DM-GAN-MDD/eval/FID && python fid_score.py --gpu 0 --path1 bird_val.npz --path2 ../../models/netG_DMGANMDD_bird
    • cd DM-GAN-MDD/eval/FID && python fid_score.py --gpu 0 --path1 coco_val.npz --path2 ../../models/netG_DMGANMDD_coco

Performance

As DM-GAN, we use the Pytorch implementation to measure FID score.

Model R-precision↑ IS↑ Pytorch FID
bird_AttnGAN (paper) 67.82% ± 4.43% 4.36 ± 0.03 23.98
bird_DMGAN (paper) 72.31% ± 0.91% 4.75 ± 0.07 16.09
bird_DMGAN_MDD 79.73% ± 0.68% 4.86 ± 0.06 15.76
coco_AttnGAN (paper) 85.47% ± 3.69% 25.89 ± 0.47 35.49
coco_DMGAN (paper) 88.56% ± 0.28% 30.49 ± 0.57 32.64
coco_DMGAN_MDD 94.37% ± 0.36% 34.46 ± 0.72 24.30

License

This code is released under the MIT License (refer to the LICENSE file for details).

dm-gan-mdd's People

Contributors

fangxiangfeng avatar

Stargazers

Fan avatar  avatar  avatar  avatar Zhuowei Chen avatar  avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.