GithubHelp home page GithubHelp logo

text2image's Introduction

text2image

This repository includes the implementation for Text to Image Generation with Semantic-Spatial Aware GAN

Network Structure

network_structure

The structure of the spatial-semantic aware (SSA) block is shown as below

ssacn

Main Requirements

  • python 3.6+
  • pytorch 1.0+
  • numpy
  • matplotlib
  • opencv

Prepare data

  1. Download the preprocessed metadata for birds and coco and save them to data/
  2. Download birds dataset and extract the images to data/birds/
  3. Download coco dataset and extract the images to data/coco/

Pre-trained DAMSM model

  1. Download the pre-trained DAMSM for CUB and save it to DAMSMencoders/
  2. Download the pre-trained DAMSM for coco and save it to DAMSMencoders/

Trained model

you can download our trained models from our onedrive repo

Start training

Run main.py file. Please adjust args in the file as your need.

Evaluation

please run IS.py and test_lpips.py (remember to change the image path) to evaluate the IS and diversity scores, respectively.

For evaluating the FID score, please use this repo https://github.com/bioinf-jku/TTUR.

Performance

You will get the scores close to below after training under xe loss for xxxxx epochs:

results

Qualitative Results

Some qualitative results on coco and birds dataset from different methods are shown as follows: qualitative_results

The predicted mask maps on different stages are shown as as follows: mask

Reference

If you find this repo helpful in your research, please consider citing our paper:

@article{liao2021text,
  title={Text to Image Generation with Semantic-Spatial Aware GAN},
  author={Liao, Wentong and Hu, Kai and Yang, Michael Ying and Rosenhahn, Bodo},
  journal={arXiv preprint arXiv:2104.00567},
  year={2021}
}

The code is released for academic research use only. For commercial use, please contact Wentong Liao.

Acknowledgements

This implementation borrows part of the code from DF-GAN.

text2image's People

Contributors

wtliao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.