GithubHelp home page GithubHelp logo

zhusz / iccv17-fashiongan Goto Github PK

View Code? Open in Web Editor NEW
370.0 18.0 92.0 4.85 MB

Full version (training+testing) of implementation of Shizhan Zhu et al.'s ICCV-17 work Be Your Own Prada: Fashion Synthesis with Structural Coherence

License: BSD 3-Clause "New" or "Revised" License

Lua 2.82% Python 0.25% Shell 6.86% CMake 0.19% MATLAB 0.15% C++ 80.82% C 7.34% Makefile 1.55% Mercury 0.01%

iccv17-fashiongan's Introduction

ICCV17-fashionGAN


Shizhan Zhu

Released on Oct 11, 2017

Updates

The complete demo is now updated. Please refer to here for details.

To facilitate future researches, we provide the indexing of our selected subset from the DeepFashion Dataset (attribute prediction task). It contains a .mat file which contains a 78979-dim indexing vector pointing to the index among the full set (the values are between 1 and 289222). We also provide the nameList of the selected subset. Download the indexing here.

Description

This is the implementation of Shizhan Zhu et al.'s ICCV-17 work Be Your Own Prada: Fashion Synthesis with Structural Coherence. It is open source under BSD-3 license (see the LICENSE file). Codes can be used freely only for academic purpose. If you want to apply it to industrial products, please send an email to Shizhan Zhu at [email protected] first.

Acknoledgement

The motivation of this work, as well as the training data used, are from the DeepFashion dataset. Please cite the following papers if you use the codes or data of this work:

@inproceedings{liuLQWTcvpr16DeepFashion,
 author = {Ziwei Liu and Ping Luo and Shi Qiu and Xiaogang Wang and Xiaoou Tang},
 title = {DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations},
 booktitle = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
 month = June,
 year = {2016} 
 }
@inproceedings{zhu2017be,
  title={Be Your Own Prada: Fashion Synthesis with Structural Coherence},
  author={Zhu, Shizhan and Fidler, Sanja and Urtasun, Raquel and Lin, Dahua and Chen, Change Loy},
  booktitle={Proceedings of the IEEE Conference on International Conference on Computer Vision},
  year={2017}
}

Qualitative Results

Matirx Visualization: The samples shown in the same row are generated from the same original person while the samples shown in the same collumn are generated from the same text description.

Walking the latent space: For each row, the first and the last images are the two samples that we will make the interpolation. We gradually change the input from the left image. In the first row, we only interpolate the input to the first stage and hence the generated results only change in shapes. In the second row, we only interpolate the input to the second stage and hence the results only change in textures. The last row interpolate the input for both the first and second stages and hence the generated interpolated results transfer smoothly from the left to the right.

Dependency

The implementation is based on Torch. CuDNN is required.

Getting data

  1. Step 1: Run the following command to obtain part of the training data and the off-the-shelf pre-trained model. Folders for models are also created here.
sh download.sh

This part of the data contains all the new annotations (languages and segmentation maps) on the subset of the DeepFashion dataset, as well as the benchmarking info (the train-test split and the image-language pairs of the test set). Compared to the full data, it does not contain the G2.h5 (which you need to obtain according to Step 2 below).

  1. Step 2: You can obtain G2.h5 in the same way as obtaining the DeepFashion dataset. Please refer to this page for detailed instructions (e.g. sign up an agreement). After obtaining the G2.h5, you need to put it into the directory of ./data_release/supervision_signals/ before you can use the codes.

Formatting of the data stored in the .h5 files:

b_: The segmentation label for each image, e.g. 0 represents the background.
ih: The 128x128 images.
ih_mean: The mean image.

For any questions regarding obtaining the data (e.g. cannot obtain through the Dropbox via the link) please send an email to [email protected].

Testing

All the testing codes are in the demo_release folder. The GAN of our second stage has three options in our implementation.

  1. Run demo_full.lua with this line uncommented. The network structure is our original submitted version.
  2. Run demo_full.lua as it is. It adds the skip connection technique proposed in Hour-glass and pix2pix.
  3. Run demo_p2p.lua. The network structure completely follows pix2pix. The texture would be nice but cannot be controlled.

You can modify this block to switch different types of visualization.

Training

  1. To train the first-stage-gan, enter the sr1 folder and run the train.lua file.
  2. To train the second-stage-gan, enter the relevant folder to run the train.lua file. Folder ih1 refers to our original submission. Filder ih1_skip refers to the second-stage-network coupled with skip connection. Folder ih1_p2p uses pix2pix as our second stage.

Complete demo

By using the complete demo, you can use your own image and language to serve as the inputs. Your own original image is not limited to be 128x128 but our output is 128x128. Your input sentence is assumed not to contain words that our model does not know.

To set up, get to the root directory of the repo and run the following commands:

sh download.sh
cd complete_demo
sh setup.sh

In addition, we also need the OpenPose library to detect the bounding box of the human inside the image. Please follow the instructions to also install the OpenPose library appropriately.

Please make sure that the Torch, PyTorch and matlab softwares are available on your system.

The complete demo can be run with the command OPENPOSE_DIR=\path\to\your\installed\openpose sh demo.sh. The input folder should contain at least two samples like the provided ones (apology for that due to the matlab's automatic squeezing of dimensions). After running the demo, the results are expected to be stored in the output folder.

The complete demo uses the libraries of Dense CRF, OpenPose, as well as the dataset from ATR and LIP. Please cite these works if you are using our complete demo.

Please report to us if you find your Matlab does not support the function cp2tform. Thanks!

Language encoding

Please refer to the language folder for training and testing the initial language encoding model.

Feedback

Suggestions and opinions of this work (both positive and negative) are greatly welcome. Please contact the author by sending email to [email protected].

License

BSD-3, see LICENSE file for details.

iccv17-fashiongan's People

Contributors

zhusz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

iccv17-fashiongan's Issues

th demo_full.lua error

torch/install/bin/luajit: ...h/install/share/lua/5.1/cudnn/SpatialFullConvolution.lua:31: attempt to perform arithmetic on field 'groups' (a nil value)
stack traceback:
...h/install/share/lua/5.1/cudnn/SpatialFullConvolution.lua:31: in function 'resetWeightDescriptors'
...h/install/share/lua/5.1/cudnn/SpatialFullConvolution.lua:105: in function 'func'
.../yicheng/torch/install/share/lua/5.1/nngraph/gmodule.lua:345: in function 'neteval'
.../yicheng/torch/install/share/lua/5.1/nngraph/gmodule.lua:380: in function 'forward'
demo_full.lua:107: in main chunk
[C]: in function 'dofile'
...heng/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

I met this error when I th demo_full.lua in the demo_release file. Can anybody kindly help me?

Caption Generation in Quantitative Experiments

Your reference the paper in the appendix is [30], but the maximum number of citations for your paper is 19, so which article is the 30, which method?

and what is it mean: d_4 denotes the four-time replicate version of d in the two spatial dimension

My Memory is not enough

i just run train.lua on sr1 with this spec

i7-7700
GTX 1080 8GB
Ram 16 GB DDR4
with storage 200gb

my RAM and swap files hit 100% on system monitor. do you have any idea to split the dataset through code ? anyone has same experienced?

Can not download the Data Fashion

Hi @zhusz ,

Thank for your interesting work. I want to download the data but I can only download Category and Attribute Prediction Benchmark. I can not download the rest. I see there are four datasets. Which one do you use for your algorithm? I did try both google drive and dropbox.

Thanks,
Hai

code is unavailable.

I want to use your code and data for your FashionGan paper. Will it be available soon?

Torch out of memory

Can anyone tell me why I'm getting out of memory error while running demo_full.lua?
System config -
CPU - i7
GPU - Nvidia Titan V RTX
System RAM - 16 GB

how to extract Deepfashion

i want to know how to extract image from deepfashion dataset. Can you provide the index table? thank you

demo sh demo.sh wrong

After running sh demo.sh ,there are some mistakes as follows:

Traceback (most recent call last):
  File "test_lang_initial.py", line 41, in <module>
    model.load_state_dict(torch.load('rnn_latest.pth'))
  File "/usr/local/lib/python2.7/dist-packages/torch/serialization.py", line 267, in load
    return _load(f, map_location, pickle_module)
  File "/usr/local/lib/python2.7/dist-packages/torch/serialization.py", line 410, in _load
    magic_number = pickle_module.load(f)
cPickle.UnpicklingError: bad pickle data

I think the rnn_latest.pth file is corrupt。
Can someone solve my problem?

openpose version

What openpose version did you have for the complete demo?
openpose's top of tree seems to be not compatible

cannot run complete demo

Hi
there's something wrong with the code so that i cannot run the demo both lua demo_full.lua and sh demo.sh.
I haven't found what exactly that problem is and here is the error message in the terminal.

Error using parseYML (line 8)
Assertion failed.

i know there should be a file named download1_pose.yml in the /cache but i've nothing
maybe the problem has to tackled happens on openpose part?
so sorry that i am just a beginner, not really good at debugging and environment setting.

also, this shows before the code start running

*** Check failure stack trace: ***
    @     0x7f7c2c8f35cd  google::LogMessage::Fail()
    @     0x7f7c2c8f5433  google::LogMessage::SendToLog()
    @     0x7f7c2c8f315b  google::LogMessage::Flush()
    @     0x7f7c2c8f5e1e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f7c2cd3b738  caffe::CuDNNConvolutionLayer<>::Reshape()
    @     0x7f7c2cca2d3b  caffe::Net<>::Init()
    @     0x7f7c2cca5e00  caffe::Net<>::Net()
    @     0x7f7c2e88deab  op::NetCaffe::initializationOnThread()
    @     0x7f7c2e850e0a  op::addCaffeNetOnThread()
    @     0x7f7c2e85172a  op::PoseExtractorCaffe::netInitializationOnThread()
    @     0x7f7c2e84ce70  op::PoseExtractor::initializationOnThread()
    @     0x7f7c2e849914  op::WPoseExtractor<>::initializationOnThread()
    @     0x7f7c2e93ed60  op::SubThread<>::initializationOnThread()
    @     0x7f7c2e940f6f  op::Thread<>::initializationOnThread()
    @     0x7f7c2e9411c1  op::Thread<>::threadFunction()
    @     0x7f7c2e95b15f  std::_Mem_fn_base<>::operator()<>()
    @     0x7f7c2e95b0f3  _ZNSt12_Bind_simpleIFSt7_Mem_fnIMN2op6ThreadISt10shared_ptrISt6vectorINS1_5DatumESaIS5_EEES3_INS1_6WorkerIS8_EEEEEFvvEEPSC_EE9_M_invokeIJLm0EEEEvSt12_Index_tupleIJXspT_EEE
    @     0x7f7c2e95aea4  std::_Bind_simple<>::operator()()
    @     0x7f7c2e95ac74  std::thread::_Impl<>::_M_run()
    @     0x7f7c2df5ac80  (unknown)
    @     0x7f7c2d6ac6ba  start_thread
    @     0x7f7c2d9c941d  clone
    @              (nil)  (unknown)
Aborted (core dumped)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.