GithubHelp home page GithubHelp logo

sakshamgupta006 / video-to-video-synthesis Goto Github PK

View Code? Open in Web Editor NEW
9.0 4.0 3.0 292 KB

A Pytorch implementation of Video to Video Synthesis by Nvidia

Home Page: http://vid-2-vid.herokuapp.com/index.html

License: Other

Python 83.52% Shell 2.72% C++ 2.30% Cuda 11.47%
deep-learning computer-vision vid-2-vid pytorch

video-to-video-synthesis's Introduction

vid2vid

Our project aims to implement NVIDIA's Video-to-Video Synthesis research paper, the goal of the research is to learn a mapping function from an input source video (sequences of semantic segmentation masks) to an output photo-realistic video that precisely depicts the content of the source video.

Contributors

Prerequisites

  • Linux
  • Python 3
  • NVIDIA GPU + CUDA(v 9.0) and cuDNN(v 7.0)
  • PyTorch 1.0

Getting Started

Installing required libraries and software

  • Download and install Ananconda.
  • Create environment
    conda create --name env_name
  • Install Pytorch
    conda install pytorch torchvision cudatoolkit=9.2 -c pytorch
  • Install python libraries dominate, requests and dlib.
    pip install dominate requests dlib
  • Installing Tensorboard
    pip install tensorboard

Dataset

  • Cityscapes
    • To download the Cityscapes dataset. Use the following link Cityscapes download
    • To run the model, copy the downloaded images to the 'datasets' folder.

Downloading Datasets Using Scripts

  • To download the dummy dataset, run the script python scripts/download_datasets.py.
  • To download the FlowNet2, run the script python scripts/download_flownet2.py.
  • To download our pre-trained model for CityScapes datasets, download from the following link.
  • Copy the downloaded models to checkpoints/label2city_256_g1/

Training Configuration

  • We use the following platform and hardware to train our model and evaluate results
    • Ubuntu 18.04
    • Cuda 9 with cudnn 7
    • Intel Core i7 8700k (8 Cores)
    • HyperX 16GB Ram
    • Nvidia RTX 2080 Graphic Card
  • Training Time : ~50 hours.

Training

  • First, download the FlowNet2 checkpoint file by running python scripts/download_models_flownet2.py.
    • We trained our models using Single RTX2080 GPU. For convenience, we provide some sample training scripts for GPU users. Performance is not guaranteed using these scripts.
    • For example, to train a 256 x 128 video with a single GPU
    python train.py --name label2city_256_g1 --label_nc 35 --loadSize 256 --use_instance --fg --n_downsample_G 2 --num_D 1 --max_frames_per_gpu 6 --n_frames_total 6
  • To run tensorboard,
    tensorboard --logdir=runs

Testing

  • To test the model:
    python test.py --name label2city_256_g1 --label_nc 35 --loadSize 256 --n_scales_spatial 3 --use_instance --fg --use_single_G
    The test results will be saved in: ./results/label2city_256/test_ltest/.

Citation

@inproceedings{wang2018vid2vid,
   author    = {Ting-Chun Wang and Ming-Yu Liu and Jun-Yan Zhu and Guilin Liu
                and Andrew Tao and Jan Kautz and Bryan Catanzaro},
   title     = {Video-to-Video Synthesis},
   booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},   
   year      = {2018},
}

video-to-video-synthesis's People

Contributors

sakshamgupta006 avatar voraparth1337 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.