GithubHelp home page GithubHelp logo

harskish / tlgan Goto Github PK

View Code? Open in Web Editor NEW
176.0 12.0 15.0 15.71 MB

Time-Lapse Disentanglement With Conditional GANs [SIGGRAPH 2022]

License: Other

Python 82.60% C++ 4.20% Cuda 13.19%
disentanglement generative-adversarial-network time-lapse timelapse

tlgan's Introduction

Disentangling Random and Cyclic Effects in Time-Lapse Sequences

valley

Disentangling Random and Cyclic Effects in Time-Lapse Sequences
Erik Härkönen1, Miika Aittala2, Tuomas Kynkäänniemi1, Samuli Laine2, Timo Aila2, Jaakko Lehtinen1,2
1Aalto University, 2NVIDIA
https://doi.org/10.1145/3528223.3530170

Abstract: Time-lapse image sequences offer visually compelling insights into dynamic processes that are too slow to observe in real time. However, playing a long time-lapse sequence back as a video often results in distracting flicker due to random effects, such as weather, as well as cyclic effects, such as the day-night cycle. We introduce the problem of disentangling time-lapse sequences in a way that allows separate, after-the-fact control of overall trends, cyclic effects, and random effects in the images, and describe a technique based on data-driven generative models that achieves this goal. This enables us to ``re-render'' the sequences in ways that would not be possible with the input images alone. For example, we can stabilize a long sequence to focus on plant growth over many months, under selectable, consistent weather.
Our approach is based on Generative Adversarial Networks (GAN) that are conditioned with the time coordinate of the time-lapse sequence. Our architecture and training procedure are designed so that the networks learn to model random variations, such as weather, using the GAN's latent space, and to disentangle overall trends and cyclic variations by feeding the conditioning time label to the model using Fourier features with specific frequencies.
We show that our models are robust to defects in the training data, enabling us to amend some of the practical difficulties in capturing long time-lapse sequences, such as temporary occlusions, uneven frame spacing, and missing frames.

Video: https://youtu.be/UrQ3tOfpjuA

Setup

See the setup instructions.

Dataset preparation

See the dataset preprocessing instructions.

Usage

Training a model
First, go through the dataset preparation instructions above to produce a dataset zip.

# Print available options
python train.py --help

# Train TLGAN on Valley using 4 GPUs.
python train.py --outdir=~/training-runs --data=~/datasets/valley_1024x1024_2225hz.zip --gpus=4 --batch=32

# Train an unconditional StyleGAN2 on Teton using 2 GPUs.
python train.py --outdir=~/training-runs --data=~/datasets/teton_512x512_2225hz.zip --gpus=2 --batch=32 --cond=none --metrics=fid50k_full

Model visualizer
The interactive model visualizer can be used to explore the effects of the conditioning inputs and the latent space.

# Visualize a trained model
python visualize.py path/to/model.pkl

The UI can be scaled with the button in the top-right corner. The UI can be made fullscreen by pressing F11.

Grid visualizer
The input grid visualizer can be used to create 2D image grids, time-lapse images (stacked strips), and videos. All exported files (jpg, png, mp4) contain embedded metadata with all UI element states. This enables previously exported data to be loaded back into the UI via drag-and-drop.

# Open trained model pickle in grid visualizer
python grid_viz.py /path/to/model.pkl

# Reopen UI and load state from previously exported image
python grid_viz.py /path/to/image.png

Dataset visualization
Both visualizers can display dataset frames that most closely match the current conditioning variables. Set environment variable TLGAN_DATASET_ROOT or pass argument --dataset_root to specify the directory in which datasets are stored.

Downloads

Known issues

  • NVJPEG does not work correctly with CUDA 11.0 - 11.5 1. CPU decoding will be used instead, leading to reduced performance. Affects preproc/process_sequence.py, grid_viz.py, and visualize.py.

Citation

@article{harkonen2022tlgan,
  author    = {Erik Härkönen and Miika Aittala and Tuomas Kynkäänniemi and Samuli Laine and Timo Aila and Jaakko Lehtinen},
  title     = {Disentangling Random and Cyclic Effects in Time-Lapse Sequences},
  journal   = {{ACM} Trans. Graph.},
  volume    = {41},
  number    = {4},
  year      = {2022},
}

License

The code of this repository is based on StyleGAN3, which is released under the NVIDIA License.
All modified source files are marked separately and released under the CC BY-NC-SA 4.0 license.
The files in ./ext are provided under the MIT license.
The file mutable_zipfile.py is released under the Python License.
The included Roboto Mono font is licensed under the Apache 2.0 license.

tlgan's People

Contributors

harskish avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tlgan's Issues

Which VS version to compile? error happens when conda env -prune the corresponding cuxx.yml file

cmake C:\Users\Administrator\AppData\Local\Temp\pip-install-qv_cg26i\simpleimageio_1f2a96f5db3b4f179142cbfeba783f7d -DPYLIB=C:\Users\Administrator\AppData\Local\Temp\pip-install-qv_cg26i\simpleimageio_1f2a96f5db3b4f179142cbfeba783f7d\build\lib.win-amd64-cpython-39\simpleimageio -DCMAKE_BUILD_TYPE=Release
-- Building for: Visual Studio 17 2022
-- The C compiler identification is MSVC 19.34.31942.0
-- The CXX compiler identification is MSVC 19.34.31942.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - failed
-- Check for working C compiler: D:/VS2022/VC/Tools/MSVC/14.34.31933/bin/Hostx64/x64/cl.exe
-- Check for working C compiler: D:/VS2022/VC/Tools/MSVC/14.34.31933/bin/Hostx64/x64/cl.exe - broken
CMake Error at C:/Users/Administrator/AppData/Local/Temp/pip-build-env-_9fg2wxv/overlay/Lib/site-packages/cmake/data/share/cmake-3.25/Modules/CMakeTestCCompiler.cmake:70 (message):
The C compiler

      "D:/VS2022/VC/Tools/MSVC/14.34.31933/bin/Hostx64/x64/cl.exe"

    is not able to compile a simple test program.

add model to Hugging Face Hub

Hi!

Congrats on the work being accepted at SIGGRAPH 2022, would you be interested in sharing your models in the Hugging Face Hub? The Hub offers free hosting and it would make your work more accessible and visible to the rest of the ML community.

Some of the benefits of sharing your models through the Hub would be:

  • versioning, commit history and diffs
  • repos provide useful metadata about their tasks, languages, metrics, etc that make them discoverable
  • multiple features from TensorBoard visualizations, PapersWithCode integration, and more
  • wider reach of your work to the ecosystem

Creating the repos and adding new models should be a relatively straightforward process if you've used Git before. This is a step-by-step guide explaining the process in case you're interested. Please let us know if you would be interested and if you have any questions.

also this month we have a event for siggraph: https://huggingface.co/SIGGRAPH2022

it would be great if you can join here and add the models here as well and a gradio demo for the model

here is a example gradio demo: https://huggingface.co/spaces/SIGGRAPH2022/StyleGAN-XL

Happy to hear your thoughts,
Ahsen and the Hugging Face team

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.