GithubHelp home page GithubHelp logo

ballgan's Introduction

BallGAN: 3D-aware Image Synthesis with a Spherical Background
Official PyTorch implementation

Teaser image

3D-aware Blending with Generative NeRFs
Minjung Shin1,2, Yunji Seo1, Jeongmin Bae1, Youngsun Choi1, Hyunsu Kim2, Hyeran Byun1
, Youngjung Uh1
1Yonsei University, 2NAVER AI Lab

Project page | Arxiv | ICCV 2023 Video

Abstract: 3D-aware GANs aim to synthesize realistic 3D scenes that can be rendered in arbitrary camera viewpoints, generating high-quality images with well-defined geometry. As 3D content creation becomes more popular, the ability to generate foreground objects separately from the background has become a crucial property. Existing methods have been developed regarding overall image quality, but they can not generate foreground objects only and often show degraded 3D geometry. In this work, we propose to represent the background as a spherical surface for multiple reasons inspired by computer graphics. Our method naturally provides foreground-only 3D synthesis facilitating easier 3D content creation. Furthermore, it improves the foreground geometry of 3D-aware GANs and the training stability on datasets with complex backgrounds.

The code heavily relays on EG3D.

Requirements

  • We recommend Linux for performance and compatibility reasons.
  • 1–8 high-end NVIDIA GPUs. We have done all testing and development using V100, RTX3090.
  • 64-bit Python 3.8 and PyTorch 1.11.0 (or later). See https://pytorch.org for PyTorch install instructions.
  • CUDA toolkit 11.3 or later. (Why is a separate CUDA toolkit installation required? We use the custom CUDA extensions from the StyleGAN3 repo. Please see Troubleshooting).
  • Python libraries: see environment.yml for exact library dependencies. You can use the following commands with Miniconda3 to create and activate your Python environment:
    • cd ballgan
    • conda env create -f environment.yml
    • conda activate ballgan

Getting started

You can download the pretrained models for both FFHQ and AFHQ.

Pre-trained networks are stored as *.pkl files that can be referenced using local filenames.

Generating media

# Generate rgb images, and rgba images and forground shapes (as .mrc files) using pre-trained model

python gen_samples.py --outdir=out --trunc=0.7 --shapes=true --seeds=0-3 \
    --network=networks/network_snapshot.pkl

We visualize our .mrc shape files with UCSF Chimerax.

To visualize a shape in ChimeraX do the following:

  1. Import the .mrc file with File > Open
  2. Find the selected shape in the Volume Viewer tool
    1. The Volume Viewer tool is located under Tools > Volume Data > Volume Viewer
  3. Change volume type to "Surface"
  4. Change step size to 1
  5. Change level set to 10
    1. Note that the optimal level can vary by each object, but is usually between 2 and 20. Individual adjustment may make certain shapes slightly sharper
  6. In the Lighting menu in the top bar, change lighting to "Full"

Dataset

We follow the same dataset format as EG3D supported on FFHQ and AFHQv2_cats, which can be either an image folder, or a zipped file.

Train a new BallGAN model

You can train new networks using train.py. For example:

# Train with FFHQ from scratch with raw neural rendering resolution=64, using 8 GPUs.
python train.py --outdir=~/training-runs --cfg=ffhq --data=~/datasets/FFHQ_512.zip \
  --gpus=8 --batch=32 --gamma=1 --gen_pose_cond=True

# Second stage finetuning of FFHQ to 128 neural rendering resolution (optional).
python train.py --outdir=~/training-runs --cfg=ffhq --data=~/datasets/FFHQ_512.zip \
  --resume=~/training-runs/ffhq_experiment_dir/network-snapshot-025000.pkl \
  --gpus=8 --batch=32 --gamma=1 --gen_pose_cond=True --neural_rendering_resolution_final=128

# Train with AFHQ, finetuning from FFHQ with ADA, using 8 GPUs.
python train.py --outdir=~/training-runs --cfg=afhq --data=~/datasets/afhq.zip \
  --gpus=8 --batch=32 --gamma=5 --aug=ada --neural_rendering_resolution_final=128 --gen_pose_cond=True --gpc_reg_prob=0.8

In the above examples, there are no regularizations applied.

If you want to use regularization, set the desired values for fg_density_reg, fg_density_spatial_reg,density_reg, and bg_transmittance_reg.

Metrics

python calc_metrics.py --metrics=fid50k_full --data=~/datasets/ffhq_512.zip \
    --network=ffhq-128.pkl

Citation

@InProceedings{Shin_2023_ICCV,
    author    = {Shin, Minjung and Seo, Yunji and Bae, Jeongmin and Choi, Young Sun and Kim, Hyunsu and Byun, Hyeran and Uh, Youngjung},
    title     = {BallGAN: 3D-aware Image Synthesis with a Spherical Background},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {7268-7279}
}

ballgan's People

Stargazers

Yeongmin Ko avatar Brad avatar  avatar  avatar  avatar Ying Xiang avatar なるみ avatar Mr.wen avatar TaekyungKi avatar Liam Schoneveld avatar Sejong Yang avatar Jiho Choi avatar JennyJeoni avatar Brian Pugh avatar Yinghao Xu avatar Ceyuan Yang avatar Taehoon Kim avatar Hyunsu Kim avatar  avatar Okrin avatar SunwooKim avatar Junyoung Seo avatar minjung shin avatar  avatar Sandalots avatar 爱可可-爱生活 avatar 김진원 avatar Joseph Cheng avatar  avatar Azlen Elza avatar Yoon, Seungje avatar Junho Kim avatar Soohyun avatar Kyusun Cho avatar SANGHYEON KIM avatar Pilhyeon Lee (이필현) avatar jinho park avatar

Watchers

Taehoon Kim avatar simongao avatar minjung shin avatar  avatar Kyusun Cho avatar

ballgan's Issues

about checkpoint release

Amazing work!
I am wondering whether it is possible to share your checkpoints. If it is possible, I would really appreciate it!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.