GithubHelp home page GithubHelp logo

jackzhousz / stylegan_directions_face_reenactment Goto Github PK

View Code? Open in Web Editor NEW

This project forked from stelabou/stylegan_directions_face_reenactment

0.0 0.0 0.0 12.09 MB

Authors official PyTorch implementation of the "Finding Directions in GAN’s Latent Space for Neural Face Reenactment" [BMVC 2022].

C++ 0.48% Python 96.59% Cuda 2.94%

stylegan_directions_face_reenactment's Introduction

Finding Directions in GAN's Latent Space for Neural Face Reenactment

Authors official PyTorch implementation of the Finding Directions in GAN's Latent Space for Neural Face Reenactment. This paper has been accepted as an oral presentation at British Machine Vision Conference (BMVC), 2022. If you use this code for your research, please cite our paper.

Finding Directions in GAN's Latent Space for Neural Face Reenactment
Stella Bounareli, Vasileios Argyriou, Georgios Tzimiropoulos

Abstract: This paper is on face/head reenactment where the goal is to transfer the facial pose (3D head orientation and expression) of a target face to a source face. Previous methods focus on learning embedding networks for identity and pose disentanglement which proves to be a rather hard task, degrading the quality of the generated images. We take a different approach, bypassing the training of such networks, by using (fine-tuned) pre-trained GANs which have been shown capable of producing high-quality facial images. Because GANs are characterized by weak controllability, the core of our approach is a method to discover which directions in latent GAN space are responsible for controlling facial pose and expression variations. We present a simple pipeline to learn such directions with the aid of a 3D shape model which, by construction, already captures disentangled directions for facial pose, identity and expression. Moreover, we show that by embedding real images in the GAN latent space, our method can be successfully used for the reenactment of real-world faces. Our method features several favorable properties including using a single source image (one-shot) and enabling cross-person reenactment. Our qualitative and quantitative results show that our approach often produces reenacted faces of significantly higher quality than those produced by state-of-the-art methods for the standard benchmarks of VoxCeleb1 & 2.

Face Reenactment Results on VoxCeleb1 dataset

Real image editing of head pose and expression

Self and Cross-subject Reenactment

Installation

  • Python 3.5+
  • Linux
  • NVIDIA GPU + CUDA CuDNN
  • Pytorch (>=1.5)
  • Pytorch3d
  • DECA

We recommend running this repository using Anaconda.

conda create -n python38 python=3.8
conda activate python38
conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=11.0 -c pytorch
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install pytorch3d -c pytorch3d 
pip install -r requirements.txt

Pretrained Models

We provide a StyleGAN2 model trained using StyleGAN2-ada-pytorch and an e4e inversion model trained on VoxCeleb1 dataset.

Path Description
StyleGAN2-VoxCeleb1 StyleGAN2 trained on VoxCeleb1 dataset.
e4e-VoxCeleb1 e4e trained on VoxCeleb1 dataset.

Auxiliary Models

We provide additional auxiliary models needed during training.

Path Description
face-detector Pretrained face detector taken from face-alignment.
IR-SE50 Model Pretrained IR-SE50 model taken from InsightFace_Pytorch for use in our identity loss.
DECA model Pretrained model taken from DECA. Extract data.tar.gz under ./libs/DECA/.

By default, we assume that all pretrained models are downloaded and saved to the directory ./pretrained_models.

Preparing your Data

  1. Download and preprocess the VoxCeleb dataset using VoxCeleb_preprocessing.

  2. Invert real images into the latent space of the pretrained StyleGAN2 using the Encoder4Editing method.

python invert_images.py --input_path path/to/voxdataset

The dataset is saved as:

.path/to/voxdataset
|-- id10271                           # identity index
|   |-- 37nktPRUJ58                   # video index
|   |   |-- frames_cropped            # preprocessed frames
|   |   |   |-- 00_000025.png
|   |   |   |-- ...
|   |   |-- inversion
|   |   |    |-- frames               # inverted frames
|   |   |    |   |-- 00_000025.png
|   |   |    |   |-- ..
|   |   |    |-- latent_codes         # inverted latent_codes
|   |   |    |   |-- 00_000025.npy
|   |   |    |   |-- ..
|   |-- Zjc7Xy7aT8c
|   |   | ...
|-- id10273
|   | ...

The correct preprocessing of the dataset is important to reenact the images. Different preprocessing will lead in poor performance. Example:

Training

To train our model, make sure to download and save the required models under ./pretrained_models path and that the training and testing data are configured as described above. Please check run_trainer.py and ./libs/configs/config_arguments.py for the training arguments.

Example of training using paired data:

python run_trainer.py \
--experiment_path ./training_attempts/exp_v00 \
--train_dataset_path path_to_training_dataset \
--test_dataset_path path_to_test_dataset \
--training_method paired

Inference

Download our pretrained model A-matrix and save it under ./pretrained_models path.

Facial image editing:

Given as input an image or a latent code, change only one facial attribute that corresponds to one of our learned directions.

python run_facial_editing.py \
  --source_path ./inference_examples/0002775.png \
  --output_path ./results/facial_editing \
  --directions 0 1 2 3 4 \
  --save_gif \
  --optimize_generator

Face reenactment (self or cross):

Given as input a source identity and a target video, reenact the source face. The source and target faces could have the same identity or different identity.

python run_inference.py \
  --source_path ./inference_examples/0002775.png \
  --target_path ./inference_examples/lWOTF8SdzJw#2614-2801.mp4 \
  --output_path ./results/ \
  --save_video

Citation

[1] Stella Bounareli, Argyriou Vasileios and Georgios Tzimiropoulos. Finding Directions in GAN's Latent Space for Neural Face Reenactment.

Bibtex entry:

@article{bounareli2022finding,
  title={Finding Directions in GAN's Latent Space for Neural Face Reenactment},
  author={Bounareli, Stella and Argyriou, Vasileios and Tzimiropoulos, Georgios},
  journal={British Machine Vision Conference (BMVC)},
  year={2022}
}

stylegan_directions_face_reenactment's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.