Finding Directions in GAN's Latent Space for Neural Face Reenactment

Authors official PyTorch implementation of the Finding Directions in GAN's Latent Space for Neural Face Reenactment. This paper has been accepted as an oral presentation at British Machine Vision Conference (BMVC), 2022. If you use this code for your research, please cite our paper.

Finding Directions in GAN's Latent Space for Neural Face Reenactment
Stella Bounareli, Vasileios Argyriou, Georgios Tzimiropoulos

Abstract: This paper is on face/head reenactment where the goal is to transfer the facial pose (3D head orientation and expression) of a target face to a source face. Previous methods focus on learning embedding networks for identity and pose disentanglement which proves to be a rather hard task, degrading the quality of the generated images. We take a different approach, bypassing the training of such networks, by using (fine-tuned) pre-trained GANs which have been shown capable of producing high-quality facial images. Because GANs are characterized by weak controllability, the core of our approach is a method to discover which directions in latent GAN space are responsible for controlling facial pose and expression variations. We present a simple pipeline to learn such directions with the aid of a 3D shape model which, by construction, already captures disentangled directions for facial pose, identity and expression. Moreover, we show that by embedding real images in the GAN latent space, our method can be successfully used for the reenactment of real-world faces. Our method features several favorable properties including using a single source image (one-shot) and enabling cross-person reenactment. Our qualitative and quantitative results show that our approach often produces reenacted faces of significantly higher quality than those produced by state-of-the-art methods for the standard benchmarks of VoxCeleb1 & 2.

Face Reenactment Results on VoxCeleb1 dataset

Real image editing of head pose and expression

Self and Cross-subject Reenactment

Installation

Python 3.5+
Linux
NVIDIA GPU + CUDA CuDNN
Pytorch (>=1.5)
Pytorch3d
DECA

We recommend running this repository using Anaconda.

conda create -n python38 python=3.8
conda activate python38
conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=11.0 -c pytorch
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install pytorch3d -c pytorch3d 
pip install -r requirements.txt

Pretrained Models

We provide a StyleGAN2 model trained using StyleGAN2-ada-pytorch and an e4e inversion model trained on VoxCeleb1 dataset.

Path	Description
StyleGAN2-VoxCeleb1	StyleGAN2 trained on VoxCeleb1 dataset.
e4e-VoxCeleb1	e4e trained on VoxCeleb1 dataset.

Auxiliary Models

We provide additional auxiliary models needed during training.

Path	Description
face-detector	Pretrained face detector taken from face-alignment.
IR-SE50 Model	Pretrained IR-SE50 model taken from InsightFace_Pytorch for use in our identity loss.
DECA model	Pretrained model taken from DECA. Extract data.tar.gz under `./libs/DECA/`.

By default, we assume that all pretrained models are downloaded and saved to the directory ./pretrained_models.

Preparing your Data

Download and preprocess the VoxCeleb dataset using VoxCeleb_preprocessing.
Invert real images into the latent space of the pretrained StyleGAN2 using the Encoder4Editing method.

python invert_images.py --input_path path/to/voxdataset

The dataset is saved as:

.path/to/voxdataset
|-- id10271                           # identity index
|   |-- 37nktPRUJ58                   # video index
|   |   |-- frames_cropped            # preprocessed frames
|   |   |   |-- 00_000025.png
|   |   |   |-- ...
|   |   |-- inversion
|   |   |    |-- frames               # inverted frames
|   |   |    |   |-- 00_000025.png
|   |   |    |   |-- ..
|   |   |    |-- latent_codes         # inverted latent_codes
|   |   |    |   |-- 00_000025.npy
|   |   |    |   |-- ..
|   |-- Zjc7Xy7aT8c
|   |   | ...
|-- id10273
|   | ...

The correct preprocessing of the dataset is important to reenact the images. Different preprocessing will lead in poor performance. Example:

Training

To train our model, make sure to download and save the required models under ./pretrained_models path and that the training and testing data are configured as described above. Please check run_trainer.py and ./libs/configs/config_arguments.py for the training arguments.

Example of training using paired data:

python run_trainer.py \
--experiment_path ./training_attempts/exp_v00 \
--train_dataset_path path_to_training_dataset \
--test_dataset_path path_to_test_dataset \
--training_method paired

Inference

Download our pretrained model A-matrix and save it under ./pretrained_models path.

Facial image editing:

Given as input an image or a latent code, change only one facial attribute that corresponds to one of our learned directions.

python run_facial_editing.py \
  --source_path ./inference_examples/0002775.png \
  --output_path ./results/facial_editing \
  --directions 0 1 2 3 4 \
  --save_gif \
  --optimize_generator

Face reenactment (self or cross):

Given as input a source identity and a target video, reenact the source face. The source and target faces could have the same identity or different identity.

python run_inference.py \
  --source_path ./inference_examples/0002775.png \
  --target_path ./inference_examples/lWOTF8SdzJw#2614-2801.mp4 \
  --output_path ./results/ \
  --save_video

Citation

[1] Stella Bounareli, Argyriou Vasileios and Georgios Tzimiropoulos. Finding Directions in GAN's Latent Space for Neural Face Reenactment.

Bibtex entry:

@article{bounareli2022finding,
  title={Finding Directions in GAN's Latent Space for Neural Face Reenactment},
  author={Bounareli, Stella and Argyriou, Vasileios and Tzimiropoulos, Georgios},
  journal={British Machine Vision Conference (BMVC)},
  year={2022}
}

jackzhousz / stylegan_directions_face_reenactment Goto Github PK

stylegan_directions_face_reenactment's Introduction

Finding Directions in GAN's Latent Space for Neural Face Reenactment

Face Reenactment Results on VoxCeleb1 dataset

Installation

Pretrained Models

Auxiliary Models

Preparing your Data

Training

Inference

Facial image editing:

Face reenactment (self or cross):

Citation

stylegan_directions_face_reenactment's People

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs