GithubHelp home page GithubHelp logo

stelabou / stylemask Goto Github PK

View Code? Open in Web Editor NEW
109.0 9.0 7.0 4.83 MB

Authors official PyTorch implementation of the "StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment" [FG 2023].

Python 94.91% C++ 0.71% Cuda 4.38%

stylemask's Introduction

StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment

Authors official PyTorch implementation of StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment. This paper has been accepted for publication at IEEE Conference on Automatic Face and Gesture Recognition, 2023.

StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment
Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

Abstract: In this paper we address the problem of neural face reenactment, where, given a pair of a source and a target facial image, we need to transfer the target's pose (defined as the head pose and its facial expressions) to the source image, by preserving at the same time the source's identity characteristics (e.g., facial shape, hair style, etc), even in the challenging case where the source and the target faces belong to different identities. In doing so, we address some of the limitations of the state-of-the-art works, namely, a) that they depend on paired training data (i.e., source and target faces have the same identity), b) that they rely on labeled data during inference, and c) that they do not preserve identity in large head pose changes. More specifically, we propose a framework that, using unpaired randomly generated facial images, learns to disentangle the identity characteristics of the face from its pose by incorporating the recently introduced style space $\mathcal{S}$ of StyleGAN2, a latent representation space that exhibits remarkable disentanglement properties. By capitalizing on this, we learn to successfully mix a pair of source and target style codes using supervision from a 3D model. The resulting latent code, that is subsequently used for reenactment, consists of latent units corresponding to the facial pose of the target only and of units corresponding to the identity of the source only, leading to notable improvement in the reenactment performance compared to recent state-of-the-art methods. In comparison to state of the art, we quantitatively and qualitatively show that the proposed method produces higher quality results even on extreme pose variations. Finally, we report results on real images by first embedding them on the latent space of the pretrained generator.

Face Reenactment Results


Installation

  • Python 3.5+
  • Linux
  • NVIDIA GPU + CUDA CuDNN
  • Pytorch (>=1.5)
  • Pytorch3d
  • DECA

We recommend running this repository using Anaconda.

conda create -n python38 python=3.8
conda activate python38
conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=11.0 -c pytorch
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install pytorch3d -c pytorch3d 
pip install -r requirements.txt

Pretrained Models

In order to use our method, make sure to download and save the required models under ./pretrained_models path.

Path Description
StyleGAN2-FFHQ-1024 Official StyleGAN2 model trained on FFHQ 1024x1024 output resolution converted using rosinality(FFHQ 1024x1024 output resolution).
e4e-FFHQ-1024 Official e4e inversion model trained on FFHQ dataset taken from e4e. In case of using real images use this model to invert them into the latent space of StyleGAN2.
stylemask-model Our pretrained StyleMask model on FFHQ 1024x1024 output resolution.

Inference

Given a pair of images or latent codes transfer the target facial pose into the source face.
Source and target paths could be None (generate random latent codes pairs), latent code files, image files or directories with images or latent codes. In case of input paths are real images the script will use the e4e inversion model to get the inverted latent codes.

python run_inference.py --output_path ./results --save_grid

Training

We provide additional models needed during training.

Path Description
IR-SE50 Model Pretrained IR-SE50 model taken from InsightFace_Pytorch for use in our identity loss.
DECA models Pretrained models taken from DECA. Extract data.tar.gz under ./libs/DECA/.

By default, we assume that all pretrained models are downloaded and saved to the directory ./pretrained_models.

python run_trainer.py --experiment_path ./training_attempts/exp_v00 

Citation

[1] Stella Bounareli, Christos Tzelepis, Argyriou Vasileios, Ioannis Patras, Georgios Tzimiropoulos. StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment. IEEE Conference on Automatic Face and Gesture Recognition (FG), 2023.

Bibtex entry:

@article{bounareli2022StyleMask,  
  author = {Bounareli, Stella and Tzelepis, Christos and Argyriou, Vasileios and Patras, Ioannis and Tzimiropoulos, Georgios},
  title = {StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment},
  journal = {IEEE Conference on Automatic Face and Gesture Recognition},
  year = {2023},
}

Acknowledgment

This work was supported by the EU's Horizon 2020 programme H2020-951911 AI4Media project.

stylemask's People

Contributors

stelabou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stylemask's Issues

Inaccurate image reconstruction results outside of the ffhq dataset

I attempted to reconstruct using real photos but found that some of the reconstruction results were inaccurate. I am planning to do pre-training on my own dataset and noticed that you are using images generated from latent codes. However, StyleGAN2 cannot accurately generate the style space for images outside of FFHQ. Therefore, I am wondering how to solve this issue.
actor14
2023-04-21_12-05
grid_0000

ValueError: stat: embedded null character in path

----- Load generator from ./pretrained_models/stylegan2-ffhq-config-f_1024.pt -----
----- Load mask network from ./pretrained_models/mask_network_1024.pt -----
----- Load e4e encoder from ./pretrained_models/e4e_ffhq_encode_1024.pt -----
Reenact 1 pairs
0% 0/1 [00:06<?, ?it/s]
Traceback (most recent call last):
File "run_inference.py", line 305, in
main()
File "run_inference.py", line 299, in main
inf.run()
File "run_inference.py", line 220, in run
source_img = image_to_tensor(cropped_image).unsqueeze(0).cuda()
File "/content/StyleMask/libs/utilities/image_utils.py", line 18, in image_to_tensor
if os.path.isfile(image_file):
File "/usr/local/lib/python3.7/genericpath.py", line 30, in isfile
st = os.stat(path)
ValueError: stat: embedded null character in path

How can we solve this problem?

Where to use the DECA model

Hello, thank you for your excellent work. I have a question that I am not sure where you used the DECA model. The expression transfer in stylemask does not seem to be using the DECA method

Target and Source image weighting.

Hello!
I'm using reenactment for frontalizing faces. Gives very good result. However, it changes face a lot. I used face similarity calculations to use most similar target image but a pool with a lot of target images needed. So, I wonder is there a way to weighting source image and target image in reenactment process? With this I can easily prevent this issue.
Best Regards.

So can we get a new w+ vector based on style_space, latent and G three parameters?

image
Hello, your work is excellent!
I have a question I'd like to ask you.
You know, we can get the new image after editing the updated style vector guided by three parameters of S (stylegan2 space vector), the original image obtained by e4e latent space vector latent (w+) and stylegan2 generator.
So can we get a new w+ vector based on these three parameters?

[Error] embedded null character in path

I tried some real images with additional param --source_path and --target_path then got the following error message:

----- Load generator from ./pretrained_models/stylegan2-ffhq-config-f_1024.pt -----
----- Load mask network from ./pretrained_models/mask_network_1024.pt -----
----- Load e4e encoder from ./pretrained_models/e4e_ffhq_encode_1024.pt -----
Reenact 1 pairs
  0% 0/1 [00:06<?, ?it/s]
Traceback (most recent call last):
  File "run_inference.py", line 305, in <module>
    main()
  File "run_inference.py", line 299, in main
    inf.run()
  File "run_inference.py", line 220, in run
    source_img = image_to_tensor(cropped_image).unsqueeze(0).cuda()
  File "/content/StyleMask/libs/utilities/image_utils.py", line 18, in image_to_tensor
    if os.path.isfile(image_file):
  File "/usr/local/lib/python3.7/genericpath.py", line 30, in isfile
    st = os.stat(path)
ValueError: stat: embedded null character in path

Issues with multi-gpus

Thanks for the wonderful work!

I wonder if your project supports multi-gpu settings.

When I was trying to inference a batch of images on multi-gpu, it pops up some exceptions in StyleGAN2 related classes.

Initially, it raises tensor device mismatch issue for the following code.

if truncation < 1:
    style_t = []
    for style in styles:
        style_t.append(
            truncation_latent + truncation * (style - truncation_latent)
        )
    styles = style_t

and it points at style - truncation_latent, seems to claim that they are on different devices. I modified it to style.to(truncation_latent.device) - truncation_latent and successfully skipped this error. However, I am confused why this happened. Is it because of the for loop? Or is there somewhere a setting for it?

After this, I encountered several similar cases and fixed them by calling to(some_tensor.device). Then, the following occurs.

File "/projects/stylemask/libs/utilities/utils_inference.py", line 210, in invert_image
    inverted_images, _ = generator([latent_codes], input_is_latent=True, return_latents = False, truncation= truncation, truncation_latent=trunc)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/projects/stylemask/libs/models/StyleGAN2/model.py", line 519, in forward
    out = self.conv1(out, latent[:, 0], noise=noise[0])
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/projects/stylemask/libs/models/StyleGAN2/model.py", line 332, in forward
    out = self.conv(input, style)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/projects/stylemask/libs/models/StyleGAN2/model.py", line 234, in forward
    style = self.modulation(style).view(batch, 1, in_channel, 1, 1)
            ^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/projects/stylemask/libs/models/StyleGAN2/model.py", line 154, in forward
    out = F.linear(input, self.weight * self.scale, bias=self.bias * self.lr_mul)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

I figured that input and self.weight are on different devices. I have never encountered this because I did nn.DataParallel(), and am not sure why self.weight, which is an nn.Parameter(), is on a different device from the input.

Well, I had to try to fix them by continue doing things like self.weight.to(input.device). Finally, it reaches

      mask_idx = self.mask_net[network_name_str](styles)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/projects/stylemask/libs/models/mask_predictor.py", line 23, in forward
    out = self.masknet(input)
          ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/container.py", line 215, in forward
    input = module(input)
            ^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/pytorch-2.1.2/lib/python3.11/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

It turns out that the error happens in the mask_predictor, and in that class there is just a sequential network

self.masknet = nn.Sequential(nn.Linear(input_dim, inner_dim, bias=True),
									nn.ReLU(),
									nn.Linear(inner_dim, output_dim, bias=True),
									)

I don't know why out = self.masknet(input) would raise the device mismatch error.

I had to omit some messages since I was inferencing StyleMask model within another project, hope the above messages explain my problems, and look forward to your reply.

Thanks!

Colab

Will there be any colab?

DECA models broken link

I can't download the models because the link is broken. I tried to make the estimated cycles from the configs but I encountered some problems. Any chance to fix it?

code release

Dear authors, thanks for your amazing work! I really liked your idea and wonder if you are planning to release training and inference codes soon.

Best

Akın

from libs.models.StyleGAN2-error

run_ inference. Error in line 19 of py
from libs.models.StyleGAN2.model import Generator as StyleGAN2Generator
I repeated the installation, confirmed the installation program, and carefully checked that there were no installation errors
I copied all databases to./pretrained_models
namely:
e4e_ffhq_encode_1024.pt
mask_network_1024.pt
stylegan2-ffhq-config-f_1024.pt

I don't know where I made a mistake. Ask for help. Thank you!
ERROR

Eye Blinking

I saw in a closed issue comment that video works but is limited to the range of the dataset expressions, as I wish to have a nuetural expression changing only the Yaw-Pitch-Roll with no mouth movments I feel this could be a good tool for me but does it support blinking? or at the very least closed eyelids (as I could use an intepolation method to close the gap between open & closed).

Thank you for your time & repo.

AttributeError: _2D

Traceback (most recent call last):
File "run_inference.py", line 309, in
main()
File "run_inference.py", line 303, in main
inf.run()
File "run_inference.py", line 212, in run
self.load_models(inversion)
File "run_inference.py", line 100, in load_models
self.fa = face_alignment.FaceAlignment(face_alignment.LandmarksType._2D, device='cuda')
File "/root/anaconda3/envs/StyleMask/lib/python3.8/enum.py", line 384, in getattr
raise AttributeError(name) from None
AttributeError: _2D

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.