GithubHelp home page GithubHelp logo

williamyang1991 / styleganex Goto Github PK

View Code? Open in Web Editor NEW
489.0 25.0 35.0 18.55 MB

[ICCV 2023] StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces

License: Other

Python 8.57% Jupyter Notebook 90.87% C++ 0.07% Cuda 0.50%
face face-editing face-manipulation stylegan2

styleganex's Introduction

StyleGANEX - Official PyTorch Implementation

teaser2.mp4

This repository provides the official PyTorch implementation for the following paper:

StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces
Shuai Yang, Liming Jiang, Ziwei Liu and Chen Change Loy
In ICCV 2023.
Project Page | Paper | Supplementary Video

google colab logo Hugging Face Spaces visitors

Abstract: Recent advances in face manipulation using StyleGAN have produced impressive results. However, StyleGAN is inherently limited to cropped aligned faces at a fixed image resolution it is pre-trained on. In this paper, we propose a simple and effective solution to this limitation by using dilated convolutions to rescale the receptive fields of shallow layers in StyleGAN, without altering any model parameters. This allows fixed-size small features at shallow layers to be extended into larger ones that can accommodate variable resolutions, making them more robust in characterizing unaligned faces. To enable real face inversion and manipulation, we introduce a corresponding encoder that provides the first-layer feature of the extended StyleGAN in addition to the latent style code. We validate the effectiveness of our method using unaligned face inputs of various resolutions in a diverse set of face manipulation tasks, including facial attribute editing, super-resolution, sketch/mask-to-face translation, and face toonification.

Features:

  • Support for Unaligned Faces: StyleGANEX can manipulate normal field-of-view face images and videos.
  • Compatibility: StyleGANEX can directly load pre-trained StyleGAN parameters without retraining.
  • Flexible Manipulation: StyleGANEX retains the style representation and editing ability of StyleGAN.

overview

Updates

  • [07/2023] Training code is released.
  • [07/2023] The paper is accepted to ICCV 2023 ๐Ÿ˜!
  • [03/2023] Integrated to ๐Ÿค— Hugging Face. Enjoy the web demo!
  • [03/2023] Inference code is released.
  • [03/2023] This website is created.

Installation

Clone this repo:

git clone https://github.com/williamyang1991/StyleGANEX.git
cd StyleGANEX

Dependencies:

We have tested on:

  • CUDA 10.1
  • PyTorch 1.7.1
  • Pillow 8.3.1; Matplotlib 3.4.2; opencv-python 4.5.3; tqdm 4.61.2; Ninja 1.10.2; dlib 19.24.0; gradio 3.4

(1) Inference

Inference Notebook

google colab logo

To help users get started, we provide a Jupyter notebook found in ./inference_playground.ipynb that allows one to visualize the performance of StyleGANEX. The notebook will download the necessary pretrained models and run inference on the images found in ./data/.

Gradio demo

We also provide a UI for testing StyleGANEX that is built with gradio. Running the following command in a terminal will launch the demo:

python app_gradio.py

This demo is also hosted on Hugging Face.

Pre-trained Models

Pre-trained models can be downloaded from Google Drive, Baidu Cloud (access code: luck) or Hugging Face:

TaskModelDescription
Inversionstyleganex_inversion.ptpre-trained model for StyleGANEX inversion
Image translationstyleganex_sr32.ptpre-trained model specially for 32x face super resolution
styleganex_sr.ptpre-trained model for 4x-48x face super resolution
styleganex_sketch2face.ptpre-trained model for skech-to-face translation
styleganex_mask2face.ptpre-trained model for parsing map-to-face translation
Video editingstyleganex_edit_hair.ptpre-trained model for hair color editing on videos
styleganex_edit_age.ptpre-trained model for age editing on videos
styleganex_toonify_cartoon.ptpre-trained Cartoon model for video face toonification
styleganex_toonify_arcane.ptpre-trained Arcane model for video face toonification
styleganex_toonify_pixar.ptpre-trained Pixar model for video face toonification
Supporting model
faceparsing.pthBiSeNet for face parsing from face-parsing.PyTorch

The downloaded models are suggested to be put into ./pretrained_models/

StyleGANEX Inversion

We can embed a face image into the latent space of StyleGANEX to obtain its w+ latent code and the first-layer feature f with inversion.py.

python inversion.py --ckpt STYLEGANEX_MODEL_PATH --data_path FACE_IMAGE_PATH

The results are saved in the folder ./output/. The results contain a reconstructed image FILE_NAME_inversion.jpg and a FILE_NAME_inversion.pt file. You can obtain w+ latent code and the first-layer feature f by

latents = torch.load('./output/FILE_NAME_inversion.pt')
wplus_hat = latents['wplus'].to(device) # w+
f_hat = [latents['f'][0].to(device)]    # f

The ./inference_playground.ipynb provides some face editing examples based on wplus_hat and f_hat.

Image Translation

image_translation.py supports face super-resolution, sketch-to-face translation and parsing map-to-face translation.

python image_translation.py --ckpt STYLEGANEX_MODEL_PATH --data_path FACE_INPUT_PATH

The results are saved in the folder ./output/.

Additional notes to consider:

  • --parsing_model_ckpt (default: pretrained_models/faceparsing.pth): path to the pre-trained parsing model
  • --resize_factor (default: 32): super resolution resize factor
  • --number (default: 4): output number of multi-modal translation (for sketch/mask-to-face translation task)
  • --use_raw_data (default: False):
    • if not specified, apply possible pre-processing to the input data
      • For styleganex_sr/sr32.pt, the input face image, e.g., ./data/ILip77SbmOE.png will be downsampled based on --resize_factor. The downsampled image will be also saved in ./output/.
      • For styleganex_sketch2face.pt, no pre-processing will be applied.
      • For styleganex_mask2face.pt, the input face image, e.g., ./data/ILip77SbmOE.png will be transformed into a parsing map. The parsing map and its visualization version will be also saved in ./output/.
    • if specified, directly load input data without pre-processing

Video Editing

video_editing.py supports video facial attribute editing and video face toonification.

python video_editing.py --ckpt STYLEGANEX_MODEL_PATH --data_path FACE_INPUT_PATH

The results are saved in the folder ./output/.

Additional notes to consider:

  • --data_path: the input can be either an image or a video.
  • --scale_factor: for attribute editing task (styleganex_edit_hair/age), control the editing degree.

(2) Training

Preparing your Data

  • As with pSp, we provide support for numerous datasets and experiments (encoding, translation, etc.).
    • Refer to configs/paths_config.py to define the necessary data paths and model paths for training and evaluation.
    • Refer to configs/transforms_config.py for the transforms defined for each dataset/experiment.
    • Finally, refer to configs/data_configs.py for the source/target data paths for the train and test sets as well as the transforms.
  • If you wish to experiment with your own dataset, you can simply make the necessary adjustments in
    • data_configs.py to define your data paths.
    • transforms_configs.py to define your own data transforms.

As an example, assume we wish to run encoding using ffhq (dataset_type=ffhq_encode). We first go to configs/paths_config.py and define:

dataset_paths = {
    'ffhq': '/path/to/ffhq/realign320x320'
    'ffhq_test': '/path/to/ffhq/realign320x320_test'
}

The transforms for the experiment are defined in the class EncodeTransforms in configs/transforms_config.py.
Finally, in configs/data_configs.py, we define:

DATASETS = {
   'ffhq_encode': {
        'transforms': transforms_config.EncodeTransforms,
        'train_source_root': dataset_paths['ffhq'],
        'train_target_root': dataset_paths['ffhq'],
        'test_source_root': dataset_paths['ffhq_test'],
        'test_target_root': dataset_paths['ffhq_test'],
    },
}

When defining our datasets, we will take the values in the above dictionary.

The 1280x1280 ffhq images can be obtain by the modified script of official ffhq:

  • Download the in-the-wild images with python script/download_ffhq1280.py --wilds
  • Reproduce the aligned 1280ร—1280 images wiht python script/download_ffhq1280.py --align
  • 320x320 ffhq images can be obtained by setting output_size=320, transform_size=1280 in Line 272 of download_ffhq1280.py

Downloading supporting models

Please download the pre-trained models to support the training of StyleGANEX

Path Description
original_stylegan StyleGAN trained with the FFHQ dataset
toonify_model StyleGAN finetuned on cartoon dataset for image toonification (cartoon, pixar, arcane)
original_psp_encoder pSp trained with the FFHQ dataset for StyleGAN inversion.
pretrained_encoder StyleGANEX encoder pretrained with the synthetic data for StyleGAN inversion.
styleganex_encoder StyleGANEX encoder trained with the FFHQ dataset for StyleGANEX inversion.
editing_vector Editing vectors for editing face attributes (age, hair color)
augmentation_vector Editing vectors for data augmentation

The main training script can be found in scripts/train.py.
Intermediate training results are saved to opts.exp_dir. This includes checkpoints, train outputs, and test outputs.

Training styleganex

Note: Our default code is a CPU-compatible version. You can switch to a more efficient version by using cpp extention. To do so, please change models.stylegan2.op to models.stylegan2.op_old

from models.stylegan2.op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d

Training the styleganex encoder

First pretrain encoder on synthetic 1024x1024 images. You can download our pretrained encoder here

python scripts/pretrain.py \
--exp_dir=/path/to/experiment \
--ckpt=/path/to/original_psp_encoder \
--max_steps=2000

Then finetune encoder on real 1280x1280 ffhq images based on the pretrained encoder

python scripts/train.py \
--dataset_type=ffhq_encode \
--exp_dir=/path/to/experiment \
--checkpoint_path=/path/to/pretrained_encoder \
--max_steps=100000 \
--workers=8 \
--batch_size=8 \
--val_interval=2500 \
--save_interval=50000 \
--start_from_latent_avg \
--id_lambda=0.1 \
--w_norm_lambda=0.001 \
--affine_augment \
--random_crop \
--crop_face

Sketch to Face

python scripts/train.py \
--dataset_type=ffhq_sketch_to_face \
--exp_dir=/path/to/experiment \
--stylegan_weights=/path/to/original_stylegan \
--max_steps=100000 \
--workers=8 \
--batch_size=8 \
--val_interval=2500 \
--save_interval=10000 \
--start_from_latent_avg \
--w_norm_lambda=0.005 \
--affine_augment \
--random_crop \
--crop_face \
--use_skip \
--skip_max_layer=1 \
--label_nc=1 \
--input_nc=1 \
--use_latent_mask

Segmentation Map to Face

python scripts/train.py \
--dataset_type=ffhq_seg_to_face \
--exp_dir=/path/to/experiment \
--stylegan_weights=/path/to/original_stylegan \
--max_steps=100000 \
--workers=8 \
--batch_size=8 \
--val_interval=2500 \
--save_interval=10000 \
--start_from_latent_avg \
--w_norm_lambda=0.005 \
--affine_augment \
--random_crop \
--crop_face \
--use_skip \
--skip_max_layer=2 \
--label_nc=19 \
--input_nc=19 \
--use_latent_mask 

Super Resolution

python scripts/train.py \
--dataset_type=ffhq_super_resolution \
--exp_dir=/path/to/experiment \
--checkpoint_path=/path/to/styleganex_encoder \
--max_steps=100000 \
--workers=4 \
--batch_size=4 \
--val_interval=2500 \
--save_interval=10000 \
--start_from_latent_avg \
--adv_lambda=0.1 \
--affine_augment \
--random_crop \
--crop_face \
--use_skip \
--skip_max_layer=4 \
--resize_factors=8

For one model supporting multiple resize factors, set --skip_max_layer=2 and --resize_factors=1,2,4,8,16

Video Editing

python scripts/train.py \
--dataset_type=ffhq_edit \
--exp_dir=/path/to/experiment \
--checkpoint_path=/path/to/styleganex_encoder \
--max_steps=100000 \
--workers=2 \
--batch_size=2 \
--val_interval=2500 \
--save_interval=10000 \
--start_from_latent_avg \
--adv_lambda=0.1 \
--tmp_lambda=30 \
--affine_augment \
--crop_face \
--use_skip \
--skip_max_layer=7 \
--editing_w_path=/path/to/editing_vector \
--direction_path=/path/to/augmentation_vector \
--use_att=1 \
--generate_training_data

Video Toonification

python scripts/train.py \
--dataset_type=toonify \
--exp_dir=/path/to/experiment \
--checkpoint_path=/path/to/styleganex_encoder \
--max_steps=55000 \
--workers=2 \
--batch_size=2 \
--val_interval=2500 \
--save_interval=10000 \
--start_from_latent_avg \
--adv_lambda=0.1 \
--tmp_lambda=30 \
--affine_augment \
--crop_face \
--use_skip \
--skip_max_layer=7 \
--toonify_weights=/path/to/toonify_model

Additional Notes

  • See options/train_options.py for all training-specific flags.
  • If you wish to generate images from segmentation maps, please specify --label_nc=N and --input_nc=N where N is the number of semantic categories.
  • Similarly, for generating images from sketches, please specify --label_nc=1 and --input_nc=1.
  • Specifying --label_nc=0 (the default value), will directly use the RGB colors as input.

(3) Results

Overview of StyleGANEX inversion and facial attribute/style editing on unaligned faces:

result

Video facial attribute editing:

part2.mp4

Video face toonification:

part3.mp4

Citation

If you find this work useful for your research, please consider citing our paper:

@inproceedings{yang2023styleganex,
โ€ƒtitle = {StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces},
โ€ƒauthor = {Yang, Shuai and Jiang, Liming and Liu, Ziwei and and Loy, Chen Change},
 booktitle = {ICCV},
โ€ƒyear = {2023},
}

Acknowledgments

The code is mainly developed based on stylegan2-pytorch, pixel2style2pixel and VToonify.

styleganex's People

Contributors

endlesssora avatar williamyang1991 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

styleganex's Issues

inversion

Hi, I'm interested in your work, Here are my inversion results, the inversion is a bit fuzzy (blocky pixels), Is this normal?
00000
00000_inversion

00001
00001_inversion

Edit Vector

Thanks for awesome work!.
How can i obtain a attribute edit vector (exp: smile, glasses, ...)

Video output size

I generated output video using video_editing.py, but the output size is different from the input resolution. (The bottom part is cut)
The input is portrait video, and the head is not located at the center. Something like this.
pretty-smiling-joyfully-female-with-fair-hair-dressed-casually-looking-with-satisfaction

Is there any way to get the same resolution to the original input? it would be great if you can point out the lines needed to be changed...

Always thank you,

Training code

thanks for your awesome works! Can you please let us know when the training code will be released

Problem with launching Gradio on WIndows

I installed everything right and other scripts are working but gradio doesn't work. I'm on Windows
Traceback (most recent call last):
File "app_gradio.py", line 97, in
main()
File "app_gradio.py", line 75, in main
create_demo_inversion(model.process_inversion, allow_optimization=True)
File "X:\StyleGANEX\webUI\app_task.py", line 313, in create_demo_inversion
api_name='inversion')
File "X:\StyleGANEX\venv\lib\site-packages\gradio\events.py", line 157, in call
trigger_only_on_success=self.trigger_only_on_success,
File "X:\StyleGANEX\venv\lib\site-packages\gradio\blocks.py", line 225, in set_event_trigger
check_function_inputs_match(fn, inputs, inputs_as_dict)
File "X:\StyleGANEX\venv\lib\site-packages\gradio\utils.py", line 749, in check_function_inputs_match
parameter_types = get_type_hints(fn)
File "X:\StyleGANEX\venv\lib\site-packages\gradio\utils.py", line 704, in get_type_hints
return typing.get_type_hints(fn)
File "X:\StyleGANEX\venv\lib\typing.py", line 1013, in get_type_hints
value = _eval_type(value, globalns, localns)
File "X:\StyleGANEX\venv\lib\typing.py", line 263, in _eval_type
return t._evaluate(globalns, localns)
File "X:\StyleGANEX\venv\lib\typing.py", line 467, in _evaluate
eval(self.forward_code, globalns, localns),
File "", line 1, in
NameError: name 'file' is not defined

If it helps some file are redownloaded whenever I try to launch Gradio
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 17.5k/17.5k [00:00<00:00, 1.05MB/s]
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 4.01M/4.01M [00:02<00:00, 1.58MB/s]
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 174k/174k [00:00<00:00, 628kB/s]
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 4.11k/4.11k [00:00<00:00, 1.41MB/s]
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 42.9k/42.9k [00:00<00:00, 350kB/s]
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 74.1k/74.1k [00:00<00:00, 489kB/s]
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1.04M/1.04M [00:00<00:00, 1.26MB/s]
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 901k/901k [00:00<00:00, 1.26MB/s]

These are the redownloads

Video face editing

Did you train video face editing (e.g black hair) on aligned data?
According to the paper, it uses a synthetic dataset from Stylegan2 or did I miss something?
(we can simply generate x and y from random latent code w+ with StyleGAN G_0)

Video editing output size

Hi fantastic job! I don't understand the output resolution in video editing, it looks like it tracks a single face and zooms in, what would be the best way to return to it's original size in a video editing app could I just do a zoom out x2 and a move x or y or something?

for my example the original size is 1920x1080 and the output is 1920x1632

about scale_factor

Dear Williamyang1991:

thanks for your nice work.

I want to generate the video with younger face, is the following command correct?

python video_editing.py --ckpt pretrained_models\styleganex_edit_age.pt --data_path data\2.mp4 --scale_factor 2

what's the real meaning of scale_factor?

thanks.

Download failed - no file

Hi. I can't seem to download a video after it has been made on web ui. I get "Download failed - no file" either if I try to save it as html or mp4 file.

Question about video editing training.

Thanks for your great work! I'm now trying to train a new edit direction: slender. I'm a little confused about the data preparing. In data.config it shows the realign320 for training. but in the paper says training with generated data. besides, is it all the 70000 in-the-wild image are used for training and testing?

Ajust Age

Firstly thank you for the wonderful project.

But I am stuck when running it to Age editing, I can't find a param to change the age
Can you explain which i param will make an object in the image older than
Thank you
Screenshot 2023-12-25 at 14 33 38

SyntaxError: 'return' outside function`

When running:

python video_editing.py --ckpt STYLEGANEX_MODEL_PATH --data_path FACE_INPUT_PATH

I get:

File "/content/StyleGANEX/video_editing.py", line 81 return ^ SyntaxError: 'return' outside function

Unexpected end of JSON input

Getting Unexpected end of JSON input error in google colab when Number of frames to toonify set to 1000. Also, can't toonify more than 3 seconds of length of a video.

Why does the 7-th layer of stylegan2 have the resolution as same as 32x32.

In the paper, the author wrote as follows:
"In Fig. 2(e), the first-layer feature fails to provide enough spatial information for a valid rotation. In comparison, the 7-th layer has a higher resolution (32 ร— 32), making it better suited for capturing spatial information."
As long as I know, resolution 32x32 is for 4th layer and resolution 256x256 is for 7th layer.

How did you find the editing_w styles for style transfer?

I tried to apply my styles found through StyleCLIP with shape [18,512] to codes variable in psp forward function, but they don't seem to work in hair/age or inversion (after optimization) networks. Even though generator is standard Stylegan. Seems like first_layer_feats from encoder suppress my StyleCLIP edit. But I see that random styles obtained through mapping network from 512 random vectors work in your example. Can I use StyleCLIP or somehow obtain my own styles?

Traceback error

I installed all req and when I run python app_gradio.py

I get this error:

Traceback (most recent call last):
  File "app_gradio.py", line 9, in <module>
    from webUI.styleganex_model import Model
  File "E:\Ai__Project\StyleGANEX\webUI\styleganex_model.py", line 9, in <module>
    import dlib
  File "C:\Users\ammar\anaconda3\envs\styleganex\lib\site-packages\dlib\__init__.py", line 19, in <module>
    from _dlib_pybind11 import *
ImportError: DLL load failed while importing _dlib_pybind11: The specified module could not be found.

Face attribute editing

Awesome work. Vtoonify was amazing, but StyleGANEX is even better.

Screenshot 2023-08-10 at 1 52 51 PM

Especially, I am interested in face attribute editing (video), but I can only test age and hair editing. As shown in the example (open mouse, smile, gender swap, etc), how can I edit the face attributes?

Is training StyleGANEX necessary?
It would be great if you could elaborate how to get different face attribute editing step by step.

Thank you! :)

How to convert the model to onnx?

from argparse import Namespace
from models.psp import pSp
import torch.nn as nn
import torch
import onnx

#Function to Convert to ONNX 
def Convert_ONNX(): 
    device = "cuda" if torch.cuda.is_available() else "cpu"
    ckpt_path = 'pretrained_models/styleganex_toonify_pixar.pt'
    ckpt = torch.load(ckpt_path, map_location='cpu')
    opts = ckpt['opts']
    opts['checkpoint_path'] = ckpt_path
    opts['device'] =  device
    opts = Namespace(**opts)
    torch_model = pSp(opts)
    torch_model.cpu()

    output_onnx = str("styleganex_toonify_pixar.onnx")

    # set the model to inference mode 
    torch_model.eval() 

    # The exported model will thus accept inputs of size [batch_size, 1, 224, 224] where batch_size can be variable.
    batch_size = 1 
    # Let's create a dummy input tensor
    channel = 3
    height = 224
    width = 224
    torch_input = torch.randn(batch_size, channel, height, width, requires_grad=True)

    dynamic_axes= {
        'input0': {0: 'batch', 2: 'height', 3: 'width'},
        'output0': {0: 'batch', 2: 'height', 3: 'width'}
    }

    # Export the model
    # """ 
    torch.onnx.export(
         torch_model,         # model being run 
         torch_input,       # model input (or a tuple for multiple inputs) 
         output_onnx,       # where to save the model  
         export_params=True,  # store the trained parameter weights inside the model file 
         opset_version=15,    # the ONNX version to export the model to 
         # WARNING: DNN inference with torch>=1.12 may require do_constant_folding=False
         do_constant_folding=True,  # whether to execute constant folding for optimization
         input_names = ['input0'],   # the model's input names 
         output_names = ['output0'], # the model's output names 
         dynamic_axes = dynamic_axes)
    # """
    
    print(" ") 
    print('Model has been converted to ONNX')

    # Checks
    onnx_model = onnx.load(output_onnx)  # load onnx model
    onnx.checker.check_model(onnx_model)  # check onnx model

    print('ONNX export success, saved as %s' % output_onnx)


def main():
    Convert_ONNX()

if __name__ == "__main__":
    main()

When I run this code,it shows the error below:

torch.onnx.errors.SymbolicValueError: Unsupported: ONNX export of convolution for kernel of unknown shape. [Caused by the value '1865 defined in (%1865 : Float(*, *, *, *, strides=[401408, 784, 28, 1], requires_grad=1, device=cpu) = onnx::Reshape[allowzero=0](%1803, %1864), scope: models.psp.pSp::/models.stylegan2.model.Generator::decoder/models.stylegan2.model.StyledConv::conv1/models.stylegan2.model.ModulatedConv2d::conv # /home/yxy/github/StyleGANEX/models/stylegan2/model.py:297:0
)' (type 'Tensor') in the TorchScript graph. The containing node has kind 'onnx::Reshape'.]

I have searched some relative docs,It shows that we can not use dynamic shapes when convert to ONNX, but the doc in pytorch didn`t mention this.

Some confusion

What is the function of editing_w๏ผŸis it train from different age data ๏ผŸ

or styleganex_edit_age.pt is it train from different age data ๏ผŸ

how to learn age change๏ผŒ What is the principle, if want to learn model use my own different age data , What should I do

Video toonify training

Thanks for your great work. what's the data organization in video toonify(eg, what's in 'toonify_in': 'data/train/pixar/trainA/'), and other style like arcane, comic, cartoon, ukyio,etc. The toonify dataset in my download was not organized. thank you for your reply

Image edit

Great job! I have some questions about the code. If I edit images, do I need to do the following? Will removal have an impact on the results?
ๅพฎไฟกๅ›พ็‰‡_20231020154958

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.