GithubHelp home page GithubHelp logo

styleanime's Introduction

Parsing-Conditioned Anime Translation: A New Dataset and Method(ACM TOG)

paper

Anime is an abstract art form that is substantially different from the human portrait, leading to a challenging misaligned image translation problem that is beyond the capability of existing methods. This can be boiled down to a highly ambiguous unconstrained translation between two domains. To this end, we design a new anime translation framework by deriving the prior knowledge of a pre-trained StyleGAN model. We introduce disentangled encoders to separately embed structure and appearance information into the same latent code, governed by four tailored losses. Moreover, we develop a FaceBank aggregation method that leverages the generated data of the StyleGAN, anchoring the prediction to, produce in-domain animes. To empower our model and promote the research of anime translation, we propose the first anime portrait parsing dataset, Danbooru-Parsing, containing 4,921 densely labeled images across 17 classes. This dataset connects the face semantics with appearances, enabling our new constrained translation setting. We further show the editability of our results, and extend our method to manga images, by generating the first manga parsing pseudo data. Extensive experiments demonstrate the values of our new dataset and method, resulting in the first feasible solution on anime translation. image

image

Description

This is the official implementation of our paper "Parsing-Conditioned Anime Translation: A New Dataset and Method"(ACM TOG).

Danbooru-Parsing Dataset

We train anime parsing model with face-parsing, and the well trained model can be downloaded here.

Label List
0:'background' 1:'skin' 2:'l_brow' 3:'r_brow'
4:'l_eye' 5:'r_eye' 6:'eye_g' 7:'l_ear'
8:'r_ear' 9:'ear_r' 10:'nose' 11:'mouth'
12:'u_lip' 13:'l_lip' 14:'neck' 15:'neck_l'
16:'cloth' 17:'hair' 18:'hat'

Download labeled anime dataset: Danbooru-Parsing Dataset

Pretrained Models

Please download the pre-trained models from the following links.

Path Description
IR-SE50 Model Pretrained IR-SE50 model taken from TreB1eN for use in our ID loss during pSp training.
MTCNN Weights for MTCNN model taken from TreB1eN for use in ID similarity metric computation. (Unpack the tar.gz to extract the 3 model weights.)
CurricularFace Backbone Pretrained CurricularFace model taken from HuangYG123 for use in ID similarity metric computation.
Anime StyleGAN2 Model Finetuned model of StyleGAN2 on our anime dataset with StyleGAN2, code from rosinality
Average Latent Average latent of Stylegan2 pretrained model on anime
Bank List FaceBank Aggregation, stylegan2 anime latent list
StyleAnime Our pretrained styleAnime model (portrait2anime)

The pretrained models should be saved to the directory pretrained_models.

Preparing Data

Please first download training dataset:
anime
celeba

Then go to configs/paths_config.py and define:

dataset_paths = {
	'anime_train_segmentation': 'path/anime/anime_seg_train',
	'anime_test_segmentation': 'path/anime/anime_seg_test_68',
	'anime_train': 'path/anime/anime_face_train',
	'anime_test': 'path/anime_face_test_68',
    
	'face_train_segmentation': 'path/celeba/celeba_seg_train',
	'face_test_segmentation': 'path/celeba/celeba_seg_test_68',
	'face_train': 'path/celeba/celeba_face_train',
	'face_test': 'path/celeba/celeba_face_test_68',
}
model_paths = {
	'anime_ffhq': 'pretrained_models/stylegan2_anime_pretrained.pt',
	'ir_se50': 'pretrained_models/model_ir_se50.pth',
	'circular_face': 'pretrained_models/CurricularFace_Backbone.pth',
	'mtcnn_pnet': 'pretrained_models/mtcnn/pnet.npy',
	'mtcnn_rnet': 'pretrained_models/mtcnn/rnet.npy',
	'mtcnn_onet': 'pretrained_models/mtcnn/onet.npy',
}

Training

The main training script can be found in scripts/train.py.

python scripts/train.py
--exp_dir=/path/output
--batch_size=1
--val_interval=2500
--save_interval=5000 
--encoder_type=GradualStyleEncoder
--start_from_latent_avg
--learning_rate=0.0001 
--lpips_lambda=2 --l2_lambda=2.5 
--hm_lambda=0.1
--w_norm_lambda=0.005
--w_norm_lambda_1=0.005
--loss_adv_weight=0.1 
--loss_adv_weight_latent=0.1 
--label_nc=19 
--input_nc=19
--test_batch_size=1

Testing

python scripts/inference_latent.py --exp_dir=/path/portrait2anime_results --checkpoint_path=./pretrained_models/best_model.pt --test_batch_size=1

We assume that all pretrained models are downloaded and saved to the directory pretrained_models.

Acknowledgments

This code borrows heavily from pixel2style2pixel

To Do List

  • Release the Anime2Portrait code

Citation

If you use this code for your research, please cite our paper Parsing-Conditioned Anime Translation: A New Dataset and Method :

@article{li2023parsing,
  title={Parsing-Conditioned Anime Translation: A New Dataset and Method},
  author={Li, Zhansheng and Xu, Yangyang and Zhao, Nanxuan and Zhou, Yang and Liu, Yongtuo and Lin, Dahua and He, Shengfeng},
  journal={ACM Transactions on Graphics},
  year={2023},
  publisher={ACM New York, NY}
}

styleanime's People

Contributors

zsl2018 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.