GithubHelp home page GithubHelp logo

image-inpainting-for-anime-digital-painting's Introduction

Image Inpainting for Anime Digital Painting (Underdevelopment)

This project is a personal endeavor undertaken, inspired by anime-related projects and DeepCreampy. It involves implementing a PyTorch model for zoom-to-inpaint.

Disclaimer

We try to trained this model on cloud GPU rental (Vast.ai) with 4 Nvidia RTX A5000 GPUs. However, cost of training this model is very high due to 4.5 million parameters, require a lot of images (10k-100k images) and lot of training step (100k+ steps) to train. We decided to pause this project in training step and afterward. If anyone prefer to use this model, please use at your own risk. You can also train this model with regular image other than anime image.

Requirements

Cuda GPU and Cuda Software

This project is coded to train with CUDA, so CUDA GPU and CUDA Toolkit is required. cuDNN is optional but can be installed to train the model faster. For google colab user, change run time type to GPU.

Python Environments

  • Python >= 3.9
  • Pytorch >= 1.13.1 with Cuda support
  • torchvision >= 0.14.1 with Cuda support
  • wandb >= 0.13.7

For Anaconda user, use these commands to install Python Environments

> conda create -n anime-inpainting
> conda activate anime-inpainting
> conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
> conda install -c conda-forge wandb

WandB set up

WandB is used for monitoring loss during training. Sign up at WandB and receive API key. In command line type

> wandb login

and paste received API key.

If you are using python notebook, you also can use this code

> import wandb
> wandb.login(key='YOU API KEY HERE')

Google colab notebook

If you're working in a Google Colab notebook, you can easily follow the example in this notebook. For additional guidance on using the command line, refer to the section below.

Training

This model consist of 3 networks namely Coarse Network, Super Resolution Network and Refinement Network. First, each network will be pre-trained individualy. After that, all networks will be combinded trained with small mask. Lastly, all networks will be combinded trained with larger mask.

Pretrain

Pretrain commmand line usage example:

> python pretrain.py --model='coarse' --train_path='./trainset' --save_model='./save_model'

Pretrain command line full usage:

> python pretrain.py --batch_size=int --epochs=int --learning_rate=float [--load_model=str]
                     --model=str [--save_model=str] --train_path=str --world_size=int

Required arguments:
  --batch_size          Amount of images that pass simultaneously to model (default:8)
  --epochs              Amount of training steps (default:10)
  --learning_rate       Control weights change during optimization (default:1e-5)
  --model               Model to train specific one of these names: 'coarse', 'super_resolution' or 'refinement'
  --train_path          Folder contain images for training (image size must be 512x512 pixels or higher)
  --world_size          Number of GPUs to do multi-GPUs training. Ignore this argument if use single GPU (default:1)

Optional arguments:
  --load_model          Folder contain saved model (Model must match --model name and folder should contain only one model file)
  --save_model          Folder to save model

You can monitor pretrain loss in WandB's project.

Combined train

Combined train command line usage example:

> python train.py --mask_type=1 --train_path='./trainset' --save_model='./save_model'

Combined train command line full usage:

> python train.py --batch_size=int --epochs=int --learning_rate=float [--load_discriminator=str] [--load_inpaint=str]
                  --mask_type=int [--save_model=str] --train_path=str [--val_path=str] --world_size=int

Required arguments:
  --batch_size          Amount of images that pass simultaneously to model (default:1)
  --epochs              Amount of training steps (default:10)
  --learning_rate       Control weights change during optimization (default:1e-5)
  --mask_type           Specific one of these numbers: 1 for small mask and 2 for larger mask (default:1)
  --train_path          Folder contain images for training (Image size must be 512x512 pixels or higher)
  --world_size          Number of GPUs to do multi-GPUs training. Ignore this argument if use single GPU (default:1)

Optional arguments:
  --load_discriminator  Folder contain discriminator model (Folder should contain only one discriminator model file)
  --load_inpaint        Folder contain coarse model, super_resulution model and refinement model (Folder should contain one file for each model)
  --save_model          Folder to save model
  --val_path=str        Folder contain images for validation (Image size must be 512x512 pixels or higher)

In jointly training, WandB will show 4 losses namely coarse_loss, super_resolution_loss, refinement_loss and discriminator_loss.

Discriminator loss should converge to 2.

Evaluation

Evaluate command line usage example:

> python test.py --image_path='./testset' --load_model='./save_model' --output_path='./result' --model='inpaint'

Evaluate command line full usage:

> python test.py --image_path=str --load_model=str --output_path=str --model=str

Required arguments:
  --image_path          Folder contain image for testing (Image size must be 512x512 pixels or higher)
  --load_model          Folder contain saved model (Folder should contain only one model file or if test all model combined, one file for each model)
  --output_path         Folder to save testing result
  --model               Model to test specific one of these names: 'coarse', 'super_resolution', 'refinement' or 'inpaint'(all model combined)

image-inpainting-for-anime-digital-painting's People

Contributors

nack424 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.