GithubHelp home page GithubHelp logo

longlongaaago / grig_few_shot_inpainting Goto Github PK

View Code? Open in Web Editor NEW
24.0 4.0 0.0 5.15 MB

we present a novel few-shot generative residual image inpainting method that produces high-quality inpainting results.

License: MIT License

Python 100.00%

grig_few_shot_inpainting's Introduction

GRIG: Data-efficient generative residual image inpainting

Official PyTorch implementation of GRIG.

[Homepage] [paper] [demo_youtube] [demo_bilibili]

We present a novel data-efficient generative residual image inpainting method that produces high-quality inpainting results.

Performance Visual results of our GRIG models trained on various few-shot settings. “All” means the whole training sets.

Requirements

cd GRIG project
pip install -r grig_requirements.txt
  • Note that other versions of PyTorch (e.g., higher than 1.7) also work well, but you have to install the corresponding CUDA version.
What we have released
  • Training and testing codes
  • Pre-trained models
    • Models on few-shot datasets (we encourage users can train the model by themselves because it will not take too long.)
    • Models on large-scale datasets (download large-scale-pre-trained models)
Project benefits
  • We are the first deep-learning-based few-shot image inpainting method.
  • Our codes can train using a batch size of 8 on a GPU with less than 12GB memory.
  • Our model converges very fast, especially on few-shot datasets.

Training

  • Prepare your small-scale datasets (download 10-few-shot_datasets)
    • Even though we claimed in our paper that we trained each dataset with 400,000 iterations. Our model actually converges very fast.
    • For most datasets you will find maybe 20,000 or 50,000 iterations are good enough to train a model.
    • We still encourage users can try various iterations to see what can be found in our GRIG.
  • Prepare your large-scale datasets (download FFHQ, CelebA-HQ, Paris Street View, and Places365)
    • We recommend training 1000,000 iterations on FFHQ, CelebA-HQ, and Paris Street View datasets, while 2000,000 on the Places365 dataset.
  • The folder structure of training and testing data is shown below:
root/
    test/
        xxx.png
        ...
        xxz.png
    train/
        xxx.png
        ...
        xxz.png
  • Prepare pre-trained checkpoints: efficient_net (put models in ./pre_train)

  • Training

python train.py --path /root/train --test_path /root/test --im_size 256 --eval_interval 200 --iter 400000 --efficient_net ./pre_train/tf_efficientnet_lite0-0aa007d2.pth

- path: training path
- test_path: testing data path
- im_size: image size for training and testing
- eval_interval: the frequency for testing 
- iter: total training iterations
- efficient_net: ckpt path of the efficient_net
  • During training, you can find trained ckpts (checkpoints) and intermediate trained images in the ./train_results/test1 folder.
  • The evaluation results can be found in ./eval_ folder (of course you can change it using --eval_dir).
  • For more functional options, please view the codes.

Testing

  • Irregular masks (optional, if you would like to test on irregular masks, download Testing Set masks)

python test.py --test_path /root/test --ckpt_path ./checkpoint/... --mask_root ./dataset/mask/testing_mask_dataset --mask_file_root ./dataset/mask --mask_type test_6.txt

- ckpt_path  the pretrained model ckpt path
- mask_root Irregular masks root
- mask_file_root file name list file folder
- mask_type could be ["Center", "test_2.txt", "test_3.txt", "test_4.txt", "test_5.txt", "test_6.txt", "all"]
  • If you don't have irregular masks, just using center masks is also fine.

python test.py test.py --test_path /root/test --ckpt_path ./checkpoint/... --mask_type Center

  • After finishing the testing, you can find output images in the ./eval and ./view folders
  • For more functional options, please view the codes.

Bibtex

  • If you find our code useful, please cite our paper:
    @misc{lu2023grig,
        title={GRIG: Few-Shot Generative Residual Image Inpainting}, 
        author={Wanglong Lu and Xianta Jiang and Xiaogang Jin and Yong-Liang Yang and Minglun Gong and Tao Wang and Kaijie Shi and Hanli Zhao},
        year={2023},
        eprint={2304.12035},
        archivePrefix={arXiv},
        primaryClass={cs.CV}
      }
    

Acknowledgements

Closely related projects: FastGAN, ProjectedGAN, and Restormer.

Codes for Learned Perceptual Image Patch Similarity, LPIPS came from https://github.com/richzhang/PerceptualSimilarity

To match FID scores more closely to tensorflow official implementations, I have used FID Inception V3 implementations in https://github.com/mseitzer/pytorch-fid

More results

Ground-truth Masked
Ground-truth Masked
Ground-truth Masked
Ground-truth Masked
Ground-truth Masked
Ground-truth Masked
Ground-truth Masked
Ground-truth Masked
Ground-truth Masked image Inpainted

grig_few_shot_inpainting's People

Contributors

longlongaaago avatar

Stargazers

Matthias Fauconneau avatar Aotenjou avatar carwin avatar  avatar Tao Wang avatar  avatar Qi Shida avatar vcq avatar Chen Jiawei avatar Dylan_邓珺礼 avatar  avatar Andy Dennis avatar kaijieshi avatar wm avatar  avatar  avatar lvjiankai avatar JiangLei avatar Echo Jayy avatar  avatar  avatar  avatar  avatar  avatar

Watchers

Matthias Fauconneau avatar James Cloos avatar Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.