GithubHelp home page GithubHelp logo

imcharlesy / adaint Goto Github PK

View Code? Open in Web Editor NEW
168.0 5.0 21.0 59.43 MB

[CVPR 2022] Official PyTorch Implementation of "AdaInt: Learning Adaptive Intervals for 3D Lookup Tables on Real-time Image Enhancement" (https://arxiv.org/abs/2204.13983)

License: Apache License 2.0

Python 95.85% Dockerfile 0.04% Shell 0.09% C++ 1.80% Cuda 1.71% MATLAB 0.51%

adaint's Introduction

AdaInt: Learning Adaptive Intervals for 3D Lookup Tables on Real-time Image Enhancement

Introduction

The codebase provides the official PyTorch implementation and some model checkpoints for the paper "AdaInt: Learning Adaptive Intervals for 3D Lookup Tables on Real-time Image Enhancement" (accepted by CVPR 2022).

AdaInt (Adaptive Interval) is an effective and efficient mechanism for promoting learnable 3D lookup tables (LUTs) on the real-time image enhancement task, which can be implemented as a plug-and-play neural network module. The central idea is to introduce image-adaptive sampling intervals for learning a non-uniform 3D LUT layout. To enable the learning of non-uniform sampling intervals in the 3D color space, a differentiable AiLUT-Transform (Adaptive Interval LUT Transform) operator is also proposed to provide gradients to the sampling intervals. Experiments demonstrate that methods equipped with AdaInt can achieve state-of-the-art performance on two public benchmark datasets with a negligible overhead increase.

The codebase is based on the popular MMEditing toolbox (v0.11.0). Please refer to ori_README.md for the original README.

Code Structure

  • mmedit/: the original MMEditing toolbox (without any modification).
  • adaint/: the core implementation of the paper, including:
    • annfiles/: including the annotation files for FiveK and PPR10K datasets.
    • dataset.py: the dataset class for image enhancement (FiveK and PPR10K).
    • transforms.py: including some augmentations not provided by MMEditing toolbox.
    • ailut_transform/: including the python interfaces, the C++ CUDA implementation, and the wheel package of the proposed AiLUT-Transform.
    • model.py: the implementation of AiLUT model (3D-LUT + AdaInt).
    • configs/: including configurations to conduct experiments.
    • metrics/: including MATLAB scripts to calculate metrics reported in the paper.
    • demo.py: a python script to run a demo.
  • pretrained/: including the pretrained models.

Prerequisites

Hardware

  • CPU: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
  • GPU: NVIDIA Tesla V100 SXM2 32G

Dependencies

  • Ubuntu 18.04.5 LTS
  • Python 3.7.10
  • PyTorch 1.8.1
  • GCC/G++ 7.5
  • CUDA 10.2
  • MMCV 1.3.17
  • MMEditing 0.11.0

Installation

You can set up the MMEditing toolbox with conda and pip as follows:

conda install -c pytorch pytorch=1.8.1 torchvision cudatoolkit=10.2 -y
pip install -r requirements.txt
pip install -v -e .

The proposed AiLUT-Transform is implemented as a PyTorch CUDA extension. You can install the extension in either the following two ways:

  • Compile and install the extension manually.
python adaint/ailut_transform/setup.py install
  • Use the pre-compiled python wheel package in ./adaint/ailut_transform.
pip install adaint/ailut_transform/ailut-1.5.0-cp37-cp37m-linux_x86_64.whl

Note that the CUDA extension should be compiled and packaged using Python 3.7.10, PyTorch 1.8.1, GCC/G++ 7.5, and CUDA 10.2. If you fail to install the extension or encounter any issue afterward, please first carefully check your environment accordingly.

In case you would like to remove the installed AiLUT-Transform extension, please execute the following command:

pip uninstall ailut

Demo

We provide a quick demo script in adaint/demo.py. You can execute it in the following way:

python adaint/demo.py [CONFIG_FILE] [MODEL_CHECKPOINT] [INPUT_IMAGE_PATH] [OUTPUT_IMAGE_PATH]

For quick testing, we provide a pretrained model in ./pretrained/AiLUT-FiveK-sRGB.pth and an input image from the FiveK dataset in 8-bit sRGB format (./resources/a4739.jpg). You can conduct enhancement on it using the below command:

python adaint/demo.py adaint/configs/fivekrgb.py pretrained/AiLUT-FiveK-sRGB.pth resources/a4739.jpg resources/a4739_enhanced.png

The enhanced result can be found in resources/a4739_enhanced.png.

Datasets

The paper use the FiveK and PPR10K datasets for experiments. It is recommended to refer to the dataset creators first using the above two urls.

Download

  • FiveK

You can download the original FiveK dataset from the dataset homepage and then preprocess the dataset using Adobe Lightroom following the instructions in Prepare_FiveK.md.

For fast setting up, you can also download only the 480p dataset preprocessed by Zeng ([GoogleDrive],[onedrive],[baiduyun:5fyk]), including 8-bit sRGB, 16-bit XYZ input images and 8-bit sRGB groundtruth images.

After downloading the dataset, please unzip the images into the ./data/FiveK directory. Please also place the annotation files in ./adaint/annfiles/FiveK to the same directory. The final directory structure is as follows.

./data/FiveK
    input/
        JPG/480p/                # 8-bit sRGB inputs
        PNG/480p_16bits_XYZ_WB/  # 16-bit XYZ inputs
    expertC/JPG/480p/            # 8-bit sRGB groundtruths
    train.txt
    test.txt
  • PPR10K

We download the 360p dataset (train_val_images_tif_360p and masks_360p) from PPR10K to conduct our experiments.

After downloading the dataset, please unzip the images into the ./data/PPR10K directory. Please also place the annotation files in ./adaint/annfiles/PPR10K to the same directory. The expected directory structure is as follows.

data/PPR10K
    source/       # 16-bit sRGB inputs
    source_aug_6/ # 16-bit sRGB inputs with 5 versions of augmented
    masks/        # human-region masks
    target_a/     # 8-bit sRGB groundtruths retouched by expert a
    target_b/     # 8-bit sRGB groundtruths retouched by expert b
    target_c/     # 8-bit sRGB groundtruths retouched by expert c
    train.txt
    train_aug.txt
    test.txt

Usage

General Instruction

  • You can configure experiments by modifying the configuration files in adaint/configs/. Here we briefly describe some critical hyper-parameters:

    • model.n_ranks: (int) The number of ranks in the mapping h (denoted as M in the paper).
    • model.n_vertices: (int) The number of sampling points along each lattice dimension (denoted as N in the paper).
    • model.en_adaint: (bool) Whether to use the AdaInt. If False, the model degenerates to TPAMI 3D-LUT.
    • model.en_adaint_share: (bool) Whether to share AdaInt among color channels (see the Share-AdaInt in ablation studies).
    • model.backbone: (str) The architecture of backbone (mapping f in the paper). Can be either 'tpami' or 'res18'.
  • Execute commands in the following format to train a model (all experiments can be conducted on a single GPU).

python tools/train.py [PATH/TO/CONFIG]
  • Execute commands in the following format to run the inference given a pretrained model.
python tools/test.py [PATH/TO/CONFIG] [PATH/TO/MODEL/CHECKPOINT] --save-path [PATH/TO/SAVE/RESULTS]
  • Use MATLAB to calculate the metrics reported in the paper.
cd ./adaint/metrics
(matlab) >> fivek_calculate_metrics([PATH/TO/SAVE/RESULTS], [PATH/TO/GT/IMAGES])

Training

  • On FiveK-sRGB (for photo retouching)
python tools/train.py adaint/configs/fivekrgb.py
  • On FiveK-XYZ (for tone mapping)
python tools/train.py adaint/configs/fivekxyz.py
  • On PPR10K (for photo retouching)
python tools/train.py adaint/configs/ppr10k.py

Testing

We provide some pretrained models in ./pretrained/. To conduct testing, please use the following commands:

  • On FiveK-sRGB (for photo retouching)
python tools/test.py adaint/configs/fivekrgb.py pretrained/AiLUT-FiveK-sRGB.pth --save-path [PATH/TO/SAVE/RESULTS]
  • On FiveK-XYZ (for tone mapping)
python tools/test.py adaint/configs/fivekxyz.py pretrained/AiLUT-FiveK-XYZ.pth --save-path [PATH/TO/SAVE/RESULTS]
  • On PPR10K (for photo retouching)
python tools/test.py adaint/configs/ppr10k.py pretrained/AiLUT-PPR10KA-sRGB.pth --save-path [PATH/TO/SAVE/RESULTS]

License

This codebase is released under the Apache 2.0 license.

Citation

If you find this repository useful, please kindly consider citing the following paper:

@InProceedings{yang2022adaint,
  title={AdaInt: Learning Adaptive Intervals for 3D Lookup Tables on Real-time Image Enhancement},
  author={Yang, Canqian and Jin, Meiguang and Jia, Xu and Xu, Yi and Chen, Ying},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2022}
}

Acknowledgements

This codebase is based on the following open-source projects. We thank their authors for making the source code publically available.

adaint's People

Contributors

610265158 avatar alexzou14 avatar allentdan avatar ckkelvinchan avatar congee524 avatar drcut avatar endlesssora avatar grimoire avatar ha0tang avatar hejm37 avatar hellock avatar imcharlesy avatar innerlee avatar jiaqixuac avatar magicdream2222 avatar nannigalaxy avatar nbei avatar orangeccc avatar plyfager avatar quincylin1 avatar runningleon avatar sunnyxiaohu avatar tangyanf avatar tpoisonooo avatar wileewang avatar wwhio avatar xinntao avatar yaochaorui avatar yivan-wyygdsg avatar yshuo-li avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

adaint's Issues

ModuleNotFoundError: No module named 'ailut'

你好,我运行 demo.py报错如题

mmcv 1.5.0
mmedit 0.14.0

请问该怎么解决

安装 pip install ailut-1.5.0-cp37-cp37m-linux_x86_64.whl也会报错如下:
ERROR: ailut-1.5.0-cp37-cp37m-linux_x86_64.whl is not a supported wheel on this platform

你好,请教个训练中的问题

你好,请教个训练中的问题,考虑到安装mmcv等的不方便,我在PPR10K的代码基础上增加了AdaInt的功能进行训练,开始的时候遇到的问题是训练中psnr值逐渐从-0.x变成了-30.x,看论文需要先锁定AdaInt模块的梯度,然后几个epoch后在放开,采用这个后确实psnr基本上是正数了,但是又遇到了其他的问题,比如训练到93个epoch时,训练PSNR从15.6x又逐步下降到了8.x,感觉非常奇怪,不知道这个原因是什么导致的?这个问题怎么解决呢?
还有一个问题就是,根据模型进行推导的时候,还是容易出现色差较大的边缘,形成color banding,这个有什么办法解决吗?
论文中的PSNR值时训练时的PSNR还是测试时的PSNR, 训练时的PSNR通常在14多一些的时候,测试集上的PSNR平均值在23左右,这个是什么原因?

About Pillow and OpenCV

Hi! Thank you for sharing this amazing work. I'm wondering why you decided to use Pillow for Adobe5K and OpenCV for PPR10K instead of using the same for both datasets. Is it something related with the usage of the pretrained ResNet when training on PPR10K?

How to solve environment problem while using NVIDIA 3060 ?

First time I did exactly the same as the readme says,then I got capability mismatch problem.
second time I reinstall pytorch==1.13 and mmv==1.7.0 and got error says I need to install mmcv>=(1, 3, 0, 0, 0, 0), <=(1, 5, 0, 0, 0, 0).

Welcome update to OpenMMLab 2.0

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

OpenMMLab 1.0 branch OpenMMLab 2.0 branch
MMEngine 0.x
MMCV 1.x 2.x
MMDetection 0.x 、1.x、2.x 3.x
MMAction2 0.x 1.x
MMClassification 0.x 1.x
MMSegmentation 0.x 1.x
MMDetection3D 0.x 1.x
MMEditing 0.x 1.x
MMPose 0.x 1.x
MMDeploy 0.x 1.x
MMTracking 0.x 1.x
MMOCR 0.x 1.x
MMRazor 0.x 1.x
MMSelfSup 0.x 1.x
MMRotate 1.x 1.x
MMYOLO 0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

关于与3D LUT的不同之处

看过代码之后,发现个问题:
在3D LUT的代码中,训练得到一组Basis 3D LUTs,在测试时,就对这组Basis 3D LUTs加权组合,得到需要的3D LUT,大大节省了推理时间,因为不需要对没一张图都得到的新的Basis 3D LUTs。
在作者的代码中,不论训练还是测试都会对每一张图计算一个Basis 3D LUTs,这与3D LUT论文是不同的,这显然会提高模型的效果,但也会增大计算开销,请问这是作者故意为之,还是错误的复现?

demo running error

hi, this is a great job and I wanna test on my own task. But I failed to run the demo, could you help me?
my cmd : PYTHONPATH=./ python adaint/demo.py adaint/configs/fivekrgb.py pretrained/AiLUT-FiveK-sRGB.pth resources/a4739.jpg resources/a4739_enhanced.png

error info:
/opt/conda/lib/python3.7/site-packages/mmcv/utils/registry.py:252: UserWarning: The old API of register_module(module, force=False) is deprecated and will be removed, please use the new API register_module(name=None, force=False, module=None) instead.
'The old API of register_module(module, force=False) '
load checkpoint from local path: pretrained/AiLUT-FiveK-sRGB.pth
The model and loaded state dict do not match exactly

missing keys in source state_dict: cnt_iters

重新训练达不到预训练的 PSNR 值

作者你好,我使用了

python3 tools/train.py adaint/configs/fivekrgb.py

重训模型几次,但发现最后精度达不及预训练模型,PSNR 基本上都在 24.x 徘徊,是哪里出现问题了么

image

How can I train my own dataset?

Hi, thanks for your great job provided by DaTaoBao tech. !

what I want to know about is that how can I train my private dataset. It seems that gt image should be provided in this project?While, only dataset without gt info do I have, So, how can I solve it?

thanks!

关于Non-uniform 3D LUT Rendering

你好,
请问框图中的Non-uniform 3D LUT Rendering具体是指的什么操作,学到的均匀的3d luts是怎么变成非均匀的Sampled 3D LUT的?还是说认为学到的3d lut和Non-uniform Lattice对应,已经是非均匀的了?在代码中没有找到对应的地方,希望能帮忙解答一下,多谢了~

请教个方案问题

作者大大,请问跟SepLUT的核心差异是不是体现在vertices = F.pad(intervals.cumsum(-1), (1, 0), 'constant', 0)这里,和后面3dlut插值那部分,核心差异是采样范围非均匀,网格使用效率更高?

多谢多谢,不确定该怎么理解

About PPR10K dataset

Wonderful work, I notice that the author of PPR10K has provided an enhanced version of 8875 images, which contains 53K images in total. I am wondering if you were training with augmented images, thanks.

About the error map in paper

Wonderful work, I see there are many images look like the thermal map and you call them the "error map",I want to know how do I draw these pictures like these map, could you please show me the code you use? Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.