imcharlesy / adaint Goto Github PK

[CVPR 2022] Official PyTorch Implementation of "AdaInt: Learning Adaptive Intervals for 3D Lookup Tables on Real-time Image Enhancement" (https://arxiv.org/abs/2204.13983)

License: Apache License 2.0

Python 95.85% Dockerfile 0.04% Shell 0.09% C++ 1.80% Cuda 1.71% MATLAB 0.51%

adaint's Introduction

AdaInt: Learning Adaptive Intervals for 3D Lookup Tables on Real-time Image Enhancement

Introduction

The codebase provides the official PyTorch implementation and some model checkpoints for the paper "AdaInt: Learning Adaptive Intervals for 3D Lookup Tables on Real-time Image Enhancement" (accepted by CVPR 2022).

AdaInt (Adaptive Interval) is an effective and efficient mechanism for promoting learnable 3D lookup tables (LUTs) on the real-time image enhancement task, which can be implemented as a plug-and-play neural network module. The central idea is to introduce image-adaptive sampling intervals for learning a non-uniform 3D LUT layout. To enable the learning of non-uniform sampling intervals in the 3D color space, a differentiable AiLUT-Transform (Adaptive Interval LUT Transform) operator is also proposed to provide gradients to the sampling intervals. Experiments demonstrate that methods equipped with AdaInt can achieve state-of-the-art performance on two public benchmark datasets with a negligible overhead increase.

The codebase is based on the popular MMEditing toolbox (v0.11.0). Please refer to ori_README.md for the original README.

Code Structure

mmedit/: the original MMEditing toolbox (without any modification).
adaint/: the core implementation of the paper, including:
- annfiles/: including the annotation files for FiveK and PPR10K datasets.
- dataset.py: the dataset class for image enhancement (FiveK and PPR10K).
- transforms.py: including some augmentations not provided by MMEditing toolbox.
- ailut_transform/: including the python interfaces, the C++ CUDA implementation, and the wheel package of the proposed AiLUT-Transform.
- model.py: the implementation of AiLUT model (3D-LUT + AdaInt).
- configs/: including configurations to conduct experiments.
- metrics/: including MATLAB scripts to calculate metrics reported in the paper.
- demo.py: a python script to run a demo.
pretrained/: including the pretrained models.

Prerequisites

Hardware

CPU: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
GPU: NVIDIA Tesla V100 SXM2 32G

Dependencies

Ubuntu 18.04.5 LTS
Python 3.7.10
PyTorch 1.8.1
GCC/G++ 7.5
CUDA 10.2
MMCV 1.3.17
MMEditing 0.11.0

Installation

You can set up the MMEditing toolbox with conda and pip as follows:

conda install -c pytorch pytorch=1.8.1 torchvision cudatoolkit=10.2 -y
pip install -r requirements.txt
pip install -v -e .

The proposed AiLUT-Transform is implemented as a PyTorch CUDA extension. You can install the extension in either the following two ways:

Compile and install the extension manually.

python adaint/ailut_transform/setup.py install

Use the pre-compiled python wheel package in ./adaint/ailut_transform.

pip install adaint/ailut_transform/ailut-1.5.0-cp37-cp37m-linux_x86_64.whl

Note that the CUDA extension should be compiled and packaged using Python 3.7.10, PyTorch 1.8.1, GCC/G++ 7.5, and CUDA 10.2. If you fail to install the extension or encounter any issue afterward, please first carefully check your environment accordingly.

In case you would like to remove the installed AiLUT-Transform extension, please execute the following command:

pip uninstall ailut

Demo

We provide a quick demo script in adaint/demo.py. You can execute it in the following way:

python adaint/demo.py [CONFIG_FILE] [MODEL_CHECKPOINT] [INPUT_IMAGE_PATH] [OUTPUT_IMAGE_PATH]

For quick testing, we provide a pretrained model in ./pretrained/AiLUT-FiveK-sRGB.pth and an input image from the FiveK dataset in 8-bit sRGB format (./resources/a4739.jpg). You can conduct enhancement on it using the below command:

python adaint/demo.py adaint/configs/fivekrgb.py pretrained/AiLUT-FiveK-sRGB.pth resources/a4739.jpg resources/a4739_enhanced.png

The enhanced result can be found in resources/a4739_enhanced.png.

Datasets

The paper use the FiveK and PPR10K datasets for experiments. It is recommended to refer to the dataset creators first using the above two urls.

Download

FiveK

You can download the original FiveK dataset from the dataset homepage and then preprocess the dataset using Adobe Lightroom following the instructions in Prepare_FiveK.md.

For fast setting up, you can also download only the 480p dataset preprocessed by Zeng ([GoogleDrive],[onedrive],[baiduyun:5fyk]), including 8-bit sRGB, 16-bit XYZ input images and 8-bit sRGB groundtruth images.

After downloading the dataset, please unzip the images into the ./data/FiveK directory. Please also place the annotation files in ./adaint/annfiles/FiveK to the same directory. The final directory structure is as follows.

./data/FiveK
    input/
        JPG/480p/                # 8-bit sRGB inputs
        PNG/480p_16bits_XYZ_WB/  # 16-bit XYZ inputs
    expertC/JPG/480p/            # 8-bit sRGB groundtruths
    train.txt
    test.txt

PPR10K

We download the 360p dataset (train_val_images_tif_360p and masks_360p) from PPR10K to conduct our experiments.

After downloading the dataset, please unzip the images into the ./data/PPR10K directory. Please also place the annotation files in ./adaint/annfiles/PPR10K to the same directory. The expected directory structure is as follows.

data/PPR10K
    source/       # 16-bit sRGB inputs
    source_aug_6/ # 16-bit sRGB inputs with 5 versions of augmented
    masks/        # human-region masks
    target_a/     # 8-bit sRGB groundtruths retouched by expert a
    target_b/     # 8-bit sRGB groundtruths retouched by expert b
    target_c/     # 8-bit sRGB groundtruths retouched by expert c
    train.txt
    train_aug.txt
    test.txt

Usage

General Instruction

You can configure experiments by modifying the configuration files in adaint/configs/. Here we briefly describe some critical hyper-parameters:
- model.n_ranks: (int) The number of ranks in the mapping h (denoted as M in the paper).
- model.n_vertices: (int) The number of sampling points along each lattice dimension (denoted as N in the paper).
- model.en_adaint: (bool) Whether to use the AdaInt. If False, the model degenerates to TPAMI 3D-LUT.
- model.en_adaint_share: (bool) Whether to share AdaInt among color channels (see the Share-AdaInt in ablation studies).
- model.backbone: (str) The architecture of backbone (mapping f in the paper). Can be either 'tpami' or 'res18'.
Execute commands in the following format to train a model (all experiments can be conducted on a single GPU).

python tools/train.py [PATH/TO/CONFIG]

Execute commands in the following format to run the inference given a pretrained model.

python tools/test.py [PATH/TO/CONFIG] [PATH/TO/MODEL/CHECKPOINT] --save-path [PATH/TO/SAVE/RESULTS]

Use MATLAB to calculate the metrics reported in the paper.

cd ./adaint/metrics
(matlab) >> fivek_calculate_metrics([PATH/TO/SAVE/RESULTS], [PATH/TO/GT/IMAGES])

Training

On FiveK-sRGB (for photo retouching)

python tools/train.py adaint/configs/fivekrgb.py

On FiveK-XYZ (for tone mapping)

python tools/train.py adaint/configs/fivekxyz.py

On PPR10K (for photo retouching)

python tools/train.py adaint/configs/ppr10k.py

Testing

We provide some pretrained models in ./pretrained/. To conduct testing, please use the following commands:

On FiveK-sRGB (for photo retouching)

python tools/test.py adaint/configs/fivekrgb.py pretrained/AiLUT-FiveK-sRGB.pth --save-path [PATH/TO/SAVE/RESULTS]

On FiveK-XYZ (for tone mapping)

python tools/test.py adaint/configs/fivekxyz.py pretrained/AiLUT-FiveK-XYZ.pth --save-path [PATH/TO/SAVE/RESULTS]

On PPR10K (for photo retouching)

python tools/test.py adaint/configs/ppr10k.py pretrained/AiLUT-PPR10KA-sRGB.pth --save-path [PATH/TO/SAVE/RESULTS]

License

This codebase is released under the Apache 2.0 license.

Citation

If you find this repository useful, please kindly consider citing the following paper:

@InProceedings{yang2022adaint,
  title={AdaInt: Learning Adaptive Intervals for 3D Lookup Tables on Real-time Image Enhancement},
  author={Yang, Canqian and Jin, Meiguang and Jia, Xu and Xu, Yi and Chen, Ying},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2022}
}

Acknowledgements

This codebase is based on the following open-source projects. We thank their authors for making the source code publically available.

adaint's People

Contributors

Stargazers

Watchers

adaint's Issues

ModuleNotFoundError: No module named 'ailut'

你好，我运行 demo.py报错如题

mmcv 1.5.0
mmedit 0.14.0

请问该怎么解决

安装 pip install ailut-1.5.0-cp37-cp37m-linux_x86_64.whl也会报错如下：
ERROR: ailut-1.5.0-cp37-cp37m-linux_x86_64.whl is not a supported wheel on this platform

ImportError:...undefined symbol: _ZNK2at10TensorBase8data_ptrIdEEPT_v

In docker contrainer

GCC version:7.5.0
CUDA:10.2
python:3.7.10
torch:1.8.1+cu102
torchvision:0.9.1+cu102
mmcv-full:1.3.17
mmedit:0.11.0

way to install Adaint: Compile and install the extension manually

你好，请教个训练中的问题

你好，请教个训练中的问题，考虑到安装mmcv等的不方便，我在PPR10K的代码基础上增加了AdaInt的功能进行训练，开始的时候遇到的问题是训练中psnr值逐渐从-0.x变成了-30.x，看论文需要先锁定AdaInt模块的梯度，然后几个epoch后在放开，采用这个后确实psnr基本上是正数了，但是又遇到了其他的问题，比如训练到93个epoch时，训练PSNR从15.6x又逐步下降到了8.x，感觉非常奇怪，不知道这个原因是什么导致的？这个问题怎么解决呢？
还有一个问题就是，根据模型进行推导的时候，还是容易出现色差较大的边缘，形成color banding，这个有什么办法解决吗？
论文中的PSNR值时训练时的PSNR还是测试时的PSNR，训练时的PSNR通常在14多一些的时候，测试集上的PSNR平均值在23左右，这个是什么原因？

About Pillow and OpenCV

Hi! Thank you for sharing this amazing work. I'm wondering why you decided to use Pillow for Adobe5K and OpenCV for PPR10K instead of using the same for both datasets. Is it something related with the usage of the pretrained ResNet when training on PPR10K?

How to solve environment problem while using NVIDIA 3060 ?

First time I did exactly the same as the readme says,then I got capability mismatch problem.
second time I reinstall pytorch==1.13 and mmv==1.7.0 and got error says I need to install mmcv>=(1, 3, 0, 0, 0, 0), <=(1, 5, 0, 0, 0, 0).

about reference paper [45],SA-3DLUT

how to get the reference paper [45],SA-3DLUT test data?

can you share its sample code ?

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

	OpenMMLab 1.0 branch	OpenMMLab 2.0 branch
MMEngine		0.x
MMCV	1.x	2.x
MMDetection	0.x 、1.x、2.x	3.x
MMAction2	0.x	1.x
MMClassification	0.x	1.x
MMSegmentation	0.x	1.x
MMDetection3D	0.x	1.x
MMEditing	0.x	1.x
MMPose	0.x	1.x
MMDeploy	0.x	1.x
MMTracking	0.x	1.x
MMOCR	0.x	1.x
MMRazor	0.x	1.x
MMSelfSup	0.x	1.x
MMRotate	1.x	1.x
MMYOLO		0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

我想在mac上跑下效果，不做训练，环境搞不定，有没有办法能搞定mac上的环境？

关于与3D LUT的不同之处

看过代码之后，发现个问题：
在3D LUT的代码中，训练得到一组Basis 3D LUTs，在测试时，就对这组Basis 3D LUTs加权组合，得到需要的3D LUT，大大节省了推理时间，因为不需要对没一张图都得到的新的Basis 3D LUTs。
在作者的代码中，不论训练还是测试都会对每一张图计算一个Basis 3D LUTs，这与3D LUT论文是不同的，这显然会提高模型的效果，但也会增大计算开销，请问这是作者故意为之，还是错误的复现？

KeyError: 'AiLUT is not in the model registry'

demo running error

hi, this is a great job and I wanna test on my own task. But I failed to run the demo, could you help me?
my cmd : PYTHONPATH=./ python adaint/demo.py adaint/configs/fivekrgb.py pretrained/AiLUT-FiveK-sRGB.pth resources/a4739.jpg resources/a4739_enhanced.png

error info:
/opt/conda/lib/python3.7/site-packages/mmcv/utils/registry.py:252: UserWarning: The old API of register_module(module, force=False) is deprecated and will be removed, please use the new API register_module(name=None, force=False, module=None) instead.
'The old API of register_module(module, force=False) '
load checkpoint from local path: pretrained/AiLUT-FiveK-sRGB.pth
The model and loaded state dict do not match exactly

missing keys in source state_dict: cnt_iters

处理视频之后视频会闪烁

我用了一个视频转成了图片，然后跑了一次增强，把增强后的图片合成视频后发现视频会闪烁

重新训练达不到预训练的 PSNR 值

作者你好，我使用了

python3 tools/train.py adaint/configs/fivekrgb.py

重训模型几次，但发现最后精度达不及预训练模型，PSNR 基本上都在 24.x 徘徊，是哪里出现问题了么

How can I train my own dataset?

Hi, thanks for your great job provided by DaTaoBao tech. !

what I want to know about is that how can I train my private dataset. It seems that gt image should be provided in this project？While, only dataset without gt info do I have, So, how can I solve it?

thanks!

关于Non-uniform 3D LUT Rendering

你好，
请问框图中的Non-uniform 3D LUT Rendering具体是指的什么操作，学到的均匀的3d luts是怎么变成非均匀的Sampled 3D LUT的？还是说认为学到的3d lut和Non-uniform Lattice对应，已经是非均匀的了？在代码中没有找到对应的地方，希望能帮忙解答一下，多谢了～

请教个方案问题

作者大大，请问跟SepLUT的核心差异是不是体现在vertices = F.pad(intervals.cumsum(-1), (1, 0), 'constant', 0)这里，和后面3dlut插值那部分，核心差异是采样范围非均匀，网格使用效率更高？

多谢多谢，不确定该怎么理解

ImportError:...undefined symbol: _ZNK2at10TensorBase8data_ptrIdEEPT_v

In docker contrainer

GCC version:7.5.0
CUDA:10.2
python:3.7.10
torch:1.8.1+cu102
torchvision:0.9.1+cu102
mmcv-full:1.3.17
mmedit:0.11.0

way to install Adaint: Compile and install the extension manually

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.