megvii-research / nafnet Goto Github PK

The state-of-the-art image restoration model without nonlinear activation functions.

License: Other

Python 100.00%

deblur denoise stereo-super-resolution eccv2022 image-deblurring image-denoising image-restoration low-level-vision pytorch

nafnet's Introduction

NAFNet: Nonlinear Activation Free Network for Image Restoration

The official pytorch implementation of the paper Simple Baselines for Image Restoration (ECCV2022)

Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, Jian Sun

Although there have been significant advances in the field of image restoration recently, the system complexity of the state-of-the-art (SOTA) methods is increasing as well, which may hinder the convenient analysis and comparison of methods. In this paper, we propose a simple baseline that exceeds the SOTA methods and is computationally efficient. To further simplify the baseline, we reveal that the nonlinear activation functions, e.g. Sigmoid, ReLU, GELU, Softmax, etc. are not necessary: they could be replaced by multiplication or removed. Thus, we derive a Nonlinear Activation Free Network, namely NAFNet, from the baseline. SOTA results are achieved on various challenging benchmarks, e.g. 33.69 dB PSNR on GoPro (for image deblurring), exceeding the previous SOTA 0.38 dB with only 8.4% of its computational costs; 40.30 dB PSNR on SIDD (for image denoising), exceeding the previous SOTA 0.28 dB with less than half of its computational costs.


Denoise	Deblur	StereoSR(NAFSSR)

News

2022.08.02 The Baseline, including the pretrained models and train/test configs, are available now.

2022.07.03 Related work, Improving Image Restoration by Revisiting Global Information Aggregation (TLC, a.k.a TLSC in our paper) is accepted by ECCV2022 🎉 . Code is available at https://github.com/megvii-research/TLC.

2022.07.03 Our paper is accepted by ECCV2022 🎉

2022.06.19 NAFSSR (as a challenge winner) is selected for an ORAL presentation at CVPR 2022, NTIRE workshop 🎉 Presentation video, slides and poster are available now.

2022.04.15 NAFNet based Stereo Image Super-Resolution solution (NAFSSR) won the 1st place on the NTIRE 2022 Stereo Image Super-resolution Challenge! Training/Evaluation instructions see here.

Installation

This implementation based on BasicSR which is a open source toolbox for image/video restoration tasks and HINet

python 3.9.5
pytorch 1.11.0
cuda 11.3

git clone https://github.com/megvii-research/NAFNet
cd NAFNet
pip install -r requirements.txt
python setup.py develop --no_cuda_ext

Quick Start

Image Denoise Colab Demo:
Image Deblur Colab Demo:
Stereo Image Super-Resolution Colab Demo:
Single Image Inference Demo:
- Image Denoise:
```
python basicsr/demo.py -opt options/test/SIDD/NAFNet-width64.yml --input_path ./demo/noisy.png --output_path ./demo/denoise_img.png
```
- Image Deblur:
```
python basicsr/demo.py -opt options/test/REDS/NAFNet-width64.yml --input_path ./demo/blurry.jpg --output_path ./demo/deblur_img.png
```
- --input_path: the path of the degraded image
- --output_path: the path to save the predicted image
- pretrained models should be downloaded.
- Integrated into Huggingface Spaces 🤗 using Gradio. Try out the Web Demo for single image restoration
Stereo Image Inference Demo:
- Stereo Image Super-resolution:
```
python basicsr/demo_ssr.py -opt options/test/NAFSSR/NAFSSR-L_4x.yml \
--input_l_path ./demo/lr_img_l.png --input_r_path ./demo/lr_img_r.png \
--output_l_path ./demo/sr_img_l.png --output_r_path ./demo/sr_img_r.png
```
- --input_l_path: the path of the degraded left image
- --input_r_path: the path of the degraded right image
- --output_l_path: the path to save the predicted left image
- --output_r_path: the path to save the predicted right image
- pretrained models should be downloaded.
- Integrated into Huggingface Spaces 🤗 using Gradio. Try out the Web Demo for stereo image super-resolution
Try the web demo with all three tasks here:

Results and Pre-trained Models

name	Dataset	PSNR	SSIM	pretrained models	configs
NAFNet-GoPro-width32	GoPro	32.8705	0.9606	gdrive \| 百度网盘	train \| test
NAFNet-GoPro-width64	GoPro	33.7103	0.9668	gdrive \| 百度网盘	train \| test
NAFNet-SIDD-width32	SIDD	39.9672	0.9599	gdrive \| 百度网盘	train \| test
NAFNet-SIDD-width64	SIDD	40.3045	0.9614	gdrive \| 百度网盘	train \| test
NAFNet-REDS-width64	REDS	29.0903	0.8671	gdrive \| 百度网盘	train \| test
NAFSSR-L_4x	Flickr1024	24.17	0.7589	gdrive \| 百度网盘	train \| test
NAFSSR-L_2x	Flickr1024	29.68	0.9221	gdrive \| 百度网盘	train \| test
Baseline-GoPro-width32	GoPro	32.4799	0.9575	gdrive \| 百度网盘	train \| test
Baseline-GoPro-width64	GoPro	33.3960	0.9649	gdrive \| 百度网盘	train \| test
Baseline-SIDD-width32	SIDD	39.8857	0.9596	gdrive \| 百度网盘	train \| test
Baseline-SIDD-width64	SIDD	40.2970	0.9617	gdrive \| 百度网盘	train \| test

Image Restoration Tasks

Task	Dataset	Train/Test Instructions	Visualization Results
Image Deblurring	GoPro	link	gdrive \| 百度网盘
Image Denoising	SIDD	link	gdrive \| 百度网盘
Image Deblurring with JPEG artifacts	REDS	link	gdrive \| 百度网盘
Stereo Image Super-Resolution	Flickr1024+Middlebury	link	gdrive \| 百度网盘

Citations

If NAFNet helps your research or work, please consider citing NAFNet.

@article{chen2022simple,
  title={Simple Baselines for Image Restoration},
  author={Chen, Liangyu and Chu, Xiaojie and Zhang, Xiangyu and Sun, Jian},
  journal={arXiv preprint arXiv:2204.04676},
  year={2022}
}

If NAFSSR helps your research or work, please consider citing NAFSSR.

@InProceedings{chu2022nafssr,
    author    = {Chu, Xiaojie and Chen, Liangyu and Yu, Wenqing},
    title     = {NAFSSR: Stereo Image Super-Resolution Using NAFNet},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    month     = {June},
    year      = {2022},
    pages     = {1239-1248}
}

Contact

If you have any questions, please contact [email protected] or [email protected]

statistics

nafnet's People

Contributors

Stargazers

Watchers

Forkers

mcc1095319343 wendongj beijing-penguin cv-ip mango1218 ninghaiywx lgg-to-try caiyuanhao1998 zp1018 jacklee396 sachinrewa chaineypung zhimaolin dbrown240 srinivest leekayden zhch-sun fionawyu kuijiang94 haheh fedral zhongruiw hexdream qiaoptdun cheseremtitus24 huangwenwenlili wangkaku empyriumz celsopitta pushprajm anas-zafar air55555 pudongdong cindyzxy1104 olitillinfintye xhwxd baifree wooseoklee4 zzg-tju guofenggitlearning jsjsjshhdhdb xwkhun alexisbdr gauenk xuannadi jamesthesnake throbbinjack aashishkumar0228 sridharsola universewill qilong-zhang yunongliu1 data-ant vamuvetv aiwithshekhar avi18794 wonlee2019 metavai azizighani aatishkr macos tompyonsuke shtarun jxzhangjhu s-marques miljkovn lucas-correa vongolawu lucsantana clearlon sijieliu518 alicebook12220 fanrupin cocowy1 juandibalbi caogaofeng lyl1015 dingbaojin yu-wang-0801 xusang77 chosj95 fork-for-modify fishcatcake douglaasss9 taowangzj cusniwtt jwgu jimwangzx aiwzx xiaom233 lwl-cpu ip-restoration morhafhaedar zhangtyro peng-pengpeng jizhe-tripleluck wendavid552 fxjy15550 fengjunxi derronqi

nafnet's Issues

Performance on other datasets

Hi,

Thanks a lot for your awesome work! And it does great help in the research work!

May I ask about the performance on other low level tasks, like Deraining, Dehazing ...?

Thanks!

关于layernorm维度的问题

大佬你好，感谢你们的开源。注意到你们NAFBlock中的layernorm(LN)模块，是在通道(C)维度上求的均值和方差，传统LN好像是在CHW维度上求的。想请教下，这两者是有较大的性能区别吗。

about the MACs of the model

hi,
Thanks first for the excellent work!
And can you share the script of counting model MACs?

Ablation Study: number of NAFBlocks

In your ablation study, you did the experiment of performance according to different number of NAFBlocks.
I would like to know is this "number" for each en/decode layer's NAFBlock or just for some certain layer ?

LayerNorm's position in NAFBlock

Hi,

refer from "Attention Is All You Need", the LN position is in the tail of the attention and feedforward block,
but you put the LN at the head of attention and feedforward in the case of NAFBlock.
Is there any theory or designing idea or just by the experiment result?

why readme doesn't provide sufficient info?

I run into many questions when run this repo's codes, such as run inference in basicsr/demo.py to check the models effects. But to my surprise, readme.md doesn't provide some useful information about code, even the simplest demo.
And the colab and pth files are all google series tools or in google stores. But in China, as you know, the access to these websites is difficult.
So would you please consider Chinese users using? Or would you please consider storing files in Chinese drives such as BaiDu Netdisks? Or in readme.md, the information will be provided with more details? The codes and tools are used by users, but your providing are so disapointed.

Img size

非常感谢你们的工作，训练时的图像尺寸必须是512吗，不同尺寸的输入会影响训练效果吗。测试时大图切片后拼接会出现块状效应，请问有什么好的消除块状的思路可以分享一下吗，感谢

亲爱的作者，有个训练loss为负数的问题？

About Paper Results

Hello，您好！我有个问题是关于您所写论文里的结果对比部分，具体地说是Table 6.
为什么NAFNet在GoPro数据最终测试的结果由32.85变为了33.69呢？我想我应该认真阅读了您的整篇文章。是因为实验部分采用了与MPRNet-Local方法一致的TLSC方法吗？

期待您的答复！

About the training speed

Thank you for your wonderful work. I use four blocks of 3090，and batch size per gpu is 4. It cost about 15 hours training NAFNet-width32 on SIDD, and about 3 days training NAFNet-width64. I wonder to know if this speed is normal？

Block numbers in different layers

你好，非常感谢你们的工作，请问 NAFNet 里不同层的 NAFBlock 数量是如何设计的，在项目的 options 文件下给出的几个数据集上的实验设置有一些区别，比如在 SIDD 上是 {[2, 2, 4, 8], 12, [2, 2, 2, 2]}，而在 GoPro 上是 '{[1, 1, 1, 28], 1, [1, 1, 1, 1]}'，其中有什么可以分享的设计**吗，感谢！

GMAC results

we use the NAFNet-main/basicsr/models/archs/NAFNet_arch.py to calculate GMAC under the configuration：
'''
width = 64
enc_blks = [2, 2, 4, 8]
middle_blk_num = 12
dec_blks = [2, 2, 2, 2]
'''
the results：
'''
enc blks [2, 2, 4, 8] middle blk num 12 dec blks [2, 2, 2, 2] width 64
start . 1038.19140625
network .. 1485.99609375
end .. 12193.1953125
254.37 115.9
total .. 13120.3953125
'''
when we set width to 32:
'''
width = 32
enc_blks = [2, 2, 4, 8]
middle_blk_num = 12
dec_blks = [2, 2, 2, 2]
'''
the results：
'''
enc blks [2, 2, 4, 8] middle blk num 12 dec blks [2, 2, 2, 2] width 32
start . 1037.89453125
network .. 1154.34375
end .. 6870.11328125
64.9 29.1
total .. 7102.91328125
'''
In my understanding, the paper report in SIDD dataset（PSNR：40.30，SSIM：0.962，GMAC：65) adopt width=64. Is this difference caused by my misunderstanding or the setting problem？

模型转为onnx

转为onnx问题已经解决，代码已经更新。可以供给大家参考，如有侵权，可联系本人删除
https://blog.csdn.net/TF666666/article/details/125678629?spm=1001.2014.3001.5502

关于SIDD测试结果

看了一下论文，我自己也训练了一遍width 32的网络，结果能和论文对上，但是貌似结果是SIDD validation set的值？是否不是benchmark的结果呢？

期待您的回复！

About droppath in NAFSSR

Hi, Thank you for your great work. But i have question about implementation of droppath in NAFSSR.

Your droppath is quite different to timm/tfa's implementations. These 2 implementations drop samples in batch, but i think your implementation drops whole batch when training network. I wonder there's reason to this.

add web demo/models to Huggingface

Hi, would you be interested in adding NAFNet to Hugging Face? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of the ML community. Models/datasets/spaces(web demos) can be added to a user account or organization similar to github.

Example from other organizations:
Keras: https://huggingface.co/keras-io
Microsoft: https://huggingface.co/microsoft
Facebook: https://huggingface.co/facebook

Example spaces with repos:
github: https://github.com/salesforce/BLIP
Spaces: https://huggingface.co/spaces/salesforce/BLIP

github: https://github.com/facebookresearch/omnivore
Spaces: https://huggingface.co/spaces/akhaliq/omnivore

and here are guides for adding spaces/models/datasets to your org

How to add a Space: https://huggingface.co/blog/gradio-spaces
how to add models: https://huggingface.co/docs/hub/adding-a-model
uploading a dataset: https://huggingface.co/docs/datasets/upload_dataset.html

Please let us know if you would be interested and if you have any questions, we can also help with the technical implementation.

模型导出为onnx

请问下NAFNet可以导出为onnx格式的模型吗?

Custom implementation of AvgPool2d not used

Hello,

Thanks for your work!

I don't really understand how TLSC works in your project and how it is enabled only during test time.

When instantiating an AvgPool2d layer:

kernel_size is always set to None ;
train_size is always set to (N, C, H, W) ;
base_size is always set to (H * 1.5, W * 1.5).

Then, in the forward pass, kernel_size is set to (x.shape[2] * 1.5, x.shape[3] * 1.5) (as base_size // train_size == 1.5).

Thus, the condition self.kernel_size[0] >= x.size(-2) and self.kernel_size[1] >= x.size(-1) (code) always holds true.

Therefore, AvgPool2d is always returning F.adaptive_avg_pool2d(x, 1) which is the default PyTorch implementation.

Am I missing something here?

Thanks.

D额noise、测试

pretrained plainnet and baselinenet

Hello and congratulations on your ECCV22 acceptance!
I wonder if it's convenient for you to share the pretrained checkpoints of PlainNet and BaselineNet in your paper, many thanks :)

Inference/Quick start for stereo image super_resolution

Kindly add script for single image stereo image super-resolution task aswell.
currently, it has scripts for only image denoising and image deblurring.

Question About LayerNorm Implementation

Hi, your work is really inspiring and interesting!
I am wondering why you re-implement LayerNorm (2D), instead of using PyTorch modules like nn.GroupNorm(1, channels)?
Are there some differences between them (e.g. final performance or function differences)?

DND数据集的结果

请问作者是否有(计划)提交和公开noise.visinf.tu-darmstadt.de 的结果？

请问NAFNet中了吗？

About JPEG artifact dataset (REDS)

想請問一下，你們的REDS with jpeg artifact這個資料集
是使用了多少壓縮率的jpeg參數(或是quality保留度)去做前處理呢?

Need help with custom datasets

感谢你们杰出的工作！我希望能够将上面的图片通过Deblurring得到下面的这张图片


训练集是很多这种类型的图片，大小是1024*1024的，我应该对网络的哪些部分进行修改呀？

CUDA out of memory

got the following error when running the Replicate demo (Debluring):

CUDA out of memory. Tried to allocate 5.41 GiB (GPU 0; 14.76 GiB total capacity; 9.14 GiB already allocated; 4.45 GiB free; 9.34 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

BTW, colab get the same error.

ask for advice

你们做的工作太漂亮了！
我想请教一下，你们首页的gif是怎么制作的，有没有教程啥的，我想学着做一个。
谢谢了，真的很酷。

Hi

《NAFSSR: Stereo Image Super-Resolution Using NAFNet》这篇文章可以从哪里找到

Denoise大图像测试问题

你好，请问NAFNet在Denoise中如何测试大图像呢，要切成多个小图像再进行测试吗？

Layernorm维度问题

Dear author,I want to why not use weight decay to train this model?

想知道为什么不使用weight-decay来训练模型，这样不会导致过拟合嘛？

NAFNet-error

你好，这是非常棒的工作！
我在重新训练网络时遇到一些问题，在生成GoPro的crop的时候生成的文件夹里面crop的图片数量为0，这是什么原因？
感谢您的解答！

Random image as result of deblurring

Running the following command python3 basicsr/demo.py -opt options/test/GoPro/NAFNet-width64.yml --input_path ./demo/noisy.png --output_path ./demo/denoise_img2.png

Results in the following image

Using CPU instead of GPU on Macbook Pro, downloaded NAFNet-GoPro-width64.pth from baidupan, md5 is 7b7f519ce4203d701dc026ff0c3fd6e0

Testing Picture in MC-Blur-Dataset has obvious artifact

hi, I use your pretrained models (NAFNet-GoPro-width32/64) with code and env setting provided by your github to run the pictures from dataset MC-Blur-Dataset and many of them get a result with obvious artifact. Would you like to check if the result is same with your research?

About the pix_loss

Hi, thanks for your great work! I am training on DPDD dataset for restoring defocus images, and I get the the pix_loss as follows:
2022-06-03 03:01:12,848 INFO: [NAFNe..][epoch:200, iter: 70,400, lr:(7.242e-04,)] [eta: 15:02:14, time (data): 0.399 (0.008)] l_pix: -2.8745e+01
2022-06-03 03:02:35,619 INFO: [NAFNe..][epoch:201, iter: 70,600, lr:(7.228e-04,)] [eta: 15:00:49, time (data): 0.403 (0.009)] l_pix: -2.9588e+01
2022-06-03 03:03:57,277 INFO: [NAFNe..][epoch:202, iter: 70,800, lr:(7.214e-04,)] [eta: 14:59:20, time (data): 0.403 (0.009)] l_pix: -2.8007e+01
2022-06-03 03:05:18,126 INFO: [NAFNe..][epoch:202, iter: 71,000, lr:(7.200e-04,)] [eta: 14:57:50, time (data): 0.402 (0.008)] l_pix: -2.7306e+01
I wonder if this value (l_pix) is normal or not, would you please share me your training log? Many thanks!

For StereoSR Task, How do I prepare the training dataset?

Hi, thanks for your wonderful work. May I know how I can prepare the dataset when training on SSR task? It seems that there are 298143 images in patches_x2 but I dont know how to get it.

Thanks again! Looking forward to your help.

enc_blk_nums middle_blk_num dec_blk_nums

Different tasks need required different settings?

About Raw Image Denoising

Hi, thanks for your excellent work! I am interested in Raw Image Denoising recently~ The code you provided does not include anything about Raw Image Denoising!!! Can you provide the codes about Raw Image Denoising? I just wonder how you get the training data and add the nosie to the clean data, and the details of you training, as you get a better result than PMRID which only provides the test codes~

train on our dataset

can you put complete code to train network on our dataset??

Releasing training code

Hi,
Thank you for your impressive work. Are you going to publish your training code? If yes, when will they be available?

SimpleGate and SCA

你好，很高兴看到MEGVII在low-level任务上的新工作，在跑过demo之后，我在Imagenet分类任务上了尝试由NAFBlock组成的网络，但是取得的效果都比较差，请问这部分网络改进在high-level任务上有尝试过吗，或者说之后有相关工作会分享吗，非常感谢。

Error occured during inference using HuggingFace Demo

The testcase used for inference: ![out-of-focus](https://user-images.githubusercontent.com/30183023/178232495-17ac4deb-4ad2-4c4a-bbe2-b4a830c37afe.png)

PSNR loss and L1 Loss

I want to know the quantitative comparisons between PSNR loss and L1 loss.

Since PSNR loss is adopted in NAFNet and HINet, I guess the PSNR loss achieves the better performance.

By the way, why PSNR loss is better? any gradient or optimization analysis?

关于NAFNet-REDS-width64的Deblur问题

作者您好，
我在测试demo的时候有的模糊照片处理的非常好，但是有照片会出现下面这种彩色马赛克的情况。请问您知道是什么问题吗

About the folder ./datasets/GoPro/train/blur_crops

This is a problem when I try to run gopro.py. The program tells me that this folder already exists, but in fact it doesn't. I don't know if this is os.path.exists() or a cache problem. Can someone answer it? Thank you!!

关于loss疑问

疑问1: 在代码image_restoration_model.py 225行l_total = l_total + 0. * sum(p.sum() for p in self.net_g.parameters()) . 0乘以任何数都=0,这是啥操作?
疑问2: 关于PSNRLoss:self.scale * torch.log(((pred - target) ** 2).mean(dim=(1, 2, 3)) + 1e-8).mean(). 即 10/ln(10) * ln(MSE) 这里MaxValue是=1的吧?. PSNR公司 MSE是在分母位置, 这里为什么在分子位置. 如果在分子位置,那么这个损失是负值啊. 怎么梯度更新呢?

cannot import name 'create_dataloader'

Hi, thanks for your excellent work.
I have successfully installed basicsr following your instruction. However, when I run train.py, the following error occurs:
Traceback (most recent call last):
File "basicsr/train.py", line 17, in
from basicsr.data import create_dataloader, create_dataset
ImportError: cannot import name 'create_dataloader'
Do you have any advice?

Metrics reproduced and those reported in the paper do not match

Hi,

Thanks for your awesome work! When I try to train the model with the official config GoPro/NAFNet-width32.yml, I got PSNR 38.6+ ssim: 0.96+, why the results in the paper are 32+/33+. What leads to that?

Best.