GithubHelp home page GithubHelp logo

icandle / camixersr Goto Github PK

View Code? Open in Web Editor NEW
203.0 203.0 11.0 69.61 MB

CAMixerSR: Only Details Need More “Attention” (CVPR 2024)

Home Page: https://arxiv.org/abs/2402.19289

License: Apache License 2.0

Python 100.00%

camixersr's Issues

Flops

Why is it that the model I got after changing [1,3,32,32] to [1,3,320,180] on the original code is much more computationally intensive than in the paper?
computation

核心work的模块是啥

请问一下,您这个工程中最主要能work的模块是啥,我这里的需要可能跟你不一样,你的是能分出空白的和需要注意力的区域,我这里区分不出来,在空白的区域可能也需要恢复出东西来,怎么解决呢,有推荐地论文吗

Question about the Gumbel Softmax code.

Hi, authors.

Sorry to bother you. I have tried your code and found that the one-hot vector with gumbel softmax is generated with some-linear --> softmax --> F.gumbel_softmax. However, in the code implementation of the DynamicViT, the one-hot vector is generated with some-linear --> Log-softmax --> F.gumbel_softmax. Is there some difference between the two, or whether it can influence the performance?

Thx.

找不到basicsr.utils

尝试使用Readme中的文件运行然后报错了
code/basicsr文件夹下并没有utils诶

Traceback (most recent call last):
File "/mnt/CAMixerSR/codes/basicsr/test.py", line 4, in
import archs
File "/mnt/CAMixerSR/codes/basicsr/archs/init.py", line 5, in
from basicsr.utils import get_root_logger, scandir
ModuleNotFoundError: No module named 'basicsr'

Cant find the file

The flow_warp file in the basicsr. archs. arch_util directory does not exist

AssertionError: 41033_HR_x4.png is not in lq_paths.

您好,当我运行训练代码时,出现如标题所示的bug。我想请问HR不应该是在HR里吗,为什么要检测这个HR图片不在lq_paths里呢?我该如何解决这个问题呢?谢谢

Question About Training Details

Hello author,

I want to reproduce the result of lightweight SR x4 task and I try to follow the training process using 2GPUs with batch size = 16 for each one. The window_size is 16x16 and feature_dim is 60.

  • My environment version information is:
BasicSR: 1.4.2
PyTorch: 2.1.1+cu121
TorchVision: 0.16.1+cu121
  • Here is my training config file: train_example.txt.txt

  • Due to the error of can't find "CAMixerSR" architecture, I modify the train.py as

import os.path as osp
import archs  
import models  
from basicsr.train import train_pipeline
if __name__ == '__main__':
    root_path = osp.abspath(osp.join(__file__, osp.pardir))
    train_pipeline(root_path)
  • The training script is:
CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 --master_port 29500 basicsr/train.py \
 -opt /home/jiachen/CAMixerSR/codes/options/train/train_example.yml \
 --launcher pytorch \

I training for 500k iters and the best result is get at 440k iters:
result

The performance is not as good as the pretrained model. Is there anything that I missed? Any kinds of help is appreciated. I'm looking forward to your reply!

Thanks,
jiachen

How the offsets work?

作者你好,我觉得你们的工作成果相当出色!但是我对论文里的偏移图(offsets)的原理感到不解,为什么Predictor经过几层卷积后得到的offsets和原图扭曲就能得到具有更多信息的窗口?k = x + flow_warp(x, offsets.permute(0, 2, 3, 1), interp_mode='bilinear', padding_mode='border') 另外,在flow_warp()的实现中提到了optical flow,这里的光流要怎么理解,据我所知光流的获取需要相邻帧的图像信息,这里把offsets看作是光流要怎么理解?期待你的解答!

How can I use pretrained models?

Excuse me, I wonder how can I use .pth files in the folder 'pretrained models', and how can I use your model to generate a high-resolution picture?

Figure 4 Predicted Mask m Visualization

作者您好,我想问一下论文中图4的Predicted Mask m 的可视化是怎么实现的?我查看了问题17但是没有解决我的疑问。以及在训练过程和测试过程掩码的使用的方法为啥不一样?在问题17中作者您有说:直接使用掩码与图像直接相乘,这这种方式预设的块的分割方式没有对齐。测试只能是batch_index_select根据mask选择出使用卷积注意力的图像块和自注意力的图像块,然后使用batch_index_fill将两种图像块拼接到一张图像当中。期待作者您的回复。

  if self.training or train_mode:
      N_ = v.shape[1]
      v1,v2 = v*mask, vs*(1-mask)   
      qk1 = qk*mask 
  else:
      idx1, idx2 = mask
      _, N_ = idx1.shape
      v1,v2 = batch_index_select(v,idx1),batch_index_select(vs,idx2)
      qk1 = batch_index_select(qk,idx1)

How to infer on real-world images?

Dear developer,

It seems that you only release the code for inference on the test set.

How to infer on real-world images?

Best wishes.

File missing

In the code, there are many files that cannot be found, such as arch_util, registry,sr_model, etc.
image
image

The mask of CAMixer, Visualization.

主要work的是CAMixer,但基本上来说,这篇工作依赖于对于复杂和简单区域采用不同复杂的的计算获得性能改善。如果区域分别不出来的话,可能需要其他指标。因为CAMixer是用图像区域的复杂度进行分类,这个指标也可以换成其他的。如果不要mask那就是相当于一个普通的WSA和卷积的结合,效果类似但计算量比较大。

我试图复现文章中的mask的作用,文章中图4展现了一些图像不同复杂度对应的概率,但是在复现过程中我发现可视化出的mask十分随机,并不能表现出图像的复杂程度,mask的上一步的pred_score也没有我想要的表现。我的问题是想知道作者在可视化图4中的概率和mask与图片复杂度对应关系上使用的是哪一部分的值?这个值应该从整个SR模型中的第几个predictor中得到?Looking forward to your reply, thanks.

Mask Visualization

我试图复现文章中的mask的作用,文章中图4展现了一些图像不同复杂度对应的概率,但是在复现过程中我发现可视化出的mask十分随机,并不能表现出图像的复杂程度,mask的上一步的pred_score也没有我想要的表现。我的问题是想知道作者在可视化图4中的概率和mask与图片复杂度对应关系上使用的是哪一部分的值?这个值应该从整个SR模型中的第几个predictor中得到?Looking forward to your reply, thanks.

the result of net is something wrong

thanks for the shared codes.
i test the pretrained models using test2K dataset but the output is wrong just like the pic blow, i can't find the reason, do you know what happened?

1279_CAMixerSR_0 50_x2_DF2K

About the Missing CAMixer-CAMixerSR

I appreciate your work. However, I checked the code and found that the relevant code of CAMixer-CAMixerSR is missing. Can you provide the appropriate code and pre-trained model?

onnx export

你好,我导出onnx时,发现耗时很久,

并且在使用netron可视化时,提示
"This large graph layout might take a very long time to complete."
看起来似乎是一张很大的图,
使用的代码如下:

import argparse
import os 
import torch
import archs.CAMixerSR_arch as arch

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        '--model_path',
        type=str,
        default='../../pretrained_models/LargeSR/CAMixerSR_S.pth'  # noqa: E501
    )
    parser.add_argument('--output', type=str, default=None, help='output ONNX model file')
    args = parser.parse_args()

    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    # set up model
    model = arch.CAMixerSR(n_feats=36, scale=4) # model-S:36 / model-M:48 / model-L:60
    model.load_state_dict(torch.load(args.model_path)['params_ema'], strict=True)
    model.eval()
    model = model.to(device)


    if args.output is None:
        output_file = os.path.splitext(args.model_path)[0] + '.onnx'
    else:
        output_file = args.output

    dummy_input = torch.randn(1, 3, 64, 64).to(device)  # Adjust the size as needed

    # Export the model to ONNX
    torch.onnx.export(
        model,                       # model being run
        dummy_input,                 # model input (or a tuple for multiple inputs)
        output_file,                 # where to save the model (can be a file or file-like object)
        export_params=True,          # store the trained parameter weights inside the model file
        opset_version=16,            # the ONNX version to export the model to
        do_constant_folding=True,    # whether to execute constant folding for optimization
        input_names=['input'],       # the model's input names
        output_names=['output'],     # the model's output names
        # dynamic_axes={
        #     'input': {0: 'batch_size', 2: 'height', 3: 'width'},
        #     'output': {0: 'batch_size', 2: 'height', 3: 'width'}
        #     }  # variable length axes
    )
    print(f"Model has been converted to ONNX and saved to {output_file}")

if __name__ == '__main__':
    main()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.