icandle / camixersr Goto Github PK
View Code? Open in Web Editor NEWCAMixerSR: Only Details Need More “Attention” (CVPR 2024)
Home Page: https://arxiv.org/abs/2402.19289
License: Apache License 2.0
CAMixerSR: Only Details Need More “Attention” (CVPR 2024)
Home Page: https://arxiv.org/abs/2402.19289
License: Apache License 2.0
i can not find test_pipeline in the code
请问一下,您这个工程中最主要能work的模块是啥,我这里的需要可能跟你不一样,你的是能分出空白的和需要注意力的区域,我这里区分不出来,在空白的区域可能也需要恢复出东西来,怎么解决呢,有推荐地论文吗
Hi, authors.
Sorry to bother you. I have tried your code and found that the one-hot vector with gumbel softmax is generated with some-linear --> softmax --> F.gumbel_softmax
. However, in the code implementation of the DynamicViT, the one-hot vector is generated with some-linear --> Log-softmax --> F.gumbel_softmax
. Is there some difference between the two, or whether it can influence the performance?
Thx.
尝试使用Readme中的文件运行然后报错了
code/basicsr文件夹下并没有utils诶
Traceback (most recent call last):
File "/mnt/CAMixerSR/codes/basicsr/test.py", line 4, in
import archs
File "/mnt/CAMixerSR/codes/basicsr/archs/init.py", line 5, in
from basicsr.utils import get_root_logger, scandir
ModuleNotFoundError: No module named 'basicsr'
The flow_warp file in the basicsr. archs. arch_util directory does not exist
请问作者,有尝试过对结果再用GAN做些对抗训练吗?就类似REAL-ESRNET & REAL-ESRGAN那样
您好,当我运行训练代码时,出现如标题所示的bug。我想请问HR不应该是在HR里吗,为什么要检测这个HR图片不在lq_paths里呢?我该如何解决这个问题呢?谢谢
How do I go through the training process?
Hello author,
I want to reproduce the result of lightweight SR x4 task and I try to follow the training process using 2GPUs with batch size = 16 for each one. The window_size is 16x16 and feature_dim is 60.
BasicSR: 1.4.2
PyTorch: 2.1.1+cu121
TorchVision: 0.16.1+cu121
Here is my training config file: train_example.txt.txt
Due to the error of can't find "CAMixerSR" architecture, I modify the train.py as
import os.path as osp
import archs
import models
from basicsr.train import train_pipeline
if __name__ == '__main__':
root_path = osp.abspath(osp.join(__file__, osp.pardir))
train_pipeline(root_path)
CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 --master_port 29500 basicsr/train.py \
-opt /home/jiachen/CAMixerSR/codes/options/train/train_example.yml \
--launcher pytorch \
I training for 500k iters and the best result is get at 440k iters:
The performance is not as good as the pretrained model. Is there anything that I missed? Any kinds of help is appreciated. I'm looking forward to your reply!
Thanks,
jiachen
thanks for your share, could you please tell us what the diffrence of these model in pretrained_models folders, thank
作者你好,我觉得你们的工作成果相当出色!但是我对论文里的偏移图(offsets)的原理感到不解,为什么Predictor经过几层卷积后得到的offsets和原图扭曲就能得到具有更多信息的窗口?k = x + flow_warp(x, offsets.permute(0, 2, 3, 1), interp_mode='bilinear', padding_mode='border')
另外,在flow_warp()的实现中提到了optical flow,这里的光流要怎么理解,据我所知光流的获取需要相邻帧的图像信息,这里把offsets看作是光流要怎么理解?期待你的解答!
Excuse me, I wonder how can I use .pth files in the folder 'pretrained models', and how can I use your model to generate a high-resolution picture?
作者您好,我想问一下论文中图4的Predicted Mask m 的可视化是怎么实现的?我查看了问题17但是没有解决我的疑问。以及在训练过程和测试过程掩码的使用的方法为啥不一样?在问题17中作者您有说:直接使用掩码与图像直接相乘,这这种方式预设的块的分割方式没有对齐。测试只能是batch_index_select根据mask选择出使用卷积注意力的图像块和自注意力的图像块,然后使用batch_index_fill将两种图像块拼接到一张图像当中。期待作者您的回复。
if self.training or train_mode:
N_ = v.shape[1]
v1,v2 = v*mask, vs*(1-mask)
qk1 = qk*mask
else:
idx1, idx2 = mask
_, N_ = idx1.shape
v1,v2 = batch_index_select(v,idx1),batch_index_select(vs,idx2)
qk1 = batch_index_select(qk,idx1)
Dear developer,
It seems that you only release the code for inference on the test set.
How to infer on real-world images?
Best wishes.
您好,我想问一下这个问题怎么解决呢?我看别的博客他们是运行了codes下的setup.py解决的,但是我拉取的代码并没有这个文件。谢谢您 。
主要work的是CAMixer,但基本上来说,这篇工作依赖于对于复杂和简单区域采用不同复杂的的计算获得性能改善。如果区域分别不出来的话,可能需要其他指标。因为CAMixer是用图像区域的复杂度进行分类,这个指标也可以换成其他的。如果不要mask那就是相当于一个普通的WSA和卷积的结合,效果类似但计算量比较大。
我试图复现文章中的mask的作用,文章中图4展现了一些图像不同复杂度对应的概率,但是在复现过程中我发现可视化出的mask十分随机,并不能表现出图像的复杂程度,mask的上一步的pred_score也没有我想要的表现。我的问题是想知道作者在可视化图4中的概率和mask与图片复杂度对应关系上使用的是哪一部分的值?这个值应该从整个SR模型中的第几个predictor中得到?Looking forward to your reply, thanks.
我试图复现文章中的mask的作用,文章中图4展现了一些图像不同复杂度对应的概率,但是在复现过程中我发现可视化出的mask十分随机,并不能表现出图像的复杂程度,mask的上一步的pred_score也没有我想要的表现。我的问题是想知道作者在可视化图4中的概率和mask与图片复杂度对应关系上使用的是哪一部分的值?这个值应该从整个SR模型中的第几个predictor中得到?Looking forward to your reply, thanks.
您好!我对您的工作非常感兴趣,想询问您为什么要在训练时使用gumble softmax,具体是怎么实现的。
I appreciate your work. However, I checked the code and found that the relevant code of CAMixer-CAMixerSR is missing. Can you provide the appropriate code and pre-trained model?
你好,我导出onnx时,发现耗时很久,
并且在使用netron可视化时,提示
"This large graph layout might take a very long time to complete."
看起来似乎是一张很大的图,
使用的代码如下:
import argparse
import os
import torch
import archs.CAMixerSR_arch as arch
def main():
parser = argparse.ArgumentParser()
parser.add_argument(
'--model_path',
type=str,
default='../../pretrained_models/LargeSR/CAMixerSR_S.pth' # noqa: E501
)
parser.add_argument('--output', type=str, default=None, help='output ONNX model file')
args = parser.parse_args()
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# set up model
model = arch.CAMixerSR(n_feats=36, scale=4) # model-S:36 / model-M:48 / model-L:60
model.load_state_dict(torch.load(args.model_path)['params_ema'], strict=True)
model.eval()
model = model.to(device)
if args.output is None:
output_file = os.path.splitext(args.model_path)[0] + '.onnx'
else:
output_file = args.output
dummy_input = torch.randn(1, 3, 64, 64).to(device) # Adjust the size as needed
# Export the model to ONNX
torch.onnx.export(
model, # model being run
dummy_input, # model input (or a tuple for multiple inputs)
output_file, # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=16, # the ONNX version to export the model to
do_constant_folding=True, # whether to execute constant folding for optimization
input_names=['input'], # the model's input names
output_names=['output'], # the model's output names
# dynamic_axes={
# 'input': {0: 'batch_size', 2: 'height', 3: 'width'},
# 'output': {0: 'batch_size', 2: 'height', 3: 'width'}
# } # variable length axes
)
print(f"Model has been converted to ONNX and saved to {output_file}")
if __name__ == '__main__':
main()
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.