mjq11302010044 / tpgsr Goto Github PK

View Code? Open in Web Editor NEW

134.0 134.0 17.0 4.28 MB

Code for Text Prior Guided Scene Text Image Super-Resolution (TIP 2023)

License: MIT License

Python 99.80% Shell 0.20%

tpgsr's Introduction

Jianqi Ma

tpgsr's People

Contributors

Stargazers

Watchers

Forkers

tpgsr's Issues

跪求一份 BTS: 双语文本分割数据集可以么

你好。冒昧打扰了，可以跟你申请一份 BTS: 双语文本分割数据集？祝好

RuntimeError: Given groups=1, weight of size [64, 4, 9, 9], expected input[1, 5, 32, 256] to have 4 channels

Hello, thank you for your excellent work. I have trained a model and want to process several images with blurred text. The command I use is as follows:
python main.py --arch="tsrn_tl_cascade" --test_model="CRNN" --batch_size=4 --STN --mask --sr_share --gradient --demo --stu_iter=1 --vis_dir='default' --resume=ckpt/vis_TPGSR-TSRN/model_best_0.pth --demo_dir demo

The blurred images are in the demo folder(four jpg images), run and prompt error:
RuntimeError: Given groups=1, weight of size [64, 4, 9, 9], expected input[1, 5, 32, 256] to have 4 channels, but got 5 channels instead
why?

loading pre-trained model from ckpt/vis_TPGSR-TSRN/model_best_0.pth
File "D:\anaconda-install\envs\envpython38\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\anaconda-install\envs\envpython38\lib\site-packages\torch\nn\modules\container.py", line 204, in forward
input = module(input)
File "D:\anaconda-install\envs\envpython38\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\anaconda-install\envs\envpython38\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "D:\anaconda-install\envs\envpython38\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [64, 4, 9, 9], expected input[1, 5, 32, 256] to have 4 channels, but got 5 channels instead

训练时间

你好，我看了你的论文和code，论文描述：The batch size is set to 48 and the model is trained for 500 epochs with one NVIDIA RTX 2080Ti GPU，请问一下跑500epochs大概需要多长时间

能公开预训练权重吗

Add new TP Generator Model

Hi, How can I add a new tp generator and train? Which rows that I have to change for changing tp generator model and what kind of changes that I have to do? Can I obtain label_vecs_final from other text recognition models to give model?

您好，仔细看了您的论文，觉得您的思路很棒，许多地方令我茅塞顿开，想好好学习一下您的源代码，就想问下你的整个项目核心程序是哪几个.py文件呢，一时半会拿到您的代码感觉很头疼。

Where is the final model?

Hi there,

I'd like to reproduce your amazing work but I can only find the pretrained models and not the final fine-tuned model. Am I correct?

Could you please upload the final model?

Thank you.

Errror when run demo ?

I use GPU when inference but i don't know why error . Which one model run on CPU ?

loading pretrained crnn model from crnn.pth
0%| | 0/4 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/thorpham/Documents/challenge/super-resolution/TPGSR/main.py", line 76, in
main(config, args, opt_TPG=opt)
File "/home/thorpham/Documents/challenge/super-resolution/TPGSR/main.py", line 16, in main
Mission.demo()
File "/home/thorpham/Documents/challenge/super-resolution/TPGSR/interfaces/super_resolution.py", line 1480, in demo
images_sr = model(images_lr)
File "/home/thorpham/anaconda3/envs/torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/thorpham/Documents/challenge/super-resolution/TPGSR/model/tsrn.py", line 195, in forward
spatial_t_emb = self.infoGen(text_emb)
File "/home/thorpham/anaconda3/envs/torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/thorpham/Documents/challenge/super-resolution/TPGSR/model/tsrn.py", line 103, in forward
x = F.relu(self.bn1(self.tconv1(t_embedding)))
File "/home/thorpham/anaconda3/envs/torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/thorpham/anaconda3/envs/torch/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 916, in forward
return F.conv_transpose2d(
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking arugment for argument weight in method wrapper_slow_conv_transpose2d)

Why doesn't the loss converge when I train？

I trained for more than 400 epochs and the loss is still 1.x

about arch

Amazing work! hello, what is the difference between 'sem_tsrn', 'tsrn_c2f', 'tsrn_tl', 'tsrn_tl_cascade', 'tsrn_tl_wmask'?
I want to reproduce your work, which one should be selected?Thanks!

How can i infer a low resolution image?

I have finished the training process, how can i use the trained model to get a high resolution text image?

RuntimeError: Tensor for argument #1 'input' is on CPU, Tensor for argument #2 'output' is on CPU, but expected them to be on GPU (while checking arguments for slow_conv_transpose2d_out_cuda)

Hi, when i try to test your model with this command:
python main.py --arch="tsrn_tl_cascade" --test_model="CRNN" --test_data_dir=../TPGSR-main/dataset/TextZoom/test/hard --batch_size=48 --STN --mask --sr_share --gradient --test --stu_iter=1 --vis_dir='hard'
in this line(TPGSR-main\interfaces\super_resolution.py", line 1382, in test):
images_sr = model(images_lr)
i receive this error:
RuntimeError: Tensor for argument #1 'input' is on CPU, Tensor for argument #2 'output' is on CPU, but expected them to be on GPU (while checking arguments for slow_conv_transpose2d_out_cuda)
what should i do?

共享SR和非共享TP

论文中多阶段训练提出使用共享SR和非共享TP,但是代码中写的是非共享SR和共享TP

根据你提出的训练命令
python3 -u main.py --arch="tsrn_tl_cascade" --batch_size=48 --STN --mask --use_distill --gradient --sr_share --stu_iter=3 --vis_dir='vis_TPGSR-TSRN'
--sr_share 默认为False,训练时是True

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0!

Hi @mjq11302010044,

I was successfully able to train the model, using the code in the repository. But , when I run Test.sh script, I have following error:
"RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument weight in method wrapper_slow_conv_transpose2d)".

I spent almost more than 2 days debugging it, but cannot get past this error.  Can you please help me resolve the issue if you have solution for this?

Regards,
Nakul

Issues about TSRN derived structures!

Hi, Ma, thanks for your nice job! Actually, I got some issues and begging for your early rely.

There are several TSRN derived structures mentioned in the code, like 'sem_tsrn', 'tsrn_c2f', 'tsrn_tl', 'tsrn_tl_cascade', 'tsrn_tl_wmask' etc. But actually, I just reproduced the 'tsrn_tl_cascade' arch successfully. The 'sem_tsrn' arch should be the core arch, isn't it? But why is there no 'sem_tsrn' in the 'args.arch' choices. Unfortunately, I still failed to reproduced it when I added 'sem_tsrn' into the choices of args.arch and set the args.arch=‘sem_tsrn’. Maybe there is something wrong in the released code I guess.
Can you explain the differences in these derived structures like ''tsrn_c2f', 'tsrn_tl_cascade', 'tsrn_tl_wmask' expect the 'data difference' from different arch? Or could you please give some detailed instructions in the README.md. It's a bit hard to understand the purpose of these structures when I read the code.

Thx again!

mjq11302010044 / tpgsr Goto Github PK

tpgsr's Introduction

Jianqi Ma

tpgsr's People

Contributors

Stargazers

Watchers

Forkers

tpgsr's Issues

The blurred images are in the demo folder(four jpg images), run and prompt error: RuntimeError: Given groups=1, weight of size [64, 4, 9, 9], expected input[1, 5, 32, 256] to have 4 channels, but got 5 channels instead why?

Recommend Projects

Recommend Topics

Recommend Org

Jobs

The blurred images are in the demo folder(four jpg images), run and prompt error:
RuntimeError: Given groups=1, weight of size [64, 4, 9, 9], expected input[1, 5, 32, 256] to have 4 channels, but got 5 channels instead
why?