mjq11302010044 / tatt Goto Github PK

View Code? Open in Web Editor NEW

163.0 163.0 17.0 14.61 MB

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

License: MIT License

Python 99.92% Shell 0.08%

tatt's Introduction

Jianqi Ma

tatt's People

Contributors

Stargazers

Watchers

Forkers

ibrahim85 jingyechen ch21yyds linhuaiyi yk-hastur pantdevesh gjh8760 tinyriver chenchangquan5 coder-yhovo jeffchenitm lhh-pi maizixiaozhuang lmy199 everythingismetaphor hell-to-heaven

tatt's Issues

How to create my own LMDB DataSets to train your model

How to create my own LMDB DataSets like textzoom to train model,how to use datasetFiles anyone knows,plz help

Where is RPE used?

I've read the supplement describing the details of the recurrent positional encoding (RPE).

However, I cannot seem to find the code which RPE is implemented and used.

Would the authors kindly point out where the implementation of RPE is in the released codebase?

How to get the output images to visualize results during training eval and inference

I have noticed that when i training the data. there is an output print called "save display images",but i can't find where the display images are, please help me, thank you very much!

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

@mjq11302010044 'RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED' occurred after several training epoches.I do not know why this happened.

Pre-trained model

Would it be possible to make a pre-trained model available?

Where to put TextZoom dataset?

In which directory should I put the TextZoom dataset?

IndexError: too many index for tensor of dimension 4

Thanks for sharing ! when I testing with code vis = true. The error came up about : too many index for tensor of dimension 4 . I checked the size of the input , its (16, 1 , 32 , 100), what should i do next ?

请问有支持中文的文本识别超分辨率的预训练权重吗

A request for help about train process,log file,code arguments.

Thanks for your work. I thoroughly enjoyed reading the paper and I feel confused about the train process,I notice that you set two training orders,and the second order seems to fine tune the first-training result.Is that the necessary procedure for the whole training? Could you give me some specifics about this issue？
In addition, after training, I only found the weight file with '.th' as suffix under the folder ckpt/TATT or ckpt/TATT_ft, it seems that there is no log file.Where I can find it?
About the parser.add_argument() function in your code file. I noticed that the help parameter is left blank for several arguments in the parser.Could you please provide a brief description for these arguments? This would be greatly helpful for users who are new to your program, as it would give them a better understanding of the purpose of each argument and how to use them correctly.
I understand that you may be busy, but if you could spare a moment to update the help messages, it would be much appreciated.
I'm just a beginner,forgive my ignorance.Again,thank you so much!

Pretrained model

Accepted from CVPR 2022 a long time ago, and still there is no pertained model for TextZoom.
Authors mentioned this problem in closed issue #1, said they will release pre-trained model in later release version, but still don't upload checkpoint for measuring their performance.

images_sr = model(images_hr)

At line 1734 in interfaces/super_resolution.py,

images_sr = model(images_hr)

It seems that HR images are input into the model. Is this correct specification?

Couldyou release the link to the datasets ICDAR15, CUTE80 and SVTP used in your paper

location of code about your text prior architecture

Can you point where is the location of your text prior architecture in your code, please?
I really want to know how your architecture use the output of CRNN as text prior.
It is hard to find it.

And another question, does it seem that your code don't use text prior in testing?

How to set up training on other data sets？

Thanks for sharing！ If I want to use the tatt model proposed in this paper to train non MDB dataset files (such as datasets packaged in traditional image format), where should I modify the code.

Table5精度指标问题

Table5精度指标，代码中只统计小写字母和数字，不考虑大写字母和标点符号吗？这是TextZoom惯用的统计方式吗？

ValueError: too many values to unpack (expected 3)

HI，when i perform test part code,use test/easy/ as testDatasets,get wrong like this in super_resolution.py:

how could i solve this ,plz hel me

If the word length is greater than 4, the letter "e" seems to be inserted third.

At lines 1918-1921 in dataset/dataset.py,

if len(word) > 4:
  word = [ch for ch in word]
  word[2] = "e"
  word = "".join(word)

the letter "e" seems to be inserted into gt labels. What is the intention behind this process?

Is the OCR evaluation model (ster\crnn) and tatt end-to-end?

Is the OCR evaluation model (ster\crnn) and tatt end-to-end ? OR first use SR model to output results, and then input OCR?
just like the code below：
def getitem(self, index):
...
...
label_str = str_filt(word, self.voc_type)
return img_HR, img_lr, img_HRy, img_lry, label_str

Does “label_str” participate in the training of the whole model？

Could you please release the completely trained parameters which produce the reported results in the paper?

I meet a problem in training "No such file or directory: 'ckpt/TATT/model_best_acc_0.pth' ", how can I solve it ?

Thanks for you work.I meet a problem in training.How can I solve it?
First, I meet this problem.

No such file or directory: 'ckpt/TATT/

Then, I make a directory named 'TATT' in 'ckpt', but I meet another problem

No such file or directory: 'ckpt/TATT/model_best_acc_0.pth'

I'm working on my graduation project, and I need to reproduce your code, so it's important to me.Thanks for your work, and looking forward to your reply!

Any plan on chinese text enhancement?

Great job! Thank you for sharing the code.
Do you have any plan on chinese text enhancement?

mjq11302010044 / tatt Goto Github PK

tatt's Introduction

Jianqi Ma

tatt's People

Contributors

Stargazers

Watchers

Forkers

tatt's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs