sakurariven / east Goto Github PK
View Code? Open in Web Editor NEWPyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector
License: MIT License
PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector
License: MIT License
I tried on several datasets, but this repo only performed well on ICDAR2015. I think ICDAR2013 is much easier than ICDAR2015,but I don't why this happen. @SakuraRiven
Anyone knows or get good score on any other dataset? We can dicuss here, thx!!!!!!!!!!中文最好:)
Epoch is [6/600], mini-batch is [67/135], time consumption is 1.44429207, batch_loss is 1.57383704
classify loss is 1.00000000, angle loss is 0.07884802, iou loss is 0.30837956
Epoch is [6/600], mini-batch is [68/135], time consumption is 1.36282444, batch_loss is 2.09685969
classify loss is 1.00000000, angle loss is 0.09357643, iou loss is 0.32338580
Epoch is [6/600], mini-batch is [69/135], time consumption is 1.56707478, batch_loss is 2.25915003
classify loss is 1.00000000, angle loss is 0.06175874, iou loss is 0.30353320
Epoch is [6/600], mini-batch is [70/135], time consumption is 1.43909860, batch_loss is 1.92112064
classify loss is 1.00000000, angle loss is 0.00818367, iou loss is 0.23648684
Epoch is [6/600], mini-batch is [71/135], time consumption is 1.47601652, batch_loss is 1.31832349
classify loss is 1.00000000, angle loss is 0.01053840, iou loss is 0.24094862
Epoch is [6/600], mini-batch is [72/135], time consumption is 1.44849062, batch_loss is 1.34633255
classify loss is 1.00000000, angle loss is 0.01020748, iou loss is 0.27271590
Epoch is [6/600], mini-batch is [73/135], time consumption is 1.44790339, batch_loss is 1.37479067
classify loss is 1.00000000, angle loss is 0.01101409, iou loss is 0.30937693
Epoch is [6/600], mini-batch is [74/135], time consumption is 1.66667509, batch_loss is 1.41951787
classify loss is 1.00000000, angle loss is 0.04861949, iou loss is 0.28227383
Epoch is [6/600], mini-batch is [75/135], time consumption is 1.53493667, batch_loss is 1.76846874
classify loss is 1.00000000, angle loss is 0.06746940, iou loss is 0.33515832
Epoch is [6/600], mini-batch is [76/135], time consumption is 1.46233368, batch_loss is 2.00985241
classify loss is 1.00000000, angle loss is 0.07263301, iou loss is 0.29534331
Epoch is [6/600], mini-batch is [77/135], time consumption is 1.59857559, batch_loss is 2.02167320
classify loss is 1.00000000, angle loss is 0.16110733, iou loss is 0.31785601
Epoch is [6/600], mini-batch is [78/135], time consumption is 1.46606827, batch_loss is 2.92892933
classify loss is 1.00000000, angle loss is 0.00571813, iou loss is 0.25007588
Epoch is [6/600], mini-batch is [79/135], time consumption is 1.57749414, batch_loss is 1.30725718
classify loss is 1.00000000, angle loss is 0.07323492, iou loss is 0.30151081
Epoch is [6/600], mini-batch is [80/135], time consumption is 1.47476912, batch_loss is 2.03385997
classify loss is 1.00000000, angle loss is 0.00549032, iou loss is 0.23784761
Epoch is [6/600], mini-batch is [81/135], time consumption is 1.49541569, batch_loss is 1.29275084
classify loss is 1.00000000, angle loss is 0.00634125, iou loss is 0.28576058
Epoch is [6/600], mini-batch is [82/135], time consumption is 1.37827063, batch_loss is 1.34917307
the train log is as above.
and I use the saved weights to detect a test img , the result is nothing to show, even a train img is also nothing be detected. I think there is problem when I use the code?
Traceback (most recent call last):
File "C:/Users/gu/Desktop/代码/EAST/EAST-PYTORCH/EAST-master/train.py", line 66, in
train(train_img_path, train_gt_path, pths_path, batch_size, lr, num_workers, epoch_iter, save_interval)
File "C:/Users/gu/Desktop/代码/EAST/EAST-PYTORCH/EAST-master/train.py", line 35, in train
for i, (img, gt_score, gt_geo, ignored_map) in enumerate(train_loader):
File "E:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 819, in next
return self._process_data(data)
File "E:\Anaconda\lib\site-packages\torch\utils\data\dataloader.py", line 846, in _process_data
data.reraise()
File "E:\Anaconda\lib\site-packages\torch_utils.py", line 369, in reraise
raise self.exc_type(msg)
TypeError: function takes exactly 5 arguments (1 given)
运行时报错,是不是数据读取问题
C:\Users\Adnan\AppData\Local\Programs\Python\Python35\python.exe C:/Users/Adnan/EAST/train.py
C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\optim\lr_scheduler.py:82: UserWarning: Detected call of lr_scheduler.step()
before optimizer.step()
. In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step()
before lr_scheduler.step()
. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
Traceback (most recent call last):
File "C:/Users/Adnan/EAST/train.py", line 66, in
train(train_img_path, train_gt_path, pths_path, batch_size, lr, num_workers, epoch_iter, save_interval)
File "C:/Users/Adnan/EAST/train.py", line 35, in train
for i, (img, gt_score, gt_geo, ignored_map) in enumerate(train_loader):
File "C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 819, in next
return self._process_data(data)
File "C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 846, in _process_data
data.reraise()
File "C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch_utils.py", line 369, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data_utils\worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\Adnan\EAST\dataset.py", line 385, in getitem
vertices, labels = extract_vertices(lines)
File "C:\Users\Adnan\EAST\dataset.py", line 365, in extract_vertices
vertices.append(list(map(int, line.rstrip('\n').lstrip('\ufeff').split(',')[:8])))
ValueError: invalid literal for int() with base 10: '555'
RuntimeError: [enforce fail at ..\c10\core\CPUAllocator.cpp:62] data. DefaultCPUAllocator: not enough memory: you tried to allocate %dGB. Buy new RAM!27
well, when i try to train it on pytorch1.3, core dump......
what should i do?
Hi SakuraRiven,
I have cloned your repository and try to run eval.py file, it outputs precision 0.014, recall 0.8574 and hmean 0.028. It is so strange to me, do you have any comment?
During training, we fit in batches of images of the same dimensions for training, 512 in code by default. During training the position of activated pixels to the rotated box boundaries is limited to the range [0, 512] due to the use of a sigmoid activation function. However, during testing, the model input is not restricted to the size of images used during training, instead, only resized to be divisible by 32. I'm wondering what's the effect of this when the test image dimensions are very different than the training image dimensions. Do you think it's better to squashed resized images to 512 during testing? @SakuraRiven
I train the network from vgg16_bn-6c64b313.pth and eval it,
got
Calculated!{"precision": 0.7974987974987975, "recall": 0.79826673086182, "hmean": 0.7978825794032723, "AP": 0}
in save/model_epoch_600.pth
BTW, I eval the east_vgg16.pth get your original score:
Calculated!{"precision": 0.8435782108945528, "recall": 0.8127106403466539, "hmean": 0.8278567925453654, "AP": 0}
So, I used the right code and model.
What's wrong in there?
Should I use earlier epoch?
Yours,
Neo
@SakuraRiven
你好,使用直接训练好的EAST vgg模型,也不能复现性能,请问是什么问题呢?
我的LANMS模块从原作者里面的工程直接编译得到,应该没有影响吧. https://github.com/argman/EAST/tree/master/lanms
您好,谢谢您的工作。
我想问一下关于网络输出的部分:
loc = self.sigmoid2(self.conv2(x)) * self.scope
1.这里self.scope的值是输入图片的尺寸512,对吗?
2.self.scope如果是输入图片的尺寸,那么self.sigmoid2(self.conv2(x))的值应该很小对吗,比如都是0-0.1之间的值,有可能更小,这对训练回归影响大吗?
期待您的回复,谢谢!
thanks for your clean code.
but i doubt that if there is something wrong in crop_img() function or not :
after while() loop, if cnt == 1000(flag==True), then there are still some vertices that are outside of the cropped img, these wrong ones should be removed from 'new_vertices' ?
another question:
suppose that after while() loop, if cnt <1000(flag==False), which means that all vertices are not cross-crop-boundry, but they don't include vertices whose label==0, because you just vertify vertices whose label==1. so if you train with ignored, the valid vertices may be not matched with the cropped img.
waiting for your comment.
Excuse me, have you ever tested on other data sets, and what is the effect, such as MSRA-TD500, MLT, ICDAR2013
我想对比看看交叉熵的实验效果,请问有实现的朋友能分享下代码吗
Thanks for your open source!!!
I am curious about how much time it cost to train the model until converge.
~~
我用的MSRA数据集 format{x1,y1,x2,y2,x3,y3,x4,y4,classname},但当我导入数据训练您的代码时,报错如下,indexerror:too many indices for array:array is 1-dimensional,but 2 were indexed, 报错代码在判断裁剪框 flag = is_cross_text([start_w,start_h],length,new_vertices[labels==1,:])
How many epochs do you need to achieve convergence during training?
hi, I encounter a problem, when I use my own dataset, the size of my dataset's picture is different, and the problem is "valueerror: image has wrong mode resize". Does that means that the size of pictures must the same?
Hi,
Could you please release a license for this re-implementation?
Best,
Amrith
关于dice loss的实现,我看其他的代码是对mini-batch中每个样本计算dice loss然后取均值,而这里的dice loss是直接所有样本一起计算的,这两种实现哪种更合适?
你好、我在运行一会之后就损失为nan了、请问能更改么
the function of get_score_geo
is very slow, about 1s/img.
Epoch is [1/600], mini-batch is [1/33], time consumption is 0.12597346, batch_loss is 0.00000000
Epoch is [1/600], mini-batch is [2/33], time consumption is 0.07961965, batch_loss is 0.00000000
Epoch is [1/600], mini-batch is [3/33], time consumption is 0.07560468, batch_loss is 0.00000000
How do you modify the geometry map generation for QUAD part of the paper? what does it mean by the statement "For the QUAD ground truth, the value of each pixel with positive score in the 8-channel geometry map is its coordinate shift from the 4 vertices of the quadrangle" I would like to know all the modification needed to be made in the code for implementing for QUAD part. How to modify the geometry map generation for QUAD method ? @SakuraRiven
(mmdetection) wuliang@gpu5:~/cvprojects/EAST-master$ python ./evaluate/script.py –g=./evaluate/gt.zip –s=./submit.zip
Error!
ZIP entry not valid: 595.txt
我想用我的数据集评估我的模型。不知道为什么不行
没法达到作者给出的精度,请问训练过程还有什么其他trick吗?
这边没有翻墙好像打不开pths文件,请问下载过的小伙伴可以提供一下吗?
I'm not sure whether it's a bug or just difference in variable naming
In loss.py, you define the width and height of the union box as:
w_union = torch.min(d3_gt, d3_pred) + torch.min(d4_gt, d4_pred)
h_union = torch.min(d1_gt, d1_pred) + torch.min(d2_gt, d2_pred)
while in original paper it is:
wi= min(d2_gt,d2_pred) + min(d4_gt,d4_pred)
hi= min(d1_gt,d1_pred) + min(d3_gt,d3_pred)
Could you please clarify?
Hi dear @SakuraRiven, great thanks for your practical re-implement. It's much more clearer than the other version.
I wanna know why you're doing find_min_rect_angle? Are you doing something like finding a best-matched AABB (Axis Aligned Bounding Box) by rotating each BBOX? (like below)
THANKS in advance!
请问你的环境是多大显存的?我是用dectect测试时,显示显存不够。
i am getting below error while running eval.py.
RuntimeError: Error(s) in loading state_dict for EAST:
Unexpected key(s) in state_dict: "extractor.features.1.num_batches_tracked", "extractor.features.4.num_batches_tracked", "extractor.features.8.num_batches_tracked", "extractor.features.11.num_batches_tracked", "extractor.features.15.num_batches_tracked", "extractor.features.18.num_batches_tracked", "extractor.features.21.num_batches_tracked", "extractor.features.25.num_batches_tracked", "extractor.features.28.num_batches_tracked", "extractor.features.31.num_batches_tracked", "extractor.features.35.num_batches_tracked", "extractor.features.38.num_batches_tracked", "extractor.features.41.num_batches_tracked", "merge.bn1.num_batches_tracked", "merge.bn2.num_batches_tracked", "merge.bn3.num_batches_tracked", "merge.bn4.num_batches_tracked", "merge.bn5.num_batches_tracked", "merge.bn6.num_batches_tracked", "merge.bn7.num_batches_tracked".
ICDAR2015数据集,训了50个epoch以后,eval结果任然是0.05左右。
而且分类loss和刚开始训的时候一样一直是0.99左右(如果我没记错的话)。
想请问一下,问题出在dataloader上还是损失函数那个部分?
感谢!
I want to know your memory, GPU memory. When I run you train.py , out of memory always occured. I use P100 which is 16GB.
do you mean your improvement mainly comes from your customized loss func ?
Has any blog or paper explained this?
Hello,thanks for your shre!
I meet two questions need your help.
When I trained AdvanceEAST by Keras, it process quickly and result is accuracy. However. I trained slowly by your project and result is not accuracy.
About 3500 images with input train. double 1080Ti spend 24 hours. But AdvanceEAST spend 1 hours.
Hi,
Do you know how to install lanms in MacOS?
I'm try to deploy part of your code but with Opencv based in this proyect: https://www.pyimagesearch.com/2018/08/20/opencv-text-detection-east-text-detector/
becose i want to drow a nice boxes in every orientation i think follow your code is the solution...
i wrote a tools2.py with step by step followed your code, chet it if u want https://drive.google.com/open?id=1SKKhZquilY-YlfWgqccF5zLewpunxXIf (i just modified geo and scores structure in the way that you said en your scripts)
then i made the follow script for test the idea and replicate your results with mi webcam https://drive.google.com/open?id=1j1EKQNalEsyAj2Uqwh3QfucHnMNSGWIs
I uploaded the weights to (if you want to reply my results)
But i cant yet any result !!!! :c if you may help me, i notice that in the tools2.py the function is_valid_poly(res, score_shape, scale) always out False but i cant understand why...
any helps or recommendation ?
I changed the Loss a little bit then train for 200 epochs and I met " Points are not clockwise" at evaluation. But when I tried to do the same thing on the original EAST model this phenomenon didn't happen. I wonder what may cause " Points are not clockwise. " Is it just means my result is bad or something else ?
你好,我想请教下,你有试过使用ResNet50或者ResNeXt等作为backbone吗,比起VGG16这些网络的效果应该会更好吧?
另外,我想请问下,你复现的EAST是否用到一些trick,比起 @argman 的https://github.com/argman/EAST performance还提升了一些。
非常感谢!
你好,我的Loss一直下不去 波动很大一开始都在20以上 十几个批次以后还是10以上的loss 这样正常吗
Traceback (most recent call last):
File "train.py", line 66, in
train(train_img_path, train_gt_path, pths_path, batch_size, lr, num_workers, epoch_iter, save_interval)
File "train.py", line 35, in train
for i, (img, gt_score, gt_geo, ignored_map) in enumerate(train_loader):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
TypeError: function takes exactly 5 arguments (1 given)
How can I fix it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.