GithubHelp home page GithubHelp logo

mjq11302010044 / rrpn_plusplus Goto Github PK

View Code? Open in Web Editor NEW
88.0 88.0 13.0 7.96 MB

RRPN++: Guidance Towards More Accurate Scene Text Detection

Dockerfile 0.14% Python 49.34% C++ 4.47% Cuda 10.68% Jupyter Notebook 34.52% C 0.47% Cython 0.38%

rrpn_plusplus's People

Contributors

mjq11302010044 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

rrpn_plusplus's Issues

Pretrained model

Hello, thanks for your paper and codes. Can you upload some pretrain models?

code

Hello author,
I have a few questions about the code.
First, look at the example you gave using the yaml file isconfigs/arpn/e2e_rrpn_R_50_C4_1x_train_AFPN_RT_LERB.yaml,the META ARCHITECTURE is ARPN. why not use RRPN?
Second, the results I got didn't feel very good, (see picture below) I want to ask why?
Third, after I tried to train RRPN using the configs/rrpn/e2e_rrpn_R_50_C4_1x_ICDAR13_15_trial.yaml model, the loss=nan situation occurred. I reduced the learning rate, but it didn't work.
Finally, we look forward to hearing from you,thanks.

Alphabet FileNotFoundError

Hi
I am unable to find a source to obtain './data_cache/alphabet_IC13_IC15_Syn800K_pro.txt'.
I understand that training the network generates the file at ./data_cache folder as mentioned in #3 but what should I do in case if I only want to test the model on my images? Is it possible to provide link to this file?
Below is the error encountered when trying to test on my images

  File "rrpn_e2e_infer.py", line 99, in <module>
    alphabet = open(cfg.MODEL.ROI_REC_HEAD.ALPHABET).readlines()[0] + '-'
FileNotFoundError: [Errno 2] No such file or directory: './data_cache/alphabet_IC13_IC15_Syn800K_pro.txt'

how to calculate mAP with the RotationDataset

I wonder how to calculate mAP with the RotationDataset. In your code only support coco_evaluation and voc_evaluation. Can you provide the code of "rotate_evaluation". Thanks a lot

RuntimeError: CUDA error: device-side assert triggered

My computer configuration:
PyTorch version: 1.0.1
Is debug build: No
CUDA used to build PyTorch: 10.0.130

OS: Ubuntu 18.04.5 LTS
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CMake version: version 3.10.2

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.0.130
GPU models and configuration:
GPU 0: GeForce RTX 2080 Ti
GPU 1: GeForce RTX 2080 Ti
GPU 2: GeForce RTX 2080 Ti

Nvidia driver version: 450.66
cuDNN version: Could not collect

Versions of relevant libraries:
[pip] Could not collect
[conda] Could not collect
Pillow (7.1.2)
2020-11-12 20:16:26,367 maskrcnn_benchmark INFO: Loaded configuration file configs/arpn_E2E/e2e_rrpn_R_50_C4_1x_train_AFPN_RT_LERB_Spotter.yaml.

error:
Database: ['IC15'] 861
2020-11-12 20:13:11,100 maskrcnn_benchmark.trainer INFO: Start training
2020-11-12 20:13:12,625 maskrcnn_benchmark.trainer INFO: eta: 4:13:50 iter: 10 loss: 0.8722 (nan) loss_classifier: 0.7357 (nan) loss_box_reg: 0.0000 (nan) loss_rec: 0.0047 (nan) loss_objectness: 0.0988 (0.0985) loss_rpn_box_reg: 0.0330 (0.0599) time: 0.0985 (0.1523) data: 0.0009 (0.0536) lr: 0.000007 max mem: 1632
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
2020-11-12 20:13:13,538 maskrcnn_benchmark.trainer INFO: eta: 3:23:03 iter: 20 loss: 0.8722 (nan) loss_classifier: 0.7357 (nan) loss_box_reg: 0.0000 (nan) loss_rec: 0.0047 (nan) loss_objectness: 0.0988 (0.0985) loss_rpn_box_reg: 0.0374 (0.0529) time: 0.0908 (0.1219) data: 0.0007 (0.0296) lr: 0.000007 max mem: 1632
INFO:maskrcnn_benchmark.trainer:eta: 3:23:03 iter: 20 loss: 0.8722 (nan) loss_classifier: 0.7357 (nan) loss_box_reg: 0.0000 (nan) loss_rec: 0.0047 (nan) loss_objectness: 0.0988 (0.0985) loss_rpn_box_reg: 0.0374 (0.0529) time: 0.0908 (0.1219) data: 0.0007 (0.0296) lr: 0.000007 max mem: 1632
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
Traceback (most recent call last):
File "tools/train_net.py", line 202, in
main()
File "tools/train_net.py", line 195, in main
model = train(cfg, args.local_rank, args.distributed, args.resume, args.config_file)
File "tools/train_net.py", line 94, in train
config_file=config_file
File "/media/tongji/data/fsy_scenetext/RRPN_plusplus/maskrcnn_benchmark/engine/trainer.py", line 84, in do_train
optimizer.step()
File "/home/tongji/anaconda3/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/optim/sgd.py", line 101, in step
buf.mul_(momentum).add_(1 - dampening, d_p)
RuntimeError: CUDA error: device-side assert triggered
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [62,0,0], thread: [96,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [62,0,0], thread: [97,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [62,0,0], thread: [98,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.

nms

Hello, is NMS used in the test phase?

how could i get alphabet file.

Sorry! I found the file "./data_cache/alphabet_IC13_IC15_Syn800K_pro.txt" isn't there, when I try to train it end-to-end. How could I get it. (I'm not familiar with end-to-end tasks, please help me. Thx.)

error in demo

I tried to run "demo/rrpn_e2e_infer.py", but it shows error. Does anyone know the reason?

I just want to test the result using a pre-trained model, did I do anything wrong? Thanks.

warnings.warn(WRONG_COMPILER_WARNING.format(
Traceback (most recent call last):
File "/root/rrpn/RRPN_plusplus/demo/rrpn_e2e_infer.py", line 11, in
from demo.predictor import ICDARDemo, RRPNDemo
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/RRPN-0.0.0-py3.9-linux-x86_64.egg/demo/predictor.py",
line 6, in
from maskrcnn_benchmark.modeling.detector import build_detection_model
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/RRPN-0.0.0-py3.9-linux-x86_64.egg/maskrcnn_benchmark/
modeling/detector/init.py", line 2, in
from .detectors import build_detection_model
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/RRPN-0.0.0-py3.9-linux-x86_64.egg/maskrcnn_benchmark/
modeling/detector/detectors.py", line 2, in
from .generalized_rcnn import GeneralizedRCNN
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/RRPN-0.0.0-py3.9-linux-x86_64.egg/maskrcnn_benchmark/
modeling/detector/generalized_rcnn.py", line 11, in
from ..backbone import build_backbone
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/RRPN-0.0.0-py3.9-linux-x86_64.egg/maskrcnn_benchmark/
modeling/backbone/init.py", line 2, in
from .backbone import build_backbone
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/RRPN-0.0.0-py3.9-linux-x86_64.egg/maskrcnn_benchmark/modeling/backbone/backbone.py", line 7, in
from maskrcnn_benchmark.modeling.make_layers import conv_with_kaiming_uniform
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/RRPN-0.0.0-py3.9-linux-x86_64.egg/maskrcnn_benchmark/modeling/make_layers.py", line 10, in
from maskrcnn_benchmark.layers import Conv2d
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/RRPN-0.0.0-py3.9-linux-x86_64.egg/maskrcnn_benchmark/layers/init.py", line 8, in
from .roi_align import ROIAlign
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/RRPN-0.0.0-py3.9-linux-x86_64.egg/maskrcnn_benchmark/layers/roi_align.py", line 9, in
from ._utils import _C
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/RRPN-0.0.0-py3.9-linux-x86_64.egg/maskrcnn_benchmark/layers/_utils.py", line 39, in
_C = _load_C_extensions()
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/RRPN-0.0.0-py3.9-linux-x86_64.egg/maskrcnn_benchmark/layers/_utils.py", line 31, in _load_C_extensions
return load_ext(
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1124, in load
return _jit_compile(
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1337, in _jit_compile
_write_ninja_file_and_build_library(
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1436, in _write_ninja_file_and_build_library
_write_ninja_file_to_build_library(
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1875, in _write_ninja_file_to_build_library
_write_ninja_file(
File "/root/anaconda3/envs/rrpn_pytorch/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1926, in _write_ninja_file
assert len(sources) > 0
AssertionError

No checkpoint found

Hi
I found a problem. When I was testing the model (python ./tools/test_net.py --config-file=configs/arpn/e2e_rrpn_R_50_C4_1x_test_AFPN.yaml).
2020-12-31 13:35:14,374 maskrcnn_benchmark.utils.checkpoint INFO: No checkpoint found. Initializing model from scratch
2020-12-31 13:35:14,374 maskrcnn_benchmark.data.build WARNING: When using more than one image per GPU you may encounter an out-of-memory (OOM) error if your GPU does not have sufficient memory. If this happens, you can reduce SOLVER.IMS_PER_BATCH (for training) or TEST.IMS_PER_BATCH (for inference). For training, you must also adjust the learning rate and schedule length according to the linear scaling rule. See for example: https://github.com/facebookresearch/Detectron/blob/master/configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml#L14
My GPU is TITAN RTX 24GB. I want to know what causes this, please tell me.

Help with testing using own images

Hi

The section titled "Testing" provides some information on testing the implementation.
I have tried to understand the mentioned step but I am confused and unable to navigate the steps to test the model on my own images.
What I was looking for is to use the pre-trained models[pre-trained on any dataset] on my own images. Let's say that I have a directory with images: /user/sp/test_images which I would like to test, what all modifications do I need to make in files from this repository. I don't see any step which mentions where to include own test data's directory path
For example the step: "Choose the dataset you want to evaluate on.
TEST:
DATASET_NAME: "IC15" # Choice can be "IC15", "LSVT" and so on
MODE: "DET" # DET for detection evaluation or E2E for recognition results in the spotter"
doesn't exactly tell you anything about your own test directory path, this only helps you to choose from fixed dataset.
I am unable to grasp what fields to change in: $RRPN_ROOT/configs/arpn_E2E/e2e_rrpn_R_50_C4_1x_test_AFPN_RT_LERB_Spotter.yaml
If anyone else has been able to test this implementation on their own data, please help with the requisite steps.
Also, couldn't find the file "$RRPN_ROOT/demo/rrpn_e2e_series.py".
Any help would be appreciated

How do you implement compute_target_rbox method in geo_target.py ? Any source of mathematical computation ?

I would like to know the source of mathematical calculations behind the implementation of compute_target_rbox in geo_target.py. I also have gone through the various papers such as RRPN_plusplus , RRPN , EAST , FOTS and Pixel-Anchor , but I didn't find anything in regards

I would like to know in-depth about dis_numerator , pj_dis and How the targets are created by multiplying pj_dis_coll with the corresponding heatmap and what is the significance of np.mgrid in creating rbox targets ?

Also , what is the significance angle_reg_map and A_reg_map in creating angle targets ?

Please provide the source of these computations . This would really help me in my Research.

I will also provide my mail id for providing the Resources
my mail id : [email protected]

Thanks in Advance .

Run testing on machine without GPU

Hi

I was following the instructions in INSTALL.md and I am encountering an error at the line:

python rotation_setup.py install

The error is:

Traceback (most recent call last):
  File "rotation_setup.py", line 58, in <module>
    CUDA = locate_cuda()
  File "rotation_setup.py", line 46, in locate_cuda
    raise EnvironmentError('The nvcc binary could not be '
OSError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDAHOME

This is expected as I am trying ti setup on a machine without GPU. Could you please suggest how do I solve this? i.e. basically set up on a machine without GPU and run the testing on CPU?

Character Content not recognized by ICDAR 2015 pre-trained model

I am using the pre-trained model for testing on my own images. I had downloaded model_IC15_89.pth using the link https://drive.google.com/file/d/1nv-ZjbYBj8ePZRa_fAhbHvzm7HqSxPWK/view?usp=sharing and changed the path to the location of this checkpoint in configs/arpn_E2E/e2e_rrpn_R_50_C4_1x_test_AFPN_RT_LERB_Spotter.yaml
The output generated by the pretrained model looks like this

270,558,331,487,347,500,287,572
236,534,286,479,300,492,251,547
285,570,343,499,358,511,300,583
398,471,441,420,456,433,413,483
394,320,448,256,461,268,407,331
...

I am assuming the detection head of the model is working and hence it is able to detect some of the words in the image and provide the coordinates as output but the recognition head is not able to recognize the character content of the boxes detected by detection head.

@mjq11302010044 Could you please help me with understanding why the implementation is not able to recognise any character?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.