mhliao / masktextspotter Goto Github PK
View Code? Open in Web Editor NEWA PyTorch implementation of Mask TextSpotter
Home Page: https://github.com/MhLiao/MaskTextSpotter
A PyTorch implementation of Mask TextSpotter
Home Page: https://github.com/MhLiao/MaskTextSpotter
core dump was occured when I runing the test.sh with the download model.
pytorch 1.2 gcc 4.8 python3.6
what's wrong with it? gcc or pytorch' version?
Hi @MhLiao ,
Thanks for your amazing work,
when I try to load your pretrain model to run the test,
I meet the error below:
File "tools/test_net.py", line 95, in <module>
main()
File "tools/test_net.py", line 64, in main
_ = checkpointer.load(cfg.MODEL.WEIGHT)
File "/home/pc/MaskTextSpotter/maskrcnn_benchmark/utils/checkpoint.py", line 62, in load
self._load_model(checkpoint)
File "/home/pc/MaskTextSpotter/maskrcnn_benchmark/utils/checkpoint.py", line 98, in _load_model
load_state_dict(self.model, checkpoint.pop("model"))
File "/home/pc/MaskTextSpotter/maskrcnn_benchmark/utils/model_serialization.py", line 80, in load_state_dict
model.load_state_dict(model_state_dict)
File "/home/pc/.conda/envs/mts/lib/python3.7/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for GeneralizedRCNN:
size mismatch for roi_heads.box.feature_extractor.fc6.weight: copying a param with shape torch.Size([1024, 12544]) from checkpoint, the shape in current model is torch.Size([1024, 50176]).
I don't remember where I fixed except the torch.bool to torch.uint8 because I use the pytorch1.1,
and in the pretrain.yaml I only change the SOLVER part...
Is there any reason may cause this problem?
Thx!
请问对于mlt这种没有character-level的标注,不会像fots算法一样,在前期训练时检测不准确从而导致识别出现问题的情况吗?训练mlt是在预训练模型上微调的吗?
Whether a single GPU can run
提示错误:FileNotFoundError: [Errno 2] No such file or directory: 'datasets/icdar2013/test_gts/gt_img_1.txt'
作者您好,我在执行sh test.sh的过程中,发生了如下错误。icdar2013数据集是在您给的网盘地址下载的,并且解压到了datasets文件夹下。
Traceback (most recent call last):
File "tools/test_net.py", line 95, in
main()
File "tools/test_net.py", line 89, in main
cfg=cfg,
File "/home/luoyijie/MaskTextSpotter/maskrcnn_benchmark/engine/text_inference.py", line 380, in inference
predictions = compute_on_dataset(model, data_loader, device)
File "/home/luoyijie/MaskTextSpotter/maskrcnn_benchmark/engine/text_inference.py", line 55, in compute_on_dataset
for i, batch in tqdm(enumerate(data_loader)):
File "/home/luoyijie/anaconda3/envs/masktextspotter/lib/python3.7/site-packages/tqdm/std.py", line 1102, in iter
for obj in iterable:
File "/home/luoyijie/anaconda3/envs/masktextspotter/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in next
return self._process_data(data)
File "/home/luoyijie/anaconda3/envs/masktextspotter/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/home/luoyijie/anaconda3/envs/masktextspotter/lib/python3.7/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/luoyijie/anaconda3/envs/masktextspotter/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/luoyijie/anaconda3/envs/masktextspotter/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/luoyijie/anaconda3/envs/masktextspotter/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/luoyijie/MaskTextSpotter/maskrcnn_benchmark/data/datasets/icdar.py", line 32, in getitem
words,boxes,charsbbs,segmentations=self.load_gt_from_txt(gt_path,height,width)
File "/home/luoyijie/MaskTextSpotter/maskrcnn_benchmark/data/datasets/icdar.py", line 89, in load_gt_from_txt
lines = open(gt_path).readlines()
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/icdar2013/test_gts/gt_img_224.txt'
what do the labels of total_text and scut label look like?
The total_text‘s label is same to CTW1500?
and the scut using the char labels, whose each box has 8 value?
thanks a lot
Add a code in paths_catalog.py
to check if the test gts dir is available, if not, then skip it. using:
else gts_dir = None
Add a code in test_net.py
to delete the ./outputs/*/inference
folder, before running the actual inferencing test.
Thank you for your excellent work!
I got stuck trying to train for unknown reasons.
2019-12-05 16:06:33,983 maskrcnn_benchmark.trainer INFO: Start training tensor(0, device='cuda:0') chars_boxes.shape: 0
Volatile GPU-Util is 0%
Please tell me what's wrong with me.Thank you very much
Will this project work with vertical text if given a vertical text dataset? or would there need to be more changes to get it to work.
Thanks for your great job, maybe we need a demo to inference a single image.
.
The pretrain code loaded the trained model to get the model, how to train from the begin without downloaded model?
the loss seems to be very small.
i have 8 gpu in my machine but it seems only one is used when testing.
TEST:
CHAR_THRESH: 192
EXPECTED_RESULTS: []
EXPECTED_RESULTS_SIGMA_TOL: 4
IMS_PER_BATCH: 1
VIS: True
2020-03-13 22:28:33,880 maskrcnn_benchmark INFO: Collecting env info (might take some time)
2020-03-13 22:28:38,474 maskrcnn_benchmark INFO:
PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: 10.1
OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: version 3.10.2
Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: GeForce GTX 1080 Ti
GPU 1: GeForce GTX 1080 Ti
GPU 2: GeForce GTX 1080 Ti
GPU 3: GeForce GTX 1080 Ti
GPU 4: GeForce GTX 1080 Ti
GPU 5: GeForce GTX 1080 Ti
GPU 6: GeForce GTX 1080 Ti
GPU 7: GeForce GTX 1080 Ti
Nvidia driver version: 440.33.01
cuDNN version: Could not collect
Versions of relevant libraries:
[pip3] numpy==1.16.4
[pip3] torch==1.2.0
[pip3] torchvision==0.4.0
[conda] torch 1.4.0 pypi_0 pypi
[conda] torchvision 0.5.0 pypi_0 pypi
Pillow (7.0.0)
Test result info:
2020-03-13 22:28:43,048 maskrcnn_benchmark.inference INFO: Start evaluation on 233 images
233it [01:19, 2.95it/s]
2020-03-13 22:30:02,229 maskrcnn_benchmark.inference INFO: Total inference time: 0:01:19.180692 (0.339831295954823 s / img per device, on 1 devices)
only one device is used?
Hi,
When I pretrain using icdar 2013, while training the best model is not saved.
Instead Pretrain is using the default CHECKPOINT_PERIOD
which is 2500.
Pretrain should constantly save the best model.
Can you provide the "scut-eng-char_train" dataset for us? We don't know how to select the 1162 images. Thanks~
Thanks to the author for sharing the code, I would like to ask you, for example, if I add 50 Chinese characters, what should I modify?
I ran test.sh and the speed was 1.47s/it...
Thank you for your work.
Could you please release the code of pre-processing of SynthText.
I want to test the results on total-text, if you release the total-text's model, i will be very grateful.
when will the evaluation code be released?
Line 234 in afe2279
Hi,
The following line link
parts = line.strip().split(',', 8)
should be changed to
parts = line.strip().split(',')
if one wants to train their network on icdar2013. Obviously this will fail the training on icdar2015.
I will update this while I figure out a workaround across this. meanwhile reporting this issue.
The training gets stuck because of this as the segmentation mask will have zero boxes.
Thanks.
@MhLiao Thank you for sharing the code. When I use one k80 (11G) to train the model, I need to set IMS_PER_BATCH: 1, otherwise CUDA out of memory. I want to know which parameters in finetune.yaml should be modified so that batch_size can be bigger without degrading performance.
I look forward to your reply.
Hi all,
I'm running text spotting on batches of images, with TEST.IMS_PER_BATCH = 16.
Some error raises from function process_char_mask in text_inference.py, that the length of boxes doesn't match char_masks.shape[0]
MaskTextSpotter/maskrcnn_benchmark/engine/text_inference.py", line 232, in process_char_mask
box = list(boxes[index])
IndexError: index 3 is out of bounds for axis 0 with size 3
def process_char_mask(char_masks, boxes, threshold=192):
texts, rec_scores, rec_char_scores, char_polygons = [], [], [], []
for index in range(char_masks.shape[0]):
-> box = list(boxes[index])
I try to trace back, but I find it only pick out the first element in the batch as following in text_inference.py,
def compute_on_dataset(model, data_loader, device):
model.eval()
results_dict = {}
cpu_device = torch.device("cpu")
for i, batch in tqdm(enumerate(data_loader)):
images, targets, image_paths = batch
images = images.to(device)
with torch.no_grad():
predictions = model(images)
if predictions is not None:
global_predictions = predictions[0]
char_predictions = predictions[1]
char_mask = char_predictions['char_mask']
boxes = char_predictions['boxes']
seq_words = char_predictions['seq_outputs']
seq_scores = char_predictions['seq_scores']
detailed_seq_scores = char_predictions['detailed_seq_scores']
global_predictions = [o.to(cpu_device) for o in global_predictions]
results_dict.update(
-> {image_paths[0]: [global_predictions[0], char_mask, boxes, seq_words, seq_scores, detailed_seq_scores]}
)
return results_dict
Is it possible to get all results from the predictions of the model? and How could we distinguish the char_mask, boxes, words in the batch for each image.
Thanks!
Hi,i'm confused that where the lexicon search method code is, thanks. In the paper, it has improved the edit distance.
@MhLiao @lvpengyuan Detectron2 was just released,
Do you have plans to upgrade MasTextSpotter to Detectron2?
is a problem?(mask head will be too heavy that hard to convergence?)
@MhLiao
What configuration do you recommend to better detect text-lines?
Thanks for your excellent work!
Any plans for releasing codes of Total-Text evaluation?
@MhLiao Thank you for your hard work,
When trying to train using the Pretrain
command, an error is shown invalid device ordinal
, then it gets stuck after loading the config. The log is attached here log.txt
My MaskTextSpotter is installed correctly, the sh test.sh
functions properly without any issue.
My system specification: Ubuntu 18.04/ 16GB RAM/ GTX 1070 ti 8GB/ GTX 960 4GB/ ryzen 2600
The training log:
(mask) home@home-desktop:~/p5/MaskTextSpotter$ python3 -m torch.distributed.launch --nproc_per_node=8 tools/train_net.py --config-file configs/pretrain.yaml
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1570710853631/work/torch/csrc/cuda/Module.cpp line=37 error=10 : invalid device ordinal
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1570710853631/work/torch/csrc/cuda/Module.cpp line=37 error=10 : invalid device ordinal
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1570710853631/work/torch/csrc/cuda/Module.cpp line=37 error=10 : invalid device ordinal
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1570710853631/work/torch/csrc/cuda/Module.cpp line=37 error=10 : invalid device ordinal
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1570710853631/work/torch/csrc/cuda/Module.cpp line=37 error=10 : invalid device ordinal
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1570710853631/work/torch/csrc/cuda/Module.cpp line=37 error=10 : invalid device ordinal
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
File "tools/train_net.py", line 173, in <module>
File "tools/train_net.py", line 173, in <module>
File "tools/train_net.py", line 173, in <module>
Traceback (most recent call last):
File "tools/train_net.py", line 173, in <module>
main()
File "tools/train_net.py", line 140, in main
main()
File "tools/train_net.py", line 140, in main
main()
torch.cuda.set_device(args.local_rank)
File "tools/train_net.py", line 140, in main
Traceback (most recent call last):
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/torch/cuda/__init__.py", line 300, in set_device
File "tools/train_net.py", line 173, in <module>
main()
File "tools/train_net.py", line 140, in main
torch._C._cuda_setDevice(device)
torch.cuda.set_device(args.local_rank)
Traceback (most recent call last):
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/torch/cuda/__init__.py", line 300, in set_device
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1570710853631/work/torch/csrc/cuda/Module.cpp:37
File "tools/train_net.py", line 173, in <module>
torch.cuda.set_device(args.local_rank)
torch.cuda.set_device(args.local_rank)
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/torch/cuda/__init__.py", line 300, in set_device
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/torch/cuda/__init__.py", line 300, in set_device
main()
File "tools/train_net.py", line 140, in main
torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1570710853631/work/torch/csrc/cuda/Module.cpp:37
main()
File "tools/train_net.py", line 140, in main
torch._C._cuda_setDevice(device)
torch.cuda.set_device(args.local_rank)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1570710853631/work/torch/csrc/cuda/Module.cpp:37
torch._C._cuda_setDevice(device)
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/torch/cuda/__init__.py", line 300, in set_device
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1570710853631/work/torch/csrc/cuda/Module.cpp:37
torch.cuda.set_device(args.local_rank)
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/torch/cuda/__init__.py", line 300, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1570710853631/work/torch/csrc/cuda/Module.cpp:37
torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1570710853631/work/torch/csrc/cuda/Module.cpp:37
2019-10-20 23:27:37,465 maskrcnn_benchmark INFO: Using 8 GPUs
2019-10-20 23:27:37,465 maskrcnn_benchmark INFO: Namespace(config_file='configs/pretrain.yaml', distributed=True, local_rank=0, opts=[], skip_test=False)
2019-10-20 23:27:37,466 maskrcnn_benchmark INFO: Collecting env info (might take some time)
2019-10-20 23:27:38,570 maskrcnn_benchmark INFO:
PyTorch version: 1.3.0
Is debug build: No
CUDA used to build PyTorch: 10.0.130
OS: Linux Mint 19.2 Tina
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: version 3.10.2
Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.0.130
GPU models and configuration:
GPU 0: GeForce GTX 1070 Ti
GPU 1: GeForce GTX 960
Nvidia driver version: 418.74
cuDNN version: Could not collect
Versions of relevant libraries:
[pip] numpy==1.17.3
[pip] torch==1.3.0
[pip] torchvision==0.4.1a0+d94043a
[conda] blas 1.0 mkl
[conda] mkl 2019.4 243
[conda] mkl-service 2.3.0 py36he904b0f_0
[conda] mkl_fft 1.0.14 py36ha843d7b_0
[conda] mkl_random 1.1.0 py36hd6b4f25_0
[conda] pytorch 1.3.0 py3.6_cuda10.0.130_cudnn7.6.3_0 pytorch
[conda] torchvision 0.4.1 py36_cu100 pytorch
Pillow (6.2.0)
2019-10-20 23:27:38,570 maskrcnn_benchmark INFO: Loaded configuration file configs/pretrain.yaml
2019-10-20 23:27:38,570 maskrcnn_benchmark INFO:
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
# WEIGHT: "./outputs/synth_pretrain_shrink++/model_0270000.pth"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
PRE_NMS_TOP_N_TRAIN: 2000
PRE_NMS_TOP_N_TEST: 1000
POST_NMS_TOP_N_TEST: 1000
FPN_POST_NMS_TOP_N_TEST: 1000
ROI_HEADS:
USE_FPN: True
BATCH_SIZE_PER_IMAGE: 512
ROI_BOX_HEAD:
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
POOLER_SAMPLING_RATIO: 2
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor"
NUM_CLASSES: 2
ROI_MASK_HEAD:
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
FEATURE_EXTRACTOR: "MaskRCNNFPNFeatureExtractor"
PREDICTOR: "SeqCharMaskRCNNC4Predictor"
POOLER_RESOLUTION_H: 16
POOLER_RESOLUTION_W: 64
POOLER_SAMPLING_RATIO: 2
RESOLUTION: 28
RESOLUTION_H: 32
RESOLUTION_W: 128
SHARE_BOX_FEATURE_EXTRACTOR: False
CHAR_NUM_CLASSES: 37
USE_WEIGHTED_CHAR_MASK: True
MASK_BATCH_SIZE_PER_IM: 64
MASK_ON: True
CHAR_MASK_ON: True
SEQUENCE:
SEQ_ON: False
NUM_CHAR: 38
BOS_TOKEN: 0
MAX_LENGTH: 32
TEACHER_FORCE_RATIO: 1.0
TWO_CONV: True
DATASETS:
TRAIN: ("icdar_2015_train",)
TEST: ("icdar_2015_test",)
DATALOADER:
SIZE_DIVISIBILITY: 32
NUM_WORKERS: 4
ASPECT_RATIO_GROUPING: False
SOLVER:
BASE_LR: 0.01 #0.02
WARMUP_FACTOR: 0.1
WEIGHT_DECAY: 0.0001
STEPS: (100000, 160000)
MAX_ITER: 300000
IMS_PER_BATCH: 4
OUTPUT_DIR: "./outputs/pretrain"
TEST:
VIS: False
CHAR_THRESH: 192
IMS_PER_BATCH: 1
INPUT:
MIN_SIZE_TRAIN: (600, 800)
MAX_SIZE_TRAIN: 2333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
2019-10-20 23:27:38,571 maskrcnn_benchmark INFO: Running with config:
DATALOADER:
ASPECT_RATIO_GROUPING: False
NUM_WORKERS: 4
SIZE_DIVISIBILITY: 32
DATASETS:
AUG: False
RANDOM_CROP_PROB: 0.0
RATIOS: []
TEST: ('icdar_2015_test',)
TRAIN: ('icdar_2015_train',)
INPUT:
MAX_SIZE_TEST: 1333
MAX_SIZE_TRAIN: 2333
MIN_SIZE_TEST: 800
MIN_SIZE_TRAIN: (600, 800)
PIXEL_MEAN: [102.9801, 115.9465, 122.7717]
PIXEL_STD: [1.0, 1.0, 1.0]
TO_BGR255: True
MODEL:
BACKBONE:
CONV_BODY: R-50-FPN
FREEZE_CONV_BODY_AT: 2
OUT_CHANNELS: 256
CHAR_MASK_ON: True
DEVICE: cuda
MASK_ON: True
META_ARCHITECTURE: GeneralizedRCNN
RESNETS:
NUM_GROUPS: 1
RES2_OUT_CHANNELS: 256
RES5_DILATION: 1
STEM_FUNC: StemWithFixedBatchNorm
STEM_OUT_CHANNELS: 64
STRIDE_IN_1X1: True
TRANS_FUNC: BottleneckWithFixedBatchNorm
WIDTH_PER_GROUP: 64
ROI_BOX_HEAD:
FEATURE_EXTRACTOR: FPN2MLPFeatureExtractor
MLP_HEAD_DIM: 1024
NUM_CLASSES: 2
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 2
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
PREDICTOR: FPNPredictor
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 512
BBOX_REG_WEIGHTS: (10.0, 10.0, 5.0, 5.0)
BG_IOU_THRESHOLD: 0.5
DETECTIONS_PER_IMG: 100
FG_IOU_THRESHOLD: 0.5
NMS: 0.5
POSITIVE_FRACTION: 0.25
SCORE_THRESH: 0.05
USE_FPN: True
ROI_MASK_HEAD:
CHAR_NUM_CLASSES: 37
CONV_LAYERS: (256, 256, 256, 256)
FEATURE_EXTRACTOR: MaskRCNNFPNFeatureExtractor
MASK_BATCH_SIZE_PER_IM: 64
MLP_HEAD_DIM: 1024
POOLER_RESOLUTION: 14
POOLER_RESOLUTION_H: 16
POOLER_RESOLUTION_W: 64
POOLER_SAMPLING_RATIO: 2
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
PREDICTOR: SeqCharMaskRCNNC4Predictor
RESOLUTION: 28
RESOLUTION_H: 32
RESOLUTION_W: 128
SHARE_BOX_FEATURE_EXTRACTOR: False
USE_WEIGHTED_CHAR_MASK: True
RPN:
ANCHOR_SIZES: (32, 64, 128, 256, 512)
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
ASPECT_RATIOS: (0.5, 1.0, 2.0)
BATCH_SIZE_PER_IMAGE: 256
BG_IOU_THRESHOLD: 0.3
FG_IOU_THRESHOLD: 0.7
FPN_POST_NMS_TOP_N_TEST: 1000
FPN_POST_NMS_TOP_N_TRAIN: 2000
MIN_SIZE: 0
NMS_THRESH: 0.7
POSITIVE_FRACTION: 0.5
POST_NMS_TOP_N_TEST: 1000
POST_NMS_TOP_N_TRAIN: 2000
PRE_NMS_TOP_N_TEST: 1000
PRE_NMS_TOP_N_TRAIN: 2000
STRADDLE_THRESH: 0
USE_FPN: True
RPN_ONLY: False
WEIGHT: catalog://ImageNetPretrained/MSRA/R-50
OUTPUT_DIR: ./outputs/pretrain
PATHS_CATALOG: /home/home/p5/MaskTextSpotter/maskrcnn_benchmark/config/paths_catalog.py
SEQUENCE:
BOS_TOKEN: 0
MAX_LENGTH: 32
MEAN_SCORE: False
NUM_CHAR: 38
SEQ_ON: False
TEACHER_FORCE_RATIO: 1.0
TWO_CONV: True
SOLVER:
BASE_LR: 0.01
BIAS_LR_FACTOR: 2
CHECKPOINT_PERIOD: 2500
GAMMA: 0.1
IMS_PER_BATCH: 4
MAX_ITER: 300000
MOMENTUM: 0.9
RESUME: True
STEPS: (100000, 160000)
USE_ADAM: False
WARMUP_FACTOR: 0.1
WARMUP_ITERS: 500
WARMUP_METHOD: linear
WEIGHT_DECAY: 0.0001
WEIGHT_DECAY_BIAS: 0
TEST:
CHAR_THRESH: 192
EXPECTED_RESULTS: []
EXPECTED_RESULTS_SIGMA_TOL: 4
IMS_PER_BATCH: 1
VIS: False
How to use only detection? thank you.
When trying to Finetune using icdar2015, I get returned non-zero exit status 1.
error:
python3 -m torch.distributed.launch --nproc_per_node=1 tools/train_net.py --config-file configs/finetune.yaml
The terminal log:
(mask) home@home-desktop:~/p5/MaskTextSpotter$ python3 -m torch.distributed.launch --nproc_per_node=1 tools/train_net.py --config-file configs/finetune.yaml
Traceback (most recent call last):
File "tools/train_net.py", line 173, in <module>
main()
File "tools/train_net.py", line 145, in main
cfg.merge_from_file(args.config_file)
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/yacs/config.py", line 213, in merge_from_file
self.merge_from_other_cfg(cfg)
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
_merge_a_into_b(cfg_other, self, self, [])
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/yacs/config.py", line 460, in _merge_a_into_b
_merge_a_into_b(v, b[k], root, key_list + [k])
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/yacs/config.py", line 456, in _merge_a_into_b
v = _check_and_coerce_cfg_value_type(v, b[k], k, full_key)
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/yacs/config.py", line 513, in _check_and_coerce_cfg_value_type
original_type, replacement_type, original, replacement, full_key
ValueError: Type mismatch (<class 'tuple'> vs. <class 'str'>) with values (() vs. icdar_2015_train) for config key: DATASETS.TRAIN
Traceback (most recent call last):
File "/home/home/anaconda3/envs/mask/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/home/anaconda3/envs/mask/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/torch/distributed/launch.py", line 253, in <module>
main()
File "/home/home/anaconda3/envs/mask/lib/python3.6/site-packages/torch/distributed/launch.py", line 249, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/home/anaconda3/envs/mask/bin/python3', '-u', 'tools/train_net.py', '--local_rank=0', '--config-file', 'configs/finetune.yaml']' returned non-zero exit status 1.
When trying to train icdar2015 using the Pretrain
command:
python3 -m torch.distributed.launch --nproc_per_node=1 ./tools/train_net.py --config-file ./configs/pretrain.yaml
I get error:
returned non-zero exit status 1.
My Terminal log:
terminal.txt
The pretrain log is attached:
log.txt
RuntimeError: shape mismatch: value tensor of shape [52, 256, 16, 64] cannot be broadcast to indexing result of shape [52, 256, 16, 16]
Hi @MhLiao,
Thanks so much for providing this excellent code base. My team at the US Geological Survey is using it to recognize character on historical topographic maps. When training, the ICDAR datasets throw an index out of range error in segmentation_mask. If we don't use those datasets (set ratio to zero in the config files), training proceeds normally. We downloaded those datasets from the place you indicate, and have them in the right spot with a train_list.txt file as directed. Please advise us if possible.
Thanks so much,
Sam
@MhLiao If I want to use it to detect just numbers and '.', what should I do? Thanks
toolz 0.10.0
torch 1.2.0
torchvision 0.4.0a0
SOLVER:
IMS_PER_BATCH: 1
python -m torch.distributed.launch --nproc_per_node=1 tools/train_net.py --config-file configs/finetune.yaml
the SCUT datasets which has 1162 images were used in training process , which datasets does the SCUT datasets includes ? Can you display a download link of it ? thanks.
@MhLiao Hi, Could you tell me the format of training datasets such as ICDAR13,15,syn, when finetuning the model. And the training process need the ground truth of the every character in one word text? Thank you!
我用Mask TextSpotter训练了一个4000类的中文模型(用了4万行私有弯曲中文文本数据),发现序列识别效果还行,字符分割效果较差,是不是因为中文字符的shrink操作给字符分割带来不好的影响,或者是类别多难度大?这个现象是不是正常的?
@MhLiao Thank you for your hard work
My config file is attached: pretrain.yaml.zip
When trying to train ICDAR2015 using Pretrain, I keep getting NaN
errors.
My training command:
python3 -m torch.distributed.launch --nproc_per_node=1 tools/train_net.py --config-file configs/pretrain.yaml
INFO:maskrcnn_benchmark.trainer:
loss: nan (nan) loss_classifier: nan (nan) loss_box_reg: nan (nan) loss_mask: nan (nan) loss_char_mask: 0.0000 (0.0000) loss_seq: nan (nan) loss_objectness: nan (nan) loss_rpn_box_reg: nan (nan)
The errors that I keep getting:
WARNING:root:NaN or Inf found in input tensor.
When I finished training the model,how to use only detection. thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.