GithubHelp home page GithubHelp logo

alibabaresearch / advancedliteratemachinery Goto Github PK

View Code? Open in Web Editor NEW
1.2K 28.0 147.0 103.51 MB

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

License: Apache License 2.0

Python 29.89% Shell 0.11% C++ 59.44% C 7.40% Cuda 2.65% Objective-C 0.35% Objective-C++ 0.16%
artificial-intelligence documentai multimodal multimodal-deep-learning ocr computer-vision vision-language-transformer end-to-end-ocr scene-text-detection scene-text-detection-recognition

advancedliteratemachinery's People

Contributors

alibaba-oss avatar wangsherpa avatar yashsandansing avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

advancedliteratemachinery's Issues

about execution of MGP-STR code

I encountered this error while executing MGP-STR code:

ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

May I ask how to solve it? Thank you!

Loss收敛问题

  请教一下 LORE 训练自己的数据集时,hm_loss, wh_loss, ax_loss, sax_loss 分别收敛到多少会获得一个比较好的解析效果?另外对于非readme提到的数据集,除了max_objs,max_pairs与max_cors这三个参数,还有什么参数需要修改的?
   非常期待您的回复!

train.py error

I cannot run train.py.
`python train.py --config=configs/finetune_funsd.yaml

Traceback (most recent call last):
File "train.py", line 6, in
from lightning_modules.data_modules.vie_data_module import VIEDataModule
File "/home/me/projects/UzPostOCR/AdvancedLiterateMachinery/DocumentUnderstanding/GeoLayoutLM/lightning_modules/data_modules/vie_data_module.py", line 8, in
import cv2
File "/home/me/miniconda3/envs/geo/lib/python3.8/site-packages/cv2/init.py", line 181, in
bootstrap()
File "/home/me/miniconda3/envs/geo/lib/python3.8/site-packages/cv2/init.py", line 175, in bootstrap
if __load_extra_py_code_for_module("cv2", submodule, DEBUG):
File "/home/me/miniconda3/envs/geo/lib/python3.8/site-packages/cv2/init.py", line 28, in __load_extra_py_code_for_module
py_module = importlib.import_module(module_name)
File "/home/me/miniconda3/envs/geo/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/home/me/miniconda3/envs/geo/lib/python3.8/site-packages/cv2/gapi/init.py", line 301, in
cv.gapi.wip.GStreamerPipeline = cv.gapi_wip_gst_GStreamerPipeline
AttributeError: partially initialized module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline' (most likely due to a circular import)`

How to pre-training?

Hi. Thanks for the great research.

I would like to train GeoLayoutLM using a dataset consisting of other languages, is it possible to provide a pre-trainable code? Or could you give me a tip, for pre-training, I think I need the loss configurations for these 3: GeoPair, GeoMPair and GeoTriplet

LORE模型demo_wireless.sh运行

在运行LORE模型的demo_wireless.sh时,运行结果出错。
demo_wireless.sh
python demo.py ctdet_mid \ --dataset table_mid \ --demo ../input_images/wireless \ --demo_name demo_wireless \ --debug 1 \ --arch resfpnhalf_18 \ --K 3000 \ --MK 5000 \ --upper_left \ --tsfm_layers 4\ --stacking_layers 4 \ --gpus 0\ --wiz_2dpe \ --wiz_detect \ --wiz_stacking \ --convert_onnx 0 \ --vis_thresh_corner 0.3 \ --vis_thresh 0.2 \ --scores_thresh 0.2 \ --nms \ --demo_dir ../output/test_wireless/ \ --load_model ../dir_of_ckpt/ckpt_wireless/model_best.pth \ --load_processor ../dir_of_ckpt/ckpt_wireless/processor_best.pth
预训练模型能够正常加载,但是存在一些问题,
Fix size testing. training chunk_sizes: [1] The output will be saved to /root/data1/bwq/LORE_new/AdvancedLiterateMachinery/DocumentUnderstanding/LORE-TSR/src/lib/../../exp/ctdet_mid/default heads {'hm': 2, 'st': 8, 'wh': 8, 'ax': 256, 'cr': 256, 'reg': 2} loaded ../dir_of_ckpt/ckpt_wireless/model_best.pth, epoch 100 Drop parameter base.base_layer.0.weight. Drop parameter base.base_layer.1.weight. Drop parameter base.base_layer.1.bias. Drop parameter base.base_layer.1.running_mean. Drop parameter base.base_layer.1.running_var. Drop parameter base.base_layer.1.num_batches_tracked. Drop parameter base.level0.0.weight. Drop parameter base.level0.1.weight. Drop parameter base.level0.1.bias. Drop parameter base.level0.1.running_mean. Drop parameter base.level0.1.running_var. Drop parameter base.level0.1.num_batches_tracked. Drop parameter base.level1.0.weight. Drop parameter base.level1.1.weight. Drop parameter base.level1.1.bias. Drop parameter base.level1.1.running_mean. Drop parameter base.level1.1.running_var. Drop parameter base.level1.1.num_batches_tracked. Drop parameter base.level2.tree1.conv1.weight. Drop parameter base.level2.tree1.bn1.weight. Drop parameter base.level2.tree1.bn1.bias. Drop parameter base.level2.tree1.bn1.running_mean. Drop parameter base.level2.tree1.bn1.running_var. Drop parameter base.level2.tree1.bn1.num_batches_tracked. Drop parameter base.level2.tree1.conv2.weight. Drop parameter base.level2.tree1.bn2.weight. Drop parameter base.level2.tree1.bn2.bias. Drop parameter base.level2.tree1.bn2.running_mean. Drop parameter base.level2.tree1.bn2.running_var. Drop parameter base.level2.tree1.bn2.num_batches_tracked. Drop parameter base.level2.tree2.conv1.weight. Drop parameter base.level2.tree2.bn1.weight. Drop parameter base.level2.tree2.bn1.bias. Drop parameter base.level2.tree2.bn1.running_mean. Drop parameter base.level2.tree2.bn1.running_var. Drop parameter base.level2.tree2.bn1.num_batches_tracked. Drop parameter base.level2.tree2.conv2.weight. Drop parameter base.level2.tree2.bn2.weight. Drop parameter base.level2.tree2.bn2.bias. Drop parameter base.level2.tree2.bn2.running_mean. Drop parameter base.level2.tree2.bn2.running_var. Drop parameter base.level2.tree2.bn2.num_batches_tracked. Drop parameter base.level2.root.conv.weight. Drop parameter base.level2.root.bn.weight. Drop parameter base.level2.root.bn.bias. Drop parameter base.level2.root.bn.running_mean. Drop parameter base.level2.root.bn.running_var. Drop parameter base.level2.root.bn.num_batches_tracked. Drop parameter base.level2.project.0.weight. Drop parameter base.level2.project.1.weight. Drop parameter base.level2.project.1.bias. Drop parameter base.level2.project.1.running_mean. Drop parameter base.level2.project.1.running_var. Drop parameter base.level2.project.1.num_batches_tracked. Drop parameter base.level3.tree1.tree1.conv1.weight. Drop parameter base.level3.tree1.tree1.bn1.weight. Drop parameter base.level3.tree1.tree1.bn1.bias. Drop parameter base.level3.tree1.tree1.bn1.running_mean. Drop parameter base.level3.tree1.tree1.bn1.running_var. Drop parameter base.level3.tree1.tree1.bn1.num_batches_tracked. Drop parameter base.level3.tree1.tree1.conv2.weight. Drop parameter base.level3.tree1.tree1.bn2.weight. Drop parameter base.level3.tree1.tree1.bn2.bias. Drop parameter base.level3.tree1.tree1.bn2.running_mean. Drop parameter base.level3.tree1.tree1.bn2.running_var. Drop parameter base.level3.tree1.tree1.bn2.num_batches_tracked. Drop parameter base.level3.tree1.tree2.conv1.weight. Drop parameter base.level3.tree1.tree2.bn1.weight. Drop parameter base.level3.tree1.tree2.bn1.bias. Drop parameter base.level3.tree1.tree2.bn1.running_mean. Drop parameter base.level3.tree1.tree2.bn1.running_var. Drop parameter base.level3.tree1.tree2.bn1.num_batches_tracked. Drop parameter base.level3.tree1.tree2.conv2.weight. Drop parameter base.level3.tree1.tree2.bn2.weight. Drop parameter base.level3.tree1.tree2.bn2.bias. Drop parameter base.level3.tree1.tree2.bn2.running_mean. Drop parameter base.level3.tree1.tree2.bn2.running_var. Drop parameter base.level3.tree1.tree2.bn2.num_batches_tracked. Drop parameter base.level3.tree1.root.conv.weight. Drop parameter base.level3.tree1.root.bn.weight. Drop parameter base.level3.tree1.root.bn.bias. Drop parameter base.level3.tree1.root.bn.running_mean. Drop parameter base.level3.tree1.root.bn.running_var. Drop parameter base.level3.tree1.root.bn.num_batches_tracked. Drop parameter base.level3.tree1.project.0.weight. Drop parameter base.level3.tree1.project.1.weight. Drop parameter base.level3.tree1.project.1.bias. Drop parameter base.level3.tree1.project.1.running_mean. Drop parameter base.level3.tree1.project.1.running_var. Drop parameter base.level3.tree1.project.1.num_batches_tracked. Drop parameter base.level3.tree2.tree1.conv1.weight. Drop parameter base.level3.tree2.tree1.bn1.weight. Drop parameter base.level3.tree2.tree1.bn1.bias. Drop parameter base.level3.tree2.tree1.bn1.running_mean. Drop parameter base.level3.tree2.tree1.bn1.running_var. Drop parameter base.level3.tree2.tree1.bn1.num_batches_tracked. Drop parameter base.level3.tree2.tree1.conv2.weight. Drop parameter base.level3.tree2.tree1.bn2.weight. Drop parameter base.level3.tree2.tree1.bn2.bias. Drop parameter base.level3.tree2.tree1.bn2.running_mean. Drop parameter base.level3.tree2.tree1.bn2.running_var. Drop parameter base.level3.tree2.tree1.bn2.num_batches_tracked. Drop parameter base.level3.tree2.tree2.conv1.weight. Drop parameter base.level3.tree2.tree2.bn1.weight. Drop parameter base.level3.tree2.tree2.bn1.bias. Drop parameter base.level3.tree2.tree2.bn1.running_mean. Drop parameter base.level3.tree2.tree2.bn1.running_var. Drop parameter base.level3.tree2.tree2.bn1.num_batches_tracked. Drop parameter base.level3.tree2.tree2.conv2.weight. Drop parameter base.level3.tree2.tree2.bn2.weight. Drop parameter base.level3.tree2.tree2.bn2.bias. Drop parameter base.level3.tree2.tree2.bn2.running_mean. Drop parameter base.level3.tree2.tree2.bn2.running_var. Drop parameter base.level3.tree2.tree2.bn2.num_batches_tracked. Drop parameter base.level3.tree2.root.conv.weight. Drop parameter base.level3.tree2.root.bn.weight. Drop parameter base.level3.tree2.root.bn.bias. Drop parameter base.level3.tree2.root.bn.running_mean. Drop parameter base.level3.tree2.root.bn.running_var. Drop parameter base.level3.tree2.root.bn.num_batches_tracked. Drop parameter base.level3.project.0.weight. Drop parameter base.level3.project.1.weight. Drop parameter base.level3.project.1.bias. Drop parameter base.level3.project.1.running_mean. Drop parameter base.level3.project.1.running_var. Drop parameter base.level3.project.1.num_batches_tracked. Drop parameter base.level4.tree1.tree1.conv1.weight. Drop parameter base.level4.tree1.tree1.bn1.weight. Drop parameter base.level4.tree1.tree1.bn1.bias. Drop parameter base.level4.tree1.tree1.bn1.running_mean. Drop parameter base.level4.tree1.tree1.bn1.running_var. Drop parameter base.level4.tree1.tree1.bn1.num_batches_tracked. Drop parameter base.level4.tree1.tree1.conv2.weight. Drop parameter base.level4.tree1.tree1.bn2.weight. Drop parameter base.level4.tree1.tree1.bn2.bias. Drop parameter base.level4.tree1.tree1.bn2.running_mean. Drop parameter base.level4.tree1.tree1.bn2.running_var. Drop parameter base.level4.tree1.tree1.bn2.num_batches_tracked. Drop parameter base.level4.tree1.tree2.conv1.weight. Drop parameter base.level4.tree1.tree2.bn1.weight. Drop parameter base.level4.tree1.tree2.bn1.bias. Drop parameter base.level4.tree1.tree2.bn1.running_mean. Drop parameter base.level4.tree1.tree2.bn1.running_var. Drop parameter base.level4.tree1.tree2.bn1.num_batches_tracked. Drop parameter base.level4.tree1.tree2.conv2.weight. Drop parameter base.level4.tree1.tree2.bn2.weight. Drop parameter base.level4.tree1.tree2.bn2.bias. Drop parameter base.level4.tree1.tree2.bn2.running_mean. Drop parameter base.level4.tree1.tree2.bn2.running_var. Drop parameter base.level4.tree1.tree2.bn2.num_batches_tracked. Drop parameter base.level4.tree1.root.conv.weight. Drop parameter base.level4.tree1.root.bn.weight. Drop parameter base.level4.tree1.root.bn.bias. Drop parameter base.level4.tree1.root.bn.running_mean. Drop parameter base.level4.tree1.root.bn.running_var. Drop parameter base.level4.tree1.root.bn.num_batches_tracked. Drop parameter base.level4.tree1.project.0.weight. Drop parameter base.level4.tree1.project.1.weight. Drop parameter base.level4.tree1.project.1.bias. Drop parameter base.level4.tree1.project.1.running_mean. Drop parameter base.level4.tree1.project.1.running_var. Drop parameter base.level4.tree1.project.1.num_batches_tracked. Drop parameter base.level4.tree2.tree1.conv1.weight. Drop parameter base.level4.tree2.tree1.bn1.weight. Drop parameter base.level4.tree2.tree1.bn1.bias. Drop parameter base.level4.tree2.tree1.bn1.running_mean. Drop parameter base.level4.tree2.tree1.bn1.running_var. Drop parameter base.level4.tree2.tree1.bn1.num_batches_tracked. Drop parameter base.level4.tree2.tree1.conv2.weight. Drop parameter base.level4.tree2.tree1.bn2.weight. Drop parameter base.level4.tree2.tree1.bn2.bias. Drop parameter base.level4.tree2.tree1.bn2.running_mean. Drop parameter base.level4.tree2.tree1.bn2.running_var. Drop parameter base.level4.tree2.tree1.bn2.num_batches_tracked. Drop parameter base.level4.tree2.tree2.conv1.weight. Drop parameter base.level4.tree2.tree2.bn1.weight. Drop parameter base.level4.tree2.tree2.bn1.bias. Drop parameter base.level4.tree2.tree2.bn1.running_mean. Drop parameter base.level4.tree2.tree2.bn1.running_var. Drop parameter base.level4.tree2.tree2.bn1.num_batches_tracked. Drop parameter base.level4.tree2.tree2.conv2.weight. Drop parameter base.level4.tree2.tree2.bn2.weight. Drop parameter base.level4.tree2.tree2.bn2.bias. Drop parameter base.level4.tree2.tree2.bn2.running_mean. Drop parameter base.level4.tree2.tree2.bn2.running_var. Drop parameter base.level4.tree2.tree2.bn2.num_batches_tracked. Drop parameter base.level4.tree2.root.conv.weight. Drop parameter base.level4.tree2.root.bn.weight. Drop parameter base.level4.tree2.root.bn.bias. Drop parameter base.level4.tree2.root.bn.running_mean. Drop parameter base.level4.tree2.root.bn.running_var. Drop parameter base.level4.tree2.root.bn.num_batches_tracked. Drop parameter base.level4.project.0.weight. Drop parameter base.level4.project.1.weight. Drop parameter base.level4.project.1.bias. Drop parameter base.level4.project.1.running_mean. Drop parameter base.level4.project.1.running_var. Drop parameter base.level4.project.1.num_batches_tracked. Drop parameter base.level5.tree1.conv1.weight. Drop parameter base.level5.tree1.bn1.weight. Drop parameter base.level5.tree1.bn1.bias. Drop parameter base.level5.tree1.bn1.running_mean. Drop parameter base.level5.tree1.bn1.running_var. Drop parameter base.level5.tree1.bn1.num_batches_tracked. Drop parameter base.level5.tree1.conv2.weight. Drop parameter base.level5.tree1.bn2.weight. Drop parameter base.level5.tree1.bn2.bias. Drop parameter base.level5.tree1.bn2.running_mean. Drop parameter base.level5.tree1.bn2.running_var. Drop parameter base.level5.tree1.bn2.num_batches_tracked. Drop parameter base.level5.tree2.conv1.weight. Drop parameter base.level5.tree2.bn1.weight. Drop parameter base.level5.tree2.bn1.bias. Drop parameter base.level5.tree2.bn1.running_mean. Drop parameter base.level5.tree2.bn1.running_var. Drop parameter base.level5.tree2.bn1.num_batches_tracked. Drop parameter base.level5.tree2.conv2.weight. Drop parameter base.level5.tree2.bn2.weight. Drop parameter base.level5.tree2.bn2.bias. Drop parameter base.level5.tree2.bn2.running_mean. Drop parameter base.level5.tree2.bn2.running_var. Drop parameter base.level5.tree2.bn2.num_batches_tracked. Drop parameter base.level5.root.conv.weight. Drop parameter base.level5.root.bn.weight. Drop parameter base.level5.root.bn.bias. Drop parameter base.level5.root.bn.running_mean. Drop parameter base.level5.root.bn.running_var. Drop parameter base.level5.root.bn.num_batches_tracked. Drop parameter base.level5.project.0.weight. Drop parameter base.level5.project.1.weight. Drop parameter base.level5.project.1.bias. Drop parameter base.level5.project.1.running_mean. Drop parameter base.level5.project.1.running_var. Drop parameter base.level5.project.1.num_batches_tracked. Drop parameter base.fc.weight. Drop parameter base.fc.bias. Drop parameter dla_up.ida_0.proj_1.actf.0.weight. Drop parameter dla_up.ida_0.proj_1.actf.0.bias. Drop parameter dla_up.ida_0.proj_1.actf.0.running_mean. Drop parameter dla_up.ida_0.proj_1.actf.0.running_var. Drop parameter dla_up.ida_0.proj_1.actf.0.num_batches_tracked. Drop parameter dla_up.ida_0.proj_1.conv.weight. Drop parameter dla_up.ida_0.proj_1.conv.bias. Drop parameter dla_up.ida_0.proj_1.conv.conv_offset_mask.weight. Drop parameter dla_up.ida_0.proj_1.conv.conv_offset_mask.bias. Drop parameter dla_up.ida_0.up_1.weight. Drop parameter dla_up.ida_0.node_1.actf.0.weight. Drop parameter dla_up.ida_0.node_1.actf.0.bias. Drop parameter dla_up.ida_0.node_1.actf.0.running_mean. Drop parameter dla_up.ida_0.node_1.actf.0.running_var. Drop parameter dla_up.ida_0.node_1.actf.0.num_batches_tracked. Drop parameter dla_up.ida_0.node_1.conv.weight. Drop parameter dla_up.ida_0.node_1.conv.bias. Drop parameter dla_up.ida_0.node_1.conv.conv_offset_mask.weight. Drop parameter dla_up.ida_0.node_1.conv.conv_offset_mask.bias. Drop parameter dla_up.ida_1.proj_1.actf.0.weight. Drop parameter dla_up.ida_1.proj_1.actf.0.bias. Drop parameter dla_up.ida_1.proj_1.actf.0.running_mean. Drop parameter dla_up.ida_1.proj_1.actf.0.running_var. Drop parameter dla_up.ida_1.proj_1.actf.0.num_batches_tracked. Drop parameter dla_up.ida_1.proj_1.conv.weight. Drop parameter dla_up.ida_1.proj_1.conv.bias. Drop parameter dla_up.ida_1.proj_1.conv.conv_offset_mask.weight. Drop parameter dla_up.ida_1.proj_1.conv.conv_offset_mask.bias. Drop parameter dla_up.ida_1.up_1.weight. Drop parameter dla_up.ida_1.node_1.actf.0.weight. Drop parameter dla_up.ida_1.node_1.actf.0.bias. Drop parameter dla_up.ida_1.node_1.actf.0.running_mean. Drop parameter dla_up.ida_1.node_1.actf.0.running_var. Drop parameter dla_up.ida_1.node_1.actf.0.num_batches_tracked. Drop parameter dla_up.ida_1.node_1.conv.weight. Drop parameter dla_up.ida_1.node_1.conv.bias. Drop parameter dla_up.ida_1.node_1.conv.conv_offset_mask.weight. Drop parameter dla_up.ida_1.node_1.conv.conv_offset_mask.bias. Drop parameter dla_up.ida_1.proj_2.actf.0.weight. Drop parameter dla_up.ida_1.proj_2.actf.0.bias. Drop parameter dla_up.ida_1.proj_2.actf.0.running_mean. Drop parameter dla_up.ida_1.proj_2.actf.0.running_var. Drop parameter dla_up.ida_1.proj_2.actf.0.num_batches_tracked. Drop parameter dla_up.ida_1.proj_2.conv.weight. Drop parameter dla_up.ida_1.proj_2.conv.bias. Drop parameter dla_up.ida_1.proj_2.conv.conv_offset_mask.weight. Drop parameter dla_up.ida_1.proj_2.conv.conv_offset_mask.bias. Drop parameter dla_up.ida_1.up_2.weight. Drop parameter dla_up.ida_1.node_2.actf.0.weight. Drop parameter dla_up.ida_1.node_2.actf.0.bias. Drop parameter dla_up.ida_1.node_2.actf.0.running_mean. Drop parameter dla_up.ida_1.node_2.actf.0.running_var. Drop parameter dla_up.ida_1.node_2.actf.0.num_batches_tracked. Drop parameter dla_up.ida_1.node_2.conv.weight. Drop parameter dla_up.ida_1.node_2.conv.bias. Drop parameter dla_up.ida_1.node_2.conv.conv_offset_mask.weight. Drop parameter dla_up.ida_1.node_2.conv.conv_offset_mask.bias. Drop parameter dla_up.ida_2.proj_1.actf.0.weight. Drop parameter dla_up.ida_2.proj_1.actf.0.bias. Drop parameter dla_up.ida_2.proj_1.actf.0.running_mean. Drop parameter dla_up.ida_2.proj_1.actf.0.running_var. Drop parameter dla_up.ida_2.proj_1.actf.0.num_batches_tracked. Drop parameter dla_up.ida_2.proj_1.conv.weight. Drop parameter dla_up.ida_2.proj_1.conv.bias. Drop parameter dla_up.ida_2.proj_1.conv.conv_offset_mask.weight. Drop parameter dla_up.ida_2.proj_1.conv.conv_offset_mask.bias. Drop parameter dla_up.ida_2.up_1.weight. Drop parameter dla_up.ida_2.node_1.actf.0.weight. Drop parameter dla_up.ida_2.node_1.actf.0.bias. Drop parameter dla_up.ida_2.node_1.actf.0.running_mean. Drop parameter dla_up.ida_2.node_1.actf.0.running_var. Drop parameter dla_up.ida_2.node_1.actf.0.num_batches_tracked. Drop parameter dla_up.ida_2.node_1.conv.weight. Drop parameter dla_up.ida_2.node_1.conv.bias. Drop parameter dla_up.ida_2.node_1.conv.conv_offset_mask.weight. Drop parameter dla_up.ida_2.node_1.conv.conv_offset_mask.bias. Drop parameter dla_up.ida_2.proj_2.actf.0.weight. Drop parameter dla_up.ida_2.proj_2.actf.0.bias. Drop parameter dla_up.ida_2.proj_2.actf.0.running_mean. Drop parameter dla_up.ida_2.proj_2.actf.0.running_var. Drop parameter dla_up.ida_2.proj_2.actf.0.num_batches_tracked. Drop parameter dla_up.ida_2.proj_2.conv.weight. Drop parameter dla_up.ida_2.proj_2.conv.bias. Drop parameter dla_up.ida_2.proj_2.conv.conv_offset_mask.weight. Drop parameter dla_up.ida_2.proj_2.conv.conv_offset_mask.bias. Drop parameter dla_up.ida_2.up_2.weight. Drop parameter dla_up.ida_2.node_2.actf.0.weight. Drop parameter dla_up.ida_2.node_2.actf.0.bias. Drop parameter dla_up.ida_2.node_2.actf.0.running_mean. Drop parameter dla_up.ida_2.node_2.actf.0.running_var. Drop parameter dla_up.ida_2.node_2.actf.0.num_batches_tracked. Drop parameter dla_up.ida_2.node_2.conv.weight. Drop parameter dla_up.ida_2.node_2.conv.bias. Drop parameter dla_up.ida_2.node_2.conv.conv_offset_mask.weight. Drop parameter dla_up.ida_2.node_2.conv.conv_offset_mask.bias. Drop parameter dla_up.ida_2.proj_3.actf.0.weight. Drop parameter dla_up.ida_2.proj_3.actf.0.bias. Drop parameter dla_up.ida_2.proj_3.actf.0.running_mean. Drop parameter dla_up.ida_2.proj_3.actf.0.running_var. Drop parameter dla_up.ida_2.proj_3.actf.0.num_batches_tracked. Drop parameter dla_up.ida_2.proj_3.conv.weight. Drop parameter dla_up.ida_2.proj_3.conv.bias. Drop parameter dla_up.ida_2.proj_3.conv.conv_offset_mask.weight. Drop parameter dla_up.ida_2.proj_3.conv.conv_offset_mask.bias. Drop parameter dla_up.ida_2.up_3.weight. Drop parameter dla_up.ida_2.node_3.actf.0.weight. Drop parameter dla_up.ida_2.node_3.actf.0.bias. Drop parameter dla_up.ida_2.node_3.actf.0.running_mean. Drop parameter dla_up.ida_2.node_3.actf.0.running_var. Drop parameter dla_up.ida_2.node_3.actf.0.num_batches_tracked. Drop parameter dla_up.ida_2.node_3.conv.weight. Drop parameter dla_up.ida_2.node_3.conv.bias. Drop parameter dla_up.ida_2.node_3.conv.conv_offset_mask.weight. Drop parameter dla_up.ida_2.node_3.conv.conv_offset_mask.bias. Drop parameter ida_up.proj_1.actf.0.weight. Drop parameter ida_up.proj_1.actf.0.bias. Drop parameter ida_up.proj_1.actf.0.running_mean. Drop parameter ida_up.proj_1.actf.0.running_var. Drop parameter ida_up.proj_1.actf.0.num_batches_tracked. Drop parameter ida_up.proj_1.conv.weight. Drop parameter ida_up.proj_1.conv.bias. Drop parameter ida_up.proj_1.conv.conv_offset_mask.weight. Drop parameter ida_up.proj_1.conv.conv_offset_mask.bias. Drop parameter ida_up.up_1.weight. Drop parameter ida_up.node_1.actf.0.weight. Drop parameter ida_up.node_1.actf.0.bias. Drop parameter ida_up.node_1.actf.0.running_mean. Drop parameter ida_up.node_1.actf.0.running_var. Drop parameter ida_up.node_1.actf.0.num_batches_tracked. Drop parameter ida_up.node_1.conv.weight. Drop parameter ida_up.node_1.conv.bias. Drop parameter ida_up.node_1.conv.conv_offset_mask.weight. Drop parameter ida_up.node_1.conv.conv_offset_mask.bias. Drop parameter ida_up.proj_2.actf.0.weight. Drop parameter ida_up.proj_2.actf.0.bias. Drop parameter ida_up.proj_2.actf.0.running_mean. Drop parameter ida_up.proj_2.actf.0.running_var. Drop parameter ida_up.proj_2.actf.0.num_batches_tracked. Drop parameter ida_up.proj_2.conv.weight. Drop parameter ida_up.proj_2.conv.bias. Drop parameter ida_up.proj_2.conv.conv_offset_mask.weight. Drop parameter ida_up.proj_2.conv.conv_offset_mask.bias. Drop parameter ida_up.up_2.weight. Drop parameter ida_up.node_2.actf.0.weight. Drop parameter ida_up.node_2.actf.0.bias. Drop parameter ida_up.node_2.actf.0.running_mean. Drop parameter ida_up.node_2.actf.0.running_var. Drop parameter ida_up.node_2.actf.0.num_batches_tracked. Drop parameter ida_up.node_2.conv.weight. Drop parameter ida_up.node_2.conv.bias. Drop parameter ida_up.node_2.conv.conv_offset_mask.weight. Drop parameter ida_up.node_2.conv.conv_offset_mask.bias. Skip loading parameter hm.0.weight, required shapetorch.Size([64, 256, 3, 3]), loaded shapetorch.Size([256, 64, 3, 3]). Skip loading parameter hm.0.bias, required shapetorch.Size([64]), loaded shapetorch.Size([256]). Skip loading parameter hm.2.weight, required shapetorch.Size([64, 64, 3, 3]), loaded shapetorch.Size([2, 256, 1, 1]). Skip loading parameter hm.2.bias, required shapetorch.Size([64]), loaded shapetorch.Size([2]). Skip loading parameter st.0.weight, required shapetorch.Size([64, 256, 3, 3]), loaded shapetorch.Size([256, 64, 3, 3]). Skip loading parameter st.0.bias, required shapetorch.Size([64]), loaded shapetorch.Size([256]). Skip loading parameter st.2.weight, required shapetorch.Size([64, 64, 3, 3]), loaded shapetorch.Size([8, 256, 1, 1]). Skip loading parameter st.2.bias, required shapetorch.Size([64]), loaded shapetorch.Size([8]). Skip loading parameter wh.0.weight, required shapetorch.Size([64, 256, 3, 3]), loaded shapetorch.Size([256, 64, 3, 3]). Skip loading parameter wh.0.bias, required shapetorch.Size([64]), loaded shapetorch.Size([256]). Skip loading parameter wh.2.weight, required shapetorch.Size([64, 64, 3, 3]), loaded shapetorch.Size([8, 256, 1, 1]). Skip loading parameter wh.2.bias, required shapetorch.Size([64]), loaded shapetorch.Size([8]). Skip loading parameter ax.0.weight, required shapetorch.Size([64, 256, 3, 3]), loaded shapetorch.Size([256, 64, 3, 3]). Skip loading parameter ax.0.bias, required shapetorch.Size([64]), loaded shapetorch.Size([256]). Skip loading parameter ax.2.weight, required shapetorch.Size([64, 64, 3, 3]), loaded shapetorch.Size([256, 256, 1, 1]). Skip loading parameter ax.2.bias, required shapetorch.Size([64]), loaded shapetorch.Size([256]). Skip loading parameter cr.0.weight, required shapetorch.Size([64, 256, 3, 3]), loaded shapetorch.Size([256, 64, 3, 3]). Skip loading parameter cr.0.bias, required shapetorch.Size([64]), loaded shapetorch.Size([256]). Skip loading parameter cr.2.weight, required shapetorch.Size([64, 64, 3, 3]), loaded shapetorch.Size([256, 256, 1, 1]). Skip loading parameter cr.2.bias, required shapetorch.Size([64]), loaded shapetorch.Size([256]). Skip loading parameter reg.0.weight, required shapetorch.Size([64, 256, 3, 3]), loaded shapetorch.Size([256, 64, 3, 3]). Skip loading parameter reg.0.bias, required shapetorch.Size([64]), loaded shapetorch.Size([256]). Skip loading parameter reg.2.weight, required shapetorch.Size([2, 64, 1, 1]), loaded shapetorch.Size([2, 256, 1, 1]). No param conv1.weight. No param bn1.weight. No param bn1.bias. No param bn1.running_mean. No param bn1.running_var. No param bn1.num_batches_tracked. No param layer1.0.conv1.weight. No param layer1.0.conv1.bias. No param layer1.0.bn1.weight. No param layer1.0.bn1.bias. No param layer1.0.bn1.running_mean. No param layer1.0.bn1.running_var. No param layer1.0.bn1.num_batches_tracked. No param layer1.0.conv2.weight. No param layer1.0.conv2.bias. No param layer1.0.bn2.weight. No param layer1.0.bn2.bias. No param layer1.0.bn2.running_mean. No param layer1.0.bn2.running_var. No param layer1.0.bn2.num_batches_tracked. No param layer1.0.downsample.0.weight. No param layer1.0.downsample.1.weight. No param layer1.0.downsample.1.bias. No param layer1.0.downsample.1.running_mean. No param layer1.0.downsample.1.running_var. No param layer1.0.downsample.1.num_batches_tracked. No param layer1.1.conv1.weight. No param layer1.1.conv1.bias. No param layer1.1.bn1.weight. No param layer1.1.bn1.bias. No param layer1.1.bn1.running_mean. No param layer1.1.bn1.running_var. No param layer1.1.bn1.num_batches_tracked. No param layer1.1.conv2.weight. No param layer1.1.conv2.bias. No param layer1.1.bn2.weight. No param layer1.1.bn2.bias. No param layer1.1.bn2.running_mean. No param layer1.1.bn2.running_var. No param layer1.1.bn2.num_batches_tracked. No param layer2.0.conv1.weight. No param layer2.0.conv1.bias. No param layer2.0.bn1.weight. No param layer2.0.bn1.bias. No param layer2.0.bn1.running_mean. No param layer2.0.bn1.running_var. No param layer2.0.bn1.num_batches_tracked. No param layer2.0.conv2.weight. No param layer2.0.conv2.bias. No param layer2.0.bn2.weight. No param layer2.0.bn2.bias. No param layer2.0.bn2.running_mean. No param layer2.0.bn2.running_var. No param layer2.0.bn2.num_batches_tracked. No param layer2.0.downsample.0.weight. No param layer2.0.downsample.1.weight. No param layer2.0.downsample.1.bias. No param layer2.0.downsample.1.running_mean. No param layer2.0.downsample.1.running_var. No param layer2.0.downsample.1.num_batches_tracked. No param layer2.1.conv1.weight. No param layer2.1.conv1.bias. No param layer2.1.bn1.weight. No param layer2.1.bn1.bias. No param layer2.1.bn1.running_mean. No param layer2.1.bn1.running_var. No param layer2.1.bn1.num_batches_tracked. No param layer2.1.conv2.weight. No param layer2.1.conv2.bias. No param layer2.1.bn2.weight. No param layer2.1.bn2.bias. No param layer2.1.bn2.running_mean. No param layer2.1.bn2.running_var. No param layer2.1.bn2.num_batches_tracked. No param layer3.0.conv1.weight. No param layer3.0.conv1.bias. No param layer3.0.bn1.weight. No param layer3.0.bn1.bias. No param layer3.0.bn1.running_mean. No param layer3.0.bn1.running_var. No param layer3.0.bn1.num_batches_tracked. No param layer3.0.conv2.weight. No param layer3.0.conv2.bias. No param layer3.0.bn2.weight. No param layer3.0.bn2.bias. No param layer3.0.bn2.running_mean. No param layer3.0.bn2.running_var. No param layer3.0.bn2.num_batches_tracked. No param layer3.0.downsample.0.weight. No param layer3.0.downsample.1.weight. No param layer3.0.downsample.1.bias. No param layer3.0.downsample.1.running_mean. No param layer3.0.downsample.1.running_var. No param layer3.0.downsample.1.num_batches_tracked. No param layer3.1.conv1.weight. No param layer3.1.conv1.bias. No param layer3.1.bn1.weight. No param layer3.1.bn1.bias. No param layer3.1.bn1.running_mean. No param layer3.1.bn1.running_var. No param layer3.1.bn1.num_batches_tracked. No param layer3.1.conv2.weight. No param layer3.1.conv2.bias. No param layer3.1.bn2.weight. No param layer3.1.bn2.bias. No param layer3.1.bn2.running_mean. No param layer3.1.bn2.running_var. No param layer3.1.bn2.num_batches_tracked. No param layer4.0.conv1.weight. No param layer4.0.conv1.bias. No param layer4.0.bn1.weight. No param layer4.0.bn1.bias. No param layer4.0.bn1.running_mean. No param layer4.0.bn1.running_var. No param layer4.0.bn1.num_batches_tracked. No param layer4.0.conv2.weight. No param layer4.0.conv2.bias. No param layer4.0.bn2.weight. No param layer4.0.bn2.bias. No param layer4.0.bn2.running_mean. No param layer4.0.bn2.running_var. No param layer4.0.bn2.num_batches_tracked. No param layer4.0.downsample.0.weight. No param layer4.0.downsample.1.weight. No param layer4.0.downsample.1.bias. No param layer4.0.downsample.1.running_mean. No param layer4.0.downsample.1.running_var. No param layer4.0.downsample.1.num_batches_tracked. No param layer4.1.conv1.weight. No param layer4.1.conv1.bias. No param layer4.1.bn1.weight. No param layer4.1.bn1.bias. No param layer4.1.bn1.running_mean. No param layer4.1.bn1.running_var. No param layer4.1.bn1.num_batches_tracked. No param layer4.1.conv2.weight. No param layer4.1.conv2.bias. No param layer4.1.bn2.weight. No param layer4.1.bn2.bias. No param layer4.1.bn2.running_mean. No param layer4.1.bn2.running_var. No param layer4.1.bn2.num_batches_tracked. No param adaption3.weight. No param adaption2.weight. No param adaption1.weight. No param adaption0.weight. No param adaptionU1.weight. No param deconv_layers1.0.weight. No param deconv_layers1.1.weight. No param deconv_layers1.1.bias. No param deconv_layers1.1.running_mean. No param deconv_layers1.1.running_var. No param deconv_layers1.1.num_batches_tracked. No param deconv_layers2.0.weight. No param deconv_layers2.1.weight. No param deconv_layers2.1.bias. No param deconv_layers2.1.running_mean. No param deconv_layers2.1.running_var. No param deconv_layers2.1.num_batches_tracked. No param deconv_layers3.0.weight. No param deconv_layers3.1.weight. No param deconv_layers3.1.bias. No param deconv_layers3.1.running_mean. No param deconv_layers3.1.running_var. No param deconv_layers3.1.num_batches_tracked. No param deconv_layers4.0.weight. No param deconv_layers4.1.weight. No param deconv_layers4.1.bias. No param deconv_layers4.1.running_mean. No param deconv_layers4.1.running_var. No param deconv_layers4.1.num_batches_tracked. No param ax.4.weight. No param ax.4.bias. No param ax.6.weight. No param ax.6.bias. No param ax.8.weight. No param ax.8.bias. No param cr.4.weight. No param cr.4.bias. No param cr.6.weight. No param cr.6.bias. No param cr.8.weight. No param cr.8.bias. No param hm.4.weight. No param hm.4.bias. No param hm.6.weight. No param hm.6.bias. No param hm.8.weight. No param hm.8.bias. No param st.4.weight. No param st.4.bias. No param st.6.weight. No param st.6.bias. No param st.8.weight. No param st.8.bias. No param wh.4.weight. No param wh.4.bias. No param wh.6.weight. No param wh.6.bias. No param wh.8.weight. No param wh.8.bias. loaded ../dir_of_ckpt/ckpt_wireless/processor_best.pth, epoch 100 0%| | 0/2 [00:00<?, ?it/s](3000, 5) 50%|████████████████████████████████████████████▌ | 1/2 [00:00<00:00, 2.30it/s](3000, 5) 100%|█████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.21it/s]
查看保存的文件,发现还是原始图片,表格结构完全没有识别。
另外,请问是哪里的问题?预训练模型从链接中下载。

LORE finetuning gives wierd bounding boxes.

Hello, thank you for the great work you did with LORE.

I am trying finetuning the small version of LORE on a custom dataset but it seems that when I do the model results are getting worst.
I tried some zero-shot inference with ckpt_ptn model weights.
For finetuning I used

!python main.py ctdet_small \
	--dataset table_small \
	--exp_id CTU_fold_0 \
	--dataset_name ctu_fold_0 \
	--image_dir /datadrive/images/CTU \
	--wiz_2dpe \
	--wiz_stacking \
	--tsfm_layers 3 \
	--stacking_layers 3 \
	--batch_size 16 \
	--master_batch 12 \
	--arch dla_34 \
	--lr 1e-4 \
	--K 500 \
	--MK 1000 \
	--num_epochs 10 \
	--lr_step '100, 160' \
	--gpus 0 \
	--num_workers 16 \
	--val_intervals 1 \
	--save_all \
	--load_model /datadrive/LORE_experiments/pretrined_checkpoints/small_model_best.pth \
	--load_processor /datadrive/LORE_experiments/pretrined_checkpoints/small_processor_best.pth

where small_model_best.pth and small_processor_best.pth are the names I used to save the ckpt_ptn model weights.

I also produced annotations similar to the ones you provided in the subset of PubTabNet. I can provide an example if necessary.

Am I doing something wrong in the script?
Thank in advance!

LISTER

Will the "LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition" be introduced?

Paper & code of GeoLayoutLM?

Hi authors,
Thanks for the great repo!
Could you provide the paper link to Geolayoutlm?
Furthermore, I saw the code in the current release is the same as the code in the main branch. Can you update this?

Thank you so much!

MagickWand错误,按照提示,不能下载

libraries = load_library()出错

ImportError: MagickWand shared library not found.
You probably had not installed ImageMagick library.

Try to install:
apt-get install libmagickwand-dev

请问有遇到过这种问题吗?已经按照说明,也尝试了各种源,但是不能下载

Regarding the inference relation head in the GeoLayoutLM and LayoutLMv3 models

I would like to express my gratitude for your hard work.
Upon reviewing the benchmark table, I have observed the impressive accuracy of the relation head in GeoLayoutLM. Would it be possible for you to share the code for pre-produce the result of LayoutLMv3 or provide guidance on how to integrate it with the current source code?

demo_imgs.py not working

Thank you for sharing your incredible work. I was trying to run the levocr pretrained model on couple of demo images.
demo_imgs script wasn't running, resolved it by making few changes in the demo_imgs.py

Following are the changes I have made -

At Line 141 model.module.load_state_dict(torch.load(opt.saved_model, map_location=device),strict=False)
At Line 150 AlignCollate_demo = AlignCollateTest(imgH=opt.imgH, imgW=opt.imgW)
At Line 171 vision_final_pred, _ = converter.encode_levt(vision_preds_str, src_dict, device=device, batch_max_length=pred_vision.size(1))

So,when I ran the script, this was the output

image

Can you help me to interpret the output or the right way to run the demo script?

LORE相关预训练模型

请问。LORE相关的预训练模型何时能放出来?如训练过程用到的model_best.pth和processor_best.pth。

model_ckpt issue

Hi! Thank you very much for open-sourcing the project code!
But I encountered a problem when running train.py FileNotFoundError: [Errno 2] No such file or directory: 'path/to/geolayoutlm_large_pretrain.pt', and ended up in configs/ GeoLayoutLM/finetune_funsd.yaml to find the source model_ckpt: path/to/geolayoutlm_large_pretrain.pt.
May I ask if this model path is not open source in the project list? Or maybe it is open source but unfortunately I didn't find it? Hopefully this can be resolved quickly! Thanks a lot!
The full error message is attached below:
2023-09-12 09:40:39,088 geolayoutlm_vie.py:60 [INFO] init weight from path/to/geolayoutlm_large_pretrain.pt'
Traceback (most recent call last):
File "train.py", line 55, in <module>
main()
File "train.py", line 44, in main
pl_module = GeoLayoutLMVIEModule(cfg)
File "/geoLayoutlm/lightning_modules/geolayoutlm_vie_module.py", line 20, in __init__
super().__init__(cfg)
File "/geoLayoutlm/lightning_modules/bros_module.py", line 24, in __init__
self.net = get_model(self.cfg)
File "/geoLayoutlm/model/__init__.py", line 5, in get_model
model = GeoLayoutLMVIEModel(cfg=cfg)
File "/geoLayoutlm/model/geolayoutlm_vie.py", line 56, in __init__
self._init_weight()
File "/geoLayoutlm/model/geolayoutlm_vie.py", line 61, in _init_weight
state_dict = torch.load(model_path, map_location='cpu')
File "/opt/conda/lib/python3.7/site-packages/torch/serialization.py", line 579, in load
with _open_file_like(f, 'rb') as opened_file:
File "/opt/conda/lib/python3.7/site-packages/torch/serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/opt/conda/lib/python3.7/site-packages/torch/serialization.py", line 211, in __init__
super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'path/to/geolayoutlm_large_pretrain.pt'

Geolayoutlm train.py issue

Hello team!

Thanks for such a great contribution to the open-source community.

But sadly I am stuck at one thing, at the end of train.py, it is thowing AttributeError: 'VIEDataModule' object has no attribute '_has_setup_TrainerFn.FITTING'. How to resolve this issue. Kindly help me.

Code Snippet
if cfg.model.head in ["vie"]:
pl_module = GeoLayoutLMVIEModule(cfg)
else:
raise ValueError(f"Not supported head {cfg.model.head}")

# import ipdb;ipdb.set_trace()
data_module = VIEDataModule(cfg, pl_module.net.tokenizer)

trainer.fit(model = pl_module, datamodule=data_module)

Error Snippet
Exception has occurred: AttributeError
'VIEDataModule' object has no attribute '_has_setup_TrainerFn.FITTING'
File "D:\TEST\GeoLayoutLM\train.py", line 52, in main
trainer.fit(model = pl_module, datamodule=data_module)
File "D:\TEST\GeoLayoutLM\train.py", line 56, in
main()
AttributeError: 'VIEDataModule' object has no attribute '_has_setup_TrainerFn.FITTING'

About NMS merge output to avoid Overlap Cells?

Hi, LORE model is great. I have a question, i saw the command in scripts infer have argument --nms but in file opts.py thís argument have action 'store_false'. It means we don't use nms function. May i ask why? Cause i run some sample images, it gave me lots of overlap cell. I think using nms can reduce this problem.

LevOCR字符集问题

你好,如果我对于字符集的检测除了大小写字母和数字以外还需要一些特殊符号例如-|~等,是不是需要重新训练整个模型,如果需要的话请问8卡A100训练ST+MJ数据集预估需要多久

LORE的loss部分

您好,非常抱歉打扰到您,有两个问题想向您请教一下。
1、在代码lore中没有发现论文中所提到的inner,inter loss函数。请问是我没找到,还是代码中没有公开?
2、在您公开的wtw数据集label中和下载的wtw中的lable和图像不能完全对应,能否提供你们做实验的那版本数据集。非常感谢。

关于部署geoLayoutLM

非常感谢你们的工作,我们在自己数据集上试了一下效果不错,现在准备部署,按照传统的方式我们准备转onnx,但是遇到问题,无法转onnx。
请问下对该模型的部署有什么建议吗?
非常感谢!

Entity Grouping function for GeoLayoutLM

Hello team!

Thanks for such a great contribution to the open-source community.

Can you please explain about the logic behind entity-grouping done to identify the entities before linking key-value pairs. It would be great if you can also point me to that respective function in the repo.

Thanks!

LORE's labels

Could you please tell me what's the meaning of the labels of LORE, especially the segmentation, logic_axis, area, iscrowd and ignore in annotations

geolayout的modelscope上的中文模型

作者您好,感谢作者release了这么好的工作,其中relation heads在提点方面有很大的作用,在modelscope上多语言的模型中没有relation heads部分的参数,如果不load relation heads直接训funsd,大概只能到70%+,relation heads部分的参数方便提供么?

Inference GeoLayoutLM

Hello! Thanks a lot for the great work!
I have a question regarding model inference. As far as I can see, the model forward requires block boxes to be fed. But in order to get the block boxes, you must first get them, for example, using LayoutLM.
Is it possible to run an inference of model with only an image, word boxes, and their corresponding text as input? Could you provide the code for the inference?

pairing of key value

I want to know how to proceed with key-value pairing only because I have label previously.

LORE-TSR's TEDS metrics?

How does the LORE-TSR calculate TEDS indicators on the PubTabNet dataset?
How is the blank cell situation handled?

PubTabNet获取逻辑坐标和无线表格问题

大佬你好,我有两个问题:
1.请问PubTabNet如何获取逻辑坐标呀?
2.对于无线表格,是不是无法预测空cell的逻辑坐标?后处理的话,是不是无法判断两个或多个相邻空cell到底是几个cell?

Visualize the GeoLayoutLM results?

Here I used the pre-trained model from ModelScope but I don't know how to get the head outputs and visualize the text and image embedding for the respective image. Here I attached the ModelScope code for reference and link also.

from modelscope.models import Model
from modelscope.pipelines import pipeline

model = Model.from_pretrained('damo/multi-modal_convnext-roberta-base_vldoc-embedding')
doc_VL_emb_pipeline = pipeline(task='document-vl-embedding', model=model)

inp = {
    'images': ['data/demo.png'], 
    'ocr_info_paths': ['data/demo.json']
}
result = doc_VL_emb_pipeline(inp)

print('Results of VLDoc: ')
for k, v in result.items():
    print(f'{k}: {v.size()}')
# Expected output:
# img_embedding: torch.Size([1, 151, 768]), 151 = 1 global img feature + 150 segment features
# text_embedding: torch.Size([1, 512, 768])

Thanks in advance!

Change bert

Hello!!

I want to learn using Korean dataset. Can I change the bert model?? If so, which part should I modify?

D4LA Dataset

Hello, when will the D4LA Dataset be released? This is a great dataset and I can't wait to test the effect.

LORE-TSR使用标注格式

您好,我想测试LORE-TSR在PubTabNet,SciTSR和WTW数据集上的性能,但这些数据集提供的标注格式都不同。PubTabNet为jsonl,SciTSR为chunk的structure,WTW为xml。请问我是否需要给出统一的标注类型,且标注的格式为?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.