yolat-vectorgraphicsrecognition's Introduction

🔥 [NIPS2021, TPAMI2024] YOLaT & YOLaT++: Powerful and Efficient Graph Models for Vector Graphics Recognition

📜 Introduction

This repository is the official PyTorch implementation of our two powerful vector graphics recognition models.

NeurIPS-2021 paper: Recognizing Vector Graphics without Rasterization.

TPAMI-2024 paper: Hierarchical Recognizing Vector Graphics and A New Chart-based Vector Graphics Dataset

Rendering vector graphics into pixel arrays can result in significant memory costs or loss of information, as demonstrated in above Figure 1. Additionally, this process discards high-level structural information within the primitives, which is critical for recognition tasks such as identifying corners and contours. To summarize, we propose You Only Look at Text series (YOLaT & YOLaT++) which addresses issues with raster graphics by taking in textual documents of vector graphics as input.

Environments

conda create -n your_env_name python=3.8
conda activate your_env_name
sh deepgcn_env_install.sh

YOLaT

1. Data Preparation

Floorplans

a) Download and unzip the Floorplans dataset to the dataset folder: data/FloorPlansGraph5_iter

b) Run the following scripts to prepare the dataset for training/inference.

cd utils
python svg_utils/build_graph_bbox.py

Diagrams

a) Download and unzip the Diagrams dataset to the dataset folder: data/diagrams

b) Run the following scripts to prepare the dataset for training/inference.

cd utils
python svg_utils/build_graph_bbox_diagram.py

2. Training & Inference

Floorplans

cd cad_recognition
CUDA_VISIBLE_DEVICES=0 python -u train.py --batch_size 4 --data_dir data/FloorPlansGraph5_iter --phase train --lr 2.5e-4 --lr_adjust_freq 9999999999999999999999999999999999999 --in_channels 5 --n_blocks 2 --n_blocks_out 2 --arch centernet3cc_rpn_gp_iter2  --graph bezier_cc_bb_iter --data_aug true  --weight_decay 1e-5 --postname run182_2 --dropout 0.0 --do_mixup 0 --bbox_sampling_step 10

Diagrams

cd cad_recognition
CUDA_VISIBLE_DEVICES=0 python -u train.py --batch_size 4 --data_dir data/diagrams --phase train --lr 2.5e-4 --lr_adjust_freq 9999999999999999999999999999999999999 --in_channels 5 --n_blocks 2 --n_blocks_out 2 --arch centernet3cc_rpn_gp_iter2  --graph bezier_cc_bb_iter --data_aug true  --weight_decay 1e-5 --postname run182_2 --dropout 0.0 --do_mixup 0 --bbox_sampling_step 5

YOLaT++

YOLaT++ is introduced, characterized by a hierarchical structure designed for VGs, spanning three levels: Primitive, Curve, and Point. Additionally, YOLaT++ employs a position-aware enhancement strategy to effectively differentiate similar primitives.

Citation

BibTex:

@inproceedings{jiang2021recognizing,
title={{Recognizing Vector Graphics without Rasterization}},
author={Jiang, Xinyang and Liu, Lu and Shan, Caihua and Shen, Yifei and Dong, Xuanyi and Li, Dongsheng},
booktitle={Proceedings of Advances in Neural Information Processing Systems (NIPS)},
volume={34},
number={},
pages={},
year={2021}}

@inproceedings{yolat24,
title={{Hierarchical Recognizing Vector Graphics and A New Chart-based Vector Graphics Dataset}},
author={Shuguang Dou, Xinyang Jiang, Lu Liu, Lu Ying, Caihua Shan, Yifei Shen, Xuanyi Dong, Yun Wang, Dongsheng Li, Cairong Zhao},
booktitle={IEEE Transactions on Pattern Analysis and Machine Intelligence},
volume={},
number={},
pages={},
year={2024}}

Please do consider 🌟 star our project to share with your community if you find this repository helpful!

yolat-vectorgraphicsrecognition's People

Contributors

Stargazers

Watchers

yolat-vectorgraphicsrecognition's Issues

about how to use the train result to predict new svg file

Dr.Jiang,
Sorry to bother you.I am a novice and I am not very familiar with how to use the trained model for new predictions. Could you please explain to me how to do that? I have followed the operation steps on Git to train the model and obtained the file "run182_2_best.pth". So, how can I use this training result to predict new files? Thank you.

About test.py

Dr.Jiang,
Sorry to bother you.
I run the command "CUDA_VISIBLE_DEVICES=0 python -u cad_recognition/test.py --data_dir data/FloorPlansGraph5_iter --pretrained_model log/run182_2_best.pth" with codes about "opt.arch" and "opt.graph" being commented out.
BUT before and then, I still got the errors:
"size mismatch for cls_net.fusion_block.0.weight: copying a param with shape torch.Size([1024, 128]) from checkpoint, the shape in current model is torch.Size([1024, 448]).
size mismatch for cls_net.fusion_block_super.0.weight: copying a param with shape torch.Size([1024, 128]) from checkpoint, the shape in current model is torch.Size([1024, 448]).
size mismatch for prediction_cls.0.0.weight: copying a param with shape torch.Size([512, 2304]) from checkpoint, the shape in current model is torch.Size([512, 2944])."
It really confusing since the model was saved based on "def save_checkpoint()" while it did not match during loading the model.
Would you like to resolve this issue?
Thanks a lot and looking forward to your response soon.

Best regards,
VivianBB.

Recommend Projects

microsoft / yolat-vectorgraphicsrecognition Goto Github PK

yolat-vectorgraphicsrecognition's Introduction

🔥 [NIPS2021, TPAMI2024] YOLaT & YOLaT++: Powerful and Efficient Graph Models for Vector Graphics Recognition

📜 Introduction

Environments

YOLaT

1. Data Preparation

Floorplans

Diagrams

2. Training & Inference

Floorplans

Diagrams

YOLaT++

Citation

Related Dataset

yolat-vectorgraphicsrecognition's People

Contributors

Stargazers

Watchers

Forkers

yolat-vectorgraphicsrecognition's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs