GithubHelp home page GithubHelp logo

microsoft / yolat-vectorgraphicsrecognition Goto Github PK

View Code? Open in Web Editor NEW
68.0 8.0 14.0 2.39 MB

Source Code of NeurIPS21 and T-PAMI24 paper: Recognizing Vector Graphics without Rasterization

License: MIT License

Python 99.56% Shell 0.44%
gnn-model vector-database vector-graphics

yolat-vectorgraphicsrecognition's Introduction

๐Ÿ”ฅ [NIPS2021, TPAMI2024] YOLaT & YOLaT++: Powerful and Efficient Graph Models for Vector Graphics Recognition

๐Ÿ“œ Introduction

arXiv

This repository is the official PyTorch implementation of our two powerful vector graphics recognition models.

NeurIPS-2021 paper: Recognizing Vector Graphics without Rasterization.

TPAMI-2024 paper: Hierarchical Recognizing Vector Graphics and A New Chart-based Vector Graphics Dataset

img-name

Rendering vector graphics into pixel arrays can result in significant memory costs or loss of information, as demonstrated in above Figure 1. Additionally, this process discards high-level structural information within the primitives, which is critical for recognition tasks such as identifying corners and contours. To summarize, we propose You Only Look at Text series (YOLaT & YOLaT++) which addresses issues with raster graphics by taking in textual documents of vector graphics as input.

Environments

conda create -n your_env_name python=3.8
conda activate your_env_name
sh deepgcn_env_install.sh 

YOLaT

1. Data Preparation

Floorplans

a) Download and unzip the Floorplans dataset to the dataset folder: data/FloorPlansGraph5_iter

b) Run the following scripts to prepare the dataset for training/inference.

cd utils
python svg_utils/build_graph_bbox.py

Diagrams

a) Download and unzip the Diagrams dataset to the dataset folder: data/diagrams

b) Run the following scripts to prepare the dataset for training/inference.

cd utils
python svg_utils/build_graph_bbox_diagram.py

2. Training & Inference

Floorplans

cd cad_recognition
CUDA_VISIBLE_DEVICES=0 python -u train.py --batch_size 4 --data_dir data/FloorPlansGraph5_iter --phase train --lr 2.5e-4 --lr_adjust_freq 9999999999999999999999999999999999999 --in_channels 5 --n_blocks 2 --n_blocks_out 2 --arch centernet3cc_rpn_gp_iter2  --graph bezier_cc_bb_iter --data_aug true  --weight_decay 1e-5 --postname run182_2 --dropout 0.0 --do_mixup 0 --bbox_sampling_step 10

Diagrams

cd cad_recognition
CUDA_VISIBLE_DEVICES=0 python -u train.py --batch_size 4 --data_dir data/diagrams --phase train --lr 2.5e-4 --lr_adjust_freq 9999999999999999999999999999999999999 --in_channels 5 --n_blocks 2 --n_blocks_out 2 --arch centernet3cc_rpn_gp_iter2  --graph bezier_cc_bb_iter --data_aug true  --weight_decay 1e-5 --postname run182_2 --dropout 0.0 --do_mixup 0 --bbox_sampling_step 5

YOLaT++

img-name

YOLaT++ is introduced, characterized by a hierarchical structure designed for VGs, spanning three levels: Primitive, Curve, and Point. Additionally, YOLaT++ employs a position-aware enhancement strategy to effectively differentiate similar primitives.

Citation

BibTex:

@inproceedings{jiang2021recognizing,
title={{Recognizing Vector Graphics without Rasterization}},
author={Jiang, Xinyang and Liu, Lu and Shan, Caihua and Shen, Yifei and Dong, Xuanyi and Li, Dongsheng},
booktitle={Proceedings of Advances in Neural Information Processing Systems (NIPS)},
volume={34},
number={},
pages={},
year={2021}}

@inproceedings{yolat24,
title={{Hierarchical Recognizing Vector Graphics and A New Chart-based Vector Graphics Dataset}},
author={Shuguang Dou, Xinyang Jiang, Lu Liu, Lu Ying, Caihua Shan, Yifei Shen, Xuanyi Dong, Yun Wang, Dongsheng Li, Cairong Zhao},
booktitle={IEEE Transactions on Pattern Analysis and Machine Intelligence},
volume={},
number={},
pages={},
year={2024}}

Please do consider ๐ŸŒŸ star our project to share with your community if you find this repository helpful!

Related Dataset

Benchmark for VG-based Detection and Chart Understanding

yolat-vectorgraphicsrecognition's People

Contributors

dianezzy avatar microsoft-github-operations[bot] avatar microsoftopensource avatar shuguang-52 avatar xinyangj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yolat-vectorgraphicsrecognition's Issues

about how to use the train result to predict new svg file

Dr.Jiang,
Sorry to bother you.I am a novice and I am not very familiar with how to use the trained model for new predictions. Could you please explain to me how to do that? I have followed the operation steps on Git to train the model and obtained the file "run182_2_best.pth". So, how can I use this training result to predict new files? Thank you.

About test.py

Dr.Jiang,
Sorry to bother you.
I run the command "CUDA_VISIBLE_DEVICES=0 python -u cad_recognition/test.py --data_dir data/FloorPlansGraph5_iter --pretrained_model log/run182_2_best.pth" with codes about "opt.arch" and "opt.graph" being commented out.
BUT before and then, I still got the errors:
"size mismatch for cls_net.fusion_block.0.weight: copying a param with shape torch.Size([1024, 128]) from checkpoint, the shape in current model is torch.Size([1024, 448]).
size mismatch for cls_net.fusion_block_super.0.weight: copying a param with shape torch.Size([1024, 128]) from checkpoint, the shape in current model is torch.Size([1024, 448]).
size mismatch for prediction_cls.0.0.weight: copying a param with shape torch.Size([512, 2304]) from checkpoint, the shape in current model is torch.Size([512, 2944])."
It really confusing since the model was saved based on "def save_checkpoint()" while it did not match during loading the model.
Would you like to resolve this issue?
Thanks a lot and looking forward to your response soon.

Best regards,
VivianBB.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.