GithubHelp home page GithubHelp logo

mashiro1323 / veto Goto Github PK

View Code? Open in Web Editor NEW

This project forked from visinf/veto

0.0 0.0 0.0 19.25 MB

Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)

License: Apache License 2.0

Shell 0.01% C++ 0.19% Python 15.46% C 0.10% Cuda 1.21% Jupyter Notebook 83.02%

veto's Introduction

Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)

This is the official repository for the paper "Vision Relation Transformer for Unbiased Scene Graph Generation".

Installation

Check INSTALL.md for installation instructions.

Dataset

Check DATASET.md for instructions of dataset preprocessing.

Pretrained Models

For VG dataset, the pretrained object detector we used is provided by Scene-Graph-Benchmark, you can download it from this link. For GQA dataset, we used the pretrained object detector provided by SHA-GCL-for-SGG which can be downloaded from this link. Modify the pretrained weight parameter MODEL.PRETRAINED_DETECTOR_CKPT in configs yaml configs/VETO_final.yaml to the path of corresponding pretrained rcnn weight to make sure you load the detection weight parameter correctly.

Scene Graph Generation Model

You can follow the following instructions to train your own, which takes 1 GPU to train each SGG model. The results should be very close to the reported results given in paper.

Following script trains VETO vanilla for PredCls (For SGCls set MODEL.ROI_RELATION_HEAD.USE_GT_BOX True MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL False, For SGDet set MODEL.ROI_RELATION_HEAD.USE_GT_BOX False MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL False)

       python ./tools/relation_train_net.py --config-file 
       "configs/VETO_final.yaml"
       MODEL.ROI_RELATION_HEAD.PREDICTOR VETOPredictor 
       GLOBAL_SETTING.DATASET_CHOICE 'VG' MODEL.ROI_RELATION_HEAD.USE_GT_BOX True 
       MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True 
       GLOBAL_SETTING.BETA_LOSS False
       SOLVER.IMS_PER_BATCH 12 TEST.IMS_PER_BATCH 1 
       SOLVER.MAX_ITER 125000 SOLVER.VAL_PERIOD 5000 
       SOLVER.CHECKPOINT_PERIOD 5000 DEBUG False 
       SOLVER.PRE_VAL False ENSEMBLE_LEARNING.ENABLED False
       EXPERIMENT_NAME "VG_VETO_vanilla"

Following script trains VETO + Rwt for PredCls

       python ./tools/relation_train_net.py --config-file 
       "configs/VETO_final.yaml" 
       MODEL.ROI_RELATION_HEAD.PREDICTOR VETOPredictor
       GLOBAL_SETTING.DATASET_CHOICE 'VG' MODEL.ROI_RELATION_HEAD.USE_GT_BOX True 
       MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True 
       GLOBAL_SETTING.BETA_LOSS True
       SOLVER.IMS_PER_BATCH 12 TEST.IMS_PER_BATCH 1 
       SOLVER.MAX_ITER 125000 SOLVER.VAL_PERIOD 5000 
       SOLVER.CHECKPOINT_PERIOD 5000 DEBUG False 
       SOLVER.PRE_VAL False ENSEMBLE_LEARNING.ENABLED False
       EXPERIMENT_NAME "VG_VETO_beta"

Following script trains VETO + MEET for PredCls

       python ./tools/relation_train_net.py --config-file 
       "configs/VETO_final.yaml" 
       MODEL.ROI_RELATION_HEAD.PREDICTOR VETOPredictor_MEET
       GLOBAL_SETTING.DATASET_CHOICE 'VG' MODEL.ROI_RELATION_HEAD.USE_GT_BOX True 
       MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True 
       GLOBAL_SETTING.BETA_LOSS True
       SOLVER.IMS_PER_BATCH 12 TEST.IMS_PER_BATCH 1 
       SOLVER.MAX_ITER 125000 SOLVER.VAL_PERIOD 5000 
       SOLVER.CHECKPOINT_PERIOD 5000 DEBUG False 
       SOLVER.PRE_VAL False ENSEMBLE_LEARNING.ENABLED True
       EXPERIMENT_NAME "VG_VETO_MEET"

Test

By replacing the parameter of MODEL.WEIGHT to the trained model weight and selected dataset name in DATASETS.TEST, you can directly eval the model on validation or test set.

Cite

If you find our work useful in your research, please consider citing

       @InProceedings{Sudhakaran_2023_ICCV,
           author    = {Sudhakaran, Gopika and Dhami, Devendra Singh and Kersting, Kristian and Roth, Stefan},
           title     = {Vision Relation Transformer for Unbiased Scene Graph Generation},
           booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
           month     = {October},
           year      = {2023},
           pages     = {21882-21893}
       }

Acknowledgment

This repository is developed on top of the following code bases:

  1. Scene graph benchmarking framework develped by KaihuaTang
  2. A Toolkit for Scene Graph Benchmark in Pytorch by Rongjie Li
  3. Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation in Pytorch by [Xingning Dong](Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation)

veto's People

Contributors

gopikasudhakaran avatar sroth-visinf avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.