GithubHelp home page GithubHelp logo

tchuanm / sftranst Goto Github PK

View Code? Open in Web Editor NEW
15.0 2.0 0.0 9.75 MB

The implement of "Learning Spatial-Frequency Transformer for Visual Object Tracking"

License: GNU General Public License v3.0

Python 85.84% Shell 0.08% C++ 0.80% Cuda 8.01% Cython 1.16% C 4.11%

sftranst's Introduction

SFTransT: Learning Spatial-Frequency Transformer for Visual Object Tracking

The official implement of SFTransT. Arxiv, IEEE T-CSVT

Framework

TL;DR

SFTransT follows the Siamese matching framework which takes the template and search frame as input. The Swin-Tiny network is adopted as the backbone, and the cross-scale features are fused as embedded features. Then, a Multi-Head Cross-Attention (MHCA) module is used to boost the interactions between the dual features. The output will be fed into our core component Spatial-Frequency Transformer, which models the Gaussian spatial prior and low-/high-frequency feature information simultaneously. More in detail, the GGN is adopted to predict the Gaussian spatial attention which will be added to the self-attention matrix. Then, the GPHA is designed to decompose them into low- and high-pass branches to achieve all-pass information propagation. Finally, the enhanced features will be fed into the classification and regression head for target object tracking.

Tracker GOT-10K (AO) LaSOT (AUC) TrackingNet (AUC) UAV123(AUC) LaSOT-ext(AUC) TNL2k(AUC) WebUAV-3M
SFTransT 72.7 69.0 82.9 71.3 46.4 54.6 58.2

Installation

  1. Create and activate a conda environment
conda create -n sftranst python=3.7
conda activate sftranst
  1. Install the necessary packages. Please install them line by line to ensure the success.
conda install -c pytorch pytorch=1.5 torchvision=0.6.1 cudatoolkit=10.2
conda install matplotlib pandas tqdm
pip install opencv-python tb-nightly visdom scikit-image tikzplotlib gdown
conda install cython scipy
sudo apt-get install libturbojpeg
pip install pycocotools jpeg4py
pip install wget yacs
pip install shapely==1.6.4.post2 
pip install timm
pip install einops
  1. Add the softlink of datasets into the path './dataset/'
     |--dataset
        |--got10k
        |--lasot
        |--trackingnet
        |--.......
  1. Setup Environment.
# Environment settings for ltr. Saved at ltr/admin/local.py
cd SFTransT
python -c "from ltr.admin.environment import create_default_local_file; create_default_local_file()"

Training

  1. download pretrained model of Swin-Tiny, and put into the ltr/models/backbone/

  2. run commmend

cd SFTransT/ltr
conda activate sftranst
python run_training.py --train_module sftranst  --train_name sftranst_cfa_gpha_mlp  

Test and Eval

  1. For UAV, OTB, GOT10k
cd SFTransT/pysot_toolkit
conda activate sftranst
python eval_global.py --cuda 0  --begin 99 --end 100 --interval 1 --folds sftranst_cfa_gpha_mlp --subset test
  1. For other datasets, like LaSOT:
python test_global.py --dataset LaSOT --cuda 5 --epoch 300  --win 0.50

Acknowledgement

This is a combination version of the python tracking framework PyTracking and PySOT-Toolkit.
Thanks for the TransT which firstly introduce the Transformer into visual tracking.

Citation

@ARTICLE{tang2022learning,
  author={Tang, Chuanming and Wang, Xiao and Bai, Yuanchao and Wu, Zhe and Zhang, Jianlin and Huang, Yongmei},
  journal={IEEE Transactions on Circuits and Systems for Video Technology}, 
  title={Learning Spatial-Frequency Transformer for Visual Object Tracking}, 
  year={2023},
  doi={10.1109/TCSVT.2023.3249468}}

or

@article{tang2022learning,
  title={Learning Spatial-Frequency Transformer for Visual Object Tracking},
  author={Tang, Chuanming and Wang, Xiao and Bai, Yuanchao and Wu, Zhe and Zhang, Jianlin and Huang, Yongmei},
  journal={arXiv preprint arXiv:2208.08829},
  year={2022}
}

sftranst's People

Contributors

tchuanm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

sftranst's Issues

可视化

你好,请问图八怎么可视化的。

About your baseline

你好,请问你们论文中的消融实验表8中的baseline 在您在代码中有吗,我觉得看起来挺简洁,我是一个初学者,想要找一个基线,简单的学习,另外您的baseline训练的大概时间、训练设置和硬件是否方便公开,非常感谢,期待您的回复!

对比试验表格

image
你好,请问表格中的这种竖线,如何做到不挨着上下两条横线。我找了很多办法,只有横线断开的办法,竖线没有用。

swin transformer pretrain model

你好,想向您请教几个问题
1.我发现您加载预训练权重时,跳过了att_mask的预训练权重参数,请问是这样做效果更好吗,还是说其他原因。
2.还有就是swintransformer的v1版本官网没有发现256的与训练模型,请问您提供的是自己训练的吗?
谢谢你!

测试的参数

你好,我想测试你的代码,我发现了两个问题:
1.pysot_toolkit里边好像没有eval_got10k_global.py这个文件
2.test_global.py这个文件里边的参数我不太清楚,例如win之类的,是测试每个数据集都一样吗,或者我可以从哪里查询到
非常感谢您的回复!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.