GithubHelp home page GithubHelp logo

tchuanm / avitmp Goto Github PK

View Code? Open in Web Editor NEW
1.0 0.0 0.0 2.16 MB

The official implement of "Exploiting Image-Related Inductive Biases in Single-Branch Visual Tracking".

License: GNU General Public License v3.0

Python 99.89% MATLAB 0.11%

avitmp's Introduction

(IEEE Transactions on Intelligent Vehicles )

TL;DR

To tackle the inferior effectiveness of the vanilla ViT, we propose an Adaptive ViT Model Prediction tracker (AViTMP) to bridge the gap between single-branch network and discriminative models. Specifically, in the proposed encoder AViT-Enc, we introduce an adaptor module and joint target state embedding to enrich the dense embedding paradigm based on ViT. Then, we combine AViT-Enc with a dense-fusion decoder and a discriminative target model to predict accurate location. Further, to mitigate the limitations of conventional inference practice, we present a novel inference pipeline called CycleTrack, which bolsters the tracking robustness in the presence of distractors via bidirectional cycle tracking verification. Lastly, we propose a dual-frame update inference strategy that adeptively handles significant challenges in long-term scenarios.

Framework

Data LaSOTExt LaSOT AVisT VOT2020 UAV123 TNL2k TrackingNet OTB100 NFS30
AUC 50.2 70.7 54.9 31.4 70.1 54.5 82.8 70.3 66.3

Setup Environment

conda create --name pytracking --file requirements.txt
source activate pytracking

Or reference pytracking

python -c "from pytracking.evaluation.environment import create_default_local_file; create_default_local_file()"
python -c "from ltr.admin.environment import create_default_local_file; create_default_local_file()"
cd ltr

Data Prapare

  1. Softlink datasets into './data', like:
ln -s .../lasot  .../AViTMP/data/
# datasets folder 
     |--data
        |--avist
        |--coco
        |--got10k
        |--lasot
        |--uav
        |--lasot
        |--trackingnet
        |--.......
  1. download pretrained ViT-B model into './pretrained_model'

Training

cd ltr
CUDA_VISIBLE_DEVICES=5,6,7,4  python run_training.py
# if need save the training logs into txt file. 
 nohup python -u run_training.py  >trainlog.txt 2>&1 &  

Infenrence

  1. copy a specific epoch and cut unnecessary parameters into './pytracking/networks/'
python script_cut_pth.py --epoch 300
  1. Evalute for one-pass evaluation (OPE) datasets. All evaluated results are put in './tracking_results'
# lasot, lasot_extension_subset, avist, uav, trackingnet, nfs, otb, tnl2k.  
python run_tracker.py --dataset_name uav   --threads 4 --num_gpus 2 

3. Evalute for VOT datasets

(Maybe a little hard to configuration, following VOT-toolkit guidence and VOT Google group.

  1. VOT2020 bbox
#1. env setting for VOT2020, 
vot-toolkit==0.5.3 
vot-trax==3.0.2

2. commends of evaluate. 
vot initialize  AViTMP  -workspace vot2020
vot evaluate --workspace vot2020 AViTMP
vot analysis --workspace vot2020 AViTMP 

Notice: If you want to evalute for VOT2020_mask or VOTS2023 (multi-object tracking & segmentation), you should:

  1. install segmentation model SAM-HQ, installed in the same "pytracking" envirment.
  2. combine VOT with segmentation method to build a tracking & segmentation two-stage method.
Set './vot2020/run_vot.py line23 args.mask as True.
# Note: env setting for VOTS2023
vot-toolkit==0.6.4
vot-trax==4.0.1

# run multiple times to generate multi-target mask.
vot initialize  AViTMP  -workspace vots2023
vot evaluate --workspace vots2023 AViTMP
vot analysis --workspace vots2023 AViTMP 

Acknowledgement

Thanks for the tracking framework PyTracking and OSTrack, which helps us to quickly implement our ideas.

Citation

If our work is useful for your research, you can consider citing:

@article{tang2024,
  author={Tang, Chuanming and Wang, Kai and Weijer, Joost van de and Zhang, Jianlin and Huang, Yongmei},
  journal={IEEE Transactions on Intelligent Vehicles}, 
  title={AViTMP: A Tracking-Specific Transformer for Single-Branch Visual Tracking}, 
  year={2024},
  pages={1-14},
  doi={10.1109/TIV.2024.3422806}}

avitmp's People

Contributors

tchuanm avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.