GithubHelp home page GithubHelp logo

ruizehan / mvmhat-mm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from realgump/mvmhat

2.0 0.0 1.0 28.35 MB

MvMHAT: Self-supervised Multi-view Multi-Human Association and Tracking (ACM MM 2021 Oral,Improved version submitted to TPAMI)

Python 100.00%

mvmhat-mm's Introduction

MvMHAT: Multi-view Multi-Human Association and Tracking

[New 2024] We have extended this work to form a journal version (submitted to PAMI) from the following aspects.

New journal paper (MvMHAT++):

Unveiling the Power of Self-supervision for Multi-view Multi-human Association and Tracking

  • First, we add a new spatial-temporal assignment matrix learning module, which shares the self-consistency rationale for the appearance feature learning module (in the previous conference paper) to together form a fully self-supervised end-to-end framework.
  • Second, a new pseudo-label generation strategy with dummy nodes used for more general MvMHAT cases is introduced.
  • Third, we include a new dataset MMP-MvMHAT and significantly extend the experimental comparisons and analyses.

Previous conference paper (MvMHAT):

Self-supervised Multi-view Multi-Human Association and Tracking (ACM MM 2021),
Yiyang Gan, Ruize Han, Liqiang Yin, Wei Feng, Song Wang

  • A self-supervised learning framework for MvMHAT.
  • A new benchmark for training and testing MvMHAT.
example

Abstract

Multi-view multi-human association and tracking (MvMHAT), is a new but important problem for multi-person scene video surveillance, aiming to track a group of people over time in each view, as well as to identify the same person across different views at the same time, which is different from previous MOT and multi-camera MOT tasks only considering the over-time human tracking. This is a relatively new problem but is very important for multi-person scene video surveillance. This way, the videos for MvMHAT require more complex annotations while containing more information for self-learning. In this work, we tackle this problem with a self-supervised learning aware end-to-end network. Specifically, we propose to take advantage of the spatial-temporal self-consistency rationale by considering three properties of reflexivity, symmetry, and transitivity. Besides the reflexivity property that naturally holds, we design the self-supervised learning losses based on the properties of symmetry and transitivity, for both appearance feature learning and assignment matrix optimization, to associate multiple humans over time and across views. Furthermore, to promote the research on MvMHAT, we build two new large-scale benchmarks for the network training and testing of different algorithms. Extensive experiments on the proposed benchmarks verify the effectiveness of our method.
example

Dataset (MvMHAT)

Baidu Drive

Part 1 (from Self-collected)
Link:https://pan.baidu.com/s/1gsYTHffmfRq84Hn-8XtzDQ 
Password:2cfh

Part 2 (from Campus)
Link: https://pan.baidu.com/s/1Ts6xnESH-9UV8goiTrSuwQ 
Password: 8sg9

Part 3 (from EPFL) 
Link: https://pan.baidu.com/s/1G84npt61rYDUEPqnaHJUlg 
Password: jjaw 

One Drive

Complete Dataset
Link: https://tjueducn-my.sharepoint.com/:f:/g/personal/han_ruize_tju_edu_cn/EuYKZsvYBvFBvewQPdjvRIoB20iQfMNr_c7_fMDXFRZ7uw?e=19rwJF
Password: MvMHAT

Dataset (MMP-MvMHAT)

Evaluation

We add the evaluation code and the raw results of the proposed method in 'Eval_MvMHAT_public.zip'.

Install (to be completed)

The code was tested on Ubuntu 16.04, with Anaconda Python 3.6 and PyTorch v1.7.1. NVIDIA GPUs are needed for both training and testing. After install Anaconda:

  1. [Optional but recommended] create a new conda environment:
   conda create -n MVMHAT python=3.6

And activate the environment:

   conda activate MVMHAT
  1. Install pytorch:
   conda install pytorch=1.7.1 torchvision -c pytorch
  1. Clone the repository:
   MVMHAT_ROOT=/path/to/clone/MVMHAT
   git clone https://github.com/realgump/MvMHAT.git $MVMHAT_ROOT
  1. Install the requirements:
   pip install -r requirements.txt
  1. Download the pretrained model to promote convergence:
   cd $MVMHAT_ROOT/models
   wget https://download.pytorch.org/models/resnet50-19c8e357.pth -O pretrained.pth

[Notes] The public code of the conference paper (ACM MM 21) can be found at https://github.com/realgump/MvMHAT.

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{gan2021mvmhat,
  title={Self-supervised Multi-view Multi-Human Association and Tracking},
  author={Yiyang Gan, Ruize Han, Liqiang Yin, Wei Feng, Song Wang},
  booktitle={ACM MM},
  year={2021}
}

 @inproceedings{MvMHAT++,
  title={Unveiling the Power of Self-supervision for Multi-view Multi-human Association and Tracking},
  author={Wei Feng, Feifan Wang, Ruize Han, Zekun Qian, Song Wang},
  booktitle={arXiv},
  year={2023}
}

References

Portions of the code are borrowed from Deep SORT, thanks for their great work.

More information is coming soon ...

Contact: [email protected] (Ruize Han), [email protected] (Yiyang Gan). Any questions or discussions are welcomed!

mvmhat-mm's People

Contributors

ruizehan avatar realgump avatar

Stargazers

 avatar  avatar

Forkers

viltju

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.