GithubHelp home page GithubHelp logo

shengcailiao / qaconv Goto Github PK

View Code? Open in Web Editor NEW
196.0 5.0 31.0 5.55 MB

[ECCV 2020] QAConv: Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting, and [CVPR 2022] GS: Graph Sampling Based Deep Metric Learning

Home Page: https://arxiv.org/abs/1904.10424

License: MIT License

Python 96.74% MATLAB 3.26%
person-reidentification person-re-identification person-reid person-retrieval person-search person-recognition re-identification re-id temporal-models adaptive-convolution

qaconv's Introduction

QAConv

Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

This is the official PyTorch code for the QAConv method proposed in our paper [1] and the QAConv-GS with Graph Sampling proposed in our paper [2]. A Chinese blog is available in 再见,迁移学习?可解释和泛化的行人再辨识.

Updates

  • 1/3/2023: Improved testing efficiency by using half precision, and optimized memory usage in testing. Included ClonedPerson, and removed DukeMTMC-reID.
  • 3/3/2022: The Graph Sampling work (QAConv-GS/QAConv 2.1) has been accepted by CVPR 2022.
  • 9/29/2022: TransMatcher has been accepted by NeurIPS 2021.
  • 9/19/2021: Include TransMatcher, a transformer based deep image matching method based on QAConv 2.0.
  • 9/16/2021: QAConv 2.1: simplify graph sampling, implement the Einstein summation for QAConv, use the batch hard triplet loss, design an adaptive epoch and learning rate scheduling method, and apply the automatic mixed precision training.
  • 4/1/2021: QAConv 2.0 [2]: include a new sampler called Graph Sampler (GS), and remove the class memory. This version is much more efficient in learning. See the updated results.
  • 3/31/2021: QAConv 1.2: include some popular data augmentation methods, and change the ranking.py implementation to the original open-reid version, so that it is more consistent to most other implementations (e.g. open-reid, torch-reid, fast-reid).
  • 2/7/2021: QAConv 1.1: an important update, which includes a pre-training function for a better initialization, so that the results are now more stable.
  • 11/26/2020: Include the IBN-Net as backbone, and the RandPerson dataset.

Illustrations

QAConv Fig. 1. Illustration of the proposed query-adaptive convolution (QAConv).

QAConv-Link Fig. 2. Examples of local correspondences obtained by QAConv.

QAConv-Arch Fig. 3. QAConv network architecture in training.

TLift Fig. 4. Illustration of the proposed temporal lifting (TLift).

Requirements

  • Pytorch (>1.0)
  • sklearn
  • scipy

Usage

Download some public datasets (e.g. Market-1501, CUHK03-NP, MSMT, RandPerson, ClonedPerson) on your own, extract them in some folder, and then run the followings.

Training and test

python main.py --dataset market --testset cuhk03_np_detected[,msmt] [--data-dir ./data] [--exp-dir ./Exp]

For more options, run "python main.py --help". For example, if you want to use the ResNet-152 as backbone, specify "-a resnet152". If you want to train on the whole dataset (as done in our paper for the MSMT17), specify "--combine_all".

The main file is updated with the QAConv 2.1 version, that is the CVPR 2022 version with the Graph Sampler and sole triplet loss. For other earlier versions, please check Releases.

Test only

python main.py --dataset market --testset cuhk03_np_detected[,market,msmt] [--data-dir ./data] [--exp-dir ./Exp] --evaluate

Performance

Performance (%) of QAConv (QAConv 1.0) and QAConv-GS (QAConv 2.1) under direct cross-dataset evaluation without transfer learning or domain adaptation:

Training Data Version Training Hours CUHK03-NP Market-1501 MSMT17
Rank-1 mAP Rank-1 mAP Rank-1 mAP
Market QAConv 1.0 1.33 9.9 8.6 - - 22.6 7.0
QAConv 2.1 0.25 19.1 18.1 - - 45.9 17.2
MSMT QAConv 2.1 0.73 20.9 20.6 79.1 49.5 - -
MSMT (all) QAConv 1.0 26.90 25.3 22.6 72.6 43.1 - -
QAConv 2.1 3.42 27.6 28.0 82.4 56.9 - -
RandPerson QAConv 2.1 2.0 18.4 16.1 76.7 46.7 45.1 15.5

Contacts

Shengcai Liao
Inception Institute of Artificial Intelligence (IIAI)
[email protected]

Citation

[1] Shengcai Liao and Ling Shao, "Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting." In the 16th European Conference on Computer Vision (ECCV), 23-28 August, 2020.

[2] Shengcai Liao and Ling Shao, "Graph Sampling Based Deep Metric Learning for Generalizable Person Re-Identification." In CVF/IEEE Conference on Computer Vision and Pattern Recognition, 2022.

@inproceedings{Liao-ECCV2020-QAConv,  
  title={{Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting}},  
  author={Shengcai Liao and Ling Shao},  
  booktitle={European Conference on Computer Vision (ECCV)},  
  year={2020}  
}

@article{Liao-CVPR2022-GraphSampling,
  author    = {Shengcai Liao and Ling Shao},
  title     = {{Graph Sampling Based Deep Metric Learning for Generalizable Person Re-Identification}},
  booktitle = {CVF/IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2022}
}

qaconv's People

Contributors

shengcailiao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

qaconv's Issues

训练结果为0

python main_gs.py --dataset market --testset cuhk03_np_detected的训练结果rank1 为0

QAConv50_IBNb_GS/res50-layer3-f64_pre1_clip512_lr3_bs64-k4_ep15s10:
cuhk03_np_detected: rank1=0.0, mAP=0.2.

还有两点疑问,一是为什么final_layer采用layer3,丢掉了最后的layer4? 第二点就是为什么采用memery机制来缓存每个类的特征?

关于第一点,我的想法是这样的,本文的假设是,conv出来的特征保持了和原图空间一致的细节特征,因此,采用空间点对点的匹配可以一定程度上缓解对齐的问题,另外学习一个匹配函数(即最后全连接层),从而达到'query adaptive'的目的,而layer4出来的特征可能失去了一定的局部细节,甚至损失了特征的空间一致性。
关于第二点,我有个不成熟的想法是,训练时在mini batch内部两两匹配,然后分类或者metric learning的方式训练, 这样的话,训练与测试就完全一致,也不需要memery来存储每个类的feature map,这样,更灵活,更能够适应工业级的大规模数据集。

Graph Sampler

thank u for ur work! I got 2 questions about Graph Sampling:

  1. intuitively, it should work on normal ReID task.
  2. The whole process is like: before training one epoch, the proposed sampler randomly select one img for each class, then computes a distmat for each img. The distmat represents distances between classes. So we can mine hardest samples in entire dataset, not a batch. But I didn't get where does "Graph" have connection to the process above.
    Looking forward to your help

where is the "main_gs.py"

I'm sorry to bother you that I can't find "main_gs.py". I'm very interested in your method. Thank you

局部区域可视化

作者您好!请问怎么像您在论文里面展示的那样可视化局部区域匹配的结果呢?

Unstable results

Hi,
Thanks for sharing your code.
However, I ran your code twice and get quite different results. Maybe due to random seed?
So did you set a fixed random seed when you train the model?

Question about backbone

Hi Mr.Liao, I appreciate much your novel idea and your code, and i notice that you choose ResNet as the backbone. ResNet152 has shown great results in the paper and in my own experiments , but it seems that it takes quite some time to train, even if we choose layer 3 of the model. Have you tried some lightweight backbone such as MobileNet? Is there any specific reason for choosing ResNet as feature extractor?
Thanks in advance.

Issues about evaluators.py

I use Market as the training dataset and Duke as the test dataset, when I use --do_tlift, it shows that the size of tensor are not match.
image

In the evaluators.py document, line 212, the original dist size is 222817661 in Market dataset, and the size of dist_rerank is 2228253 because the num_gal is not the same. The value of num_gal is the length of gallery images in the definition of line 189. However, it is redefined in line 204 as the size of gallery feature.

卷积核的output channel 为什么是h*w 呢

作者您好,关于卷积核的output channel我有一点疑问:
按照个人理解, output channel 应该是 (h//s)*(w//s)? 是不是论文里打错了呀?还是我理解错了-0-

Out of memory,--test_fea_batch --test_gal_batch --test_prob_batch all had seted to 128

main.py --dataset market --testset msmt --data-dir ./reid/datasets/ --exp-dir ./Exp

fpaths:./reid/datasets/market/bounding_box_train/1500_c6s3_086567_01.jpg
fpaths:./reid/datasets/market/bounding_box_test/1501_c6s4_001902_01.jpg
fpaths:./reid/datasets/market/query/1501_c6s4_001877_00.jpg
Market dataset loaded
subset | # ids | # images

train | 751 | 12935
query | 750 | 3367
gallery | 751 | 15912

  • Finished epoch 1 at lr=[0.0005, 0.005, 0.005]. Loss: 14.812. Acc: 54.97%. Training time: 174 seconds.

  • Finished epoch 2 at lr=[0.0005, 0.005, 0.005]. Loss: 13.333. Acc: 61.35%. Training time: 344 seconds.

  • Finished epoch 3 at lr=[0.0005, 0.005, 0.005]. Loss: 11.447. Acc: 68.55%. Training time: 514 seconds.

  • Finished epoch 4 at lr=[0.0005, 0.005, 0.005]. Loss: 10.338. Acc: 72.09%. Training time: 684 seconds.

  • Finished epoch 5 at lr=[0.0005, 0.005, 0.005]. Loss: 9.319. Acc: 75.31%. Training time: 855 seconds.

Decay the learning rate by a factor of 0.1. Final epochs: 7.

  • Finished epoch 6 at lr=[5e-05, 0.0005, 0.0005]. Loss: 8.566. Acc: 77.75%. Training time: 1025 seconds.

  • Finished epoch 7 at lr=[5e-05, 0.0005, 0.0005]. Loss: 7.732. Acc: 80.22%. Training time: 1195 seconds.

The learning converges at epoch 7.

Evaluate the learned model:
test_names: ['msmt']
MSMT dataset loaded
subset | # ids | # images

train | 1041 | 32621
query | 3060 | 11659
gallery | 3060 | 82161
/home/luotao/anaconda3/envs/QAConv/lib/python3.6/site-packages/torchvision/transforms/transforms.py:288: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
"Argument interpolation should be of type InterpolationMode instead of int. "
Time: 2690.337 seconds. / 1284. similarity 1 / 1284.
已杀死

run: python main.py --dataset market --testset msmt --data-dir ./reid/datasets/ --exp-dir ./Exp
--test_fea_batch --test_gal_batch --test_prob_batch all set to 128. Time: xx seconds, /xx similarity xx/xx. 已杀死.
Those three parameters set to 64, the errer : Time: 2690.337 seconds. / 1284. similarity 1 / 1284. 已杀死.

多源域训练/目标域测试

廖老师,您好:
请问QAConv系列工作是不是只适配cross-domain设定,即单源域和单目标域,在多源域和多目标域时能否使用呢?

谢谢!

复现的小问题

您好,请问您训练和测试不同数据集的时候, 所使用的超参数全部一样吗? 我用了您代码的默认配置,有的数据集效果会差1-2个点。

Can't find qaconv_loss

Hello,

First of all, thanks so much for your good work!

Here is a question: inside the test_matching.py, you import from reid.loss.qaconv_loss import QAConvLoss, however, it seems that qaconv_loss is no longer here, so I change to other loss functions. Will it influence the performance?

Thanks!

The Graph Sampling work 相关问题

廖老师您好,读了您最近的The Graph Sampling work 论文,有两个问题不太清楚,想请教您一下,望您指点:

  1. 在每一个epoch建图的时候,随机采样每个类的一张图片会不会造成比较大的偏差?
  2. K=2处理梯度太小的问题时,会不会遇到完全采样不到的情况(以前在远大于学术数据集的业务数据集上遇到过着种问题,hardcase采样不到)

self.model.eval()

Recently, I have read your code for QAConv. Now, I have a question to consult you. In the train() method in trainer.py, the following codes
class BaseTrainer(object):
for i, inputs in enumerate(data_loader):
self.model.eval()
self.criterion.train()
Why you don't set the model in train mode by using self.mode.train(), instead of using model.eval(). And, in the whole code of your project, I also found that there is no other place to use model. train().

some questions

(1)传统的训练好的CNN对所有query都是同一套参数,而QAconv对每个query的参数是不同的,这样理解是否正确?如果正确的话,测试时新的query的QAconv的参数是如何得到的呢?如果不正确的话,query-adaptive中的’adaptive’该怎么理解?
(2)这个自适应卷积的参数,是这个query的每个位置都对应一个特异的卷积核还是这个query的所有位置共享一个卷积核?
(3)QAConv是怎么学到的图像匹配?因为整个监督信息只有ID信息,也就是说,只能反馈这两个是不是同一个人,怎么学到这个局部和那个局部匹配上?
(4)QAconv是用于图像匹配的卷积核,它也是从数据集里面训练的,那么会不会也存在只对训练数据的已有的模式响应的问题?也就是说,对于新的类别,QAconv如何保证能对新的模式响应呢?

期待您耐心的解答

graph sampling的疑问

廖老师您好,想了解一下为什么graph sampling对于domain generalized re-id能够有很好的提升效果?以往的domain generalized re-id方法往往是采用domain invariant learning, style normalization等方式来解决这一任务,但graph sampling好像跟以往的方法思路不同,是通过加强hard mining的方式来改善domain generalization;对这一点有些不太理解,期待您的回复,谢谢!

Memory and batchsize

When I tried to run this code, I encountered an out of memory error. I want to know whether reducing the batchsize has a big impact on the accuracy of the experiment. And may I ask at least how much GPU memory does this need to run?

训练非常慢!

你好,我使用2块2080TI训练30W数据,64 batch_size 并且使用了fp16来加速训练,但是一个epoch训练了半个多小时才到511 iter,这正常吗?
Epoch: [1][511/4714] 455Time 2.620 (2.646)ec 0.0Data 0.001 (0.002) Loss 456.984 (520.544) Prec 0.00% (0.00%)

Unable to use ClassMemoryLoss to train the model

In the QAConv codes, I tried to modify the loss function to ClassMemoryLoss as the criterion but the acc is nearly zero. Is the ClassMemoryLoss available to use? Are ClassMemoryLoss and Focal Loss in the paper the same? The code is shown below.

criterion = ClassMemoryLoss(matcher, num_classes, num_features, hei, wid).cuda()

关于s=1

廖老师您好,我想问一下关于s的取值问题。 您论文提到为了效率选择了s=1, 我是这么理解的,在不使用classmemory 而是使用pair wise match的情况下, 做一次QAconv的时间复杂度为O( B^2 * (HW)^2 * s )。 按照时间复杂度来的化, s取值的大一点或者小小一点感觉没有多影响。 但是,当s=1的时候,可以直接使用矩阵乘法,然后又因为矩阵乘法做了大量的优化,所以实际的时间大大缩短了。所以最终s=1. 不知我的理解是否有问题,望老师您赐教!

Error about main.py

I tried python main.py --dataset market --testset cuhk03_np_detected --data-dir /home/lzj/QAConv-master/data --exp-dir home/lzj/QAConv-master/data ,and it shows error
image
I'm not sure if it's due to I passed the wrong command,though I checked the python main.py -h. And the path, /home/lzj/QAConv-master/data is where Market-1501 lies in.
Thanks so much in advance!!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.