GithubHelp home page GithubHelp logo

svip-lab / svip-sequence-verification-for-procedures-in-videos Goto Github PK

View Code? Open in Web Editor NEW
19.0 2.0 2.0 2.31 MB

[CVPR2022] SVIP: Sequence VerIfication for Procedures in Videos

License: MIT License

Python 100.00%

svip-sequence-verification-for-procedures-in-videos's Introduction

SVIP: Sequence VerIfication for Procedures in Videos

This repo is the official implementation of our CVPR 2022 paper: SVIP: Sequence VerIfication for Procedures in Videos.


Getting Started

Prerequisites

  • python 3.6
  • pytorch 1.7.1
  • cuda 10.2

Installation

  1. Clone the repo and install dependencies.

    git clone https://github.com/svip-lab/SVIP-Sequence-VerIfication-for-Procedures-in-Videos.git
    cd VIP-Sequence-VerIfication-for-Procedures-in-Videos
    pip install requirements.txt 
  2. Download the Kinetics-400 pretrained model.

    Link:here

    Extraction code:bs6b


Datasets

Please refer to here for detailed instructions.


Training and Evaluation

We have provided the default configuration files for reproducing our results. Try these commands to play with this project.

  • For training:
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --config configs/train_resnet_config.yml
  • For evaluation:
    CUDA_VISIBLE_DEVICES=0 python eval.py --config configs/eval_resnet_config.yml --root_path [model&log folder] --dist [L2/NormL2] --log_name [xxx]
    Note that we use L2 distance while evaluating on COIN-SV, otherwise NormL2.

Trained Models

We provide checkpoints for each dataset trained with this re-organized codebase.

Notice: The reproduced performances are occassionally higher or lower (within a reasonable range) than the results reported in the paper.

DatasetSplitPaparReproduceckpt
COIN-SV val 56.81, 0.400558.27, 0.4667here
test51.13, 0.409851.55, 0.4658
Diving48-SV val 91.91, 1.064291.69, 1.0928here
test83.11, 0.600984.28, 0.6193
CSV test 83.02, 0.419382.88, 0.4474here

Citation

If you find this repo helpful, please cite our paper:

@inproceedings{qian2022svip,
  title={SVIP: Sequence VerIfication for Procedures in Videos},
  author={Qian, Yicheng and Luo, Weixin and Lian, Dongze and Tang, Xu and Zhao, Peilin and Gao, Shenghua},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={19890--19902},
  year={2022}
}

svip-sequence-verification-for-procedures-in-videos's People

Contributors

dul1nk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

svip-sequence-verification-for-procedures-in-videos's Issues

关于数据集预处理问题

作者您好,我最近根据文章要求配置数据集,并且用github仓库提供的checkpoint进行测试,但是我们测试下来结果无法达到readme中报告的AUC,可能会差2个点以上(例如CSV的AUC是81.45)。我认为可能是我的数据集下载和预处理存在问题。
我采用 ffmpeg -i xxx.mp4 -s 320x180 -y .../%06d.jpg 的形式(也试过最高的压缩质量)提取视频的帧并resize到180x320的大小,想了解一下你们所采用的数据预处理方法是什么,是否有区别?如果不是预处理的问题,那么可能是什么导致了性能的差异,或者如果官方能够提供预处理后的数据集下载方式就最好了。

Regarding variation of performance metric

According to the paper, one should get WDR= 0.4403 but in my case I am getting a very high values such as 3.039 and sometimes also getting infinity, if I increase the input size to more than 300 videos. So, could you please tell what can be the reason behind that. And same goes with the AUC i am getting 0.9703.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.