GithubHelp home page GithubHelp logo

pwzy / video_features Goto Github PK

View Code? Open in Web Editor NEW

This project forked from v-iashin/video_features

0.0 0.0 0.0 288.04 MB

Extract video features from raw videos using multiple GPUs. We support RAFT and PWC flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, ResNet features.

Home Page: https://iashin.ai/video_features

License: GNU General Public License v3.0

Python 97.20% Dockerfile 2.80%

video_features's Introduction

Video Features

video_features allows you to extract features from video clips. It supports a variety of extractors and modalities, i.e. visual appearance, optical flow, and audio. See more details in Documentation.

Supported Models

Action Recognition

Sound Recognition

Optical Flow

Frame-wise Features

Quick Start

Open In Colab

or run with conda locally:

# clone the repo and change the working directory
git clone https://github.com/v-iashin/video_features.git
cd video_features

# install environment
conda env create -f conda_env_torch_zoo.yml

# load the environment
conda activate torch_zoo

# extract r(2+1)d features for the sample videos
python main.py \
    feature_type=r21d \
    device="cuda:0" \
    video_paths="[./sample/v_ZNVhz7ctTq0.mp4, ./sample/v_GGSY1Qvo990.mp4]"

# if you have many GPUs, just run this command from another terminal with another device
# device can also be "cpu"

If you are more comfortable with Docker, there is a Docker image with a pre-installed environment that supports all models. Check out the Docker support. documentation page.

Multi-GPU and Multi-Node Setups

With video_features, it is easy to parallelize feature extraction among many GPUs. It is enough to start the script in another terminal with another GPU (or even the same one) pointing to the same output folder and input video paths. The script will check if the features already exist and skip them. It will also try to load the feature file to check if it is corrupted (i.e. not openable). This approach allows you to continue feature extraction if the previous script failed for some reason.

If you have an access to a GPU cluster with shared disk space you may scale extraction with as many GPUs as you can by creating several single-GPU jobs with the same command.

Since each time the script is run the list of input files is shuffled, you don't need to worry that workers will be processing the same video. On a rare occasion when the collision happens, the script will rewrite previously extracted features.

Used in

Please, let me know if you found this repo useful for your projects or papers.

Acknowledgements

  • @Kamino666: added CLIP model as well as Windows and CPU support (and many other small things).
  • @borijang: for solving bugs with file names, I3D checkpoint loading enhancement and code style improvements.
  • @ohjho: added support of 37-layer R(2+1)d favors.

Extract UCF-Crime Features

获得视频地址:

rm sample/sample_video_paths.txt
find /home/jing/project/dataset/UCF-Crime-unzip  -name "*mp4" > sample/sample_video_paths.txt 

进行特征提取:

python main.py \
    feature_type=i3d \
    device="cuda:0" \
    stack_size=16 \
    step_size=16 \
    file_with_video_paths=./sample/sample_video_paths.txt \
    on_extraction=save_numpy \
    output_path="./output"

注意:

  1. 参数不可指定extraction_fps,因为视频进行重新帧率编码后可能会增加多余的帧。例如14帧的视频进行重新编码后帧数变为21。
  2. i3d提取过程中至少需要17帧才能生成特征(因为需要多一帧计算光流)。
  3. 提取过程中可进行文件拆分,拆分后文件为split_file_aa, split_file_ab,...
split -l 1000 example.txt split_file_
  1. ShanghaiTech dataset数据集的帧率为24,但在进行特征提取时不要修改帧率参数,可能导致视频重新编码后视频总帧数增加。

video_features's People

Contributors

borijang avatar kamino666 avatar ohjho avatar pwzy avatar v-iashin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.