GithubHelp home page GithubHelp logo

mini-net's Introduction

This repo contains source code for our ECCV 2020 work MINI-Net: Multiple Instance Ranking Network for Video Highlight Detection. Our model is implemented under Pytorch.

image-20200926112429908

Prerequisites

  1. Pytorch 1.4 +
  2. numpy
  3. tqdm
  4. moviepy
  5. cv2

Getting Started

Change directory to src and run following code:

CUDA_VISIBLE_DEVICES=1 python trainMIL.py --dataset youtube --domain gymnastics \
	--train_path /home/share/Highlight/proDataset/TrainingSet/ \
	--test_path /home/share/Highlight/proDataset/DomainSpecific \
	--topk_mAP 1 --FNet MILModel10 --AM AttentionModule \
	--DS MILDataset --AHLoss AdaptiveHingerLoss \
	--short_lower 10 --short_upper 40 --long_lower 60 --long_upper 60000 --bagsize 60 

Parameters:

  • CUDA_VISIBLE_DEVICES: specify GPU Id for training
  • dataset: choose dataset, alternatives: youtube, tvsum, cosum
  • domain: choose target domain in given dataset, e.g., gymnastics for youtube dataset
  • train_path: extracted feature file for training, mentioned above
  • test_path: extracted feature file for testing
  • topk_mAP: specify test metric, 1 or 5 in our paper
  • FNet: which model to use to predict highlight score for each segment in video, in our paper: MILModel10
  • AM: which model to fuse visual feature and audio feature, in our paper: AttentionModule_1
  • DS: dataset model: in out paper: MILDataset
  • AHLoss: hinger loss used in our paper

See visual-audio fusion/opts.py for details of data selection hyper-parameters.

The extracted features of three test datasets are available at here.

Main Results

  • Youtube:
Topic mAP
dog 0.5816
gymnastics 0.6165
parkour 0.7020
skating 0.7217
skiing 0.5866
surfing 0.6514
  • TVsum:
Topic top-5 mAP
VT 0.8062
VU 0.6832
GA 0.7821
MS 0.8183
PK 0.7807
PR 0.6584
FM 0.5780
BK 0.7502
BT 0.8019
DS 0.6551
  • CoSum:
Topic top-5 mAP
BJ 0.8450
BP 0.9887
ET 0.9156
ERC 1
KP 0.9611
MLB 0.9353
NFL 1
NDC 0.9536
SL 0.8896
SF 0.7897

Reference

If you find our work helpful in your research, please cite our paper via:

Bib:
@inproceedings{hong2020mini,
title={MINI-Net: Multiple Instance Ranking Network for Video Highlight Detec- tion},
author={Hong, Fa-Ting and Huang, Xuanteng and Li, Wei-Hong and Zheng, Wei-Shi},
booktitle={European Conference on Computer Vision},
year={2020}
}

More information about our work can be viewed in https://harlanhong.github.io.

mini-net's People

Contributors

harlanhong avatar huangxt57 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mini-net's Issues

About the training set

@Huangxt57 Would you mind sharing the extracted feature for training set with me? Therefore, I can train the model by myself.

Thank you.

the problem about the results in TV sum dataset

I conduct many experiments on TVsum dataset,and I find that I cann't get an stable results for each domain. Take VT domain as an example, top-5 mAP varies from 45%~79% at the w/o audio setting, the including audio setting is stable.

FileNotFoundError: [Errno 2] No such file or directory: 'files/train_features/gymnastics_duration.npy'

I downloaded the dataset provided here in a directory and ran the command like this:

("files" is where I have stored the files)

!python trainMIL.py --dataset youtube --domain gymnastics \
	--train_path "/files/train_features" \
	--test_path "/files/DomainSpecific"  \
	--topk_mAP 1 --FNet MILModel10 --AM AttentionModule \
	--DS MILDataset --AHLoss AdaptiveHingerLoss \
	--short_lower 10 --short_upper 40 --long_lower 60 --long_upper 60000 --bagsize 60 

I am getting this error:

FileNotFoundError: [Errno 2] No such file or directory: 'files/train_features/gymnastics_duration.npy'

Is there any files missing in the 'train_features'?

about pretrained model

Hi,can you provide the pretrained model of C3D network and the audio feature extracted model? I wanna to conduct the experiments on other video datasets

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.