GithubHelp home page GithubHelp logo

sdaujohnfan / video-summarization-with-lstm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kezhang-cs/video-summarization-with-lstm

0.0 1.0 0.0 19.63 MB

Implementation of our ECCV 2016 Paper (Video Summarization with Long Short-term Memory)

License: Other

MATLAB 51.75% Python 35.28% HTML 12.97%

video-summarization-with-lstm's Introduction

Video Summarization with LSTM

This repository provides the data and implementation for video summarization with LSTM, i.e. vsLSTM and dppLSTM in our paper:

Video Summarization with Long Short-term Memory
Ke Zhang*, Wei-Lun Chao*, Fei Sha, and Kristen Grauman.
In Proceedings of the European Conference on Computer Vision (ECCV), 2016, Amsterdam, The Netherlands. (*Equal contribution) [pdf] [supp]

If you find the codes or other related resources from this repository useful, please cite the following paper:

@inproceedings{zhang2016video,
  title={Video summarization with long short-term memory},
  author={Zhang, Ke and Chao, Wei-Lun and Sha, Fei and Grauman, Kristen},
  booktitle={ECCV},
  year={2016},
  organization={Springer}
}

Environment

  • MAC OS X or Linux
  • NVIDIA GPU with compute capability 3.5+
  • Python 2.7+
  • Theano 0.7+
  • Matlab

Data

Download the data and unzip to ./data/

Note that we down-sampled the original video by 2fps.

  1. file name: in the format 'Data_$Dataset$_google_p5.h5', e.g. Data_SumMe_google_p5.h5, means the frame level feature of SumMe dataset.
  2. the index of videos are stored as ‘idx’ in the file, in most cases it’s from 1 to n, where n is the number of videos in the dataset (except for Youtube dataset).
  3. feature & ground-truth: the feature is indexed as ‘fea_i’ , the importance is indexed as ‘gt_1_i’ (real number, from the original dataset), and the keyframe we used is indexed as ‘gt_2_i’ (binary value transferred from the original dataset) for the i-th video in the dataset.

Original videos and annotations for each dataset are also available from the the authors' project page

Codes

dppLSTM for video summarization

We have enclosed pre-trained models in the ./model directory download the model and run the following commands:

Download the pre-trained models and unzip it to ./models and run the following commands:

cd ./codes
THEANO_FLAGS=device=gpu0,floatX=float32 python dppLSTM_main.py 

This will automatically run summarization on the video data using pre-trained model, and save the results in ./res_LSTM/ as dppLSTM_$DATASET$_2_inference.h5

If you want to train the model on your own data, just uncomment Line 85 in dppLSTM_main.py

train(model_idx = model_idx, train_set = train_set, val_set = val_set, model_saved = model_file)

Evaluation

For both SumMe and TVSum datasets, you can find the code for evaluation provided by the author:

We also provided the evaluation code with wrappers that help adapt to the datasets above

To run evaluation on the predicted summarization, start the matlab and run the following commands:

cd ./codes
dppLSTM_eval('../data/', '$DATASET$', '/dppLSTM_$DATASET$_2_inference.h5')

video-summarization-with-lstm's People

Contributors

kezhang-cs avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.