GithubHelp home page GithubHelp logo

imatge-upc / unsupervised-2017-cvprw Goto Github PK

View Code? Open in Web Editor NEW
25.0 10.0 7.0 4.45 MB

Disentangling Motion, Foreground and Background Features in Videos

Home Page: https://imatge-upc.github.io/unsupervised-2017-cvprw/

License: MIT License

Python 100.00%
deep-learning unsupervised-machine-learning video-processing

unsupervised-2017-cvprw's Introduction

Disentangling Foreground, Background and Motion Features in Videos

This repo contains the source codes for our work as in title. Please refer to our project webpage or original paper for more details.

Dataset

This project requires UCF-101 dataset and its localization annotations (bonding box for action region). Please note that the annotations only contain bounding boxes for 24 classes out of 101. We only use these 24 classes for further experiments.

Download link

UCF-101: http://crcv.ucf.edu/data/UCF101/UCF101.rar
Annotations (version 1): http://crcv.ucf.edu/ICCV13-Action-Workshop/index.files/UCF101_24Action_Detection_Annotations.zip 

Dataset split

We split our dataset into training set, validation set and test set. Split lists of each set can be found under dataset folder.

Generate TF-Records

As we are dealing with videos, using TF-records in TensorFlow can help to reduce I/O overheads. (Please refer to the official documentation if you're not familiar with TF-records). Each SequenceExample in our TF-records includes 32 video frames, corresponding masks and so on.

A brief description of how we generate TF-records from videos and annotations: for each video, we split it into multiple chunks consisting of 32 frames and save each chunk as an example. The corresponding masks are generated through manipulations on localization annotations.

In order to generate TF-records used by this project, you need to modify certain paths in tools/generate_tfrecords.py. Including, videos_directory, annotation_directory and so on. As we use FFMPEG to decode videos, you may want to install it with the command below if you are using Anaconda:

conda install ffmpeg

After installation of FFMPEG, you need to specify the path to executable FFMPEG binary file in tools/ffmpeg_reader.py. (Usually it's just ~/anaconda/bin/ffmpeg if you are using Anaconda). After specifying path to FFMPEG, you are good to go! Run the script as below to generate those TF-records:

python tools/generate_tfrecords.py

Training & Testing

Our codes for training and testing are organized in the following fashion: we have scripts under models/ to construct TensorFlow graph for each model. And under top path, we have scripts named as ***_[train|val|test].py. These are the scripts accomplish external call and training/validation/test of each model.

unsupervised-2017-cvprw's People

Contributors

allenovo avatar xavigiro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

unsupervised-2017-cvprw's Issues

the pretrained model

Your paper is very helpful to me. Can you share the "ckpt" folder, which is about the weight of pre-training? Thank you very much!

the ffmpeg_reader FLAGS

In the ffmpeg_reader.py, the function _load_video_ffmpeg has the code:
n_frames = FLAGS.num_frames
random_chunk = FLAGS.random_chunks
target_fps = FLAGS.fps
But I can not find where FLAGS defines?

Cannot find train files in train_list.txt

After I ran generate_tfrecord.py I got files names like "UCF-24-train-00002-of-00025", but when I run the train script, it's looking for file names in the train_list.txt (e.g. Basketball/v_Basketball_g25_c05.avi 1), and I got this error. How should we load the train files instead?

NotFoundError (see above for traceback): ../data/UCF-101-tf-records/Basketball/v_Basketball_g25_c05.avi 1
[[Node: ReaderReadV2_1 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReaderV2_1, input_producer)]]

About the datasets

It is a wonderful work and wonderful code! Can you release the groundtruth mask from manual annotations, It will very helpful for me!
Thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.