mvs-using-cnn-and-lstm's Introduction

MVS-using-CNN-and-LSTM

Cloud-Assisted Multi-View Video Summarization using CNN and Bi-Directional LSTM

Paper

https://ieeexplore.ieee.org/abstract/document/8765236

The package contains the following three files.

1. FullDatasetFeatures.py

It contains the code to extract features from the overall dataset.

2. LSTM.py

It contains code to train the extracted features using Bi-directional LSTM. Splitting the code into train, test and validation is already included in this code.

3. OneFileOnlineTest.py

This is to test one video file on the trained model and write its output video.

Notice

We are really sorry to have the raw code, but to ensure the availability of an easy and understandable code we need some time. Feel free to contact me at [email protected] if you have any queries or you will be okay with raw format code. Thanks

Demo Results

Dummy link:

https://www.youtube.com/watch?v=aHvTtb8MbnQ

Contact Me

If you feel any difficulty or errors, please refer to tanveerkhattak37973[at][gmail]

Citation


@article{hussain2019cloud,
  title={Cloud-assisted multiview video summarization using CNN and bidirectional LSTM},
  author={Hussain, Tanveer and Muhammad, Khan and Ullah, Amin and Cao, Zehong and Baik, Sung Wook and de Albuquerque, Victor Hugo C},
  journal={IEEE Transactions on Industrial Informatics},
  volume={16},
  number={1},
  pages={77--86},
  year={2019},
  publisher={IEEE}
}

If you are interested in Multi-view Video Summarization domain you may want to read our recent survey:


@article{hussain2020comprehensive,
  title={A comprehensive survey of multi-view video summarization},
  author={Hussain, Tanveer and Muhammad, Khan and Ding, Weiping and Lloret, Jaime and Baik, Sung Wook and de Albuquerque, Victor Hugo C},
  journal={Pattern Recognition},
  volume={109},
  pages={107567},
  year={2020},
  publisher={Elsevier}
}

My further research [sorted year-wise] on Video Summarization (single- and multi-view video summarization) domain is as follows:

Multi-view Video Summarization


@ARTICLE{9208765,
  author={T. {Hussain} and K. {Muhammad} and A. {Ullah} and J. {Del Ser} and A. H. {Gandomi} and M. {Sajjad} and S. W. {Baik} and V. H. C. {de Albuquerque}},
  journal={IEEE Internet of Things Journal}, 
  title={Multi-View Summarization and Activity Recognition Meet Edge Computing in IoT Environments}, 
  year={2020},
  volume={},
  number={},
  pages={1-1},
  doi={10.1109/JIOT.2020.3027483}}
@article{hussain2019intelligent,
  title={Intelligent Embedded Vision for Summarization of Multiview Videos in IIoT},
  author={Hussain, Tanveer and Muhammad, Khan and Del Ser, Javier and Baik, Sung Wook and de Albuquerque, Victor Hugo C},
  journal={IEEE Transactions on Industrial Informatics},
  volume={16},
  number={4},
  pages={2592--2602},
  year={2019},
  publisher={IEEE}
}
@article{hussain2019cloud,
  title={Cloud-assisted multiview video summarization using CNN and bidirectional LSTM},
  author={Hussain, Tanveer and Muhammad, Khan and Ullah, Amin and Cao, Zehong and Baik, Sung Wook and de Albuquerque, Victor Hugo C},
  journal={IEEE Transactions on Industrial Informatics},
  volume={16},
  number={1},
  pages={77--86},
  year={2019},
  publisher={IEEE}
}
@article{후세인2018구조적인,
  title={구조적인 유사성에 기반한 다중 뷰 비디오의 효율적인 키프레임 추출},
  author={후세인 and 탄베르 and 칸살만 and 이미영 and 백성욱 and others},
  journal={한국차세대컴퓨팅학회 논문지},
  volume={14},
  number={6},
  pages={7--14},
  year={2018}
}

Single-view Video Summarization


@article{muhammad2020efficient,
  title={Efficient and Privacy Preserving Video Transmission in 5G-Enabled IoT Surveillance Networks: Current Challenges and Future Directions},
  author={Muhammad, Khan and Hussain, Tanveer and Rodrigues, Joel JPC and Bellavista, Paolo and de Macedo, Antonio Roberto L and de Albuquerque, Victor Hugo C},
  journal={IEEE Network},
  year={2020},
  publisher={IEEE}
}
@article{muhammad2020efficient,
  title={Efficient CNN based summarization of surveillance videos for resource-constrained devices},
  author={Muhammad, Khan and Hussain, Tanveer and Baik, Sung Wook},
  journal={Pattern Recognition Letters},
  volume={130},
  pages={370--375},
  year={2020},
  publisher={Elsevier}
}
@article{muhammad2019deepres,
  title={DeepReS: A Deep Learning-Based Video Summarization Strategy for Resource-Constrained Industrial Surveillance Scenarios},
  author={Muhammad, Khan and Hussain, Tanveer and Del Ser, Javier and Palade, Vasile and De Albuquerque, Victor Hugo C},
  journal={IEEE Transactions on Industrial Informatics},
  volume={16},
  number={9},
  pages={5938--5947},
  year={2019},
  publisher={IEEE}
}
@article{muhammad2019cost,
  title={Cost-effective video summarization using deep CNN with hierarchical weighted fusion for IoT surveillance networks},
  author={Muhammad, Khan and Hussain, Tanveer and Tanveer, Muhammad and Sannino, Giovanna and de Albuquerque, Victor Hugo C},
  journal={IEEE Internet of Things Journal},
  volume={7},
  number={5},
  pages={4455--4463},
  year={2019},
  publisher={IEEE}
}

mvs-using-cnn-and-lstm's People

Contributors

Stargazers

mvs-using-cnn-and-lstm's Issues

Check failed: fd != -1 (-1 vs. -1) File not found: imagenet_mean.binaryproto

I use Ubuntu 20.04, Caffe version = 1.0.0-rc3 and OpenCV version = 4.4.0

When running FullDatasetFeatures.py file, I get the following error:

/usr/bin/python3.8 /home/sumaya/Downloads/Models/MVS-using-CNN-and-LSTM-master/FullDatasetFeatures.py
/usr/local/lib/python3.8/dist-packages/skimage/io/manage_plugins.py:23: UserWarning: Your installed pillow version is < 7.1.0. Several security issues (CVE-2020-11538, CVE-2020-10379, CVE-2020-10994, CVE-2020-10177) have been fixed in pillow 7.1.0 or higher. We recommend to upgrade this library.
from .collection import imread_collection_wrapper
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0416 09:02:30.208113 14341 _caffe.cpp:122] DEPRECATION WARNING - deprecated use of Python interface
W0416 09:02:30.208173 14341 _caffe.cpp:123] Use this instead (with the named "weights" parameter):
W0416 09:02:30.208176 14341 _caffe.cpp:125] Net('Models/alexnet.prototxt', 1, weights='Models/bvlc_alexnet.caffemodel')
I0416 09:02:30.208559 14341 net.cpp:322] The NetState phase (1) differed from the phase (0) specified by a rule in layer data
I0416 09:02:30.208578 14341 net.cpp:58] Initializing net from parameters:
name: "AlexNet"
state {
phase: TEST
level: 0
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 227
mean_file: "/root/jinhou/DeepLearning/CAFFE/caffe/examples/imagenet/data/ilsvrc12/imagenet_mean.binaryproto"
}
data_param {
source: "/root/jinhou/DeepLearning/data/caffe/lmdb/ilsvrc12_val_lmdb"
batch_size: 50
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "norm2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "pool2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8"
type: "InnerProduct"
bottom: "fc7"
top: "fc8"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 1000
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc8"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8"
bottom: "label"
top: "loss"
}
I0416 09:02:30.208882 14341 layer_factory.hpp:77] Creating layer data
I0416 09:02:30.209121 14341 net.cpp:100] Creating Layer data
I0416 09:02:30.209131 14341 net.cpp:408] data -> data
I0416 09:02:30.209147 14341 net.cpp:408] data -> label
I0416 09:02:30.209158 14341 data_transformer.cpp:27] Loading mean file from: /root/jinhou/DeepLearning/CAFFE/caffe/examples/imagenet/data/ilsvrc12/imagenet_mean.binaryproto
F0416 09:02:30.209172 14341 io.cpp:63] Check failed: fd != -1 (-1 vs. -1) File not found: /root/jinhou/DeepLearning/CAFFE/caffe/examples/imagenet/data/ilsvrc12/imagenet_mean.binaryproto
*** Check failure stack trace: ***

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

What is the problem?

tanveer-hussain / mvs-using-cnn-and-lstm Goto Github PK