anchen1011 / toflow Goto Github PK

TOFlow: Video Enhancement with Task-Oriented Flow

License: MIT License

Lua 32.19% Cuda 7.46% CMake 0.80% C 4.08% MATLAB 54.99% Shell 0.37% M 0.10%

video video-demo video-denoising dataset video-deblocking super-resolution deep-learning optical-flow video-processing interpolation

toflow's Introduction

TOFlow: Video Enhancement with Task-Oriented Flow

This repository is based on our IJCV publication TOFlow: Video Enhancement with Task-Oriented Flow (PDF). It contains pre-trained models and a demo code. It also includes the description and download scripts for the Vimeo-90K dataset we collected. If you used this code or dataset in your work, please cite:

@article{xue2019video,
  title={Video Enhancement with Task-Oriented Flow},
  author={Xue, Tianfan and Chen, Baian and Wu, Jiajun and Wei, Donglai and Freeman, William T},
  journal={International Journal of Computer Vision (IJCV)},
  volume={127},
  number={8},
  pages={1106--1125},
  year={2019},
  publisher={Springer}
}

Video Demo

If you cannot access YouTube, please download 1080p video from here.

Prerequisites

Torch

Our implementation is based on Torch 7 (http://torch.ch).

CUDA [optional]

CUDA is suggested (https://developer.nvidia.com/cuda-toolkit) for fast inference. The demo code is still runnable without CUDA, but much slower.

Matlab [optional]

We use Matlab for generating video denoising/super-resolution dataset and quantitative evaluation require Matlab installation (https://www.mathworks.com/products/matlab.html). It is not necessary for the demo code.

FFmpeg [optional]

We use FFmpeg (http://ffmpeg.org) for generating video deblocking dataset. It is not necessary for the demo code.

Installation

Our current release has been tested on Ubuntu 14.04.

Clone the repository

git clone https://github.com/anchen1011/toflow.git

Install dependency

cd toflow/src/stnbhwd
luarocks make

This will install 'stn' package for Lua. The list of components:

require 'stn'
nn.AffineGridGeneratorBHWD(height, width)
-- takes B x 2 x 3 affine transform matrices as input, 
-- outputs a height x width grid in normalized [-1,1] coordinates
-- output layout is B,H,W,2 where the first coordinate in the 4th dimension is y, and the second is x
nn.BilinearSamplerBHWD()
-- takes a table {inputImages, grids} as inputs
-- outputs the interpolated images according to the grids
-- inputImages is a batch of samples in BHWD layout
-- grids is a batch of grids (output of AffineGridGeneratorBHWD)
-- output is also BHWD
nn.AffineTransformMatrixGenerator(useRotation, useScale, useTranslation)
-- takes a B x nbParams tensor as inputs
-- nbParams depends on the contrained transformation
-- The parameters for the selected transformation(s) should be supplied in the
-- following order: rotationAngle, scaleFactor, translationX, translationY
-- If no transformation is specified, it generates a generic affine transformation (nbParams = 6)
-- outputs B x 2 x 3 affine transform matrices

Download pretrained models (104MB)

cd ../../
./download_models.sh

Run Demo Code

cd src
th demo.lua -mode interp -inpath ../data/example/low_frame_rate
th demo.lua -mode denoise -inpath ../data/example/noisy
th demo.lua -mode deblock -inpath ../data/example/block
th demo.lua -mode sr -inpath ../data/example/blur

There are a few options in demo.lua:

nocuda: Set this option when CUDA is not available.

gpuId: GPU device ID.

mode: There are four options:

'interp': temporal frame interpolation
'denoise': video denoising
'deblock': video deblocking
'sr': video super-resolution

inpath: The path to the input sequence.

outpath: The path to where the result stores (default is ../demo_output).

Vimeo-90K Dataset

We also build a large-scale, high-quality video dataset, Vimeo-90K, designed for the following four video processing tasks: temporal frame interpolation, video denoising, video deblocking, and video super-resolution.

Vimeo-90K is built upon 5,846 selected videos downloaded from vimeo.com, which covers large variaty of scenes and actions. This video set is a subset of Vimeo-90K dataset is a subset of AoT dataset and all video links are here.

We further chop these videos to 89,800 video clips and build two datasets from these clips:

Triplet dataset for temporal frame interpolation

The triplet dataset consists of 73171 3-frame sequences with a fixed resolution of 448 x 256, extracted from 15k selected video clips from Vimeo-90K. This dataset is designed for temporal frame interpolation. Download links are:

Test set only: zip (1.7GB).

Both training and test set: zip (33GB).

Septuplet dataset for video denoising, super-resolution, and deblocking

The septuplet dataset consists of 91701 7-frame sequences with fixed resolution 448 x 256, extracted from 39k selected video clips from Vimeo-90k. This dataset is designed to video denoising, deblocking, and super-resolution.

The test set for video denoising: zip (16GB).

The test set for video deblocking: zip (11GB).

The test set for video super-resolution: zip (6GB).

The original test set (not downsampled or downgraded by noise): zip (15GB).

The original training + test set (consists of 91701 sequences, which are not downsampled or downgraded by noise): zip (82GB).

Generate Testing Sequences

See src/generate_testing_sample for the functions to generate noisy/low-resolution sequences.

To generate noisy sequences with Matlab under src/generate_testing_sample, run

add_noise_to_input(data_path, output_path);

and the results will be stored under output_path

To generate blur sequences with Matlab, run

blur_input(data_path, output_path);

and the results will be stored under output_path

Blocky sequences are compressed by FFmpeg. Our test set is generated with the following configuration:

ffmpeg -i *.png -q 20 -vcodec jpeg2000 -format j2k name.mov

Run Quantitative Evaluation

Download all four Vimeo testsets (52G)

./download_testset.sh

Run inference on Vimeo testsets

cd src
th demo_vimeo90k.lua -mode interp
th demo_vimeo90k.lua -mode denoise
th demo_vimeo90k.lua -mode deblock
th demo_vimeo90k.lua -mode sr

Evaluation

We use three metrics to evaluate the performance of our algorithm: PSNR, SSIM, and Abs metrics. To run evaluation, execute following commands in Matlab:

cd src/evaluation
evaluate(output_dir, target_dir);

For example, to evaluate results generated in the previous step, run

cd src/evaluation
evaluate('../../output/interp', '../../data/vimeo_interp_test/target', 'interp')
evaluate('../../output/denoise', '../../data/vimeo_test_clean/sequences', 'denoise')
evaluate('../../output/deblock', '../../data/vimeo_test_clean/sequences', 'deblock')
evaluate('../../output/sr', '../../data/vimeo_test_clean/sequences', 'sr')

It is assumed that our datasets are unzipped under data/ and not renamed. It is also assumed that results are put under [output_root]/[task_name] e.g. output/sr output/interp output/denoise output/deblock, with exactly the same subfolder structure as our datasets.

References

Our warping code is based on qassemoquab/stnbhwd.
Our flow utilities and transformation utilities are based on anuragranj/spynet
There is an unofficial PyTorch implementation by coldog2333/pytoflow

toflow's People

Contributors

Stargazers

Watchers

Forkers

donglaiw zmlshiwo jodyngo isvoid chfarrelmk liuguoyou jlb951115 ianhunag athenanna freelandy hyzcn htzheng kazim1990 kittyyinhui flt19940317 983045775 ipersevere chunyu-lin-bjtu yaoooliang peterzhousz fendaq hqleeustc lifengcs hehuiguo shiyoi xuliang2455 zeyuxiao1997 cliff-bot yonghoonkwon kanbo0409 leoyouli av-processing-1 yun-jung koala-good mikeseven gaochen315 chen8023 jnfbbplayer wwlcape bencoster joycezw zhhezhhe deepvisual hi-yan liyang53719 pzhao16me boxofpasta littlefoxhelper oscillated bai0925 tjtanaa dovedx zchyag dreaming-ml wh-forker ywu40 guillaumeguerin ahabbsciencestudiopak lxin996 lianhan odiofan databill86 ardaa-yilmaz santolina ycqian jimmy0723123 chaihuanhuan aleksandervainer ahlas cv-ip arezkibouzid seunghwa-jeong fei-si cherryblueberry blckeagle4 vipul109 wonlee2019 jimmy-hu ishandutta2007 zgzaacm marceloeatworld 18239256 shirne csoren changyanxiao45 oldmanhaha jenissamse marenan

toflow's Issues

structure about the SPN net

@anchen1011 Hi, I am confused about the details of the SPN network. I referred to the spatial transformer network and tried to find the localisation net in your project but failed. Can you share your SPN structures? Thanks!

STN net

Running toflow on consecutive frames

I would like to run your system inference on consecutive frames coming directly from the network, rather than on a stored video file on the disk. Is it possible? How would you suggest to do it?

Output at higher resolution

Hi, I have several high resolution footages (4240x2382) that would like to be tested on this algorithm.

What are the options to get the same resolution as output?

What's dsr_imresize?

Hi,
In src/generate_testing_sample/blur_input.m, you use a custom function dsr_imresize.p (which is a binary file) to downsample and upsample images. But you mention that you use MATLAB imresize function to generate LR images in your paper Section 5. I check the output of MATLAB imresize and your dsr_imresize. There are some small differences between the LR images (from the same HR image).

What's the difference between dsr_imresize and MATLAB imresize? Why not using MATLAB built-in imresize function?

Issue in SSIM implementation

Hey, you have amazing work. I am facing an issue with SSIM score you have calculated. it's like reshaping the image and passing it from SSIM function is not consistent with the Original Implementation of authors of SSIM. See the code for reproducing please.

% LR image available here 'https://raw.githubusercontent.com/mugheesahmad/Fun_testing/master/LR0000001.jpg' 
% HR image available here 'https://raw.githubusercontent.com/mugheesahmad/Fun_testing/master/HR0000001.jpg' 
lr = imread('LR0000001.jpg');
hr = imread('HR0000001.jpg');
ssim(hr, lr) %colored image
% ans = 0.8433
ssim(rgb2gray(hr), rgb2gray(lr)) %builtin MATLAB function
% ans =    0.7570
original_ssim(rgb2gray(hr), rgb2gray(lr)) %author implementation available here https://ece.uwaterloo.ca/~z70wang/research/ssim/
% original implementation doesnot accept the RGB image
% ans =    0.7574 
lru = reshape(lr, [380*672,3]);  %your way of doing
hru = reshape(hr, [380*672,3]);
ssim(lru, hru)
%ans = 86.70

As per the docs of Matlab SSIM, only gray images can be passed. Your way of using it does not consistent with the original and also with the SSIM implementation with skimage and pytorch version. see this and this colab file.

Loading the model:

To reproduce the results of the paper on video denoising, I want to load the pre-trained model to evaluate on the test set of Vimeo-90 dataset. But when I reload the model using torch7, I have encountered the following problem:

Warning: Failed to load function from bytecode: binary string: not a precompiled chunkWarning: Failed to load function from bytecode: [string ""]:1: unexpected symbol near char(5)/home/mars/torch/install/bin/lua: /home/mars/torch/install/share/lua/5.2/torch/File.lua:375: unknown object
stack traceback:
[C]: in function 'error'
/home/mars/torch/install/share/lua/5.2/torch/File.lua:375: in function 'readObject'
/home/mars/torch/install/share/lua/5.2/torch/File.lua:307: in function 'readObject'
/home/mars/torch/install/share/lua/5.2/torch/File.lua:369: in function 'readObject'
/home/mars/torch/install/share/lua/5.2/nn/Module.lua:192: in function 'read'
...rs/torch/install/share/lua/5.2/nn/BatchNormalization.lua:185: in function 'read'
/home/mars/torch/install/share/lua/5.2/torch/File.lua:351: in function 'readObject'
/home/mars/torch/install/share/lua/5.2/torch/File.lua:369: in function 'readObject'
/home/mars/torch/install/share/lua/5.2/torch/File.lua:369: in function 'readObject'
/home/mars/torch/install/share/lua/5.2/nn/Module.lua:192: in function 'read'
...
/home/mars/torch/install/share/lua/5.2/nn/Module.lua:192: in function 'read'
/home/mars/torch/install/share/lua/5.2/torch/File.lua:351: in function 'readObject'
/home/mars/torch/install/share/lua/5.2/torch/File.lua:369: in function 'readObject'
/home/mars/torch/install/share/lua/5.2/torch/File.lua:369: in function 'readObject'
/home/mars/torch/install/share/lua/5.2/nn/Module.lua:192: in function 'read'
/home/mars/torch/install/share/lua/5.2/torch/File.lua:351: in function 'readObject'
/home/mars/torch/install/share/lua/5.2/torch/File.lua:409: in function 'load'
demo.lua:65: in main chunk
[C]: in function 'dofile'
...mars/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?
Could you help me check if the pre-trained model has any problem? @anchen1011

Confusion on Image Transformation Module

Hi, according to the paper section 4.3, STN layer is used for transformation estimation. Seems the implementation you use in this repo only support affine transformation? So does it mean in Fig2 the "warped input" is just an affine transformation of "Input frames"?

Using ffmpeg to generate image sequence?

Hello there,

In the readme.md, the line ffmpeg -i *.png -q 20 -vcodec jpeg2000 -format j2k name.mov can generate a video with extension MOV.

I just wonder whether it is possible to directly generate a sequence of images.

Thank you!

What is the number of parameters for this algorithm?

Hi, @anchen1011

What is the number of parameters for the interpolation model? We would like to compare your model with ours.

Thanks.

runtime crash

I'm trying to run this using this command:

th demo.lua -mode denoise -inpath ../data/example/noisy -outpath ./out

And I'm getting this runtime error:

/torch/install/bin/luajit: ./main/loader.lua:27: argument 1 expected a 'string', got a 'nil'
stack traceback:
	./main/loader.lua:27: in function 'gen_path'
	demo.lua:47: in main chunk

divide vimeo90K dataset into training set and test set

Thank you very much for your work. It is a extremely good work .
I have a problem about how to divide vimeo90K dataset into training set and LR images?

I have seen that there is a get_path_sep.m in src/evalution and sep_trainlist.txt in data.
I want to know how to call to generate the corresponding training set ?

Thanks a lot.

module 'libcustn' not found:No LuaRocks module found for libcustn

Hi, I am running your demo code but I meet this error.
module 'libcustn' not found:No LuaRocks module found for libcustn
I tried to use luarocks install libcustn. It dosen't work. What is libcustn and how to install it? Please help me.

Resolution mismatching for interpolation

Thank you so much for sharing your code.
I use TOFLow to interpolate 2k images, their resolution is 2560x1440.
16 can divide 2560 and 16 can also divide 1440.
However, after doing interpolation, the resolution of my result is 2096x1184. How could I resolve this problem?
Thank you.

resolution mismatch for interp

thank you for sharing your code as well as your dataset.

i would like to evaluate your interp implementation. however, given an input of size 960x540 the output will be of size 960x528 instead. my intermediate solution is to rescale the output using bicubic interpolation such that it matches the input again. what is your suggestion in this regard, such that the evaluation is being done fairly? thanks!

Pre-training the flow estimation network

Hi, @anchen1011 . I pre-trained the flownet on the Sintel dataset but that does not converge . The batchsize is 16 and learning rate is 0.0001, the loss is defined by calculating the l1 difference between the last sub-net's output and the ground truth. Can you share the details about pre-training the flownet?

the noise level of the pretrained denoise model

Hi @anchen1011 thanks for your sharing the pretrained model, the denoising effect is very impressive!

I want to train the denoise model based on my own dataset.
I noticed that there is a file add_noise_to_input to generate the "test" sequences. May I ask that whether it is the same file (same noise var) to generate the "train" sequences?

Thanks!

Not all septuplets listed in "sep_testlist.txt" and "sep_trainlist.txt".

Hello Tianfan,
Thank you for sharing the dataset and code. I downloaded full septuplets dataset (training+test data) and found that "sep_testlist.txt" and "sep_trainlist.txt" contain only 72,436 setptuplets instead of 91,701. However, the folder with septuplets contains all 91,701 septuplets. Why not all septuplets appear in the "sep_testlist.txt" and "sep_trainlist.txt"?

About Vimeo Dataset.

What is the frame rate for the videos in the dataset? Also, what is the sampling frequency for the triplet and septuplet datasets?

2x model

Hi,

Thank you for the great code! In the code, the model is actually a 4x super-resolution model. Is it possible to use the proposed algorithm for 2x? Thank you!

Best,
Yongcheng

Code cleaning

Just a test of how to use issues in github

License of Vimeo-90K?

Hi, thank you for your great work!

I would like to know the license of Vimeo-90K.
I know the license for the code is MIT, but I wonder if same license is applied to Vimeo-90K dataset.

Thank you!

About the box down-sample kernel

Does the box down-sample kernel mentioned in your paper can be implemented by function imresize(img_hr, 1/up_scale, 'box'), I'm not sure whether they are same?

thanks~

demo.lua fails to run: cuDNN not found

The demo.lua not running after installing with Cuda and cuDNN. I had also installed 'luarocks install cudnn'
$ th demo.lua -mode denoise -inpath ../data/example/noisy
==> initializing...
/home/xhuv/torch/install/bin/luajit: /home/xhuv/torch/install/share/lua/5.1/trepl/init.lua:389: /home/xhuv/torch/install/share/lua/5.1/trepl/init.lua:389: /home/xhuv/torch/install/share/lua/5.1/trepl/init.lua:389: /home/xhuv/torch/install/share/lua/5.1/cudnn/ffi.lua:1603: 'libcudnn (R5) not found in library path.
Please install CuDNN from https://developer.nvidia.com/cuDNN
Then make sure files named as libcudnn.so.5 or libcudnn.5.dylib are placed in
your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH)

Alternatively, set the path to libcudnn.so.5 or libcudnn.5.dylib
to the environment variable CUDNN_PATH and rerun torch.
For example: export CUDNN_PATH="/usr/local/cuda/lib64/libcudnn.so.5"

stack traceback:
[C]: in function 'error'
/home/xhuv/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
demo.lua:20: in main chunk
[C]: in function 'dofile'
...xhuv/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk

septuplet dataset 502

Hi, the 82G http://data.csail.mit.edu/tofu/dataset/vimeo_septuplet.zip got 502

otput of SR get much smaller size than input (not 8x problem)

thank you for sharing your code as well as your dataset.

i would like to do the SR implementation. however, given an input of size 19201072, the output will be 19201072, while when input is 38402144 (satisfied 8x ), the output will be of size 21121168 instead， W/H also changed .

What are the options to get the same resolution as output? thanks!

vimeo-90k

Hi @anchen1011 ,

Thank you for your work.
I wonder, can I get original video clips from your Vimeo-90k dataset?
Or is it possible for you to create longer clips? for example 30-frame sequences

add noise to input seems to have something wrong

Hi, I found in your add_noise.m file line5, why the noise is uint8(random('norm',0,25.5,h,w,3)) instead of random('norm',0,25.5,h,w,3)? is there anything wrong?

Looking forward to the training code.

Great work. Do you have any plans to release the training code?

the results of sr model is not correct

I have tested the sr model by using command line: th demo.lua -mode sr -inpath ../data/example/low_resolution/ it won't get a correct results, the results just like blur of input frame, when i test with th demo.lua , the denoise will get correct results, could you help me to fix the bug of sr model?

anchen1011 / toflow Goto Github PK

toflow's Introduction

TOFlow: Video Enhancement with Task-Oriented Flow

Video Demo

Prerequisites

Torch

CUDA [optional]

Matlab [optional]

FFmpeg [optional]

Installation

Clone the repository

Install dependency

Download pretrained models (104MB)

Run Demo Code

Vimeo-90K Dataset

Triplet dataset for temporal frame interpolation

Septuplet dataset for video denoising, super-resolution, and deblocking

Generate Testing Sequences

Run Quantitative Evaluation

Download all four Vimeo testsets (52G)

Run inference on Vimeo testsets

Evaluation

References

toflow's People

Contributors

Stargazers

Watchers

Forkers

toflow's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs