GithubHelp home page GithubHelp logo

danier97 / st-mfnet Goto Github PK

View Code? Open in Web Editor NEW
66.0 6.0 9.0 42 KB

[IEEE/CVF CVPR'2022] "ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation", Duolikun Danier, Fan Zhang, David Bull

License: MIT License

Python 100.00%
deep-learning pytorch video-frame-interpolation

st-mfnet's People

Contributors

danielism97 avatar danier97 avatar zsxkib avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

st-mfnet's Issues

CUDA out of memory

Hi @danielism97 I have RTX2060 6gb vram but it needs more vram to execute. How can I use small length sequences to get inference on lower memory gpu ?

Perceptual Loss

Thanks for sharing your amazing work.

In the paper you mentioned that " adversarial loss for the generator is combined with the Laplacian pyramid loss to
form the perceptual loss for ST-MFNet fine-tuning" can you please let us know the value for lambda used and after how many epochs of training with just Laplacian pyramid loss this perceptual loss term was introduced.

Thanks

4x or 8x interpolation?

Thanks for sharing this great work!

Does interpolate_yuv.py only allow for a 2x interpolation? How can I do a 4x or 8x interpolation?

vfitex dataset is not complete

Hello, thank you very much for sharing the code and dataset. After downloading the vfitex dataset, I found that in vfitex --->sakura_4K_pexels. There are only 87 photos in the folder, while the other files have 100 photos below, so when running the code, only 934 quintuplets can be read instead of the 940 mentioned in the paper

Error during inference on custom video

====Note: Using block-wise evaluation with block size=None, overlap=None, batch size=None
====This may generate unwanted block artefacts.
Loading the model...
Using model STMFNet to upsample file birthday_girl_720x480p.yuv
0%| | 0/641 [00:00<?, ?it/s]/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/cupy/cuda/compiler.py:461: UserWarning: cupy.cuda.compile_with_cache has been deprecated in CuPy v10, and will be removed in the future. Use cupy.RawModule or cupy.RawKernel instead.
warnings.warn(
0%| | 0/641 [00:00<?, ?it/s]
Traceback (most recent call last):
File "interpolate_yuv.py", line 136, in
main()
File "interpolate_yuv.py", line 118, in main
out = model(frame0, frame1, frame2, frame3)
File "/home/azureuser/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/deepakpanda7/code/Users/deepakpanda/slow-motion/ST-MFNet/models/stmfnet.py", line 287, in forward
I1_us = self.upsampler(I1)
File "/home/azureuser/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/deepakpanda7/code/Users/deepakpanda/slow-motion/ST-MFNet/models/misc/init.py", line 31, in forward
im_up_row = F.conv2d(F.pad(im, pad=(p,p+1,0,0), mode='reflect'), self.filter, groups=3)
RuntimeError: Expected object of scalar type Float but got scalar type Long for argument #2 'weight' in call to _thnn_conv_depthwise2d_forward

VFITex dataset is not available.

Hi,

First of all, thank you so much for your well-developed research of VFI.
I'm raising an issue because I can't access your VFITex dataset.
When I try to download it, the message pops up and says "the link has expired".
Can I access your VFITex?

Thank you!

Code available?

Hello, are you still planning to publish the inference/training code of your paper? Thanks!

Question about dataset sampling

Hi, I came across your work and thought it was a very interesting concept. I see you randomly sample from the vimeo90k and bvi sets. When you were doing your ablations, how did you manage to get the list of randomly sampled frames and load them for each experiment?
The reason I'm asking is I am currently looking at different training parameters to see if there's a quicker way for the model to converge (as currently takes about 16 days on one GPU if I use the full vimeo90k for reference), but if the dataset is sampled at each random the comparison would not necessarily account for the variance in training data.
Thanks

BVI-DVC

Hi, I was going through your work recently, and figured that the link to the BVI-DVC quintuplets has been expired.
I wish to have access to the link, but it seems to be broken.

Could you kindly check and make it available it again?

Help for experiments trained only on Vimeo

Hi, thanks for your code!
We are trying to build an unified benchmark of most of the SOTA VFI methods. For 2x VFI with four input frames, we see that the training set of FLAVR(https://github.com/tarun005/FLAVR) and VFIT(https://github.com/zhshi0816/Video-Frame-Interpolation-Transformer) are the pure Vimeo (septulets set), while, in ST-MFNet, it is a mixture of Vimeo(septulets set) and BVI. So we retrained the ST-MFNet on the pure Vimeo (septulets set). But, unfortunately the best PSNR we got on validation set of Vimeo (septulets set) is only 35.29, which is quite worse than the model released (36.49 on the same validation set). We think there must be some errors about the parameter settings, but we have tried a few weeks and got no improvement till now.
So, we are here to seek the help for the complement of our experiments.
Could you train your model on the pure Vimeo (septulets set) and release the model or evaluation results on Vimeo, UCF101, DAVIS and SNU-FILM(4 subsets).
Thanks!

BVI-DVC quintuplets

Hi, I was going through your work recently, and figured that the link to the BVI-DVC quintuplets has been expired.
I wish to have access to the link, but it seems to be broken.

Could you kindly check and make it available it again?

Right YUV format of the input video?

Hello,
I would like to ask how exactly should we encode the input video.
I have a directory with PNG frames. I did simply:

ffmpeg -i myDir/%02d.png -pix_fmt yuv444p test444.yuv
python interpolate_yuv.py --net STMFNet --checkpoint stmfnet.pth --yuv_path test444.yuv --size 3840x2160 --out_fps 30 --out_dir test --batch_size 4 --patch_size 256 --overlap 4
ffmpeg -i test/test.yuv_3840x2160_30fps_STMFNet.mp4 -video_size 3840x2160 -framerate 30 test/%04d.png

Everything works but the results are strange with distorted colors. I believe that there is a mismatch between the used YUV formats. Would you please share your commands for the correct processing?
Thanks!

Retrain using existing Trained model

Hello,

I'm trying to retrain the model with my own dataset. I want to use existing model a the bae checkpoint to start the training..

I downloaded the pretrained model- stmfnet.pth
when I passed -
python train.py --net STMFNet --data_dir G:\WorkSpace\Harshitha\ST-MFNet-main --out_dir ./train_resultswithO --epochs 100 --batch_size 4 --loss 1*Lap --patch_size 256 --lr 0.001 --decay_type plateau --gamma 0.5 --patience 5 --optimizer ADAMax --load stmfnet.pth

in train.py
we have - start_epoch = checkpoint['epoch'] in line 79 ..

I'm getting an error -
KeyError: 'epoch'

..
could you please help me out.. .pth file does not have epoch

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.