danier97 / st-mfnet Goto Github PK
View Code? Open in Web Editor NEW[IEEE/CVF CVPR'2022] "ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation", Duolikun Danier, Fan Zhang, David Bull
License: MIT License
[IEEE/CVF CVPR'2022] "ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation", Duolikun Danier, Fan Zhang, David Bull
License: MIT License
self.feature_extractor = getattr(feature, args.featnet)(args.featc, norm_layer=args.featnorm)
Hi @danielism97 I have RTX2060 6gb vram but it needs more vram to execute. How can I use small length sequences to get inference on lower memory gpu ?
Thanks for sharing your amazing work.
In the paper you mentioned that " adversarial loss for the generator is combined with the Laplacian pyramid loss to
form the perceptual loss for ST-MFNet fine-tuning" can you please let us know the value for lambda used and after how many epochs of training with just Laplacian pyramid loss this perceptual loss term was introduced.
Thanks
Hi @danielism97 could you please upload the inference file to test on real videos ?
Thanks for sharing this great work!
Does interpolate_yuv.py only allow for a 2x interpolation? How can I do a 4x or 8x interpolation?
Hello, thank you very much for sharing the code and dataset. After downloading the vfitex dataset, I found that in vfitex --->sakura_4K_pexels. There are only 87 photos in the folder, while the other files have 100 photos below, so when running the code, only 934 quintuplets can be read instead of the 940 mentioned in the paper
====Note: Using block-wise evaluation with block size=None, overlap=None, batch size=None
====This may generate unwanted block artefacts.
Loading the model...
Using model STMFNet to upsample file birthday_girl_720x480p.yuv
0%| | 0/641 [00:00<?, ?it/s]/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/cupy/cuda/compiler.py:461: UserWarning: cupy.cuda.compile_with_cache has been deprecated in CuPy v10, and will be removed in the future. Use cupy.RawModule or cupy.RawKernel instead.
warnings.warn(
0%| | 0/641 [00:00<?, ?it/s]
Traceback (most recent call last):
File "interpolate_yuv.py", line 136, in
main()
File "interpolate_yuv.py", line 118, in main
out = model(frame0, frame1, frame2, frame3)
File "/home/azureuser/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/deepakpanda7/code/Users/deepakpanda/slow-motion/ST-MFNet/models/stmfnet.py", line 287, in forward
I1_us = self.upsampler(I1)
File "/home/azureuser/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/deepakpanda7/code/Users/deepakpanda/slow-motion/ST-MFNet/models/misc/init.py", line 31, in forward
im_up_row = F.conv2d(F.pad(im, pad=(p,p+1,0,0), mode='reflect'), self.filter, groups=3)
RuntimeError: Expected object of scalar type Float but got scalar type Long for argument #2 'weight' in call to _thnn_conv_depthwise2d_forward
Hi,
First of all, thank you so much for your well-developed research of VFI.
I'm raising an issue because I can't access your VFITex dataset.
When I try to download it, the message pops up and says "the link has expired".
Can I access your VFITex?
Thank you!
Hello, are you still planning to publish the inference/training code of your paper? Thanks!
Hi, I came across your work and thought it was a very interesting concept. I see you randomly sample from the vimeo90k and bvi sets. When you were doing your ablations, how did you manage to get the list of randomly sampled frames and load them for each experiment?
The reason I'm asking is I am currently looking at different training parameters to see if there's a quicker way for the model to converge (as currently takes about 16 days on one GPU if I use the full vimeo90k for reference), but if the dataset is sampled at each random the comparison would not necessarily account for the variance in training data.
Thanks
Hi, I was going through your work recently, and figured that the link to the BVI-DVC quintuplets has been expired.
I wish to have access to the link, but it seems to be broken.
Could you kindly check and make it available it again?
Hi, thanks for your code!
We are trying to build an unified benchmark of most of the SOTA VFI methods. For 2x VFI with four input frames, we see that the training set of FLAVR(https://github.com/tarun005/FLAVR) and VFIT(https://github.com/zhshi0816/Video-Frame-Interpolation-Transformer) are the pure Vimeo (septulets set), while, in ST-MFNet, it is a mixture of Vimeo(septulets set) and BVI. So we retrained the ST-MFNet on the pure Vimeo (septulets set). But, unfortunately the best PSNR we got on validation set of Vimeo (septulets set) is only 35.29, which is quite worse than the model released (36.49 on the same validation set). We think there must be some errors about the parameter settings, but we have tried a few weeks and got no improvement till now.
So, we are here to seek the help for the complement of our experiments.
Could you train your model on the pure Vimeo (septulets set) and release the model or evaluation results on Vimeo, UCF101, DAVIS and SNU-FILM(4 subsets).
Thanks!
Hi, I was going through your work recently, and figured that the link to the BVI-DVC quintuplets has been expired.
I wish to have access to the link, but it seems to be broken.
Could you kindly check and make it available it again?
Hello,
I would like to ask how exactly should we encode the input video.
I have a directory with PNG frames. I did simply:
ffmpeg -i myDir/%02d.png -pix_fmt yuv444p test444.yuv
python interpolate_yuv.py --net STMFNet --checkpoint stmfnet.pth --yuv_path test444.yuv --size 3840x2160 --out_fps 30 --out_dir test --batch_size 4 --patch_size 256 --overlap 4
ffmpeg -i test/test.yuv_3840x2160_30fps_STMFNet.mp4 -video_size 3840x2160 -framerate 30 test/%04d.png
Everything works but the results are strange with distorted colors. I believe that there is a mismatch between the used YUV formats. Would you please share your commands for the correct processing?
Thanks!
Hello,
I'm trying to retrain the model with my own dataset. I want to use existing model a the bae checkpoint to start the training..
I downloaded the pretrained model- stmfnet.pth
when I passed -
python train.py --net STMFNet --data_dir G:\WorkSpace\Harshitha\ST-MFNet-main --out_dir ./train_resultswithO --epochs 100 --batch_size 4 --loss 1*Lap --patch_size 256 --lr 0.001 --decay_type plateau --gamma 0.5 --patience 5 --optimizer ADAMax --load stmfnet.pth
in train.py
we have - start_epoch = checkpoint['epoch'] in line 79 ..
I'm getting an error -
KeyError: 'epoch'
..
could you please help me out.. .pth file does not have epoch
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.