danier97 / st-mfnet Goto Github PK

[IEEE/CVF CVPR'2022] "ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation", Duolikun Danier, Fan Zhang, David Bull

License: MIT License

Python 100.00%

deep-learning pytorch video-frame-interpolation

st-mfnet's People

Contributors

Stargazers

Watchers

Forkers

cv-ip crispianm xudong-ma tqsenchanting ywu40 zsxkib 5l1v3r1 edwardsanchez

st-mfnet's Issues

What does the "getattr" mean in the stmfnet.py? I didn't find its definition

self.feature_extractor = getattr(feature, args.featnet)(args.featc, norm_layer=args.featnorm)

CUDA out of memory

Hi @danielism97 I have RTX2060 6gb vram but it needs more vram to execute. How can I use small length sequences to get inference on lower memory gpu ?

In the paper you mentioned that " adversarial loss for the generator is combined with the Laplacian pyramid loss to
form the perceptual loss for ST-MFNet fine-tuning" can you please let us know the value for lambda used and after how many epochs of training with just Laplacian pyramid loss this perceptual loss term was introduced.

Thanks

Inference Code

Hi @danielism97 could you please upload the inference file to test on real videos ?

4x or 8x interpolation?

Thanks for sharing this great work!

Does interpolate_yuv.py only allow for a 2x interpolation? How can I do a 4x or 8x interpolation?

vfitex dataset is not complete

Hello, thank you very much for sharing the code and dataset. After downloading the vfitex dataset, I found that in vfitex --->sakura_4K_pexels. There are only 87 photos in the folder, while the other files have 100 photos below, so when running the code, only 934 quintuplets can be read instead of the 940 mentioned in the paper

Error during inference on custom video

====Note: Using block-wise evaluation with block size=None, overlap=None, batch size=None
====This may generate unwanted block artefacts.
Loading the model...
Using model STMFNet to upsample file birthday_girl_720x480p.yuv
0%| | 0/641 [00:00<?, ?it/s]/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/cupy/cuda/compiler.py:461: UserWarning: cupy.cuda.compile_with_cache has been deprecated in CuPy v10, and will be removed in the future. Use cupy.RawModule or cupy.RawKernel instead.
warnings.warn(
0%| | 0/641 [00:00<?, ?it/s]
Traceback (most recent call last):
File "interpolate_yuv.py", line 136, in
main()
File "interpolate_yuv.py", line 118, in main
out = model(frame0, frame1, frame2, frame3)
File "/home/azureuser/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/deepakpanda7/code/Users/deepakpanda/slow-motion/ST-MFNet/models/stmfnet.py", line 287, in forward
I1_us = self.upsampler(I1)
File "/home/azureuser/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/deepakpanda7/code/Users/deepakpanda/slow-motion/ST-MFNet/models/misc/init.py", line 31, in forward
im_up_row = F.conv2d(F.pad(im, pad=(p,p+1,0,0), mode='reflect'), self.filter, groups=3)
RuntimeError: Expected object of scalar type Float but got scalar type Long for argument #2 'weight' in call to _thnn_conv_depthwise2d_forward

VFITex dataset is not available.

Hi,

First of all, thank you so much for your well-developed research of VFI.
I'm raising an issue because I can't access your VFITex dataset.
When I try to download it, the message pops up and says "the link has expired".
Can I access your VFITex?

Thank you!

Code available?

Hello, are you still planning to publish the inference/training code of your paper? Thanks!

Question about dataset sampling

Hi, I came across your work and thought it was a very interesting concept. I see you randomly sample from the vimeo90k and bvi sets. When you were doing your ablations, how did you manage to get the list of randomly sampled frames and load them for each experiment?
The reason I'm asking is I am currently looking at different training parameters to see if there's a quicker way for the model to converge (as currently takes about 16 days on one GPU if I use the full vimeo90k for reference), but if the dataset is sampled at each random the comparison would not necessarily account for the variance in training data.
Thanks

4 frames to Adacof

BVI-DVC

Hi, I was going through your work recently, and figured that the link to the BVI-DVC quintuplets has been expired.
I wish to have access to the link, but it seems to be broken.

Could you kindly check and make it available it again?

Help for experiments trained only on Vimeo

Hi, thanks for your code!
We are trying to build an unified benchmark of most of the SOTA VFI methods. For 2x VFI with four input frames, we see that the training set of FLAVR(https://github.com/tarun005/FLAVR) and VFIT(https://github.com/zhshi0816/Video-Frame-Interpolation-Transformer) are the pure Vimeo (septulets set), while, in ST-MFNet, it is a mixture of Vimeo(septulets set) and BVI. So we retrained the ST-MFNet on the pure Vimeo (septulets set). But, unfortunately the best PSNR we got on validation set of Vimeo (septulets set) is only 35.29, which is quite worse than the model released (36.49 on the same validation set). We think there must be some errors about the parameter settings, but we have tried a few weeks and got no improvement till now.
So, we are here to seek the help for the complement of our experiments.
Could you train your model on the pure Vimeo (septulets set) and release the model or evaluation results on Vimeo, UCF101, DAVIS and SNU-FILM(4 subsets).
Thanks!

BVI-DVC quintuplets

Hi, I was going through your work recently, and figured that the link to the BVI-DVC quintuplets has been expired.
I wish to have access to the link, but it seems to be broken.

Could you kindly check and make it available it again?

Right YUV format of the input video?

Hello,
I would like to ask how exactly should we encode the input video.
I have a directory with PNG frames. I did simply:

ffmpeg -i myDir/%02d.png -pix_fmt yuv444p test444.yuv
python interpolate_yuv.py --net STMFNet --checkpoint stmfnet.pth --yuv_path test444.yuv --size 3840x2160 --out_fps 30 --out_dir test --batch_size 4 --patch_size 256 --overlap 4
ffmpeg -i test/test.yuv_3840x2160_30fps_STMFNet.mp4 -video_size 3840x2160 -framerate 30 test/%04d.png

Everything works but the results are strange with distorted colors. I believe that there is a mismatch between the used YUV formats. Would you please share your commands for the correct processing?
Thanks!

Retrain using existing Trained model

Hello,

I'm trying to retrain the model with my own dataset. I want to use existing model a the bae checkpoint to start the training..

I downloaded the pretrained model- stmfnet.pth
when I passed -
python train.py --net STMFNet --data_dir G:\WorkSpace\Harshitha\ST-MFNet-main --out_dir ./train_resultswithO --epochs 100 --batch_size 4 --loss 1*Lap --patch_size 256 --lr 0.001 --decay_type plateau --gamma 0.5 --patience 5 --optimizer ADAMax --load stmfnet.pth

in train.py
we have - start_epoch = checkpoint['epoch'] in line 79 ..

I'm getting an error -
KeyError: 'epoch'

..
could you please help me out.. .pth file does not have epoch

danier97 / st-mfnet Goto Github PK

st-mfnet's People

Contributors

Stargazers

Watchers

Forkers

st-mfnet's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs