ruiliu-ai / dstt Goto Github PK

View Code? Open in Web Editor NEW

64.0 64.0 7.0 279 KB

Python 100.00%

dstt's People

Contributors

Stargazers

Watchers

Forkers

cv-ip learningyan hn18001 zhaorui-creator muchen12 bingogo888 pweglik

dstt's Issues

How to align the consecutive frames or patches?

Nice work! I hope to know whether use alignment, such as optical flow or affine transformation to these image patches?

is the frame the ground truth? but the paper says there is no groundtruth for training

DSTT/core/trainer.py

Line 252 in 0b16ff1

hole_loss = self.l1_loss(pred_img*masks, frames*masks)

FLOPs calculation

@ruiliu-ai Could you please release your FLOPs calculation code? Thanks in advance.

how does this network realize self-training?

the paper uses most of the parts to show the generator network. but because of lacking ground-truth, we need self-training to realize objective moving? so I wonder how to realize self-training? GAN is supervised learning as I know.
tks for answer!

Can you please point out the download links of the dataset you use?

Hi author,
Seems no YouTube-VOS download link from https://competitions.codalab.org/competitions/19544 and too many DAVIS download links from https://davischallenge.org/davis2017/code.html
Can you please give us a concrete hint about it?

About GPU issues

Hi Thanks for sharing your work, I have the following problem:

We have 6 2080ti graphics cards, but the following error will be reported when running:
RuntimeError: CUDA error: out of memory

Algorithm output format (mp4, other, etc)

Hi, thanks for the code. Can I modify the output format, or should I transform it post algorithm? (from mp4 to png, for example?)

Question about the inference speed.

Hi, friend.
I got different results compared with the Figure 1 in your paper when I test the inference speed of the models.
For STTN, I got a result about 11 FPS.
Could you tell me how you test it?

How to align the consecutive frames or patches?

Asked for pretrained Discriminator

Could you offer pretrained Discriminator together? I wonder if it can be used for measuring the quality of the inpainted result.

Some question about the inference speed.

I just came into contact with the research direction of video inpainting recently. The test sets of Davis and YouTube-VOS only correspond to one mask for each video. How did you use these data sets to conduct the test?

Some question about the paper!

In your paper Section 3.2, split the F in the s^2 zones, Then total number is t * s^2 * n, why this number need to * n , should the number is t * s^2?
Looking forward your reply

Questions about pos_embedding

Hello,

This is such a great work and thanks for sharing the codes!
I have a question about why there is no pos_embedding in the codes while transformer is not aware of the temporal orders of inputs. Please give me any hints, thanks!

Best,
Kejie

When you will release your code

Input resolutions other than 432x240

I am trying to test your work with input images of resolution 640x448 but I keep getting the following error:

File "/home/cosmos/AI/DSTT/model/DSTT.py", line 241, in forward
key = key.view(b, t, 2, self.h//2, 2, self.w//2, self.head, c_h)
RuntimeError: shape '[1, 11, 2, 10, 2, 18, 4, 128]' is invalid for input of size 11556864

Is it only possible to use 432x240 input images with the pre-trained model?
If so, would I need to train a new model specifically for the 640x448 resolution?
I also tried to change the resolution in youtube-vos.json and run the train.py script but I get a similar error there.

Can you explain how to make it work for resolutions other than 432x240?

Thank you!

About HierarchyEncoder

Hi, in the paper, it is stated that the interaction between different scale feature maps is isolated by group convolution to preserve the spatial structure. In theory, x0 and out0 should be spliced directly without grouping. However, in the code, the Fj and F1 layer feature maps are grouped before the channel dimension concat. Does this operation lead to the information interaction between different scale feature maps?

def forward(self, x):
    bt, c, h, w = x.size()
    out = x
    for i, layer in enumerate(self.layers):
        if i % 2 == 0 and i != 0: 
            g = self.group[i//2]
            x0 = x.view(bt, g, -1, h, w)
            out0 = out.view(bt, g, -1, h, w)
            out = torch.cat([x0, out0], 2).view(bt, -1, h, w) 
        out = layer(out) 
    return out

License

Hey, what is the license of the repo?

Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed

when i run test.py,it return an error , just like this.
python3 test.py -c *** -v *** -m ***
then it return error:
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda ->auto::operator()(int)->auto: block: [222,0,0], thread: [95,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "test.py", line 162, in
main_worker()
File "test.py", line 135, in main_worker
pred_img = model(masked_imgs)
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(input, **kwargs)
File "/ProjectRoot/test/Video_inpainting/DSTT-master/model/DSTT.py", line 144, in forward
enc_feat = self.encoder(masked_frames.view(bt, c, h, w))
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 345, in forward
return self.conv2d_forward(input, self.weight)
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 342, in conv2d_forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

ruiliu-ai / dstt Goto Github PK

dstt's People

Contributors

Stargazers

Watchers

Forkers

dstt's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs