The image-play from ywchao

image-play's People

Stargazers

Watchers

image-play's Issues

I don't get how ConvLSTM is done by this code.

Hi Yuwei,

I find something confusing in the code for 2d forecasting.

local function clstm(inp, inputSize, res)
  -- Replicate encoder output
  local rep = nn.Replicate(seqLength,2)(inp)

  -- Merge into one mini-batch
  local x1_ = nn.Transpose({2,3},{3,4})(inp)
  local x1 = nn.View(-1,inputSize)(x1_)

  -- LSTM
  local x2 = nn.View(-1,1,inputSize)(x1)
  local x3 = nn.Padding(1,seqLength-1,1)(x2)
  local hid = cudnn.LSTM(inputSize,hiddenSize,numLayers,true,0)(x3)
  local h1 = nn.Contiguous()(hid)

  -- Split from one mini-batch
  local h2_ = nn.View(-1,res,res,seqLength,hiddenSize)(h1)
  local h2 = nn.Transpose({3,4},{2,3},{4,5},{3,4})(h2_)

  -- Add residual to encoder output
  local add = nn.CAddTable()({rep,h2})
  
  -- Merge output in batch dimension;
  return nn.View(-1,hiddenSize,res,res)(add)
end

This is part of the model your paper used to deal with the LSTM part. Your team mentioned that you use the ConvLSTM trick in the paper. But if you do it this way, when the resolution is 64*64, your input for the nn.LSTM is (4096, 16, 256), which is actually getting the spatial information from the LSTM, but not the temporal information.

Thanks for your time and patience.

Why do you replicate inp for sequence length times at dim 2?

Hi Yuwei,

I am re-implementing your 2D pose forecasting work for now. But I am confused by the part in lib/models/hg-256-res-clstm.lua. Why do you replicate inp for sequence length times at dim 2? If you want to get the residual of the encoder output, will using the input copy work as the same?

ywchao / image-play Goto Github PK

image-play's People

Stargazers

Watchers

Forkers

image-play's Issues

I don't get how ConvLSTM is done by this code.

Why do you replicate inp for sequence length times at dim 2?

How to use my own images

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs