书中给出类似 https://tensorlayer.readthedocs.io/en/latest/modules/iterate.html?highlight=ptb_iterator#tensorlayer.iterate.ptb_iterator 的例子。
当batch_size=2, num_steps=3时,结果是:
print(x, y)
... [[ 0 1 2] <---x 1st subset/ iteration
... [10 11 12]]
... [[ 1 2 3] <---y
... [11 12 13]]
...
... [[ 3 4 5] <--- 1st batch input 2nd subset/ iteration
... [13 14 15]] <--- 2nd batch input
... [[ 4 5 6] <--- 1st batch target
... [14 15 16]] <--- 2nd batch target
...
... [[ 6 7 8] 3rd subset/ iteration
... [16 17 18]]
... [[ 7 8 9]
... [17 18 19]]
为什么(x, y)的pair直接从1st subset ([0, 1, 2], [1, 2, 3])跳到 2nd subset ([3, 4, 5], [4, 5, 6]),中间的([1, 2, 3], [2, 3, 4])和([2, 3, 4], [3, 4, 5])呢?
我尝试使用不同的batch_size和num_steps,还是没有办法得到中间跳过的subset.
比如batch_size=1, num_steps=3, 结果为:
x: [[0 1 2]]
y: [[1 2 3]]
x: [[3 4 5]]
y: [[4 5 6]]
x: [[6 7 8]]
y: [[7 8 9]]
x: [[ 9 10 11]]
y: [[10 11 12]]
x: [[12 13 14]]
y: [[13 14 15]]
x: [[15 16 17]]
y: [[16 17 18]]
书中说batch_size设为1,所有的样本就会被一个一个放入网络。但其实即使batch_size为1,中间还是有数据被跳过和丢失了(比如x:[1, 2, 3], y: [2, 3, 4]).
看起来问题在于num_steps既控制window_size, 又控制stride,而像cnn的kernel里面这是由两个参数一起控制的。