zepingyu0512 / srnn Goto Github PK
View Code? Open in Web Editor NEWsliced-rnn
sliced-rnn
Even in your paper...
casual should -> causal
Hi,你好:
yelp 2013数据集下载速度较慢,我想先看懂代码,请问可以发一个训练数据的样例给我么?谢谢。
It's a nice work. However, i have little confusion about the usage of SRNN. Does it currently more suitable for classification task? How about other sequential prediction tasks?
Train on 25 samples, validate on 3 samples
Epoch 1/10
25/25 [==============================] - 4s 156ms/step - loss: 2.3046 - acc: 0.2000 - val_loss: 2.1626 - val_acc: 0.0000e+00
Epoch 00001: val_acc improved from -inf to 0.00000, saving model to F:\SRNN(8,2)_yelp20131.h5
Epoch 2/10
25/25 [==============================] - 0s 17ms/step - loss: 1.8452 - acc: 0.2000 - val_loss: 1.6156 - val_acc: 0.0000e+00
Epoch 00002: val_acc did not improve from 0.00000
Epoch 3/10
25/25 [==============================] - 0s 17ms/step - loss: 1.5321 - acc: 0.3200 - val_loss: 1.2719 - val_acc: 0.6667
Epoch 00003: val_acc improved from 0.00000 to 0.66667, saving model to F:\SRNN(8,2)_yelp20131.h5
Traceback (most recent call last):
File "", line 1, in
runfile('C:/Users/ycl/Desktop/SRNN8-2 - -attention.py', wdir='C:/Users/杨长利/Desktop')
File "E:\anaconda\envs\tensorflow\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 668, in runfile
execfile(filename, namespace)
File "E:\anaconda\envs\tensorflow\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/ycl/Desktop/SRNN8-2 - -attention.py", line 239, in
verbose = 1)
File "E:\anaconda\envs\tensorflow\lib\site-packages\keras\engine\training.py", line 1039, in fit
validation_steps=validation_steps)
File "E:\anaconda\envs\tensorflow\lib\site-packages\keras\engine\training_arrays.py", line 217, in fit_loop
callbacks.on_epoch_end(epoch, epoch_logs)
File "E:\anaconda\envs\tensorflow\lib\site-packages\keras\callbacks.py", line 79, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "E:\anaconda\envs\tensorflow\lib\site-packages\keras\callbacks.py", line 446, in on_epoch_end
self.model.save(filepath, overwrite=True)
File "E:\anaconda\envs\tensorflow\lib\site-packages\keras\engine\network.py", line 1090, in save
save_model(self, filepath, overwrite, include_optimizer)
File "E:\anaconda\envs\tensorflow\lib\site-packages\keras\engine\saving.py", line 382, in save_model
_serialize_model(model, f, include_optimizer)
File "E:\anaconda\envs\tensorflow\lib\site-packages\keras\engine\saving.py", line 78, in _serialize_model
f['keras_version'] = str(keras_version).encode('utf8')
File "E:\anaconda\envs\tensorflow\lib\site-packages\keras\utils\io_utils.py", line 214, in setitem
'Group with name "{}" exists.'.format(attr))
KeyError: 'Cannot set attribute. Group with name "keras_version" exists.'
您好,我是一个刚入坑深度学习的新人,如果问得比较脑残请见谅。看了下模型的框架图有些不明白它的输出,传统RNN可以m对m或者m对1输出。这个SRNN是不是只能m对n(m>n)和m对1输出啊?因为看图第一层8个输入,第二层就取了h2,4,6,8,到顶层就1个输出了。请问是不是只能用来做分类或者问一个长串问题回答就两个字那种?如果想回答很多个字就不行?不知大佬能否解答一下,先谢谢了。
Recently when I read the codes again, I found the "timedistributed" wrapper in keras actually is not computed parallelized, they use a "for loop" when implementing, so the computing over the subsequences are not parallelized. This means SRNN gets the 10+ speed advantage by only saving the back propagation time, without saving the forward propagation time. So it will have faster speed if we do something to change the "for loop" codes, I will try to do this later.
The code of the timedistributed layer is here: https://github.com/tensorflow/tensorflow/blob/r1.11/tensorflow/python/keras/layers/wrappers.py
They use the rnn-based implementation on line 245 (not real rnn, just a trick, see the code carefully will know):
_, outputs, _ = K.rnn(
step,
inputs,
initial_states=[],
input_length=input_shape[1],
unroll=False)
y = outputs
The code of K.rnn is here: https://github.com/tensorflow/tensorflow/blob/r1.11/tensorflow/python/keras/backend.py
On line 3135 they use the "for loop" to compute each subsequence.
I think there may be some ways to improve the for loop:
change cuda codes to do matrix-matrix computing over subsequences
put them into batches
I have read your paper and you have mentioned that SRNN could be equal to standard RNN if you set their initial parameters. However, in your code, you did not use any tricks of weight sharing, right? Then why you considered using activation=None
in GRU unit? Would it be better if you simply use default activation function? Thanks.
I've read your paper and confused by the architecture in Sliced-rnn and Wavenet (or fast-wavenet).
Could you explain those differences?Or post the code about DCCNN in your paper?
请问有Pytorch版本嘛?或者说在Pytorch中应该如何实现啊!谢谢!!!
111
Thanks for author's open source.
Here, in my pos, I edit the code based on new API--Python3.7, Keras 2.2.4 and Tensorflow 2.0.0.
In this paper, you state that the key improvement between SRNN and plain RNN is that we can parallelize SRNN so that we can train multiple subsequences at the same time. But I don't see that improvement manifested in this code? So how can we parallelize this code? Thank you.
hi~, 我尝试运行了你的代码,在
embed2 = TimeDistributed(Encoder1)(input2)
这行代码中出现了TypeError: Value passed to parameter 'shape' has DataType float32 not in list of allowed values: int32, int64的错误,没有什么头绪,请多多指教,谢谢
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.