Comments (5)
看了 the illustrated transformer,明白了。
首先,Keras 的 batch_dot 只会做 2 阶的 dot,就是矩阵积,就是 array 是超过三阶时也一样,最前面的 (n-2) 个维度都被看成是 "batch"。这里 Keras 的文档写得不好。
然后,Transformer 确是要做矩阵积,而不是内积。每个 query vector 都会跟同一个 attention head 的所有其他 key vector 求积然后加起来。没看 illustrated transformer 看不明白。
所以代码没有问题,可以 close issue,打搅了。
from rasa_chatbot_cn.
再尝试多了一点,代码在 Tensorflow 下跑没有问题,但 Keras 的 batch_dot 在 tensorflow 和 theano 下效果不一致,在 theano 下只有第一个维度当成为 batch,所以会出现一开始所说的不正常计算结果,就是不同的 attention head 之间也会做矩阵积。
from rasa_chatbot_cn.
例子如下:
import numpy as np import keras.backend as K A = np.int_( [[[[1, 2], [5, 6]], [[3, 4], [7, 8]]]]) B = np.int_( [[[[7, 8], [11, 13]], [[5, 3], [4, 1]]]]) varA = K.variable(value=A) varB = K.variable(value=B) print("A batch. B\n%s" % K.eval(K.batch_dot(varA, varB, axes=[3, 3])))
Tensorflow 下执行结果:
Using TensorFlow backend. A batch. B [[[[ 23. 37.] [ 83. 133.]] [[ 27. 16.] [ 59. 36.]]]]
Theano 下执行结果:
Using Theano backend. A batch. B [[[[[ 23. 37.] [ 11. 6.]] [[ 83. 133.] [ 43. 26.]]] [[[ 53. 85.] [ 27. 16.]] [[113. 181.] [ 59. 36.]]]]]
from rasa_chatbot_cn.
現在的batch_dot在 tensorflow和 theano 下效果應該一致了
之前用tf backend以下code的output shape是(9, 8, 7, 4, 5)
from keras import backend as K
a = K.ones((9, 8, 7, 4, 2))
b = K.ones((9, 8, 7, 2, 5))
c = K.batch_dot(a, b)
print(c.shape)
現在也變成(9, 8, 7, 4, 8, 7, 5)
了
from rasa_chatbot_cn.
这块我还没涉及到,所以你能提交个pr吗
from rasa_chatbot_cn.
Related Issues (20)
- show_tokens_to_client HOT 3
- make train仍报错 HOT 2
- 如何对接前后端?
- tensorflow 版本的问题? HOT 1
- 训练rasa_nlu出错 HOT 5
- make run-cmdline报错。RuntimeError: this event loop is already running. HOT 2
- 关于nlu.json数据格式的两点问题。
- rasa能配置mysql嘛 我看见你在endpoints.yml里配置的是mongodb
- 你项目有百度网盘嘛 我不知道为什么下你的这个下面老是下载不下来 HOT 1
- 闲聊模块问题 HOT 1
- rasa train出错 HOT 3
- rasa shell bot load之后出错 HOT 2
- 请问下楼主这个可以直接对话吗还需要自己准备数据训练吗 HOT 3
- docker也运行不起来 HOT 1
- 关于docker和mongodb HOT 1
- 您好,请教一个form进行slot填充的问题 HOT 1
- ModuleNotFoundError: No module named 'rasa_core_sdk' HOT 8
- 您好,我是刚学rasa的小白,rasa shell时,遇到这个问题,请问如何解决?
- rasa train --domain domain.yml --data data --config config.yml --out models 运行报错
- 有没有关键字屏蔽功能 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rasa_chatbot_cn.