Describe the Bug A clear and concise deion of

The method get_model to get the bert model is right? about keras-bert HOT 4 CLOSED

lwyeah commented on June 10, 2024

The method get_model to get the bert model is right?

from keras-bert.

Comments (4)

CyberZHG commented on June 10, 2024

The attention module uses both the history_only parameter and the given mask.

https://github.com/CyberZHG/keras-self-attention/blob/f3341547271068243866b1c0ff2512baddf92068/keras_self_attention/scaled_dot_attention.py#L69-L70

from keras-bert.

lwyeah commented on June 10, 2024

Yes, the attention module in the keras_self_attention has parameter mask. But the _attention_builder returned by attention_builder only receives one parameters x (i.e. the parameter inputs in ScaledDotProductAttention) and the parameter mask is None. So only the history_only parameter controls the self-attention mask in transformer encoder.

Below is the code in keras-transformer/keras_transformer/transformer.py

def attention_builder(name,
head_num,
activation,
history_only,
trainable=True):
"""Get multi-head self-attention builder.
:param name: Prefix of names for internal layers.
:param head_num: Number of heads in multi-head self-attention.
:param activation: Activation for multi-head self-attention.
:param history_only: Only use history data.
:param trainable: Whether the layer is trainable.
:return:
"""
def _attention_builder(x):
return MultiHeadAttention(
head_num=head_num,
activation=activation,
history_only=history_only,
trainable=trainable,
name=name,
)(x)
return _attention_builder

from keras-bert.

CyberZHG commented on June 10, 2024

Note that the second () in MultiHeadAttention(...)(x) calls __call__(...) but not call(...).

from keras-bert.

lwyeah commented on June 10, 2024

Oh, god, thand you very much. I have read the <Understanding masking & padding> in the keras docs, and understand the _keras_mask. Thank you again taking time out to answer my questions.

from keras-bert.

Recommend Projects

The method get_model to get the bert model is right? about keras-bert HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs