Hi, I'm trying to code StructBERT from scratch. But I couldn't find any code examples

Thanks for the clarification <a class="user-mention notranslate" data-hovercard-type="

Yes, there can be more than one trigrams. Section 2.4: "5% of trigrams are selec

Thanks for the explanation <a class="user-mention notranslate" data-hovercard-type="us

Will you share pre-training code of StructBERT about alicemind HOT 5 CLOSED

alibaba commented on April 27, 2024 4

Will you share pre-training code of StructBERT

from alicemind.

Comments (5)

wangwei7175878 commented on April 27, 2024

We currently have no plan to open source pre-training code, please refer to the paper for more details.

from alicemind.

kaansonmezoz commented on April 27, 2024

Thanks for the clarification @wangwei7175878 .

I've tried to implement my own version of Word Structural Objective (WSO).
There were some parts in the paper unclear to me, so I had to come-up with my own solutions.

Could you please explain more about the following questions ?

Is it possible for a sentence to have more than one trigrams ?
Since WSO is a single-sentence task, did you calculate WSO loss from a real single sentence or sequence of sentences ? (i.e Did you predict shuffled tokens from [CLS] SentA [SEP] or [CLS] SentA. SentB. SentC [SEP])
Did you use same head for MLM and WSO (i.e Huggingface's BertOnlyMLMHead)? Since MLM and WSO are predicting tokens in the same way with different labels, MLM takes labels with masked tokens and WSO takes labels with shuffled tokens, same head can be used to predict token ids.

from alicemind.

wangwei7175878 commented on April 27, 2024

Yes, there can be more than one trigrams. Section 2.4: "5% of trigrams are selected for random shuffling"
We calculate WSO loss from sequence of sentences.
Yes, we use same head for MLM and WSO

from alicemind.

kaansonmezoz commented on April 27, 2024

Thanks for the explanation @wangwei7175878

from alicemind.