RuntimeError: Error(s) in loading state_dict for FastTransformer:

Question

size mismatch for encoder.out.weight: copying a param with shape torch.Size([36377, 27

mohengzxr · Answer

Decoding from NA Models why? --load_vocab,Why does it appear size mismatch for enc

mohengzxr · Answer

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

mansimov · Answer

Hm it is a bit hard for me to figure out why this happens.
Haven't used this codeb

mohengzxr · Answer

--lr_schedule anneal and transformer different?
@

mohengzxr · Answer

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

mansimov · Answer

--lr_schedule anneal just anneals the learning rate starting from certain value
--

mohengzxr · Answer

anneal and transformer ?Whether it affects Bleu?

mohengzxr · Answer

Why is the Bleu produced by choosing ananne and transformer different?
What is the

mansimov · Answer

Anneal works for well for IWSLT, transformer works well for WMT.
In general I advi

mohengzxr · Answer

In the experiment, we chose the Uyghur-Chinese corpus.
When using ananne and trans

mohengzxr · Answer

I am very interested in this paper, I want to have a little research on your basis.

mohengzxr · Answer

I really hope to get your help.

mansimov · Answer

I am glad that you are interested :)

It is very hard to say why resu

mohengzxr · Answer

@jasonleeinf <a class="user-mention notranslate" data-hovercard-type="user" data-hover

mohengzxr · Answer

<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.github

mansimov · Answer

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

RuntimeError: Error(s) in loading state_dict for FastTransformer: about dl4mt-nonauto HOT 16 CLOSED