GithubHelp home page GithubHelp logo

dialogpt2-interact's People

Contributors

andreamad8 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dialogpt2-interact's Issues

While running with medium model, missing some keys in state_dict

`psoni@blr2-lnxwk-071:~/DialoGPT$ sudo python3 interact.py --model_name_or_path ./models/medium --load_checkpoint ./models/medium/medium_ft.pkl --top_k 0
Found existing ./models folder, skip creating a new one!
03/06/2020 19:51:21 - INFO - main - Downloading models...
03/06/2020 19:51:21 - INFO - demo_utils - ./models/medium/config.json exists, return!
03/06/2020 19:51:21 - INFO - demo_utils - ./models/medium/vocab.json exists, return!
03/06/2020 19:51:21 - INFO - demo_utils - ./models/medium/merges.txt exists, return!
03/06/2020 19:51:21 - INFO - demo_utils - ./models/medium/pytorch_model.bin exists, return!
03/06/2020 19:51:21 - INFO - demo_utils - ./models/medium/medium_ft.pkl exists, return!
03/06/2020 19:51:21 - INFO - main - Done!

03/06/2020 19:51:21 - INFO - pytorch_pretrained_bert.tokenization_gpt2 - loading vocabulary file ./models/medium/vocab.json
03/06/2020 19:51:21 - INFO - pytorch_pretrained_bert.tokenization_gpt2 - loading merges file ./models/medium/merges.txt
03/06/2020 19:51:22 - INFO - gpt2_training.train_utils - loading finetuned model from ./models/medium/medium_ft.pkl
Traceback (most recent call last):
File "interact.py", line 203, in
run_model()
File "interact.py", line 147, in run_model
model = load_model(GPT2LMHeadModel(config), args.load_checkpoint, args, verbose=True)
File "/home/psoni/DialoGPT/gpt2_training/train_utils.py", line 39, in load_model
start_model.load_state_dict(model_state_dict)
File "/home/psoni/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 830, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for GPT2LMHeadModel:
Missing key(s) in state_dict: "transformer.h.12.ln_1.weight", "transformer.h.12.ln_1.bias", "transformer.h.12.attn.bias", "transformer.h.12.attn.c_attn.weight", "transformer.h.12.attn.c_attn.bias", "transformer.h.12.attn.c_proj.weight", "transformer.h.12.attn.c_proj.bias", "transformer.h.12.ln_2.weight", "transformer.h.12.ln_2.bias", "transformer.h.12.mlp.c_fc.weight", "transformer.h.12.mlp.c_fc.bias", "transformer.h.12.mlp.c_proj.weight", "transformer.h.12.mlp.c_proj.bias", "transformer.h.13.ln_1.weight", "transformer.h.13.ln_1.bias", "transformer.h.13.attn.bias", "transformer.h.13.attn.c_attn.weight", "transformer.h.13.attn.c_attn.bias", "transformer.h.13.attn.c_proj.weight", "transformer.h.13.attn.c_proj.bias", "transformer.h.13.ln_2.weight", "transformer.h.13.ln_2.bias", "transformer.h.13.mlp.c_fc.weight", "transformer.h.13.mlp.c_fc.bias", "transformer.h.13.mlp.c_proj.weight", "transformer.h.13.mlp.c_proj.bias", "transformer.h.14.ln_1.weight", "transformer.h.14.ln_1.bias", "transformer.h.14.attn.bias", "transformer.h.14.attn.c_attn.weight", "transformer.h.14.attn.c_attn.bias", "transformer.h.14.attn.c_proj.weight", "transformer.h.14.attn.c_proj.bias", "transformer.h.14.ln_2.weight", "transformer.h.14.ln_2.bias", "transformer.h.14.mlp.c_fc.weight", "transformer.h.14.mlp.c_fc.bias", "transformer.h.14.mlp.c_proj.weight", "transformer.h.14.mlp.c_proj.bias", "transformer.h.15.ln_1.weight", "transformer.h.15.ln_1.bias", "transformer.h.15.attn.bias", "transformer.h.15.attn.c_attn.weight", "transformer.h.15.attn.c_attn.bias", "transformer.h.15.attn.c_proj.weight", "transformer.h.15.attn.c_proj.bias", "transformer.h.15.ln_2.weight", "transformer.h.15.ln_2.bias", "transformer.h.15.mlp.c_fc.weight", "transformer.h.15.mlp.c_fc.bias", "transformer.h.15.mlp.c_proj.weight", "transformer.h.15.mlp.c_proj.bias", "transformer.h.16.ln_1.weight", "transformer.h.16.ln_1.bias", "transformer.h.16.attn.bias", "transformer.h.16.attn.c_attn.weight", "transformer.h.16.attn.c_attn.bias", "transformer.h.16.attn.c_proj.weight", "transformer.h.16.attn.c_proj.bias", "transformer.h.16.ln_2.weight", "transformer.h.16.ln_2.bias", "transformer.h.16.mlp.c_fc.weight", "transformer.h.16.mlp.c_fc.bias", "transformer.h.16.mlp.c_proj.weight", "transformer.h.16.mlp.c_proj.bias", "transformer.h.17.ln_1.weight", "transformer.h.17.ln_1.bias", "transformer.h.17.attn.bias", "transformer.h.17.attn.c_attn.weight", "transformer.h.17.attn.c_attn.bias", "transformer.h.17.attn.c_proj.weight", "transformer.h.17.attn.c_proj.bias", "transformer.h.17.ln_2.weight", "transformer.h.17.ln_2.bias", "transformer.h.17.mlp.c_fc.weight", "transformer.h.17.mlp.c_fc.bias", "transformer.h.17.mlp.c_proj.weight", "transformer.h.17.mlp.c_proj.bias", "transformer.h.18.ln_1.weight", "transformer.h.18.ln_1.bias", "transformer.h.18.attn.bias", "transformer.h.18.attn.c_attn.weight", "transformer.h.18.attn.c_attn.bias", "transformer.h.18.attn.c_proj.weight", "transformer.h.18.attn.c_proj.bias", "transformer.h.18.ln_2.weight", "transformer.h.18.ln_2.bias", "transformer.h.18.mlp.c_fc.weight", "transformer.h.18.mlp.c_fc.bias", "transformer.h.18.mlp.c_proj.weight", "transformer.h.18.mlp.c_proj.bias", "transformer.h.19.ln_1.weight", "transformer.h.19.ln_1.bias", "transformer.h.19.attn.bias", "transformer.h.19.attn.c_attn.weight", "transformer.h.19.attn.c_attn.bias", "transformer.h.19.attn.c_proj.weight", "transformer.h.19.attn.c_proj.bias", "transformer.h.19.ln_2.weight", "transformer.h.19.ln_2.bias", "transformer.h.19.mlp.c_fc.weight", "transformer.h.19.mlp.c_fc.bias", "transformer.h.19.mlp.c_proj.weight", "transformer.h.19.mlp.c_proj.bias", "transformer.h.20.ln_1.weight", "transformer.h.20.ln_1.bias", "transformer.h.20.attn.bias", "transformer.h.20.attn.c_attn.weight", "transformer.h.20.attn.c_attn.bias", "transformer.h.20.attn.c_proj.weight", "transformer.h.20.attn.c_proj.bias", "transformer.h.20.ln_2.weight", "transformer.h.20.ln_2.bias", "transformer.h.20.mlp.c_fc.weight", "transformer.h.20.mlp.c_fc.bias", "transformer.h.20.mlp.c_proj.weight", "transformer.h.20.mlp.c_proj.bias", "transformer.h.21.ln_1.weight", "transformer.h.21.ln_1.bias", "transformer.h.21.attn.bias", "transformer.h.21.attn.c_attn.weight", "transformer.h.21.attn.c_attn.bias", "transformer.h.21.attn.c_proj.weight", "transformer.h.21.attn.c_proj.bias", "transformer.h.21.ln_2.weight", "transformer.h.21.ln_2.bias", "transformer.h.21.mlp.c_fc.weight", "transformer.h.21.mlp.c_fc.bias", "transformer.h.21.mlp.c_proj.weight", "transformer.h.21.mlp.c_proj.bias", "transformer.h.22.ln_1.weight", "transformer.h.22.ln_1.bias", "transformer.h.22.attn.bias", "transformer.h.22.attn.c_attn.weight", "transformer.h.22.attn.c_attn.bias", "transformer.h.22.attn.c_proj.weight", "transformer.h.22.attn.c_proj.bias", "transformer.h.22.ln_2.weight", "transformer.h.22.ln_2.bias", "transformer.h.22.mlp.c_fc.weight", "transformer.h.22.mlp.c_fc.bias", "transformer.h.22.mlp.c_proj.weight", "transformer.h.22.mlp.c_proj.bias", "transformer.h.23.ln_1.weight", "transformer.h.23.ln_1.bias", "transformer.h.23.attn.bias", "transformer.h.23.attn.c_attn.weight", "transformer.h.23.attn.c_attn.bias", "transformer.h.23.attn.c_proj.weight", "transformer.h.23.attn.c_proj.bias", "transformer.h.23.ln_2.weight", "transformer.h.23.ln_2.bias", "transformer.h.23.mlp.c_fc.weight", "transformer.h.23.mlp.c_fc.bias", "transformer.h.23.mlp.c_proj.weight", "transformer.h.23.mlp.c_proj.bias".
size mismatch for transformer.wte.weight: copying a param with shape torch.Size([50257, 768]) from checkpoint, the shape in current model is torch.Size([50257, 1024]).
size mismatch for transformer.wpe.weight: copying a param with shape torch.Size([1024, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.0.ln_1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.0.ln_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.0.attn.c_attn.weight: copying a param with shape torch.Size([768, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 3072]).
size mismatch for transformer.h.0.attn.c_attn.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for transformer.h.0.attn.c_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.0.attn.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.0.ln_2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.0.ln_2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.0.mlp.c_fc.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for transformer.h.0.mlp.c_fc.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for transformer.h.0.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for transformer.h.0.mlp.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.1.ln_1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.1.ln_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.1.attn.c_attn.weight: copying a param with shape torch.Size([768, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 3072]).
size mismatch for transformer.h.1.attn.c_attn.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for transformer.h.1.attn.c_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.1.attn.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.1.ln_2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.1.ln_2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.1.mlp.c_fc.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for transformer.h.1.mlp.c_fc.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for transformer.h.1.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for transformer.h.1.mlp.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.2.ln_1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.2.ln_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.2.attn.c_attn.weight: copying a param with shape torch.Size([768, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 3072]).
size mismatch for transformer.h.2.attn.c_attn.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for transformer.h.2.attn.c_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.2.attn.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.2.ln_2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.2.ln_2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.2.mlp.c_fc.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for transformer.h.2.mlp.c_fc.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for transformer.h.2.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for transformer.h.2.mlp.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.3.ln_1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.3.ln_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.3.attn.c_attn.weight: copying a param with shape torch.Size([768, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 3072]).
size mismatch for transformer.h.3.attn.c_attn.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for transformer.h.3.attn.c_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.3.attn.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.3.ln_2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.3.ln_2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.3.mlp.c_fc.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for transformer.h.3.mlp.c_fc.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for transformer.h.3.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for transformer.h.3.mlp.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.4.ln_1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.4.ln_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.4.attn.c_attn.weight: copying a param with shape torch.Size([768, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 3072]).
size mismatch for transformer.h.4.attn.c_attn.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for transformer.h.4.attn.c_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.4.attn.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.4.ln_2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.4.ln_2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.4.mlp.c_fc.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for transformer.h.4.mlp.c_fc.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for transformer.h.4.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for transformer.h.4.mlp.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.5.ln_1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.5.ln_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.5.attn.c_attn.weight: copying a param with shape torch.Size([768, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 3072]).
size mismatch for transformer.h.5.attn.c_attn.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for transformer.h.5.attn.c_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.5.attn.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.5.ln_2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.5.ln_2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.5.mlp.c_fc.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for transformer.h.5.mlp.c_fc.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for transformer.h.5.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for transformer.h.5.mlp.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.6.ln_1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.6.ln_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.6.attn.c_attn.weight: copying a param with shape torch.Size([768, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 3072]).
size mismatch for transformer.h.6.attn.c_attn.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for transformer.h.6.attn.c_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.6.attn.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.6.ln_2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.6.ln_2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.6.mlp.c_fc.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for transformer.h.6.mlp.c_fc.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for transformer.h.6.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for transformer.h.6.mlp.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.7.ln_1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.7.ln_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.7.attn.c_attn.weight: copying a param with shape torch.Size([768, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 3072]).
size mismatch for transformer.h.7.attn.c_attn.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for transformer.h.7.attn.c_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.7.attn.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.7.ln_2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.7.ln_2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.7.mlp.c_fc.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for transformer.h.7.mlp.c_fc.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for transformer.h.7.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for transformer.h.7.mlp.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.8.ln_1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.8.ln_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.8.attn.c_attn.weight: copying a param with shape torch.Size([768, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 3072]).
size mismatch for transformer.h.8.attn.c_attn.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for transformer.h.8.attn.c_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.8.attn.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.8.ln_2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.8.ln_2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.8.mlp.c_fc.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for transformer.h.8.mlp.c_fc.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for transformer.h.8.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for transformer.h.8.mlp.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.9.ln_1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.9.ln_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.9.attn.c_attn.weight: copying a param with shape torch.Size([768, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 3072]).
size mismatch for transformer.h.9.attn.c_attn.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for transformer.h.9.attn.c_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.9.attn.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.9.ln_2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.9.ln_2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.9.mlp.c_fc.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for transformer.h.9.mlp.c_fc.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for transformer.h.9.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for transformer.h.9.mlp.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.10.ln_1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.10.ln_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.10.attn.c_attn.weight: copying a param with shape torch.Size([768, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 3072]).
size mismatch for transformer.h.10.attn.c_attn.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for transformer.h.10.attn.c_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.10.attn.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.10.ln_2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.10.ln_2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.10.mlp.c_fc.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for transformer.h.10.mlp.c_fc.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for transformer.h.10.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for transformer.h.10.mlp.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.11.ln_1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.11.ln_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.11.attn.c_attn.weight: copying a param with shape torch.Size([768, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 3072]).
size mismatch for transformer.h.11.attn.c_attn.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for transformer.h.11.attn.c_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for transformer.h.11.attn.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.11.ln_2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.11.ln_2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.h.11.mlp.c_fc.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for transformer.h.11.mlp.c_fc.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for transformer.h.11.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for transformer.h.11.mlp.c_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.ln_f.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for transformer.ln_f.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for lm_head.decoder.weight: copying a param with shape torch.Size([50257, 768]) from checkpoint, the shape in current model is torch.Size([50257, 1024]).
psoni@blr2-lnxwk-071:~/DialoGPT$
`

Add License

I know this is mostly on top of MIT licensed software, but still.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.