GithubHelp home page GithubHelp logo

yunxinli / ndcr Goto Github PK

View Code? Open in Web Editor NEW
8.0 8.0 3.0 18.61 MB

A Neural Divide-and-Conquer Reasoning Framework for Multimodal Reasoning on Linguistically Complex Text and Similar Images

License: Apache License 2.0

Python 100.00%

ndcr's Issues

"decoder_input_ids=None" cause error

when training the model, I encountered the bug below:

Exception has occurred: AttributeError
'NoneType' object has no attribute 'shape'
File "/home/NDCR/OFA/transformers/src/transformers/models/ofa/modeling_ofa.py", line 1901, in forward
~encoder_outputs.padding_mask, encoder_hidden_states.dtype, decoder_input_ids.shape[-1]
File "/home/NDCR/OFA_encoder_Divide_and_Conquer.py", line 204, in forward
gen = self.OFA(input_ids_context, patch_images=global_image, decoder_input_ids=None)
File "/home/NDCR/OFA_encoder_Divide_and_Conquer.py", line 742, in
contextual_clip(images, text, pos_mask, None, str(img_dir), text_=None, input_ids=input_ids)
AttributeError: 'NoneType' object has no attribute 'shape'

RuntimeError: Error(s) in loading state_dict for ContextualCLIP: size mismatch XXXXXXX

Hello author,
I encountered the following bugs when I tried to reproduce the results of the paper. It seems that the size of checkpoint(pretrain_BART_generator_coldstart_OFA) you provided on huggingface doesn't match the ''current model''.

Traceback (most recent call last):
File "/datasata0/cloud-wuzhengyuan/lxj/NDCR/OFA_encoder_Divide_and_Conquer.py", line 426, in
contextual_clip.load_state_dict(checkpoint['model_state_dict'], False)
File "/root/miniconda3/envs/blip/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ContextualCLIP:
size mismatch for text_encoder.model.shared.weight: copying a param with shape torch.Size([50265, 768]) from checkpoint, the shape in current model is torch.Size([50265, 1024]).
size mismatch for text_encoder.model.encoder.embed_tokens.weight: copying a param with shape torch.Size([50265, 768]) from checkpoint, the shape in current model is torch.Size([50265, 1024]).
size mismatch for text_encoder.model.encoder.embed_positions.weight: copying a param with shape torch.Size([1026, 768]) from checkpoint, the shape in current model is torch.Size([1026, 1024]).
size mismatch for text_encoder.model.encoder.layers.0.self_attn.k_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for text_encoder.model.encoder.layers.0.self_attn.k_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for text_encoder.model.encoder.layers.0.self_attn.v_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for text_encoder.model.encoder.layers.0.self_attn.v_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for text_encoder.model.encoder.layers.0.self_attn.q_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for text_encoder.model.encoder.layers.0.self_attn.q_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for text_encoder.model.encoder.layers.0.self_attn.out_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for text_encoder.model.encoder.layers.0.self_attn.out_proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for text_encoder.model.encoder.layers.0.self_attn_layer_norm.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for text_encoder.model.encoder.layers.0.self_attn_layer_norm.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
......

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.