GithubHelp home page GithubHelp logo

Comments (6)

bloodhunt3r avatar bloodhunt3r commented on May 27, 2024 3

replace the following code:

https://github.com/xinyu1205/Tag2Text/blob/9f6866e115ed3026d748bc67de67a9d428df1016/models/bert.py#L224-L229

with the following code to do alignment:

        if key_layer.shape[0] > query_layer.shape[0]:
            key_layer = key_layer[:query_layer.shape[0], :, :, :]
            attention_mask = attention_mask[:query_layer.shape[0], :, :]
            value_layer = value_layer[:query_layer.shape[0], :, :, :]
        attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))

from recognize-anything.

xinyu1205 avatar xinyu1205 commented on May 27, 2024 2

Thank you for your very valuable feedback. I will check this issue and give you a response as soon as possible in next few days.

from recognize-anything.

deyiluobo avatar deyiluobo commented on May 27, 2024

I had the same problem, I lowered the version of transformers to no avail

from recognize-anything.

xinyu1205 avatar xinyu1205 commented on May 27, 2024

The reason behind the issue is that our code has been modified based on BLIP. To resolve the issue quickly, you can refer to a simple solution provided in this GitHub comment: salesforce/BLIP#142 (comment).

Further modifications are required in the Tag2Text/models/bert.py file to align it with the new version of the transformer. I have added this to my pending tasks list, but due to my current workload, I cannot ensure completion as soon as possible. Really hope you can understand.

Sincerely Thank you once again for bringing this issue to my attention.

from recognize-anything.

xinyu1205 avatar xinyu1205 commented on May 27, 2024

Thank you very much for sharing! If you have already tested the corresponding version, you are very welcome to be one of the contributors to this project by initiating Pull Requests (please add comments in the corresponding area). We appreciate your willingness for sharing and look forward to your contributions.

from recognize-anything.

Qiliqing avatar Qiliqing commented on May 27, 2024

replace the following code:

https://github.com/xinyu1205/Tag2Text/blob/9f6866e115ed3026d748bc67de67a9d428df1016/models/bert.py#L224-L229

with the following code to do alignment:

        if key_layer.shape[0] > query_layer.shape[0]:
            key_layer = key_layer[:query_layer.shape[0], :, :, :]
            attention_mask = attention_mask[:query_layer.shape[0], :, :]
            value_layer = value_layer[:query_layer.shape[0], :, :, :]
        attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))

It works, thx!

from recognize-anything.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.