GithubHelp home page GithubHelp logo

Beam search error about meshed-memory-transformer HOT 3 OPEN

Faiail avatar Faiail commented on August 15, 2024
Beam search error

from meshed-memory-transformer.

Comments (3)

Faiail avatar Faiail commented on August 15, 2024

Actually I checked that during beam search the shape of previous generated token changes from bs_ X 1 to 40 X 1 and I trurly don't know why. In particular this happens after generating the first token. In fact the first input is a batch of start tokens, with the correct batch size, whereas after the first iteration the it tensor changes the shape. This is related to the fact that it is taken the previous generated token as input. How can I fix that?

from meshed-memory-transformer.

KillerGary avatar KillerGary commented on August 15, 2024

RuntimeError: gather(): Expected dtype int64 for index

from meshed-memory-transformer.

Kuo2023 avatar Kuo2023 commented on August 15, 2024

RuntimeError: gather(): Expected dtype int64 for index:
I get this too, I have to add .to(torch.int64) at the end of every line of the code with gather (e.g
s = torch.gather(s.view(([self.b_s, cur_beam_size] + shape[1:])), 1,
beam.expand(
([self.b_s, self.beam_size] + shape[1:])).to(torch.int64))
)
And then I am encountered with other different problems
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.FloatTensor instead (while checking arguments for embedding)

from meshed-memory-transformer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.