GithubHelp home page GithubHelp logo

Comments (9)

ValkyriaLenneth avatar ValkyriaLenneth commented on July 20, 2024

确实是这样, longformer即便是long, 也只不过是把输入长度扩展到1K,4K而已

from longformer_zh.

Finnyhudson avatar Finnyhudson commented on July 20, 2024

如果是使用双塔式模型,可以直接使用两文本的[CLS]进行相似度计算吗

from longformer_zh.

ValkyriaLenneth avatar ValkyriaLenneth commented on July 20, 2024

双塔模型我没有用过, 但是[cls]作为表征整个文本的向量,是可以用来相似度计算的.

from longformer_zh.

Finnyhudson avatar Finnyhudson commented on July 20, 2024

好的 感谢

from longformer_zh.

Finnyhudson avatar Finnyhudson commented on July 20, 2024

再请问下您的代码环境是什么样的(torch+cuda版本和tranformer版本),之前allenai的原版transformer版本过高会导致forward报错,需要降到3.x

from longformer_zh.

ValkyriaLenneth avatar ValkyriaLenneth commented on July 20, 2024

我记得transformer是3.2 torch1.1 cuda 11, 不过只要模型没问题, 参数直接加载就行? forward报错的话,可能API有变动

from longformer_zh.

Finnyhudson avatar Finnyhudson commented on July 20, 2024

好的 再次感谢

from longformer_zh.

Finnyhudson avatar Finnyhudson commented on July 20, 2024

不好意思 又打扰了 如果我想只用1024的长度来跑的话 是直接把dataset的长度固定为每个样本都为1024的长度还是说在模型的config里面去修改

from longformer_zh.

18410080631 avatar 18410080631 commented on July 20, 2024

请问这个模型的tokenizer应该用什么

from longformer_zh.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.