GithubHelp home page GithubHelp logo

Comments (10)

brightmart avatar brightmart commented on May 13, 2024

争取本周内

from roberta_zh.

Jethu1 avatar Jethu1 commented on May 13, 2024

这周能发布么? 挺着急想试试这小模型的效果

from roberta_zh.

brightmart avatar brightmart commented on May 13, 2024

延期了

from roberta_zh.

Shuryne avatar Shuryne commented on May 13, 2024

请问目前有确定6层模型以及训练语料的发放日期么?想试一试小模型的效果,谢谢🙏

from roberta_zh.

brightmart avatar brightmart commented on May 13, 2024

有的,6层的在训练,就这两天就会发布。
另外也有计划最近发布中文版的albert,最近出来的,小而高性能的google模型。

from roberta_zh.

Shuryne avatar Shuryne commented on May 13, 2024

太好了!albert我昨天才看到,似乎用了大量TPU训练,如果能够发布真的是太好了,非常感谢~

from roberta_zh.

brightmart avatar brightmart commented on May 13, 2024

@csy1998 @Jethu1 超小模型,参数量和模型大小为bert的十分之一,训练速度加快了1倍,可以试用了
https://github.com/brightmart/albert_zh

from roberta_zh.

KunWangR avatar KunWangR commented on May 13, 2024

hi,请问楼主,6层的roberta模型,大约能在什么时候发布呢?

from roberta_zh.

brightmart avatar brightmart commented on May 13, 2024

@Jethu1 @csy1998 @KunWangR 已经发布了6层的roberta模型(体验版),可以试一试效果怎么样。能否报告一下,在你们的任务上的效果呢?

from roberta_zh.

KunWangR avatar KunWangR commented on May 13, 2024

roberta 6层模型在我这边的文本相似度匹配数据集上准确率提升2%,albert 6层模型提升有1%, albert 4层模型不仅不能提升任务效果,反而降低0.5%左右。但是从cpu的预测速度来看,4层以下的bert模型才能满足cpu响应性能需求,英文担心蒸馏后损失模型效果,希望作者和大家一起探究一下roberta 3-4层的模型的效果和性能如何。

from roberta_zh.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.