Comments (10)
争取本周内
from roberta_zh.
这周能发布么? 挺着急想试试这小模型的效果
from roberta_zh.
延期了
from roberta_zh.
请问目前有确定6层模型以及训练语料的发放日期么?想试一试小模型的效果,谢谢🙏
from roberta_zh.
有的,6层的在训练,就这两天就会发布。
另外也有计划最近发布中文版的albert,最近出来的,小而高性能的google模型。
from roberta_zh.
太好了!albert我昨天才看到,似乎用了大量TPU训练,如果能够发布真的是太好了,非常感谢~
from roberta_zh.
@csy1998 @Jethu1 超小模型,参数量和模型大小为bert的十分之一,训练速度加快了1倍,可以试用了
https://github.com/brightmart/albert_zh
from roberta_zh.
hi,请问楼主,6层的roberta模型,大约能在什么时候发布呢?
from roberta_zh.
@Jethu1 @csy1998 @KunWangR 已经发布了6层的roberta模型(体验版),可以试一试效果怎么样。能否报告一下,在你们的任务上的效果呢?
from roberta_zh.
roberta 6层模型在我这边的文本相似度匹配数据集上准确率提升2%,albert 6层模型提升有1%, albert 4层模型不仅不能提升任务效果,反而降低0.5%左右。但是从cpu的预测速度来看,4层以下的bert模型才能满足cpu响应性能需求,英文担心蒸馏后损失模型效果,希望作者和大家一起探究一下roberta 3-4层的模型的效果和性能如何。
from roberta_zh.
Related Issues (20)
- XLNet其实不能稳压RoBERTa吧? HOT 1
- GPT vs BERT, under same computation and data resource, which one is better for downstream tasks like GLUE?
- 在pytorch模型上做post train HOT 2
- NaN probability sometimes when inference on GPU
- CMRC示例
- 是否可以开放语料,供其他模型对比
- 其中依赖的预训练模型是否和bert官方提供是一样的?
- 关于MLM中,中文全词掩盖的预测标签问题 HOT 7
- What are the pretrained-language-model that is obviously better than BERT and RoBERTa?
- 预处理数据丢失问题 HOT 1
- Unrelated parameters in the config
- resource文件夹下的vocab和代码不对应
- pytorch用BERT的加载方式加载roberta模型,呢么创建token时special token 是按照bert的方式还是roberta的方式呢
- 请问下,怎么进行GPU训练?
- Huggingface
- 利用roberta_zh的tokenizer来做中文NER任务时报错 HOT 2
- tensorboard可视化模型输出结果 train的masked_lm_loss和masked_lm_accuracy是空的,eval的图只有一个点
- Loss curve
- 下载问题和加载模型
- 加载的小问题求解答
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from roberta_zh.