Comments (8)
https://arxiv.org/abs/1806.00920
参考DRCD官方的论文,报BERT基线结果 F1 score 89.59% and Exact Match score 82.34%
.
from chinese-bert-wwm.
能否提供一下阅读理解的实验代码,用bert官方代码单卡batch size开不到64
from chinese-bert-wwm.
参考:https://github.com/ymcui/CMRC2018-DRCD-BERT/
在这个基础上仅对lr_schedual进行了优化以提高基线效果。
from chinese-bert-wwm.
参考:https://github.com/ymcui/CMRC2018-DRCD-BERT/
在这个基础上仅对lr_schedual进行了优化以提高基线效果。
python run_cmrc2018_drcd_baseline.py
--vocab_file=${PATH_TO_BERT}/multi_cased_L-12_H-768_A-12/vocab.txt
--bert_config_file=${PATH_TO_BERT}/multi_cased_L-12_H-768_A-12/bert_config.json
--init_checkpoint=${PATH_TO_BERT}/multi_cased_L-12_H-768_A-12/bert_model.ckpt
--do_train=True
--train_file=${DATA_DIR}/cmrc2018_train.json
--do_predict=True
--predict_file=${DATA_DIR}/cmrc2018_dev.json
--train_batch_size=32
--num_train_epochs=2
--max_seq_length=512
--doc_stride=128
--learning_rate=3e-5
--save_checkpoints_steps=1000
--output_dir=${MODEL_DIR}
--do_lower_case=False
--use_tpu=False
论文效果是用这个配置跑的吗?
from chinese-bert-wwm.
请参考技术报告中的参数(batch 64, lr 3e-5,epoch 2,maxlen 512),未说明的代表使用的是默认配置。
另外,你贴的是multi-lingual BERT上的命令,不是Chinese BERT,do_lower_case需要设置为True,中文BERT是uncased模型。
from chinese-bert-wwm.
https://github.com/ymcui/CMRC2018-DRCD-BERT/ 的代码在32g v100上单卡batch size开不到64。论文里的实验在什么gpu上跑的?是通过多卡同步增大batch size吗?
from chinese-bert-wwm.
TPU v2
from chinese-bert-wwm.
reopen if necessary
from chinese-bert-wwm.
Related Issues (20)
- 预训练阶段的loss问题 HOT 2
- 请问一下训练用了几块tpu呀 HOT 2
- 预训练分词是会用到中文vocab后面带“##”的token吗?如果是,整词掩码预训练出来的语言模型,用在下游任务中,下游任务可以不分词吗? HOT 4
- 有点晕:『开源版本不包含MLM任务的权重』,这句话是什么意思? HOT 3
- RoBERTa-wwm-ext-large应用到全新领域不收敛 HOT 4
- How to download chinese-roberta-wwm-ext.pt ? HOT 1
- CJRC有预训练的模型么? HOT 1
- How WWM stratege works in code HOT 2
- 链接失效求助 HOT 2
- RoBERTa-wwm-ext-large ft的时候loss飞了 HOT 2
- 继续预训练 HOT 2
- 如何抽取特定layer的词向量? HOT 2
- 预训练数据 HOT 2
- 请问可否使用wwm系列的模型做词级别的fill-mask预测? HOT 2
- NER问题 HOT 2
- Is there any sharing about phoneme-BERT pretrained? HOT 2
- Is onnx model available HOT 2
- 请问每个模型推理时需要多大的显存去加载? HOT 1
- 请问pytorch版本的chinese-roberta-wwm-ext-large在哪里下载稳定而且快?huggingface上太慢了还总断。 HOT 1
- 从BERT换成此模型后跑不通,应该怎么解决,显示缺少参数 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chinese-bert-wwm.