bert-sentence-similarity-pytorch's Introduction

Bert sentence similarity by PyTorch

This repo contains a PyTorch implementation of a pretrained BERT model for sentence similarity task.

Structure of the code

At the root of the project, you will see:

├── pybert
|  └── callback
|  |  └── lrscheduler.py　　
|  |  └── trainingmonitor.py　
|  |  └── ...
|  └── config
|  |  └── basic_config.py #a configuration file for storing model parameters
|  └── dataset　　　
|  └── io　　　　
|  |  └── dataset.py　　
|  |  └── data_transformer.py　　
|  └── model
|  |  └── nn　
|  |  └── pretrain　
|  └── output #save the ouput of model
|  └── preprocessing #text preprocessing 
|  └── train #used for training a model
|  |  └── trainer.py 
|  |  └── ...
|  └── utils # a set of utility functions
├── convert_tf_checkpoint_to_pytorch.py
├── train_bert_atec_nlp.py
├── data_join.py

Dependencies

csv
tqdm
numpy
pickle
scikit-learn
PyTorch 1.0
matplotlib
pandas
pytorch_pretrained_bert (load bert model)

How to use the code

you need download pretrained chinese bert model (chinese_L-12_H-768_A-12.zip)

Download the Bert pretrained model from Google and place it into the /pybert/model/pretrain directory.
pip install pytorch-pretrained-bert from github.
Run python convert_tf_checkpoint_to_pytorch.py to transfer the pretrained model(tensorflow version) into pytorch form .
Prepare ATEC NLP data, you can modify the io.data_transformer.py to adapt your data.
Modify configuration information in pybert/config/basic_config.py(the path of data,...).
Run python data_join.py
Run python train_bert_atec_nlp.py.

Tips

When converting the tensorflow checkpoint into the pytorch, it's expected to choice the "bert_model.ckpt", instead of "bert_model.ckpt.index", as the input file. Otherwise, you will see that the model can learn nothing and give almost same random outputs for any inputs. This means, in fact, you have not loaded the true ckpt for your model
When using multiple GPUs, the non-tensor calculations, such as accuracy and f1_score, are not supported by DataParallel instance
As recommanded by Jocob in his paper https://arxiv.org/pdf/1810.04805.pdf, in fine-tuning tasks, the hyperparameters are expected to set as following: Batch_size: 16 or 32, learning_rate: 5e-5 or 2e-5 or 3e-5, num_train_epoch: 3 or 4
The pretrained model has a limit for the sentence of input that its length should is not larger than 512, the max position embedding dim. The data flows into the model as: Raw_data -> WordPieces -> Model. Note that the length of wordPieces is generally larger than that of raw_data, so a safe max length of raw_data is at ~128 - 256
Upon testing, we found that fine-tuning all layers could get much better results than those of only fine-tuning the last classfier layer. The latter is actually a feature-based way

bert-sentence-similarity-pytorch's People

Contributors

Stargazers

Watchers

bert-sentence-similarity-pytorch's Issues

score

What's your score of this model? Thank you!

想问一下，为啥我跑出来的结果F1 Score为0？

跑的是atec的数据，但是loss和acc一直在0.5 和 80% 附近徘徊，loss没有下降，acc也是在[75% - 90+%]之间。
validation的时候F1就直接为0了。
用quora question pair 是数据跑也是同样的结果，loss一直不下降。

ImportError: cannot import name 'PreTrainedBertModel'

When I run python train_bert_atec_nlp.py, I get the error:

Traceback (most recent call last): File "train_bert_atec_nlp.py", line 15, in <module> from pybert.model.nn.bert_fine import BertFine File "~/bert-sentence-similarity-pytorch/pybert/model/nn/bert_fine.py", line 3, in <module> from pytorch_pretrained_bert.modeling import PreTrainedBertModel, BertModel
ImportError: cannot import name 'PreTrainedBertModel'

Thank you very much!

Recommend Projects

lonepatient / bert-sentence-similarity-pytorch Goto Github PK

bert-sentence-similarity-pytorch's Introduction

Bert sentence similarity by PyTorch

Structure of the code

Dependencies

How to use the code

Tips

bert-sentence-similarity-pytorch's People

Contributors

Stargazers

Watchers

Forkers

bert-sentence-similarity-pytorch's Issues

cannot use gpu?

score

想问一下，为啥我跑出来的结果F1 Score为0？

ImportError: cannot import name 'PreTrainedBertModel'

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs