GithubHelp home page GithubHelp logo

pytorch-soft-masked-bert's People

Contributors

whgaara avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pytorch-soft-masked-bert's Issues

运行

你好,这个是可以直接运行的吗?

pretrained_bert model

你好 这个项目可以加载bert预训练好的模型吗?比如chinese_L-12_H-768_A-12

requirements

你好,我想问一下这个项目的torch版本是多少?

您好,感谢您的分享。

请问在该项目中训练mlm模型是从头开始训练,不需要已经预训练好的bert模型吗?是否可以冲已经预训练好的bert模型中继续训练?

GPU显存蹭蹭往上涨

BatchSize是64 可用显存23G 刚开始跑的时候显存占用不大,跑到约4%的时候显存就不够用了,请问大佬知道是啥问题么?

关于BiGRU的输入格式

您好,关于BiGRU部分的数据格式有点疑惑,查了文档torch中GRU的input和output应该是 (seq_len, batch, input_size)的形式,但是代码中在处理这部分时是直接按 (batch, seq_len,input_size)来输入的,请问这样不会导致网络结构的错误么?

运行环境

运行环境有说明吗?python需要版本吗?纠错率多少呢?

训练数据

为什么训练数据只有正确语句,不应该有错误语句吗?

预测

你好,你这个模型是不是不能完成自己文本的预测呢?

GPU显存问题

请问您的显存多了大了啊?我使用的是11G,但是报了如下错误,按道理显存是够的啊
RuntimeError: CUDA out of memory. Tried to allocate 5.49 GiB (GPU 0; 10.76 GiB total capacity; 5.60 GiB already allocated; 4.27 GiB free; 5.64 GiB reserved in total by PyTorch)

预测时出错

大佬,你好,预测的时候出错,即用step3时,出现错误(用的给的样例文本)
image

训练过程不稳定,是有报错,是怎么回事??

我在ubuntu下跑代码,貌似是正常的。。但是在win下跑,前面是正常的,多跑一会就会报错:
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)
也根据百度的结果,设置了torch.cuda.set_device(0) 仍旧是报错。。#本人单机单卡。
不知道博主大佬,是否遇到这样的情况。。求指点。。

关于代码的几点疑问

想请教下代码中的几点疑问:

  1. 处理数据集时,随机处理验证集的函数random_wrong中,number = random.randint(672, 7992),这个672和7992是个什么意思? 貌似对训练集而言,其随机生成标签的句子时,处理方式与这个不同? 为什么呢?
  2. smbert_dataset.py文件中的DataFactory的ids_to_mask方法中,tmp_ids = [101] + tmp_ids + [102],这个101和102是什么意思?是否对应这vocab.txt文件中的101行和102行编码[CLS]和[SEP],tmp_masks = [0] + tmp_masks + [0]中的0对应[PAD] ?? ids_all_mask方法也有类似情况
  3. 如果我使用自身的训练集,发现vocab中的字符无法满足我的要求。我需要将其替换掉。那是否可以替换从第2行到第100行的[unused**]符号,及其他行的符号。只是要保证[PAD], [UNK], [CLS], [SEP],[MASK]这几个字符的位置不变,是么?
  4. 为什么训练过程中,没有用到char_meta.txt文件,而在预测阶段才用到。。不是很理解其中的思路。。恳请指点下。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.