GithubHelp home page GithubHelp logo

flat-lattice-transformer's People

Contributors

leesureman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flat-lattice-transformer's Issues

Compatibility with BERT

请问在最后结合BERT模型的实验中,由于BERT_Tokenizer会把序列切分为字序列,是如何将lexicon中匹配的词输入到BERT中的?

代码运行错误

File "../V0/modules.py", line 93, in forward
pe_ss = self.pe_ss[(pos_ss).view(-1)+self.max_seq_len].view(size=[batch,max_seq_len,max_seq_len,-1])
IndexError: index 363 is out of bounds for dimension 0 with size 351

weibo数据集找不到_deseg后缀的文件

原始数据集里只有.train/.dev/.test的文件?

Traceback (most recent call last):
  File "flat_main.py", line 254, in <module>
    only_train_min_freq=args.only_train_min_freq,
  File "/Users/user/.pyenv/versions/env-mkwPXnF--py3.7/lib/python3.7/site-packages/fastNLP/core/utils.py", line 344, in wrapper
    results = func(*args, **kwargs)
  File "../load_data.py", line 646, in load_weibo_ner
    bundle = loader.load(v)
  File "/Users/user/.pyenv/versions/env-mkwPXnF--py3.7/lib/python3.7/site-packages/fastNLP/io/loader/loader.py", line 68, in load
    paths = check_loader_paths(paths)
  File "/Users/user/.pyenv/versions/env-mkwPXnF--py3.7/lib/python3.7/site-packages/fastNLP/io/utils.py", line 63, in check_loader_paths
    raise FileNotFoundError(f"{paths} is not a valid file path.")
FileNotFoundError: /Users/user/Downloads/Flat-Lattice-Transformer/V0/WeiboNER/weiboNER_2nd_conll.train_deseg is not a valid file path.

显存不足

这么吃显存吗? 8万条训练数据16g就跑不动了

多GPU运行,怎么修改代码?

总是出错RuntimeError: The size of tensor a (94) must match the size of tensor b (214) at non-singleton dimension 1
发现多GPU会切割维度,导致match不上。如何正确设置多GPU运行?

相对位置编码问题

想请教一下这一块不是特别清晰,感觉代码里面和自己用的数据集也并没有用到相对位置的功能?

运行flat_main.py报错

按照readme运行代码会报以下错误,想请教一下是什么原因造成的:
Traceback (most recent call last):
File "flat_main.py", line 306, in
only_train_min_freq=args.only_train_min_freq)
File "E:\Program\Anaconda\Install\envs\pytorch-1.2.0\lib\site-packages\fastNLP\core\utils.py", line 160, in wrapper
with open(cache_filepath, 'wb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'cache\weibo_lattice_only_train:False_trainClip:True_norm_num:0char_min_freq1bigram_min_freq1word_min_freq1only_train_min_freqTruenumber_norm0lexicon_yjload_dataset_seed100'

运行代码错误

File "../V0/modules.py", line 93, in forward
pe_ss = self.pe_ss[(pos_ss).view(-1)+self.max_seq_len].view(size=[batch,max_seq_len,max_seq_len,-1])
IndexError: index 363 is out of bounds for dimension 0 with size 351

关于weibo数据集

作者您好,请问您在论文中使用的数据集是weiboNER.conll.train还是weiboNER_2nd_conll.train呢?

Can not download embedding from Google drive

There are no gigaword_chn.all.a2b.bi.ite50.vec and sgns.merge.word.bz2 google drive links

Since I receive 404 error from Baidu Pan links, I cannot download Baidu Pan links.

Very much appreciate if you can upload the word embeddings.

Thank you very much.

关于lattice-lstm

文章给出的lattcie-lstm的结构与我看到的Chinese NER Using Lattice LSTM有点不一样, 有个问题想要请教一下, 按照Chinese NER Using Lattice LSTM中的构建方法, 重庆人和药店应该会提取出[重庆, 重庆人, 人和药店, 药店]四个词, 请问是如何剔除重庆人人这个词的. 文章中只提到“Some words in lattice may be important for NER. ”, 能给出如果筛选这些重要的词的么?

运行错误

File "../V0/modules.py", line 93, in forward
pe_ss = self.pe_ss[(pos_ss).view(-1)+self.max_seq_len].view(size=[batch,max_seq_len,max_seq_len,-1])
IndexError: index 363 is out of bounds for dimension 0 with size 351

公式(11)书写是否有问题?

在读论文推导公式的过程中,我觉得公式(11)书写是否有问题?
以这一部分举例,image,是一个(dhead,dmodel)*(dmodel,1)*(1,dmodel)*(dmodel,dhead)的计算,计算结果是一个(dhead,dhead)的矩阵,而非是一个标量。Aij是一个矩阵的话,A*就不能替换掉attention公式中的A了。

不知道是否我理解有问题,希望作者能够解答

模型推断速度

您好,请问有实验过模型预测的速度吗,我用的一块Tesla P100,每个样本预测速度在5s左右,这个是正常的吗

输入长度未超过512也会报长度错误

Traceback (most recent call last):
File "flat_main.py", line 787, in
trainer.train()
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/trainer.py", line 613, in train
self.callback_manager.on_exception(e)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/callback.py", line 309, in wrapper
returns.append(getattr(callback, func.name)(*arg))
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/callback.py", line 505, in on_exception
raise exception # 抛出陌生Error
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/trainer.py", line 609, in train
self._train()
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/trainer.py", line 664, in _train
prediction = self._data_forward(self.model, batch_x)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/trainer.py", line 752, in _data_forward
y = network(**x)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "../V1/models.py", line 440, in forward
bert_embed = self.bert_embedding(char_for_bert)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "../fastNLP_module.py", line 389, in forward
outputs = self.model(words)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/embeddings/bert_embedding.py", line 339, in forward
"After split words into word pieces, the lengths of word pieces are longer than the "
RuntimeError: After split words into word pieces, the lengths of word pieces are longer than the maximum allowed sequence length:512 of bert. You can set auto_truncate=True for BertEmbedding to automatically truncate overlong input.

数据集从哪里来?

ontonotes, msra, weibo ,resume 这些数据集都找不到,能否在Readme中增加一下数据集下载链接?
另外看起来load_data.py和preprocess.py代码都是不完整的?还没有实现完?

关于数据集

可不可以给个数据集的链接,自己找的好像格式不太对

关于residual结构的疑问。在resume数据集上只达到95.15%,还没有达到95.5%。

1.我总感觉作者transformer_encoder时的residual结构写的有问题,但是我把它修改后发现效果变差了。不知道为什么。
class Layer_Process(nn.Module):
def init(self, process_sequence, hidden_size, dropout=0,
use_pytorch_dropout=True):
def forward(self, inp):
output = inp
for op in self.process_sequence: #process_sequence=’an‘
if op == 'a':
output = output + inp #这里不是相当于将inp*2吗?
elif op == 'd':
output = self.dropout(output)
elif op == 'n':
output = self.layer_norm(output)
return output
2.修改了个别超参,如batch设置为5,k_proj修改为True,作者设置为false,另外在融合位置embed的时候ss,se,es,ee都使用了,作者的超参只使用了ss,ee,为了增大每个注意力头的大小稍微怎大了隐藏层大小。其中将k_proj修改为True就可以到达95.0%,使用4个相对位置融合感觉没有提升,增大隐藏层大小上升到95.15%。笔记本空间有限就没有继续增加隐藏层。

一个简单的bug

Traceback (most recent call last):
File "E:/yuan/Flat-Lattice-Transformer-master/V0/flat_main.py", line 306, in
only_train_min_freq=args.only_train_min_freq)
File "C:\ProgramData\Anaconda3\envs\flat\lib\site-packages\fastNLP\core\utils.py", line 160, in wrapper
with open(cache_filepath, 'wb') as f:
OSError: [Errno 22] Invalid argument: 'cache\resume_lattice_only_train:False_trainClip:True_norm_num:0char_min_freq1bigram_min_freq1word_min_freq1only_train_min_freqTruenumber_norm0lexicon_yjload_dataset_seed100'
这个怎么解决啊

import issue

您好,我想请问一下,我这边运行flat_main.py 后fastNLP_module提示fastNLP.modules.utils包中can not import name 'get_file_name_base_on_postfix'是什么情况,不是很懂这里的原因

报错OSError: [Errno 22] Invalid argument

请问一下,您知道这个错误咋解决吗,resume和weibo两个数据集都会报这个错误。

Traceback (most recent call last):
File "flat_main.py", line 313, in
only_train_min_freq=args.only_train_min_freq)
File "D:\develop\python\Anaconda3\lib\site-packages\fastNLP\core\utils.py", line 160, in wrapper
with open(cache_filepath, 'wb') as f:
OSError: [Errno 22] Invalid argument: 'cache\weibo_lattice_only_train:False_trainClip:True_norm_num:0char_min_freq1bigram_min_freq1word_min_freq1only_train_min_freqTruenumber_norm0lexicon_yjload_dataset_seed100'

相关代码

datasets,vocabs,embeddings = equip_chinese_ner_with_lexicon(datasets,vocabs,embeddings,
                                                            w_list,yangjie_rich_pretrain_word_path,
                                                         _refresh=refresh_data,_cache_fp=cache_name,
                                                         only_lexicon_in_train=args.only_lexicon_in_train,
                                                            word_char_mix_embedding_path=yangjie_rich_pretrain_char_and_word_path,
                                                            number_normalized=args.number_normalized,
                                                            lattice_min_freq=args.lattice_min_freq,
                                                            only_train_min_freq=args.only_train_min_freq)

应该是这个注解产生的错误吧。

@cache_results(_cache_fp='need_to_defined_fp',_refresh=True)
def equip_chinese_ner_with_lexicon(datasets,vocabs,embeddings,w_list,word_embedding_path=None,
                                   only_lexicon_in_train=False,word_char_mix_embedding_path=None,
                                   number_normalized=False,
                                   lattice_min_freq=1,only_train_min_freq=0):

add and norm问题

您好,在看代码时class Layer_Process内部进行Add+Norm时,Add好像不是残差连接,请问是什么原因?
代码如下:
def forward(self, inp):
output = inp
for op in self.process_sequence:
if op == 'a':
output = output + inp
elif op == 'd':
output = self.dropout(output)
elif op == 'n':
output = self.layer_norm(output)

    return output

这里add 是 相当于 input+input,并不是 fun(input) + input,看了TENER也使用的是残差连接,请问是我理解错了吗?

Concatenation

你好,我想请问一下Rij的最后形状是怎样的?

运行weibo数据集报错

运行微博数据集报错:
运行命令:python flat_main.py --dataset weibo
image
OSError: [Errno 22] Invalid argument: 'cache\weibo_lattice_only_train:False_trainClip:True_norm_num:0char_min_freq1bigram_min_freq1word_min_freq1only_train_min_freqTruenumber_norm0lexicon_yjload_dataset_seed100'

请问当前 flat_main.py 里的默认参数,就是论文里描述的模型结构吗?

请问使用当前flat_main.py里的默认参数,模型结构就是论文里 “3.2 Relative Position Encoding of Spans” 所描述的模型结构吗?
flat_main.py 116行

parser.add_argument('--four_pos_fusion',default='ff_two',choices=['ff','attn','gate','ff_two','ff_linear'],

modules.py 110行

if self.four_pos_fusion == 'ff_two':
    pe_2 = torch.cat([pe_ss,pe_ee],dim=-1)

这里我理解是只使用了4个相对位置特征([pe_ss,pe_se,pe_es,pe_ee])中的两个[pe_ss,pe_ee],而论文“3.2 Relative Position Encoding of Spans”用了4个特征。不知道是否是我对代码解读有误?

关于path.py

您好,请问一下利用下载的embedding修改path.py是修改路径吗
具体是修改哪一个呢,抱歉我是新手还请见谅哈

bert的实现

你好,请问论文中是如何实现 Flat-Lattice 和BERT的结合,①是修改BERT的Transformer结构?②还是BERT只提供了char级别的embedding,然后在bert输出隐藏表示后进行FLAT的操作?

代码复现

想请问一下,最终模型复现的效果是什么样的呢

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.