zhixiuye / hscrf-pytorch Goto Github PK

View Code? Open in Web Editor NEW

304.0 304.0 68.0 1 MB

ACL 2018: Hybrid semi-Markov CRF for Neural Sequence Labeling (http://aclweb.org/anthology/P18-2038)

Python 100.00%

crf ner nlp pytorch sequence-labeling

hscrf-pytorch's People

Contributors

Stargazers

Watchers

hscrf-pytorch's Issues

Question about scrf_to_crf in utils.py

Hi. I read your code and I have a question about the function scrf_to_crf in utils.py, ie..

for i_l in decoded_scrf:
        sent_labels = [l_map['<start>']]
        for label in i_l:
            if label != l_map['<pad>']:
                sent_labels.append(label)
            else:
                break
        crf_labels.append(sent_labels)
    crfdata = []
    masks = []
    maxl_1 = max([len(i) for i in crf_labels])
    for i_l in crf_labels:
        cur_len_1 = len(i_l)
        cur_len = cur_len_1 - 1
        i_l_pad = [i_l[ind] * label_size + i_l[ind + 1] for ind in range(0, cur_len)] + [i_l[cur_len] * label_size + pad_label] + [
                    pad_label * label_size + pad_label] * (maxl_1 - cur_len_1)

        mask = [1] * cur_len_1 + [0] * (maxl_1 - cur_len_1)
        crfdata.append(i_l_pad)
        masks.append(mask)

Why would it break if lable == l_map['']? The break operation will make the length of sent_label different from that of decoded_scrf[0], resulting in a mismantch between sent_labels and the sentence in terms of the length?

如何将HSCRF模型放入自己的模型做分类器使用？谢谢！

如何将HSCRF模型放入自己的模型做分类器使用？具体应如何做？
本人小白一个，谢谢了！

Loss function for NER task

Can you please point to some reference which you used to implement the loss function for NER task?

I think your paper only discusses loss function for the word level segmenting.

Thanks

On Torch 0.4

Traceback (most recent call last):
  File "train.py", line 205, in <module>
    evaluator.calc_score(model, dev_dataset_loader)
  File "/u/suhubdyd/projects/HSCRF-pytorch/model/evaluator.py", line 172, in calc_score
    decoded_crf, crf_result_scored_by_crf = utils.decode_with_crf(ner_model.crf, word_representations, mask_v,self.l_map)
  File "/u/suhubdyd/projects/HSCRF-pytorch/model/utils.py", line 763, in decode_with_crf
    decoded_crf = crf.decode(word_reps, mask_v)
  File "/u/suhubdyd/projects/HSCRF-pytorch/model/crf_layer.py", line 131, in decode
    decode_idx[idx] = pointer
RuntimeError: expand(torch.cuda.LongTensor{[50, 1]}, size=[50]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)

Have you tried this method on a Chinese dataset?

Wrong requirements

Your requirements.txt contains wrong pytorch requirements, either your code base is in python3.x or your pytorch should use a whl corresponding to python2.7, fix it also please update the refereces

logalpha initialize

hi, what really confused me is that why logalpha initialize to -10000 not zero or other random data, is there any special meaning?

out of memory

Have you ever had this problem:
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/condabld/pytorch_1511304568725/work/torch/lib/THC/generic/THCStorage.cu:66

I used the Chinese corpus with the vector dimension of 300. I try to use the little dataset and make the batchsize = 5, but it failed again.

What is the dataset?

I wonder what the data in ./data folder is.

Can you offer me the check point files?

When I try to run 'eval.py', I realize that there is lack of './checkpoint/6365035.json' and './checkpoint/6365035.model' file in this code source. Does this project need these two files? I will be grateful if you can offer me more information.

解释一下score，谢谢

可否详细解释一下score的计算原理，尤其是不同sapn_len的计算？

Not able to achieve accuracy shown in paper/repo

I tried running the program in an environment with pytorch=0.2.0 and python=2.7 multiple times using the command : CUDA_VISIBLE_DEVICES=0 python train.py --char_lstm --high_way

The highest accuracy that I have gotten is as follows :

Please tell me how you were able to achieve 91.xx?

asking for the cuda OOM questions

I run the code on a Chinese ner train data(around 70 thousand sentences), and I got the OMM error:

Tot it 541 (epoch 0): 0it [00:00, ?it/s]THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
Traceback (most recent call last):
File "train.py", line 179, in
loss.backward()
File "/usr/lib64/python2.7/site-packages/torch/autograd/variable.py", line 156, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/usr/lib64/python2.7/site-packages/torch/autograd/init.py", line 98, in backward
variables, grad_variables, retain_graph)
File "/usr/lib64/python2.7/site-packages/torch/autograd/function.py", line 91, in apply
return self._forward_cls.backward(self, *args)
File "/usr/lib64/python2.7/site-packages/torch/autograd/function.py", line 194, in wrapper
outputs = fn(ctx, *tensor_args)
File "/usr/lib64/python2.7/site-packages/torch/nn/_functions/thnn/sparse.py", line 80, in backward
grad_weight = grad_output.new(ctx.weight_size).zero()
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:66
I have set the batch_size to 10 and 128, and both of them result in the error shown above. Could any one give me some advise to solve it?

Goldfactors format

Hi, I have been trying to use your code as a reference to implement a similar SCRF variant, and am a little confused by the get_logloss_numerator function. What exactly is the goldfactors you use for the correct path scores?
I am speaking about this line: https://github.com/ZhixiuYe/HSCRF-pytorch/blob/master/model/hscrf_layer.py#L146

Could you give me an example how I can generate my own factors?

========><==========
EDIT: I think I understand the format, each item is a tensorized list of (from_id, to_id, prev_tag, curr_tag). If that is correct, the comment that the size is (batch_size, tag_len, 4) is confusing, since it means the maximum number of unique tag-sequences within items in the batch. I assumed that tag_len means tagset_size.

tried to construct a tensor

Traceback (most recent call last):
File "/home/04-HSCRF-pytorch-master/train.py", line 210, in
evaluator.calc_score(model, dev_dataset_loader)
File "/home/04-HSCRF-pytorch-master/evaluator.py", line 172, in calc_score
decoded_crf, crf_result_scored_by_crf = utils.decode_with_crf(ner_model.crf, word_representations, mask_v,self.l_map)
File "/home/04-HSCRF-pytorch-master/utils.py", line 777, in decode_with_crf
bi_crf = torch.cuda.LongTensor(bi_crf).transpose(0,1).unsqueeze(2)
RuntimeError: tried to construct a tensor from a nested int sequence, but found an item of type numpy.int64 at index (0, 0)

Value Error : Running with hindi word embeddings

Hey, I'm running the code for binary labeling (I only have the MISC tag).

I'm working with 300 sized Hindi Word2Vec Embeddings.

Here's a pastebin of the error : https://pastebin.com/fDF2aTRD

Any inputs on how to resolve this?

version of PyTorch

Is it possible to use PyTorch 0.4.0?

It seems I'm running into some issues but not sure if it dues to the version issue.

`allan@statnlp0:~/tmp/HSCRF-pytorch$ CUDA_VISIBLE_DEVICES=0 python train.py --char_lstm --high_way
seed: 5703958
setting:
Namespace(allowspan=6, batch_size=10, char_embedding_dim=30, char_lstm=True, char_lstm_hidden_dim=300, char_lstm_layers=1, checkpoint='./checkpoint/', clip_grad=5.0, cnn_filter_num=30, dev_file='./data/eng.testa', dropout_ratio=0.55, early_stop=10, emb_file='./data/glove.6B.100d.txt', epoch=150, grconv=False, high_way=True, highway_layers=1, index_embeds_dim=10, least_epoch=75, load_check_point='', load_opt=False, lr=0.015, lr_decay=0.05, mini_count=5, model_name='HSCRF', momentum=0.9, scrf_dense_dim=100, shrink_embedding=False, start_epoch=0, test_file='./data/eng.testb', train_file='./data/eng.train', unk='unk', word_embedding_dim=100, word_hidden_dim=300, word_lstm_layers=1)
loading corpus
constructing coding table
/home/allan/tmp/HSCRF-pytorch/model/utils.py:642: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(input_embedding, -bias, bias)
constructing dataset
building model
/usr/local/lib/python2.7/dist-packages/torch/nn/modules/rnn.py:38: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.55 and num_layers=1
"num_layers={}".format(dropout, num_layers))
/home/allan/tmp/HSCRF-pytorch/model/utils.py:649: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(input_linear.weight, -bias, bias)
/home/allan/tmp/HSCRF-pytorch/model/hscrf_layer.py:60: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(input_embedding, -bias, bias)
/home/allan/tmp/HSCRF-pytorch/model/hscrf_layer.py:68: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(input_linear.weight, -bias, bias)
/home/allan/tmp/HSCRF-pytorch/model/utils.py:660: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(weight, -bias, bias)
/home/allan/tmp/HSCRF-pytorch/model/utils.py:663: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(weight, -bias, bias)

Tot it 1404 (epoch 0): 0it [00:00, ?it/s]train.py:180: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.
nn.utils.clip_grad_norm(model.parameters(), args.clip_grad)
Tot it 1404 (epoch 0): 0it [00:00, ?it/s]train.py:195: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.
nn.utils.clip_grad_norm(model.parameters(), args.clip_grad)
epoch_loss: 22.3492750524
/home/allan/tmp/HSCRF-pytorch/model/data_packer.py:49: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
f_f = Variable(f_f[:, 0:mlen[0]].transpose(0, 1), volatile=True).cuda()
/home/allan/tmp/HSCRF-pytorch/model/data_packer.py:50: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
f_p = Variable(f_p[:, 0:mlen[1]].transpose(0, 1), volatile=True).cuda()
/home/allan/tmp/HSCRF-pytorch/model/data_packer.py:51: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
b_f = Variable(b_f[:, -mlen[0]:].transpose(0, 1), volatile=True).cuda()
/home/allan/tmp/HSCRF-pytorch/model/data_packer.py:52: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
b_p = Variable((b_p[:, 0:mlen[1]] - ocl + mlen[0]).transpose(0, 1), volatile=True).cuda()
/home/allan/tmp/HSCRF-pytorch/model/data_packer.py:53: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
w_f = Variable(w_f[:, 0:mlen[1]].transpose(0, 1), volatile=True).cuda()
/home/allan/tmp/HSCRF-pytorch/model/data_packer.py:54: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
tg_v = Variable(target[:, 0:mlen[1]].transpose(0, 1), volatile=True).unsqueeze(2).cuda()
/home/allan/tmp/HSCRF-pytorch/model/data_packer.py:55: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
mask_v = Variable(mask[:, 0:mlen[1]].transpose(0, 1), volatile=True).cuda()
/home/allan/tmp/HSCRF-pytorch/model/data_packer.py:56: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
SCRF_labels = Variable(SCRF_labels[:, 0:mlen[2]], volatile=True).cuda()
/home/allan/tmp/HSCRF-pytorch/model/data_packer.py:57: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
mask_SCRF_laebls = Variable(mask_SCRF_laebls[:, 0:mlen[2]], volatile=True).cuda()
/home/allan/tmp/HSCRF-pytorch/model/data_packer.py:58: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
cnn_features = Variable(cnn_features[:, 0:mlen[1], 0:mlen[3]].transpose(0, 1), volatile=True).cuda().contiguous()
/home/allan/tmp/HSCRF-pytorch/model/crf_layer.py:113: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
mask = Variable(1 - mask.data, volatile=True)
/home/allan/tmp/HSCRF-pytorch/model/crf_layer.py:114: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
decode_idx = Variable(torch.cuda.LongTensor(seq_len-1, bat_size), volatile=True)
Traceback (most recent call last):
File "train.py", line 205, in
evaluator.calc_score(model, dev_dataset_loader)
File "/home/allan/tmp/HSCRF-pytorch/model/evaluator.py", line 172, in calc_score
decoded_crf, crf_result_scored_by_crf = utils.decode_with_crf(ner_model.crf, word_representations, mask_v,self.l_map)
File "/home/allan/tmp/HSCRF-pytorch/model/utils.py", line 763, in decode_with_crf
decoded_crf = crf.decode(word_reps, mask_v)
File "/home/allan/tmp/HSCRF-pytorch/model/crf_layer.py", line 131, in decode
decode_idx[idx] = pointer
RuntimeError: expand(torch.cuda.LongTensor{[50, 1]}, size=[50]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)`

hard-coded dimensions of view cnn_lstm

in the cnn_lstm function in word_rep_layer.py you've hard-coded the dimensions in the view of d_char_out, that is you are using (-1, self.batch_size, 30) , it should be (word_seq.size(0), slef.batch_size, -1) to make it general

I think this won't work if you change the embedding dimension or number of characters in alphabet

What is the format of the dataset?

the dataset given by you in data folder has sentences such as

AUGUST RB I-NP O
1996 CD I-NP O
CDU NNP I-NP I-ORG
/ SYM O I-ORG
CSU NNP I-NP I-ORG
SPD NNP I-NP B-ORG
FDP NNP I-NP B-ORG
Greens NNP I-NP B-ORG
PDS NNP I-NP B-ORG

this is not BIO as B is following I, it should have been B-ORG followed by multiple I-ORG, same happens with B-MISC and B-PER and B-LOC doesn't even exits. please clarify

Why the CoNLL 2003 NER dataset isn't annotated by BIOES tags as the paper described?

Hi,
Thank you for your excellent work.
As you described in the paper, you have adopted BIOES tagging scheme in the experiment. However, it seems that the CoNLL 2003 NER dataset is annotated by BIO, which makes me confused.

no

sequence lenght mismatch

Traceback (most recent call last):
File "F:/Python_Projects/HSCRF_NER_pytorch/train.py", line 203, in
evaluator.calc_score(model, dev_dataset_loader)
File "F:\Python_Projects\HSCRF_NER_pytorch\model\evaluator.py", line 181, in calc_score
scrf_result_scored_by_crf = utils.rescored_with_crf(decoded_scrf, self.l_map, ner_model.crf.crf_scores)
File "F:\Python_Projects\HSCRF_NER_pytorch\model\utils.py", line 871, in rescored_with_crf
tg_energy = torch.gather(scores.view(seq_len, bat_size, -1), 2, scrfdata).view(seq_len, bat_size)
RuntimeError: invalid argument 2: Input tensor must have same size as output tensor apart from the specified dimension at c:\users\administrator\downloads\new-builder\win-wheel\pytorch\aten\src\thc\generic/THCTensorScatterGather.cu:29

Process finished with exit code 1

on using crf model to score the prediction of scrf,
file utils.py, in function scrf_to_crf(): after the prediction of scrf is converted crf data, its sequence lenght is not consistent with the sequence lenght of crf model scores. the former = lenght of sentence + 1,but the latter = thresholds[idx].

有关 'HSCRF_scores' 函数的一个问题

HSCRF-pytorch/model/hscrf_layer.py

Line 186 in 86bb648

def HSCRF_scores(self, feats):

麻烦问一下这个函数返回的score的作用是什么，没有看懂第二维和第三维两个sent_len维度是什么作用？
谢谢

How to use loss from HSCRF?

Hey,

When I am running a forward pass with your HSCRF module, I am getting a Loss formatted like below.
In your training code, you use it like this: epoch_loss += utils.to_scalar(loss) (here: https://github.com/ZhixiuYe/HSCRF-pytorch/blob/master/train.py#L177).

What exactly does this do? Why is the loss not directly a scalar?

Loss

tensor([  -40.1046, -1039.7493, -2039.6393, -1039.3293, -1038.7677,
        -2038.6620, -2038.5282, -2038.3925, -2038.2518, -2038.1088,
        -2037.9637, -2037.8209, -2037.6803, -2037.5381, -2037.3993,
        -2037.2581, -2037.1168, -2036.9700, -2036.8280, -2036.6908,
        -2036.5507, -2036.4150, -2036.2721, -2036.1278, -2035.9827,
        -2035.8431, -2035.7041, -2035.5673, -2035.4296, -2035.2881,
        -2035.1510, -2035.0101, -2034.8658, -2034.7272, -2034.5862,
        -2034.4449, -2034.3057, -2034.1682, -2034.0237, -1033.9022,
        -1040.0959, -2040.0959, -2039.9283, -2039.7893, -2039.6503,
        -2039.5123, -2039.3701, -2039.2305, -2039.0912, -2038.9553,
        -2038.8187, -2038.6780, -2038.5389, -2038.3978,   -36.7899,
          -40.0815, -1040.0814, -2039.9296, -2039.7889, -2039.6522,
        -2039.5165, -2039.3783, -2039.2343, -2039.0922, -2038.9550,
        -2038.8119, -1038.5255, -1038.3424,   -37.4315, -1040.0959,
        -2040.0959, -2039.9336, -2039.7943, -1039.5092, -1039.1903,
        -2039.0892, -2038.9465, -2038.8047, -2038.6637, -2038.5248,
        -2038.3876, -2038.2494, -1037.9631, -1037.5199, -2037.4154,
        -2037.2767, -2037.1420, -2037.0059, -2036.8678, -2036.7299,
        -2036.5885, -2036.4471, -2036.3021, -2036.1615, -2036.0237,
        -2035.8873, -2035.7494, -2035.6093, -2035.4706, -1035.3413,
          -40.0959, -1038.7875, -2038.6840, -2038.5472, -2038.4045,
        -2038.2649, -2038.1256, -2037.9882, -2037.8506, -2037.7131,
        -2037.5725, -2037.4362, -2037.2994, -2037.1644, -2037.0319,
        -2036.8983, -2036.7588, -2036.6194, -2036.4772, -2036.3358,
        -1036.2131,   -40.0959, -1039.0562, -2038.9530, -2038.8147,
        -2038.6696, -2038.5270, -2038.3877, -2038.2422, -2038.1002,
        -2037.9564, -2037.8147, -1037.6818,   -40.1046, -1039.7607,
        -2039.6508, -2039.5139, -2039.3763, -2039.2395, -2039.1012,
        -2038.9645, -2038.8303, -2038.6946, -2038.5599, -2038.4235,
        -1038.1422, -1037.9800,   -37.7956,   -40.1046, -1039.7583,
        -1039.5063, -1039.3461, -2039.2405, -2039.1073, -2038.9686,
        -2038.8303, -2038.6901, -2038.5526, -2038.4150, -2038.2737,
        -2038.1295, -2037.9882, -1037.8634, -1040.0959, -2040.0959,
        -2039.9303,   -40.0959, -1038.6449, -2038.5428, -2038.4020,
        -2038.2581, -2038.1158, -2037.9739, -2037.8346, -2037.6959,
        -2037.5570, -2037.4202, -2037.2764, -2037.1334, -2036.9911,
        -2036.8450, -1036.7113,   -40.1044, -1039.9014, -2039.7880,
          -40.0959, -1038.3672, -2038.2637, -2038.1252, -2037.9850,
        -2037.8483, -2037.7079, -2037.5686, -2037.4323, -2037.2991,
        -2037.1674, -2037.0237, -2036.8802, -2036.7416, -2036.6041,
        -2036.4674, -2036.3307, -2036.1898, -2036.0443, -2035.9014,
        -2035.7561, -2035.6185, -2035.4778, -1035.3492], device='cuda:0')

zhixiuye / hscrf-pytorch Goto Github PK

hscrf-pytorch's People

Contributors

Stargazers

Watchers

Forkers

hscrf-pytorch's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs