zhixiuye / hscrf-pytorch Goto Github PK
View Code? Open in Web Editor NEWACL 2018: Hybrid semi-Markov CRF for Neural Sequence Labeling (http://aclweb.org/anthology/P18-2038)
ACL 2018: Hybrid semi-Markov CRF for Neural Sequence Labeling (http://aclweb.org/anthology/P18-2038)
您好,请问在实现table1中模型9时,参数应该如何设置呀?
Hi. I read your code and I have a question about the function scrf_to_crf in utils.py, ie..
for i_l in decoded_scrf:
sent_labels = [l_map['<start>']]
for label in i_l:
if label != l_map['<pad>']:
sent_labels.append(label)
else:
break
crf_labels.append(sent_labels)
crfdata = []
masks = []
maxl_1 = max([len(i) for i in crf_labels])
for i_l in crf_labels:
cur_len_1 = len(i_l)
cur_len = cur_len_1 - 1
i_l_pad = [i_l[ind] * label_size + i_l[ind + 1] for ind in range(0, cur_len)] + [i_l[cur_len] * label_size + pad_label] + [
pad_label * label_size + pad_label] * (maxl_1 - cur_len_1)
mask = [1] * cur_len_1 + [0] * (maxl_1 - cur_len_1)
crfdata.append(i_l_pad)
masks.append(mask)
Why would it break if lable == l_map['']? The break operation will make the length of sent_label different from that of decoded_scrf[0], resulting in a mismantch between sent_labels and the sentence in terms of the length?
如何将HSCRF模型放入自己的模型做分类器使用?具体应如何做?
本人小白一个,谢谢了!
Can you please point to some reference which you used to implement the loss function for NER task?
I think your paper only discusses loss function for the word level segmenting.
Thanks
Traceback (most recent call last):
File "train.py", line 205, in <module>
evaluator.calc_score(model, dev_dataset_loader)
File "/u/suhubdyd/projects/HSCRF-pytorch/model/evaluator.py", line 172, in calc_score
decoded_crf, crf_result_scored_by_crf = utils.decode_with_crf(ner_model.crf, word_representations, mask_v,self.l_map)
File "/u/suhubdyd/projects/HSCRF-pytorch/model/utils.py", line 763, in decode_with_crf
decoded_crf = crf.decode(word_reps, mask_v)
File "/u/suhubdyd/projects/HSCRF-pytorch/model/crf_layer.py", line 131, in decode
decode_idx[idx] = pointer
RuntimeError: expand(torch.cuda.LongTensor{[50, 1]}, size=[50]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)
Have you tried this method on a Chinese dataset?
Your requirements.txt contains wrong pytorch requirements, either your code base is in python3.x or your pytorch should use a whl corresponding to python2.7, fix it also please update the refereces
hi, what really confused me is that why logalpha initialize to -10000 not zero or other random data, is there any special meaning?
Have you ever had this problem:
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/condabld/pytorch_1511304568725/work/torch/lib/THC/generic/THCStorage.cu:66
I used the Chinese corpus with the vector dimension of 300. I try to use the little dataset and make the batchsize = 5, but it failed again.
I wonder what the data in ./data folder is.
When I try to run 'eval.py', I realize that there is lack of './checkpoint/6365035.json' and './checkpoint/6365035.model' file in this code source. Does this project need these two files? I will be grateful if you can offer me more information.
可否详细解释一下score的计算原理,尤其是不同sapn_len的计算?
I run the code on a Chinese ner train data(around 70 thousand sentences), and I got the OMM error:
Hi, I have been trying to use your code as a reference to implement a similar SCRF variant, and am a little confused by the get_logloss_numerator
function. What exactly is the goldfactors you use for the correct path scores?
I am speaking about this line: https://github.com/ZhixiuYe/HSCRF-pytorch/blob/master/model/hscrf_layer.py#L146
Could you give me an example how I can generate my own factors?
========><==========
EDIT: I think I understand the format, each item is a tensorized list of (from_id, to_id, prev_tag, curr_tag). If that is correct, the comment that the size is (batch_size, tag_len, 4)
is confusing, since it means the maximum number of unique tag-sequences within items in the batch. I assumed that tag_len means tagset_size.
Traceback (most recent call last):
File "/home/04-HSCRF-pytorch-master/train.py", line 210, in
evaluator.calc_score(model, dev_dataset_loader)
File "/home/04-HSCRF-pytorch-master/evaluator.py", line 172, in calc_score
decoded_crf, crf_result_scored_by_crf = utils.decode_with_crf(ner_model.crf, word_representations, mask_v,self.l_map)
File "/home/04-HSCRF-pytorch-master/utils.py", line 777, in decode_with_crf
bi_crf = torch.cuda.LongTensor(bi_crf).transpose(0,1).unsqueeze(2)
RuntimeError: tried to construct a tensor from a nested int sequence, but found an item of type numpy.int64 at index (0, 0)
Hey, I'm running the code for binary labeling (I only have the MISC tag).
I'm working with 300 sized Hindi Word2Vec Embeddings.
Here's a pastebin of the error : https://pastebin.com/fDF2aTRD
Any inputs on how to resolve this?
Is it possible to use PyTorch 0.4.0?
It seems I'm running into some issues but not sure if it dues to the version issue.
`allan@statnlp0:~/tmp/HSCRF-pytorch$ CUDA_VISIBLE_DEVICES=0 python train.py --char_lstm --high_way
seed: 5703958
setting:
Namespace(allowspan=6, batch_size=10, char_embedding_dim=30, char_lstm=True, char_lstm_hidden_dim=300, char_lstm_layers=1, checkpoint='./checkpoint/', clip_grad=5.0, cnn_filter_num=30, dev_file='./data/eng.testa', dropout_ratio=0.55, early_stop=10, emb_file='./data/glove.6B.100d.txt', epoch=150, grconv=False, high_way=True, highway_layers=1, index_embeds_dim=10, least_epoch=75, load_check_point='', load_opt=False, lr=0.015, lr_decay=0.05, mini_count=5, model_name='HSCRF', momentum=0.9, scrf_dense_dim=100, shrink_embedding=False, start_epoch=0, test_file='./data/eng.testb', train_file='./data/eng.train', unk='unk', word_embedding_dim=100, word_hidden_dim=300, word_lstm_layers=1)
loading corpus
constructing coding table
/home/allan/tmp/HSCRF-pytorch/model/utils.py:642: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(input_embedding, -bias, bias)
constructing dataset
building model
/usr/local/lib/python2.7/dist-packages/torch/nn/modules/rnn.py:38: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.55 and num_layers=1
"num_layers={}".format(dropout, num_layers))
/home/allan/tmp/HSCRF-pytorch/model/utils.py:649: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(input_linear.weight, -bias, bias)
/home/allan/tmp/HSCRF-pytorch/model/hscrf_layer.py:60: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(input_embedding, -bias, bias)
/home/allan/tmp/HSCRF-pytorch/model/hscrf_layer.py:68: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(input_linear.weight, -bias, bias)
/home/allan/tmp/HSCRF-pytorch/model/utils.py:660: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(weight, -bias, bias)
/home/allan/tmp/HSCRF-pytorch/model/utils.py:663: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(weight, -bias, bias)
with torch.no_grad():
instead.with torch.no_grad():
instead.with torch.no_grad():
instead.with torch.no_grad():
instead.with torch.no_grad():
instead.with torch.no_grad():
instead.with torch.no_grad():
instead.with torch.no_grad():
instead.with torch.no_grad():
instead.with torch.no_grad():
instead.with torch.no_grad():
instead.with torch.no_grad():
instead.in the cnn_lstm function in word_rep_layer.py
you've hard-coded the dimensions in the view of d_char_out
, that is you are using (-1, self.batch_size, 30)
, it should be (word_seq.size(0), slef.batch_size, -1)
to make it general
I think this won't work if you change the embedding dimension or number of characters in alphabet
the dataset given by you in data folder has sentences such as
AUGUST RB I-NP O
1996 CD I-NP O
CDU NNP I-NP I-ORG
/ SYM O I-ORG
CSU NNP I-NP I-ORG
SPD NNP I-NP B-ORG
FDP NNP I-NP B-ORG
Greens NNP I-NP B-ORG
PDS NNP I-NP B-ORG
this is not BIO as B is following I, it should have been B-ORG followed by multiple I-ORG, same happens with B-MISC and B-PER and B-LOC doesn't even exits. please clarify
Hi,
Thank you for your excellent work.
As you described in the paper, you have adopted BIOES tagging scheme in the experiment. However, it seems that the CoNLL 2003 NER dataset is annotated by BIO, which makes me confused.
Traceback (most recent call last):
File "F:/Python_Projects/HSCRF_NER_pytorch/train.py", line 203, in
evaluator.calc_score(model, dev_dataset_loader)
File "F:\Python_Projects\HSCRF_NER_pytorch\model\evaluator.py", line 181, in calc_score
scrf_result_scored_by_crf = utils.rescored_with_crf(decoded_scrf, self.l_map, ner_model.crf.crf_scores)
File "F:\Python_Projects\HSCRF_NER_pytorch\model\utils.py", line 871, in rescored_with_crf
tg_energy = torch.gather(scores.view(seq_len, bat_size, -1), 2, scrfdata).view(seq_len, bat_size)
RuntimeError: invalid argument 2: Input tensor must have same size as output tensor apart from the specified dimension at c:\users\administrator\downloads\new-builder\win-wheel\pytorch\aten\src\thc\generic/THCTensorScatterGather.cu:29
Process finished with exit code 1
on using crf model to score the prediction of scrf,
file utils.py, in function scrf_to_crf(): after the prediction of scrf is converted crf data, its sequence lenght is not consistent with the sequence lenght of crf model scores. the former = lenght of sentence + 1,but the latter = thresholds[idx].
HSCRF-pytorch/model/hscrf_layer.py
Line 186 in 86bb648
Hey,
When I am running a forward pass with your HSCRF module, I am getting a Loss formatted like below.
In your training code, you use it like this: epoch_loss += utils.to_scalar(loss)
(here: https://github.com/ZhixiuYe/HSCRF-pytorch/blob/master/train.py#L177).
What exactly does this do? Why is the loss not directly a scalar?
Loss
tensor([ -40.1046, -1039.7493, -2039.6393, -1039.3293, -1038.7677,
-2038.6620, -2038.5282, -2038.3925, -2038.2518, -2038.1088,
-2037.9637, -2037.8209, -2037.6803, -2037.5381, -2037.3993,
-2037.2581, -2037.1168, -2036.9700, -2036.8280, -2036.6908,
-2036.5507, -2036.4150, -2036.2721, -2036.1278, -2035.9827,
-2035.8431, -2035.7041, -2035.5673, -2035.4296, -2035.2881,
-2035.1510, -2035.0101, -2034.8658, -2034.7272, -2034.5862,
-2034.4449, -2034.3057, -2034.1682, -2034.0237, -1033.9022,
-1040.0959, -2040.0959, -2039.9283, -2039.7893, -2039.6503,
-2039.5123, -2039.3701, -2039.2305, -2039.0912, -2038.9553,
-2038.8187, -2038.6780, -2038.5389, -2038.3978, -36.7899,
-40.0815, -1040.0814, -2039.9296, -2039.7889, -2039.6522,
-2039.5165, -2039.3783, -2039.2343, -2039.0922, -2038.9550,
-2038.8119, -1038.5255, -1038.3424, -37.4315, -1040.0959,
-2040.0959, -2039.9336, -2039.7943, -1039.5092, -1039.1903,
-2039.0892, -2038.9465, -2038.8047, -2038.6637, -2038.5248,
-2038.3876, -2038.2494, -1037.9631, -1037.5199, -2037.4154,
-2037.2767, -2037.1420, -2037.0059, -2036.8678, -2036.7299,
-2036.5885, -2036.4471, -2036.3021, -2036.1615, -2036.0237,
-2035.8873, -2035.7494, -2035.6093, -2035.4706, -1035.3413,
-40.0959, -1038.7875, -2038.6840, -2038.5472, -2038.4045,
-2038.2649, -2038.1256, -2037.9882, -2037.8506, -2037.7131,
-2037.5725, -2037.4362, -2037.2994, -2037.1644, -2037.0319,
-2036.8983, -2036.7588, -2036.6194, -2036.4772, -2036.3358,
-1036.2131, -40.0959, -1039.0562, -2038.9530, -2038.8147,
-2038.6696, -2038.5270, -2038.3877, -2038.2422, -2038.1002,
-2037.9564, -2037.8147, -1037.6818, -40.1046, -1039.7607,
-2039.6508, -2039.5139, -2039.3763, -2039.2395, -2039.1012,
-2038.9645, -2038.8303, -2038.6946, -2038.5599, -2038.4235,
-1038.1422, -1037.9800, -37.7956, -40.1046, -1039.7583,
-1039.5063, -1039.3461, -2039.2405, -2039.1073, -2038.9686,
-2038.8303, -2038.6901, -2038.5526, -2038.4150, -2038.2737,
-2038.1295, -2037.9882, -1037.8634, -1040.0959, -2040.0959,
-2039.9303, -40.0959, -1038.6449, -2038.5428, -2038.4020,
-2038.2581, -2038.1158, -2037.9739, -2037.8346, -2037.6959,
-2037.5570, -2037.4202, -2037.2764, -2037.1334, -2036.9911,
-2036.8450, -1036.7113, -40.1044, -1039.9014, -2039.7880,
-40.0959, -1038.3672, -2038.2637, -2038.1252, -2037.9850,
-2037.8483, -2037.7079, -2037.5686, -2037.4323, -2037.2991,
-2037.1674, -2037.0237, -2036.8802, -2036.7416, -2036.6041,
-2036.4674, -2036.3307, -2036.1898, -2036.0443, -2035.9014,
-2035.7561, -2035.6185, -2035.4778, -1035.3492], device='cuda:0')
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.