qipeng / gcn-over-pruned-trees Goto Github PK
View Code? Open in Web Editor NEWGraph Convolution over Pruned Dependency Trees Improves Relation Extraction (authors' PyTorch implementation)
License: Other
Graph Convolution over Pruned Dependency Trees Improves Relation Extraction (authors' PyTorch implementation)
License: Other
When I am resurrecting, all predicted values are “no_relation”, and the Precision value is 100%, the Recall value is 0, and the value of dev_f1 is also 0.
Evaluating on dev set...
['no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation']
Precision (micro): 100.000%
Recall (micro): 0.000%
F1 (micro): 0.000%
epoch 19: train_loss = 0.852363, dev_loss = 5.416558, dev_f1 = 0.0000
model saved to ./saved_models/00/checkpoint_epoch_19.pt
Hi Yuhao,
Congratulations on the good work. In the original paper, you conduct experiment on SemEval 2010 Task 8 Dataset. But in this code, I don't see this experiment.Can you share your experiment code on this dataset.Thank you!
Dear @yuhaozhang ,
In your paper, in Appendix A.1, you said that you use 2 GCN layers for the best performance. But when I run your source code, I fine-tune the number of GCN layer parameter, and I see that the number is 1 for the best, with the same your performance reported (P=0.695 R=0.636 F=0.664). For the number of GCN layers is 2, I got (P=0.68 R=0.62 F=0.644). Any mistypes in your Appendix? or what is the number of GCN layers for the best performance? Thanks!
I am geeting strange results like precision=100%, R=0, F=0
You can see
Per-relation statistics:
org:top_members/employees P: 100.00% R: 0.00% F1: 0.00% #: 1
per:age P: 100.00% R: 0.00% F1: 0.00% #: 1
per:origin P: 100.00% R: 0.00% F1: 0.00% #: 1
per:title P: 100.00% R: 0.00% F1: 0.00% #: 1
Final Score:
Precision (micro): 100.000%
Recall (micro): 0.000%
F1 (micro): 0.000%
test set evaluate result: 1.00 0.00 0.00
Evaluation ended.
Hi, thanks for sharing your code. I noticed a bug that would affect the experimental results.
This line of code below constructs subj_mask
and obj_mask
according to whether subj_pos
or obj_pos
is 0. But in DataLoader
, shorter sequences are also padded with 0 for their subj_pos
s and obj_pos
s. So subj_mask
and obj_mask
don't mask the padding tokens.
gcn-over-pruned-trees/model/gcn.py
Line 88 in db7c128
This will affect the following subject and object pooling operations cause the representation vectors for padding tokens are not 0 (for example, a linear transformation would add bias term to these vectors).
Changing it to the following would fix the problem
subj_mask, obj_mask = subj_pos.eq(0).eq(0), obj_pos.eq(0).eq(0) # invert mask
subj_mask = (subj_mask | masks).unsqueeze(2) # logical or with word masks
obj_mask = (obj_mask | masks).unsqueeze(2)
I find the "head" in tree.py is used to construct dependency parsing tree. It corresponds to "stanford_head" in the given dataset. "stanford_deprel " is also in the dataset but has not been used. I don't know what does "stanford_head" mean in representing the dependency parsing label. Could you please explain for it, or is there any information about it? Thank you!
In the code:
gcn-over-pruned-trees/model/tree.py
Line 80 in db7c128
I am sorry for disturbing you. I assume that your model uses fine-tune all embeddings in the training phase before. However, I wonder about using the fine-tune word embedding layer in the test phase.
In the line 36: trainer = GCNTrainer(opt), the file eval.py , you init GCNTrainer without providing the argument emb_matrix, or emb_matrix=None. Therefore, your model will init an uniform gcn_model.emb.weight.data in the test phase. What about using the learnable gcn_model.emb.weight.data in the trainning phase before?
Thank you very much for your time!
Dear authors,
Did you try _not_ finetuning word embeddings and what's the influence? Thanks.
When I run eval.py, an error raised.
Loading model from saved_models/00/best_model.pt
[ Fail: model loading failed. ]
Traceback (most recent call last):
File "eval.py", line 35, in <module>
opt = torch_utils.load_config(model_file)
File "/home/penzm/gcn-over-pruned-trees/utils/torch_utils.py", line 161, in load_config
return dump['config']
UnboundLocalError: local variable 'dump' referenced before assignment
In two places accessing tokens from the tacred dataset (here and here the wrong key has been used. The data in this repository uses tokens
instead of token
as the key for the list of tokens. This looks like a typo that has been corrected as a previous project had what would have worked with this code, namely token
, as the key.
I've fixed the code for myself but you may decide whether to fix the code or the data.
P.S. thanks for your code, which is otherwise quite nice and tidy!
I am using jupyter notebook on server.
I have preprocess a new dataset in the required format( i.e. tacred dataset) but i am getting MemoryError while training;
bash train_gcn.sh 0
I show you some my dataset examples;
[{"id":109627235236557200,"docid":"WLP-Dataset-master_updated/train/protocol_205","relation":"Acts-on","token":["Soak","the","PVDF","membrane","in","100","%","methanol","for","1","-","2","minutes","then","rinse","2","-","3","times","with","deionized","water","."],"subj_start":0,"subj_end":0,"obj_start":2,"obj_end":3,"subj_type":"Action","obj_type":"Reagent","stanford_pos":["VB","DT","NN","NN","IN","CD","NN","NN","IN","CD","SYM","CD","NNS","RB","VB","CD","SYM","CD","NNS","IN","VBN","NN","."],"stanford_ner":["Action","O","Reagent","Reagent","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O"],"stanford_head":[0,4,4,1,8,7,8,1,13,13,12,10,1,15,1,19,18,16,15,22,22,15,1],"stanford_deprel":["root","det","compound","obj","case","nummod","compound","obl","case","nummod","case","nmod","obl","advmod","conj","nummod","case","nmod","obl:tmod","case","amod","obl","punct"]},{"id":109627235236557200,"docid":"WLP-Dataset-master_updated/train/protocol_205","relation":"Site","token":["Soak","the","PVDF","membrane","in","100","%","methanol","for","1","-","2","minutes","then","rinse","2","-","3","times","with","deionized","water","."],"subj_start":0,"subj_end":0,"obj_start":7,"obj_end":7,"subj_type":"Action","obj_type":"Reagent","stanford_pos":["VB","DT","NN","NN","IN","CD","NN","NN","IN","CD","SYM","CD","NNS","RB","VB","CD","SYM","CD","NNS","IN","VBN","NN","."],"stanford_ner":["Action","O","O","O","O","O","O","Reagent","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O"],"stanford_head":[0,4,4,1,8,7,8,1,13,13,12,10,1,15,1,19,18,16,15,22,22,15,1],"stanford_deprel":["root","det","compound","obj","case","nummod","compound","obl","case","nummod","case","nmod","obl","advmod","conj","nummod","case","nmod","obl:tmod","case","amod","obl","punct"]},{"id":109627235236557200,"docid":"WLP-Dataset-master_updated/train/protocol_205","relation":"Count","token":["Soak","the","PVDF","membrane","in","100","%","methanol","for","1","-","2","minutes","then","rinse","2","-","3","times","with","deionized","water","."],"subj_start":14,"subj_end":14,"obj_start":15,"obj_end":18,"subj_type":"Action","obj_type":"Numerical","stanford_pos":["VB","DT","NN","NN","IN","CD","NN","NN","IN","CD","SYM","CD","NNS","RB","VB","CD","SYM","CD","NNS","IN","VBN","NN","."],"stanford_ner":["O","O","O","O","O","O","O","O","O","O","O","O","O","O","Action","Numerical","Numerical","Numerical","Numerical","O","O","O","O"],"stanford_head":[0,4,4,1,8,7,8,1,13,13,12,10,1,15,1,19,18,16,15,22,22,15,1],"stanford_deprel":["root","det","compound","obj","case","nummod","compound","obl","case","nummod","case","nmod","obl","advmod","conj","nummod","case","nmod","obl:tmod","case","amod","obl","punct"]},{"id":109627235236557200,"docid":"WLP-Dataset-master_updated/train/protocol_205","relation":"Using","token":["Soak","the","PVDF","membrane","in","100","%","methanol","for","1","-","2","minutes","then","rinse","2","-","3","times","with","deionized","water","."],"subj_start":14,"subj_end":14,"obj_start":20,"obj_end":21,"subj_type":"Action","obj_type":"Reagent","stanford_pos":["VB","DT","NN","NN","IN","CD","NN","NN","IN","CD","SYM","CD","NNS","RB","VB","CD","SYM","CD","NNS","IN","VBN","NN","."],"stanford_ner":["O","O","O","O","O","O","O","O","O","O","O","O","O","O","Action","O","O","O","O","O","Reagent","Reagent","O"],"stanford_head":[0,4,4,1,8,7,8,1,13,13,12,10,1,15,1,19,18,16,15,22,22,15,1],"stanford_deprel":["root","det","compound","obj","case","nummod","compound","obl","case","nummod","case","nmod","obl","advmod","conj","nummod","case","nmod","obl:tmod","case","amod","obl","punct"]},{"id":109627235236557200,"docid":"WLP-Dataset-master_updated/train/protocol_205","relation":"Acts-on","token":["Soak","the","PVDF","membrane","in","100","%","methanol","for","1","-","2","minutes","then","rinse","2","-","3","times","with","deionized","water","."],"subj_start":14,"subj_end":14,"obj_start":2,"obj_end":3,"subj_type":"Action","obj_type":"Reagent","stanford_pos":["VB","DT","NN","NN","IN","CD","NN","NN","IN","CD","SYM","CD","NNS","RB","VB","CD","SYM","CD","NNS","IN","VBN","NN","."],"stanford_ner":["O","O","Reagent","Reagent","O","O","O","O","O","O","O","O","O","O","Action","O","O","O","O","O","O","O","O"],"stanford_head":[0,4,4,1,8,7,8,1,13,13,12,10,1,15,1,19,18,16,15,22,22,15,1],"stanford_deprel":["root","det","compound","obj","case","nummod","compound","obl","case","nummod","case","nmod","obl","advmod","conj","nummod","case","nmod","obl:tmod","case","amod","obl","punct"]},{"id":109627235236557200,"docid":"WLP-Dataset-master_updated/train/protocol_205","relation":"Measure","token":["Soak","the","PVDF","membrane","in","100","%","methanol","for","1","-","2","minutes","then","rinse","2","-","3","times","with","deionized","water","."],"subj_start":7,"subj_end":7,"obj_start":5,"obj_end":6,"subj_type":"Reagent","obj_type":"Concentration","stanford_pos":["VB","DT","NN","NN","IN","CD","NN","NN","IN","CD","SYM","CD","NNS","RB","VB","CD","SYM","CD","NNS","IN","VBN","NN","."],"stanford_ner":["O","O","O","O","O","Concentration","Concentration","Reagent","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O"],"stanford_head":[0,4,4,1,8,7,8,1,13,13,12,10,1,15,1,19,18,16,15,22,22,15,1],"stanford_deprel":["root","det","compound","obj","case","nummod","compound","obl","case","nummod","case","nmod","obl","advmod","conj","nummod","case","nmod","obl:tmod","case","amod","obl","punct"]}]
Here is the error i got;
Finetune all embeddings.
Traceback (most recent call last):
File "train.py", line 142, in
loss = trainer.update(batch)
File "/home/hz071/gcn-over-pruned-trees/model/trainer.py", line 80, in update
logits, pooling_output = self.model(inputs)
File "/home/hz071/.conda/envs/AG-GCNN/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/hz071/gcn-over-pruned-trees/model/gcn.py", line 27, in forward
outputs, pooling_output = self.gcn_model(inputs)
File "/home/hz071/.conda/envs/AG-GCNN/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/hz071/gcn-over-pruned-trees/model/gcn.py", line 84, in forward
adj = inputs_to_tree_reps(head.data, words.data, l, self.opt['prune_k'], subj_pos.data, obj_pos.data)
File "/home/hz071/gcn-over-pruned-trees/model/gcn.py", line 78, in inputs_to_tree_reps
trees = [head_to_tree(head[i], words[i], l[i], prune, subj_pos[i], obj_pos[i]) for i in range(len(l))]
File "/home/hz071/gcn-over-pruned-trees/model/gcn.py", line 78, in
trees = [head_to_tree(head[i], words[i], l[i], prune, subj_pos[i], obj_pos[i]) for i in range(len(l))]
File "/home/hz071/gcn-over-pruned-trees/model/tree.py", line 81, in head_to_tree
tmp += [h-1]
MemoryError
After that the implementation stopped.
Can you give me any idea what went wrong? I have used stanza pipeline(spacy tokenizer) for pre-processing.
Hello, you mentioned that the effect of the semeval dataset in the paper is also very good, could you please share your code that tests this network in SemEval 2010 task8?
May I ask your Hyperparameter setting on SemEval2010_task8 dataset?
I can't get the best performance on SemEval2010_task8 in the paper.
Thank you so much!
Hi Yuhao,
The code looks very well written, but it's kind of slow. I found that you are doing all tree manipulations on GPU. Just add
head, words, subj_pos, obj_pos = head.cpu().numpy(), words.cpu().numpy(), subj_pos.cpu().numpy(), obj_pos.cpu().numpy()
at line 77 in gcn.py, then the speed increases from 0.12 sec/batch to 0.016 sec/batch on my machine.
The model is designed for Named Entity Recognition with Sequence Labeling approach.
The Named Entities correspond to relations.
I would like to use this model for sentence classification and a sentence representation token [CLS] is needed at some place.
How can I adjust the model?
Dear authors:
I have a question about the implement of self loop.
In "tree.py: line 173-175",we have set adj[i,i]=1,
if self_loop: for i in idx: ret[i, i] = 1
And I think the information of the node itself have been contained by the bmm operation for adj and gcn_input,just like "gcn.py : line 166":
Ax = adj.bmm(gcn_inputs)
But why we still use "gcn.py : line 168" to do self loop ?
AxW = AxW + self.W[l](gcn_inputs)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.