GithubHelp home page GithubHelp logo

qipeng / gcn-over-pruned-trees Goto Github PK

View Code? Open in Web Editor NEW
372.0 7.0 70.0 599 KB

Graph Convolution over Pruned Dependency Trees Improves Relation Extraction (authors' PyTorch implementation)

License: Other

Python 98.98% Shell 1.02%
information-extraction relation-extraction nlp natural-language-processing dependency-parsing dependency-parse-trees

gcn-over-pruned-trees's People

Contributors

pencoa avatar wzhouad avatar yuhaozhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

gcn-over-pruned-trees's Issues

all predicted values are “no_relation”, and the Precision value is 100%, the Recall value is 0, and the value of dev_f1 is also 0.

When I am resurrecting, all predicted values are “no_relation”, and the Precision value is 100%, the Recall value is 0, and the value of dev_f1 is also 0.
Evaluating on dev set...
['no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation', 'no_relation']
Precision (micro): 100.000%
Recall (micro): 0.000%
F1 (micro): 0.000%
epoch 19: train_loss = 0.852363, dev_loss = 5.416558, dev_f1 = 0.0000
model saved to ./saved_models/00/checkpoint_epoch_19.pt

SemEval 2010 Task 8 Dataset is not used

Hi Yuhao,
Congratulations on the good work. In the original paper, you conduct experiment on SemEval 2010 Task 8 Dataset. But in this code, I don't see this experiment.Can you share your experiment code on this dataset.Thank you!

The number of GCN layers is the best for experiment?

Dear @yuhaozhang ,

In your paper, in Appendix A.1, you said that you use 2 GCN layers for the best performance. But when I run your source code, I fine-tune the number of GCN layer parameter, and I see that the number is 1 for the best, with the same your performance reported (P=0.695 R=0.636 F=0.664). For the number of GCN layers is 2, I got (P=0.68 R=0.62 F=0.644). Any mistypes in your Appendix? or what is the number of GCN layers for the best performance? Thanks!

Do i need both cpu and Gpu for running this implementation?

I am geeting strange results like precision=100%, R=0, F=0
You can see
Per-relation statistics:
org:top_members/employees P: 100.00% R: 0.00% F1: 0.00% #: 1
per:age P: 100.00% R: 0.00% F1: 0.00% #: 1
per:origin P: 100.00% R: 0.00% F1: 0.00% #: 1
per:title P: 100.00% R: 0.00% F1: 0.00% #: 1

Final Score:
Precision (micro): 100.000%
Recall (micro): 0.000%
F1 (micro): 0.000%
test set evaluate result: 1.00 0.00 0.00
Evaluation ended.

A bug: subj_mask and obj_mask don't mask the padding tokens

Hi, thanks for sharing your code. I noticed a bug that would affect the experimental results.

This line of code below constructs subj_mask and obj_mask according to whether subj_pos or obj_pos is 0. But in DataLoader, shorter sequences are also padded with 0 for their subj_poss and obj_poss. So subj_mask and obj_mask don't mask the padding tokens.

subj_mask, obj_mask = subj_pos.eq(0).eq(0).unsqueeze(2), obj_pos.eq(0).eq(0).unsqueeze(2) # invert mask

This will affect the following subject and object pooling operations cause the representation vectors for padding tokens are not 0 (for example, a linear transformation would add bias term to these vectors).

Changing it to the following would fix the problem

subj_mask, obj_mask = subj_pos.eq(0).eq(0), obj_pos.eq(0).eq(0) # invert mask
subj_mask = (subj_mask | masks).unsqueeze(2)  # logical or with word masks
obj_mask = (obj_mask | masks).unsqueeze(2)

Meaning of "stanford_head" in given dataset

I find the "head" in tree.py is used to construct dependency parsing tree. It corresponds to "stanford_head" in the given dataset. "stanford_deprel " is also in the dataset but has not been used. I don't know what does "stanford_head" mean in representing the dependency parsing label. Could you please explain for it, or is there any information about it? Thank you!

Higher Loss Value

2021-03-29
I am training the model with prune_k=-1, but i am getting very high loss values. What can be the problem indeed?
Kindly give insight into it.

Using fine-tune word embedding in the evaluation phase

I am sorry for disturbing you. I assume that your model uses fine-tune all embeddings in the training phase before. However, I wonder about using the fine-tune word embedding layer in the test phase.

In the line 36: trainer = GCNTrainer(opt), the file eval.py , you init GCNTrainer without providing the argument emb_matrix, or emb_matrix=None. Therefore, your model will init an uniform gcn_model.emb.weight.data in the test phase. What about using the learnable gcn_model.emb.weight.data in the trainning phase before?

Thank you very much for your time!

eval.py error

When I run eval.py, an error raised.

Loading model from saved_models/00/best_model.pt
[ Fail: model loading failed. ]
Traceback (most recent call last):
  File "eval.py", line 35, in <module>
    opt = torch_utils.load_config(model_file)
  File "/home/penzm/gcn-over-pruned-trees/utils/torch_utils.py", line 161, in load_config
    return dump['config']
UnboundLocalError: local variable 'dump' referenced before assignment

Inconsistency between the dataset and the loading method

In two places accessing tokens from the tacred dataset (here and here the wrong key has been used. The data in this repository uses tokens instead of token as the key for the list of tokens. This looks like a typo that has been corrected as a previous project had what would have worked with this code, namely token, as the key.

I've fixed the code for myself but you may decide whether to fix the code or the data.

P.S. thanks for your code, which is otherwise quite nice and tidy! ☺️

Memory issue

I am using jupyter notebook on server.
I have preprocess a new dataset in the required format( i.e. tacred dataset) but i am getting MemoryError while training;
bash train_gcn.sh 0
I show you some my dataset examples;

[{"id":109627235236557200,"docid":"WLP-Dataset-master_updated/train/protocol_205","relation":"Acts-on","token":["Soak","the","PVDF","membrane","in","100","%","methanol","for","1","-","2","minutes","then","rinse","2","-","3","times","with","deionized","water","."],"subj_start":0,"subj_end":0,"obj_start":2,"obj_end":3,"subj_type":"Action","obj_type":"Reagent","stanford_pos":["VB","DT","NN","NN","IN","CD","NN","NN","IN","CD","SYM","CD","NNS","RB","VB","CD","SYM","CD","NNS","IN","VBN","NN","."],"stanford_ner":["Action","O","Reagent","Reagent","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O"],"stanford_head":[0,4,4,1,8,7,8,1,13,13,12,10,1,15,1,19,18,16,15,22,22,15,1],"stanford_deprel":["root","det","compound","obj","case","nummod","compound","obl","case","nummod","case","nmod","obl","advmod","conj","nummod","case","nmod","obl:tmod","case","amod","obl","punct"]},{"id":109627235236557200,"docid":"WLP-Dataset-master_updated/train/protocol_205","relation":"Site","token":["Soak","the","PVDF","membrane","in","100","%","methanol","for","1","-","2","minutes","then","rinse","2","-","3","times","with","deionized","water","."],"subj_start":0,"subj_end":0,"obj_start":7,"obj_end":7,"subj_type":"Action","obj_type":"Reagent","stanford_pos":["VB","DT","NN","NN","IN","CD","NN","NN","IN","CD","SYM","CD","NNS","RB","VB","CD","SYM","CD","NNS","IN","VBN","NN","."],"stanford_ner":["Action","O","O","O","O","O","O","Reagent","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O"],"stanford_head":[0,4,4,1,8,7,8,1,13,13,12,10,1,15,1,19,18,16,15,22,22,15,1],"stanford_deprel":["root","det","compound","obj","case","nummod","compound","obl","case","nummod","case","nmod","obl","advmod","conj","nummod","case","nmod","obl:tmod","case","amod","obl","punct"]},{"id":109627235236557200,"docid":"WLP-Dataset-master_updated/train/protocol_205","relation":"Count","token":["Soak","the","PVDF","membrane","in","100","%","methanol","for","1","-","2","minutes","then","rinse","2","-","3","times","with","deionized","water","."],"subj_start":14,"subj_end":14,"obj_start":15,"obj_end":18,"subj_type":"Action","obj_type":"Numerical","stanford_pos":["VB","DT","NN","NN","IN","CD","NN","NN","IN","CD","SYM","CD","NNS","RB","VB","CD","SYM","CD","NNS","IN","VBN","NN","."],"stanford_ner":["O","O","O","O","O","O","O","O","O","O","O","O","O","O","Action","Numerical","Numerical","Numerical","Numerical","O","O","O","O"],"stanford_head":[0,4,4,1,8,7,8,1,13,13,12,10,1,15,1,19,18,16,15,22,22,15,1],"stanford_deprel":["root","det","compound","obj","case","nummod","compound","obl","case","nummod","case","nmod","obl","advmod","conj","nummod","case","nmod","obl:tmod","case","amod","obl","punct"]},{"id":109627235236557200,"docid":"WLP-Dataset-master_updated/train/protocol_205","relation":"Using","token":["Soak","the","PVDF","membrane","in","100","%","methanol","for","1","-","2","minutes","then","rinse","2","-","3","times","with","deionized","water","."],"subj_start":14,"subj_end":14,"obj_start":20,"obj_end":21,"subj_type":"Action","obj_type":"Reagent","stanford_pos":["VB","DT","NN","NN","IN","CD","NN","NN","IN","CD","SYM","CD","NNS","RB","VB","CD","SYM","CD","NNS","IN","VBN","NN","."],"stanford_ner":["O","O","O","O","O","O","O","O","O","O","O","O","O","O","Action","O","O","O","O","O","Reagent","Reagent","O"],"stanford_head":[0,4,4,1,8,7,8,1,13,13,12,10,1,15,1,19,18,16,15,22,22,15,1],"stanford_deprel":["root","det","compound","obj","case","nummod","compound","obl","case","nummod","case","nmod","obl","advmod","conj","nummod","case","nmod","obl:tmod","case","amod","obl","punct"]},{"id":109627235236557200,"docid":"WLP-Dataset-master_updated/train/protocol_205","relation":"Acts-on","token":["Soak","the","PVDF","membrane","in","100","%","methanol","for","1","-","2","minutes","then","rinse","2","-","3","times","with","deionized","water","."],"subj_start":14,"subj_end":14,"obj_start":2,"obj_end":3,"subj_type":"Action","obj_type":"Reagent","stanford_pos":["VB","DT","NN","NN","IN","CD","NN","NN","IN","CD","SYM","CD","NNS","RB","VB","CD","SYM","CD","NNS","IN","VBN","NN","."],"stanford_ner":["O","O","Reagent","Reagent","O","O","O","O","O","O","O","O","O","O","Action","O","O","O","O","O","O","O","O"],"stanford_head":[0,4,4,1,8,7,8,1,13,13,12,10,1,15,1,19,18,16,15,22,22,15,1],"stanford_deprel":["root","det","compound","obj","case","nummod","compound","obl","case","nummod","case","nmod","obl","advmod","conj","nummod","case","nmod","obl:tmod","case","amod","obl","punct"]},{"id":109627235236557200,"docid":"WLP-Dataset-master_updated/train/protocol_205","relation":"Measure","token":["Soak","the","PVDF","membrane","in","100","%","methanol","for","1","-","2","minutes","then","rinse","2","-","3","times","with","deionized","water","."],"subj_start":7,"subj_end":7,"obj_start":5,"obj_end":6,"subj_type":"Reagent","obj_type":"Concentration","stanford_pos":["VB","DT","NN","NN","IN","CD","NN","NN","IN","CD","SYM","CD","NNS","RB","VB","CD","SYM","CD","NNS","IN","VBN","NN","."],"stanford_ner":["O","O","O","O","O","Concentration","Concentration","Reagent","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O"],"stanford_head":[0,4,4,1,8,7,8,1,13,13,12,10,1,15,1,19,18,16,15,22,22,15,1],"stanford_deprel":["root","det","compound","obj","case","nummod","compound","obl","case","nummod","case","nmod","obl","advmod","conj","nummod","case","nmod","obl:tmod","case","amod","obl","punct"]}]

Here is the error i got;
Finetune all embeddings.
Traceback (most recent call last):
File "train.py", line 142, in
loss = trainer.update(batch)
File "/home/hz071/gcn-over-pruned-trees/model/trainer.py", line 80, in update
logits, pooling_output = self.model(inputs)
File "/home/hz071/.conda/envs/AG-GCNN/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/hz071/gcn-over-pruned-trees/model/gcn.py", line 27, in forward
outputs, pooling_output = self.gcn_model(inputs)
File "/home/hz071/.conda/envs/AG-GCNN/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/hz071/gcn-over-pruned-trees/model/gcn.py", line 84, in forward
adj = inputs_to_tree_reps(head.data, words.data, l, self.opt['prune_k'], subj_pos.data, obj_pos.data)
File "/home/hz071/gcn-over-pruned-trees/model/gcn.py", line 78, in inputs_to_tree_reps
trees = [head_to_tree(head[i], words[i], l[i], prune, subj_pos[i], obj_pos[i]) for i in range(len(l))]
File "/home/hz071/gcn-over-pruned-trees/model/gcn.py", line 78, in
trees = [head_to_tree(head[i], words[i], l[i], prune, subj_pos[i], obj_pos[i]) for i in range(len(l))]
File "/home/hz071/gcn-over-pruned-trees/model/tree.py", line 81, in head_to_tree
tmp += [h-1]
MemoryError

After that the implementation stopped.
Can you give me any idea what went wrong? I have used stanza pipeline(spacy tokenizer) for pre-processing.

About dataset of SemEval 2010 task8

Hello, you mentioned that the effect of the semeval dataset in the paper is also very good, could you please share your code that tests this network in SemEval 2010 task8?

Question on SemEval2010_task8 dataset

May I ask your Hyperparameter setting on SemEval2010_task8 dataset?
I can't get the best performance on SemEval2010_task8 in the paper.
Thank you so much!

Speedup training 10 times by one line of code

Hi Yuhao,
The code looks very well written, but it's kind of slow. I found that you are doing all tree manipulations on GPU. Just add

head, words, subj_pos, obj_pos = head.cpu().numpy(), words.cpu().numpy(), subj_pos.cpu().numpy(), obj_pos.cpu().numpy()

at line 77 in gcn.py, then the speed increases from 0.12 sec/batch to 0.016 sec/batch on my machine.

Implement of self loop

Dear authors:
I have a question about the implement of self loop.
In "tree.py: line 173-175",we have set adj[i,i]=1,

if self_loop: for i in idx: ret[i, i] = 1

And I think the information of the node itself have been contained by the bmm operation for adj and gcn_input,just like "gcn.py : line 166":

Ax = adj.bmm(gcn_inputs)

But why we still use "gcn.py : line 168" to do self loop ?

AxW = AxW + self.W[l](gcn_inputs)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.