jd-ai-research-silicon-valley / sacn Goto Github PK

View Code? Open in Web Editor NEW

112.0 5.0 31.0 8.41 MB

End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion

License: MIT License

Python 99.69% Shell 0.31%

knowledge-graph-completion knowledge-graph pytorch graph-convolutional-networks sacn

sacn's Introduction

SACN

Paper: "End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion"

Published in the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19).

--- PyTorch Version ---

Overview

The end-to-end Structure-Aware Convolutional Network (SACN) model takes the benefit of GCN and ConvE together for knowledge base completion. SACN consists of an encoder of a weighted graph convolutional network (WGCN), and a decoder of a convolutional network called Conv-TransE. WGCN utilizes knowledge graph node structure, node attributes and edge relation types. The decoder Conv-TransE enables the state-of-the-art ConvE to be translational between entities and relations while keeps the same link prediction performance as ConvE.

Installation

This repo supports Linux and Python installation via Anaconda.

Install PyTorch 1.0 using official website or Anaconda.
Install the requirements: pip install -r requirements.txt
Download the default English model used by spaCy, which is installed in the previous step python -m spacy download en.

Data Preprocessing

Run the preprocessing script for FB15k-237, WN18RR, FB15k-237-attr and kinship: sh preprocess.sh.

Run a model

To run a model, you first need to preprocess the data. This can be done by specifying the process parameter.

For ConvTransE model, you can run it using:

CUDA_VISIBLE_DEVICES=0 python main.py model ConvTransE init_emb_size 100 dropout_rate 0.4 channels 50 lr 0.001 kernel_size 3 dataset FB15k-237 process True

For SACN model, you can run it using:

CUDA_VISIBLE_DEVICES=0 python main.py model SACN dataset FB15k-237 process True

You can modify the hyper-parameters from "src.spodernet.spodernet.utils.global_config.py" or specify the hyper-parameters in the command. For different datasets, you need to tune the parameters.

For this test version, if you find any problems, please feel free and email me. We will keep updating the code.

Acknowledgements

Code is inspired by ConvE.

sacn's People

Contributors

Stargazers

Watchers

sacn's Issues

unable to reproduce results for SACN model

I can reproduce fb15k-237 with default parameters that you set, but the result of wn18rr could not reproduce. Can you tell me the parameter setting of wn18rr. Thanks!

questions about paper please

The degree of nodes are different, so we usually need normalization in GCN (by using degree matrix). In equation 3, 4, 5, I can't find the normalization.
In Table 3, why DistMult performs better than R-GCN on FB15K-237? This is a contradiction with the result in R-GCN paper.
There is an attribute triple (entity, relation, attribute) example (s = Tom, r = people.person.gender, a = male. The paper said that ”Note that each type of attribute corresponds to a node.". I understand this by taking the attributes "male" and "female " as two nodes. However, the paper also said "For instance, in our example, gender is represented by a single node rather than two nodes for “male” and “female”.“ I understand this by taking the relation "people.person.gender" as nodes. I feel confused.

bashmagic problem

Hello we seem to have a problem when running the requirements and trying to install bashmagic. It seems the library is not available in the github anymore is there any way we can bypass this and make the code run ?

Fix This Please

file : SACN/main.py line 209 and 210 :
from :
ranking_and_hits(model, test_rank_batcher, vocab, 'test_evaluation') <br> ranking_and_hits(model, dev_rank_batcher, vocab, 'dev_evaluation')
change it to :
ranking_and_hits(model, test_rank_batcher, vocab, 'test_evaluation',X, adjacencies) <br> ranking_and_hits(model, dev_rank_batcher, vocab, 'dev_evaluation',X, adjacencies)

and this line is annoing :
line 135 :
print("batch number:", i)
change it to :
if i % 10 == 0 : print("batch number:", i)

and i have no idea how to change batch size and .. because The SACN cant fit into Google Colab (SACN need 14GB ram and Colab gives us 12.7)

Are there some parameters not included?

It seems that when I run the model using
CUDA_VISIBLE_DEVICES=0 python main.py model SACN dataset FB15k-237 process True
, and I got an error

Traceback (most recent call last):
  File "main.py", line 249, in <module>
    main()
  File "main.py", line 169, in main
    model = SACN(vocab['e1'].num_token, vocab['rel'].num_token)
  File "my_path/models.py", line 243, in __init__
    self.hidden_drop = torch.nn.Dropout(Config.dropout_rate)
AttributeError: type object 'Config' has no attribute 'dropout_rate'

And I looked in the file src.spodernet.spodernet.utils.global_config.py, but there is not a parameter named 'dropout_rate' and I can't find some other parameters such as 'init_emb_size' and 'channels' either.

Do I miss something?

Hyperparameters for both Conv-TransE and SACN models

Can you please share the hyperparameters you used to reproduce the results mentioned in paper.

error code of SACN model

Sorry to bother you, when I run your code of SACN model, it pull an error just like below, I tried to solve it by my self, however ,it still pull error, I hope you can help me
File "main.py", line 250, in
main()
File "main.py", line 229, in main
pred = model.forward(e1, rel, X.cuda(), adjacencies)
File "/home/hg-1/Gosline_data/GCN/SACN-master/models.py", line 269, in forward
x = self.gc1(emb_initial, A)
File "/home/hg-1/anaconda3/envs/liudipytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/hg-1/Gosline_data/GCN/SACN-master/models.py", line 167, in forward
A = torch.sparse_coo_tensor(adj[0], alp, torch.Size([adj[2], adj[2]]), requires_grad=True)
RuntimeError: Expected object of type torch.cuda.LongTensor but found type torch.LongTensor for argument #1 'indices'
a beginner of AI learning

A question about the WGCN in your paper

Hi,
thanks for your code, I have a question about the WGCN in your paper.

The WGCN implemented in your code seems different from the traditional GCN. Traditional GCN uses the Symmetric normalized Laplacian Matrix, but in your code, the matrix is simplified. I guess the reason might be that the weights in the adjacency matrix are trainable, so it doesn't matter whether we initialize them with the Laplacian Matrix?

I will be grateful if you could reply to this issue.

Question about whether 'rel_reserse' is included in training process ?

Hi, I like your paper so i read your released code. Then the following confusion comes to me. The adjacencies is put forward into the training process to represent the graph structure, and the v in adjacencies is obtained solely from train_batcher['rel']. It seens that the 'rel_reserve' is excluded in the training process. But in evaluation, the (e2, rel_reverse, adjacencies) is put forward into the model. I am confused, if the model doesn't see the 'rel_reverse' in the training then how could it handle the 'rel_reverse' in evaluation? Looking forward to hearing from you soon. Best wise.

Questions about Hits@10 left and right

hi,could you tell me what the hits@10left or right mean? Thanks

Theoretical question - How is translation property for the embeddings maintained?

First of all, thanks for sharing your great work!

I am reading through your paper, and I am finding it difficult to understand how the translation property for the embeddings is maintained. I do see that when you remove the reshape operation, every 2 x K convolutional filter becomes a dimension-wise weighted sum of subject and relationship embeddings for each fact triple.

However, since this is followed by A) vectorizing many channels, B) a non-linearity after a matrix multiplication with weight W, and C) inner product with object embeddings, it seems that the final embeddings are no longer translational.

How can I use your architecture but derive embeddings that are translational (i.e. head + rel ~ tail) ? One of my use-cases is highly dependent on the translational property.

Thanks in advance!
Kiran

Source code request

Hi, I am very interested in your research! May I ask when the source code will be public?

why the adjacency matrix constructed from the training set different for each run?

In main.py line 134-148,

    for i, str2var in enumerate(train_batcher):
        print("batch number:", i)
        for j in range(str2var['e1'].shape[0]):
            for k in range(str2var['e2_multi1'][j].shape[0]):
                if str2var['e2_multi1'][j][k] != 0:
                    a = str2var['rel'][j].cpu()
                    data.append(str2var['rel'][j].cpu())
                    rows.append(str2var['e1'][j].cpu().tolist()[0])
                    columns.append(str2var['e2_multi1'][j][k].cpu())
                else:
                    break

    rows = rows  + [i for i in range(num_entities)]
    columns = columns + [i for i in range(num_entities)]
    data = data + [num_relations for i in range(num_entities)]

the rows or columns shape is changed when we run the process twice, which means that train set is mutable?

Code much slower than statement in paper

In the paper, there wrote that, 'On NVIDIA Tesla P40 GPU, for FB15k-237, computation time for SACN for each epoch is about 1 minute', but in default setting of your code and with a Tesla P100 GPU, it ran about 20 minutes 1 epoch. Did I set something wrong about your code? How many epoches would it take to converge?

Is there any result about using GCN as encoder?

I think the result is necessary to prove that SACN not only works, but also works better than the un-structure-aware GCN.

Unable to reproduce results for SACN model

Hi,
Using the default hyperparameters, I am unable to reproduce MRR score .33 for ConvTransE model from the given code. Currently, I am getting around .295 on the test set. Can you share the hyperparameters you used for training ConvTransE?

Thanks in advance

How to combined SACN and ConvTransE

hi， it a great work! But I didn't find how to feed the output of SACN into ConvTransE while reading the source code. I'd like to ask how you combined these two models to create an end-to-end model. Thanks!

questions about paper

Hi, I am very interested in your paper especially the part about node indegree study. I have checked the FB15k-237 dataset, but I found that the indegree of nodes is almost less than 100. So please can you tell me how to get the right test dataset to study the node indegree. Thanks a lot.

Why can't I find the path file for the dataset

I could not create the data set path file requested in the code.

How to compute the results in a specific degree scope?

hi， it a great work! there are two questions: 1. What the role of the attribute node ? 2. how to calculate the results in different indegree scope, because a triplet usually contains different entities whose degree is different. For example, the results between [0,100]， using the all triplets containing the entity whose degree is smaller the 100? Thanks!