snap-stanford / kgreasoning Goto Github PK

Multi-Hop Logical Reasoning in Knowledge Graphs

License: MIT License

Python 96.95% Shell 3.05%

knowledge-graph knowledge-base embedding reasoning

kgreasoning's Introduction

KGReasoning

This repo contains several algorithms for multi-hop reasoning on knowledge graphs, including the official Pytorch implementation of Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs.

Models

KG Data

The KG data (FB15k, FB15k-237, NELL995) mentioned in the BetaE paper and the Query2box paper can be downloaded here. Note the two use the same training queries, but the difference is that the valid/test queries in BetaE paper have a maximum number of answers, making it more realistic.

Each folder in the data represents a KG, including the following files.

train.txt/valid.txt/test.txt: KG edges
id2rel/rel2id/ent2id/id2ent.pkl: KG entity relation dicts
train-queries/valid-queries/test-queries.pkl: defaultdict(set), each key represents a query structure, and the value represents the instantiated queries
train-answers.pkl: defaultdict(set), each key represents a query, and the value represents the answers obtained in the training graph (edges in train.txt)
valid-easy-answers/test-easy-answers.pkl: defaultdict(set), each key represents a query, and the value represents the answers obtained in the training graph (edges in train.txt) / valid graph (edges in train.txt+valid.txt)
valid-hard-answers/test-hard-answers.pkl: defaultdict(set), each key represents a query, and the value represents the additional answers obtained in the validation graph (edges in train.txt+valid.txt) / test graph (edges in train.txt+valid.txt+test.txt)

We represent the query structures using a tuple in case we run out of names :), (credits to @michiyasunaga). For example, 1p queries: (e, (r,)) and 2i queries: ((e, (r,)),(e, (r,))). Check the code for more details.

Examples

Please refer to the examples.sh for the scripts of all 3 models on all 3 datasets.

Citations

If you use this repo, please cite the following paper.

@inproceedings{
 ren2020beta,
 title={Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs},
 author={Hongyu Ren and Jure Leskovec},
 booktitle={Neural Information Processing Systems},
 year={2020}
}

kgreasoning's People

Contributors

Stargazers

Watchers

kgreasoning's Issues

about the evaluate of easy and hard answers

Nice work! however, I am confused that:
betaE only ranks entities within easy and hard answers?
https://github.com/snap-stanford/KGReasoning/blob/ec728497f083973edbe8b9e3c4066f416fc74250/models.py#L655C37-L655C37

It seems that Q2B ranks entities within hard answers and all negtive entities (all entities - easy answers)?
Screenshot from Q2B：

create_queries.py

Thanks for sharing your code!
How to run the create_queries.py file? How do I set parameters
In addition, in the index_dataset function, only the train_indexified.txt was generated. Does the program only need to generate the train_indexified.txt file?

the link to the datasets used in the paper seems to be broken. Where can we track down the data?

运行项目需要多大的显存？

我的gpu显存为6g，运行后提示RuntimeError: CUDA error: out of memory

How are the easy and hard answers generated?

Hi, as I'm generating queries for a new dataset using the create_queries.py, I am wondering how are the easy and hard answers generated exactly? Using create_queries.py generate three answer files, fp-answers.pkl, fn-answers.pkl, and tp-answers.pkl, but how to build easy and hard answers on top of these three?

data

Excuse me, how is the data of query built? For example, the file train-queries.pkl.

KL Divergence

Why not use Jensen-Shannon Divergence instead of KL Divergence

Natural Queries Extraction

Thanks for sharing the repo.
I'm trying to find/extract the natural language queries of the respective entity and relation represented queries. Could you share if it is available or could be extracted. Any reference would be helpful.

Codes for generating queries

Hi, thanks for sharing the codes of the nice work!

Could you please release the codes for generating queries? Now I want to generate some queries in other structures but don't know how to make it.

"&" operator instead of "weighted product of the PDFs"

Hi ! Thanks for sharing the source code .
In the paper, you use “weighted product of the PDFs” to implement the intersection operator . But why not use the "&" operator to implement the intersection operator ？

CUDA error: device-side assert triggered

Hello, currently I am trying to using your code to generate the query and answer on my own dataset. I succeed to generate the query and answer .pkl file, but when I run the training, it yeld the error like this

C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Indexing.cu:658: block: [4,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm 2020.3.2\plugins\python\helpers\pydev\pydevd.py", line 1477, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm 2020.3.2\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/ERCLab/Desktop/RESEARCH/Natural language processing/KGReasoning/main.py", line 449, in <module>
    main(parse_args())
  File "C:/Users/ERCLab/Desktop/RESEARCH/Natural language processing/KGReasoning/main.py", line 385, in main
    log = model.train_step(model, optimizer, train_path_iterator, args, step)
  File "C:\Users\ERCLab\Desktop\RESEARCH\Natural language processing\KGReasoning\models.py", line 591, in train_step
    positive_logit, negative_logit, subsampling_weight, _ = model(positive_sample, negative_sample, subsampling_weight, batch_queries_dict, batch_idxs_dict)
  File "C:\Users\ERCLab\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\Users\ERCLab\Desktop\RESEARCH\Natural language processing\KGReasoning\models.py", line 196, in forward
    return self.forward_box(positive_sample, negative_sample, subsampling_weight, batch_queries_dict, batch_idxs_dict)
  File "C:\Users\ERCLab\Desktop\RESEARCH\Natural language processing\KGReasoning\models.py", line 434, in forward_box
    center_embedding, offset_embedding, _ = self.embed_query_box(batch_queries_dict[query_structure], 
  File "C:\Users\ERCLab\Desktop\RESEARCH\Natural language processing\KGReasoning\models.py", line 216, in embed_query_box
    offset_embedding = torch.zeros_like(embedding).cuda()
RuntimeError: CUDA error: device-side assert triggered

I have already tried with FB15k dataset but it didn't yield the error. I can not figure out what wrong happen.
Could you give me some clue for solving that?

About how to generate the _hard.pkl files?

Hello, nice work. I have a question, how did you generate the _haed.pkl file? Can you provide the code? Looking forward to your reply.

MRR

Hello, love this project! :)

A quick question: Tab. 8 in https://arxiv.org/pdf/2002.05969.pdf (Q2B paper) shows some MRR results for GQE and Q2B, but the MRR values from Tab. 1 in https://arxiv.org/pdf/2010.11465.pdf (BetaE paper) look a bit different - is there a different evaluation protocol being used?

The link of KG data

Thanks for sharing your code!
The link of KG data seems to be missing, can you update the link?
Thanks a lot!

Can you share the train.log with us?

Thanks for sharing the code.
Can you share the train.log for each dataset with us?
Training the model from scratch takes too much time.

What's the difference between 'easy' and 'hard' answers?

Hi Dear Author,

Thanks for releasing this excellent work! The codes are very clean in use. But one thing unclear for me is about the dataset, since there're easy answers and hard answers for valid and test queries, and they seem to have no intersection. So what is the principle when you distinguish their difficulty? And how do you use both in evaluations (MRR and Hit@1,3,10)?

snap-stanford / kgreasoning Goto Github PK

kgreasoning's Introduction

KGReasoning

kgreasoning's People

Contributors

Stargazers

Watchers

Forkers

kgreasoning's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs