accenture / ampligraph Goto Github PK

Python library for Representation Learning on Knowledge Graphs https://docs.ampligraph.org

License: Apache License 2.0

Python 67.80% Jupyter Notebook 32.20%

machine-learning knowledge-graph relational-learning representation-learning graph-representation-learning graph-embeddings knowledge-graph-embeddings

ampligraph's Introduction

Join the conversation on Slack

Open source library based on TensorFlow that predicts links between concepts in a knowledge graph.

AmpliGraph is a suite of neural machine learning models for relational Learning, a branch of machine learning that deals with supervised learning on knowledge graphs.

Use AmpliGraph if you need to:

Discover new knowledge from an existing knowledge graph.
Complete large knowledge graphs with missing statements.
Generate stand-alone knowledge graph embeddings.
Develop and evaluate a new relational model.

AmpliGraph's machine learning models generate knowledge graph embeddings, vector representations of concepts in a metric space:

It then combines embeddings with model-specific scoring functions to predict unseen and novel links:

AmpliGraph 2.0.0 is now available!

The new version features TensorFlow 2 back-end and Keras style APIs that makes it faster, easier to use and extend the support for multiple features. Further, the data input/output pipeline has changed, and the support for some obsolete models was discontinued.
See the Changelog for a more thorough list of changes.

Key Features

Intuitive APIs: AmpliGraph APIs are designed to reduce the code amount required to learn models that predict links in knowledge graphs. The new version AmpliGraph 2 APIs are in Keras style, making the user experience even smoother.
GPU-Ready: AmpliGraph 2 is based on TensorFlow 2, and it is designed to run seamlessly on CPU and GPU devices - to speed-up training.
Extensible: Roll your own knowledge graph embeddings model by extending AmpliGraph base estimators.

Modules

AmpliGraph includes the following submodules:

Datasets: helper functions to load datasets (knowledge graphs).
Models: knowledge graph embedding models. AmpliGraph 2 contains TransE, DistMult, ComplEx, HolE (More to come!)
Evaluation: metrics and evaluation protocols to assess the predictive power of the models.
Discovery: High-level convenience APIs for knowledge discovery (discover new facts, cluster entities, predict near duplicates).
Compat: submodule that extends the compatibility of AmpliGraph 2 APIs to those of AmpliGraph 1.x for the user already familiar with them.

Installation

Prerequisites

Linux, macOS, Windows
Python ≥ 3.8

Provision a Virtual Environment

To provision a virtual environment for installing AmpliGraph, any option can work; here we will give provide the instruction for using venv and Conda.

venv

The first step is to create and activate the virtual environment.

python3.8 -m venv PATH/TO/NEW/VIRTUAL_ENVIRONMENT
source PATH/TO/NEW/VIRTUAL_ENVIRONMENT/bin/activate

Once this is done, we can proceed with the installation of TensorFlow 2:

pip install "tensorflow==2.9.0"

If you are installing Tensorflow on MacOS, instead of the following please use:

pip install "tensorflow-macos==2.9.0"

IMPORTANT: the installation of TensorFlow can be tricky on Mac OS with the Apple silicon chip. Though venv can provide a smooth experience, we invite you to refer to the dedicated section down below and consider using conda if some issues persist in alignment with the Tensorflow Plugin page on Apple developer site.

Conda

The first step is to create and activate the virtual environment.

conda create --name ampligraph python=3.8
source activate ampligraph

Once this is done, we can proceed with the installation of TensorFlow 2, which can be done through pip or conda.

pip install "tensorflow==2.9.0"

or 

conda install "tensorflow==2.9.0"

Install TensorFlow 2 for Mac OS M1 chip

When installing TensorFlow 2 for Mac OS with Apple silicon chip we recommend to use a conda environment.

conda create --name ampligraph python=3.8
source activate ampligraph

After having created and activated the virtual environment, run the following to install Tensorflow.

conda install -c apple tensorflow-deps
pip install --user tensorflow-macos==2.9.0
pip install --user tensorflow-metal==0.6

In case of problems with the installation or for further details, refer to Tensorflow Plugin page on the official Apple developer website.

Install AmpliGraph

Once the installation of Tensorflow is complete, we can proceed with the installation of AmpliGraph.

To install the latest stable release from pip:

pip install ampligraph

To sanity check the installation, run the following:

>>> import ampligraph
>>> ampligraph.__version__
'2.1.0'

If instead you want the most recent development version, you can clone the repository from GitHub, install AmpliGraph from source and checkout the develop branch. In this way, your local working copy will be on the latest commit on the develop branch.

git clone https://github.com/Accenture/AmpliGraph.git
cd AmpliGraph
git checkout develop
pip install -e .

Notice that the code snippet above installs the library in editable mode (-e).

To sanity check the installation run the following:

>>> import ampligraph
>>> ampligraph.__version__
'2.1-dev'

Predictive Power Evaluation (MRR Filtered)

AmpliGraph includes implementations of TransE, DistMult, ComplEx, HolE and RotatE. Versions <2.0 also includes ConvE, and ConvKB. Their predictive power is reported below and compared against the state-of-the-art results in literature. More details available here.

	FB15K-237	WN18RR	YAGO3-10	FB15k	WN18
Literature Best	0.35*	0.48*	0.49*	0.84**	0.95*
TransE	0.31	0.22	0.50	0.62	0.66
DistMult	0.30	0.47	0.48	0.71	0.82
ComplEx	0.31	0.51	0.49	0.73	0.94
HolE	0.30	0.47	0.47	0.73	0.94
RotatE	0.31	0.51	0.43	0.70	0.95
ConvE (AmpliGraph v1.4)	0.26	0.45	0.30	0.50	0.93
ConvE (1-N, AmpliGraph v1.4)	0.32	0.48	0.40	0.80	0.95
ConvKB (AmpliGraph v1.4)	0.23	0.39	0.30	0.65	0.80

_{* Timothee Lacroix, Nicolas Usunier, and Guillaume Obozinski. Canonical tensor decomposition for knowledge base
completion. In International Conference on Machine Learning, 2869–2878. 2018.

** Kadlec, Rudolf, Ondrej Bajgar, and Jan Kleindienst. "Knowledge base completion: Baselines strike back.
" arXiv preprint arXiv:1705.10744 (2017).} _{Results above are computed assigning the worst rank to a positive in case of ties.
Although this is the most conservative approach, some published literature may adopt an evaluation protocol that assigns
the best rank instead.}

Documentation

Documentation available here

The project documentation can be built from your local working copy with:

cd docs
make clean autogen html

How to contribute

See guidelines from AmpliGraph documentation.

How to Cite

If you like AmpliGraph and you use it in your project, why not starring the project on GitHub!

If you instead use AmpliGraph in an academic publication, cite as:

@misc{ampligraph,
 author= {Luca Costabello and
          Alberto Bernardi and
          Adrianna Janik and
          Aldan Creo and
          Sumit Pai and
          Chan Le Van and
          Rory McGrath and
          Nicholas McCarthy and
          Pedro Tabacof},
 title = {{AmpliGraph: a Library for Representation Learning on Knowledge Graphs}},
 month = mar,
 year  = 2019,
 doi   = {10.5281/zenodo.2595043},
 url   = {https://doi.org/10.5281/zenodo.2595043}
}

License

AmpliGraph is licensed under the Apache 2.0 License.

ampligraph's People

Contributors

Stargazers

Watchers

Forkers

rezacsedu zorrock allensmile chaoyue729 htaghizadeh kiddozhu cxjtju frfy gyrard ishafizan shamystic sayingandparsing mbrukman jolks shaunstanislauslau rafaelmri iamaziz hhy5277 thecooltechguy sumitpai hoangcuong2011 cclauss hunglethanh9 tarsbase duoergun0729 soorajvinodan liuweiping2020 1230113202 pyvandenbussche emuhedo thegje prabhjotsl dineshsonachalam afcarl mohcinemadkour fosterleejoe rogervaas joahxiong greysun victor-iyi liujian19911023 rogerspy vindex10 haemin-jung gabrielmacedo zhengxu001 idigitopia lordstar chengjingfeng stjordanis alinamimi neerajd12 richardhgl netsec 911steven todun fengyinyang awesomemachinelearning edwardburgin erictham kazgu arita37 prios-blake-aber androssi ucla-starai yaoyang33 unt-iia-lab fionabrennan itzderr dbrroxane 0xqq nwilliam868 reiyw mohammadheydari islammohamedmosaad ahtae hoefling kevinwu555 mhmgad fagan2888 ojasviyadav hovinhthinh biyanisuraj strategist922 databill86 amgsharma gassantos f4temerahimi romankagan keshava xeverentx mindis alexucb totalgood emrul zhongkailv fardon awoziji joshgay shihanyang

ampligraph's Issues

Move duplicate regularizer code into abstract base method.

In regularizers.py multiple implementations of Regularizer abstract class have the same copy of the inputs_check method.
This implementation should be moved to the base class instead of code duplication in each child.

Replace generic exceptions with specific exceptions and helpful messages.

In many places in the code specific exceptions are being handled and then a generic exception is being thrown with a general error.

except KeyError:
    raise Exception('Some of the hyperparams for regularizer were not passed.')

Instead we should raise a KeyError and display list the specific keys that were not passed.

Implement code to download datasets.

Background and Context
The current examples require publicly available datasets in order to run. Obtaining these datasets manually may be tedious. This process should be automated.

Description
Create a database class in the module that will download and store the required datasets.
Datasets have been saved to google drive: datasets

Get a DOI and document "how to cite" section

Description
Once public, get DOI with Zenodo and document how to cite AmpliGraph.

Add logging to datasets module

Add various levels of logging to the datasets module to allow for easier debugging.

Clean up experiment script

Description

simplify argument parsing: uses choices
PEP8 compliance
remove unused imports
add shebang
replace H @ 1 with Hits@1 (same for 3, 10).

Normalize scores as per Kadlec2017

Description
Implement the scores normalization strategy of Kadlec2017.

They normalize scores as follows:

s(h,r,t) is the scoring function
They use nll loss of softmax

Moving datasets to better location

Removing the datasets from the internal sharepoint to google drive.

Add MD5 checksum for datasets

Description
Each dataset loader should have an argument check_MD5 (set to False by default) that performs MD5 checksum of the downloaded dataset.

Add logistic loss used in ANALOGY

Description
Implement the loss function used by Analogical Inference for Multi-relational Embeddings

Does `NoRegularizer` need a specific class?

Description
Can we avoid using a NoRegularizer class, and perhaps use this choice with a more pythonic None value instead?

Automate datasets download

Background and Context
Download and uncompress each dataset manually contributes to user friction during installation.

Description
Add a script to automatically download and decompress datasets in the desired folder (which must match AMPLIGRAPH_DATA_HOME).

Improve datasets loaders documentation

Description
We must add in docstrings:

stats on how many tripels in each split, how many distinct entities and relations
warning boxes for datasets with missing entities (FB15k-237, WN18RR)
references to where the dataset was first proposed, and from where we downloaded it.

A summary table in the "datasets" section would also help. Something like this:

Adding progress bars on select_best_model_ranking and evaluate_performance

To easy keep track the progress of training and evaluation processes, we should have the progress bars on them. More specifically, tqdm should be added into select_best_model_ranking and evaluate_performance methods.

Add single experiment reproduce results

Description
We need to publish single script that reproduces our best results shown in #17.
e.g.:
$ ./predictive_performance.py -i fb15k_237 -m complex

The script may take up to two arguments from the command line:

-m: the model (complex, transe, distmult)
-i: the dataset (fb15k, wn18, etc)

If no arguments are passed, all experiments are carried out.

Best hyperparams are hardcoded.

Verbosity should be strictly limited to:

output the best hyperparams,
overall progress bar on the most outer loop of experiments
final results, output as table. (you can use beautifultable)

This script will replace what's in the experiments folder.

Note everything should be kept as simple adn user-friendly as possible.

Handling datasets with unseen entities

Description

There can be datasets with unseen entities from validation/test sets. This results in the library crashes.

Actual Behavior

The library crashes when evaluate performance for validation/test sets if it meets unseen entities.

Expected Behavior

The library should at least show a friendly error message. Ideally, there should be a configurable strict mode in performance_evaluation method to allow the execution keeps running or stopping.

Steps to Reproduce

Using load_wn18rr or load_fb15k_237 to load the data.
Fit the train+valid dataset.
Predict test dataset. The library crashes at this step.

Implement multi-class log-loss

Description

Implement multi-class log-loss as presented by Lacroix2018.

In section 6.2 (see screenshot below) they claim multi-class log-loss can be responsible for better results compared to our current binomial nll loss:

This is similar to Kadlec2017, where they use a sampled multi-class log loss that seems to perform better than ComplEx's current binomial nll loss (see also #22).

From Kadlec2017:

And indeed Kadlec2017 uses the loss defined by Toutanova2015, that defines the loss as (section 3.3):

Implement RotatE

Description
Implement RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space

https://openreview.net/forum?id=HkgEQnRqYQ

Replace existing data input pipeline with tf.data APIs

Background and Context
Our training loop underperforms un the new server.
The bottleneck is likely to be the data import procedure.

Description

Replace the existing data load code with tf.data APIs.

https://www.tensorflow.org/guide/datasets
https://www.tensorflow.org/guide/performance/datasets

Add examples to savemodel, restoreModel

Background and Context
All functions and classes's documentation must include working code examples.

Description
Add working examples of savemodel, restoreModel functions in their docstrings.

Rename 'type' parameter in get_embeddings method.

In the models.py class the get_embeddings uses a parameter named 'type'. This is masking pythons built-in type function.

Assess docstring examples quality

Background and Context
AmpliGraph documentation includes examples for each public API.
Some examples may not be up to date with the latest code changes, and some new APIs may miss an example.

Description

Make sure all public APIs in the HTML documentation have an example
check if each example runs properly. Fix any broken code.
make sure HTML rendering of the example is correct
Also check the "Examples" section in examples.md.

Revamp documentation

Background and Context
AmpliGraph documentation is currently insufficient.

Description
Improve API documentation project-wise.

Find a location to host the example datasets.

Background and Context
The datasets will have to be hosted somewhere such that the automated downloader can access them. Google drive and Drop box were investigated but they required additional dependencies. In order to download from google drive additional dependencies need to be added to the project. Additional dependencies cannot be added at this time due to legal reasons with open-sourcing the project.

Description
To allow modules to download datasets automatically the datasets need to be hosted in a location that does not require additional dependencies to the project.
There is an issue with downloading the files from google drive, we will have to use additional dependencies:

pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib

from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request

Dropbox has similar dependencies.

pip install dropbox

import dropbox

A hosting location will need to be identified and the datasets will need to be moved there.

Implement nuclear 3 norm

Description
Implement Nuclear 3 norm as presented by Lacroix2018.

See section 4 of the paper (in particular 4.2)

Optimise code

Replace numpy objects with tensorflow objects where applicable.

Add logging to evaluation module.

Add various levels of logging to the evaluation module to allow for easier debugging.

Implement logging and metrics

Add different levels of logging to code to help with debugging and identifying bottlenecks.

Use self-explanatory, meaningful argument names for loss functions, regularizers and models

Background and Context
The signatures of all functions and classes of the library should be easily understandable. Argument names should be intuitive, and default values should lead to viable results, even without user-defined preferences.

Description
Currently, loss_functions.py and regularizers.py inherited classes such as PairwiseLoss use the hyperparam_dict argument in their __init__() method. Such argument should be replaced with Python's kwargs approach, and unpakced into a list of meaningful argument names (e.g. margin, gamma).

Review and integrate changelog for 1.0 release

Description
Make sure changelog coverage is correct.

Implement alternative ranking evaluation protocols

Background

Literature adopts different interpretations of the ranking evaluation protocol. We must implement alternative protocols, to be enabled with a flag in the evaluation function, to get fair comparisons of results.

Description

Protocols to implement:

Trouillon16 (ComplEx paper): ranks each positive against corruptions of head and tail together. (current implemented version)
ConvE, RotatE: corrupt head and tail separately

Implement the evaluation strategy described by RotatE and ConvE.
This may imply also revisiting the metric implementations (MRR, MR, Hits@N).

From the ConvE paper:

See also RotatE reference implementation for clues.

Complete and Review Predictive Performance documentation page

Description
Make sure all results in docs/experiments.rst are updated and well presented.
Also check recap table in README.md.

Add detail on corruption filtering scheme.

Document prime numbers-based evaluation protocol function

Description
Describe how the strategy works in the docstring of evaluate_performance().

https://accenture-labs-ampligraph.readthedocs-hosted.com/en/latest/generated/ampligraph.evaluation.evaluate_performance.html#evaluate-performance

Explain:

benefits
known limitations

Experiments fail due to prime_number_list.txt file

Description

When running the experiments an error is thrown after training is completed.
The error shows that the prime_number_list.txt files cannot be found.

FileNotFoundError: [Errno 2] No such file or directory: 'anaconda3/envs/ampligraph/lib/python3.6/site-packages/ampligraph/latent_features/prime_number_list.txt'

Actual Behavior

An error is thrown after training and no results are generated.

Expected Behavior

The model should continue to evaluation and display results.

Resolve Sphinx warnings when generating documentation.

Background and Context
The sphinx documentation generation process current generates a lot of warnings when generating the documentation. These warnings should be resolved or handled.

Description
Resolve the warnings that are generated by Sphinx when the documentation is generated. This warnings include missing citations, unexpected unindent and missing blank lines after block quotes.

Refactor mar_score

Description
Name of the function must be refactored to mr_score.

@chanlevan , @sumitpai any non-versioned experiment script will break and will need manual update.

YAGO3-10 experiments

Description
Run a batch of experiments on the YAGO3-10 dataset.

From [Lacroix2018]:

Sphinx documentation fails to build due to logger configuration file.

Sphinx logs of the first reported failure (March 8th)

[rtd-command-info] start-time: 2019-03-08T15:18:41.597007Z, end-time: 2019-03-08T15:18:42.128740Z, duration: 0, exit-code: 2
python /home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/bin/sphinx-build -T -E -b readthedocs -d _build/doctrees-readthedocs -D language=en . _build/html
Running Sphinx v1.7.9

Traceback (most recent call last):
File "/home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/lib/python3.7/site-packages/sphinx/config.py", line 161, in init
execfile_(filename, config)
File "/home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/lib/python3.7/site-packages/sphinx/util/pycompat.py", line 150, in execfile_
exec_(code, _globals)
File "conf.py", line 26, in
import ampligraph
File "/home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/lib/python3.7/site-packages/ampligraph/init.py", line 8, in
logging.config.fileConfig(fname=os.path.join(os.path.abspath(os.path.dirname(file)),'logger.conf'), disable_existing_loggers=False)
File "/home/docs/.pyenv/versions/3.7.1/lib/python3.7/logging/config.py", line 71, in fileConfig
formatters = _create_formatters(cp)
File "/home/docs/.pyenv/versions/3.7.1/lib/python3.7/logging/config.py", line 104, in _create_formatters
flist = cp["formatters"]["keys"]
File "/home/docs/.pyenv/versions/3.7.1/lib/python3.7/configparser.py", line 958, in getitem
raise KeyError(key)
KeyError: 'formatters'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/lib/python3.7/site-packages/sphinx/cmdline.py", line 303, in main
args.warningiserror, args.tags, args.verbosity, args.jobs)
File "/home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/lib/python3.7/site-packages/sphinx/application.py", line 163, in init
confoverrides or {}, self.tags)
File "/home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/lib/python3.7/site-packages/sphinx/config.py", line 167, in init
raise ConfigError(CONFIG_ERROR % traceback.format_exc())
sphinx.errors.ConfigError: There is a programable error in your configuration file:

Traceback (most recent call last):
File "/home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/lib/python3.7/site-packages/sphinx/config.py", line 161, in init
execfile_(filename, config)
File "/home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/lib/python3.7/site-packages/sphinx/util/pycompat.py", line 150, in execfile_
exec_(code, _globals)
File "conf.py", line 26, in
import ampligraph
File "/home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/lib/python3.7/site-packages/ampligraph/init.py", line 8, in
logging.config.fileConfig(fname=os.path.join(os.path.abspath(os.path.dirname(file)),'logger.conf'), disable_existing_loggers=False)
File "/home/docs/.pyenv/versions/3.7.1/lib/python3.7/logging/config.py", line 71, in fileConfig
formatters = _create_formatters(cp)
File "/home/docs/.pyenv/versions/3.7.1/lib/python3.7/logging/config.py", line 104, in _create_formatters
flist = cp["formatters"]["keys"]
File "/home/docs/.pyenv/versions/3.7.1/lib/python3.7/configparser.py", line 958, in getitem
raise KeyError(key)
KeyError: 'formatters'

Configuration error:
There is a programable error in your configuration file:

Traceback (most recent call last):
File "/home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/lib/python3.7/site-packages/sphinx/config.py", line 161, in init
execfile_(filename, config)
File "/home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/lib/python3.7/site-packages/sphinx/util/pycompat.py", line 150, in execfile_
exec_(code, _globals)
File "conf.py", line 26, in
import ampligraph
File "/home/docs/checkouts/readthedocs.org/user_builds/accenture-labs-ampligraph/envs/latest/lib/python3.7/site-packages/ampligraph/init.py", line 8, in
logging.config.fileConfig(fname=os.path.join(os.path.abspath(os.path.dirname(file)),'logger.conf'), disable_existing_loggers=False)
File "/home/docs/.pyenv/versions/3.7.1/lib/python3.7/logging/config.py", line 71, in fileConfig
formatters = _create_formatters(cp)
File "/home/docs/.pyenv/versions/3.7.1/lib/python3.7/logging/config.py", line 104, in _create_formatters
flist = cp["formatters"]["keys"]
File "/home/docs/.pyenv/versions/3.7.1/lib/python3.7/configparser.py", line 958, in getitem
raise KeyError(key)
KeyError: 'formatters'

Add support for L4 optimizer

Description

Integrate L4 optimizer and allow its use as drop-in replacement for adam, adagrad as optimizer argument in EmbeddingModel constructor.

https://github.com/martius-lab/l4-optimizer
https://arxiv.org/abs/1802.05074v2

Run experiments on FB15k-237 on ComplEx to assess:

i) predictive power
ii) training speed

Implement HolE

Description

Implement HolE.

Nickel, Maximilian, Lorenzo Rosasco, and Tomaso A. Poggio. "Holographic Embeddings of Knowledge Graphs." AAAI. Vol. 2. No. 1. 2016.
http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/download/12484/11828

Note HolE scoring function can be implemented as in the orange square below (table from RotatE paper):

It is also worth taking this paper into account :
Hayashi, Katsuhiko, and Masashi Shimbo. "On the equivalence of holographic and complex embeddings for link prediction." arXiv preprint arXiv:1702.05563 (2017).
https://arxiv.org/pdf/1702.05563.pdf

Hayashi claims that HolE scoring function is simply:

So, very easy to implement, but predictive power and validity of this approach must be carefully tested with experiments.

Resolve warnings in code to future proof module.

Background and Context
Current when running tests there are 156 warnings generated.
These warnings are due to the use of deprecated methods or future warnings about methods that will be removed in the next iteration of pandas.
To help support future versions we should resolve these issues.

Description
Warnings need to be resolved or handled to avoid future updates of dependencies.

Remove HDT support or make it optional

Description

Currently the documentation suggests that installing with HDT support is optional.
However unit tests and dataset code will fail to run unless it is installed.

This is due to the dataset.py script importing hdt:

from hdt import HDTDocument

Actual Behavior

Currently hdt support is not optional.

Expected Behavior

hdt support should be optional.

Update AmpliGraph examples section

Description
Examples section of the docs is outdated. It must be updated.

Implement tf.data api for loading data

Use tf.data to efficiently extract and preprocess the data and apply transformations like batching, shuffling and mapping functions. This should remove bottleneck of performing these operations on the cpu.

Run time performance analysis

Description

Can you please report

how many milliseconds it takes to train a batch each model on FB15k-237, when k=200, eta=2, batches_count=100, loss=nll

Tensorboard Visualizations

Background and Context

Embeddings can be visualized nicely in TensorBoard, but require the appropriate checkpoint and metadata files to be written.

Description

Implement function to write embeddings and labels to disk as files read by tensorboard.
(Optional): include functionality that can give groups/labels for each embedding, as Tensorboard can colour each group in a different colour.

Fill in experiments results in README table

Description
Best results obtained for each model and each dataset should be reported in two identical tables:

one in the README file:
one in the documentation

In the documentation, also fill in the second table with the list of hyperparameters.

Add logging to latent_features module

Add various levels of logging to the latent_features module to allow for easier debugging.

Remove misc.py

The code in this file is for some previous work on explainability, and should be removed from this release.

accenture / ampligraph Goto Github PK

ampligraph's Introduction

AmpliGraph 2.0.0 is now available!

Key Features

Modules

Installation

Prerequisites

Provision a Virtual Environment

venv

Conda

Install TensorFlow 2 for Mac OS M1 chip

Install AmpliGraph

Predictive Power Evaluation (MRR Filtered)

Documentation

How to contribute

How to Cite

License

ampligraph's People

Contributors

Stargazers

Watchers

Forkers

ampligraph's Issues

Description

Actual Behavior

Expected Behavior

Steps to Reproduce

Description

Actual Behavior

Expected Behavior

Description

Actual Behavior

Expected Behavior

Recommend Projects

Recommend Topics

Recommend Org

Jobs