The tgb2 from juliagast

Temporal Graph Benchmark for Machine Learning on Temporal Graphs (NeurIPS 2023 Datasets and Benchmarks Track)

TGB 2.0: A Benchmark for Learning on Temporal Knowledge Graphs and Heterogeneous Graphs (preprint)

Overview of the Temporal Graph Benchmark (TGB) pipeline:

TGB includes large-scale and realistic datasets from five different domains with both dynamic link prediction and node property prediction tasks.
TGB automatically downloads datasets and processes them into numpy, PyTorch and PyG compatible TemporalData formats.
Novel TG models can be easily evaluated on TGB datasets via reproducible and realistic evaluation protocols.
TGB provides public and online leaderboards to track recent developments in temporal graph learning domain.
Now TGB supports temporal homogeneous graphs, temporal knowledge graphs and temporal heterogenenous graph datasets.

To submit to TGB leaderboard, please fill in this google form

See all version differences and update notes here

Announcements

Excited to announce TGB 2.0, expanding TGB to Temporal Knowledge Graphs and Temporal Heterogeneous Graphs

See our preprint here for details. Please install locally first. We welcome your feedback and suggestions.

Excited to announce TGX, a companion package for analyzing temporal graphs in WSDM 2024 Demo Track

TGX supports all TGB datasets and provides numerous temporal graph visualization plots and statistics out of the box. See our paper: Temporal Graph Analysis with TGX and TGX website.

Excited to announce that TGB has been accepted to NeurIPS 2023 Datasets and Benchmarks Track

Thanks to everyone for your help in improving TGB! we will continue to improve TGB based on your feedback and suggestions.

Please update to version 0.9.2

version `0.9.2`

Update the fix for tgbl-flight where now the unix timestamps are provided directly in the dataset. If you had issues with tgbl-flight, please remove TGB/tgb/datasets/tgbl_flightand redownload the dataset for a clean install

Pip Install

You can install TGB via pip. Requires python >= 3.9

pip install py-tgb

Links and Datasets

The project website can be found here.

The API documentations can be found here.

all dataset download links can be found at info.py

TGB dataloader will also automatically download the dataset as well as the negative samples for the link property prediction datasets.

if website is unaccessible, please use this link instead.

Running Example Methods

For the dynamic link property prediction task, see the examples/linkproppred folder for example scripts to run TGN, DyRep and EdgeBank on TGB datasets.
For the dynamic node property prediction task, see the examples/nodeproppred folder for example scripts to run TGN, DyRep and EdgeBank on TGB datasets.
For all other baselines, please see the TGB_Baselines repo.

Acknowledgments

We thank the OGB team for their support throughout this project and sharing their website code for the construction of TGB website.

Citation

If code or data from this repo is useful for your project, please consider citing our paper:

@article{huang2023temporal,
  title={Temporal graph benchmark for machine learning on temporal graphs},
  author={Huang, Shenyang and Poursafaei, Farimah and Danovitch, Jacob and Fey, Matthias and Hu, Weihua and Rossi, Emanuele and Leskovec, Jure and Bronstein, Michael and Rabusseau, Guillaume and Rabbany, Reihaneh},
  journal={Advances in Neural Information Processing Systems},
  year={2023}
}

Some more utility functions for dataset.py?

I need the following methods for recurrencybaseline.py, they will also be needed by other methods later. does it make sense to move them somewhere else, e.g. to utils?
(you can also find them here: https://github.com/JuliaGast/TGB2/blob/julia_new/examples/linkproppred/tkgl-polecat/recurrencybaseline.py)

def group_by(data: np.array, key_idx: int) -> dict:
    """
    group data in an np array to dict; where key is specified by key_idx. for example groups elements of array by relations
    :param data: [np.array] data to be grouped
    :param key_idx: [int] index for element of interest
    returns data_dict: dict with key: values of element at index key_idx, values: all elements in data that have that value
    """
    data_dict = {}
    data_sorted = sorted(data, key=itemgetter(key_idx))
    for key, group in groupby(data_sorted, key=itemgetter(key_idx)):
        data_dict[key] = np.array(list(group))
    return data_dict

def add_inverse_quadruples(triples: np.array, num_rels:int) -> np.array:
    """
    creates an inverse triple for each triple in triples. inverse triple swaps subject and objsect, and increases 
    relation id by num_rels
    :param triples: [np.array] dataset triples
    :param num_rels: [int] number of relations that we have originally
    returns all_triples: [np.array] triples including inverse triples
    """
    inverse_triples = triples[:, [2, 1, 0, 3]]
    inverse_triples[:, 1] = inverse_triples[:, 1] + num_rels  # we also need inverse triples
    all_triples = np.concatenate((triples[:,0:4], inverse_triples))

    return all_triples

def reformat_ts(timestamps):
    """ reformat timestamps s.t. they start with 0, and have stepsize 1.
    :param timestamps: np.array() with timestamps
    returns: np.array(ts_new)
    """
    all_ts = list(set(timestamps))
    all_ts.sort()
    ts_min = np.min(all_ts)
    ts_dist = all_ts[1] - all_ts[0]

    ts_new = []
    timestamps2 = timestamps - ts_min
    for timestamp in timestamps2:
        timestamp = int(timestamp/ts_dist)
        ts_new.append(timestamp)
    return np.array(ts_new)

juliagast / tgb2 Goto Github PK

tgb2's Introduction

Announcements

version 0.9.2

Pip Install

Links and Datasets

Running Example Methods

Acknowledgments

Citation

tgb2's People

Contributors

Stargazers

tgb2's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs

version `0.9.2`