GithubHelp home page GithubHelp logo

davidmr001 / graphvite Goto Github PK

View Code? Open in Web Editor NEW

This project forked from deepgraphlearning/graphvite

0.0 0.0 0.0 5.43 MB

A general and high-performance graph embedding system for various applications

Home Page: https://graphvite.io

License: Apache License 2.0

CMake 3.64% Shell 0.05% Cuda 47.28% C++ 27.33% Python 21.71%

graphvite's Introduction

GraphVite logo

GraphVite - graph embedding at high speed and large scale

Install with conda License

GraphVite is a general graph embedding engine, dedicated to high-speed and large-scale embedding learning in various applications.

GraphVite provides complete training and evaluation pipelines for 3 applications: node embedding, knowledge graph embedding and graph & high-dimensional data visualization. Besides, it also includes 9 popular models, along with their benchmarks on a bunch of standard datasets.

Node Embedding Knowledge Graph Embedding Graph & High-dimensional Data Visualization

Here is a summary of the training time of GraphVite along with the best open-source implementations on 3 applications. All the time is reported based on a server with 24 CPU threads and 4 V100 GPUs.

Node embedding on Youtube dataset.

Model Existing Implementation GraphVite Speedup
DeepWalk 1.64 hrs (CPU parallel) 1.19 mins 82.9x
LINE 1.39 hrs (CPU parallel) 1.17 mins 71.4x
node2vec 24.4 hrs (CPU parallel) 4.39 mins 334x

Knowledge graph embedding on FB15k dataset.

Model Existing Implementation GraphVite Speedup
TransE 1.31 hrs (1 GPU) 14.8 mins 5.30x
RotatE 3.69 hrs (1 GPU) 27.0 mins 8.22x

High-dimensional data visualization on MNIST dataset.

Model Existing Implementation GraphVite Speedup
LargeVis 15.3 mins (CPU parallel) 15.1 s 60.8x

Requirements

Generally, GraphVite works on any Linux distribution with CUDA >= 9.2.

The library is compatible with Python 2.7 and 3.5/3.6/3.7.

Installation

From Conda

GraphVite can be installed through conda with only one line.

conda install -c milagraph graphvite

If you only need embedding training without evaluation, you can use the following alternative with minimal dependencies.

conda install -c milagraph graphvite-mini

From Source

Before installation, make sure you have conda installed.

git clone https://github.com/DeepGraphLearning/graphvite
cd graphvite
conda install -y --file conda/requirements.txt
mkdir build
cd build && cmake .. && make && cd -
cd python && python setup.py install && cd -

Quick Start

Here is a quick-start example of the node embedding application.

graphvite baseline quick start

Typically, the example takes no more than 1 minute. You will obtain some output like

Batch id: 6000
loss = 0.371641

macro-F1@20%: 0.236794
micro-F1@20%: 0.388110

Baseline Benchmark

To reproduce a baseline benchmark, you only need to specify the keywords of the experiment. e.g. model and dataset.

graphvite baseline [keyword ...] [--no-eval] [--gpu n] [--cpu m]

You may also set the number of GPUs and the number of CPUs per GPU.

Use graphvite list to get a list of available baselines.

High-dimensional Data Visualization

You can visualize your high-dimensional vectors with a simple command line in GraphVite.

graphvite visualize [file] [--label label_file] [--save save_file] [--perplexity n] [--3d]

The file can be either in numpy dump or text format. For the save file, we recommend to use a png format, while pdf is also supported.

Contributing

We welcome all contributions from bug fixs to new features. Please let us know if you have any suggestion to our library.

Development Team

GraphVite is developed by MilaGraph, led by Prof. Jian Tang.

Authors of this project are Zhaocheng Zhu, Shizhen Xu, Meng Qu and Jian Tang. Contributors include Kunpeng Wang and Zhijian Duan.

Citation

If you find GraphVite useful for your research or development, please cite the following paper.

@inproceedings{zhu2019graphvite,
    title={GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding},
     author={Zhu, Zhaocheng and Xu, Shizhen and Qu, Meng and Tang, Jian},
     booktitle={The World Wide Web Conference},
     pages={2494--2504},
     year={2019},
     organization={ACM}
 }

Acknowledgements

We would like to thank Compute Canada for supporting GPU servers. We specially thank Wenbin Hou for useful discussions on C++ and GPU programming techniques.

graphvite's People

Contributors

kiddozhu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.