GithubHelp home page GithubHelp logo

isabella232 / trillion-graph Goto Github PK

View Code? Open in Web Editor NEW

This project forked from neo4j/trillion-graph

0.0 0.0 0.0 8.34 MB

A scale demo of Neo4j Fabric spanning up to 1129 machines/shards running a 100TB (LDBC) dataset with 1.2tn nodes and relationships.

License: Apache License 2.0

HTML 0.24% JavaScript 2.58% Shell 0.12% Svelte 28.68% TypeScript 7.93% Java 60.45%

trillion-graph's Introduction

Discord Discourse users

Demo application instructions

Overview

This repository contains the code necessary to reproduce the results for the Trillion Entity demonstration that was part of the NODES 2021 Keynote presentation. It contains the store generation code we used, the orchestration scripts for the AWS instances that are needed to run the setup, the queries we executed, and the client that performs the latency measurements. Please read this README in its entirety before proceeding, to make sure you have an understanding of the necessary steps.

More Information

Blog post with more behind the scenes information Behind the Scenes of Creating the World’s Biggest Graph Database.

The NODES 2021 Keynote recording showing the Trillion Graph Demo live:

A twitter thread summary of the demo:

How To

What you'll need:

  1. An AWS account with sufficient capacity for the number and type of EC2 instances you'll create, including access to S3. AWS is the default provider this application uses; it should be possible to modify it to use the cloud provider of your choice.
  2. Access to Neo4j Enterprise. Fabric is a Neo4j Enterprise feature, which is distributed under a different license. It needs to be properly installed to your local Maven repository and you can find detailed instructions in the Neo4j Documentation

The directory structure is as follows:

  1. cypher contains the individual cypher queries that were used in the demo
  2. server contains the data generation code and the instance orchestration
  3. client contains the client for the latency measurements
  4. guide contains a Neo4j Browser guide which explains the LDBC schema and queries

Outline

Here we'll describe the basic steps you'll need to take. Detailed instructions are provided further down.

Familiarize yourself with the code.

The code provided should be straightforward to understand. You should take some time to familirize yourself with it, since you'll need to provide information specific to your environment. The main two files to look at are the FabricDataGenerator and AmazonController that you can find under the server directory. The first creates the stores both locally and remotely, and the second orchestrates the AWS Neo4j instances. They are structured as scripts, so you can modify them as you like. You will need to edit the code to execute the various steps and configure the setup to your requirements.

Create the stores

You should first create the Person and Template databases. The first is the full Person shard and the latter is the basis for the Forum shards. Typically, you will create these two locally, upload them to S3, and then orchestrate EC2 instances with the AmazonController to generate en mass Forum shards. Of course, with minimal changes, you can do everything locally, in one step, and then move the databases to the Fabric shards however you prefer.

Instantiate the Shards

The AmazonController class can be used to install and configure Neo4j and the shards. You will need to modify the code to execute the appropriate commands for your setup, but the basic AWS orchestration steps will be the same as for the store generation.

Build, install and run the application

The last step is to locally build and run the UI for the demo. With that, you'll be able to take latency measurements and explore the schema you built.

trillion-graph's People

Contributors

jexp avatar digitalstain avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.