GithubHelp home page GithubHelp logo

samuelbalogh / people-api Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 43 KB

Social graph in Postgres

Python 79.93% Dockerfile 1.53% Makefile 0.72% Shell 1.59% PLpgSQL 16.24%
graph postgres recursive-ctes hobbies

people-api's Introduction

People API

Backend for the Hackathon project written on the 9th January 2021.

Concept

The idea was to write an app to list friends and note some details about them, like when we talked last time, and how their kids are called (I always forget their names), and who their acquaintances are - the last requirement basically makes this a super-simple social network.

Why Postgres?

I wanted to use Postgres as a graph database because I wanted to see how terrible it could be to shoehorn a graph use-case into a relational database.
I think overall for a simple use-case like mine, it works surprisingly well.

โ— I would not use this in a production setting though (I felt like this had to be mentioned, lest this project is taken seriously by mistake).

The data structure is based on two tables: nodes and edges (it's how Martin Kleppman described this approach in the DDIA book).

Nodes are currently only people, but they could be anything, like hobbies or places. For simplicity's sake, I'm storing places and hobbies as properties, not as nodes, so they are not first-class citizens in this model.
Edges are relationships between people. They point from node A to node B and they have a label, like friend.

Storing the data is straightforward. To query the data, I'm using some CTE's to get some basic info about a person's first level of connections (their names). I don't want to answer arbitrary questions like how can I get to person X from person A, so I'm not walking the graph with Postgres beyond a single level. Even if going a few levels was a requirement, it could be solved by recursive CTEs.

The schema can be seen below:

people-api=# \d nodes
                     Table "public.nodes"
   Column   | Type | Collation | Nullable |      Default       
------------+------+-----------+----------+--------------------
 id         | uuid |           | not null | uuid_generate_v4()
 properties | json |           |          | 
Indexes:
    "nodes_pkey" PRIMARY KEY, btree (id)
Referenced by:
    TABLE "edges" CONSTRAINT "edges_head_node_fkey" FOREIGN KEY (head_node) REFERENCES nodes(id) ON DELETE CASCADE
    TABLE "edges" CONSTRAINT "edges_tail_node_fkey" FOREIGN KEY (tail_node) REFERENCES nodes(id) ON DELETE CASCADE
people-api=# \d edges
                     Table "public.edges"
   Column   | Type | Collation | Nullable |      Default       
------------+------+-----------+----------+--------------------
 id         | uuid |           | not null | uuid_generate_v4()
 tail_node  | uuid |           | not null | 
 head_node  | uuid |           | not null | 
 label      | text |           |          | 
 properties | json |           |          | 
Indexes:
    "edges_pkey" PRIMARY KEY, btree (id)
    "edges_heads" btree (head_node)
    "edges_tails" btree (tail_node)
Foreign-key constraints:
    "edges_head_node_fkey" FOREIGN KEY (head_node) REFERENCES nodes(id) ON DELETE CASCADE
    "edges_tail_node_fkey" FOREIGN KEY (tail_node) REFERENCES nodes(id) ON DELETE CASCADE

Limitations

There are many.

It's hard (but possible) to answer arbitrary graph-traversal questions with SQL - but as I mentioned earlier, it would be doable with recursive CTEs.

The performance is very good at the moment but the size of the dataset is trivial currently. It would be nice to do some load testing to figure out at what point this approach becomes untenable.

I have shifted some logic the application layer, like sorting results, so that people are surrounded by people with whom they are connected with - this is done via a recursive depth-first traversal (I'm not sure if that would have been feasible in Postgres).

What's next

  • load testing locally so I can see how it performs with a few hundred thousand rows
  • look into graph DBs more deeply
  • look into Neo4j docs
  • look into FaunaDB docs

people-api's People

Contributors

samuelbalogh avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.