Comments (3)
Below comment is from related Trello card.
We had a meeting today around the real-time dashboards use case (May 29, 2015).
One of the take aways from that meeting was providing a toolkit and/or instructions to handle table roll-ups in PostgreSQL. We discussed the following:
(1) For roll-up specifics, we discussed two alternatives:
(1a) Lazy materialized views with expiration times: http://hashrocket.com/blog/posts/materialized-view-strategies-using-postgresql
Andres noted that we can improve on this approach by blocking on the caller's side with a PG_TRY block.
(1b) Andres then noted that we could also create an extension, where we have a background worker process that does the "roll-up coordination".
We need to evaluate these two approaches, and pick one that we like better. Some points we chatted about the in the meeting were: Amazon RDS not being extension friendly, and if these approaches would be compatible with the approach we pick in CitusDB.
(2) Keep an invalidation queue or not
Kafka has become a standard for queueing. At the same time, it's an external dependency. How much effort is it to keep an invalidation queue table in Postgres?
For the motivation for this use-case, I'm copy+pasting from the email with the subject line "Real-time dashboards use case" dated May 21, 2015.
We had 5 customer interactions this week that targeted the same use-case: raw events data comes into the system, gets rolled up into cubes, and gets displayed through a NoSQL database.
As we chatted, if we can cover how users can address this use-case with CitusDB or PostgreSQL, that's a major win.
The current alternative pipeline we hear today are: (a) Redshift / Hadoop to store data, (b) Spark / MapReduce to transform the data, and (c) NoSQL database to serve the rolled up data.
For half of these conversations, the current INSERT rates were around 100/s. If we could document and provide a toolbox for this use case on PostgreSQL, their only objection will be that Postgres doesn't scale. We could then mitigate that by pointing them to the new CitusDB documents.
from citus.
We explored several methods to create roll-up tables and also presented a tutorial on it. We could evaluate our learnings from this tutorial and pick and implement an approach: https://www.youtube.com/watch?v=0ybz6zuXCPo
from citus.
There are now many companies that successfully implemented a real-time analytics pipeline on Citus by creating rollup tables and using INSERT...SELECT.
Blog post: https://www.citusdata.com/blog/2018/06/14/scalable-incremental-data-aggregation/
Documentation: https://docs.citusdata.com/en/v7.4/use_cases/realtime_analytics.html
from citus.
Related Issues (20)
- wrong one
- Execution order of statements of a transaction HOT 1
- pg_stats_user_tables last_autovacuum for distrubuted tables HOT 3
- Having error while using checkout v4 action in check-sql-snapshots
- Debezium support HOT 2
- [Feature request] Worker nodes read only mode HOT 1
- looks like Issue with MERGE Command. HOT 1
- [doc] wrong code comments for function PopUnassignedPlacementExecution HOT 1
- [Feature question] CDC support in Citus
- `alter_table_set_access_method` AccessExclusiveLock HOT 2
- ERROR: no binary output function available for type theta_sketch HOT 4
- Distributed transactions under a concurrency scenario HOT 2
- Assert when executing SELECT citus_set_coordinator_host('localhost');
- postgresql is crashing when selecting from citus columnar table HOT 1
- PG17Beta2 Support - Regression tests sanity
- Query on Best Practices for Handling High Bandwidth on the Coordinator in Large Citus Clusters
- Pg_basebackup error: COPY stream ended before last file was finished14 on Citus worker
- [Performance, Columnar] window function in subquery greatly slows down query, even when unreferenced
- limit null is implementation is broken when offset is used
- Test database is failing on release-12.1 + REL_16_STABLE
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from citus.