Comments (2)
Hi,
I'm very familiar with Kafka Streams from a Java perspective, but I haven't used this Javascript library yet. However I can explain the functionality behind the interactive queries:
Usually when creating a Kafka Streams topology you do the following:
- Consume data from one or more Kafka topics
- Transform the data using the Kafka Streams topology (map, filter, join, ...)
- Produce the transformed data into one or more Kafka topics
You can however choose a different target for your streaming topology. Instead producing data to another Kafka topic, you could also write it into a so called "State Store". A State Store is basically a RocksDB Key-Value Database that resides in the local file system of your application (usually in the OS temp folder). This state store can be accessed via a Kafka API and gives you the possibility to get the latest message value for a certain key. You could wrap this functionality behind a REST interface and make the content of a Kafka topic available for a Web Frontend. This is basically what they call interactive queries.
However the above description is a little bit simplified. What I've left out is that there are two different kind of state stores. There is the regular state store and there are global state stores. A global state store will always have all the data from a topic. When you start 3 instances of your application the global state store will be created 3 times. When you use a regular state store and your start 3 instances of your application, Kafka will assign a third of the topic partitions to every instance, therefore the whole topic content is spread over 3 Kafka state stores.
If you use such a regular state store and want to provide a REST service to query the state store content, you are now facing the problem that the data is spread over 3 different server instances. Therefore you have to find out which instance of your application is holding which partition of the Kafka topic. There is also an API for that available (at least in the Java version). So that is the second big and important chunk of the term interactive queries.
You will find a great description of the whole concept here: https://docs.confluent.io/current/streams/developer-guide/interactive-queries.html
Hope this helps.
from kafka-streams.
Hi @Protoss78,
Your explanation is remarkably useful. You’re spot on in your second paragraph detailing what Kafka (at least in the Java perspective) provides in the form of interactive queries. This dispels the magic that Kafka packages, makes us aware of the layers of abstraction and in the end understand that the core of what Kafka gives is publishing and subscribing of topics.
I’m starting to understand that this kafka-streams library sits above the pub/sub topics core of Kafka and augments it with a Most.js syntax for stream processing of topics data. Have to also note that this stream processing (map, filter, etc) occurs locally at the application instance as and when data comes in (eg a reduce operation does not combine between multiple application instances).
The library does not come bundled with a state store nor a global state store that mimics what the Kafka API in the Java world has. That though does not stop anyone from building a state store for every application instance yourself. For eg, I’ve been trying to build a Redis-backed state store that does a subset of what the Java Kafka API does.
I also quickly realised that some non-trivial operations, especially aggregations, eg calculating mean value of a stream of values in a distributed manner using multiple application instances, requires much more planning and thought. This library simply does not give you the required options out of the box, though do correct me if I’m wrong.
Thank you once again for your time in your splendid explanation.
from kafka-streams.
Related Issues (20)
- Returning JSON as a Buffer without id as a parameter
- Returning JSON value without id injected in it
- How to use it with most-buffer?
- Is it possible to join streams by key?
- Writting stream to topic doesn't write HOT 1
- feature request: KStream leftJoin support HOT 1
- NPM links -> not found
- prototype for sending successful messages to one topic, errors to another topic HOT 1
- Foreign key KTable-KTable join
- trying to generage 2 msgs to output stream from 1 incoming msg HOT 1
- mapJSONConvenience and mapBufferKeyToString do not work dues to faulty isBuffer check
- No clear documentation or example is given regarding streaming from remote topic instead of localhost
- Provide Debezium example
- Question - deployment location HOT 1
- Question Typescript: get Table storage is not optional
- Merge does not produce to output stream. but to respective. input stream
- Calling .consume() is not required in streaming mode.
- Using KTable for look-up with a KStream.
- Is the project alive? HOT 3
- Any chance of supporting non-EOL versions of node?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kafka-streams.