nodefluent / kafka-streams Goto Github PK

View Code? Open in Web Editor NEW

823.0 25.0 111.0 4.24 MB

equivalent to kafka-streams :octopus: for nodejs :sparkles::turtle::rocket::sparkles:

Home Page: https://nodefluent.github.io/kafka-streams/

License: MIT License

JavaScript 3.76% Shell 2.02% CSS 1.21% TypeScript 93.01%

kafka streams stream-processing kafka-streams node nodejs big-data

kafka-streams's People

Contributors

Stargazers

Watchers

kafka-streams's Issues

Is this using the Kafka Streams API ?

I wanted to know if this library uses the Kafka Streams API or is it simulating the same using Producer-Consumer API of Kafka ?

Intermittent stringification issue

I have a Kafka producer and consumer.

The producer does this:

            const returnMessage = {
                prop1: 'some string',
                prop2: 'another string',
                prop3: nestedObject
            };
            console.log(JSON.stringify(returnMessage))
            await stream.writeToStream(JSON.stringify(returnMessage));

The consumer does this:

incomingStream.forEach(
                message => {
                        console.log(message.value)
                        let messageObject = message.value;
                        ...other stuff...
                }
            );

Now, on the producer side, the return message is always logged as a proper string, everything is good. But on the consumer side, at first, the message.value is a proper string from which it's possible to parse a JSON, but on subsequent requests, it comes across as '[object Object]',

I feel like I'm missing something crucial here...please help.

Problem with chaining .merge()

I've faced a really weird problem when merging streams: merging requires a kafka streams reference on the left-hand merger. By documentation, it had to work. I tried to use other method. Joining streams gives either the same error or not implemented error.

Here's my code part:

const {KafkaStreams} = require("kafka-streams");
const kafkaStreams = new KafkaStreams(config.kafkas);

const stream_1 = kafkaStreams.getKStream(config.kafka.topics[0]);
const stream_2 = kafkaStreams.getKStream(config.kafka.topics[1]);
const stream_3 = kafkaStreams.getKStream(config.kafka.topics[2]);
const stream_4= kafkaStreams.getKStream(config.kafka.topics[3]);
var mergedStream = stream_1
  .merge(stream_2)
  .merge(stream_3) // Error arises at this line
  .merge(stream_4);

By documentation, from .merge(stream) should be returned an object which can be treated both as a stream or a table. Also, .merge(stream) should work fine on a table also

I use [email protected], [email protected] and [email protected]

Upd: creating new KafkaStreams for each stream, giving them unique groupId didn't solve the problem

Allow injecting custom Kafka client

I'm one of the developers of kafkajs. I've been wanting to provide a Kafka Streams-like API on top of it for some time now, but looking at this library, it looks like the interface between it and the underlying Kafka client library is pretty small, so it should be perfectly possible to allow the user to inject their own client, provided that it implements the expected interface.

Would you be open to accepting a PR that adds a config.clientFactory option that would accept a function returning something extending KafkaClient, which would be used instead of instantiating JSKafkaClient/NativeKafkaClient when provided?

KTable State Persistence

I am having a problem using KTables.

I have a KTable reading from a topic, and I need to compare all incoming key-value messages with the latest stored key-value message with the same key. I need the state, which maintains the latest stored message for every key in the KTable, to persist when the topic changes, or when the application stops and is restarted again.

What I am trying to do is possible and a common use case of the Java Kafka Streams library. I notice that when trying to use KTables in node kafka-streams, no new Kafka internal topics are created in my cluster to maintain the KTable internal state (Kafka should create internal topics for KTable state).

How can I get Kafka to create the internal topics? Is there something I need to change about my Kafka config when creating the KTables, e.g. to ensure a consistent application ID?

Do I need to pass a KStorage argument to the KTable constructor? How would it be persisted in between restarting my application?

Is the LastState class relevant here?

I would appreciate any guidance in solving this and answering some of my questions.

Thanks.

Is possible to branch a stream?

It's possible to branch a stream, like java counterpart?

KStream<String, Long> stream = ...;
KStream<String, Long>[] branches = stream.branch(
(key, value) -> key.startsWith("A"), /* first predicate /
(key, value) -> key.startsWith("B"), / second predicate /
(key, value) -> true / third predicate */
);

// KStream branches[0] contains all records whose keys start with "A"
// KStream branches[1] contains all records whose keys start with "B"
// KStream branches[2] contains all other records

// Java 7 example: cf. filter for how to create Predicate instances

Error importing kafka-streams

When importing the library with:

const { KafkaStreams } = require('kafka-streams')

I get the following error:

node_modules/sinek/lib/librdkafka/NConsumer.js:921
  async getLagStatus(noCache = false){
        ^^^^^^^^^^^^

SyntaxError: Unexpected identifier

This is due to the use of async in the Sinek Library.

I've tried with Node 8, 9 and 10, recompiling my node_modules each time.
I've even tried with babel-register (and babel-polyfill), but making it also compile this library:

require('babel-register')({
  ignore: filename => !(!/\/node_modules\//.test(filename) || /\/node_modules\/sinek\//.test(filename)),
})
require('babel-polyfill') // both with and without this
const { KafkaStreams } = require('kafka-streams')

But it errors elsewhere:

node_modules/sinek/lib/librdkafka/Health.js:139
        return super.createCheck(STATES.UNCONNECTED, MESSAGES.UNCONNECTED);
               ^^^^^

SyntaxError: 'super' keyword unexpected here

Obviously this prevents me (and possibly further users) from even starting to use this library!
Any help would be greatly appreciated.

An in-range update of bluebird is breaking the build 🚨

The dependency bluebird was updated from `3.5.4` to `3.5.5`.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

bluebird is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details

❌ continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

Release Notes for v3.5.5

Features:

Added Symbol.toStringTag support to Promise (#1421)

Bugfixes:

Fix error in IE9 (#1591, #1592)
Fix error with undefined stack trace (#1537)
Fix #catch throwing an error later rather than immediately when passed non-function handler (#1517)

Commits

The new version differs by 7 commits.

e78138a Release v3.5.5
e182a17 Fix #1517
4c3bfeb Fixes #1537
65a7e7e package lock
b96656e Fixes #1592 Fixes #1591
17f69f3 Added Symbol.toStringTag support to Promise (#1421)
08d0bb5 Remove --save option as it isn't required anymore (#1582)

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.

Your Greenkeeper Bot 🌴

An in-range update of debug is breaking the build 🚨

Version 3.2.3 of debug was just published.

Branch	Build failing 🚨
Dependency	debug
Current Version	3.2.2
Type	dependency

This version is covered by your current version range and after updating it in your project the build failed.

debug is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details

❌ continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

Commits

The new version differs by 1 commits.

700a010 re-introduce node.js (root file) (ref #603)

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.

Your Greenkeeper Bot 🌴

Produce messages as Objects

Hi there,
I am trying to produce some messages to consume them later. When the message is a String everything is ok but when I try to send an Object the message doesn't get sent and I don't event get any error of any kind...
Here is my code (I am using the same config as the one in the tests folder)

'use strict';

const {KafkaStreams} = require('kafka-streams');
const {nativeConfig: config} = require('./config');

const kafkaStreams = new KafkaStreams(config);
const stream = kafkaStreams.getKStream(null);

//creating a stream without topic is possible
//no consumer will be created during stream.start()
stream.to('my-output-topic');
//define a topic to stream messages to


stream.start().then(() => {
  stream.writeToStream({
    x: 'x',
    y: 'y',
    z: 'z'
  });
});

Can you please help on this? Am I doing anything wrong?

KTable

Hey Team

I am new to Kafka and Kafka Streams,
I want to create Ktable using my topic and want to fetch all the data present in the topic the messages can be a week old, and want to filter the messages on the basis of payload, Can you guys please help me how to achieve that. and also i want to know for long can KTable store data.

Any reply and help would be appreciated

Thanks

Question: Specifying a key when producing to a topic

Hi,

I'm using node-rdkafka client and would like to a specify a key when producing to a topic. Your example produceToTopic.js seems to generate a random key for each message.

Thanks.

An in-range update of jsdoc is breaking the build 🚨

The devDependency jsdoc was updated from `3.6.1` to `3.6.2`.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

jsdoc is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.

Status Details

❌ continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

Commits

The new version differs by 4 commits.

0e468af 3.6.2
d5e0eb0 Add 3.6.2 changelog.
61ae11c Ensure that ES 2015 classes appear in the generated docs when they're supposed to. (#1644)
03b8abd Add 3.6.1 changelog.

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.

Your Greenkeeper Bot 🌴

Is TopicNameExtractor supported? Is there an example?

An in-range update of bluebird is breaking the build 🚨

The dependency bluebird was updated from `3.5.3` to `3.5.4`.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

bluebird is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details

❌ continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

Release Notes for v3.5.4

Proper version check supporting VSCode(#1576)

Commits

The new version differs by 6 commits.

e0222e3 Release v3.5.4
4b9fa33 missing --expose-gc flag (#1586)
63b15da docs: improve and compare Promise.each and Promise.mapSeries (#1565)
9dcefe2 .md syntax fix for coming-from-other-languages.md (#1584)
b97c0d2 added proper version check supporting VSCode (#1576)
499cf8e Update jsdelivr url in docs (#1571)

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.

Your Greenkeeper Bot 🌴

Request for releasing the global Ktables in the next version

Hi,
The current Ktable in the module is quite impressing and we are using it over a large scale in our micro-services. But as the Ktables concept itself is not persistent and the “local” KTable instance of each application instance will be populated with data from only 1 partition of the topic’s x partitions, i couldn't able to load balance my applications.

Global Ktable solve this problem in making the data persistent and available in all applications.

Please make this feature available in the next version.

Mutating message value with version property when using 'buffer' as produceType and native client

Hi,

When producing a JSON object using 'buffer' produceType and native client the message value may be mutated when a version property on the message is not supplied. AFAK version on an message value is not required as part of the Kafka protocol (please correct me if I'm mistaken), therefore setting version = 1 (https://github.com/nodefluent/kafka-streams/blob/master/lib/messageProduceHandle.js#L39) has the unintentional side affect when the message is produced by the native client (https://github.com/nodefluent/node-sinek/blob/master/lib/librdkafka/NProducer.js#L375)

In our implementation we take produce then transform a message a number of times, and validate the integrity of the message by using a hash of the original message and we do not use the version property on the message, this integrity check fail because kafka-streams / node-sinek adds the version property.

E.g.

const message = {
    key: '1',
    value: {
        name: 'test'
    }
}

const hash = myHashingMethod(message);

....

consumerStream
    .from('my_topic')
    .forEach(msg => {
        // Fails
        const newHash = myHashingMethod(msg);
        expect(newHash).to.eql(hash);
    });

producerStream
    .to({ topic: 'my_topic', producerType: 'buffer')
    .writeToStream(message);

We are currently migrating from Kafka 0.9 -> 1.1 and wanted to ask if the kafka protocol when using JSON data requires a version property on the message value?

Thanks,

Question

If I'm not mistaken a stream ending in .forEach() returns a Promise that will resolve when all events have been consumed, however a stream ending in .to() will resolve the Promise right in the middle of stream being initialised.

For my use cases I would much prefer also .to() resolving the Promise only after the events have been consumed. Any advise on this?

Example code snippet with Promise resolving too early:

const someStream = kstream.from("input-topic")
    .take(1)
    .to("output-topic")

someStream.then(() => console.log("finished"))

kstream.start();

TypeError: createSubject is not a function @ examples/window.js

TypeError: createSubject is not a function at new Window (/root/kafkastream/node_modules/kafka-streams/lib/actions/Window.js:13:21) at KStream.window (/root/kafkastream/node_modules/kafka-streams/lib/dsl/KStream.js:286:24) at Object.<anonymous> (/root/kafkastream/window.js:65:41) at Module._compile (internal/modules/cjs/loader.js:702:30) at Object.Module._extensions..js (internal/modules/cjs/loader.js:713:10) at Module.load (internal/modules/cjs/loader.js:612:32) at tryModuleLoad (internal/modules/cjs/loader.js:551:12) at Function.Module._load (internal/modules/cjs/loader.js:543:3) at Function.Module.runMain (internal/modules/cjs/loader.js:744:10) at startup (internal/bootstrap/node.js:238:19) at bootstrapNodeJSCore (internal/bootstrap/node.js:572:3)

================================
in 'kafka-streams\lib\actions\Window.js' :

const {async: createSubject} = require("most-subject");

I seeked most-subject there is no async method

Can I access the high level producer through this library?

I have an issue-when I write to a topic, I would like to automatically distribute across its partitions, however many there may be. At the moment, I have to configure the number of partitions manually with stream.to([stream name], [number of partitions]). I would like to use the HighlevelProducer available in Kafka for this. Is this possible?

Questions about KTables

Hi,

I'm very new to Kafka and streaming platforms so I apologize if my understanding is not correct. One use-case I see for using KTables is to handle external queries (e.g. from a GUI) without having to store the state of my application in a separated database (thus, Kafka is the single source of truth).

The thing I don't understand is the purpose of methods such as consumeUntilCount() or consumeUntilMs(). If I want the table to represent the current state of my application, I have to read all the messages from the topic, from the oldest to the newest message, don't I? And as consumeUntilLastOffset() is not implemented yet, how am I supposed to do this?

[q] How to add messages to stream

Hello. Brief problem description - at some point I receive a message. Then it's processed and as a result, I need to push two messages into output topic. What is the best way to do that?

eventStream
        .from('intopic')
        .map(message => {
           // do something
           // send two messages into the same topic
        })
        .to('outtopic')

I've tried to use writeMessage, but it adds the message to the start of the stream.
Probably there's a way to directly call producer send method?
Or maybe it's better to create a standalone producer (or stream) and write additional message into in and after .map merge/concat it with existing stream?

Commit offset only after processing and producing to a different topic was successful

Hi,

I have the following Stream processing use case:

Consume a message
Do some processing on that message
Produce the result to a different topic

After step 3 is successful (i.e. producer message got successfully written to Kafka and acknowledged), I'd like to move the offset for the consumer but not before.

If I use the .map/.to parts of this library, does this library automatically (or through configuration) update the offset only AFTER produce was successful? Or do I have to manually call the Native consumer.commitMessage() within node-rdkafka which would require listening on producer's delivery-reports event too?

Thanks!

KTable methods

I'd like to periodically generate a KTable from a topic, then go through it and aggregate the rows.

My thinking on this was that I would do a setInterval(), and in the callback, do table.consumeUntilLatestOffset(), with another callback inside to loop through the rows with a forEach().

However, table.consumeUntilLatestOffset() throws an error: "not implemented yet".

What can I do instead?

Async map support and awaitPromises.

Hi.
In most.js there's a method called .awaitPromises() that allows map() for example to return promises.
The result I want to achieve is:

stream
  .from('input')
  .map(async () => {
    return 'value'
  })
  .awaitPromises() 
  .to('output')

However it kafka-streams module .awaitPromises() not working and throwing error that TypeError: stream.map(...).awaitPromises is not a function.

So it there any way to make promises work?

I've tried to use asyncMap instead of map, but without any success.

UPD:
Adding new metod to StreamDSL seems solves the problem and code above work as expected.

   awaitPromises(etl){
        this.stream$ = this.stream$.awaitPromises(etl);
        return this;
    }

This will also work with filters and other methods that return promises.

Buffers are converted to strings

I'm trying to

let outStream = kafkaStreams.getKStream(null);
await outStream.to({ topic: 'output-topic' });

outStream.writeToStream(Buffer.from([ 1, 2, 3, 4, 5 ]));

However, when I am reading the data back in using another stream I end up with a "string" representation of the value.

{ topic: 'output-topic',
  value: '\u0001\u0002\u0003\u0004\u0005',
  offset: 8,
  partition: 0,
  highWaterOffset: 9,
  key: null }

Am I doing something incorrectly to cause this to happen? I would like to work with the raw buffer so that I can utilize a custom encoding mechanism.

Calling .consume() is not required in streaming mode.

Any ideas?

Calling .consume() is not required in streaming mode. Error: Calling .consume() is not required in streaming mode.
at NConsumer.consume (/Users/giles/Projects/msd-2-0/src/apps/api/node_modules/sinek/lib/librdkafka/NConsumer.js:496:29)
at consumer.connect.then (/Users/giles/Projects/msd-2-0/src/apps/api/node_modules/kafka-streams/lib/client/NativeKafkaClient.js:112:35)
at processImmediate (timers.js:632:19)
From previous event:
at NativeKafkaClient.once (/Users/giles/Projects/msd-2-0/src/apps/api/node_modules/kafka-streams/lib/client/NativeKafkaClient.js:103:69)
at Object.onceWrapper (events.js:276:13)
at NativeKafkaClient.emit (events.js:188:13)
at NativeKafkaClient.start (/Users/giles/Projects/msd-2-0/src/apps/api/node_modules/kafka-streams/lib/client/NativeKafkaClient.js:121:19)
at KStream._start (/Users/giles/Projects/msd-2-0/src/apps/api/node_modules/kafka-streams/lib/dsl/KStream.js:106:20)
at KStream.start (/Users/giles/Projects/msd-2-0/src/apps/api/node_modules/kafka-streams/lib/dsl/KStream.js:61:21)

Can't get examples to work

Hi,

I've been trying to get the examples provided in the repository to work on my local machine without any luck. Whenever I try and start any of them I'm presented with the following error:

INFO @ 2017-10-26T08:14:36.640Z : [log4bro] Logger is: in-prod=false, in-docker:false, level=INFO, skipDebug=false
Unhandled rejection Error: One of the following: zkConStr or kafkaHost must be defined.

It's using the test-config.js provided by the repository, which has kafkaHost defined & uncommented by default. Even commenting out Kafka and using ZK has the same issue.

Any advice?

Thanks.

TypeError: Cannot read property 'run' of undefined at examples/window.js

TypeError: Cannot read property 'run' of undefined at Merge.run (/home/root/projects/dev/kafka-streams/node_modules/most/lib/combinator/merge.js:95:40) at Slice.run (/home/root/projects/dev/kafka-streams/node_modules/most/lib/combinator/slice.js:94:40) at Tap.run (/home/root/projects/dev/kafka-streams/node_modules/most/lib/combinator/transform.js:67:22) at runSource (/home/root/projects/dev/kafka-streams/node_modules/most/lib/runSource.js:39:35) at /home/root/projects/dev/kafka-streams/node_modules/most/lib/runSource.js:31:5 at new Promise (<anonymous>) at withScheduler (/home/root/projects/dev/kafka-streams/node_modules/most/lib/runSource.js:30:10) at withDefaultScheduler (/home/root/projects/dev/kafka-streams/node_modules/most/lib/runSource.js:26:10) at drain (/home/root/projects/dev/kafka-streams/node_modules/most/lib/combinator/observe.js:36:46) at observe (/home/root/projects/dev/kafka-streams/node_modules/most/lib/combinator/observe.js:26:10) at Stream._Stream2.default.observe._Stream2.default.forEach (/home/root/projects/dev/kafka-streams/node_modules/most/lib/index.js:209:31) at KStream.forEach (/home/root/projects/dev/kafka-streams/lib/dsl/StreamDSL.js:271:29) at Object.<anonymous> (/home/root/projects/dev/kafka-streams/examples/window.js:20:6) at Module._compile (module.js:652:30) at Object.Module._extensions..js (module.js:663:10) at Module.load (module.js:565:32) (node:9247) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 3) (node:9247) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code. (node:9247) UnhandledPromiseRejectionWarning: TypeError: window.getStream(...).error is not a function at stream$.skipWhile.takeWhile.until.tap.drain.then.catch.e (/home/root/projects/dev/kafka-streams/lib/dsl/KStream.js:321:44) at <anonymous> at runMicrotasksCallback (internal/process/next_tick.js:121:5) at _combinedTickCallback (internal/process/next_tick.js:131:7) at process._tickCallback (internal/process/next_tick.js:180:9) at Function.Module.runMain (module.js:695:11) at startup (bootstrap_node.js:188:16) at bootstrap_node.js:609:3

How to specify a key to stream's messages from .map method

Hello.

Here in example, while processing message in .map() method, key and value were added only to value:
https://github.com/nodefluent/kafka-streams/blob/master/examples/fieldSum.js#L12

Also I din't find in source code place, where key for message can be specified.
For example:
https://github.com/nodefluent/kafka-streams/blob/master/lib/StreamDSL.js#L104
https://github.com/nodefluent/kafka-streams/blob/master/lib/KStream.js#L95
, there is only 'value' parameter for sending messages.

But key is present in received message object.

Is there any way to specify a key for sent message ?

Thank you.

Accessing StateStore

I’m new to Kafka and I’m trying to relate how this article https://www.confluent.io/blog/building-a-microservices-ecosystem-with-kafka-streams-and-ksql/ relates to kafka-streams.
The article talks about being able to interactively query the state store (one of the ways is to use KSQL).
I’ve been reading tons of resources about Kafka and I still can’t put the pieces together on how to use kafka-streams to implement the concepts that I’ve learnt.
For a start, how do you actually use kafka-streams to query the state store? Any help in any direction is greatly appreciated.

Can the map function do async work?

Hi,

Is it possible to do a DB lookup in the function that is passed to map? DB lookup will mean async calls to the DB. This is different in JS than in Java, hence the question.

-Yash

supported kerberos?

An in-range update of sinek is breaking the build 🚨

The dependency sinek was updated from `6.22.0` to `6.22.1`.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

sinek is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details

❌ continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

Commits

The new version differs by 5 commits.

6c803a5 6.22.1
8262b76 Merge pull request #73 from nodefluent/add-auto-option-to-type-defintion
6276c7f Merge pull request #74 from nodefluent/optional-identifier-in-nproducer-type-definition
fff9268 The identifier parameter in NProducer is optional
c3338a2 Add "auto" as value for defaultPartitionCount in NProducer

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.

Your Greenkeeper Bot 🌴

Question: How do I consume from multiple topics?

The changelog for v4 lists the ability to provide the getKStream() method with multiple topics, but this doesn't seem to work —

import { KafkaStreams } from 'kafka-streams';

const kafkaStreams = new KafkaStreams({...});

const inputTopics = ['t1', 't2'];
const inputStream = kafkaStreams.getKStream(inputTopics);

inputStream.forEach(message => console.log);

inputStream.start();

I've ensured both topics (t1 and t2) have messages to read from, but the output always contains messages from only one of them.

connectiing to multiple brokers

Hi I have

const kafkaStreams = new KafkaStreams(config)

with

config = {
  noptions: {
    'metadata.broker.list': 'broker:9092, broker1:9093, broker2:9094',
    'group.id': 'api',
    'client.id': 'api',
    event_cb: true,
    'api.version.request': true,
    'compression.codec': 'snappy',

    'socket.keepalive.enable': true,
    'socket.blocking.max.ms': 100,

    'enable.auto.commit': true,
    'auto.commit.interval.ms': 1000,

    'heartbeat.interval.ms': 1000,
    'retry.backoff.ms': 250,

    'fetch.min.bytes': 100,
    'fetch.message.max.bytes': 2 * 1024 * 1024,
    'queued.min.messages': 100,

    'fetch.error.backoff.ms': 100,
    'queued.max.messages.kbytes': 50,

    'fetch.wait.max.ms': 1000,
    'queue.buffering.max.ms': 1000,

    'batch.num.messages': 10000,
  },
  tconf: {
    // 'auto.offset.reset': 'earliest',
    'request.required.acks': 1,
  },
}

however if any one of thoise brokers is down i get an error from

stream.start(
        () => {
          console.log(`${topic} stream started, as kafka consumer is ready.`)
        }, (error) => {
          console.log(error.message, error.stack)

rror: broker transport failure
at NConsumer.stream.start.error (/Users/giles/Projects/msd-2-0/src/apps/api/build/index.js:1112:14)
at NConsumer.emit (events.js:193:15)
at KafkaConsumer.consumer.on.error (/Users/giles/Projects/msd-2-0/src/apps/api/node_modules/sinek/lib/librdkafka/NConsumer.js:364:15)
at KafkaConsumer.emit (events.js:193:15)
at eventHandler (/Users/giles/Projects/msd-2-0/src/apps/api/node_modules/node-rdkafka/lib/client.js:69:16)

should it not work ok if not all brokers are up?

thanks

Inner Join on KStreams

Hello, I am trying to do an Inner Join (without windowing) in KStream but I can't make it work. Could you provide some example on how to use them properly? This is the example code I have tested:

function keyValueMapperEtl(message){
    const val = JSON.parse(message.value);
    return {
        key: message.key.toString(),
        value: val.number,
    };
}

const st1 = kafkaStreams
    .getKStream("testA_out")
    .map(keyValueMapperEtl);

const st2 = kafkaStreams
    .getKStream("testB_out")
    .map(keyValueMapperEtl);

const st3 = st1.innerJoin(st2);
st3.forEach((v) => console.log(v));  // key missmatch

st1.start();
st2.start();

Node version requirements in docs seem to be incorrect

The docs say the requirement is:

nodejs should be version >= 6.10, suggested: >= 8.6.x

But the dependency on sinek appears to require async-await which requires at least node 7.6. Is there a work around that I'm missing?

TypeScript

This seems to be the kind of project that would benefit very much from TypeScript.

I'm saying this based in the fact that this is already a kind of a port of java code, it's class and interface based, it's a library and that it should be well documented and easy to use.

What are your feelings about the subject?

Partitioner?

@krystianity I'm using the native producer, and have set up my options like this -

"options": {
      "noptions": {
        "metadata.broker.list": "...",
		"...": "..."
      },
      "tconf": {
        "partitioner": "murmur2"
      }
}

The murmur2 partitioner doesn't seem to work, and all my messages (with unique-enough keys) end up on partition 0. Is this the correct way of doing this?

An in-range update of sinek is breaking the build 🚨

The dependency sinek was updated from `6.23.3` to `6.23.4`.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

sinek is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details

❌ continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

Commits

The new version differs by 4 commits.

bbfeb21 fixed missing return type, vb
f082261 Merge pull request #80 from nodefluent/holgeradam-patch-1
1120ff9 Fixes missing return time for tombstone function
79cb47a Update index.d.ts

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.

Your Greenkeeper Bot 🌴

Cannot install node-rdkafka because you updated to Mojave?

Please see this

Why consumer doesn't get new message just produced?

Hi I'm just start using kafka-streams. I try to create basic pub/sub over one topic.
My strategy is:

send message X to topic "kafka-test" (producer)
receive message X from topic "kafka-test" (consumer)

According examples in this repo I've created basic test app:

const {KafkaStreams} = require('kafka-streams')
const kafkaStreams = new KafkaStreams({
    kafkaHost: 'localhost:9092',
    logger: {
        debug: msg => console.log('logger', msg),
        info : msg => console.log('logger', msg),
        warn : msg => console.log('logger', msg),
        error: msg => console.log('logger', msg)
    },
    groupId: 'kafka-streams-test' + Math.random(),
    clientName: 'kafka-streams-test-name' + Math.random(),
    workerPerPartition: 1,
    options: {
        sessionTimeout: 8000,
        protocol: ['roundrobin'],
        fromOffset: 'latest',//'earliest', //latest
        fetchMaxBytes: 1024 * 100,
        fetchMinBytes: 1,
        fetchMaxWaitMs: 10,
        heartbeatInterval: 250,
        retryMinTimeout: 250,
        autoCommit: false,
        autoCommitIntervalMs: 1000,
        requireAcks: 1,
        ackTimeoutMs: 100,
        partitionerType: 3
    }
})


const topic = 'kafka-test'
const c = kafkaStreams.getKStream(topic)

c.forEach(m => console.log('consumer message', m))

c.start().then(
    () => console.log('consumer started'),
    e  => console.log('consumer error', e)
)

setTimeout(() => {
    const p = kafkaStreams.getKStream(null)
    p.to(topic)
    p.start().then(
        () => {
            console.log('producer started')
            p.writeToStream(Math.random() + 'ppppp')
        },
        e => console.log('producer error', e)
    )
}, 3000)

STDOUT:

logger starting ConsumerGroup for topic: ["kafka-test"]
logger [Drainer] started drain process.
(node:6659) DeprecationWarning: Kafka is deprecated, please use 'NConsumer' if possible.
(node:6659) DeprecationWarning: Drainer is deprecated, please use 'NConsumer' if possible.
logger consumer is connected / ready.
consumer started
logger starting Producer.
logger [Publisher] buffer disabled.
(node:6659) DeprecationWarning: Publisher is deprecated, please use 'NProducer' if possible.
logger producer is connected.
logger producer ready fired.
logger producer is ready.
logger meta-data refreshed.
producer started
logger producer is connected.

An in-range update of debug is breaking the build 🚨

The dependency debug was updated from `4.1.0` to `4.1.1`.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

debug is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details

❌ continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

Commits

The new version differs by 4 commits.

68b4dc8 4.1.1
7571608 remove .coveralls.yaml
57ef085 copy custom logger to namespace extension (fixes #646)
d0e498f test: only run coveralls on travis

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.

Your Greenkeeper Bot 🌴

An in-range update of bluebird is breaking the build 🚨

The dependency bluebird was updated from `3.5.2` to `3.5.3`.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

bluebird is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details

❌ continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

Release Notes for v3.5.3

Bugfixes:

Update acorn dependency

Commits

The new version differs by 7 commits.

a5a5b57 Release v3.5.3
c8a7714 update packagelock
8a765fd Update getting-started.md (#1561)
f541801 deps: update acorn and add acorn-walk (#1560)
247e512 Update promise.each.md (#1555)
e2756e5 fixed browser cdn links (#1554)
7cfa9f7 Changed expected behaviour when promisifying (#1545)

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.

Your Greenkeeper Bot 🌴

An in-range update of global is breaking the build 🚨

The dependency global was updated from `4.3.2` to `4.4.0`.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

global is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details

❌ continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

Commits

The new version differs by 3 commits.

91c4362 4.4.0
fbcec09 Merge pull request #9 from shinnn/dep
e458e7f Update process from v0.5.x to v0.11.x

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.

Your Greenkeeper Bot 🌴

Creating new topic with multiple partitions

I am trying to produce messages to a new topic using the "to" method:

stream.to({"topic":options.topic, "outputPartitionsCount":5});

This works fine, except the new topic always created with 1 partition.

Is there a way to create a topic with more than one partition?

Thank you!

Operator continueWith not supported

The continueWith operator from most is not implemented. Would this be possible to implement?

How do I produce to multiple topics?

I want to consume from a single topic and produce to multiple topics.

const { KafkaStreams } = require('kafka-streams');

const kafkaStreams = new KafkaStreams({
    kafkaHost: "127.0.0.1:32771",
    logger: {
      debug: msg => console.log(msg),
      info: msg => console.log(msg),
      warn: msg => console.log(msg),
      error: msg => console.error(msg)
    },
    groupId: "kafka-streams-test",
    clientName: "kafka-streams-test-name",
    workerPerPartition: 1,
    options: {
        sessionTimeout: 8000,
        protocol: ["roundrobin"],
        fromOffset: "earliest",
        fetchMaxBytes: 1024 * 100,
        fetchMinBytes: 1,
        fetchMaxWaitMs: 10,
        heartbeatInterval: 250,
        retryMinTimeout: 250,
        autoCommit: true,
        autoCommitIntervalMs: 1000,
        requireAcks: 1,
        ackTimeoutMs: 100,
        partitionerType: 3
    }
});

const stream = kafkaStreams.getKStream('in');

const x = stream.branch([()=>{ return true; }, ()=>{ return true; }]);

x[0]
  .mapJSONConvenience()
  .mapWrapKafkaValue()
  .tap(console.log)
  .wrapAsKafkaValue()
  .to('out', 1, 'buffer');

x[1]
  .mapJSONConvenience()
  .mapWrapKafkaValue()
  .tap(console.log)
  .wrapAsKafkaValue()
  .to('test', 2, 'buffer');

stream.start().then(() => {
    console.log("stream started, as kafka consumer is ready.");
}, error => {
    console.log("streamed failed to start: " + error);
});

I'm using ./kafka-console-consumer --bootstrap-server 127.0.0.1:32771 --topic out and ./kafka-console-consumer --bootstrap-server 127.0.0.1:32771 --topic test to listen to my two outgoing streams. I can consuming data from my out topic but I'm only receiving "[object Object]" from my test topic.

node-rdkafka

Hi,

I am using node-rdkafka with SASL Kerberos, I would like to use the avro schema with it but it doesn't seem to support it, does kafka-streams support avro and is it production ready?

Looking forward to hear from you. Thank you in advance.

Best regards,
D Tanna

nodefluent / kafka-streams Goto Github PK

kafka-streams's People

Contributors

Stargazers

Watchers

Forkers

kafka-streams's Issues

The dependency bluebird was updated from 3.5.4 to 3.5.5.

Version 3.2.3 of debug was just published.

The devDependency jsdoc was updated from 3.6.1 to 3.6.2.

The dependency bluebird was updated from 3.5.3 to 3.5.4.

The dependency sinek was updated from 6.22.0 to 6.22.1.

The dependency sinek was updated from 6.23.3 to 6.23.4.

The dependency debug was updated from 4.1.0 to 4.1.1.

The dependency bluebird was updated from 3.5.2 to 3.5.3.

The dependency global was updated from 4.3.2 to 4.4.0.

Recommend Projects

Recommend Topics

Recommend Org

Jobs

The dependency bluebird was updated from `3.5.4` to `3.5.5`.

The devDependency jsdoc was updated from `3.6.1` to `3.6.2`.

The dependency bluebird was updated from `3.5.3` to `3.5.4`.

The dependency sinek was updated from `6.22.0` to `6.22.1`.

The dependency sinek was updated from `6.23.3` to `6.23.4`.

The dependency debug was updated from `4.1.0` to `4.1.1`.

The dependency bluebird was updated from `3.5.2` to `3.5.3`.

The dependency global was updated from `4.3.2` to `4.4.0`.