Comments (14)
Thanks for reporting. I'll try to reproduce this.
from amazon-kinesis-client.
Hi eyesoftime,
I'm not able to reproduce this. Can you provide the steps which you took to produce the problem?
What language are you using to process the records? Are you using one of the official multilang KCLs?
Thanks
from amazon-kinesis-client.
I was loading records into the stream, each one about 130KB of arbitrary data with indexes applied for tracking. Initially I started out with one shard. After about 2500 records I split the shard into two, after another 2500 records I merged the new shards. So, there's 4 shard altogether. There were no consumers running at the time (but that doesn't really change the outcome).
So then I started one daemon with Python sample application (with additional logging added, again for tracking). While it was consuming records from the first shard, I started the second daemon which then was idling as the first shard hadn't been consumed yet and second and third are the child shards of the first. Now when the consumer of the first shard reached the end, the first daemon died with the NPE. It happened repeatedly, while running daemons on the same EC2 instance, or parallel to one in my local machine. The same thing happened when the test was done 2-1-2 with shardes, ie merging and splitting. In that case it also died when going from one shard to two.
Hope it helps you.
from amazon-kinesis-client.
Thanks for the information. I did not do the shard merge in my own test, so that might be the problem. I will do so to see if that reproduces the problem.
from amazon-kinesis-client.
I have reproduced the problem. The problem isn't with MultiLangRecordProcessor
per se, but rather with the Worker
implementation.
A Worker
will sometimes call shutdown
on an IRecordProcessor
even if initialize
has not been called on the same instance. Since MultiLangRecordProcessor
uses its initialize
method to construct certain fields, and its shutdown
method assumes that those fields have been initialized, an NPE occurs.
Once again thank you for reporting the problem. It will be fixed in a future release.
from amazon-kinesis-client.
Using the python wrapper for this package, and I seem to be running into the same issue. Is there a workaround you can recommend to guarantee that initialize
always gets called?
from amazon-kinesis-client.
If your code doesn't need the shard id, you might be able to place it in the constructor of the class instead. Do you absolutely need initialize to be called? If it's just to ensure proper functioning of the shutdown method, adding a flag to check whether initialization has happened might be sufficient.
from amazon-kinesis-client.
Was this fixed in a recent version?
from amazon-kinesis-client.
This remains unfixed. The MultiLangRecordProcessor has not been changed since Oct 2014 https://github.com/awslabs/amazon-kinesis-client/blob/73ac2c0e25a25776cbc88f2c685223fb049e6757/src/main/java/com/amazonaws/services/kinesis/multilang/MultiLangRecordProcessor.java
I was able to reproduce this issue on 1.6.1 (the current latest version)
from amazon-kinesis-client.
@kevincdeng ETA here?
from amazon-kinesis-client.
@kevincdeng @findchris FWIW I've found success by specifying the failoverTimeMillis property in the .properties file to a high number (e.g. 100s)
from amazon-kinesis-client.
This apparently shipped in https://github.com/awslabs/amazon-kinesis-client#release-162-march-23-2016 @manango can you close or merge this PR please? Its confusing to leave open.
from amazon-kinesis-client.
The issue has been resolved in 1.6.2 release. Closing the issue.
from amazon-kinesis-client.
I am facing similar issue in https://github.com/awslabs/amazon-kinesis-client-net. Can someone please help?
from amazon-kinesis-client.
Related Issues (20)
- Please update your schema-registry-serde library in order to solve CVE issue HOT 1
- Need more test cases, samples, documentation for StreamConfigs in case of Multi Streams with KCL2.x.
- Support for both polling and fanout retrieval mode for multi-stream consumer configuration HOT 1
- amazon-kinesis-client-pom using old awssdk.version HOT 2
- i.n.c.ChannelInitializer Failed to initialize a channel. Closing: [id: 0x9a4e56d6] java.lang.VerifyError: Bad return type HOT 2
- Lease table is not updated when new shards are added causing stale workers HOT 1
- Support artifacts with all third-party dependencies relocated HOT 1
- Consolidate metrics to a common name space, like /aws/kinesis-client HOT 2
- Uneven distribution of shards over the consumer application workers HOT 1
- Retrieving shard consumer's current lease's hash range key HOT 2
- KCL2 -Multi stream consumer - Configured streams can be in same account and cross account HOT 4
- graceful shutdown of MultiLangDaemon worker that is assigned for completed shards is always timeout HOT 3
- STS Endpoint HOT 1
- Change to PollingConfig maxRecords breaks compat
- OutOfMemory due to huge number of 'org.apache.http.impl.conn.PoolingHttpClientConnectionManager' instances referenced by 'idle-connection-reaper' thread HOT 1
- Am I degrading my app if use multiple KCL at the same time? HOT 2
- Uncaught Netty exceptions on high volume data stream HOT 2
- Documentation on _CHECKPOINT_FREQ_SECONDS
- 2.6.0 release not available on Maven Central HOT 2
- ERROR: Unable to download MultiLangDaemon jar files from maven
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from amazon-kinesis-client.