Comments (5)
Nothing obvious come to mind, unfortunately.
from broadway_kafka.
Thank you for having a look @josevalim, really appreciated!
After debugging for another while we have found one of the possible causes (I say one because we've seen offsets accumulating without this error message) of this, the batcher
dies due to an unknown timer.
"GenServer MyService.Broadway.Broadway.Batcher_ignore terminating
** (RuntimeError) unknown timer #Reference<0.1181658914.3223060481.95258>
(broadway 1.0.3) lib/broadway/topology/batcher_stage.ex:207: Broadway.Topology.BatcherStage.cancel_batch_timeout/1
(broadway 1.0.3) lib/broadway/topology/batcher_stage.ex:148: Broadway.Topology.BatcherStage.deliver_batch/6
(broadway 1.0.3) lib/broadway/topology/batcher_stage.ex:118: Broadway.Topology.BatcherStage.handle_events_per_batch_key/3
(broadway 1.0.3) lib/broadway/topology/batcher_stage.ex:64: anonymous fn/2 in Broadway.Topology.BatcherStage.handle_events/3
(telemetry 1.1.0) /build/deps/telemetry/src/telemetry.erl:320: :telemetry.span/3
(broadway 1.0.3) lib/broadway/topology/batcher_stage.ex:54: Broadway.Topology.BatcherStage.handle_events/3
(gen_stage 1.1.2) lib/gen_stage.ex:2471: GenStage.consumer_dispatch/6
(gen_stage 1.1.2) lib/gen_stage.ex:2660: GenStage.take_pc_events/3
It seems that Broadway already accounts for the case where the timeout message has been received when the cancel timer returns false (https://github.com/dashbitco/broadway/blob/main/lib/broadway/topology/batcher_stage.ex#L201-L213), but there seems to be an edge case?
My guess is crashing the batcher is fine for other Producers where the ACK is not sequential, but for BroadwayKafka it seems like it's messing up.
from broadway_kafka.
I am looking at the code and I cannot see a code path that would make the error message above happen. Every time we cancel the timer, we delete the batch, which means it is impossible to recover the timer again.
from broadway_kafka.
This has been fixed in Broadway. There was an assumption that the timer message would be delivered automatically but that was not always the case.
from broadway_kafka.
This has been fixed in Broadway. There was an assumption that the timer message would be delivered automatically but that was not always the case.
This is great news! Thank you very much for all your work @josevalim ❤️
from broadway_kafka.
Related Issues (20)
- Support :query_api_versions brod option HOT 1
- Cut release 0.3.6 ? HOT 2
- Consumer Static Membership HOT 9
- No rejoin after "payload connection down :shutdown, :tcp_closed}" deadlock on race between assigments_revoked call and handle DOWN message HOT 16
- the table identifier does not refer to an existing ETS table HOT 5
- Deadlock on race between assigments_revoked call and handle DOWN message HOT 3
- drain_after_revoke failed due to killed process HOT 3
- Producers stuck in :assignments_revoked causing endless group rebalancing HOT 24
- Feature: Add option to set the starting offset for new consumer HOT 6
- Backoff strategy HOT 1
- Manual Partition Assignment HOT 4
- Allow to force consume the topic from the beginning or the end
- Undesirable resource usage related to producer concurrency HOT 8
- Add support for reseting offsets to a specific timestamp HOT 1
- Request for a new release HOT 1
- Offsets accumulating in the producer ack state (take 2) HOT 6
- Updates on the release date of the next version? HOT 5
- Fails to compile on otp 27 HOT 6
- Implementing offset lag telemetry HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from broadway_kafka.