Comments (7)
@piodul are you familiar with how expensive gossiping is? If we go with the simpler solution here, we'll have add_local_application_state
each second which may be very expensive if there are periods when we're not gossiping at all, or it may be free if we're gossiping each second anyway.
There's also problem with the second solution - the backlog in a response is only propagated to one node, so if we update the last sent backlog in gossip also with backlogs sent in responses, we may think we propagated the backlog already, when actually it's been only propagated to one node.
Perhaps we should go with a third approach - when sending replies, only note that the backlog has changed since the last gossip round and still keep the last sent gossip backlog in view_update_backlog_broker
. This should avoid the second issue and gossip will keep sending updates each second only as long as we're performing requests with view updates
from scylladb.
@piodul are you familiar with how expensive gossiping is?
In general, AFAIK we try to avoid gossiping data unnecessarily. It probably depends mostly on the size of the data.
The most reasonable solution for me would be to have something like this:
- Have a flag
bool need_publishing;
- When the local view update backlog changes, set
need_publishing = true
- Run a fiber which, every second:
- If
need_publishing
then update the application state in gossip and setneed_publishing = false
- If
This is of course a naive model because it only assumes one shard. In reality, calculating the backlog is done with atomics (see node_update_backlog::add_fetch
), so this becomes more complicated.
Perhaps we could have a per-shard, non-atomic need_publishing
variable; the fiber would use invoke_on_all and do the check I mentioned on each local shard and would update the application state if the flag was true on any of the shards. This approach would avoid all the concerns related to the ordering of atomics (each shard's backlog is only written by that shard, so we properly serialize with any potential updates of the shard-local backlog).
from scylladb.
After discussing this with @piodul @kostja and @gleb-cloudius, there are a few things worth noting. There are gossip services similar to the view_update_backlog_broker
that we're using, particularly the load gossiper and the cache-hit-rate gossiper. The load gossiper broadcasts the load every 60s and the cache-hit-rate gossiper sends its updates every 2s. The gossiped values are used mainly when the node is starting, later they are always obsolete anyway.
The main difference in the view_update_backlog_broker
is that it may be completely unused - in contrast to load and cache-hit-rate which are useful in practically every workload. It may also be used periodically, and between the periods, updates from gossip may be needed.
With that in mind, we found a few approaches we can try here:
- The approach mentioned by @piodul in #18461 (comment), which has benefits in form of relatively low complexity and performance costs
- Simply send the view update backlog in each iteration of the gossiping loop, this would have the biggest performance cost but lowest complexity
- Time-out view update backlog values received in responses after some time - in this case we would assume that if we didn't get an update from gossip, the backlog dropped to 0 (or to the last gossiped value). This approach would be relatively simple and inexpensive, but would allow a higher temporary discrepancy
- Implement another way of propagating view update backlog sizes. Currently the propagation works well as long as there are frequent updates from the same coordinator to the same node, the values propagated using gossip quite outdated in comparison.
from scylladb.
If it's gossiped every 60 seconds is an optimization worthwhile?
from scylladb.
@avikivity the view update backlog is gossiped every second.
from scylladb.
@wmitros how does this issue relate to #18462? It seems your original problem statement refers to the case where a zero backlog estimate is not gossipped, so some non-zero estimate sent in some previous request gets kept forever. If this is the problem then this is exactly issue #18462 - no need for both issues.
from scylladb.
@wmitros how does this issue relate to #18462? It seems your original problem statement refers to the case where a zero backlog estimate is not gossipped, so some non-zero estimate sent in some previous request gets kept forever. If this is the problem then this is exactly issue #18462 - no need for both issues.
These issues have similar symptoms but they are separate issues. #18462 refers only to receiving "empty" backlogs from gossip and this issue is about sending repeating backlogs (which probably are most likely to be 0 as well, but don't have to be).
from scylladb.
Related Issues (20)
- TestMaterializedViews.test_base_replica_repair: error starting node unable to connect to scylla-jmx HOT 2
- test_joining_old_node_fails fails due to SEGV (accessing a high value address) HOT 3
- Reserving capacity for large partition_range_vector might stall
- clang 18.1.1 miscompiles coroutines with references to static thread_local variables with initializers in debug mode HOT 5
- Coordinator emits 2x more ranges than there are tablets in a full scan
- cql/describe: User-defined types should be sorted topologically to safely restore schema HOT 1
- [x86_64, debug] topology_experimental_raft/test_tablets failed with InvalidRequest HOT 1
- The gossiped view update backlog is updated incorrectly on local node
- test_tablet_split: terminate after `std::logic_error` "Invalid tablet id: 3 >= 2", leading to "Coordinator node timed out" in test HOT 3
- Add garbage collection for `system_schema.dropped_columns` HOT 1
- topology/test_change_ip very rarely fails with " cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 127.194.33.11:9042 datacenter1>: ConnectionException('Host has been marked down or removed')})" HOT 1
- boost.logalloc_test.background_reclaim: critical check compacted_pre == compacted_post has failed HOT 2
- Get rid of SHUTDOWN status HOT 18
- scylla-nodetool --help exits with error status
- Serialize tablet repair and tablet migration HOT 1
- sstable_compaction_test doesn't complete in debug mode HOT 1
- ubuntu 24.04 scylla_setup failing on kernel check (Command 'mkfs.xfs /var/tmp/kernel-check.img' returned non-zero exit status 1)
- perf_simple_query - tps and tasks_per_op regression HOT 1
- docs: Issue on page Scylla Memory Usage HOT 1
- Segmentation fault on shard X HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scylladb.