Comments (11)
Observability: we should be able to track why a certain client was throttled, ie which specific metric it was throttled on.
from vitess.
Throttler check requests (mostly via throttler clients) should be able to specify the list of metrics on which they wish to throttle (e.g. "I care about replication lag, but fine to ignore load average")
-
The set of metrics specified by the client will
AND
with each other, ie if the client chooses to throttle based onlag,loadavg
then bothlag
andloadavg
need to individually pass for the overall check to pass.I don't think it makes sense to
OR
or to have any other combination.
from vitess.
As mentioned above, we want to be able to change the list of considered metrics while an Online DDL operation is running (as an example). So that, for example, we want Online DDL to start throttling based on lag and based on load average, or then later on for it to stop throttling based on load average and remain just with lag.
IMO the way to do that is to associate metrics with an app name. All Online DDL operations use the app name "online-ddl". So the way would be to associate "online-ddl": "lag,loadavg"
.
That association will then either
make its way to the throttler client -- which then provides to the throttler the list of metrics its interested in,- or, keeping the throttler client ignorant, computed on behalf of the client by the throttler.
from vitess.
metrics can be collected from the single tablet being probed, or from the collective shard.
- Replication lag is normally something you wish to collect from the entire shard (including primary), because you want to know about replica's lag. There is a strong reason to check on all shard servers.
- What about load average? Are you concerned with the load average on the
PRIMARY
or are you concerned about the metric on replicas? There is no clear answer and you probably want to check onPRIMARY
only.
To that effect:
- A metric is associated with a scope (
self
/shard
). Each metric has a default scope.lag
usesshard
, others useself
. - A normal check will use the default scopes (per metric).
- But the user may also indicate "I wish to check the entire
shard
for all metrics" or I wish to checkself
scope for all metrics". In which case we override the metrics' defaults.
Moreover, consider the discussion in previous comment re: associating metrics with apps. It will be even further possible to fine grain the checks by associating "online-ddl": "lag,shard/loadavg"
. Note:
- the scope is not mandatory (nothing declared for
lag
, and so the scope forlag
is the default one for this metric, which happens to beshard
). - per-metric scopes are ignored by the self-checks, which are the mechanism by which the tablets collect their own metrics and by which the
PRIMARY
tablet collects metrics from the replicas.
from vitess.
- Adding support for an
all
app, which is a catch-all for anything that's doesn't have any specific rules. Withall
, it is possible to do inverted rules, such as "everything is rejected, except this app which is allowed". Or, "everything throttles at 0.7 ratio for the next 2 hours, except these two apps, one of which is exempted in the next 5 hours, the other throttled at 0.2 ratio for the next 30min". Or also "everything is exempted, but this app needs to go through normal throttling".
from vitess.
- Adding
vtctldclient CheckThrottler
command, which returns a detailedCheckThrottlerResponse
. The command takes a tablet name as argument (potentially also it could take shard name, much likeBackup
andBackupShard
). IT takes--app-name
and--scope
optional arguments as well as some extra flags.
from vitess.
Required additions to vtctldclient UpdateThrottlerConfig
:
- Updating the threshold for a given metric name. Setting threshold to
0
will remove the entry.
We can use the existing--threshold
flag, and add--metric-name=...
flag. IF the latter exists, then--threshold
must be specified. If it does not exist, then we assume the "default" metric. - Setting the per app metrics. Something like
--app-name=online-ddl --app-metrics=lag,shard/loadavg
. The two flags must come together - either both exist, or none exists. It's OK to provide an empty--app-metric
, in which case the throttler uses the default metrics for the given app.--app-name
must not be empty. It can be"all"
.
from vitess.
Eventually (v21/v22/v23
, depending), we will deprecate these flags in vtctldclient UpdateThrottlerConfig
:
--check-as-check-self
--check-as-check-shard
We will also clean up these fields fromUpdateThrottlerConfigRequest
:CheckAsCheckSelf
CheckAsCheckShard
from vitess.
- Assigning metrics to "all" app should apply to 'all' app should apply to all apps which do not already have any explicit metrics assigned:
$ vtctldclient UpdateThrottlerConfig --app-name "all" --app-metrics "lag,loadavg" commerce
from vitess.
Addressed by #15988
from vitess.
Base branch PR for changes: planetscale:throttler-multi-metrics-incremental
#16012, onto which we will merge multiple incremental PRs.
from vitess.
Related Issues (20)
- Bug Report: cancelling MoveTables errors with 'cannot remove tables since one or more do not exist in the denylist' HOT 11
- Bug Report: Insert on duplicate key update failing with missing bind var
- Feature Request: Add support for getting older key values from the topo server
- RFC: Drain MySQL connection in VTGate upon termination of VTGate
- Bug Report: VPlayer does not detect stalls HOT 3
- Feature Request: Add support for multi table update for non literal column update using other dependent table
- FeatureRequest: `LookupVindex create` should use an existing VIndex and Table when possible
- `LookupVindex create` does not properly cleanup / undo state changes made when it fails to create the workflow
- Feature Request: `LookupVindex` has no `internalize` command (equivalent to `ReverseTraffic`)
- Bug Report: VTOrc is not setting the correct timeout for RPC calls which can get stuck
- Feature Request: Move most VARBINARY columns to VARCHAR in examples
- Bug Report: `schemadiff` shows bogus diff on textual column where collation is undefined
- Release of `v20.0.0`
- Release of `v20.0.0-RC1` HOT 2
- `go:linkname` (de)stabilization HOT 1
- Feature Request: support private go packages
- Cleanup: Deleting the deprecated metrics in VTOrc
- Bug Report: OnlineDDL PK conversion results in table scans HOT 18
- Bug Report: `schemadiff` should not allow `INSTANT` DDL when setting expression-based default column value HOT 1
- `ApplySchema` command should reroute `ALTER VITESS_MIGRATION ... THROTTLE ...` via `UpdateThrottlerConfig`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vitess.