What feature do you want? We want to be able to set min-scale wh

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Allow scale to zero to work when min-scale is greater than 0 about serving HOT 3 CLOSED

daraghlowe commented on June 24, 2024

Allow scale to zero to work when min-scale is greater than 0

from serving.

Comments (3)

skonto commented on June 24, 2024

Hi @daraghlowe, I will take a look on what you report about activation-scale and will get back to you.

from serving.

skonto commented on June 24, 2024

Hi @daraghlowe

We did some testing with using activation-scale but it doesn't solve the problem as the service can scale down to 1 replica when its active if it doesn't get enough request concurrency after initially activating and scaling up to 2. The description of the PR that was merged seems to indicate that it should work like we want it however, but it doesn't.

According to the docs the behavior is:

This value controls the minimum number of replicas that will be created when the Revision scales up from zero. After the Revision has reached this scale one time, this value is ignored. This means that the Revision will scale down after the activation scale is reached if the actual traffic received needs a smaller scale.

Also in the PR: "This annotation will not impact initial-scale values, as it will only apply on subsequent scales from zero."
Now in the code we have:

if a.deciderSpec.ActivationScale > 1 {
  logger.Debug("Considering Activation Scale")
  if dspc > 0 && a.deciderSpec.ActivationScale > desiredStablePodCount {
  ...

if dspc is > 0 due to traffic come in and also revision is active I am wondering why you don't see two pods.
I did try it and don't see less pods, using:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: autoscale-go
  namespace: default
spec:
  template:
    metadata:
      annotations:
        # Target 10 in-flight-requests per pod.
        autoscaling.knative.dev/target: "10"
        autoscaling.knative.dev/activation-scale: "2"
        autoscaling.knative.dev/target-burst-capacity: "10"
    spec:
      containers:
      - image: ghcr.io/knative/autoscale-go:latest

Could you enable debug logging for the autoscaler and paste the output also provide more details like the ksvc you used?

cc @psschwei @dprotaso if they have more ideas.

from serving.

daraghlowe commented on June 24, 2024

Hi @skonto

I did some additional testing and confirmed that the activation-scale annotation does work as you mentioned. As long as the revision is receiving traffic (i sent one request per 5s) then it will stay scaled up to the the number of replicas set in the activation-scale annotation.

If the revision doesn't receive traffic then it will scale down to 1 replica and it will scale back up to 2 (activation-scale) as soon as you send 1 request.

We were testing in our dev environment where the revision wasn't receiving any traffic and when we saw the replicas scale down from 2 to 1, we misunderstood how it was working as we expected that it would always stay at minimum of 2 replicas and when it scaled down, it would scale down directly to zero rather than scale down to 1 replica first.

Thanks for investigating this and clarifying how it works!

from serving.

Allow scale to zero to work when min-scale is greater than 0 about serving HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs