GithubHelp home page GithubHelp logo

Comments (2)

cristim avatar cristim commented on August 25, 2024

Thanks for reporting this.

At the moment I don't have any time to look into this but by all means please try to test it and report back your findings, preferably in a pull request to update the documentation.

If you don't like the way AutoSpotting handles this, pull requests to change it for the better are always welcome 😁

from autospotting.

cristim avatar cristim commented on August 25, 2024

The main difference seems to be that AutoSpotting launches OnDemand instances and then tries to replace them with spot instances, while Capacity Rebalancing seems to only attempt to launch spot instances. In theory, it's possible that AutoSpotting can do a better job at launching an OnDemand instance than Capacity Rebalancing can do in finding a spot instance, but it seems like AWS's service should be pretty good at finding spare capacity (feel free to chime in if anyone has empirical data on this).

The OnDemand instances are not launched by AutoSpotting, but by the ASG itself. When the event comes, (regardless if it's a termination of rebalancing event, as they're handled the same way) AutoSpotting will currently either:

  1. proactively detach the terminating Spot instance from the ASG and leave it run outside the ASG for up to 14 minutes (we have a 15min Lambda timeout), then terminates it if it wasn't terminated by EC2 Spot. Spot will terminate the instance after 2 minutes if it was a termination notification, but rebalancing events may not always result in terminations, and that's why we terminate it ourselves.

Then the ASG will notice it runs with reduced capacity, and will attempt to launch an OnDemand instance to recover the desired capacity. Within seconds after launch, this new OnDemand instance will be replaced by a new Spot instance and terminated, so the new Spot instance is booting up inside the ASG.

or...

  1. terminate the instance while it's still in the ASG, telling the ASG to replace it immediately with a new OnDemand instance, which will be replaced identically by AutoSpotting as it's mentioned at the end of option 1.

The default behavior depends if the ASG has Lifecycle Hooks configured:

  • if there are no termination lifecycle hooks configured, the instance will be detached and terminated after the 14 minutes timeout (option 1)
  • otherwise AutoSpotting will terminate the instance within the ASG , in order to have the termination lifecycle hooks triggered.

There is also a configuration flag that can enforce either of the above behaviors regardless if the ASG has Lifecycle hooks or not, as you can see in the CloudFormation stack parameters:

 TerminationNotificationAction:
      AllowedValues:
        - "auto"
        - "detach"
        - "terminate"
      Default: "auto"

Are there any other differences between AutoSpotting and native autoscaling that should be documented?

Yes, the ASG won't run any temporary OnDemand capacity. It will first attempt to launch the replacement Spot instance, and only terminates the instance that received the rebalancing event after the new Spot instance is ready and passes the EC2/ELB health checks.

I've been working on a similar implementation in #475 but it's not ready yet. In addition, this will also fallback to OnDemand capacity with fallback across instance types if we failed to launch Spot across all the suitable Spot instance types from the AZ.

I'm looking for people who can help me test/refine #475 to get it merged.

from autospotting.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.