GithubHelp home page GithubHelp logo

awslabs / fleetiq-adapter-for-agones Goto Github PK

View Code? Open in Web Editor NEW
38.0 4.0 5.0 7.07 MB

For running containerized game servers on Spot instances reliably and safely

License: Apache License 2.0

Dockerfile 2.14% Python 46.88% Go 4.51% Shell 43.11% Mustache 3.35%
agones kubernetes gamelift

fleetiq-adapter-for-agones's Introduction

Introduction

This project allows you to run containerized game servers on Spot instances while decreasing the likelihood of Spot interruptions. Interruptions are minimized by using Gamelift FleetIQ which periodically adjusts the instance types used by an AWS Autoscaling Group (ASG) using an algorithm that assesses an instance's viability. Instances with claimed game servers are temporarily protected from termination.

Components

Agones

Agones provides lifecycle management operations for running containerized game servers on Kubernetes. This project was specifically designed to work with Agones running on Amazon EKS or a self-managed Kubernetes cluster running in the AWS Cloud.

The daemonset

The daemonset is an "agent" that runs on worker nodes that have been designated to run containerized game servers, i.e. instances with the role=game-servers label. On EKS, labels can be automatically added to instances by modifying the kubelet parametes in the instance's user data or by modifying the launch template referenced by the ASG for the game server node group.

When the daemonset starts, it immediately registers the instance with Gamelift FleetIQ, runs ClaimGameServer, and calls UpdateGameServer 1x per minute thereafter to maintain the instance's health. It also starts polling a Redis channel for the instance's viability. When an instance's status changes from ACTIVE to DRAINING, the daemon cordons the node to prevent new game servers from being scheduled onto the node. Then it adds a toleration to all allocated game servers. Afterwards, it taints the node, forcing pods that do not have a toleration for the taint, i.e. un-allocated game servers, to be evicted. When the last allocated game server is shutdown, the daemon calls DeregisterGameServer which deregisters the instance from FleetIQ and waits for the instance to be terminated.

The pubsub application

The pubsub application runs a loop that calls DescribeGameServerInstances, parses the results, and publishes the status for each instance to a Redis channel for that instance. Although we could have built the daemon to call DescribeGameServerInstances directly, we chose to use a pub/sub model to avoid exceeded the rate limit for the Gamelift APIs.

The pubsub application supports n game server groups. On startup, the application reads the list of game server groups from the fleetiqconfig ConfigMap.

kind: ConfigMap
apiVersion: v1
metadata:
  name: fleetiqconfig
  namespace: default
data:
  fleetiq.conf: '{"GameServerGroups": [ "agones-game-servers" ]}'

The instructions for installing the pubsub application, along with Redis, can be found here.

The pubsub application and Redis should be installed prior to the gamelift daemon.

Redis

Redis is used to publish InstanceStatus to a channel for each instance. We elected to use Redis instead of SNS to avoid taking a dependency on another AWS service. That said, you can use Redis ElastiCache as your Redis endpoint or you can choose to run it locally in your Kubernetes cluster. The Redis endpoint can be configured by updating the REDIS_URL environment variable for the pubsub application and the daemonset.

Installation

Please follow the instructions in the FleetIQ ESK Agones Integration Guide to install the solution.

We recommend that you build the images for the daemonset and the pubsub application from the Dockerfiles in this repository. Be aware that you will need to update the daemonset and deployment manifests with the appropriate image URIs if you do. Both charts allow you to override the defaults for image and tag with your own values.

Issues

If you have an issue with the Guide or with any of the solution's components, please file an issue.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

fleetiq-adapter-for-agones's People

Contributors

dependabot[bot] avatar jicomusic avatar jicowan avatar syvanen avatar tgreaves avatar trevorrobertsjr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

fleetiq-adapter-for-agones's Issues

Following start guide - how to view created resources in AWS console?

Apologies for the newbie question - I am learning about Agones, GameLift FleetIQ and plan to use AWS FlexMatch to create a game service for CSGO (source engine) - I plan to use a webhook autoscaler with AWS and write a wrapper for the game server.

I followed the guide to create a workspace and set up all the aws resources - all good managed to get to the end of the guide and proceed with the scaling up and replicas seeing the nodes - however it didn't ever scale up to 100 nodes stopping at 54 with 1 group and 60 with 2 groups. I'd like to understand more of this adapter and agones but I can't see any of these resources in my AWS account except for the GameLift groups. The EKS cluster does show up but doesn't display any nodes.

This isn't really an issue with the repo - I'd just like to learn more but I can't see any forum or slack channel/similar for this. I'd like to be able to work on the components of the solution architecture and visualise where I need to work on, if you guys can give any recommendations.

Thanks.

EKS Launch Template now has multiple mentions of "NODE_TAINT"

The awk on the quickinstall.sh line 161 fails because there's multiple lines with NODE_TAINTS=in the launch template now.

By adding -m1 limits the match to first hit making the awk replacement work again. Without this the daemon will fail to add/update the taint in main.py line 206 to fail as there are no taints defined on the node.

Error on logs:

Registering game server
The game server is already registered
The instance is HEALTHY
Updating game server health
Claiming game server
The instance has already been claimed
Changing status to utilized
Traceback (most recent call last):
File "main.py", line 347, in <module>
main()
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "main.py", line 322, in main
initialize_game_server(GameServerGroupName=game_server_group_name, GameServerId=game_server_id, InstanceId=instance_id)
File "main.py", line 207, in initialize_game_server
taints.append(taint)
AttributeError: 'NoneType' object has no attribute 'append'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.