GithubHelp home page GithubHelp logo

ram-shanmugam / aws-glue-quota-audit-framework Goto Github PK

View Code? Open in Web Editor NEW

This project forked from aws-samples/aws-glue-quota-audit-framework

0.0 0.0 0.0 0 B

This aws sample code base implements Lambda logic to schedule only limited Glue jobs based on resource constraints

License: MIT No Attribution

Python 100.00%

aws-glue-quota-audit-framework's Introduction

AWS Glue audit framework to efficiently control resource quota.

Introduction

Creating and running Glue jobs require additional auditing and logging methods for efficient tracking of job status and resource quota. Glue console dashboard is the first place to review these metrics but as the number of jobs increases along with more complex multiple instance job runs users face the challenge of hitting the maximum limits on glue quota.

There are few quota to monitor during glue concurrent run of jobs / instance of same job.

  • Max concurrent job runs per account
  • Max concurrent job runs per job
  • Max connection per account
  • Max jobs per account
  • Max task dpus per account
  • Number of IP address in your subnet

The first lower limit you will hit is max concurrent job runs per account this is set to 50 but this can be increased upon request using service request.

Please refer below link to understand complete list of AWS Glue quota. https://docs.aws.amazon.com/general/latest/gr/glue.html

As more adoption of Glue happens there is a need to dynamically monitor and schedule Glue jobs based on available IP addresses in your private subnet and max DPU setting combined with account level concurrency limits.

As part of this repository, we walk through a solution where we create Glue audit framework to efficiently manage resource quota and schedule AWS Glue jobs using step functions. This solution is useful for large enterprise running more than 100 instance of Glue jobs regularly within private subnet where availability of IP address and DPU capacity is main contraint for job scheduling.

Glue quota auditing framework

This is an overview of the architecture described above:

GlueAuditFramework

Data

The data used in this demo solution is from public glue dataset available in below location.

Amazon S3 bucket at s3://awsglue-datasets/examples/medicare/Medicare_Hospital_Provider.csv

Solution Steps

  1. Event rules trigger Step function for every 5 mins

  2. Step function executes Lambda to validate glue capacity and quota metrics

  3. Lambda Query executed on Dynamo DB gives total active job list

  4. Set of glue jobs triggered based on Lambda calculations

  5. Glue jobs executed in parallel and writes to S3

  6. Glue updates dynamo table - once start of the job as "Active" second during completion / failure.

Useful links to familiarize AWS resources used in this solution.

License Summary

This sample code is made available under the MIT-0 license. See the LICENSE file.

aws-glue-quota-audit-framework's People

Contributors

amazon-auto avatar ram-shanmugam avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.