GithubHelp home page GithubHelp logo

aws-samples / aws-usage-queries Goto Github PK

View Code? Open in Web Editor NEW
43.0 7.0 7.0 1.51 MB

This application bootstraps everything needed to query the AWS Cost and Usage reports through Amazon Athena. It also includes reference data and preconfigured SQL queries.

License: MIT No Attribution

JavaScript 1.22% TypeScript 84.11% Python 12.48% Shell 2.19%
amazon-athena cost cur-data cdk

aws-usage-queries's Introduction

AWS usage queries

If you want to identify the top AWS usage in your AWS accounts the AWS Cost and Usage reports (CUR) have much of the data required. You can query the CUR data through Amazon Athena.

This repository is an AWS Cloud Development Kit (CDK) template that bootstraps:

  • AWS Glue catalog metadata in your account to query the CUR
  • additional reference data you can use to enrich your queries - such as instance type information
  • common queries as AWS Glue views, e.g. EC2 instance vCPU hours by account and month

You can modify the template to add your own data and queries.

Amazon Athena view in the AWS Management Console

Architectural Overview

Architectural Overview

Get Started

This application is available in the AWS Serverless Application Repository. You can deploy it to your account from there:

cloudformation-launch-button

Compile the CDK template

The CDK template is written in TypeScript. TypeScript sources must be compiled to JavaScript initially and after each modification. Open a new terminal and keep this terminal open in the background if you like to change the source files. Change the directory to the one where cdk.json is and execute:

npm install
npm run watch

Read the CDK developer guide for more information.

Deployment using CDK

Create or update the application with cdk deploy. You must provide parameters used when creating cost and usage reports. The CUR data must be stored in Apache Parquet format.

cdk deploy \
    --parameters CurBucketName=<bucket name> \
    --parameters ReportPathPrefix=<path without leading or trailing slash> \
    --parameters ReportName=<report name> \
    --databaseName=<optional databasename override, default: aws_usage_queries_database>

Access the queries

You can find the queries deployed via the Amazon Athena console. You can choose Preview next to each view or run a query like:

SELECT * FROM <see databaseName parameter above>."monthly_s3_storage_by_account" limit 10;

Cleaning Up

To avoid incurring charges, delete the AWS CloudFormation stack when you are finished experimenting via cdk destroy in the directory where cdk.json is:

cdk destroy

Or delete the AWS CloudFormation stack manually:

  • Sign in to the AWS CloudFormation console and choose your stack.
  • Choose Delete to delete all resources, including the bucket that stores the reference data.

FAQs

Q: Can I use CUR information stored in an other AWS account?

Yes. You can use a bucket policy for cross-account access in Athena to Amazon S3 buckets or you setup cross-account replication of the report bucket.

Q: Why is my CUR data not being delivered to the report bucket?

Make sure your Amazon S3 bucket is granting the permissions for Cost and Usage Reports to PUT the CUR data into the bucket.

Q: How much do resources in this template cost?

Standard AWS charges apply to the resources you deploy with this template. For the query cost read the Amazon Athena pricing and AWS Glue pricing for details. For the storage and requests to CUR data in your Amazon S3 bucket read the Amazon S3 Pricing. When creating the CUR report select Enable report data integration for Athena. That way the report data is written with the Apache Parquet columnar format which usually results in better performance and cost optimization for both Amazon S3 and Amazon Athena.

Q: How can I refresh the instance types

The instance types come from the EC2 API. Query and write it to a file:

echo "instance_family,instance_type,clock_speed_in_ghz,vcpu_count" \
    > referenceData/instanceTypes/data.csv
aws ec2 describe-instance-types \
    --output text \
    --query 'InstanceTypes[*].[InstanceType, ProcessorInfo.SustainedClockSpeedInGhz, VCpuInfo.DefaultVCpus]' \
    | sed -E 's/[[:blank:]]+/,/g' \
    | sed -E 's/^(([^.]+).*)/\2,\1/' \
    | sort >> referenceData/instanceTypes/data.csv

Q: How can I add a new question to this list?

If you found yourself wishing this set of frequently asked questions had an answer for a particular problem, please submit a pull request. The chances are good that others will also benefit from having the answer listed here.

Q: How can I contribute?

See the Contributing Guidelines for details.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

aws-usage-queries's People

Contributors

amazon-auto avatar dependabot[bot] avatar steffeng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.