GithubHelp home page GithubHelp logo

pbudzon / aws-maintenance Goto Github PK

View Code? Open in Web Editor NEW
77.0 11.0 56.0 68 KB

Collection of scripts and Lambda functions used for maintaining AWS resources

Home Page: https://mysteriouscode.io/blog/

License: MIT License

Python 99.65% Makefile 0.35%
aws-lambda rds-snapshots cloudformation-templates snapshot aws-cloudformation ebs-snapshots

aws-maintenance's Introduction

aws-maintenance

Collection of scripts and Lambda functions used for maintaining various AWS resources.

Table of contents

Cross-region RDS backups (backup-rds.py)

Lambda function used to copy RDS snapshot from one region to another, to allow for the database to be restored in case of region failure. One (latest) copy for each RDS instance is kept in the target region. The provided CloudFormation template will create a subscription from RDS to Lambda, whenever an automated RDS snapshot on any database in that AWS region is made - that snapshot will be copied to target region and all older snapshots for that database will be removed.

Regions

You will be asked to specify the target region (where to copy your snapshots) to use by Lambda when creating the CloudFormation stack. The stack itself needs to be created in the same region where the RDS databases that you want to use it for are located.

Limit to specific RDS instances

You can also limit the function to only act for specific databases - specify the list of names in the "Databases to use for" parameter when creating the CloudFormation stack. If you leave it empty, Lambda will trigger for all RDS instances within the source region.

Encryption

If your RDS instances are encrypted, you need to provide a KMS key ARN in the target region when creating the stack.

Since KMS keys are region-specific, when the snapshot is copied into another region, it needs to be re-encrypted using a key located in that region. Create a KMS key in the target region, copy its ARN and paste that value into KMS Key in target region parameter when creating the CloudFormation stack. If you do not provide that value, copy operation for encrypted snapshots will fail.

You can also provide that value if your RDS instances are not encrypted - the copied snapshots will be encrypted using that key.

If you don't use encryption and don't want your snapshots to be encrypted, leave the KMS Key in target region parameter empty.

Aurora clusters

Since Aurora clusters do not offer an event notification for their automated backups, a daily schedule needs to be used to copy the latest snapshot over to the target region. If you're using clusters, set Use for Aurora clusters to 'Yes' when creating the CloudFormation stack. You can limit which clusters' snapshots are copied by specifying a comma-delimited list in Aurora clusters to use for parameter. The snapshots will be copied over once a day, at a random time of AWS choosing (using CloudWatch Event with rate(1 day)).

Guide

How to use for the first time

  1. Download the backup-rds.py file from this repository and zip it into a file called backup-rds.zip (for example: zip backup-rds.zip backup-rds.py).
  2. Upload the ZIP file to an S3 bucket on your AWS account in the same region where your RDS instances live.
  3. Create a new CloudFormation stack using the template: infrastructure/templates/rds-cross-region-backup.json.
  4. CloudFormation will ask you for the following parameters:
    • Required: Target region - provide the id of the AWS region where the copied snapshots should be stored, like 'eu-central-1'. Those are listed in AWS documentation.
    • Required: Name of S3 bucket - name of the S3 bucket where you uploaded the ZIP in earlier step.
    • Required: Name of ZIP file - name of the ZIP file in S3 bucket you uploaded. If you uploaded it into a directory, provide a path to the file in S3 (for example lambda_code/backup-rds.zip)
    • Required/Optional: KMS Key in target region - if your RDS instances are encrypted, provide an ARN of a KMS key in the target region. See Encryption section above.
    • Optional: Databases to use for - if you want limit the functionality to only specific RDS instances, provide a comma-delimited list of their names.
    • Optional: Use for Aurora clusters - select "Yes" if you have any Aurora Clusters that you want this code to work with.
    • Optional: Aurora clusters to use for (applies only if you select "Yes" above) - if you want to limit the functionality to only specific Aurora Clusters, provide a comma-delimited list of clusters names.

How to update to the latest version

Follow the update steps, but name the zip file something else that before - for example, if you uploaded backup-rds.zip, upload the new file as backup-rds-1.zip. Update your CloudFormation stack with the latest template from this repo, and provide that new ZIP file name in Name of ZIP file parameter.

How to test

Once all resources are created, you can test your Lambda from the Console, by using the following test event:

{
  "Records": [
    {
      "EventVersion": "1.0",
      "EventSubscriptionArn": "arn:aws:sns:EXAMPLE",
      "EventSource": "aws:sns",
      "Sns": {
        "Type": "Notification",
        "MessageId": "abcd",
        "TopicArn": "arn:aws:sns:eu-west-1:123456789012:topic_name",
        "Subject": "RDS Notification Message",
        "Message": "{\"Event Source\":\"db-instance\",\"Event Time\":\"2017-12-26 22:34:07.882\",\"Identifier Link\":\"https://console.aws.amazon.com/rds/home?region=eu-west-1#dbinstance:id=database_name\",\"Source ID\":\"PUT_YOUR_RDS_NAME_HERE\",\"Event ID\":\"http://docs.amazonwebservices.com/AmazonRDS/latest/UserGuide/USER_Events.html#RDS-EVENT-0002\",\"Event Message\":\"Finished DB Instance backup\"}",
        "Timestamp": "2017-12-26T22:35:19.946Z",
        "SignatureVersion": "1",
        "Signature": "xxx",
        "SigningCertURL": "xxx",
        "UnsubscribeURL": "xxx"
      }
    }
  ]
}

Replace the PUT_YOUR_RDS_NAME_HERE in the JSON string with a name of any of your RDS instances.

For Aurora Clusters, use the below event (no need to change anything):

{
  "version": "0",
  "id": "eb6d8ba9-c5c2-3269-3ac4-9918a9df74d9",
  "detail-type": "Scheduled Event",
  "source": "aws.events",
  "account": "123456789012",
  "time": "2018-01-30T21:11:00Z",
  "region": "eu-west-1",
  "resources": [
    "arn:aws:events:eu-west-1:123456789012:rule/eventName"
  ],
  "detail": {}
}

The code will go through all clusters (or those listed in Aurora clusters to use for parameter).

Related blog posts

Automated EC2 storage backups and retention management (ebs-snapshots.py)

Lambda function which will automatically create daily snapshots of instances tagged with "Backup" tag (name can be customized). The tag should contain a number of days the snapshot should be retained for - after that date, it will be deleted when this Lambda is executed.

Notes

  • Encrypted volumes' snapshots will retain the encryption and use the same encryption key.
  • Unencrypted volumes' snapshots will remain unencrypted.
  • Default retention period is 7 days (can be changed in Lambda code, see below).
  • Lambda can be run multiple times a day if needed, it will NOT create duplicated snapshots in the same day.
  • Tags from EC2 instance will be copied to the snapshot (except "Backup" tag), and a new tag "CreatedBy" will be added with this Lambda's name.
  • If you have a lot of instances to snapshot, you may need to extend the Lambda execution time (or schedule it to be executed multiple times a day).

Guide

How to use for the first time

  1. Download the ebs-snapshots.py file from this repository and zip it into a file called ebs-snapshots.zip (for example: zip ebs-snapshots.zip ebs-snapshots.py).
  2. Upload the ZIP file to an S3 bucket on your AWS account.
  3. Create a new CloudFormation stack using the template: infrastructure/templates/create-ebs-snapshots.json.
  4. CloudFormation will ask you for the following parameters:
    • Required: Name of S3 bucket - name of the S3 bucket where you uploaded the ZIP in earlier step.
    • Required: Name of ZIP file - name of the ZIP file in S3 bucket you uploaded. If you uploaded it into a directory, provide a path to the file in S3 (for example lambda_code/ebs-snapshots.zip)
  5. Create the stack.
  6. Add a tag called "Backup" to some instances, with a number of days (or 0) you want to retain their snapshots for as the tag's value.
  7. That's it! CloudWatch Event Rule will be created that will trigger the Lambda once a day. You can also trigger it manually from Lambda console.

How to update to the latest version

Follow the update steps, but name the zip file something else that before - for example, if you uploaded ebs-snapshots.zip, upload the new file as ebs-snapshots-1.zip. Update your CloudFormation stack with the latest template from this repo, and provide that new ZIP file name in Name of ZIP file parameter.

How to test

Trigger the Lambda from the console. Any (even empty) input will do, it will be ignored. Output from the Lambda will list tagged EC2 instances found and which EBS snapshots were created.

How to modify names of tags used by code or default retention period

In ebs-snapshots.py file, one of the top few lines define the following variables, which you can change as needed:

  • DEFAULT_RETENTION - number of days the snapshots are retained for if the "Backup" tag value is zero (default: 7).
  • BACKUP_TAG - name of the tag on EC2 instances the code will look for (default: "Backup").
  • DELETE_ON_TAG - name of the tag with deletion date that will be added to snapshots (default: "DeleteOn"). Important: If you change this AFTER some snapshots were already created with previous name, those snapshots will not be deleted when their date is reached. Either update the tag name assigned to them, or delete them manually.

After changing those values, follow the update guide above to deploy your new code.

Related blog posts

Monitor CloudTrail events (cloudtrail-monitor.py)

Lambda function which monitors CloudTrail logs and sends SNS notification on LaunchInstances event. This can be modified to look for and respond to any AWS API calls as needed.

Use infrastructure/templates/cloudtrail-notifications.json CloudFormation template to create the Lambda, CloudTrail and SNS topics. In the Outputs of the CloudFormation stack, you'll find the SNS topic to which you can subscribe to receive the notifications.

Other Lambdas

The Lambdas below can be created by using infrastructure/templates/maintenance-lambdas.json CloudFormation template.

You should probably review (and adjust) them to your needs as necessary. They are provided as examples.

clean-base-images.py and clean-release-images.py

Remove AMIs from eu-west-1 (Ireland) and eu-central-1 (Frankfurt) based on different tags.

Meant to be used as a part of immutable infrastructure, where each project has a base AMI (tagged with Type=BaseImage) and each release in contained within a new AMI based on it (tagged with Type=ReleaseImage).

Assumptions:

  1. base images are stored in Ireland. Release images are stored in Ireland and Frankfurt (as backups).
  2. Apart from Type tag, each AMI has a Project tag, which can contain any value.

Those scripts make sure only a certain amount of recent images for each project is stored to limit the costs.

clean-es-indices.py

Removes old CloudWatch indices inside AWS ElasticSearch Service. Useful when using CloudWatch log streaming into ElasticSearch.

Configure list of accounts, ElasticSearch endpoint and amount of last indices to be kept inside the code.

aws-maintenance's People

Contributors

pbudzon avatar timbaileyjones avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aws-maintenance's Issues

KMSKeyNotAccessibleFault

After running the stack and executing a Lambda test I get the following: "errorMessage": "An error occurred (KMSKeyNotAccessibleFault) when calling the CopyDBSnapshot operation: The source snapshot KMS key does not exist, is not enabled or you do not have permissions to access it"

Is there something I need to add to the IAM role to get this working?

RDS-SnapshotQuotaExceeded

Can we copy 4 snapshots at a time to another region?

An error occurred (SnapshotQuotaExceeded) when calling the CopyDBSnapshot operation: Cannot copy more than 5 snapshots across regions: SnapshotQuotaExceededFault

AWS only allows 5 snapshots copy across regions. So, 'IN PROGRESS' status can't exceed more than 5.

File: backup-rds.py

Error In Cross Copy

When I am running the test function for aurora I am getting below error. Until now it was working fine but now i am seeing this error
An error occurred (SnapshotQuotaExceeded) when calling the CopyDBClusterSnapshot operation: Cannot create more than 100 manual snapshots: SnapshotQuotaExceededFault
Traceback (most recent call last):
File “/var/task/backup-rds.py”, line 259, in lambda_handler
copy_latest_snapshot(account_id, cluster, True)
File “/var/task/backup-rds.py”, line 172, in copy_latest_snapshot
SourceRegion=SOURCE_REGION
File “/var/runtime/botocore/client.py”, line 314, in _api_call
return self._make_api_call(operation_name, kwargs)
File “/var/runtime/botocore/client.py”, line 612, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.SnapshotQuotaExceededFault: An error occurred (SnapshotQuotaExceeded) when calling the CopyDBClusterSnapshot operation: Cannot create more than 100 manual snapshots

Issues with snapshots deleting

Thank you for this Paulina, I have implemented via the template and it is working. However, the snapshots in the target region do not seem to be deleting and nothing in the cloudwatch logs suggest that the deletion is happening either. I ran into an AWS limit of not being able to have over 100 manual snapshots so am having to manually delete them. Have you seen this issue or maybe know a possible cause? Thank you @pbudzon

Cross Region RDS Backup Copy CFN Not working

I cloned your git repository to my local machine, uploaded the zipped python for RDS cross region backup copy to an s3 bucket. However when I upload the CFN template into the console it fails. Are you sure the provided template is valid?

image

CF creation fails with new instructions

Hi @pbudzon

I followed your new instructions to enable the cross-region copy of encrypted RDS snapshots: I've created a new key in the destination region, downloaded the lambda python file, added it to a S3 bucket in the source region, downloaded the JSON file and changed the S3 bucket name in it, but when I try to create a new CF stack it fails.

image

Many thanks!

Aurora support

I have multiple instances in RDS, one with AuroraMySQL engine & one with MySQL engine. In my source region snapshot has been created for both but in the target region only the snapshot related to MySQL engine has been created not for the AuroraMySQL engine.
As mentioned in the readme file i have left the DatabasesToUse as empty because i wanted the snapshots of all the RDS instances to be copied in the destination region.
Do i need to do something else in the code? Kindly suggest

Retention Policy for the Snapshots

@pbudzon : The cross region backup for aurora instances removes snapshots which are one day older. If we want to implement some retention policy like keeping all the snapshots for last 30 days & 1 snapshot for day 1 of each of the previous months upto six months? I can keep the backup for 30 days by changing snapshots_to_remove = [i[0] for i in sorted_snapshots[29:]]. But for keeping the snapshot for each month of the first day (for last six months), do i need to change the name of the snapshot? Could you please suggest

Cross-region RDS backup copy fails with encrypted snapshots

Hi @pbudzon

I just tested the updated Lambda function with the provided example event and it fails because no KMS key is provided for the cross-region RDS snapshot copy.

{
  "errorMessage": "An error occurred (InvalidParameterValue) when calling the CopyDBSnapshot operation: Must specify new KMS key for cross region encrypted snapshot copy.",
  "errorType": "ClientError",
  "stackTrace": [
    [
      "/var/task/index.py",
      103,
      "lambda_handler",
      "copy_latest_snapshot(account_id, message['Source ID'])"
    ],
    [
      "/var/task/index.py",
      48,
      "copy_latest_snapshot",
      "CopyTags=True"
    ],
    [
      "/var/runtime/botocore/client.py",
      317,
      "_api_call",
      "return self._make_api_call(operation_name, kwargs)"
    ],
    [
      "/var/runtime/botocore/client.py",
      615,
      "_make_api_call",
      "raise error_class(parsed_response, operation_name)"
    ]
  ]
}

Many thanks again.

CloudFormation stack fails if created in non-source region

Hi @pbudzon - when trying to deploy the CloudFormation stack for cross-region RDS snapshot copy to the destination region the stack creation fails. Our snapshots are encrypted.

The evidence is in the third post - the previous image on this post was from a different Lambda function.

Many thanks!

RDS-snapshot: Trigger with Event Bridge + cron failed

Hello! I was wondering if it is possible to adapt the template so that, in addition to calling the lambda function with sns, it is possible to include or change the trigger for an event with Event Bridge that allows the lambda to be executed at certain times on certain days.

I have tried to do it "raw" by adding a trigger with event bride manually after the stack´s creation, but when executed it seems that it does not pass the parameters of the databases that I want to copy the snapshot and it fails.

Resource import error

Hello,

When I try to create the CloudFormation stack from the json template I got this error:

"The following resource types are not supported for resource import: AWS::Lambda::Permission,AWS::RDS::EventSubscription,AWS::Lambda::Permission"

Can you advice?

Thanks,
Carlos

AccessDenied

Hello,

I'm getting the error below. Role has attached FullRDS rights.

Any idea?

13:08:02
START RequestId: a41934f6-9264-41f4-bd0c-cc2990538f3c Version: $LATEST

13:08:02
Latest snapshot found: 'rds:swat-rds-prd-2020-04-06-02-04' from 2020-04-06 02:04:32.297000+00:00

13:08:02
Checking if 'swat-rds-prd-None-rds-swat-rds-prd-2020-04-06-02-04' exists in target region

13:08:02
[ERROR] ClientError: An error occurred (AccessDenied) when calling the DescribeDBClusterSnapshots operation: Unknown Traceback (most recent call last): File "/var/task/lambda_function.py", line 237, in lambda_handler copy_latest_snapshot(account_id, cluster, True) File "/var/task/lambda_function.py", line 140, in copy_latest_snapshot print_encryption_info(source_snapshot_arn, is_aurora

13:08:02
END RequestId: a41934f6-9264-41f4-bd0c-cc2990538f3c

Lambda function is triggered but fails

Hi @pbudzon - and apologies if this is not the right place to discuss this error.

I've used your amazing code to deploy a CloudFormation stack to copy RDS snapshots from one region to another, following the readme instructions. First I tried to deploy it on the destination region and it failed, but then it was successfully deployed in the snapshot source region - the one that has the RDS instance that is generating the snapshots.

But the Lambda Function is failing to copy the snapshot. I don't know if it is related to the fact that our snapshots are encrypted or something else, but the log suggests that it is failing to identify that there's no copy of the snapshot in the destination region, like if the snapshot was already copied, and then the function quits without running the copy itself.

image

Can you advise on this please? Or is there a restriction on encrypted snapshots?

Many thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.