GithubHelp home page GithubHelp logo

awslabs / aurora-snapshot-tool Goto Github PK

View Code? Open in Web Editor NEW
132.0 17.0 85.0 204 KB

The Snapshot Tool for Amazon Aurora automates the task of creating manual snapshots, copying them into a different account and a different region, and deleting them after a specified number of days

License: Apache License 2.0

Python 95.03% Makefile 4.97%

aurora-snapshot-tool's People

Contributors

dougneal avatar jinnko avatar larslevie avatar lefthand avatar mrcoronel avatar ninetyninenines avatar polaskj avatar rts-rob avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aurora-snapshot-tool's Issues

Missing rds:addTagsToResources

the problem:

is not authorized to perform: rds:AddTagsToResource on resource: <ResourceArn>

Appeared on:

  • dest-lambda Copy Snapshots Aurora
  • source-lambda Backup Snapshots Aurora

I solved it now by adding the permission in the template under iamroleSnapshotsAurora and it worked fine afterwards

Support for Aurora PostgreSQL?

The README.md says it's Aurora MySQL only. But then there is also #3 which is closed and says it appears PostgreSQL is also functional.

Can you tell me whether it is the case that this tools does support Aurora PostgreSQL? If so, then the README.md probably needs updating.

Thanks!

KMS ID or KMS ARN

Hi!

I'm having a hard time getting this tool to work as it keeps giving errors about kms keys not being found, but I am not completely sure how to debug such errors.

What values are required for setting up the KMS key configuration? Should I use the KMS Key ID or the KMS Key ARN?

Public bucket access

Is there a way to create the S3 bucket so that it doesn't open up public access and still allows the Makefile, specifically this line, to work? I can only figure out how to make it work with a publicly accessible bucket.

Related question: why is that line necessary? I tried commenting it out, and everything seemed to work, at least with my setup. I acknowledge that it's not a security vulnerability, in that only open-source code is stored in the bucket, but it doesn't seem to follow the principle of least privilege.

Thanks in advance!

Snapshot clusters by tag

Cloudformation does not currently support setting the cluster identifier. This makes snapshotting based on this name less useful. Would it be possible to snapshot based on a tag?

Including automated snapshots increases run time and cost.

Problem

Automated snapshots are parsed by the tool, even though only manual snapshots can be shared and copied. This results in unnecessary pagination and additional runtime, which leads to additional cost.

Root Cause

Calls to describe_db_cluster_snapshots are not filtered by snapshot type.

Proposed Solution

Modify snapshot pagination calls to only apply to manually-created snapshots via the SnapshotType parameter.

Retention days parameter in the dest account not working as expected

Hello,
I have setup the snapshot tool for both Aurora postgres and Aurora Mysql and for both of them I set the retentionDays to 3 in the destination account. The source account has the retention of 1 day.

I only see one snapshot in the destination account/region. I am not sure why it is deleting the other snapshots when we specified the 3 day retention in the destination. Can you please let me know If something else needs to be done.

Snapshot issue

Every time step funcion runs for snapshot making on one instance it trows "errorMessage": "Could not back up every cluster. Backups pending: 1" if I run it on multiple instances "errorMessage": "Could not back up every cluster. Backups pending: 3" any idea hot to solve it?

Failed to create stack

Hi Expert,

My DR plan is:

  1. do snapshot in us-west-1 region, us-west-1 is source.
  2. copy snapshot to us-west-2 region. us-west-2 is destination

Because this is DR plan, I would like to put snapshot tool to different region instead of us-west-1. I supposed we cannot use us-west-1 any more when something happened, that's mean I cannot use S3 in us-west-1.

so I created a S3 in us-west-2 for us-west-1. But I failed to create stack in us-west-1.

Here is my mapping:
"Mappings": {
"Buckets": {
"us-east-1": {
"Bucket": "snapshots-tool-aurora-us-east-1"
},
"us-west-1": {
"Bucket": "xxxx-db-dr-tools"
},
"us-west-2": {
"Bucket": "snapshots-tool-aurora-us-west-2"
},
"us-east-2": {
"Bucket": "snapshots-tool-aurora-us-east-2"
},
"ap-southeast-2": {
"Bucket": "snapshots-tool-aurora-ap-southeast-2"
},
"ap-northeast-1": {
"Bucket": "snapshots-tool-aurora-ap-northeast-1"
},
"eu-west-1": {
"Bucket": "snapshots-tool-aurora-eu-west-1"
},
"eu-west-2": {
"Bucket": "snapshots-tool-aurora-eu-west-2-real"
},
"eu-central-1": {
"Bucket": "snapshots-tool-aurora-eu-central-1"
}
}
},

Here is error information:

10:28:07 UTC-0700 ROLLBACK_IN_PROGRESS AWS::CloudFormation::Stack snapshot-stack-ben The following resource(s) failed to create: [lambdaTakeSnapshotsAurora, lambdaDeleteOldSnapshotsAurora]. . Rollback requested by user.
10:28:06 UTC-0700 CREATE_FAILED AWS::Lambda::Function lambdaTakeSnapshotsAurora Resource creation cancelled
10:28:06 UTC-0700 CREATE_FAILED AWS::Lambda::Function lambdaDeleteOldSnapshotsAurora Error occurred while GetObject. S3 Error Code: PermanentRedirect. S3 Error Message: The bucket is in this region: us-west-2. Please use this region to retry the request (Service: AWSLambda; Status Code: 400; Error Code: InvalidParameterValueException; Request ID: dc5c1ed1-c343-11e8-a7e0-adf5d5e6a444)

Ben

Same-Account snapshot copies

It doesn't look like same-account snapshot copying is available. Am I missing something? Is there a way to do this? Thanks!

Patterns match against snapshot instead of cluster

Pattern matching in the share function matches against the DBClusterSnapshotIdentifier instead of the DBClusterIdentifier. This results in orphaned snapshots which are not shared, and therefore not copied to the destination account.

An example:

The pattern .+(-production)$ matches all clusters that end in -production. Given an Aurora cluster django-production and an Aurora cluster django-production-reporting, we want to copy the first but not the second.

The pattern matches correctly when taking snapshots, but appends YYYY-MM-DD-hh-mm to the DBClusterSnapshotIdentifier. When the share function executes, the DBClusterSnapshotIdentifier does not match the pattern, so the snapshot is not shared.

Need overly permissive policy to avoid KMSKeyNotAccessibleFault

Hi!

If I add the permissions to the KMS Key as described in https://docs.aws.amazon.com/kms/latest/developerguide/key-policy-modifying-external-accounts.html

{ "Sid": "Allow an external account to use this CMK", "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::444455556666:root" ] }, "Action": [ "kms:Encrypt", "kms:Decrypt", "kms:ReEncrypt*", "kms:GenerateDataKey*", "kms:DescribeKey" ], "Resource": "*" }

I get the following error when the lambda for local copy runs:
[ERROR] 2020-03-31T09:00:41.188Z An error occurred (KMSKeyNotAccessibleFault) when calling the CopyDBClusterSnapshot operation: The source snapshot KMS key [arn:aws:kms:eu-west-1:accountnumber:key/keynumber] does not exist, is not enabled or you do not have permissions to access it.

However, if I do the same thing, but change the "Action" to '*', like this:
{ "Sid": "Allow an external account to use this CMK", "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::444455556666:root" ] }, "Action": "*", "Resource": "*" }

It works without issues.

What permission might be missing? I am not comfortable with this overly permissive policy...

Thanks in advance!

How to check destination settings?

Hello,

I have setup one stack in us-west-1 region for source snapshot, and setup one stack in us-west-2 region for destination snapshot, I found two snapshots were created correctly in us-west-1 region, but I did not find any snapshots that were copied to us-west-2 region.

Here are destination settings:
Key | Value | Resolved Value
CodeBucket | DEFAULT_BUCKET |  
CrossAccountCopy | FALSE |  
DeleteOldSnapshots | TRUE |  
DestinationRegion | us-west-2 |  
KmsKeyDestination | None |  
KmsKeySource | None |  
LogLevel | ERROR |  
RetentionDays | 7 |  
SnapshotPattern | snapshot-db-dr |  
SNSTopic |   |  
SourceRegionOverride | NO

Here are source settings:
Key | Value | Resolved Value
BackupInterval | 2 |  
BackupSchedule | 0 19,21,23 * * ? * |  
ClusterNamePattern | snapshot-db-dr |  
CodeBucket | DEFAULT_BUCKET |  
DeleteOldSnapshots | TRUE |  
DestinationAccount | 000000000000 |  
LogLevel | ERROR |  
RetentionDays | 7 |  
ShareSnapshots | FALSE |  
SNSTopic |   |  
SourceRegionOverride | NO

Thanks,
Ben

Add option to use a different KMS key when sharing

This project gave me a great starting point, so foremost thank you for that!
Just as reported in the sister project (eg: awslabs/rds-snapshot-tool#60), my use case was to share a snapshot with a different account. Since some of the clusters are using the default KMS key this doesn't really work, as the destination account can never access the needed KMS key and therefore can't make a local copy.

To fix this I implemented an extra copy step in the source account (after the take snapshot) to bring all the snapshots to use the same KMS key and then share them. It's potentially less efficient as it generates an extra snapshot copy, but it makes the process generic after that point.

Maybe it's something that can be added to the project, but it would make the generic solution more complex.

Unable to copy AWS managed KMS encrypted snapshots between accounts

I have an Aurora cluster, encrypted with the AWS managed KMS key. The snapshot tool successfully creates a snapshot in the source account, and shares it with the backup account, however the copy into the backup account fails with:

The source snapshot KMS key [arn:aws:kms:(source account arn)] does not exist, is not enabled or you do not have permissions to access it.

Being an AWS managed key, I'm (as far as I can tell) unable to grant the backup account permissions to use it.

I'd much rather not recreate the cluster with a CMK. Is there a a way of working around this?

Same account, different region

Hello, I'm trying to use this project to copy Aurora snapshots from source 1 region (us-east-1) to dest region (us-west-2) within the same AWS account.

Source snapshot is being created properly every day, but I see nothing in the dest region.

I'm wondering if I need a DB created in that region, because currently it is empty. I just wanted to have the snapshots replicated in another region.

Any hint?

Thanks.

Incomplete pagination prevents sharing

Problem

When an account has more than 25 Aurora cluster snapshots of any type, snapshots are no longer copied over to the target account.

Root Cause

snapshots_tool_utils.py uses custom pagination code. Somehow this pagination is limited to 25 rows. Once an account has more than 25 manual backups, snapshots are created but not shared, and thus are not copied over.

Proposed Solution

Modify snapshots_tool_utils.py to use the built-in Boto paginator. A PR will be submitted against this issue.

Parameters: [CodeBucket] must have values

There is an error when applying the template:

Parameters: [CodeBucket] must have values

Documentation for that parameter says

Name of the bucket that contains the lambda functions to deploy. Leave the default value to download the code from the AWS Managed buckets

Occasional rate limiting errors

We are running this tool with quite a lot of snapshots being generated (once an hour, kept for 7 days for multiple clusters).

In this configuration, we are seeing the occasional errors due to rate limiting:

[ERROR]	2018-10-08T02:30:33.680Z	6153e97f-733f-4764-899c-a89c052312c8	Exception sharing dev-aurora-cluster20181001042229986400000001-2018-10-04-23-00: An error occurred (Throttling) when calling the ModifyDBClusterSnapshotAttribute operation (reached max retries: 4): Rate exceeded

or

An error occurred (Throttling) when calling the ListTagsForResource operation (reached max retries: 4): Rate exceeded: ClientError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 49, in lambda_handler
ResourceName=snapshot_arn)
File "/var/runtime/botocore/client.py", line 314, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/var/runtime/botocore/client.py", line 612, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (Throttling) when calling the ListTagsForResource operation (reached max retries: 4): Rate exceeded

This happens often enough that it not only fails the Lambda but also the Step Function after retries.

Is there a way to reduce the number of calls the tool makes or are we simply going beyond the limit of what's possible with the current implementation?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.