GithubHelp home page GithubHelp logo

irods_client_aws_lambda_s3's Introduction

irods_client_aws_lambda_s3

This AWS Lambda function updates an iRODS Catalog with events that occur in one or more S3 buckets.

Files created, renamed, or deleted in S3 appear quickly in iRODS.

The following AWS configurations are supported at this time:

  • S3 -> Lambda -> iRODS
  • S3 -> SNS -> Lambda -> iRODS
  • S3 -> SQS -> Lambda -> iRODS

iRODS is assumed to have its associated S3 Storage Resource(s) configured with HOST_MODE=cacheless_attached.

If SQS is involved, it is assumed to be configured with batch_size = 1.

Configuration

Function

Handler: irods_client_aws_lambda_s3.lambda_handler

Runtime: Python 3.7

Environment Variables:

IRODS_COLLECTION_PREFIX : /tempZone/home/rods/lambda

IRODS_ENVIRONMENT_SSM_PARAMETER_NAME : irods_default_environment

IRODS_MULTIBUCKET_SUFFIX : _s3

Triggers

You must configure your lambda to trigger on all ObjectCreated and ObjectRemoved events for a connected S3 bucket.

iRODS Connection Environment

The connection information is stored in the AWS Systems Manager > Parameter Store as a JSON object string.

https://console.aws.amazon.com/systems-manager/parameters

Create a parameter with:

Name (must match IRODS_ENVIRONMENT_SSM_PARAMETER_NAME above):

irods_default_environment

Type:

SecureString

Value:

{
    "irods_default_resource": "s3Resc",
    "irods_host": "irods.example.org",
    "irods_password": "rods",
    "irods_port": 1247,
    "irods_user_name": "rods",
    "irods_zone_name": "tempZone"
}

Configuration Options

SSL Support

If the Lambda needs to be configured to connect with an SSL-enabled iRODS Zone, the following additional keys need to be included in the environment in the Parameter Store:

    "irods_client_server_negotiation": "request_server_negotiation",
    "irods_client_server_policy": "CS_NEG_REQUIRE",
    "irods_encryption_algorithm": "AES-256-CBC",
    "irods_encryption_key_size": 32,
    "irods_encryption_num_hash_rounds": 16,
    "irods_encryption_salt_size": 8,
    "irods_ssl_verify_server": "cert",
    "irods_ssl_ca_certificate_file": "irods.crt"

Note irods_ssl_ca_certificate_file is a relative path to a certificate file (or certificate chain file) within the Lambda package.

Multi-Bucket Support

This Lambda function can be configured to receive events from multiple sources at the same time.

If the irods_default_resource is NOT defined in the environment in the Parameter Store, then the Lambda function will derive the name of a target iRODS Resource.

By default, the Lambda function will append _s3 to the incoming bucket name.

For example, if the incoming event comes from bucket example_bucket, then the iRODS resource that would be targeted would be example_bucket_s3.

However, if IRODS_MULTIBUCKET_SUFFIX is defined as -S3Resc, the the iRODS resource that would be targeted would be example_bucket-S3Resc.

irods_client_aws_lambda_s3's People

Contributors

trel avatar

Stargazers

John Jacquay avatar

Watchers

 avatar James Cloos avatar Kory Draughn avatar  avatar

Forkers

cinek810 trel tcutts

irods_client_aws_lambda_s3's Issues

investigate intermittent failure of session.collections.create()

Sometimes, the recursive call to make sure the parent collection of the about-to-be-registered s3 data object exists... it raises CollectionDoesNotExist in its get() call.

Have not been able to replicate this behavior outside the Lambda environment.

try:
session.collections.create(irods_collection_name, recurse=True)
except Exception as e:
print(e)
print('session.collections.create returned CollectionDoesNotExist on get()... TODO: investigate...')

ssl cert string fails when >4k in length

Amazon Parameter Store secrets are limited to 4k in length.

A sufficiently long certificate chain file can be longer than 4k and will not fit.

Need to find a workaround - easiest will be to load a relative path of a cert located in the Lambda package.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.