This project is used for demonstrating how Amazon DynamoDB could be used together with AWS Lambda to perform real-time and batch analysis of domain specific data. Real-time analysis is done using DynamDB streams as an event source of a Lambda function. Batch processing utilizes the parallel scan Action of DynamoDB to distribute work to Lambda. Although this is a Maven project, AWS Lambda functions cannot be deployed by Maven. It is expected to use Eclipse to deploy the AWS Lambda functions and run the sample code.
- Install Eclipse to your computer
- Install AWS Toolkit for Eclipse
- Install Eclipse Maven plugin
- Use
git clone https://github.com/awslabs/reinvent2015-practicaldynamodb.git .
to download this folder to your local computer, import to your Eclipse IDE environment as a Maven project.
- Setup Credentials
- Inside the datasetinit package, change Constants.LOCAL_CRED_PROFILE_NAME to the desired profile name. For more information, go to Providing AWS Credentials in the AWS SDK for Java
- Create DynamoDB tables
- In datasetinit package, run the CreateFunctionTrackerTable, CreateHighScoresByDateTable, CreatePlayerStatsTable and CreateScoresTable classes.
- Create a DynamoDB Streams sourced Lambda function
- Upload the streamhandling.ScoresTableTrigger Lambda function to the us-west-2 region(use any function name).
- Right click handleRequest method -> AWS Lambda -> Upload Function to AWS Lambda...
- In the console, add a new event source for this Lambda function
- DynamoDB table: Scores
- Batch size: 100 (default)
- Starting position: trim horizon
- Upload the streamhandling.ScoresTableTrigger Lambda function to the us-west-2 region(use any function name).
- Create the Batch processing Lambda function
- Upload the parallelscan.SegmentScannerFunctionHandler Lambda function to the us-west-2 region.
- Function name: TableSegmentScannerFunction
- Timeout: 300s (max)
- Upload the parallelscan.SegmentScannerFunctionHandler Lambda function to the us-west-2 region.
- Right click handleRequest method -> AWS Lambda -> Run Function on AWS Lambda...
- An insertion event will be simulated using dynamodb-event.insert.json file to insert a record
- You will notice a new record is inserted to PlayerStats table from AWS console
- Run datasetinit.GenerateScores to simulate inserting records to the Scores table.
- From the console, you will notice the PlayerStats table updating records due to the streaming Lambda function setup earlier.
- Run the parallelscan.FunctionInvoker class.
- After it finishes, check HighScoresByDate table.
For more information, refer to reInvent 2015 demo video on Youtube.