GithubHelp home page GithubHelp logo

aws-samples / amazon-msk-java-app-cdk Goto Github PK

View Code? Open in Web Editor NEW
38.0 7.0 20.0 339 KB

This project provides and example of end to end data processing application created using the combination of Amazon Managed Streaming for Apache Kafka (Amazon MSK), AWS Fargate, AWS Lambda and Amazon DynamoDB. Business logic is implemented in Java and Typescript. The build and deployment of the application is fully automated using AWS CDK.

License: MIT No Attribution

JavaScript 3.92% TypeScript 39.42% Dockerfile 0.93% Java 50.37% Shell 5.36%
aws-cdk aws-msk aws-fargate aws-lambda cdk apache-kafka kafka

amazon-msk-java-app-cdk's Introduction

Building Apache Kafka data processing Java application using AWS CDK

Table of contents

  1. Introduction
  2. Architecture
  3. Project structure
  4. Prerequisites
  5. Tools and services
  6. Usage
  7. Clean up

Introduction

This project provides an example of Apache Kafka data processing application. The build and deployment of the application if fully automated using AWS CDK.

Project consists of three main parts:

  • AWS infrastructure and deployment definition - AWS CDK scripts written in Typescript
  • AWS Lambda function - sends messages to Apache Kafka topic using KafkaJS library. It is implemented in Typescript.
  • Consumer application - Spring Boot Java application containing main business logic of the data processing pipeline. It consumes messages from Apache Kafka topic, performs simple validation and processing and stores results in Amazon DynamoDB table.

Provided example show cases two ways of packaging and deploying business logic using high-level AWS CDK constructs. One way using Dockerfile and AWS CDK ECS ContainerImage to deploy Java application to AWS Fargate and the other way is using AWS CDK NodejsFunction construct to deploy Typescript AWS Lambda code

Architecture

architecture

Triggering the TransactionHandler Lambda function publishes messages to an Apache Kafka topic. The application is packaged in a container and deployed to ECS Fargate, consumes messages from the Kafka topic, processes them, and stores the results in an Amazon DynamoDB table. The KafkaTopicHandler Lambda function is called once during deployment to create Kafka topic. Both the Lambda function and the consumer application publish logs to Amazon CloudWatch.

Project structure

Prerequisites

Tools and services

  • AWS CDK โ€“ The AWS Cloud Development Kit (AWS CDK) is a software development framework for defining your cloud infrastructure and resources by using programming languages such as TypeScript, JavaScript, Python, Java, and C#/.Net.
  • Amazon MSK - Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service that makes it easy for you to build and run applications that use Apache Kafka to process streaming data.
  • AWS Fargate โ€“ AWS Fargate is a serverless compute engine for containers. Fargate removes the need to provision and manage servers, and lets you focus on developing your applications.
  • AWS Lambda - AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers, creating workload-aware cluster scaling logic, maintaining event integrations, or managing runtimes.
  • Amazon DynamoDB - Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale.

Usage

  1. Start the Docker daemon on your local system. The AWS CDK uses Docker to build the image that is used in the AWS Fargate task. You must run Docker before you proceed to the next step.
  2. export AWS_PROFILE=<REPLACE WITH YOUR AWS PROFILE NAME> or alternatively follow instructions in the AWS CDK documentation
  3. (First-time AWS CDK users only) Follow instructions on AWS CDK documentation page to bootstrap AWS CDK
  4. cd scripts
  5. Run deploy.sh script
  6. Trigger lambda function to send message to Kafka queue. You can use below command line or alternatively you can trigger lambda function from AWS console. Fell free to change values of accountId and value fields in the payload JSON and send multiple messages with different payloads to experiment with the application.
    aws lambda invoke --cli-binary-format raw-in-base64-out --function-name TransactionHandler --log-type Tail --payload '{ "accountId": "account_123", "value": 456}' /dev/stdout --query 'LogResult' --output text | base64 -d
  7. To view results in DynamoDB table you can run below command. Alternatively you can navigate to Amazon DynamoDB AWS console and select Accounts table.
    aws dynamodb scan --table-name Accounts --query "Items[*].[id.S,Balance.N]" --output text
  8. You can also view CloudWatch logs in AWS console or by running aws logs tail command with specified CloudWatch Logs group

Clean up

Follow AWS CDK instructions to remove AWS CDK stacks from your account. You can also use scripts/destroy.sh. Be sure to also manually remove Amazon DynamoDB table, clean up Amazon CloudWatch logs and remove Amazon ECR images to avoid incurring additional AWS infrastructure costs.

amazon-msk-java-app-cdk's People

Contributors

amazon-auto avatar pchot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

amazon-msk-java-app-cdk's Issues

how to modify or adapt this class

Hello ,
We are trying to use public KafkaConsumer(DynamoDBService dynamoDBService, AdminClient adminClient, KafkaConsumerProperties properties) so as to consume the topic messages . However we are needing to use DynamoDBService so can this method signature be changed to
public KafkaConsumer(AdminClient adminClient, KafkaConsumerProperties properties) and will that work

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.