This article will introduce you to creating serverless PubSub microservices by building a simple Slackbased word counting service.
These microservices are AWS Lambda based. Lambda is a service that does not require you to manage servers in order to run code. The high level overview is that you define events ( called triggers ) that will cause a packaging of your code ( called a function ) to be invoked. Inside your package ( aka function ), a specific function within a file ( called a hander ) will be called.
If you're feeling a bit confused by overloaded termiology, you are not alone. For now, here's the short list:
Lambda term | common name | description |
---|---|---|
Trigger | AWS Service | Component that invokes Lambda |
Function | software package | Group of files needed to run code (includes libraries) |
handler | file.function in your package | The filename/function name to execute |
There are many different types of triggers ( S3, API Gateway, Kinesis streams, and more! See this page for a complete list. Lambdas run in the context of a specific IAM Role. This means that, in addition to features provided by your language of choice ( python, nodejs, java, scala ), you can call from your Lambda to other AWS Services ( like DynamoDB ).
These microservices, once built, will count words typed into Slack. The services are:
-
The first service splits up the user-input into individual words and:
- increments the counter for each word
- supplies a response to the user showing the current count of any seen words
- triggers functions 2 and 3 which execute concurrently
-
The second service also splits up the user-input into individual words and:
- adds a count of 10 to each of those words
-
The third service logs the input it receives.
While you might not have a specific need for a word counter, the concepts demonstrated here could be applied elsewhere. For example, you may have a project where you need to run several things in series, or perhaps you have a single event that needs to trigger concurrent workflows.
For example:
- Concurrent workflows triggered by a single event:
- New user joins org, and needs accounts created in several systems
- Website user is interested in a specific topic, and you want to curate additional content to present to the user
- There is a software outage, and you need to update several systems ( statuspage, nagios, etc ) at the same time
- Website clicks need to be tracked in a system used by Operations, and a different system used by Analytics
- Serial workflows triggered by a single event:
- New user needs a Google account created, then that Google account needs to be given permission to access another system integrated with Google auth.
- A new version of software needs to be packaged, then deployed, then activated
- Cup is inserted to a coffee machine, then the coffee machine dispenses coffee into the cup
- The API Gateway ( trigger ) will call a Lambda Function that will split whatever text it is given into specific words * Upsert a key in a DynamoDB table with the number 1 * Drop a message on a SNS Topic
- The SNS Topic ( trigger ) will have two lambda functions attached to it that will * Upsert the same keys in the dynamodb with the number 10 * Log a message to CloudWatchLogs
It looks like this picture here
Example code for AWS Advent near-code-free PubSub. Technologies you'll use
- Slack ( outgoing webhooks )
- API Gateway
- IAM
- SNS
- Lambda
- DynamoDB
I came into the world of computing by way of The Operations Path. The Publish-Subscribe Pattern has always been near and dear to my ❤️.
There are a few things about PubSub that I really appreciate as an "infrastructure person".
-
Scalability. In terms of the transport layer ( usually a message bus of some kind ), the ability to scale is separate from the publishers and the consumers. In this wonderful thing which is AWS, we as infrastructure admins can get out of this aspect of the business of running PubSub entirely.
-
Loose Coupling. In the happy path, publishers don't know anything about what subscribers are doing with the messages they publish. There's admittedly a little hand-waving here, and folks new to PubSub ( and sometimes those that are experienced ) get rude surprises as messages mutate over time.
-
Asynchronous. This is not necessarily inherent in the PubSub pattern, but it's the most common implementation that I've seen. There's quite a lot of pressure that can be absent from Dev Teams, Operations Teams, or DevOps Teams when there is no expectation from the business that systems will retain single millisecond response times.
-
New Cloud Ways. Once upon a time, we needed to queue messages in PubSub systems ( and you might you might still have a need for that feature ), but with Lambda, we can also invoke consumers on demand as messages pass through our system. We don't necessarily hace to keep things in the queue at all. Message appears, processing code runs, everybody's happy.
One of the biggest benefits that we can enjoy from being hosted with AWS is not having to manage stuff. Running your own message bus might be something that separates your business from your competition, but it might also be undifferentiated heavy lifting.
IMO, if AWS can and will handle scaling issues for you ( to say nothing of only paying for the transactions that you use ), then it might be the right choice for you to let them take care of that for you.
I would also like to point out that running these things without servers isn't quite the same thing as running them in a traditional setup. I ended up redoing this implementation a few times as I kept finding the rough edges of running things serverless. All were ultimately addressable, but I wanted to keep the complexity of this down somewhat.
TL;DR
GIMMIE SOME EXAMPLES
CloudFormation is pretty well covered by AWS Advent, we'll configure this little diddy via the AWS console.
👇 You can follow the steps below, or view this video 👉
- Console
- DynamoDB
- Create Table
- Table Name
table
- Primary Key
word
Create
- Table Name
This Lambda accepts the input from a Slack outgoing webhook, splits the input into separate words, and adds a count of one to each word. It further returns a json response body to the outgoing webhook that displays a message in slack.
If the Lambda is triggered with the input awsadvent some words
, this Lambda will create the following three keys in dynamodb, and give each the value of one.
- awsadvent = 1
- some = 1
- words = 1
👇 You can follow the steps below, or view this video 👉
- Make the first Lambda, which accepts slack outgoing webook input, and saves that in DynamoDB
- Console
- Lambda
- Get Started Now
- Select Blueprint
- Blank Function
- Configure Triggers
- Click in the empty box
- Choose API Gateway
- API Name
aws_advent
( This will be the /PATH of your API Call )
- Security
- Open
- Name
aws_advent
- Runtime
- Python 2.7
- Code Entry Type
- Inline
- It's included as app.py in this repo. There are more Lambda Packaging Examples here
- Environment Variables
DYNAMO_TABLE
=table
- Handler
app.handler
- Role
- Create new role from template(s)
- Name
aws_advent_lambda_dynamo
- Policy Templates
- Simple Microservice permissions
- Triggers
- API Gateway
- save the URL
👇 You can follow the steps below, or view this video 👉
-
Setup an outbound webhook in your favorite Slack team.
- Manage
- Search
- outgoing wehbooks
- Channel ( optional )
- Trigger words
- awsadvent
- URLs
- Your API Gateway Endpoint on the Lambda from above
- Customize Name
- awsadvent-bot
-
Go to slack
We're using a SNS Topic as a broker. The producer ( the aws_advent
Lambda ) publishes messages to the SNS Topic. Two other Lambdas will be consumers of the SNS Topic, and they'll get triggered as new messages come into the Topic.
👇 You can follow the steps below, or view this video 👉
- Console
- SNS
- New Topic
- Name
awsadvent
- Note the topic ARN
This permission will allow the first Lambda to talk to the SNS Topic. You also need to set an environment variable on the aws_advent
Lambda to have it be able to talk to the SNS Topic.
👇 You can follow the steps below, or view this video 👉
- Give additional IAM permissions on the role for the first lambda
- Console
- IAM
- Roles
aws_advent_lambda_dynamo
- Permissions
- Inline Policies
- click here
- Policy Name
aws_advent_lambda_dynamo_snspublish
{
"Version":"2012-10-17",
"Statement":[{
"Effect":"Allow",
"Action":"sns:Publish",
"Resource":"arn:aws:sns:*:*:awsadvent"
}
]
}
👇 You can follow the steps below, or view this video 👉
There's a conditional in the aws_advent
lambda that will publish to a SNS topic, if the SNS_TOPIC_ARN environment variable is set. Set it, and watch more PubSub magic happen.
- Add the SNS_TOPIC_ARN environment variable to the
aws_advent
lambda- Console
- LAMBDA
aws_advent
- Scroll down
SNS_TOPIC_ARN
- The SNS Topic ARN from above.
This microservice increments the values collected by the aws_advent
Lambda. In a real world application, I would probably not take the approach of having a second Lambda function update values in a database that are originally input by another Lambda function. It's useful here to show how work can be done outside of the Request->Response flow for a request. A less contrived example might be that this Lambda checks for words with high counts, to build a leaderboard of words.
This Lambda function will subscribe to the SNS Topic, and it is triggered when a message is delivered to the SNS Topic. In the real world, this Lambda might do something like copy data to a secondary database that internal users can query without impacting the user experience.
👇 You can follow the steps below, or view this video 👉
- Console
- lambda
- Create a Lambda function
- Select Blueprint
1. search sns
1.
sns-message
python2.7 runtime - Configure Triggers
- SNS topic
awsadvent
- click
enable trigger
- SNS topic
- Name
sns_multiplier
- Runtime
- Python 2.7
- Code Entry Type
- Inline
- It's included as sns_multiplier.py in this repo.
- Inline
- Handler
- sns_multiplier.handler
- Role
- Create new role from template(s)
- Policy Templates
- Simple Microservice permissions
- Next
- Create Function
Now that you have the most interesting parts hooked up together, test it out!
What we'd expect to happen is pictured here 👉
👇 Writeup is below, or view this video 👉
- The first time we sent a message, the count of the number of times the words are seen is one. This is provided by our first Lambda
- The second time we sent a message, the count of the number of times the words are seen is twelve. This is a combination of our first and second Lambdas working together.
- The first invocation set the count to
current(0) + one
, and passed the words off to the SNS topic. The value of each word in the database was set to 1. - After SNS recieved the message, it ran the
sns_multiplier
Lambda, which added ten to the value of each wordcurrent(1) + 10
. The value of each word in the database was set to 11. - The second invocation set the count of each word to
current(11) + 1
. The value of each word in the database was set to 12.
- The first invocation set the count to
This output of this Lambda will be viewable in the CloudWatch Logs console, and it's only showing that we could do something else ( anything else, even ) with this microservice implementation.
- Console
- lambda
- Create a Lambda function
- Select Blueprint
1. search sns
1.
sns-message
python2.7 runtime - Configure Triggers
- SNS topic
awsadvent
- click
enable trigger
- SNS topic
- Name
sns_logger
- Runtime
- Python 2.7
- Code Entry Type
- Inline
- It's included as sns_logger.py in this repo.
- Inline
- Handler
- sns_logger.handler
- Role
- Create new role from template(s)
- Policy Templates
- Simple Microservice permissions
- Next
- Create Function
PubSub is an awsome model for some types of work, and in AWS with Lambda we can work inside this model relatively simply. Plenty of real-word work depends on the PubSub model.
You might translate this project to things that you do need to do like software deployment, user account management, building leaderboards, etc.
It's ok to lean on AWS for the heavy lifting. As our word counter becomes more popular, we probably won't have to do anything at all to scale with traffic. Having our code execute on a request-driven basis is a big win from my point of view. "Serverless" computing is a very interesting development in cloud computing. Look for ways to experiment with it, there are plenty of benefits to it ( other than novelty ).
Some benefits you can enjoy via Servless PubSub in AWS:
- Scaling the publishers.
Since this used API Gateway to terminate user requests to a Lambda function:
- You don't have idle resources burning money, waiting for traffic
- You don't have to scale because traffic has increased or decreased
- Scaling the bus / interconnection.
SNS did the following for you:
- Scaled to accomodate the volume of traffic we send to it
- Provided HA for the bus
- Pay-per-transaction. You don't have to pay for idle resources!
- Scaling the consumers.
Having lambda functions that trigger on a message being delivered to SNS
- Scaled the lambda invocations to the volume of traffic.
- Provides some sense of HA
Lambda is a new technology. If you use it, you will find some rough edges.
The API Gateway is a new technology. If you use it, you will find some rough edges.
Don't let that dissuade you from trying them out!
I'm open for further discussion on these topics. Find me on twitter @edyesed