dorametrix
๐ ๐งโ๐ ๐ง๐ฟโ๐ ๐ง๐ปโ๐ ๐ฉโ๐ ๐
DORA metrics.
Know if you are running high-performing software development teams by making it clear how individual products (i.e. services, APIs, systems...) line up with thedorametrix
is a Node.js-based service that helps you calculate your DORA metrics, by inferring your metrics from events you can create manually or with webhooks. It has pre-baked support for push and incident events from GitHub Actions, Bitbucket and Jira (only incidents!), and can be run "as is" as a web service, too.
๐๏ธ It's super easy to get started with, comes pre-packaged and needs just a tiny bit of fiddling with webhooks and settings on your end: Simply put, the easiest way you can start using DORA metrics in a scalable way today!
โ โน๏ธ Credit where credit is due: This project is strongly influenced by Google Cloud's Four Keys
project. The Four Keys project is great but is, for obvious reasons, Google Cloud-oriented. It also uses a SQL database and ETL pattern which is less than ideal from a serverless perspective. In general, the approach and the structuring and nomenclature is the same here. Most interestingly, that dorametrix
is better decoupled from the specifics of any one CI tool, and uses DynamoDB (NoSQL) instead of BigQuery.
dorametrix
works
How At its heart, dorametrix
is a serverless web service that collects and represents specific delivery-related events that you send to it, which are then stored in a database. As a user you can request these metrics, which are calculated from the same stored events.
For more exact information, see the section "What are the DORA metrics and how does dorametrix
calculate them?" further down.
The events
The most basic data type we have is the Event. These are internally created from inputs to the service. For example, when you push a commit, an Event is added to the database. The event will contain, for example, information like the commit SHA, time of the commit, and similar. We keep the events indefinitely so we can have a complete record of all dorametrix
events that have occurred.
dorametrix
will also (on its own) add individual domain-specific records for (respectively) a Change, Deployment, or Incident, based on the incoming data. This is so we can easily follow up on those typologies and make the desired calculations.
- Changes essentially correspond to
push
-type lifecycle events, since these represent commits. - Deployments are added by you manually, as a separate activity at the end of a deployment pipe.
- Incidents are special, and more complex, since they both have to be raised and (later) be resolved. This can be done manually by calling
dorametrix
but it's highly recommended (and much more practical) to let your issue tracker send such events by webhook based on actual user interactions with issues or tickets.
Diagrams
Solution diagram
As it stands currently, dorametrix
is implemented in an AWS-oriented manner. This should be fairly easy to modify so it works with other cloud platforms and with other persistence technologies. If there is sufficient demand, I might add extended support. Or you do it! Just make a PR and I'll see how we can proceed.
Code flow diagram
Please see the generated documentation site for more detailed information.
Prerequisites
- Recent Node.js (ideally 16+) installed.
- Amazon Web Services (AWS) account with sufficient permissions so that you can deploy infrastructure. A naive but simple policy would be full rights for CloudWatch, Lambda, API Gateway, DynamoDB, X-Ray, and S3.
- Ideally some experience with Serverless Framework as that's what we will use to deploy the service and infrastructure.
- You will need to deploy the stack prior to working with it locally as it uses actual infrastructure even in local mode.
Installation
Clone, fork, or download the repo as you normally would. Run npm install
.
Commands
The below commands are the most critical ones. See package.json
for more commands!
npm start
: Run Serverless Framework in offline modenpm test
: Test codenpm run deploy
: Deploy with Serverless Frameworknpm run build
: Package and build the code with Serverless Framework
Configuration
Application settings
You can set certain values in serverless.yml
.
Required
API_KEY
: The API key you want to use to (somewhat) secure your service. You will use this when calling the service later.
Optional
DEPLOYMENT_PERIOD_DAYS
: Number of days to include in deployment window when calculating "deployment frequency" metric (default:7
)REGION
: The AWS region you want to use (default:eu-north-1
)TABLE_NAME
: The DynamoDB table name you want to use (default:dorametrix
)
Optional: Set up your incident handling webhook
This is highly recommended but not strictly necessary, though it will become quite a hassle if you do not automate incident handling.
GitHub
Create a webhook; see this guide if you need instructions.
Add your dorametrix
endpoint URL, set the content type to application/json
and select the event types Issues
and Push
.
Bitbucket
Create a webhook; see this guide if you need instructions.
Add your dorametrix
endpoint URL and select the event types Repository:Push
, Issue:Created
and Issue:Updated
.
Jira
Create a webhook; see this guide if you need instructions.
Add your dorametrix
endpoint URL and select the event types Repository:Push
, Issue:created
, Issue:updated
, Issue:deleted
.
Note on security with webhook secrets
The current version of dorametrix
does not have built-in support for GitHub webhook secrets, but if there is sufficient demand I might add such support.
Note that Bitbucket Cloud and Jira do not have support for webhook secrets: https://jira.atlassian.com/browse/BCLOUD-14683.
Running locally
Run npm start
.
Note that it will attempt to connect to a database, so deploy the application and infrastructure before any local development.
Deployment
Run npm run deploy
.
dorametrix
events
Raising Creating deployments
You can create deployments either manually, with the provided script, or use ready-made actions or pipes to abstract that part for you.
Manually
Download the deployment.sh
script in this repository. In your CI script, call the script at the end of your deployment, for example:
bash deployment.sh "$ENDPOINT" "$API_KEY" "$PRODUCT"
As seen above, the required inputs are:
- The endpoint
- The API key
- The product name
GitHub Actions action
An example using two user-provided secrets for endpoint and API key:
steps:
- name: Checkout
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Dorametrix
uses: mikaelvesavuori/[email protected]
with:
endpoint: ${{ secrets.DORAMETRIX_ENDPOINT }}
api-key: ${{ secrets.DORAMETRIX_API_KEY }}
A full example is available at https://github.com/mikaelvesavuori/demo-dorametrix-action.
The specific action, mikaelvesavuori/[email protected]
, is available for use.
Bitbucket Pipelines pipe
An example using two user-provided secrets and setting the product with a known variable representing the repo name:
- step:
name: Dorametrix
script:
- pipe: docker://mikaelvesavuori/dorametrix-pipe:1.0.0
variables:
ENDPOINT: '$ENDPOINT'
API_KEY: '$API_KEY'
PRODUCT: '$BITBUCKET_REPO_SLUG'
A full example is available at https://github.com/mikaelvesavuori/demo-dorametrix-pipe.
The specific action, docker://mikaelvesavuori/dorametrix-pipe:1.0.0
, is available for use but is highly limited when it comes to pulling it (since it is hosted on a free plan). It's therefore highly recommended that you host or push your own image if you are within the Bitbucket + Docker Hub infrastructure!
Creating incidents
See below for the tool-specific conventions.
GitHub
- Open incident: Create a GitHub Issue with an
incident
label. - Close incident: Close the Issue or unlabel the
incident
label.
Bitbucket
- Open incident: Create a Bitbucket Issue with a
bug
label. - Close incident: Close the Issue or unlabel the
bug
label.
Jira
- Open incident: Create a Jira Issue with an
incident
label. - Close incident: Close the Issue or unlabel the
incident
label.
dorametrix
calculate them?
What are the DORA metrics and how does Quotes from a blog post on Google Cloud.
The period that is taken into account is 30 days in most cases, or 7 days for deployment frequency. These values are configurable.
Deployment frequency
How often an organization successfully releases to production.
Data collection
- Send a
CHANGE
event when starting up the CI build. You can do this with either a direct call to thedorametrix
API endpoint, or by using a webhook ("push" event or similar) in GitHub, Bitbucket, or Jira. - Send a
DEPLOYMENT
event after pushing code to your production environment. You can do this with either a direct call to thedorametrix
API endpoint, or by using a convenience GitHub action or Bitbucket pipe for your CI. See the GitHub action demo or Bitbucket pipe demo for more information.
Calculation
{number of deployments} / 7
is the standard (i.e. number of deployments in the last week).
Lead Time for Changes
The amount of time it takes a commit to get into production.
Data collection
Same as deployment frequency (see above).
Calculation
The solution used in dorametrix
is based on calculating the difference between the earliest commit timestamp in a change batch (that lead to a deployment) with the timestamp of the actual deployment.
{accumulated time of every first commit for each deployment} / {number of deployments}
Change Failure Rate
The percentage of deployments causing a failure in production.
Data collection
Send an INCIDENT
event. You can do this with either a direct call to the dorametrix
API endpoint, or by using a webhook ("opened"/"closed"/"labeled"/"unlabeled" event or similar) in GitHub, Bitbucket, or Jira. The conventions are listed above, in the "Configuration" section.
You could certainly look into other types of integrations and automations, for example by connecting a failing function with calling dorametrix
(or connecting Cloudwatch to it).
Calculation
Some thinkers in the DORA metrics space will say that we need to understand whether a deployment was successful or failed. In dorametrix
this is seen as unimportant and an over-complication of matters. Instead, all deployments are simply... deployments.
{incident count} / {deployment count}
.
Time to Restore Services
How long it takes an organization to recover from a failure in production.
Data collection
Depends on the above collection of incidents (change failure rate).
Calculation
This is very straight-forward, just add the total time of all incidents (from start to being resolved). Unresolved tasks will obviously continue to add up.
{accumulated time of all incidents} / {actual incident count}
.
Example API calls
All of the below demonstrates "directly calling" the API; since webhook events from GitHub, Bitbucket and Jira have other (varying shapes) they are out-of-scope for the example calls.
Create change
Request
POST {{BASE_URL}}/event
{
"eventType": "change",
"product": "demo"
}
Successful response
204 No Content
Create deployment
Request
POST {{BASE_URL}}/event
{
"eventType": "deployment",
"product": "demo",
"changes": [
{
"id": "356a192b7913b04c54574d18c28d46e6395428ab",
"timeCreated": "1642879177"
},
{
"id": "da4b9237bacccdf19c0760cab7aec4a8359010b0",
"timeCreated": "1642874964"
},
{
"id": "77de68daecd823babbb58edb1c8e14d7106e83bb",
"timeCreated": "1642873353"
}
]
}
Successful response
204 No Content
Create incident
Request
POST {{BASE_URL}}/event
{
"eventType": "incident",
"product": "demo"
}
Successful response
204 No Content
Get all metrics
Request
GET {{BASE_URL}}/metrics?product=myproduct
Example response
{
"changeFailureRate": "0.00",
"deploymentFrequency": "0.00",
"leadTimeForChanges": "00:00:00:00",
"timeToRestoreServices": "00:00:23:11"
}
Get specific metric (changeFailureRate)
Request
GET {{BASE_URL}}/metrics?changeFailureRate&product=myproduct
Example response
{
"changeFailureRate": "0.00"
}
Get specific metric (deploymentFrequency)
Request
GET {{BASE_URL}}/metrics?deploymentFrequency&product=myproduct
Example response
{
"deploymentFrequency": "0.00"
}
Get specific metric (leadTimeForChanges)
Request
GET {{BASE_URL}}/metrics?leadTimeForChanges&product=myproduct
Example response
{
"leadTimeForChanges": "00:00:00:00"
}
Get specific metric (timeToRestoreServices)
Request
GET {{BASE_URL}}/metrics?timeToRestoreServices&product=myproduct
Response
{
"timeToRestoreServices": "00:00:23:11"
}
Get last deployment
Request
GET {{BASE_URL}}/lastdeployment?product=myproduct
Response
{
"id": "de9e97a5f7e60230c440c627b0779629fa2c796b",
"timeCreated": "1644259334000"
}
References
Articles
- Are you an Elite DevOps performer? Find out with the Four Keys Project
- DevOps culture: How to transform
- Implementing DORA key software delivery metrics