A basic example of an AWS lakehouse architecture based on Glue, Athena, and S3.
Created for the following blog series:
- https://www.xerris.com/insights/building-modern-data-warehouses-with-s3-glue-and-athena-part-1/
- https://www.xerris.com/insights/building-modern-data-warehouses-with-s3-glue-and-athena-part-2/
- https://www.xerris.com/insights/building-modern-data-warehouses-with-s3-glue-and-athena-part-3/
- Configure AWS credentials
- Update
main.tf
with an existing S3 bucket to use as a Terraform backend - Run
terraform plan
to see what will be created - Run
terraform apply
to deploy the resources - Ensure the datalake bucket you specified as a variable exists
- Upload the example CSV files under the prefix you specific as a variable
- Run the Glue crawler that was created
- Once finished, a new table will exist in your Glue database
- Go to Athena, switch to your workgroup, and use the "Preview table" option to generate an example query
- To delete the resources created, run
terraform destroy
- If desired, the S3 buckets for the Terraform backend and datalake should be manually deleted