Really enjoyed your article on Towards Data science! Great explainer on the arch and shows the scope of all the diff moving parts quite well.
One thing that might be helpful for other peeps coming to this repo would be separate branches (or other repo, what have you) with the the spark and aws setups respectively. At a glance it's kind of confusing to see what elements are for which config.
That said this is a pretty sweet project and ref point for peeps trying to dig into these tools!