GithubHelp home page GithubHelp logo

zahidul-islam / 60-days-of-data-engineering Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 0.0 6 KB

The log of my 60 days of Data Engineering challenge - to keep myself accountable

60daysofdataengineering kubeflow airflow apache-spark kafka tensoflow

60-days-of-data-engineering's Introduction

60 Days of Data Engineering

Inspired by #100DaysOfCode, I've decided to challenge myself into becoming a Data Engineer by studying and building Data/ML pipeline for 10-12 hours every day for the next 60 days. This started today 3rd of September and should be finished by 4th of November, 2019. My focus will be on ML/DL pipeline and Data Engineering tools around it such as KubeFlow, Apache Airflow, Apache Spark, Apache Kafka, and Tensorflow. I will document my progress on Github and update daily logs in LinkedIn.

Day 5: September 7, 2019

Today's Progress: Today was no a productive day. Only finish Week 1 content of Natural Language Processing with Tensorflow course.

Thoughts: I am excited and looking forward to start Insight Data Engineering Fellows Program on September 9th 2019.

Day 4: September 6, 2019

Today's Progress: Today I finished the Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning course on Coursera.

Thoughts: It was not a difficult course. However, it gave me a solid understanding of Tensorflow 2.0 API and Convolutional Neural Networks (ConvNets). Building some simple image classifiers were fun.

Useful Links:

๐Ÿ‘‰ Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning https://www.coursera.org/learn/introduction-tensorflow

๐Ÿ‘‰ Fashion MNIST with Keras and TPUs https://research.google.com/seedbank/seed/fashion_mnist_with_keras_and_tpus

๐Ÿ‘‰ Understanding Convolutions https://colah.github.io/posts/2014-07-Understanding-Convolutions/

Day 3: September 5, 2019

Today's Progress: Today I started TensorFlow in Practice Specialization from deeplearning.ai. I am in week 4 of Introduction to TensorFlow for Artificial Intelligence course.

Thoughts: I like the way Al Advocate (Instructor) introduced Convolutional neural network by building a simple classifier using fashion mnist dataset and Tensorflow. TensorFlow in Practice Specialization is hands-on. Looking forward to learn more about TensorFlow.

Useful Links:

๐Ÿ‘‰ TensorFlow in Practice Specialization https://www.coursera.org/specializations/tensorflow-in-practice

๐Ÿ‘‰ Different Convolution Filters https://lodev.org/cgtutor/filtering.html

๐Ÿ‘‰ Machine Learning Fairness https://developers.google.com/machine-learning/fairness-overview/

๐Ÿ‘‰ Collection of Interactive Machine Learning Examples https://research.google.com/seedbank/

๐Ÿ‘‰ Step-by-step Guide to Install TensorFlow 2 https://medium.com/@cran2367/install-and-setup-tensorflow-2-0-2c4914b9a265

Day 2: September 4, 2019

Today's Progress: I wrote a blog post on LinkedIn where I explained Apache Airflow core concepts.

Thoughts: There are so many interesting concepts in Airflow. It is an excellent tool for workflow orchestration. I want to spend more time on building custom Operator, Hook and data pipeline.

Link to work: Apache Airflow Core Concepts

Here are some useful links:

๐Ÿ‘‰ A Definitive Compilation of Apache Airflow Resources - Aakash Pydi https://towardsdatascience.com/a-definitive-compilation-of-apache-airflow-resources-82bc4980c154

๐Ÿ‘‰ DAG Writing Best Practices in Apache Airflow https://www.astronomer.io/guides/dag-best-practices/

๐Ÿ‘‰ Automate AWS Tasks Thanks to Airflow Hooks - Arnaud https://blog.sicara.com/automate-aws-tasks-boto3-airflow-hooks-593c3120e8fc

๐Ÿ‘‰ Getting started with Apache Airflow - Adnan Siddiqi https://towardsdatascience.com/getting-started-with-apache-airflow-df1aa77d7b1b

๐Ÿ‘‰ Orchestration and DAG Design in Apache Airflow โ€” Two Approaches https://medium.com/hashmapinc/orchestration-and-dag-design-in-apache-airflow-two-approaches-35edd3eaf7c0

๐Ÿ‘‰ Apache Airflow Core Concepts - Zahidul Islam https://www.linkedin.com/pulse/apache-airflow-core-concepts-zahidul-islam/?trackingId=X3YNEn0IQHehblxk9G0Z7Q%3D%3D

Day 1: September 3, 2019

Today's Progress: Spent time learning about Apache Airflow. Airflow is a platform to programmatically author, schedule and monitor workflows. Link: https://airflow.apache.org/index.html

Thoughts: Very happy with my progress, and excited to start building a Dynamodb to BigQuery ETL pipeline using Airflow tomorrow.

There are so many excellent blogs on Airflow. Today I want to share some beginner-friendly resources:

๐Ÿ‘‰ Airflow official documentation https://airflow.apache.org/index.html

๐Ÿ‘‰ Apache Airflow for the confused - Jonathan Pichot https://medium.com/nyc-planning-digital/apache-airflow-for-the-confused-b588935669df

๐Ÿ‘‰ Apache Airflow: Tutorial and Beginners Guide https://www.polidea.com/blog/apache-airflow-tutorial-and-beginners-guide/

๐Ÿ‘‰ Apache Airflow on Docker for Complete Beginners https://medium.com/@itunpredictable/apache-airflow-on-docker-for-complete-beginners-cf76cf7b2c9a

๐Ÿ‘‰ Understanding Apache Airflowโ€™s key concepts https://medium.com/@dustinstansbury/understanding-apache-airflows-key-concepts-a96efed52b1a

๐Ÿ‘‰ How to start automating your data pipelines with Airflow - Sriram Baskaran https://blog.insightdatascience.com/airflow-101-start-automating-your-batch-workflows-with-ease-8e7d35387f94

60-days-of-data-engineering's People

Contributors

zahidul-islam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.