GithubHelp home page GithubHelp logo

anastasia-mikheeva / dssg-data-engineering-workshop Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rogall-e/dssg-data-engineering-workshop

0.0 0.0 0.0 10.1 MB

Home Page: https://rogall-e.github.io/DSSG-Data-Engineering-Workshop/

License: MIT License

Python 5.28% Jupyter Notebook 94.72%

dssg-data-engineering-workshop's Introduction

DSSG - Data Engineering Workshop 🪠❤️

Welcome to the Data Engineering Workshop organized by the Data Science for Social Good (DSSG) community. This workshop is designed to provide a hands-on introduction to the data engineering workflow and tools. The intended audience is data scientists and analysts who are interested in learning how to build simple data pipelines and data warehouses.

Workshop Setup

For this workshop we prepared a documentation page that will guide us through the different modules and exercises. You can find it here.

Timetable

Session Time Description Tool
Giving Context and Getting to know each other 9:00 - 9:30 Introduce the workshop objectives and participants share their backgrounds and expectations
Storing Data 9:30 - 10:15 Persisting data in a secure and queryable location for analytics purposes DuckDB
Extracting and Loading 10:30 - 11:15 Transferring data from different systems to a centralized repository Airbyte
Transforming 11:30 - 12:30 Shaping raw data from various sources into a unified view that can be interpreted by stakeholders dbt
Making data accessible 12:45 - 13:15 Providing interpretation and data access to the rest of the organization Metabase
Follow Up Questions and next steps 13:15 Participants can ask questions and receive guidance on recommended next steps and resources for further learning in data engineering

We will learn to:

  • Set up a basic Analytical Database using DuckDB
  • Read some data from various sources into our database with Airbyte
  • Transform the data into a unified view with dbt
  • Attach a visualization tool to the database using Metabase

We are not touching:

  • Buildig a production-ready data pipeline.
  • Setting up cloud infrastructure
  • Orchestrating complext data pipelines with lots of dependencies
  • Interacting with all the bells and whistles of the tools we will use.

dssg-data-engineering-workshop's People

Contributors

janbutof avatar pwaldi avatar rogall-e avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.