GithubHelp home page GithubHelp logo

travaux-tech-challenge's Introduction

travaux-tech-challenge

travaux.com technical challenge

This work is based on :

  • embulk for the Extract/Load steps : configuration can be found in the /embulk/ directory
  • postgres for storage and SQL fueling
  • dbt for Transformation : project will be found under dbt/travaux/

Docker usage

Building the tech challenge environment

Thanks to the repository's Dockerfile, one can build the tech challenge environment like this :

docker build -t travaux-tech-challenge .

Please be patient...

Running the tech challenge environment

The built environment can be launched with the following command line :

docker run -d -p 5432:5432 -p 8080:8080 -p 3000:3000 travaux-tech-challenge

Then, please get the running container id like this :

docker ps

and launch a terminal session in the environment with this container-id :

sudo docker exec -it CONTAINER-ID bash

We are now going to use this terminal to process our data.

Extrat/Load the event_log data with embulk

This will extract the data from the event_log.csv file and load it in the postgres instance :

cd /embulk
java -jar embulk.jar run event_log.yml

Transform data with dbt

Now, we can use dbt to transform the data, from the event_log data up to the end user's datasets :

cd /dbt/travaux
dbt run --profiles-dir .

Test tranformation steps with dbt

The following command will validate all the transformation steps :

dbt test --profiles-dir .

Navigate the transformations' documentation

To learn how data is transformed, one can follow these steps :

  1. Generate the documentation : dbt docs generate --profiles-dir .
  2. Start the embedded dbt web server : dbt docs serve --profiles-dir . &
  3. Open a web browser at http://localhost:8080

Explore the end user's datasets

Finally, we can explore the data further in a data visualization tool :

  1. cd /metabase
  2. java -jar metabase.jar &
  3. Open a web browser page at http://localhost:3000
  4. Authenticate with the following parameters : [email protected] / travaux1

Stop the environment

docker stop CONTAINER-ID

Thank you !

This Dockerfile is the first one I have ever built, and it may surely not follow the best practices...

travaux-tech-challenge's People

Contributors

fabrice-etanchaud avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.