GithubHelp home page GithubHelp logo

micgonzalez / luigi-data-pipeline-with-python Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 3.67 MB

This repository is based on scraping data from a static website through Luigi. This was created to display my ability to utilize the Luigi pipeline to automatically collect data and other tasks.

License: GNU General Public License v3.0

Python 100.00%
python3 luigi-pipeline luigi-tasks spotify beautifulsoup4 requests

luigi-data-pipeline-with-python's Introduction

Luigi Data Pipeline with Python

Introduction

This repository is based on scraping data from a static website through Luigi. This was created to display my ability to utilize the Luigi pipeline to automatically collect data and other tasks.

Abstract

Have you ever asked yourself, if it was possible to automate tasks? Could word counts help in finding insights? What could you do with the extra time gained from automation? These were a few questions that came to mind, when working on this project.

Summary of Skills

I used the python environment within Pycharm and Luigi Scheduler to perform my actions needed to complete this repository. I also used the Luigi, Beautifulsoup4, Requests, Counter and Pickle packages for this project.

Preview

Preview of Luigi Scheduler created from this project.

This Screenshot was created from this project to show the public about the visual view of Luigi Scheduler.

Preview of Pycharm Terminal created from this project.

This Screenshot was created from this project to show the public what was done in Pycharm's terminal window.

Findings

I was tasked to automate the task of scraping data from a static webpage and create a summary of the resulting word count. I had to utilize the Luigi Scheduler and Pycharm's terminal function to perform these tasks. Using Luigi Scheduler can reduce the allotted time in performing these tasks. Luigi is a powerful application, but it does not have a visual display of the tasks being performed. Luigi Scheduler helps in giving a useful display of visuals on the tasks.

Challenges

On this project, I had one challenge that I did not foresee when I was working on this project. The challenge for me was to navigate through Luigi Scheduler. I have intereacted with Luigi through Pycharm's Terminal function and the Terminal app on a MAC. In my previous class, we never interacted with a visual display of Luigi. Luigi Scheduler has an interest layout and it took some time to get use to it.

Conclusion

Thinking back to my previous experience with Luigi, I did not know that there was a visual way to see how the tasks are preformed. Utilizing Luigi with Luigi Scheduler will help in automating tasks like web scraping and summarizing word counts. This just not cut down the time on performing these tasks, but it allows you to focus on more complex tasks. Luigi is great for straightforward projects.

luigi-data-pipeline-with-python's People

Contributors

micgonzalez avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.