GithubHelp home page GithubHelp logo

muskanmahajan37 / gbfs_kafka Goto Github PK

View Code? Open in Web Editor NEW

This project forked from heavyai/gbfs_kafka

0.0 0.0 0.0 633 KB

A Python and Kafka dataflow for the General Bikeshare Feed Specification (GBFS)

License: Apache License 2.0

Python 89.71% PLpgSQL 10.29%

gbfs_kafka's Introduction

A Python and StreamSets dataflow for the General Bikeshare Feed Specification (GBFS)

This repository has three parts: Python code to access each station_station and free_bike_status endpoint provided by GBFS, a StreamSets dataflow to insert the data into an OmniSci database and the OmniSci table definition DDLs.

Python

There are 4 Python scripts in this repository:

  • build_gbfs_endpoints.py - reads GBFS endpoints from the GBFS system.csv file, then accesses each working endpoint
  • build_slow_changing_tables.py - creates CSVs of endpoints that change slow enough that they don't need to be built every minute
  • free_bike_status.py - accesses each endpoint from the working free bike status endpoints identified by build_gbfs_endpoints.py, writing the results to Apache Kafka
  • station_status.py - accesses each endpoint from the working free bike status endpoints identified by build_gbfs_endpoints.py, writing the results to Apache Kafka

The two main scripts are free_bike_status.py and station_status.py, which are intended to be run once per minute (in our case, via cron). For our use case, it is sufficient to get the results once per minute, as the data are intended to be a demonstration rather than a precise to-the-moment status of any given bike station.

StreamSets

The second part of this repository is the StreamSets dataflow. This dataflow reads the data from Apache Kafka, processes the JSON provided from the GBFS endpoints, and inserts the data into an OmniSci database via the OmniSci JDBC driver.

OmniSci Table Definitions

The last part of this repository are the DDLs for OmniSci. These represent the targets for the StreamSets pipeline to insert data into.

Goals

The intent of this repository is to demonstrate how to create a fully open-source streaming data example using OmniSci. The repo will be updated over time to reflect the current status of our demo, and can be used as a starting point for your own purposes, but it's not intended to be a full-blown collaboration to meet the needs of all users. Rather, this data are collected by OmniSci to use for demoing the OmniSci platform and related open-source tools for data analysis.

If you do have ideas about how to make this demo more impressive or there is a bug that stops you from replicating the example, please feel free to open an issue.

gbfs_kafka's People

Contributors

randyzwitch avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.