GithubHelp home page GithubHelp logo

jet-train's Introduction

Jet Trains

This project is a demo of Hazelcast Jet, a data streaming engine based on Hazelcast IMDG.

It displays the position of public transports in the Bay Area in near real-time.

Note
It used to showcase Switzerland’s public transports. Unfortunately, the Swiss data provider doesn’t provide the GTFS-RT feed anymore.
Screenshot of the demo
Figure 1. Demo screenshot (click to watch a recording sample)

The technology stack consists of:

Overall structure

The project contains several modules with dedicated responsibilities:

Name Description

common

Code shared across modules

infrastructure

Contain the static data files, as well as configuration files for Docker Compose and Kubernetes

local-jet

As an alternative to the previous module, start a local Jet instance to be able to debug inside the IDE

load-static

Load GTFS-RT static data from files in memory. Those files contain reference data that are used later to enrich the data pipeline

stream-dynamic

Call an OpenData endpoint to get dynamic data, transform it, enrich it, and store it into an IMDG map

web

Subscribe to the aforementioned IMDG map and publish changes to a web-socket endpoint. The UI subscribes to the endpoint and displays each data point on an Open Street Map.

architecture

Reference documentation

The data provider releases data compliant with the General Transport Feed Specification (by Google).

Two types of data are available:

  1. Static files that contain reference data that don’t change often e.g. schedules, stops, etc.

  2. A REST endpoint serves dynamic data e.g. vehicle positions

Running the demo: data

The demo is based on data provided by 511 SF Bay’s Open Data Portal.

Data update

Every day, new reference data (e.g. expected stop times) are published. Hence, the infrastructure project that contains said data needs to be updated with new files. Note that only 4 files are required for the demo: agency.txt, routes.txt, stops.txt and trips.txt.

GTFS Feed Download allows the user to download a zip file containing GTFS dataset for the specified operator/agency. It also contains additional files, called the GTFS+ files, that provide information not contained in the GTFS files such as the direction names, fare zone names, etc.

Allowable parameters: api_key (mandatory), operator_id (mandatory), and historic (optional)

API Key

Calling the endpoint requires an API key.

  1. First, register

  2. You’ll receive a confirmation email

  3. When you’ve confirmed the email, you’ll receive a new email with the token

  4. The token should be used as an argument when launching the com.hazelcast.jettrain.data.MainKt class from the stream-dynamic module:

    java com.hazelcast.jettrain.data.MainKt $TOKEN
Note
There’s a rate limiter on the server side: the endpoint returns a 429 status if it’s queried more than 60 times per hour. In order to not go over this limit too soon, the Jet job is configured to run only once per 31 seconds.

Running the demo: developer setup

If you’re a Java developer, this approach will be fastest as you probably have all the tools ready.

Requirements

  • Git (with LFS extension installed - on Ubuntu it’s not installed by default)

  • A Java IDE e.g. IntelliJ IDEA, Eclipse, etc.

Steps

  1. Clone the repo

  2. Import the code into your IDE

  3. In the local-jet module, run the com.hazelcast.jettrain.LocalJet.kt class inside the IDE with the following parameters:

    -Xmx8g \                                                             (1)
    -XX:+UseStringDeduplication \                                        (2)
    --add-modules java.se \                                              (3)
    --add-exports java.base/jdk.internal.ref=ALL-UNNAMED \               (3)
    --add-opens java.base/java.lang=ALL-UNNAMED \                        (3)
    --add-opens java.base/java.nio=ALL-UNNAMED \                         (3)
    --add-opens java.base/sun.nio.ch=ALL-UNNAMED \                       (3)
    --add-opens java.management/sun.management=ALL-UNNAMED \             (3)
    --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED   (3)
    1. Reserve extra memory

    2. Improve memory efficiency when storing strings

    3. Necessary when working with Java 9+

  4. To import static data files, run the MainKt class from inside the load-static module:

    java -Ddata.path=/path/to/local/infrastructure/data com.hazelcast.jettrain.refs.MainKt
  5. To query dynamic data, run the MainKt class from inside the stream-dynamic module:

    java -Dtoken=$YOUR_511_TOKEN com.hazelcast.jettrain.data.MainKt

    In the web module:

    java com.hazelcast.jettrain.JetDemoKt

    The webapp is available at http://localhost:8080.

Running the demo: Docker-Compose

With this setup, you’ll build the demo from source.

Requirements

  • Docker compose

  • Hazelcast Jet distribution

Steps

  1. Start Docker

  2. Get the webapp image:

    docker pull nfrankel/jettrain:latest
  3. Adapt the docker-compose.yml file to your file hierarchy. I found no way to use relative files path in Docker Compose (hints/PRs welcome). You need to update the file to use the correct paths. Look for paths starting with /Users/nico/projects/hazelcast/ and update accordingly.

  4. Start the containers: In the infrastructure/compose folder :

    docker-compose up
  5. Get the latest "static" JAR

  6. Configure the client configuration file.

    It depends on the topology. Here’s a sample:

    $JET_DISTRIBUTION/config/jettrain.yml
    hazelcast-client:
      cluster-name: jet
      network:
        cluster-members:
          - localhost:31781
        smart-routing: false
      connection-strategy:
        connection-retry:
          cluster-connect-timeout-millis: 1000
  7. To load static data:

    In the Hazelcast Jet distribution folder, run the following commands:

    ./jet --config ../config/jettrain.yml submit -v -c com.hazelcast.jettrain.refs.Agencies $PROJECT_ROOT/load-static/target/load-static-1.0-SNAPSHOT.jar
    ./jet --config ../config/jettrain.yml submit -v -c com.hazelcast.jettrain.refs.Stops $PROJECT_ROOT/load-static/target/load-static-1.0-SNAPSHOT.jar
    ./jet --config ../config/jettrain.yml submit -v -c com.hazelcast.jettrain.refs.Routes $PROJECT_ROOT/load-static/target/load-static-1.0-SNAPSHOT.jar
    ./jet --config ../config/jettrain.yml submit -v -c com.hazelcast.jettrain.refs.Trips $PROJECT_ROOT/load-static/target/load-static-1.0-SNAPSHOT.jar
    ./jet --config ../config/jettrain.yml submit -v -c com.hazelcast.jettrain.refs.StopTimes $PROJECT_ROOT/load-static/target/load-static-1.0-SNAPSHOT.jar
  8. Get the latest "dynamic" JAR

  9. To query dynamic data:

    In the Hazelcast Jet distribution folder, run the following command:

jet-train's People

Contributors

aigoncharov avatar dependabot[bot] avatar nfrankel avatar utkukaratas avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.