Flow Centric PoC

Complex json structures coming from different sources? We want to fix your problem. Take a look to get inspired!!!

You have small but complex data, and you want easily get ready for any analytics tools with less burn down and expenses possible. So we tried at the end the use of BSON library and the Spring Cloud Dataflow framework, for running a 'continuous streaming'...

First level PoC development.

In a real world we should provide at least a similar architecture:

As a Poc we can start providing in a single Maven project, all we can provide it's just a small subset of the minimum viable architecture, as shown here:

Results

In the MongoDb instance you can find the flow-centric database.

After a cycle of data streaming we have two MongDb collections per each category.

Into data collections we can find some ready indexes:

Here how MongDb collections appears.

And into any of the data collections we have elements that track :

metadata partition
metadata document id
index
model name

All as shown in following images:

Spring Cloud Configuration

The Spring Cloud Config server takes the configuration from a specific repository, as follows:

dataflow-flow-centric-config

It provides some profiles:

dev (source_dev, process_dev, sink_dev)
compose (source_compose, process_compose, sink_compose)
kubernetes (source_kubernetes, process_kubernetes, sink_kubernetes)
local (not ready)

Test locally without building the code

We provide a docker compose to simulate base environment.

Please visit folder scripts

Information about docker compose here and command line reference here.

Enjoy tour journey in Spring Cloud Dataflow Framework.

Docker images repository

Here the docker images sources repository:

spring-dataflow-docker

Coming soon

Upcoming branch with a compose whicb main news are Spring Cloud Dataflow Server and the Spring Cloud Skipper Server you can scale as you can with you system resources and you will be able to scale individually for the 3 microservices, registered into the Server via catalogue (source, process and sink). We will provide as well the 3 analytics microservices with some automation on the definition and recognition of model types, indexes, and some new spatial concepts. Autoplacing indexes required by the analytics nodes, via model databdatabase (missing in this release). The use of another streaming engine will realize the data push in the metadata sourcing microservice channels and it will be used to pushback responses from the analytics sink moctoservice after the analytics group computation. So let's get ready for a more intensive experience on the dataflow universe ...

License

The library is licensed with CC0 v. 1.0 clauses, with prior authorization of author before any production or commercial use. Use of this library or any extension is prohibited due to high risk of damages due to improper use. No warranty is provided for improper or unauthorized use of this library or any implementation.

Any request can be prompted to the author Fabrizio Torelli at the follwoing email address:

[email protected]

hellgate75 / dataflow-flow-centric-poc Goto Github PK

dataflow-flow-centric-poc's Introduction

Flow Centric PoC

First level PoC development.

Results

Spring Cloud Configuration

Test locally without building the code

Docker images repository

Coming soon

License

dataflow-flow-centric-poc's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs