Complex json structures coming from different sources? We want to fix your problem. Take a look to get inspired!!!
You have small but complex data, and you want easily get ready for any analytics tools with less burn down and expenses possible. So we tried at the end the use of BSON library and the Spring Cloud Dataflow framework, for running a 'continuous streaming'...
In a real world we should provide at least a similar architecture:
As a Poc we can start providing in a single Maven project, all we can provide it's just a small subset of the minimum viable architecture, as shown here:
In the MongoDb instance you can find the flow-centric database.
After a cycle of data streaming we have two MongDb collections per each category.
Into data collections we can find some ready indexes:
Here how MongDb collections appears.
And into any of the data collections we have elements that track :
-
metadata partition
-
metadata document id
-
index
-
model name
All as shown in following images:
The Spring Cloud Config server takes the configuration from a specific repository, as follows:
It provides some profiles:
-
dev (source_dev, process_dev, sink_dev)
-
compose (source_compose, process_compose, sink_compose)
-
kubernetes (source_kubernetes, process_kubernetes, sink_kubernetes)
-
local (not ready)
We provide a docker compose to simulate base environment.
Please visit folder scripts
Information about docker compose here and command line reference here.
Enjoy tour journey in Spring Cloud Dataflow Framework.
Here the docker images sources repository:
Upcoming branch with a compose whicb main news are Spring Cloud Dataflow Server and the Spring Cloud Skipper Server you can scale as you can with you system resources and you will be able to scale individually for the 3 microservices, registered into the Server via catalogue (source, process and sink). We will provide as well the 3 analytics microservices with some automation on the definition and recognition of model types, indexes, and some new spatial concepts. Autoplacing indexes required by the analytics nodes, via model databdatabase (missing in this release). The use of another streaming engine will realize the data push in the metadata sourcing microservice channels and it will be used to pushback responses from the analytics sink moctoservice after the analytics group computation. So let's get ready for a more intensive experience on the dataflow universe ...
The library is licensed with CC0 v. 1.0 clauses, with prior authorization of author before any production or commercial use. Use of this library or any extension is prohibited due to high risk of damages due to improper use. No warranty is provided for improper or unauthorized use of this library or any implementation.
Any request can be prompted to the author Fabrizio Torelli at the follwoing email address: