airavata-courses / garuda Goto Github PK

View Code? Open in Web Editor NEW

2.0 3.0 0.0 25.37 MB

A distributed weather data visualization platform

Home Page: https://github.com/airavata-courses/garuda#garuda

License: Apache License 2.0

Java 0.67% Python 1.59% JavaScript 47.76% Dockerfile 0.06% Shell 0.19% HTML 42.69% CSS 4.33% Less 2.71%

nexrad node java javascript docker docker-compose python flask aws maven

garuda's Introduction

Garuda

Spring 2022 Project

Application to visualises user requested NEXRAD data.

Run on UNIX based systems

Dependency/Prerequisite

Softwares/prerequisites needed to run garuda: Docker

Note: You'll need latest version of docker engine

Start Application

Export ENV variables

export $PROJECT=PROJECT3

Pull the latest images from DockerHub

docker-compose pull

Start application services

docker-compose up

Run the above command on your terminal from the root of project folder to create all the resources to run the project.

Adding hostnames in /etc/hosts

sudo sh scripts/host.sh

Note: The above command creates 6 containers for the running the application.

Note: The services run in non-detached mode. On exiting the process from terminal all the containers stop.

Note: This command might take some time to run. It's spinning up all the containers required to run the project. After all the resources are done loading, logs won't be printing on the terminal. You can use the application now !

Access Web-Application

URL for the web-application: http://garuda.org:3000

Stop Application

Type : CTLR + C to exit

Clean Created Resources

Done playing around ? Run this command to remove all the created resources.

docker-compose down

Build

Build resource again if needed

docker-compose build

Note: Before building make sure you have these env variables exported in terminal

NASA_USERNAME - NASA MERRA2 dashboard username

NASA_PASSWORD - NASA MERRA2 dashboard password

AWS_ACCESS_KEY_ID - JetStream Object Store access key ID

AWS_SECRET_ACCESS_KEY - JetStream Object Store access secret

PROJECT - Version of project you want to build

Run on Windows based systems

Dependency/Prerequisite

Softwares/prerequisites needed to run garuda: Docker

Note: You'll need latest version of docker for windows

Start Application

Export ENV variables

export $PROJECT=PROJECT3

Pull the latest images from DockerHub

docker-compose pull

Start application services

docker-compose up

Run the above command on your cmd from the root of project folder to create all the resources to run the project.

Adding hostnames in /etc/hosts

sudo sh scripts/host.sh

Note: The above command creates 6 containers for the running the application.

Note: The services run in non-detached mode. On exiting the process from terminal all the containers stop.

Note: This command might take some time to run. It's spinning up all the containers required to run the project. After all the resources are done loading, logs won't be printing on the terminal. You can use the application now !

Access Web-Application

URL for the web-application: http://garuda.org:3000

Stop Application

Type : CTLR + C to exit

Clean Created Resources

Done playing around ? Run this command to remove all the created resources.

docker compose down

Build

Build resource again if needed

docker compose build

Note: Before building make sure you have these env variables exported in terminal

NASA_USERNAME - NASA MERRA2 dashboard username

NASA_PASSWORD - NASA MERRA2 dashboard password

AWS_ACCESS_KEY_ID - JetStream Object Store access key ID

AWS_SECRET_ACCESS_KEY - JetStream Object Store access secret

PROJECT - Version of project you want to build

Access JetStream production deployment

Add changes in etc/hosts/ for production url

sudo sh scripts/prod_host.sh

Install CORS plugin in browser to enable cors headers since application is using jetstream object strore sample cors plugin

Access application at http://garuda.org

Modules

Data Extractor : Apache Maven project to build a utility JAR file which extracts requested NEXRAD data from S3.
Queue Worker : Apache Maven project to build a JAR file which runs a consumer on a rabbitmq queue. It processes the request using data_extractor utitlity JAR and published the data to a API endpoint.
DB_Middleware: Microservice to interact with database. This microservices provides APIs to perform read and writes to database. Reads are performed by API_Gateway module and Writes are performed by Queue_Worker module and API_Gateway module. It also dumps the dataset of the request to the object store (AWS S3 bucket) and saves the object url in the database
API_Gateway: API_Gateway module provides a middle-ware layer for all the back-end services. Front-end application communicate with API_Gateway module to interact with all other micro-services.
Web_App: Web Application module is the application with which the end users interacts. It communicates with API_Gateway module to maintain user data and fetch NEXRAD data.
Queue Worker Nasa : Python application which runs a consumer on a rabbitmq queue. It processes the request using extractor utitlity and published the data in a conerted formatted to a API endpoint.

Optimization

In the project 3 milestone after brainstorming we found scope for improvement in our system through which we reduced load from the backend significantly. The improvement was to store request dataset to the object-store(AWS S3 bucket) and then web app retrieves the data from the object store whenever user requests to plot the map. To check the systems performance with and without object store. We benchmarked the system with JMeter by making 100 concurrent request. The average response time in without object store was 14194ms and in with object store was 399ms. The average response time was reduced by 135% . The details of the reports are present here

Architecture

Napkin Diagram

CI/CD

CI : Github Action workflow is used as CI. Any pull request/ commit to main branch triggers CI workflow. Garuda_CI
CD :
1. Github Pages is used to deploy data_extractor's javadocs.
2. Github Pages also host static assets. docs/ folder is hosted via GitHub Pages.
3. CD is triggered on each push to master branch.
4. CD logs into JetStream2 remote server, builds all the docker images, pushes the docker images to DockerHub, replaces old deployments with latest deployments on remote kubernetes cluster.

Packages / Distribution builds

Garuda's Data Extractor Maven Package

Developers

Pranav Palani Acharya
Rishabh Deepak Jain
Tanmay Dilipkumar Sawaji

garuda's People

Contributors

Stargazers

Watchers

garuda's Issues

UI - Dashboard (user requests )

Show all user requests in a list

Remove last plotted map

removed last plotted map from view

Docker setup with env variables

Containerization with dynamic hostnames

Deploy data-serve using docker

Fetch station data to serve UI request

More informative map plots.

I would like to point out that the weather data on the map can be made more informative by providing some visual aids like a heat map or reflectivity index.

Login module GoogleAuth

Support document of size greater than 16mb

db_middleware feature to add support for 16mb document

Project 1 Peer Review - Team Epsilon

Hi Team,

I tried to set up your Project on my System which is a MAC running (Monterey v12.1) in which port 5000 is being already used by the OS and I saw that your api gateway runs on PORT 5000, so I tried to update it to an open port in my system (Port 6000) but it still didn't work and then I found out that you have written the port in files other than the configuration files and even after changing them it didn't work.

I would suggest not to write PORT details directly in the code like it was done in garuda.apigateway/server.py and you can also try to look what can be done so that such issues with PORTS can be handled with less changes or changes at a single place like changing the port in docker-compose.yml maybe should change the PORT dependency everywhere.
You can have a look at "Environment" tag for docker-compose.

Kubernetes integration with scaling

create deployments and statefulset with auto scaling for all micro services

UI Datepicker issue

Date is not changing on click

Push data request to RabbitMq

Additional key in the response object to Frontend

add response from db_middleware in 'data_dump' key

Reflectivity not popluated in DB

Populate reflectivity in DB

Restructure directory for data-serve

extract_data from NEXRAD database and fetch data

Test Case for db_middleware

added test for db_middleware ping path

Centralised CI using GitHub Actions

added CI for queue_worker

Moved data_extractor CI to garuda__github_actions_CI

react - refractor code

data conversion in data extraction module

modify data extraction module to change data format, ie. azimuth distance to lat,lng

Unit testing of /getAllInfo

render map and plot data in react UI

IU changes in login screen and dashboard

Feedback related to Project 1 (Team Neo)

Congratulations on coming up with a smooth UI and data flow.

UI seems very minimal and intuitive.
Below is the resultant data:-

There are some points I will which we want to bring to your knowledge:-

You tried well to persist the data to db for any search before returning it to the user. But It seems some of the time durations have huge data that it's not getting retrieved properly because of the MongoDB limitations. Maybe you can compress the lat-long data to already compressed data and then store to DB. So that merging documents won't reach the limit.
Docker Images can be reduced in size using alpine os images of Node, python and Java(Not sure about this)
Build time can be removed by pushing images on Docker Hub

Review Done by Team Neo(https://github.com/airavata-courses/neo)

Test Cases for data_extractor project

Unit test for data_extractor project

Readme docs for db_middleware

Add docs

Project 1 Feedback (Scapsulators)

Just ran your project, and I have a couple of comments and suggestions -

Build time took around 12-14 minutes, was easy to follow from the readme files.

Implementation of CircleCI for continous builds on commits
Using docker-hub instead of individually building each microservice in your main repo
Time slot selection doesn't seem so intuitive at all
Could've explored more properties instead of just reflectivity
Also not sure how you're handling the error when there's no data for the querry

Rest great work! Loved your project as such

Reviewed by Team Scapsulators (https://github.com/airavata-courses/scapsulators)

Page refresh on click

Refresh page when a new request is submitted

integration of data serve, db_middleware and queue worker

Handle dataset larger than 16MB while retrieving from database

Possible solutions:

Sampling the dataset
Compressing the dataset while storing and retrieving data.
Retrieve data by parts

UI - Login Page

React webapp setup and create login page

Development of Db_Middleware module to communicate with the database for QUEUE WORKER module

data_extractor multithread downloading

data_extractor multi thread downloading
added feature to not download if existing

queue_worker to start_time, end_time, requestID and other params to db_middleware

Need to add few more parameter from queue_worker (start_time, end_time, requestID)
Tested rest of the parameters

Originally posted by @pranavacharya in #33 (comment)

Docker hub remote repository setup

Add remote repository setup for docker images

Docker Containerisation whole project

Docker compose for whole project

UI - Dashboard

Add exception handling to data-serve microservice

Project 1 Feedback [Team Zilean]

Likes:

System architecture looks very similar to the way we thought.

Observations:

Each microservice needs to make an API call to the DB middleware microservice to make DB operations. Here, it increases one network hop which might hamper performance.

Piyush Nalawade
Team Zilean 🦄

Queue worker to consume offload request from queue and extract data using data_extractor

consume offload request from rabbitmq queue
extract data using data_extractor
publish message / call db writer service

airavata-courses / garuda Goto Github PK

garuda's Introduction

Garuda

Run on UNIX based systems

Dependency/Prerequisite

Start Application

Adding hostnames in /etc/hosts

Access Web-Application

Stop Application

Clean Created Resources

Build

Run on Windows based systems

Dependency/Prerequisite

Start Application

Adding hostnames in /etc/hosts

Access Web-Application

Stop Application

Clean Created Resources

Build

Access JetStream production deployment

Modules

Optimization

Architecture

Napkin Diagram

CI/CD

Packages / Distribution builds

Developers

garuda's People

Contributors

Stargazers

Watchers

garuda's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs