GithubHelp home page GithubHelp logo

tubbz-alt / kmeans-service Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mayhem-lab/kmeans-service

0.0 1.0 0.0 33.48 MB

Web service for K-Means clustering algorithm with Mahalanobis distance and Bayesian Information Criterion.

License: Other

Python 70.33% HTML 29.13% Shell 0.54%

kmeans-service's Introduction

Centaurus: K-Means as a Service

Centaurus is a scalable, easy to use, cloud service for k-means clustering that automatically deploys and executes multiple k-means variants concurrently, and then scores them to provide a clustering recommendation. Centaurus scores clustering results using Bayesian Information Criterion to determine the best model fit across cluster results. Visualization and diagnostic tools are available to help users interpret clustering results.

Authors: Angad Gill, Nevena Golubovic

Architecture

The system consists of a total of five services:

  • Frontend: The frontend is provided by a Python Flask server (site/frontend.py) paired with Gunicorn and NGINX.

  • Backend: There are two options for the backend:

    1. Worker: Python Celery to perform all analysis tasks asynchronously (site/worker.py).
    2. Queue: RabbitMQ as a message broker between the Frontend and Workers.
    3. Database: Centaurus can be used with either NoSQL (MongoDB) or SQL (Postgres) to store all parameters for analysis and results of all tasks associated with each analysis.
    4. Storage: Amazon S3 to store the data files uploaded by users.

    Centaurus Architecture

Purpose

The purpose of the Frontend is to do the following:

  1. Provide an interface for users to upload their data files to the Backend Storage.
  2. Provide an interface for users to view the status and results of the analysis.
  3. Generate all the tasks (individual k-means fit runs) needed to complete a job.
  4. Generate necessary plots and tables needed for 1. and 2.
  5. Allow users to rerun tasks that failed.

The purpose of the Backend Worker is to do the following:

  1. Run the analysis based on the data and parameters provided in the Backend Queue.
  2. When done, update the Backend Database with the analysis results.

Installation

See site/README.md.

Publications:

N. Golubovic, A. Gill, C. Krintz, R. Wolski, "CENTAURUS: A Cloud Service for K-means Clustering", 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress

kmeans-service's People

Contributors

angadgill avatar nevenag avatar ckrintz avatar heronalps avatar dependabot[bot] avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.