GithubHelp home page GithubHelp logo

lkampoli / clustviz Goto Github PK

View Code? Open in Web Editor NEW

This project forked from guglielmosanchini/clustviz

0.0 0.0 0.0 251.42 MB

Visualization of many Clustering Algorithms, via Notebook or GUI

License: MIT License

Python 26.33% Jupyter Notebook 73.67%

clustviz's Introduction

Build Status codecov Documentation Status PyPI version Downloads Codacy Badge PEP8 License: MIT

ClustViz

2D Clustering Algorithms Visualization

Check out ClustVizGUI, too

The aim of ClustViz is to visualize every step of each clustering algorithm, in the case of 2D input data.

The following algorithms have been examined:

  • OPTICS

  • DBSCAN

  • HDBSCAN

  • SPECTRAL CLUSTERING

  • HIERARCHICAL AGGLOMERATIVE CLUSTERING

    • single linkage
    • complete linkage
    • average linkage
    • Ward's method
  • CURE

  • BIRCH

  • PAM

  • CLARA

  • CLARANS

  • CHAMELEON

  • CHAMELEON2

  • DENCLUE

Instructions

Documentation: click here

Install with

pip install clustviz

To run BIRCH algorithm, the open source visualization software Graphviz is required. Install Graphviz from the official webpage (https://graphviz.gitlab.io/download/) or using HomeBrew, then modify the PATH variable as follows (replace the string according to the path where you installed Graphviz):

import os
# on Windows usually
os.environ["PATH"] += os.pathsep + 'C:/Program Files (x86)/Graphviz2.38/bin'
# on MacOS usually
os.environ["PATH"] += os.pathsep + '/usr/local/bin'

To run CHAMELEON and CHAMELEON2 algorithms, the METIS library is required. To install it on macOS, execute the following commands (partially taken from here):

# download the file using wget (do it from the website if you prefer)
wget http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/metis-5.1.0.tar.gz
# uncompress it
gunzip metis-5.1.0.tar.gz
# untar it
tar -xvf metis-5.1.0.tar
# remove the tar
rm metis-5.1.0.tar
# go inside the folder
cd metis-5.1.0
# install it using make
make config shared=1
make install
# export the dll
export METIS_DLL=/usr/local/lib/libmetis.dylib

To install METIS on Windows, go to conda-metis and follow the instructions.

Usage

Let's see a basic example using OPTICS:

from clustviz.optics import OPTICS, plot_clust
from sklearn.datasets import make_blobs

# create a random dataset
X, y = make_blobs(n_samples=30, centers=4, n_features=2, cluster_std=1.8, random_state=42)

# perform OPTICS algorithm, with plotting enabled
ClustDist, CoreDist = OPTICS(X, eps=2, minPTS=3, plot=True, plot_reach=True)

# plot the final clusters
plot_clust(X, ClustDist, CoreDist, eps=2, eps_db=1.9)

For many other examples, take a look at the detailed clustviz_example notebook.

Repository structure

  1. The folder data/DOCUMENTS contains all the official papers, PowerPoint presentations and other PDFs regarding all the algorithms involved and clustering in general.
  2. The folder clustviz contains the scripts necessary to run the clustering algorithms.
  3. The notebook data/clustviz_example.ipynb lets the user run every algorithm on 2D datasets; it contains a subsection for every algorithm, with the necessary modules and functions imported and some commented lines of code which can be uncommented to run the algorithms.
  4. The folder docs contains the necessary files to build the documentation using Sphinx and ReadTheDocs.
  5. The folder tests contains pytest tests.

Credits for some algorithms

I did not start to write the scripts for each algorithm from scratch; in some cases I modified some Python libraries, in other cases I took some publicly available GitHub repositories and modified the scripts contained there. The following list provides all the sources used when I did not write all the code by myself:

The other algorithms have been implemented from scratch following the relative papers. Thanks to Darius (https://github.com/dariomonici), the GUI Meister, for the help with PyQt5, used for ClustVizGUI.

Possible improvements

  • add more clustering algorithms
  • comment every code block and improve code quality
  • pymetis doesnt work on Windows, but could be an option for macOS
  • add highlights to docstrings using ``
  • show aliases typehints using Sphinx (open issue)

TravisCI path

  • if Travis CI doesn't trigger, it is probably because .travis.yml isn't properly formatted. Use yamllint to correct it
  • add package update
  • for the deployment phase: brew install ruby, brew install travis
  • added empty conftest.py in clustviz folder for tests in Windows version

clustviz's People

Contributors

guglielmosanchini avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.