GithubHelp home page GithubHelp logo

gabrielegilardi / clustering Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 2.0 487 KB

Implementation of K-means and fuzzy C-means clustering using the naive algorithm and particle swarm optimization.

License: MIT License

Python 100.00%
k-means fuzzy-c-means clustering naive-algorithm particle-swarm-optimization python

clustering's Introduction

K-means and Fuzzy C-means Clustering Using a Naive Algorithm and Particle Swarm Optimization

Features

  • The code has been written and tested in Python 3.8.8.

  • Two clustering methods (K-means and fuzzy C-means) and two solvers (naive algorithm and PSO).

  • For the K-means clustering method:

    • the distance from the cluster centers is assumed as clustering error;
    • the function minimized is the sum of squared errors;
    • the silhouette coefficient and Davies–Bouldin index are available metrics;
    • the function assign_data can be used to classify new data.
  • For the fuzzy C-means clustering method:

    • the weighted distance from the cluster centers is assumed as clustering error;
    • the function minimized is the sum of (weighted) squared errors;
    • the Dunn's and Kaufman's fuzzy partition coefficients are available metrics;
    • the function calc_U can be used to classify new data.
  • Usage: python test.py example.

Main Parameters

example Name of the example to run (g2, dim2, unbalance, s3)

nPop, epochs Number of agents (population) and number of iterations.

K, K_list Number of clusters.

n_rep Number of repetitions (re-starts) in the naive algorithm.

max_iter Max. number of iterations in the naive algorithm.

func Name of the interface function for the PSO.

m Fuzziness coefficient in the fuzzy C-means method.

tol Convergency tolerance in the fuzzy C-means method.

The other PSO parameters are used with their default values (see pso.py).

Examples

Example 1: g2

K-means using PSO, 2 clusters, 8 features, 2048 samples.

# Cluster centers:
# [[600, 600, 600, 600, 600, 600, 600, 600],
#  [500, 500, 500, 500, 500, 500, 500, 500]]

# Found solution:
# [[599.06 598.27 599.21 600.61 600.05 598.84 600.48 599.4 ]
#  [499.76 499.45 499.9  500.92 497.64 498.66 499.48 499.39]]

# Max. error [%]: 0.473

Example 2: dim2

K-means using naive algorithm, 2 to 15 clusters, 2 features, 1351 samples, silhouette coefficient and Davies–Bouldin index as metrics.

example_2

Example 3: unbalance

Fuzzy C-means using PSO, 8 clusters (unbalanced), 2 features, 6500 samples.

example_3

Example 4: s3

Fuzzy C-means using naive algorithm, 2 to 20 clusters, 2 features, 5000 samples, Dunn's and Kaufman's fuzzy partition coefficients as metrics.

example_4

References

clustering's People

Contributors

gabrielegilardi avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.