GithubHelp home page GithubHelp logo

zivzakalik / agglomerative-clustering Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 8 KB

This repository contains an Agglomerative Clustering algorithm for classifying cancer types using biomedical data. It utilizes Euclidean distance for cluster merging, evaluates performance with purity metrics, and supports both Single Link and Complete Link methods.

Python 100.00%

agglomerative-clustering's Introduction

Agglomerative Clustering

Distance Measurement

We employ the Euclidean distance to determine which clusters should be merged. This metric helps in assessing the proximity between different clusters effectively.

Algorithm Quality

The performance of our clustering algorithm is evaluated based on its purity. This metric helps in determining the accuracy with which the algorithm groups samples into the correct clusters.

Algorithm Overview

Purpose

The algorithm aims to cluster samples based on their features to identify specific types of cancer accurately.

Stopping Criterion

The clustering process is halted once the number of clusters reaches seven. This stop condition ensures that the clusters remain meaningful and manageable.

Methods Employed

We utilize two variants of agglomerative clustering:

  • Single Link: This method considers the minimum distance between clusters for merging decisions.
  • Complete Link: In contrast, this method considers the maximum distance between clusters to guide the merging process.

Implementation Classes

  1. Cluster: Represents a single cluster containing grouped samples.
  2. Data: This class is responsible for organizing the data and computing the initial distance matrix, essential for the clustering process.
  3. Link: This superclass contains three subclasses:
    • Link: The base class for linkage criteria.
    • SingleLink: Implements the single link clustering method.
    • CompleteLink: Implements the complete link clustering method.
  4. AgglomerativeClustering: This class updates the clusters and oversees the execution of the algorithm, ensuring that the clustering process adheres to the specified methods and stopping criteria.

agglomerative-clustering's People

Contributors

zivzakalik avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.