GithubHelp home page GithubHelp logo

guntas-13 / cs328-sparsecommunitydetection Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 179.41 MB

CS328 Introduction to Data Science - Prof. Anirban Dasgupta - Project: Sparsifying Networks while Preserving Properties

Jupyter Notebook 91.88% HTML 8.02% JavaScript 0.01% Python 0.09%
community-detection-algorithms graph-sparsification sparsification

cs328-sparsecommunitydetection's Introduction

SparseCommunityDetection

Final Report Final Presentation

Modularity

$$ Q(\mathcal{C}) = \frac{1}{2m} \sum_{C \in \mathcal{C}} \sum_{u \in C, v \in C} \left( A_{u, v} - \frac{d_u d_v}{2m} \right) $$

Betweenness Centrality

$$ \text{Betweenness}(e) = \sum_{s, t \in V} \frac{\sigma(s, t | e)}{\sigma(s, t)} $$

Jaccard Similarity

$$ \texttt{An edge (i, j) is likely to lie within a cluster if the vertices i and j have adjacency lists with high overlap} $$

$$ \text{J}(i, j) = \frac{|\text{Adj}(i) \cap \text{Adj}(j)|}{|\text{Adj}(i) \cup \text{Adj}(j)|} $$

RESULTS (As of Now)

Louvain Algorithm

Label Propogation Algorithm

God Funtion

def plot_metrics_sparse(G, ground_truth, sparseFunctions, k_values, AlgoFunction, flag, networkName = None, AlgoName = None):

    """_summary_

    Args:
        G (nx.Graph): Original Graph
        ground_truth (dictionary): dictionary mapping nodes to communities
        sparseFunctions (list): list of tuples containing the name and the function to generate the sparse graph
        k_values (list): list of values for the percentage of edges to retain
        AlgoFunction (function): Community Detection Algorithm
        flag (int): 0 if the output of the algorithm is a dictionary, 1 if the output is a list of communities
        networkName (str, optional): Name of Network to go in the Title of the Plots. Defaults to None.
        AlgoName (str, optional): Algorithm Name to go in the Title of the Plots. Defaults to None.

    Returns:
        list: List of Sparse Graphs (nx.Graph)
    """

    ari_values = [[0] * len(k_values) for _ in sparseFunctions]
    modularity_values = [[0] * len(k_values) for _ in sparseFunctions]
    nmi_values = [[0] * len(k_values) for _ in sparseFunctions]
    names = [name for name, _ in sparseFunctions]
    SparseGraphs = [[0] * len(k_values) for _ in sparseFunctions]

    for idx, (_, function) in enumerate(sparseFunctions):
        for i, k in enumerate(k_values):
            H = function(G, k)
            predicted = AlgoFunction(H)
            if (flag == 1):
                predicted = get_community_dict(predicted)
            ari, nmi = metrics(ground_truth, predicted)
            modularity_values[idx][i] = modularity(H, get_communities(predicted))
            ari_values[idx][i] = ari
            nmi_values[idx][i] = nmi
            SparseGraphs[idx][i] = H

    plt.figure(figsize = (12, 8))
    plt.xticks(range(len(k_values)), [f"{100 * k}%" for k in k_values])
    for idx in range(len(sparseFunctions)):
        plt.plot(ari_values[idx], label = names[idx], marker = "o")
        for i, txt in enumerate(ari_values[idx]):
            plt.annotate(f"{txt:.4f}", (i, ari_values[idx][i]))
    plt.xlabel("Percentage Retention of Edges")
    plt.ylabel("ARI")
    plt.title(f"ARI for {AlgoName} vs k for {networkName}")
    plt.legend()
    plt.grid()
    plt.show()

    plt.figure(figsize = (12, 8))
    plt.xticks(range(len(k_values)), [f"{100 * k}%" for k in k_values])
    for idx in range(len(sparseFunctions)):
        plt.plot(nmi_values[idx], label = names[idx], marker = "o")
        for i, txt in enumerate(nmi_values[idx]):
            plt.annotate(f"{txt:.4f}", (i, nmi_values[idx][i]))
    plt.xlabel("Percentage Retention of Edges")
    plt.ylabel("NMI")
    plt.title(f"NMI for {AlgoName} vs k for {networkName}")
    plt.legend()
    plt.grid()
    plt.show()

    plt.figure(figsize = (12, 8))
    plt.xticks(range(len(k_values)), [f"{100 * k}%" for k in k_values])
    for idx in range(len(sparseFunctions)):
        plt.plot(modularity_values[idx], label = names[idx], marker = "o")
        for i, txt in enumerate(modularity_values[idx]):
            plt.annotate(f"{txt:.4f}", (i, modularity_values[idx][i]))

    plt.axhline(y = modularity(G, get_communities(ground_truth)), color = "black", linestyle = "--", label = "Original Graph")
    plt.xlabel("Percentage Retention of Edges")
    plt.ylabel("Modularity")
    plt.title(f"Modularity for {AlgoName} vs k for {networkName}")
    plt.legend()
    plt.grid()
    plt.show()

    return SparseGraphs

Some Additional Sources:

Zachary's Karate Club Dataset Analysis
Modularity - Justin Ruth
Community Detection - IITM NPTEL
Although not needed - Introduction to GNN

Handy Libraries

Networkx
igraph
Liitle Ball of Fur

cs328-sparsecommunitydetection's People

Contributors

guntas-13 avatar robohrriday avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.