GithubHelp home page GithubHelp logo

jupiters1117 / mico Goto Github PK

View Code? Open in Web Editor NEW
17.0 2.0 5.0 5.96 MB

MICO: Mutual Information and Conic Optimization for feature selection

Home Page: https://jupiters1117.github.io/mico/

License: Other

Makefile 0.58% Python 99.37% Shell 0.05%
feature-selection conic-programs semidefinite-programming convex-optimization machine-learning python mutual-information

mico's Introduction

License

MICO: Mutual Information and Conic Optimization for feature selection

MICO is a Python package that implements a conic optimization based feature selection method with mutual information (MI) measure1. The idea behind the approach is to measure the features’relevance and redundancy using MI, and then formulate a feature selection problem as a pure-binary quadratic optimization problem, which can be heuristically solved by an efficient randomization algorithm via semidefinite programming2. Optimization software Colin3 is used for solving the underlying conic optimization problems.

This package

  • implements three methods for feature selections:
    • MICO : Conic Optimization approach
    • MIFS : Forward Selection approach
    • MIBS : Backward Selection approach
  • supports three different MI measures:
    • JMI : Joint Mutual Information4
    • JMIM : Joint Mutual Information Maximisation5
    • MRMR : Max-Relevance Min-Redundancy6
  • generates feature importance scores for all selected features.
  • provides scikit-learn compatible APIs.

Installation

  1. Download Colin distribution from http://www.colinopt.org/downloads.php and unpack it into a chosen directory (<CLNHOME>). Then install Colin package:
cd <CLNHOME>/python
pip install -r requirements.txt
python setup.py install
  1. To install MICO package, use:
pip install -r requirements.txt
python setup.py install

or

pip install colin-mico

To install the development version, you may use:

pip install --upgrade git+https://github.com/jupiters1117/mico

Usage

This package provides scikit-learn compatible APIs:

  • fit(X, y)
  • transform(X)
  • fit_transform(X, y)

Examples

The following example illustrates the use of the package:

import pandas as pd
from sklearn.datasets import load_breast_cancer

# Prepare data.
data = load_breast_cancer()
y = data.target
X = pd.DataFrame(data.data, columns=data.feature_names)

# Perform feature selection.
mico = MutualInformationConicOptimization(verbose=1, categorical=True)
mico.fit(X, y)

# Populate selected features.
print("Selected features: {}".format(mico.get_support()))

# Populate feature importance scores.
print("Feature importance scores: {}".format(mico.feature_importances_))

# Call transform() on X.
X_transformed = mico.transform(X)

Documentation

User guide, examples, and API are available here.

References

Credits

  • KuoLing Huang, 2019-presents

Licensing

MICO is 3-clause BSD licensed.

Note

MICO is heavily inspired from MIFS: Parallelized Mutual Information based Feature Selection module by Daniel Homola.


  1. T Naghibi, S Hoffmann and B Pfister, "A semidefinite programming based search strategy for feature selection with mutual information measure", IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(8), pp. 1529--1541, 2015. [Pre-print]

  2. M Goemans and D Williamson, "Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming", J. ACM, 42(6), pp. 1115--1145, 1995 [Pre-print]

  3. Colin: Conic-form Linear Optimizer (www.colinopt.org).

  4. H Yang and J Moody, "Data Visualization and Feature Selection: New Algorithms for Nongaussian Data", NIPS 1999. [Pre-print]

  5. M Bennasar, Y Hicks, abd R Setchi, "Feature selection using Joint Mutual Information Maximisation", Expert Systems with Applications, 42(22), pp. 8520--8532, 2015 [pre-print]

  6. H Peng, F Long, and C Ding, "Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy", IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), pp. 1226--1238, 2005. [Pre-print]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.