GithubHelp home page GithubHelp logo

jejjohnson / hsic_alignment Goto Github PK

View Code? Open in Web Editor NEW
0.0 3.0 0.0 98 MB

In this repo, I will be looking at how to choose the best parameters for HSIC alignment.

Home Page: https://jejjohnson.github.io/hsic_alignment

License: MIT License

Python 0.32% Jupyter Notebook 99.66% Shell 0.01% Makefile 0.02%
hsic kernels dependence similarity parameters rbf

hsic_alignment's Introduction

Kernel Alignment: An Empirical Study of $\gamma$ for the RBF Kernel


Summary

Kernel methods are class of machine learning algorithms solve complex nonlinear functions by a transformation facilitated by flexible, and expressive representations through kernel functions. For classes of problems such as regression and classification, there is an objective which allows provides a criteria for selecting the kernel parameters. However, in unsupervised settings such as dependence estimation where we compare two separate variables, there is no objective function to minimize or maximize except for the measurement itself. Alternatively one can choose the kernel parameters by a means of a heuristic but it is unclear which heuristic is appropriate for which application.

The Hilbert-Schmidt Independence Criterion (HSIC) is one of the most widely used kernel methods for estimating dependence but it is not invariant to isotropic scaling for some kernels and it is difficult to interpret because there is no upper bound. Other variations include the Kernel Alignment (KA) and centered Kernel Alignment (CKA) methods; the non-centered and normalized versions of HSIC respectively. It is rare to see empirical comparisons between the methods when estimating the similarity between two unsupervised data sources. In this work we demonstrate how the kernel parameter for HSIC measures change depending on the toy dataset and the amount of noise present. We also demonstrate how the methods compare when evaluated on known distributions where the analytical mutual information is available for large scale, multi-dimensional datasets.


Example Experiments

  • RBF Gamma Initialization
  • RBF Gamma Parameter space
  • Scaling versus
  • Real Distributions (Large N, Large D)

Installation Instructions

  1. Firstly, you need to clone the following RBIG repo and install/put in PYTHONPATH. See external toolboxes below for more information.

    git clone https://github.com/jejjohnson/rbig
  2. Secondly, you can create the environment from the .yml file found in the main repo.

    conda env create -f environment.yml -n myenv
    source activate myenv

External Toolboxes

RBIG (Rotation-Based Iterative Gaussianization)

This is a package I created to implement the RBIG algorithm. This is a multivariate Gaussianization method that allows one to calculate information theoretic measures such as entropy, total correlation and mutual information. For this project, I used it to calculate the mutual information. More information can be found in the repository esdc_tools.

hsic_alignment's People

Contributors

jejjohnson avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.