GithubHelp home page GithubHelp logo

davpinto / master-thesis Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 1.0 41.66 MB

My Master's thesis on Bayesian Classification with Regularized Gaussian Models

R 95.71% Jupyter Notebook 4.29%
covariance-matrix shrinkage boosting probability-calibration bayes-classifier gaussian-models regularization

master-thesis's Introduction

Bayesian Classification with Regularized Gaussian Models

Bayesian classifiers with regularized estimators for the class priors, vector of means and covariance matrix

This work presents a novel approach to reduce the effects of the violations of the attribute independence assumption on which the Gaussian naive Bayes classifier is based. A Regularized Gaussian Bayes (RGB) algorithm is introduced, that considers the correlation structure among variables to learn the class posterior probabilities. The proposed RGB classifier avoids overfitting by replacing the sample covariance estimate with well-conditioned regularized estimates. So, RGB aims to find the best trade-off between non-naivety and prediction accuracy.

Moreover, improvements in RGB accuracy and stability are achieved using Adaptive Boosting (AdaBoost). In short, the proposed Boosted RGB (BRGB) classifier generates a sequentially weighted set of RGB base classifiers that are combined to form a robust classifier. Classification experiments have demonstrated that the BRGB achieves prediction performance comparable to the best off-the-shelf ensemble based architectures, such as Random Forests, Extremely Randomized Trees (ExtraTrees) and Gradient Boosting Machines (GBMs), using few (10 to 20) base classifiers.

BRGB Decision Boundary as boosting iterations proceed:

Boosted Regularized Gaussian Bayes Classifier

References

[1] Ledoit, Olivier, and Michael Wolf. "A well-conditioned estimator for large-dimensional covariance matrices." Journal of multivariate analysis 88.2 (2004): 365-411.

[2] Chen, Yilun, et al. "Shrinkage algorithms for MMSE covariance estimation." Signal Processing, IEEE Transactions on 58.10 (2010): 5016-5029.

[3] Schäfer, Juliane, and Korbinian Strimmer. "A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics." Statistical applications in genetics and molecular biology 4.1 (2005).

[4] Opgen-Rhein, Rainer, and Korbinian Strimmer. "Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach." Statistical Applications in Genetics and Molecular Biology 6.1 (2007).

[5] Tipping, Michael E., and Christopher M. Bishop. "Probabilistic principal component analysis." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61.3 (1999): 611-622.

[6] Minka, Thomas P. "Automatic choice of dimensionality for PCA." NIPS. Vol. 13. 2000.

[7] Witten, Daniela M., Robert Tibshirani, and Trevor Hastie. "A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis." Biostatistics (2009): kxp008.

[8] Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. "Sparse inverse covariance estimation with the graphical lasso." Biostatistics 9.3 (2008): 432-441.

[9] Hsieh, Cho-Jui, et al. "Sparse inverse covariance matrix estimation using quadratic approximation." Advances in Neural Information Processing Systems. 2011.

[10] Freund, Yoav, Robert Schapire, and N. Abe. "A short introduction to boosting." Journal-Japanese Society For Artificial Intelligence 14, no. 771-780 (1999): 1612.

[11] Schapire, Robert E., and Yoav Freund. "Boosting: Foundations and algorithms." MIT press, 2012.

[12] Niculescu-Mizil, Alexandru, and Rich Caruana. "Predicting good probabilities with supervised learning." In Proceedings of the 22nd international conference on Machine learning, pp. 625-632. ACM, 2005.

master-thesis's People

Contributors

davpinto avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

duongcaonhan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.