GithubHelp home page GithubHelp logo

guidodc97 / dm Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 332 KB

Repository for "Data Minig" project

License: GNU General Public License v3.0

Python 1.31% Jupyter Notebook 98.69%
machine-learning data-mining classification jupyterlab sklearn pandas python

dm's Introduction

Data Mining

Repository for Data Mining project.

Individual project consisting in building a classifier for Alzheimer's disease detection.

AY 2020/2021

Solution

Project has been completely developed in Python, using popular libraries for data analysis and machine learning such as pandas and scikit-learn.

The search for the best machine learning algorithm to adopt for the specific classification task consists of the following steps:

  • Data preparation: which is articulated in:
    • Data cleaning: consists in the correction of incoherent data in the dataset where possible.
    • Data imputation: consists in replacing missing values with reasonably values.
  • Model selection: the aim is to find out the most promising algorithm for the problem under study. This step is acomplished by performing a repeated stratified K-fold cross validation with inner grid search for finding best hyperparameters for algorithm.
  • Hyperparameter tuning: once the model selection has been acomplished, the hyperparameter of the selected model can be tuned with a K-fold cross validation and an inner grid search for hyperparameters.
  • Model evaluation: with the hyperparameter gfound at the previous step, it is possibile to train the model with them and to evaluate its performances. The evaluation phase is used again a K-fold cross validation.
  • Prediction: finally, the model is retrained on the whole dataset with the specified hyperparameters foud during tuning phase and new instances can be predicted.

The best solution chosen is SCV with polinomial kernel, which gave an accuracy of 72,4% on test set.

Documentation & Code

Complete description of assignment can be found here.

Source code of project can be found here.

dm's People

Contributors

guidodc97 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.