GithubHelp home page GithubHelp logo

ihdavjar / csl2050_major_project Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 2.27 MB

Course Major Project of Pattern Recognition and Machine Learning( CSL2050 )

Jupyter Notebook 100.00%
deep-neural-networks machine-learning parkinsons-disease skewed-data

csl2050_major_project's Introduction

Abstract

This study aimed to construct a supervised learning model for classifying medical subjects into two groups based on their Parkinson's disease status. The dataset comprises a variety of audio parameters extracted from voice recordings of patients. The dataset is skewed, as 23 of the total 31 patients in the recording are positive. As a result, we used both accuracy and the F1 score as measures. We've employed dimensionality reduction and feature selection techniques and then trained multiple models on them.

Introduction

In this investigation, we attempted to categorise patients as either healthy or sick using a variety of supervised learning algorithms. Initially, we employed linear discriminant analysis (LDA) to determine whether or not the data were linearly separable. Then, we utilised principal component analysis (PCA) with naive Bayes classification to determine the efficacy of this method.

Then, we attempted the sequential forward feature selection algorithm with the Naive Bayes classifier as the foundational model. Then, we attempted to identify the optimal feature using the sequential forward feature selection algorithm and the Decision Tree classifier as the base model. On the resulting datasets, we then evaluated the precision of various models.

The various models used in this project are

  • Gaussian NB
  • Decision Tree Classifier
  • Bagging with the Decision Tree Classifier as the base ensemble
  • AdaBoost with the Decision Tree Classifier as the base ensemble
  • Xgboost Classifier
  • Neural Network
  • Support Vector Machine
  • KNN Classifiers

Result and Discussion

Out of all the models implemented in this project, KNN gives the best performance with standardised data, as both F1 score and accuracy are at their maximum in that case.

Accuracy on KNN - Classifier

image25

F1 Score on KNN - Classifier

image20

โ†’ Report.pdf contains detailed explaination of this project along with various visualisation.

โ†’ major_project.ipynb contains the implementation of the above discussed clustering.

csl2050_major_project's People

Contributors

ihdavjar avatar kalbhavi-vadhiraj-infrrd avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.