GithubHelp home page GithubHelp logo

javakanaya / framingham-cvd Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.37 MB

Machine learning project predicting cardiovascular disease using the Framingham Heart Study dataset. Explores various models, preprocessing techniques, and hyperparameter tuning to optimize accuracy.

Jupyter Notebook 100.00%
classification cvd framingham-heart-study machine-learning

framingham-cvd's Introduction

Framingham Heart Study - Cardiovascular Disease Prediction

This repository contains a Jupyter Notebook for our final project in a machine learning course. The project aims to predict cardiovascular disease using the Framingham Heart Study dataset. We explored various machine learning algorithms and preprocessing techniques to find the most effective method for accurate predictions.

Dataset

We utilized the Framingham Heart Study dataset available on Kaggle. The dataset includes various health-related features to predict the risk of cardiovascular disease.

Project Overview

Our project focuses on comparing the performance of different machine learning models under various data splitting, imbalance handling, and hyperparameter tuning scenarios. The scenarios include:

  • Data Splitting Ratios: 1:9, 2:8, and 3:7 proportions for testing and training data.
  • Imbalance Handling Techniques:
    • Imbalanced Data (original dataset)
    • Undersampling
    • Oversampling using SMOTE
  • Hyperparameter Tuning: Testing models with and without hyperparameter tuning.
  • Machine Learning Algorithms:
    • Decision Tree
    • Random Forest
    • k-Nearest Neighbor (k-NN)
    • Extreme Gradient Boosting (XGBoost)
    • Support Vector Machines (SVM)

Results

The main findings from our experiments are as follows:

  • Optimal Preprocessing: The best preprocessing technique for this dataset was oversampling using SMOTE, which effectively handled class imbalance.
  • Best Model: The Random Forest algorithm provided the highest accuracy for cardiovascular disease prediction when using SMOTE, a 9:1 training/testing data split, and hyperparameter tuning.
  • Impact of Hyperparameter Tuning: Hyperparameter tuning significantly improved model performance in most cases. However, the impact varied across different scenarios, with some scenarios showing improvements, declines, or no change in performance.

Conclusion

Our study demonstrates that careful preprocessing and hyperparameter tuning are crucial for optimizing machine learning models in predicting cardiovascular disease. The Random Forest algorithm, in particular, showed superior performance under the tested conditions.

framingham-cvd's People

Contributors

javakanaya avatar afiqhaidar avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.