GithubHelp home page GithubHelp logo

sleep-data-analysis's Introduction

Welcome to sleep-data-analysis repository

This is our Mini-Project for SC1015 (Introduction to Data Science and Artificial Intelligence).

Contributors

  • shinghao (Soh Shing Hao) - Data Preparation & Cleaning, Exploratory Analysis, Presentation
  • leechunyang98 (Lee Chun Yang) - Data Resampling, Machine Learning Models
  • czhi-heng (Cheung Zhi Heng) - Research, Data Analysis, Video Recording and Editing, Presentation

Practical Motivation

We are often told that we need at least 7 hours of sleep to be well-rested. However, we often still feel tired and unsufficiently rested even after sleeping for at least 7 hours. Are other variables apart from the duration of our sleep affecting our sleep quality?

Problem Definition

Can we predict a person's sleep quality using information on his sleep cycle, the time he goes to sleep, and his activities throughout the day?

We will only use data where the person has slept for at least 7 hours.

Dataset Source

The dataset we will be using is the Sleep Data dataset created and shared by Dana Diotte on Kaggle. The sleep data dataset consists only of Dana Diotte's own sleep information which he accquired between 2014-2018 and collected through the Sleep Cycle app from Northcube on iOS. The dataset can be found here: https://www.kaggle.com/danagerous/sleep-data

Machine Learning Models Used

  1. Linear Regression
  2. Random Forest
  3. Polynomial Regression

Sampling Methods

  1. Random Sampling
  2. K-Folds
  3. Repeated K-Folds

Conclusion

  • More sleep cycles and having a stressful day results in better sleep quality
  • Remaining variables do not have much correlation to sleep quality
  • Out of the 3 models we used, linear Regression produced the best results
  • However, since all 3 models produced low accuracy, we conclude that sleep quality cannot be accurately predicted with just sleep cycle, time going to sleep & lifestyle. To accurately predict sleep quality, other variables or models have to be explored.

Recommendations

  • Create a more balanced response variable through methods such as resampling. This is because for our model, the response variable, Sleep quality, is more skewed towards the right (representing higher sleep quality).
  • Collect more data as the data may become too small to measure the actual accuracy of the models.
  • Since this data is only about one person, it may be biased and hard to make accurate analysis of the information being given. It will be better to have a specific range/group of sleep information to give.(Continuation of point 2) Or a specific research centre/sleep centre of information to analyse will generate more interesting insights.
  • Since there is a lack of correlation for the variables we used, we recommend considering other variables that could also affect sleep quality

What Did We Learn?

  • Random Forest Model
  • Polynomial Regression
  • Encoding Categorical data with Label Encoding
  • Sampling data with K-Folds and repeated K-Folds
  • Representing data in time-series
  • Experimented with other machine learning models - Gradient Boosting Decision Tree, Histogram-Based Gradient Boosting, AdaBoost, K-Nearest Neighbour

References

sleep-data-analysis's People

Contributors

shinghao avatar leechunyang98 avatar czhi-heng avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.