GithubHelp home page GithubHelp logo

akshatashanmugam / humanactivityrecognitionmodels Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 10.56 MB

Trying to find the best model among 5 models of Decision Tree, Random Forest, Logistic Regression, XGB, LightGBM for Human Activity Recognition

Jupyter Notebook 100.00%
decision-trees lightgbm logistic-regression multiclass-classification random-forest xgboost

humanactivityrecognitionmodels's Introduction

Multi-Class Classification Models - Human Activity Recognition Model

This README file provides an overview of the code and usage instructions for a Python script that demonstrates multi-class classification using various machine learning models in a Google Colab environment. The script covers the following models:

  • Logistic Regression
  • LightGBM (Light Gradient Boosting Machine)
  • XGBoost
  • Decision Tree
  • Random Forest

WISDM Dataset

The WISDM dataset contains data collected under controlled laboratory conditions. The dataset statistics are as follows:

  • Raw Time Series Data

    • Number of examples: 1,098,207
    • Number of attributes: 6
    • Missing attribute values: None
  • Class Distribution

    • Walking: 424,400 (38.6%)
    • Jogging: 342,177 (31.2%)
    • Upstairs: 122,869 (11.2%)
    • Downstairs: 100,427 (9.1%)
    • Sitting: 59,939 (5.5%)
    • Standing: 48,395 (4.4%)

Dataset Preprocessing

Before using the dataset in the multi-class classification models, the file WISDM_ar_v1.1_raw_about.txt was converted to a CSV format using the following code:

import pandas as pd

# Read the data from the text file
with open("data.txt", "r") as txt_file:
    data_lines = txt_file.readlines()

# Process the data and convert it to a list of dictionaries
data_list = []
for line in data_lines:
    parts = line.strip().split(',')

    try:
        user = int(parts[0])
        activity = parts[1]
        timestamp = int(parts[2])
        x_acceleration = float(parts[3])
        y_acceleration = float(parts[4])
        z_acceleration = float(parts[5].rstrip(';'))  # Remove semicolon

        data_list.append({
            'user': user,
            'activity': activity,
            'timestamp': timestamp,
            'x-acceleration': x_acceleration,
            'y-acceleration': y_acceleration,
            'z-acceleration': z_acceleration
        })
    except ValueError:
        print(f"Skipping line: {line}")

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data_list)

# Save the DataFrame to a CSV file
df.to_csv("data.csv", index=False)

The resulting data.csv file is used for the multi-class classification models in this repository. (It is added here in the zipped format, unzip before use)

Prerequisites:

Before running the script in Google Colab, you need to ensure that you have a Google Colab environment set up. Additionally, you should have a dataset named data.csv uploaded to your Colab environment. You can upload the dataset directly from your local machine or from cloud storage services.

Usage:

  • Open a new or existing Google Colab notebook.
  • Upload the script to your Colab environment.
  • Make sure that you have the required Python packages installed within your Colab environment.
  • You can install these packages using the following commands within a Colab cell:
!pip install numpy pandas scikit-learn matplotlib xgboost lightgbm
  • Upload the data.csv dataset to your Colab environment.

Output:

The script provides output for each machine learning model, including accuracy, confusion matrices, classification reports, and cross-validation scores. Additionally, ROC curves are visualized for each class for each model, highlighting the model's performance for multi-class classification.

Accuracy Analysis:

  • Logistic Regression: Achieved an accuracy of 0.49, indicating limited performance.
  • LightGBM: Accuracy of 0.94.
  • XGBoost: Performed well with an accuracy of 0.95.
  • Decision Tree: Accuracy of 0.99.
  • Random Forest: Achieved an accuracy of 0.99, with the best model's parameters provided.

Note:

  • LightGBM can be sensitive to the choice of parameters. You can experiment with different hyperparameters as needed.
  • For the Random Forest model, hyperparameter tuning is demonstrated using GridSearchCV. The best parameters and the corresponding accuracy are displayed.
  • Make sure to have the data.csv dataset uploaded to your Colab environment. It is added in the repository in the zipped format, unzip before use.

Thank you for using this multi-class classification demonstration script in Google Colab. If you have any questions or need assistance, please feel free to reach out.

humanactivityrecognitionmodels's People

Contributors

akshatashanmugam avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.