GithubHelp home page GithubHelp logo

valinsogna / ethicalai-stackanalysis Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 2.22 MB

A ML algorithm capable of conducting an in-depth analysis of students' responses to STACK questions

License: MIT License

Jupyter Notebook 100.00%
correspondence-analysis kmeans-clustering machine-learning stack

ethicalai-stackanalysis's Introduction

STACK Student Response Analysis for Sage Foundation Ethical AI Hackathon

A ML algorithm capable of conducting an in-depth analysis of students' responses to STACK questions for the Ethical AI Hackathon promoted by Sage Foundation.

STACK is the world-leading open-source online assessment system for mathematics and STEM. It is available for Moodle, ILIAS and as an integration through LTI.

The algorithms used were CA for discovery lexical similarity between students' incorrect answers and K-means to cluster them.

Table of Contents

  1. Team Introduction & Understanding the Problem
  2. Data Cleaning
  3. Python Scripts & Visualisation
  4. ML algorithms
  5. Presentation
  6. Future Improvements
  7. Team & Researchers

1. Team Introduction & Understanding the Problem

Review of the sample data

Hackathon Challenge

Our challenge in this hackathon is to develop a machine learning algorithm to analyze students' responses to STACK questions. The aim is to classify correct vs. incorrect responses, further delve into the types of incorrect responses, group similar incorrect responses, and identify any outlier responses.

Aim

To devise an algorithm that effectively provides an in-depth analysis of students' answers to STACK questions.

Specific Objectives

  1. Classification of Correct vs. Incorrect Responses
  2. Multilevel Classification of Incorrect Responses (Predicted vs. unpredicted responses using PRT paths)
  3. Cluster Analysis - Grouping Similar incorrect responses
  4. Anomaly Detection Based on Question Text

2.Data Cleaning

For the purposes of our analysis, only the finished attempts are considered.

3. Python Scripts & Visualisation

Each objective was approached with a dedicated Python script, followed by visualization to represent the analysis results.

Writing Python Scripts

  1. Script for Objective 1-2: Link to the Code
  2. Script for Objective 3-4: Link to the Code

4. Machine Learning Analysis Summary

  • Contingency tables: for each type of question, a contingency table of students'answer was build using as vocabulary the characters present in each response.
  • Correspondence Analysis (CA): for each type of question, 2D CA was performed on predicted and not predicted wrong students'answers in order to analyzes lexical (dis)similarities between them.
  • K-means: used for clustering to understand common errors for each type of question, using as input the results from each CA.
  • Data Saving and Retrieval: save analyzed data for future use or further analysis.

5. Presentation

The findings, algorithm, and insights were compiled and documented for presentation to the Hackathon judges.

6. Future Improvements

  • After individual testing, all code blocks should be integrated into a single program.
  • Increase num of dimensions for Correspondence Analysis (3D).
  • Add mathematical functions and symbol to the vocabulary for creating contingency table.
  • Choose effective num of clusters based on the better view of data from 3D CA.
  • Create API to fetch this clustering data and work as an input to the STACK system.

7. Team & Researchers

Below are the contributors to this project:

Team Members

Lead Researchers

References

ethicalai-stackanalysis's People

Contributors

valinsogna avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.