GithubHelp home page GithubHelp logo

ange3 / deepcode Goto Github PK

View Code? Open in Web Editor NEW
28.0 8.0 11.0 147.27 MB

Deep learning using Recurrent Neural Networks on student code submissions; focusing on LSTMs to predict student success

Python 33.22% Shell 0.02% C 1.23% C++ 0.02% Fortran 0.01% HTML 4.71% CSS 0.01% Jupyter Notebook 60.79%

deepcode's Introduction

What Will You Code Next? Deep Knowledge Tracing on Non-Binary Data

Predicting Student Performance on Online Coding Exercises

Deep learning using Recurrent Neural Networks on student code submissions; focusing on LSTMs to predict student success.

In our research, we use deep learning to understand a student’s learning trajectory as they solve open-ended problems. With a robust understanding of student learning, we can ultimately provide personalized automated feedback to students at scale.

We perform 2 tasks using RNNs to predict a student's future performance on given questions: (1) Binary - Using a student's past accuracy, predict if the student will get the next question right or wrong (2) Non-Binary - Using a student's past submissions, predict the next step in the student's problem solving path (in the case of the Code.org data, we are predicting the next code program that the student will write)

Stanford Computer Science Senior Project

  • Students: Angela, Lisa, Larry
  • Advisor: Chris

Folders

A description of our folders and a few of the main files in each folder.

code

Contains code to run our RNN models and other helper files

  • baselines contain the baseline models that serve as accuracy value benchmarks for our 2 tasks
  • constants.py
  • ipython notebooks run the Recurrent Neural Networks. We have 2 flavors of RNNs: one which predicts binary correct/wrong for each student solution (milestone_1_binary) and one which predicts the next AST in a student's problem solving path (lasagne_rnn_predict_next_ast)
  • model* python files build and compile the Lasagne models and functions
  • visualize.py creates loss and accuracy plots for our results

data-extraction-utils

Contains the python files which pre-process the CSV files with code.org Hour of Code data into numpy matrices used for the RNNs.

  • extract_from_activities_csv extracts the number of attempted and correct student solutions from the activities.csv database dump for HOC 1-9
  • extract_asts_for_all_trajectories.py extracts AST IDs from trajectories to create matrices of (num_trajectories, num_timesteps, num_ast). Can also be used to clip trajectories below a certain frequency. Note: Data files defined similar to 'data/trajectory_ast_csv_files/Trajectory_ASTs_1.csv'.
  • extract_blocks_for_all_asts.py extracts code statements from ASTs to create matrices of (num_trajectories, num_timesteps, num_code_blocks). Note: Data files defined similar to 'data/ast_blocks_files/AST_to_blocks_1.csv'.
  • info files (printed output from running extract_* files)

loss_plots

Images showing the loss and accuracy values of our RNNs

syntheticDetailed

Synthetic generated data of students answering a series of questions

deepcode's People

Contributors

ange3 avatar hrlarry avatar wur911 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.