GithubHelp home page GithubHelp logo

nihalnayak / slam18 Goto Github PK

View Code? Open in Web Editor NEW
13.0 2.0 2.0 144 KB

Code for Context based Approach for Second Language Acquisition

License: MIT License

Python 100.00%
natural-language-processing machine-learning machine-learning-models language-learning nlp duolingo cognitive-science

slam18's Introduction

Context based Approach for Second Language Acquisition

This project is the implementation of the system submitted to the SLAM 2018 (Second Language Acquisition Modeling 2018) shared task.

This page gives instructions for replicating the results in our system.

Table of Contents

Installation

Our project is built on python. We have ensured python 2 and 3 compatibility. In this section, we describe the installation procedure for Ubuntu 16.04.

git clone https://github.com/iampuntre/slam18.git
cd slam18
virtualenv env
source env/bin/activate
pip install -r requirements.txt
mkdir data

Note: Follow equivalent instructions for your Operating System

Downloading Data

In our experiments, we use the SLAM 2018 Dataset. To download the dataset, download from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/8SWHNO.

After downloading the data, unzip it in the data directory.

Parameters for the Experiment

In order to train the model, you will have to configure the parameters.ini file. You can find this file in src/parameters.ini.

We have three sections in the file - model, options and context_features. The model section is used to point to the train and test files. options section is used to manipulate the various hyperparameters while training the model. Lastly, we have the context_features section, which is used to activate/deactivate various context based features.

Change the appropriate values for train, dev and test files. We have preset the values of the hyperparameters that we have used in our experiments. By default, all the context features are activated.

Prepare Data

After you have successfully configured the parameters.ini file, you should prepare the data for training. This is an intermediate step, where we extract the tokens and part of speech present in the surrounding of the context. For more details, read our paper.

To prepare the data, execute the following -

python src/prepare_data.py --params_file src/parameters.ini

You should be able to see three .json files in your data directory.

Train your model

To train the model, type the following command in your terminal -

python src/train_model.py --params_file src/parameters.ini

Note: It is recommended you run this step only if you have sufficient memory (atleast 16GB)

Test your predictions

To evaluate your predictions, execute the following command -

python src/eval.py --params_file src/parameters.ini

Citation

If you make use of this work in your paper, please cite our paper

@InProceedings{nayak-rao:2018:W18-05,
  author    = {Nayak, Nihal V.  and  Rao, Arjun R.},
  title     = {Context Based Approach for Second Language Acquisition},
  booktitle = {Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications},
  month     = {June},
  year      = {2018},
  address   = {New Orleans, Louisiana},
  publisher = {Association for Computational Linguistics},
  pages     = {212--216},
  url       = {http://www.aclweb.org/anthology/W18-0524}
}


slam18's People

Contributors

dependabot[bot] avatar nihalnayak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

slam18's Issues

Add: Evaluation script

The project does not have the evaluation script that was initially present in the Duolingo baseline code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.