GithubHelp home page GithubHelp logo

erayyildiz / neural-morphological-disambiguation-for-turkish-deprecated Goto Github PK

View Code? Open in Web Editor NEW
11.0 3.0 0.0 8.18 MB

Neural morphological disambiguation for Turkish. Implemented in DyNet

Python 100.00%
turkish morphological-analysis morphological-disambiguator dynet

neural-morphological-disambiguation-for-turkish-deprecated's Introduction

Deprecated: See the current version which is based on PyTorch.

Neural-Morphological-Disambiguation-for-Turkish

This project is dynet implementation of morphological disambiguation method proposed in [1]

Full context model is one of the methods proposed in the study and perform best for Turkish morphological disambiguation among four proposed methods. All the methods are based on character level bidirectional LSTM networks and Full Context Model represent the context applying LSTMs on surface words around the target word needs to be disambiguated. Here, we provide an implementation of this method written in dynet. We also provide train and test sets for Turkish morphological disambiguation. The disambiguation model can be trained using this dynet script. Model save and load functions are also provided so that users are able to use this tool as a Turkish morphological disambiguator.

This script is developed in a project about joint dependecy parsing and morphological disambiguation and willbe further developed. A better organized morphological disambiguation tool with much more functionality will be provided soon.

Just run them Neural_morph_disambiguator.py to start training. Values of default parameters are same as proposed values in the paper.

For the details about the method, please see the Shen's study.

accuracy on test set: 0.966707477868 ambiguous accuracy on test: 0.93011416992

[1] Shen,et. al. 2017. The Role of Context in Neural Morphological Disambiguation COLING 2016

Installation

To install the progrm you need a machine with a Python 3 or higher. To install python3, run the following program if your machine has linux based OS.

sudo apt update
sudo apt install software-properties-common
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.7

Then you should install python package manager pip:

sudo apt install python3-pip

After pip is installed you should install python packages which the project requires:

pip install -f REQUIREMENTS.txt

Then create a folder named models in project root directory so that the trained model can be saved into this directory.

mkdir models

Now installation is completed, you can run the program.

How to run

For training you can run the following python script:

from neural_morph_disambiguation_dynet import MorphologicalDisambiguator
disambiguator = MorphologicalDisambiguator(train_from_scratch=True, char_representation_len=100, word_lstm_rep_len=200, 
                                           train_data_path="data/data.train.txt",
                                           test_data_paths=["data/data.test.txt","data/test.merge"],
                                           model_file_name='my_model')

These codes will start training on given datasets and you can track the training status from console. When the training is completed, the model will be saved into models path with name 'my_model'

To load and use a trained model use the codes below:

from neural_morph_disambiguation_dynet import MorphologicalDisambiguator
disambiguator = MorphologicalDisambiguator.create_from_existed_model("model-22.10.2017")

neural-morphological-disambiguation-for-turkish-deprecated's People

Contributors

erayyildiz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.