GithubHelp home page GithubHelp logo

wzb1005 / dialectal_arabic_segmenter Goto Github PK

View Code? Open in Web Editor NEW

This project forked from qcri/dialectal_arabic_segmenter

0.0 1.0 0.0 15.06 MB

Arabic Dialects Segmenter Using Keras/BiLSTM/ChainCRF

License: GNU Lesser General Public License v3.0

Python 66.72% Roff 33.28%

dialectal_arabic_segmenter's Introduction

Dialectal Arabic Segmenter

Dialectal Arabic Segmenter is a freeware module developed by the ALT team at Qatar Computing Research Institute (QCRI) to process Dialectal Arabic. The segmenter is built using a collection of tweets from frour regions - Egypt, Gulf, Maghrib and Levantine.

Arabic Dialects Segmenter implemented using Keras/BiLSTM/ChainCRF.

Requirements

This segmenter requires the following packages:

  • Python version 2.7

    • Python 3.4 should be fine as well (some minor changes)
  • tensorflow version 0.9 or later: https://www.tensorflow.org

  • keras version 1.2.2 or later: http://keras.io

  • nltk version 3.0 or later

Installation

You can install the Dialectal Arabic Segmenter by cloning the repo:

Installing Dialectal Arabic Segmenter from github

Clone the repo from the github using the following command:

git clone https://github.com/qcri/dialectal_segmenter.git

Or download the compressed file of the project, extract it.

Getting started

Dialectal Arabic Segmenter reads an input Arabic text file from the stdin and produces the segmentation line per line. The segmenter expects the input file encoded in UTF-8,

python code/dialects_segmenter.py -i [in-file] -o [out-file] 

For more details see:

python code/dialects_segmenter.py -h

Publications

Younes Samih, Mohamed Eldesouki, Mohammed Attia, Kareem Darwish, Ahmed Abdelali, Hamdy Mubarak, Laura Kallmeyer, (2017), Learning from Relatives: Unified Dialectal Arabic Segmentation, Journal Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), Pages 432-441.

Mohamed Eldesouki, Younes Samih, Ahmed Abdelali, Mohammed Attia, Hamdy Mubarak, Kareem Darwish, Kallmeyer Laura, (2017), Arabic Multi-Dialect Segmentation: bi-LSTM-CRF vs. SVM, arXiv preprint arXiv:1708.05891.

Younes Samih, Mohammed Attia, Mohamed Eldesouki, Ahmed Abdelali, Hamdy Mubarak, Laura Kallmeyer, Kareem Darwish, (2017), A Neural Architecture for Dialectal Arabic Segmentation, Journal Proceedings of the Third Arabic Natural Language Processing Workshop, Pages 46-54.

Support

You can ask questions and join the development discussion:

You can also post bug reports and feature requests (only) in Github issues. Make sure to read our guidelines first.

License

Dialectal Arabic Segmenter is covered by the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.


dialectal_arabic_segmenter's People

Contributors

ahmed451 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.