GithubHelp home page GithubHelp logo

sysujayce / uyghur-tokenizer Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mardan/uyghur-tokenizer

0.0 1.0 0.0 1.62 MB

A simple word level tokenizing library and tool for Uyghur language | ئۇيغۇرچە سۆز سۈزۈش كودى ۋە قۇرالى

Home Page: https://github.com/mardan/Uyghur-Tokenizer

License: Other

C# 100.00%

uyghur-tokenizer's Introduction

Uyghur Tokenizer

A simple word level tokenizer for Uyghur (Arabic-based alphabet). This project includes:

  • Tokenizer library (requires .Net Framework 2.0 or latter).
  • A demo tool for tokenizing local text files on your PC.

تېكىست ھۆججەتلەردىكى ئۇيغۇرچە سۆزلەرنى سۈزۈپ ئېلىشقا ئىشلىتىلىدىغان ئاددىي كود ۋە ئەپچىل قۇرال. تۆۋەندىكىلەر ھازىرلاندى:

  • ئۆزىڭىزنىڭ پروگراممىلىرىدا ئىشلىتەلەيدىغان dll ھۆججىتى (.Net قۇرۇلمىسىنىڭ 2.0 دىن يۇقۇرى نەشرىگە بېقىنىدۇ)
  • بەلگىلەنگەن مۇندەرىجە ئىچىدىكى ھۆججەتلەردىن سۆزلەرنى ئاپتوماتى سۈزۈپ ئالىدىغان ئەپچىل قۇرال.

Using Library

Add UyghurTokenizer.dll as reference in your .Net project.

UyghurTokenizer.dll ھۆججىتنى تۈرىڭىزگە قىستۇرۇپ تۆۋەندىكى ئۇسۇللار ئىشلىتىڭ.

        string textToTokenize = "Uyghur content";
        UyghurTokenizer tokenizer = new UyghurTokenizer();
        string[] tokens = tokenizer.GetTokens(textToTokenize);

or

        List<string> tokens = new List<string>();
        IEnumerator<string> iter = GetTokenIterator(inputText);
        while (iter.MoveNext())

Using Demo Tool

Please follow the instruction on tool UI.

كۆزنەكتىكى كۆرسەتمىلەر يويىچە مەشغۇلات قىلىڭ.

Online demo: http://lab.uyghurdev.net/Uyghur-Tokenizer/Default.aspx

uyghur-tokenizer's People

Contributors

mardan avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.