GithubHelp home page GithubHelp logo

aalok-sathe / sanskrit_ipa Goto Github PK

View Code? Open in Web Editor NEW
5.0 3.0 1.0 13.95 MB

Implementation of LREC-CCURL'18 paper: a rule-based system for the transcription of Sanskrit text written using Devanagari into the International Phonetic Alphabet. Uses closest known approximate pronunciations of sounds as well as prosodic and metric rules for syllabification using the WWG algorithm for Sinhalese adapted to Sanskrit and the assignment of syllable-weight-determined stress

License: GNU General Public License v3.0

Python 23.67% TeX 76.33%
ipa-symbols linguistics sanskrit syllabification transcription sanskrit-ipa nlp

sanskrit_ipa's Introduction

DOI

sanskrit_IPA

Usage: To start up the application command prompt, do: python3 init.py

Commands:

  1. transcribe [text] 1.1: transcribes provided Sanskrit [text] which is written using the Devanagari script into IPA and provides inline response. 1.2: if input file has been set, transcribes text from the input file.

  2. setInPath 2.1: sets the input path if user wishes to transcribe text from a particular input file.

sanskrit_ipa's People

Contributors

aalok-sathe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

mediabuff

sanskrit_ipa's Issues

Executable init.py

Could you make init.py executable (that is, run chmod +x init.py and push the file)? It already has the appropriate shebang at the top and typing ./init.py is preferred to python3 init.py. Also, the name init.py is a bit misleading, because it does not simply initialize something but actually runs it too. How about main.py?

Syllabification issues for words with श्च

These words don't get syllabified:

> transcribe निश्चित
niɕt͡ɕit̪ə
> transcribe निश्चय
niɕt͡ɕəjə
> transcribe नरश्च गजश्च
nəɹəɕt͡ɕə gəd͡ʑəɕt͡ɕə

But this one does:

> transcribe पुनःश्च
pu.ˈnəhɕ.t͡ɕə

In my opinion, the code should reach this condition but doesn't—might be a starting point for debugging.

(Note: at first I thought this was an issue relating to zero-width non-joiners, but it wasn't. Nevertheless, the presense of ZWNJs in the input gives a warning in the script. I believe Sanskrit text does not use ZWNJs but it might be good to explicitly ignore them anyway.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.