GithubHelp home page GithubHelp logo

Comments (4)

aarondandy avatar aarondandy commented on May 28, 2024 1

Interesting, I can't seem to find any Hunspell compatible dictionary files out there. This project is just a port of the orignial Hunspell which can be found at hunspell/hunspell. That said, if my assumptions are correct, and you have a dotnet background my port is going to be a lot easier to follow along with. I'll be honest, I don't completely understand how it all works, but let me see if I can dig up some clues for you.

So first up, I am totally ignorant to the language but it appears to be right to left which may need to have some special treatment for complex affixes. Within Hunspell this seems to be referred to as a "Complex Affix" language and will set of a ton of string reversals in motion. Another thing to consider, is to be sure to encode your files you would make as UTF-8, it just makes it all so much simpler!

Regarding the Levenshtein distance, I don't know if that is implemented exactly for suggestions, but there is a whole lot of code that runs as part of the suggestions that is at least very similar in how it operates. It's not pretty, but it all starts around here: https://github.com/aarondandy/WeCantSpell.Hunspell/blob/master/src/WeCantSpell.Hunspell/WordList.QuerySuggest.cs#L504 . If you have a test runner that includes test coverage such as NCrunch or the new Visual Studio test runner you can use that to find tests that will cover interesting areas and step through them. The test coverage is pretty decent and can be a huge aid in understanding how it all works.

Hope that helps, getting a new language into Hunspell would be pretty cool.

from wecantspell.hunspell.

aarondandy avatar aarondandy commented on May 28, 2024

Another thought: again I'm no linguist and have no idea what I am talking about but the German language may have some similarities in the way it forms what would be referred to in Hunspell as "Compound Words"

from wecantspell.hunspell.

mhmd-azeez avatar mhmd-azeez commented on May 28, 2024

@aarondandy thank you very much for your reply, your port is definitely a huge help for me. The problem is that there is not much documentation about Hunspell, and creating dictionaries for it. I'll take a look at the German dictionary, to see what I can understand.

from wecantspell.hunspell.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.