GithubHelp home page GithubHelp logo

kmoroz / wanakanashaapu Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 0.0 192 KB

๐ŸŠ C# port of WanaKana JS ๐Ÿฆ€

C# 100.00%
japanese wanakana wanikani japanese-characters japanese-kana japanese-language japanese-study hiragana kana kanji

wanakanashaapu's Introduction

crabigator gif

ใƒฏใƒŠใ‚ซใƒŠ โ†”๏ธ WanaKana โ†”๏ธ ใ‚ใชใ‹ใช

๐ŸŠ๐Ÿฆ€ Basic Overview

WanaKanaShaapu is a utility library for converting Japanese characters to the Latin ones and detecting the Japanese language in a given input. The C# version is ported from WanaKana JS v5.0.0.

๐ŸŠ๐Ÿฆ€ Original Documentation

WanaKana API Documentation

๐ŸŠ๐Ÿฆ€ Differences in Implementation

Method WanaKana JS WanaKanaShaapu
IsMixed isMixed(input, { passKanji: true }) IsMixed(input, passKanji)
StripOkurigana stripOkurigana(input, { leading: false, matchKanji: '' }) StripOkurigana(input, leading, matchKanji)
Tokenize 1. tokenize(input, { detailed: true, compact: false })
2. Returns either a string array or an object
1. Tokenize(input, compact)
2. Returns Tokenization object

wanakanashaapu's People

Contributors

kmoroz avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

wanakanashaapu's Issues

"chuu" is translated into "ใกใ‚…" instead of "ใกใ‚…ใ†"

Hi,

When trying to transcript "chuu" into hiragana, I get the result "ใกใ‚…" instead of the expected "ใกใ‚…ใ†".

This seems to happen when adding the "ya" parts after consonants by calling this method inTreeBuilder.cs during the call line 70:

private static void AddNode(Node node, string romaji, string kana)
{
    if (romaji.Length == 1 && !node.Children.ContainsKey(romaji))
        node.Children.Add(romaji, new Node(kana));
    else if (romaji.Length == 1 && node.Children.ContainsKey(romaji))
        node.Children[romaji].Data = kana;
    else
    {
        foreach (char c in romaji)
        {
            if (!node.Children.ContainsKey(c.ToString()))
                node.Children.Add(c.ToString(), new Node(string.Empty));
            node = node.Children[c.ToString()];
            AddNode(node, romaji[1..], kana);
        }
    }
}

When trying to add ya in the k node, we get to foreach loop:

Current node is `k`, romaji = "ya", kana = "ใใ‚ƒ", current character is "y"
`k -> y` does not exist
    we create `k -> y`
node = `k -> y`
    ->  Add node recursively (node: `k -> y`, romaji: "a", kana: "ใใ‚ƒ")
        romaji.Length is 1 and `k -> y` does not have any child yet
            `k -> y -> a` is created with value "ใใ‚ƒ"
    <-

// next loop, node `k -> y`, romaji = "ya", kana = "ใใ‚ƒ", current character is "a"
`k -> y -> a` exists
node = `k -> y -> a`
    -> Add node recursively (node: `k -> y -> a`, romaji: "a", kana: "ใใ‚ƒ")
        romaji.Length is 1 and `k -> y -> a` does not have any child yet
            `k -> y -> a -> a` is created with value "ใใ‚ƒ"

It seems that there is a mix of recursion and iteration here which mixes things up.

I can see two solutions to that problem.

By fixing the iterations:

private static void AddNode(Node node, string romaji, string kana)
{
    if (romaji.Length == 1 && !node.Children.ContainsKey(romaji))
        node.Children.Add(romaji, new Node(kana));
    else if (romaji.Length == 1 && node.Children.ContainsKey(romaji))
        node.Children[romaji].Data = kana;
    else
    {
        var remaining = romaji;
        foreach (char c in romaji)
        {
            if (!node.Children.ContainsKey(c.ToString()))
                node.Children.Add(c.ToString(), new Node(string.Empty));
            node = node.Children[c.ToString()];
            AddNode(node, remaining[1..], kana);
            remaining = remaining[1..];
        }
    }
}

By fixing the recursion: (this seems more elegant to me)

private static void AddNode(Node node, string romaji, string kana)
{
    if (romaji.Length == 0)
        return;
    else if (romaji.Length == 1 && !node.Children.ContainsKey(romaji))
        node.Children.Add(romaji, new Node(kana));
    else if (romaji.Length == 1 && node.Children.ContainsKey(romaji))
        node.Children[romaji].Data = kana;
    else
    {
        var firstChar = romaji.First().ToString();
        if (!node.Children.ContainsKey(firstChar))
            node.Children.Add(firstChar.First().ToString(), new Node(string.Empty));
        node = node.Children[firstChar.ToString()];
        AddNode(node, romaji[1..], kana);
    }
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.