GithubHelp home page GithubHelp logo

Build an executable of ddskk/skk about ddskk HOT 7 CLOSED

skk-dev avatar skk-dev commented on July 24, 2024
Build an executable of ddskk/skk

from ddskk.

Comments (7)

tkita avatar tkita commented on July 24, 2024 1

@loretoparisi さん、こんにちは。
あなたがやりたいことは大体理解しました。
ただしそれは、単純な「kakasi の逆」ではありません。

かな漢字からローマ字への変換は、ほぼ一意に求めることができます。
日本が好きです => にっぽん が すき です => nippon ga suki desu

ところが、ローマ字からかな漢字への変換は、必ずしも一意ではありません。
"ga" の一文字だけに着目しても「が」「画」「賀」「我」「蛾」と多くの候補があり、その文脈において最適な候補が何であるかを求めるにはアルゴリズムと創意工夫が必要です。
また、「nippongasukidesu」を「nippon ga suki desu」と意味のある区切りにする形態素解析も必要となります。

ddskk は、上記の「最適な候補の選択」と「形態素解析」を人間が行うと割り切ったかな漢字変換システムであり、極端に言えばあなたが望むものと最も遠いプログラムでしょう。

I hope someone will translate it into English.

from ddskk.

tkita avatar tkita commented on July 24, 2024

skktools is management tool for dictionary file merge/sort/convert.
Is your hope a kana to kanji with Javascript?

from ddskk.

loretoparisi avatar loretoparisi commented on July 24, 2024

@tkita Thank you, then it was the wrong package. I was looking for skk main library sources... So not in javascript, I would prefer the C/C++ version to build as an executable or a library. Then I could take care of using node gyp to wrap headers in node.jsjavascript or I would use a process fork to execute a compiled binary in the same way I do with kakasi.js. I'm aware of some very old tools like skkfep but I'm not sure how do they work or if I can use it as standalone executable.

from ddskk.

tkita avatar tkita commented on July 24, 2024

refer to `Anthy' https://ja.osdn.net/projects/anthy/releases/37536

BTW, API is no good? http://www.google.com/transliterate
see function skk-google-cgi-api-for-japanese-input().

(skk-google-cgi-api-for-japanese-input "かんじ")
=> ("感じ" "漢字" "幹事" "カンジ" "監事")

from ddskk.

loretoparisi avatar loretoparisi commented on July 24, 2024

@tkita thank you, in my case I would use offline. My aim is to invert romaji to kanji, that is why I was thinking to skk, so basically the opposite of kakasi, should it be possible right?
In the case of kakasi I'm doing this like

echo "日本が好きです。" | kakasi -i euc -Ha -Ka -Ja -Ea -ka -s -iutf8 -outf8
nippon ga suki desu .

from ddskk.

loretoparisi avatar loretoparisi commented on July 24, 2024

@tkita thank you so much for the clarification. Assumed the google translated did it well, my reply below 👍

I'm aware of the fact that the transliteration romaji -> kanji it is not unique, so you need some intelligence to get the best candidate among k. For Indian languages I'm using a SHMM model plus a Neural Network of the weights, then Viterbi is used for decoding to get the best of K candidates among the optimal ones.

Now I understand why there is a morphological analyzer in anthy-morphological-analyzer and the anthy-dic-tool , thank you.

So we can say that dkk can be used as a intermediate step between kanji and romaji, starting from kana but not as-it-is then.

What I'm working on right now it is a Tensorflow sequence 2 sequence neural network architecture that inverts romaji to kanji (a seq2seq machine learning task) using a parallel corpus of sentences.

Thanks again for your help, closing then!

from ddskk.

tkita avatar tkita commented on July 24, 2024

Sorry for my weird English.
I wish for the success of your project.

from ddskk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.