Automatic quiz generation for Japanese language teachers
# get python3 tools if you don't already
suto apt install python3-pip
sudo apt install python3-setuptools
sudo apt install python3-dev
mkdir data
cd data
# download japanese model
wget https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.ja.300.bin.gz
gunzip cc.ja.300.bin.gz
# download fastText library
git clone https://github.com/facebookresearch/fastText.git
cd fastText
sudo -H pip3 install .
This requires the following files to be downloaded to a directory called data
- https://lars.yencken.org/datasets/phd/jyouyou__strokeEditDistance.csv
- https://lars.yencken.org/datasets/phd/jyouyou__yehAndLiRadical.csv
- https://raw.githubusercontent.com/scriptin/kanji-frequency/master/data/aozora.json
The following parameters in parse_similarity.js
are adjustable:
strokeWeight
: weight of the "strokeAndEditDistance" datasetstrokeWeight
: weight of the "yehAndLiRadical" datasetfreqWeight
: weight of the frequency of the kanji
Then, similarKanji.json
can be regenerated by running node parse_similarity.js