GithubHelp home page GithubHelp logo

sincebyte / emt Goto Github PK

View Code? Open in Web Editor NEW

This project forked from roife/emt

0.0 0.0 0.0 81 KB

Emacs macOS Tokenizer, tokenizing CJK words with macOS's built-in NLP tokenizer.

License: GNU General Public License v3.0

Emacs Lisp 80.49% Swift 19.51%

emt's Introduction

emt.el

Introduction

EMT stands for Emacs MacOS Tokenizer.

This package use macOS’s built-in NLP tokenizer to tokenize and operate on CJK words in Emacs.

Installation

Requirements

  • macOS 10.15 or later
  • Emacs 26.1 or later, built with dynamic module support (use --with-modules during compilation)

Build dynamic module

Pre-built (recommendation)

If you enable emt-mode and the module cannot be found, it will prompt whether to automatically download it from GitHub. Or you can manually retrieve the pre-built module from the releases section and place the dylib file in the emacs-macos-tokenizer-lib-path (by default, it is located at modules/libEMT.dylib within your personal configuration folder, normally ~/.emacs.d/modules/libEMT.dylib).

Current version of the dynamic module is v2.0.0, make sure you have updated to latest module.

Manually build

  • Install Xcode.
  • Build the module using emt-compile-module, which compiles and copies the module to emt-lib-path.

If you enconter the folloing error:

No such module “PackageDescription”

run the following command and try again:

sudo xcode-select --switch /Applications/Xcode.app/Contents/Developer

Install package

Install with straight and use-package:

(use-package emt
  :straight (:host github :repo "roife/emt"
                   :files ("*.el" "module/*" "module"))
  :hook (after-init . emt-mode))

Customization

emt-use-cache

Caches for results of tokenization if non-nil. Default is t.

emt-cache-lru-size

The size of LRU cache. Default is 50.

emt-lib-path

The path to the directory of dynamic library for emt. Default is ~/.emacs.d/modules/libEMT.dylib.

Usage

keymap: emt-mode-map

It remaps forward-word, backward-word, kill-word and backward-kill-word to use emt’s version.

Minor mode

It calls emt-ensure, which load dynamic modeuls and set emt-mode-map.

Functions

emt-word-at-point-or-forward

Return the word at point. If current point is at bound of a word, return the one forward.

emt-word-at-point-or-backward

Return the word at point. If current point is at bound of a word, return the one backward.

emt-compile-module

Compile and copy the module to emt-lib-path.

It takes an optional argument path, which is the path to the directory of dynamic library. By default, path is set to emt-lib-path.

emt-download-module

Download dynamic module from https://github.com/roife/emt/releases/download/<VERSION>/libEMT.dylib.

If PATH is non-nil, download the module to PATH.

emt-ensure

Load dynamic module.

emt-split

Split string into a list of words.

Return a list of word bounds (a cons of the beginning position and the ending position of a word)

emt-forward-word

CJK compatible version of forward-word.

emt-backward-word

CJK compatible version of backward-word.

emt-kill-word

CJK compatible version of kill-word.

emt-backward-kill-word

CJK compatible version of backward-kill-word.

emt-mark-word

CJK compatible version of mark-word.

Acknowledgements

This package is inspired by jieba.el which is a Chinese tokenizer for Emacs using jieba.

The dynamic module uses emacs-swift-module, which provides an interface for writing Emacs dynamic modules in Swift.

emt's People

Contributors

roife avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.