GithubHelp home page GithubHelp logo

spellcheck's Introduction

spellcheck

A different type of spellchecker, written in Haskell.

Inspired by Norvig's spelling corrector. I read this and thought it was cool, but was irritated by what I thought was a glaring omission in the spell checker's design: it didn't take into account key locations.

For example, according to Norvig's corrector, the misspelling vate could be either gate or late with about equal likelihood (since they're both just one edit away from vate). In fact, it would guess 'late', since that's a slightly more common word. However, looking down at our (U.S. layout) keyboard, we immediately see that of course the likelihood isn't equal! The letter v on the keyboard is just one key away from g, whereas it's six letters away from l, so we can clearly see that the person probably meant to type gate but missed the g key with their finger: but v is nowhere near l.

Part of the charm of the Norvig corrector is how short the program is, so I tried to write an equally small program which implemented the idea of key distance to rank the quality of the edits (instead of word frequency, as in the Norvig one).

NOTE, this spellchecker might perform equally terribly as the Norvig corrector, just in very different ways. For example, on input sfore, it ranks afore and score as equally likely corrections, but normal people don't use the word "afore" very much -- they probably wanted "score". I believe one could make a truly great spellchecker by taking both English-language word frequency and key distance into account when suggesting edits, and this is just a demonstration of the latter.

Example

Brandon:spellcheck brandon$ cd src/
Brandon:src brandon$ ghc -O2 main.hs 
[1 of 5] Compiling Keyboard         ( Keyboard.hs, Keyboard.o )
[2 of 5] Compiling Insert           ( Insert.hs, Insert.o )
[3 of 5] Compiling Delete           ( Delete.hs, Delete.o )
[4 of 5] Compiling Replace          ( Replace.hs, Replace.o )
[5 of 5] Compiling Main             ( main.hs, main.o )
Linking main ...
Brandon:src brandon$ ./main sfore
[("afore",1),("score",1),("adore",2),("afire",2),("scored",2)...("pore",16)]

Instructions

No libraries are used other than a modern ghc. Only good for US-layout keyboards!

Compile: Clone the repo and then ghc -O2 main.hs

Run: ./main speling, will return an ugly list of its best guesses for a correction of speling

spellcheck's People

Contributors

pickledish avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.