GithubHelp home page GithubHelp logo

spell-check's Introduction

Spell-Check

Spell Checker in Python

Use

Cloning and Running Program

git clone https://github.com/jacks205/Spell-Check.git
cd Spell-Check
make 0 or make 1

Removing .pyc files if needed

make realclean

Note: When using word generated mistakes, reoccuring words or letters may appear. Cause being that random numbers aren't always completely random when generated reoccuringly.

Algorithm

Spell Check program using algorithm originally summarized by Dr. Peter Norvig. src: How to Write a Spelling Corrector additional src: Google Algorithm Paper

The algorithm used has 3 parts:

  • The probability of the typed word being correctly typed by the user
  • The offset probability of the user typing word, x, but initially meant word, y
  • Iteration of all possible outputs, and choosing a word which has the best probability

My altered algorithm used is faster than O(n) because I shortened the list of possible words based on the first letter. By creating a dictionary ordered by letter, the run time of the program would range closer to O(1/26*n), where n is the number of words, and 1/26 stands for the alphabet. If n is a

Main Challenge

Write a program that reads a large list of English words (e.g. from /usr/share/dict/words on a unix system) into memory, and then reads words from stdin, and prints either the best spelling suggestion, or "NO SUGGESTION" if no suggestion can be found. The program should print ">" as a prompt before reading each word, and should loop until killed.

Your solution should be faster than O(n) per word checked, where n is the length of the dictionary. That is to say, you can't scan the dictionary every time you want to spellcheck a word.

For example:

> sheeeeep

sheep

> peepple

people

> sheeple

NO SUGGESTION

The class of spelling mistakes to be corrected is as follows:

  • Case (upper/lower) errors: "inSIDE" => "inside"
  • Repeated letters: "jjoobbb" => "job"
  • Incorrect vowels: "weke" => "wake" Any combination of the above types of error in a single word should be corrected (e.g. "CUNsperrICY" => "conspiracy").

If there are many possible corrections of an input word, your program can choose one in any way you like. It just has to be an English word that is a spelling correction of the input by the above rules.

Final step: Write a second program that generates words with spelling mistakes of the above form, starting with correctly spelled English words. Pipe its output into the first program and verify that there are no occurrences of "NO SUGGESTION" in the output.

Algorithm Source

Peter Norvig - How to Write a Spelling Corrector

spell-check's People

Contributors

jacks205 avatar

Stargazers

Arshiya Sharma avatar Muhammed Ali Kocabey avatar Kaique da Silva avatar zxy_kkk avatar M avatar Austin Hester avatar Ivan Plotnikov avatar Mike Brave avatar Divesh Pandey avatar  avatar Tim Kendall avatar

Watchers

Tim Kendall avatar  avatar Rishija Mangla avatar

spell-check's Issues

how to run this code?

when i run the init.py file the following error is generated:

Traceback (most recent call last):
File "C:\Python27_init_.py", line 11, in
main()
File "C:\Python27_init_.py", line 6, in main
spellchk.run(sys.argv[1])
IndexError: list index out of range

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.