GithubHelp home page GithubHelp logo

ml-ai-nlp-ir / vocabulary Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tasdikrahman/vocabulary

0.0 1.0 0.0 57 KB

Python Module to get Meanings, Synonyms and what not for a given word

Home Page: https://pypi.python.org/pypi/vocabulary

License: MIT License

Python 100.00%

vocabulary's Introduction

Vocabulary

PyPI version License Python Versions Build Status Join the chat at https://gitter.im/prodicus/vocabulary Bitdeli Badge

A dictionary magician in the form of a module!

Table of Contents

What is it

For a given word, using Vocabulary, you can get it's

  • Meaning
    • Synonyms
    • Antonyms
  • Part of speech : whether the word is a noun, interjection or an adverb et el
  • Usage example : a quick example on how to use the word in a sentence
  • Pronuciation
  • Hyphenation : shows the particular stress points(if any)

Features

Why should I use Vocabulary

Wordnet is a great resource. No doubt about it! So why should you use Vocabulary when we already have Wordnet out there?

My 2 cents

Wordnet Comparison

Let's say you want to find out the synonyms for the word car.

  • Using Wordnet
>>> from nltk.corpus import wordnet
>>> syns = wordnet.synsets('car')
>>> syns[0].lemmas[0].name
'car'
>>> [s.lemmas[0].name for s in syns]
['car', 'car', 'car', 'car', 'cable_car']

>>> [l.name for s in syns for l in s.lemmas]
['car', 'auto', 'automobile', 'machine', 'motorcar', 'car', 'railcar', 'railway_car', 'railroad_car', 'car', 'gondola', 'car', 'elevator_car', 'cable_car', 'car']
  • Doind the same using Vocabulary
>>> from vocabulary import Vocabulary as vb
>>> vb.synonym("car")
'[{"seq": 0, "text": "automotive"}, {"seq": 1, "text": "motor"}, {"seq": 2, "text": "wagon"}, {"seq": 3, "text": "cart"}, {"seq": 4, "text": "automobile"}]'
>>> ## load the json data
>>> car_synonyms = json.loads(vb.synonym("car"))
>>> type(car_synonyms)
<class 'list'>
>>> 

So there you go. You get the data in an easy JSON format.

You can go on comparing for the other methods too.

Installation

Option 1: installing through pip (Suggested way)

pypi package link

$ pip install vocabulary

If you are behind a proxy

$ pip --proxy [username:password@]domain_name:port install vocabulary

Note: If you get command not found then $ sudo apt-get install python-pip should fix that

Option 2: Installing from source

$ git clone https://github.com/prodicus/vocabulary.git
$ cd vocabulary/
$ pip install -r requirements.txt
$ python setup.py install

Uninstalling

$ pip uninstall vocabulary

Demo

Demo link

Demo link

Usage

A Simple demonstration of the module

## Importing the module
>>> from vocabulary import Vocabulary as vb

## Extracting "Meaning"
>>> vb.meaning("hillbilly")
'[{"text": "Someone who is from the hills; especially from a rural area, with a connotation of a lack of refinement or sophistication.", "seq": 0}, {"text": "someone who is from the hills", "seq": 1}, {"text": "A white person from the rural southern part of the United States.", "seq": 2}]'
>>> 

## "Synonym"
>>> vb.synonym("hurricane")
'[{"text": "storm", "seq": 0}, {"text": "tropical cyclone", "seq": 1}, {"text": "typhoon", "seq": 2}, {"text": "gale", "seq": 3}]'
>>> 

## "Antonym"
>>> vb.antonym("respect")
'{"text": ["disesteem", "disrespect"]}'
>>> vb.antonym("insane")
'{"text": ["sane"]}'

## "Part of Speech"
>>> vb.part_of_speech("hello")
'[{"text": "interjection", "example:": "Used to greet someone, answer the telephone, or express surprise.", "seq": 0}]'
>>>

## "Usage Examples"
>>> vb.usage_example("chicanery")
'[{"text": "The Bush Administration is now the commander-in-theif (lower-case intentional) thanks to their chicanery.", "seq": 0}]'
>>>

## "Pronunciation"
>>> vb.pronunciation("hippopotamus")
[{'raw': '(hĭpˌə-pŏtˈə-məs)', 'rawType': 'ahd-legacy', 'seq': 0}, {'raw': 'HH IH2 P AH0 P AA1 T AH0 M AH0 S', 'rawType': 'arpabet', 'seq': 0}]
>>>

## "Hyphenation"
>>> vb.hyphenation("hippopotamus")
'[{"text": "hip", "type": "secondary stress", "seq": 0}, {"text": "po", "seq": 1}, {"text": "pot", "type": "stress", "seq": 2}, {"text": "a", "seq": 3}, {"text": "mus", "seq": 4}]'
>>> vb.hyphenation("amazing")
'[{"text": "a", "seq": 0}, {"text": "maz", "type": "stress", "seq": 1}, {"text": "ing", "seq": 2}]'
>>>

Help

If you need to see the usage for any of the methods, do a

>>> from vocabulary import Vocabulary as vb
>>> help(vb.meaning)
Help on function meaning in module vocabulary.vocabulary:

meaning(phrase, source_lang='en', dest_lang='en')
    make calls to the
    - glosbe API(default choice)
    - Wordnik API 

    Wordnik's API gives less results so not Using it here for getting the meanings

    params: 
    =======
    source_lang, dest_lang (both default to "en" if nothing is specified)

    Usage: 
    ======
    >>> from vocabulary import Vocabulary as vb
    >>> vb.meaning("levitate")
    '[{"text": "(intransitive) Be suspended in the air, as if in defiance of gravity.", "seq": 0}, {"text": "(transitive) To cause to rise in the air and float, as if in defiance of gravity.", "seq": 1}]'
    >>>
(END)

and so on for other functions

How does it work

Under the hood, it makes use of 4 awesome API's to give you consistent results. The API's being

  • Wordnik
  • Glosbe
  • BighugeLabs
  • Urbandict

Contributing

create a virtualenv first:

  1. Fork it.
  2. Clone it
$ virtualenv develop              # Create virtual environment
$ source develop/bin/activate     # Change default python to virtual one
(develop)$ pip install -r requirements.txt  # Install requirements for 'Vocablary' in virtual environment
(develop)$ git clone https://github.com/prodicus/vocabulary.git

Or, if virtualenv is not installed on your system:

$ wget https://raw.github.com/pypa/virtualenv/master/virtualenv.py
$ python virtualenv.py develop    # Create virtual environment
$ source develop/bin/activate     # Change default python to virtual one
(develop)$ pip install -r requirements.txt  # Install requirements for 'Vocablary' in virtual environment
(develop)$ git clone https://github.com/prodicus/vocabulary.git
  1. Create your feature branch ($ git checkout -b my-new-awesome-feature)
  2. Commit your changes ($ git commit -am 'Added <xyz> feature')
  3. Run tests
(develop) $ ./tests.py -v

If everything is running fine, integrate your feature

  1. Push to the branch ($ git push origin my-new-awesome-feature)
  2. Create new Pull Request

Hack away!

To do

  • Add translate module
  • Add an option like json=False or json=True where the former returns a list object

Tests

Vocabulary uses unittesting for testing purposes.

Running the test cases

$ ./tests.py -v
test_antonym_1 (__main__.TestModule) ... ok
test_antonym_2 (__main__.TestModule) ... ok
test_hyphenation (__main__.TestModule) ... ok
test_meaning (__main__.TestModule) ... ok
test_partOfSpeech_1 (__main__.TestModule) ... ok
test_partOfSpeech_2 (__main__.TestModule) ... ok
test_pronunciation (__main__.TestModule) ... ok
test_synonym (__main__.TestModule) ... ok
test_usageExamples (__main__.TestModule) ... ok

----------------------------------------------------------------------
Ran 9 tests in 7.742s

OK
(testvocab)

Known Issues

  • When using the method
>>> vb.pronunciation("hippopotamus")
[{'raw': '(hĭpˌə-pŏtˈə-məs)', 'rawType': 'ahd-legacy', 'seq': 0}, {'raw': 'HH IH2 P AH0 P AA1 T AH0 M AH0 S', 'rawType': 'arpabet', 'seq': 0}]
>>> type(vb.pronunciation("hippopotamus"))
<class 'list'>
>>> json.dumps(vb.pronunciation("hippopotamus"))
'[{"raw": "(h\\u012dp\\u02cc\\u0259-p\\u014ft\\u02c8\\u0259-m\\u0259s)", "rawType": "ahd-legacy", "seq": 0}, {"raw": "HH IH2 P AH0 P AA1 T AH0 M AH0 S", "rawType": "arpabet", "seq": 0}]'
>>>

You are being returned a list object instead of a JSON object. When returning the latter, there are some unicode issues. A fix for this will be released soon.

Discuss

Join us on our Gitter channel if you want to chat or if you have any questions.

Changelog

0.0.4

  • JSON inconsistency fixed for the methods
    • Vocabulary.hyphenation()
    • Vocabulary.part_of_speech()
    • Vocabulary.meaning()

Bugs

Please report the bugs at the issue tracker

License :

MIT License © Tasdik Rahman

You can find a copy of the License at http://prodicus.mit-license.org/

vocabulary's People

Contributors

tasdikrahman avatar monkpit avatar prodicus avatar gitter-badger avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.