japerk / nltk3-cookbook Goto Github PK
View Code? Open in Web Editor NEWCode for NLTK3 Cookbook
Code for NLTK3 Cookbook
Hi. I am getting an error with python 3.4.1 when trying to complete the example from Chapter 4 called Training a Brill Tagger. Could below be related to the book's use of version 3.3.5? Thanks in advance for any assistance.
I did not submit this via Pakt's Errata form because maybe it's just be a python version issue and on the errata form at https://www.packtpub.com/books/content/errata, only the NLTK 2.0 book seems to be available from its dropdown:
$ uname -a
Darwin MacBook-Pro.local 13.4.0 Darwin Kernel Version 13.4.0: Sun Aug 17 19:50:11 PDT 2014; root:xnu-2422.115.4~1/RELEASE_X86_64 x86_64
$ git pull origin master
From https://github.com/japerk/nltk3-cookbook
* branch master -> FETCH_HEAD
Already up-to-date.
$ python3
Python 3.4.1 (default, Aug 24 2014, 21:32:40)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.__version__
'3.0.0'
>>> import enchant
>>> enchant.__version__
'1.6.6'
>>> import numpy
>>> numpy.__version__
'1.9.0'
>>> import scipy
>>> scipy.__version__
'0.14.0'
>>> import sklearn
>>> sklearn.__version__
'0.15.2'
>>> import execnet
>>> execnet.__version__
'1.2.0'
>>> import pymongo
>>> pymongo.version
'2.7.2'
>>> import redis
>>> redis.__version__
'2.10.3'
>>> from lxml import etree
>>> etree.LXML_VERSION
(3, 4, 0, 0)
>>> import bs4
>>> bs4.__version__
'4.3.2'
>>> import dateutil
>>> dateutil.__version__
'2.2'
>>> import charade
>>> charade.__version__
'1.0.3'
>>> from nltk.corpus import treebank
>>> from nltk.tag import DefaultTagger, UnigramTagger, BigramTagger, TrigramTagger
>>> from tag_util import backoff_tagger, train_brill_tagger
>>> test_sents = treebank.tagged_sents()[3000:]
>>> train_sents = treebank.tagged_sents()[:3000]
>>> default_tagger = DefaultTagger('NN')
>>> initial_tagger = backoff_tagger(train_sents, [UnigramTagger, BigramTagger, TrigramTagger], backoff=default_tagger)
>>> initial_tagger.evaluate(test_sents)
0.8808115691776387
>>> brill_tagger = train_brill_tagger(initial_tagger, train_sents)
Traceback (most recent call last):
File "/usr/local/lib/python3.4/site-packages/nltk/tbl/rule.py", line 191, in __hash__
return self.__hash
AttributeError: 'Rule' object has no attribute '_Rule__hash'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/site-packages/nltk/tbl/rule.py", line 200, in __repr__
return self.__repr
AttributeError: 'Rule' object has no attribute '_Rule__repr'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/petethomas/git-projects/nltk3-cookbook/tag_util.py", line 48, in train_brill_tagger
return trainer.train(train_sents, **kwargs)
File "/usr/local/lib/python3.4/site-packages/nltk/tag/brill_trainer.py", line 288, in train
self._init_mappings(test_sents, train_sents)
File "/usr/local/lib/python3.4/site-packages/nltk/tag/brill_trainer.py", line 359, in _init_mappings
train_sents)
File "/usr/local/lib/python3.4/site-packages/nltk/tag/brill_trainer.py", line 387, in _update_rule_applies
if pos in self._positions_by_rule[rule]:
File "/usr/local/lib/python3.4/site-packages/nltk/tbl/rule.py", line 193, in __hash__
self.__hash = hash(repr(self))
File "/usr/local/lib/python3.4/site-packages/nltk/tbl/rule.py", line 210, in __repr__
", ".join("({0:s},{1:s})".format(f,unicode_repr(v)) for (f,v) in self._conditions)))
File "/usr/local/lib/python3.4/site-packages/nltk/tbl/rule.py", line 210, in <genexpr>
", ".join("({0:s},{1:s})".format(f,unicode_repr(v)) for (f,v) in self._conditions)))
TypeError: non-empty format string passed to object.__format__
>>>
I can download it but there's no wheel or egg or anything in it to install it so it can be imported the way everything in nltk can be. Please fill me in (odd, that the book constantly reminds readers to unzip things in nltk but does not say how to set this up for easy access although it implies (beginning of "Replacing Words Matching Regular Expressions") that this should be possible...).
is there a way for users to provide or help you provide new languages to the ntlk mashape api? thx!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.