louismartin / capfalcnlp Goto Github PK
View Code? Open in Web Editor NEWNLP tools for the Cap'FALC project
NLP tools for the Cap'FALC project
In my environment :
When I run the example : python cli.py --input-file example_text.txt
I getthe following error :
Traceback (most recent call last):
File "cli.py", line 14, in <module>
detections = get_detections(text)
File "/home/codatalab/capfalcnlp/capfalcnlp/features.py", line 197, in get_detections
for sentence in get_long_sentences(text):
File "/home/codatalab/capfalcnlp/capfalcnlp/features.py", line 184, in get_long_sentences
sentences = split_in_sentences(text)
File "/home/codatalab/capfalcnlp/capfalcnlp/processing.py", line 74, in split_in_sentences
return _split_in_sentences_nltk(text, **kwargs)
File "/home/codatalab/capfalcnlp/capfalcnlp/processing.py", line 69, in _split_in_sentences_nltk
return get_nltk_sentence_tokenizer(**kwargs).tokenize(text)
File "/home/codatalab/capfalcnlp/capfalcnlp/processing.py", line 58, in get_nltk_sentence_tokenizer
return nltk.data.load(f'tokenizers/punkt/{language}.pickle')
File "/home/codatalab/.local/lib/python3.7/site-packages/nltk/data.py", line 750, in load
opened_resource = _open(resource_url)
File "/home/codatalab/.local/lib/python3.7/site-packages/nltk/data.py", line 875, in _open
return find(path_, path + [""]).open()
File "/home/codatalab/.local/lib/python3.7/site-packages/nltk/data.py", line 583, in find
raise LookupError(resource_not_found)
LookupError:
**********************************************************************
Resource punkt not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('punkt')
For more information see: https://www.nltk.org/data.html
Attempted to load tokenizers/punkt/PY3/french.pickle
Searched in:
- '/home/codatalab/nltk_data'
- '/opt/conda/envs/capfalcnlp/nltk_data'
- '/opt/conda/envs/capfalcnlp/share/nltk_data'
- '/opt/conda/envs/capfalcnlp/lib/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- ''
**********************************************************************
Adding nltk.download('punkt')
on line 50 of processing.py solves the issue.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.