Class to process strings. The following processing functions are available:
tokenize
: returns the tokens of the string as a list.lemmatize
: returns the lemmas of the string as a list.
Both functions have a boolean argument called filter
. If it is set to True, as is the default, stopwords and punctuation signs are removed.
The file example.py
contains an example of how to run it. Follow the installation instructions in the docstring.