Comments (3)
Thanks, @facundoolano, for reaching out and using retext-keywords.
Thing is, retext-keywords is based on parts of speech, not on stopwords.
The POS tags default to NN
, which makes unknown words eligible for inclusion in the results, which is the reason this occurs.
Solving word detection
One fix would be to wait for dariusk/pos-js#21 to be PRd (I believe this is an issue there).
Another would be to create a retext plug-in which expands those elision examples (my preference), or alternatively preprocess the given input and expand them by regexes (easiest):
input.replace(/you[’']ve/, 'you have')/*...*/;
retext.process(input/*, ...*/);
The last method should fix this issue, as then the words will be recognised correctly.
Solving stop-words
Or, this project could start supporting given configurable stop-words. Would you be interested in working on a PR for that? I could help guide you through the project!
from retext-keywords.
Thanks for answering. From your explanation it sounds like a plugin that handles the expansions before processing would work better than introducing stopwords. I'll consider working in such plugin in the future, but for now I guess I'll keep on replacing by hand before processing, so I can move forward with my current project.
from retext-keywords.
It’s quite some work, and I’m not very interested in working on it currently.
If someone wants this: create a PR on dariusk/pos-js!
from retext-keywords.
Related Issues (18)
- Duplicates using the same word HOT 2
- Readme demo crashes against current retext HOT 6
- AWS Lambda Insights HOT 3
- Spanish HOT 5
- Not working with custom texts HOT 9
- Problem with words like "night’s" causing keyphrases to be malformed due to ’ character HOT 4
- What algorithm does this use? HOT 3
- Identify American English HOT 1
- TypeError: Cannot read property 'children' of undefined when using options HOT 1
- Phrases HOT 4
- Not working with head version of retext HOT 5
- Getting "it's" as a keyword HOT 4
- Error: Attempted import error: 'color' is not exported from 'unist-util-visit-parents/do-not-use-color' (imported as 'color'). HOT 2
- TypeError: Cannot read property 'push' of undefined
- Russian language support HOT 3
- Version bump on retext-pos dependency? HOT 1
- United not a keyword HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from retext-keywords.