Comments (18)
We need to use a search engine with fuzzy search support that works well for the Swedish mapping of pronunciation/spelling. We have several options, make a switch from MySQL to Postgres as the general DB. The latter have better support for this. Use a separate free text search engine like Elastic or Sphinx with support for fuzzy search.
Let’s discuss the level of ambition for this.
Related to finding employees based on free text info #13.
from intranet-dashboard.
Should we have an AFK discussion regarding this and #13?
from intranet-dashboard.
yes.
from intranet-dashboard.
- Evaluate of ElasticSearch is the right search server to use.
- Investigate if we
- need to set it up on a new virtual instance, or
- should move MySQL to a separate virtual instance, or
- should add more resources to the existing server, or
- should lower the level of aggression on the news update queue worker
- Integrate the employee directory with the text search engine.
- Fine-tune weighting of fields in the search.
- Develop a UI that will not make simple name searches we have today polluted.
The above will be time-boxed.
from intranet-dashboard.
Fuzzy search for autocomplete is ready to setup and deploy in test as soon as there is 2GB more RAM on the instance. We may not be able to run Elasticsearch for both test and prod on the same instance, but for the first deployment we can fine-tune things for test first and then turn it off before deploying into prod.
The Levenshtein editing distance is set to 2, meaning that you can have to spelling errors in the string and still get a match. Correct matches are scored higher. The setting can be changed to 1 or to a percentage of the matching with the indexed terms.
The current settings for index and search analyzers has a few experimental features that can be evaluated and either expanded or abandoned. The following synonyms are used:
- "carlsson, carlson => karlsson"
- "karlson => karlsson"
- "hanson => hansson"
- "carl => karl"
And this character mappings are added:
- "û => y"
- "ph => f"
This means that the search strings "hanson" and "hansson" is the exact same search, as well as "bylund" and "bûlund". This is both good and bad. Good if the user is wrong about the spelling, bad if s/he is right.
The search also matches phone and cell phone numbers. This does not make any noise in the search. The matching is done from the back and not from the front of the indexed term as it is for names since the phone numbers in the directory are cut of in the front randomly.
from intranet-dashboard.
The server is configured and a test version is deployed with the settings mentioned in the previous comment.
I will mess with the settings for Elasticsearch on Wednesday, no results in the in the autocomplete means I'm working on something, sort of. I will also split the prod and test indices and search clients.
In the scope of this release, fuzzy search is enabled in the autocomplete in the masthead and on the search page but the full search results is not fuzzy. This means that if you typ svennevall
you will get two items in the autocomplete but if you execute a search by hitting enter, pressing search or selecting "View all matches" you will not get any matches. We can hide the "View all" to get around this strangeness until we deploy a fuzzy search for the full search as well.
from intranet-dashboard.
Spelling error in the bottom of the suggest list: Visa all alla träffar"
from intranet-dashboard.
Testing the fuzzyness. "anna-karin jangmar" does not give an auto suggest for Anna-Karin Jangmark. An example of too much fuzzyness?
from intranet-dashboard.
"jan-inge ahlfridh" is also a no autosuggest results search query. I think there is a problem with how elastic handles hypen.
from intranet-dashboard.
Typo fixed in malmostad/intranet-assets@ea4c279
from intranet-dashboard.
Yes, there is a problem with the hyphen. Not in Elastic, but in my Elastic text analyzer 🉑
from intranet-dashboard.
Fixed the hyphen bug and the typos above. Also boosted exact matches in a better way.
New version deployed in test.
from intranet-dashboard.
Ok. Deploy in prod!
from intranet-dashboard.
Autocomplete out in production.
Next step is to jack in Elastic in the full search.
from intranet-dashboard.
Full search in test is now using Elastic with the same text analyzers as the autocomplete.
from intranet-dashboard.
The "123 employees matched your query" is a little bit weird, it doesn't say anything about how many Svens we have but how fuzzy we are.
from intranet-dashboard.
Good! Deploy in prod!
from intranet-dashboard.
Out in production.
from intranet-dashboard.
Related Issues (20)
- BI box with SCB data HOT 3
- Field for LinkedIn address in the staff employee profile HOT 2
- Maintenance warnings in box "Verktyg & system" HOT 2
- Create function for batch writing activities on a selection of people HOT 7
- User is manager for too many employees HOT 4
- Number of comments is not showing in dashboard box HOT 4
- Generate rapports HOT 4
- Enhance the user's feed stream HOT 13
- Detach links in "Verktyg & system" and "Jag vill" from roles HOT 10
- Track profile changes for shortcuts HOT 13
- Create Aastra ID mapping job HOT 2
- Badge on news HOT 2
- Allow the contacts_editor role to edit personal profiles HOT 5
- Links in edit mode in the staff directory wrong, change to new target page HOT 4
- Erase box on dashboard page "Min sida" HOT 1
- New link/button HOT 9
- Mailnotifieringar till den som kommenterat nyhet och blogg HOT 1
- Blank e-postadress i Kontaktboken HOT 2
- Bugg i mina arbetskamrater HOT 3
- Fel på typsnitt och sökruta HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from intranet-dashboard.