GithubHelp home page GithubHelp logo

Comments (5)

etianen avatar etianen commented on August 20, 2024

Hi,

I'm struggling a bit with your index notation. What database backend are you using?

On 28 Jul 2013, at 10:25, Michal Novotný [email protected] wrote:

There is an interesting difference in produced indexes in a situation where I would expect them to be the same.

Unicode function of my class Crag returns attribute name.

def unicode(self):
return self.name
watson.register(Crag) produces index 'ceus':3 'céüse':1A,2

Now If I register Crag just with the field name I would expect the index to be the same. but it is not.

watson.register(Crag, ('name',)) produces index 'céüse':1A,2

I'd like always the first variant to apply because it provides provides case insensitive search.


Reply to this email directly or view it on GitHub.

from django-watson.

clime avatar clime commented on August 20, 2024

It is postgresql.

from django-watson.

etianen avatar etianen commented on August 20, 2024

Hmm. That's really odd.

So, in your case, each watson search entry has three indexed fields: "title", "description" and "content. These are indexed with the following weightings:

title (defaults to unicode(obj)) - A
description (defaults to "") - B
content (defaults to " ".join(fields)) - C

If you register that model with no extra fields, then the index should contain:

"céüse" - A
"" - B
"" - C

If you register that model with a name field, then the index should contain:

"céüse" - A
"" - B
"céüse" - C

So I really have no idea how you're getting the index results you are. Raw postgres index notations don't mean a lot to me, I'm afraid.

On 30 Jul 2013, at 11:41, Michal Novotný [email protected] wrote:

It is postgresql.


Reply to this email directly or view it on GitHub.

from django-watson.

clime avatar clime commented on August 20, 2024

I am sorry for the confusion. To explain:

There is also field normalized_name on my Crag model which contains unaccented lower-cased name. That is where 'ceus' is coming from in 'ceus':3 'céüse':1A,2. For some reason there is "e" being cut, not sure why but that is probably not important. These 1,2,3 numbers are probably order and A is a weight I guess. So it makes sense. I didn't realize that watson.register(Crag) includes also the normalized_name column on its own.

So I guess the way to make search accent-insensitive is to register these columns with already unaccented text. Fair enough.

from django-watson.

etianen avatar etianen commented on August 20, 2024

Aha, yes! django-watson's default behaviour is to index all text and char fields.

Mystery solved. :D

On 1 Aug 2013, at 22:41, Michal Novotný [email protected] wrote:

I am sorry for the confusion. To explain:

There is also field normalized_name on my Crag model which contains unaccented lower-cased name. That is where 'ceus' is coming from in 'ceus':3 'céüse':1A,2. For some reason there is "e" being cut, not sure why but that is probably not important. These 1,2,3 numbers are probably order and A is a weight I guess. So it make sense. I didn't realize that watson.register(Crag) includes also the normalized_name column on its own.

So I guess the way to make search accent-insensitive is to register these columns with already unaccented text. Fair enough.


Reply to this email directly or view it on GitHub.

from django-watson.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.