GithubHelp home page GithubHelp logo

elasticsearch-analysis-combo's People

Contributors

jprante avatar ofavre avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elasticsearch-analysis-combo's Issues

Compatability with Elasticsearch 2.0

This plugin doesn't install with Elasticsearch 2.0, I get the following error: ERROR: Could not find plugin descriptor 'plugin-descriptor.properties'

Plugin is no longer downloadable through elasticsearch plugin mechanism

The install command from the readme doesn't work any longer as the plugin is assumed to be a site plugin:
bin/plugin -install yakaz/elasticsearch-analysis-combo/1.2.0

The more explicit variant doesn't work neither (also assumed to be a site plugin):
bin/plugin -url https://github.com/yakaz/elasticsearch-analysis-combo/zipball/v1.2.0 -install elasticsearch-analysis-combo

The full response for both calls is:
Plugin installation assumed to be site plugin, but contains source code, aborting installation...

Compatibility to Elasticsearch 1.0.0.RC1

I enountered a compatibility issue with Elasticsearch 1.0.0.RC1.
The Class

ElasticSearchIllegalArgumentException

is now called

ElasticsearchIllegalArgumentException

Problem with 1.3.0 Combo version:

Hi

I've tried the 1.3.0 version of the combo plugin. I have the following configuration:

        "analysis":{
            "analyzer":{
                "ngram":{
                    "tokenizer":"whitespace",
                    "filter":[
                        "standard",
                        "lowercase",
                        "ngram",
                        "catenate_words"
                    ]
                },
                "combo":{
                    "type":"combo",
                    "sub_analyzers":[
                        "standard",
                        "ngram"
                    ]
                }
            },
            "filter":{
                "catenate_words":{
                    "type":"word_delimiter",
                    "catenate_all":true,
                    "generate_word_parts":true,
                    "generate_number_parts":false,
                    "split_on_case_change":false,
                    "split_on_numerics":false,
                    "preserve_original":true,
                    "stem_english_possessive":false
                },
                "ngram":{
                    "type":"nGram",
                    "min_gram":3,
                    "max_gram":50
                },

If I create this index and run the analyze API like this I get an error:

http://localhost:9200/indexname/_analyze?analyzer=combo&text=hello%20world&pretty=true

ES returns:

{
"error" : "UnsupportedOperationException[null]",
"status" : 500
}

The "space"/"%20" seems to cause problems here and after running the above query we keep on getting the 500 status error on all requests.

I get this error trace as well:

[13-03-15 11:30:40:040] DEBUG: [Supernalia] failed to execute [org.elasticsearch.action.admin.indices.analyze.AnalyzeRequest@159f36a4]
java.lang.UnsupportedOperationException
at org.apache.lucene.analysis.ReusableTokenStreamComponents.setReader(ReusableTokenStreamComponents.java:20)
at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:137)
at org.elasticsearch.action.admin.indices.analyze.TransportAnalyzeAction.shardOperation(TransportAnalyzeAction.java:201)
at org.elasticsearch.action.admin.indices.analyze.TransportAnalyzeAction.shardOperation(TransportAnalyzeAction.java:57)
at org.elasticsearch.action.support.single.custom.TransportSingleCustomOperationAction$AsyncSingleAction$2.run(TransportSingleCustomOperationAction.java:175)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)

Strange results when combining German2 snowball with ngram and highlighting matches.

When using the combo analyzer with the sub analyzers German2 snowball and ngram I get wrong highlights at positions where the queried word didn't occur in the input text. See gist 5411318. Changing order of the sub_analyzers doesn't change anything. default_index and default_search are different because ngram shouldn't be used for search analysis. I don't have that much experience in the field of full-text search but not using combo analyzer and using a fillter array

[ "ngram_filter", "germansnow" ]

where appropriate yields correct results for me. But as far as I understand it now is not the same as using the combo analyzer.

As far as my requirements go, I started here: http://jprante.github.io/lessons/2012/05/16/multilingual-analysis-for-title-search.html because we have multilingual titles as well.

Problem using combo analyzer with the 0.90.0_beta1 version of ElasticSearch

Hi

We just upgraded to the latest ES and got the following stacktrace when trying to index using the combo-analyzer. Any ideas:

org.elasticsearch.indices.IndexCreationException: [essearchtest] failed to create index
at org.elasticsearch.indices.InternalIndicesService.createIndex(InternalIndicesService.java:299)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewIndices(IndicesClusterStateService.java:302)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:168)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:321)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.VerifyError: class org.elasticsearch.index.analysis.ComboAnalyzer overrides final method tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2389)
at java.lang.Class.getDeclaredConstructors(Class.java:1836)
at org.elasticsearch.common.inject.assistedinject.FactoryProvider.createMethodMapping(FactoryProvider.java:214)
at org.elasticsearch.common.inject.assistedinject.FactoryProvider.newFactory(FactoryProvider.java:151)
at org.elasticsearch.common.inject.assistedinject.FactoryProvider.newFactory(FactoryProvider.java:146)
at org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:386)
at org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(InternalIndicesService.java:295)

Add plugin to public maven repository

Hello,

It would be nice to make the plugin available to a public maven repository.

The current installation:
bin/plugin -install yakaz/elasticsearch-analysis-combo/1.2.0
Seems to work with the git account, according to ES plugins documentation.
I didn't find the plugin on any public maven repository.

The "github plugin install" works fine, having the plugin on a real maven repository would help, particularly for integration tests with an embedded ES serve, where the plugin dependency should be added to the classpath with a test scope.

I had to clone the tagged branch locally and do a mvn clean install, and had to explain the procedure to other developers so that they are not stuck with unresolvable maven dependency (we don't have a Nexus for now)

Publishing it on Sonatype, or at least on a custom built repository would be nice.
For exemple you can do something similar to what I've done here:
https://github.com/slorber/gcm-server-repository

By the way, good job, the plugin works fine for me, my integration tests passed successfully after a small effort to switch from a multifield to a combo analyzer

Add possibility to boost per analyzer

Hello,

It would be nice to be able to give a boost per analyzer.

I mean, it I index the word "description" with edgengrams(3,7) + stemming + default

I would like to be able to say:

  • If a match is found thanks to edgengrams, then boost of 0.2
  • If a match is found thanks to stemming, then boost of 0.7
  • If a match is found thanks to stemming, then boost of 1

Because matches with "des" may be less relevant than matches with "descript" than matches with "description", so matches with "description" should be the firsts to come.

I don't know if it is possible to do, just a suggestion :)

Also, it would be nice to have some informations about the effects on scoring of using a combo analyzer. What came to me first was for exemple "is the order of sub analyzers important?". I think it doesn't since you mention some stuff about duplicate tokens.

dynamic template multi-field w/ _all combo analyzer issue?

FWIW, I posted this on the Elasticsearch forum but got no response thus far, so I am posting it here too, since it definitely is analysis combo related.

I'm seeing duplicate concatenated values when using the combo analyzer for _all using a multi-field defined in a dynamic template.

e.g. Instead of seeing "Foo Bar" when listing the _all terms aggregation, I'm seeing "Foo Bar Foo Bar" for the token because my mulit-field defines 2 sub-fields. If the multi-field is defined with 4 sub-fields, then "Foo Bar" is concatenated 4 times.

My set up is below.

Elasticsearch 1.0.0 on CentOs 6.4 with Java 1.7.0_51.

$ES_HOME/config/default-mapping.json:
{
"default": {
"_all": {
"enabled": true,
"analyzer": "combo",
"store": false
},
"dynamic_templates": {
"string_multifield_template": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"include_in_all": false,
"fields": {
"{name}": {
"index": "not_analyzed",
"store": true,
"type": "string"
},
"lowercase": {
"analyzer": "lowercase",
"index": "analyzed",
"store": false,
"type": "string"
}
}
}
}
}
}
}

$ES_HOME/config/elasticsearch.yml:
...
index.analysis.analyzer.lowercase.type: custom
index.analysis.analyzer.lowercase.tokenizer: keyword
index.analysis.analyzer.lowercase.filter [ lowercase ]

index.analysis.analyzer.combo.type: custom
index.analysis.analyzer.combo.sub_analyzers: [ keyword, lowercase ]
index.analysis.analyzer.combo.deduplication: true
index.analysis.analyzer.combo.tokenstream_reuse: false
...

The aggregation query I use is the following:
{
"aggs": {
"_all": {
"terms": {
"field": "_all"
}
}
}
}

ES 1.2.1 — Unsupported major.minor version 51.0

Having been using ES 1.1.0 with the combo analyser, lovely — thanks!

Just downloaded ES 1.2.1 and tried to install the combo analyser, and got the below — what to do?

vagrant@precise32:~/elasticsearch-1.2.1$ bin/plugin -install com.yakaz.elasticsearch.plugins/elasticsearch-analysis-combo/1.5.1
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/elasticsearch/plugins/PluginManager : Unsupported major.minor version 51.0
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:643)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
    at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
Could not find the main class: org.elasticsearch.plugins.PluginManager. Program will exit.
vagrant@precise32:~/elasticsearch-1.2.1$

highlighting is not working. how can i highlighting with combo type analyzer? please help me!

question.
how can i highlighting with combo type analyzer?
please help me!

  1. create index with combo anlayzer
    {
    "index.analysis.filter.thai_stop_custom.type":"stop",
    "index.analysis.analyzer.custom_whitespace_synonym_analyzer.filter.0":"thai_stop_custom",
    "index.analysis.analyzer.custom_whitespace_synonym_analyzer.filter.1":"english_stop_custom",
    "index.analysis.analyzer.custom_whitespace_synonym_analyzer.filter.2":"synonym",
    "index.analysis.analyzer.custom_whitespace_synonym_analyzer.filter.3":"unique_token_filter",
    "index.analysis.filter.compound_word.max_subword_size":"25",
    "index.analysis.analyzer.custom_whitespace_synonym_analyzer.type":"custom",
    "index.analysis.analyzer.custom_foreign_languages_synonym_analyzer.type":"custom",
    "index.analysis.analyzer.custom_whitespace_synonym_analyzer.tokenizer":"whitespace",
    "index.analysis.analyzer.custom_foreign_languages_synonym_analyzer.filter.0":"compound_word",
    "index.analysis.analyzer.custom_synonym_analyzer.type":"custom",
    "index.analysis.analyzer.custom_synonym_analyzer.filter.2":"synonym",
    "index.analysis.analyzer.custom_synonym_analyzer.filter.3":"unique_token_filter",
    "index.analysis.analyzer.custom_synonym_analyzer.filter.0":"thai_stop_custom",
    "index.analysis.analyzer.custom_foreign_languages_synonym_analyzer.filter.4":"synonym",
    "index.analysis.analyzer.custom_synonym_analyzer.filter.1":"english_stop_custom",
    "index.analysis.analyzer.custom_foreign_languages_synonym_analyzer.filter.3":"english_stop_custom",
    "index.analysis.analyzer.custom_foreign_languages_synonym_analyzer.filter.2":"thai_stop_custom",
    "index.analysis.analyzer.custom_foreign_languages_synonym_analyzer.filter.1":"keep_word",
    "index.analysis.filter.compound_word.min_subword_size":"2",
    "index.analysis.analyzer.custom_foreign_languages_synonym_analyzer.filter.5":"unique_token_filter",
    "index.analysis.analyzer.combo_thai_analyzer.deduplication":"true",
    "index.analysis.filter.unique_token_filter.type":"unique",
    "index.analysis.filter.synonym.synonyms_path":"synonyms.txt",
    "index.analysis.filter.keep_word.type":"keep",
    "index.analysis.filter.compound_word.word_list_path":"protwords.txt",
    "index.analysis.filter.english_stop_custom.type":"stop",
    "index.analysis.analyzer.custom_foreign_languages_analyzer.type":"custom",
    "index.analysis.analyzer.combo_thai_analyzer.type":"combo",
    "index.analysis.filter.thai_stop_custom.stopwords_path":"stopwords.txt",
    "index.analysis.filter.compound_word.type":"dictionary_decompounder",
    "index.analysis.analyzer.custom_foreign_languages_synonym_analyzer.tokenizer":"keyword",
    "index.analysis.analyzer.custom_icu_analyzer.tokenizer":"icu_tokenizer",
    "index.analysis.analyzer.custom_foreign_languages_analyzer.tokenizer":"keyword",
    "index.analysis.filter.synonym.type":"synonym",
    "index.analysis.analyzer.combo_thai_analyzer.sub_analyzers.3":"custom_foreign_languages_synonym_analyzer",
    "index.analysis.analyzer.combo_thai_analyzer.sub_analyzers.2":"custom_synonym_analyzer",
    "index.analysis.analyzer.combo_thai_analyzer.sub_analyzers.1":"custom_foreign_languages_analyzer",
    "index.analysis.analyzer.combo_thai_analyzer.sub_analyzers.0":"custom_icu_analyzer",
    "index.analysis.filter.keep_word.keep_words_path":"protwords.txt",
    "index.analysis.filter.unique_token_filter.only_on_same_position":"false",
    "index.analysis.analyzer.combo_thai_analyzer.sub_analyzers.4":"custom_whitespace_synonym_analyzer",
    "index.analysis.analyzer.custom_icu_analyzer.filter.0":"thai_stop_custom",
    "index.analysis.analyzer.custom_icu_analyzer.filter.1":"english_stop_custom",
    "index.analysis.analyzer.custom_foreign_languages_analyzer.filter.0":"compound_word",
    "index.analysis.analyzer.custom_synonym_analyzer.tokenizer":"icu_tokenizer",
    "index.analysis.analyzer.custom_icu_analyzer.filter.2":"icu_normalizer",
    "index.analysis.analyzer.custom_icu_analyzer.filter.3":"unique_token_filter",
    "index.analysis.analyzer.custom_foreign_languages_analyzer.filter.2":"thai_stop_custom",
    "index.analysis.analyzer.custom_foreign_languages_analyzer.filter.1":"keep_word",
    "index.analysis.filter.compound_word.min_word_size":"2",
    "index.analysis.analyzer.custom_foreign_languages_analyzer.filter.4":"unique_token_filter",
    "index.analysis.analyzer.custom_foreign_languages_analyzer.filter.3":"english_stop_custom"
    }
  2. create mapping with combo analyzer
    {
    "properties": {
    "id": {
    "type": "long",
    "index": "not_analyzed"
    },
    "title": {
    "type": "string",
    "index": "analyzed",
    "store": true
    },
    "indexTitle": {
    "type": "string",
    "index": "analyzed",
    "analyzer": "combo_thai_analyzer",
    "term_vector": "with_positions_offsets",
    "store": true
    },
    "searchKeywords": {
    "type": "string",
    "index": "no",
    "store": true
    },
    "keywordGroup": {
    "type": "string",
    "index": "no",
    "store": true
    }
    }
    }
  3. save docs
    {
    "index":{
    "_index":"testindex",
    "_type":"ITEM",
    "_id":"1"
    }
    }{
    "id":1,
    "title":"เดรสผ่าแขนหน้า เรียบหรูดูดีสุดๆค่ะ ผ้าฮานาโกะคัตติ้งเนี้ยบ งานดีงานสลวยสวยเก๋ ใส่แล้วเป็นคุนนายขึ้นมาทันทีค่ะ ขาว ดำ แดง น้ำเงิน ครีม ชมพู Freesize เอว 24-28 490฿",
    "indexTitle":"เดรสผ่าแขนหน้า เรียบหรูดูดีสุดๆค่ะ ผ้าฮานาโกะคัตติ้งเนี้ยบ งานดีงานสลวยสวยเก๋ ใส่แล้วเป็นคุนนายขึ้นมาทันทีค่ะ ขาว ดำ แดง น้ำเงิน ครีม ชมพู Freesize เอว 24-28 490฿",
    "searchKeywords":"เดรส",
    "keywordGroup":"A"
    }{
    "index":{
    "_index":"testindex",
    "_type":"ITEM",
    "_id":"2"
    }
    }{
    "id":2,
    "title":"เดรสปาดไหล่แขนยาว ทรงนี้เป็นทรงที่ใครใส่ก็สวยค่ะ เดรสเกาะอกแต่งแขนยาวปาดไหล่ เปรี้ยวแซ่บ ตัวนี้ใช้ผ้าเบาสบายเนื้อดี ใส่สวยพริ้วๆ ไม่ร้อนแน่นอนค่ะ ด้านหลังแต่งโบว์เพิ่มความน่ารัก ทำให้ชุดดูมีลูกเล่น สาวๆพลาดไม่ได้เลยนะคะ สวยจริงๆค่ะ ^^ ฟรีไซส์ : อก : ยางยืดได้ถึง38" ,
    รอบแขน13" ยาว21" มี5สี : แดงเลือดหมู ,
    ดำ ,
    ชมพู ,
    เหลือง ,
    ครีม",
    "indexTitle":"เดรสปาดไหล่แขนยาว ทรงนี้เป็นทรงที่ใครใส่ก็สวยค่ะ เดรสเกาะอกแต่งแขนยาวปาดไหล่ เปรี้ยวแซ่บ ตัวนี้ใช้ผ้าเบาสบายเนื้อดี ใส่สวยพริ้วๆ ไม่ร้อนแน่นอนค่ะ ด้านหลังแต่งโบว์เพิ่มความน่ารัก ทำให้ชุดดูมีลูกเล่น สาวๆพลาดไม่ได้เลยนะคะ สวยจริงๆค่ะ ^^ ฟรีไซส์ : อก : ยางยืดได้ถึง38" ,
    รอบแขน13" ยาว21" มี5สี : แดงเลือดหมู ,
    ดำ ,
    ชมพู ,
    เหลือง ,
    ครีม",
    "searchKeywords":"เดรส",
    "keywordGroup":"A"
    }{
    "index":{
    "_index":"testindex",
    "_type":"ITEM",
    "_id":"3"
    }
    }{
    "id":3,
    "title":"เดรสเข้ารูปตัดต่อผ้ามุ้งช่วงบนหน้าอก ดูเซ็กซี่เล็กๆน่าค้นหาชวนมอง ผ้าเรย่อนเนื้อดียืดหยุ่นดีเกรดเริ่ดไม่ย้วยไม่บาง ขอบแขนระบายเล็กๆ Freesize อกยืดไม่เกิน38" สะโพกไม่เกิน38" ยาว30" งานมี4สี เทา/ขาว/ดำ/แดง",
    "indexTitle":"เดรสเข้ารูปตัดต่อผ้ามุ้งช่วงบนหน้าอก ดูเซ็กซี่เล็กๆน่าค้นหาชวนมอง ผ้าเรย่อนเนื้อดียืดหยุ่นดีเกรดเริ่ดไม่ย้วยไม่บาง ขอบแขนระบายเล็กๆ Freesize อกยืดไม่เกิน38" สะโพกไม่เกิน38" ยาว30" งานมี4สี เทา/ขาว/ดำ/แดง",
    "searchKeywords":"เดรส",
    "keywordGroup":"A"
    }{
    "index":{
    "_index":"testindex",
    "_type":"ITEM",
    "_id":"4"
    }
    }{
    "id":4,
    "title":"เดรสผ้ายืดตัวยาว ผ้ายืดค้อตตอนใส่สบาย พิมพ์ลายสัปปะรสที่กำลังอินที่สุดในตอนนี้ ตัวยาวไม่เข้ารูปใส่ปล่อย ผ่าด้านข้าง2ข้าง สาวๆห้ามพลาดน้า Freesize อกยืดถึง43" ยาว56" มีสีเดียวตามรูป",
    "indexTitle":"เดรสผ้ายืดตัวยาว ผ้ายืดค้อตตอนใส่สบาย พิมพ์ลายสัปปะรสที่กำลังอินที่สุดในตอนนี้ ตัวยาวไม่เข้ารูปใส่ปล่อย ผ่าด้านข้าง2ข้าง สาวๆห้ามพลาดน้า Freesize อกยืดถึง43" ยาว56" มีสีเดียวตามรูป",
    "searchKeywords":"เดรส",
    "keywordGroup":"A"
    }{
    "index":{
    "_index":"testindex",
    "_type":"ITEM",
    "_id":"5"
    }
    }{
    "id":5,
    "title":"เดรสทรงตรงใส่สะบายๆ ดีไซส์ระบายผ้าชีฟองช่วงอก ตัวเดรสด้านหน้าปริ้นลายกราฟฟิกสถานที่ ลายสวยคมชัด สามารถใส่ทำงานหรือออกงานได้ตามโอกาส Cutting /Pattern สวย เป๊ะเหมือนแบบเลยจร้า Color : ชมพู ราคา 690 S - อก 34"/ เอว 32" / สะโพก 36" / วงแขน 18" / ค.ยาว 33" M - อก 36" / เอว 34" / สะโพก 38" / วงแขน 18" / ค.ยาว 33" L - อก 37" / เอว 32" / สะโพก 40" / วงแขน 18" / ค.ยาว 33"",
    "indexTitle":"เดรสทรงตรงใส่สะบายๆ ดีไซส์ระบายผ้าชีฟองช่วงอก ตัวเดรสด้านหน้าปริ้นลายกราฟฟิกสถานที่ ลายสวยคมชัด สามารถใส่ทำงานหรือออกงานได้ตามโอกาส Cutting /Pattern สวย เป๊ะเหมือนแบบเลยจร้า Color : ชมพู ราคา 690 S - อก 34"/ เอว 32" / สะโพก 36" / วงแขน 18" / ค.ยาว 33" M - อก 36" / เอว 34" / สะโพก 38" / วงแขน 18" / ค.ยาว 33" L - อก 37" / เอว 32" / สะโพก 40" / วงแขน 18" / ค.ยาว 33"",
    "searchKeywords":"เดรส",
    "keywordGroup":"A"
    }
  4. search docs with combo_thai_analayzer
    {
    "from": 0,
    "size": 100,
    "query": {
    "query_string": {
    "default_field": "indexTitle",
    "query": "เดรส",
    "analyzer": "combo_thai_analyzer"
    }
    },
    "sort": [
    {
    "_score": {
    "order": "desc"
    }
    }
    ],
    "highlight": {
    "fields": {
    "indexTitle": {
    "pre_tags": [
    ""
    ],
    "post_tags": [
    ""
    ],
    "index_options": "offsets"
    }
    }
    }
    }

4-1 result. highlight is wrong. highlighting entire sentence
{
_index: testindex
_type: ITEM
_id: 1
_score: 1.0320579
_source: {
id: 1
title: เดรสผ่าแขนหน้า เรียบหรูดูดีสุดๆค่ะ ผ้าฮานาโกะคัตติ้งเนี้ยบ งานดีงานสลวยสวยเก๋ ใส่แล้วเป็นคุนนายขึ้นมาทันทีค่ะ ขาว ดำ แดง น้ำเงิน ครีม ชมพู Freesize เอว 24-28 490฿
indexTitle: เดรสผ่าแขนหน้า เรียบหรูดูดีสุดๆค่ะ ผ้าฮานาโกะคัตติ้งเนี้ยบ งานดีงานสลวยสวยเก๋ ใส่แล้วเป็นคุนนายขึ้นมาทันทีค่ะ ขาว ดำ แดง น้ำเงิน ครีม ชมพู Freesize เอว 24-28 490฿
searchKeywords: เดรส
keywordGroup: A
}
highlight: {
indexTitle: [
เดรสผ่าแขนหน้า เรียบหรูดูดีสุดๆค่ะ ผ้าฮานาโกะคัตติ้งเนี้ยบ งานดีงานสลวยสวยเก๋ ใส่แล้วเป็นคุนนายขึ้นมาทันทีค่ะ ขาว ดำ แดง น้ำเงิน ครีม ชมพู Freesize เอว 24-28 490฿
]
}
}

  1. search with no combine analyzer
    {
    "from": 0,
    "size": 100,
    "query": {
    "query_string": {
    "default_field": "indexTitle",
    "query": "เดรส",
    "analyzer": "custom_icu_analyzer"
    }
    },
    "sort": [
    {
    "_score": {
    "order": "desc"
    }
    }
    ],
    "highlight": {
    "fields": {
    "indexTitle": {
    "pre_tags": [
    ""
    ],
    "post_tags": [
    ""
    ],
    "index_options": "offsets"
    }
    }
    }
    }

5-1 it's working, but, it's don't know other analysis results. for example synonym, foreign-language analyzer results.
{
_index: testindex
_type: ITEM
_id: 1
_score: 0.72977513
_source: {
id: 1
title: เดรสผ่าแขนหน้า เรียบหรูดูดีสุดๆค่ะ ผ้าฮานาโกะคัตติ้งเนี้ยบ งานดีงานสลวยสวยเก๋ ใส่แล้วเป็นคุนนายขึ้นมาทันทีค่ะ ขาว ดำ แดง น้ำเงิน ครีม ชมพู Freesize เอว 24-28 490฿
indexTitle: เดรสผ่าแขนหน้า เรียบหรูดูดีสุดๆค่ะ ผ้าฮานาโกะคัตติ้งเนี้ยบ งานดีงานสลวยสวยเก๋ ใส่แล้วเป็นคุนนายขึ้นมาทันทีค่ะ ขาว ดำ แดง น้ำเงิน ครีม ชมพู Freesize เอว 24-28 490฿
searchKeywords: เดรส
keywordGroup: A
}
highlight: {
indexTitle: [
เดรสผ่าแขนหน้า เรียบหรูดูดีสุดๆค่ะ ผ้าฮานาโกะคัตติ้งเนี้ยบ งานดีงานสลวยสวยเก๋ ใส่แล้วเป็นคุนนายขึ้นมาทันทีค่ะ
]
}
}

question.
how can i highlighting with combo type analyzer?
please help me!

How-To? I'm stuck

I'm having some troubles with the combo plugin, I don't get it working the way I'd like.

I'm following your example

curl -XDELETE localhost:9200/twitter
curl -XPUT localhost:9200/twitter -d '{
    "index" : {
        "analysis" : {
            "analyzer" : {
                "default" : {
                    "type" : "custom",
                    "tokenizer" : "icu_tokenizer",
                    "filter" : [ "snowball", "icu_folding" ]
                },
                "combo" : {
                    "type" : "combo",
                    "sub_analyzers" : [ "standard", "default" ]
                }
            },
            "filter" : {
                "snowball" : {
                    "type" : "snowball",
                    "language" : "German2"
                }
            }
        }
    }
}'

And, like in the readme, the result when calling curl -XGET 'localhost:9200/test/_analyze?analyzer=combo&pretty=true' -d 'Ein schöner Tag in Köln im Café' is not, like mentioned in the readme, that ö,ä,ü are tranformed to 'oe,ae,ue', but simply to o,a,u.

What I am trying is to get all three kinds of transformation done, so the string Ein schöner Tag in Köln im Café can be found as is, as Ein schoner Tag in Koln im Cafe as well as Ein schoener Tag in Köln im Cafe

Is this possible? What am I doing wrong? (Using ES 0.19.11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.