GithubHelp home page GithubHelp logo

elasticsearch-timefacets-plugin's Introduction

Elasticsearch Plugins

Note

This project is not maintained anymore and is now superseded by crate - see https://github.com/crate/crate

Distinct Date Histogram Facet

This facet counts distinct values for string and numeric fields.

Example:

{
    "query" : {
        "match_all" : {}
    },
    "facets" : {
        "distinct" : {
            "distinct_date_histogram" : {
                "field" : "field_name",
                "value_field" : "value_field_name",
                "interval" : "day"
            }
        }
    }
}

Result:

"distinct":{
    "_type":"distinct_date_histogram",
    "entries":[
        "{"time":950400000,"count":2},
        "{"time":1555200000,"count":3}
    ],
    "count":4
}

The "count" is the number of distinct values in the time period. The outer "count" is the number of total distinct values.

Works like the "date_histogram" with these exceptions:

  • value_field is mandatory
  • value_field must be of type String or Numeric
  • no value_script

"Latest" Facet

This facet collapses matching documents to key_field and uses only the document with the highest value of ts_field. The result is always sorted on descending value_field.

Example:

{"query": { "match_all":{}},
 "facets": {
    "l": {
     "latest": {
      "size": 100,
      "start": 50,
      "key_field": "mykey",
      "value_field": "num_comments",
      "ts_field": "created_at"
    }
  }
 }}

Result:

"facets" : {
  "l" : {
    "_type" : "latest",
    "total": 25,
    "entries" : [ {
      "value" : 52127,
      "key" : 5758683603492929880,
      "ts" : 1325577893000
    }, {
      "value" : 14980,
      "key" : 5758683371564695759,
      "ts" : 1325447138000
    }, {
      "value" : 10392,
      "key" : 5758683603492929669,
      "ts" : 1325577885000
    } ]
  }
}

Restrictions of the "Latest" facet

Documents need to be routed in a way that the same values of key_field are on the same shard. This can be accomplished by setting the _routing attribute upon indexing. This is needed for performance reasons, so the fields can be collapsed per shard.

Currently the key_field and ts_field need to be longs, while the value_field is required to be of type Numeric.

Installation

  • Clone this repo with git clone [email protected]:crate/elasticsearch-timefacets-plugin.git
  • Checkout the tag (find out via git tag) you want to build with (possibly master is not for your elasticsearch version)
  • Run: mvn clean package -DskipTests=true โ€“ this does not run any unit tests, as they take some time. If you want to run them, better run mvn clean package
  • Install the plugin: /path/to/elasticsearch/bin/plugin -install elasticsearch-timefacets-plugin -url file:///$PWD/target/releases/elasticsearch-timefacets-plugin-$version.jar

Maven

To use this project with maven follow the steps described at https://github.com/lovelysystems/maven

Deployment

The distributionManagement section in the pom contains the actual repository urls on github. It will lead to an error if you try to deploy to those urls, because these are no Maven API endpoints, where maven could upload the artifacts.

So to deploy to the Lovely Systems Maven repository first clone https://github.com/lovelysystems/maven to your local machine and set the deployment target location on the commandline like this:

mvn -DaltDeploymentRepository=repo::default::file:../maven/releases clean deploy

After deployment simply commit the changes in the maven repository project and push.

This approach was take from the very useful blog entry at http://cemerick.com/2010/08/24/hosting-maven-repos-on-github/

elasticsearch-timefacets-plugin's People

Contributors

dobe avatar jukart avatar lars-grote avatar mfussenegger avatar quodt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elasticsearch-timefacets-plugin's Issues

Timefacets returns incorrect count when timeZone is non-zero

Facet:
"facets": {
"distinct": {
"distinct_date_histogram": {
"key_field": "time",
"interval": "day",
"timeZone": -7.0,
"value_field": "userName"
}
}
}

Result:
"facets": {
"distinct": {
"_type": "distinct_date_histogram",
"entries": [
{
"time": 1392879600000,
"count": 2
},
{
"time": 1392966000000,
"count": 2
},
{
"time": 1395212400000,
"count": 1
},
{
"time": 1395817200000,
"count": 2
}
],
"count": 2
}
}

Expecting:
{
"time": 1395817200000,
"count": 3
}

There are three userNames for this day: [diranl, dcervelli, admin]. I don't understand why the timefacets only returns two.

0.90.5 Compatability

The CacheRecycler interface changed. It looks like you'd access the underlying data map directly

cacheRecycler.longObjectMap.v();

and there's a new release() method on the cache.

ES newer version compatibility

Wonder if this plugin can be updated for the newer versions of ES (0.90.6 onwards), especially with trove being replaced by hppc in Elasticsearch's own implementations?

NPE when query made via ES Java client

The elasticsearch-timefacets-plugin works fine via REST but gives an NullPointerException when the query is done with the JAVA client (Netty serialization).

I have just experienced this, and it has also been reported by a user an the elasticsearch mailing list 8 months ago.

https://groups.google.com/forum/#!msg/elasticsearch/kuqXGlqgTwc/O2Ac4o--fDoJ

A workaround/fixe is to return the built-in InternalCountDateHistogramFacet instead of the InternalDistinctDateHistogramFacet.

I will create a pull-request to further discuss this.

Integrate elasticsearch 0.90.3

Version 0.90.3 is ready, so why not upgrading ;-)

We use elasticsearch 0.90.3 and want to use this plugin. I created a diff file which make it ready for 0.90.3.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.