GithubHelp home page GithubHelp logo

elasticsearch-dsl-ruby's Introduction

Elasticsearch::DSL

The elasticsearch-dsl library provides a Ruby API for the Elasticsearch Query DSL.

Installation

Install the package from Rubygems:

gem install elasticsearch-dsl

To use an unreleased version, either add it to your Gemfile for Bundler:

gem 'elasticsearch-dsl', git: 'git://github.com/elasticsearch/elasticsearch-dsl-ruby.git'

or install it from a source code checkout:

git clone https://github.com/elasticsearch/elasticsearch-dsl-ruby.git
cd elasticsearch-dsl-ruby
bundle install
rake install

Usage

The library is designed as a group of standalone Ruby modules, classes and DSL methods, which provide an idiomatic way to build complex search definitions.

Let's have a simple example using the declarative variant:

require 'elasticsearch/dsl'
include Elasticsearch::DSL

definition = search do
  query do
    match title: 'test'
  end
end

definition.to_hash
# => { query: { match: { title: "test"} } }

require 'elasticsearch'
client = Elasticsearch::Client.new trace: true

client.search body: definition
# curl -X GET 'http://localhost:9200/test/_search?pretty' -d '{
#   "query":{
#     "match":{
#       "title":"test"
#     }
#   }
# }'
# ...
# => {"took"=>10, "hits"=> {"total"=>42, "hits"=> [...] } }

Let's build the same definition in a more imperative fashion:

require 'elasticsearch/dsl'
include Elasticsearch::DSL

definition = Search::Search.new
definition.query = Search::Queries::Match.new title: 'test'

definition.to_hash
# => { query: { match: { title: "test"} } }

The library doesn't depend on an Elasticsearch client -- its sole purpose is to facilitate building search definitions in Ruby. This makes it possible to use it with any Elasticsearch client:

require 'elasticsearch/dsl'
include Elasticsearch::DSL

definition = search { query { match title: 'test' } }

require 'json'
require 'faraday'
client   = Faraday.new(url: 'http://localhost:9200')
response = JSON.parse(
              client.post(
                '/_search',
                JSON.dump(definition.to_hash),
                { 'Accept' => 'application/json', 'Content-Type' => 'application/json' }
              ).body
            )
# => {"took"=>10, "hits"=> {"total"=>42, "hits"=> [...] } }

Features Overview

The library allows to programatically build complex search definitions for Elasticsearch in Ruby, which are translated to Hashes, and ultimately, JSON, the language of Elasticsearch.

All Elasticsearch DSL features are supported, namely:

Please see the extensive RDoc examples in the source code and the integration tests.

Accessing methods outside DSL blocks' scopes

Methods can be defined and called from within a block. This can be done for values like a Hash, String, Array, etc. For example:

def match_criteria
  { title: 'test' }
end

s = search do
  query do
    match match_criteria
  end
end

s.to_hash
# => { query: { match: { title: 'test' } } }

To define subqueries in other methods, self must be passed to the method and the subquery must be defined in a block passed to instance_eval called on the object argument. Otherwise, the subquery does not have access to the scope of the block from which the method was called. For example:

def not_clause(obj)
  obj.instance_eval do
    _not do
      term color: 'red'
    end
  end
end

s = search do
  query do
    filtered do
      filter do
        not_clause(self)
      end
    end
  end
end

s.to_hash
# => { query: { filtered: { filter: { not: { term: { color: 'red' } } } } } }

Development

See CONTRIBUTING.

elasticsearch-dsl-ruby's People

Contributors

akfernun avatar andreasklinger avatar betamatt avatar chrisbr avatar custompro98 avatar defgenx avatar estolfo avatar fredsadaghiani avatar gmile avatar h6ah4i avatar karmi avatar karmiq avatar kkirsche avatar koenpunt avatar mindreframer avatar mothonmars avatar niuage avatar ocowchun avatar picandocodigo avatar rafallo avatar sauravj avatar scouttyg avatar send avatar soartec-lab avatar stephenprater avatar tangopium avatar tennisonchan avatar tmaier avatar zerobearing2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

elasticsearch-dsl-ruby's Issues

Nested bool filters not supported?

Example input and output below. Am I missing something here?

search do
  query do
    bool do
      filter do
        bool do
          should do
            match something: "something"
          end
          should do
            match something_else: "something"
          end
          minimum_should_match 1
        end
      end
    end
  end
end.to_hash
{:query=>
  {:bool=>
    {:filter=>
      [
       {:bool=>{:should=>[{}]}}]
      }}}

Support for calendar_interval

interval on date_histogram is being deprecated

Elasticsearch-7.7.1-ad56dce891c901a492bb1ee393f12dfff473a423 "[interval] on [date_histogram] is deprecated, use [fixed_interval] or [calendar_interval] in the future."

[DSL] Support for non-search DSLs?

Thanks for the work on the Elasticsearch::DSL! We're using the update_by_query API, which largely overlaps with the Query DSL, but with the addition of a "script" section. For example:

POST /twitter/_update_by_query
{
  "script": {
    "inline": "ctx._source.likes++"
  },
  "query": {
    "term": {
      "user": "kimchy"
    }
  }
}

Unfortunately the script isn't supported when building a query using the search method:

NoMethodError: undefined methodscript' for #Elasticsearch::DSL::Search::Search:0x00000004ef2500`

Do you have any suggestions for building the body of update_by_query requests using the DSL? Is there a workaround to allow using the DSL to build just the query section, or would we need to wait for other non-search DSLs to be fully implemented?

[DSL] handle multiple values (array)

Is my understanding correct that if i want to do

{
    "query": {
        "bool": {
            "must": [
                {
                    "term": {"shape": "round"}
                },
                {
                    "bool": {
                        "should": [
                            {"term": {"color": "red"}},
                            {"term": {"color": "blue"}}
                        ]
                    }
                }
            ]
        }
    }
}

I would have to write:

query do 
  bool do
   must do
    term shape: round
   end
   must do
    bool do
      should do
        term color: red
      end
      should do
        term color: blue
      end
    end
   end
  end
end

(notice the multiple must/should blocks)

i didn't receive any error message when putting multiple terms into one must block - it just swallowed the further ones

did i do something wrong? am i misunderstanding something? is there a way to do "array-like" syntax that i am unaware off?

If this is by design - are there any other differences to the JSON syntax i should be aware off?

Not able to specify missing_value in aggregation

aggregations: { "aggregation_name": { "terms": { "field": "field_name", "size": number, "missing": "field value" } } }

ES allows to assign a value to records with missing values, currently not possible to generate this query.

Inconsistent getters and setters

Throughout the DSL there are many different ways to setting data on a search.

Sometimes it's through a setter:

search = Search.new
search.filter = Filter.new

Sometimes that setter doesn't exist and setting is based on arguments ONLY

search = Search.new
search.sort   # => nil # Only returns the value since no arguments
search.sort {} # Empty block creates the sort object
search.sort # => Sort

Sometimes the setter returns some random internal value (if it's a

search = Search.new
search.filter = Filter.new
#
# This will append the filter but then returns the array inside the filter
search.filter.range(:name) # => [Range()]

And yet other times it's even more confusing:
https://github.com/elastic/elasticsearch-ruby/blob/master/elasticsearch-dsl/lib/elasticsearch/dsl/search/base_component.rb#L56

Basically, the library becomes very hard to use as either a DSL or as an imperative library.

It's hard to use it as a DSL because it changes the implicit receiver making using it inside another class very hard because almost everything causes method_missing errors when trying to call my own functions.

Using it imperatively is very hard and requires the source code handy because every method call has a totally different, undocumented, argument list and/or a different return value.

How to combine bool query must with should

Something like that
post_filter do
bool do
must do
terms "status_investigate?": [true]
bool do
should { terms "materials.id": opts[:materials].map {|x| x == 'Not specified' ? '' : x} }
should { terms "styles.id": opts[:styles].map {|x| x == 'Not specified' ? '' : x} }
should { terms "tags.id": opts[:tags].map {|x| x == 'Not specified' ? '' : x} }
end
end
end

Global Aggregations

The following code raises an SystemStackError:

search do
  aggregation :all do
    global
    aggregation :foo do
      terms field: "bar"
    end
  end
end

What I would like to archive is the following:

"aggregations": {
   "all": {
       "global": {},
       "aggregations": {
           "foo": {
               "terms": {
                   "field": "bar"
               }
           }
       }
   }
}

The exception results from nesting the aggregations. Am I using it wrong or is this just not implemented yet?

I looked further into this: Since the global aggregation is structurally similar to a nested aggregation I thought it should be used like one:

search do
  aggregation :all do
    global do
      aggregation :foo do
        terms field: "bar"
      end
    end
  end
end

But this leads to an other error. aggregation is not defined. By comparing the Global class with the Nested class I noticed that Global includes BaseComponent while Nested includes BaseAggregationComponent. Currently I am working around this by monkeypatching the Global class and including BaseAggregationComponent.

I would love to create a pull request if this is the intended way of using the gloabl aggregation.

DSL constantizes & instantiates a class on method_missing

Any time an attr_reader variable names happens to line up with any defined classes (even custom user classes), the DSL eats the method_missing call and attempts to construct that class. While it might be difficult/ to avoid collisions with builtin filter/query class names, it should be possible to avoid this for unsuspecting classes.

Not familiar with this codebase yet, but it seems like this const_defined? might be the issue:
https://github.com/elastic/elasticsearch-ruby/blob/master/elasticsearch-dsl/lib/elasticsearch/dsl/search/query.rb#L28

There may be a better solution, but a class method (e.g is_elasticsearch_dsl_class?) would do the trick.

require 'elasticsearch/dsl'

class User
  def initialize(*args, **kwargs)
    puts "Initialized User with args=#{args}, kwargs=#{kwargs}"
  end
end

class MyClass
  include Elasticsearch::DSL

  attr_reader :user

  def initialize
    @user = "user_string"
  end

  def build_query_hash
    query = search do
      query do
        user_substr = user[0..5]
        match user: user_substr
      end
    end
    query.to_hash
  end
end

MyClass.new.build_query_hash

yields

Initialized User with args=[], kwargs={}
Traceback (most recent call last):
       10: from /Users/kevinmcdonough/.rvm/rubies/ruby-2.6.5/bin/irb:23:in `<main>'
        9: from /Users/kevinmcdonough/.rvm/rubies/ruby-2.6.5/bin/irb:23:in `load'
        8: from /Users/kevinmcdonough/.rvm/rubies/ruby-2.6.5/lib/ruby/gems/2.6.0/gems/irb-1.0.0/exe/irb:11:in `<top (required)>'
        7: from (irb):59
        6: from (irb):55:in `build_query_hash'
        5: from /Users/kevinmcdonough/.rvm/gems/ruby-2.6.5/gems/elasticsearch-dsl-0.1.9/lib/elasticsearch/dsl/search.rb:267:in `to_hash'
        4: from /Users/kevinmcdonough/.rvm/gems/ruby-2.6.5/gems/elasticsearch-dsl-0.1.9/lib/elasticsearch/dsl/search/query.rb:51:in `to_hash'
        3: from /Users/kevinmcdonough/.rvm/gems/ruby-2.6.5/gems/elasticsearch-dsl-0.1.9/lib/elasticsearch/dsl/search/query.rb:42:in `call'
        2: from /Users/kevinmcdonough/.rvm/gems/ruby-2.6.5/gems/elasticsearch-dsl-0.1.9/lib/elasticsearch/dsl/search/query.rb:42:in `instance_eval'
        1: from (irb):51:in `block (2 levels) in build_query_hash'
NoMethodError (undefined method `[]' for #<User:0x00007ff983854308>)

EDIT: cleaned up language to be clearer.

[elasticsearch-dsl] BaseAggregationComponent#to_hash pollutes @hash

bin/rails c
Loading development environment (Rails 5.2.4.3)
[1] pry(main)> include Elasticsearch::DSL
=> Object
[2] pry(main)> query = search { aggregation(:_filter) { filter(query { match_all }) } }
=> #<Elasticsearch::DSL::Search::Search:0x00007fe7e7ab5b90
 @aggregations={:_filter=>#<Elasticsearch::DSL::Search::Aggregation:0x00007fe7e7ab58c0 @block=#<Proc:0x00007fe7e7ab5938@(pry):2>>},
 @block=#<Proc:0x00007fe7e7ab5a50@(pry):2>,
 @options=#<Elasticsearch::DSL::Search::Options:0x00007fe7e7ab5ac8 @hash={}>>
[3] pry(main)> query.as_json
=> {"aggregations"=>{"_filter"=>{"filter"=>{"match_all"=>{}}}}}
[4] pry(main)> query.as_json
NameError: wrong constant name Empty?
from /Users/alechoey/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/bundler/gems/elasticsearch-ruby-1b3924ff3812/elasticsearch-dsl/lib/elasticsearch/dsl/search/query.rb:41:in `const_defined?'

Also happens with #to_hash.

Running on elasticsearch-dsl 0.1.9

Elasticsearch::DSL filters aggregation does not support the other_bucket or other_bucket_key options

Documented here

[1] pry(main)> include Elasticsearch::DSL
=> Object
[2] pry(main)> search do
[2] pry(main)*   aggregation(:filters) do
[2] pry(main)*     filters do
[2] pry(main)*       filters(issues: { term: { type: 'issue' } }, pull_requests: { term: { type: 'pr' } })
[2] pry(main)*       other_bucket(true)
[2] pry(main)*     end
[2] pry(main)*   end
[2] pry(main)* end
=> #<Elasticsearch::DSL::Search::Search:0x00007ff9b6ff9c60
 @aggregations={:filters=>#<Elasticsearch::DSL::Search::Aggregation:0x00007ff9b6ff9a58 @block=#<Proc:0x00007ff9b6ff9a80@(pry):11>>},
 @block=#<Proc:0x00007ff9b6ff9b70@(pry):10>,
 @options=#<Elasticsearch::DSL::Search::Options:0x00007ff9b6ff9c10 @hash={}>>
[4] pry(main)> _.as_json
NoMethodError: undefined method `other_bucket' for {:filters=>nil}:Hash
from (pry):14:in `block (3 levels) in <main>'

How to use DSL

all of the examples show mixing it into the global scope, which in practice no would would ever do. How do I use the DSL with rails / models, which is probably the typical use-case? all of the rails examples show passing a hash into SomeModel.search({}). In all my searching, I failed to find a concise example that used both elasticsearch-model/rails and elasticsearch-dsl.

DSL AST initialisation recursively searches up namespaces for consts

When the DSL AST initialises nodes, it performs a .const_get on the relative module. This has problems, as it will look recursively up the entire module tree for a relating const that it will then instantiate.

For example, in method_missing on Elasticsearch::DSL::Search::Query line 39, it calls const_defined? and const_get on Queries. If you happen to have say, a class defined in the root namespace called Report and you call report, it'll find this class, make a new instance and give it to you. This creates some strangeness with the other conditional, where it defers to binding to allow you to access methods in the outer context.

I am using the DSL to dynamically create queries, often using instances of this model, which I have sensibly named report in the context I am using the DSL. However, when the arity == 0, and I try to access this report, it hits the const_defined? and initializes a new instance of ::Report.

class Report
end

class Example
  include Elasticsearch::DSL
  
  attr_accessor :report
  
  def initialize(report)
    @report = report
  end
  
  def generate_query
    search do
      query do
        # Now anywhere in here report will be a Report.new everytime, instead of deferring to the outer caller's report.
      end
      aggregation :wicked_metric do
        # Similarly here, the same issue arises.
      end
    end
  end
end

I don't believe this is the intended behavior, or if it is intended, perhaps ensuring the found class is within the Elasticsearch::DSL::Search namespace would be sufficient safety.

Queries.const_defined? klass, false

I can just use the arity to prevent the issue, but that's pretty ugly on larger queries and creates a lot of clutter.

Initialize an instance of Search from_hash

Hi,

I'm trying to figure out how to initialize an instance of Elasticsearch::DSL::Search::Search from a hash, pretty much a complete reverse of to_hash. The idea is to take the json query provided by the user, convert it to hash, instantiate a Search, and enrich it with some additional clauses and options before issuing a request to Elasticsearch. Of course, I could update the user's hash, but being able to use the DSL seems so much nicer and a lot less clunky - if only it worked. It looks like currently this use case is unsupported - the purpose of this library is to go from Ruby DSL to Elasticsearch hashes, not backwards. However, I can't help wondering if perhaps other people ran into the same problem. Can you offer any help / advice?

Thank you.

Other bucket aggregation support

hey guys, I'm trying to find a way to use other bucket with this DSL and just wonder if it's even possible or not

other bucket docs

I try this way:

aggregation "entity_tag_city" do
  filters do
    terms field: "other_entity_tag_city"
    filters terms: { directory_id: directory_ids } do
      # blah-bla-bla
    end
  end
end

[400] {"error":{"root_cause":[{"type":"parsing_exception","reason":"[directory_id] query malformed, no start_object after query name","line":1,"col":1123}],"type":"parsing_exception","reason":"[directory_id] query malformed, no start_object after query name","line":1,"col":1123},"status":400} (Elasticsearch::Transport::Transport::Errors::BadRequest)

or this way

aggregation "entity_tag_city" do
  filters "other_entity_tag" => "other_entity_tag,
            terms: { directory_id: directory_ids } do
    # blah-bla-bla
  end
end

[400] {"error":{"root_cause":[{"type":"parsing_exception","reason":"Unknown key for a VALUE_STRING in [entity_tag_city]: [other_entity_tag_city].","line":1,"col":1112}],"type":"parsing_exception","reason":"Unknown key for a VALUE_STRING in [entity_tag_city]: [other_entity_tag_city].","line":1,"col":1112},"status":400} (Elasticsearch::Transport::Transport::Errors::BadRequest)

I even tried to modify the query hash by hands and pass it to Elasticsearch::Model.search method, like this (here just part of the query hash):

"aggregation" => {
  "entity_tag_city" => {
    "filters" => {
      "filters" => {
        "terms" => { "directory_id"=>[6641, 6642, 6643] } 
      },
      "other_entity_tag_city" => "other_entity_tag_city"
    }
  }
}

which looks pretty similar to what I can see in elasticsearch docs example. But nothing helps:

Elasticsearch::Transport::Transport::Errors::BadRequest Exception: [400] {"error":{"root_cause":[{"type":"parsing_exception","reason":"[directory_id] query malformed, no start_object after query name","line":1,"col":1166}],"type":"parsing_exception","reason":"[directory_id] query malformed, no start_object after query name","line":1,"col":1166},"status":400}

so, I'm just wonder can I user other buckets with this gem and if it's so - then what is proper query syntax. Thnx!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.