reyjrar / es-utils Goto Github PK

View Code? Open in Web Editor NEW

141.0 141.0 27.0 1.86 MB

ElasticSearch Utilities

Perl 100.00%

elasticsearch perl

es-utils's People

Contributors

Stargazers

Watchers

es-utils's Issues

Uses File::Slurp, known to be buggy and vulnerable

e.g. look at https://rt.cpan.org/Ticket/Display.html?id=83126 and be dismayed

File::Slurp::Tiny and Path::Tiny are both excellent alternatives. See also http://shadow.cat/blog/matt-s-trout/mstpan-5/

scripts/es-copy-index.pl gone

The POD still refers to the copy script. Please clarify why it has gone missing.

No feature for name [_status]

Hi,

The index _status API has been replaced with the index _stats. I just change the attribute in the "es-daily-index-maintenance.pl" script and it works again:

- my $status = es_request('_status',{index=>$index});
+ my $status = es_request('_stats',{index=>$index});

Why Elastijk and not official API ?

Hi,

I have a question about the API Elasticsearch. Why did you have taken "Elastijk" and not "Search::Elasticsearch", the official Elasticsearch API ?

Best regards,

Current --tail logic is unreliable for multi node clusters or multiple data sources with possible delays

The current --tail logic uses the simple condition: range => {'@timestamp' => {gte => $last_hit_ts}}

Let's assume that there is a cluster of 3 ES nodes, but only one source of data, and that data is in presented in @timestamp order. Elasticsearch sends the docs to be stored to different nodes. Those nodes will update the segments with the new docs within refresh_interval but the update is not synchronized across nodes. So the order in which docs 'become searchable' may not be in @timestamp order, so the gte => $last_hit_ts condition is not sufficiently safe. Older docs may be missed because they became 'searchable' after other docs that have a later @timestamp.

A fix for this might be something like:

add the concept of a 'time window' eg range => {'@timestamp' => {gte => $last_hit_ts - $time_window}}
record the document ids that have been seen in final time window from the last query
exclude those document ids so they're not shown twice (either by including the ids in an extra NOT condition in the query, or else by checking and discarding duplicate ids in the client)

(The scope of the problem described above is bounded by the value of refresh_interval but other related situations aren't. Consider the case of multiple sources of data where some might be delayed. For example, we have many machines feeding logs to several logstash servers which feed an ES cluster. Logs are often delayed for at least a few seconds and sometimes for many minutes. Increasing the 'time window' approach described above doesn't scale well to larger time periods or high volumes of log messages. For this case the best approach would be to enable the _timestamp field field and use that to drive the tailing logic.)

Logic around already-closed indexes doesn't log cleanly

This applies to version 2.9, which is the last available on RHEL 6.

The following conditional only checks whether it got a status object back, then tries to evaluate the content of it even when it might not be meaningful.

https://github.com/reyjrar/es-utils/blob/release-2.9/scripts/es-daily-index-maintenance.pl#L174

The effect during a normal run is like so:

[root@production-elasticsearch-1 ~]# /usr/local/bin/es-daily-index-maintenance.pl --all --replicas-min 1 --local --pattern logstash-*
Use of uninitialized value in numeric gt (>) at /usr/local/bin/es-daily-index-maintenance.pl line 174.
Use of uninitialized value in numeric gt (>) at /usr/local/bin/es-daily-index-maintenance.pl line 174.
Use of uninitialized value in numeric gt (>) at /usr/local/bin/es-daily-index-maintenance.pl line 174.

Every closed index emits a warning line. Script runs fine, but stderr is really noisy. What it actually gets back from ES during that _status request is:

{"error":"IndexClosedException[[logstash-2014.11.07] closed]","status":403}

Which of course dosn't have a shards element. Probably another defined() call would clean it up.

es-daily-index-maintenance.pl should also delete closed indexes

Message::Passing::Output::ElasticSearch closes indexes older than 7 days.
Because index_stats doesn't return closed indexes they aren't deleted.
GET /_cluster/state includes also the closed indexes so you might want to use the cluster_state method instead.

Apply settings after a few days

Hi,

It would be nice to be able to use the es-apply-settings.pl script to apply parameters after a few days, and not just on the last days.

Best regards,

No handler found for uri [/] and method [PUT]

Hi,

I'm testing "es-copy-index.pl" and I have a issue with elasticsearch 5.3.

es_request(//) failed[400]: Bad Request
es_request(//) returned HTTP Status Bad Request
Undefined subroutine &main::is_hashref called at ./es-copy-index.pl line 136.

$res return "No handler found for uri [/] and method [PUT]"

It seems that the problem comes from:

    $res = es_request('/',
        {
            method => 'PUT',
            index => $INDEX{to},
        },
        {
            settings => $to_settings,
            mappings => $mappings,
        }
    );

Do you have this issue ?

Fails with 500 and not much else...

[root@salttestvm70 ~]# /usr/local/bin/es-copy-index.pl --debug --from other-host --to localhost logstash-2015.10.23
Failed to create index in localhost (http status = 500): [
   "500",
   {
      "error" : "NullPointerException[null]",
      "status" : 500
   }
]
 at /usr/local/bin/es-copy-index.pl line 75.

That's all although I turned on debugging.

Error using es-daily-index-maintenance.pl

I'm getting the following error when trying to use one of the scripts:

Attempt to reload JSON/XS.pm aborted.
Compilation failed in require at /usr/bin/es-daily-index-maintenance.pl line 14.
BEGIN failed--compilation aborted at /usr/bin/es-daily-index-maintenance.pl line 14

JSON::XS is installed (v2.3.4).

perl -v
This is perl, v5.10.0 built for x86_64-linux-thread-multi

App::ElasticSearch::Utilities::HTTPRequest to "HTTP::Request::es"
App::ElasticSearch::Utilities::Query to "es::Query"
App::ElasticSearch::Utilities::QueryString to "es::QueryString" (bundled with es::Query)
App::Elasticsearch::Utilities::Connection to "LWP::UserAgent::es"

I'm selecting a lowercase 'es' intentionally. These are not the official elastic modules. These are generic, minimal modules designed to make working with Elasticsearch less of a hassle.

reyjrar / es-utils Goto Github PK

es-utils's People

Contributors

Stargazers

Watchers

Forkers

es-utils's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs