GithubHelp home page GithubHelp logo

Comments (4)

grownuphacker avatar grownuphacker commented on July 1, 2024 1

+1
Just came to add my interest in this. I haven't gotten any method other than hammering my REST source with the exact same request to work.

from logstash-filter-http.

vjt avatar vjt commented on July 1, 2024 1

-1

I don't think that LogStash should have a caching layer, as there is already external software (nginx, memcached) that does that well and it's easy to integrate them with LogStash.

I have two use cases for which I am using external caches:

  • Querying an internal API service over HTTP to enrich logs coming from different sources. The information on the API service changes seldom, and logstash processes hundreds of events per second. I set up an nginx listening on localhost that proxies my API service, configured its disk cache and pointed logstash to it (see [1] below)
  • Keeping a mapping of client IP - user name. If an incoming log event has both a clientip and user fields, I store it in memcached. If I have a clientip and not an user field, I query memcached to enrich the log event (see [2] below).

That said, I find the following pluses in having the caching layer external:

  • being able to tune, change the behaviour or replace altogether the caching layers without involving LogStash or having to reconfigure it
  • being sure of not losing the cache contents if I need to restart LogStash; otherwise being able to flush the cache without involving LogStash
  • being able to scale out the cache separately than LogStash

Sorry for the verbosity, I hope this is useful also for your use cases.

[1] local caching proxy

proxy_cache_path /srv/cache/foobar levels=1:2 keys_zone=foobar:40m inactive=24h max_size=1g;

server {
  listen localhost:8084;

  access_log off;

  location / {
    proxy_pass            https://foobar;

    proxy_ignore_headers  Cache-Control;

    proxy_set_header      Host foobar.example.org;
    proxy_buffering       on;
    proxy_cache           foobar;
    proxy_cache_key       $uri$is_args$args;
    proxy_cache_valid     200 404 1h;
    proxy_cache_valid     any 5m;
    proxy_cache_lock      on;
    proxy_cache_use_stale error timeout invalid_header updating http_500 http_502 http_503 http_504;

    add_header X-Cache-Status $upstream_cache_status;
  }
}

upstream foobar {
  server foobar.example.org:443;
}

[2] memcached enrichment

# We have a mapping from the event, store it in the cache for usage by other future events.
#
if [clientip] and [user] and [user] !~ '(?:^(?:unauthenticated|_?system|anonymous|\[?unknown\]?)$)' {
  memcached {
    hosts => ["cache-01"]
    namespace => "logstash-ip"
    set => { "[user]" => "%{clientip}" }
    ttl => 86400 # Avoid stale lookups
  }
}

# We don't have a mapping from the event, try to look it up from the cache.
#
if [clientip] and ! [user] {
  # Check the cache
  #
  memcached {
    hosts => ["cache-01"]
    namespace => "logstash-ip"
    get => { "%{clientip}" => "[user]" }
    add_tag => ["user_from_cache"]
  }
}

from logstash-filter-http.

yaauie avatar yaauie commented on July 1, 2024

I envision a two-part solution:

  1. Support for proxies (including https) would be trivial to add, and would allow users to configure a local caching proxy (e.g., Squid Cache) that obeyed all of the semantics and standards of the web and kept that complexity out of our maintenance domain.
  2. A naïve LRU in-memory cache (perhaps around LogStash::Filters::Http#request_http(verb, url, options)) is also possible if a little more complex, and would reduce the overhead of a user of this plugin configuring and running above-mentioned caching proxy, at the cost of breaking some of the semantics (e.g., no upstream cache invalidation) and some unpredictability in the plugin's memory consumption.

from logstash-filter-http.

telune avatar telune commented on July 1, 2024

I add my vote on this one, it would be ideal for our data enrichment use case. We are now using the jdbc_streaming filter, but it's a less-than-ideal choice. The perfect choice would be the http filter with caching capabilities, just like the aforementioned jdbc_streaming, only making HTTP calls instead of SQL queries.

from logstash-filter-http.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.