Comments (4)
+1
Just came to add my interest in this. I haven't gotten any method other than hammering my REST source with the exact same request to work.
from logstash-filter-http.
-1
I don't think that LogStash should have a caching layer, as there is already external software (nginx, memcached) that does that well and it's easy to integrate them with LogStash.
I have two use cases for which I am using external caches:
- Querying an internal API service over HTTP to enrich logs coming from different sources. The information on the API service changes seldom, and logstash processes hundreds of events per second. I set up an
nginx
listening on localhost that proxies my API service, configured its disk cache and pointed logstash to it (see [1] below) - Keeping a mapping of client IP - user name. If an incoming log event has both a
clientip
anduser
fields, I store it in memcached. If I have aclientip
and not anuser
field, I query memcached to enrich the log event (see [2] below).
That said, I find the following pluses in having the caching layer external:
- being able to tune, change the behaviour or replace altogether the caching layers without involving LogStash or having to reconfigure it
- being sure of not losing the cache contents if I need to restart LogStash; otherwise being able to flush the cache without involving LogStash
- being able to scale out the cache separately than LogStash
Sorry for the verbosity, I hope this is useful also for your use cases.
[1] local caching proxy
proxy_cache_path /srv/cache/foobar levels=1:2 keys_zone=foobar:40m inactive=24h max_size=1g;
server {
listen localhost:8084;
access_log off;
location / {
proxy_pass https://foobar;
proxy_ignore_headers Cache-Control;
proxy_set_header Host foobar.example.org;
proxy_buffering on;
proxy_cache foobar;
proxy_cache_key $uri$is_args$args;
proxy_cache_valid 200 404 1h;
proxy_cache_valid any 5m;
proxy_cache_lock on;
proxy_cache_use_stale error timeout invalid_header updating http_500 http_502 http_503 http_504;
add_header X-Cache-Status $upstream_cache_status;
}
}
upstream foobar {
server foobar.example.org:443;
}
[2] memcached enrichment
# We have a mapping from the event, store it in the cache for usage by other future events.
#
if [clientip] and [user] and [user] !~ '(?:^(?:unauthenticated|_?system|anonymous|\[?unknown\]?)$)' {
memcached {
hosts => ["cache-01"]
namespace => "logstash-ip"
set => { "[user]" => "%{clientip}" }
ttl => 86400 # Avoid stale lookups
}
}
# We don't have a mapping from the event, try to look it up from the cache.
#
if [clientip] and ! [user] {
# Check the cache
#
memcached {
hosts => ["cache-01"]
namespace => "logstash-ip"
get => { "%{clientip}" => "[user]" }
add_tag => ["user_from_cache"]
}
}
from logstash-filter-http.
I envision a two-part solution:
- Support for proxies (including https) would be trivial to add, and would allow users to configure a local caching proxy (e.g., Squid Cache) that obeyed all of the semantics and standards of the web and kept that complexity out of our maintenance domain.
- A naïve LRU in-memory cache (perhaps around
LogStash::Filters::Http#request_http(verb, url, options)
) is also possible if a little more complex, and would reduce the overhead of a user of this plugin configuring and running above-mentioned caching proxy, at the cost of breaking some of the semantics (e.g., no upstream cache invalidation) and some unpredictability in the plugin's memory consumption.
from logstash-filter-http.
I add my vote on this one, it would be ideal for our data enrichment use case. We are now using the jdbc_streaming filter, but it's a less-than-ideal choice. The perfect choice would be the http filter with caching capabilities, just like the aforementioned jdbc_streaming, only making HTTP calls instead of SQL queries.
from logstash-filter-http.
Related Issues (20)
- Debug message causes Fatal error HOT 2
- Add ability to disable SSL certificate validation HOT 4
- Cannot send Content-Type header HOT 6
- Documentation Mismatch HOT 1
- Unable to add event field to http body HOT 3
- Unhelpful error message when body is hash and body_format is text
- Unable to parse Json with one field or none + needed clarity on target_body parameter
- Implement ECS-Compatibility Mode
- Authentication username and password are not interpolated
- Plugin crashes on empty response body HOT 9
- Is it possible to do PUT requests from this plugin HOT 1
- Friendlier handling of non 200 response codes HOT 2
- tag_on_request_failure and tag_on_json_failure are not documented
- Handle Empty body in HTTP response HOT 1
- Doesn't respect quoted characters HOT 3
- Plugin crashes when it receives Array type of content-type header.
- Need clarification on sending data as json
- Some APIs allow for batch processing, how can we provide batch support?
- Add support to extract a value from the response.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from logstash-filter-http.