Comments (5)
Elasticsearch and Logstash treat names with dots in them differently. For data indexed in Elasticsearch 5.0 and newer, the JSON objects {"a": {"b": "c"}}
and {"a.b": "c"}
are equal. In Logstash [a][b]
and a.b
two different fields, which is what you're hitting here.
It is preferable to index events in the form of {"a": {"b": "c"}}
(so [a][b]
in Logstash), mainly because it's way easier to work with in other languages. For example, in Go you can use nested structs to represent different ECS field groups.
To work around the problem with the grok filter you can simply use a.b
or a_b
in the regex and then use mutate
to rename the field to [a][b]
.
from ecs.
I run the logstash with the configuration file:
input {
generator {
lines => [ "Request violations: blocked" ]
count => 1
}
}
filter {
grok {
match => { "message" => "((?<event.action>Request) violations: %{GREEDYDATA:[f5][dcc][violations][blocked]})?" }
}
}
output {
stdout {
codec => rubydebug
}
}
and with the --log.level=debug
to see expanded regular expression:
[2018-07-09T18:26:05,887][DEBUG][logstash.filters.grok ] Grok compiled OK {:pattern=>"((?<event.action>Request) violations: %{GREEDYDATA:[f5][dcc][violations][blocked]})?", :expanded_pattern=>"((?<event.action>Request) violations: (?<GREEDYDATA:[f5][dcc][violations][blocked]>.*))?"}
[2018-07-09T18:26:07,171][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
{
"host" => "ad8a32273295",
"@version" => "1",
"message" => "Request violations: blocked",
"f5" => {
"dcc" => {
"violations" => {
"blocked" => "blocked"
}
}
},
"event.action" => "Request",
"sequence" => 0,
"@timestamp" => 2018-07-09T18:26:06.128Z
}
The expanded GREEDYDATA grok pattern contains it's name as the field name prefix. I modified regex part of your pattern to contain some pattern name prefix:
filter {
grok {
match => { "message" => "((?<REQUEST:[event][action]>Request) violations: %{GREEDYDATA:[f5][dcc][violations][blocked]})?" }
}
}
Expanded regular expression looks good and [event][action]
field is set properly:
[2018-07-09T18:32:56,692][DEBUG][logstash.filters.grok ] Grok compiled OK {:pattern=>"((?<REQUEST:[event][action]>Request) violations: %{GREEDYDATA:[f5][dcc][violations][blocked]})?", :expanded_pattern=>"((?<REQUEST:[event][action]>Request) violations: (?<GREEDYDATA:[f5][dcc][violations][blocked]>.*))?"}
[2018-07-09T18:32:57,903][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
{
"message" => "Request violations: blocked",
"@version" => "1",
"sequence" => 0,
"@timestamp" => 2018-07-09T18:18:56.896Z,
"host" => "ad8a32273295",
"event" => {
"action" => "Request"
},
"f5" => {
"dcc" => {
"violations" => {
"blocked" => "blocked"
}
}
}
}
I hope it helps you.
from ecs.
This confusion comes differences in the platforms grok is running on: logstash uses square brackets for field references and ingest node uses dots, so kibana's grok debugger will use dots as well.
The solution that creates less friction is for one of these two (or both) to support both notations.
On the Logstash side this has been proposed already and could likely be done without any breaking changes.
from ecs.
It seems Logstash is treating fields different if they are defined as [][] vs .
For example this optional list of request violations grok pattern:
((?<event.action>Request) violations: %{GREEDYDATA:f5.dcc.violations.blocked}. )?
As using [][] notation in the regex capture makes Logstash fail I have to use . notation (see event.action)
But when I later create a conditional using for example:
if [event][action] == "Request" {
mutate {
replace => { "[event][action]" => "Request passed obj" }
add_tag => "grok_f5_dcc_test2
}
}
if "%{[event][action]}" == "Request" {
mutate {
replace => { "[event][action]" => "Request passed var obj" }
add_tag => "grok_f5_dcc_test3
}
}
if "%{event.action}" == "Request" {
mutate {
replace => { "[event][action]" => "Request passed var dot" }
add_tag => "grok_f5_dcc_test4
}
}
All the above conditionals fail somehow... So how should I do a named regex capture to an object if I can't use [][]? For the record, the field event.action is indexed correctly to Elasticsearch.
Also tried with:
((?<[event][action]>Request) violations: %{GREEDYDATA:f5.dcc.violations.request}. )?
((?<\[event]\[action]>Request) violations: %{GREEDYDATA:f5.dcc.violations.request}. )?
((?<\\[event]\\[action]>Request) violations: %{GREEDYDATA:f5.dcc.violations.request}. )?
But those make my Logstash fail to execute action.. I found this issue: logstash-plugins/logstash-filter-grok#66 which actually describes my problem.. I'm not sure how I can continue migrating my F5 grok patterns to objects with this issue.. Is there anyone who knows a workaround?
from ecs.
Does everyone here agree that this issue in ECS can be closed, in favour of the one in the Logstash repo? :-)
from ecs.
Related Issues (20)
- Clarify the type of container disk and network metrics HOT 1
- ECS Vulnerability Published field
- Add threat.indicator.tags field
- [Proposal] Make event.kind a list HOT 3
- Incorrect generated/beats/fields.ecs.yml, not accounting on top_level: false HOT 1
- [ECS] Addition - http.request.header.bytes & http.response.header.bytes HOT 2
- Add lowercase normaliser to ECS fields which support security incident response process
- Mark experimental fields as `beta` in generated files
- Elastic-Agent Integrations Use of Legacy Mapping Types Impacts .fleet_globals & prevents agents from being upgraded
- Add `related.url` field
- Add event.zone and event.environment fields
- Addition of additional allowed values for event.type
- Support cloud events in schema HOT 1
- Better abstraction of the type event.kind: alert
- ECS can no longer map all components out of the box HOT 13
- Define a standard way to identify prevention and detection security alerts HOT 5
- Support multi-key fields from SemConv HOT 5
- Allow risk object to be nested under network.
- Add a multi-field user.id.text to the user.id field.
- [Discuss] Add `agent.group` and `host.group` field
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ecs.