GithubHelp home page GithubHelp logo

elastic / ecs Goto Github PK

View Code? Open in Web Editor NEW
979.0 345.0 407.0 16.16 MB

Elastic Common Schema

Home Page: https://www.elastic.co/what-is/ecs

License: Apache License 2.0

Makefile 1.41% Python 92.73% HTML 1.95% Jinja 3.91%

ecs's Introduction

Supported Python versions Unit Tests Chat

Elastic Common Schema (ECS)

The Elastic Common Schema (ECS) defines a common set of fields for ingesting data into Elasticsearch. A common schema helps you correlate data from sources like logs and metrics or IT operations analytics and security analytics.

ECS Donation to OpenTelemetry

In April 2023, OpenTelemetry and Elastic made an important joint announcement. In this announcement, we shared our intention to achieve convergence of ECS and OTel Semantic Conventions into a single standard maintained by OpenTelemetry.

Special guidance is provided during the donation period. Please review the contribution guide.

Documentation

The ECS reference is published on the main Elastic documentation website.

Visit the official ECS Reference Documentation.

Getting Started

Please review the tooling usage guide to get started using the tools provided in this repo.

Contributing

If you're looking to contribute to ECS, you're invited to look at our contribution guide. Substantial changes to ECS are completed through our RFC process.

Generated artifacts

Various kinds of files or programs can be generated based on ECS. You can learn more in generated/README.md

Releases of ECS

The main branch of this repository should never be considered an official release of ECS. You can browse official releases of ECS here.

The ECS team publishes improvements to the schema by following Semantic Versioning. Generally major ECS releases are planned to be aligned with major Elastic Stack releases.

License

This software is licensed under the Apache License, version 2 ("ALv2"), quoted below.

Copyright 2018-2021 Elasticsearch https://www.elastic.co

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

ecs's People

Contributors

adriansr avatar ajosh0504 avatar alexanderwert avatar andrewkroh avatar andrewstucki avatar bhapas avatar chrisberkhout avatar dainperkins avatar dependabot[bot] avatar djptek avatar ebeahan avatar felixbarny avatar ferozsalam avatar graphaelli avatar jonathan-buttner avatar kaiyan-sheng avatar karenzone avatar kgeller avatar marc-gr avatar marshallmain avatar mikepaquette avatar mitodrummer avatar peasead avatar ruflin avatar rw-access avatar rylnd avatar tacklebox avatar taylor-swanson avatar trinity2019 avatar webmat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ecs's Issues

Mapping issue with source field when used with Filebeat custom input

Hello,

Similar issue to tha one I had with the Logstash udp input adding host field. FIlebeat seems to add a source field by default when using a custom prospector / input.

For example:

filebeat.inputs:
- paths: ["/usr/local/nagiosxi/var/components/auditlog.log"]
  ignore_older: 3h
  fields_under_root: true
  fields.dig.app.name: "Nagios XI"
  fields.dig.app.type: "Audit Logs"
  pipeline: "filebeat-nagios-audit"
  processors:
    - drop_event:
        when.not.regexp:
          message: '^\d{4}.*'

Results in:

org.elasticsearch.index.mapper.MapperParsingException: object mapping for [source] tried to parse field [source] as object, but found a concrete value

image

when using source.ip in the used ingest pipeline. Is this already a known issue?

New field "time"

For at least IIS, Apache2, IIS DHCP logs I need a time field. Just event.duration is not good for use because it do not describe what has taken what time.

I propose to create "time" (or some other named) fields like this:

  • time.taken_seconds
  • time.taken_milliseconds
  • time.duration_seconds
  • time.duration_milliseconds

So for IIS, Apache2 logs I could you http.response.time.taken_milliseconds, for dhcp I could use dhcp.lease.time.duration_seconds, ...

Session object

So I need quite a few session related fields for my F5 project. Do I place it in my new f5 object or should session deserve it's own tlf (top level field ;) )

For example:

        "f5": {
            "session": {
              "properties": {
                "id": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "key": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "value": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "bytes_in": {
                  "type": "long"
                },
                "bytes_out": {
                  "type": "long"
                },
                "client_ip": {
                  "type": "ip"
                },
                "deleted_reason": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "listener": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "location": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "vip_ip": {
                  "type": "ip"
                }
              }
            }

Discussion Topic: Verbs

We're attempting to adapt our feeds to ECS to try it out. So far, so good when it comes to identifying the Nouns in the events, but we're struggling with the Verbs.

If a firewall (covered) blocks traffic, some source IP (covered) some dest IP (covered) on such and such ports (all covered) at some time (covered), I get to what happened: REJECT

Where does that go...

I see the http now and I'm thinking, maybe we could abstract that and use it for other services? Maybe service is that abstraction?

{ 
  "service: {
    "name": "httpd",
    "verb":  404,
   ...
}

{ 
  "service": {
    "name": "firewall",
    "verb": "reject",
   ...
}

That doesn't feel exactly right... but I feel like we need something like that. Anyone else thinking about that?

Whether it was an authentication failure (or success) for ssh, or a firewall blocking something, or whatever... something is happening in those logged messages. I want to capture that part.

I feel like making tags for every service is the wrong path, I'm not sure what to do. :-\

Thanks for listening anyway :D

Correct place to put syslog program

Hello,

Where exactly should a syslog program field be put? In the past I put it in syslog_program. Should it be:

  • event.category
  • event.module
  • event.type

For example, F5 has following syslog programs:

  • apd
  • dcc
  • tmm
  • http
  • ssh

Or would an additional field be appropriate for syslog programs? Or should I put it in the f5.* object I'm creating?

Grtz

Willem

Log content-related fields should not be at the first level

I am currently trying to use some ECS fields for messages sent by filebeat and parsed by logstash into several fields. It looks like very messy to mix device or metadata fields like "host", "device", ... with log content fields like "url.". I propose that all log content (parsed log line/message) should be nested the same way. So instead of having "url." or something like "iis.url." there should be something like "logdata.url" or something like "meta.log.offset", ... and "data.log.url.path", ... or just "log." and rename current "log" field because it is also a little bit confusing with event.

So instead of having:
source.ip
client.ip
...

It would be nice to have something like:
source.ip, or event.source.ip or ...
log.client.ip

So for everybody it is clear that client.ip is related to the log content, not to the originator of log or ...

Rename geoip to location

The current geoip fields (https://github.com/elastic/ecs#geoip) are inspired by the geoip ingest processor form Elasticsearch (https://www.elastic.co/guide/en/elasticsearch/plugins/6.2/using-ingest-geoip.html). In recent discussion around geo fields it became clear not all geo information is coming necessarly from an ip address and there is also geo / location information which does not necessarly can be extracted from an ip.

I would like to suggest to rename geoip fields to location. I was also thinking of renaming it to geo but I think location is more specific.

Geoip for both source and destination

I do geoip fields for both my source and destination IPs. Would it be useful to include a separation between these two within ECS? I found it hard to differentiate between which geoip is internal and which one is external without it. Or should this just be an specific schema for my particular use case?

Thanks!

ecs lowercase normalizer for host.name

For a year or so I'm using a lowercase normalizer on some fields, which saves me several headaches. The root cause is that Windows tends to use uppercase hostnames while Linux uses lowercase hostnames. It's impossible to know before you search if you need to search lower or uppercase.

Would it be an idea to add a default lowercase normalizer on certain ecs fields such as host.name (and maybe some other fields)? This way the original hostname is saved, but at least people can search for them in lowercase.

Example of my override of the default packetbeat template:

PUT _template/packetbeat-overrides
{
  "order": 2147483646,
  "index_patterns": "packetbeat-*",
  "settings": {
    "analysis": {
      "normalizer": {
        "normalizer_lowercase": {
          "type": "custom",
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "doc": {
      "properties": {
        "beat": {
          "properties": {
            "name": {
              "type": "keyword",
              "ignore_above": 1024,
              "normalizer": "normalizer_lowercase"
            },
            "hostname": {
              "type": "keyword",
              "ignore_above": 1024,
              "normalizer": "normalizer_lowercase"
            }
          }
        }
      }
    }
  },
  "aliases" : {
    "packetbeat" : {},
    "beats-packetbeat" : {}
  }

}

Proposal - host.os.kernel

I propose adding a field for reporting the OS the kernel version under host.os.kernel.

Example values:

  • linux: 4.4.0-112-generic
  • windows: 6.3.9600.19000 (winblue_ltsb.180410-0600) (taken from the FileVersion value on ntoskrnl.exe)
  • darwin: 16.7.0

Making of all objects reusable

I have seen a lot of use cases to use host and other objects inside the another ones. Also mentioned in couple of issues here). I think all or almost all objects should be designed to be reusable. Just now I found another field name which could cause a collision in the future:

user_agent.device, .. vs. device object. In user_agent the device is keyword, but should be renamed to user_agent.device.name or user_agent.device_name, so everyone could put device fields inside user_agent. Maybe not the best example with user_agent but if anybody see device, it should be related to device.* object. Like host.name vs hostname ...

If not possible in all cases the documentation should clearly mention it. For each field group there should be info if it is mentioned to be only top level or reusable in any level.

Allow the use of underscores in place of periods in the field names for Elastic Common Schema

I am currently using Elastic Search without Logstash. The data parsing is being done through Apache Beam and the data is being written to several other data sources including Elastic and an SQL database.

However, the issue I face is that the "." is a protected keyword in any sql database. And since beam is already writing data in json format, it is too expensive to output two different formats of the same data set. Since this data is being forwarded to multiple sources, is it possible to conform to a more common schema naming convention?

That being said, is it possible for the Elastic Common Schema to use "_" and "." interchangeably, if not replace "." with underscore completely?

Enhanced TLS certificate metadata

The current specification only declares a handful of TLS certificate fields. Additionally, these fields are centered around an active connection, not so much around TLS metadata. For instance tls.servername specifies that it be the servername requested by the client. If the cert is a wildcard cert, there is no place in the schema for that.

There is a certificates field, but that is somewhat under-defined. Should the certs be x.509 PEM or DER encoded? I think that description needs to be tightened up as well.

I propose that we the full list of x.509 fields to ECS, and make it clear that tls.* should be exclusively what's in the certificate(s), not the context around the given connection.

Furthermore, I think we should remove the tls.servername field, destination.hostname should suffice for use in those situations.

Connection specific info field

Hello,

I often need to know (not only) the source ip address of the agent sending the forwarded logs to Logstash (the agent is beats or some syslog sender in some device or ...). Usually the logs contains information about some connections (web proxy logs, firewall logs), so "source" and "destination" fields should contain info about the logged connections inside logs itself. Field "host" can identify the originator of the logs (for example switch device). But I also need to know the IP of the log forwarder, so to what field should I put it? Maybe there should be another field named like "conn" for connection related info.

Consider following scenario: ORIGINATOR DEVICE -> SYSLOG FORWARDER -> LOGSTASH
I think in "host.ip" I should put ORIGINATOR DEVICE ip, in "source." and "destination." I should put some part of the log itself and in "conn" I should put IP of the SYSLOG FORWARDER, so the source IP of the connection made from SYSLOG FORWARDER to LOGSTASH.
The same applies for forwarded windows events logs and more ...

Inconsistent usage of .raw, and discussion about .raw vs .keyword naming

This issue is about two distinct problems that I think are closely related, hence the creation of a single issue.

Inconsistent usage of .raw

We currently have 7 fields ending in .raw in ECS. Here they are, listed with their type, and the "meaning" of .raw in their context:

Field Name Type Meaning of .raw
event.raw keyword Original value, prior to parsing
file.path.raw keyword Nested field of a multi-field
file.target_path.raw keyword Nested field of a multi-field
url.href.raw keyword Nested field of a multi-field
url.path.raw keyword Nested field of a multi-field
url.query.raw keyword Nested field of a multi-field
user_agent.raw text Original value, prior to parsing

In this list, the middle 5 are following one of the conventions for multi-field: text indexing for the top level field (e.g. file.path) and keyword indexing for the nested field (e.g. file.path.raw).

The other two are not following that convention:

  • user_agent.raw is the one breaking from the convention the most. It's not part of a multi-field, and user_agent.raw is actually of type text, not type keyword. Meaning a user could not use this field for aggregations, as opposed to what the .raw convention establishes. It's named .raw because it's the full user agent string, prior to breaking it up into name, OS, version fields & so on.
  • event.raw happens to be of type keyword which is good. But it's not actually part of a multi-field. It just means the original message is stored there.

Naming of the nested field for multi-field

I'm not 100% clear on the exact timeline. But my experience with ElasticSearch in monitoring pipelines from time immemorial, has been using the .raw nomenclature for the name of a sub-field of type keyword. Since about Stack v5 (or perhaps ES 2.x and Kibana 4.x), I've seen the naming shift to using .keyword for the nested field, instead of .raw.

Given the inconsistency I've outlined in the first part of this issue, I wonder if we shouldn't move to the new naming convention of using .keyword for multi-field, and having the freedom to use .raw for fields where we actually mean an original value.

If we decide to stick with the .raw naming convention for multi-field, I think we should address the event.raw and user_agent.raw inconsistencies, however. Perhaps rename them event.original and user_agent.original or something to that effect.

I don't have a strong preference for .keyword or .raw, but I think we need to address the inconsistency. I was curious what folks think about this.

Expected locations for url fields in http requests/responses

The docs says URL can be reused in other prefixes. According to #83 it is actually a top level item. As url can be part of both http request and response (in case of redirects), should not docs mentioned this use case if expected to use the url this way? If not what are correct locations for url objects?

event.risk_score

Hello,

Coud someone please elaborate on the event.risk_score field?

Risk score value of the event. float

In what format should the values be? What range? Score from 1 to 10?1 to 100? With decimals?

I will need to assign a risk score to F5 related security events.

Grtz

Willem

How to share schemas used with fields not in ECS

Most implementations which use ECS have ECS as the basic fields but have their own fields on top. As inspiration for which fields could be added to ECS and and inspiration for other users it would be interesting if in the context of ECS people could share their used schemas for example with F5.

An current example we did with auditbeat data and the hash prefix can be found here:

#- name: hash
# group: 3
# description: >
# Hash fields used in Auditbeat.
#
# The hash field contains cryptographic hashes of data associated with the event
# (such as a file). The keys are names of cryptographic algorithms. The values
# are encoded as hexidecimal (lower-case).
#
# All fields in user can have one or multiple entries.
# fields:

These fields are currently listed in use cases but commented out. A better solution is needed. One idea would be to have these use cases with the complete set of fields an user can contribute them but all fields which are not part of ECS are listed separately. The two things I worry here is that creating the fields.yml is sometimes too much overhead and sharing just a json would be easier, the other part is people might get confused on what is part of ECS and what is not.

Any ideas are more then welcome.

Tracking Index Time vs. Event Time

Hello,

I love this initiative and I wanted to start a dialog about time.

We have some business processes that hinge on the time elasticsearch indexes an event, and others that hinge on when the event occurred, so we want to track both. The following examples all have the same flavor, but:

  • Index time facilitates alerting by allowing you to do a query every 1m for events that were indexed now-1m. We have some sources take over an hour to get to us. It's very hard to know how long you have to look back even you're using the event time. You have to do now-2h and then keep track of whether you've already alerted on something.

  • The time gap can be variable, too. If you want to know if one of your twenty pipelines has died, tracking index time is the best way. If you're using event time and there's a twenty to forty minute gap, it's hard to know when it stops working. Maybe there's another way to do that per feed, but tracking index time makes it trivial.

  • The delta between occurrence and indexing time gives you a nice metric for how smoothly your ingest pipeline is running. You can watch the gap in timelion, set up alerts, etc.

Anyway, assuming I've convinced you it's valuable to know index time, what can we do? Most people map the time the event happened into the @timestamp, as I think the description for ECS @timestamp field says. (You have "generated" vs. "read" but don't say who's doing the reading.)

I can map timestamps from our data sources into something like event.timestamp to track when they occurred. I could see adding something to a pipeline that adds an index.timestamp or something just before indexing to track index time. It wouldn't be totally accurate but close enough. Leaving @timestamp blank gives you an indexed time correctly and automatically, including handling timeouts and errors automatically (if it fails to index, there's no @timestamp) but the name of the field is kind of generic.

I would love to leave @timestamp blank, but it kinds of goes against common practice. Any thoughts?

Thanks,
Dave Jaccard

Missing host fields

host.containerized and host.os.codename seem to be missing from the host object, while they are added by the new:

- add_host_metadata:
    netinfo.enabled: true

processor.

Device or host address / location info

Hello,

We need to add location info, such as street, number, code and city in separate fields. I'm not sure where this info should fit? Under hosts, device or maybe geoip..

Grtz

Willem

Expose keyword length in the CSV and doc

When adding new fields in the schema we can impose length limits on the string, the value of the limit is only exposed in the YAML and the generated templates.

As a producer of events, I think it would be appropriate to expose that value in the documentation and the CSV for easy reference and also reduce the risk of sending data that will be truncated.

Clarify use of hostname, subdomain, domain in source/destination

It's not clear to me how to populate the hostname, subdomain, and domain fields of source / destination. More detailed descriptions of each field are needed with examples.

It would probably be helpful to establish some terminology that could be used in clarifying the descriptions.

Terms:

  • FQDN: fully qualified domain name
  • TLD: top level domain (e.g. .com, .net, .bmw, .us) [list of TLDs]
  • eTLD: effective top level domain (e.g. .com, .co.uk and pvt.k12.wy.us) [these get determined with the help of the public suffix list]
  • eTLD+1: effective top level domain plus one level (e.g. example.com, example.co.uk)
  • SLD: second level domain (e.g. co is the SLD of www.example.co.uk)

Examples showing the mappings of these FQDNs to ECS would probably be sufficient to clarify the topic for me.

  • example.com
  • www.example.com
  • www.example.co.uk

Logstash has a TLD filter that uses similar field names, possibly(?) with different meanings.

Propose new top-level prefix 'related'

What I really like about ECS is the well-defined semantics of the data model. I understand we're still evolving what that means, of course. One downside to the way that Kibana is structured to analyzed event-based, time-based data, is that it's impossible to pivot across fields that may have the same content, but different field names. Of course, this is one item that ECS aims to unify, but semantics have to come first.

I'd like to propose a new top-level object called related. This object will hold fields that are lists of necessary types, which allows a single-click pivot to related records where the data may exist in different fields. Some examples:

  • related.ip: List of related IPs (IPv4 or IPv6)
  • related.hostname: List of related DNS hostname
  • related.id: List of related event IDs
  • related.hash: List of all hashes listed in the event

We do this today in RockNSM, but under the field names like @meta.related_id, @meta.related_ip, etc. I'm in the process of migrating this over to ECS and I think it would be especially useful to have this across other datasets.

A tangible example: Bro files log.

The bro files log has a unique identifier for a given analyzed file. Usually, a file was analyzed as part of one or more network data streams. An extreme, but not uncommon example, is a file transferred over the bittorrent protocol. In this case, bro tracks a single file that was transferred from and to many, many hosts. In this case, the following transformation captures the relevant pivotable data:

  • fuid: copy to event.id and related.id
  • tx_hosts: copy to related.ip
  • rx_hosts: copy to related.ip
  • conn_uids: copy to related.id
  • md5: copy to related.hash
  • sha1: copy to related.hash
  • sha256: copy to related.hash
  • sha512: copy to related.hash

Of course, data needs to be moved around to make the rest of it conformant, but now we can pivot to the related connection-oriented logs and events from other data sources, that perhaps match hashes.

host object considerations

Hello,

Some fields I'd like to see added for the host object:

host.description
host.cluster
host.contact.domain (group of persons / users)
host.contact.person (or user)
host.customer (customer for this host)
host.domain (domain this servers has joined, eg ad domain)
host.engineer (engineer responsible for this host)
host.environment (QA/DV/PR/DR/ST)
host.installation_date
host.os.build_nr (eg 10.0.14393) (different then host.os.version which would be Server 2016 in this case)
host.os.service_pack
host.site.primary

Those are just some fields I will be using. I'll leave it up to Elastic to decide which ones they think could be useful to add to ECS.

Grtz

Willem

Making clear about to what device is host object related to

A lot if issues here (including mine) needs to somehow implement following:

  • What is the network peer host name I received logs from?
  • What device originated the event?
  • What device transmitted it?

I propose to make some unification about it by prefixing host field with "device.SOURCENAME", where SOURCENAME is:

  • originator - the generator of the event (Filebeat)
  • collector - who received it (Logstash)
  • relay - who relayed it to another relay or collector ... so maybe relay0, relay1, ...

Also there should be peer object so device.collector.peer.host.ip - means IP of the TCP session peer from which for example the Logstash received the event (can be different from device.relay.host.ip or even device.originator.host.ip).

url method

Is there already a place for GET, POST, HEAD, PUT. I was looking for something like url.method. Checked hte http object and the url object, but didn't found anything suitable at first sight?

I hope it's not meant to be in event.action, because for me event.action is already occupied by the action done on the event (Request blokced or passsed)

For example

<155>Jul 10 12:32:17 slot1/acff5 err dcc[13819]: 01310039:3: [SECEV] Request violations: Attack signature detected. HTTP protocol compliance sub violations: N/A. Evasion techniques sub violations: N/A. Web services security sub violations: N/A. Virus name: N/A. Support id: 629814446755358496, source ip: 10.12.149.61, xff ip: 10.12.149.61, source port: 51779, destination ip: 10.12.1.138, destination port: 443, route_domain: 0, HTTP classifier: /Common/F5_External_Policy, scheme HTTPS, geographic location: <N/A>, request: <POST /Sched.aspx?viewGuid=cbeb7-fdc0-4687-a961-d28 HTTP/1.1\r\nHost: vlootbeheer.osjoemboera.be\r\nConnection: keep-a>, username: <N/A>, session_id: <f725ad3311df19ad>

So I get this:

POST /Sched.aspx?viewGuid=cbeb7-fdc0-4687-a961-d28 HTTP/1.1\r\nHost: vlootbeheer.osjoemboera.be\r\nConnection: keep-a

But in what field should I put POST (or GET or PUT)? Where should I put the http version?

Improve documentation around how to use ECS with additional fields

Additional documentation and recommendations should be created around ECS on how users should use ECS with their existing data structure, how to prevent conflicts with their own fields.

Add information about what is in the scope of ECS and what is not.

Also more info should be added about the terminology/style and if possible a grammar guide.

Doing geoip on more than one address...

Here's another topic for discussion!

We have events with more than one IP and want to do geoip on all of them. Instead of being "top level" field, what about source.geoip and such? Attach the geoip block to the thing it belongs to.

Compose ECS objects vs reuse objects

In several open issues / PRs (for example #51) the host objects are reused in different places. To reuse objects we can either repeat the name of the object or we could use composition. Examples:

Composition: Field c is composed out of host and geo, means it contains all the fields in host and geo. Example:

c.ip
c.mac
c.location

Reusing the object:

c.host.ip
c.host.mac
c.geo.location

Composition has the advantage that the names are shorter but it's not directly know out of what c is composed, meaning which fields it contains. Reusing the objects reduces the number of objects in ECS that look different. c could have a field d that is specific to c and still have all host and and geo fields. As soon as geo or host are used in any place it is clear which fields it will contain. This should simplify the mental model around ECS. To handle the longer field names with reusing objects it would be nice if Kibana would "understand" ECS and could potentially add some magic shortening here.

Most important is that we are consistent across ECS and use one or the other and don't mix them as we do at the moment.

Is the intention to have auditbeat (and others) migrate to the ECS

Hi there,

Is the plan to have auditbeat, metricbeat, packetbeat (etc) use the ECS mappings at some point? Will all the fields they produce be reflected in the ECS? If so, are you able to indicate if this migration work is already underway?

I'm keen to store all the events from the various beats in a single set of time based indices (e.g logstash-*) to ease analysis. Further to this, I'm keen to have Kibana know the field mappings up front, rather than having to continually refresh the mappings cache (at least until elastic/kibana#6498 has been implemented).

Many thanks,
Nick

beat.* vs agent.*

Hello,

I was wondering if in the long term, the agent.* object will replace the beat.* object that is used now? (and host.name for beat.hostname). Or will Beat agents keep their own object?

Grtz

Willem

Add network.name

Some operators take care to identify their networks in their monitoring pipelines. I think network.name should be the right place to put this.

I'd like to submit my first PR to ECS for this. I've already identified a few places where I should add the field:

  • README.md
  • schemas/network.yml
  • schema.csv

Questions

  • Do you agree we should add network.name?
  • Where else do I need to add the field? Any gotchas?
  • In schema.csv, what's the meaning of the "Phase" column? What value would you suggest be put there. I suspect network.name will not be the most used field.

Add network.address

I often need to somehow store the network name and network address. I propose to add network.address field for values like 192.168.0.0/24 ...

New use case: DNS

A very useful log source is DNS query logging, either from DNS resolver logging or network packet analysis. In my opinion, this is currently not easy to map to the current set of ECS fields.

Packetbeat has a list of DNS fields which could be used as a starting point.

Missing thread, process + thread id naming DEC+HEX

For process information, there are process fields, for thread there is none. I propose add thread fields and reflect that sometimes the application logs process ID and thread ID in HEX, sometimes in DEC. So maybe there should be process.pid_hex and thread.tid_hex like fields for this case. Even if it is possible to convert to DEC in Logstash, usually we require to know the pid in HEX format and in the same time pid in DEC (as listed in process list in OS).
Also for process.pid, should not be there process.id instead (as pid is shortcut for process id itself)? For process.ppid there could be process.parent.id.

Recommened way to map custom date formats to for example event.created with ingest pipeline

Hello,

I was wondering what's the recommended way to map a field to an ecs date field. Let's say I try to assign a timestamp to event.created.

So I took the official ecs template which has:

        "event": {
          "properties": {
            "created": {
              "type": "date"
            }

Es cluster logs gives me a org.elasticsearch.index.mapper.MapperParsingException: failed to parse [event.created] because Invalid format: "2018-08-07 22:51:09" is malformed at " 22:51:09"

My pipeline looks like:

PUT _ingest/pipeline/filebeat-nagios-audit
{
  "description": "Filebeat Nagios Audit Pipeline",
  "processors": [
    {
      "set": {
        "field": "pipeline",
        "value": "filebeat-nagios-audit"
      }
    },
    {
      "grok": {
        "field": "message",
        "patterns": [
          "%{TIMESTAMP_ISO8601:event.created} - %{GREEDYDATA:event.category} \\[%{NUMBER:event.type_id}] %{GREEDYDATA:user.name}:%{IPORHOST:source.ip} - %{GREEDYDATA:log.message}"
        ]
      }
    },
    {
      "date": {
        "field": "event.created",
        "target_field": "@timestamp",
        "formats": [
          "yyyy-MM-dd HH:mm:ss"
        ],
        "timezone": "Europe/Brussels"
      }
    },
    {
      "gsub": {
        "field": "source.ip",
        "pattern": "localhost",
        "replacement": "127.0.0.1"
      }
    }
  ]
}

So should I update the date type format of event.created or should I map to a temporary field and then use the date processor to event.created?

Top level: "client" and "server"

Hey! We do a lot of network flow work. We have a sort of issue using "source" and "destination" because flow data comes in both directions and we get records for each. The data for a single session might look like:

source.ip source.port destination.ip desatination.port
1.2.3.4 54321 6.7.8.9 443
6.7.8.9 443 1.2.3.4 5432

So that's a problem for us. The concepts of source and destination really only apply on a packet scale anyway. We'd like to normalize both of the records into:

client.ip client.port server.ip server.port
1.2.3.4 54321 6.7.8.9 443

This would also sort through things like DNS requests and other services that open a port.

Thoughts about that?

Guidance on anonymization/pseudonymization

I'd like to propose that ECS adds guidance for anonymization and pseudonymization. Some thoughts:

Definitions

  • anonymization: Irreversible data obfuscation.
  • pseudonymization: Reversible data obfuscation.

PII model
The NIST 800-122 publication on PII identifies levels of personal identifiable information:

  • High (4): publication has severe/catastrophic effects
  • Medium (3): publication has serious adverse effects
  • Low (2): publication has limited adverse effects
  • Public (1): not part of PII, but describes non-personal data

Typically if one is allowed to see PII level X, one can also see PII levels < X (the Air Force One uses the same method: walk freely towards the rear, but never walk forward of your own seat). We could also imagine putting pii_<level> as a pre- or postfix in field names to easily manage Field Level Security (because it supports access based on wildcards (*)).

Varying levels of obfuscation
We should also recognize that various versions of the same field can (and should) exist in harmony. Perhaps the Dutch postal code system is a good example:

  • postalcode: 1234AB

The system is set up so that each character to the right is adding more precision to the location.

Perhaps in Elasticsearch this becomes:

  • customer.postalcode.raw: 1234AB
  • customer.postalcode.city: 12
  • customer.postalcode.obfuscated: E32DB25A9BAAA6AF655FE65A861C9BD35AF1868229E0E9D738236B4500626AFB

Or, implementing PII:

  • customer.postalcode.pii4: 1234AB <-- perhaps enough to identify the customer
  • customer.postalcode.pii2: 12 <-- not enough to identify the customer
  • customer.postalcode.pii1: E32DB25A9BAAA6AF655FE65A861C9BD35AF1868229E0E9D738236B4500626AFB <-- not enough to identify the customer, but based on PII 4 data hence we can bucket customers of the same street without knowing which street it is.

The above would allow various users to access the postal code at an appropriate level for their usage (in case Business Analytics, for example, uses non-PII 3 or 4 data only due to laws on personal data like GDPR).

object mapping for [host] tried to parse field [host] as object, but found a concrete value

Hello,

When I try to use this f5ecs template where I integrated the ecs fields I think I will need:

PUT _template/f5ecs
{
  "order": 0,
  "index_patterns": "f5-002-*",
  "settings": {
    "index": {
      "mapping": {
        "total_fields": {
          "limit": "10000"
        }
      },
      "refresh_interval": "5s",
      "number_of_shards": "3",
      "number_of_replicas": "1"
    }
  },
  "mappings": {
    "doc": {
      "_meta": {
        "version": "2.0.2"
      },
      "date_detection": false,
      "dynamic": "false",
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "@version": {
          "type": "integer"
        },
        "dig": {
          "properties": {
            "source": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "app": {
              "properties": {
                "name": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "type": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              }
            },
            "correlation_id": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "message_id": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "destination": {
          "properties": {
            "domain": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "hostname": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "ip": {
              "type": "ip"
            },
            "mac": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "port": {
              "type": "long"
            },
            "subdomain": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "event": {
          "properties": {
            "action": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "category": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "created": {
              "type": "date"
            },
            "dataset": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "duration": {
              "type": "long"
            },
            "hash": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "id": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "module": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "raw": {
              "doc_values": false,
              "ignore_above": 1024,
              "index": false,
              "type": "keyword"
            },
            "risk_score": {
              "type": "float"
            },
            "severity": {
              "type": "long"
            },
            "type": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "version": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "geoip": {
          "properties": {
            "city_name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "continent_name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "country_iso_code": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "location": {
              "type": "geo_point"
            },
            "region_name": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "host": {
          "properties": {
            "architecture": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "id": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "ip": {
              "type": "ip"
            },
            "mac": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "os": {
              "properties": {
                "family": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "name": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "platform": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "version": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              }
            },
            "timezone": {
              "properties": {
                "offset": {
                  "properties": {
                    "sec": {
                      "type": "long"
                    }
                  }
                }
              }
            },
            "type": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "http": {
          "properties": {
            "response": {
              "properties": {
                "body": {
                  "norms": false,
                  "type": "text"
                },
                "status_code": {
                  "type": "long"
                }
              }
            }
          }
        },
        "log": {
          "properties": {
            "level": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "line": {
              "type": "long"
            },
            "message": {
              "doc_values": false,
              "ignore_above": 1024,
              "index": false,
              "type": "keyword"
            },
            "offset": {
              "type": "long"
            }
          }
        },
        "message": {
          "norms": false,
          "type": "text"
        },
        "network": {
          "properties": {
            "direction": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "forwarded_ip": {
              "type": "ip"
            },
            "inbound": {
              "properties": {
                "bytes": {
                  "type": "long"
                },
                "packets": {
                  "type": "long"
                }
              }
            },
            "outbound": {
              "properties": {
                "bytes": {
                  "type": "long"
                },
                "packets": {
                  "type": "long"
                }
              }
            },
            "protocol": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "organization": {
          "properties": {
            "id": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "name": {
              "norms": false,
              "type": "text"
            }
          }
        },
        "os": {
          "properties": {
            "family": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "platform": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "version": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "process": {
          "properties": {
            "args": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "pid": {
              "type": "long"
            },
            "ppid": {
              "type": "long"
            },
            "title": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "service": {
          "properties": {
            "ephemeral_id": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "id": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "state": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "type": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "version": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "source": {
          "properties": {
            "domain": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "hostname": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "ip": {
              "type": "ip"
            },
            "mac": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "port": {
              "type": "long"
            },
            "subdomain": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "tags": {
          "ignore_above": 1024,
          "type": "keyword"
        },
        "tls": {
          "properties": {
            "certificates": {
              "doc_values": false,
              "type": "keyword"
            },
            "ciphersuite": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "servername": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "version": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "url": {
          "properties": {
            "fragment": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "host": {
              "properties": {
                "name": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              }
            },
            "href": {
              "fields": {
                "raw": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              },
              "norms": false,
              "type": "text"
            },
            "password": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "path": {
              "fields": {
                "raw": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              },
              "norms": false,
              "type": "text"
            },
            "port": {
              "type": "long"
            },
            "query": {
              "fields": {
                "raw": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              },
              "norms": false,
              "type": "text"
            },
            "scheme": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "username": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "user": {
          "properties": {
            "email": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "hash": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "id": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "name": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "user_agent": {
          "properties": {
            "device": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "major": {
              "type": "long"
            },
            "minor": {
              "type": "long"
            },
            "name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "os": {
              "properties": {
                "major": {
                  "type": "long"
                },
                "minor": {
                  "type": "long"
                },
                "name": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "version": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              }
            },
            "patch": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "raw": {
              "norms": false,
              "type": "text"
            },
            "version": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "f5": {
          "properties": {
            "apd": {
              "properties": {
                "function": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "message": {
                  "type": "text",
                  "fields": {
                    "keyword": {
                    "ignore_above": 1024,
                    "type": "keyword"
                    }
                  }
                },
                "session": {
                  "properties": {
                    "key": {
                      "ignore_above": 1024,
                      "type": "keyword"
                    },
                    "value": {
                      "ignore_above": 1024,
                      "type": "keyword"
                    }
                  }
                },
                "processor": {
                  "properties": {
                    "name": {
                      "ignore_above": 1024,
                      "type": "keyword"
                    },
                    "line_number": {
                      "type": "long"
                    },
                    "message": {
                      "type": "text",
                      "fields": {
                        "keyword": {
                        "ignore_above": 1024,
                        "type": "keyword"
                        }
                      }
                    }
                  }
                }
              }
            },
            "dcc": {
              "properties": {
                "name": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "type": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "transaction": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "drop_counter": {
                  "type": "long"
                },
                "evasion_violation": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "event": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "http_violation": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "http_classifier": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "injection_ratio": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "injection_threshold": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "legit_sessions": {
                  "type": "long"
                },
                "new_transactions": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "operation_mode": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "request": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "rest": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "route_domain": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "scheme": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "scraping_status": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "scraping_type": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "session_id": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "support_id": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "violation": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "violation_counter": {
                  "type": "long"
                },
                "virus_name": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "web_violation": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "xff_ip": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              }
            },
            "correlation_id": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "message_id": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "f5_httpd_message": {
          "type": "text",
            "fields": {
              "keyword": {
                "ignore_above": 1024,
                "type": "keyword"
              }
          }
        },
        "f5_httpd_user_name": {
          "ignore_above": 256,
          "type": "keyword"
        },
        "f5_message_id": {
          "ignore_above": 16,
          "type": "keyword"
        },
        "f5_session_id": {
          "ignore_above": 16,
          "type": "keyword"
        },
        "f5_ssh_message": {
          "type": "text",
            "fields": {
              "keyword": {
                "ignore_above": 256,
                "type": "keyword"
              }
          }
        },
        "f5_ssh_port": {
          "type": "keyword"
        },
        "f5_ssh_source_ip": {
          "type": "ip"
        },
        "f5_ssh_sourceip": {
          "type": "ip"
        },
        "f5_ssh_source_port": {
          "type": "keyword"
        },
        "f5_ssh_username": {
          "type": "keyword"
        },
        "f5_tmm_auth_id": {
          "type": "keyword"
        },
        "f5_tmm_auth_ip": {
          "type": "ip"
        },
        "f5_tmm_auth_message": {
          "type": "text",
            "fields": {
              "keyword": {
              "ignore_above": 256,
              "type": "keyword"
              }
          }
        },
        "f5_tmm_auth_port": {
          "type": "keyword"
        },
        "f5_tmm_auth_type": {
          "type": "keyword"
        },
        "f5_tmm_auth_version": {
          "type": "keyword"
        },
        "f5_tmm_client_activex": {
          "type": "integer"
        },
        "f5_tmm_client_browser": {
          "type": "keyword"
        },
        "f5_tmm_client_browser_version": {
          "type": "keyword"
        },
        "f5_tmm_client_cpu": {
          "type": "keyword"
        },
        "f5_tmm_client_ip": {
          "type": "ip"
        },
        "f5_tmm_client_javascript": {
          "type": "integer"
        },
        "f5_tmm_client_platform": {
          "type": "keyword"
        },
        "f5_tmm_client_plugin": {
          "type": "integer"
        },
        "f5_tmm_client_port": {
          "type": "keyword"
        },
        "f5_tmm_client_ui_mode": {
          "type": "keyword"
        },
        "f5_tmm_event": {
          "type": "keyword"
        },
        "f5_tmm_message": {
          "type": "text",
            "fields": {
              "keyword": {
              "ignore_above": 256,
              "type": "keyword"
              }
          }
        },
        "f5_tmm_reputation": {
          "type": "keyword"
        },
        "f5_tmm_rest": {
          "type": "text",
            "fields": {
              "keyword": {
              "ignore_above": 256,
              "type": "keyword"
              }
          }
        },
        "f5_tmm_rule": {
          "type": "keyword"
        },
        "f5_tmm_rule_message": {
          "type": "text",
            "fields": {
              "keyword": {
              "ignore_above": 256,
              "type": "keyword"
              }
          }
        },
        "f5_tmm_sequence_id": {
          "type": "keyword"
        },
        "f5_tmm_server_ip": {
          "type": "ip"
        },
        "f5_tmm_server_port": {
          "type": "integer"
        },
        "f5_tmm_session_bytes_in": {
          "type": "long"
        },
        "f5_tmm_session_bytes_out": {
          "type": "long"
        },
        "f5_tmm_session_client_ip": {
          "type": "ip"
        },
        "f5_tmm_session_deleted_reason": {
          "type": "keyword"
        },
        "f5_tmm_session_listener": {
          "type": "keyword"
        },
        "f5_tmm_session_location": {
          "type": "keyword"
        },
        "f5_tmm_session_vip_ip": {
          "type": "ip"
        },
        "f5_tmm_type": {
          "type": "keyword"
        }
      }
    }
  },
  "aliases": {
    "f5": {}
  }
}

I get Logstash errors like:

[2018-07-06T15:46:38,453][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"f5-002-2018.07.06", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x2373a721>], :response=>{"index"=>{"_index"=>"f5-002-2018.07.06", "_type"=>"doc", "_id"=>"hGDYb2QBpfUnuaeQN_7m", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"object mapping for [host] tried to parse field [host] as object, but found a concrete value"}}}}

GET /_cat/templates/f5*?v&s=name:asc
name  index_patterns order version
f5    [f5-001-*]     0     
f5ecs [f5-002-*]     0     

And my pipeline:

input {
    udp {
        type => 'syslog-f5'
        port => 5548
        id => 'input-syslog-f5'
    }
}
filter {
    grok {
        patterns_dir => "/etc/logstash/patterns"
        match => [ "message", "\A<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} (slot1\/)?%{HOSTNAMEUND:host.name} %{LOGLEVEL:event.severity} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}\Z" ]
        add_tag => "grok_f5"
        id => 'grok-syslog-f5'
    } 
    translate {
        dictionary_path => [ "/etc/logstash/dictionaries/syslogpri.yml" ]
        field => "syslog_pri"
        destination => "log.level"
        id => 'translate-log-level'
    }
}
output {
    elasticsearch {
        index => "f5-002-%{+YYYY.MM.dd}"
        hosts => ["https://srvlogstashqa01.gentgrp.gent.be:9200"]
        manage_template => false
        user => "logstash_internal"
        password => "${LOGSTASH_INTERNAL}"
        document_type => doc
    }
}

the f5 template for f5-001-* still has a 'host' field, but shouldn't interfer as the my new f5ecs template applies to a different index?. I'm not sure what's going wrong here, will have to investigate further, but I thought I throw it in here, it might be related to the way I refer to the host object in my pipeline?

Nested Object Arrays and Kibana

For a elasticsearch document which stores nested object arrays, where we want to maintain some of the relationship information, Kibana does not support nested data in schemas.

For example a document which has more than a source and destination IP with a descriptive category associated with each nested object.

Elastic search is happy with the structure, and the search is fine, but cannot build a dashboard?

For example, if I have a field mapped as:

ip_rec_obj {
  “type” : “nested”,
  “properties”:{
      “value”:”{“type”:”ip”},
       “type”:{“type”:”text”},
       “geoloc”:{“type”:”geopoint”}
   }
}

And a document which contains 5 of those type records where the "type" field has a domain of 5 different values. I want to build a dashboard of counts by type across a time period.

Question about template usage for ecs

Hello,

As creator of the following project => https://github.com/OutsideIT/logstash_filter_f5

I'd prefer to follow ECS guidelines in the future and switch current fields if appropriate to ECS fields with dot notation. For now I always used underscores for all my fields, which makes this kind of new.

If I would create a template for some fields, eg.

        "f5_tmm_session_bytes_in": {
          "type": "long"
        },
        "f5_tmm_session_bytes_out": {
          "type": "long"
        },

And switch those to dot notated fields network.inbound.bytes and network.outbound.bytes, would this be the template that I would ideally use for those fields?

        "network": {
          "properties": {
            "inbound": {
              "properties": {
                "bytes": {
                  "type": "long"
                }
              }
            },
            "outbound": {
              "properties": {
                "bytes": {
                  "type": "long"
                }
              }
            },
          }
        }

I saw some examples which also have "type": "object" in the template, but I didn't see that everywhere (not in the beat.* object template for example)

Thanks for confirming the correct or incorrect use of my f5 template.

Another small question, I tend to use the ignore_above parameter alot, which I don't see anywhere in the ECS common field types. Are we 'allowed' to use the ignore_above on ECS fields and can we set them as we want or would this cause mapping conflicts if mixed with data from other indices which have different or no ignore_above parameter for the same field?

Grtz

Willem

user agent?

Looking at for example this log:

image

Would I ideally place this under user_agent.* or immediately under host.* object?

And what with:

  • activex_support
  • plugin_support
  • ui_mode

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.