GithubHelp home page GithubHelp logo

isabella232 / ecs-mapper Goto Github PK

View Code? Open in Web Editor NEW

This project forked from elastic/ecs-mapper

0.0 0.0 0.0 73 KB

Translate an ECS mapping CSV to starter pipelines for Beats, Elasticsearch or Logstash

License: Apache License 2.0

Ruby 100.00%

ecs-mapper's Introduction

ECS Mapper

Warning

This tool is currently experimental and not yet supported. Our goal is to solicit feedback on the process to convert data to ECS to make it even easier in the stack. Please feel welcome to open issues and PRs and we will address them on a best effort basis.

Synopsis

This tool turns a field mapping from a CSV to an equivalent pipeline for:

This tool generates starter pipelines for each solution above to help you get started quickly in mapping new data sources to ECS.

A mapping CSV is what you get when you start planning how to map a new data source to ECS in a spreadsheet.

Colleagues may collaborate on a spreadsheet that looks like this:

source_field destination_field notes
duration event.duration ECS supports nanoseconds precision
remoteip source.ip Hey @Jane do you agree with this one?
message No need to change this field
...

You can export your spreadsheet to CSV, run it through the ECS mapper, and generate your starter pipelines.

Note that this tool generates starter pipelines. They only do field rename and copy operations as well as some field format adjustments. It's up to you to integrate them in a complete pipeline that ingests and outputs the data however you need.

Scroll down to the Examples section below to get right to a concrete example you can play with.

Maturity

This code is a proof of concept and is not officially supported. The pipelines generated by this tool are likely not complete and probably need more testing and validation before they are ready for production. They are simply meant to give you a head start in mapping various sources to ECS.

CSV Format

Here are more details on the CSV format supported by this tool. Since mapping spreadsheets are used by humans, it's totally fine to have as many columns as you need in your spreadsheets/CSV. Only the following columns will be considered:

column name required allowed values notes
source_field required A dotted Elasticsearch field name. Dots represent JSON nesting. Lines with empty "source_field" are skipped.
destination_field required A dotted Elasticsearch field name. Dots represent JSON nesting. Can be left empty if there's no copy action (just a type conversion).
format_action optional to_float, to_integer, to_string, to_boolean, to_array, parse_timestamp, uppercase, lowercase, (empty) Simple conversion to apply to the field value.
timestamp_format optional Only UNIX and UNIX_MS formats are supported across all three tools. You may also specify other formats, like ISO8601, TAI64N, or a Java time pattern, but we will not validate whether the format is supported by the tool.
copy_action optional rename, copy, (empty) What to do with the field. If left empty, default action is based on the --copy-action flag.

You can start from this spreadsheet template. Make a copy of it in your Google Docs account, or download it as an Excel file.

When the destination field is @timestamp, then we always enforce an explicit date format_action of parse_timestamp to UNIX_MS avoid conversion problems downstream. If no timestamp_format is provided, then UNIX_MS is used. Please note that the timestamp layouts used by the Filebeat processor for converting timestamps are different than the formats supported by date processors in Logstash and Elasticsearch Ingest Node.

Usage and Dependencies

This is a simple Ruby program with no external dependencies, other than development dependencies.

Any modern version of Ruby should be sufficient. If you don't intend to run the tests or the rake tasks, you can skip right to usage tips.

Ruby Setup

If you want to tweak the code of this script, run the tests or use the rake tasks, you'll need to install the development dependencies.

Once you have Ruby installed for your platform, installing the dependencies is simply:

gem install bundler
bundle install

Run the tests:

rake test

Using the ECS Mapper

Help.

./ecs-mapper --help
Reads a CSV mapping of source field names to destination field names, and generates
Elastic pipelines to help perform the conversion.

You can have as many columns as you want in your CSV.
Only the following columns will be used by this tool:
source_field, destination_field, format_action, copy_action

Options:
    -f, --file FILE                  Input CSV file.
    -o, --output DIR                 Output directory. Defaults to parent dir of --file.
        --copy-action COPY_ACTION
                                     Default action for field renames. Acceptable values are: copy, rename. Default is copy.
        --debug                      Shorthand for --log-level=debug
    -h, --help                       Display help

Process my.csv and output pipelines in the same directory as the csv.

./ecs-mapper --file my.csv

Process my.csv and output pipelines elsewhere.

./ecs-mapper --file my.csv --output pipelines/mine/

Process my.csv, fields with an empty value in the "copy_action" column are renamed, instead of copied (the default).

./ecs-mapper --file my.csv --copy_action rename 

Examples

Look at an example CSV mapping and the pipelines generated from it:

You can try each pipeline easily by following the instructions in example/README.md.

Caveats

  • At this time, the Beats pipelines don't perform "to_array", "uppercase" nor "lowercase" transformations. They could be implemented via the "script" processor.
  • Only UNIX and UNIX_MS timestamp formats are supported across Beats, Elasticsearch, and Filebeat. For other timestamp formats, please modify the starter pipeline or add the appropriate date processor in the generated pipeline by hand. Refer to the documentation for Beats, Elasticsearch, and Logstash.
  • This tool does not currently support additional processors, like setting static field values or dropping events based on a condition.

ecs-mapper's People

Contributors

tonymeehan avatar woodywalton avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.