GithubHelp home page GithubHelp logo

logstash-output-csv's People

Contributors

andsel avatar colinsurprenant avatar electrical avatar jakelandis avatar jordansissel avatar jsvd avatar karenzone avatar kares avatar kurtado avatar nachonam avatar ph avatar robbavey avatar suyograo avatar talevy avatar yaauie avatar ycombinator avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

logstash-output-csv's Issues

Broken CSV output

Seems that 5.2 broke the CSV output plugin.

This will generate a "csv" with all the message fields all the time NOT separated by comma's but spaces even though "blah" does not exist as a field. This works correctly in 2.3 and is broken in 5.2.

To recreate pass in a file with this in it:
00:00:00.0 COMM_TURNED_ON YODA

Use this grok pattern:
EVENT_COMM_TURNED_ON %{TIME:event_time}%{SPACE}%{NOTSPACE:event_type}%{SPACE}%{NOTSPACE:name}

input { stdin { } }
filter {
grok {
patterns_dir =>["C:/src/elk/broken"]
match =>["message", "%{EVENT_COMM_TURNED_ON}"]
}
}

output {
if "_grokparsefailure" not in [tags] {
elasticsearch {
index => "raw-data-%{+YYYY.MM.dd}"
}
if "COMM_TURNED_ON" in [message] {
csv {
fields => ["blah"]
csv_options => {"col_sep" => "," "row_sep" => "\r\n"}
path => "C:/src/elk/comm_turned_on.csv"
}
}
}
}

Files remain open when writing

I'm using the JDBC input in combination with the CSV output. When I fetch data from the table and write the data to an output CSV, it appears that the file remains open. This is a shame, because I need to move and delete that file when done. Any suggestions?

Add support for CSV header row

There doesn't appear to be a way to set a header row in a CSV output file. Please add an option to set a header row, ideally using the field names.

Output to stdout bug

With logstash 2.1 a bug was introduced, that disallows me to print the csv document to stdout.

Main parts of the config:

input {
        stdin {}
}

filter {
...
}

output {
        csv {
                path => "/dev/stdout"
                fields => [ "path", "resource" ]
                csv_options => {
                        col_sep => ";"
                        force_quotes => true
                }
        }
}

This worked in 2.0 and smaller version.

The error message is:

Exception while flushing and closing files. {:exception=>#<IOError: Illegal seek>, :level=>:error}

UPDATE

If I use file input filter with a static path I get this error:

IOError: Illegal seek
                flush at org/jruby/RubyIO.java:2207
                flush at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-file-2.2.3/lib/logstash/outputs/file.rb:284
  flush_pending_files at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-file-2.2.3/lib/logstash/outputs/file.rb:200
                 each at org/jruby/RubyHash.java:1342
  flush_pending_files at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-file-2.2.3/lib/logstash/outputs/file.rb:198
                flush at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-file-2.2.3/lib/logstash/outputs/file.rb:187
              receive at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-csv-2.0.2/lib/logstash/outputs/csv.rb:40
               handle at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.1-java/lib/logstash/outputs/base.rb:81
          output_func at (eval):163
         outputworker at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.1-java/lib/logstash/pipeline.rb:277
        start_outputs at /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.1-java/lib/logstash/pipeline.rb:194

clarify file_mode and dir_mode

Retranscripting a user's investigation:

Under documentation for CSV is a documentation for file_mode:
https://www.elastic.co/guide/en/logstash/current/plugins-outputs-csv.html#plugins-outputs-csv-file_mode
There is the sentence:
File access mode to use. Note that due to the bug in jruby system umask is ignored on linux: jruby/jruby#3426 Setting it to -1 uses default OS value. Example: "file_mode" => 0640

In my opinion this documentation about file_mode and dir_mode as well is not true.
I did some tests with filemask:
/usr/share/logstash/bin/logstash --config.string 'input { stdin { } } output { csv { path => "/usr/share/logstash/00359402.csv" fields => "message" file_mode => 0000 } }' --log.level=debug

And result was this table:
0000 ----------.
0001 ---------x.
0002 ----------.
0003 ---------x.
0004 -------r--.
0005 -------r-x.
0006 -------r--.
0007 -------r-x.
Under RHEL7 umask for root is 0022.

I found this documentation about the ruby File method:
http://www.ruby-doc.org/core-2.1.2/File.html#method-c-new
which essentially says that ruby uses system call with open and chmod, which essentially says umask is always ignored regarding the bit that is set.
https://www.linuxnix.com/umask-define-linuxunix/
Umask values are subtracted from the default permissions, so a umask of 0222 would make a file read-only for everyone.

So if root default is 0022 which means that for group and other is always read-only, which is exactly what is happening here.

So I tested to set umask to 0000:

   umask 0000 ; /usr/share/logstash/bin/logstash --config.string 'input { stdin { } } output {   csv { path => "/usr/share/logstash/00359402.csv" fields => "message" file_mode => 0222 }   }' --log.level=debug ; umask 0022

result was a file with correct rights:

--w--w--w-. 1 root root 52 Jul  4 16:27 /usr/share/logstash/00359402.csv

So one solution can be, that the documentation page documents that.

Another solution can be, that Systemd could set umask for logstash, but this is not a good solution.
vim /etc/systemd/system/multi-user.target.wants/logstash.service

      [Service]
      UMask=0000

3rd solution could be, that another Ruby solution will be established for that.

I could not reproduce this bug under RHEL7
jruby/jruby#3426

/usr/share/logstash/bin/logstash --interactive irb
under user root:           puts File.umask      18 => nil
under user  logstash:   puts File.umask        2 => nil

Properly escape CSV values

CSV output needs to escape characters in values which cannot be rendered in spreadsheet applications.

Logstash 5.4.0 packages 3.0.2 instead of 3.0.3 version of the logstash-output-csv plugin

Just filing for those who may run into this.

LS 5.4.0 incorrectly packages 3.0.2 version of the logstash-output-csv plugin, which means that users on 5.4.0 can run into this known bug. Note that LS 5.3.0 already comes with 3.0.3 version of the plugin. I can confirm that LS 5.4.1 packages 3.0.3 version of the plugin again so it has been addressed.

For users on LS 5.4.0, either upgrade to LS 5.4.1, or manually install (./logstash-plugin install --version "3.0.3" logstash-output-csv) the 3.0.3 version of the plugin with the fix to #14.

Config not parsed?

  • Version: logstash-output-csv-3.0.2; logstash-5.1.1
  • Operating System: Gentto Linux
  • Config File:
input {
    file {
        path => [ "/tmp/logstash-test.txt" ]
        type => "test"
    }
}

filter {}

output {
    if [type] == "test" {
        file {
            path => "/tmp/logstash-test.json"
        }

        csv {
            path => "/tmp/logstash-test.csv"
            fields => [ "host", "path" ]
            csv_options => { "col_sep" => ":" "row_sep" => "\n"}
        }
    }
}
  • Sample Data:
/tmp/logstash-test.txt
091502 001 002 003
091517 001 002 003
/tmp/logstash-test.json
{"path":"/tmp/logstash-test.txt","@timestamp":"2017-02-03T09:15:03.119Z","@version":"1","host":"localhost","message":"091502 001 002 003","type":"test","tags":[]}
{"path":"/tmp/logstash-test.txt","@timestamp":"2017-02-03T09:15:18.130Z","@version":"1","host":"localhost","message":"091517 001 002 003","type":"test","tags":[]}
/tmp/logstash-test.csv
2017-02-03T09:15:03.119Z localhost 091502 001 002 0032017-02-03T09:15:18.130Z localhost 091517 001 002 003
  • Steps to Reproduce:
echo "$(date -u +%H%M%S) 001 002 003" >> /tmp/logstash-test.txt
  • Missing:
    Expected fields "host" and "path" separated by ":" followed by newline :-(
localhost:/tmp/logstash-test.txt
localhost:/tmp/logstash-test.txt

csv separator character should not be escaped.

From elastic/logstash#3001

I followed the guide in http://logstash.net/docs/1.4.2/outputs/csv#csv_options, but the separator was escaped. For example:

a\t2015-04-14 06:24:07 UTC\t1\nb\t2015-04-14 06:24:08 UTC\t1\nc\t2015-04-14 06:24:09 UTC\t1\n""\t2015-04-14 06:24:09 UTC\t1\nd\t2015-04-14 06:24:10 UTC\t1\n

Configuration:
input {
stdin {}
}

output {
csv {
fields => [ "message", "@timestamp", "@Version" ]
csv_options => {"col_sep" => "\t" "row_sep" => "\n"}
path => "/tmp/test.csv"
}
}

logstash version: 1.4.2

header row

Please post all product and debugging questions on our forum. Your questions will reach our wider community members there, and if we confirm that there is a bug, then we can open a new issue here.

For all general issues, please provide the following details for fast resolution:

  • Version:
  • Operating System:
  • Config File (if you have sensitive info, please remove it):
  • Sample Data:
  • Steps to Reproduce:

ES 5.4 error

Hi

With ES 5.4 this conf file make the folowing errors

input {
  elasticsearch {
	hosts => "http://localhost:9200"
	index => "sirene"
	query => '{"query": {"query_string" : {"query": "((DEPET:54 AND provider:sp_mairie) OR (DEPET:55 AND provider:sp_mairie) OR (DEPET:60 AND provider:sp_mairie))"}}}'
  }
}
output {
  csv {
	fields => ["nom", "street", "cp","nomcommune","nom_maire","siteweb","email"]
	path => "/home/data-prospection/public_html/jsondata/57cae745456ab85d4ff76a83ef3f1f0e/dt_0d83a4d79454b856ab249ca6496b17b1.csv"
  }
}
19:49:32.109 [[main]<elasticsearch] ERROR logstash.pipeline - A plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::Elasticsearch hosts=>["http://localhost:9200"], index=>"sirene", query=>"{\"query\": {\"query_string\" : {\"query\": \"((DEPET:54 AND provider:sp_mairie) OR (DEPET:55 AND provider:sp_mairie) OR (DEPET:60 AND provider:sp_mairie))\"}}}", id=>"5c936185cc6a7e0f4f295c6b5c5a95250ce5f9b6-1", enable_metric=>true, codec=><LogStash::Codecs::JSON id=>"json_633de6cc-69e1-4a02-bd93-63fb1c7b92a6", enable_metric=>true, charset=>"UTF-8">, size=>1000, scroll=>"1m", docinfo=>false, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>
  Error: [400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Failed to parse request body"}],"type":"illegal_argument_exception","reason":"Failed to parse request body","caused_by":{"type":"json_parse_exception","reason":"Unrecognized token 'DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAACsFm84Mm84SFpzU3VHSzdWMHdWQ3N3NGcAAAAAAAAArRZvODJvOEhac1N1R0s3VjB3VkNzdzRnAAAAAAAAAKsWbzgybzhIWnNTdUdLN1Ywd1ZDc3c0ZwAAAAAAAACuFm84Mm84SFpzU3VHSzdWMHdWQ3N3NGcAAAAAAAAArxZvODJvOEhac1N1R0s3VjB3VkNzdzRn': was expecting ('true', 'false' or 'null')\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@6037430d; line: 1, column: 457]"}},"status":400}

Thanks for your return (it's work well with ES 5.3)

scientific number for float

I output a cvs file with the csv plugin and a field with float value which is output in a scientific format with E.

Will not be logical to use decimal value instead? (best for latitude and longitude)

method received not called in logstash 5.x in favor of the inherited multi_receive_encoded

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.