fstab / grok_exporter Goto Github PK
View Code? Open in Web Editor NEWExport Prometheus metrics from arbitrary unstructured log data.
License: Apache License 2.0
Export Prometheus metrics from arbitrary unstructured log data.
License: Apache License 2.0
Hi
When using summary, what is the sliding window used for the percentiles?
Hi there.
Sorry if this is the wrong place to ask but has anyone built support for IIS logs? Cheers
Pete
Hi,
thanks for grok_exporter, looks really useful! I was wondering if there's support for reading multiple files within the same config/server ?
Release: v0.2.5
I have the following log line I am trying to grok pattern match:
{"blah": "foo", "somethin": "baz", "status": 200}
My Grok Pattern (tested on https://regex101.com, and has captured value of 200
):
STATUS "status":\s+([^",}]*)
My Grok Metric (Config v2):
match: '%{STATUS:status}'
labels:
status: '{{.status}}'
I want .status to be just 200
, but instead I am getting "status": 200
The grok patter in match
seems to be using the whole match sequence in the pattern defintion rather than the capture group. Is this a bug or a feature?
Hi,
Is it possible to have "conditional" labels ? For example, consider the following patterns :
FOO foo
BAR bar
FOOBAR %{FOO:foo}%{BAR:bar}?
If I match on %{FOOBAR}
in my metrics, I can use {{.foo}}
in the labels, because it will always be present. However, as soon as I want to use {{.bar}}
, I run into problems because a line can match only %{FOO}
and not %{BAR}
, so sometimes bar
isn't in the template context, so grok_exporter becomes unhappy : Warning : Skipping log line: foobar: unexpected result when calling onig_name_to_group_numbers()
I tried to work around it by using {{if .bar}}{{.bar}}{{end}}
to no avail.
How could I achieve what I want there ? Is there a way to check that a template variable exists without making grok_exporter bail on onig_name_to_group_numbers()
?
Thanks
I'm running grok (grok_exporter version: 0.2.6 (build date: 2018-10-08, branch: master, revision: 81c0afe, go version: go1.11.1, platform: linux-amd64) and it starts successfully and parses log files - but my log files are rotated after reaching certain size. When these log files get rotated, I see an error message and grok process gets killed.Am I doing something wrong?
error message:
Starting server on http://0.0.0.0:9142/metrics
error reading log lines: failed to watch /var/prod/logs/xxxx/haproxy/http_hap.log: open /var/prod/logs/xxxx/haproxy/http_hap.log: permission denied
grok log files:
-rwxr-xr-x. 1 xxxx yyyy 206714514 May 27 03:15 http_hap.log-20190527.gz
-rwxr-xr-x. 1 xxxx yyyy 215198130 May 28 03:46 http_hap.log-20190528.gz
-rwxr-xr-x. 1 xxxx yyyy 291469041 May 29 03:19 http_hap.log-20190529.gz
-rwxr-xr-x. 1 xxxx yyyy 694520310 May 29 10:03 http_hap.log
my log had put info kafka from many os, how can grok_exporter input from kafka, exporter metrics to promethues?
As suggested in #46 the time between two loglines is quite often a useful statistic.
In one of the applications running on our cluster, we get loglines like the following :
[2018-11-26 11:12:26 +0000] INFO com.somejavaclass.redacted.processingnode.verticle.ProcessingJobVerticle Received processing job with jobId: 9bfb55a0-ae31-42c6-a2df-00160e65986c
[2018-11-26 11:12:26 +0000] INFO com.somejavaclass.redacted.processingnode.verticle.ProcessingJobVerticle Created path for source file: /data/input/0b487e4a-4d65-49b8-a321-4b8a1f83d6b8
[2018-11-26 11:12:26 +0000] FINE com.somejavaclass.redacted.http.DownloadVerticle Received download message : {"com.somejavaclass.redacted.http.DownloadVerticle.save.path.key":"/data/input/0b487e4a-4d65-49b8-a321-4b8a1f83d6b8","com.somejavaclass.redacted.http.DownloadVerticle.download.url.key":"https://redacted.host.name/source/2F7dMYtNsRkMJGexWfLz-y/Eexz2sMfTiE89Gr536fSMc"}
[2018-11-26 11:12:26 +0000] FINE com.somejavaclass.redacted.http.ZookeeperVerticle Received resolve url message.
[2018-11-26 11:12:26 +0000] FINE com.somejavaclass.redacted.http.ZookeeperVerticle Url 'https://redacted.host.name/source/2F7dMYtNsRkMJGexWfLz-y/Eexz2sMfTiE89Gr536fSMc' is not an internal url, using it unmodified.
[2018-11-26 11:12:26 +0000] FINE com.somejavaclass.redacted.http.DownloadVerticle Found https url, configuring all trusting ssl client options.
[2018-11-26 11:12:26 +0000] INFO com.somejavaclass.redacted.http.DownloadVerticle Downloading file from 'https://redacted.host.name:-1/source/2F7dMYtNsRkMJGexWfLz-y/Eexz2sMfTiE89Gr536fSMc' and saving to: /data/input/0b487e4a-4d65-49b8-a321-4b8a1f83d6b8
[2018-11-26 11:12:26 +0000] WARNING io.netty.util.internal.logging.Slf4JLogger warn Failed to find a usable hardware address from the network interfaces; using random bytes: d2:41:fa:d0:d3:8c:b6:b9
[2018-11-26 11:12:26 +0000] INFO com.somejavaclass.redacted.http.DownloadVerticle Beginning download...
[2018-11-26 11:12:29 +0000] FINE com.somejavaclass.redacted.http.DownloadVerticle Closed file /data/input/0b487e4a-4d65-49b8-a321-4b8a1f83d6b8.
[2018-11-26 11:12:29 +0000] INFO com.somejavaclass.redacted.processingnode.verticle.ProcessingJobVerticle Download finished from: https://redacted.host.name/source/2F7dMYtNsRkMJGexWfLz-y/Eexz2sMfTiE89Gr536fSMc. File saved at : /data/input/0b487e4a-4d65-49b8-a321-4b8a1f83d6b8
The time between Download started (11:12:26) and Download finished (11:12:29) for a job is actually really interesting to us.
For multiple events, my suggestion for the flow (least confusion and easiest to code) :
11:00 [event 2 fires] (without event 1), do nothing because we have nothing to time against
11:01 [event 1 fires] log time.
11:02 [event 2 fires] check for a tracked event 1 and store the time between now and then and output the metric
11:03 [event 1 fires] log time
11:04 [event 1 fires] log time, overwrite slot
11:05 [event 2 fires] check for a tracked event 1 and store the time between now and then and output the metric (1 minute)
11:06 [event 2 fires] check for a tracked event 1 and store the time between now and then and output the metric (2 minutes because 11:06-11:04)
Whilst this doesn't give the ability to correlate between events, it's the least surprising and easiest to code. If there is some kind of correlation ID and events might nest or fire asynchronously, a regex for correlation id might be a nice to have, but that requires more storage, and lots more metric buckets.
@fstab thoughts?
Would be great if we could run this on a Raspberry Pi or on Scaleway: https://www.scaleway.com/armv8-cloud-servers/
https://prometheus.io/docs/instrumenting/writing_clientlibs/#summary
A summary MUST allow not having quantiles, as just _count/_sum is quite useful and this MUST be the default.
We have configured Grok exporter to monitor errors from various system logs. But it seems changes are reflected once we restart the respective grok instance.
Please see the config.yml below:
global:
config_version: 2
input:
type: file
path: /ZAMBAS/logs/Healthcheck/EFT/eftcl.log
readall: true
poll_interval_seconds: 5
grok:
patterns_dir: ./patterns
metrics:
- type: gauge
name: EFTFileTransfers
help: Counter metric example with labels.
match: '%{WORD:Status}\s%{GREEDYDATA:FileTransferTime};\s\\%{WORD:Customer}\\%{WORD:OutboundSystem}\\%{GREEDYDATA:File};\s%{WORD:Operation};\s%{NUMBER:Code}'
value: '{{.Code}}'
cumulative: false
labels:
Customer: '{{.Customer}}'
OutboundSystem: '{{.OutboundSystem}}'
File: '{{.File}}'
Status: '{{.Status}}'
Operation: '{{.Operation}}'
FileTransferTime: '{{.FileTransferTime}}'
- type: gauge
name: EFTFileSuccessfullTransfers
help: Counter metric example with labels.
match: 'Success\s%{GREEDYDATA:Time};\s\\%{WORD:Customer}\\%{WORD:OutboundSystem}\\%{GREEDYDATA:File};\s%{WORD:Operation};\s%{NUMBER:Code}'
value: '{{.Code}}'
cumulative: false
- type: gauge
name: EFTFileFailedTransfers
help: Counter metric example with labels.
match: 'Failed\s%{GREEDYDATA:Time};\s\\%{WORD:Customer}\\%{WORD:OutboundSystem}\\%{GREEDYDATA:File};\s%{WORD:Operation};\s%{NUMBER:Code}'
value: '{{.Code}}'
cumulative: false
server:
port: 9845
Without restart it doesn't reflects correct matching patterns. Once I restart the grok instance it reflects perfectly.
I have used a parameter suggested in some diff issue "poll_interval_seconds: 5". But this doesn't helps me either.
Is there some parameter I am missing here ?
Thanks Priyotosh
Can i start a single grok-exporter with multiple config files?
Eg.
./grok_exporter -config file1 file2 file3 file4
or
./grok_exporter -config file1 -config file2 -config file3 -config file4
The reason i ask is i have 4 static files in which i want to monitor the output of a string.
2 files i expect same string to exist and 2 files with 2 other strings
these are the strings
*=INFO<string>=DETAIL:<string>=DETAIL
*=info
*=INFO
I created 4 files to monitor each looking pretty much like
global:
config_version: 2
input:
type: file
path: <path>/server.xml
readall: true # Read from the beginning of the file? False means we start at the end of the file and read only new lines.
grok:
patterns_dir: ./patterns
additional_patterns:
- 'EXIM_MESSAGE [a-zA-Z ]*'
metrics:
- type: counter
name: uview_server_xml_trace_level
help: traceSpecification in the uview server.xml
match: 'traceSpecification=%{QUOTEDSTRING:LOGLEVEL}'
labels:
LOGLEVEL: '{{.LOGLEVEL}}'
Multifile seems only to read the first file.
grok_exporter_build_info{branch="master",builddate="2019-04-08",goversion="go1.12.2",platform="linux-amd64",revision="e2ba841",version="0.2.7"} 1
Could you add a feature to run grok_exporter as a Windows service? I have tried to register it with
sc.exe create grok_exporter binPath= "c:\grok_exporter\grok_exporter.exe -config c:\grok_exporter\config.yml"
but the service failed to start with followed error:
The grok_exporter service failed to start due to the following error:
The service did not respond to the start or control request in a timely fashion.
Hi,
My case is: I have the nginx access log file. I want the exporter can do tailing over the file. If the new lines in the file matches the pattern, then count it. And I want the have the expiration time(i.e. 5 mimutes) for the current counter.
Looks like this exporter can not do neither ?
Bests,
Autumn Wang
Hello
I'm using grok exporter and here is what I want to achieve: I have a Java application whose log entry is in below format:
{"@Version":1,"source_host":"fstest-stage-bm-62","message":"Known host file not configured, using user known host file: /home/.ssh/known_hosts","thread_name":"Camel (camel-1) thread #4 - aws-s3://fstest-stage-bm-62","@timestamp":"2019-08-28T07:52:12.526+00:00","level":"INFO","logger_name":"org.apache.cam.file.remote.oerations"}
global:
config_version: 2
input:
type: file
path: ./example/test.log
readall: true # Read from the beginning of the file? False means we start at the end of the file and read only new lines.
grok:
patterns_dir: ./patterns
metrics:
- type: counter
name: error_test
help: Counter metric example
match: '%{NUMBER} %{JAVACLASS} %{JAVALOGMESSAGE} %{JAVATHREAD} %{TOMCAT_DATESTAMP} %{LOGLEVEL:severity} %{JAVAMETHOD}'
labels:
grok_field_name: severity
prometheus_label: severity
server:
host: 0.0.0.0
port: 9144
============================
The test log file has 4 log lines with one log line having log level as ERROR. I did try accessing http://IP:9144/metrics and I see the below but there is no metric created on Prometheus(grok_exporter is installed on Prometheus itself).
grok_exporter_line_processing_errors_total{metric="error_test"} 0
grok_exporter_lines_matching_total{metric="error_test"} 0
grok_exporter_lines_processing_time_microseconds_total{metric="error_test"} 0
grok_exporter_lines_total{status="ignored"} 4
grok_exporter_lines_total{status="matched"} 0
I do see the metric on prometheus but ti doesn't yield any value. Can someone please help me with regex expression for my json log format as I couldn't get the correct matching format.
Thanks
Hi,
Seems like the Tailer is not pushing lines over the channel ( Lines() ) when the given filepath is a symbolic link.
I am instanciating the tailer this way:
logtailer := tailer.RunFseventFileTailer(filepath, false, true, logger)
the filepath is pointing to the file named access.log that is a symbolic link
-rw-r--r-- 1 dbenque dbenque 521 Feb 20 23:38 access0
lrwxrwxrwx 1 dbenque dbenque 7 Feb 20 23:38 access.log -> access0
At the same time I have process writting inside file access0. The lines finally arrive in one bulk when the process writting to the file access0 is terminated.
Would it be possible to support Symbolic Link for the tailer?
Note:
> uname -a
Linux ncelrnd0228 4.15.0-45-generic #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Thanks,
David
I have tried to access metrics from the apache logs using grok_exporter in windows 10 environment. I getting following issuses/errors.
"Error reading log lines: failed to watch C:\Apache24\logs\access_log: read error: read C:\Apache24\logs\access_log: The process cannot access the file because another process has locked a portion of the file" After sometime.
Hitting some load to apache web server, But "http://localhost:9144/metrics" i am getting old metrics not updated metrics(i have tried by making readfile true/false both options).
i have took "exe file" from this link.
When i am running "tail -f /var/log/access_log" in terminal and grok_exporter in other terminal real time metrics getting in browser, But after some time ending with above mentioned error(1).
I'm newbie for prometheus and exporters. Thanks for your great work. I believe this exporter is fairly useful for monitoring guys.
I have a question about usage of this exporter.
Situation: Now I'm trying to use this exporter to retrive metrics of CPU, memory and disk usage from some log messages of a certain OSS**, but it seems grok_exporter returns same values repeatedly even after receiving logs completely stopped. After stopping target application (and sending logs), the exporter continues to shows me "CPU usage: x % (and seems running now)" forever.
**Should I use any kind of API returns some metrics in real time? Yes, I should. Unfortunately, I cannot get nice metrics from its API, need to use logs to moniter them. :-(
Problem: I cannot distinguish this metrics is dead or really keeping same value. And it's difficult to make some kind of queries of prometheus, because useless values remains, avg() and similar functions returns wrong results.
Question: Is there any way to make the exporter to forget very old values with interval we want? Or is there other good practice to resolve this situation?
I tried:
cpu_percentage{app_id="<UUID>",app_name="<NAME>", app_container_index="<NUM>",instance="grok_exporter:9144",job="grok"}[1m]
5.00000000 @1493027556.79
5.00000000 @1493027571.79
5.00000000 @1493027586.79
5.00000000 @1493027601.79
...repeats forever
I don't know very nice solution but guessing:
cpu_percentage{app_id="<UUID>",app_name="<NAME>", app_container_index="<NUM>",instance="grok_exporter:9144",job="grok"}[1m]
5.00000000 @1493027556.79
5.00000000 @1493027571.79
(Any more metric values are returned after passing expire interval, until receiving new logs matching same grok expression.)
...
X.XXXXXXXX @1493030000.79 (if new matched log message appears.)
Note: Use cumulative:true and idelta() in prometheus function might be an unperfect workaround. idelta() returns 0 in the previous situation, however, I couldn't recognize the application [container] is dead or keeping 0 value. For example, query like avg(idelta(cpu_percentage_cumulative{app_name="<NAME>", app_container_index=".+"}[60s]))
returns wrong lower values after we scale in (shurink) this application. Shrinked container's values are keeping 0.
Hello,
I'm trying to capture multiline pattern from the following text:
2018-10-08 06:55:35.156330 0x00007f3e3569c700: <info> (health::main.cpp@169) peer-node-0
ACNTST C : 20'032 | ACNTST C HVA : 22 | ACTIVE PINGS : 0 | B WRITERS : 0 |
BLK ELEM ACT : 0 | BLK ELEM TOT : 2'217 | BLKDIF C : 665 | HASH C : 13'013 |
HASHLOCK C : 0 | MEM CUR RSS : 65 | MEM CUR VIRT : 774 | MEM MAX RSS : 65 |
MEM SHR RSS : 27 | MOSAIC C : 1 | MOSAIC C DS : 1 | NS C : 1 |
NS C AS : 1 | NS C DS : 1 | RB COMMIT ALL : 0 | RB COMMIT RCT : 0 |
RB IGNORE ALL : 0 | RB IGNORE RCT : 0 | READERS : 3 | SECRETLOCK C : 0 |
SUCCESS PINGS : 0 | TASKS : 11 | TOTAL PINGS : 0 | TS NODE AGE : 13 |
TS OFFSET ABS : 0 | TS OFFSET DIR : 0 | TS TOTAL REQ : 0 | TX ELEM ACT : 0 |
TX ELEM TOT : 3'081 | UNLKED ACCTS : 1 | UT CACHE : 0 | WRITERS : 0 |
2018-10-08 06:55:35.156661 0x00007f3e3569c700: <info> (health::main.cpp@169) peer-node-1
ACNTST C : 20'032 | ACNTST C HVA : 22 | ACTIVE PINGS : 0 | B WRITERS : 0 |
BLK ELEM ACT : 0 | BLK ELEM TOT : 2'236 | BLKDIF C : 665 | HASH C : 13'013 |
HASHLOCK C : 0 | MEM CUR RSS : 65 | MEM CUR VIRT : 770 | MEM MAX RSS : 66 |
MEM SHR RSS : 27 | MOSAIC C : 1 | MOSAIC C DS : 1 | NS C : 1 |
NS C AS : 1 | NS C DS : 1 | RB COMMIT ALL : 0 | RB COMMIT RCT : 0 |
RB IGNORE ALL : 0 | RB IGNORE RCT : 0 | READERS : 2 | SECRETLOCK C : 0 |
SUCCESS PINGS : 0 | TASKS : 10 | TOTAL PINGS : 0 | TS NODE AGE : 13 |
TS OFFSET ABS : 0 | TS OFFSET DIR : 0 | TS TOTAL REQ : 0 | TX ELEM ACT : 0 |
TX ELEM TOT : 2'063 | UT CACHE : 0 | WRITERS : 1 |
I want:
Capture MEM CUR VIRT that is correlated to peer-node-0 (that would be the first occurrence of MEM CUR VIRT)
I wrote following pattern in config.yml:
global:
config_version: 2
input:
type: file
path: ./example/my_file.log
readall: true # Read from the beginning of the file? False means we start at the end of the file and read only new lines.
grok:
patterns_dir: ./patterns
metrics:
- type: gauge
name: peer_node_0_MEM_CUR_VIRT
help: peer_node_0_MEM_CUR_VIRT
match: '(?m)%{GREEDYDATA}peer-node-0%{GREEDYDATA}MEM%{SPACE}CUR%{SPACE}VIRT%{SPACE}:%{SPACE}%{INT:data}%{GREEDYDATA}peer-node-1%{GREEDYDATA}'
value: '{{.data}}'
server:
port: 9144
What I get:
However, this doesn't work even if it works on: http://grokdebug.herokuapp.com/
If I write following match:
%{GREEDYDATA}MEM%{SPACE}CUR%{SPACE}VIRT%{SPACE}:%{SPACE}%{INT:data}%{GREEDYDATA}
it will work in Grok Exporter but I will not have correlation to specific peer-node (last occurrence will be taken)
Is there any other way to capture multi-line pattern in grok exporter?
Thanks for the library and your passion. Is there a way to skip matching some records by overriding a match rule. I keep getting lot of 404's from /phpMyAdmin/ and other bot attacks. I want to avoid creating timelines for them by ignoring them in the grok exporter.
Any way to achieve it the current grok exporter?
I use the authguard to set up SSL and Authentication to prevent the misuse of sensible data from our apache logs. Please implement an option for running the exporter on localhost like many other exporters do.
Grok Exporter does not support the web.telemetry-path option to set Path under which to expose metrics.
This is available in node_exporter as well as consul exporter, can this option be provided?
eg. Line 74 here: https://github.com/prometheus/node_exporter/blob/master/node_exporter.go
I really like the grok_exporter - it works like a charm with little effort and I can reuse my grok patterns.
Is there any way to use multiple fields for counting? For example:
timestamp, field1, field2, field3 ...
I would like to be able to count (and group by) "field1.field3", instead of counting each field separately.
I have a fairly frequently updated log file that I am using grok exporter to count specific log entries. I see that the resident memory usage by the exporter is quite high and growing. This happens 3/4 of the time in my env.
The config is:
global:
config_version: 2
grok:
additional_patterns:
- BASE10NUM (?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+)))
- DATA .*?
- NUMBER (?:%{BASE10NUM})
- DAEMON [a-zA-Z_\-]+
- NAELOGLEVEL LOG_[\S+]
- MESSAGE [\S\s\S]?*
- YEAR (?>\d\d){1,2}
- HOUR (?:2[0123]|[01]?[0-9])
- MINUTE (?:[0-5][0-9])
- SECOND (?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)
- TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
- MONTHNUM (?:0?[1-9]|1[0-2])
- MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])
- ISO8601_TIMEZONE (?:Z|[+-]%{HOUR}(?::?%{MINUTE}))
- TIMESTAMP_ISO8601 %{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T ]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}?
- LEVEL (LOG_)?([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|DBG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr|ERR?(?:OR)?|[Cc](rit)+(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)
input:
path: /var/log/messages
poll_interval_seconds: 5
type: file
metrics:
- help: all log entries seperated by daemon
labels:
daemon: '{{.daemon}}'
match: '%{TIMESTAMP_ISO8601:time} %{DAEMON:daemon}.*%{LEVEL:level}.*'
name: benchmark_all_daemon_log_entry_counts
type: counter
- help: count of panic entries in the log by go daemons
labels:
daemon: '{{.daemon}}'
match: '%{TIMESTAMP_ISO8601:time} %{DAEMON:daemon}.*panic.*'
name: benchmark_go_daemon_log_panic_counts
type: counter
[other confidential counters]
Hello, I am a packager for Gentoo linux.
I have been asked to look into packaging this and making it available to our users.
I see from your README.md that that you require oniguruma 5.9.6; however, that version is not available in Gentoo. We have versions 6.8.2, 6.9.0 and 6.9.1.
What are your plans for updating this? Do you have any idea when you will support a newer version?
Thanks much.
some related links are below. this is most useful when running in webhook mode.
https://microservices.io/patterns/observability/health-check-api.html
Hi,
Cam we have inotify support, so that when a file is rotated we pick up the new one? thanks
i want to add just x="y"
no connection to grok fields
may i?
thanks
Our log files are written in ASCII and can contain special chars like äöu in german speaking.
I searched for a possibility to set the coded in config.yml like in logstash config file:
codec => json { charset => "ASCII-8BIT" }
Is this possible to add to the config.yml?
Unable to find the dist/ dir after running the release script. Logs below
some_dir/gowork/src/github.com/grok_exporter > git log | head
commit 27d47f313b43a9f0981c83a9a742fe716b6f84ee
Author: Fabian Stäber <[email protected]>
Date: Sun May 5 22:05:48 2019 +0200
#5 add file tailer test for multiple log files
commit f6cf48430d31d72d88526efd4339200567c5b348
Author: Fabian Stäber <[email protected]>
Date: Wed May 1 22:45:58 2019 +0200
some_dir/gowork/src/github.com/grok_exporter > ./release.sh linux-amd64
? github.com/fstab/grok_exporter [no test files]
ok github.com/fstab/grok_exporter/config (cached)
ok github.com/fstab/grok_exporter/config/v1 (cached)
ok github.com/fstab/grok_exporter/config/v2 (cached)
ok github.com/fstab/grok_exporter/exporter (cached)
ok github.com/fstab/grok_exporter/oniguruma (cached)
ok github.com/fstab/grok_exporter/tailer (cached)
? github.com/fstab/grok_exporter/tailer/fswatcher [no test files]
ok github.com/fstab/grok_exporter/tailer/glob (cached)
ok github.com/fstab/grok_exporter/template (cached)
Building dist/grok_exporter-0.2.8-SNAPSHOT.linux-amd64.zip
go version: go version go1.12.2 linux/amd64
some_dir/gowork/src/github.com/grok_exporter > ll
total 288K
-rw-r--r-- 1 raokru warp 32 May 8 14:49 AUTHORS
-rw-r--r-- 1 raokru warp 3.5K May 8 14:49 BUILTIN.md
drwxr-xr-x 4 raokru warp 4.0K May 8 14:49 config/
-rw-r--r-- 1 raokru warp 22K May 8 14:49 CONFIG.md
-rw-r--r-- 1 raokru warp 14K May 8 14:49 CONFIG_v1.md
drwxr-xr-x 2 raokru warp 4.0K May 8 14:49 example/
drwxr-xr-x 2 raokru warp 4.0K May 8 14:49 exporter/
-rw-r--r-- 1 raokru warp 550 May 8 14:49 go.mod
-rw-r--r-- 1 raokru warp 11K May 8 14:49 grok_exporter.go
-rw-r--r-- 1 raokru warp 2.1K May 8 14:49 HOWTO_VERIFY_RELEASES.md
-rwxr-xr-x 1 raokru warp 12K May 8 14:49 integration-test.sh*
-rw-r--r-- 1 raokru warp 10K May 8 14:49 LICENSE
drwxr-xr-x 2 raokru warp 4.0K May 8 14:49 logstash-patterns-core/
-rw-r--r-- 1 raokru warp 673 May 8 14:49 NOTICE
drwxr-xr-x 2 raokru warp 4.0K May 8 14:49 oniguruma/
-rw-r--r-- 1 raokru warp 8.1K May 8 14:49 README.md
-rwxr-xr-x 1 raokru warp 7.4K May 8 14:49 release.sh*
-rw-r--r-- 1 raokru warp 141K May 8 14:49 screenshot.png
drwxr-xr-x 4 raokru warp 4.0K May 8 14:49 tailer/
drwxr-xr-x 2 raokru warp 4.0K May 8 14:49 template/
some_dir/gowork/src/github.com/grok_exporter >
consider the following code inside a file named t.go
package main
import (
"fmt"
"github.com/fstab/grok_exporter/tailer"
)
func main() {
t := tailer.RunFileTailer("hello", false, nil)
for line := range t.Lines() {
fmt.Println(line)
}
}
the file hello
contains the following
line1
now run t.go
it will wait for new lines to be written to hello
, it won't read anything since we passed false
to RunFileTailer
.
now add line2
to hello
$ echo "line2" >> hello
our program will output the following:
line2
Great, That's exactly what we're looking for, now let's add line3
$ echo "line3" >> hello
our program will output the following:
line1
line2
line3
not exactly what we wanted, instead of getting the newest line, we got the whole file.
Go 1.8
macOS 10.12.3
go test
passes
Would it be possible to configure the grok section to only create metrics for the last matched line? I am trying to grok CSV files but each line already contains the aggregate result.
Hey,
Here is my Grok query but for some reason it cannot find a match when I have the brackets in the Referrer and user agent name.
%{TIMESTAMP_ISO8601:logtime} %{WORD:s-sitename} %{WORD:s-computername} %{IPORHOST:s-ip} %{WORD:cs-method} %{NOTSPACE:cs-uri-stem} %{NOTSPACE:cs-uri-query} %{NUMBER:s-port} %{NOTSPACE:cs-username} %{IPORHOST:c-ip} %{NOTSPACE:cs-version} %{NOTSPACE:cs(User-Agent)} %{NOTSPACE:cs(Referer)} %{IPORHOST:cs-host} %{NUMBER:sc-status} %{NUMBER:sc-substatus} %{NUMBER:c-win32-status} %{NUMBER:sc-bytes} %{NUMBER:cs-bytes} %{NUMBER:time-taken}
Example log item:
2018-02-02 00:01:32 W3SVC1 UKAPPSVR 172.18.131.173 GET /123/I/Home/PLMonstants - 80 Joe+Bloggs 172.18.17.185 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+Trident/7.0;+rv:11.0)+like+Gecko https://blahblah.co.uk/theappname/live/app/thingy localhost 200 0 0 3393 2644 90
was using http://grokconstructor.appspot.com/do/match to validate?
Any ideas what I could be doing wrong or if there is something I can change with the query string to work around the bracket issue.
Unfortunately I cannot change the name of the field as we push into splunk as well.
Thanks.
Pete
We are running grok_exporter in a docker container. The logstash collects and writes to the logfile that grok_exporter is picking up. Also logstash writes to another file that Splunk is picking up.
Splunk reports higher and we believe the correct number of transaction counts, however, grok_exporter seems like not able to keep up, only reports about 1/3 of the TPS. grok_exporter seems to catch up only at the early morning low traffic hours.
There is no out of memory and high CPU issue.
We have 1 counter, 2 gauges, 3 histograms, each with 11 labels. In the /metrics endpoint, it produces about 70k different time series.
How do you recommend to go about debugging this issue?
Build Info:
branch="master",builddate="2018-05-31",goversion="go1.10.2",platform="linux-amd64",revision="62d82f8",version="0.2.5"
Config Info:
global:
config_version: 2
input:
type: file
path: /usr/share/logstash/output/log/apg_stats.log
fail_on_missing_logfile: false
readall: false
poll_interval_seconds: 5
When a grok field is referenced in the following way:
{{if eq .field "value"}}text{{end}}
The field is not detected by referencedGrokFields
(in metrics.go).
Example config:
global:
config_version: 2
input:
type: file
path: ./input
readall: true
grok:
patterns_dir: ./patterns
metrics:
- type: gauge
name: example
help: not empty
match: '%{NOTSPACE:val1} %{NOTSPACE:val2}'
value: '{{.val1}}'
labels:
my_label: '{{if eq .val2 "test"}}yes{{else}}no{{end}}'
With input file:
1 test
2 nomatch
This results in the following error:
WARNING: Skipping log line: unexpected error while evaluating my_label template: template: my_label:1:5: executing "my_label" at <eq .val2 "test">: error calling eq: invalid type for comparison
If we change the definition of my_label
to:
my_label: '{{.val2}} {{if eq .val2 "test"}}yes{{else}}no{{end}}'
The metric works as expected:
# HELP example not empty
# TYPE example gauge
example{my_label="nomatch no"} 2
example{my_label="test yes"} 1
Hello All,
I want to hide default labels in Grok Exporter output.
My Output is:
WLSRESTARTFLAG{exported_instance="https://171.17.22.7:2002/console/login/LoginForm.jsp",instance="muclhp522:9968",job="grok"} 0
WLSRESTARTFLAG{exported_instance="https://171.17.22.6:3011/weblogicStatus/index.html",instance="muclhp522:9968",job="grok"} 1
WLSRESTARTFLAG{exported_instance="https://171.17.22.5:3011/weblogicStatus/index.html",instance="muclhp522:9968",job="grok"} 0
WLSRESTARTFLAG{exported_instance="https://171.17.22.4:3012/weblogicStatus/index.html",instance="muclhp522:9968",job="grok"} 0
WLSRESTARTFLAG{exported_instance="https://171.17.22.3:3011/weblogicStatus/index.html",instance="muclhp522:9968",job="grok"} 0
WLSRESTARTFLAG{exported_instance="https://171.17.22.2:3012/weblogicStatus/index.html",instance="muclhp522:9968",job="grok"} 0
WLSRESTARTFLAG{exported_instance="https://171.17.22.1:3011/weblogicStatus/index.html",instance="muclhp522:9968",job="grok"} 0
I don't want the labels - instance and job in Prometheus output as this is causing difficulties in setting up alerts in alertmanager.
How can i hide them ? I don't want them to get printed.
Thanks
Priyotosh
Hello,
I've seen that support for multiple files is one of the most requested features, but would it be possible to use a file that has a datestamp in the filename? I'm monitoring an application that appends the current date and time to its log files. I know that wildcards that are being considered as part of multiple file support would 'solve' this issue, but it would potentially mean reading through dozens of log files when all I'm really interested in is the most recent one.
Thanks.
In my grok configuration file, I enabled retention setting for gauge metrics.
retention: 5m
I use curl command to get the metrics.
Even after 10 minutes of metrics being generated, I can still see my self defined metrics in http://localhost:9144/metrics
Would some one help me out how retention setting works, why it's still exist in http://localhost:9144/metrics?
command I use:
curl http://localhost:9144/metrics|grep my_metrics | wc -l
...
546
should I use some parameter setting to get the correct result. I expect expired metrics should not be present in /metrics.
Hello All,
We have configured Grok exporter to monitor errors from the web service logs. We see that even when there are NO errors it still prints the past count of errors.
We have used "gauge" as the metric type and polling the log file every 5 secs.
Please see the config.yml below:
global:
config_version: 2
input:
type: file
path: /ZAMBAS/logs/Healthcheck/AI/ai_17_grafana.log
readall: true
poll_interval_seconds: 5
grok:
patterns_dir: ./patterns
metrics:
- type: counter
name: OutOfThreads
help: Counter metric example with labels.
match: '%{GREEDYDATA} WARN!! OUT OF THREADS: %{GREEDYDATA}'
- type: counter
name: OutOfMemory
help: Counter metric example with labels.
match: '%{GREEDYDATA}: Java heap space'
- type: gauge
name: NoMoreEndpointPrefix
help: Counter metric example with labels.
match: '%{GREEDYDATA}: APPL%{NUMBER:val1}: IO Exception: Connection refused %{GREEDYDATA}'
value: '{{.val1}}'
cumulative: false
- type: gauge
name: IOExceptionConnectionReset
help: Counter metric example with labels.
match: ' <faultstring>APPL%{NUMBER:val3}: IO Exception: Connection reset'
value: '{{.val3}}'
cumulative: false
- type: gauge
name: IOExceptionReadTimedOut
help: Counter metric example with labels.
match: ' <faultstring>APPL%{NUMBER:val4}: IO Exception: Read timed out'
value: '{{.val4}}'
cumulative: false
- type: gauge
name: FailedToConnectTo
help: Counter metric example with labels.
match: " <faultstring>RUNTIME0013: Failed to connect to '%{URI:val5}"
value: '{{.val5}}'
cumulative: false
server:
port: 9244
Result:
grok_exporter_lines_matching_total{metric="FailedToConnectTo"} 0
grok_exporter_lines_matching_total{metric="IOExceptionConnectionReset"} 0
grok_exporter_lines_matching_total{metric="IOExceptionReadTimedOut"} 3
grok_exporter_lines_matching_total{metric="NoMoreEndpointPrefix"} 0
grok_exporter_lines_matching_total{metric="OutOfMemory"} 0
grok_exporter_lines_matching_total{metric="OutOfThreads"} 0
Say, for 1 hr there were No errors, still it shows '3' errors and when an error does occurs it keeps adding up. So in total it becomes 4 and so on..it keeps on adding :(
I want grok to show only the present data without adding previous values.
Please help us here on what am I doing wrong.
Thanks
Priyotosh
TL;DR: Create new fields from other fields (or replace values of existing ones) via regex or other builtin fonctions, just like logstash
's mutate
plugin.
Context: I've just dicsovered this tool (and mtail
) after trying to perform tail+parse+count data processing in existing PHP application (disclaimer: it failed).
grok_exporter
seems great but there is one feature I would miss: data mutations.
The ability to alter fields before exporting to Prometheus (just like logstash
's mutate
plugin) would be awesome.
In my use case I am reading Apache access.log file and I want to export HTTP requests count with the following dimensions/labels:
http://example.com/foo.asp?id=42
=> http://example.com
): some regex would do (or I could adapt the line matching regex)http://example.com/foo.asp?id=42&source=github&foo=bar
I want the following fields (thus labels): id
and foo
. I get that dropping labels is already something grok_exporter
can do, so having a mutation that creates a label for every found query parameter is fine.Other use cases (not mine):
-
by connected_user
and the -
value by guest
).The following config will result in an error.
labels:
label: '{{gsub .label "[[:lower:]]" "\\U\\0"}}'
Failed to load /etc/grok_exporter/grok.yml: invalid configuration: failed to read metric label test_metric error parsing label template: syntax error in gsub call: '\U\0' is not a valid replacement: invalid escape sequence: don't forget to put a . (dot) in front of grok fields, otherwise it will be interpreted as a function.
Issue observed with 0.26, 0.27 and source build from master on April 25.
I have configured 3 instances of grok_exporter on CentOS. 2 instances run without issue, but 1 will only export metrics for a few minutes before entering a high CPU state. In high CPU state the exporter still responds to scrape requests. The backlog is growing at 3-4 rows per second. While operating normally, the "bad" instance grok_exporter_lines_processing_time_microseconds_total is < 500 µs, and this metric drops to 0 once the instance enters high cpu state.
What debug information can I collect to help investigate this issue?
It will be great if we can use retention for metrics values without labels.
Currently, retention is only supported with labels in metrics.
Our issue: While conversion of units of values in file, i.e. from KiB/s to MiB/s, I am observing two different values in metrics, which eventually display two different plots / lines in graph dashboard of Grafana.
example,
metrics:
- type: gauge
name: bandwidth_randwrite
help: FIO Bandwidth Random Write Gauge Metrics
match: ' write: IOPS=%{GREEDYDATA}, BW=%{NUMBER:val2}%{GREEDYDATA:kibs} %{GREEDYDATA}'
value: '{{if eq .kibs "KiB/s"}}{{divide .val2 1024}}{{else}}{{.val2}}{{end}}'
labels:
bw_unit: '{{.kibs}}'
cumulative: false
retention: 1m
Result Metrics:
bandwidth_randwrite{bw_unit="KiB/s"} 2.209
bandwidth_randwrite{bw_unit="MiB/s"} 3.205
Whenever I am running the command "./grok_exporter -config ./example/config.yml", I am getting following error.
SIGILL: illegal instruction
PC=0x438d4fa m=8 sigcode=1
signal arrived during cgo execution
goroutine 1 [syscall, locked to thread]:
runtime.cgocall(0x4383910, 0xc420051600, 0x4485c19)
/usr/local/Cellar/go/1.9.2/libexec/src/runtime/cgocall.go:132 +0xe4 fp=0xc4200515c0 sp=0xc420051580 pc=0x4003614
github.com/fstab/grok_exporter/exporter._Cfunc_onig_new(0xc4204d45e0, 0x6000000, 0x600076e, 0x0, 0x46ed318, 0x46ed1b8, 0xc4204d6cc0, 0x0)
github.com/fstab/grok_exporter/exporter/_obj/_cgo_gotypes.go:294 +0x4d fp=0xc420051600 sp=0xc4200515c0 pc=0x437bbcd
github.com/fstab/grok_exporter/exporter.(*OnigurumaLib).Compile.func1(0xc4204d45e0, 0x6000000, 0x600076e, 0x0, 0x46ed318, 0x46ed1b8, 0xc4204d6cc0, 0x53)
/Users/fabian/go/src/github.com/fstab/grok_exporter/exporter/oniguruma.go:77 +0x19f fp=0xc420051660 sp=0xc420051600 pc=0x437e1ef
github.com/fstab/grok_exporter/exporter.(*OnigurumaLib).Compile(0xc4204fa0a8, 0xc42017f800, 0x76e, 0x0, 0x0, 0x0)
/Users/fabian/go/src/github.com/fstab/grok_exporter/exporter/oniguruma.go:77 +0x179 fp=0xc4200516f0 sp=0xc420051660 pc=0x437c4a9
github.com/fstab/grok_exporter/exporter.Compile(0xc420168000, 0x6d, 0xc42011a050, 0xc4204fa0a8, 0x0, 0x1, 0x4042fb4)
/Users/fabian/go/src/github.com/fstab/grok_exporter/exporter/grok.go:31 +0x9c fp=0xc4200517a8 sp=0xc4200516f0 pc=0x4372bfc
main.createMetrics(0xc420158000, 0xc42011a050, 0xc4204fa0a8, 0x0, 0x0, 0x0, 0x0, 0xc420064970)
/Users/fabian/go/src/github.com/fstab/grok_exporter/grok_exporter.go:178 +0x15c fp=0xc420051b10 sp=0xc4200517a8 pc=0x43819ec
main.main()
/Users/fabian/go/src/github.com/fstab/grok_exporter/grok_exporter.go:62 +0x14d fp=0xc420051f80 sp=0xc420051b10 pc=0x438000d
runtime.main()
/usr/local/Cellar/go/1.9.2/libexec/src/runtime/proc.go:195 +0x226 fp=0xc420051fe0 sp=0xc420051f80 pc=0x402da76
runtime.goexit()
/usr/local/Cellar/go/1.9.2/libexec/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc420051fe8 sp=0xc420051fe0 pc=0x4059551
rax 0x0
rbx 0x1
rcx 0x5
rdx 0x700000392be4
rdi 0x46ed318
rsi 0x1
rbp 0x700000392d90
rsp 0x700000392aa0
r8 0x0
r9 0x0
r10 0x467bfb0
r11 0x6ffffbd165fc
r12 0x700000392bb0
r13 0x1
r14 0xc4204d6cc0
r15 0x4d00020
rip 0x438d4fa
rflags 0x10297
cs 0x2b
fs 0x0
gs 0x0
kindly look into this issue.
Hi,
Firstly Thank you for the wonderful work with this, It works really well.
I just wanted to check if there is a way wherein subtext within a capture group can be replaced with something??
For example, I have a lot of URL's being captured like:
/abc/def/ghi/number/123123/jkl/mno
/abc/def/ghi/number/123124/jkl/mno
/abc/def/ghi/number/987654/jkl/mno
/abc/def/ghi/number/654763/jkl/mno
Currently I do not need the digits in the URL, it is creating a lot of unique metrics because of this, is there a way where in post capturing, the URL can be replaced to something like this:
/abc/def/ghi/number/x/jkl/mno
This would also mean instead of 5 separate metrics with count 1, I would have 1 metric with count 5.
Thanks,
Varunn
I see grok-exporter output some go_mem* metrics by default. But all these metrics are already available with node-exporter.
One important point is that some metrics display different help string between grok_exporter and node-exporter. This will get Prometheus to complain the difference.
e.g:
node-exporter: # HELP go_memstats_sys_bytes Number of bytes obtained from system.
grok-exporter: # HELP go_memstats_sys_bytes Number of bytes obtained by system. Sum of all system allocations.
Would you please remove duplicated metrics with node exporter or update the help text string?
Hi, I'm trying to get grok_exporter to parse a log file created by a node.js app and then send that data to Prometheus. I'm most probably "holding it wrong" because the matched values aren't update when new lines are written to the log file. When I restart the grok_exporter, then it sends the last read match to Prometheus and doesn't update with new values.
Example of log lines:
2017-09-14 14:52 +00:00: The proxy currently has 9 miners connected at 5638 h/s with an average diff of 18800
I've got something along these lines in the config file (so far):
global:
config_version: 2
input:
type: stdin
grok:
patterns_dir: ./patterns
metrics:
- type: gauge
name: proxy_miners_count
help: Number of miners connected to proxy.
match: '%{TIMESTAMP_ISO8601:date} %{ISO8601_TIMEZONE}: The proxy currently has %{NUMBER:miners:int} miners connected at %{NUMBER:hash:int} h/s with an average diff of %{NUMBER:diff:int}'
value: '{{.miners}}'
- type: gauge
name: proxy_hash_count
help: Total H/s of miners connected to proxy.
match: '%{TIMESTAMP_ISO8601:date} %{ISO8601_TIMEZONE}: The proxy currently has %{NUMBER:miners:int} miners connected at %{NUMBER:hash:int} h/s with an average diff of %{NUMBER:diff:int}'
value: '{{.hash}}'
Is it an error on my part or is this something not intended for grok_exporter? The log file gets update every 1-2 seconds so I'd expect to see this data reflected on Prometheus.
Cheers and thanks for any help
Pedro
Please add the appropriate LICENSE.md file to the root of your project.
Great work.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.