Comments (37)
I should also mention that restarting syslog-ng does not help. The drift is the same.
from syslog-ng.
And running version 3.5.4.1 on Debian testing with kernel 3.14.1.
from syslog-ng.
Are you using /dev/kmsg as the kernel source?
from syslog-ng.
How would I determine that? There is no mention of kmsg in syslog-ng.conf. The log source is from system() and internal() then filters KERN to kern.log. The Debian change log does mention kmsg:
- Read /proc/kmsg directly again. It's eliminate all the problem around klogd.
But that is dated from 2002.
from syslog-ng.
You are most probably using /dev/kmsg then. The system driver uses that if the requisite support is present in the kernel.
Hmmm.. The timestamp supplied by kmsg is probably CLOCK_MONOTONIC (boot time plus offset). Is your system clock continuously changing (for instance by NTP)?
I'm not sure rsyslog uses this timestamp by default or not.
from syslog-ng.
You are most probably using /dev/kmsg then. The system driver uses that if the requisite support is present in the kernel.
Hmmm.. The timestamp supplied by kmsg is probably CLOCK_MONOTONIC (boot time plus offset). Is your system clock continuously changing (for instance by NTP)?
I'm not sure rsyslog uses this timestamp by default or not.
from syslog-ng.
Check/update the clock via a cron job once per hour via ntpdate. I do have a VM running but its idle most all of the time and without any load.
from syslog-ng.
Maybe it's the load on the host machine. I've rebooted recently:
uptime; adjtimex -p; adjtimexconfig
21:35:45 up 10:52, 5 users, load average: 0.40, 0.55, 0.50
mode: 0
offset: 0
frequency: 1895862
maxerror: 16000000
esterror: 16000000
status: 64
time_constant: 2
precision: 1
tolerance: 32768000
tick: 9762
raw time: 1398648945s 994651us = 1398648945.994651
return value = 5
Comparing clocks (this will take 70 sec)...done.
Adjusting system time by -2055.21 sec/day to agree with CMOS clock...done.
This was after doing a rebuild of a project that uses 100% CPU for about 2.5-3 hours, although I didn't run the above commands before the build.
from syslog-ng.
If the host clock is drifting from clock monotonic, that explains the reason for drifting.
I'm not yet sure how to solve it though. One way would be to ignore the timestamp of the kernel.
Or if you could use the R_ prefixed time macro in this case.
from syslog-ng.
Maybe adjust the timestamp for a kernel message on the local machine to the time of reception, but leave it alone for remote connections? Or an option for it.
from syslog-ng.
Or an option to use /proc/kmsg rather than /dev/kmsg if both are available. /proc/kmsg doesn't include CLOCK_MONOTONIC timestamps so that would have the same effect as a local time of receptiion timestamp.
I suppose I could add a filter to suppress output of /dev/kmsg then add a filter for /proc/kmsg.
from syslog-ng.
The easiest way to use /proc/kmsg would be to not use system()
, in my opinion, and just hard-code what it expands to (see the --preprocess-into=
command-line option), replacing the /dev/kmsg line with an appropriate /proc/kmsg one.
from syslog-ng.
This issue seems to me very similar (I bet it's the same):
https://bugs.gentoo.org/show_bug.cgi?id=533328
Reverting to syslog-ng-3.4.8
solved it for the gentoo user.
from syslog-ng.
I have the same issue, also on a Gentoo system. My observations are:
- the problems started when I upgraded from syslog-ng v3.4.8 to v3.6.2
- the 'wrong' timestamp only occurs in lines with "kernel:", not in any other (bluetoothd, su, cron, etc.)
- the problem shows in /var/log/messages and other logs showing the kernel messages
- the time offset is reset to a few seconds after a reboot
- the time offset increases in jumps whenever my laptop sleeps; the jump size is equal to the sleep time
- in other words, after waking up, most system messages have a label that is <sleep time> later, while the "kernel:" lines continue a few seconds "later"
- it seems that the kernel lines print a timestamp based on the 'waking time' since boot
- since my laptop usually sleeps at night and isn't rebooted very often, the current timestamp difference between "kernel:" and other lines amounts to about 14 days
- upgrading the kernel (from 3.16.5 to 3.18.9) didn't change anything
- after downgrading to syslog-ng v3.4.8 and restarting it, the problem went away
- also, /var/log/messages was flushed with ~3000 kernel-message lines at that moment
- after upgrading again to syslog-ng v3.6.2 and restarting it, the problem immediately came back, with the same offset of ~14 days
- "date" and "adjtimex -p" print the same time (no offset)
All this suggests that the issue is caused by a change between syslog-ng v3.4.8 and v3.6.2, perhaps in the way the timestamp is constructed for the kernel messages. If there are any more tests that might be useful, I'd be glad to help if I can.
from syslog-ng.
The problem went away for me. I think it had something to do with adjtimex needing to reset or recalibrate /etc/adjtime, I dont remember exactly. But it has fixed itself for a while. If I can remember what changed I'll let you all know.
from syslog-ng.
I' running 2 Gentoo boxes (1 server, 1 desktop) and do suffer from this issue on my desktop.
The issue started with 3.6.x, 3.4.x was fine. I do not hibernate my desktop, but I do s2ram often.
Here's an example :
May 1 11:20:01 t44 CROND[12761]: (root) CMD ([ ! -x /etc/cron.hourly/0anacron ] && { test -x /usr/sbin/run-crons && /usr/sbin/run-crons ; })
Apr 30 10:15:19 t44 kernel: wlp3s0: AP 08:96:d7:05:f9:2a changed bandwidth, new config is 2462 MHz, width 2 (2452/0 MHz)
Apr 30 10:15:20 t44 kernel: wlp3s0: AP 08:96:d7:05:f9:2a changed bandwidth, new config is 2462 MHz, width 1 (2462/0 MHz)
Apr 30 10:15:22 t44 kernel: wlp3s0: AP 08:96:d7:05:f9:2a changed bandwidth, new config is 2462 MHz, width 2 (2452/0 MHz)
Apr 30 10:20:20 t44 kernel: wlp3s0: AP 08:96:d7:05:f9:2a changed bandwidth, new config is 2462 MHz, width 1 (2462/0 MHz)
May 1 11:30:01 t44 CROND[14380]: (root) CMD ([ ! -x /etc/cron.hourly/0anacron ] && { test -x /usr/sbin/run-crons && /usr/sbin/run-crons ; })
from syslog-ng.
Debian has an /etc/default/adjtimex file to set values to correct drift. My values for this file are:
TICK=10000
FREQ=2712500
which is set during startup with /etc/init.d/adjtimex that does:
/sbin/adjtimex -tick "$TICK" -frequency "$FREQ"
Debian ships with adjtimexconfig to help set those two variables which is used with /etc/adjtime. Then I have a cronjob to call ntpdate hourly. The drift is not more than 1/10 of a second. Hope this helps.
from syslog-ng.
Here's an example of 5 consecutive lines from my /var/log/messages, written on May first:
Apr 18 02:30:38 think kernel: e1000e 0000:00:19.0 eth0: Error reading PHY register
May 1 09:32:45 think dbus[3960]: [system] Activating service name=(...)
Apr 18 02:30:39 think kernel: EXT4-fs (sda3): re-mounted. Opts: data=ordered,commit=600
May 1 09:32:45 think init[1]: Switching to runlevel: 5
Apr 18 02:30:39 think kernel: e1000e 0000:00:19.0 eth0: Error reading PHY register
All "kernel:" lines give the wrong timestamp. You can see that the time jumps by about 13 days and 7 hours, then -13d7h, +13d7h and -13d7h in the course of ~1-2s, when using syslog-ng v3.6.2. This is not a drift, as seems to be the case for @benkibbey. 13d7h is consistent with the amount of time my laptop spent 'sleeping' (s2ram) since the last boot. For v3.4.8 (keeping the rest of my system the same), such behaviour doesn't happen, which suggests timestamps for kernel messages are determined differently between the two versions. As mentioned, adjtimex and date agree on the time on my system.
from syslog-ng.
FWIW, I see the following on my laptop:
$ dmesg -T | tail
[Wed May 6 05:48:21 2015] thermal LNXTHERM:00: registered as thermal_zone0
[Wed May 6 05:48:21 2015] ACPI: Thermal Zone [TZ00] (28 C)
[Wed May 6 05:48:21 2015] thermal LNXTHERM:01: registered as thermal_zone1
[Wed May 6 05:48:21 2015] ACPI: Thermal Zone [TZ01] (30 C)
But the date is Thu May 7 12:32:32 CEST 2015
. Non-kernel logs are fine, and in /var/log/messages
, the above messages appear with correct date:
May 7 12:32:28 eowyn kernel: thermal LNXTHERM:00: registered as thermal_zone0
May 7 12:32:28 eowyn kernel: ACPI: Thermal Zone [TZ00] (28 C)
May 7 12:32:28 eowyn kernel: thermal LNXTHERM:01: registered as thermal_zone1
May 7 12:32:28 eowyn kernel: ACPI: Thermal Zone [TZ01] (30 C)
This is using 3.6.1, but I don't see anything obvious in the 3.6.2 changes that would change the behaviour.
@AstroFloyd, @benkibbey does your dmesg -T
agree with what's written to /var/log/messages
? Also, are you using system()
? Do you get your kernel logs from /proc/kmsg
? If using system()
and a recent kernel, kernel logs will be read from /dev/log
, and the time parser there may be wrong.
If using system()
, does changing it to to use /proc/kmsg
instead of /dev/kmsg
solve the issue? (To figure out what system()
expands to, you can use the system-expand
tool in modules/system-source/
)
What I suspect is that we're reading /dev/kmsg
, which has a timestamp, and that is not updated after SUSPEND/RESUME. With /proc/kmsg
, we don't have a timestamp at all, as far as I remember, so the receive time will be used, which is closer to being correct than the /dev/kmsg
timestamp after resuming. Perhaps we need a way to ignore /dev/kmsg
timestamps, and do so by default?
from syslog-ng.
I've checked the difference between 3.4.8
and 3.6.2
. In 3.4.8
we use only /proc/kmsg
. In 3.6.2
we first try to open /dev/kmsg
and if that fails we move to use /proc/kmsg
. This happens on Linux machines.
Could somebody test a patch if I make one?
from syslog-ng.
Why not - my Gentoo Linux here makes it easy to apply patches.
(just FWIW I switched back in the mean while my Gentoo Linux desktop to 3.4.8 b/c 3.6.2 was no longer useful, but would give the patch a try)
from syslog-ng.
@toralf: Before we go for that (/proc/kmsg
can be read safely only by one consumer) could you try something?
Now, we read messages from /dev/kmsg
. It's also used by dmesg
which says in its manual (Ubuntu Trusty):
The time source used for the logs is not updated after system
SUSPEND/RESUME.
The time drift is also affected by BIOS time. You may check your BIOS time with the following command (as root):
hwclock --show
To print out the system time, execute date
.
The BIOS time can be synchronized with the system time with the following command (as root):
hwclock --systohc --utc
Could you synchronize your BIOS time to your system time with the above commands? It should mitigate the problem. Please, also paste the output dates into a comment.
from syslog-ng.
FWIW, see my analysis above. The straightforward solution is to not use the date parts of /dev/kmsg
, but add our own.
Keep in mind that restarting syslog-ng will re-read /dev/kmsg
, as far as I remember, because we don't empty the buffer, nor do we track where we left of. This may have been fixed without me noticing, but something to keep in mind.
from syslog-ng.
@algernon I overlooked that, thanks.
from syslog-ng.
@toralf Could you test the fix? You can find the commit below this comment.
from syslog-ng.
I''' try to test it.
FWIW :
$ sudo hwclock --show; date
Wed 27 May 2015 02:49:58 PM CEST -0.734753 seconds
Wed May 27 14:49:58 CEST 2015
BTW - I didn't found an easy way to download the patch directly from the given link - do I overlooked the download button ?
from syslog-ng.
Just a moment, I'll paste it here.
from syslog-ng.
From 75dc0c6d4fffeac7a7b0461a6cdc6edd7a62c22c Mon Sep 17 00:00:00 2001
From: Benke Tibor <[email protected]>
Date: Wed, 27 May 2015 13:18:53 +0200
Subject: [PATCH] system: use keep-timestamp(no) in case of Linux kernel log
messages
Kernel messages read from /dev/kmsg may not have accurate timestamps.
The time source used for the logs is not updated after system
SUSPEND/RESUME. With this patch we ignore the original (and possibly
inaccurate) timestamp and use the time of reception.
Fixes #121
Signed-off-by: Tibor Benke <[email protected]>
---
modules/system-source/system-source.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/modules/system-source/system-source.c b/modules/system-source/system-source.c
index 7228dd4..9e3de12 100644
--- a/modules/system-source/system-source.c
+++ b/modules/system-source/system-source.c
@@ -76,7 +76,8 @@ system_sysblock_add_unix_dgram(GString *sysblock, const gchar *path,
static void
system_sysblock_add_file(GString *sysblock, const gchar *path,
gint follow_freq, const gchar *prg_override,
- const gchar *flags, const gchar *format)
+ const gchar *flags, const gchar *format,
+ gboolean ignore_timestamp)
{
g_string_append_printf(sysblock, "file(\"%s\"", path);
if (follow_freq >= 0)
@@ -87,6 +88,8 @@ system_sysblock_add_file(GString *sysblock, const gchar *path,
g_string_append_printf(sysblock, " flags(%s)", flags);
if (format)
g_string_append_printf(sysblock, " format(%s)", format);
+ if (ignore_timestamp)
+ g_string_append_printf(sysblock, " keep-timestamp(no)");
g_string_append(sysblock, ");\n");
}
@@ -145,9 +148,9 @@ system_sysblock_add_freebsd_klog(GString *sysblock, const gchar *release)
if (strncmp(release, "7.", 2) == 0 ||
strncmp(release, "8.", 2) == 0 ||
strncmp(release, "9.0", 3) == 0)
- system_sysblock_add_file(sysblock, "/dev/klog", 1, "kernel", "no-parse", NULL);
+ system_sysblock_add_file(sysblock, "/dev/klog", 1, "kernel", "no-parse", NULL, FALSE);
else
- system_sysblock_add_file(sysblock, "/dev/klog", 0, "kernel", "no-parse", NULL);
+ system_sysblock_add_file(sysblock, "/dev/klog", 0, "kernel", "no-parse", NULL, FALSE);
}
static gboolean
@@ -200,7 +203,7 @@ system_sysblock_add_linux_kmsg(GString *sysblock)
}
else
system_sysblock_add_file(sysblock, kmsg, -1,
- "kernel", "kernel", format);
+ "kernel", "kernel", format, TRUE);
}
static gboolean
--
1.9.1
from syslog-ng.
@toralf: Append .patch to the URL, and it becomes easily downloadable.
from syslog-ng.
@ihrwein : at a first glance that seems to work, will observe it over the next few days
@algernon: cool hint - somebody should tell github devs to code that into a nifty bottun.
from syslog-ng.
@toralf Did it work finally? If it works, please close this issue.
from syslog-ng.
It works here fine since I applied the patch - but I cannot close this issue, I'm not the issuer, I'm just the Gentoo Linux guy, who cried loud about this.
from syslog-ng.
@toralf You were the tester, thanks for that :)
@benkibbey If the patch works for you, please close the issue.
from syslog-ng.
still - or again - a problem with 3.7.2 : http://www.zwiebeltoralf.de/pub/syslog-ng-3.7.2-mess.txt
at this stable hardened 64 Bit Gentoo, 3.6.4 was fine AFAICT
from syslog-ng.
@toralf The upcoming 3.7.3 release will contain the patch. Thanks for reporting! (3.8 will also have it).
update: a workaround is to expand manually your system()
source and use keep-timestamp(no)
in case of /dev/kmsg
.
from syslog-ng.
yep - with confiure option "keep-timestamp(no)" the version 3.7.2 seems to run flawlessly.
Would that for 3.7.3 be needed too ?
from syslog-ng.
@toralf 3.7.3's system()
source will use the keep-timestamp(no)
so you won't need any manual tweaking.
from syslog-ng.
Related Issues (20)
- Support role in s3 destination
- Sending logs to OpenSearch using panos parser produces JSON error in OpenSearch HOT 2
- default-network-drivers() is not getting fortigate logs hostname ($HOST) correctly HOT 4
- stats(healthcheck-freq()) problems HOT 4
- On a stressed system, some logs are not being written to the log file. HOT 2
- openobserve config is incorrect HOT 2
- Syslog-ng stops writing logs to a file from a custom systemd-journald namespace after a server restart HOT 2
- Support aws:kms encryption in the s3 destination
- [4.7.1] can't compile cloud-auth when using libressl HOT 2
- compile error in otel when ipv6 support is disabled
- clang c++ support HOT 1
- trusted-keys: support a secure hash algorithm HOT 1
- Add macro for certificate fingerprint HOT 1
- Unexpected behavior with multiple conditions including 'not' in the filter HOT 4
- There is no problem starting syslog-ng, but Verify reports an error. HOT 1
- OpenTelemetry/OTLP: add support for certificate pinning, like with `trusted-keys()` HOT 3
- Config failing with kafka and template HOT 3
- syslog-ng 3.24.1 sometimes crash in log_pipe_queue HOT 2
- libcloud-auth is underlinked HOT 1
- Fuzzing discovery HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from syslog-ng.