GithubHelp home page GithubHelp logo

Comments (37)

benkibbey avatar benkibbey commented on July 20, 2024

I should also mention that restarting syslog-ng does not help. The drift is the same.

from syslog-ng.

benkibbey avatar benkibbey commented on July 20, 2024

And running version 3.5.4.1 on Debian testing with kernel 3.14.1.

from syslog-ng.

bazsi avatar bazsi commented on July 20, 2024

Are you using /dev/kmsg as the kernel source?

from syslog-ng.

benkibbey avatar benkibbey commented on July 20, 2024

How would I determine that? There is no mention of kmsg in syslog-ng.conf. The log source is from system() and internal() then filters KERN to kern.log. The Debian change log does mention kmsg:

  • Read /proc/kmsg directly again. It's eliminate all the problem around klogd.

But that is dated from 2002.

from syslog-ng.

bazsi avatar bazsi commented on July 20, 2024

You are most probably using /dev/kmsg then. The system driver uses that if the requisite support is present in the kernel.

Hmmm.. The timestamp supplied by kmsg is probably CLOCK_MONOTONIC (boot time plus offset). Is your system clock continuously changing (for instance by NTP)?

I'm not sure rsyslog uses this timestamp by default or not.

from syslog-ng.

bazsi avatar bazsi commented on July 20, 2024

You are most probably using /dev/kmsg then. The system driver uses that if the requisite support is present in the kernel.

Hmmm.. The timestamp supplied by kmsg is probably CLOCK_MONOTONIC (boot time plus offset). Is your system clock continuously changing (for instance by NTP)?

I'm not sure rsyslog uses this timestamp by default or not.

from syslog-ng.

benkibbey avatar benkibbey commented on July 20, 2024

Check/update the clock via a cron job once per hour via ntpdate. I do have a VM running but its idle most all of the time and without any load.

from syslog-ng.

benkibbey avatar benkibbey commented on July 20, 2024

Maybe it's the load on the host machine. I've rebooted recently:

uptime; adjtimex -p; adjtimexconfig

21:35:45 up 10:52, 5 users, load average: 0.40, 0.55, 0.50
mode: 0
offset: 0
frequency: 1895862
maxerror: 16000000
esterror: 16000000
status: 64
time_constant: 2
precision: 1
tolerance: 32768000
tick: 9762
raw time: 1398648945s 994651us = 1398648945.994651
return value = 5
Comparing clocks (this will take 70 sec)...done.
Adjusting system time by -2055.21 sec/day to agree with CMOS clock...done.

This was after doing a rebuild of a project that uses 100% CPU for about 2.5-3 hours, although I didn't run the above commands before the build.

from syslog-ng.

bazsi avatar bazsi commented on July 20, 2024

If the host clock is drifting from clock monotonic, that explains the reason for drifting.

I'm not yet sure how to solve it though. One way would be to ignore the timestamp of the kernel.

Or if you could use the R_ prefixed time macro in this case.

from syslog-ng.

benkibbey avatar benkibbey commented on July 20, 2024

Maybe adjust the timestamp for a kernel message on the local machine to the time of reception, but leave it alone for remote connections? Or an option for it.

from syslog-ng.

benkibbey avatar benkibbey commented on July 20, 2024

Or an option to use /proc/kmsg rather than /dev/kmsg if both are available. /proc/kmsg doesn't include CLOCK_MONOTONIC timestamps so that would have the same effect as a local time of receptiion timestamp.
I suppose I could add a filter to suppress output of /dev/kmsg then add a filter for /proc/kmsg.

from syslog-ng.

algernon avatar algernon commented on July 20, 2024

The easiest way to use /proc/kmsg would be to not use system(), in my opinion, and just hard-code what it expands to (see the --preprocess-into= command-line option), replacing the /dev/kmsg line with an appropriate /proc/kmsg one.

from syslog-ng.

ihrwein avatar ihrwein commented on July 20, 2024

This issue seems to me very similar (I bet it's the same):

https://bugs.gentoo.org/show_bug.cgi?id=533328

Reverting to syslog-ng-3.4.8 solved it for the gentoo user.

from syslog-ng.

AstroFloyd avatar AstroFloyd commented on July 20, 2024

I have the same issue, also on a Gentoo system. My observations are:

  • the problems started when I upgraded from syslog-ng v3.4.8 to v3.6.2
  • the 'wrong' timestamp only occurs in lines with "kernel:", not in any other (bluetoothd, su, cron, etc.)
  • the problem shows in /var/log/messages and other logs showing the kernel messages
  • the time offset is reset to a few seconds after a reboot
  • the time offset increases in jumps whenever my laptop sleeps; the jump size is equal to the sleep time
    • in other words, after waking up, most system messages have a label that is <sleep time> later, while the "kernel:" lines continue a few seconds "later"
    • it seems that the kernel lines print a timestamp based on the 'waking time' since boot
    • since my laptop usually sleeps at night and isn't rebooted very often, the current timestamp difference between "kernel:" and other lines amounts to about 14 days
  • upgrading the kernel (from 3.16.5 to 3.18.9) didn't change anything
  • after downgrading to syslog-ng v3.4.8 and restarting it, the problem went away
    • also, /var/log/messages was flushed with ~3000 kernel-message lines at that moment
  • after upgrading again to syslog-ng v3.6.2 and restarting it, the problem immediately came back, with the same offset of ~14 days
  • "date" and "adjtimex -p" print the same time (no offset)

All this suggests that the issue is caused by a change between syslog-ng v3.4.8 and v3.6.2, perhaps in the way the timestamp is constructed for the kernel messages. If there are any more tests that might be useful, I'd be glad to help if I can.

from syslog-ng.

benkibbey avatar benkibbey commented on July 20, 2024

The problem went away for me. I think it had something to do with adjtimex needing to reset or recalibrate /etc/adjtime, I dont remember exactly. But it has fixed itself for a while. If I can remember what changed I'll let you all know.

from syslog-ng.

toralf avatar toralf commented on July 20, 2024

I' running 2 Gentoo boxes (1 server, 1 desktop) and do suffer from this issue on my desktop.

The issue started with 3.6.x, 3.4.x was fine. I do not hibernate my desktop, but I do s2ram often.

Here's an example :
May 1 11:20:01 t44 CROND[12761]: (root) CMD ([ ! -x /etc/cron.hourly/0anacron ] && { test -x /usr/sbin/run-crons && /usr/sbin/run-crons ; })
Apr 30 10:15:19 t44 kernel: wlp3s0: AP 08:96:d7:05:f9:2a changed bandwidth, new config is 2462 MHz, width 2 (2452/0 MHz)
Apr 30 10:15:20 t44 kernel: wlp3s0: AP 08:96:d7:05:f9:2a changed bandwidth, new config is 2462 MHz, width 1 (2462/0 MHz)
Apr 30 10:15:22 t44 kernel: wlp3s0: AP 08:96:d7:05:f9:2a changed bandwidth, new config is 2462 MHz, width 2 (2452/0 MHz)
Apr 30 10:20:20 t44 kernel: wlp3s0: AP 08:96:d7:05:f9:2a changed bandwidth, new config is 2462 MHz, width 1 (2462/0 MHz)
May 1 11:30:01 t44 CROND[14380]: (root) CMD ([ ! -x /etc/cron.hourly/0anacron ] && { test -x /usr/sbin/run-crons && /usr/sbin/run-crons ; })

from syslog-ng.

benkibbey avatar benkibbey commented on July 20, 2024

Debian has an /etc/default/adjtimex file to set values to correct drift. My values for this file are:

TICK=10000
FREQ=2712500

which is set during startup with /etc/init.d/adjtimex that does:

/sbin/adjtimex -tick "$TICK" -frequency "$FREQ"

Debian ships with adjtimexconfig to help set those two variables which is used with /etc/adjtime. Then I have a cronjob to call ntpdate hourly. The drift is not more than 1/10 of a second. Hope this helps.

from syslog-ng.

AstroFloyd avatar AstroFloyd commented on July 20, 2024

Here's an example of 5 consecutive lines from my /var/log/messages, written on May first:

Apr 18 02:30:38 think kernel: e1000e 0000:00:19.0 eth0: Error reading PHY register
May 1 09:32:45 think dbus[3960]: [system] Activating service name=(...)
Apr 18 02:30:39 think kernel: EXT4-fs (sda3): re-mounted. Opts: data=ordered,commit=600
May 1 09:32:45 think init[1]: Switching to runlevel: 5
Apr 18 02:30:39 think kernel: e1000e 0000:00:19.0 eth0: Error reading PHY register

All "kernel:" lines give the wrong timestamp. You can see that the time jumps by about 13 days and 7 hours, then -13d7h, +13d7h and -13d7h in the course of ~1-2s, when using syslog-ng v3.6.2. This is not a drift, as seems to be the case for @benkibbey. 13d7h is consistent with the amount of time my laptop spent 'sleeping' (s2ram) since the last boot. For v3.4.8 (keeping the rest of my system the same), such behaviour doesn't happen, which suggests timestamps for kernel messages are determined differently between the two versions. As mentioned, adjtimex and date agree on the time on my system.

from syslog-ng.

algernon avatar algernon commented on July 20, 2024

FWIW, I see the following on my laptop:

$ dmesg -T | tail
[Wed May  6 05:48:21 2015] thermal LNXTHERM:00: registered as thermal_zone0
[Wed May  6 05:48:21 2015] ACPI: Thermal Zone [TZ00] (28 C)
[Wed May  6 05:48:21 2015] thermal LNXTHERM:01: registered as thermal_zone1
[Wed May  6 05:48:21 2015] ACPI: Thermal Zone [TZ01] (30 C)

But the date is Thu May 7 12:32:32 CEST 2015. Non-kernel logs are fine, and in /var/log/messages, the above messages appear with correct date:

May  7 12:32:28 eowyn kernel: thermal LNXTHERM:00: registered as thermal_zone0
May  7 12:32:28 eowyn kernel: ACPI: Thermal Zone [TZ00] (28 C)
May  7 12:32:28 eowyn kernel: thermal LNXTHERM:01: registered as thermal_zone1
May  7 12:32:28 eowyn kernel: ACPI: Thermal Zone [TZ01] (30 C)

This is using 3.6.1, but I don't see anything obvious in the 3.6.2 changes that would change the behaviour.

@AstroFloyd, @benkibbey does your dmesg -T agree with what's written to /var/log/messages? Also, are you using system()? Do you get your kernel logs from /proc/kmsg? If using system() and a recent kernel, kernel logs will be read from /dev/log, and the time parser there may be wrong.

If using system(), does changing it to to use /proc/kmsg instead of /dev/kmsg solve the issue? (To figure out what system() expands to, you can use the system-expand tool in modules/system-source/)

What I suspect is that we're reading /dev/kmsg, which has a timestamp, and that is not updated after SUSPEND/RESUME. With /proc/kmsg, we don't have a timestamp at all, as far as I remember, so the receive time will be used, which is closer to being correct than the /dev/kmsg timestamp after resuming. Perhaps we need a way to ignore /dev/kmsg timestamps, and do so by default?

from syslog-ng.

ihrwein avatar ihrwein commented on July 20, 2024

I've checked the difference between 3.4.8 and 3.6.2. In 3.4.8 we use only /proc/kmsg. In 3.6.2 we first try to open /dev/kmsg and if that fails we move to use /proc/kmsg. This happens on Linux machines.

Could somebody test a patch if I make one?

from syslog-ng.

toralf avatar toralf commented on July 20, 2024

Why not - my Gentoo Linux here makes it easy to apply patches.
(just FWIW I switched back in the mean while my Gentoo Linux desktop to 3.4.8 b/c 3.6.2 was no longer useful, but would give the patch a try)

from syslog-ng.

ihrwein avatar ihrwein commented on July 20, 2024

@toralf: Before we go for that (/proc/kmsg can be read safely only by one consumer) could you try something?

Now, we read messages from /dev/kmsg. It's also used by dmesg which says in its manual (Ubuntu Trusty):

The  time  source  used for the logs is not updated after system
              SUSPEND/RESUME.

The time drift is also affected by BIOS time. You may check your BIOS time with the following command (as root):

hwclock --show

To print out the system time, execute date.

The BIOS time can be synchronized with the system time with the following command (as root):

hwclock --systohc --utc

Could you synchronize your BIOS time to your system time with the above commands? It should mitigate the problem. Please, also paste the output dates into a comment.

from syslog-ng.

algernon avatar algernon commented on July 20, 2024

FWIW, see my analysis above. The straightforward solution is to not use the date parts of /dev/kmsg, but add our own.

Keep in mind that restarting syslog-ng will re-read /dev/kmsg, as far as I remember, because we don't empty the buffer, nor do we track where we left of. This may have been fixed without me noticing, but something to keep in mind.

from syslog-ng.

ihrwein avatar ihrwein commented on July 20, 2024

@algernon I overlooked that, thanks.

from syslog-ng.

ihrwein avatar ihrwein commented on July 20, 2024

@toralf Could you test the fix? You can find the commit below this comment.

from syslog-ng.

toralf avatar toralf commented on July 20, 2024

I''' try to test it.
FWIW :

$ sudo hwclock --show; date
Wed 27 May 2015 02:49:58 PM CEST -0.734753 seconds
Wed May 27 14:49:58 CEST 2015

BTW - I didn't found an easy way to download the patch directly from the given link - do I overlooked the download button ?

from syslog-ng.

ihrwein avatar ihrwein commented on July 20, 2024

Just a moment, I'll paste it here.

from syslog-ng.

ihrwein avatar ihrwein commented on July 20, 2024
From 75dc0c6d4fffeac7a7b0461a6cdc6edd7a62c22c Mon Sep 17 00:00:00 2001
From: Benke Tibor <[email protected]>
Date: Wed, 27 May 2015 13:18:53 +0200
Subject: [PATCH] system: use keep-timestamp(no) in case of Linux kernel log
 messages

Kernel messages read from /dev/kmsg may not have accurate timestamps.
The  time  source  used for the logs is not updated after system
 SUSPEND/RESUME. With this patch we ignore the original (and possibly
 inaccurate) timestamp and use the time of reception.

Fixes #121

Signed-off-by: Tibor Benke <[email protected]>
---
 modules/system-source/system-source.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/modules/system-source/system-source.c b/modules/system-source/system-source.c
index 7228dd4..9e3de12 100644
--- a/modules/system-source/system-source.c
+++ b/modules/system-source/system-source.c
@@ -76,7 +76,8 @@ system_sysblock_add_unix_dgram(GString *sysblock, const gchar *path,
 static void
 system_sysblock_add_file(GString *sysblock, const gchar *path,
                          gint follow_freq, const gchar *prg_override,
-                         const gchar *flags, const gchar *format)
+                         const gchar *flags, const gchar *format,
+                         gboolean ignore_timestamp)
 {
   g_string_append_printf(sysblock, "file(\"%s\"", path);
   if (follow_freq >= 0)
@@ -87,6 +88,8 @@ system_sysblock_add_file(GString *sysblock, const gchar *path,
     g_string_append_printf(sysblock, " flags(%s)", flags);
   if (format)
     g_string_append_printf(sysblock, " format(%s)", format);
+  if (ignore_timestamp)
+    g_string_append_printf(sysblock, " keep-timestamp(no)");
   g_string_append(sysblock, ");\n");
 }

@@ -145,9 +148,9 @@ system_sysblock_add_freebsd_klog(GString *sysblock, const gchar *release)
   if (strncmp(release, "7.", 2) == 0 ||
       strncmp(release, "8.", 2) == 0 ||
       strncmp(release, "9.0", 3) == 0)
-    system_sysblock_add_file(sysblock, "/dev/klog", 1, "kernel", "no-parse", NULL);
+    system_sysblock_add_file(sysblock, "/dev/klog", 1, "kernel", "no-parse", NULL, FALSE);
   else
-    system_sysblock_add_file(sysblock, "/dev/klog", 0, "kernel", "no-parse", NULL);
+    system_sysblock_add_file(sysblock, "/dev/klog", 0, "kernel", "no-parse", NULL, FALSE);
 }

 static gboolean
@@ -200,7 +203,7 @@ system_sysblock_add_linux_kmsg(GString *sysblock)
     }
   else
     system_sysblock_add_file(sysblock, kmsg, -1,
-                             "kernel", "kernel", format);
+                             "kernel", "kernel", format, TRUE);
 }

 static gboolean
-- 
1.9.1

from syslog-ng.

algernon avatar algernon commented on July 20, 2024

@toralf: Append .patch to the URL, and it becomes easily downloadable.

from syslog-ng.

toralf avatar toralf commented on July 20, 2024

@ihrwein : at a first glance that seems to work, will observe it over the next few days

@algernon: cool hint - somebody should tell github devs to code that into a nifty bottun.

from syslog-ng.

ihrwein avatar ihrwein commented on July 20, 2024

@toralf Did it work finally? If it works, please close this issue.

from syslog-ng.

toralf avatar toralf commented on July 20, 2024

It works here fine since I applied the patch - but I cannot close this issue, I'm not the issuer, I'm just the Gentoo Linux guy, who cried loud about this.

from syslog-ng.

ihrwein avatar ihrwein commented on July 20, 2024

@toralf You were the tester, thanks for that :)
@benkibbey If the patch works for you, please close the issue.

from syslog-ng.

toralf avatar toralf commented on July 20, 2024

still - or again - a problem with 3.7.2 : http://www.zwiebeltoralf.de/pub/syslog-ng-3.7.2-mess.txt
at this stable hardened 64 Bit Gentoo, 3.6.4 was fine AFAICT

from syslog-ng.

ihrwein avatar ihrwein commented on July 20, 2024

@toralf The upcoming 3.7.3 release will contain the patch. Thanks for reporting! (3.8 will also have it).

update: a workaround is to expand manually your system() source and use keep-timestamp(no) in case of /dev/kmsg.

from syslog-ng.

toralf avatar toralf commented on July 20, 2024

yep - with confiure option "keep-timestamp(no)" the version 3.7.2 seems to run flawlessly.
Would that for 3.7.3 be needed too ?

from syslog-ng.

ihrwein avatar ihrwein commented on July 20, 2024

@toralf 3.7.3's system() source will use the keep-timestamp(no) so you won't need any manual tweaking.

from syslog-ng.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.