Sometimes it looks like the tail end of the output from a container is being truncated

/cc <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

My results... With cc-runtime (<cod

This issue might be related to <a class="issue-link js-issue-link" data-error-text="Fa

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

stdout sometimes seems to be truncated about agent HOT 18 CLOSED

kata-containers commented on June 10, 2024

stdout sometimes seems to be truncated

from agent.

Comments (18)

grahamwhaley commented on June 10, 2024

/cc @sboeuf @chavafg

from agent.

sboeuf commented on June 10, 2024

@grahamwhaley I'll take a look.

from agent.

grahamwhaley commented on June 10, 2024

Hi @sboeuf - I see kata-containers/shim#42, and I think that fixes the ordering issue (but I've not scripted/tested it thoroughly...). But, with that in place, I still see truncation I believe. I 'bashed' (pun intended) up this to check it:

#!/bin/bash

set -x
set -e

#phrase="Linux"
phrase="Freeing unused kernel memory"
#RUNTIME=cc-runtime
RUNTIME=kata-runtime

for i in $(seq 1 20); do
        # Check to see we get the full log
        # Fail due to the shell -e if we do not find the expected line
        echo "Check $i"
        docker run --rm -ti --runtime=$RUNTIME ubuntu dmesg | fgrep "${phrase}"
done

echo "Done OK"

And that works for cc-runtime, and works with the phrase Linux for kata, but not with the phrase Freeing unused kernel memory, I believe because that phrase occurs much later in the dmesg.

from agent.

sboeuf commented on June 10, 2024

@grahamwhaley interesting, so this means we get the truncation because of a buffer overflow somewhere.

from agent.

sboeuf commented on June 10, 2024

@grahamwhaley I have spent some time trying to reproduce this issue (the truncation) using your script and some manual commands on my side on my local machine, unfortunately, it's impossible for me to end up with this error...
Please make sure you have everything up to date with all the recent patches, but honestly it's very hard for me to debug this issue if I cannot test it.

from agent.

grahamwhaley commented on June 10, 2024

@sboeuf - sure, np. I've updated my kata components, but I still see the issue. I'll dump my component details first, and then a further example test. The proxy/shim/image(agent) are from the kata repos. The runtime is from the cc/runtime repo with a 'install-kata-system'. All from HEAD. Oh, note, I do not have debug enabled in the kata config - as I was doing metrics tests, and wanted to remove any extra latency - maybe that makes a difference?. From kata-runtime kata-env, I have:

$ kata-runtime kata-env
[Meta]
  Version = "1.0.6"

[Runtime]
  Debug = false
  [Runtime.Version]
    Semver = "3.0.15"
    Commit = "d948ce756754f726ab18ae40a7ad18c1ee815830"
    OCI = "1.0.1"
  [Runtime.Config]
    Path = "/usr/share/defaults/kata-containers/configuration.toml"

[Hypervisor]
  MachineType = "pc"
  Version = "QEMU emulator version 2.7.0, Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers"
  Path = "/usr/bin/qemu-lite-system-x86_64"
  Debug = false

[Image]
  Path = "/usr/share/kata-containers/kata-containers-2018-01-31-10:54:27.844271593+0000-1c504c7"

[Kernel]
  Path = "/usr/share/clear-containers/vmlinuz-4.9.60-82.container"
  Parameters = ""

[Proxy]
  Type = "kataProxy"
  Version = "kata-proxy version 0.0.1-8a92752c1338a42c043bd6bab496be01ae6140fd"
  Path = "/usr/libexec/kata-containers/kata-proxy"
  Debug = false

[Shim]
  Type = "kataShim"
  Version = "kata-shim version 0.0.1-8908929827acba53dd2ceb4b220f418b73ce3dee"
  Path = "/usr/libexec/kata-containers/kata-shim"
  Debug = false

[Agent]
  Type = "kata"
  Version = "<<unknown>>"

[Host]
  Kernel = "4.4.0-104-generic"
  CCCapable = true
  [Host.Distro]
    Name = "Ubuntu"
    Version = "16.04"
  [Host.CPU]
    Vendor = "GenuineIntel"
    Model = "Intel(R) Core(TM) i5-6260U CPU @ 1.80GHz"

Let's simplify the test slightly. If I use:

#!/bin/bash

set -e

#RUNTIME=cc-runtime
RUNTIME=kata-runtime

for i in $(seq 1 20); do
        echo -n "Check $i :"
        docker run --rm -ti --runtime=$RUNTIME ubuntu dmesg | wc
done

then for cc-runtime I get:

Check 2 :   1433   11093  109530
Check 3 :   1433   11104  109619
Check 4 :   1433   11091  109518
Check 5 :   1433   11097  109582
Check 6 :   1433   11101  109596
Check 7 :   1433   11104  109621
Check 8 :   1433   11105  109625
Check 9 :   1433   11094  109543
Check 10 :   1433   11099  109579
Check 11 :   1433   11098  109579
Check 12 :   1433   11094  109540
Check 13 :   1433   11099  109589
Check 14 :   1433   11090  109514
Check 15 :   1433   11092  109519
Check 16 :   1433   11102  109603
Check 17 :   1433   11101  109593
Check 18 :   1433   11098  109579
Check 19 :   1433   11092  109533
Check 20 :   1433   11101  109591

We can just focus on the 'line count' - I expect the word and char count to vary due to timestamps and other small boot differences.

With kata-runtime I get:

Check 2 :   1434   11102  109602
Check 3 :   1435   11114  109723
Check 4 :   1415   10932  107916
Check 5 :   1400   10786  106470
Check 6 :   1434   11101  109600
Check 7 :   1434   11109  109674
Check 8 :   1391   10715  105750
Check 9 :   1408   10866  107262
Check 10 :   1434   11098  109577
Check 11 :   1434   11108  109658
Check 12 :   1434   11102  109592
Check 13 :   1405   10830  106956
Check 14 :   1350   10410  102578
Check 15 :   1434   11110  109682
Check 16 :   1434   11098  109588
Check 17 :   1434   11106  109620
Check 18 :   1434   11107  109651
Check 19 :   1359   10478  103261
Check 20 :   1358   10469  103179

The line count looks pretty wobbly...

Can somebody else give this a spin then - @jodh-intel , would you be able to give the above simple script a quick spin with your kata setup?

Maybe it is something else that is the variable here - the docker, qemu or go version for instance maybe? My qemu looks to be 2.7 for instance - an issue?

from agent.

grahamwhaley commented on June 10, 2024

I enabled the debugs in my .toml config - I still see the 'wobble' for kata.

from agent.

jodh-intel commented on June 10, 2024

My results...

With cc-runtime (master + latest OBS packages for everything else):

Check 1 :   1436   11142  110421               
Check 2 :   1436   11142  110403               
Check 3 :   1437   11154  110540               
Check 4 :   1436   11142  110404               
Check 5 :   1436   11142  110430               
Check 6 :   1436   11142  110420               
Check 7 :   1436   11142  110401               
Check 8 :   1436   11142  110410               
Check 9 :   1436   11142  110411               
Check 10 :   1436   11142  110414              
Check 11 :   1436   11142  110410              
Check 12 :   1436   11142  110395              
Check 13 :   1436   11142  110415              
Check 14 :   1436   11142  110415
Check 15 :   1436   11142  110408
Check 16 :   1436   11142  110419
Check 17 :   1437   11154  110552
Check 18 :   1446   11242  111345
Check 19 :   1436   11142  110405
Check 20 :   1436   11142  110405

With kata-runtime (from https://github.com/clearcontainers/runtime + latest kata agent in osbuilder image):

Check 1 :   1441   11198  111125
Check 2 :   1439   11171  110866
Check 3 :   1403   10825  107130
Check 4 :   1440   11181  110941
Check 5 :   1439   11171  110851
Check 6 :   1439   11171  110860
Check 7 :   1439   11171  110854
Check 8 :   1360   10503  103730
Check 9 :   1439   11171  110855
Check 10 :   1439   11171  110841
Check 11 :   1439   11171  110895
Check 12 :   1292    9998   98858
Check 13 :   1257    9729   96087
Check 14 :   1439   11171  110921
Check 15 :   1439   11171  110877
Check 16 :   1439   11171  110881
Check 17 :   1439   11171  110929
Check 18 :   1439   11171  110874
Check 19 :   1441   11198  111143
Check 20 :   1441   11198  111186

With kata-runtime (from https://github.com/kata-containers/runtime + latest kata agent in osbuilder image):

Check 1 :   1439   11171  110857
Check 2 :   1441   11198  111129
Check 3 :   1449   11271  111799
Check 4 :   1439   11171  110857
Check 5 :   1439   11171  110865
Check 6 :   1439   11171  110861
Check 7 :   1439   11171  110870
Check 8 :   1441   11198  111135
Check 9 :   1439   11171  110857
Check 10 :   1439   11171  110845
Check 11 :   1433   11120  110360
Check 12 :   1439   11171  110856
Check 13 :   1439   11171  110872
Check 14 :   1439   11171  110863
Check 15 :   1359   10498  103680
Check 16 :   1441   11198  111139
Check 17 :   1439   11171  110859
Check 18 :   1439   11171  110864
Check 19 :   1439   11171  110871
Check 20 :   1439   11171  110857

from agent.

sboeuf commented on June 10, 2024

This issue might be related to kata-containers/runtime#35 and #145
Please read the issue for the explanation of the root cause.

from agent.

egernst commented on June 10, 2024

@sboeuf @grahamwhaley -- AFAIU we've fixed this already - can this be closed now?

from agent.

grahamwhaley commented on June 10, 2024

Not until somebody physically confirms it. afaik, we had 3 issues, 2 of which were fixed - this one has not been checked. There were a couple of fixes in this area, but they were not mentioned directly in relation to this - so, pls do not close yet. Let's re-test (which I won't get to this week, and was holding off a re-run of the metrics on kata until the arch discussions had settled down).

from agent.

sboeuf commented on June 10, 2024

Yes I agree with @grahamwhaley, let's keep this open as this might not be fixed yet.

from agent.

egernst commented on June 10, 2024

If I manage to get latest kata up later today, I'll make a note to run the "wobble script" and report what I find...

from agent.

egernst commented on June 10, 2024

Looking pretty good, using latest agent, runtime, shim and proxy:

Check 1      1546   12290  121117
Check 2 :   1548   12317  121397
Check 3 :   1546   12290  121109
Check 4 :   1546   12290  121115
Check 5 :   1546   12290  121105
Check 6 :   1546   12290  121123
Check 7 :   1546   12290  121132
Check 8 :   1546   12290  121114
Check 9 :   1546   12290  121113
Check 10 :   1546   12290  121122
Check 11 :   1546   12290  121132
Check 12 :   1546   12290  121111
Check 13 :   1546   12290  121105
Check 14 :   1546   12290  121109
Check 15 :   1546   12290  121109
Check 16 :   1546   12290  121114
Check 17 :   1546   12290  121105
Check 18 :   1547   12302  121263
Check 19 :   1546   12290  121114
Check 20 :   1546   12290  121128

from agent.

egernst commented on June 10, 2024

I just ran ~15 kata containers, checking the dmesg output of each. sudo docker run --runtime=kata-runtime -it alpine sh -c dmesg

In each case the output looks appropriate, and I don't see any truncation or mangling.

Closing this issue. If you see this again, please reopen and I owe you a drink of choice.

from agent.

sboeuf commented on June 10, 2024

@egernst Fair enough, but the issue was hard to reproduce. Only @grahamwhaley was able to see it on his machine.

from agent.

egernst commented on June 10, 2024

I guess we'll see then -- looked like jodh had some variation in his as well. Either way, good chance for @grahamwhaley to prove me wrong! :)

from agent.

grahamwhaley commented on June 10, 2024

Whilst I had a fresh kata set up I re-checked (just for you @egernst ;-). Looking good - so, I think one of our other stdout/err/buffer fixes has fixed this:

# #For kata
Check 1 :   1690   13162  130371
Check 2 :   1690   13159  130319
Check 3 :   1690   13159  130311
Check 4 :   1690   13158  130325
Check 5 :   1690   13165  130354
Check 6 :   1690   13157  130302
Check 7 :   1690   13164  130348
Check 8 :   1690   13166  130362
Check 9 :   1690   13160  130350
Check 10 :   1690   13161  130340
Check 11 :   1690   13160  130324
Check 12 :   1690   13164  130351
Check 13 :   1690   13165  130360
Check 14 :   1690   13162  130341
Check 15 :   1690   13170  130406
Check 16 :   1690   13163  130343
Check 17 :   1690   13160  130331
Check 18 :   1690   13166  130368
Check 19 :   1690   13161  130332
Check 20 :   1690   13156  130307

# #For cc-runtime
Check 1 :   1690   13129  129992
Check 2 :   1690   13126  129955
Check 3 :   1690   13129  129986
Check 4 :   1690   13129  129987
Check 5 :   1690   13127  129977
Check 6 :   1690   13132  130003
Check 7 :   1690   13126  129964
Check 8 :   1690   13134  130029
Check 9 :   1690   13123  129944
Check 10 :   1690   13127  129976
Check 11 :   1690   13132  130000
Check 12 :   1690   13129  129987
Check 13 :   1690   13126  129965
Check 14 :   1690   13124  129965
Check 15 :   1690   13124  129951
Check 16 :   1690   13126  129964
Check 17 :   1690   13132  130018
Check 18 :   1690   13129  129998
Check 19 :   1690   13126  129970
Check 20 :   1690   13126  129973

I'm happy the line counts are the same. If somebody were being really picky they could check why the char counts are different (that is, always a little less on kata). It is probably for some sane reason - but given the images and kernels are pretty identical, it might be nice to know.

from agent.

stdout sometimes seems to be truncated about agent HOT 18 CLOSED

Comments (18)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs