GithubHelp home page GithubHelp logo

Comments (8)

rn avatar rn commented on May 17, 2024

The error message might be a red herring. Here is the output from the daemon logs:

DEBU[2017-09-13T15:52:09.458936500+01:00] HCSShim::Container::Modify succeeded id=76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6_svm
DEBU[2017-09-13T15:52:09.458936500+01:00] opengcs: HotAddVhd: C:\lcow\lcow\scratch\76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6.vhdx added successfully
DEBU[2017-09-13T15:52:09.459935500+01:00] opengcs: CreateExt4Vhdx: C:\lcow\lcow\scratch\76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6.vhdx querying mapped virtual disks
DEBU[2017-09-13T15:52:09.459935500+01:00] HCSShim::Container::MappedVirtualDiskList id=76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6_svm
DEBU[2017-09-13T15:52:09.460955300+01:00] HCSShim::Container::MappedVirtualDiskList succeeded id=76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6_svm
DEBU[2017-09-13T15:52:09.460955300+01:00] opengcs: CreateExt4Vhdx: C:\lcow\lcow\scratch\76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6.vhdx at C=0 L=2
DEBU[2017-09-13T15:52:09.460955300+01:00] opengcs: RunProcess: test -d /sys/bus/scsi/devices/0:0:0:2
DEBU[2017-09-13T15:52:09.460955300+01:00] opengcs: createUtilsProcess
DEBU[2017-09-13T15:52:09.460955300+01:00] HCSShim::Container::CreateProcess id=76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6_svm config={"CommandLine":"test -d /sys/bus/scsi/devices/0:0:0:2","WorkingDirectory":"/bin","Environment":{"PATH":"/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:"},"CreateStdInPipe":true,"CreateStdOutPipe":true,"CreateStdErrPipe":true,"ConsoleSize":[0,0],"CreateInUtilityVm":true}
DEBU[2017-09-13T15:52:09.471936000+01:00] HCSShim::Container::CreateProcess succeeded id=76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6_svm processid=408
DEBU[2017-09-13T15:52:09.471936000+01:00] HCSShim::Process::Stdio processid=408
DEBU[2017-09-13T15:52:09.472939100+01:00] HCSShim::Process::Stdio succeeded processid=408
DEBU[2017-09-13T15:52:09.472939100+01:00] opengcs: createUtilsProcess success: pid 408
DEBU[2017-09-13T15:52:09.472939100+01:00] opengcs: runProcess success: test -d /sys/bus/scsi/devices/0:0:0:2
DEBU[2017-09-13T15:52:09.472939100+01:00] HCSShim::Process::WaitTimeout processid=408
DEBU[2017-09-13T15:52:09.526251200+01:00] HCSShim::Process::WaitTimeout succeeded processid=408
DEBU[2017-09-13T15:52:09.527248700+01:00] HCSShim::Process::ExitCode processid=408
DEBU[2017-09-13T15:52:09.527248700+01:00] HCSShim::Process::properties processid=408
DEBU[2017-09-13T15:52:09.528258300+01:00] HCSShim::Process::properties succeeded processid=408, properties={"ProcessId":408,"Exited":true,"ExitCode":0,"LastWaitResult":0}
DEBU[2017-09-13T15:52:09.528258300+01:00] HCSShim::Process::ExitCode succeeded processid=408 exitCode=0
DEBU[2017-09-13T15:52:09.528258300+01:00] opengcs: RunProcess: ls /sys/bus/scsi/devices/0:0:0:2/block
DEBU[2017-09-13T15:52:09.529341700+01:00] opengcs: createUtilsProcess
DEBU[2017-09-13T15:52:09.529341700+01:00] HCSShim::Container::CreateProcess id=76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6_svm config={"CommandLine":"ls /sys/bus/scsi/devices/0:0:0:2/block","WorkingDirectory":"/bin","Environment":{"PATH":"/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:"},"CreateStdInPipe":true,"CreateStdOutPipe":true,"CreateStdErrPipe":true,"ConsoleSize":[0,0],"CreateInUtilityVm":true}
DEBU[2017-09-13T15:52:11.609893700+01:00] Result: {"Error":-2147467259,"ErrorMessage":"Unspecified error","ErrorEvents":[{"Message":"failed creating stdin Connection: failed connecting the VsockConnection: failed connect() to 00000002.40000080: connection timed out","StackTrace":"\ngithub.com/Microsoft/opengcs/service/gcs/transport.(*VsockTransport).Dial\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/transport/vsock.go:27\ngithub.com/Microsoft/opengcs/service/gcs/bridge.connectStdio\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/bridge/connection.go:21\ngithub.com/Microsoft/opengcs/service/gcs/bridge.(*Bridge).execProcess\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/bridge/bridge.go:382\ngithub.com/Microsoft/opengcs/service/gcs/bridge.(*Bridge).(github.com/Microsoft/opengcs/service/gcs/bridge.execProcess)-fm\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/bridge/bridge.go:199\ngithub.com/Microsoft/opengcs/service/gcs/bridge.HandlerFunc.ServeMsg\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/bridge/bridge.go:45\ngithub.com/Microsoft/opengcs/service/gcs/bridge.(*Mux).ServeMsg\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/bridge/bridge.go:107\ngithub.com/Microsoft/opengcs/service/gcs/bridge.(*Bridge).ListenAndServe.func2.1\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/bridge/bridge.go:262\nruntime.goexit\n\t/usr/lib/go/src/runtime/asm_amd64.s:2197","Provider":"00000000-0000-0000-0000-000000000000","EventId":0}]}
DEBU[2017-09-13T15:52:11.610930700+01:00] opengcs: CreateExt4Vhdx: hot-remove of C:\lcow\lcow\scratch\76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6.vhdx
DEBU[2017-09-13T15:52:11.615887400+01:00] opengcs: HotRemoveVhd: C:\lcow\lcow\scratch\76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6.vhdx
DEBU[2017-09-13T15:52:11.615887400+01:00] HCSShim::Container::Modify id=76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6_svm request={"ResourceType":"MappedVirtualDisk","Settings":{"HostPath":"C:\\lcow\\lcow\\scratch\\76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6.vhdx","ContainerPath":"","CreateInUtilityVM":true,"AttachOnly":false},"RequestType":"Remove"}
DEBU[2017-09-13T15:52:11.656885000+01:00] HCSShim::Container::Modify succeeded id=76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6_svm
DEBU[2017-09-13T15:52:11.656885000+01:00] opengcs: HotRemoveVhd: C:\lcow\lcow\scratch\76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6.vhdx removed successfully
DEBU[2017-09-13T15:52:11.657879800+01:00] HCSShim::Process::Close processid=408
DEBU[2017-09-13T15:52:11.657879800+01:00] HCSShim::Process::Close succeeded processid=408
DEBU[2017-09-13T15:52:11.657879800+01:00] lcowdriver: startservicevmifnotrunning 76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6: (applydiff 76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6): releasing serviceVM on error path from CreateExt4Vhdx: failed to `ls /sys/bus/scsi/devices/0:0:0:2/block` following hot-add C:\lcow\lcow\scratch\76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6.vhdx to utility VM: failed to create process (&{Options:{KirdPath:C:\Program Files\Linux Containers KernelFile:bootx64.efi InitrdFile:initrd.img Vhdx:C:\Program Files\Linux Containers\uvm.vhdx TimeoutSeconds:0 BootParameters:} Name:76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6_svm RequestedMode:3 ActualMode:2 UvmTimeoutSeconds:300 Uvm:0xc042942fc0 MappedVirtualDisks:[]}) in utility VM: container 76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6_svm encountered an error during CreateProcess: failure in a Windows system call: Unspecified error (0x80004005) extra info: {"CommandLine":"ls /sys/bus/scsi/devices/0:0:0:2/block","WorkingDirectory":"/bin","Environment":{"PATH":"/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:"},"CreateStdInPipe":true,"CreateStdOutPipe":true,"CreateStdErrPipe":true,"ConsoleSize":[0,0],"CreateInUtilityVm":true}
DEBU[2017-09-13T15:52:11.658893100+01:00] lcowdriver: startservicevmifnotrunning 76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6: (applydiff 76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6): releasing cachedScratchMutex on error path

in there is this:

DEBU[2017-09-13T15:52:09.528258300+01:00] opengcs: RunProcess: ls /sys/bus/scsi/devices/0:0:0:2/block
DEBU[2017-09-13T15:52:09.529341700+01:00] opengcs: createUtilsProcess
DEBU[2017-09-13T15:52:09.529341700+01:00] HCSShim::Container::CreateProcess id=76c1196c65017f90705ed697f01646b81e132e7319edbacf52768b1bb996e0b6_svm config={"CommandLine":"ls /sys/bus/scsi/devices/0:0:0:2/block","WorkingDirectory":"/bin","Environment":{"PATH":"/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:"},"CreateStdInPipe":true,"CreateStdOutPipe":true,"CreateStdErrPipe":true,"ConsoleSize":[0,0],"CreateInUtilityVm":true}
DEBU[2017-09-13T15:52:11.609893700+01:00] Result: {"Error":-2147467259,"ErrorMessage":"Unspecified error","ErrorEvents":[{"Message":"failed creating stdin Connection: failed connecting the VsockConnection: failed connect() to 00000002.40000080: connection timed out","StackTrace":"\ngithub.com/Microsoft/opengcs/service/gcs/transport.(*VsockTransport).Dial\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/transport/vsock.go:27\ngithub.com/Microsoft/opengcs/service/gcs/bridge.connectStdio\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/bridge/connection.go:21\ngithub.com/Microsoft/opengcs/service/gcs/bridge.(*Bridge).execProcess\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/bridge/bridge.go:382\ngithub.com/Microsoft/opengcs/service/gcs/bridge.(*Bridge).(github.com/Microsoft/opengcs/service/gcs/bridge.execProcess)-fm\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/bridge/bridge.go:199\ngithub.com/Microsoft/opengcs/service/gcs/bridge.HandlerFunc.ServeMsg\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/bridge/bridge.go:45\ngithub.com/Microsoft/opengcs/service/gcs/bridge.(*Mux).ServeMsg\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/bridge/bridge.go:107\ngithub.com/Microsoft/opengcs/service/gcs/bridge.(*Bridge).ListenAndServe.func2.1\n\t/go/src/github.com/Microsoft/opengcs/service/gcs/bridge/bridge.go:262\nruntime.goexit\n\t/usr/lib/go/src/runtime/asm_amd64.s:2197","Provider":"00000000-0000-0000-0000-000000000000","EventId":0}]}

which indicates a time out error on a VsockConnection: failed creating stdin Connection: failed connecting the VsockConnection: failed connect() to 00000002.40000080: connection timed out

So the real reason why the ls of the block device fails may have nothing to do with the command itself but with the plumbing of stdin/stdout for the process.

This could potentially be a dup of #124

from opengcs.

gupta-ak avatar gupta-ak commented on May 17, 2024

It looks like the same error as #124. Can you repro it if you cherry-pick the top commit from the retry-connect branch? (PR is #133)

from opengcs.

rn avatar rn commented on May 17, 2024

I've tried something slightly different https://github.com/rn/opengcs/tree/dial-timeout-hack
basically do a errors.Wrap() in virtsock and then look at the error returned and retry on ETIMEDOUT and with that I can't repro. I've only tried a few times and weirdly I haven't seen the log messages I added...I also don't see the logrus.Info() messages were already in the code (ie https://github.com/Microsoft/opengcs/blob/master/service/gcs/transport/vsock.go#L23). Any idea why they are not visible?

I'll try to debug more tomorrow (I'm in the UK and it's evening now)

from opengcs.

gupta-ak avatar gupta-ak commented on May 17, 2024

I can try out your branch and see why the logs aren't showing up. At a quick glance, it looks like you're missing the commit where the gcs logs are sent to the HCS, so maybe that's the reason if you're trying to find them from docker.

from opengcs.

gupta-ak avatar gupta-ak commented on May 17, 2024

Using your branch, I definitely see the logs. Here is a truncated snippet from /tmp/gcs.log:

vsock Dial port (1073741955)
external process 380 exited
bridge: response sent
opengcs: re-dial 1073741955
vsock Connect port (107374195
vsock Dial port (1073741956)
vsock Connect port (107374195
vsock Dial port (1073741957)
vsock Connect port (107374195

from opengcs.

rn avatar rn commented on May 17, 2024

Cool, thanks for checking. So this is definitely the same as #124 then.

Do you want me to tidy this up (and roll in the virtsock changes)? I think it's cleaner to just check for ETIMEDOUT as it doesn't mask other potential errors.

from opengcs.

gupta-ak avatar gupta-ak commented on May 17, 2024

Yeah, I agree. I originally had it check etimedout, but I realised it was getting masked. With the virtsock update, I think your change of only retrying on etimedout is right.

from opengcs.

rn avatar rn commented on May 17, 2024

yes, only retrying on ETIMEDOUT, other errors error out properly

from opengcs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.