GithubHelp home page GithubHelp logo

firecracker-microvm / firecracker-containerd Goto Github PK

View Code? Open in Web Editor NEW
2.0K 2.0K 177.0 3 MB

firecracker-containerd enables containerd to manage containers as Firecracker microVMs

License: Apache License 2.0

Go 92.41% Makefile 5.53% Shell 2.06%
aws containerd containers firecracker firecracker-containerd firecracker-microvms oci virtualization

firecracker-containerd's People

Contributors

aaithal avatar alakesh avatar austinvazquez avatar chinchaun avatar dependabot[bot] avatar fangn2 avatar fntlnz avatar ginglis13 avatar henry118 avatar ircody avatar jingkaihe avatar kern-- avatar kontotto avatar kzys avatar mdlayher avatar mehrdadrad avatar mikebrow avatar mxpv avatar plamenmpetrov avatar praveensastry avatar roycedavison avatar samuelkarp avatar serhiikozachenko avatar sipsma avatar swagatbora90 avatar ustiugov avatar utam0k avatar vvejell1 avatar xibz avatar zyqsempai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

firecracker-containerd's Issues

Proxy stdio streams

A way to take the containers stdio streams and proxy those to the fifo's that containerd provides outside the vm over vsock.

Cleaner method of shutdown

Currently runtime calls stopVM() which forcefully shuts down the VM. Instead we could signal the VM to exit via calling /sbin/reboot which is a more graceful shutdown.

Will need to coordinate with how the runc shim handles shutdown and probably require some re-working of how the agent calls into that.

Right now when running from ctr, ctr: rpc error: code = Unknown desc = ttrpc: client shutting down: read vsock:host(2):1222: transport endpoint is not connected: unknown is printed every time ctr finishes. We should close down the connection gracefully if possible to avoid this error.

Figure out why cri plugin must be disabled

If the cri plugin is enabled, our runtime causes errors upon exiting:

INFO[2018-12-03T11:02:09.030879139-08:00] TaskExit event &TaskExit{ContainerID:,ID:,Pid:224,ExitStatus:0,ExitedAt:2018-12-03 19:01:50.016792753 +0000 UTC,}
ERRO[2018-12-03T11:02:09.030966305-08:00] Failed to handle backOff event &TaskExit{ContainerID:,ID:,Pid:224,ExitStatus:0,ExitedAt:2018-12-03 19:01:50.016792753 +0000 UTC,} for   error="can't find container for TaskExit event: Prefix can't be empty"

The error "Prefix can't be empty" appears to be coming from a pkg in moby that is used by the cri plugin.

It's unclear why because AFAIK we are not invoking any cri functionality but having the plugin enabled seems to change the code path for exiting in some way.

killing task doesn't work properly

the kill task doesn't work properly and causes shim zombie issue
pls let me know if you need any extra info

ctr c create --snapshotter firecracker-dm-snapshotter --runtime aws.firecracker docker.io/library/redis:latest myredis1
ctr tasks start myredis1
1:C 06 Feb 2019 19:11:45.423 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 06 Feb 2019 19:11:45.423 # Redis version=5.0.3, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 06 Feb 2019 19:11:45.423 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
1:M 06 Feb 2019 19:11:45.425 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
1:M 06 Feb 2019 19:11:45.426 # Server can't set maximum open files to 10032 because of OS error: Operation not permitted.
1:M 06 Feb 2019 19:11:45.426 # Current maximum open files is 1024. maxclients has been reduced to 992 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
1:M 06 Feb 2019 19:11:45.428 * Running mode=standalone, port=6379.
1:M 06 Feb 2019 19:11:45.428 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 06 Feb 2019 19:11:45.428 # Server initialized
1:M 06 Feb 2019 19:11:45.428 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 06 Feb 2019 19:11:45.429 * Ready to accept connections
ctr tasks list
TASK        PID    STATUS    
myredis1    996    RUNNING
ctr tasks kill myredis1
1:signal-handler (1549480354) Received SIGTERM scheduling shutdown...
ctr: rpc error: code = Unknown desc = ttrpc: client shutting down: read vsock:host(2):1028: connection reset by peer: unknown
ps axu | grep shim
root      4016  0.4  0.0 784440 13444 pts/0    Sl   19:11   0:01 /usr/local/bin/containerd-shim-aws-firecracker -namespace default -address /run/containerd/containerd.sock -publish-binary /usr/local/bin/containerd
ctr tasks ls
ERRO[2019-02-06T19:17:25.961348016Z] converting task to protobuf                   error="ttrpc: client shutting down: read vsock:host(2):1028: connection reset by peer: unknown" id=myredis1
TASK    PID    STATUS    

Runtime should limit privileges

  • runtime should support that ReadOnlyRootfs is false
  • runtime should support Privileged is true
  • runtime should support RunAsUser
  • should return error if RunAsGroup is set without RunAsUser
  • runtime should support RunAsUserName
  • runtime should support SupplementalGroups
  • runtime should support setting Capability
  • runtime should support Privileged is false
  • runtime should support RunAsGroup
  • runtime should support that ReadOnlyRootfs is true

Runtime should support namespace options

  • runtime should support HostNetwork is false
  • runtime should support HostIpc is true
  • runtime should support ContainerPID
  • runtime should support PodPID
  • runtime should support HostNetwork is true
  • runtime should support HostIpc is false
  • runtime should support HostPID

time out issue

i've followed the script and the getting started guide.

using the script (building local version of containerd, runc, firecracker and firecracker-containerd).

trying on
Linux ub18 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

running sudo PATH=$PATH /usr/local/bin/containerd in one window and sudo /usr/local/bin/naive_snapshotter -address /var/run/firecracker-containerd/naive-snapshotter.sock -path /var/lib/firecracker-containerd/naive -debug in another.

when i do sudo ctr run --snapshotter firecracker-naive --runtime aws.firecracker --tty docker.io/library/busybox:latest busybox-test i get ctr: Firecracker did not create API socket ./firecracker.sock: context deadline exceeded: unknown

any ideas why?

Document the microvm architecture

Our architecture documentation is missing information on the kernel, microvm-guest image, and how the components inside the microvm work.

snapshotter: devmapper: Add an automated thinpool creation mode

The device mapper implementation of the snapshotter is currently not creating the thinpool by default, and returns an error if the thinpool has not been created beforehand. When users perfectly know what they want to achieve by using device mapper, this approach is completely valid.
But for a user who wants to try out some block based device snapshotter, he does not want to run into pre-configuration steps such as creating thinpools on its own.
That's why I think having in the configuration of the current snapshotter implementation, an option to specify if you want the snapshotter to take care of this step for you could be very useful.
We don't need to advertise this as the most optimized and "production ready" use case, but as a nice way to get started with it.

Issue opened based on the discussion started here.

Runtime SeccompProfilePath

  • runtime should not block setting host name with unconfined seccomp and SYS_ADMIN
  • should support seccomp unconfined on the container
  • should support seccomp default which is unconfined on the container
  • runtime should support setting hostname with docker/default seccomp profile and SYS_ADMIN
  • runtime should support an seccomp profile that blocks setting hostname with SYS_ADMIN
  • runtime should block sethostname with docker/default seccomp profile and no extra caps
  • should support seccomp localhost/profile on the container
  • runtime should not support a custom seccomp profile without using localhost/ as a prefix
  • runtime should ignore a seccomp profile that blocks setting hostname when privileged
  • should support seccomp docker/default on the container

Should we persist with parsing JSON config file for Firecracker VM opts?

PRs #109 and #105 provide clients the ability to pass Firecracker VM options using the FirecrackerConfig protobuf message. As per #109 (comment):

Originally JSON was added as temp solution to simplify running. Here we have to manage two entities for configuration and keep overrideVMConfigFromTaskOpts to overwrite config params. This makes it more complicated than it should be and I see no reasons to keep old config.

Let's discuss the tradeoffs for continuing to parse default values from JSON config. So, option A is continue doing what's being done today (read json + build FC config + override protobuf). Option B would be defaults + protobuf.

runtime: Implement CID management

At the moment, we're hardcoding a vsock CID in the shim. This limits us to running a single instance per host. We should instead ensure that each VM has a unique CID.

Runtime should support mount propagation

  • with 'rslave' should support propagation from host to container
  • mount with 'rshared' should support propagation from host to container and vice versa
  • mount with 'rprivate' should not support propagation

Allow stdio redirection inside VM

Some use-cases are better served by having stdio contained entirely within the microVM, instead of crossing the VM<->host boundary through vsock.

Runtime should support streaming interfaces

  • runtime should support exec with tty=false and stdin=false
  • runtime should support portforward
  • runtime should support portforward in host network
  • runtime should support attach
  • runtime should support exec with tty=true and stdin=true

Remove directories for removed containers on the host

[ec2-user@ip-172-31-30-18 ~]$ sudo /usr/local/bin/ctr run --snapshotter firecracker-naive --runtime aws.firecracker --tty docker.io/library/busybox:latest test
/ # echo hello
hello
/ # exit
[ec2-user@ip-172-31-30-18 ~]$ sudo /usr/local/bin/ctr c rm test
[ec2-user@ip-172-31-30-18 ~]$ sudo /usr/local/bin/ctr run --snapshotter firecracker-naive --runtime aws.firecracker --tty docker.io/library/busybox:latest test
ctr: mkdir /run/containerd/io.containerd.runtime.v2.task/default/test: file exists: unknown

Even though I've deleted the container (and its snapshot) on the host, the microVM rootfs the host retains the bundle directory that was created for the container. We should either clean up the bundle directory when the container is removed (or stopped?), or we should ensure that the microVM rootfs is read-only (or separate per microVM).

snapshotter: boltdb needs to rollback when mkfs or create thin pool fails

Currently when mkfs or create thin pool encounters an error, it does not rollback the changes to the DB and immediately returns an error. This causes an odd error of failing subsequent calls with object already exists due to it already existing in the DB but may not have create the thin pool, yet.The proposed change would be to rollback changes that occurred during storage.CreateSnapshot.

TestSnapshotterSuite/Chown failure in CI

TestSnapshotterSuite/Chown failed on an apparently-unrelated push: https://buildkite.com/firecracker-microvm/firecracker-containerd/builds/18. The test then succeeded on retry.

    --- FAIL: TestSnapshotterSuite/Chown (8.55s)
        issues.go:111: Check snapshots failed: invalid argument
            failed to mount
            github.com/containerd/containerd/snapshots/testsuite.applyToMounts
            	/root/go/pkg/mod/github.com/containerd/[email protected]/snapshots/testsuite/helpers.go:40
            github.com/containerd/containerd/snapshots/testsuite.createSnapshot
            	/root/go/pkg/mod/github.com/containerd/[email protected]/snapshots/testsuite/helpers.go:62
            github.com/containerd/containerd/snapshots/testsuite.checkSnapshots
            	/root/go/pkg/mod/github.com/containerd/[email protected]/snapshots/testsuite/helpers.go:123
            github.com/containerd/containerd/snapshots/testsuite.checkChown
            	/root/go/pkg/mod/github.com/containerd/[email protected]/snapshots/testsuite/issues.go:110
            github.com/containerd/containerd/snapshots/testsuite.makeTest.func1
            	/root/go/pkg/mod/github.com/containerd/[email protected]/snapshots/testsuite/testsuite.go:111
            testing.tRunner
            	/usr/lib/go-1.11/src/testing/testing.go:827
            runtime.goexit
            	/usr/lib/go-1.11/src/runtime/asm_amd64.s:1333
            failed to apply
            github.com/containerd/containerd/snapshots/testsuite.createSnapshot
            	/root/go/pkg/mod/github.com/containerd/[email protected]/snapshots/testsuite/helpers.go:63
            github.com/containerd/containerd/snapshots/testsuite.checkSnapshots
            	/root/go/pkg/mod/github.com/containerd/[email protected]/snapshots/testsuite/helpers.go:123
            github.com/containerd/containerd/snapshots/testsuite.checkChown
            	/root/go/pkg/mod/github.com/containerd/[email protected]/snapshots/testsuite/issues.go:110
            github.com/containerd/containerd/snapshots/testsuite.makeTest.func1
            	/root/go/pkg/mod/github.com/containerd/[email protected]/snapshots/testsuite/testsuite.go:111
            testing.tRunner
            	/usr/lib/go-1.11/src/testing/testing.go:827
            runtime.goexit
            	/usr/lib/go-1.11/src/runtime/asm_amd64.s:1333
            failed to create snapshot 1
            github.com/containerd/containerd/snapshots/testsuite.checkSnapshots
            	/root/go/pkg/mod/github.com/containerd/[email protected]/snapshots/testsuite/helpers.go:125
            github.com/containerd/containerd/snapshots/testsuite.checkChown
            	/root/go/pkg/mod/github.com/containerd/[email protected]/snapshots/testsuite/issues.go:110
            github.com/containerd/containerd/snapshots/testsuite.makeTest.func1
            	/root/go/pkg/mod/github.com/containerd/[email protected]/snapshots/testsuite/testsuite.go:111
            testing.tRunner
            	/usr/lib/go-1.11/src/testing/testing.go:827
            runtime.goexit
            	/usr/lib/go-1.11/src/runtime/asm_amd64.s:1333
        helpers.go:67: drwx------       4096 /tmp/snapshot-suite-Snapshotter-905427901
        helpers.go:67: drwxr-xr-x       4096 /tmp/snapshot-suite-Snapshotter-905427901/root
        helpers.go:67: drwxr-x---       4096 /tmp/snapshot-suite-Snapshotter-905427901/root/images
        helpers.go:65: -rw-r--r-- 1073741824 /tmp/snapshot-suite-Snapshotter-905427901/root/images/1 [ "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" ...]
        helpers.go:65: -rw-------      65536 /tmp/snapshot-suite-Snapshotter-905427901/root/metadata.db [ "\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\xed\xda\f\xed\x02\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\a\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\b\x00\x00\x00\x00\x00\x00\x00\t\x00\x00\x00\x00\x00\x00\x00" ...]
        helpers.go:67: drwxr-xr-x       4096 /tmp/snapshot-suite-Snapshotter-905427901/work

Implement networking

Integrate CNI into the appropriate components such that MicroVM-enclosed containers have network access.

Because Firecracker requires the use of Linux tap devices, many existing CNI plugins will not work. We will need to design one or more plugins to facilitate various network configurations.

Checksum mismatch for two packages

I pulled from master but I'm getting a checksum mismatch on two packages in our go.mod. I regenerated go.sum from scratch and see this diff:

--- a/go.sum
+++ b/go.sum
@@ -93,7 +93,7 @@ github.com/mdlayher/vsock v0.0.0-20181130155850-676f733b747c h1:iyuTD7VKmLNdKySm
 github.com/mdlayher/vsock v0.0.0-20181130155850-676f733b747c/go.mod h1:gLmzC7yBmdxKztR5gDQz8FyFUMHvOK5H0gQEbKQGIMA=
 github.com/mitchellh/mapstructure v1.1.2 h1:fmNYVwqnSfB9mZU6OS2O6GsXM+wcskZDuKQzvN1EDeE=
 github.com/mitchellh/mapstructure v1.1.2/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y=
-github.com/moby/moby v0.7.3-0.20181205005855-1895e082b613 h1:6Xs44N81W9BuXM9C/cSlFZV/O6m1Mn0FZXCkyEJ18ek=
+github.com/moby/moby v0.7.3-0.20181205005855-1895e082b613 h1:pUghvT3QQWEKVFwH3VdAKbJSPNOcxS+ix9kguP4tEFM=
 github.com/moby/moby v0.7.3-0.20181205005855-1895e082b613/go.mod h1:fDXVQ6+S340veQPv35CzDahGBmHsiclFwfEygB/TWMc=
 github.com/opencontainers/go-digest v1.0.0-rc1 h1:WzifXhOVOEOuFYOJAW6aQqW0TooG2iki3E3Ii+WN7gQ=
 github.com/opencontainers/go-digest v1.0.0-rc1/go.mod h1:cMLVZDEM3+U2I4VmLI6N8jQYUd2OVphdqWwCJHrFt2s=
@@ -144,6 +144,6 @@ gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+
 gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
 gopkg.in/yaml.v2 v2.2.1 h1:mUhvW9EsL+naU5Q3cakzfE91YhliOondGd6ZrsDBHQE=
 gopkg.in/yaml.v2 v2.2.1/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
-gotest.tools v2.2.0+incompatible h1:VsBPFP1AI068pPrMxtb/S8Zkgf9xEmTLJjfM+P5UIEo=
+gotest.tools v2.2.0+incompatible h1:y0IMTfclpMdsdIbr6uwmJn5/WZ7vFuObxDMdrylFM3A=
 gotest.tools v2.2.0+incompatible/go.mod h1:DsYFclhRJ6vuDpmuTbkuFWG+y2sxOXAzmJt81HFBacw=
 honnef.co/go/tools v0.0.0-20180728063816-88497007e858/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4=

I have no idea why this is different.

Don't share a mutable rootfs with all microVMs

We currently use the same rootfs image for all microVMs, and it is mounted read/write. We need to prevent multiple microVMs from mutating it, or present a separate copy of the image to each microVM.

Runtime should support networking

  • runtime should support DNS config
  • runtime should support port mapping with host port and container port
  • runtime should support port mapping with only container port

Simplify the runtime configuration

The runtime configuration is fairly complicated and has a number of fields that need reasonable defaults.

  • firecracker_binary_path - the relative path default is unreasonable, since the working directory changes every time containerd executes a new container (working directory is inside the container bundle)
  • socket_path - needs a reasonable default, probably relative inside the working directory/bundle
  • kernel_image_path - could use a default lookup path
  • kernel_args - needs a reasonable default
  • root_drive - could use a default lookup path
  • cpu_count - might want to pick this dynamically from the OCI spec, with perhaps a reasonable default
  • console - should it default to "stdio"?

Related to #57

Allow specification of VM parameters

Many of the configuration parameters we support today should be settable at the VM level instead of using the same for all VMs.

  • CPU count
  • CPU template
  • Memory
  • Additional storage devices
  • Network devices
  • Kernel

Snapshotter improvements

The existing snapshotter is, by design, simple and broadly compatible but not very efficient. We should implement an alternative snapshotter following a model similar to Docker's device-mapper storage driver.

Write a "Getting Started" guide

We need a guide to help users and contributors get started with firecracker-containerd. #16 is part of the process, but we also need to cover installation and configuration of all required components.

Support exec

We need to be able to support task.Exec for executing additional processes inside the containers.

Runtime should support basic operations on container

  • runtime should support starting container
  • runtime should support execSync
  • runtime should support execSync with timeout
  • runtime should support stopping container
  • runtime should support creating container
  • runtime should support removing container

Determine how agent and runc should be embedded into the microvm

firecracker-microvm depends on three components to be inside the microvm: the agent (communicating across the vsocks for control and stdio), containerd's runc shim (supervising runc), and runc (running the requested container). We need to determine how we want those components to be made available inside the microvm: embedded in the microvm-guest image, added as a separate device, or something else.

Use alternative file systems in snapshotters

We should allow the user to specify an alternative filesystem (e.g. xfs) via configuration file and/or flags.

This includes:

  • Extending mkfs func
  • Adding configuration flag to switch between fs
  • Support dm.fs flag
  • Unit tests

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.