GithubHelp home page GithubHelp logo

CRIU support about crun HOT 35 CLOSED

containers avatar containers commented on May 12, 2024 3
CRIU support

from crun.

Comments (35)

giuseppe avatar giuseppe commented on May 12, 2024 1

closed by #480 🎉 🎉 🎉 🎉 🎉

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

@adrianreber I've not even looked into it so I've no idea at all. How much work would it be to add CRIU support to crun?

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

Reading about crun I also already thought about it. Hard to tell how much work it would actually be, definitely doable

The bigger questions is how to communicate between crun and criu. runc starts criu in a special RPC mode and then transfers protobuf messages to criu. For a C based program there are two approaches, either the same as runc, but that requires protobuf-c to translate protobuf to C code and some code around starting criu and sending the messages. All this is already implemented in libcriu, crun could just directly link against libcriu.

How do you feel about adding a library dependency for criu support to crun? Or would you prefer to include the criu wrapper code directly into the crun code base (like one header and one (or two) C file) without adding an additional library?

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

I think the libcriu approach is to prefer as I see it offers a nicer C integration.

I see no problems with the new dependency as long as we have a way to disable it at ./configure time as we can do with systemd. The dependency and the CRIU support can be enabled by default.

from crun.

rhatdan avatar rhatdan commented on May 12, 2024

Since we plan on defaulting to crun in Fedora 31, I think we need to get going on this. or CRIU will not work with it correct?

from crun.

rhatdan avatar rhatdan commented on May 12, 2024

@adrianreber @giuseppe Any movement on this, or will CRIU be broken in F31?

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

I did not had a chance to work on this. So currently there is no CRIU support in crun from my side.

from crun.

rhatdan avatar rhatdan commented on May 12, 2024

Well then it will be broken in F31.

from crun.

crosbymichael avatar crosbymichael commented on May 12, 2024

Thinking about this for a while, it would be nice to move this functionality out of "runc/crun". I think it can be done. Most of the runc specific code is around state management of the container. It would be worth it to factor this out into higher layers and provide the correct hooks in crun to do this while being controlled from above.

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

It would be worth it to factor this out into higher layers and provide the correct hooks in crun to do this while being controlled from above.

@crosbymichael This sounds interesting. Could you describe your idea a bit more in detail? I like the idea but I do not yet understand how this would look like. What would be in crun/runc, what would be in the upper layers? How would those higher layers interact with crun/runc?

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

@giuseppe I started to integrate CRIU support into crun and I am hitting a problem that CRIU cannot checkpoint a process with the following error message:

(00.042961) sid=0 pgid=0 pid=1
(00.042964) Error (criu/cr-dump.c:1327): A session leader of 155391(1) is outside of its pid namespace

Looking closer at this I see a difference between a process running in crun (compared to runc). The following is from a Fedora 31 system running a RHEL7 based container:

[root@fedora01 rhel7-httpd]# runc run rhel7-httpd -d -b /runc/containers/rhel7-httpd/
[root@fedora01 rhel7-httpd]# runc exec rhel7-httpd cat /proc/1/status | grep NSsid
NSsid:	1
[root@fedora01 rhel7-httpd]# runc delete -f rhel7-httpd
[root@fedora01 rhel7-httpd]# crun run -d -b /runc/containers/rhel7-httpd/  rhel7-httpd 
[root@fedora01 rhel7-httpd]# crun exec rhel7-httpd cat /proc/1/status | grep NSsid
NSsid:	0

CRIU's error message show the same, that sid=0. CRIU does a getsid() from within the process to be checkpointed to figure out the SID.

Do you know if there a specific reason crun behaves differently than runc for this?

Checkpointing a runc container CRIU gives me following information about PID 1 in the container:

(00.064662) sid=1 pgid=1 pid=1

So sid and pgid is different when running crun.

CRIU has the following check in its code:

        if (item->sid == 0) {
                pr_err("A session leader of %d(%d) is outside of its pid namespace\n",
                        item->pid->real, vpid(item));
                goto err_cure;
        }

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

@adrianreber could it be related to the setsid() call crun does in src/libcrun/linux.c?

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

I'll take a look to understand what is going on

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

I think the setsid should be inconditional, does this patch work for you?

diff --git a/src/libcrun/container.c b/src/libcrun/container.c
index a71bfdd..76645b4 100644
--- a/src/libcrun/container.c
+++ b/src/libcrun/container.c
@@ -676,14 +676,14 @@ container_init_setup (void *args, const char *notify_socket,
         }
     }
 
+  ret = setsid ();
+  if (UNLIKELY (ret < 0))
+    return crun_make_error (err, errno, "setsid");
+
   if (has_terminal)
     {
       cleanup_close int terminal_fd = -1;
 
-      ret = setsid ();
-      if (UNLIKELY (ret < 0))
-        return crun_make_error (err, errno, "setsid");
-
       fflush (stderr);
 
       terminal_fd = libcrun_set_terminal (container, err);
diff --git a/src/libcrun/linux.c b/src/libcrun/linux.c
index 145a77a..ea493c5 100644
--- a/src/libcrun/linux.c
+++ b/src/libcrun/linux.c
@@ -2838,12 +2838,6 @@ libcrun_join_process (libcrun_container_t *container, pid_t pid_to_join, libcrun
   for (i = 0; all_namespaces[i]; i++)
     close_and_reset (&fds[i]);
 
-  if (detach && setsid () < 0)
-    {
-      crun_make_error (err, errno, "setsid");
-      goto exit;
-    }
-
   /* We need to fork once again to join the PID namespace.  */
   pid = fork ();
   if (UNLIKELY (pid < 0))
@@ -2881,9 +2875,6 @@ libcrun_join_process (libcrun_container_t *container, pid_t pid_to_join, libcrun
         {
           cleanup_close int master_fd = -1;
 
-          if (setsid () < 0)
-            libcrun_fail_with_error (errno, "setsid");
-
           master_fd = open_terminal (container, &slave, err);
           if (UNLIKELY (master_fd < 0))
             {
@@ -2899,6 +2890,9 @@ libcrun_join_process (libcrun_container_t *container, pid_t pid_to_join, libcrun
             }
         }
 
+      if (setsid () < 0)
+        libcrun_fail_with_error (errno, "setsid");
+
       if (r != 0)
         _exit (EXIT_FAILURE);

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

@giuseppe Thanks. With that patch I do not see the session leader error any more.

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

@giuseppe Another thing I am having a problem is, is /dev/null. If stdin, stdout, stderr point to/dev/null runc reopens /dev/null inside of the container (func reOpenDevNull()). Does crun also do that? I haven't seen it so far.

It seems like /dev/null still points to the outside of the container.

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

@adrianreber no, it is not currently done. I will add that

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

@adrianreber is that currently blocking you?

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

@adrianreber is that currently blocking you?

Yes, but knowing that it is missing I can look into it myself. If you have a patch I am happy to use that, but I can try to provide a patch myself.

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

Yes, but knowing that it is missing I can look into it myself. If you have a patch I am happy to use that, but I can try to provide a patch myself.

ok thanks. CRIU support is very important, so if there is anything I can do to help with it, just let me know.

Is the mnt_id the only visible effect of re-opening /dev/null inside of the container?

from crun.

rst0git avatar rst0git commented on May 12, 2024

@adrianreber I think the error occurs because in crun /dev/null is mounted on masked paths:

ret = do_mount (container, "/dev/null", pathfd, path, NULL, MS_BIND | MS_UNBINDABLE | MS_REC, NULL, 0, err);

The same is done in runc, but it also adds /dev/null as an external mount map for CRIU: https://github.com/opencontainers/runc/blob/f6fb7a0338c3ea8488bd9bd7cc7667b113aff8d8/libcontainer/container_linux.go#L870L872

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

@rst0git I thought I am already correctly handling the masked paths. Let me have a closer look at this, but I can definitely see that runc uses /dev/null from the inside of the container and crun /dev/null from the outside.

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

If I am pointing stdin, stdout, stderr to the /dev/null in the container if the point to /dev/null on the outside then I successfully checkpoint the container.

Basic checkpointing works for me now with @giuseppe setsid patch from above. I will open a PR for the /dev/null handling.

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

Basic checkpointing works for me now with @giuseppe setsid patch from above.

that is great news! Do you want me to prepare a PR with the setsid patch above?

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

that is great news! Do you want me to prepare a PR with the setsid patch above?

Yes, please.

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

@adrianreber opened here: #267

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

Question how to continue here. My main 'problem' is Podman. Podman currently figures out if the OCI runtime supports checkpointing by running the subcommand 'checkpoint' and if it is successful it assume this OCI runtime has all the features Podman needs. This basically means that I cannot add my checkpoint/restore patches to crun before if is ready for Podman to actually being used.

On the other hand this means that it will be one huge patch once I am ready. I think smaller patches/pull requests would make more sense as they are easier to review. Is there a way to introduce an experimental feature into crun? One idea I had was not mentioning it in crun --help and instead of calling the subcommands checkpoint and restore maybe prefix if with an underscore (_checkpoint, _restore). This way it could be tested by anyone interested but it would not be immediately obvious that it exists and Podman would also not detect it if it is prefixed with an underscore. Once everything is ready I would remove the prefix.

Would this be a workable approach for you? Any other ideas how to handle this best?

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

yes I agree smaller patches are better, so we can already start playing with it.

I am fine with _checkpoint and _restore as well as using them something different like checkpoint-experimental

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

@giuseppe If I want to pass multiple checkpoint/restore options from the subcommand parser down to libcrun, should I create a new structure in src/libcrun/container.h or make the new checkpoint/restore specific members part of the context structure?

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

both are fine, as long as the new features are guarded by macros so it is possible to disable CRIU att build time if needed

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

With the merge of #342 this is now almost done. Once CRIU 3.14 is released I will bump crun's CRIU minimum version to 3.14 and enable the checkpoint/restore subcommands.

from crun.

giuseppe avatar giuseppe commented on May 12, 2024

With the merge of #342 this is now almost done. Once CRIU 3.14 is released I will bump crun's CRIU minimum version to 3.14 and enable the checkpoint/restore subcommands.

I think we can already enable the subcommands and fix the dependency in the .spec file once I cut a new release. Do you think the dependency should be at compile time?

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

I am just fixing one more thing. So unfortunately not ready to be enabled. I tried to run Podman's checkpoint/restore tests and almost all tests are failing because the tests are using -t and the crun with CRIU cannot not yet correctly connect the console-socket back. This needs one more fix.

from crun.

adrianreber avatar adrianreber commented on May 12, 2024

This change in CRIU (checkpoint-restore/criu#1063) to be able to run Podman's checkpoint/restore tests with crun. Once these CRIU changes are merged I can open the corresponding crun PR.

from crun.

rhatdan avatar rhatdan commented on May 12, 2024

@adrianreber Any update on this issue?

from crun.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.