Comments (4)
Originally in commit 4497ac8 the intent was to skip tracefs mount on top of debugfs mount, because on restore this tracefs was mounted automatically and if criu mounts it there explicitly too we have one excess tracefs mount appearing after each c/r.
Actually on my Fedora I have both "nested" tracefs and separately mounted tracefs:
cat /proc/self/mountinfo | grep "tracefs\|debugfs"
37 24 0:7 / /sys/kernel/debug rw,nosuid,nodev,noexec,relatime shared:18 - debugfs debugfs rw
38 24 0:12 / /sys/kernel/tracing rw,nosuid,nodev,noexec,relatime shared:19 - tracefs tracefs rw
802 37 0:12 / /sys/kernel/debug/tracing rw,nosuid,nodev,noexec,relatime shared:610 - tracefs tracefs rw
The code does not differentiate between those, that is a first problem with the code.
Second problem with the code is that it leads to tracefs mount not visible in mount tree that's why files on this mount can't be handled and lead to error. Proper solution probably is: instead of skipping this mount on dump, to skip restoring it explicitly in case it is on top of debugfs.
Third problem I can see with all of this is that both tracefs and debugfs does not seem to be virtualized (correct me if I'm wrong), they belong to the host. Thus If CRIU migrates open file on tracefs/debugfs to another host this file may become meaningless due to different tracefs setup, or even lead to something completely unexpected.
So I would rather eliminate debugfs and tracefs from the container you are migrating and also don't migrate apps which use tracefs and debugfs because this can lead to inconsistent setups.
from criu.
Hi @Snorch,
First of all, thank you very much for taking the time to write such a detailed response.
What I am trying to do is to dump a Podman container to restore it on a later moment (but in the same machine). This means, there is no risk that the tracing or debug filesystems are not present when restoring. The host runs Debian and the Podman image is Debian as well. On the host, debugfs is also mounted twice.
34 24 0:11 / /sys/kernel/tracing rw,nosuid,nodev,noexec,relatime shared:14 - tracefs tracefs rw
35 24 0:7 / /sys/kernel/debug rw,relatime shared:15 - debugfs none rw
288 35 0:11 / /sys/kernel/debug/tracing rw,relatime shared:162 - tracefs tracefs rw
The content and state of the opened tracing and debug fs files is not important after restoring the container for my application.
I wrote a small test application to check what happens on dump/restore with different types of files open. Here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define EXIT_IF_NOT_OPEN(pidFile) \
do \
{ \
if (NULL == (pidFile)) \
{ \
perror("Error opening" #pidFile); \
exit(EXIT_FAILURE); \
} \
} while (0)
#define PID_FILE_PATH "/tmp/file-opener.pid"
#define NORMAL_FILE_PATH "/tmp/normalFile"
#define DEBUF_FS_FILE_PATH "/sys/kernel/debug/memblock/memory"
#define TRACE_FS_FILE_PATH "/sys/kernel/tracing/enabled_functions"
int main(int, char **)
{
const int pid = getpid();
FILE *pidFile = fopen(PID_FILE_PATH, "w");
EXIT_IF_NOT_OPEN(pidFile);
fprintf(pidFile, "%d\n", pid);
fclose(pidFile);
FILE *normalFile = fopen(NORMAL_FILE_PATH, "w");
EXIT_IF_NOT_OPEN(normalFile);
FILE *debugFsFile = fopen(DEBUF_FS_FILE_PATH, "r");
EXIT_IF_NOT_OPEN(debugFsFile);
FILE *traceFsFile = fopen(TRACE_FS_FILE_PATH, "r");
EXIT_IF_NOT_OPEN(traceFsFile);
int i = 0;
for (i;; ++i)
{
printf("PID: %d, count:%d\n", pid, i);
fflush(stdout);
sleep(1);
}
}
If I start a container running that application:
sudo podman run \
--detach \
--network=host \
--mount "type=bind,source=/tmp/file-opener,target=/home/root" \
--mount "type=bind,source=/sys/kernel/tracing,target=/sys/kernel/tracing" \
--mount "type=bind,source=/sys/kernel/debug,target=/sys/kernel/debug" \
--name tc \
docker.io/arm64v8/debian:latest \
/home/root/file-opener
Dump and restore work perfectly if I modify tracefs_parse
(criu/filesystems.c:576
) to always return 0
.
sudo podman container checkpoint -l -k
e8a9e19d9c21a9c17a04752d3b95751f1b925c7055a3e4
sudo podman container restore -l -k
e8a9e19d9c21a9c17a04752d3b95751f1b925c7055a3e4
restore.log says:
(00.004303) mnt: Read 488 mp @ /sys/kernel/tracing
(00.004322) mnt: Will mount 487 from /
(00.004340) mnt: Will mount 487 @ /tmp/.criu.mntns.gSkbyZ/mnt-0000000487 /sys/kernel/debug/tracing
(00.004357) mnt: Read 487 mp @ /sys/kernel/debug/tracing
(00.004378) mnt: Will mount 486 from /sys/kernel/debug (E)
(00.004396) mnt: Will mount 486 @ /tmp/.criu.mntns.gSkbyZ/mnt-0000000486 /sys/kernel/debug
(00.004411) mnt: Read 486 mp @ /sys/kernel/debug
(00.004433) mnt: Will mount 485 from /var/run/containers/storage/overlay-containers/e8a9e19d9c21a9c17a04752d3b95751f1b925c7055a3e4
from criu.
Your change is basically
breaking mnt_tracefs zdtm test:
[root@turmoil criu]# git diff
diff --git a/criu/filesystems.c b/criu/filesystems.c
index 093e1c492..433394b72 100644
--- a/criu/filesystems.c
+++ b/criu/filesystems.c
@@ -572,11 +572,6 @@ static int debugfs_parse(struct mount_info *pm)
return 0;
}
-static int tracefs_parse(struct mount_info *pm)
-{
- return 1;
-}
-
static bool cgroup_sb_equal(struct mount_info *a, struct mount_info *b)
{
if (a->private && b->private && strcmp(a->private, b->private))
@@ -744,7 +739,6 @@ static struct fstype fstypes[] = {
{
.name = "tracefs",
.code = FSTYPE__TRACEFS,
- .parse = tracefs_parse,
},
{
.name = "cgroup",
[root@turmoil criu]# test/zdtm.py run -t zdtm/static/mnt_tracefs
userns is supported
Checking feature mnt_id
mnt_id is supported
=== Run 1/1 ================ zdtm/static/mnt_tracefs
====================== Run zdtm/static/mnt_tracefs in uns ======================
Start test
Running zdtm/static/mnt_tracefs.hook(--post-start)
./mnt_tracefs --pidfile=mnt_tracefs.pid --outfile=mnt_tracefs.out --dirname=mnt_tracefs.test
Running zdtm/static/mnt_tracefs.hook(--pre-dump)
Run criu dump
Running zdtm/static/mnt_tracefs.hook(--pre-restore)
Run criu restore
=[log]=> dump/zdtm/static/mnt_tracefs/64/1/restore.log
------------------------ grep Error ------------------------
b'(00.004337) 1: No ipcns-sem-11.img image'
b'(00.005344) 1: net: Try to restore a link 10:1:lo'
b'(00.005359) 1: net: Restoring link lo type 1'
b'(00.005846) 1: net: \tRunning ip addr restore'
b'Error: ipv4: Address already assigned.'
b'Error: ipv6: address already assigned.'
b'(00.028274) 1: mnt: \tBind /sys/kernel/debug/ to /tmp/.criu.mntns.xt9rIN/14-0000000000/zdtm/static/mnt_tracefs.test'
b'(00.028294) 1: mnt: 1491:/tmp/.criu.mntns.xt9rIN/14-0000000000/zdtm/static/mnt_tracefs.test private 0 shared 0 slave 1'
b'(00.028301) 1: mnt: \tMounting tracefs 1492@/tmp/.criu.mntns.xt9rIN/14-0000000000/zdtm/static/mnt_tracefs.test/tracing (0)'
b'(00.028303) 1: mnt: \tBind /sys/kernel/debug/tracing/ to /tmp/.criu.mntns.xt9rIN/14-0000000000/zdtm/static/mnt_tracefs.test/tracing'
b"(00.028316) 1: Error (criu/mount.c:2507): mnt: Can't bind-mount at /tmp/.criu.mntns.xt9rIN/14-0000000000/zdtm/static/mnt_tracefs.test/tracing: Permission denied"
b'(00.029233) uns: calling exit_usernsd (-1, 1)'
b'(00.029410) uns: daemon calls 0x478080 (89, -1, 1)'
b'(00.029420) uns: `- daemon exits w/ 0'
b'(00.029959) uns: daemon stopped'
b'(00.029972) Error (criu/cr-restore.c:2571): Restoring FAILED.'
------------------------ ERROR OVER ------------------------
############## Test zdtm/static/mnt_tracefs FAIL at CRIU restore ###############
Test output: ================================
<<< ================================
Running zdtm/static/mnt_tracefs.hook(--clean)
##################################### FAIL #####################################
In your case this change helps, and with external master mount tracefs it breaks things. I don't see a general solution...
Due to problem (3), I mentioned in my previous message, I believe it is best to avoid having tracefs and debugfs in container.
from criu.
Hi, thanks for the info. Then I will close this issue.
from criu.
Related Issues (20)
- How a app can know that it had been dumping by criu HOT 7
- ERR: vdso01.c:378: Delta is too big HOT 2
- docker checkpoint create failed: Error (compel/src/lib/ptrace.c:27): suspending seccomp failed: Operation not permitted HOT 2
- Checkpointing runC container is giving error: Unable to connect a transport socket: Permission denied HOT 6
- How to make parasite code support glibc? HOT 5
- How disable plugin for nvidia gpu HOT 2
- How to use CRIU with CUDA HOT 2
- Cannot checkpoint container: /usr/bin/nvidia-container-runtime did not terminate successfully: exit status 1 HOT 14
- gcc format-truncation warnings on Ubuntu 24.04 HOT 7
- Question: Lazy restore tends to restore all pages rather than those pages that really touched ? HOT 9
- Can't get reg-files.img by dump. HOT 7
- Following the `setcap` instruction raises 'fatal error: Invalid argument' HOT 6
- Can CRIU use arm based runners from Actuated? HOT 2
- CRIU package for Ubuntu 24.04 HOT 9
- "Fedora ASAN Test / build" fails with "cgroup.clone_children: No such file or directory"
- "Cirrus CI / Vagrant Fedora Rawhide based test" fails with error "setenforce: SELinux is disabled" HOT 2
- compel parasite sys_open return -1 always HOT 3
- Cannot checkpoint container: "failed: could not load libcriu.so.2" HOT 7
- Can not pass images_dir_fd option when using pycriu HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from criu.