Comments (18)
One question comes to my mind, as we by default use the oci worker, what is this containerd mount?
OCI mode still consumes containerd as a library
from buildkit.
As mentioned by @bcressey , Bottlerocket mounts its local storage with “nosuid” and “nodev” flags as a hardening step, and those flags are among those that have to be passed in subsequent bind mounts.
Here is the workaround using a persistent volume(EBS csi driver in EKS) instead of emptyDir that in turn uses Bottlerocket's local storage
Pod: Used fsGroup as 1000 to mount the volume within the pod for user (1000) and the Group (1000) to have access
Pod yaml - https://github.com/vtgspk/buildkit-rootless/blob/main/pod.yml
Persistent Volume Claim- https://github.com/vtgspk/buildkit-rootless/blob/main/persistent-claim.yml
Storage class - https://github.com/vtgspk/buildkit-rootless/blob/main/storage-class.yml
By this way, I am able to get the buildkitd pod up and running and build images successfully within that which uses the EBS mount instead of the Bottlerocket local storage.
from buildkit.
Does it work if you specify securityContext.privileged
?
from buildkit.
Does https://raw.githubusercontent.com/moby/buildkit/master/examples/kubernetes/job.rootless.yaml work?
from buildkit.
We do not want to run it in privileged mode.
from buildkit.
We do not want to run it in privileged mode.
Asking for a diagnosis purpose
from buildkit.
It worked actually, but still the purpose to run it without privileged.
from buildkit.
One question comes to my mind, as we by default use the oci worker, what is this containerd mount?
from buildkit.
Is there any logs or commands that I can execute to help investigating more?
from buildkit.
Is there any logs or commands that I can execute to help investigating more?
cat /proc/mounts
in the buildkitd container, and compare the result with Ubuntu nodes, etc.
from buildkit.
It worked as expected on both Amazon linux 2 and Ubuntu EKS optimized images based on 20.04.
The output of /proc/mounts is:
overlay / overlay rw,context="system_u:object_r:data_t:s0:c208,c287",relatime,lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/71/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/59/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/55/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/50/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/45/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/40/fs,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/425/fs,workdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/425/work 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev tmpfs rw,context="system_u:object_r:data_t:s0:c208,c287",nosuid,size=65536k,mode=755 0 0
devpts /dev/pts devpts rw,context="system_u:object_r:data_t:s0:c208,c287",nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666 0 0
mqueue /dev/mqueue mqueue rw,seclabel,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs ro,seclabel,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup cgroup2 ro,seclabel,nosuid,nodev,noexec,relatime 0 0
/dev/nvme1n1p1 /etc/hosts xfs rw,seclabel,nosuid,nodev,noatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
/dev/nvme1n1p1 /dev/termination-log xfs rw,seclabel,nosuid,nodev,noatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
/dev/nvme1n1p1 /etc/hostname xfs rw,seclabel,nosuid,nodev,noatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
/dev/nvme1n1p1 /etc/resolv.conf xfs rw,seclabel,nosuid,nodev,noatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
shm /dev/shm tmpfs rw,seclabel,nosuid,nodev,noexec,relatime,size=65536k 0 0
/dev/nvme1n1p1 /home/user/.local/share/buildkit xfs rw,seclabel,nosuid,nodev,noatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
tmpfs /run/secrets/kubernetes.io/serviceaccount tmpfs ro,seclabel,relatime,size=6931992k 0 0
proc /proc/bus proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/fs proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/irq proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/sys proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/sysrq-trigger proc ro,nosuid,nodev,noexec,relatime 0 0
tmpfs /proc/acpi tmpfs ro,context="system_u:object_r:data_t:s0:c208,c287",relatime 0 0
tmpfs /proc/kcore tmpfs rw,context="system_u:object_r:data_t:s0:c208,c287",nosuid,size=65536k,mode=755 0 0
tmpfs /proc/keys tmpfs rw,context="system_u:object_r:data_t:s0:c208,c287",nosuid,size=65536k,mode=755 0 0
tmpfs /proc/latency_stats tmpfs rw,context="system_u:object_r:data_t:s0:c208,c287",nosuid,size=65536k,mode=755 0 0
tmpfs /proc/timer_list tmpfs rw,context="system_u:object_r:data_t:s0:c208,c287",nosuid,size=65536k,mode=755 0 0
tmpfs /proc/scsi tmpfs ro,context="system_u:object_r:data_t:s0:c208,c287",relatime 0 0
tmpfs /sys/firmware tmpfs ro,context="system_u:object_r:data_t:s0:c208,c287",relatime 0 0
Can you please take a look and provide feedback so I can open a ticket to Bottlerocket team with the details?
Thanks
from buildkit.
Seems relevant to SELinux? Does this work?
securityContext:
seLinuxOptions:
level: s0
type: spc_t
from buildkit.
Unfortunately, it did not work.
I got the same error.
from buildkit.
As far as I can tell this is the same error that was fixed in #3697, but at a different stage in the process.
Running mountsnoop
from bcc
, I can see that the initial set of bind mounts go OK:
buildkitd 210370 210738 4026533418 mount("/home/user/.local/share/buildkit/runc-overlayfs/snapshots/snapshots/2/fs", "/home/user/.local/tmp/buildkit-mount276192057", "bind", MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOATIME|MS_BIND|MS_REC, "") = 0
buildkitd 210370 210738 4026533418 mount("", "/home/user/.local/tmp/buildkit-mount276192057", "", MS_RDONLY|MS_NOSUID|MS_NODEV|MS_REMOUNT|MS_NOATIME|MS_BIND|MS_REC, "") = 0
...
However, the operation ultimately fails in the call to overlay.WriteUpperdir
:
2024-05-06T01:07:57.845201317Z stderr F time="2024-05-06T01:07:57Z" level=warning msg="failed to compute blob by overlay differ (ok=false): failed to write compressed diff: failed to mount /home/user/.local/tmp/containerd-mount1074778686: operation not permitted" span="export layers" spanID=0f5a00d506b35262 traceID=32ade31627d6b338d5e3051b59dea3e2
From the related mountsnoop
output, we can see that the nosuid
and nodev
flags were not passed:
buildkitd 210370 210739 4026533418 mount("/home/user/.local/share/buildkit/runc-overlayfs/snapshots/snapshots/4/fs", "/home/user/.local/tmp/containerd-mount1074778686", "bind", MS_RDONLY|MS_BIND|MS_REC, "") = 0
buildkitd 210370 210739 4026533418 mount("", "/home/user/.local/tmp/containerd-mount1074778686", "", MS_RDONLY|MS_REMOUNT|MS_BIND|MS_REC, "") = -EPERM
overlay.WriteUpperdir
calls into mount.WithTempMount
, which uses the containerd mount library. It looks like we end up here and then the remount fails because it doesn't have the equivalent of the UnprivilegedMountFlags
logic.
from buildkit.
overlay.WriteUpperdir calls into mount.WithTempMount, which uses the containerd mount library. It looks like we end up here and then the remount fails because it doesn't have the equivalent of the UnprivilegedMountFlags logic.
@bcressey Thanks for analysis. Would you be interested in submitting a PR?
from buildkit.
@bcressey if okay, I can work on a fix for this.
from buildkit.
@swagatbora90 that'd be great! Let me know if I can help advise on setting up a test environment, or testing out a change when ready.
from buildkit.
@bcressey @AkihiroSuda Added PR to check and preserve unprivileged flags before we remount a bind mount for readonly. However, the change alone was not sufficient and also had to update the above pod spec to mount the /tmp
directory from the host
pod.spec
apiVersion: v1
kind: Pod
metadata:
name: buildkitd
spec:
containers:
- name: buildkitd
image: public.ecr.aws/e5v3s6y4/buildkit-rootless:rootless
args:
- --addr
- tcp://0.0.0.0:1234
- --oci-worker-no-process-sandbox
- --debug
securityContext:
seccompProfile:
type: Unconfined
runAsUser: 1000
runAsGroup: 1000
volumeMounts:
# The first mount is not needed, but makes it explicit that there
# is a VOLUME here which shows up as a separate mount, which is why
# buildkit is able to find the unprivileged mount flags it needs to
# preserve.
- mountPath: /home/user/.local/share/buildkit
name: buildkitd-1
# The second mount is needed, because otherwise there's no explicit
# mount to inspect for mount options, and the underlying filesystem's
# mount flags are obscured by the overlayfs used for the container's
# rootfs.
- mountPath: /home/user/.local/tmp
name: buildkitd-2
env:
# This is required to align the temporary directory created by buildkit
# with the volume mount for that directory.
- name: XDG_RUNTIME_DIR
value: /home/user/.local/tmp
- name: runner
image: moby/buildkit:rootless
command: [ "/bin/sh", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
env:
- name: BUILDKIT_HOST
value: tcp://localhost:1234
volumes:
- name: buildkitd-1
emptyDir: {}
- name: buildkitd-2
emptyDir: {}
Exposing the tmp dir as a bind mount in the container is required, otherwise the directory is just in the container root and its actual mount flags get obfuscated by overlayfs. So, the check for unprivileged flags no longer works. Inorder to make this work we need both 1) Update containerd mount library to preserve nosuid, nodev flags 2) Pod spec update to bind mount /tmp dir.
Let me know if this makes sense. I am also wondering if we no longer need #3697 since we are already checking for the flags downstream in containerd. I will test this out next.
from buildkit.
Related Issues (20)
- S3 touch fails on files greater than 5GB in size HOT 3
- wishlist: annotations on docker/dockerfile images HOT 2
- WCOW fails to load custom frontend HOT 3
- Proposal: Use referrers api to import/export inline type cache
- bake does not merge compose services/<service>/build/x-bake setting with serivice extensions
- copy --parents silently ignores nonexistent paths HOT 2
- buildkit WCOW cannot seemingly run `RUN powershell ...`, while vanilla dockerd can HOT 4
- Proposal: csv syntax for git repos HOT 3
- Support extracting `ADD --checksum=.. https://.. ..` HOT 6
- 0.13.2 corrupts cache and cannot start (`go.etcd.io/bbolt.(*freelist).read`, `panic: invalid freelist page: 0, page type is unknown<0`) HOT 5
- Local and Registry cache not used or *invalidating* cache unnecessarily HOT 3
- can't add CA cert for gha remotecache?
- Inconsistencies in RUN mount options parsing/handling HOT 1
- dockerfile: Add integration test for ARG with empty value and default
- [v0.13] It seems that "registry.insecure=true" doesn't work. HOT 7
- buildkit remote cache fails if manifest is larger than 1MB
- Windows buildkit: copy context error HOT 6
- Proposal: lint disable controls HOT 4
- windows differ is not implemented HOT 1
- `FROM --platform=$BUILDPLATFORM a as b` not overriding `TARGETPLATFORM` for multi-stage builds HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from buildkit.