Comments (7)
This issue is currently awaiting triage.
If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
from kubernetes.
/sig node
from kubernetes.
/cc @haircommander
from kubernetes.
However those log messages are completely out of control of the container itself, as it has no influence of their retention / lifecycle after leaving the container premises
I personally disagree with this, as it's the container that is logging in the first place. I understand the lifecycle point about the kubelet being the entity that choses when to clean logs, but allowing a container to define its own limits functionally opens the node up for DOS attacks from a container spamming the node and defining its own arbitrarily high limit. Kubelet is responsible for protecting the other workloads on the node, and if the container is being limited by the admin defined limits, it should log less, IMO
from kubernetes.
allowing a container to define its own limits functionally opens the node up for DOS attacks from a container spamming the node and defining its own arbitrarily high limit.
I'm not sure that we're in sync regarding the context: I'm not asking/requesting for a feature to let the container define it's own log retention - on the contrary, I think the current design (where kubelet / CRI handles the lifecycle of container produced logs) is sufficient to cover that.
And because kubelet already provides configuration options to define the maximum amount of logs each container could consume on the worker node, so I don't see a way to attack the host by producing large volume of logs.
By default, the configuration allows 50 MiB (uncompressed) logs to be stored:
https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#options :
--container-log-max-files int32 Default: 5
<Warning: Beta feature> Set the maximum number of container log files that can be present for a container. The number must be >= 2. (DEPRECATED: This parameter should be set via the config file specified by the kubelet's --config flag. See kubelet-config-file for more information.)
--container-log-max-size string Default: 10Mi
<Warning: Beta feature> Set the maximum size (e.g. 10Mi) of container log file before it is rotated. (DEPRECATED: This parameter should be set via the config file specified by the kubelet's --config flag. See kubelet-config-file for more information.)
Needless to say, cluster admin(s) might choose to configure these values differently (allowing more or less logs to be stored on a worker node, so the maximum amount of logs allowed on each worker node is completely up to the cluster admin to decide.
No, what I'm highlighting in this issue is kinda the opposite: If the container's ephemeral-storage limitation is more strict than the amount of logs kubelet maintain on the worker node, the container could be evicted due to the way ephemeral storage usage is calculated.
One might argue that containers should not produce that much logs to begin with, but this is just something that can't be guaranteed over the lifetime of the container:
- Even if the container produces only a fraction of the allowed logs, over a long period of time those small amounts of logs do add up
- Some containers are just "chatty" by default (even on INFO loglevel)
- Other containers might need to run in debug/trace mode, where they can produce a huge volume of logs over a relatively short time
Examples:
- ingress-nginx PODs are chatty by design, and they do require frequent rotation as a result (in case of that container, the 50 MiB default is only enough to hold a few hours worth of logs):
master-2:~ # crictl ps |grep ingress
49692998d8d53 5aa0bf4798fa2 2 months ago Running controller 0 73af5d1f72713 ingress-controller-nginx-ingress-controller-q9zbb
master-2:~ # crictl inspect 49692998d8d53 |grep logPath
"logPath": "/var/log/pods/ingress-nginx_ingress-controller-nginx-ingress-controller-q9zbb_9d51e25e-22f2-441b-82d2-e4dc6256a70b/controller/0.log",
master-2:~ # ls -l /var/log/pods/ingress-nginx_ingress-controller-nginx-ingress-controller-q9zbb_9d51e25e-22f2-441b-82d2-e4dc6256a70b/controller/
total 18840
-rw-r----- 1 root root 4710921 Apr 25 22:32 0.log
-rw-r--r-- 1 root root 1373597 Apr 25 20:50 0.log.20240425-202651.gz
-rw-r--r-- 1 root root 1372657 Apr 25 21:28 0.log.20240425-205044.gz
-rw-r--r-- 1 root root 1252749 Apr 25 22:12 0.log.20240425-212809.gz
-rw-r----- 1 root root 10560742 Apr 25 22:12 0.log.20240425-221224
- CoreDNS can also produce plenty of logs if log mode is enabled (https://coredns.io/plugins/log/)
// NOTE: Default ingress-nginx / coredns doesn't come with any ephemeral-storage limit definition, which is why they're not impacted by this issue (see the if podStats.EphemeralStorage != nil
condition in podEphemeralStorageLimitEviction()
)
Unfortunately the documentation doesn't do a good job of explaining how 'ephemeral-storage' limits are used either.
My original interpretation was that it limits the maximum ephemeral storage a container could use from the ephemeral volume (let's say a 500 MiB limit from a 2 GiB sized volume), but that turned out to be incorrect.
Instead the ephemeral-storage usage is calculated based on a factor that the container has no way of knowing, and therefore it can't even be generalized in the manifest either.
Because of this, it's probably safer to follow the example of ingress-nginx / coredns and not set any limit for ephemeral-storage to avoid "mysterious" POD evictions, but doesn't that make this feature pointless?
from kubernetes.
Marking this as a feature since the cluster-admin will have to account for storage used by container logs separately while planning.
/kind feature
/remove-kind bug
from kubernetes.
Marking this as a feature since the cluster-admin will have to account for storage used by container logs separately while planning.
I would argue that this is already the case: When the cluster admin designs / dimensions a cluster node, it already has to account the number of PODs the system will have to support, and the maximum amount of logs those PODs (and containers related to those PODs) could generate (and thus to configure the --container-log-max-files
and --container-log-max-size
accordingly, depending on the requirements)
Again - the issue I'm trying to highlight here is the disconnect between the worker node
[1] and the container
[2] configuration.
[1] kubelet - controls the amount of logs a container could have on a given worker node
[2] manifest - defines the container requirements according to 'Container v1 core', including the resource requests ad limits
Configurations / settings related to these two are completely disconnected from each other, however the POD eviction logic connects them indirectly (by accounting logs that are maintained by kubelet on the worker node).
Because of this I don't see the feature
flag justifiable (feels more like a design bug).
from kubernetes.
Related Issues (20)
- kubelet crash loop panic with SIGSEGV HOT 4
- Status manager does not normalize ephemeral container statuses HOT 6
- Linux 6.6 EEVDF scheduler on Kubernetes: openat2 /sys/fs/cgroup/kubepods.slice/cpu.weight: no such file or directory HOT 2
- Duplicate Tolerations HOT 4
- [sig-cloud-provider] Hybrid cloud native support. HOT 3
- Kubelet: Add a metrics in kubelet to track how long it takes for pod to fully start HOT 12
- 1.30 tag also breaks PodIP.IP (which should be marked required) HOT 2
- One Node all pods got crashloopbackoff HOT 4
- Ephemeral volume scheduling problems HOT 2
- Enabling `publishNotReadyAddresses` causes proxy to direct traffic to NotReady pods. HOT 6
- Ignore and potentially prevent reporting container status for not-existing containers HOT 1
- Regarding adding an interface to retrieve the netns of a Pod object HOT 2
- v1.30: kube-scheduler crashes with: Observed a panic: "integer divide by zero" HOT 12
- containerized protobuf codegen does not handle .go-version / GOTOOLCHAIN properly HOT 19
- Pods that have UnexpectedAdmissionError are not automatically removed. HOT 5
- [Sidecar Containers] Pods comparison by maxContainerRestarts should account for sidecar containers HOT 2
- [Sidecar Containers] Sidecar containers finish time needs to be accounted for in job controller HOT 2
- [Sidecar Containers] Eviction message should account for the sidecar containers HOT 3
- last applied annotations are not getting updated HOT 5
- Non existing localhostProfile Seccomp profile is not applied on Kubernetes nodes >= 1.28 HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kubernetes.