Comments (2)
Hi @FelipeGXavier, also if you try to set the CPU limit "cpu": 262
on the docker (container) level Amazon ECS agent will set "CpuShares": 262
instead "CpuShares": 2
(based on docker inspect) which is ridiculous as first look.
There is no hard limit based on docker documentation.
I don't understand the logic of how CPU limitation is working on ECS level, mb we need to try/stay/switch with cgroup v1
like other people do? Can someone explain it?
Environment Details
$ mount | grep group
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate,memory_recursiveprot)
OS: Amazon Linux 2023.3.20240304
Amazon ECS Agent: v1.82.0 (bc3cb997)
Docker: Version: 20.10.25, API version: 1.41 (client and server)
from amazon-ecs-agent.
Hello @FelipeGXavier, @3guboff
Thanks for reaching out to ECS. Please find my response as follows.
- The "Observed Behavior" on this post is expected for both cgroup v1 and v2.
- cgroup v2 is used by default on Fedora (since 31), Debian GNU/Linux (since 11), Ubuntu (since 21.10) and Amazon Linux 2023 and ECS-optimized AL2023 AMI.
- For cgroup v1, ECS Agent creates a root cgroup
/ecs
at start-up. When an ECS task is placed on the container instance, ECS Agent creates a sub-cgroup under/ecs
named with$task_id
to apply task-level resource limits. - When creating a docker container, the
/ecs/$task_id
cgroup will be passed to Docker as a cgroup parent. Docker further creates one sub-cgroup for each container using$container_id
under the parent cgroup/ecs/$task_id
parent cgroup, and apply container-level resource limits. As limits placed on a cgroup at a higher level in the hierarchy cannot be exceeded by descendant cgroups, each container will be covered by the hard limit set on the parent cgroup/ecs/$task_id
. - To review resources limits, users can go to cgroup parent of the container:
/sys/fs/cgroup/cpu/ecs/$task_id
for cgroup v1, and/sys/fs/cgroup/ecstasks.slice/ecstasks-$task_id.slice
for cgroup v2.
Example:
$ docker inspect <container_id>
[
{
"CpuShares": 2,
"Memory": 0,
"NanoCpus": 0,
"CgroupParent": "/ecs/bf1ddbfbbc8f45c9b1ac4f779368741e",
"CpuPeriod": 0,
"CpuQuota": 0,
"CpuRealtimePeriod": 0,
"CpuRealtimeRuntime": 0,
"CpusetCpus": "",
"CpusetMems": "",
"Ulimits": [
{
"Name": "nofile",
"Hard": 65536,
"Soft": 32768
}
],
"CpuCount": 0,
"CpuPercent": 0,
...
}
]
[root@ip-xxx cpu]# pwd
/sys/fs/cgroup/cpu
[root@ip-1xxx cpu]# ls
cgroup.clone_children cpu.cfs_period_us cpu.rt_runtime_us cpuacct.stat cpuacct.usage_percpu cpuacct.usage_sys ecs system.slice
cgroup.procs cpu.cfs_quota_us cpu.shares cpuacct.usage cpuacct.usage_percpu_sys cpuacct.usage_user notify_on_release tasks
cgroup.sane_behavior cpu.rt_period_us cpu.stat cpuacct.usage_all cpuacct.usage_percpu_user docker release_agent user.slice
[root@ip-xxx ecs]# pwd
/sys/fs/cgroup/cpu/ecs
[root@ip-xxx ecs]# ls
${task_id} cpu.cfs_period_us cpu.rt_runtime_us cpuacct.stat cpuacct.usage_percpu cpuacct.usage_sys tasks
cgroup.clone_children cpu.cfs_quota_us cpu.shares cpuacct.usage cpuacct.usage_percpu_sys cpuacct.usage_user
cgroup.procs cpu.rt_period_us cpu.stat cpuacct.usage_all cpuacct.usage_percpu_user notify_on_release
[root@ip-xxx ${task_id}]# pwd
/sys/fs/cgroup/cpu/ecs/${task_id}
[root@ip-xxx ${task_id}]# ls
${container_id} cpu.cfs_period_us cpu.rt_runtime_us cpuacct.stat cpuacct.usage_percpu cpuacct.usage_sys tasks
cgroup.clone_children cpu.cfs_quota_us cpu.shares cpuacct.usage cpuacct.usage_percpu_sys cpuacct.usage_user
cgroup.procs cpu.rt_period_us cpu.stat cpuacct.usage_all cpuacct.usage_percpu_user notify_on_release
- CPU set on the container level is the number of cpu units that ECS Agent reserves for the container. On Linux, this parameter maps to
CpuShares
. In addition, null, zero, and CPU values of one are passed to Docker as2
CPU shares since ECS Agent versions >= 1.2.0.
Reference:
from amazon-ecs-agent.
Related Issues (20)
- Upgrade minimum docker client api to 1.24 to maintain compatibility with upcoming docker engine v26 release HOT 3
- Task Health Status wrongly reported as HEALTHY HOT 1
- Update not supported on ARM architecture HOT 1
- Add retries for publishing metrics & health checks
- ECS Deployment Fails Due to Premature Resource Availability Reporting HOT 8
- Add support for custom ECS-Agent and ECS-Telemetry Endpoints HOT 1
- Upgraded ecs agent causes Error loading previously saved state from BoltDB HOT 4
- ECS control plane not compatible with ECS-A and Docker v26 requirements for API version HOT 6
- AWS ECS agent does not start in EC2 instance HOT 3
- Agent is Failing to Add com.amazonaws.ecs.capability.logging-driver.journald Attribute to the Container Instance HOT 1
- Docker client doesn't support zstd compression HOT 1
- ECS agent on windows does not work for more than 10 CPU despite setting 'ECS_ENABLE_TASK_CPU_MEM_LIMIT' to true HOT 4
- Secret in US region, and ECS cluster in Asia pacific region HOT 2
- Run Security Updates without failing long-running tasks HOT 2
- Unable to delete Docker image due to multiple repository references HOT 1
- Specifying docker image for caching during ecs-init
- nvidia-gpu-info.json not being generated since v1.82.4
- Docker tags are not shown for pulled images where tag is specified in task definition HOT 1
- More descriptive log message for "Resources not consumed, enough resources not available" HOT 1
- ECS Instances stuck with "Agent Disconnected" HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from amazon-ecs-agent.