GithubHelp home page GithubHelp logo

blsched's People

Contributors

jbrun3t avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

blsched's Issues

RT and Deadline task are running on a little cpu after minimum time staying on the big cpu

rt and deadline tasks also get there se->avg data structure initialized when they are created:

$ sudo chrt -f 20 sysbench --test=cpu --num-threads=1 --max-requests=1000000 run

$ cat /proc/18025/task/18025/sched
sysbench (18025, #threads: 2)

...
se.avg.load_avg : 0
se.avg.util_avg : 0
se.avg.last_update_time : 32666222637295
policy : 1
prio : 79

but their 'se.avg.load_avg' values drop to 0 pretty quickly and their 'se.avg.last_update_time' values are not updated as for cfs tasks since rt/deadline sched classes are not using PELT.
So bLsched puts them after 3secs on a little cpu.

n big cpus - # big tasks > n

I see a potential issue with 'n big cpus - # big tasks > n' workloads

When I run 5 big tasks on a 4 big cpu system, all big tasks run affine to the big cpus.

sysbench --test=cpu --num-threads=5 --max-time=1000000 run

bLsched -b 0 -b 1 -b 2 -b 3 -l -vv

13058: sysbench added
13059: sysbench added
13060: sysbench added
13061: sysbench added
13062: sysbench added
13059 sysbench: load_avg 1015
13059 sysbench: move to big
13062 sysbench: load_avg 1005
13062 sysbench: move to big
13060 sysbench: load_avg 1018
13060 sysbench: move to big
13058 sysbench: load_avg 1012
13058 sysbench: move to big
13061 sysbench: load_avg 1013
13061 sysbench: move to big
13059 sysbench: load_avg 1005
13059 sysbench: in big, left 3000ms
13062 sysbench: load_avg 1003
13062 sysbench: in big, left 3000ms
13060 sysbench: load_avg 1008
13060 sysbench: in big, left 3000ms
13058 sysbench: load_avg 1002
13058 sysbench: in big, left 3000ms
13061 sysbench: load_avg 1004
13061 sysbench: in big, left 3000ms
13059 sysbench: removed
13061 sysbench: removed
13060 sysbench: removed
13062 sysbench: removed
13058 sysbench: removed

This causes preemption latency among the big tasks you wouldn't normally see since the kernel scheduler would have the little cpus for at least one of the big tasks available.

Put floating point heavy tasks on the big CPU's.

The big CPU is running floating point code 2-3 times faster than the LITTLE CPU. It would be nice to identify floating point heavy task and bind them to the big CPU.
Tried this by enabling trapping FP from use space exceptions, but this didn't work as all user space processes seems to use FP (strXXX functions using NEON).

Usage of load instead of utilization can lead to saturation on big CPUs?

I've noticed that you use load (se.avg.load_avg) instead of utilization (se.avg.util_avg) to defined the task size. As you correctly point out here:

if (info->load_avg > 1024) /* may happen for high priority tasks */

load is scaled by the task priority thus resulting in values which can (quite easily) be bigger then the CPU capacity (i.e. 1024) and thus subject to the above capping. Moreover, from a theoretical standpoint, load it's not in the same scale unit of capacity.

Another couple of big differences between load and utilization is that it load tracks RUNNABLE time, instead of RUNNING time, and it is not "cpu scaled". These two things have both the side effect of representing as bigger a relatively small task which just happen to be co-scheduler with others, and thus subject to some wakeup/run latencies, or running on a LITTLE CPU.

While this can have the interesting side-effect of up-migrating more easily some tasks, which maybe it's not a big issue for the specific deployment you are targeting, it is likely to effect the energy efficiency of the final solution. Main risks are:

  • small tasks being up-migrated when not really required for their specific bandwidth demand
  • more inertia on down-migrating tasks, especially when the big side happens to be almost saturated

Considering these two effects you can end up with scenarios where a bunch of tasks are up-migrated just because they appeared to be big (from a load standpoint) and then they will keep remaining on the big side just because they are still co-scheduled on that side.
For example, if you run an app which has the same number of tasks as the LITTLE+big CPUs, you can probably end up by pinning all these tasks on the big side while instead the LITTLEs are idle, with consequences both on performance (since LITTLE cores are not used) and power (since you run longer on an highly clocked big side).

All that considered, I was wondering if there are any specific reasons why this tool choose to use load instead of utilization as well as if you ever considered/tried the other way.

Perhaps it could be interesting to support both metrics, maybe with some kind of global start parameter or per app configuration settings, which allows to choose one metric over the other. In that case I would say it can be interesting to experiment with using load for "performance sensitive" tasks (e.g. directly imparting the user-experience) while falling back to (or use by default) utilization for all the other mainly "energy sensitive" tasks (e.g. background tasks)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.