GithubHelp home page GithubHelp logo

Comments (14)

randomascii avatar randomascii commented on July 17, 2024 2

I suspect that the VS team could easily fix this issue by using a global semaphore. All they would have to do is have the first compiler that starts create a global semaphore with a count of n-processors. Then, the compilers get invoked as normal but before they start working they have to acquire a semaphore, and when they stop working they release it. Thus, while there could still end up being hundreds of compiler processes in existence, at most num-procs of them would be doing any work.

This would be a simple and reliable fix for excessive compilation steps. The linker could also use the same semaphore which would then cover ~90% of the problem.

But, don't hold your breath.

from msbuild.

randomascii avatar randomascii commented on July 17, 2024 1

The lack of global scheduling is particularly bad on the machines used by many Chromium developers. With 40-48 threads (20-24 cores in two sockets) it would be easy for msbuild to try doing 1,600+ way parallelism.

If you need global scheduling now then I recommend looking at the ninja build system. Chromium uses this and it creates almost perfect parallelism - for ~95% of the length of a full Chromium build there are exactly the requested number of parallel processes. The parallelism only drops below the requested level when there are unavoidable dependencies or during linking, when parallelism is intentionally reduced in order to avoid excessive memory pressure. Open source. Recommended.

from msbuild.

powercode avatar powercode commented on July 17, 2024 1

Isn't it time to schedule this task?

from msbuild.

Trass3r avatar Trass3r commented on July 17, 2024 1

https://devblogs.microsoft.com/cppblog/improved-parallelism-in-msbuild/

from msbuild.

AndyGerlicher avatar AndyGerlicher commented on July 17, 2024

Team triage: More effective scheduling is on our long term road-map, but not something we're looking at right now.

from msbuild.

ariccio avatar ariccio commented on July 17, 2024

Since you're linking to his post (see also: https://randomascii.wordpress.com/2014/03/31/you-got-your-web-browser-in-my-compiler/), you might as well also @randomascii.

from msbuild.

Trass3r avatar Trass3r commented on July 17, 2024

@randomascii yeah but how to leverage ninja if you only have a VS solution (no generated one)?

from msbuild.

randomascii avatar randomascii commented on July 17, 2024

You have to create .ninja files if you want to use ninja. You can use GYP, gn, Cmake, and probably other tools - GYP is handy because it can generate .vcxproj files. It's probably possible to write a simple tool that translates a .vcxproj file to a .ninja file, within limits, but I'm not aware of anyone actually doing that.

So yeah, it's a hassle. Which is why it would be nice if msbuild actually worked sanely. Doing so would significantly reduce the impact of bugs like this one, which is exacerbated by having O(n^2) compiles in parallel:
https://randomascii.wordpress.com/2014/03/31/you-got-your-web-browser-in-my-compiler/

from msbuild.

danmoseley avatar danmoseley commented on July 17, 2024

This has been known for a long time and we talked to the CL team way back about it. I recently hit it in the CoreCLR repo where there can be many 10's of CL's spawned at once which is grossly inefficient. The only option right now is to manually adjust the MSBuild parallelism and/or the CL parallelism (in the .vcxproj) which aside from being time consuming requires making a tradeoff between full stack build (more MSBuild parallelism) vs. dev cycle (more CL parallelism) and also the numbers need to be parameterized by the number of CPUs on the particular box.

It is certainly worth improving. It seems to me there are two approaches:

  1. some kind of communication between tools -- task can say to the engine, "OK, I can parallelize this job N ways, let me know when/how many are good to spawn" -- this would not help cases where the task is Exec or the tool does not expose a parallelization flag as CL does. It falls into the old trap of assuming every tool has a nice task around it and real builds are more diverse than that and do not center around MSBuild's needs. It is probably the right approach if only CL and one or two other tasks we own are important. For typical VS users, CL may be the most problematic tool and this may be the way to go.
  2. heuristic based on performance counters. MSBuild would govern its own parallelism based on how busy the disk and CPU seems to be at any point in time and perhaps offer facility for enlightened tasks to make similar decisions. That would have some benefit for everybody. Note that 1 thread per CPU is not always the best choice - linking tends to be disk bound and fewer threads can be faster then. This is akin to dynamic choices that threadpools tend to make, although they tend to only be concerned with CPU. If I was writing an arbitrary build tool, I would make it work this way. I have no idea how well several independent heuristics would work together in practice though - one could imagine "beating" behavior as they take more resources then back off in response.

It should not be hard to at least make some improvement over the current pathological behavior but it is necessary to choose the approach.

from msbuild.

danmoseley avatar danmoseley commented on July 17, 2024

One relatively simple, localized change that could be made would be a change in the MSBuild scheduler itself to govern its parallelism using the system CPU load. Basically it would begin to treat the number of CPU parameter as merely a hint to be increased or decreased as new work comes in to schedule during the build. CL would continue to do its own thing but the system would no longer descend into such serious thrashing. It seems to me this change would be localized and not break any external invariants - at least reducing parallelism wouldn't.

from msbuild.

randomascii avatar randomascii commented on July 17, 2024

Unfortunately the MSBuild scheduler is not in control. It spawns (say) num-cores compilers, and all is well. Unless, that is, those num-cores compilers each have a few hundred source files to work on - not an uncommon situation. In which case those num-cores instances of cl.exe will each spawn num-cores copies of themselves, and quadratic processes is achieved.

At the time that MSBuild spawns those compilers it has lacks the foresight to realize how many sub-processes they will spawn, and it lacks any evidence of CPU overloading until it is too late.

Then again, while this means that MSBuild cannot do a perfect job, it still might be able to improve the situation in some cases. However what it really needs to look at is the number of ready threads - to see how overloaded the CPUs are - and I'm not sure that that information is available to non-administrator processes.

Anyway, it seems like a kludgy fix, and it would be better to do a proper fix. Ninja does this quite elegantly, although it does have a few hacks to restrict the number of links more heavily than compiles, due to the greater memory requirements. Ninja can run my Chromium build at 100% CPU for 40 minutes with no over-committing and only the tiniest of breaks.

from msbuild.

Trass3r avatar Trass3r commented on July 17, 2024

A related issue is CustomBuild steps as used by Qt's VS addin for calling their moc "preprocessor" can't be parallelized at all. Which means that part takes longer than the actual compilation.

from msbuild.

danmoseley avatar danmoseley commented on July 17, 2024

@randomascii right, in the cases that it's CPU contention. Linking often is disk bound.

from msbuild.

danmoseley avatar danmoseley commented on July 17, 2024

By the way, the change could be wholly within MSBuild, as it's MSBuild that owns the parallelism in VS builds. I don't speak for the MSBuild team, but possibly you could offer a PR that optionally enables such a semaphore, and patch your copy of VS and try it out.

from msbuild.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.