Comments (4)
I have the PR linked above that adds an if statement to avoid the kernel launch. I did not make any changes to the GPU diagnostics yet.
from chapel.
A question that we should answer is whether GpuDiagnostics should register a loop like that as a kernel launch, or not. My initial reflex was that "Yes, it should."
My initial reflex would be the opposite. If we're not launching a kernel then why would we count it as a kernel launch?
from chapel.
I agree with Andy. I think if we squash the launch into (effectively) a no-op, it should not be counted since it suggests an overhead or cost that isn't actually being incurred and can't be further optimized away.
from chapel.
I agree with Andy and Brad -- it should not count as a launch.
My point in the OP:
Imagine a hypothetical scenario where we may choose between GPU/CPU execution based on some loop characteristic either statically or dynamically potentially as an optimization. In such a scenario, I would want GpuDiagnostics to reflect the actual behavior.
Resonated more and more over time for me. My slightly more practical thought experiment for this case was: imagine we have an optimization where we don't fire kernels for loops that trip less than 10 times. What do you expect GpuDiagnostics to do in that case? My answer to that is very clear and that it should not count that as a launch.
from chapel.
Related Issues (20)
- Missing deprecation warnings for converting `owned` to `shared`
- Internal error when building Chapel code with HIP module loaded HOT 7
- Apparent bug in readBinary() (for sufficiently large files? sufficiently large offsets within files? array slice reads?) HOT 3
- GPU: Initializers with promoted expressions don't get GPUized. HOT 3
- assertOnGpu fails when applied to `foreach` with custom iterator HOT 2
- Discussion on forall intents
- What to do with the empty regex initializer? HOT 4
- GPU Kernels in Standard Modules cause segfaults or internal errors
- Restore support for casting expressions to strings?
- forall reductions fail silently for GPU HOT 1
- dyno: incremental re-scope resolution causes recursive query when standard modules are enabled HOT 1
- Using `CHPL_GPU=cpu` with the gnu compiler gives an attribute warning HOT 3
- Stabilizing Associative Domains and Arrays HOT 1
- stdin/stdout/stderr store the dummy locale rather than locale 0 HOT 1
- Improve IO's bulk read/write performance when endianness is non-native HOT 2
- Clarify documentation around profiling based on user feedback HOT 1
- Add a note to the GPU technote about best practices for host profiling when using the GPU locale model HOT 3
- Add support for `-pg` to the LLVM back-end? HOT 1
- Capitalization / Naming of ioendian HOT 6
- Can Chapel code use a param to detect endianness?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chapel.