Comments (2)
@llvm/issue-subscribers-mlir-sparse
Author: Giorgis Georgakoudis (ggeorgakoudis)
Then use the dumped pipeline directly in mlir-opt
:
mlir-opt -pass-pipeline="builtin.module(func.func(linalg-generalize-named-ops),func.func(linalg-fuse-elementwise-ops),sparsification-and-bufferization,sparse-storage-specifier-to-llvm,func.func(canonicalize{ max-iterations=10 max-num-rewrites=-1 region-simplify=true test-convergence=false top-down=true}),func.func(finalizing-bufferize),sparse-gpu-codegen{enable-runtime-library=true num-threads=1024},gpu.module(strip-debuginfo),gpu.module(convert-scf-to-cf),gpu.module(convert-gpu-to-nvvm{has-redux=false index-bitwidth=0 use-bare-ptr-memref-call-conv=false}),func.func(convert-linalg-to-loops),func.func(convert-vector-to-scf{full-unroll=false lower-tensors=false target-rank=1}),func.func(expand-realloc{emit-deallocs=true}),func.func(convert-scf-to-cf),expand-strided-metadata,lower-affine,convert-vector-to-llvm{enable-amx=false enable-arm-neon=false enable-arm-sve=false enable-x86vector=false force-32bit-vector-indices=true reassociate-fp-reductions=false},finalize-memref-to-llvm{index-bitwidth=0 use-aligned-alloc=false use-generic-functions=false},func.func(convert-complex-to-standard),func.func(arith-expand{include-bf16=false}),func.func(convert-math-to-llvm{approximate-log1p=true}),convert-math-to-libm,convert-complex-to-libm,convert-vector-to-llvm{enable-amx=false enable-arm-neon=false enable-arm-sve=false enable-x86vector=false force-32bit-vector-indices=true reassociate-fp-reductions=false},convert-complex-to-llvm,convert-vector-to-llvm{enable-amx=false enable-arm-neon=false enable-arm-sve=false enable-x86vector=false force-32bit-vector-indices=true reassociate-fp-reductions=false},convert-func-to-llvm{index-bitwidth=0 use-bare-ptr-memref-call-conv=false},nvvm-attach-target{O=2 chip=sm_80 fast=false features=+ptx71 ftz=false module= triple=nvptx64-nvidia-cuda},gpu-to-llvm{gpu-binary-annotation=gpu.binary use-bare-pointers-for-host=false use-bare-pointers-for-kernels=false},gpu-module-to-binary{format=llvm opts= toolkit=},reconcile-unrealized-casts)"
Reproducer: https://godbolt.org/z/1zz64j895
Expected to see to same the same GPU codegen as with callng the sparsifier but output does not contain GPU code
from llvm-project.
@llvm/issue-subscribers-mlir-gpu
Author: Giorgis Georgakoudis (ggeorgakoudis)
Then use the dumped pipeline directly in mlir-opt
:
mlir-opt -pass-pipeline="builtin.module(func.func(linalg-generalize-named-ops),func.func(linalg-fuse-elementwise-ops),sparsification-and-bufferization,sparse-storage-specifier-to-llvm,func.func(canonicalize{ max-iterations=10 max-num-rewrites=-1 region-simplify=true test-convergence=false top-down=true}),func.func(finalizing-bufferize),sparse-gpu-codegen{enable-runtime-library=true num-threads=1024},gpu.module(strip-debuginfo),gpu.module(convert-scf-to-cf),gpu.module(convert-gpu-to-nvvm{has-redux=false index-bitwidth=0 use-bare-ptr-memref-call-conv=false}),func.func(convert-linalg-to-loops),func.func(convert-vector-to-scf{full-unroll=false lower-tensors=false target-rank=1}),func.func(expand-realloc{emit-deallocs=true}),func.func(convert-scf-to-cf),expand-strided-metadata,lower-affine,convert-vector-to-llvm{enable-amx=false enable-arm-neon=false enable-arm-sve=false enable-x86vector=false force-32bit-vector-indices=true reassociate-fp-reductions=false},finalize-memref-to-llvm{index-bitwidth=0 use-aligned-alloc=false use-generic-functions=false},func.func(convert-complex-to-standard),func.func(arith-expand{include-bf16=false}),func.func(convert-math-to-llvm{approximate-log1p=true}),convert-math-to-libm,convert-complex-to-libm,convert-vector-to-llvm{enable-amx=false enable-arm-neon=false enable-arm-sve=false enable-x86vector=false force-32bit-vector-indices=true reassociate-fp-reductions=false},convert-complex-to-llvm,convert-vector-to-llvm{enable-amx=false enable-arm-neon=false enable-arm-sve=false enable-x86vector=false force-32bit-vector-indices=true reassociate-fp-reductions=false},convert-func-to-llvm{index-bitwidth=0 use-bare-ptr-memref-call-conv=false},nvvm-attach-target{O=2 chip=sm_80 fast=false features=+ptx71 ftz=false module= triple=nvptx64-nvidia-cuda},gpu-to-llvm{gpu-binary-annotation=gpu.binary use-bare-pointers-for-host=false use-bare-pointers-for-kernels=false},gpu-module-to-binary{format=llvm opts= toolkit=},reconcile-unrealized-casts)"
Reproducer: https://godbolt.org/z/1zz64j895
Expected to see to same the same GPU codegen as with callng the sparsifier but output does not contain GPU code
from llvm-project.
Related Issues (20)
- [Modules] Unncessary compilations due to use of "same" lambda signatures HOT 1
- [DWARFLinker] "Assertion failed: EntryBody" while running test/tools/dsymutil/X86/DWARFLinkerParallel/odr-string.test
- How should `__builtin_object_size` treat casts in the `type & 1 == 1` case? HOT 3
- AVR rtlib linking quirks warning emitted even when not linking HOT 1
- lldb tests sometimes fail with SmallVector assert "Iterator to erase is out of bounds." HOT 1
- Clang allows invalid type conversion of lambda from 'void' to 'int' in default argument and template class HOT 1
- [X86][Aarch64][RISCV] Dynamic vector shuffle idiom is not always recognized and optimally lowered HOT 5
- Configuring incomplete, errors occurred! HOT 2
- Clang 18.1.8 crashes when using -mno-sse2 HOT 4
- trunc nuw x to i1 as branch/select cond implies x HOT 1
- libfuzzer passes null pointers to functions whose parameters are marked __non_null HOT 2
- [flang] Preprocessor substitution in bind(C) names HOT 9
- Replace pipe with internal_pipe in compiler-rt sanitizer common
- Clang does not optimize memcpy-like scenario for pipelining
- fatal error: error in backend: Cannot select: intrinsic %llvm.writecmrelease HOT 2
- [ARM]Run fail when set target to armebv7-unknown-linux HOT 2
- [libc++] Poor diagnostic for signedness mismatch in std::ranges::iota HOT 1
- Segmentation Fault when extensions in "ImplementationFileExtensions" or "HeaderFileExtensions" settings have leading dots HOT 6
- MCContext: Rmove the `const char *BeginSymName` parameter from getXCOFFSection HOT 1
- [AMDGPU] Alias to alias crashes 'AMDGPUResourceUsageAnalysis' with an invalid cast HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llvm-project.