[mlir][sparse] Calling mlir-opt with sparsifier's dumped pass pipeline for GPU codegen does not emit GPU code about llvm-project HOT 2 OPEN

ggeorgakoudis commented on June 27, 2024

[mlir][sparse] Calling mlir-opt with sparsifier's dumped pass pipeline for GPU codegen does not emit GPU code

from llvm-project.

Comments (2)

llvmbot commented on June 27, 2024

@llvm/issue-subscribers-mlir-sparse

Author: Giorgis Georgakoudis (ggeorgakoudis)

First call `mlir-opt` for the sparsifier to generate GPU code and dump the pass pipeline (I'm omitting the mlir input file, reproducers contain it): ``` mlir-opt --sparsifier="enable-runtime-library=false parallelization-strategy=dense-outer-loop gpu-triple=nvptx64-nvidia-cuda gpu-chip=sm_80 gpu-features=+ptx71 gpu-format=llvm" --dump-pass-pipeline ``` Reproducer: https://godbolt.org/z/7cP8z1qcY

Then use the dumped pipeline directly in mlir-opt:

mlir-opt -pass-pipeline="builtin.module(func.func(linalg-generalize-named-ops),func.func(linalg-fuse-elementwise-ops),sparsification-and-bufferization,sparse-storage-specifier-to-llvm,func.func(canonicalize{  max-iterations=10 max-num-rewrites=-1 region-simplify=true test-convergence=false top-down=true}),func.func(finalizing-bufferize),sparse-gpu-codegen{enable-runtime-library=true num-threads=1024},gpu.module(strip-debuginfo),gpu.module(convert-scf-to-cf),gpu.module(convert-gpu-to-nvvm{has-redux=false index-bitwidth=0 use-bare-ptr-memref-call-conv=false}),func.func(convert-linalg-to-loops),func.func(convert-vector-to-scf{full-unroll=false lower-tensors=false target-rank=1}),func.func(expand-realloc{emit-deallocs=true}),func.func(convert-scf-to-cf),expand-strided-metadata,lower-affine,convert-vector-to-llvm{enable-amx=false enable-arm-neon=false enable-arm-sve=false enable-x86vector=false force-32bit-vector-indices=true reassociate-fp-reductions=false},finalize-memref-to-llvm{index-bitwidth=0 use-aligned-alloc=false use-generic-functions=false},func.func(convert-complex-to-standard),func.func(arith-expand{include-bf16=false}),func.func(convert-math-to-llvm{approximate-log1p=true}),convert-math-to-libm,convert-complex-to-libm,convert-vector-to-llvm{enable-amx=false enable-arm-neon=false enable-arm-sve=false enable-x86vector=false force-32bit-vector-indices=true reassociate-fp-reductions=false},convert-complex-to-llvm,convert-vector-to-llvm{enable-amx=false enable-arm-neon=false enable-arm-sve=false enable-x86vector=false force-32bit-vector-indices=true reassociate-fp-reductions=false},convert-func-to-llvm{index-bitwidth=0 use-bare-ptr-memref-call-conv=false},nvvm-attach-target{O=2 chip=sm_80 fast=false features=+ptx71 ftz=false  module= triple=nvptx64-nvidia-cuda},gpu-to-llvm{gpu-binary-annotation=gpu.binary use-bare-pointers-for-host=false use-bare-pointers-for-kernels=false},gpu-module-to-binary{format=llvm  opts= toolkit=},reconcile-unrealized-casts)"

Reproducer: https://godbolt.org/z/1zz64j895

Expected to see to same the same GPU codegen as with callng the sparsifier but output does not contain GPU code

from llvm-project.

llvmbot commented on June 27, 2024

@llvm/issue-subscribers-mlir-gpu

Author: Giorgis Georgakoudis (ggeorgakoudis)

Then use the dumped pipeline directly in mlir-opt:

mlir-opt -pass-pipeline="builtin.module(func.func(linalg-generalize-named-ops),func.func(linalg-fuse-elementwise-ops),sparsification-and-bufferization,sparse-storage-specifier-to-llvm,func.func(canonicalize{  max-iterations=10 max-num-rewrites=-1 region-simplify=true test-convergence=false top-down=true}),func.func(finalizing-bufferize),sparse-gpu-codegen{enable-runtime-library=true num-threads=1024},gpu.module(strip-debuginfo),gpu.module(convert-scf-to-cf),gpu.module(convert-gpu-to-nvvm{has-redux=false index-bitwidth=0 use-bare-ptr-memref-call-conv=false}),func.func(convert-linalg-to-loops),func.func(convert-vector-to-scf{full-unroll=false lower-tensors=false target-rank=1}),func.func(expand-realloc{emit-deallocs=true}),func.func(convert-scf-to-cf),expand-strided-metadata,lower-affine,convert-vector-to-llvm{enable-amx=false enable-arm-neon=false enable-arm-sve=false enable-x86vector=false force-32bit-vector-indices=true reassociate-fp-reductions=false},finalize-memref-to-llvm{index-bitwidth=0 use-aligned-alloc=false use-generic-functions=false},func.func(convert-complex-to-standard),func.func(arith-expand{include-bf16=false}),func.func(convert-math-to-llvm{approximate-log1p=true}),convert-math-to-libm,convert-complex-to-libm,convert-vector-to-llvm{enable-amx=false enable-arm-neon=false enable-arm-sve=false enable-x86vector=false force-32bit-vector-indices=true reassociate-fp-reductions=false},convert-complex-to-llvm,convert-vector-to-llvm{enable-amx=false enable-arm-neon=false enable-arm-sve=false enable-x86vector=false force-32bit-vector-indices=true reassociate-fp-reductions=false},convert-func-to-llvm{index-bitwidth=0 use-bare-ptr-memref-call-conv=false},nvvm-attach-target{O=2 chip=sm_80 fast=false features=+ptx71 ftz=false  module= triple=nvptx64-nvidia-cuda},gpu-to-llvm{gpu-binary-annotation=gpu.binary use-bare-pointers-for-host=false use-bare-pointers-for-kernels=false},gpu-module-to-binary{format=llvm  opts= toolkit=},reconcile-unrealized-casts)"

Reproducer: https://godbolt.org/z/1zz64j895

Expected to see to same the same GPU codegen as with callng the sparsifier but output does not contain GPU code

from llvm-project.

[mlir][sparse] Calling mlir-opt with sparsifier's dumped pass pipeline for GPU codegen does not emit GPU code about llvm-project HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs