riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Charitha Saumya <136391709+charithaintc@users.noreply.github.com>	2025-11-04 13:15:32 -0800
committer	GitHub <noreply@github.com>	2025-11-04 13:15:32 -0800
commit	9703bda95b088bb6a455ef9faffdb41c537aff2f (patch)
tree	fa30be9f0439ed7d1efc519c13f31b0260226a68 /llvm/lib/Target/WebAssembly/WebAssemblyFixFunctionBitcasts.cpp
parent	2141edf506baab7e526f3a305bcdb6d6f2c772bc (diff)
download	llvm-9703bda95b088bb6a455ef9faffdb41c537aff2f.zip llvm-9703bda95b088bb6a455ef9faffdb41c537aff2f.tar.gz llvm-9703bda95b088bb6a455ef9faffdb41c537aff2f.tar.bz2

[mlir][xegpu] Add OptimizeBlockLoads pass. (#165483)

This pass rewrites certain xegpu `CreateNd` and `LoadNd` operations that feeds into `vector.transpose` to more optimal form to improve performance. Specifically, low precision (bitwidth < 32) `LoadNd` ops that feeds into transpose ops are rewritten to i32 loads with a valid transpose layout such that later passes can use the load with transpose HW feature to accelerate such load ops. **Update:** Pass is renamed to `OptimizeBlockLoads ` because later we plan to add the array length optimization into this pass as well. This will break down a larger load (like `32x32xf16`) into more DPAS-favorable array length loads (`32x16xf16` with array length = 2). Both these optmizations require rewriting `CreateNd` and `LoadNd` and it makes sense to have a common pass for both.

Diffstat (limited to 'llvm/lib/Target/WebAssembly/WebAssemblyFixFunctionBitcasts.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: