aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Target/WebAssembly/WebAssemblyFixFunctionBitcasts.cpp
diff options
context:
space:
mode:
authorCharitha Saumya <136391709+charithaintc@users.noreply.github.com>2025-11-04 13:15:32 -0800
committerGitHub <noreply@github.com>2025-11-04 13:15:32 -0800
commit9703bda95b088bb6a455ef9faffdb41c537aff2f (patch)
treefa30be9f0439ed7d1efc519c13f31b0260226a68 /llvm/lib/Target/WebAssembly/WebAssemblyFixFunctionBitcasts.cpp
parent2141edf506baab7e526f3a305bcdb6d6f2c772bc (diff)
downloadllvm-9703bda95b088bb6a455ef9faffdb41c537aff2f.zip
llvm-9703bda95b088bb6a455ef9faffdb41c537aff2f.tar.gz
llvm-9703bda95b088bb6a455ef9faffdb41c537aff2f.tar.bz2
[mlir][xegpu] Add OptimizeBlockLoads pass. (#165483)
This pass rewrites certain xegpu `CreateNd` and `LoadNd` operations that feeds into `vector.transpose` to more optimal form to improve performance. Specifically, low precision (bitwidth < 32) `LoadNd` ops that feeds into transpose ops are rewritten to i32 loads with a valid transpose layout such that later passes can use the load with transpose HW feature to accelerate such load ops. **Update:** Pass is renamed to `OptimizeBlockLoads ` because later we plan to add the array length optimization into this pass as well. This will break down a larger load (like `32x32xf16`) into more DPAS-favorable array length loads (`32x16xf16` with array length = 2). Both these optmizations require rewriting `CreateNd` and `LoadNd` and it makes sense to have a common pass for both.
Diffstat (limited to 'llvm/lib/Target/WebAssembly/WebAssemblyFixFunctionBitcasts.cpp')
0 files changed, 0 insertions, 0 deletions