diff options
author | Arun Thangamani <arun.thangamani@intel.com> | 2025-08-29 17:11:05 +0530 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-08-29 13:41:05 +0200 |
commit | 448811dfc2ab93ec8d01c3afa93c98e83ac6376b (patch) | |
tree | ea0c32944cb270217f2b1d4d1d17c1fdfac7f61c /llvm/tools/llvm-cov/SourceCoverageView.cpp | |
parent | 9155f511bf5038f04525703b467ed148978f944d (diff) | |
download | llvm-448811dfc2ab93ec8d01c3afa93c98e83ac6376b.zip llvm-448811dfc2ab93ec8d01c3afa93c98e83ac6376b.tar.gz llvm-448811dfc2ab93ec8d01c3afa93c98e83ac6376b.tar.bz2 |
[mlir][amx] Add write side effect to AMX tile creation ops (#155403)
Adds `MemWrite` side effect to `amx.tile_zero` and `amx.tile_load` ops.
Memory write models hardware populating AMX tiles with specified values
through tile zero and load ops.
Making the side effect explicit allows to use multiple op instances as a
compilation hint to use different AMX tile registers. This can prevent
less efficient lowering through tile store-load copies compared to
directly populating tiles with values.
To illustrate the trade off:
Without explicit side effects, `CSE` optimizes two `amx.tile_zero` into
a single op which lowers to a copy for the second tile:
```
tilezero %tmm0
tilestored %tmm0, -2032(%rbp,%rbx) # 1024-byte Folded Spill
tileloadd -2032(%rbp,%rbx), %tmm1 # 1024-byte Folded Reload
```
By keeping the two `amx.tile_zero` ops and, thus, lowering to two
separate intrinsic invocations, the two tile registers are zeroed out
directly without the additional round trip through memory:
```
tilezero %tmm0
tilezero %tmm1
```
The same principle applies to `amx.tile_load` ops.
Diffstat (limited to 'llvm/tools/llvm-cov/SourceCoverageView.cpp')
0 files changed, 0 insertions, 0 deletions