diff options
author | Benjamin Maxwell <benjamin.maxwell@arm.com> | 2024-07-25 18:15:14 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-07-25 18:15:14 +0100 |
commit | c194bc77a21d68dd321588c9d726ef2d6c556a30 (patch) | |
tree | d3da61e6c00346c3f9dbe7ca65ab4b0fc56a8542 /lldb/source/Plugins/ScriptInterpreter/Python/lldb-python.h | |
parent | 99bb9a719cec9513e72ad275c1c0302b76b6c408 (diff) | |
download | llvm-c194bc77a21d68dd321588c9d726ef2d6c556a30.zip llvm-c194bc77a21d68dd321588c9d726ef2d6c556a30.tar.gz llvm-c194bc77a21d68dd321588c9d726ef2d6c556a30.tar.bz2 |
[mlir][ArmSME] Add rewrite to handle unsupported SVE transposes via SME/ZA (#98620)
This adds a workaround rewrite that allows stores of unsupported SVE
transposes such as:
```mlir
%tr = vector.transpose %vec, [1, 0]
: vector<2x[4]xf32> to vector<[4]x2xf32>
vector.transfer_write %tr, %dest[%i, %j] {in_bounds = [true, true]}
: vector<[4]x2xf32>, memref<?x?xf32>
```
To use SME tiles, which are possible to lower (when SME is available):
```mlir
// Insert vector<2x[4]xf32> into an SME tile:
%0 = arm_sme.get_tile : vector<[4]x[4]xf32>
%1 = vector.extract %vec[0] : vector<[4]xf32> from vector<2x[4]xf32>
%2 = vector.insert %1, %0 [0] : vector<[4]xf32> into vector<[4]x[4]xf32>
%3 = vector.extract %vec[1] : vector<[4]xf32> from vector<2x[4]xf32>
%4 = vector.insert %3, %2 [1] : vector<[4]xf32> into vector<[4]x[4]xf32>
// Store the tile with a transpose + mask:
%c4_vscale = arith.muli %vscale, %c4 : index
%mask = vector.create_mask %c4_vscale, %c2 : vector<[4]x[4]xi1>
vector.transfer_write %4, %arg1[%arg2, %arg3], %mask
{permutation_map = affine_map<(d0, d1) -> (d1, d0)>}
: vector<[4]x[4]xf32>, memref<?x?xf32>
```
Diffstat (limited to 'lldb/source/Plugins/ScriptInterpreter/Python/lldb-python.h')
0 files changed, 0 insertions, 0 deletions