diff options
| author | Philip Ginsbach-Chen <philip.ginsbach@cantab.net> | 2025-12-15 05:34:30 +0000 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-12-15 05:34:30 +0000 |
| commit | 1d821b0c6b71b124a5eacc63deb0a7b81d2fb973 (patch) | |
| tree | fdd804c92006e0dce386f589a9f07ff4d46b0a83 /offload/unittests/OffloadAPI/device_code/multiargs.cpp | |
| parent | 8f51da369e6e7f13bea941e61b4b2c5fa81216f5 (diff) | |
| download | llvm-1d821b0c6b71b124a5eacc63deb0a7b81d2fb973.zip llvm-1d821b0c6b71b124a5eacc63deb0a7b81d2fb973.tar.gz llvm-1d821b0c6b71b124a5eacc63deb0a7b81d2fb973.tar.bz2 | |
[AArch64] use `isTRNMask` to calculate shuffle costs (#171524)
This builds on #169858 to fix the divergence in codegen
(https://godbolt.org/z/a9az3h6oq) between two very similar
functions initially observed in #137447 (represented in the diff by test
cases `@transpose_splat_constants` and `@transpose_constants_splat`:
```
int8x16_t f(int8_t x)
{
return (int8x16_t) { x, 0, x, 1, x, 2, x, 3,
x, 4, x, 5, x, 6, x, 7 };
}
int8x16_t g(int8_t x)
{
return (int8x16_t) { 0, x, 1, x, 2, x, 3, x,
4, x, 5, x, 6, x, 7, x };
}
```
The PR uses an additional `isTRNMask` call in
`AArch64TTIImpl::getShuffleCost` to ensure that we treat shuffle masks
as transpose masks even if `isTransposeMask` fails to recognise them
(meaning that `Kind == TTI::SK_Transpose` cannot be relied upon).
Follow-up work could consider modifying `isTransposeMask`, but that
would also impact other backends than AArch64.
Diffstat (limited to 'offload/unittests/OffloadAPI/device_code/multiargs.cpp')
0 files changed, 0 insertions, 0 deletions
