riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Craig Topper <craig.topper@sifive.com>	2022-01-20 14:16:37 -0800
committer	Craig Topper <craig.topper@sifive.com>	2022-01-20 14:44:47 -0800
commit	fa8bb224661dfb38cb2a246f7d98dc61fd45602e (patch)
tree	113143c262b0c698c201a964a58666d7eb3a7414 /lldb/source/Plugins/ScriptInterpreter/Python
parent	cd2d7369639e70df17e7977ac3a4a5b7854043fa (diff)
download	llvm-fa8bb224661dfb38cb2a246f7d98dc61fd45602e.zip llvm-fa8bb224661dfb38cb2a246f7d98dc61fd45602e.tar.gz llvm-fa8bb224661dfb38cb2a246f7d98dc61fd45602e.tar.bz2

[RISCV] Optimize vector_shuffles that are interleaving the lowest elements of two vectors.

RISCV only has a unary shuffle that requires places indices in a register. For interleaving two vectors this means we need at least two vrgathers and a vmerge to do a shuffle of two vectors. This patch teaches shuffle lowering to use a widening addu followed by a widening vmaccu to implement the interleave. First we extract the low half of both V1 and V2. Then we implement (zext(V1) + zext(V2)) + (zext(V2) * zext(2^eltbits - 1)) which simplifies to (zext(V1) + zext(V2) * 2^eltbits). This further simplifies to (zext(V1) + zext(V2) << eltbits). Then we bitcast the result back to the original type splitting the wide elements in half. We can only do this if we have a type with wider elements available. Because we're using extends we also have to be careful with fractional lmuls. Floating point types are supported by bitcasting to/from integer. The tests test a varied combination of LMULs split across VLEN>=128 and VLEN>=512 tests. There a few tests with shuffle indices commuted as well as tests for undef indices. There's one test for a vXi64/vXf64 vector which we can't optimize, but verifies we don't crash. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D117743

Diffstat (limited to 'lldb/source/Plugins/ScriptInterpreter/Python')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: