diff options
author | Xi Ruoyao <xry111@xry111.site> | 2023-11-19 06:12:22 +0800 |
---|---|---|
committer | Xi Ruoyao <xry111@xry111.site> | 2023-11-22 17:06:06 +0800 |
commit | fce367810149580da1bb0cb0c3cd4fb00b968f1c (patch) | |
tree | fffa1a55cac6ab33726ff62d809fccdb1c665c6f /gcc/expr.cc | |
parent | bd17d00a4bdee34876cc97bdf9a1f2316e0a6790 (diff) | |
download | gcc-fce367810149580da1bb0cb0c3cd4fb00b968f1c.zip gcc-fce367810149580da1bb0cb0c3cd4fb00b968f1c.tar.gz gcc-fce367810149580da1bb0cb0c3cd4fb00b968f1c.tar.bz2 |
LoongArch: Optimize LSX vector shuffle on floating-point vector
The vec_perm expander was wrongly defined. GCC internal says:
Operand 3 is the “selector”. It is an integral mode vector of the same
width and number of elements as mode M.
But we made operand 3 in the same mode as the shuffled vectors, so it
would be a FP mode vector if the shuffled vectors are FP mode.
With this mistake, the generic code manages to work around and it ends
up creating some very nasty code for a simple __builtin_shuffle (a, b,
c) where a and b are V4SF, c is V4SI:
la.local $r12,.LANCHOR0
la.local $r13,.LANCHOR1
vld $vr1,$r12,48
vslli.w $vr1,$vr1,2
vld $vr2,$r12,16
vld $vr0,$r13,0
vld $vr3,$r13,16
vshuf.b $vr0,$vr1,$vr1,$vr0
vld $vr1,$r12,32
vadd.b $vr0,$vr0,$vr3
vandi.b $vr0,$vr0,31
vshuf.b $vr0,$vr1,$vr2,$vr0
vst $vr0,$r12,0
jr $r1
This is obviously stupid. Fix the expander definition and adjust
loongarch_expand_vec_perm to handle it correctly.
gcc/ChangeLog:
* config/loongarch/lsx.md (vec_perm<mode:LSX>): Make the
selector VIMODE.
* config/loongarch/loongarch.cc (loongarch_expand_vec_perm):
Use the mode of the selector (instead of the shuffled vector)
for truncating it. Operate on subregs in the selector mode if
the shuffled vector has a different mode (i. e. it's a
floating-point vector).
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/vect-shuf-fp.c: New test.
Diffstat (limited to 'gcc/expr.cc')
0 files changed, 0 insertions, 0 deletions