aboutsummaryrefslogtreecommitdiff
path: root/gcc/expr.cc
diff options
context:
space:
mode:
authorKyrylo Tkachov <kyrylo.tkachov@arm.com>2023-06-06 09:54:41 +0100
committerKyrylo Tkachov <kyrylo.tkachov@arm.com>2023-06-06 09:54:41 +0100
commitb327cbe8f4eefc91ee2bea49a1da7128adf30281 (patch)
tree02d04fdafaa0319b916927bdc61e7ce47e3e8f01 /gcc/expr.cc
parent84eec2916fa68cd2e2b3a2cf764f2ba595cce843 (diff)
downloadgcc-b327cbe8f4eefc91ee2bea49a1da7128adf30281.zip
gcc-b327cbe8f4eefc91ee2bea49a1da7128adf30281.tar.gz
gcc-b327cbe8f4eefc91ee2bea49a1da7128adf30281.tar.bz2
aarch64: Improve representation of ADDLV instructions
We've received requests to optimise the attached intrinsics testcase. We currently generate: foo_1: uaddlp v0.4s, v0.8h uaddlv d31, v0.4s fmov x0, d31 ret foo_2: uaddlp v0.4s, v0.8h addv s31, v0.4s fmov w0, s31 ret foo_3: saddlp v0.4s, v0.8h addv s31, v0.4s fmov w0, s31 ret The widening pair-wise addition addlp instructions can be omitted if we're just doing an ADDV afterwards. Making this optimisation would be quite simple if we had a standard RTL PLUS vector reduction code. As we don't, we can use UNSPEC_ADDV as a stand in. This patch expresses the SADDLV and UADDLV instructions as an UNSPEC_ADDV over a widened input, thus removing the need for separate UNSPEC_SADDLV and UNSPEC_UADDLV codes. To optimise the testcases involved we add two splitters that match a vector addition where all participating elements are taken and widened from the same vector and then fed into an UNSPEC_ADDV. In that case we can just remove the vector PLUS and just emit the simple RTL for SADDLV/UADDLV. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_parallel_select_half_p): Define prototype. (aarch64_pars_overlap_p): Likewise. * config/aarch64/aarch64-simd.md (aarch64_<su>addlv<mode>): Express in terms of UNSPEC_ADDV. (*aarch64_<su>addlv<VDQV_L:mode>_ze<GPI:mode>): Likewise. (*aarch64_<su>addlv<mode>_reduction): Define. (*aarch64_uaddlv<mode>_reduction_2): Likewise. * config/aarch64/aarch64.cc (aarch64_parallel_select_half_p): Define. (aarch64_pars_overlap_p): Likewise. * config/aarch64/iterators.md (UNSPEC_SADDLV, UNSPEC_UADDLV): Delete. (VQUADW): New mode attribute. (VWIDE2X_S): Likewise. (USADDLV): Delete. (su): Delete handling of UNSPEC_SADDLV, UNSPEC_UADDLV. * config/aarch64/predicates.md (vect_par_cnst_select_half): Define. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/addlv_1.c: New test.
Diffstat (limited to 'gcc/expr.cc')
0 files changed, 0 insertions, 0 deletions