aboutsummaryrefslogtreecommitdiff
path: root/gcc/gcc-urlifier.cc
diff options
context:
space:
mode:
authorJennifer Schmitz <jschmitz@nvidia.com>2024-10-01 08:01:13 -0700
committerJennifer Schmitz <jschmitz@nvidia.com>2024-10-24 09:06:20 +0200
commitfc40202c1ac5d585bb236cdaf3a3968927e970a0 (patch)
treed4f4797ee2b488065aafabe21df16efdabcad912 /gcc/gcc-urlifier.cc
parent90e38c4ffad086a82635e8ea9bf0e7e9e02f1ff7 (diff)
downloadgcc-fc40202c1ac5d585bb236cdaf3a3968927e970a0.zip
gcc-fc40202c1ac5d585bb236cdaf3a3968927e970a0.tar.gz
gcc-fc40202c1ac5d585bb236cdaf3a3968927e970a0.tar.bz2
SVE intrinsics: Fold division and multiplication by -1 to neg
Because a neg instruction has lower latency and higher throughput than sdiv and mul, svdiv and svmul by -1 can be folded to svneg. For svdiv, this is already implemented on the RTL level; for svmul, the optimization was still missing. This patch implements folding to svneg for both operations using the gimple_folder. For svdiv, the transform is applied if the divisor is -1. Svmul is folded if either of the operands is -1. A case distinction of the predication is made to account for the fact that svneg_m has 3 arguments (argument 0 holds the values for the inactive lanes), while svneg_x and svneg_z have only 2 arguments. Tests were added or adjusted to check the produced assembly and runtime tests were added to check correctness. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold): Fold division by -1 to svneg. (svmul_impl::fold): Fold multiplication by -1 to svneg. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/div_s32.c: New test. * gcc.target/aarch64/sve/acle/asm/mul_s16.c: Adjust expected outcome. * gcc.target/aarch64/sve/acle/asm/mul_s32.c: New test. * gcc.target/aarch64/sve/acle/asm/mul_s64.c: Adjust expected outcome. * gcc.target/aarch64/sve/acle/asm/mul_s8.c: Likewise. * gcc.target/aarch64/sve/div_const_run.c: New test. * gcc.target/aarch64/sve/mul_const_run.c: Likewise.
Diffstat (limited to 'gcc/gcc-urlifier.cc')
0 files changed, 0 insertions, 0 deletions