diff options
author | Wilco Dijkstra <wdijkstr@arm.com> | 2018-10-12 10:49:27 +0000 |
---|---|---|
committer | Wilco Dijkstra <wilco@gcc.gnu.org> | 2018-10-12 10:49:27 +0000 |
commit | 0cfc095c8d150af9f0f68a3238abe76b43e73bec (patch) | |
tree | f746fd533cf5f3d089d33b8c9c44c23d14eebf91 /gcc/opt-problem.h | |
parent | 4dc003fffabd35361fb77a40f077805d21184f9c (diff) | |
download | gcc-0cfc095c8d150af9f0f68a3238abe76b43e73bec.zip gcc-0cfc095c8d150af9f0f68a3238abe76b43e73bec.tar.gz gcc-0cfc095c8d150af9f0f68a3238abe76b43e73bec.tar.bz2 |
[AArch64] Support zero-extended move to FP register
The popcount expansion uses SIMD instructions acting on 64-bit values.
As a result a popcount of a 32-bit integer requires zero-extension before
moving the zero-extended value into an FP register. This patch adds
support for zero-extended int->FP moves to avoid the redundant uxtw.
Similarly, add support for 32-bit zero-extending load->FP register
and 32-bit zero-extending FP->FP and FP->int moves.
Add a missing 'fp' arch attribute to the related 8/16-bit pattern and
fix an incorrect type attribute.
To complete zero-extended load support, add a new alternative to
load_pair_zero_extendsidi2_aarch64 to support LDP into FP registers too.
int f (int a)
{
return __builtin_popcount (a);
}
Before:
uxtw x0, w0
fmov d0, x0
cnt v0.8b, v0.8b
addv b0, v0.8b
fmov w0, s0
ret
After:
fmov s0, w0
cnt v0.8b, v0.8b
addv b0, v0.8b
fmov w0, s0
ret
Passes regress & bootstrap on AArch64.
gcc/
* config/aarch64/aarch64.md (zero_extendsidi2_aarch64): Add alternatives
to zero-extend between int and floating-point registers.
(load_pair_zero_extendsidi2_aarch64): Add alternative for zero-extended
ldp into floating-point registers. Add type and arch attributes.
(zero_extend<SHORT:mode><GPI:mode>2_aarch64): Add arch attribute.
Use f_loads for type attribute.
testsuite/
* gcc.target/aarch64/popcnt.c: Test zero-extended popcount.
* gcc.target/aarch64/vec_zeroextend.c: Test zero-extended vectors.
From-SVN: r265079
Diffstat (limited to 'gcc/opt-problem.h')
0 files changed, 0 insertions, 0 deletions