aboutsummaryrefslogtreecommitdiff
path: root/gcc/fold-const.c
diff options
context:
space:
mode:
authorWilco Dijkstra <wdijkstr@arm.com>2020-02-12 18:19:25 +0000
committerWilco Dijkstra <wdijkstr@arm.com>2020-02-12 18:19:25 +0000
commit9921bbf9b2e27568d952fe6ee5bc083c93bbf7fd (patch)
treec475f27f25b512752a4c5904a3aaf3c584047a2f /gcc/fold-const.c
parente5cc04a73a3e212114ca9725911eaaa66d32303c (diff)
downloadgcc-9921bbf9b2e27568d952fe6ee5bc083c93bbf7fd.zip
gcc-9921bbf9b2e27568d952fe6ee5bc083c93bbf7fd.tar.gz
gcc-9921bbf9b2e27568d952fe6ee5bc083c93bbf7fd.tar.bz2
[AArch64] Improve popcount expansion
The popcount expansion uses umov to extend the result and move it back to the integer register file. If we model ADDV as a zero-extending operation, fmov can be used to move back to the integer side. This results in a ~0.5% speedup on deepsjeng on Cortex-A57. A typical __builtin_popcount expansion is now: fmov s0, w0 cnt v0.8b, v0.8b addv b0, v0.8b fmov w0, s0 gcc/ * config/aarch64/aarch64-simd.md (aarch64_zero_extend<GPI:mode>_reduc_plus_<VDQV_E:mode>): New pattern. * config/aarch64/aarch64.md (popcount<mode>2): Use it instead of generating separate ADDV and zero_extend patterns. * config/aarch64/iterators.md (VDQV_E): New iterator. testsuite/ * gcc.target/aarch64/popcnt2.c: New test.
Diffstat (limited to 'gcc/fold-const.c')
0 files changed, 0 insertions, 0 deletions