riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Jonathan Wright <jonathan.wright@arm.com>	2021-02-17 13:13:52 +0000
committer	Jonathan Wright <jonathan.wright@arm.com>	2021-04-30 18:41:25 +0100
commit	d388179a798c6528563873cbabd80a0e7272c013 (patch)
tree	ccff0a4779c0fbe66220f178c0750ba506560556 /gcc/value-range.h
parent	1baf4ed878639536c50a7aab9e7be64da43356fd (diff)
download	gcc-d388179a798c6528563873cbabd80a0e7272c013.zip gcc-d388179a798c6528563873cbabd80a0e7272c013.tar.gz gcc-d388179a798c6528563873cbabd80a0e7272c013.tar.bz2

aarch64: Use RTL builtins for FP ml[as][q]_laneq intrinsics

Rewrite floating-point vml[as][q]_laneq Neon intrinsics to use RTL builtins rather than relying on the GCC vector extensions. Using RTL builtins allows control over the emission of fmla/fmls instructions (which we don't want here.) With this commit, the code generated by these intrinsics changes from a fused multiply-add/subtract instruction to an fmul followed by an fadd/fsub instruction. If the programmer really wants fmla/fmls instructions, they can use the vfm[as] intrinsics. gcc/ChangeLog: 2021-02-17 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-simd-builtins.def: Add float_ml[as][q]_laneq builtin generator macros. * config/aarch64/aarch64-simd.md (mul_laneq<mode>3): Define. (aarch64_float_mla_laneq<mode>): Define. (aarch64_float_mls_laneq<mode>): Define. * config/aarch64/arm_neon.h (vmla_laneq_f32): Use RTL builtin instead of GCC vector extensions. (vmlaq_laneq_f32): Likewise. (vmls_laneq_f32): Likewise. (vmlsq_laneq_f32): Likewise.

Diffstat (limited to 'gcc/value-range.h')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: