diff options
author | Jonathan Wright <jonathan.wright@arm.com> | 2021-02-16 23:59:22 +0000 |
---|---|---|
committer | Jonathan Wright <jonathan.wright@arm.com> | 2021-04-30 18:41:11 +0100 |
commit | 1baf4ed878639536c50a7aab9e7be64da43356fd (patch) | |
tree | 1599683c4163adb622fc03dbc9345966ba3e9a03 /gcc/cp/class.c | |
parent | b0d9aac8992c1f8c3198d9528a9867c653623dfb (diff) | |
download | gcc-1baf4ed878639536c50a7aab9e7be64da43356fd.zip gcc-1baf4ed878639536c50a7aab9e7be64da43356fd.tar.gz gcc-1baf4ed878639536c50a7aab9e7be64da43356fd.tar.bz2 |
aarch64: Use RTL builtins for FP ml[as][q]_lane intrinsics
Rewrite floating-point vml[as][q]_lane Neon intrinsics to use RTL
builtins rather than relying on the GCC vector extensions. Using RTL
builtins allows control over the emission of fmla/fmls instructions
(which we don't want here.)
With this commit, the code generated by these intrinsics changes from
a fused multiply-add/subtract instruction to an fmul followed by an
fadd/fsub instruction. If the programmer really wants fmla/fmls
instructions, they can use the vfm[as] intrinsics.
gcc/ChangeLog:
2021-02-16 Jonathan Wright <jonathan.wright@arm.com>
* config/aarch64/aarch64-simd-builtins.def: Add
float_ml[as]_lane builtin generator macros.
* config/aarch64/aarch64-simd.md (*aarch64_mul3_elt<mode>):
Rename to...
(mul_lane<mode>3): This, and re-order arguments.
(aarch64_float_mla_lane<mode>): Define.
(aarch64_float_mls_lane<mode>): Define.
* config/aarch64/arm_neon.h (vmla_lane_f32): Use RTL builtin
instead of GCC vector extensions.
(vmlaq_lane_f32): Likewise.
(vmls_lane_f32): Likewise.
(vmlsq_lane_f32): Likewise.
Diffstat (limited to 'gcc/cp/class.c')
0 files changed, 0 insertions, 0 deletions