diff options
author | Richard Sandiford <richard.sandiford@arm.com> | 2019-08-15 08:18:03 +0000 |
---|---|---|
committer | Richard Sandiford <rsandifo@gcc.gnu.org> | 2019-08-15 08:18:03 +0000 |
commit | a19ba9e1b15d248e5a13ee773f4acd4ae29fdeaa (patch) | |
tree | 585d320966057caeb76464388dcb7d0d8a8c20e4 | |
parent | bf30864e4c241e50585745af504b09db55f7f08b (diff) | |
download | gcc-a19ba9e1b15d248e5a13ee773f4acd4ae29fdeaa.zip gcc-a19ba9e1b15d248e5a13ee773f4acd4ae29fdeaa.tar.gz gcc-a19ba9e1b15d248e5a13ee773f4acd4ae29fdeaa.tar.bz2 |
[AArch64] Use SVE binary immediate instructions for conditional arithmetic
This patch lets us use the immediate forms of FADD, FSUB, FSUBR,
FMUL, FMAXNM and FMINNM for conditional arithmetic. (We already
use them for normal unconditional arithmetic.)
2019-08-15 Richard Sandiford <richard.sandiford@arm.com>
Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
gcc/
* config/aarch64/aarch64.c (aarch64_print_vector_float_operand):
Print 2.0 naturally.
(aarch64_sve_float_mul_immediate_p): Return true for 2.0.
* config/aarch64/predicates.md
(aarch64_sve_float_negated_arith_immediate): New predicate,
renamed from aarch64_sve_float_arith_with_sub_immediate.
(aarch64_sve_float_arith_with_sub_immediate): Test for both
positive and negative constants.
(aarch64_sve_float_arith_with_sub_operand): Redefine as a register
or an aarch64_sve_float_arith_with_sub_immediate.
* config/aarch64/constraints.md (vsN): Use
aarch64_sve_float_negated_arith_immediate.
* config/aarch64/iterators.md (SVE_COND_FP_BINARY_I1): New int
iterator.
(sve_pred_fp_rhs2_immediate): New int attribute.
* config/aarch64/aarch64-sve.md
(cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>): Use
sve_pred_fp_rhs1_operand and sve_pred_fp_rhs2_operand.
(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_2_const)
(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_any_const)
(*cond_add<SVE_F:mode>_2_const, *cond_add<SVE_F:mode>_any_const)
(*cond_sub<mode>_3_const, *cond_sub<mode>_any_const): New patterns.
gcc/testsuite/
* gcc.target/aarch64/sve/cond_fadd_1.c: New test.
* gcc.target/aarch64/sve/cond_fadd_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_1.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_1.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_1.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_1.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_4_run.c: Likewise.
Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
From-SVN: r274508
47 files changed, 1772 insertions, 18 deletions
diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 593e8fb..52ab8e5 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,6 +1,32 @@ 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> + * config/aarch64/aarch64.c (aarch64_print_vector_float_operand): + Print 2.0 naturally. + (aarch64_sve_float_mul_immediate_p): Return true for 2.0. + * config/aarch64/predicates.md + (aarch64_sve_float_negated_arith_immediate): New predicate, + renamed from aarch64_sve_float_arith_with_sub_immediate. + (aarch64_sve_float_arith_with_sub_immediate): Test for both + positive and negative constants. + (aarch64_sve_float_arith_with_sub_operand): Redefine as a register + or an aarch64_sve_float_arith_with_sub_immediate. + * config/aarch64/constraints.md (vsN): Use + aarch64_sve_float_negated_arith_immediate. + * config/aarch64/iterators.md (SVE_COND_FP_BINARY_I1): New int + iterator. + (sve_pred_fp_rhs2_immediate): New int attribute. + * config/aarch64/aarch64-sve.md + (cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>): Use + sve_pred_fp_rhs1_operand and sve_pred_fp_rhs2_operand. + (*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_2_const) + (*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_any_const) + (*cond_add<SVE_F:mode>_2_const, *cond_add<SVE_F:mode>_any_const) + (*cond_sub<mode>_3_const, *cond_sub<mode>_any_const): New patterns. + +2019-08-15 Richard Sandiford <richard.sandiford@arm.com> + Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> + * config/aarch64/aarch64-sve.md (*aarch64_cond_abd<SVE_F:mode>_2) (*aarch64_cond_abd<SVE_F:mode>_3) (*aarch64_cond_abd<SVE_F:mode>_any): New patterns. diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index b1e2f24..d43ce52 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -2553,14 +2553,14 @@ ;; ---- [FP] General binary arithmetic corresponding to unspecs ;; ------------------------------------------------------------------------- ;; Includes merging forms of: -;; - FADD +;; - FADD (constant forms handled in the "Addition" section) ;; - FDIV ;; - FDIVR -;; - FMAXNM -;; - FMINNM -;; - FMUL -;; - FSUB -;; - FSUBR +;; - FMAXNM (including #0.0 and #1.0) +;; - FMINNM (including #0.0 and #1.0) +;; - FMUL (including #0.5 and #2.0) +;; - FSUB (constant forms handled in the "Addition" section) +;; - FSUBR (constant forms handled in the "Subtraction" section) ;; ------------------------------------------------------------------------- ;; Unpredicated floating-point binary operations. @@ -2603,8 +2603,8 @@ (unspec:SVE_F [(match_dup 1) (const_int SVE_STRICT_GP) - (match_operand:SVE_F 2 "register_operand") - (match_operand:SVE_F 3 "register_operand")] + (match_operand:SVE_F 2 "<sve_pred_fp_rhs1_operand>") + (match_operand:SVE_F 3 "<sve_pred_fp_rhs2_operand>")] SVE_COND_FP_BINARY) (match_operand:SVE_F 4 "aarch64_simd_reg_or_zero")] UNSPEC_SEL))] @@ -2635,6 +2635,30 @@ [(set_attr "movprfx" "*,yes")] ) +;; Same for operations that take a 1-bit constant. +(define_insn_and_rewrite "*cond_<optab><mode>_2_const" + [(set (match_operand:SVE_F 0 "register_operand" "=w, ?w") + (unspec:SVE_F + [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl") + (unspec:SVE_F + [(match_operand 4) + (match_operand:SI 5 "aarch64_sve_gp_strictness") + (match_operand:SVE_F 2 "register_operand" "0, w") + (match_operand:SVE_F 3 "<sve_pred_fp_rhs2_immediate>")] + SVE_COND_FP_BINARY_I1) + (match_dup 2)] + UNSPEC_SEL))] + "TARGET_SVE && aarch64_sve_pred_dominates_p (&operands[4], operands[1])" + "@ + <sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3 + movprfx\t%0, %2\;<sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3" + "&& !rtx_equal_p (operands[1], operands[4])" + { + operands[4] = copy_rtx (operands[1]); + } + [(set_attr "movprfx" "*,yes")] +) + ;; Predicated floating-point operations, merging with the second input. (define_insn_and_rewrite "*cond_<optab><mode>_3" [(set (match_operand:SVE_F 0 "register_operand" "=w, ?&w") @@ -2700,6 +2724,44 @@ [(set_attr "movprfx" "yes")] ) +;; Same for operations that take a 1-bit constant. +(define_insn_and_rewrite "*cond_<optab><mode>_any_const" + [(set (match_operand:SVE_F 0 "register_operand" "=w, w, ?w") + (unspec:SVE_F + [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl") + (unspec:SVE_F + [(match_operand 5) + (match_operand:SI 6 "aarch64_sve_gp_strictness") + (match_operand:SVE_F 2 "register_operand" "w, w, w") + (match_operand:SVE_F 3 "<sve_pred_fp_rhs2_immediate>")] + SVE_COND_FP_BINARY_I1) + (match_operand:SVE_F 4 "aarch64_simd_reg_or_zero" "Dz, 0, w")] + UNSPEC_SEL))] + "TARGET_SVE + && !rtx_equal_p (operands[2], operands[4]) + && aarch64_sve_pred_dominates_p (&operands[5], operands[1])" + "@ + movprfx\t%0.<Vetype>, %1/z, %2.<Vetype>\;<sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3 + movprfx\t%0.<Vetype>, %1/m, %2.<Vetype>\;<sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3 + #" + "&& 1" + { + if (reload_completed + && register_operand (operands[4], <MODE>mode) + && !rtx_equal_p (operands[0], operands[4])) + { + emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[2], + operands[4], operands[1])); + operands[4] = operands[2] = operands[0]; + } + else if (!rtx_equal_p (operands[1], operands[5])) + operands[5] = copy_rtx (operands[1]); + else + FAIL; + } + [(set_attr "movprfx" "yes")] +) + ;; ------------------------------------------------------------------------- ;; ---- [FP] Addition ;; ------------------------------------------------------------------------- @@ -2729,7 +2791,76 @@ [(set (match_dup 0) (plus:SVE_F (match_dup 2) (match_dup 3)))] ) -;; Merging forms are handled through SVE_COND_FP_BINARY. +;; Predicated floating-point addition of a constant, merging with the +;; first input. +(define_insn_and_rewrite "*cond_add<mode>_2_const" + [(set (match_operand:SVE_F 0 "register_operand" "=w, w, ?w, ?w") + (unspec:SVE_F + [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl, Upl") + (unspec:SVE_F + [(match_operand 4) + (match_operand:SI 5 "aarch64_sve_gp_strictness") + (match_operand:SVE_F 2 "register_operand" "0, 0, w, w") + (match_operand:SVE_F 3 "aarch64_sve_float_arith_with_sub_immediate" "vsA, vsN, vsA, vsN")] + UNSPEC_COND_FADD) + (match_dup 2)] + UNSPEC_SEL))] + "TARGET_SVE && aarch64_sve_pred_dominates_p (&operands[4], operands[1])" + "@ + fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3 + fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3 + movprfx\t%0, %2\;fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3 + movprfx\t%0, %2\;fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3" + "&& !rtx_equal_p (operands[1], operands[4])" + { + operands[4] = copy_rtx (operands[1]); + } + [(set_attr "movprfx" "*,*,yes,yes")] +) + +;; Predicated floating-point addition of a constant, merging with an +;; independent value. +(define_insn_and_rewrite "*cond_add<mode>_any_const" + [(set (match_operand:SVE_F 0 "register_operand" "=w, w, w, w, ?w, ?w") + (unspec:SVE_F + [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl, Upl, Upl, Upl") + (unspec:SVE_F + [(match_operand 5) + (match_operand:SI 6 "aarch64_sve_gp_strictness") + (match_operand:SVE_F 2 "register_operand" "w, w, w, w, w, w") + (match_operand:SVE_F 3 "aarch64_sve_float_arith_with_sub_immediate" "vsA, vsN, vsA, vsN, vsA, vsN")] + UNSPEC_COND_FADD) + (match_operand:SVE_F 4 "aarch64_simd_reg_or_zero" "Dz, Dz, 0, 0, w, w")] + UNSPEC_SEL))] + "TARGET_SVE + && !rtx_equal_p (operands[2], operands[4]) + && aarch64_sve_pred_dominates_p (&operands[5], operands[1])" + "@ + movprfx\t%0.<Vetype>, %1/z, %2.<Vetype>\;fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3 + movprfx\t%0.<Vetype>, %1/z, %2.<Vetype>\;fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3 + movprfx\t%0.<Vetype>, %1/m, %2.<Vetype>\;fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3 + movprfx\t%0.<Vetype>, %1/m, %2.<Vetype>\;fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3 + # + #" + "&& 1" + { + if (reload_completed + && register_operand (operands[4], <MODE>mode) + && !rtx_equal_p (operands[0], operands[4])) + { + emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[2], + operands[4], operands[1])); + operands[4] = operands[2] = operands[0]; + } + else if (!rtx_equal_p (operands[1], operands[5])) + operands[5] = copy_rtx (operands[1]); + else + FAIL; + } + [(set_attr "movprfx" "yes")] +) + +;; Register merging forms are handled through SVE_COND_FP_BINARY. ;; ------------------------------------------------------------------------- ;; ---- [FP] Subtraction @@ -2765,7 +2896,71 @@ [(set (match_dup 0) (minus:SVE_F (match_dup 2) (match_dup 3)))] ) -;; Merging forms are handled through SVE_COND_FP_BINARY. +;; Predicated floating-point subtraction from a constant, merging with the +;; second input. +(define_insn_and_rewrite "*cond_sub<mode>_3_const" + [(set (match_operand:SVE_F 0 "register_operand" "=w, ?w") + (unspec:SVE_F + [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl") + (unspec:SVE_F + [(match_operand 4) + (match_operand:SI 5 "aarch64_sve_gp_strictness") + (match_operand:SVE_F 2 "aarch64_sve_float_arith_immediate") + (match_operand:SVE_F 3 "register_operand" "0, w")] + UNSPEC_COND_FSUB) + (match_dup 3)] + UNSPEC_SEL))] + "TARGET_SVE && aarch64_sve_pred_dominates_p (&operands[4], operands[1])" + "@ + fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2 + movprfx\t%0, %3\;fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2" + "&& !rtx_equal_p (operands[1], operands[4])" + { + operands[4] = copy_rtx (operands[1]); + } + [(set_attr "movprfx" "*,yes")] +) + +;; Predicated floating-point subtraction from a constant, merging with an +;; independent value. +(define_insn_and_rewrite "*cond_sub<mode>_any_const" + [(set (match_operand:SVE_F 0 "register_operand" "=w, w, ?w") + (unspec:SVE_F + [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl") + (unspec:SVE_F + [(match_operand 5) + (match_operand:SI 6 "aarch64_sve_gp_strictness") + (match_operand:SVE_F 2 "aarch64_sve_float_arith_immediate") + (match_operand:SVE_F 3 "register_operand" "w, w, w")] + UNSPEC_COND_FSUB) + (match_operand:SVE_F 4 "aarch64_simd_reg_or_zero" "Dz, 0, w")] + UNSPEC_SEL))] + "TARGET_SVE + && !rtx_equal_p (operands[3], operands[4]) + && aarch64_sve_pred_dominates_p (&operands[5], operands[1])" + "@ + movprfx\t%0.<Vetype>, %1/z, %3.<Vetype>\;fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2 + movprfx\t%0.<Vetype>, %1/m, %3.<Vetype>\;fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2 + #" + "&& 1" + { + if (reload_completed + && register_operand (operands[4], <MODE>mode) + && !rtx_equal_p (operands[0], operands[4])) + { + emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[3], + operands[4], operands[1])); + operands[4] = operands[3] = operands[0]; + } + else if (!rtx_equal_p (operands[1], operands[5])) + operands[5] = copy_rtx (operands[1]); + else + FAIL; + } + [(set_attr "movprfx" "yes")] +) + +;; Register merging forms are handled through SVE_COND_FP_BINARY. ;; ------------------------------------------------------------------------- ;; ---- [FP] Absolute difference @@ -2939,7 +3134,8 @@ [(set (match_dup 0) (mult:SVE_F (match_dup 2) (match_dup 3)))] ) -;; Merging forms are handled through SVE_COND_FP_BINARY. +;; Merging forms are handled through SVE_COND_FP_BINARY and +;; SVE_COND_FP_BINARY_I1. ;; ------------------------------------------------------------------------- ;; ---- [FP] Binary logical operations @@ -3064,7 +3260,8 @@ [(set_attr "movprfx" "*,*,yes,yes")] ) -;; Merging forms are handled through SVE_COND_FP_BINARY. +;; Merging forms are handled through SVE_COND_FP_BINARY and +;; SVE_COND_FP_BINARY_I1. ;; ------------------------------------------------------------------------- ;; ---- [PRED] Binary logical operations diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 8e39225..81a267b 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -8289,6 +8289,8 @@ aarch64_print_vector_float_operand (FILE *f, rtx x, bool negate) fixed form in the assembly syntax. */ if (real_equal (&r, &dconst0)) asm_fprintf (f, "0.0"); + else if (real_equal (&r, &dconst2)) + asm_fprintf (f, "2.0"); else if (real_equal (&r, &dconst1)) asm_fprintf (f, "1.0"); else if (real_equal (&r, &dconsthalf)) @@ -15205,11 +15207,10 @@ aarch64_sve_float_mul_immediate_p (rtx x) { rtx elt; - /* GCC will never generate a multiply with an immediate of 2, so there is no - point testing for it (even though it is a valid constant). */ return (const_vec_duplicate_p (x, &elt) && GET_CODE (elt) == CONST_DOUBLE - && real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconsthalf)); + && (real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconsthalf) + || real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconst2))); } /* Return true if replicating VAL32 is a valid 2-byte or 4-byte immediate diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 28734b4..bb2fe8e 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -458,4 +458,4 @@ (define_constraint "vsN" "@internal A constraint that matches the negative of vsA" - (match_operand 0 "aarch64_sve_float_arith_with_sub_immediate")) + (match_operand 0 "aarch64_sve_float_negated_arith_immediate")) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 81dddb7..3187843 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -1713,6 +1713,10 @@ UNSPEC_COND_FMUL UNSPEC_COND_FSUB]) +(define_int_iterator SVE_COND_FP_BINARY_I1 [UNSPEC_COND_FMAXNM + UNSPEC_COND_FMINNM + UNSPEC_COND_FMUL]) + (define_int_iterator SVE_COND_FP_BINARY_REG [UNSPEC_COND_FDIV]) ;; Floating-point max/min operations that correspond to optabs, @@ -2108,3 +2112,9 @@ (UNSPEC_COND_FMINNM "aarch64_sve_float_maxmin_operand") (UNSPEC_COND_FMUL "aarch64_sve_float_mul_operand") (UNSPEC_COND_FSUB "register_operand")]) + +;; Likewise for immediates only. +(define_int_attr sve_pred_fp_rhs2_immediate + [(UNSPEC_COND_FMAXNM "aarch64_sve_float_maxmin_immediate") + (UNSPEC_COND_FMINNM "aarch64_sve_float_maxmin_immediate") + (UNSPEC_COND_FMUL "aarch64_sve_float_mul_immediate")]) diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 1a47708..98f28f5 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -663,10 +663,14 @@ (and (match_code "const,const_vector") (match_test "aarch64_sve_float_arith_immediate_p (op, false)"))) -(define_predicate "aarch64_sve_float_arith_with_sub_immediate" +(define_predicate "aarch64_sve_float_negated_arith_immediate" (and (match_code "const,const_vector") (match_test "aarch64_sve_float_arith_immediate_p (op, true)"))) +(define_predicate "aarch64_sve_float_arith_with_sub_immediate" + (ior (match_operand 0 "aarch64_sve_float_arith_immediate") + (match_operand 0 "aarch64_sve_float_negated_arith_immediate"))) + (define_predicate "aarch64_sve_float_mul_immediate" (and (match_code "const,const_vector") (match_test "aarch64_sve_float_mul_immediate_p (op)"))) @@ -730,7 +734,7 @@ (match_operand 0 "aarch64_sve_float_arith_immediate"))) (define_predicate "aarch64_sve_float_arith_with_sub_operand" - (ior (match_operand 0 "aarch64_sve_float_arith_operand") + (ior (match_operand 0 "register_operand") (match_operand 0 "aarch64_sve_float_arith_with_sub_immediate"))) (define_predicate "aarch64_sve_float_mul_operand" diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 62d5369..aad8e04 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,6 +1,50 @@ 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> + * gcc.target/aarch64/sve/cond_fadd_1.c: New test. + * gcc.target/aarch64/sve/cond_fadd_1_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fadd_2.c: Likewise. + * gcc.target/aarch64/sve/cond_fadd_2_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fadd_3.c: Likewise. + * gcc.target/aarch64/sve/cond_fadd_3_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fadd_4.c: Likewise. + * gcc.target/aarch64/sve/cond_fadd_4_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fsubr_1.c: Likewise. + * gcc.target/aarch64/sve/cond_fsubr_1_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fsubr_2.c: Likewise. + * gcc.target/aarch64/sve/cond_fsubr_2_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fsubr_3.c: Likewise. + * gcc.target/aarch64/sve/cond_fsubr_3_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fsubr_4.c: Likewise. + * gcc.target/aarch64/sve/cond_fsubr_4_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fmaxnm_1.c: Likewise. + * gcc.target/aarch64/sve/cond_fmaxnm_1_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fmaxnm_2.c: Likewise. + * gcc.target/aarch64/sve/cond_fmaxnm_2_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fmaxnm_3.c: Likewise. + * gcc.target/aarch64/sve/cond_fmaxnm_3_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fmaxnm_4.c: Likewise. + * gcc.target/aarch64/sve/cond_fmaxnm_4_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fminnm_1.c: Likewise. + * gcc.target/aarch64/sve/cond_fminnm_1_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fminnm_2.c: Likewise. + * gcc.target/aarch64/sve/cond_fminnm_2_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fminnm_3.c: Likewise. + * gcc.target/aarch64/sve/cond_fminnm_3_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fminnm_4.c: Likewise. + * gcc.target/aarch64/sve/cond_fminnm_4_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fmul_1.c: Likewise. + * gcc.target/aarch64/sve/cond_fmul_1_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fmul_2.c: Likewise. + * gcc.target/aarch64/sve/cond_fmul_2_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fmul_3.c: Likewise. + * gcc.target/aarch64/sve/cond_fmul_3_run.c: Likewise. + * gcc.target/aarch64/sve/cond_fmul_4.c: Likewise. + * gcc.target/aarch64/sve/cond_fmul_4_run.c: Likewise. + +2019-08-15 Richard Sandiford <richard.sandiford@arm.com> + Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> + * gcc.target/aarch64/sve/cond_fabd_1.c: New test. * gcc.target/aarch64/sve/cond_fabd_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fabd_2.c: Likewise. diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_1.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_1.c new file mode 100644 index 0000000..d103e1f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_1.c @@ -0,0 +1,62 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include <stdint.h> + +#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + PRED_TYPE *__restrict pred, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = pred[i] != 1 ? y[i] + (TYPE) CONST : y[i]; \ + } + +#define TEST_TYPE(T, TYPE, PRED_TYPE) \ + T (TYPE, PRED_TYPE, half, 0.5) \ + T (TYPE, PRED_TYPE, one, 1.0) \ + T (TYPE, PRED_TYPE, two, 2.0) \ + T (TYPE, PRED_TYPE, minus_half, -0.5) \ + T (TYPE, PRED_TYPE, minus_one, -1.0) \ + T (TYPE, PRED_TYPE, minus_two, -2.0) + +#define TEST_ALL(T) \ + TEST_TYPE (T, _Float16, int16_t) \ + TEST_TYPE (T, float, int32_t) \ + TEST_TYPE (T, double, int64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #-2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_1_run.c new file mode 100644 index 0000000..956ae14 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_1_run.c @@ -0,0 +1,32 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_fadd_1.c" + +#define N 99 + +#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N]; \ + PRED_TYPE pred[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i * i; \ + pred[i] = i % 3; \ + } \ + test_##TYPE##_##NAME (x, y, pred, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = i % 3 != 1 ? y[i] + (TYPE) CONST : y[i]; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_2.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_2.c new file mode 100644 index 0000000..b7d02f4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_2.c @@ -0,0 +1,56 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include <stdint.h> + +#define DEF_LOOP(TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + TYPE *__restrict z, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = y[i] < 8 ? z[i] + (TYPE) CONST : y[i]; \ + } + +#define TEST_TYPE(T, TYPE) \ + T (TYPE, half, 0.5) \ + T (TYPE, one, 1.0) \ + T (TYPE, two, 2.0) \ + T (TYPE, minus_half, -0.5) \ + T (TYPE, minus_one, -1.0) \ + T (TYPE, minus_two, -2.0) + +#define TEST_ALL(T) \ + TEST_TYPE (T, float) \ + TEST_TYPE (T, double) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 6 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 6 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_2_run.c new file mode 100644 index 0000000..debf395 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_2_run.c @@ -0,0 +1,31 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_fadd_2.c" + +#define N 99 + +#define TEST_LOOP(TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N], z[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i % 13; \ + z[i] = i * i; \ + } \ + test_##TYPE##_##NAME (x, y, z, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = y[i] < 8 ? z[i] + (TYPE) CONST : y[i]; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_3.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_3.c new file mode 100644 index 0000000..aec0e5a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_3.c @@ -0,0 +1,65 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include <stdint.h> + +#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + PRED_TYPE *__restrict pred, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = pred[i] != 1 ? y[i] + (TYPE) CONST : 4; \ + } + +#define TEST_TYPE(T, TYPE, PRED_TYPE) \ + T (TYPE, PRED_TYPE, half, 0.5) \ + T (TYPE, PRED_TYPE, one, 1.0) \ + T (TYPE, PRED_TYPE, two, 2.0) \ + T (TYPE, PRED_TYPE, minus_half, -0.5) \ + T (TYPE, PRED_TYPE, minus_one, -1.0) \ + T (TYPE, PRED_TYPE, minus_two, -2.0) + +#define TEST_ALL(T) \ + TEST_TYPE (T, _Float16, int16_t) \ + TEST_TYPE (T, float, int32_t) \ + TEST_TYPE (T, double, int64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #-2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */ + +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */ + +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_3_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_3_run.c new file mode 100644 index 0000000..d5268c5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_3_run.c @@ -0,0 +1,32 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_fadd_3.c" + +#define N 99 + +#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N]; \ + PRED_TYPE pred[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i * i; \ + pred[i] = i % 3; \ + } \ + test_##TYPE##_##NAME (x, y, pred, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = i % 3 != 1 ? y[i] + (TYPE) CONST : 4; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_4.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_4.c new file mode 100644 index 0000000..bb276c1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_4.c @@ -0,0 +1,64 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include <stdint.h> + +#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + PRED_TYPE *__restrict pred, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = pred[i] != 1 ? y[i] + (TYPE) CONST : 0; \ + } + +#define TEST_TYPE(T, TYPE, PRED_TYPE) \ + T (TYPE, PRED_TYPE, half, 0.5) \ + T (TYPE, PRED_TYPE, one, 1.0) \ + T (TYPE, PRED_TYPE, two, 2.0) \ + T (TYPE, PRED_TYPE, minus_half, -0.5) \ + T (TYPE, PRED_TYPE, minus_one, -1.0) \ + T (TYPE, PRED_TYPE, minus_two, -2.0) + +#define TEST_ALL(T) \ + TEST_TYPE (T, _Float16, int16_t) \ + TEST_TYPE (T, float, int32_t) \ + TEST_TYPE (T, double, int64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #-2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 6 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 6 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_4_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_4_run.c new file mode 100644 index 0000000..4ea8be6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_4_run.c @@ -0,0 +1,32 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_fadd_4.c" + +#define N 99 + +#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N]; \ + PRED_TYPE pred[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i * i; \ + pred[i] = i % 3; \ + } \ + test_##TYPE##_##NAME (x, y, pred, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = i % 3 != 1 ? y[i] + (TYPE) CONST : 0; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_1.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_1.c new file mode 100644 index 0000000..d0db090 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_1.c @@ -0,0 +1,55 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#include <stdint.h> + +#ifndef FN +#define FN(X) __builtin_fmax##X +#endif + +#define DEF_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + PRED_TYPE *__restrict pred, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = pred[i] != 1 ? FN (y[i], CONST) : y[i]; \ + } + +#define TEST_TYPE(T, FN, TYPE, PRED_TYPE) \ + T (FN, TYPE, PRED_TYPE, zero, 0) \ + T (FN, TYPE, PRED_TYPE, one, 1) \ + T (FN, TYPE, PRED_TYPE, two, 2) + +#define TEST_ALL(T) \ + TEST_TYPE (T, FN (f16), _Float16, int16_t) \ + TEST_TYPE (T, FN (f32), float, int32_t) \ + TEST_TYPE (T, FN (f64), double, int64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_1_run.c new file mode 100644 index 0000000..00a3c41 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_1_run.c @@ -0,0 +1,32 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#include "cond_fmaxnm_1.c" + +#define N 99 + +#define TEST_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N]; \ + PRED_TYPE pred[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i * i; \ + pred[i] = i % 3; \ + } \ + test_##TYPE##_##NAME (x, y, pred, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = i % 3 != 1 ? FN (y[i], CONST) : y[i]; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_2.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_2.c new file mode 100644 index 0000000..0b535d1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_2.c @@ -0,0 +1,48 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#include <stdint.h> + +#ifndef FN +#define FN(X) __builtin_fmax##X +#endif + +#define DEF_LOOP(FN, TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + TYPE *__restrict z, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = y[i] < 8 ? FN (z[i], CONST) : y[i]; \ + } + +#define TEST_TYPE(T, FN, TYPE) \ + T (FN, TYPE, zero, 0) \ + T (FN, TYPE, one, 1) \ + T (FN, TYPE, two, 2) + +#define TEST_ALL(T) \ + TEST_TYPE (T, FN (f32), float) \ + TEST_TYPE (T, FN (f64), double) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_2_run.c new file mode 100644 index 0000000..9eb4d80 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_2_run.c @@ -0,0 +1,31 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#include "cond_fmaxnm_2.c" + +#define N 99 + +#define TEST_LOOP(FN, TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N], z[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i % 13; \ + z[i] = i * i; \ + } \ + test_##TYPE##_##NAME (x, y, z, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = y[i] < 8 ? FN (z[i], CONST) : y[i]; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_3.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_3.c new file mode 100644 index 0000000..741f8f6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_3.c @@ -0,0 +1,54 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#include <stdint.h> + +#ifndef FN +#define FN(X) __builtin_fmax##X +#endif + +#define DEF_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + PRED_TYPE *__restrict pred, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = pred[i] != 1 ? FN (y[i], CONST) : 4; \ + } + +#define TEST_TYPE(T, FN, TYPE, PRED_TYPE) \ + T (FN, TYPE, PRED_TYPE, zero, 0) \ + T (FN, TYPE, PRED_TYPE, one, 1) \ + T (FN, TYPE, PRED_TYPE, two, 2) + +#define TEST_ALL(T) \ + TEST_TYPE (T, FN (f16), _Float16, int16_t) \ + TEST_TYPE (T, FN (f32), float, int32_t) \ + TEST_TYPE (T, FN (f64), double, int64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ + +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_3_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_3_run.c new file mode 100644 index 0000000..4aac75f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_3_run.c @@ -0,0 +1,32 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#include "cond_fmaxnm_3.c" + +#define N 99 + +#define TEST_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N]; \ + PRED_TYPE pred[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i * i; \ + pred[i] = i % 3; \ + } \ + test_##TYPE##_##NAME (x, y, pred, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = i % 3 != 1 ? FN (y[i], CONST) : 4; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_4.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_4.c new file mode 100644 index 0000000..83a53c7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_4.c @@ -0,0 +1,53 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#include <stdint.h> + +#ifndef FN +#define FN(X) __builtin_fmax##X +#endif + +#define DEF_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + PRED_TYPE *__restrict pred, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = pred[i] != 1 ? FN (y[i], CONST) : 0; \ + } + +#define TEST_TYPE(T, FN, TYPE, PRED_TYPE) \ + T (FN, TYPE, PRED_TYPE, zero, 0) \ + T (FN, TYPE, PRED_TYPE, one, 1) \ + T (FN, TYPE, PRED_TYPE, two, 2) + +#define TEST_ALL(T) \ + TEST_TYPE (T, FN (f16), _Float16, int16_t) \ + TEST_TYPE (T, FN (f32), float, int32_t) \ + TEST_TYPE (T, FN (f64), double, int64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_4_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_4_run.c new file mode 100644 index 0000000..e1d9043 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_4_run.c @@ -0,0 +1,32 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#include "cond_fmaxnm_4.c" + +#define N 99 + +#define TEST_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N]; \ + PRED_TYPE pred[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i * i; \ + pred[i] = i % 3; \ + } \ + test_##TYPE##_##NAME (x, y, pred, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = i % 3 != 1 ? FN (y[i], CONST) : 0; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_1.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_1.c new file mode 100644 index 0000000..d667b20 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_1.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#define FN(X) __builtin_fmin##X +#include "cond_fmaxnm_1.c" + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_1_run.c new file mode 100644 index 0000000..5df2ff8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_1_run.c @@ -0,0 +1,5 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#define FN(X) __builtin_fmin##X +#include "cond_fmaxnm_1_run.c" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_2.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_2.c new file mode 100644 index 0000000..d66a84b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_2.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#define FN(X) __builtin_fmin##X +#include "cond_fmaxnm_2.c" + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_2_run.c new file mode 100644 index 0000000..79a98bb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_2_run.c @@ -0,0 +1,5 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#define FN(X) __builtin_fmin##X +#include "cond_fmaxnm_2_run.c" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_3.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_3.c new file mode 100644 index 0000000..d39dd18 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_3.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#define FN(X) __builtin_fmin##X +#include "cond_fmaxnm_3.c" + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ + +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_3_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_3_run.c new file mode 100644 index 0000000..ca1a047 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_3_run.c @@ -0,0 +1,5 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#define FN(X) __builtin_fmin##X +#include "cond_fmaxnm_3_run.c" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_4.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_4.c new file mode 100644 index 0000000..fff6fdd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_4.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#define FN(X) __builtin_fmin##X +#include "cond_fmaxnm_4.c" + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_4_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_4_run.c new file mode 100644 index 0000000..b945d04 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_4_run.c @@ -0,0 +1,5 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ + +#define FN(X) __builtin_fmin##X +#include "cond_fmaxnm_4_run.c" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_1.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_1.c new file mode 100644 index 0000000..ce417ed --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_1.c @@ -0,0 +1,47 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include <stdint.h> + +#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + PRED_TYPE *__restrict pred, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = pred[i] != 1 ? y[i] * (TYPE) CONST : y[i]; \ + } + +#define TEST_TYPE(T, TYPE, PRED_TYPE) \ + T (TYPE, PRED_TYPE, half, 0.5) \ + T (TYPE, PRED_TYPE, two, 2.0) \ + T (TYPE, PRED_TYPE, four, 4.0) + +#define TEST_ALL(T) \ + TEST_TYPE (T, _Float16, int16_t) \ + TEST_TYPE (T, float, int32_t) \ + TEST_TYPE (T, double, int64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #4\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_1_run.c new file mode 100644 index 0000000..9ca5b50 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_1_run.c @@ -0,0 +1,32 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_fmul_1.c" + +#define N 99 + +#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N]; \ + PRED_TYPE pred[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i * i; \ + pred[i] = i % 3; \ + } \ + test_##TYPE##_##NAME (x, y, pred, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = i % 3 != 1 ? y[i] * (TYPE) CONST : y[i]; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_2.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_2.c new file mode 100644 index 0000000..cbf9d13 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_2.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include <stdint.h> + +#define DEF_LOOP(TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + TYPE *__restrict z, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = y[i] < 8 ? z[i] * (TYPE) CONST : y[i]; \ + } + +#define TEST_TYPE(T, TYPE) \ + T (TYPE, half, 0.5) \ + T (TYPE, two, 2.0) \ + T (TYPE, four, 4.0) + +#define TEST_ALL(T) \ + TEST_TYPE (T, float) \ + TEST_TYPE (T, double) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_2_run.c new file mode 100644 index 0000000..44b283b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_2_run.c @@ -0,0 +1,31 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_fmul_2.c" + +#define N 99 + +#define TEST_LOOP(TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N], z[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i % 13; \ + z[i] = i * i; \ + } \ + test_##TYPE##_##NAME (x, y, z, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = y[i] < 8 ? z[i] * (TYPE) CONST : y[i]; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_3.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_3.c new file mode 100644 index 0000000..4da147e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_3.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include <stdint.h> + +#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + PRED_TYPE *__restrict pred, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = pred[i] != 1 ? y[i] * (TYPE) CONST : 8; \ + } + +#define TEST_TYPE(T, TYPE, PRED_TYPE) \ + T (TYPE, PRED_TYPE, half, 0.5) \ + T (TYPE, PRED_TYPE, two, 2.0) \ + T (TYPE, PRED_TYPE, four, 4.0) + +#define TEST_ALL(T) \ + TEST_TYPE (T, _Float16, int16_t) \ + TEST_TYPE (T, float, int32_t) \ + TEST_TYPE (T, double, int64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #4\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ + +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_3_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_3_run.c new file mode 100644 index 0000000..9b81d43 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_3_run.c @@ -0,0 +1,32 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_fmul_3.c" + +#define N 99 + +#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N]; \ + PRED_TYPE pred[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i * i; \ + pred[i] = i % 3; \ + } \ + test_##TYPE##_##NAME (x, y, pred, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = i % 3 != 1 ? y[i] * (TYPE) CONST : 8; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_4.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_4.c new file mode 100644 index 0000000..c4fdb2b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_4.c @@ -0,0 +1,49 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include <stdint.h> + +#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + PRED_TYPE *__restrict pred, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = pred[i] != 1 ? y[i] * (TYPE) CONST : 0; \ + } + +#define TEST_TYPE(T, TYPE, PRED_TYPE) \ + T (TYPE, PRED_TYPE, half, 0.5) \ + T (TYPE, PRED_TYPE, two, 2.0) \ + T (TYPE, PRED_TYPE, four, 4.0) + +#define TEST_ALL(T) \ + TEST_TYPE (T, _Float16, int16_t) \ + TEST_TYPE (T, float, int32_t) \ + TEST_TYPE (T, double, int64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #4\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_4_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_4_run.c new file mode 100644 index 0000000..b93e031 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_4_run.c @@ -0,0 +1,32 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_fmul_4.c" + +#define N 99 + +#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N]; \ + PRED_TYPE pred[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i * i; \ + pred[i] = i % 3; \ + } \ + test_##TYPE##_##NAME (x, y, pred, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = i % 3 != 1 ? y[i] * (TYPE) CONST : 0; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_1.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_1.c new file mode 100644 index 0000000..8e7172a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_1.c @@ -0,0 +1,47 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include <stdint.h> + +#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + PRED_TYPE *__restrict pred, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = pred[i] != 1 ? (TYPE) CONST - y[i] : y[i]; \ + } + +#define TEST_TYPE(T, TYPE, PRED_TYPE) \ + T (TYPE, PRED_TYPE, half, 0.5) \ + T (TYPE, PRED_TYPE, one, 1.0) \ + T (TYPE, PRED_TYPE, two, 2.0) + +#define TEST_ALL(T) \ + TEST_TYPE (T, _Float16, int16_t) \ + TEST_TYPE (T, float, int32_t) \ + TEST_TYPE (T, double, int64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_1_run.c new file mode 100644 index 0000000..61ffac4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_1_run.c @@ -0,0 +1,32 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_fsubr_1.c" + +#define N 99 + +#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N]; \ + PRED_TYPE pred[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i * i; \ + pred[i] = i % 3; \ + } \ + test_##TYPE##_##NAME (x, y, pred, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = i % 3 != 1 ? (TYPE) CONST - y[i] : y[i]; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_2.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_2.c new file mode 100644 index 0000000..6d2efde --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_2.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include <stdint.h> + +#define DEF_LOOP(TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + TYPE *__restrict z, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = y[i] < 8 ? (TYPE) CONST - z[i] : y[i]; \ + } + +#define TEST_TYPE(T, TYPE) \ + T (TYPE, half, 0.5) \ + T (TYPE, one, 1.0) \ + T (TYPE, two, 2.0) + +#define TEST_ALL(T) \ + TEST_TYPE (T, float) \ + TEST_TYPE (T, double) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_2_run.c new file mode 100644 index 0000000..1b25392 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_2_run.c @@ -0,0 +1,31 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_fsubr_2.c" + +#define N 99 + +#define TEST_LOOP(TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N], z[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i % 13; \ + z[i] = i * i; \ + } \ + test_##TYPE##_##NAME (x, y, z, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = y[i] < 8 ? (TYPE) CONST - z[i] : y[i]; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_3.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_3.c new file mode 100644 index 0000000..328af57 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_3.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include <stdint.h> + +#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + PRED_TYPE *__restrict pred, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = pred[i] != 1 ? (TYPE) CONST - y[i] : 4; \ + } + +#define TEST_TYPE(T, TYPE, PRED_TYPE) \ + T (TYPE, PRED_TYPE, half, 0.5) \ + T (TYPE, PRED_TYPE, one, 1.0) \ + T (TYPE, PRED_TYPE, two, 2.0) + +#define TEST_ALL(T) \ + TEST_TYPE (T, _Float16, int16_t) \ + TEST_TYPE (T, float, int32_t) \ + TEST_TYPE (T, double, int64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */ + +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_3_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_3_run.c new file mode 100644 index 0000000..8978287 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_3_run.c @@ -0,0 +1,32 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_fsubr_3.c" + +#define N 99 + +#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N]; \ + PRED_TYPE pred[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i * i; \ + pred[i] = i % 3; \ + } \ + test_##TYPE##_##NAME (x, y, pred, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = i % 3 != 1 ? (TYPE) CONST - y[i] : 4; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_4.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_4.c new file mode 100644 index 0000000..1d420b1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_4.c @@ -0,0 +1,49 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include <stdint.h> + +#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + void __attribute__ ((noipa)) \ + test_##TYPE##_##NAME (TYPE *__restrict x, \ + TYPE *__restrict y, \ + PRED_TYPE *__restrict pred, \ + int n) \ + { \ + for (int i = 0; i < n; ++i) \ + x[i] = pred[i] != 1 ? (TYPE) CONST - y[i] : 0; \ + } + +#define TEST_TYPE(T, TYPE, PRED_TYPE) \ + T (TYPE, PRED_TYPE, half, 0.5) \ + T (TYPE, PRED_TYPE, one, 1.0) \ + T (TYPE, PRED_TYPE, two, 2.0) + +#define TEST_ALL(T) \ + TEST_TYPE (T, _Float16, int16_t) \ + TEST_TYPE (T, float, int32_t) \ + TEST_TYPE (T, double, int64_t) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */ +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */ + +/* { dg-final { scan-assembler-not {\tmov\tz} } } */ +/* { dg-final { scan-assembler-not {\tsel\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_4_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_4_run.c new file mode 100644 index 0000000..2cb3409 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_4_run.c @@ -0,0 +1,32 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include "cond_fsubr_4.c" + +#define N 99 + +#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \ + { \ + TYPE x[N], y[N]; \ + PRED_TYPE pred[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + y[i] = i * i; \ + pred[i] = i % 3; \ + } \ + test_##TYPE##_##NAME (x, y, pred, N); \ + for (int i = 0; i < N; ++i) \ + { \ + TYPE expected = i % 3 != 1 ? (TYPE) CONST - y[i] : 0; \ + if (x[i] != expected) \ + __builtin_abort (); \ + asm volatile ("" ::: "memory"); \ + } \ + } + +int +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +} |