diff options
author | Jennifer Schmitz <jschmitz@nvidia.com> | 2024-11-15 07:45:59 -0800 |
---|---|---|
committer | Jennifer Schmitz <jschmitz@nvidia.com> | 2024-12-06 08:35:13 +0100 |
commit | 5289540ed58e42ae66255e31f22afe4ca0a6e15e (patch) | |
tree | 87c134a2f286dca051d422bf011d7569c74c8524 /gcc | |
parent | 8772f37e45e9401c9a361548e00c9691424e75e0 (diff) | |
download | gcc-5289540ed58e42ae66255e31f22afe4ca0a6e15e.zip gcc-5289540ed58e42ae66255e31f22afe4ca0a6e15e.tar.gz gcc-5289540ed58e42ae66255e31f22afe4ca0a6e15e.tar.bz2 |
SVE intrinsics: Fold calls with pfalse predicate.
If an SVE intrinsic has predicate pfalse, we can fold the call to
a simplified assignment statement: For _m predication, the LHS can be assigned
the operand for inactive values and for _z, we can assign a zero vector.
For _x, the returned values can be arbitrary and as suggested by
Richard Sandiford, we fold to a zero vector.
For example,
svint32_t foo (svint32_t op1, svint32_t op2)
{
return svadd_s32_m (svpfalse_b (), op1, op2);
}
can be folded to lhs = op1, such that foo is compiled to just a RET.
For implicit predication, a case distinction is necessary:
Intrinsics that read from memory can be folded to a zero vector.
Intrinsics that write to memory or prefetch can be folded to a no-op.
Other intrinsics need case-by-case implemenation, which we added in
the corresponding svxxx_impl::fold.
We implemented this optimization during gimple folding by calling a new method
gimple_folder::fold_pfalse from gimple_folder::fold, which covers the generic
cases described above.
We tested the new behavior for each intrinsic with all supported predications
and data types and checked the produced assembly. There is a test file
for each shape subclass with scan-assembler-times tests that look for
the simplified instruction sequences, such as individual RET instructions
or zeroing moves. There is an additional directive counting the total number of
functions in the test, which must be the sum of counts of all other
directives. This is to check that all tested intrinsics were optimized.
Some few intrinsics were not covered by this patch:
- svlasta and svlastb already have an implementation to cover a pfalse
predicate. No changes were made to them.
- svld1/2/3/4 return aggregate types and were excluded from the case
that folds calls with implicit predication to lhs = {0, ...}.
- svst1/2/3/4 already have an implementation in svstx_impl that precedes
our optimization, such that it is not triggered.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/ChangeLog:
PR target/106329
* config/aarch64/aarch64-sve-builtins-base.cc
(svac_impl::fold): Add folding if pfalse predicate.
(svadda_impl::fold): Likewise.
(class svaddv_impl): Likewise.
(class svandv_impl): Likewise.
(svclast_impl::fold): Likewise.
(svcmp_impl::fold): Likewise.
(svcmp_wide_impl::fold): Likewise.
(svcmpuo_impl::fold): Likewise.
(svcntp_impl::fold): Likewise.
(class svcompact_impl): Likewise.
(class svcvtnt_impl): Likewise.
(class sveorv_impl): Likewise.
(class svminv_impl): Likewise.
(class svmaxnmv_impl): Likewise.
(class svmaxv_impl): Likewise.
(class svminnmv_impl): Likewise.
(class svorv_impl): Likewise.
(svpfirst_svpnext_impl::fold): Likewise.
(svptest_impl::fold): Likewise.
(class svsplice_impl): Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.cc
(class svcvtxnt_impl): Likewise.
(svmatch_svnmatch_impl::fold): Likewise.
* config/aarch64/aarch64-sve-builtins.cc
(is_pfalse): Return true if tree is pfalse.
(gimple_folder::fold_pfalse): Fold calls with pfalse predicate.
(gimple_folder::fold_call_to): Fold call to lhs = t for given tree t.
(gimple_folder::fold_to_stmt_vops): Helper function that folds the
call to given stmt and adjusts virtual operands.
(gimple_folder::fold): Call fold_pfalse.
* config/aarch64/aarch64-sve-builtins.h (is_pfalse): Declare is_pfalse.
gcc/testsuite/ChangeLog:
PR target/106329
* gcc.target/aarch64/pfalse-binary_0.h: New test.
* gcc.target/aarch64/pfalse-unary_0.h: New test.
* gcc.target/aarch64/sve/pfalse-binary.c: New test.
* gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-binary_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c: New test.
* gcc.target/aarch64/sve/pfalse-binary_rotate.c: New test.
* gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-binaryxn.c: New test.
* gcc.target/aarch64/sve/pfalse-clast.c: New test.
* gcc.target/aarch64/sve/pfalse-compare_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-compare_wide_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-count_pred.c: New test.
* gcc.target/aarch64/sve/pfalse-fold_left.c: New test.
* gcc.target/aarch64/sve/pfalse-load.c: New test.
* gcc.target/aarch64/sve/pfalse-load_ext.c: New test.
* gcc.target/aarch64/sve/pfalse-load_ext_gather_index.c: New test.
* gcc.target/aarch64/sve/pfalse-load_ext_gather_offset.c: New test.
* gcc.target/aarch64/sve/pfalse-load_gather_sv.c: New test.
* gcc.target/aarch64/sve/pfalse-load_gather_vs.c: New test.
* gcc.target/aarch64/sve/pfalse-load_replicate.c: New test.
* gcc.target/aarch64/sve/pfalse-prefetch.c: New test.
* gcc.target/aarch64/sve/pfalse-prefetch_gather_index.c: New test.
* gcc.target/aarch64/sve/pfalse-prefetch_gather_offset.c: New test.
* gcc.target/aarch64/sve/pfalse-ptest.c: New test.
* gcc.target/aarch64/sve/pfalse-rdffr.c: New test.
* gcc.target/aarch64/sve/pfalse-reduction.c: New test.
* gcc.target/aarch64/sve/pfalse-reduction_wide.c: New test.
* gcc.target/aarch64/sve/pfalse-shift_right_imm.c: New test.
* gcc.target/aarch64/sve/pfalse-store.c: New test.
* gcc.target/aarch64/sve/pfalse-store_scatter_index.c: New test.
* gcc.target/aarch64/sve/pfalse-store_scatter_offset.c: New test.
* gcc.target/aarch64/sve/pfalse-storexn.c: New test.
* gcc.target/aarch64/sve/pfalse-ternary_opt_n.c: New test.
* gcc.target/aarch64/sve/pfalse-ternary_rotate.c: New test.
* gcc.target/aarch64/sve/pfalse-unary.c: New test.
* gcc.target/aarch64/sve/pfalse-unary_convert_narrowt.c: New test.
* gcc.target/aarch64/sve/pfalse-unary_convertxn.c: New test.
* gcc.target/aarch64/sve/pfalse-unary_n.c: New test.
* gcc.target/aarch64/sve/pfalse-unary_pred.c: New test.
* gcc.target/aarch64/sve/pfalse-unary_to_uint.c: New test.
* gcc.target/aarch64/sve/pfalse-unaryxn.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_int_opt_n.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_opt_n.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_opt_single_n.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_to_uint.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c: New test.
* gcc.target/aarch64/sve2/pfalse-binary_wide.c: New test.
* gcc.target/aarch64/sve2/pfalse-compare.c: New test.
* gcc.target/aarch64/sve2/pfalse-load_ext_gather_index_restricted.c:
New test.
* gcc.target/aarch64/sve2/pfalse-load_ext_gather_offset_restricted.c:
New test.
* gcc.target/aarch64/sve2/pfalse-load_gather_sv_restricted.c: New test.
* gcc.target/aarch64/sve2/pfalse-load_gather_vs.c: New test.
* gcc.target/aarch64/sve2/pfalse-shift_left_imm_to_uint.c: New test.
* gcc.target/aarch64/sve2/pfalse-shift_right_imm.c: New test.
* gcc.target/aarch64/sve2/pfalse-store_scatter_index_restricted.c:
New test.
* gcc.target/aarch64/sve2/pfalse-store_scatter_offset_restricted.c:
New test.
* gcc.target/aarch64/sve2/pfalse-unary.c: New test.
* gcc.target/aarch64/sve2/pfalse-unary_convert.c: New test.
* gcc.target/aarch64/sve2/pfalse-unary_convert_narrowt.c: New test.
* gcc.target/aarch64/sve2/pfalse-unary_to_int.c: New test.
Diffstat (limited to 'gcc')
68 files changed, 2472 insertions, 14 deletions
diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 13e020b..927c5bb 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -201,6 +201,15 @@ class svac_impl : public function_base public: CONSTEXPR svac_impl (int unspec) : m_unspec (unspec) {} + gimple * + fold (gimple_folder &f) const override + { + tree pg = gimple_call_arg (f.call, 0); + if (is_pfalse (pg)) + return f.fold_call_to (pg); + return NULL; + } + rtx expand (function_expander &e) const override { @@ -216,6 +225,14 @@ public: class svadda_impl : public function_base { public: + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + return f.fold_call_to (gimple_call_arg (f.call, 1)); + return NULL; + } + rtx expand (function_expander &e) const override { @@ -227,6 +244,21 @@ public: } }; +class svaddv_impl : public reduction +{ +public: + CONSTEXPR svaddv_impl () + : reduction (UNSPEC_SADDV, UNSPEC_UADDV, UNSPEC_FADDV) {} + + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + return f.fold_call_to (build_zero_cst (TREE_TYPE (f.lhs))); + return NULL; + } +}; + /* Implements svadr[bhwd]. */ class svadr_bhwd_impl : public function_base { @@ -245,11 +277,25 @@ public: e.args.quick_push (expand_vector_broadcast (mode, shift)); return e.use_exact_insn (code_for_aarch64_adr_shift (mode)); } - /* How many bits left to shift the vector displacement. */ unsigned int m_shift; }; + +class svandv_impl : public reduction +{ +public: + CONSTEXPR svandv_impl () : reduction (UNSPEC_ANDV) {} + + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + return f.fold_call_to (build_all_ones_cst (TREE_TYPE (f.lhs))); + return NULL; + } +}; + class svbic_impl : public function_base { public: @@ -333,6 +379,14 @@ class svclast_impl : public quiet<function_base> public: CONSTEXPR svclast_impl (int unspec) : m_unspec (unspec) {} + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + return f.fold_call_to (gimple_call_arg (f.call, 1)); + return NULL; + } + rtx expand (function_expander &e) const override { @@ -425,6 +479,8 @@ public: return gimple_build_assign (f.lhs, m_code, rhs1, rhs2); } + if (is_pfalse (pg)) + return f.fold_call_to (pg); return NULL; } @@ -464,6 +520,15 @@ public: : m_code (code), m_unspec_for_sint (unspec_for_sint), m_unspec_for_uint (unspec_for_uint) {} + gimple * + fold (gimple_folder &f) const override + { + tree pg = gimple_call_arg (f.call, 0); + if (is_pfalse (pg)) + return f.fold_call_to (pg); + return NULL; + } + rtx expand (function_expander &e) const override { @@ -502,6 +567,16 @@ public: class svcmpuo_impl : public quiet<function_base> { public: + + gimple * + fold (gimple_folder &f) const override + { + tree pg = gimple_call_arg (f.call, 0); + if (is_pfalse (pg)) + return f.fold_call_to (pg); + return NULL; + } + rtx expand (function_expander &e) const override { @@ -598,6 +673,16 @@ public: class svcntp_impl : public function_base { public: + + gimple * + fold (gimple_folder &f) const override + { + tree pg = gimple_call_arg (f.call, 0); + if (is_pfalse (pg)) + return f.fold_call_to (build_zero_cst (TREE_TYPE (f.lhs))); + return NULL; + } + rtx expand (function_expander &e) const override { @@ -613,6 +698,19 @@ public: } }; +class svcompact_impl + : public QUIET_CODE_FOR_MODE0 (aarch64_sve_compact) +{ +public: + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + return f.fold_call_to (build_zero_cst (TREE_TYPE (f.lhs))); + return NULL; + } +}; + /* Implements svcreate2, svcreate3 and svcreate4. */ class svcreate_impl : public quiet<multi_vector_function> { @@ -749,6 +847,18 @@ public: } }; +class svcvtnt_impl : public CODE_FOR_MODE0 (aarch64_sve_cvtnt) +{ +public: + gimple * + fold (gimple_folder &f) const override + { + if (f.pred == PRED_x && is_pfalse (gimple_call_arg (f.call, 1))) + f.fold_call_to (build_zero_cst (TREE_TYPE (f.lhs))); + return NULL; + } +}; + class svdiv_impl : public rtx_code_function { public: @@ -1155,6 +1265,20 @@ public: } }; +class sveorv_impl : public reduction +{ +public: + CONSTEXPR sveorv_impl () : reduction (UNSPEC_XORV) {} + + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + return f.fold_call_to (build_zero_cst (TREE_TYPE (f.lhs))); + return NULL; + } +}; + /* Implements svextb, svexth and svextw. */ class svext_bhw_impl : public function_base { @@ -1397,7 +1521,8 @@ public: BIT_FIELD_REF lowers to Advanced SIMD element extract, so we have to ensure the index of the element being accessed is in the range of a Advanced SIMD vector width. */ - gimple *fold (gimple_folder & f) const override + gimple * + fold (gimple_folder & f) const override { tree pred = gimple_call_arg (f.call, 0); tree val = gimple_call_arg (f.call, 1); @@ -1973,6 +2098,80 @@ public: } }; +class svminv_impl : public reduction +{ +public: + CONSTEXPR svminv_impl () + : reduction (UNSPEC_SMINV, UNSPEC_UMINV, UNSPEC_FMINV) {} + + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + { + tree rhs = f.type_suffix (0).integer_p + ? TYPE_MAX_VALUE (TREE_TYPE (f.lhs)) + : build_real (TREE_TYPE (f.lhs), dconstinf); + return f.fold_call_to (rhs); + } + return NULL; + } +}; + +class svmaxnmv_impl : public reduction +{ +public: + CONSTEXPR svmaxnmv_impl () : reduction (UNSPEC_FMAXNMV) {} + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + { + REAL_VALUE_TYPE rnan = dconst0; + rnan.cl = rvc_nan; + return f.fold_call_to (build_real (TREE_TYPE (f.lhs), rnan)); + } + return NULL; + } +}; + +class svmaxv_impl : public reduction +{ +public: + CONSTEXPR svmaxv_impl () + : reduction (UNSPEC_SMAXV, UNSPEC_UMAXV, UNSPEC_FMAXV) {} + + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + { + tree rhs = f.type_suffix (0).integer_p + ? TYPE_MIN_VALUE (TREE_TYPE (f.lhs)) + : build_real (TREE_TYPE (f.lhs), dconstninf); + return f.fold_call_to (rhs); + } + return NULL; + } +}; + +class svminnmv_impl : public reduction +{ +public: + CONSTEXPR svminnmv_impl () : reduction (UNSPEC_FMINNMV) {} + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + { + REAL_VALUE_TYPE rnan = dconst0; + rnan.cl = rvc_nan; + return f.fold_call_to (build_real (TREE_TYPE (f.lhs), rnan)); + } + return NULL; + } +}; + class svmla_impl : public function_base { public: @@ -2222,6 +2421,20 @@ public: } }; +class svorv_impl : public reduction +{ +public: + CONSTEXPR svorv_impl () : reduction (UNSPEC_IORV) {} + + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + return f.fold_call_to (build_zero_cst (TREE_TYPE (f.lhs))); + return NULL; + } +}; + class svpfalse_impl : public function_base { public: @@ -2246,6 +2459,16 @@ class svpfirst_svpnext_impl : public function_base { public: CONSTEXPR svpfirst_svpnext_impl (int unspec) : m_unspec (unspec) {} + gimple * + fold (gimple_folder &f) const override + { + tree pg = gimple_call_arg (f.call, 0); + if (is_pfalse (pg)) + return f.fold_call_to (m_unspec == UNSPEC_PFIRST + ? gimple_call_arg (f.call, 1) + : pg); + return NULL; + } rtx expand (function_expander &e) const override @@ -2326,6 +2549,13 @@ class svptest_impl : public function_base { public: CONSTEXPR svptest_impl (rtx_code compare) : m_compare (compare) {} + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + return f.fold_call_to (boolean_false_node); + return NULL; + } rtx expand (function_expander &e) const override @@ -2741,6 +2971,18 @@ public: } }; +class svsplice_impl : public QUIET_CODE_FOR_MODE0 (aarch64_sve_splice) +{ +public: + gimple * + fold (gimple_folder &f) const override + { + if (is_pfalse (gimple_call_arg (f.call, 0))) + return f.fold_call_to (gimple_call_arg (f.call, 2)); + return NULL; + } +}; + class svst1_impl : public full_width_access { public: @@ -3194,13 +3436,13 @@ FUNCTION (svacle, svac_impl, (UNSPEC_COND_FCMLE)) FUNCTION (svaclt, svac_impl, (UNSPEC_COND_FCMLT)) FUNCTION (svadd, rtx_code_function, (PLUS, PLUS, UNSPEC_COND_FADD)) FUNCTION (svadda, svadda_impl,) -FUNCTION (svaddv, reduction, (UNSPEC_SADDV, UNSPEC_UADDV, UNSPEC_FADDV)) +FUNCTION (svaddv, svaddv_impl,) FUNCTION (svadrb, svadr_bhwd_impl, (0)) FUNCTION (svadrd, svadr_bhwd_impl, (3)) FUNCTION (svadrh, svadr_bhwd_impl, (1)) FUNCTION (svadrw, svadr_bhwd_impl, (2)) FUNCTION (svand, rtx_code_function, (AND, AND)) -FUNCTION (svandv, reduction, (UNSPEC_ANDV)) +FUNCTION (svandv, svandv_impl,) FUNCTION (svasr, rtx_code_function, (ASHIFTRT, ASHIFTRT)) FUNCTION (svasr_wide, shift_wide, (ASHIFTRT, UNSPEC_ASHIFTRT_WIDE)) FUNCTION (svasrd, unspec_based_function, (UNSPEC_ASRD, -1, -1)) @@ -3257,12 +3499,12 @@ FUNCTION (svcnth_pat, svcnt_bhwd_pat_impl, (VNx8HImode)) FUNCTION (svcntp, svcntp_impl,) FUNCTION (svcntw, svcnt_bhwd_impl, (VNx4SImode)) FUNCTION (svcntw_pat, svcnt_bhwd_pat_impl, (VNx4SImode)) -FUNCTION (svcompact, QUIET_CODE_FOR_MODE0 (aarch64_sve_compact),) +FUNCTION (svcompact, svcompact_impl,) FUNCTION (svcreate2, svcreate_impl, (2)) FUNCTION (svcreate3, svcreate_impl, (3)) FUNCTION (svcreate4, svcreate_impl, (4)) FUNCTION (svcvt, svcvt_impl,) -FUNCTION (svcvtnt, CODE_FOR_MODE0 (aarch64_sve_cvtnt),) +FUNCTION (svcvtnt, svcvtnt_impl,) FUNCTION (svdiv, svdiv_impl,) FUNCTION (svdivr, rtx_code_function_rotated, (DIV, UDIV, UNSPEC_COND_FDIV)) FUNCTION (svdot, svdot_impl,) @@ -3273,7 +3515,7 @@ FUNCTION (svdup_lane, svdup_lane_impl,) FUNCTION (svdupq, svdupq_impl,) FUNCTION (svdupq_lane, svdupq_lane_impl,) FUNCTION (sveor, rtx_code_function, (XOR, XOR, -1)) -FUNCTION (sveorv, reduction, (UNSPEC_XORV)) +FUNCTION (sveorv, sveorv_impl,) FUNCTION (svexpa, unspec_based_function, (-1, -1, UNSPEC_FEXPA)) FUNCTION (svext, QUIET_CODE_FOR_MODE0 (aarch64_sve_ext),) FUNCTION (svextb, svext_bhw_impl, (QImode)) @@ -3337,14 +3579,14 @@ FUNCTION (svmax, rtx_code_function, (SMAX, UMAX, UNSPEC_COND_FMAX, UNSPEC_FMAX)) FUNCTION (svmaxnm, cond_or_uncond_unspec_function, (UNSPEC_COND_FMAXNM, UNSPEC_FMAXNM)) -FUNCTION (svmaxnmv, reduction, (UNSPEC_FMAXNMV)) -FUNCTION (svmaxv, reduction, (UNSPEC_SMAXV, UNSPEC_UMAXV, UNSPEC_FMAXV)) +FUNCTION (svmaxnmv, svmaxnmv_impl,) +FUNCTION (svmaxv, svmaxv_impl,) FUNCTION (svmin, rtx_code_function, (SMIN, UMIN, UNSPEC_COND_FMIN, UNSPEC_FMIN)) FUNCTION (svminnm, cond_or_uncond_unspec_function, (UNSPEC_COND_FMINNM, UNSPEC_FMINNM)) -FUNCTION (svminnmv, reduction, (UNSPEC_FMINNMV)) -FUNCTION (svminv, reduction, (UNSPEC_SMINV, UNSPEC_UMINV, UNSPEC_FMINV)) +FUNCTION (svminnmv, svminnmv_impl,) +FUNCTION (svminv, svminv_impl,) FUNCTION (svmla, svmla_impl,) FUNCTION (svmla_lane, svmla_lane_impl,) FUNCTION (svmls, svmls_impl,) @@ -3367,7 +3609,7 @@ FUNCTION (svnor, svnor_impl,) FUNCTION (svnot, svnot_impl,) FUNCTION (svorn, svorn_impl,) FUNCTION (svorr, rtx_code_function, (IOR, IOR)) -FUNCTION (svorv, reduction, (UNSPEC_IORV)) +FUNCTION (svorv, svorv_impl,) FUNCTION (svpfalse, svpfalse_impl,) FUNCTION (svpfirst, svpfirst_svpnext_impl, (UNSPEC_PFIRST)) FUNCTION (svpnext, svpfirst_svpnext_impl, (UNSPEC_PNEXT)) @@ -3429,7 +3671,7 @@ FUNCTION (svset2, svset_impl, (2)) FUNCTION (svset3, svset_impl, (3)) FUNCTION (svset4, svset_impl, (4)) FUNCTION (svsetffr, svsetffr_impl,) -FUNCTION (svsplice, QUIET_CODE_FOR_MODE0 (aarch64_sve_splice),) +FUNCTION (svsplice, svsplice_impl,) FUNCTION (svsqrt, rtx_code_function, (SQRT, SQRT, UNSPEC_COND_FSQRT)) FUNCTION (svst1, svst1_impl,) FUNCTION (svst1_scatter, svst1_scatter_impl,) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc index 0eda53d..cb9a77d 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc @@ -252,6 +252,18 @@ public: } }; +class svcvtxnt_impl : public CODE_FOR_MODE1 (aarch64_sve2_cvtxnt) +{ +public: + gimple * + fold (gimple_folder &f) const override + { + if (f.pred == PRED_x && is_pfalse (gimple_call_arg (f.call, 1))) + return f.fold_call_to (build_zero_cst (TREE_TYPE (f.lhs))); + return NULL; + } +}; + class svdup_laneq_impl : public function_base { public: @@ -389,6 +401,14 @@ class svmatch_svnmatch_impl : public function_base { public: CONSTEXPR svmatch_svnmatch_impl (int unspec) : m_unspec (unspec) {} + gimple * + fold (gimple_folder &f) const override + { + tree pg = gimple_call_arg (f.call, 0); + if (is_pfalse (pg)) + return f.fold_call_to (pg); + return NULL; + } rtx expand (function_expander &e) const override @@ -952,7 +972,7 @@ FUNCTION (svcvtlt, unspec_based_function, (-1, -1, UNSPEC_COND_FCVTLT)) FUNCTION (svcvtn, svcvtn_impl,) FUNCTION (svcvtnb, fixed_insn_function, (CODE_FOR_aarch64_sve2_fp8_cvtnbvnx16qi)) FUNCTION (svcvtx, unspec_based_function, (-1, -1, UNSPEC_COND_FCVTX)) -FUNCTION (svcvtxnt, CODE_FOR_MODE1 (aarch64_sve2_cvtxnt),) +FUNCTION (svcvtxnt, svcvtxnt_impl,) FUNCTION (svdup_laneq, svdup_laneq_impl,) FUNCTION (sveor3, CODE_FOR_MODE0 (aarch64_sve2_eor3),) FUNCTION (sveorbt, unspec_based_function, (UNSPEC_EORBT, UNSPEC_EORBT, -1)) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 8e94a2d..8714acc 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -3518,6 +3518,15 @@ is_ptrue (tree v, unsigned int step) && vector_cst_all_same (v, step)); } +/* Return true if V is a constant predicate that acts as a pfalse. */ +bool +is_pfalse (tree v) +{ + return (TREE_CODE (v) == VECTOR_CST + && TYPE_MODE (TREE_TYPE (v)) == VNx16BImode + && integer_zerop (v)); +} + gimple_folder::gimple_folder (const function_instance &instance, tree fndecl, gimple_stmt_iterator *gsi_in, gcall *call_in) : function_call_info (gimple_location (call_in), instance, fndecl), @@ -3623,6 +3632,46 @@ gimple_folder::redirect_pred_x () return redirect_call (instance); } +/* Fold calls with predicate pfalse: + _m predication: lhs = op1. + _x or _z: lhs = {0, ...}. + Implicit predication that reads from memory: lhs = {0, ...}. + Implicit predication that writes to memory or prefetches: no-op. + Return the new gimple statement on success, else NULL. */ +gimple * +gimple_folder::fold_pfalse () +{ + if (pred == PRED_none) + return nullptr; + tree arg0 = gimple_call_arg (call, 0); + if (pred == PRED_m) + { + /* Unary function shapes with _m predication are folded to the + inactive vector (arg0), while other function shapes are folded + to op1 (arg1). */ + tree arg1 = gimple_call_arg (call, 1); + if (is_pfalse (arg1)) + return fold_call_to (arg0); + if (is_pfalse (arg0)) + return fold_call_to (arg1); + return nullptr; + } + if ((pred == PRED_x || pred == PRED_z) && is_pfalse (arg0)) + return fold_call_to (build_zero_cst (TREE_TYPE (lhs))); + if (pred == PRED_implicit && is_pfalse (arg0)) + { + unsigned int flags = call_properties (); + /* Folding to lhs = {0, ...} is not appropriate for intrinsics with + AGGREGATE types as lhs. */ + if ((flags & CP_READ_MEMORY) + && !AGGREGATE_TYPE_P (TREE_TYPE (lhs))) + return fold_call_to (build_zero_cst (TREE_TYPE (lhs))); + if (flags & (CP_WRITE_MEMORY | CP_PREFETCH_MEMORY)) + return fold_to_stmt_vops (gimple_build_nop ()); + } + return nullptr; +} + /* Fold the call to constant VAL. */ gimple * gimple_folder::fold_to_cstu (poly_uint64 val) @@ -3725,6 +3774,27 @@ gimple_folder::fold_active_lanes_to (tree x) return gimple_build_assign (lhs, VEC_COND_EXPR, pred, x, vec_inactive); } +/* Fold call to assignment statement lhs = t. */ +gimple * +gimple_folder::fold_call_to (tree t) +{ + if (types_compatible_p (TREE_TYPE (lhs), TREE_TYPE (t))) + return fold_to_stmt_vops (gimple_build_assign (lhs, t)); + + tree rhs = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), t); + return fold_to_stmt_vops (gimple_build_assign (lhs, VIEW_CONVERT_EXPR, rhs)); +} + +/* Fold call to G, incl. adjustments to the virtual operands. */ +gimple * +gimple_folder::fold_to_stmt_vops (gimple *g) +{ + gimple_seq stmts = NULL; + gimple_seq_add_stmt_without_update (&stmts, g); + gsi_replace_with_seq_vops (gsi, stmts); + return g; +} + /* Try to fold the call. Return the new statement on success and null on failure. */ gimple * @@ -3744,6 +3814,8 @@ gimple_folder::fold () /* First try some simplifications that are common to many functions. */ if (auto *call = redirect_pred_x ()) return call; + if (auto *call = fold_pfalse ()) + return call; return base->fold (*this); } diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 1d0ca39..6f22868 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -648,6 +648,7 @@ public: gcall *redirect_call (const function_instance &); gimple *redirect_pred_x (); + gimple *fold_pfalse (); gimple *fold_to_cstu (poly_uint64); gimple *fold_to_pfalse (); @@ -655,6 +656,8 @@ public: gimple *fold_to_vl_pred (unsigned int); gimple *fold_const_binary (enum tree_code); gimple *fold_active_lanes_to (tree); + gimple *fold_call_to (tree); + gimple *fold_to_stmt_vops (gimple *); gimple *fold (); @@ -848,6 +851,7 @@ extern tree acle_svprfop; bool vector_cst_all_same (tree, unsigned int); bool is_ptrue (tree, unsigned int); +bool is_pfalse (tree); const function_instance *lookup_fndecl (tree); /* Try to find a mode with the given mode_suffix_info fields. Return the diff --git a/gcc/testsuite/gcc.target/aarch64/pfalse-binary_0.h b/gcc/testsuite/gcc.target/aarch64/pfalse-binary_0.h new file mode 100644 index 0000000..72fb5b6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pfalse-binary_0.h @@ -0,0 +1,240 @@ +#include <arm_sve.h> + +#define MXZ(F, RTY, TY1, TY2) \ + RTY F##_f (TY1 op1, TY2 op2) \ + { \ + return sv##F (svpfalse_b (), op1, op2); \ + } + +#define PRED_MXv(F, RTY, TYPE1, TYPE2, TY) \ + MXZ (F##_##TY##_m, RTY, TYPE1, sv##TYPE2) \ + MXZ (F##_##TY##_x, RTY, TYPE1, sv##TYPE2) + +#define PRED_Zv(F, RTY, TYPE1, TYPE2, TY) \ + MXZ (F##_##TY##_z, RTY, TYPE1, sv##TYPE2) + +#define PRED_MXZv(F, RTY, TYPE1, TYPE2, TY) \ + PRED_MXv (F, RTY, TYPE1, TYPE2, TY) \ + PRED_Zv (F, RTY, TYPE1, TYPE2, TY) + +#define PRED_Z(F, RTY, TYPE1, TYPE2, TY) \ + PRED_Zv (F, RTY, TYPE1, TYPE2, TY) \ + MXZ (F##_n_##TY##_z, RTY, TYPE1, TYPE2) + +#define PRED_MXZ(F, RTY, TYPE1, TYPE2, TY) \ + PRED_MXv (F, RTY, TYPE1, TYPE2, TY) \ + MXZ (F##_n_##TY##_m, RTY, TYPE1, TYPE2) \ + MXZ (F##_n_##TY##_x, RTY, TYPE1, TYPE2) \ + PRED_Z (F, RTY, TYPE1, TYPE2, TY) + +#define PRED_IMPLICITv(F, RTY, TYPE1, TYPE2, TY) \ + MXZ (F##_##TY, RTY, TYPE1, sv##TYPE2) + +#define PRED_IMPLICIT(F, RTY, TYPE1, TYPE2, TY) \ + PRED_IMPLICITv (F, RTY, TYPE1, TYPE2, TY) \ + MXZ (F##_n_##TY, RTY, TYPE1, TYPE2) + +#define ALL_Q_INTEGER(F, P) \ + PRED_##P (F, svuint8_t, svuint8_t, uint8_t, u8) \ + PRED_##P (F, svint8_t, svint8_t, int8_t, s8) + +#define ALL_Q_INTEGER_UINT(F, P) \ + PRED_##P (F, svuint8_t, svuint8_t, uint8_t, u8) \ + PRED_##P (F, svint8_t, svint8_t, uint8_t, s8) + +#define ALL_Q_INTEGER_INT(F, P) \ + PRED_##P (F, svuint8_t, svuint8_t, int8_t, u8) \ + PRED_##P (F, svint8_t, svint8_t, int8_t, s8) + +#define ALL_Q_INTEGER_BOOL(F, P) \ + PRED_##P (F, svbool_t, svuint8_t, uint8_t, u8) \ + PRED_##P (F, svbool_t, svint8_t, int8_t, s8) + +#define ALL_H_INTEGER(F, P) \ + PRED_##P (F, svuint16_t, svuint16_t, uint16_t, u16) \ + PRED_##P (F, svint16_t, svint16_t, int16_t, s16) + +#define ALL_H_INTEGER_UINT(F, P) \ + PRED_##P (F, svuint16_t, svuint16_t, uint16_t, u16) \ + PRED_##P (F, svint16_t, svint16_t, uint16_t, s16) + +#define ALL_H_INTEGER_INT(F, P) \ + PRED_##P (F, svuint16_t, svuint16_t, int16_t, u16) \ + PRED_##P (F, svint16_t, svint16_t, int16_t, s16) + +#define ALL_H_INTEGER_WIDE(F, P) \ + PRED_##P (F, svuint16_t, svuint16_t, uint8_t, u16) \ + PRED_##P (F, svint16_t, svint16_t, int8_t, s16) + +#define ALL_H_INTEGER_BOOL(F, P) \ + PRED_##P (F, svbool_t, svuint16_t, uint16_t, u16) \ + PRED_##P (F, svbool_t, svint16_t, int16_t, s16) + +#define ALL_S_INTEGER(F, P) \ + PRED_##P (F, svuint32_t, svuint32_t, uint32_t, u32) \ + PRED_##P (F, svint32_t, svint32_t, int32_t, s32) + +#define ALL_S_INTEGER_UINT(F, P) \ + PRED_##P (F, svuint32_t, svuint32_t, uint32_t, u32) \ + PRED_##P (F, svint32_t, svint32_t, uint32_t, s32) + +#define ALL_S_INTEGER_INT(F, P) \ + PRED_##P (F, svuint32_t, svuint32_t, int32_t, u32) \ + PRED_##P (F, svint32_t, svint32_t, int32_t, s32) + +#define ALL_S_INTEGER_WIDE(F, P) \ + PRED_##P (F, svuint32_t, svuint32_t, uint16_t, u32) \ + PRED_##P (F, svint32_t, svint32_t, int16_t, s32) + +#define ALL_S_INTEGER_BOOL(F, P) \ + PRED_##P (F, svbool_t, svuint32_t, uint32_t, u32) \ + PRED_##P (F, svbool_t, svint32_t, int32_t, s32) + +#define ALL_D_INTEGER(F, P) \ + PRED_##P (F, svuint64_t, svuint64_t, uint64_t, u64) \ + PRED_##P (F, svint64_t, svint64_t, int64_t, s64) + +#define ALL_D_INTEGER_UINT(F, P) \ + PRED_##P (F, svuint64_t, svuint64_t, uint64_t, u64) \ + PRED_##P (F, svint64_t, svint64_t, uint64_t, s64) + +#define ALL_D_INTEGER_INT(F, P) \ + PRED_##P (F, svuint64_t, svuint64_t, int64_t, u64) \ + PRED_##P (F, svint64_t, svint64_t, int64_t, s64) + +#define ALL_D_INTEGER_WIDE(F, P) \ + PRED_##P (F, svuint64_t, svuint64_t, uint32_t, u64) \ + PRED_##P (F, svint64_t, svint64_t, int32_t, s64) + +#define ALL_D_INTEGER_BOOL(F, P) \ + PRED_##P (F, svbool_t, svuint64_t, uint64_t, u64) \ + PRED_##P (F, svbool_t, svint64_t, int64_t, s64) + +#define SD_INTEGER_TO_UINT(F, P) \ + PRED_##P (F, svuint32_t, svuint32_t, uint32_t, u32) \ + PRED_##P (F, svuint64_t, svuint64_t, uint64_t, u64) \ + PRED_##P (F, svuint32_t, svint32_t, int32_t, s32) \ + PRED_##P (F, svuint64_t, svint64_t, int64_t, s64) + +#define BH_INTEGER_BOOL(F, P) \ + ALL_Q_INTEGER_BOOL (F, P) \ + ALL_H_INTEGER_BOOL (F, P) + +#define BHS_UNSIGNED_UINT64(F, P) \ + PRED_##P (F, svuint8_t, svuint8_t, uint64_t, u8) \ + PRED_##P (F, svuint16_t, svuint16_t, uint64_t, u16) \ + PRED_##P (F, svuint32_t, svuint32_t, uint64_t, u32) + +#define BHS_UNSIGNED_WIDE_BOOL(F, P) \ + PRED_##P (F, svbool_t, svuint8_t, uint64_t, u8) \ + PRED_##P (F, svbool_t, svuint16_t, uint64_t, u16) \ + PRED_##P (F, svbool_t, svuint32_t, uint64_t, u32) + +#define BHS_SIGNED_UINT64(F, P) \ + PRED_##P (F, svint8_t, svint8_t, uint64_t, s8) \ + PRED_##P (F, svint16_t, svint16_t, uint64_t, s16) \ + PRED_##P (F, svint32_t, svint32_t, uint64_t, s32) + +#define BHS_SIGNED_WIDE_BOOL(F, P) \ + PRED_##P (F, svbool_t, svint8_t, int64_t, s8) \ + PRED_##P (F, svbool_t, svint16_t, int64_t, s16) \ + PRED_##P (F, svbool_t, svint32_t, int64_t, s32) + +#define ALL_UNSIGNED_UINT(F, P) \ + PRED_##P (F, svuint8_t, svuint8_t, uint8_t, u8) \ + PRED_##P (F, svuint16_t, svuint16_t, uint16_t, u16) \ + PRED_##P (F, svuint32_t, svuint32_t, uint32_t, u32) \ + PRED_##P (F, svuint64_t, svuint64_t, uint64_t, u64) + +#define ALL_UNSIGNED_INT(F, P) \ + PRED_##P (F, svuint8_t, svuint8_t, int8_t, u8) \ + PRED_##P (F, svuint16_t, svuint16_t, int16_t, u16) \ + PRED_##P (F, svuint32_t, svuint32_t, int32_t, u32) \ + PRED_##P (F, svuint64_t, svuint64_t, int64_t, u64) + +#define ALL_SIGNED_UINT(F, P) \ + PRED_##P (F, svint8_t, svint8_t, uint8_t, s8) \ + PRED_##P (F, svint16_t, svint16_t, uint16_t, s16) \ + PRED_##P (F, svint32_t, svint32_t, uint32_t, s32) \ + PRED_##P (F, svint64_t, svint64_t, uint64_t, s64) + +#define ALL_FLOAT(F, P) \ + PRED_##P (F, svfloat16_t, svfloat16_t, float16_t, f16) \ + PRED_##P (F, svfloat32_t, svfloat32_t, float32_t, f32) \ + PRED_##P (F, svfloat64_t, svfloat64_t, float64_t, f64) + +#define ALL_FLOAT_INT(F, P) \ + PRED_##P (F, svfloat16_t, svfloat16_t, int16_t, f16) \ + PRED_##P (F, svfloat32_t, svfloat32_t, int32_t, f32) \ + PRED_##P (F, svfloat64_t, svfloat64_t, int64_t, f64) + +#define ALL_FLOAT_BOOL(F, P) \ + PRED_##P (F, svbool_t, svfloat16_t, float16_t, f16) \ + PRED_##P (F, svbool_t, svfloat32_t, float32_t, f32) \ + PRED_##P (F, svbool_t, svfloat64_t, float64_t, f64) + +#define ALL_FLOAT_SCALAR(F, P) \ + PRED_##P (F, float16_t, float16_t, float16_t, f16) \ + PRED_##P (F, float32_t, float32_t, float32_t, f32) \ + PRED_##P (F, float64_t, float64_t, float64_t, f64) + +#define B(F, P) \ + PRED_##P (F, svbool_t, svbool_t, bool_t, b) + +#define ALL_SD_INTEGER(F, P) \ + ALL_S_INTEGER (F, P) \ + ALL_D_INTEGER (F, P) + +#define HSD_INTEGER_WIDE(F, P) \ + ALL_H_INTEGER_WIDE (F, P) \ + ALL_S_INTEGER_WIDE (F, P) \ + ALL_D_INTEGER_WIDE (F, P) + +#define BHS_INTEGER_UINT64(F, P) \ + BHS_UNSIGNED_UINT64 (F, P) \ + BHS_SIGNED_UINT64 (F, P) + +#define BHS_INTEGER_WIDE_BOOL(F, P) \ + BHS_UNSIGNED_WIDE_BOOL (F, P) \ + BHS_SIGNED_WIDE_BOOL (F, P) + +#define ALL_INTEGER(F, P) \ + ALL_Q_INTEGER (F, P) \ + ALL_H_INTEGER (F, P) \ + ALL_S_INTEGER (F, P) \ + ALL_D_INTEGER (F, P) + +#define ALL_INTEGER_UINT(F, P) \ + ALL_Q_INTEGER_UINT (F, P) \ + ALL_H_INTEGER_UINT (F, P) \ + ALL_S_INTEGER_UINT (F, P) \ + ALL_D_INTEGER_UINT (F, P) + +#define ALL_INTEGER_INT(F, P) \ + ALL_Q_INTEGER_INT (F, P) \ + ALL_H_INTEGER_INT (F, P) \ + ALL_S_INTEGER_INT (F, P) \ + ALL_D_INTEGER_INT (F, P) + +#define ALL_INTEGER_BOOL(F, P) \ + ALL_Q_INTEGER_BOOL (F, P) \ + ALL_H_INTEGER_BOOL (F, P) \ + ALL_S_INTEGER_BOOL (F, P) \ + ALL_D_INTEGER_BOOL (F, P) + +#define ALL_FLOAT_AND_SD_INTEGER(F, P) \ + ALL_SD_INTEGER (F, P) \ + ALL_FLOAT (F, P) + +#define ALL_ARITH(F, P) \ + ALL_INTEGER (F, P) \ + ALL_FLOAT (F, P) + +#define ALL_ARITH_BOOL(F, P) \ + ALL_INTEGER_BOOL (F, P) \ + ALL_FLOAT_BOOL (F, P) + +#define ALL_DATA(F, P) \ + ALL_ARITH (F, P) \ + PRED_##P (F, svbfloat16_t, svbfloat16_t, bfloat16_t, bf16) + diff --git a/gcc/testsuite/gcc.target/aarch64/pfalse-unary_0.h b/gcc/testsuite/gcc.target/aarch64/pfalse-unary_0.h new file mode 100644 index 0000000..f6183a1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pfalse-unary_0.h @@ -0,0 +1,195 @@ +#include <arm_sve.h> +#include <stdbool.h> + +#define M(F, RTY, TY) \ + RTY F##_f (RTY inactive, TY op) \ + { \ + return sv##F (inactive, svpfalse_b (), op); \ + } + +#define XZI(F, RTY, TY) \ + RTY F##_f (TY op) \ + { \ + return sv##F (svpfalse_b (), op); \ + } + +#define PRED_Z(F, RTY, TYPE, TY) \ + XZI (F##_##TY##_z, RTY, sv##TYPE) \ + +#define PRED_MZ(F, RTY, TYPE, TY) \ + M (F##_##TY##_m, RTY, sv##TYPE) \ + PRED_Z (F, RTY, TYPE, TY) + +#define PRED_MXZ(F, RTY, TYPE, TY) \ + XZI (F##_##TY##_x, RTY, sv##TYPE) \ + PRED_MZ (F, RTY, TYPE, TY) + +#define PRED_IMPLICIT(F, RTY, TYPE, TY) \ + XZI (F##_##TY, RTY, sv##TYPE) + +#define PRED_IMPLICITn(F, RTY, TYPE) \ + XZI (F, RTY, sv##TYPE) + +#define Q_INTEGER(F, P) \ + PRED_##P (F, svuint8_t, uint8_t, u8) \ + PRED_##P (F, svint8_t, int8_t, s8) + +#define Q_INTEGER_SCALAR(F, P) \ + PRED_##P (F, uint8_t, uint8_t, u8) \ + PRED_##P (F, int8_t, int8_t, s8) + +#define Q_INTEGER_SCALAR_WIDE(F, P) \ + PRED_##P (F, uint64_t, uint8_t, u8) \ + PRED_##P (F, int64_t, int8_t, s8) + +#define H_INTEGER(F, P) \ + PRED_##P (F, svuint16_t, uint16_t, u16) \ + PRED_##P (F, svint16_t, int16_t, s16) + +#define H_INTEGER_SCALAR(F, P) \ + PRED_##P (F, uint16_t, uint16_t, u16) \ + PRED_##P (F, int16_t, int16_t, s16) + +#define H_INTEGER_SCALAR_WIDE(F, P) \ + PRED_##P (F, uint64_t, uint16_t, u16) \ + PRED_##P (F, int64_t, int16_t, s16) + +#define S_INTEGER(F, P) \ + PRED_##P (F, svuint32_t, uint32_t, u32) \ + PRED_##P (F, svint32_t, int32_t, s32) + +#define S_INTEGER_SCALAR(F, P) \ + PRED_##P (F, uint32_t, uint32_t, u32) \ + PRED_##P (F, int32_t, int32_t, s32) + +#define S_INTEGER_SCALAR_WIDE(F, P) \ + PRED_##P (F, uint64_t, uint32_t, u32) \ + PRED_##P (F, int64_t, int32_t, s32) + +#define S_UNSIGNED(F, P) \ + PRED_##P (F, svuint32_t, uint32_t, u32) + +#define D_INTEGER(F, P) \ + PRED_##P (F, svuint64_t, uint64_t, u64) \ + PRED_##P (F, svint64_t, int64_t, s64) + +#define D_INTEGER_SCALAR(F, P) \ + PRED_##P (F, uint64_t, uint64_t, u64) \ + PRED_##P (F, int64_t, int64_t, s64) + +#define SD_INTEGER(F, P) \ + S_INTEGER (F, P) \ + D_INTEGER (F, P) + +#define SD_DATA(F, P) \ + PRED_##P (F, svfloat32_t, float32_t, f32) \ + PRED_##P (F, svfloat64_t, float64_t, f64) \ + S_INTEGER (F, P) \ + D_INTEGER (F, P) + +#define ALL_SIGNED(F, P) \ + PRED_##P (F, svint8_t, int8_t, s8) \ + PRED_##P (F, svint16_t, int16_t, s16) \ + PRED_##P (F, svint32_t, int32_t, s32) \ + PRED_##P (F, svint64_t, int64_t, s64) + +#define ALL_SIGNED_UINT(F, P) \ + PRED_##P (F, svuint8_t, int8_t, s8) \ + PRED_##P (F, svuint16_t, int16_t, s16) \ + PRED_##P (F, svuint32_t, int32_t, s32) \ + PRED_##P (F, svuint64_t, int64_t, s64) + +#define ALL_UNSIGNED_UINT(F, P) \ + PRED_##P (F, svuint8_t, uint8_t, u8) \ + PRED_##P (F, svuint16_t, uint16_t, u16) \ + PRED_##P (F, svuint32_t, uint32_t, u32) \ + PRED_##P (F, svuint64_t, uint64_t, u64) + +#define HSD_INTEGER(F, P) \ + H_INTEGER (F, P) \ + S_INTEGER (F, P) \ + D_INTEGER (F, P) + +#define ALL_INTEGER(F, P) \ + Q_INTEGER (F, P) \ + HSD_INTEGER (F, P) + +#define ALL_INTEGER_SCALAR(F, P) \ + Q_INTEGER_SCALAR (F, P) \ + H_INTEGER_SCALAR (F, P) \ + S_INTEGER_SCALAR (F, P) \ + D_INTEGER_SCALAR (F, P) + +#define ALL_INTEGER_SCALAR_WIDE(F, P) \ + Q_INTEGER_SCALAR_WIDE (F, P) \ + H_INTEGER_SCALAR_WIDE (F, P) \ + S_INTEGER_SCALAR_WIDE (F, P) \ + D_INTEGER_SCALAR (F, P) + +#define ALL_INTEGER_UINT(F, P) \ + ALL_SIGNED_UINT (F, P) \ + ALL_UNSIGNED_UINT (F, P) + +#define ALL_FLOAT(F, P) \ + PRED_##P (F, svfloat16_t, float16_t, f16) \ + PRED_##P (F, svfloat32_t, float32_t, f32) \ + PRED_##P (F, svfloat64_t, float64_t, f64) + +#define ALL_FLOAT_SCALAR(F, P) \ + PRED_##P (F, float16_t, float16_t, f16) \ + PRED_##P (F, float32_t, float32_t, f32) \ + PRED_##P (F, float64_t, float64_t, f64) + +#define ALL_FLOAT_INT(F, P) \ + PRED_##P (F, svint16_t, float16_t, f16) \ + PRED_##P (F, svint32_t, float32_t, f32) \ + PRED_##P (F, svint64_t, float64_t, f64) + +#define ALL_FLOAT_UINT(F, P) \ + PRED_##P (F, svuint16_t, float16_t, f16) \ + PRED_##P (F, svuint32_t, float32_t, f32) \ + PRED_##P (F, svuint64_t, float64_t, f64) + +#define ALL_FLOAT_AND_SIGNED(F, P) \ + ALL_SIGNED (F, P) \ + ALL_FLOAT (F, P) + +#define ALL_ARITH_SCALAR(F, P) \ + ALL_INTEGER_SCALAR (F, P) \ + ALL_FLOAT_SCALAR (F, P) + +#define ALL_ARITH_SCALAR_WIDE(F, P) \ + ALL_INTEGER_SCALAR_WIDE (F, P) \ + ALL_FLOAT_SCALAR (F, P) + +#define ALL_DATA(F, P) \ + ALL_INTEGER (F, P) \ + ALL_FLOAT (F, P) \ + PRED_##P (F, svbfloat16_t, bfloat16_t, bf16) + +#define ALL_DATA_SCALAR(F, P) \ + ALL_ARITH_SCALAR (F, P) \ + PRED_##P (F, bfloat16_t, bfloat16_t, bf16) + +#define ALL_DATA_UINT(F, P) \ + ALL_INTEGER_UINT (F, P) \ + ALL_FLOAT_UINT (F, P) \ + PRED_##P (F, svuint16_t, bfloat16_t, bf16) + +#define B(F, P) \ + PRED_##P (F, svbool_t, bool_t, b) + +#define BN(F, P) \ + PRED_##P (F, svbool_t, bool_t, b8) \ + PRED_##P (F, svbool_t, bool_t, b16) \ + PRED_##P (F, svbool_t, bool_t, b32) \ + PRED_##P (F, svbool_t, bool_t, b64) + +#define BOOL(F, P) \ + PRED_##P (F, bool, bool_t) + +#define ALL_PRED_UINT64(F, P) \ + PRED_##P (F, uint64_t, bool_t, b8) \ + PRED_##P (F, uint64_t, bool_t, b16) \ + PRED_##P (F, uint64_t, bool_t, b32) \ + PRED_##P (F, uint64_t, bool_t, b64) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary.c new file mode 100644 index 0000000..a8fd4c8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +B (brkn, Zv) +B (brkpa, Zv) +B (brkpb, Zv) +ALL_DATA (splice, IMPLICITv) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tpfalse\tp0\.b\n\tret\n} 3 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmov\tz0\.d, z1\.d\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 15 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c new file mode 100644 index 0000000..08cd6a0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +ALL_FLOAT_INT (scale, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 6 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 18 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_n.c new file mode 100644 index 0000000..f5c9cbf --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_n.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +ALL_ARITH (abd, MXZ) +ALL_ARITH (add, MXZ) +ALL_INTEGER (and, MXZ) +B (and, Zv) +ALL_INTEGER (bic, MXZ) +B (bic, Zv) +ALL_FLOAT_AND_SD_INTEGER (div, MXZ) +ALL_FLOAT_AND_SD_INTEGER (divr, MXZ) +ALL_INTEGER (eor, MXZ) +B (eor, Zv) +ALL_ARITH (mul, MXZ) +ALL_INTEGER (mulh, MXZ) +ALL_FLOAT (mulx, MXZ) +B (nand, Zv) +B (nor, Zv) +B (orn, Zv) +ALL_INTEGER (orr, MXZ) +B (orr, Zv) +ALL_ARITH (sub, MXZ) +ALL_ARITH (subr, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 224 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 448 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tpfalse\tp0\.b\n\tret\n} 7 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 679 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c new file mode 100644 index 0000000..91ae3c8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +ALL_ARITH (max, MXZ) +ALL_ARITH (min, MXZ) +ALL_FLOAT (maxnm, MXZ) +ALL_FLOAT (minnm, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 56 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 112 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 168 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_rotate.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_rotate.c new file mode 100644 index 0000000..12368ce --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_rotate.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define MXZ4(F, TYPE) \ + TYPE F##_f (TYPE op1, TYPE op2) \ + { \ + return sv##F (svpfalse_b (), op1, op2, 90); \ + } + +#define PRED_MXZ(F, TYPE, TY) \ + MXZ4 (F##_##TY##_m, TYPE) \ + MXZ4 (F##_##TY##_x, TYPE) \ + MXZ4 (F##_##TY##_z, TYPE) + +#define ALL_FLOAT(F, P) \ + PRED_##P (F, svfloat16_t, f16) \ + PRED_##P (F, svfloat32_t, f32) \ + PRED_##P (F, svfloat64_t, f64) + +ALL_FLOAT (cadd, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 3 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 6 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 9 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c new file mode 100644 index 0000000..dd52a58 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +BHS_SIGNED_UINT64 (asr_wide, MXZ) +BHS_INTEGER_UINT64 (lsl_wide, MXZ) +BHS_UNSIGNED_UINT64 (lsr_wide, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 24 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 48 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 72 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c new file mode 100644 index 0000000..e55ddfb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +ALL_SIGNED_UINT (asr, MXZ) +ALL_INTEGER_UINT (lsl, MXZ) +ALL_UNSIGNED_UINT (lsr, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 32 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 64 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 96 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binaryxn.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binaryxn.c new file mode 100644 index 0000000..6796229 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binaryxn.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T(F, TY) \ + sv##TY F##_f (sv##TY op1, sv##TY op2) \ + { \ + return sv##F (svpfalse_b (), op1, op2); \ + } + +#define ALL_DATA(F) \ + T (F##_bf16, bfloat16_t) \ + T (F##_f16, float16_t) \ + T (F##_f32, float32_t) \ + T (F##_f64, float64_t) \ + T (F##_s8, int8_t) \ + T (F##_s16, int16_t) \ + T (F##_s32, int32_t) \ + T (F##_s64, int64_t) \ + T (F##_u8, uint8_t) \ + T (F##_u16, uint16_t) \ + T (F##_u32, uint32_t) \ + T (F##_u64, uint64_t) \ + T (F##_b, bool_t) + +ALL_DATA (sel) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmov\t[zp]0\.[db], [zp]1\.[db]\n\tret\n} 13 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 13 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-clast.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-clast.c new file mode 100644 index 0000000..7f2ec4a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-clast.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +ALL_DATA (clasta, IMPLICITv) +ALL_DATA (clastb, IMPLICITv) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 24 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 24 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-compare_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-compare_opt_n.c new file mode 100644 index 0000000..d18427b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-compare_opt_n.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +ALL_FLOAT_BOOL (acge, IMPLICIT) +ALL_FLOAT_BOOL (acgt, IMPLICIT) +ALL_FLOAT_BOOL (acle, IMPLICIT) +ALL_FLOAT_BOOL (aclt, IMPLICIT) +ALL_ARITH_BOOL (cmpeq, IMPLICIT) +ALL_ARITH_BOOL (cmpge, IMPLICIT) +ALL_ARITH_BOOL (cmpgt, IMPLICIT) +ALL_ARITH_BOOL (cmple, IMPLICIT) +ALL_ARITH_BOOL (cmplt, IMPLICIT) +ALL_ARITH_BOOL (cmpne, IMPLICIT) +ALL_FLOAT_BOOL (cmpuo, IMPLICIT) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tpfalse\tp0\.b\n\tret\n} 162 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 162 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-compare_wide_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-compare_wide_opt_n.c new file mode 100644 index 0000000..983ab5c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-compare_wide_opt_n.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +BHS_SIGNED_WIDE_BOOL (cmpeq_wide, IMPLICIT) +BHS_INTEGER_WIDE_BOOL (cmpge_wide, IMPLICIT) +BHS_INTEGER_WIDE_BOOL (cmpgt_wide, IMPLICIT) +BHS_INTEGER_WIDE_BOOL (cmple_wide, IMPLICIT) +BHS_INTEGER_WIDE_BOOL (cmplt_wide, IMPLICIT) +BHS_SIGNED_WIDE_BOOL (cmpne_wide, IMPLICIT) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tpfalse\tp0\.b\n\tret\n} 60 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 60 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-count_pred.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-count_pred.c new file mode 100644 index 0000000..de36b66 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-count_pred.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-unary_0.h" + +ALL_PRED_UINT64 (cntp, IMPLICIT) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmov\tx0, 0\n\tret\n} 4 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 4 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-fold_left.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-fold_left.c new file mode 100644 index 0000000..333140d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-fold_left.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +ALL_FLOAT_SCALAR (adda, IMPLICITv) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 3 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 3 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load.c new file mode 100644 index 0000000..93d6693 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T(F, TY) \ + sv##TY F##_f (const TY *base) \ + { \ + return sv##F (svpfalse_b (), base); \ + } + +#define ALL_DATA(F) \ + T (F##_bf16, bfloat16_t) \ + T (F##_f16, float16_t) \ + T (F##_f32, float32_t) \ + T (F##_f64, float64_t) \ + T (F##_s8, int8_t) \ + T (F##_s16, int16_t) \ + T (F##_s32, int32_t) \ + T (F##_s64, int64_t) \ + T (F##_u8, uint8_t) \ + T (F##_u16, uint16_t) \ + T (F##_u32, uint32_t) \ + T (F##_u64, uint64_t) \ + +ALL_DATA (ldff1) +ALL_DATA (ldnf1) +ALL_DATA (ldnt1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 36 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 36 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_ext.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_ext.c new file mode 100644 index 0000000..c88686a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_ext.c @@ -0,0 +1,40 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T(F, RTY, TY) \ + RTY F##_f (const TY *base) \ + { \ + return sv##F (svpfalse_b (), base); \ + } + +#define D_INTEGER(F, TY) \ + T (F##_s64, svint64_t, TY) \ + T (F##_u64, svuint64_t, TY) + +#define SD_INTEGER(F, TY) \ + D_INTEGER (F, TY) \ + T (F##_s32, svint32_t, TY) \ + T (F##_u32, svuint32_t, TY) + +#define HSD_INTEGER(F, TY) \ + SD_INTEGER (F, TY) \ + T (F##_s16, svint16_t, TY) \ + T (F##_u16, svuint16_t, TY) + +#define TEST(F) \ + HSD_INTEGER (F##sb, int8_t) \ + SD_INTEGER (F##sh, int16_t) \ + D_INTEGER (F##sw, int32_t) \ + HSD_INTEGER (F##ub, uint8_t) \ + SD_INTEGER (F##uh, uint16_t) \ + D_INTEGER (F##uw, uint32_t) \ + +TEST (ld1) +TEST (ldff1) +TEST (ldnf1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 72 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 72 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_ext_gather_index.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_ext_gather_index.c new file mode 100644 index 0000000..5f4b562fc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_ext_gather_index.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T1(F, RTY, TY1, TY2) \ + RTY F##_f (const TY1 *base, TY2 indices) \ + { \ + return sv##F (svpfalse_b (), base, indices); \ + } + +#define T2(F, RTY, TY1) \ + RTY F##_f (TY1 bases, int64_t index) \ + { \ + return sv##F (svpfalse_b (), bases, index); \ + } + +#define T3(F, B, RTY, TY, TYPE) \ + T1 (F##_gather_s##B##index_##TY, RTY, TYPE, svint##B##_t) \ + T1 (F##_gather_u##B##index_##TY, RTY, TYPE, svuint##B##_t) + +#define T4(F, B, RTY, TY) \ + T2 (F##_gather_##TY##base_index_s##B, svint##B##_t, RTY) \ + T2 (F##_gather_##TY##base_index_u##B, svuint##B##_t, RTY) + +#define TEST(F) \ + T3 (F##sh, 32, svint32_t, s32, int16_t) \ + T3 (F##sh, 32, svuint32_t, u32, int16_t) \ + T3 (F##sh, 64, svint64_t, s64, int16_t) \ + T3 (F##sh, 64, svuint64_t, u64, int16_t) \ + T4 (F##sh, 32, svuint32_t, u32) \ + T4 (F##sh, 64, svuint64_t, u64) \ + T3 (F##sw, 64, svint64_t, s64, int32_t) \ + T3 (F##sw, 64, svuint64_t, u64, int32_t) \ + T4 (F##sw, 64, svuint64_t, u64) \ + T3 (F##uh, 32, svint32_t, s32, uint16_t) \ + T3 (F##uh, 32, svuint32_t, u32, uint16_t) \ + T3 (F##uh, 64, svint64_t, s64, uint16_t) \ + T3 (F##uh, 64, svuint64_t, u64, uint16_t) \ + T4 (F##uh, 32, svuint32_t, u32) \ + T4 (F##uh, 64, svuint64_t, u64) \ + T3 (F##uw, 64, svint64_t, s64, uint32_t) \ + T3 (F##uw, 64, svuint64_t, u64, uint32_t) \ + T4 (F##uw, 64, svuint64_t, u64) \ + +TEST (ld1) +TEST (ldff1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 72 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 72 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_ext_gather_offset.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_ext_gather_offset.c new file mode 100644 index 0000000..0fe8ab3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_ext_gather_offset.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T1(F, RTY, TY1, TY2) \ + RTY F##_f (const TY1 *base, TY2 offsets) \ + { \ + return sv##F (svpfalse_b (), base, offsets); \ + } + +#define T2(F, RTY, TY1) \ + RTY F##_f (TY1 bases, int64_t offset) \ + { \ + return sv##F (svpfalse_b (), bases, offset); \ + } + +#define T5(F, RTY, TY1) \ + RTY F##_f (TY1 bases) \ + { \ + return sv##F (svpfalse_b (), bases); \ + } + +#define T3(F, B, RTY, TY, TYPE) \ + T1 (F##_gather_s##B##offset_##TY, RTY, TYPE, svint##B##_t) \ + T1 (F##_gather_u##B##offset_##TY, RTY, TYPE, svuint##B##_t) + +#define T4(F, B, RTY, TY) \ + T2 (F##_gather_##TY##base_offset_s##B, svint##B##_t, RTY) \ + T2 (F##_gather_##TY##base_offset_u##B, svuint##B##_t, RTY) \ + T5 (F##_gather_##TY##base_s##B, svint##B##_t, RTY) \ + T5 (F##_gather_##TY##base_u##B, svuint##B##_t, RTY) + +#define TEST(F) \ + T3 (F##sb, 32, svint32_t, s32, int8_t) \ + T3 (F##sb, 32, svuint32_t, u32, int8_t) \ + T3 (F##sb, 64, svint64_t, s64, int8_t) \ + T3 (F##sb, 64, svuint64_t, u64, int8_t) \ + T4 (F##sb, 32, svuint32_t, u32) \ + T4 (F##sb, 64, svuint64_t, u64) \ + T3 (F##sh, 32, svint32_t, s32, int16_t) \ + T3 (F##sh, 32, svuint32_t, u32, int16_t) \ + T3 (F##sh, 64, svint64_t, s64, int16_t) \ + T3 (F##sh, 64, svuint64_t, u64, int16_t) \ + T4 (F##sh, 32, svuint32_t, u32) \ + T4 (F##sh, 64, svuint64_t, u64) \ + T3 (F##sw, 64, svint64_t, s64, int32_t) \ + T3 (F##sw, 64, svuint64_t, u64, int32_t) \ + T4 (F##sw, 64, svuint64_t, u64) \ + T3 (F##ub, 32, svint32_t, s32, uint8_t) \ + T3 (F##ub, 32, svuint32_t, u32, uint8_t) \ + T3 (F##ub, 64, svint64_t, s64, uint8_t) \ + T3 (F##ub, 64, svuint64_t, u64, uint8_t) \ + T4 (F##ub, 32, svuint32_t, u32) \ + T4 (F##ub, 64, svuint64_t, u64) \ + T3 (F##uh, 32, svint32_t, s32, uint16_t) \ + T3 (F##uh, 32, svuint32_t, u32, uint16_t) \ + T3 (F##uh, 64, svint64_t, s64, uint16_t) \ + T3 (F##uh, 64, svuint64_t, u64, uint16_t) \ + T4 (F##uh, 32, svuint32_t, u32) \ + T4 (F##uh, 64, svuint64_t, u64) \ + T3 (F##uw, 64, svint64_t, s64, uint32_t) \ + T3 (F##uw, 64, svuint64_t, u64, uint32_t) \ + T4 (F##uw, 64, svuint64_t, u64) \ + +TEST (ld1) +TEST (ldff1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 160 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 160 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_gather_sv.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_gather_sv.c new file mode 100644 index 0000000..758f00f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_gather_sv.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T1(F, RTY, TY1, TY2) \ + RTY F##_f (TY1 *base, TY2 values) \ + { \ + return sv##F (svpfalse_b (), base, values); \ + } + +#define T3(F, TY, B) \ + T1 (F##_f##B, svfloat##B##_t, float##B##_t, TY) \ + T1 (F##_s##B, svint##B##_t, int##B##_t, TY) \ + T1 (F##_u##B, svuint##B##_t, uint##B##_t, TY) \ + +#define T2(F, B) \ + T3 (F##_gather_u##B##offset, svuint##B##_t, B) \ + T3 (F##_gather_u##B##index, svuint##B##_t, B) \ + T3 (F##_gather_s##B##offset, svint##B##_t, B) \ + T3 (F##_gather_s##B##index, svint##B##_t, B) + +#define SD_DATA(F) \ + T2 (F, 32) \ + T2 (F, 64) + +SD_DATA (ld1) +SD_DATA (ldff1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 48 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 48 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_gather_vs.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_gather_vs.c new file mode 100644 index 0000000..f82471f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_gather_vs.c @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T1(F, RTY, TY) \ + RTY F##_f (TY bases) \ + { \ + return sv##F (svpfalse_b (), bases); \ + } + +#define T2(F, RTY, TY) \ + RTY F##_f (TY bases, int64_t value) \ + { \ + return sv##F (svpfalse_b (), bases, value); \ + } + +#define T4(F, TY, TEST, B) \ + TEST (F##_f##B, svfloat##B##_t, TY) \ + TEST (F##_s##B, svint##B##_t, TY) \ + TEST (F##_u##B, svuint##B##_t, TY) \ + +#define T3(F, B) \ + T4 (F##_gather_u##B##base, svuint##B##_t, T1, B) \ + T4 (F##_gather_u##B##base_offset, svuint##B##_t, T2, B) \ + T4 (F##_gather_u##B##base_index, svuint##B##_t, T2, B) + +#define SD_DATA(F) \ + T3 (F, 32) \ + T3 (F, 64) + +SD_DATA (ld1) +SD_DATA (ldff1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 36 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 36 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_replicate.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_replicate.c new file mode 100644 index 0000000..ba500b6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-load_replicate.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2 -march=armv8.2-a+sve+f64mm" } */ + +#include <arm_sve.h> + +#define T(F, TY) \ + sv##TY F##_f (const TY *base) \ + { \ + return sv##F (svpfalse_b (), base); \ + } + +#define ALL_DATA(F) \ + T (F##_bf16, bfloat16_t) \ + T (F##_f16, float16_t) \ + T (F##_f32, float32_t) \ + T (F##_f64, float64_t) \ + T (F##_s8, int8_t) \ + T (F##_s16, int16_t) \ + T (F##_s32, int32_t) \ + T (F##_s64, int64_t) \ + T (F##_u8, uint8_t) \ + T (F##_u16, uint16_t) \ + T (F##_u32, uint32_t) \ + T (F##_u64, uint64_t) \ + +ALL_DATA (ld1rq) +ALL_DATA (ld1ro) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 24 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 24 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-prefetch.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-prefetch.c new file mode 100644 index 0000000..71894c4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-prefetch.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T(F) \ + void F##_f (const void *base) \ + { \ + return sv##F (svpfalse_b (), base, 0); \ + } + +#define ALL_PREFETCH \ + T (prfb) \ + T (prfh) \ + T (prfw) \ + T (prfd) + +ALL_PREFETCH + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 4 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 4 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-prefetch_gather_index.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-prefetch_gather_index.c new file mode 100644 index 0000000..1b7cc42 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-prefetch_gather_index.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T(F, TYPE, TY) \ + void F##_##TY##_f (const void *base, sv##TYPE indices) \ + { \ + return sv##F##_##TY##index (svpfalse_b (), base, indices, 0);\ + } + +#define T1(F) \ + T (F, uint32_t, u32) \ + T (F, uint64_t, u64) \ + T (F, int32_t, s32) \ + T (F, int64_t, s64) + +#define ALL_PREFETCH_GATHER_INDEX \ + T1 (prfh_gather) \ + T1 (prfw_gather) \ + T1 (prfd_gather) + +ALL_PREFETCH_GATHER_INDEX + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 12 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-prefetch_gather_offset.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-prefetch_gather_offset.c new file mode 100644 index 0000000..7f4ff2d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-prefetch_gather_offset.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T(F, TYPE, TY) \ + void F##_##TY##_f (const void *base, sv##TYPE offsets) \ + { \ + return sv##F##_##TY##offset (svpfalse_b (), base, offsets, 0); \ + } + +#define T1(F) \ + T (F, uint32_t, u32) \ + T (F, uint64_t, u64) \ + T (F, int32_t, s32) \ + T (F, int64_t, s64) + +#define ALL_PREFETCH_GATHER_OFFSET \ + T1 (prfb_gather) \ + +ALL_PREFETCH_GATHER_OFFSET + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 4 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 4 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-ptest.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-ptest.c new file mode 100644 index 0000000..0a587fc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-ptest.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-unary_0.h" + +BOOL (ptest_any, IMPLICITn) +BOOL (ptest_first, IMPLICITn) +BOOL (ptest_last, IMPLICITn) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmov\tw0, 0\n\tret\n} 3 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 3 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-rdffr.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-rdffr.c new file mode 100644 index 0000000..d795f8e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-rdffr.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +svbool_t rdffr_f () +{ + return svrdffr_z (svpfalse_b ()); +} + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tpfalse\tp0\.b\n\tret\n} 1 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-reduction.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-reduction.c new file mode 100644 index 0000000..42b37ae --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-reduction.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +#include "../pfalse-unary_0.h" + +ALL_INTEGER_SCALAR (andv, IMPLICIT) +ALL_INTEGER_SCALAR (eorv, IMPLICIT) +ALL_ARITH_SCALAR (maxv, IMPLICIT) +ALL_ARITH_SCALAR (minv, IMPLICIT) +ALL_INTEGER_SCALAR (orv, IMPLICIT) +ALL_FLOAT_SCALAR (maxnmv, IMPLICIT) +ALL_FLOAT_SCALAR (minnmv, IMPLICIT) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmov\t[wx]0, 0\n\tret\n} 20 } } */ +/* { dg-final { scan-tree-dump-times "return Nan" 6 "optimized" } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmov\t[wx]0, -1\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmov\t[wxz]0(?:\.[sd])?, #?-?[1-9]+[0-9]*\n\tret\n} 27 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\t(movi|mvni)\tv0\.(2s|4h), 0x[0-9a-f]+, [a-z]+ [0-9]+\n\tret\n} 5 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 52 } } */ + +/* The sum of tested cases is 52 + 12, because mov [wx]0, -1 is tested in two + patterns. */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-reduction_wide.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-reduction_wide.c new file mode 100644 index 0000000..bd9a980 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-reduction_wide.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-unary_0.h" + +ALL_ARITH_SCALAR_WIDE (addv, IMPLICIT) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[dwvx]0(?:\.(2s|4h))?, #?0\n\tret\n} 11 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 11 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-shift_right_imm.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-shift_right_imm.c new file mode 100644 index 0000000..62a0755 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-shift_right_imm.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define MXZ(F, TY) \ + TY F##_f (TY op1) \ + { \ + return sv##F (svpfalse_b (), op1, 2); \ + } + +#define PRED_MXZn(F, TYPE, TY) \ + MXZ (F##_n_##TY##_m, TYPE) \ + MXZ (F##_n_##TY##_x, TYPE) \ + MXZ (F##_n_##TY##_z, TYPE) + +#define ALL_SIGNED_IMM(F, P) \ + PRED_##P (F, svint8_t, s8) \ + PRED_##P (F, svint16_t, s16) \ + PRED_##P (F, svint32_t, s32) \ + PRED_##P (F, svint64_t, s64) + +ALL_SIGNED_IMM (asrd, MXZn) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 4 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 8 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 12 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-store.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-store.c new file mode 100644 index 0000000..751e60e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-store.c @@ -0,0 +1,53 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T(F, TY1, TY2) \ + void F##_f (TY1 *base, sv##TY2 data) \ + { \ + return sv##F (svpfalse_b (), base, data); \ + } + +#define D_INTEGER(F, TY) \ + T (F##_s64, TY, int64_t) \ + T (F##_u64, TY, uint64_t) + +#define SD_INTEGER(F, TY) \ + D_INTEGER (F, TY) \ + T (F##_s32, TY, int32_t) \ + T (F##_u32, TY, uint32_t) + +#define HSD_INTEGER(F, TY) \ + SD_INTEGER (F, TY) \ + T (F##_s16, TY, int16_t) \ + T (F##_u16, TY, uint16_t) + +#define ALL_DATA(F, A) \ + T (F##_bf16, bfloat16_t, bfloat16##A) \ + T (F##_f16, float16_t, float16##A) \ + T (F##_f32, float32_t, float32##A) \ + T (F##_f64, float64_t, float64##A) \ + T (F##_s8, int8_t, int8##A) \ + T (F##_s16, int16_t, int16##A) \ + T (F##_s32, int32_t, int32##A) \ + T (F##_s64, int64_t, int64##A) \ + T (F##_u8, uint8_t, uint8##A) \ + T (F##_u16, uint16_t, uint16##A) \ + T (F##_u32, uint32_t, uint32##A) \ + T (F##_u64, uint64_t, uint64##A) \ + +HSD_INTEGER (st1b, int8_t) +SD_INTEGER (st1h, int16_t) +D_INTEGER (st1w, int32_t) +ALL_DATA (st1, _t) +ALL_DATA (st2, x2_t) +ALL_DATA (st3, x3_t) +ALL_DATA (st4, x4_t) + +/* FIXME: Currently, st1/2/3/4 are not folded with a pfalse + predicate, which is the reason for the 48 missing cases below. Once + folding is implemented for these intrinsics, the sum should be 60. */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 60 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-store_scatter_index.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-store_scatter_index.c new file mode 100644 index 0000000..44792d3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-store_scatter_index.c @@ -0,0 +1,40 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T1(F, TY1, TY2, TY3) \ + void F##_f (TY1 *base, TY2 indices, TY3 data) \ + { \ + sv##F (svpfalse_b (), base, indices, data); \ + } + +#define T2(F, TY1, TY3) \ + void F##_f (TY1 bases, int64_t index, TY3 data) \ + { \ + sv##F (svpfalse_b (), bases, index, data); \ + } + +#define T3(F, B, TYPE1, TY, TYPE2) \ + T1 (F##_scatter_s##B##index_##TY, TYPE2, svint##B##_t, TYPE1) \ + T1 (F##_scatter_u##B##index_##TY, TYPE2, svuint##B##_t, TYPE1) + +#define T4(F, B, TYPE1, TY) \ + T2 (F##_scatter_u##B##base_index_##TY, svuint##B##_t, TYPE1) + +#define TEST(F) \ + T3 (F##h, 32, svint32_t, s32, int16_t) \ + T3 (F##h, 32, svuint32_t, u32, int16_t) \ + T3 (F##h, 64, svint64_t, s64, int16_t) \ + T3 (F##h, 64, svuint64_t, u64, int16_t) \ + T4 (F##h, 32, svuint32_t, u32) \ + T4 (F##h, 64, svuint64_t, u64) \ + T3 (F##w, 64, svint64_t, s64, int32_t) \ + T3 (F##w, 64, svuint64_t, u64, int32_t) \ + T4 (F##w, 64, svuint64_t, u64) \ + +TEST (st1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 15 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 15 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-store_scatter_offset.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-store_scatter_offset.c new file mode 100644 index 0000000..f3820e0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-store_scatter_offset.c @@ -0,0 +1,68 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T1(F, TY1, TY2, TY3) \ + void F##_f (TY1 *base, TY2 offsets, TY3 data) \ + { \ + sv##F (svpfalse_b (), base, offsets, data); \ + } + +#define T2(F, TY1, TY3) \ + void F##_f (TY1 bases, int64_t offset, TY3 data) \ + { \ + sv##F (svpfalse_b (), bases, offset, data); \ + } + +#define T5(F, TY1, TY3) \ + void F##_f (TY1 bases, TY3 data) \ + { \ + sv##F (svpfalse_b (), bases, data); \ + } + + +#define T3(F, B, TYPE1, TY, TYPE2) \ + T1 (F##_scatter_s##B##offset_##TY, TYPE2, svint##B##_t, TYPE1)\ + T1 (F##_scatter_u##B##offset_##TY, TYPE2, svuint##B##_t, TYPE1) + +#define T4(F, B, TYPE1, TY) \ + T2 (F##_scatter_u##B##base_offset_##TY, svuint##B##_t, TYPE1) \ + T5 (F##_scatter_u##B##base_##TY, svuint##B##_t, TYPE1) + +#define D_INTEGER(F, BHW) \ + T3 (F, 64, svint64_t, s64, int##BHW##_t) \ + T3 (F, 64, svuint64_t, u64, int##BHW##_t) \ + T4 (F, 64, svint64_t, s64) \ + T4 (F, 64, svuint64_t, u64) + +#define SD_INTEGER(F, BHW) \ + D_INTEGER (F, BHW) \ + T3 (F, 32, svint32_t, s32, int##BHW##_t) \ + T3 (F, 32, svuint32_t, u32, int##BHW##_t) \ + T4 (F, 32, svint32_t, s32) \ + T4 (F, 32, svuint32_t, u32) + +#define SD_DATA(F) \ + T3 (F, 32, svint32_t, s32, int32_t) \ + T3 (F, 64, svint64_t, s64, int64_t) \ + T4 (F, 32, svint32_t, s32) \ + T4 (F, 64, svint64_t, s64) \ + T3 (F, 32, svuint32_t, u32, uint32_t) \ + T3 (F, 64, svuint64_t, u64, uint64_t) \ + T4 (F, 32, svuint32_t, u32) \ + T4 (F, 64, svuint64_t, u64) \ + T3 (F, 32, svfloat32_t, f32, float32_t) \ + T3 (F, 64, svfloat64_t, f64, float64_t) \ + T4 (F, 32, svfloat32_t, f32) \ + T4 (F, 64, svfloat64_t, f64) + + +SD_DATA (st1) +SD_INTEGER (st1b, 8) +SD_INTEGER (st1h, 16) +D_INTEGER (st1w, 32) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 64 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 64 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-storexn.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-storexn.c new file mode 100644 index 0000000..e49266d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-storexn.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T(F, TY) \ + void F##_f (TY *base, sv##TY data) \ + { \ + return sv##F (svpfalse_b (), base, data); \ + } + +#define ALL_DATA(F) \ + T (F##_bf16, bfloat16_t) \ + T (F##_f16, float16_t) \ + T (F##_f32, float32_t) \ + T (F##_f64, float64_t) \ + T (F##_s8, int8_t) \ + T (F##_s16, int16_t) \ + T (F##_s32, int32_t) \ + T (F##_s64, int64_t) \ + T (F##_u8, uint8_t) \ + T (F##_u16, uint16_t) \ + T (F##_u32, uint32_t) \ + T (F##_u64, uint64_t) + +ALL_DATA (stnt1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 12 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-ternary_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-ternary_opt_n.c new file mode 100644 index 0000000..acdd141 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-ternary_opt_n.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define MXZ(F, TY12, TY3) \ + TY12 F##_f (TY12 op1, TY12 op2, TY3 op3) \ + { \ + return sv##F (svpfalse_b (), op1, op2, op3); \ + } + +#define PRED_MXZ(F, TYPE, TY) \ + MXZ (F##_##TY##_m, sv##TYPE, sv##TYPE) \ + MXZ (F##_n_##TY##_m, sv##TYPE, TYPE) \ + MXZ (F##_##TY##_x, sv##TYPE, sv##TYPE) \ + MXZ (F##_n_##TY##_x, sv##TYPE, TYPE) \ + MXZ (F##_##TY##_z, sv##TYPE, sv##TYPE) \ + MXZ (F##_n_##TY##_z, sv##TYPE, TYPE) + +#define ALL_FLOAT(F, P) \ + PRED_##P (F, float16_t, f16) \ + PRED_##P (F, float32_t, f32) \ + PRED_##P (F, float64_t, f64) + +#define ALL_INTEGER(F, P) \ + PRED_##P (F, uint8_t, u8) \ + PRED_##P (F, uint16_t, u16) \ + PRED_##P (F, uint32_t, u32) \ + PRED_##P (F, uint64_t, u64) \ + PRED_##P (F, int8_t, s8) \ + PRED_##P (F, int16_t, s16) \ + PRED_##P (F, int32_t, s32) \ + PRED_##P (F, int64_t, s64) \ + +#define ALL_ARITH(F, P) \ + ALL_INTEGER (F, P) \ + ALL_FLOAT (F, P) + +ALL_ARITH (mad, MXZ) +ALL_ARITH (mla, MXZ) +ALL_ARITH (mls, MXZ) +ALL_ARITH (msb, MXZ) +ALL_FLOAT (nmad, MXZ) +ALL_FLOAT (nmla, MXZ) +ALL_FLOAT (nmls, MXZ) +ALL_FLOAT (nmsb, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 112 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 224 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 336 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-ternary_rotate.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-ternary_rotate.c new file mode 100644 index 0000000..7698045 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-ternary_rotate.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define MXZ(F, TY) \ + TY F##_f (TY op1, TY op2, TY op3) \ + { \ + return sv##F (svpfalse_b (), op1, op2, op3, 90); \ + } + +#define PRED_MXZ(F, TYPE, TY) \ + MXZ (F##_##TY##_m, sv##TYPE) \ + MXZ (F##_##TY##_x, sv##TYPE) \ + MXZ (F##_##TY##_z, sv##TYPE) + +#define ALL_FLOAT(F, P) \ + PRED_##P (F, float16_t, f16) \ + PRED_##P (F, float32_t, f32) \ + PRED_##P (F, float64_t, f64) + +ALL_FLOAT (cmla, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 3 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 6 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 9 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary.c new file mode 100644 index 0000000..037376b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary.c @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-unary_0.h" + +ALL_FLOAT_AND_SIGNED (abs, MXZ) +B (brka, MZ) +B (brkb, MZ) +ALL_INTEGER (cnot, MXZ) +HSD_INTEGER (extb, MXZ) +SD_INTEGER (exth, MXZ) +D_INTEGER (extw, MXZ) +B (mov, Z) +ALL_FLOAT_AND_SIGNED (neg, MXZ) +ALL_INTEGER (not, MXZ) +B (not, Z) +B (pfirst, IMPLICIT) +ALL_INTEGER (rbit, MXZ) +ALL_FLOAT (recpx, MXZ) +HSD_INTEGER (revb, MXZ) +SD_INTEGER (revh, MXZ) +D_INTEGER (revw, MXZ) +ALL_FLOAT (rinti, MXZ) +ALL_FLOAT (rintx, MXZ) +ALL_FLOAT (rintz, MXZ) +ALL_FLOAT (sqrt, MXZ) +SD_DATA (compact, IMPLICIT) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 80 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 160 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tpfalse\tp0\.b\n\tret\n} 4 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 244 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_convert_narrowt.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_convert_narrowt.c new file mode 100644 index 0000000..1287a70 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_convert_narrowt.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2 -march=armv8.2-a+sve+bf16" } */ + +#include <arm_sve.h> + +#define T(F, TYPE1, TY1, TYPE2, TY2) \ + TYPE1 F##_##TY1##_##TY2##_x_f (TYPE1 even, TYPE2 op) \ + { \ + return sv##F##_##TY1##_##TY2##_x (even, svpfalse_b (), op); \ + } \ + TYPE1 F##_##TY1##_##TY2##_m_f (TYPE1 even, TYPE2 op) \ + { \ + return sv##F##_##TY1##_##TY2##_m (even, svpfalse_b (), op); \ + } + +T (cvtnt, svbfloat16_t, bf16, svfloat32_t, f32) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 1 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?: [0-9]*[bhsd])?, #?0\n\tret\n} 1 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_convertxn.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_convertxn.c new file mode 100644 index 0000000..f519266 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_convertxn.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2 -march=armv8.2-a+sve+bf16" } */ + +#include <arm_sve.h> + +#define T(TYPE1, TY1, TYPE2, TY2) \ + TYPE1 cvt_##TY1##_##TY2##_x_f (TYPE2 op) \ + { \ + return svcvt_##TY1##_##TY2##_x (svpfalse_b (), op); \ + } \ + TYPE1 cvt_##TY1##_##TY2##_z_f (TYPE2 op) \ + { \ + return svcvt_##TY1##_##TY2##_z (svpfalse_b (), op); \ + } \ + TYPE1 cvt_##TY1##_##TY2##_m_f (TYPE1 inactive, TYPE2 op) \ + { \ + return svcvt_##TY1##_##TY2##_m (inactive, svpfalse_b (), op); \ + } + +#define SWAP(TYPE1, TY1, TYPE2, TY2) \ + T (TYPE1, TY1, TYPE2, TY2) \ + T (TYPE2, TY2, TYPE1, TY1) + +#define TEST_ALL \ + T (svbfloat16_t, bf16, svfloat32_t, f32) \ + SWAP (svfloat16_t, f16, svfloat32_t, f32) \ + SWAP (svfloat16_t, f16, svfloat64_t, f64) \ + SWAP (svfloat32_t, f32, svfloat64_t, f64) \ + SWAP (svint16_t, s16, svfloat16_t, f16) \ + SWAP (svint32_t, s32, svfloat16_t, f16) \ + SWAP (svint32_t, s32, svfloat32_t, f32) \ + SWAP (svint32_t, s32, svfloat64_t, f64) \ + SWAP (svint64_t, s64, svfloat16_t, f16) \ + SWAP (svint64_t, s64, svfloat32_t, f32) \ + SWAP (svint64_t, s64, svfloat64_t, f64) \ + SWAP (svuint16_t, u16, svfloat16_t, f16) \ + SWAP (svuint32_t, u32, svfloat16_t, f16) \ + SWAP (svuint32_t, u32, svfloat32_t, f32) \ + SWAP (svuint32_t, u32, svfloat64_t, f64) \ + SWAP (svuint64_t, u64, svfloat16_t, f16) \ + SWAP (svuint64_t, u64, svfloat32_t, f32) \ + SWAP (svuint64_t, u64, svfloat64_t, f64) + +TEST_ALL + + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 35 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 70 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 105 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_n.c new file mode 100644 index 0000000..fabde3e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_n.c @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define M(F, TY) \ + sv##TY F##_f (sv##TY inactive, TY op) \ + { \ + return sv##F (inactive, svpfalse_b (), op); \ + } + +#define XZ(F, TY) \ + sv##TY F##_f (TY op) \ + { \ + return sv##F (svpfalse_b (), op); \ + } + +#define PRED_MXZ(F, TYPE, TY) \ + M (F##_##TY##_m, TYPE) \ + XZ (F##_##TY##_x, TYPE) \ + XZ (F##_##TY##_z, TYPE) + +#define ALL_DATA(F, P) \ + PRED_##P (F, uint8_t, u8) \ + PRED_##P (F, uint16_t, u16) \ + PRED_##P (F, uint32_t, u32) \ + PRED_##P (F, uint64_t, u64) \ + PRED_##P (F, int8_t, s8) \ + PRED_##P (F, int16_t, s16) \ + PRED_##P (F, int32_t, s32) \ + PRED_##P (F, int64_t, s64) \ + PRED_##P (F, float16_t, f16) \ + PRED_##P (F, float32_t, f32) \ + PRED_##P (F, float64_t, f64) \ + PRED_##P (F, bfloat16_t, bf16) \ + +ALL_DATA (dup, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmov\tz0(?:\.[bhsd])?, [wxhsd]0\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 36 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_pred.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_pred.c new file mode 100644 index 0000000..46c9592 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_pred.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-unary_0.h" + +BN (pnext, IMPLICIT) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tpfalse\tp0\.b\n\tret\n} 4 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 4 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_to_uint.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_to_uint.c new file mode 100644 index 0000000..b820bde --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unary_to_uint.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-unary_0.h" + +ALL_SIGNED_UINT (cls, MXZ) +ALL_INTEGER_UINT (clz, MXZ) +ALL_DATA_UINT (cnt, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 24 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 48 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 72 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unaryxn.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unaryxn.c new file mode 100644 index 0000000..1e99b7f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-unaryxn.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-unary_0.h" + +ALL_FLOAT (rinta, MXZ) +ALL_FLOAT (rintm, MXZ) +ALL_FLOAT (rintn, MXZ) +ALL_FLOAT (rintp, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 12} } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 24 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 36 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary.c new file mode 100644 index 0000000..94470a5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +ALL_ARITH (addp, MXv) +ALL_ARITH (maxp, MXv) +ALL_FLOAT (maxnmp, MXv) +ALL_ARITH (minp, MXv) +ALL_FLOAT (minnmp, MXv) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 39 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 39 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 78 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_int_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_int_opt_n.c new file mode 100644 index 0000000..b8747b8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_int_opt_n.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +ALL_INTEGER_INT (qshl, MXZ) +ALL_INTEGER_INT (qrshl, MXZ) +ALL_UNSIGNED_INT (sqadd, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 40 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 80 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 120 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c new file mode 100644 index 0000000..7cb7ee5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +ALL_INTEGER_INT (rshl, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 16 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 32 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 48 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_opt_n.c new file mode 100644 index 0000000..787126f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_opt_n.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +ALL_INTEGER (hadd, MXZ) +ALL_INTEGER (hsub, MXZ) +ALL_INTEGER (hsubr, MXZ) +ALL_INTEGER (qadd, MXZ) +ALL_INTEGER (qsub, MXZ) +ALL_INTEGER (qsubr, MXZ) +ALL_INTEGER (rhadd, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 112 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 224 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 336 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_opt_single_n.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_opt_single_n.c new file mode 100644 index 0000000..6b2b0a42 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_opt_single_n.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2 -march=armv8.2-a+sve2+faminmax" } */ + +#include "../pfalse-binary_0.h" + +ALL_FLOAT (amax, MXZ) +ALL_FLOAT (amin, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 24 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 36 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_to_uint.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_to_uint.c new file mode 100644 index 0000000..a0a7f80 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_to_uint.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +SD_INTEGER_TO_UINT (histcnt, Zv) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 4 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 4 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c new file mode 100644 index 0000000..c13db48 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +ALL_SIGNED_UINT (uqadd, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 8 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 16 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 24 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_wide.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_wide.c new file mode 100644 index 0000000..145b077 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_wide.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +HSD_INTEGER_WIDE (adalp, MXZv) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 6 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 18 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-compare.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-compare.c new file mode 100644 index 0000000..da175db --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-compare.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.h" + +BH_INTEGER_BOOL (match, IMPLICITv) +BH_INTEGER_BOOL (nmatch, IMPLICITv) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tpfalse\tp0\.b\n\tret\n} 8 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 8 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-load_ext_gather_index_restricted.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-load_ext_gather_index_restricted.c new file mode 100644 index 0000000..c0476ce --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-load_ext_gather_index_restricted.c @@ -0,0 +1,46 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T1(F, RTY, TY1, TY2) \ + RTY F##_f (const TY1 *base, TY2 indices) \ + { \ + return sv##F (svpfalse_b (), base, indices); \ + } + +#define T2(F, RTY, TY1) \ + RTY F##_f (TY1 bases, int64_t index) \ + { \ + return sv##F (svpfalse_b (), bases, index); \ + } + +#define T3(F, B, RTY, TY, TYPE) \ + T1 (F##_gather_s##B##index_##TY, RTY, TYPE, svint##B##_t) \ + T1 (F##_gather_u##B##index_##TY, RTY, TYPE, svuint##B##_t) + +#define T4(F, B, RTY, TY) \ + T2 (F##_gather_##TY##base_index_s##B, svint##B##_t, RTY) \ + T2 (F##_gather_##TY##base_index_u##B, svuint##B##_t, RTY) + +#define TEST(F) \ + T3 (F##sh, 64, svint64_t, s64, int16_t) \ + T3 (F##sh, 64, svuint64_t, u64, int16_t) \ + T4 (F##sh, 32, svuint32_t, u32) \ + T4 (F##sh, 64, svuint64_t, u64) \ + T3 (F##sw, 64, svint64_t, s64, int32_t) \ + T3 (F##sw, 64, svuint64_t, u64, int32_t) \ + T4 (F##sw, 64, svuint64_t, u64) \ + T3 (F##uh, 64, svint64_t, s64, uint16_t) \ + T3 (F##uh, 64, svuint64_t, u64, uint16_t) \ + T4 (F##uh, 32, svuint32_t, u32) \ + T4 (F##uh, 64, svuint64_t, u64) \ + T3 (F##uw, 64, svint64_t, s64, uint32_t) \ + T3 (F##uw, 64, svuint64_t, u64, uint32_t) \ + T4 (F##uw, 64, svuint64_t, u64) \ + +TEST (ldnt1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 28 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 28 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-load_ext_gather_offset_restricted.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-load_ext_gather_offset_restricted.c new file mode 100644 index 0000000..f644024 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-load_ext_gather_offset_restricted.c @@ -0,0 +1,65 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T1(F, RTY, TY1, TY2) \ + RTY F##_f (const TY1 *base, TY2 offsets) \ + { \ + return sv##F (svpfalse_b (), base, offsets); \ + } + +#define T2(F, RTY, TY1) \ + RTY F##_f (TY1 bases, int64_t offset) \ + { \ + return sv##F (svpfalse_b (), bases, offset); \ + } + +#define T5(F, RTY, TY1) \ + RTY F##_f (TY1 bases) \ + { \ + return sv##F (svpfalse_b (), bases); \ + } + +#define T3(F, B, RTY, TY, TYPE) \ + T1 (F##_gather_u##B##offset_##TY, RTY, TYPE, svuint##B##_t) + +#define T4(F, B, RTY, TY) \ + T2 (F##_gather_##TY##base_offset_s##B, svint##B##_t, RTY) \ + T2 (F##_gather_##TY##base_offset_u##B, svuint##B##_t, RTY) \ + T5 (F##_gather_##TY##base_s##B, svint##B##_t, RTY) \ + T5 (F##_gather_##TY##base_u##B, svuint##B##_t, RTY) + +#define TEST(F) \ + T3 (F##sb, 32, svuint32_t, u32, int8_t) \ + T3 (F##sb, 64, svint64_t, s64, int8_t) \ + T3 (F##sb, 64, svuint64_t, u64, int8_t) \ + T4 (F##sb, 32, svuint32_t, u32) \ + T4 (F##sb, 64, svuint64_t, u64) \ + T3 (F##sh, 32, svuint32_t, u32, int16_t) \ + T3 (F##sh, 64, svint64_t, s64, int16_t) \ + T3 (F##sh, 64, svuint64_t, u64, int16_t) \ + T4 (F##sh, 32, svuint32_t, u32) \ + T4 (F##sh, 64, svuint64_t, u64) \ + T3 (F##sw, 64, svint64_t, s64, int32_t) \ + T3 (F##sw, 64, svuint64_t, u64, int32_t) \ + T4 (F##sw, 64, svuint64_t, u64) \ + T3 (F##ub, 32, svuint32_t, u32, uint8_t) \ + T3 (F##ub, 64, svint64_t, s64, uint8_t) \ + T3 (F##ub, 64, svuint64_t, u64, uint8_t) \ + T4 (F##ub, 32, svuint32_t, u32) \ + T4 (F##ub, 64, svuint64_t, u64) \ + T3 (F##uh, 32, svuint32_t, u32, uint16_t) \ + T3 (F##uh, 64, svint64_t, s64, uint16_t) \ + T3 (F##uh, 64, svuint64_t, u64, uint16_t) \ + T4 (F##uh, 32, svuint32_t, u32) \ + T4 (F##uh, 64, svuint64_t, u64) \ + T3 (F##uw, 64, svint64_t, s64, uint32_t) \ + T3 (F##uw, 64, svuint64_t, u64, uint32_t) \ + T4 (F##uw, 64, svuint64_t, u64) \ + +TEST (ldnt1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 56 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 56 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-load_gather_sv_restricted.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-load_gather_sv_restricted.c new file mode 100644 index 0000000..a48a8a9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-load_gather_sv_restricted.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T2(F, RTY, TY1, TY2) \ + RTY F##_f (TY1 *base, TY2 values) \ + { \ + return sv##F (svpfalse_b (), base, values); \ + } + +#define T4(F, TY, B) \ + T2 (F##_f##B, svfloat##B##_t, float##B##_t, TY) \ + T2 (F##_s##B, svint##B##_t, int##B##_t, TY) \ + T2 (F##_u##B, svuint##B##_t, uint##B##_t, TY) \ + +#define SD_DATA(F) \ + T4 (F##_gather_s64index, svint64_t, 64) \ + T4 (F##_gather_u64index, svuint64_t, 64) \ + T4 (F##_gather_u32offset, svuint32_t, 32) \ + T4 (F##_gather_u64offset, svuint64_t, 64) \ + T4 (F##_gather_s64offset, svint64_t, 64) \ + +SD_DATA (ldnt1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 15 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 15 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-load_gather_vs.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-load_gather_vs.c new file mode 100644 index 0000000..1fc08a3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-load_gather_vs.c @@ -0,0 +1,36 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T1(F, RTY, TY) \ + RTY F##_f (TY bases) \ + { \ + return sv##F (svpfalse_b (), bases); \ + } + +#define T2(F, RTY, TY) \ + RTY F##_f (TY bases, int64_t value) \ + { \ + return sv##F (svpfalse_b (), bases, value); \ + } + +#define T4(F, TY, TEST, B) \ + TEST (F##_f##B, svfloat##B##_t, TY) \ + TEST (F##_s##B, svint##B##_t, TY) \ + TEST (F##_u##B, svuint##B##_t, TY) \ + +#define T3(F, B) \ + T4 (F##_gather_u##B##base, svuint##B##_t, T1, B) \ + T4 (F##_gather_u##B##base_offset, svuint##B##_t, T2, B) \ + T4 (F##_gather_u##B##base_index, svuint##B##_t, T2, B) + +#define SD_DATA(F) \ + T3 (F, 32) \ + T3 (F, 64) + +SD_DATA (ldnt1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 18 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 18 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-shift_left_imm_to_uint.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-shift_left_imm_to_uint.c new file mode 100644 index 0000000..bd2c937 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-shift_left_imm_to_uint.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define MXZ(F, RTY, TY) \ + RTY F##_f (TY op1) \ + { \ + return sv##F (svpfalse_b (), op1, 2); \ + } + +#define PRED_MXZn(F, RTY, TYPE, TY) \ + MXZ (F##_n_##TY##_m, RTY, TYPE) \ + MXZ (F##_n_##TY##_x, RTY, TYPE) \ + MXZ (F##_n_##TY##_z, RTY, TYPE) + +#define ALL_SIGNED_IMM(F, P) \ + PRED_##P (F, svuint8_t, svint8_t, s8) \ + PRED_##P (F, svuint16_t, svint16_t, s16) \ + PRED_##P (F, svuint32_t, svint32_t, s32) \ + PRED_##P (F, svuint64_t, svint64_t, s64) + +ALL_SIGNED_IMM (qshlu, MXZn) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 4 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 8 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 12 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-shift_right_imm.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-shift_right_imm.c new file mode 100644 index 0000000..f4994de --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-shift_right_imm.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define MXZ(F, TY) \ + TY F##_f (TY op1) \ + { \ + return sv##F (svpfalse_b (), op1, 2); \ + } + +#define PRED_MXZn(F, TYPE, TY) \ + MXZ (F##_n_##TY##_m, TYPE) \ + MXZ (F##_n_##TY##_x, TYPE) \ + MXZ (F##_n_##TY##_z, TYPE) + +#define ALL_INTEGER_IMM(F, P) \ + PRED_##P (F, svuint8_t, u8) \ + PRED_##P (F, svuint16_t, u16) \ + PRED_##P (F, svuint32_t, u32) \ + PRED_##P (F, svuint64_t, u64) \ + PRED_##P (F, svint8_t, s8) \ + PRED_##P (F, svint16_t, s16) \ + PRED_##P (F, svint32_t, s32) \ + PRED_##P (F, svint64_t, s64) + +ALL_INTEGER_IMM (rshr, MXZn) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 8 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 16 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 24 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-store_scatter_index_restricted.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-store_scatter_index_restricted.c new file mode 100644 index 0000000..6bec3b3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-store_scatter_index_restricted.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T1(F, TY1, TY2, TY3) \ + void F##_f (TY1 *base, TY2 indices, TY3 data) \ + { \ + sv##F (svpfalse_b (), base, indices, data); \ + } + +#define T2(F, TY1, TY3) \ + void F##_f (TY1 bases, int64_t index, TY3 data) \ + { \ + sv##F (svpfalse_b (), bases, index, data); \ + } + +#define T3(F, B, TYPE1, TY, TYPE2) \ + T1 (F##_scatter_s##B##index_##TY, TYPE2, svint##B##_t, TYPE1) \ + T1 (F##_scatter_u##B##index_##TY, TYPE2, svuint##B##_t, TYPE1) + +#define T4(F, B, TYPE1, TY) \ + T2 (F##_scatter_u##B##base_index_##TY, svuint##B##_t, TYPE1) + +#define TEST(F) \ + T3 (F##h, 64, svint64_t, s64, int16_t) \ + T3 (F##h, 64, svuint64_t, u64, int16_t) \ + T4 (F##h, 32, svuint32_t, u32) \ + T4 (F##h, 64, svuint64_t, u64) \ + T3 (F##w, 64, svint64_t, s64, int32_t) \ + T3 (F##w, 64, svuint64_t, u64, int32_t) \ + T4 (F##w, 64, svuint64_t, u64) \ + +#define SD_DATA(F) \ + T3 (F, 64, svfloat64_t, f64, float64_t) \ + T3 (F, 64, svint64_t, s64, int64_t) \ + T3 (F, 64, svuint64_t, u64, uint64_t) \ + T4 (F, 32, svfloat32_t, f32) \ + T4 (F, 32, svint32_t, s32) \ + T4 (F, 32, svuint32_t, u32) \ + T4 (F, 64, svfloat64_t, f64) \ + T4 (F, 64, svint64_t, s64) \ + T4 (F, 64, svuint64_t, u64) \ + +TEST (stnt1) +SD_DATA (stnt1) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 23 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 23 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-store_scatter_offset_restricted.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-store_scatter_offset_restricted.c new file mode 100644 index 0000000..bcb4a14 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-store_scatter_offset_restricted.c @@ -0,0 +1,65 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T1(F, TY1, TY2, TY3) \ + void F##_f (TY1 *base, TY2 offsets, TY3 data) \ + { \ + sv##F (svpfalse_b (), base, offsets, data); \ + } + +#define T2(F, TY1, TY3) \ + void F##_f (TY1 bases, int64_t offset, TY3 data) \ + { \ + sv##F (svpfalse_b (), bases, offset, data); \ + } + +#define T5(F, TY1, TY3) \ + void F##_f (TY1 bases, TY3 data) \ + { \ + sv##F (svpfalse_b (), bases, data); \ + } + + +#define T3(F, B, TYPE1, TY, TYPE2) \ + T1 (F##_scatter_u##B##offset_##TY, TYPE2, svuint##B##_t, TYPE1) + +#define T4(F, B, TYPE1, TY) \ + T2 (F##_scatter_u##B##base_offset_##TY, svuint##B##_t, TYPE1) \ + T5 (F##_scatter_u##B##base_##TY, svuint##B##_t, TYPE1) + +#define D_INTEGER(F, BHW) \ + T3 (F, 64, svint64_t, s64, int##BHW##_t) \ + T3 (F, 64, svuint64_t, u64, int##BHW##_t) \ + T4 (F, 64, svint64_t, s64) \ + T4 (F, 64, svuint64_t, u64) + +#define SD_INTEGER(F, BHW) \ + D_INTEGER (F, BHW) \ + T3 (F, 32, svuint32_t, u32, int##BHW##_t) \ + T4 (F, 32, svint32_t, s32) \ + T4 (F, 32, svuint32_t, u32) + +#define SD_DATA(F) \ + T3 (F, 64, svint64_t, s64, int64_t) \ + T4 (F, 32, svint32_t, s32) \ + T4 (F, 64, svint64_t, s64) \ + T3 (F, 32, svuint32_t, u32, uint32_t) \ + T3 (F, 64, svuint64_t, u64, uint64_t) \ + T4 (F, 32, svuint32_t, u32) \ + T4 (F, 64, svuint64_t, u64) \ + T3 (F, 32, svfloat32_t, f32, float32_t) \ + T3 (F, 64, svfloat64_t, f64, float64_t) \ + T4 (F, 32, svfloat32_t, f32) \ + T4 (F, 64, svfloat64_t, f64) + + +SD_DATA (stnt1) +SD_INTEGER (stnt1b, 8) +SD_INTEGER (stnt1h, 16) +D_INTEGER (stnt1w, 32) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 45 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 45 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-unary.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-unary.c new file mode 100644 index 0000000..ba7e931 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-unary.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2 -march=armv9.2-a+sve+sme" } */ + +#include "../pfalse-unary_0.h" + +ALL_SIGNED (qabs, MXZ) +ALL_SIGNED (qneg, MXZ) +S_UNSIGNED (recpe, MXZ) +S_UNSIGNED (rsqrte, MXZ) + +#undef M +#define M(F, RTY, TY) \ + __arm_streaming \ + RTY F##_f (RTY inactive, TY op) \ + { \ + return sv##F (inactive, svpfalse_b (), op); \ + } + +#undef XZI +#define XZI(F, RTY, TY) \ + __arm_streaming \ + RTY F##_f (TY op) \ + { \ + return sv##F (svpfalse_b (), op); \ + } + +ALL_DATA (revd, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 22 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 44 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 66 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-unary_convert.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-unary_convert.c new file mode 100644 index 0000000..7aa59ff --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-unary_convert.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define XZ(F, TYPE1, TY1, TYPE2, TY2, P) \ + TYPE1 F##_##TY1##_##TY2##_##P##_f (TYPE2 op) \ + { \ + return sv##F##_##TY1##_##TY2##_##P (svpfalse_b (), op); \ + } \ + +#define M(F, TYPE1, TY1, TYPE2, TY2) \ + TYPE1 F##_##TY1##_##TY2##_m_f (TYPE1 inactive, TYPE2 op) \ + { \ + return sv##F##_##TY1##_##TY2##_m (inactive, svpfalse_b (), op); \ + } + +M (cvtx, svfloat32_t, f32, svfloat64_t, f64) +XZ (cvtx, svfloat32_t, f32, svfloat64_t, f64, x) +XZ (cvtx, svfloat32_t, f32, svfloat64_t, f64, z) +M (cvtlt, svfloat32_t, f32, svfloat16_t, f16) +XZ (cvtlt, svfloat32_t, f32, svfloat16_t, f16, x) +M (cvtlt, svfloat64_t, f64, svfloat32_t, f32) +XZ (cvtlt, svfloat64_t, f64, svfloat32_t, f32, x) + + + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 3 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 4 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 7 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-unary_convert_narrowt.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-unary_convert_narrowt.c new file mode 100644 index 0000000..1a4525c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-unary_convert_narrowt.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> + +#define T(F, TYPE1, TY1, TYPE2, TY2) \ + TYPE1 F##_##TY1##_##TY2##_x_f (TYPE1 even, TYPE2 op) \ + { \ + return sv##F##_##TY1##_##TY2##_x (even, svpfalse_b (), op); \ + } \ + TYPE1 F##_##TY1##_##TY2##_m_f (TYPE1 even, TYPE2 op) \ + { \ + return sv##F##_##TY1##_##TY2##_m (even, svpfalse_b (), op); \ + } + +T (cvtnt, svfloat16_t, f16, svfloat32_t, f32) +T (cvtnt, svfloat32_t, f32, svfloat64_t, f64) +T (cvtxnt, svfloat32_t, f32, svfloat64_t, f64) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 3 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?: [0-9]*[bhsd])?, #?0\n\tret\n} 3 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 6 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-unary_to_int.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-unary_to_int.c new file mode 100644 index 0000000..b64bfc3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-unary_to_int.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target elf } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-unary_0.h" + +ALL_FLOAT_INT (logb, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 3} } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 6 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 9 } } */ |