aarch64: Add support for SME2 intrinsics

This patch adds support for the SME2 <arm_sme.h> intrinsics. The convention I've used is to put stuff in aarch64-sve-builtins-sme.* if it relates to ZA, ZT0, the streaming vector length, or other such SME state. Things that operate purely on predicates and vectors go in aarch64-sve-builtins-sve2.* instead. Some of these will later be picked up for SVE2p1. We previously used Uph internally as a constraint for 16-bit immediates to atomic instructions. However, we need a user-facing constraint for the upper predicate registers (already available as PR_HI_REGS), and Uph makes a natural pair with the existing Upl. gcc/ * config/aarch64/aarch64.h (TARGET_STREAMING_SME2): New macro. (P_ALIASES): Likewise. (REGISTER_NAMES): Add pn aliases of the predicate registers. (W8_W11_REGNUM_P): New macro. (W8_W11_REGS): New register class. (REG_CLASS_NAMES, REG_CLASS_CONTENTS): Update accordingly. * config/aarch64/aarch64.cc (aarch64_print_operand): Add support for %K, which prints a predicate as a counter. Handle tuples of predicates. (aarch64_regno_regclass): Handle W8_W11_REGS. (aarch64_class_max_nregs): Likewise. * config/aarch64/constraints.md (Uci, Uw2, Uw4): New constraints. (x, y): Move further up file. (Uph): Redefine as the high predicate registers, renaming the old constraint to... (Uih): ...this. * config/aarch64/predicates.md (const_0_to_7_operand): New predicate. (const_0_to_4_step_4_operand, const_0_to_6_step_2_operand): Likewise. (const_0_to_12_step_4_operand, const_0_to_14_step_2_operand): Likewise. (aarch64_simd_shift_imm_qi): Use const_0_to_7_operand. * config/aarch64/iterators.md (VNx16SI_ONLY, VNx8SI_ONLY) (VNx8DI_ONLY, SVE_FULL_BHSIx2, SVE_FULL_HF, SVE_FULL_SIx2_SDIx4) (SVE_FULL_BHS, SVE_FULLx24, SVE_DIx24, SVE_BHSx24, SVE_Ix24) (SVE_Fx24, SVE_SFx24, SME_ZA_BIx24, SME_ZA_BHIx124, SME_ZA_BHIx24) (SME_ZA_HFx124, SME_ZA_HFx24, SME_ZA_HIx124, SME_ZA_HIx24) (SME_ZA_SDIx24, SME_ZA_SDFx24): New mode iterators. (UNSPEC_REVD, UNSPEC_CNTP_C, UNSPEC_PEXT, UNSPEC_PEXTx2): New unspecs. (UNSPEC_PSEL, UNSPEC_PTRUE_C, UNSPEC_SQRSHR, UNSPEC_SQRSHRN) (UNSPEC_SQRSHRU, UNSPEC_SQRSHRUN, UNSPEC_UQRSHR, UNSPEC_UQRSHRN) (UNSPEC_UZP, UNSPEC_UZPQ, UNSPEC_ZIP, UNSPEC_ZIPQ, UNSPEC_BFMLSLB) (UNSPEC_BFMLSLT, UNSPEC_FCVTN, UNSPEC_FDOT, UNSPEC_SQCVT): Likewise. (UNSPEC_SQCVTN, UNSPEC_SQCVTU, UNSPEC_SQCVTUN, UNSPEC_UQCVT): Likewise. (UNSPEC_SME_ADD, UNSPEC_SME_ADD_WRITE, UNSPEC_SME_BMOPA): Likewise. (UNSPEC_SME_BMOPS, UNSPEC_SME_FADD, UNSPEC_SME_FDOT, UNSPEC_SME_FVDOT) (UNSPEC_SME_FMLA, UNSPEC_SME_FMLS, UNSPEC_SME_FSUB, UNSPEC_SME_READ) (UNSPEC_SME_SDOT, UNSPEC_SME_SVDOT, UNSPEC_SME_SMLA, UNSPEC_SME_SMLS) (UNSPEC_SME_SUB, UNSPEC_SME_SUB_WRITE, UNSPEC_SME_SUDOT): Likewise. (UNSPEC_SME_SUVDOT, UNSPEC_SME_UDOT, UNSPEC_SME_UVDOT): Likewise. (UNSPEC_SME_UMLA, UNSPEC_SME_UMLS, UNSPEC_SME_USDOT): Likewise. (UNSPEC_SME_USVDOT, UNSPEC_SME_WRITE): Likewise. (Vetype, VNARROW, V2XWIDE, Ventype, V_INT_EQUIV, v_int_equiv) (VSINGLE, vsingle, b): Add tuple modes. (v2xwide, za32_offset_range, za64_offset_range, za32_long) (za32_last_offset, vg_modifier, z_suffix, aligned_operand) (aligned_fpr): New mode attributes. (SVE_INT_BINARY_MULTI, SVE_INT_BINARY_SINGLE, SVE_INT_BINARY_MULTI) (SVE_FP_BINARY_MULTI): New int iterators. (SVE_BFLOAT_TERNARY_LONG): Add UNSPEC_BFMLSLB and UNSPEC_BFMLSLT. (SVE_BFLOAT_TERNARY_LONG_LANE): Likewise. (SVE_WHILE_ORDER, SVE2_INT_SHIFT_IMM_NARROWxN, SVE_QCVTxN) (SVE2_SFx24_UNARY, SVE2_x24_PERMUTE, SVE2_x24_PERMUTEQ) (UNSPEC_REVD_ONLY, SME2_INT_MOP, SME2_BMOP, SME_BINARY_SLICE_SDI) (SME_BINARY_SLICE_SDF, SME_BINARY_WRITE_SLICE_SDI, SME_INT_DOTPROD) (SME_INT_DOTPROD_LANE, SME_FP_DOTPROD, SME_FP_DOTPROD_LANE) (SME_INT_TERNARY_SLICE, SME_FP_TERNARY_SLICE, BHSD_BITS) (LUTI_BITS): New int iterators. (optab, sve_int_op): Handle the new unspecs. (sme_int_op, has_16bit_form): New int attributes. (bits_etype): Handle 64. * config/aarch64/aarch64.md (UNSPEC_LD1_SVE_COUNT): New unspec. (UNSPEC_ST1_SVE_COUNT, UNSPEC_LDNT1_SVE_COUNT): Likewise. (UNSPEC_STNT1_SVE_COUNT): Likewise. * config/aarch64/atomics.md (cas_short_expected_imm): Use Uhi rather than Uph for HImode immediates. * config/aarch64/aarch64-sve.md (@aarch64_ld1<SVE_FULLx24:mode>) (@aarch64_ldnt1<SVE_FULLx24:mode>, @aarch64_st1<SVE_FULLx24:mode>) (@aarch64_stnt1<SVE_FULLx24:mode>): New patterns. (@aarch64_<sur>dot_prod_lane<vsi2qi>): Extend to... (@aarch64_<sur>dot_prod_lane<SVE_FULL_SDI:mode><SVE_FULL_BHI:mode>) (@aarch64_<sur>dot_prod_lane<VNx4SI_ONLY:mode><VNx16QI_ONLY:mode>): ...these new patterns. (SVE_WHILE_B, SVE_WHILE_B_X2, SVE_WHILE_C): New constants. Add SVE_WHILE_B to existing while patterns. * config/aarch64/aarch64-sve2.md (@aarch64_sve_ptrue_c<BHSD_BITS>) (@aarch64_sve_pext<BHSD_BITS>, @aarch64_sve_pext<BHSD_BITS>x2) (@aarch64_sve_psel<BHSD_BITS>, *aarch64_sve_psel<BHSD_BITS>_plus) (@aarch64_sve_cntp_c<BHSD_BITS>, <frint_pattern><mode>2) (<optab><mode>3, *<optab><mode>3, @aarch64_sve_single_<optab><mode>) (@aarch64_sve_<sve_int_op><mode>): New patterns. (@aarch64_sve_single_<sve_int_op><mode>, @aarch64_sve_<su>clamp<mode>) (*aarch64_sve_<su>clamp<mode>_x, @aarch64_sve_<su>clamp_single<mode>) (@aarch64_sve_fclamp<mode>, *aarch64_sve_fclamp<mode>_x) (@aarch64_sve_fclamp_single<mode>, <optab><mode><v2xwide>2) (@aarch64_sve_<sur>dotvnx4sivnx8hi): New patterns. (@aarch64_sve_<maxmin_uns_op><mode>): Likewise. (*aarch64_sve_<maxmin_uns_op><mode>): Likewise. (@aarch64_sve_single_<maxmin_uns_op><mode>): Likewise. (aarch64_sve_fdotvnx4sfvnx8hf): Likewise. (aarch64_fdot_prod_lanevnx4sfvnx8hf): Likewise. (@aarch64_sve_<optab><VNx16QI_ONLY:mode><VNx16SI_ONLY:mode>): Likewise. (@aarch64_sve_<optab><VNx8HI_ONLY:mode><VNx8SI_ONLY:mode>): Likewise. (@aarch64_sve_<optab><VNx8HI_ONLY:mode><VNx8DI_ONLY:mode>): Likewise. (truncvnx8sf<mode>2, @aarch64_sve_cvtn<mode>): Likewise. (<optab><v_int_equiv><mode>2, <optab><mode><v_int_equiv>2): Likewise. (@aarch64_sve_sel<mode>): Likewise. (@aarch64_sve_while<while_optab_cmp>_b<BHSD_BITS>_x2): Likewise. (@aarch64_sve_while<while_optab_cmp>_c<BHSD_BITS>): Likewise. (@aarch64_pred_<optab><mode>, @cond_<optab><mode>): Likewise. (@aarch64_sve_<optab><mode>): Likewise. * config/aarch64/aarch64-sme.md (@aarch64_sme_<optab><mode><mode>) (*aarch64_sme_<optab><mode><mode>_plus, @aarch64_sme_read<mode>) (*aarch64_sme_read<mode>_plus, @aarch64_sme_write<mode>): New patterns. (*aarch64_sme_write<mode>_plus aarch64_sme_zero_zt0): Likewise. (@aarch64_sme_<optab><mode>, *aarch64_sme_<optab><mode>_plus) (@aarch64_sme_single_<optab><mode>): Likewise. (*aarch64_sme_single_<optab><mode>_plus): Likewise. (@aarch64_sme_<optab><SME_ZA_SDI:mode><SME_ZA_BHIx24:mode>) (*aarch64_sme_<optab><SME_ZA_SDI:mode><SME_ZA_BHIx24:mode>_plus) (@aarch64_sme_single_<optab><SME_ZA_SDI:mode><SME_ZA_BHIx24:mode>) (*aarch64_sme_single_<optab><SME_ZA_SDI:mode><SME_ZA_BHIx24:mode>_plus) (@aarch64_sme_single_sudot<VNx4SI_ONLY:mode><SME_ZA_BIx24:mode>) (*aarch64_sme_single_sudot<VNx4SI_ONLY:mode><SME_ZA_BIx24:mode>_plus) (@aarch64_sme_lane_<optab><SME_ZA_SDI:mode><SME_ZA_BHIx24:mode>) (*aarch64_sme_lane_<optab><SME_ZA_SDI:mode><SME_ZA_BHIx24:mode>_plus) (@aarch64_sme_<optab><VNx4SI_ONLY:mode><SVE_FULL_BHI:mode>) (*aarch64_sme_<optab><VNx4SI_ONLY:mode><SVE_FULL_BHI:mode>_plus) (@aarch64_sme_<optab><VNx4SI_ONLY:mode><SME_ZA_BHIx24:mode>) (*aarch64_sme_<optab><VNx4SI_ONLY:mode><SME_ZA_BHIx24:mode>_plus) (@aarch64_sme_single_<optab><VNx4SI_ONLY:mode><SME_ZA_BHIx24:mode>) (*aarch64_sme_single_<optab><VNx4SI_ONLY:mode><SME_ZA_BHIx24:mode>_plus) (@aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_BHIx124:mode>) (*aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_BHIx124:mode>) (@aarch64_sme_<optab><VNx2DI_ONLY:mode><VNx8HI_ONLY:mode>) (*aarch64_sme_<optab><VNx2DI_ONLY:mode><VNx8HI_ONLY:mode>_plus) (@aarch64_sme_<optab><VNx2DI_ONLY:mode><SME_ZA_HIx24:mode>) (*aarch64_sme_<optab><VNx2DI_ONLY:mode><SME_ZA_HIx24:mode>_plus) (@aarch64_sme_single_<optab><VNx2DI_ONLY:mode><SME_ZA_HIx24:mode>) (*aarch64_sme_single_<optab><VNx2DI_ONLY:mode><SME_ZA_HIx24:mode>_plus) (@aarch64_sme_lane_<optab><VNx2DI_ONLY:mode><SME_ZA_HIx124:mode>) (*aarch64_sme_lane_<optab><VNx2DI_ONLY:mode><SME_ZA_HIx124:mode>) (@aarch64_sme_<optab><VNx4SI_ONLY:mode><VNx8HI_ONLY:mode>) (@aarch64_sme_<optab><VNx4SI_ONLY:mode><VNx4SI_ONLY:mode>) (@aarch64_sme_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx24:mode>) (*aarch64_sme_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx24:mode>_plus) (@aarch64_sme_single_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx24:mode>) (*aarch64_sme_single_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx24:mode>_plus) (@aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx24:mode>) (*aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx24:mode>_plus) (@aarch64_sme_<optab><SME_ZA_SDF_I:mode><SME_ZA_SDFx24:mode>) (*aarch64_sme_<optab><SME_ZA_SDF_I:mode><SME_ZA_SDFx24:mode>_plus) (@aarch64_sme_single_<optab><SME_ZA_SDF_I:mode><SME_ZA_SDFx24:mode>) (*aarch64_sme_single_<optab><SME_ZA_SDF_I:mode><SME_ZA_SDFx24:mode>_plus) (@aarch64_sme_lane_<optab><SME_ZA_SDF_I:mode><SME_ZA_SDFx24:mode>) (*aarch64_sme_lane_<optab><SME_ZA_SDF_I:mode><SME_ZA_SDFx24:mode>) (@aarch64_sme_<optab><VNx4SI_ONLY:mode><SVE_FULL_HF:mode>) (*aarch64_sme_<optab><VNx4SI_ONLY:mode><SVE_FULL_HF:mode>_plus) (@aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx124:mode>) (*aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx124:mode>) (@aarch64_sme_lut<LUTI_BITS><mode>): Likewise. (UNSPEC_SME_LUTI): New unspec. * config/aarch64/aarch64-sve-builtins.def (single): New mode suffix. (c8, c16, c32, c64): New type suffixes. (vg1x2, vg1x4, vg2, vg2x1, vg2x2, vg2x4, vg4, vg4x1, vg4x2) (vg4x4): New group suffixes. * config/aarch64/aarch64-sve-builtins.h (CP_READ_ZT0) (CP_WRITE_ZT0): New constants. (get_svbool_t): Delete. (function_resolver::report_mismatched_num_vectors): New member function. (function_resolver::resolve_conversion): Likewise. (function_resolver::infer_predicate_type): Likewise. (function_resolver::infer_64bit_scalar_integer_pair): Likewise. (function_resolver::require_matching_predicate_type): Likewise. (function_resolver::require_nonscalar_type): Likewise. (function_resolver::finish_opt_single_resolution): Likewise. (function_resolver::require_derived_vector_type): Add an expected_num_vectors parameter. (function_expander::map_to_rtx_codes): Add an extra parameter for unconditional FP unspecs. (function_instance::gp_type_index): New member function. (function_instance::gp_type): Likewise. (function_instance::gp_mode): Handle multi-vector operations. * config/aarch64/aarch64-sve-builtins.cc (TYPES_all_count) (TYPES_all_pred_count, TYPES_c, TYPES_bhs_data, TYPES_bhs_widen) (TYPES_hs_data, TYPES_cvt_h_s_float, TYPES_cvt_s_s, TYPES_qcvt_x2) (TYPES_qcvt_x4, TYPES_qrshr_x2, TYPES_qrshru_x2, TYPES_qrshr_x4) (TYPES_qrshru_x4, TYPES_while_x, TYPES_while_x_c, TYPES_s_narrow_fsu) (TYPES_za_s_b_signed, TYPES_za_s_b_unsigned, TYPES_za_s_b_integer) (TYPES_za_s_h_integer, TYPES_za_s_h_data, TYPES_za_s_unsigned) (TYPES_za_s_float, TYPES_za_s_data, TYPES_za_d_h_integer): New type macros. (groups_x2, groups_x12, groups_x4, groups_x24, groups_x124) (groups_vg1x2, groups_vg1x4, groups_vg1x24, groups_vg2, groups_vg4) (groups_vg24): New group arrays. (function_instance::reads_global_state_p): Handle CP_READ_ZT0. (function_instance::modifies_global_state_p): Handle CP_WRITE_ZT0. (add_shared_state_attribute): Handle zt0 state. (function_builder::add_overloaded_functions): Skip MODE_single for non-tuple groups. (function_resolver::report_mismatched_num_vectors): New function. (function_resolver::resolve_to): Add a fallback error message for the general two-type case. (function_resolver::resolve_conversion): New function. (function_resolver::infer_predicate_type): Likewise. (function_resolver::infer_64bit_scalar_integer_pair): Likewise. (function_resolver::require_matching_predicate_type): Likewise. (function_resolver::require_matching_vector_type): Specifically diagnose mismatched vector counts. (function_resolver::require_derived_vector_type): Add an expected_num_vectors parameter. Extend to handle cases where tuples are expected. (function_resolver::require_nonscalar_type): New function. (function_resolver::check_gp_argument): Use gp_type_index rather than hard-coding VECTOR_TYPE_svbool_t. (function_resolver::finish_opt_single_resolution): New function. (function_checker::require_immediate_either_or): Remove hard-coded constants. (function_expander::direct_optab_handler): New function. (function_expander::use_pred_x_insn): Only add a strictness flag is the insn has an operand for it. (function_expander::map_to_rtx_codes): Take an unconditional FP unspec as an extra parameter. Handle tuples and MODE_single. (function_expander::map_to_unspecs): Handle tuples and MODE_single. * config/aarch64/aarch64-sve-builtins-functions.h (read_zt0) (write_zt0): New typedefs. (full_width_access::memory_vector): Use the function's vectors_per_tuple. (rtx_code_function_base): Add an optional unconditional FP unspec. (rtx_code_function::expand): Update accordingly. (rtx_code_function_rotated::expand): Likewise. (unspec_based_function_exact_insn::expand): Use tuple_mode instead of vector_mode. (unspec_based_uncond_function): New typedef. (cond_or_uncond_unspec_function): New class. (sme_1mode_function::expand): Handle single forms. (sme_2mode_function_t): Likewise, adding a template parameter for them. (sme_2mode_function): Update accordingly. (sme_2mode_lane_function): New typedef. (multireg_permute): New class. (class integer_conversion): Likewise. (while_comparison::expand): Handle svcount_t and svboolx2_t results. * config/aarch64/aarch64-sve-builtins-shapes.h (binary_int_opt_single_n, binary_opt_single_n, binary_single) (binary_za_slice_lane, binary_za_slice_int_opt_single) (binary_za_slice_opt_single, binary_za_slice_uint_opt_single) (binaryx, clamp, compare_scalar_count, count_pred_c) (dot_za_slice_int_lane, dot_za_slice_lane, dot_za_slice_uint_lane) (extract_pred, inherent_zt, ldr_zt, read_za, read_za_slice) (select_pred, shift_right_imm_narrowxn, storexn, str_zt) (unary_convertxn, unary_za_slice, unaryxn, write_za) (write_za_slice): Declare. * config/aarch64/aarch64-sve-builtins-shapes.cc (za_group_is_pure_overload): New function. (apply_predication): Use the function's gp_type for the predicate, instead of hard-coding the use of svbool_t. (parse_element_type): Add support for "c" (svcount_t). (parse_type): Add support for "c0" and "c1" (conversion destination and source types). (binary_za_slice_lane_base): New class. (binary_za_slice_opt_single_base): Likewise. (load_contiguous_base::resolve): Pass the group suffix to r.resolve. (luti_lane_zt_base): New class. (binary_int_opt_single_n, binary_opt_single_n, binary_single) (binary_za_slice_lane, binary_za_slice_int_opt_single) (binary_za_slice_opt_single, binary_za_slice_uint_opt_single) (binaryx, clamp): New shapes. (compare_scalar_def::build): Allow the return type to be a tuple. (compare_scalar_def::expand): Pass the group suffix to r.resolve. (compare_scalar_count, count_pred_c, dot_za_slice_int_lane) (dot_za_slice_lane, dot_za_slice_uint_lane, extract_pred, inherent_zt) (ldr_zt, read_za, read_za_slice, select_pred, shift_right_imm_narrowxn) (storexn, str_zt): New shapes. (ternary_qq_lane_def, ternary_qq_opt_n_def): Replace with... (ternary_qq_or_011_lane_def, ternary_qq_opt_n_or_011_def): ...these new classes. Allow a second suffix that specifies the type of the second vector argument, and that is used to derive the third. (unary_def::build): Extend to handle tuple types. (unary_convert_def::build): Use the new c0 and c1 format specifiers. (unary_convertxn, unary_za_slice, unaryxn, write_za): New shapes. (write_za_slice): Likewise. * config/aarch64/aarch64-sve-builtins-base.cc (svbic_impl::expand) (svext_bhw_impl::expand): Update call to map_to_rtx_costs. (svcntp_impl::expand): Handle svcount_t variants. (svcvt_impl::expand): Handle unpredicated conversions separately, dealing with tuples. (svdot_impl::expand): Handle 2-way dot products. (svdotprod_lane_impl::expand): Likewise. (svld1_impl::fold): Punt on tuple loads. (svld1_impl::expand): Handle tuple loads. (svldnt1_impl::expand): Likewise. (svpfalse_impl::fold): Punt on svcount_t forms. (svptrue_impl::fold): Likewise. (svptrue_impl::expand): Handle svcount_t forms. (svrint_impl): New class. (svsel_impl::fold): Punt on tuple forms. (svsel_impl::expand): Handle tuple forms. (svst1_impl::fold): Punt on tuple loads. (svst1_impl::expand): Handle tuple loads. (svstnt1_impl::expand): Likewise. (svwhilelx_impl::fold): Punt on tuple forms. (svdot_lane): Use UNSPEC_FDOT. (svmax, svmaxnm, svmin, svminmm): Add unconditional FP unspecs. (rinta, rinti, rintm, rintn, rintp, rintx, rintz): Use svrint_impl. * config/aarch64/aarch64-sve-builtins-base.def (svcreate2, svget2) (svset2, svundef2): Add _b variants. (svcvt): Use unary_convertxn. (svdot): Use ternary_qq_opt_n_or_011. (svdot_lane): Use ternary_qq_or_011_lane. (svmax, svmaxnm, svmin, svminnm): Use binary_opt_single_n. (svpfalse): Add a form that returns svcount_t results. (svrinta, svrintm, svrintn, svrintp): Use unaryxn. (svsel): Use binaryxn. (svst1, svstnt1): Use storexn. * config/aarch64/aarch64-sve-builtins-sme.h (svadd_za, svadd_write_za, svbmopa_za, svbmops_za, svdot_za) (svdot_lane_za, svldr_zt, svluti2_lane_zt, svluti4_lane_zt) (svmla_za, svmla_lane_za, svmls_za, svmls_lane_za, svread_za) (svstr_zt, svsub_za, svsub_write_za, svsudot_za, svsudot_lane_za) (svsuvdot_lane_za, svusdot_za, svusdot_lane_za, svusvdot_lane_za) (svvdot_lane_za, svwrite_za, svzero_zt): Declare. * config/aarch64/aarch64-sve-builtins-sme.cc (load_store_za_base): Rename to... (load_store_za_zt0_base): ...this and extend to tuples. (load_za_base, store_za_base): Update accordingly. (expand_ldr_str_zt0): New function. (svldr_zt_impl, svluti_lane_zt_impl, svread_za_impl, svstr_zt_impl) (svsudot_za_impl, svwrite_za_impl, svzero_zt_impl): New classes. (svadd_za, svadd_write_za, svbmopa_za, svbmops_za, svdot_za) (svdot_lane_za, svldr_zt, svluti2_lane_zt, svluti4_lane_zt) (svmla_za, svmla_lane_za, svmls_za, svmls_lane_za, svread_za) (svstr_zt, svsub_za, svsub_write_za, svsudot_za, svsudot_lane_za) (svsuvdot_lane_za, svusdot_za, svusdot_lane_za, svusvdot_lane_za) (svvdot_lane_za, svwrite_za, svzero_zt): New functions. * config/aarch64/aarch64-sve-builtins-sme.def: Add SME2 intrinsics. * config/aarch64/aarch64-sve-builtins-sve2.h (svbfmlslb, svbfmlslb_lane, svbfmlslt, svbfmlslt_lane, svclamp) (svcvtn, svpext, svpsel, svqcvt, svqcvtn, svqrshr, svqrshrn) (svqrshru, svqrshrun, svrevd, svunpk, svuzp, svuzpq, svzip) (svzipq): Declare. * config/aarch64/aarch64-sve-builtins-sve2.cc (svclamp_impl) (svcvtn_impl, svpext_impl, svpsel_impl): New classes. (svqrshl_impl::fold): Update for change to svrshl shape. (svrshl_impl::fold): Punt on tuple forms. (svsqadd_impl::expand): Update call to map_to_rtx_codes. (svunpk_impl): New class. (svbfmlslb, svbfmlslb_lane, svbfmlslt, svbfmlslt_lane, svclamp) (svcvtn, svpext, svpsel, svqcvt, svqcvtn, svqrshr, svqrshrn) (svqrshru, svqrshrun, svrevd, svunpk, svuzp, svuzpq, svzip) (svzipq): New functions. * config/aarch64/aarch64-sve-builtins-sve2.def: Add SME2 intrinsics. * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define or undefine __ARM_FEATURE_SME2. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h: Provide a way for test functions to share ZT0. (ATTR): Update accordingly. (TEST_LOAD_COUNT, TEST_STORE_COUNT, TEST_PN, TEST_COUNT_PN) (TEST_EXTRACT_PN, TEST_SELECT_P, TEST_COMPARE_S_X2, TEST_COMPARE_S_C) (TEST_CREATE_B, TEST_GET_B, TEST_SET_B, TEST_XN, TEST_XN_SINGLE) (TEST_XN_SINGLE_Z15, TEST_XN_SINGLE_AWKWARD, TEST_X2_NARROW) (TEST_X4_NARROW): New macros. * gcc.target/aarch64/sve/acle/asm/create2_1.c: Add _b tests. * gcc.target/aarch64/sve/acle/general-c/binary_za_m_1.c: Remove test for svmopa that becomes valid with SME2. * gcc.target/aarch64/sve/acle/general-c/create_1.c: Adjust for existence of svboolx2_t version of svcreate2. * gcc.target/aarch64/sve/acle/general-c/store_1.c: Adjust error messages to account for svcount_t predication. * gcc.target/aarch64/sve/acle/general-c/store_2.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/ternary_qq_lane_1.c: Adjust error messages to account for new SME2 variants. * gcc.target/aarch64/sve/acle/general-c/ternary_qq_opt_n_2.c: Likewise.
author: Richard Sandiford <richard.sandiford@arm.com> 2023-12-05 10:24:02 +0000
committer: Richard Sandiford <richard.sandiford@arm.com> 2023-12-05 10:24:02 +0000
commit: c1c267dfcdccd53af0d2b2c945794ab3b0d8683f (patch)
tree: 011229f73c4414a17739e7d16a4ec78c53fa5f3a /gcc/config/aarch64/atomics.md
parent: 8d29b7aca151339b07a3bf46cf0378c7f469ce6d (diff)
download: gcc-c1c267dfcdccd53af0d2b2c945794ab3b0d8683f.zip
gcc-c1c267dfcdccd53af0d2b2c945794ab3b0d8683f.tar.gz
gcc-c1c267dfcdccd53af0d2b2c945794ab3b0d8683f.tar.bz2
1 files changed, 1 insertions, 1 deletions
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 055a873..3ca7f23 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -39,7 +39,7 @@
 (define_mode_attr cas_short_expected_pred
   [(QI "aarch64_reg_or_imm") (HI "aarch64_plushi_operand")])
 (define_mode_attr cas_short_expected_imm
-  [(QI "n") (HI "Uph")])
+  [(QI "n") (HI "Uih")])
 
 (define_insn_and_split "@aarch64_compare_and_swap<mode>"
   [(set (reg:CC CC_REGNUM)					;; bool out
author	Richard Sandiford <richard.sandiford@arm.com>	2023-12-05 10:24:02 +0000
committer	Richard Sandiford <richard.sandiford@arm.com>	2023-12-05 10:24:02 +0000
commit	c1c267dfcdccd53af0d2b2c945794ab3b0d8683f (patch)
tree	011229f73c4414a17739e7d16a4ec78c53fa5f3a /gcc/config/aarch64/atomics.md
parent	8d29b7aca151339b07a3bf46cf0378c7f469ce6d (diff)
download	gcc-c1c267dfcdccd53af0d2b2c945794ab3b0d8683f.zip gcc-c1c267dfcdccd53af0d2b2c945794ab3b0d8683f.tar.gz gcc-c1c267dfcdccd53af0d2b2c945794ab3b0d8683f.tar.bz2