aboutsummaryrefslogtreecommitdiff
path: root/gcc/config/aarch64/atomics.md
AgeCommit message (Collapse)AuthorFilesLines
2024-01-03Update copyright years.Jakub Jelinek1-1/+1
2023-12-05aarch64: Add support for SME2 intrinsicsRichard Sandiford1-1/+1
This patch adds support for the SME2 <arm_sme.h> intrinsics. The convention I've used is to put stuff in aarch64-sve-builtins-sme.* if it relates to ZA, ZT0, the streaming vector length, or other such SME state. Things that operate purely on predicates and vectors go in aarch64-sve-builtins-sve2.* instead. Some of these will later be picked up for SVE2p1. We previously used Uph internally as a constraint for 16-bit immediates to atomic instructions. However, we need a user-facing constraint for the upper predicate registers (already available as PR_HI_REGS), and Uph makes a natural pair with the existing Upl. gcc/ * config/aarch64/aarch64.h (TARGET_STREAMING_SME2): New macro. (P_ALIASES): Likewise. (REGISTER_NAMES): Add pn aliases of the predicate registers. (W8_W11_REGNUM_P): New macro. (W8_W11_REGS): New register class. (REG_CLASS_NAMES, REG_CLASS_CONTENTS): Update accordingly. * config/aarch64/aarch64.cc (aarch64_print_operand): Add support for %K, which prints a predicate as a counter. Handle tuples of predicates. (aarch64_regno_regclass): Handle W8_W11_REGS. (aarch64_class_max_nregs): Likewise. * config/aarch64/constraints.md (Uci, Uw2, Uw4): New constraints. (x, y): Move further up file. (Uph): Redefine as the high predicate registers, renaming the old constraint to... (Uih): ...this. * config/aarch64/predicates.md (const_0_to_7_operand): New predicate. (const_0_to_4_step_4_operand, const_0_to_6_step_2_operand): Likewise. (const_0_to_12_step_4_operand, const_0_to_14_step_2_operand): Likewise. (aarch64_simd_shift_imm_qi): Use const_0_to_7_operand. * config/aarch64/iterators.md (VNx16SI_ONLY, VNx8SI_ONLY) (VNx8DI_ONLY, SVE_FULL_BHSIx2, SVE_FULL_HF, SVE_FULL_SIx2_SDIx4) (SVE_FULL_BHS, SVE_FULLx24, SVE_DIx24, SVE_BHSx24, SVE_Ix24) (SVE_Fx24, SVE_SFx24, SME_ZA_BIx24, SME_ZA_BHIx124, SME_ZA_BHIx24) (SME_ZA_HFx124, SME_ZA_HFx24, SME_ZA_HIx124, SME_ZA_HIx24) (SME_ZA_SDIx24, SME_ZA_SDFx24): New mode iterators. (UNSPEC_REVD, UNSPEC_CNTP_C, UNSPEC_PEXT, UNSPEC_PEXTx2): New unspecs. (UNSPEC_PSEL, UNSPEC_PTRUE_C, UNSPEC_SQRSHR, UNSPEC_SQRSHRN) (UNSPEC_SQRSHRU, UNSPEC_SQRSHRUN, UNSPEC_UQRSHR, UNSPEC_UQRSHRN) (UNSPEC_UZP, UNSPEC_UZPQ, UNSPEC_ZIP, UNSPEC_ZIPQ, UNSPEC_BFMLSLB) (UNSPEC_BFMLSLT, UNSPEC_FCVTN, UNSPEC_FDOT, UNSPEC_SQCVT): Likewise. (UNSPEC_SQCVTN, UNSPEC_SQCVTU, UNSPEC_SQCVTUN, UNSPEC_UQCVT): Likewise. (UNSPEC_SME_ADD, UNSPEC_SME_ADD_WRITE, UNSPEC_SME_BMOPA): Likewise. (UNSPEC_SME_BMOPS, UNSPEC_SME_FADD, UNSPEC_SME_FDOT, UNSPEC_SME_FVDOT) (UNSPEC_SME_FMLA, UNSPEC_SME_FMLS, UNSPEC_SME_FSUB, UNSPEC_SME_READ) (UNSPEC_SME_SDOT, UNSPEC_SME_SVDOT, UNSPEC_SME_SMLA, UNSPEC_SME_SMLS) (UNSPEC_SME_SUB, UNSPEC_SME_SUB_WRITE, UNSPEC_SME_SUDOT): Likewise. (UNSPEC_SME_SUVDOT, UNSPEC_SME_UDOT, UNSPEC_SME_UVDOT): Likewise. (UNSPEC_SME_UMLA, UNSPEC_SME_UMLS, UNSPEC_SME_USDOT): Likewise. (UNSPEC_SME_USVDOT, UNSPEC_SME_WRITE): Likewise. (Vetype, VNARROW, V2XWIDE, Ventype, V_INT_EQUIV, v_int_equiv) (VSINGLE, vsingle, b): Add tuple modes. (v2xwide, za32_offset_range, za64_offset_range, za32_long) (za32_last_offset, vg_modifier, z_suffix, aligned_operand) (aligned_fpr): New mode attributes. (SVE_INT_BINARY_MULTI, SVE_INT_BINARY_SINGLE, SVE_INT_BINARY_MULTI) (SVE_FP_BINARY_MULTI): New int iterators. (SVE_BFLOAT_TERNARY_LONG): Add UNSPEC_BFMLSLB and UNSPEC_BFMLSLT. (SVE_BFLOAT_TERNARY_LONG_LANE): Likewise. (SVE_WHILE_ORDER, SVE2_INT_SHIFT_IMM_NARROWxN, SVE_QCVTxN) (SVE2_SFx24_UNARY, SVE2_x24_PERMUTE, SVE2_x24_PERMUTEQ) (UNSPEC_REVD_ONLY, SME2_INT_MOP, SME2_BMOP, SME_BINARY_SLICE_SDI) (SME_BINARY_SLICE_SDF, SME_BINARY_WRITE_SLICE_SDI, SME_INT_DOTPROD) (SME_INT_DOTPROD_LANE, SME_FP_DOTPROD, SME_FP_DOTPROD_LANE) (SME_INT_TERNARY_SLICE, SME_FP_TERNARY_SLICE, BHSD_BITS) (LUTI_BITS): New int iterators. (optab, sve_int_op): Handle the new unspecs. (sme_int_op, has_16bit_form): New int attributes. (bits_etype): Handle 64. * config/aarch64/aarch64.md (UNSPEC_LD1_SVE_COUNT): New unspec. (UNSPEC_ST1_SVE_COUNT, UNSPEC_LDNT1_SVE_COUNT): Likewise. (UNSPEC_STNT1_SVE_COUNT): Likewise. * config/aarch64/atomics.md (cas_short_expected_imm): Use Uhi rather than Uph for HImode immediates. * config/aarch64/aarch64-sve.md (@aarch64_ld1<SVE_FULLx24:mode>) (@aarch64_ldnt1<SVE_FULLx24:mode>, @aarch64_st1<SVE_FULLx24:mode>) (@aarch64_stnt1<SVE_FULLx24:mode>): New patterns. (@aarch64_<sur>dot_prod_lane<vsi2qi>): Extend to... (@aarch64_<sur>dot_prod_lane<SVE_FULL_SDI:mode><SVE_FULL_BHI:mode>) (@aarch64_<sur>dot_prod_lane<VNx4SI_ONLY:mode><VNx16QI_ONLY:mode>): ...these new patterns. (SVE_WHILE_B, SVE_WHILE_B_X2, SVE_WHILE_C): New constants. Add SVE_WHILE_B to existing while patterns. * config/aarch64/aarch64-sve2.md (@aarch64_sve_ptrue_c<BHSD_BITS>) (@aarch64_sve_pext<BHSD_BITS>, @aarch64_sve_pext<BHSD_BITS>x2) (@aarch64_sve_psel<BHSD_BITS>, *aarch64_sve_psel<BHSD_BITS>_plus) (@aarch64_sve_cntp_c<BHSD_BITS>, <frint_pattern><mode>2) (<optab><mode>3, *<optab><mode>3, @aarch64_sve_single_<optab><mode>) (@aarch64_sve_<sve_int_op><mode>): New patterns. (@aarch64_sve_single_<sve_int_op><mode>, @aarch64_sve_<su>clamp<mode>) (*aarch64_sve_<su>clamp<mode>_x, @aarch64_sve_<su>clamp_single<mode>) (@aarch64_sve_fclamp<mode>, *aarch64_sve_fclamp<mode>_x) (@aarch64_sve_fclamp_single<mode>, <optab><mode><v2xwide>2) (@aarch64_sve_<sur>dotvnx4sivnx8hi): New patterns. (@aarch64_sve_<maxmin_uns_op><mode>): Likewise. (*aarch64_sve_<maxmin_uns_op><mode>): Likewise. (@aarch64_sve_single_<maxmin_uns_op><mode>): Likewise. (aarch64_sve_fdotvnx4sfvnx8hf): Likewise. (aarch64_fdot_prod_lanevnx4sfvnx8hf): Likewise. (@aarch64_sve_<optab><VNx16QI_ONLY:mode><VNx16SI_ONLY:mode>): Likewise. (@aarch64_sve_<optab><VNx8HI_ONLY:mode><VNx8SI_ONLY:mode>): Likewise. (@aarch64_sve_<optab><VNx8HI_ONLY:mode><VNx8DI_ONLY:mode>): Likewise. (truncvnx8sf<mode>2, @aarch64_sve_cvtn<mode>): Likewise. (<optab><v_int_equiv><mode>2, <optab><mode><v_int_equiv>2): Likewise. (@aarch64_sve_sel<mode>): Likewise. (@aarch64_sve_while<while_optab_cmp>_b<BHSD_BITS>_x2): Likewise. (@aarch64_sve_while<while_optab_cmp>_c<BHSD_BITS>): Likewise. (@aarch64_pred_<optab><mode>, @cond_<optab><mode>): Likewise. (@aarch64_sve_<optab><mode>): Likewise. * config/aarch64/aarch64-sme.md (@aarch64_sme_<optab><mode><mode>) (*aarch64_sme_<optab><mode><mode>_plus, @aarch64_sme_read<mode>) (*aarch64_sme_read<mode>_plus, @aarch64_sme_write<mode>): New patterns. (*aarch64_sme_write<mode>_plus aarch64_sme_zero_zt0): Likewise. (@aarch64_sme_<optab><mode>, *aarch64_sme_<optab><mode>_plus) (@aarch64_sme_single_<optab><mode>): Likewise. (*aarch64_sme_single_<optab><mode>_plus): Likewise. (@aarch64_sme_<optab><SME_ZA_SDI:mode><SME_ZA_BHIx24:mode>) (*aarch64_sme_<optab><SME_ZA_SDI:mode><SME_ZA_BHIx24:mode>_plus) (@aarch64_sme_single_<optab><SME_ZA_SDI:mode><SME_ZA_BHIx24:mode>) (*aarch64_sme_single_<optab><SME_ZA_SDI:mode><SME_ZA_BHIx24:mode>_plus) (@aarch64_sme_single_sudot<VNx4SI_ONLY:mode><SME_ZA_BIx24:mode>) (*aarch64_sme_single_sudot<VNx4SI_ONLY:mode><SME_ZA_BIx24:mode>_plus) (@aarch64_sme_lane_<optab><SME_ZA_SDI:mode><SME_ZA_BHIx24:mode>) (*aarch64_sme_lane_<optab><SME_ZA_SDI:mode><SME_ZA_BHIx24:mode>_plus) (@aarch64_sme_<optab><VNx4SI_ONLY:mode><SVE_FULL_BHI:mode>) (*aarch64_sme_<optab><VNx4SI_ONLY:mode><SVE_FULL_BHI:mode>_plus) (@aarch64_sme_<optab><VNx4SI_ONLY:mode><SME_ZA_BHIx24:mode>) (*aarch64_sme_<optab><VNx4SI_ONLY:mode><SME_ZA_BHIx24:mode>_plus) (@aarch64_sme_single_<optab><VNx4SI_ONLY:mode><SME_ZA_BHIx24:mode>) (*aarch64_sme_single_<optab><VNx4SI_ONLY:mode><SME_ZA_BHIx24:mode>_plus) (@aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_BHIx124:mode>) (*aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_BHIx124:mode>) (@aarch64_sme_<optab><VNx2DI_ONLY:mode><VNx8HI_ONLY:mode>) (*aarch64_sme_<optab><VNx2DI_ONLY:mode><VNx8HI_ONLY:mode>_plus) (@aarch64_sme_<optab><VNx2DI_ONLY:mode><SME_ZA_HIx24:mode>) (*aarch64_sme_<optab><VNx2DI_ONLY:mode><SME_ZA_HIx24:mode>_plus) (@aarch64_sme_single_<optab><VNx2DI_ONLY:mode><SME_ZA_HIx24:mode>) (*aarch64_sme_single_<optab><VNx2DI_ONLY:mode><SME_ZA_HIx24:mode>_plus) (@aarch64_sme_lane_<optab><VNx2DI_ONLY:mode><SME_ZA_HIx124:mode>) (*aarch64_sme_lane_<optab><VNx2DI_ONLY:mode><SME_ZA_HIx124:mode>) (@aarch64_sme_<optab><VNx4SI_ONLY:mode><VNx8HI_ONLY:mode>) (@aarch64_sme_<optab><VNx4SI_ONLY:mode><VNx4SI_ONLY:mode>) (@aarch64_sme_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx24:mode>) (*aarch64_sme_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx24:mode>_plus) (@aarch64_sme_single_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx24:mode>) (*aarch64_sme_single_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx24:mode>_plus) (@aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx24:mode>) (*aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx24:mode>_plus) (@aarch64_sme_<optab><SME_ZA_SDF_I:mode><SME_ZA_SDFx24:mode>) (*aarch64_sme_<optab><SME_ZA_SDF_I:mode><SME_ZA_SDFx24:mode>_plus) (@aarch64_sme_single_<optab><SME_ZA_SDF_I:mode><SME_ZA_SDFx24:mode>) (*aarch64_sme_single_<optab><SME_ZA_SDF_I:mode><SME_ZA_SDFx24:mode>_plus) (@aarch64_sme_lane_<optab><SME_ZA_SDF_I:mode><SME_ZA_SDFx24:mode>) (*aarch64_sme_lane_<optab><SME_ZA_SDF_I:mode><SME_ZA_SDFx24:mode>) (@aarch64_sme_<optab><VNx4SI_ONLY:mode><SVE_FULL_HF:mode>) (*aarch64_sme_<optab><VNx4SI_ONLY:mode><SVE_FULL_HF:mode>_plus) (@aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx124:mode>) (*aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_HFx124:mode>) (@aarch64_sme_lut<LUTI_BITS><mode>): Likewise. (UNSPEC_SME_LUTI): New unspec. * config/aarch64/aarch64-sve-builtins.def (single): New mode suffix. (c8, c16, c32, c64): New type suffixes. (vg1x2, vg1x4, vg2, vg2x1, vg2x2, vg2x4, vg4, vg4x1, vg4x2) (vg4x4): New group suffixes. * config/aarch64/aarch64-sve-builtins.h (CP_READ_ZT0) (CP_WRITE_ZT0): New constants. (get_svbool_t): Delete. (function_resolver::report_mismatched_num_vectors): New member function. (function_resolver::resolve_conversion): Likewise. (function_resolver::infer_predicate_type): Likewise. (function_resolver::infer_64bit_scalar_integer_pair): Likewise. (function_resolver::require_matching_predicate_type): Likewise. (function_resolver::require_nonscalar_type): Likewise. (function_resolver::finish_opt_single_resolution): Likewise. (function_resolver::require_derived_vector_type): Add an expected_num_vectors parameter. (function_expander::map_to_rtx_codes): Add an extra parameter for unconditional FP unspecs. (function_instance::gp_type_index): New member function. (function_instance::gp_type): Likewise. (function_instance::gp_mode): Handle multi-vector operations. * config/aarch64/aarch64-sve-builtins.cc (TYPES_all_count) (TYPES_all_pred_count, TYPES_c, TYPES_bhs_data, TYPES_bhs_widen) (TYPES_hs_data, TYPES_cvt_h_s_float, TYPES_cvt_s_s, TYPES_qcvt_x2) (TYPES_qcvt_x4, TYPES_qrshr_x2, TYPES_qrshru_x2, TYPES_qrshr_x4) (TYPES_qrshru_x4, TYPES_while_x, TYPES_while_x_c, TYPES_s_narrow_fsu) (TYPES_za_s_b_signed, TYPES_za_s_b_unsigned, TYPES_za_s_b_integer) (TYPES_za_s_h_integer, TYPES_za_s_h_data, TYPES_za_s_unsigned) (TYPES_za_s_float, TYPES_za_s_data, TYPES_za_d_h_integer): New type macros. (groups_x2, groups_x12, groups_x4, groups_x24, groups_x124) (groups_vg1x2, groups_vg1x4, groups_vg1x24, groups_vg2, groups_vg4) (groups_vg24): New group arrays. (function_instance::reads_global_state_p): Handle CP_READ_ZT0. (function_instance::modifies_global_state_p): Handle CP_WRITE_ZT0. (add_shared_state_attribute): Handle zt0 state. (function_builder::add_overloaded_functions): Skip MODE_single for non-tuple groups. (function_resolver::report_mismatched_num_vectors): New function. (function_resolver::resolve_to): Add a fallback error message for the general two-type case. (function_resolver::resolve_conversion): New function. (function_resolver::infer_predicate_type): Likewise. (function_resolver::infer_64bit_scalar_integer_pair): Likewise. (function_resolver::require_matching_predicate_type): Likewise. (function_resolver::require_matching_vector_type): Specifically diagnose mismatched vector counts. (function_resolver::require_derived_vector_type): Add an expected_num_vectors parameter. Extend to handle cases where tuples are expected. (function_resolver::require_nonscalar_type): New function. (function_resolver::check_gp_argument): Use gp_type_index rather than hard-coding VECTOR_TYPE_svbool_t. (function_resolver::finish_opt_single_resolution): New function. (function_checker::require_immediate_either_or): Remove hard-coded constants. (function_expander::direct_optab_handler): New function. (function_expander::use_pred_x_insn): Only add a strictness flag is the insn has an operand for it. (function_expander::map_to_rtx_codes): Take an unconditional FP unspec as an extra parameter. Handle tuples and MODE_single. (function_expander::map_to_unspecs): Handle tuples and MODE_single. * config/aarch64/aarch64-sve-builtins-functions.h (read_zt0) (write_zt0): New typedefs. (full_width_access::memory_vector): Use the function's vectors_per_tuple. (rtx_code_function_base): Add an optional unconditional FP unspec. (rtx_code_function::expand): Update accordingly. (rtx_code_function_rotated::expand): Likewise. (unspec_based_function_exact_insn::expand): Use tuple_mode instead of vector_mode. (unspec_based_uncond_function): New typedef. (cond_or_uncond_unspec_function): New class. (sme_1mode_function::expand): Handle single forms. (sme_2mode_function_t): Likewise, adding a template parameter for them. (sme_2mode_function): Update accordingly. (sme_2mode_lane_function): New typedef. (multireg_permute): New class. (class integer_conversion): Likewise. (while_comparison::expand): Handle svcount_t and svboolx2_t results. * config/aarch64/aarch64-sve-builtins-shapes.h (binary_int_opt_single_n, binary_opt_single_n, binary_single) (binary_za_slice_lane, binary_za_slice_int_opt_single) (binary_za_slice_opt_single, binary_za_slice_uint_opt_single) (binaryx, clamp, compare_scalar_count, count_pred_c) (dot_za_slice_int_lane, dot_za_slice_lane, dot_za_slice_uint_lane) (extract_pred, inherent_zt, ldr_zt, read_za, read_za_slice) (select_pred, shift_right_imm_narrowxn, storexn, str_zt) (unary_convertxn, unary_za_slice, unaryxn, write_za) (write_za_slice): Declare. * config/aarch64/aarch64-sve-builtins-shapes.cc (za_group_is_pure_overload): New function. (apply_predication): Use the function's gp_type for the predicate, instead of hard-coding the use of svbool_t. (parse_element_type): Add support for "c" (svcount_t). (parse_type): Add support for "c0" and "c1" (conversion destination and source types). (binary_za_slice_lane_base): New class. (binary_za_slice_opt_single_base): Likewise. (load_contiguous_base::resolve): Pass the group suffix to r.resolve. (luti_lane_zt_base): New class. (binary_int_opt_single_n, binary_opt_single_n, binary_single) (binary_za_slice_lane, binary_za_slice_int_opt_single) (binary_za_slice_opt_single, binary_za_slice_uint_opt_single) (binaryx, clamp): New shapes. (compare_scalar_def::build): Allow the return type to be a tuple. (compare_scalar_def::expand): Pass the group suffix to r.resolve. (compare_scalar_count, count_pred_c, dot_za_slice_int_lane) (dot_za_slice_lane, dot_za_slice_uint_lane, extract_pred, inherent_zt) (ldr_zt, read_za, read_za_slice, select_pred, shift_right_imm_narrowxn) (storexn, str_zt): New shapes. (ternary_qq_lane_def, ternary_qq_opt_n_def): Replace with... (ternary_qq_or_011_lane_def, ternary_qq_opt_n_or_011_def): ...these new classes. Allow a second suffix that specifies the type of the second vector argument, and that is used to derive the third. (unary_def::build): Extend to handle tuple types. (unary_convert_def::build): Use the new c0 and c1 format specifiers. (unary_convertxn, unary_za_slice, unaryxn, write_za): New shapes. (write_za_slice): Likewise. * config/aarch64/aarch64-sve-builtins-base.cc (svbic_impl::expand) (svext_bhw_impl::expand): Update call to map_to_rtx_costs. (svcntp_impl::expand): Handle svcount_t variants. (svcvt_impl::expand): Handle unpredicated conversions separately, dealing with tuples. (svdot_impl::expand): Handle 2-way dot products. (svdotprod_lane_impl::expand): Likewise. (svld1_impl::fold): Punt on tuple loads. (svld1_impl::expand): Handle tuple loads. (svldnt1_impl::expand): Likewise. (svpfalse_impl::fold): Punt on svcount_t forms. (svptrue_impl::fold): Likewise. (svptrue_impl::expand): Handle svcount_t forms. (svrint_impl): New class. (svsel_impl::fold): Punt on tuple forms. (svsel_impl::expand): Handle tuple forms. (svst1_impl::fold): Punt on tuple loads. (svst1_impl::expand): Handle tuple loads. (svstnt1_impl::expand): Likewise. (svwhilelx_impl::fold): Punt on tuple forms. (svdot_lane): Use UNSPEC_FDOT. (svmax, svmaxnm, svmin, svminmm): Add unconditional FP unspecs. (rinta, rinti, rintm, rintn, rintp, rintx, rintz): Use svrint_impl. * config/aarch64/aarch64-sve-builtins-base.def (svcreate2, svget2) (svset2, svundef2): Add _b variants. (svcvt): Use unary_convertxn. (svdot): Use ternary_qq_opt_n_or_011. (svdot_lane): Use ternary_qq_or_011_lane. (svmax, svmaxnm, svmin, svminnm): Use binary_opt_single_n. (svpfalse): Add a form that returns svcount_t results. (svrinta, svrintm, svrintn, svrintp): Use unaryxn. (svsel): Use binaryxn. (svst1, svstnt1): Use storexn. * config/aarch64/aarch64-sve-builtins-sme.h (svadd_za, svadd_write_za, svbmopa_za, svbmops_za, svdot_za) (svdot_lane_za, svldr_zt, svluti2_lane_zt, svluti4_lane_zt) (svmla_za, svmla_lane_za, svmls_za, svmls_lane_za, svread_za) (svstr_zt, svsub_za, svsub_write_za, svsudot_za, svsudot_lane_za) (svsuvdot_lane_za, svusdot_za, svusdot_lane_za, svusvdot_lane_za) (svvdot_lane_za, svwrite_za, svzero_zt): Declare. * config/aarch64/aarch64-sve-builtins-sme.cc (load_store_za_base): Rename to... (load_store_za_zt0_base): ...this and extend to tuples. (load_za_base, store_za_base): Update accordingly. (expand_ldr_str_zt0): New function. (svldr_zt_impl, svluti_lane_zt_impl, svread_za_impl, svstr_zt_impl) (svsudot_za_impl, svwrite_za_impl, svzero_zt_impl): New classes. (svadd_za, svadd_write_za, svbmopa_za, svbmops_za, svdot_za) (svdot_lane_za, svldr_zt, svluti2_lane_zt, svluti4_lane_zt) (svmla_za, svmla_lane_za, svmls_za, svmls_lane_za, svread_za) (svstr_zt, svsub_za, svsub_write_za, svsudot_za, svsudot_lane_za) (svsuvdot_lane_za, svusdot_za, svusdot_lane_za, svusvdot_lane_za) (svvdot_lane_za, svwrite_za, svzero_zt): New functions. * config/aarch64/aarch64-sve-builtins-sme.def: Add SME2 intrinsics. * config/aarch64/aarch64-sve-builtins-sve2.h (svbfmlslb, svbfmlslb_lane, svbfmlslt, svbfmlslt_lane, svclamp) (svcvtn, svpext, svpsel, svqcvt, svqcvtn, svqrshr, svqrshrn) (svqrshru, svqrshrun, svrevd, svunpk, svuzp, svuzpq, svzip) (svzipq): Declare. * config/aarch64/aarch64-sve-builtins-sve2.cc (svclamp_impl) (svcvtn_impl, svpext_impl, svpsel_impl): New classes. (svqrshl_impl::fold): Update for change to svrshl shape. (svrshl_impl::fold): Punt on tuple forms. (svsqadd_impl::expand): Update call to map_to_rtx_codes. (svunpk_impl): New class. (svbfmlslb, svbfmlslb_lane, svbfmlslt, svbfmlslt_lane, svclamp) (svcvtn, svpext, svpsel, svqcvt, svqcvtn, svqrshr, svqrshrn) (svqrshru, svqrshrun, svrevd, svunpk, svuzp, svuzpq, svzip) (svzipq): New functions. * config/aarch64/aarch64-sve-builtins-sve2.def: Add SME2 intrinsics. * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define or undefine __ARM_FEATURE_SME2. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h: Provide a way for test functions to share ZT0. (ATTR): Update accordingly. (TEST_LOAD_COUNT, TEST_STORE_COUNT, TEST_PN, TEST_COUNT_PN) (TEST_EXTRACT_PN, TEST_SELECT_P, TEST_COMPARE_S_X2, TEST_COMPARE_S_C) (TEST_CREATE_B, TEST_GET_B, TEST_SET_B, TEST_XN, TEST_XN_SINGLE) (TEST_XN_SINGLE_Z15, TEST_XN_SINGLE_AWKWARD, TEST_X2_NARROW) (TEST_X4_NARROW): New macros. * gcc.target/aarch64/sve/acle/asm/create2_1.c: Add _b tests. * gcc.target/aarch64/sve/acle/general-c/binary_za_m_1.c: Remove test for svmopa that becomes valid with SME2. * gcc.target/aarch64/sve/acle/general-c/create_1.c: Adjust for existence of svboolx2_t version of svcreate2. * gcc.target/aarch64/sve/acle/general-c/store_1.c: Adjust error messages to account for svcount_t predication. * gcc.target/aarch64/sve/acle/general-c/store_2.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/ternary_qq_lane_1.c: Adjust error messages to account for new SME2 variants. * gcc.target/aarch64/sve/acle/general-c/ternary_qq_opt_n_2.c: Likewise.
2023-10-24aarch64: Avoid bogus atomics matchRichard Sandiford1-1/+1
The non-LSE pattern aarch64_atomic_exchange<mode> comes before the LSE pattern aarch64_atomic_exchange<mode>_lse. From a recog perspective, the only difference between the patterns is that the non-LSE one clobbers CC and needs a scratch. However, combine and RTL-SSA can both add clobbers to make a pattern match. This means that if they try to rerecognise an LSE pattern, they could end up turning it into a non-LSE pattern. This patch adds a !TARGET_LSE test to avoid that. This is needed to avoid a regression with later patches. gcc/ * config/aarch64/atomics.md (aarch64_atomic_exchange<mode>): Require !TARGET_LSE.
2023-04-18aarch64: Add QI -> HI zero-extension for LDAPRKyrylo Tkachov1-3/+3
This patch is a straightforward extension of the zero-extending LDAPR pattern to represent QI -> HI load-extends. This maps down to a LDAPRB-W instruction. This lets us remove a redundant zero-extend in the new test function. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/atomics.md (*aarch64_atomic_load<ALLX:mode>_rcpc_zext): Use SD_HSDI for destination mode iterator. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ldapr-zext.c: Add test for u8 to u16 extension.
2023-01-16Update copyright years.Jakub Jelinek1-1/+1
2022-11-18aarch64: Fix LDAPURS assembly outputKyrylo Tkachov1-1/+1
... And another follow-up once I realised that the sign-extending load, of course, needs to have strictly an X-reg as a destination for DImode extensions and a W-reg for SImode ones. Tested on aarch64-none-linux. gcc/ChangeLog: * config/aarch64/atomics.md (*aarch64_atomic_load<ALLX:mode>_rcpc_sext): Use <GPI:w> for destination format. * config/aarch64/iterators.md (w_sz): Delete. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ldapr-sext.c: Adjust expected output.
2022-11-18aarch64: Fix up LDAPR codegenKyrylo Tkachov1-3/+3
Upon some further inspection I realised I had misunderstood some intricacies of the extending loads of the RCPC feature. This patch fixes up the recent GCC support accordingly. In particular: * The sign-extending forms are a form of LDAPURS* and are actually part of FEAT_RCPC2 that is enabled with Armv8.4-a rather than the base Armv8.3-a FEAT_RCPC. The patch introduces a TARGET_RCPC2 macro and gates this combine pattern accordingly. * The assembly output for the zero-extending LDAPR instruction should always use %w formatting for its destination register. The testcase is split into zero-extending and sign-extending parts since they require different architecture pragmas. It's also straightforward to add the rest of the FEAT_RCPC2 codegen (with immediate offset addressing modes) but that can be done as a separate patch. Apologies for not catching this sooner, but it hasn't been in trunk long, so no harm done. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64.h (TARGET_RCPC2): Define. * config/aarch64/atomics.md (*aarch64_atomic_load<ALLX:mode>_rcpc_zext): Adjust output template. (*aarch64_atomic_load<ALLX:mode>_rcpc_sex): Guard on TARGET_RCPC2. Adjust output template. * config/aarch64/iterators.md (w_sz): New mode attr. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ldapr-ext.c: Rename to... * gcc.target/aarch64/ldapr-zext.c: ... This. Fix expected assembly. * gcc.target/aarch64/ldapr-sext.c: New test.
2022-11-17aarch64: Add mode size check on LDAPR-extend patternsKyrylo Tkachov1-2/+2
Add an extra safety check as suggested by Richard. Tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/atomics.md (*aarch64_atomic_load<ALLX:mode>_rcpc_zext): Add mode size check to condition. (*aarch64_atomic_load<ALLX:mode>_rcpc_sext): Likewise.
2022-11-15aarch64: Add support for widening LDAPR instructionsAndre Vieira1-0/+22
gcc/ChangeLog: * config/aarch64/atomics.md (*aarch64_atomic_load<ALLX:mode>_rcpc_zext): New pattern. (*aarch64_atomic_load<ALLX:mode>_rcpc_sext): New pattern. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ldapr-ext.c: New test.
2022-11-15aarch64: Enable the use of LDAPR for load-acquire semanticsAndre Vieira1-1/+32
This patch enables the use of LDAPR for load-acquire semantics. 2022-11-15 Andre Vieira <andre.simoesdiasvieira@arm.com> Kyrylo Tkachov <kyrylo.tkachov@arm.com> gcc/ChangeLog: * config/aarch64/aarch64.h (AARCH64_ISA_RCPC): New Macro. (TARGET_RCPC): New Macro. * config/aarch64/atomics.md (atomic_load<mode>): Change into an expand. (aarch64_atomic_load<mode>_rcpc): New define_insn for ldapr. (aarch64_atomic_load<mode>): Rename of old define_insn for ldar. * config/aarch64/iterators.md (UNSPEC_LDAP): New unspec enum value. * doc/invoke.texi (rcpc): Ammend documentation to mention the effects on code generation. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ldapr.c: New test.
2022-10-06aarch64: Remove redundant zero-extends with LDARKyrylo Tkachov1-0/+17
Like other loads in AArch64, the LDARB,LDARH,LDAR instructions clear out the top part of their destination register and we can thus avoid having to explicitly zero-extend it. We were missing a combine pattern that this patch adds. For one of the examples in the testcase we generated: load_uint8_t_ext_uint16_t: adrp x0, .LANCHOR0 add x0, x0, :lo12:.LANCHOR0 ldarb w0, [x0] and w0, w0, 255 ret but now generate: load_uint8_t_ext_uint16_t: adrp x0, .LANCHOR0 add x0, x0, :lo12:.LANCHOR0 ldarb w0, [x0] ret Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/atomics.md (*atomic_load<ALLX:mode>_zext<SD_HSDI:mode>): New pattern. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ldar_2.c: New test.
2022-01-03Update copyright years.Jakub Jelinek1-1/+1
2021-01-04Update copyright years.Jakub Jelinek1-1/+1
2020-03-31aarch64: Fix up aarch64_compare_and_swaphi pattern [PR94368]Jakub Jelinek1-1/+4
The following testcase ICEs in final_scan_insn_1. The problem is in the @aarch64_compare_and_swaphi define_insn_and_split, since 9 it uses aarch64_plushi_operand predicate for the "expected value" operand, which allows either 0..0xfff constants or 0x1000..0xf000 constants (i.e. HImode values which when zero extended are either 0..0xfff or (0..0xfff) << 12). The problem is that RA doesn't care about predicates, it honors just constraints and the used constraint on the operand is n, which means any HImode CONST_SCALAR_INT. In the testcase LRA thus propagates the -1 value into the insn. This is a define_insn_and_split which requires mandatory split. But during split2 pass, we check the predicate (and don't check constraints), which fails and thus we don't split it and during final ICE because the mandatory splitting didn't happen. The following patch fixes it by adding a matching constraint to the predicate and using it. 2020-03-31 Jakub Jelinek <jakub@redhat.com> PR target/94368 * config/aarch64/constraints.md (Uph): New constraint. * config/aarch64/atomics.md (cas_short_expected_imm): New mode attr. (@aarch64_compare_and_swap<mode>): Use it instead of n in operand 2's constraint. * gcc.dg/pr94368.c: New test.
2020-01-17[AArch64] Fix shrinkwrapping interactions with atomics (PR92692)Wilco Dijkstra1-10/+10
The separate shrinkwrapping pass may insert stores in the middle of atomics loops which can cause issues on some implementations. Avoid this by delaying splitting atomics patterns until after prolog/epilog generation. gcc/ PR target/92692 * config/aarch64/aarch64.c (aarch64_split_compare_and_swap) Add assert to ensure prolog has been emitted. (aarch64_split_atomic_op): Likewise. * config/aarch64/atomics.md (aarch64_compare_and_swap<mode>) Use epilogue_completed rather than reload_completed. (aarch64_atomic_exchange<mode>): Likewise. (aarch64_atomic_<atomic_optab><mode>): Likewise. (atomic_nand<mode>): Likewise. (aarch64_atomic_fetch_<atomic_optab><mode>): Likewise. (atomic_fetch_nand<mode>): Likewise. (aarch64_atomic_<atomic_optab>_fetch<mode>): Likewise. (atomic_nand_fetch<mode>): Likewise.
2020-01-01Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r279813
2019-09-23[AArch64] Fix memmodel index in aarch64_store_exclusive_pairRichard Sandiford1-1/+1
Found via an rtx checking failure. 2019-09-23 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/atomics.md (aarch64_store_exclusive_pair): Fix memmodel index. From-SVN: r276052
2019-09-19aarch64: Implement -moutline-atomicsRichard Henderson1-8/+86
* config/aarch64/aarch64.opt (-moutline-atomics): New. * config/aarch64/aarch64.c (aarch64_atomic_ool_func): New. (aarch64_ool_cas_names, aarch64_ool_swp_names): New. (aarch64_ool_ldadd_names, aarch64_ool_ldset_names): New. (aarch64_ool_ldclr_names, aarch64_ool_ldeor_names): New. (aarch64_expand_compare_and_swap): Honor TARGET_OUTLINE_ATOMICS. * config/aarch64/atomics.md (atomic_exchange<ALLI>): Likewise. (atomic_<atomic_op><ALLI>): Likewise. (atomic_fetch_<atomic_op><ALLI>): Likewise. (atomic_<atomic_op>_fetch<ALLI>): Likewise. * doc/invoke.texi: Document -moutline-atomics. testsuite/ * gcc.target/aarch64/atomic-op-acq_rel.c: Use -mno-outline-atomics. * gcc.target/aarch64/atomic-comp-swap-release-acquire.c: Likewise. * gcc.target/aarch64/atomic-op-acquire.c: Likewise. * gcc.target/aarch64/atomic-op-char.c: Likewise. * gcc.target/aarch64/atomic-op-consume.c: Likewise. * gcc.target/aarch64/atomic-op-imm.c: Likewise. * gcc.target/aarch64/atomic-op-int.c: Likewise. * gcc.target/aarch64/atomic-op-long.c: Likewise. * gcc.target/aarch64/atomic-op-relaxed.c: Likewise. * gcc.target/aarch64/atomic-op-release.c: Likewise. * gcc.target/aarch64/atomic-op-seq_cst.c: Likewise. * gcc.target/aarch64/atomic-op-short.c: Likewise. * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: Likewise. * gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c: Likewise. * gcc.target/aarch64/sync-comp-swap.c: Likewise. * gcc.target/aarch64/sync-op-acquire.c: Likewise. * gcc.target/aarch64/sync-op-full.c: Likewise. From-SVN: r275968
2019-09-19aarch64: Implement TImode compare-and-swapRichard Henderson1-5/+88
This pattern will only be used with the __sync functions, because we do not yet have a bare TImode atomic load. * config/aarch64/aarch64.c (aarch64_gen_compare_reg): Add support for NE comparison of TImode values. (aarch64_emit_load_exclusive): Add support for TImode. (aarch64_emit_store_exclusive): Likewise. (aarch64_split_compare_and_swap): Disable strong_zero_p for TImode. * config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI_TI>): Change iterator from ALLI to ALLI_TI. (@atomic_compare_and_swap<JUST_TI>): New. (@atomic_compare_and_swap<JUST_TI>_lse): New. (aarch64_load_exclusive_pair): New. (aarch64_store_exclusive_pair): New. * config/aarch64/iterators.md (JUST_TI): New. From-SVN: r275965
2019-07-03[AArch64] Remove constraint strings from define_expand constructsDennis Zhang1-18/+18
A number of AArch64 define_expand patterns have specified constraints for their operands. But the constraint strings are ignored at expand time and are therefore redundant/useless. We now avoid specifying constraints in new define_expands, but we should clean up the existing define_expand definitions. For example, the constraint "=w" is removed in the following case: (define_expand "sqrt<mode>2" [(set (match_operand:GPF_F16 0 "register_operand" "=w") The "" marks with an empty constraint in define_expand are removed as well. 2019-07-03 Dennis Zhang <dennis.zhang@arm.com> gcc/ * config/aarch64/aarch64.md: Remove redundant constraints from define_expand but keep some patterns untouched if they are specially selected by TARGET_SECONDARY_RELOAD hook. * config/aarch64/aarch64-sve.md: Likewise. * config/aarch64/atomics.md: Remove redundant constraints from define_expand. * config/aarch64/aarch64-simd.md: Likewise. From-SVN: r273021
2019-01-01Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r267494
2018-11-21re PR target/87839 (ICE in final_scan_insn_1, at final.c:3070)Jakub Jelinek1-1/+1
PR target/87839 * config/aarch64/atomics.md (@aarch64_compare_and_swap<mode>): Use rIJ constraint for aarch64_plus_operand rather than rn. * gcc.target/aarch64/pr87839.c: New test. From-SVN: r266346
2018-10-31aarch64: Remove early clobber from ATOMIC_LDOP scratchRichard Henderson1-1/+13
* config/aarch64/atomics.md (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): The scratch register need not be early-clobber. Document the reason why we cannot use ST<OP>. From-SVN: r265703
2018-10-31aarch64: Improve atomic-op lse generationRichard Henderson1-93/+104
Fix constraints; avoid unnecessary split. Drop the use of the atomic_op iterator in favor of the ATOMIC_LDOP iterator; this is simplier and more logical for ldclr aka bic. * config/aarch64/aarch64.c (aarch64_emit_bic): Remove. (aarch64_atomic_ldop_supported_p): Remove. (aarch64_gen_atomic_ldop): Remove. * config/aarch64/atomic.md (atomic_<atomic_optab><ALLI>): Fully expand LSE operations here. (atomic_fetch_<atomic_optab><ALLI>): Likewise. (atomic_<atomic_optab>_fetch<ALLI>): Likewise. (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): Drop atomic_op iterator and use ATOMIC_LDOP instead; use register_operand for the input; drop the split and emit insns directly. (aarch64_atomic_fetch_<ATOMIC_LDOP><ALLI>_lse): Likewise. (aarch64_atomic_<atomic_op>_fetch<ALLI>_lse): Remove. (@aarch64_atomic_load<ATOMIC_LDOP><ALLI>): Remove. From-SVN: r265660
2018-10-31aarch64: Improve swp generationRichard Henderson1-34/+15
Allow zero as an input; fix constraints; avoid unnecessary split. * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): Remove. (aarch64_gen_atomic_ldop): Don't call it. * config/aarch64/atomics.md (atomic_exchange<ALLI>): Use aarch64_reg_or_zero. (aarch64_atomic_exchange<ALLI>): Likewise. (aarch64_atomic_exchange<ALLI>_lse): Remove split; remove & from operand 0; use aarch64_reg_or_zero for input; merge ... (@aarch64_atomic_swp<ALLI>): ... this and remove. From-SVN: r265659
2018-10-31aarch64: Improve cas generationRichard Henderson1-8/+11
Do not zero-extend the input to the cas for subword operations; instead, use the appropriate zero-extending compare insns. Correct the predicates and constraints for immediate expected operand. * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): New. (aarch64_split_compare_and_swap): Use it. (aarch64_expand_compare_and_swap): Likewise. Remove convert_modes; test oldval against the proper predicate. * config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI>): Use nonmemory_operand for expected. (cas_short_expected_pred): New. (@aarch64_compare_and_swap<SHORT>): Use it; use "rn" not "rI" to match. (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for expected. * config/aarch64/predicates.md (aarch64_plushi_immediate): New. (aarch64_plushi_operand): New. From-SVN: r265657
2018-10-31aarch64: Simplify LSE cas generationRichard Henderson1-88/+33
The cas insn is a single insn, and if expanded properly need not be split after reload. Use the proper inputs for the insn. * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap): Force oldval into the rval register for TARGET_LSE; emit the compare during initial expansion so that it may be deleted if unused. (aarch64_gen_atomic_cas): Remove. * config/aarch64/atomics.md (@aarch64_compare_and_swap<SHORT>_lse): Change =&r to +r for operand 0; use match_dup for operand 2; remove is_weak and mod_f operands as unused. Drop the split and merge with... (@aarch64_atomic_cas<SHORT>): ... this pattern's output; remove. (@aarch64_compare_and_swap<GPI>_lse): Similarly. (@aarch64_atomic_cas<GPI>): Similarly. From-SVN: r265656
2018-09-19[AARCH64] Use STLUR for atomic_storeMatthew Malcomson1-3/+6
Use the STLUR instruction introduced in Armv8.4-a. This instruction has the store-release semantic like STLR but can take a 9-bit unscaled signed immediate offset. Example test case: ``` void foo () { int32_t *atomic_vals = calloc (4, sizeof (int32_t)); atomic_store_explicit (atomic_vals + 1, 2, memory_order_release); } ``` Before patch generates ``` foo: stp x29, x30, [sp, -16]! mov x1, 4 mov x0, x1 mov x29, sp bl calloc mov w1, 2 add x0, x0, 4 stlr w1, [x0] ldp x29, x30, [sp], 16 ret ``` After patch generates ``` foo: stp x29, x30, [sp, -16]! mov x1, 4 mov x0, x1 mov x29, sp bl calloc mov w1, 2 stlur w1, [x0, 4] ldp x29, x30, [sp], 16 ret ``` We introduce a new feature flag to indicate the presence of this instruction. The feature flag is called AARCH64_ISA_RCPC8_4 and is included when targeting armv8.4 architecture. We also introduce an "arch" attribute to be checked called "rcpc8_4" after this feature flag. gcc/ 2018-09-19 Matthew Malcomson <matthew.malcomson@arm.com> * config/aarch64/aarch64-protos.h (aarch64_offset_9bit_signed_unscaled_p): New declaration. * config/aarch64/aarch64.md (arches): New "rcpc8_4" attribute value. (arch_enabled): Add check for "rcpc8_4" attribute value of "arch". * config/aarch64/aarch64.h (AARCH64_FL_RCPC8_4): New bitfield. (AARCH64_FL_FOR_ARCH8_4): Include AARCH64_FL_RCPC8_4. (AARCH64_FL_PROFILE): Move index so flags are ordered. (AARCH64_ISA_RCPC8_4): New flag. * config/aarch64/aarch64.c (offset_9bit_signed_unscaled_p): Renamed to aarch64_offset_9bit_signed_unscaled_p. * config/aarch64/atomics.md (atomic_store<mode>): Allow offset and use stlur. * config/aarch64/constraints.md (Ust): New constraint. * config/aarch64/predicates.md. (aarch64_9bit_offset_memory_operand): New predicate. (aarch64_rcpc_memory_operand): New predicate. gcc/testsuite/ 2018-09-19 Matthew Malcomson <matthew.malcomson@arm.com> * gcc.target/aarch64/atomic-store.c: New. From-SVN: r264421
2018-08-02[gen/AArch64] Generate helpers for substituting iterator values into pattern ↵Richard Sandiford1-12/+12
names Given a pattern like: (define_insn "aarch64_frecpe<mode>" ...) the SVE ACLE implementation wants to generate the pattern for a particular (non-constant) mode. This patch automatically generates helpers to do that, specifically: // Return CODE_FOR_nothing on failure. insn_code maybe_code_for_aarch64_frecpe (machine_mode); // Assert that the code exists. insn_code code_for_aarch64_frecpe (machine_mode); // Return NULL_RTX on failure. rtx maybe_gen_aarch64_frecpe (machine_mode, rtx, rtx); // Assert that generation succeeds. rtx gen_aarch64_frecpe (machine_mode, rtx, rtx); Many patterns don't have sensible names when all <...>s are removed. E.g. "<optab><mode>2" would give a base name "2". The new functions therefore require explicit opt-in, which should also help to reduce code bloat. The (arbitrary) opt-in syntax I went for was to prefix the pattern name with '@', similarly to the existing '*' marker. The patch also makes config/aarch64 use the new routines in cases where they obviously apply. This was mostly straight-forward, but it seemed odd that we defined: aarch64_reload_movcp<...><P:mode> but then only used it with DImode, never SImode. If we should be using Pmode instead of DImode, then that's a simple change, but should probably be a separate patch. 2018-08-02 Richard Sandiford <richard.sandiford@arm.com> gcc/ * doc/md.texi: Expand the documentation of instruction names to mention port-local uses. Document '@' in pattern names. * read-md.h (overloaded_instance, overloaded_name): New structs. (mapping): Declare. (md_reader::handle_overloaded_name): New member function. (md_reader::get_overloads): Likewise. (md_reader::m_first_overload): New member variable. (md_reader::m_next_overload_ptr): Likewise. (md_reader::m_overloads_htab): Likewise. * read-md.c (md_reader::md_reader): Initialize m_first_overload, m_next_overload_ptr and m_overloads_htab. * read-rtl.c (iterator_group): Add "type" and "get_c_token" fields. (get_mode_token, get_code_token, get_int_token): New functions. (map_attr_string): Add an optional argument that passes back the associated iterator. (overloaded_name_hash, overloaded_name_eq_p, named_rtx_p): (md_reader::handle_overloaded_name, add_overload_instance): New functions. (apply_iterators): Handle '@' names. Report an error if '@' is used without iterators. (initialize_iterators): Initialize the new iterator_group fields. * genopinit.c (handle_overloaded_code_for) (handle_overloaded_gen): New functions. (main): Use them to print declarations of maybe_code_for_* and maybe_gen_* functions, and inline definitions of code_for_* and gen_*. * genemit.c (print_overload_arguments, print_overload_test) (handle_overloaded_code_for, handle_overloaded_gen): New functions. (main): Use it to print definitions of maybe_code_for_* and maybe_gen_* functions. * config/aarch64/aarch64.c (aarch64_split_128bit_move): Use gen_aarch64_mov{low,high}_di and gen_aarch64_movdi_{low,high} instead of explicit mode checks. (aarch64_split_simd_combine): Likewise gen_aarch64_simd_combine. (aarch64_split_simd_move): Likewise gen_aarch64_split_simd_mov. (aarch64_emit_load_exclusive): Likewise gen_aarch64_load_exclusive. (aarch64_emit_store_exclusive): Likewise gen_aarch64_store_exclusive. (aarch64_expand_compare_and_swap): Likewise gen_aarch64_compare_and_swap and gen_aarch64_compare_and_swap_lse (aarch64_gen_atomic_cas): Likewise gen_aarch64_atomic_cas. (aarch64_emit_atomic_swap): Likewise gen_aarch64_atomic_swp. (aarch64_constant_pool_reload_icode): Delete. (aarch64_secondary_reload): Use code_for_aarch64_reload_movcp instead of aarch64_constant_pool_reload_icode. Use code_for_aarch64_reload_mov instead of explicit mode checks. (rsqrte_type, get_rsqrte_type, rsqrts_type, get_rsqrts_type): Delete. (aarch64_emit_approx_sqrt): Use gen_aarch64_rsqrte instead of get_rsqrte_type and gen_aarch64_rsqrts instead of gen_rqrts_type. (recpe_type, get_recpe_type, recps_type, get_recps_type): Delete. (aarch64_emit_approx_div): Use gen_aarch64_frecpe instead of get_recpe_type and gen_aarch64_frecps instead of get_recps_type. (aarch64_atomic_load_op_code): Delete. (aarch64_emit_atomic_load_op): Likewise. (aarch64_gen_atomic_ldop): Use UNSPECV_ATOMIC_* instead of aarch64_atomic_load_op_code. Use gen_aarch64_atomic_load instead of aarch64_emit_atomic_load_op. * config/aarch64/aarch64.md (aarch64_reload_movcp<GPF_TF:mode><P:mode>) (aarch64_reload_movcp<VALL:mode><P:mode>, aarch64_reload_mov<mode>) (aarch64_movdi_<mode>low, aarch64_movdi_<mode>high) (aarch64_mov<mode>high_di, aarch64_mov<mode>low_di): Add a '@' character before the pattern name. * config/aarch64/aarch64-simd.md (aarch64_split_simd_mov<mode>) (aarch64_rsqrte<mode>, aarch64_rsqrts<mode>) (aarch64_simd_combine<mode>, aarch64_frecpe<mode>) (aarch64_frecps<mode>): Likewise. * config/aarch64/atomics.md (atomic_compare_and_swap<mode>) (aarch64_compare_and_swap<mode>, aarch64_compare_and_swap<mode>_lse) (aarch64_load_exclusive<mode>, aarch64_store_exclusive<mode>) (aarch64_atomic_swp<mode>, aarch64_atomic_cas<mode>) (aarch64_atomic_load<atomic_ldop><mode>): Likewise. From-SVN: r263251
2018-07-16[Patch AArch64] Add early clobber for aarch64_store_exclusive.Ramana Radhakrishnan1-1/+1
From-SVN: r262686
2018-01-03Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r256169
2017-06-21[AArch64] Fix atomic_cmp_exchange_zero_reg_1.c with +lseKyrylo Tkachov1-4/+4
* config/aarch64/atomics.md (aarch64_compare_and_swap<mode>_lse, SHORT): Relax operand 3 to aarch64_reg_or_zero and constraint to Z. (aarch64_compare_and_swap<mode>_lse, GPI): Likewise. (aarch64_atomic_cas<mode>, SHORT): Likewise for operand 2. (aarch64_atomic_cas<mode>, GPI): Likewise. From-SVN: r249457
2017-06-06[AArch64] Allow const0_rtx operand for atomic compare-exchange patternsKyrylo Tkachov1-4/+4
* config/aarch64/atomics.md (atomic_compare_and_swap<mode> expander): Use aarch64_reg_or_zero predicate for operand 4. (aarch64_compare_and_swap<mode> define_insn_and_split): Use aarch64_reg_or_zero predicate for operand 3. Add 'Z' constraint. (aarch64_store_exclusive<mode>): Likewise for operand 2. * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test. From-SVN: r248921
2017-01-01Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r243994
2016-07-04[AArch64] Renaming ARMv8.1 to ARMv8.1-A in comments and documentationsJiong Wang1-1/+1
* config/aarch64/aarch64.h: Rename "ARMv8.1" to "ARMv8.1-A". * config/aarch64/aarch64_neon.h: Likewise. * config/aarch64/arm_neon.h: Likewise. * config/aarch64/atomics.md: Likewise. * config/aarch64/aarch64-simd-builtins.def: Likewise. * doc/invoke.texi: Likewise. From-SVN: r237988
2016-01-04Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r232055
2015-12-202015-12-20 Andrew Pinsi <apinski@cavium.com>Andrew Pinski1-1/+1
* config/aarch64/atomics.md (aarch64_atomic_<atomic_optab>_fetch<mode>_lse): Add early clobber to the scratch register. From-SVN: r231864
2015-12-04atomics.md (atomic_store<mode>): Use predicate aarch64_sync_memory_operand.Bin Cheng1-1/+1
* config/aarch64/atomics.md (atomic_store<mode>): Use predicate aarch64_sync_memory_operand. From-SVN: r231251
2015-11-10[AArch64] Move iterators from atomics.md to iterators.mdMatthew Wahab1-28/+0
* config/aarch64/atomics.md (unspecv): Move to iterators.md. (ATOMIC_LDOP): Likewise. (atomic_ldop): Likewise. * config/aarch64/iterators.md (unspecv): Moved from atomics.md. (ATOMIC_LDOP): Likewise. (atomic_ldop): Likewise. From-SVN: r230114
2015-09-22[AArch64] Use atomic load-operate instructions for update-fetch patterns.Matthew Wahab1-4/+51
2015-09-22 Matthew Wahab <matthew.wahab@arm.com> * config/aarch64/aarch64-protos.h (aarch64_gen_atomic_ldop): Adjust declaration. * config/aarch64/aarch64.c (aarch64_emit_bic): New. (aarch64_gen_atomic_ldop): Adjust comment. Add parameter out_result. Update to support update-fetch operations. * config/aarch64/atomics.md (aarch64_atomic_exchange<mode>_lse): Adjust for change to aarch64_gen_atomic_ldop. (aarch64_atomic_<atomic_optab><mode>_lse): Likewise. (aarch64_atomic_fetch_<atomic_optab><mode>_lse): Likewise. (atomic_<atomic_optab>_fetch<mode>): Change to an expander. (aarch64_atomic_<atomic_optab>_fetch<mode>): New. (aarch64_atomic_<atomic_optab>_fetch<mode>_lse): New. gcc/testsuite 2015-09-22 Matthew Wahab <matthew.wahab@arm.com> * gcc.target/aarch64/atomic-inst-ldadd.c: Add tests for update-fetch operations. * gcc.target/aarch64/atomic-inst-ldlogic.c: Likewise. From-SVN: r228002
2015-09-22[AArch64] Use atomic load-operate instructions for fetch-update patterns.Matthew Wahab1-9/+92
gcc/ 2015-09-22 Matthew Wahab <matthew.wahab@arm.com> * config/aarch64/aarch64-protos.h (aarch64_atomic_ldop_supported_p): Declare. * config/aarch64/aarch64.c (aarch64_atomic_ldop_supported_p): New. (enum aarch64_atomic_load_op_code): New. (aarch64_emit_atomic_load_op): New. (aarch64_gen_atomic_ldop): Update to support load-operate patterns. * config/aarch64/atomics.md (atomic_<atomic_optab><mode>): Change to an expander. (aarch64_atomic_<atomic_optab><mode>): New. (aarch64_atomic_<atomic_optab><mode>_lse): New. (atomic_fetch_<atomic_optab><mode>): Change to an expander. (aarch64_atomic_fetch_<atomic_optab><mode>): New. (aarch64_atomic_fetch_<atomic_optab><mode>_lse): New. gcc/testsuite/ 2015-09-22 Matthew Wahab <matthew.wahab@arm.com> * gcc.target/aarch64/atomic-inst-ldadd.c: New. * gcc.target/aarch64/atomic-inst-ldlogic.c: New. From-SVN: r228001
2015-09-22[AArch64] Add atomic load-operate instructions.Matthew Wahab1-0/+41
2015-09-22 Matthew Wahab <matthew.wahab@arm.com> * config/aarch64/aarch64/atomics.md (UNSPECV_ATOMIC_LDOP): New. (UNSPECV_ATOMIC_LDOP_OR): New. (UNSPECV_ATOMIC_LDOP_BIC): New. (UNSPECV_ATOMIC_LDOP_XOR): New. (UNSPECV_ATOMIC_LDOP_PLUS): New. (ATOMIC_LDOP): New. (atomic_ldop): New. (aarch64_atomic_load<atomic_ldop><mode>): New. From-SVN: r228000
2015-09-22[AArch64] Use atomic instructions for swap and fetch-update operations.Matthew Wahab1-4/+67
gcc/ 2015-09-22 Matthew Wahab <matthew.wahab@arm.com> * config/aarch64/aarch64-protos.h (aarch64_gen_atomic_ldop): Declare. * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): New. (aarch64_gen_atomic_ldop): New. (aarch64_split_atomic_op): Fix whitespace and add a comment. * config/aarch64/atomics.md (UNSPECV_ATOMIC_SWP): New. (aarch64_compare_and_swap<mode>_lse): Fix some whitespace. (atomic_exchange<mode>): Replace with an expander. (aarch64_atomic_exchange<mode>): New. (aarch64_atomic_exchange<mode>_lse): New. (aarch64_atomic_<atomic_optab><mode>): Fix some whitespace. (aarch64_atomic_swp<mode>): New. gcc/testsuite/ 2015-09-22 Matthew Wahab <matthew.wahab@arm.com> * gcc.target/aarch64/atomic-inst-ops.inc: (TEST_MODEL): New. (TEST_ONE): New. * gcc.target/aarch64/atomic-inst-swap.c: New. From-SVN: r227998
2015-08-14re PR target/67143 (ICE (could not split insn) on aarch64-linux-gnu)Matthew Wahab1-3/+3
gcc/ 2015-08-14 Matthew Wahab <matthew.wahab@arm.com> PR target/67143 * config/aarch64/atomics.md (atomic_<optab><mode>): Replace 'lconst_atomic' with 'const_atomic'. (atomic_fetch_<optab><mode>): Likewise. (atomic_<optab>_fetch<mode>): Likewise. * config/aarch64/iterators.md (lconst-atomic): Move below 'const_atomic'. (const_atomic): New. gcc/testsuite/ 2015-08-14 Matthew Wahab <matthew.wahab@arm.com> Matthias Klose <doko@debian.org> PR target/67143 * gcc.c-torture/compile/pr67143.c: New * gcc.target/aarch64/atomic-op-imm.c (atomic_fetch_add_negative_RELAXED): New. (atomic_fetch_sub_negative_ACQUIRE): New. Co-Authored-By: Matthias Klose <doko@debian.org> From-SVN: r226895
2015-08-13* config/aarch64/aarch64-protos.hMatthew Wahab1-7/+110
(aarch64_gen_atomic_cas): Declare. * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap): Choose appropriate instruction pattern for the target. (aarch64_gen_atomic_cas): New. * config/aarch64/atomics.md (UNSPECV_ATOMIC_CAS): New. (atomic_compare_and_swap<mode>_1): Rename to aarch64_compare_and_swap<mode>. Fix some indentation. (aarch64_compare_and_swap<mode>_lse): New. (aarch64_atomic_cas<mode>): New. From-SVN: r226858
2015-05-12re PR target/65697 (__atomic memory barriers not strong enough for __sync ↵Andrew MacLeod1-24/+14
builtins) 2015-05-12 Andrew MacLeod <amacleod@redhat.com> PR target/65697 * coretypes.h (MEMMODEL_SYNC, MEMMODEL_BASE_MASK): New macros. (enum memmodel): Add SYNC_{ACQUIRE,RELEASE,SEQ_CST}. * tree.h (memmodel_from_int, memmodel_base, is_mm_relaxed, is_mm_consume,is_mm_acquire, is_mm_release, is_mm_acq_rel, is_mm_seq_cst, is_mm_sync): New accessor functions. * builtins.c (expand_builtin_sync_operation, expand_builtin_compare_and_swap): Use MEMMODEL_SYNC_SEQ_CST. (expand_builtin_sync_lock_release): Use MEMMODEL_SYNC_RELEASE. (get_memmodel, expand_builtin_atomic_compare_exchange, expand_builtin_atomic_load, expand_builtin_atomic_store, expand_builtin_atomic_clear): Use new accessor routines. (expand_builtin_sync_synchronize): Use MEMMODEL_SYNC_SEQ_CST. * optabs.c (expand_compare_and_swap_loop): Use MEMMODEL_SYNC_SEQ_CST. (maybe_emit_sync_lock_test_and_set): Use new accessors and MEMMODEL_SYNC_ACQUIRE. (expand_sync_lock_test_and_set): Use MEMMODEL_SYNC_ACQUIRE. (expand_mem_thread_fence, expand_mem_signal_fence, expand_atomic_load, expand_atomic_store): Use new accessors. * emit-rtl.c (need_atomic_barrier_p): Add additional enum cases. * tsan.c (instrument_builtin_call): Update check for memory model beyond final enum to use MEMMODEL_LAST. * c-family/c-common.c: Use new accessor for memmodel_base. * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap): Use new accessors. * config/aarch64/atomics.md (atomic_load<mode>,atomic_store<mode>, arch64_load_exclusive<mode>, aarch64_store_exclusive<mode>, mem_thread_fence, *dmb): Likewise. * config/alpha/alpha.c (alpha_split_compare_and_swap, alpha_split_compare_and_swap_12): Likewise. * config/arm/arm.c (arm_expand_compare_and_swap, arm_split_compare_and_swap, arm_split_atomic_op): Likewise. * config/arm/sync.md (atomic_load<mode>, atomic_store<mode>, atomic_loaddi): Likewise. * config/i386/i386.c (ix86_destroy_cost_data, ix86_memmodel_check): Likewise. * config/i386/sync.md (mem_thread_fence, atomic_store<mode>): Likewise. * config/ia64/ia64.c (ia64_expand_atomic_op): Add new memmodel cases and use new accessors. * config/ia64/sync.md (mem_thread_fence, atomic_load<mode>, atomic_store<mode>, atomic_compare_and_swap<mode>, atomic_exchange<mode>): Use new accessors. * config/mips/mips.c (mips_process_sync_loop): Likewise. * config/pa/pa.md (atomic_loaddi, atomic_storedi): Likewise. * config/rs6000/rs6000.c (rs6000_pre_atomic_barrier, rs6000_post_atomic_barrier): Add new cases. (rs6000_expand_atomic_compare_and_swap): Use new accessors. * config/rs6000/sync.md (mem_thread_fence): Add new cases. (atomic_load<mode>): Add new cases and use new accessors. (store_quadpti): Add new cases. * config/s390/s390.md (mem_thread_fence, atomic_store<mode>): Use new accessors. * config/sparc/sparc.c (sparc_emit_membar_for_model): Use new accessors. * doc/extend.texi: Update docs to indicate 16 bits are used for memory model, not 8. From-SVN: r223096
2015-01-05Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r219188
2014-11-04[AArch64] Fix predicate and constraint mismatch in logical atomic operationsMichael Collison1-6/+6
2014-11-04 Michael Collison <michael.collison@linaro.org> * config/aarch64/iterators.md (lconst_atomic): New mode attribute to support constraints for CONST_INT in atomic operations. * config/aarch64/atomics.md (atomic_<atomic_optab><mode>): Use lconst_atomic constraint. (atomic_nand<mode>): Likewise. (atomic_fetch_<atomic_optab><mode>): Likewise. (atomic_fetch_nand<mode>): Likewise. (atomic_<atomic_optab>_fetch<mode>): Likewise. (atomic_nand_fetch<mode>): Likewise. From-SVN: r217076
2014-01-02Update copyright years in gcc/Richard Sandiford1-1/+1
From-SVN: r206289
2013-01-10Update copyright years in gcc/Richard Sandiford1-1/+1
From-SVN: r195098