aboutsummaryrefslogtreecommitdiff
path: root/gcc/gimple-loop-interchange.cc
diff options
context:
space:
mode:
authorRichard Sandiford <richard.sandiford@arm.com>2023-12-05 10:11:26 +0000
committerRichard Sandiford <richard.sandiford@arm.com>2023-12-05 10:11:26 +0000
commit3af9ceb631b741095d8eabd055ff7c23d4a69e6f (patch)
treebc96b93298df7a2bef8dac6116ad856aaca7bbd5 /gcc/gimple-loop-interchange.cc
parentdd8090f40079fa41ee58d9f76b2e50ed4f95c6bf (diff)
downloadgcc-3af9ceb631b741095d8eabd055ff7c23d4a69e6f.zip
gcc-3af9ceb631b741095d8eabd055ff7c23d4a69e6f.tar.gz
gcc-3af9ceb631b741095d8eabd055ff7c23d4a69e6f.tar.bz2
aarch64: Add support for SME ZA attributes
SME has an array called ZA that can be enabled and disabled separately from streaming mode. A status bit called PSTATE.ZA indicates whether ZA is currently enabled or not. In C and C++, the state of PSTATE.ZA is controlled using function attributes. There are four attributes that can be attached to function types to indicate that the function shares ZA with its caller. These are: - arm::in("za") - arm::out("za") - arm::inout("za") - arm::preserves("za") If a function's type has one of these shared-ZA attributes, PSTATE.ZA is specified to be 1 on entry to the function and on return from the function. Otherwise, the caller and callee have separate ZA contexts; they do not use ZA to share data. Although normal non-shared-ZA functions have a separate ZA context from their callers, nested uses of ZA are expected to be rare. The ABI therefore defines a cooperative lazy saving scheme that allows saves and restore of ZA to be kept to a minimum. (Callers still have the option of doing a full save and restore if they prefer.) Functions that want to use ZA internally have an arm::new("za") attribute, which tells the compiler to enable PSTATE.ZA for the duration of the function body. It also tells the compiler to commit any lazy save initiated by a caller. The patch uses various abstract hard registers to track dataflow relating to ZA. See the comments in the patch for details. The lazy save scheme is intended to be transparent to most normal functions, so that they don't need to be recompiled for SME. This is reflected in the way that most normal functions ignore the new hard registers added in the patch. As with arm::streaming and arm::streaming_compatible, the attributes are also available as __arm_<attr>. This has two advantages: it triggers an error on compilers that don't understand the attributes, and it eases use on C, where [[...]] attributes were only added in C23. gcc/ * config/aarch64/aarch64-isa-modes.def (ZA_ON): New ISA mode. * config/aarch64/aarch64-protos.h (aarch64_rdsvl_immediate_p) (aarch64_output_rdsvl, aarch64_optimize_mode_switching) (aarch64_restore_za): Declare. * config/aarch64/constraints.md (UsR): New constraint. * config/aarch64/aarch64.md (LOWERING_REGNUM, TPIDR_BLOCK_REGNUM) (SME_STATE_REGNUM, TPIDR2_SETUP_REGNUM, ZA_FREE_REGNUM) (ZA_SAVED_REGNUM, ZA_REGNUM, FIRST_FAKE_REGNUM): New constants. (LAST_FAKE_REGNUM): Likewise. (UNSPEC_SAVE_NZCV, UNSPEC_RESTORE_NZCV, UNSPEC_SME_VQ): New unspecs. (arches): Add sme. (arch_enabled): Handle it. (*cb<optab><mode>1): Rename to... (aarch64_cb<optab><mode>1): ...this. (*movsi_aarch64): Add an alternative for RDSVL. (*movdi_aarch64): Likewise. (aarch64_save_nzcv, aarch64_restore_nzcv): New insns. * config/aarch64/aarch64-sme.md (UNSPEC_SMSTOP_ZA) (UNSPEC_INITIAL_ZERO_ZA, UNSPEC_TPIDR2_SAVE, UNSPEC_TPIDR2_RESTORE) (UNSPEC_READ_TPIDR2, UNSPEC_WRITE_TPIDR2, UNSPEC_SETUP_LOCAL_TPIDR2) (UNSPEC_RESTORE_ZA, UNSPEC_START_PRIVATE_ZA_CALL): New unspecs. (UNSPEC_END_PRIVATE_ZA_CALL, UNSPEC_COMMIT_LAZY_SAVE): Likewise. (UNSPECV_ASM_UPDATE_ZA): New unspecv. (aarch64_tpidr2_save, aarch64_smstart_za, aarch64_smstop_za) (aarch64_initial_zero_za, aarch64_setup_local_tpidr2) (aarch64_clear_tpidr2, aarch64_write_tpidr2, aarch64_read_tpidr2) (aarch64_tpidr2_restore, aarch64_restore_za, aarch64_asm_update_za) (aarch64_start_private_za_call, aarch64_end_private_za_call) (aarch64_commit_lazy_save): New patterns. * config/aarch64/aarch64.h (AARCH64_ISA_ZA_ON, TARGET_ZA): New macros. (FIXED_REGISTERS, REGISTER_NAMES): Add the new fake ZA registers. (CALL_USED_REGISTERS): Replace with... (CALL_REALLY_USED_REGISTERS): ...this and add the fake ZA registers. (FIRST_PSEUDO_REGISTER): Bump to include the fake ZA registers. (FAKE_REGS): New register class. (REG_CLASS_NAMES): Update accordingly. (REG_CLASS_CONTENTS): Likewise. (machine_function::tpidr2_block): New member variable. (machine_function::tpidr2_block_ptr): Likewise. (machine_function::za_save_buffer): Likewise. (machine_function::next_asm_update_za_id): Likewise. (CUMULATIVE_ARGS::shared_za_flags): Likewise. (aarch64_mode_entity, aarch64_local_sme_state): New enums. (aarch64_tristate_mode): Likewise. (OPTIMIZE_MODE_SWITCHING, NUM_MODES_FOR_MODE_SWITCHING): Define. * config/aarch64/aarch64.cc (AARCH64_STATE_SHARED, AARCH64_STATE_IN) (AARCH64_STATE_OUT): New constants. (aarch64_attribute_shared_state_flags): New function. (aarch64_lookup_shared_state_flags, aarch64_fndecl_has_new_state) (aarch64_check_state_string, cmp_string_csts): Likewise. (aarch64_merge_string_arguments, aarch64_check_arm_new_against_type) (handle_arm_new, handle_arm_shared): Likewise. (handle_arm_new_za_attribute): New (aarch64_arm_attribute_table): Add new, preserves, in, out, and inout. (aarch64_hard_regno_nregs): Handle FAKE_REGS. (aarch64_hard_regno_mode_ok): Likewise. (aarch64_fntype_shared_flags, aarch64_fntype_pstate_za): New functions. (aarch64_fntype_isa_mode): Include aarch64_fntype_pstate_za. (aarch64_fndecl_has_state, aarch64_fndecl_pstate_za): New functions. (aarch64_fndecl_isa_mode): Include aarch64_fndecl_pstate_za. (aarch64_cfun_incoming_pstate_za, aarch64_cfun_shared_flags) (aarch64_cfun_has_new_state, aarch64_cfun_has_state): New functions. (aarch64_sme_vq_immediate, aarch64_sme_vq_unspec_p): Likewise. (aarch64_rdsvl_immediate_p, aarch64_output_rdsvl): Likewise. (aarch64_expand_mov_immediate): Handle RDSVL immediates. (aarch64_function_arg): Add the ZA sharing flags as a third limb of the PARALLEL. (aarch64_init_cumulative_args): Record the ZA sharing flags. (aarch64_extra_live_on_entry): New function. Handle the new ZA-related fake registers. (aarch64_epilogue_uses): Handle the new ZA-related fake registers. (aarch64_cannot_force_const_mem): Handle UNSPEC_SME_VQ constants. (aarch64_get_tpidr2_block, aarch64_get_tpidr2_ptr): New functions. (aarch64_init_tpidr2_block, aarch64_restore_za): Likewise. (aarch64_layout_frame): Check whether the current function creates new ZA state. Record that it clobbers LR if so. (aarch64_expand_prologue): Handle functions that create new ZA state. (aarch64_expand_epilogue): Likewise. (aarch64_create_tpidr2_block): New function. (aarch64_restore_za): Likewise. (aarch64_start_call_args): Disallow calls to shared-ZA functions from functions that have no ZA state. Emit a marker instruction before calls to private-ZA functions from functions that have SME state. (aarch64_expand_call): Add return registers for state that is managed via attributes. Record the use and clobber information for the ZA registers. (aarch64_end_call_args): New function. (aarch64_regno_regclass): Handle FAKE_REGS. (aarch64_class_max_nregs): Likewise. (aarch64_override_options_internal): Require TARGET_SME for functions that have ZA state. (aarch64_conditional_register_usage): Handle FAKE_REGS. (aarch64_mov_operand_p): Handle RDSVL immediates. (aarch64_comp_type_attributes): Check that the ZA sharing flags are equal. (aarch64_merge_decl_attributes): New function. (aarch64_optimize_mode_switching, aarch64_mode_emit_za_save_buffer) (aarch64_mode_emit_local_sme_state, aarch64_mode_emit): Likewise. (aarch64_insn_references_sme_state_p): Likewise. (aarch64_mode_needed_local_sme_state): Likewise. (aarch64_mode_needed_za_save_buffer, aarch64_mode_needed): Likewise. (aarch64_mode_after_local_sme_state, aarch64_mode_after): Likewise. (aarch64_local_sme_confluence, aarch64_mode_confluence): Likewise. (aarch64_one_shot_backprop, aarch64_local_sme_backprop): Likewise. (aarch64_mode_backprop, aarch64_mode_entry): Likewise. (aarch64_mode_exit, aarch64_mode_eh_handler): Likewise. (aarch64_mode_priority, aarch64_md_asm_adjust): Likewise. (TARGET_END_CALL_ARGS, TARGET_MERGE_DECL_ATTRIBUTES): Define. (TARGET_MODE_EMIT, TARGET_MODE_NEEDED, TARGET_MODE_AFTER): Likewise. (TARGET_MODE_CONFLUENCE, TARGET_MODE_BACKPROP): Likewise. (TARGET_MODE_ENTRY, TARGET_MODE_EXIT): Likewise. (TARGET_MODE_EH_HANDLER, TARGET_MODE_PRIORITY): Likewise. (TARGET_EXTRA_LIVE_ON_ENTRY): Likewise. (TARGET_MD_ASM_ADJUST): Use aarch64_md_asm_adjust. * config/aarch64/aarch64-c.cc (aarch64_define_unconditional_macros): Define __arm_new, __arm_preserves,__arm_in, __arm_out, and __arm_inout. gcc/testsuite/ * gcc.target/aarch64/sme/za_state_1.c: New test. * gcc.target/aarch64/sme/za_state_2.c: Likewise. * gcc.target/aarch64/sme/za_state_3.c: Likewise. * gcc.target/aarch64/sme/za_state_4.c: Likewise. * gcc.target/aarch64/sme/za_state_5.c: Likewise. * gcc.target/aarch64/sme/za_state_6.c: Likewise. * g++.target/aarch64/sme/exceptions_1.C: Likewise. * gcc.target/aarch64/sme/keyword_macros_1.c: Add ZA macros. * g++.target/aarch64/sme/keyword_macros_1.C: Likewise.
Diffstat (limited to 'gcc/gimple-loop-interchange.cc')
0 files changed, 0 insertions, 0 deletions