aboutsummaryrefslogtreecommitdiff
path: root/gcc/config/aarch64/predicates.md
AgeCommit message (Collapse)AuthorFilesLines
2021-04-27aarch64: Fix UB in the compiler [PR100200]Jakub Jelinek1-2/+2
The following patch fixes UBs in the compiler when negativing a CONST_INT containing HOST_WIDE_INT_MIN. I've changed the spots where there wasn't an obvious earlier condition check or predicate that would fail for such CONST_INTs. 2021-04-27 Jakub Jelinek <jakub@redhat.com> PR target/100200 * config/aarch64/predicates.md (aarch64_sub_immediate, aarch64_plus_immediate): Use -UINTVAL instead of -INTVAL. * config/aarch64/aarch64.md (casesi, rotl<mode>3): Likewise. * config/aarch64/aarch64.c (aarch64_print_operand, aarch64_split_atomic_op, aarch64_expand_subvti): Likewise.
2021-03-08aarch64: Fix PR99437 - tighten shift predicates for narrowing shift patternsKyrylo Tkachov1-0/+16
In this bug combine forms the (R)SHRN(2) instructions with an invalid shift amount. The intrinsic expanders for these patterns validate the right shift amount but if the final patterns end up being matched by combine (or other RTL passes I suppose) they still let the wrong const_vector through. This patch tightens up the predicates for the instructions involved by using predicates for the right shift amount const_vectors. gcc/ChangeLog: PR target/99437 * config/aarch64/predicates.md (aarch64_simd_shift_imm_vec_qi): Define. (aarch64_simd_shift_imm_vec_hi): Likewise. (aarch64_simd_shift_imm_vec_si): Likewise. (aarch64_simd_shift_imm_vec_di): Likewise. * config/aarch64/aarch64-simd.md (aarch64_shrn<mode>_insn_le): Use predicate from above. (aarch64_shrn<mode>_insn_be): Likewise. (aarch64_rshrn<mode>_insn_le): Likewise. (aarch64_rshrn<mode>_insn_be): Likewise. (aarch64_shrn2<mode>_insn_le): Likewise. (aarch64_shrn2<mode>_insn_be): Likewise. (aarch64_rshrn2<mode>_insn_le): Likewise. (aarch64_rshrn2<mode>_insn_be): Likewise. gcc/testsuite/ChangeLog: PR target/99437 * gcc.target/aarch64/simd/pr99437.c: New test.
2021-01-04Update copyright years.Jakub Jelinek1-1/+1
2020-12-22arm&aarch64: subdivide the type attribute "alu_shfit_imm"Qian Jianhua1-0/+2
The type attribute "alu_shfit_imm" is subdivided into "alu_shift_imm_lsl_1to4" and "alu_shift_imm_other", to accommodate optimazations of some microarchitectures. Here is the detailed discussion. https://gcc.gnu.org/pipermail/gcc/2020-September/233594.html gcc/ * config/arm/types.md (define_attr "autodetect_type"): New. (define_attr "type"): Subdivide alu_shift_imm. * config/arm/common.md: New file. * config/aarch64/predicates.md:Include common.md. * config/arm/predicates.md:Include common.md. * config/aarch64/aarch64.md (*add_<shift>_<mode>): Set autodetect_type. (*add_<shift>_si_uxtw): Likewise. (*sub_<shift>_<mode>): Likewise. (*sub_<shift>_si_uxtw): Likewise. (*neg_<shift>_<mode>2): Likewise. (*neg_<shift>_si2_uxtw): Likewise. * config/arm/arm.md (*addsi3_carryin_shift): Likewise. (add_not_shift_cin): Likewise. (*subsi3_carryin_shift): Likewise. (*subsi3_carryin_shift_alt): Likewise. (*rsbsi3_carryin_shift): Likewise. (*rsbsi3_carryin_shift_alt): Likewise. (*arm_shiftsi3): Likewise. (*<arith_shift_insn>_multsi): Likewise. (*<arith_shift_insn>_shiftsi): Likewise. (subsi3_carryin): Set new type. (*if_arith_move): Set new type. (*if_move_arith): Set new type. (define_attr "core_cycles"): Use new type. * config/arm/arm-fixed.md (arm_ssatsihi_shift): Set autodetect_type. * config/arm/thumb2.md (*orsi_not_shiftsi_si): Likewise. (*thumb2_shiftsi3_short): Set new type. * config/aarch64/falkor.md (falkor_alu_1_xyz): Use new type. * config/aarch64/saphira.md (saphira_alu_1_xyz): Likewise. * config/aarch64/thunderx.md (thunderx_arith_shift): Likewise. * config/aarch64/thunderx2t99.md (thunderx2t99_alu_shift): Likewise. * config/aarch64/thunderx3t110.md (thunderx3t110_alu_shift): Likewise. (thunderx3t110_alu_shift1): Likewise. * config/aarch64/tsv110.md (tsv110_alu_shift): Likewise. * config/arm/arm1020e.md (1020alu_shift_op): Likewise. * config/arm/arm1026ejs.md (alu_shift_op): Likewise. * config/arm/arm1136jfs.md (11_alu_shift_op): Likewise. * config/arm/arm926ejs.md (9_alu_op): Likewise. * config/arm/cortex-a15.md (cortex_a15_alu_shift): Likewise. * config/arm/cortex-a17.md (cortex_a17_alu_shiftimm): Likewise. * config/arm/cortex-a5.md (cortex_a5_alu_shift): Likewise. * config/arm/cortex-a53.md (cortex_a53_alu_shift): Likewise. * config/arm/cortex-a57.md (cortex_a57_alu_shift): Likewise. * config/arm/cortex-a7.md (cortex_a7_alu_shift): Likewise. * config/arm/cortex-a8.md (cortex_a8_alu_shift): Likewise. * config/arm/cortex-a9.md (cortex_a9_dp_shift): Likewise. * config/arm/cortex-m4.md (cortex_m4_alu): Likewise. * config/arm/cortex-m7.md (cortex_m7_alu_shift): Likewise. * config/arm/cortex-r4.md (cortex_r4_alu_shift): Likewise. * config/arm/exynos-m1.md (exynos_m1_alu_shift): Likewise. * config/arm/fa526.md (526_alu_shift_op): Likewise. * config/arm/fa606te.md (606te_alu_op): Likewise. * config/arm/fa626te.md (626te_alu_shift_op): Likewise. * config/arm/fa726te.md (726te_alu_shift_op): Likewise. * config/arm/fmp626.md (mp626_alu_shift_op): Likewise. * config/arm/marvell-pj4.md (pj4_shift): Likewise. (pj4_shift_conds): Likewise. (pj4_alu_shift): Likewise. (pj4_alu_shift_conds): Likewise. * config/arm/xgene1.md (xgene1_alu): Likewise. * config/arm/arm.c (xscale_sched_adjust_cost): Likewise.
2020-09-07aarch64: Remove redundant mult patternsAlex Coplan1-15/+0
Following on from the previous commit to fix up the syntax for add/sub/adds/subs and friends with a sign/zero-extended operand, this patch removes the "mult" variants of these patterns which are all redundant. This patch removes the following patterns from the AArch64 backend: *adds_mul_imm_<mode> *subs_mul_imm_<mode> *adds_<optab><mode>_multp2 *subs_<optab><mode>_multp2 *add_mul_imm_<mode> *add_<optab><ALLX:mode>_mult_<GPI:mode> *add_<optab><SHORT:mode>_mult_si_uxtw *add_<optab><mode>_multp2 *add_<optab>si_multp2_uxtw *add_uxt<mode>_multp2 *add_uxtsi_multp2_uxtw *sub_mul_imm_<mode> *sub_mul_imm_si_uxtw *sub_<optab><mode>_multp2 *sub_<optab>si_multp2_uxtw *sub_uxt<mode>_multp2 *sub_uxtsi_multp2_uxtw *neg_mul_imm_<mode>2 *neg_mul_imm_si2_uxtw Together with the following predicates which were used only by these patterns: aarch64_pwr_imm3 aarch64_pwr_2_si aarch64_pwr_2_di These patterns are all redundant since multiplications by powers of two should be represented as shfits outside a (mem). --- gcc/ChangeLog: * config/aarch64/aarch64.md (*adds_mul_imm_<mode>): Delete. (*subs_mul_imm_<mode>): Delete. (*adds_<optab><mode>_multp2): Delete. (*subs_<optab><mode>_multp2): Delete. (*add_mul_imm_<mode>): Delete. (*add_<optab><ALLX:mode>_mult_<GPI:mode>): Delete. (*add_<optab><SHORT:mode>_mult_si_uxtw): Delete. (*add_<optab><mode>_multp2): Delete. (*add_<optab>si_multp2_uxtw): Delete. (*add_uxt<mode>_multp2): Delete. (*add_uxtsi_multp2_uxtw): Delete. (*sub_mul_imm_<mode>): Delete. (*sub_mul_imm_si_uxtw): Delete. (*sub_<optab><mode>_multp2): Delete. (*sub_<optab>si_multp2_uxtw): Delete. (*sub_uxt<mode>_multp2): Delete. (*sub_uxtsi_multp2_uxtw): Delete. (*neg_mul_imm_<mode>2): Delete. (*neg_mul_imm_si2_uxtw): Delete. * config/aarch64/predicates.md (aarch64_pwr_imm3): Delete. (aarch64_pwr_2_si): Delete. (aarch64_pwr_2_di): Delete.
2020-07-09aarch64: Mitigate SLS for BLR instructionMatthew Malcomson1-1/+2
This patch introduces the mitigation for Straight Line Speculation past the BLR instruction. This mitigation replaces BLR instructions with a BL to a stub which uses a BR to jump to the original value. These function stubs are then appended with a speculation barrier to ensure no straight line speculation happens after these jumps. When optimising for speed we use a set of stubs for each function since this should help the branch predictor make more accurate predictions about where a stub should branch. When optimising for size we use one set of stubs for all functions. This set of stubs can have human readable names, and we are using `__call_indirect_x<N>` for register x<N>. When BTI branch protection is enabled the BLR instruction can jump to a `BTI c` instruction using any register, while the BR instruction can only jump to a `BTI c` instruction using the x16 or x17 registers. Hence, in order to ensure this transformation is safe we mov the value of the original register into x16 and use x16 for the BR. As an example when optimising for size: a BLR x0 instruction would get transformed to something like BL __call_indirect_x0 where __call_indirect_x0 labels a thunk that contains __call_indirect_x0: MOV X16, X0 BR X16 <speculation barrier> The first version of this patch used local symbols specific to a compilation unit to try and avoid relocations. This was mistaken since functions coming from the same compilation unit can still be in different sections, and the assembler will insert relocations at jumps between sections. On any relocation the linker is permitted to emit a veneer to handle jumps between symbols that are very far apart. The registers x16 and x17 may be clobbered by these veneers. Hence the function stubs cannot rely on the values of x16 and x17 being the same as just before the function stub is called. Similar can be said for the hot/cold partitioning of single functions, so function-local stubs have the same restriction. This updated version of the patch never emits function stubs for x16 and x17, and instead forces other registers to be used. Given the above, there is now no benefit to local symbols (since they are not enough to avoid dealing with linker intricacies). This patch now uses global symbols with hidden visibility each stored in their own COMDAT section. This means stubs can be shared between compilation units while still avoiding the PLT indirection. This patch also removes the `__call_indirect_x30` stub (and function-local equivalent) which would simply jump back to the original location. The function-local stubs are emitted to the assembly output file in one chunk, which means we need not add the speculation barrier directly after each one. This is because we know for certain that the instructions directly after the BR in all but the last function stub will be from another one of these stubs and hence will not contain a speculation gadget. Instead we add a speculation barrier at the end of the sequence of stubs. The global stubs are emitted in COMDAT/.linkonce sections by themselves so that the linker can remove duplicates from multiple object files. This means they are not emitted in one chunk, and each one must include the speculation barrier. Another difference is that since the global stubs are shared across compilation units we do not know that all functions will be targeting an architecture supporting the SB instruction. Rather than provide multiple stubs for each architecture, we provide a stub that will work for all architectures -- using the DSB+ISB barrier. This mitigation does not apply for BLR instructions in the following places: - Some accesses to thread-local variables use a code sequence with a BLR instruction. This code sequence is part of the binary interface between compiler and linker. If this BLR instruction needs to be mitigated, it'd probably be best to do so in the linker. It seems that the code sequence for thread-local variable access is unlikely to lead to a Spectre Revalation Gadget. - PLT stubs are produced by the linker and each contain a BLR instruction. It seems that at most only after the last PLT stub a Spectre Revalation Gadget might appear. Testing: Bootstrap and regtest on AArch64 (with BOOT_CFLAGS="-mharden-sls=retbr,blr") Used a temporary hack(1) in gcc-dg.exp to use these options on every test in the testsuite, a slight modification to emit the speculation barrier after every function stub, and a script to check that the output never emitted a BLR, or unmitigated BR or RET instruction. Similar on an aarch64-none-elf cross-compiler. 1) Temporary hack emitted a speculation barrier at the end of every stub function, and used a script to ensure that: a) Every RET or BR is immediately followed by a speculation barrier. b) No BLR instruction is emitted by compiler. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_indirect_call_asm): New declaration. * config/aarch64/aarch64.c (aarch64_regno_regclass): Handle new stub registers class. (aarch64_class_max_nregs): Likewise. (aarch64_register_move_cost): Likewise. (aarch64_sls_shared_thunks): Global array to store stub labels. (aarch64_sls_emit_function_stub): New. (aarch64_create_blr_label): New. (aarch64_sls_emit_blr_function_thunks): New. (aarch64_sls_emit_shared_blr_thunks): New. (aarch64_asm_file_end): New. (aarch64_indirect_call_asm): New. (TARGET_ASM_FILE_END): Use aarch64_asm_file_end. (TARGET_ASM_FUNCTION_EPILOGUE): Use aarch64_sls_emit_blr_function_thunks. * config/aarch64/aarch64.h (STB_REGNUM_P): New. (enum reg_class): Add STUB_REGS class. (machine_function): Introduce `call_via` array for function-local stub labels. * config/aarch64/aarch64.md (*call_insn, *call_value_insn): Use aarch64_indirect_call_asm to emit code when hardening BLR instructions. * config/aarch64/constraints.md (Ucr): New constraint representing registers for indirect calls. Is GENERAL_REGS usually, and STUB_REGS when hardening BLR instruction against SLS. * config/aarch64/predicates.md (aarch64_general_reg): STUB_REGS class is also a general register. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sls-mitigation/sls-miti-blr-bti.c: New test. * gcc.target/aarch64/sls-mitigation/sls-miti-blr.c: New test.
2020-01-17[AArch64] [SVE] Implement svld1ro intrinsic.Matthew Malcomson1-0/+16
We take no action to ensure the SVE vector size is large enough. It is left to the user to check that before compiling this intrinsic or before running such a program on a machine. The main difference between ld1ro and ld1rq is in the allowed offsets, the implementation difference is that ld1ro is implemented using integer modes since there are no pre-existing vector modes of the relevant size. Adding new vector modes simply for this intrinsic seems to make the code less tidy. Specifications can be found under the "Arm C Language Extensions for Scalable Vector Extension" title at https://developer.arm.com/architectures/system-architectures/software-standards/acle gcc/ChangeLog: 2020-01-17 Matthew Malcomson <matthew.malcomson@arm.com> * config/aarch64/aarch64-protos.h (aarch64_sve_ld1ro_operand_p): New. * config/aarch64/aarch64-sve-builtins-base.cc (class load_replicate): New. (class svld1ro_impl): New. (class svld1rq_impl): Change to inherit from load_replicate. (svld1ro): New sve intrinsic function base. * config/aarch64/aarch64-sve-builtins-base.def (svld1ro): New DEF_SVE_FUNCTION. * config/aarch64/aarch64-sve-builtins-base.h (svld1ro): New decl. * config/aarch64/aarch64-sve-builtins.cc (function_expander::add_mem_operand): Modify assert to allow OImode. * config/aarch64/aarch64-sve.md (@aarch64_sve_ld1ro<mode>): New pattern. * config/aarch64/aarch64.c (aarch64_sve_ld1rq_operand_p): Implement in terms of ... (aarch64_sve_ld1rq_ld1ro_operand_p): This. (aarch64_sve_ld1ro_operand_p): New. * config/aarch64/aarch64.md (UNSPEC_LD1RO): New unspec. * config/aarch64/constraints.md (UOb,UOh,UOw,UOd): New. * config/aarch64/predicates.md (aarch64_sve_ld1ro_operand_{b,h,w,d}): New. gcc/testsuite/ChangeLog: 2020-01-17 Matthew Malcomson <matthew.malcomson@arm.com> * gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c: New test.
2020-01-09[AArch64] Pass a mode to some SVE immediate queriesRichard Sandiford1-4/+4
It helps the SVE2 ACLE support if aarch64_sve_arith_immediate_p and aarch64_sve_sqadd_sqsub_immediate_p accept scalars as well as vectors. 2020-01-09 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-protos.h (aarch64_sve_arith_immediate_p) (aarch64_sve_sqadd_sqsub_immediate_p): Add a machine_mode argument. * config/aarch64/aarch64.c (aarch64_sve_arith_immediate_p) (aarch64_sve_sqadd_sqsub_immediate_p): Likewise. Handle scalar immediates as well as vector ones. * config/aarch64/predicates.md (aarch64_sve_arith_immediate) (aarch64_sve_sub_arith_immediate, aarch64_sve_qadd_immediate) (aarch64_sve_qsub_immediate): Update calls accordingly. From-SVN: r280059
2020-01-01Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r279813
2019-11-19[AArch64] Implement Armv8.5-A memory tagging (MTE) intrinsicsDennis Zhang1-0/+14
2019-11-19 Dennis Zhang <dennis.zhang@arm.com> * config/aarch64/aarch64-builtins.c (enum aarch64_builtins): Add AARCH64_MEMTAG_BUILTIN_START, AARCH64_MEMTAG_BUILTIN_IRG, AARCH64_MEMTAG_BUILTIN_GMI, AARCH64_MEMTAG_BUILTIN_SUBP, AARCH64_MEMTAG_BUILTIN_INC_TAG, AARCH64_MEMTAG_BUILTIN_SET_TAG, AARCH64_MEMTAG_BUILTIN_GET_TAG, and AARCH64_MEMTAG_BUILTIN_END. (aarch64_init_memtag_builtins): New. (AARCH64_INIT_MEMTAG_BUILTINS_DECL): New macro. (aarch64_general_init_builtins): Call aarch64_init_memtag_builtins. (aarch64_expand_builtin_memtag): New. (aarch64_general_expand_builtin): Call aarch64_expand_builtin_memtag. (AARCH64_BUILTIN_SUBCODE): New macro. (aarch64_resolve_overloaded_memtag): New. (aarch64_resolve_overloaded_builtin_general): New. Call aarch64_resolve_overloaded_memtag to handle overloaded MTE builtins. * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define __ARM_FEATURE_MEMORY_TAGGING when enabled. (aarch64_resolve_overloaded_builtin): Call aarch64_resolve_overloaded_builtin_general. * config/aarch64/aarch64-protos.h (aarch64_resolve_overloaded_builtin_general): New declaration. * config/aarch64/aarch64.h (AARCH64_ISA_MEMTAG): New macro. (TARGET_MEMTAG): Likewise. * config/aarch64/aarch64.md (UNSPEC_GEN_TAG): New unspec. (UNSPEC_GEN_TAG_RND, and UNSPEC_TAG_SPACE): Likewise. (irg, gmi, subp, addg, ldg, stg): New instructions. * config/aarch64/arm_acle.h (__arm_mte_create_random_tag): New macro. (__arm_mte_exclude_tag, __arm_mte_ptrdiff): Likewise. (__arm_mte_increment_tag, __arm_mte_set_tag): Likewise. (__arm_mte_get_tag): Likewise. * config/aarch64/predicates.md (aarch64_memtag_tag_offset): New. (aarch64_granule16_uimm6, aarch64_granule16_simm9): New. * config/arm/types.md (memtag): New. * doc/invoke.texi (-memtag): Update description. 2019-11-19 Dennis Zhang <dennis.zhang@arm.com> * gcc.target/aarch64/acle/memtag_1.c: New test. * gcc.target/aarch64/acle/memtag_2.c: New test. * gcc.target/aarch64/acle/memtag_3.c: New test. From-SVN: r278444
2019-11-18Add optabs for accelerating RAW and WAR alias checksRichard Sandiford1-0/+5
This patch adds optabs that check whether a read followed by a write or a write followed by a read can be divided into interleaved byte accesses without changing the dependencies between the bytes. This is one of the uses of the SVE2 WHILERW and WHILEWR instructions. (The instructions can also be used to limit the VF at runtime, but that's future work.) 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> gcc/ * doc/sourcebuild.texi (vect_check_ptrs): Document. * optabs.def (check_raw_ptrs_optab, check_war_ptrs_optab): New optabs. * doc/md.texi: Document them. * internal-fn.def (IFN_CHECK_RAW_PTRS, IFN_CHECK_WAR_PTRS): New internal functions. * internal-fn.h (internal_check_ptrs_fn_supported_p): Declare. * internal-fn.c (check_ptrs_direct): New macro. (expand_check_ptrs_optab_fn): Likewise. (direct_check_ptrs_optab_supported_p): Likewise. (internal_check_ptrs_fn_supported_p): New fuction. * tree-data-ref.c: Include internal-fn.h. (create_ifn_alias_checks): New function. (create_intersect_range_checks): Use it. * config/aarch64/iterators.md (SVE2_WHILE_PTR): New int iterator. (optab, cmp_op): Handle it. (raw_war, unspec): New int attributes. * config/aarch64/aarch64.md (UNSPEC_WHILERW, UNSPEC_WHILE_WR): New constants. * config/aarch64/predicates.md (aarch64_bytes_per_sve_vector_operand): New predicate. * config/aarch64/aarch64-sve2.md (check_<raw_war>_ptrs<mode>): New expander. (@aarch64_sve2_while<cmp_op><GPI:mode><PRED_ALL:mode>_ptest): New pattern. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_check_ptrs): New procedure. * gcc.dg/vect/vect-alias-check-14.c: Expect IFN_CHECK_WAR to be used, if available. * gcc.dg/vect/vect-alias-check-15.c: Likewise. * gcc.dg/vect/vect-alias-check-16.c: Likewise IFN_CHECK_RAW. * gcc.target/aarch64/sve2/whilerw_1.c: New test. * gcc.target/aarch64/sve2/whilewr_1.c: Likewise. * gcc.target/aarch64/sve2/whilewr_2.c: Likewise. From-SVN: r278414
2019-10-29[AArch64] Add support for arm_sve.hRichard Sandiford1-3/+86
This patch adds support for arm_sve.h. I've tried to split all the groundwork out into separate patches, so this is mostly adding new code rather than changing existing code. The C++ frontend seems to handle correct ACLE code without modification, even in length-agnostic mode. The C frontend is close; the only correct construct I know it doesn't handle is initialisation. E.g.: svbool_t pg = svptrue_b8 (); produces: variable-sized object may not be initialized although: svbool_t pg; pg = svptrue_b8 (); works fine. This can be fixed by changing: { /* A complete type is ok if size is fixed. */ - if (TREE_CODE (TYPE_SIZE (TREE_TYPE (decl))) != INTEGER_CST + if (!poly_int_tree_p (TYPE_SIZE (TREE_TYPE (decl))) || C_DECL_VARIABLE_SIZE (decl)) { error ("variable-sized object may not be initialized"); in c/c-decl.c:start_decl. Invalid code is likely to trigger ICEs, so this isn't ready for general use yet. However, it seemed better to apply the patch now and deal with diagnosing invalid code as a follow-up. For one thing, it means that we'll be able to provide testcases for middle-end changes related to SVE vectors, which has been a problem until now. (I already have a series of such patches lined up.) The patch includes some tests, but the main ones need to wait until the PCS support has been applied. 2019-10-29 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> gcc/ * config.gcc (aarch64*-*-*): Add arm_sve.h to extra_headers. Add aarch64-sve-builtins.o, aarch64-sve-builtins-shapes.o and aarch64-sve-builtins-base.o to extra_objs. Add aarch64-sve-builtins.h and aarch64-sve-builtins.cc to target_gtfiles. * config/aarch64/t-aarch64 (aarch64-sve-builtins.o): New rule. (aarch64-sve-builtins-shapes.o): Likewise. (aarch64-sve-builtins-base.o): New rules. * config/aarch64/aarch64-c.c (aarch64_pragma_aarch64): New function. (aarch64_resolve_overloaded_builtin): Likewise. (aarch64_check_builtin_call): Likewise. (aarch64_register_pragmas): Install aarch64_resolve_overloaded_builtin and aarch64_check_builtin_call in targetm. Register the GCC aarch64 pragma. * config/aarch64/aarch64-protos.h (AARCH64_FOR_SVPRFOP): New macro. (aarch64_svprfop): New enum. (AARCH64_BUILTIN_SVE): New aarch64_builtin_class enum value. (aarch64_sve_int_mode, aarch64_sve_data_mode): Declare. (aarch64_fold_sve_cnt_pat, aarch64_output_sve_prefetch): Likewise. (aarch64_output_sve_cnt_pat_immediate): Likewise. (aarch64_output_sve_ptrues, aarch64_sve_ptrue_svpattern_p): Likewise. (aarch64_sve_sqadd_sqsub_immediate_p, aarch64_sve_ldff1_operand_p) (aarch64_sve_ldnf1_operand_p, aarch64_sve_prefetch_operand_p) (aarch64_ptrue_all_mode, aarch64_convert_sve_data_to_pred): Likewise. (aarch64_expand_sve_dupq, aarch64_replace_reg_mode): Likewise. (aarch64_sve::init_builtins, aarch64_sve::handle_arm_sve_h): Likewise. (aarch64_sve::builtin_decl, aarch64_sve::builtin_type_p): Likewise. (aarch64_sve::mangle_builtin_type): Likewise. (aarch64_sve::resolve_overloaded_builtin): Likewise. (aarch64_sve::check_builtin_call, aarch64_sve::gimple_fold_builtin) (aarch64_sve::expand_builtin): Likewise. * config/aarch64/aarch64.c (aarch64_sve_data_mode): Make public. (aarch64_sve_int_mode): Likewise. (aarch64_ptrue_all_mode): New function. (aarch64_convert_sve_data_to_pred): Make public. (svprfop_token): New function. (aarch64_output_sve_prefetch): Likewise. (aarch64_fold_sve_cnt_pat): Likewise. (aarch64_output_sve_cnt_pat_immediate): Likewise. (aarch64_sve_move_pred_via_while): Use gen_while with UNSPEC_WHILE_LO instead of gen_while_ult. (aarch64_replace_reg_mode): Make public. (aarch64_init_builtins): Call aarch64_sve::init_builtins. (aarch64_fold_builtin): Handle AARCH64_BUILTIN_SVE. (aarch64_gimple_fold_builtin, aarch64_expand_builtin): Likewise. (aarch64_builtin_decl, aarch64_builtin_reciprocal): Likewise. (aarch64_mangle_type): Call aarch64_sve::mangle_type. (aarch64_sve_sqadd_sqsub_immediate_p): New function. (aarch64_sve_ptrue_svpattern_p): Likewise. (aarch64_sve_pred_valid_immediate): Check aarch64_sve_ptrue_svpattern_p. (aarch64_sve_ldff1_operand_p, aarch64_sve_ldnf1_operand_p) (aarch64_sve_prefetch_operand_p, aarch64_output_sve_ptrues): New functions. * config/aarch64/aarch64.md (UNSPEC_LDNT1_SVE, UNSPEC_STNT1_SVE) (UNSPEC_LDFF1_GATHER, UNSPEC_PTRUE, UNSPEC_WHILE_LE, UNSPEC_WHILE_LS) (UNSPEC_WHILE_LT, UNSPEC_CLASTA, UNSPEC_UPDATE_FFR) (UNSPEC_UPDATE_FFRT, UNSPEC_RDFFR, UNSPEC_WRFFR) (UNSPEC_SVE_LANE_SELECT, UNSPEC_SVE_CNT_PAT, UNSPEC_SVE_PREFETCH) (UNSPEC_SVE_PREFETCH_GATHER, UNSPEC_SVE_COMPACT, UNSPEC_SVE_SPLICE): New unspecs. * config/aarch64/iterators.md (SI_ONLY, DI_ONLY, VNx8HI_ONLY) (VNx2DI_ONLY, SVE_PARTIAL, VNx8_NARROW, VNx8_WIDE, VNx4_NARROW) (VNx4_WIDE, VNx2_NARROW, VNx2_WIDE, PRED_HSD): New mode iterators. (UNSPEC_ADR, UNSPEC_BRKA, UNSPEC_BRKB, UNSPEC_BRKN, UNSPEC_BRKPA) (UNSPEC_BRKPB, UNSPEC_PFIRST, UNSPEC_PNEXT, UNSPEC_CNTP, UNSPEC_SADDV) (UNSPEC_UADDV, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FEXPA, UNSPEC_FTMAD) (UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_CMPEQ_WIDE): New unspecs. (UNSPEC_COND_CMPGE_WIDE, UNSPEC_COND_CMPGT_WIDE): Likewise. (UNSPEC_COND_CMPHI_WIDE, UNSPEC_COND_CMPHS_WIDE): Likewise. (UNSPEC_COND_CMPLE_WIDE, UNSPEC_COND_CMPLO_WIDE): Likewise. (UNSPEC_COND_CMPLS_WIDE, UNSPEC_COND_CMPLT_WIDE): Likewise. (UNSPEC_COND_CMPNE_WIDE, UNSPEC_COND_FCADD90, UNSPEC_COND_FCADD270) (UNSPEC_COND_FCMLA, UNSPEC_COND_FCMLA90, UNSPEC_COND_FCMLA180) (UNSPEC_COND_FCMLA270, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN): Likewise. (UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX, UNSPEC_COND_FSCALE): Likewise. (UNSPEC_LASTA, UNSPEC_ASHIFT_WIDE, UNSPEC_ASHIFTRT_WIDE): Likewise. (UNSPEC_LSHIFTRT_WIDE, UNSPEC_LDFF1, UNSPEC_LDNF1): Likewise. (Vesize): Handle partial vector modes. (self_mask, narrower_mask, sve_lane_con, sve_lane_pair_con): New mode attributes. (UBINQOPS, ANY_PLUS, SAT_PLUS, ANY_MINUS, SAT_MINUS): New code iterators. (s, paired_extend, inc_dec): New code attributes. (SVE_INT_ADDV, CLAST, LAST): New int iterators. (SVE_INT_UNARY): Add UNSPEC_RBIT. (SVE_FP_UNARY, SVE_FP_UNARY_INT): New int iterators. (SVE_FP_BINARY, SVE_FP_BINARY_INT): Likewise. (SVE_COND_FP_UNARY): Add UNSPEC_COND_FRECPX. (SVE_COND_FP_BINARY): Add UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and UNSPEC_COND_FMULX. (SVE_COND_FP_BINARY_INT, SVE_COND_FP_ADD): New int iterators. (SVE_COND_FP_SUB, SVE_COND_FP_MUL): Likewise. (SVE_COND_FP_BINARY_I1): Add UNSPEC_COND_FMAX and UNSPEC_COND_FMIN. (SVE_COND_FP_BINARY_REG): Add UNSPEC_COND_FMULX. (SVE_COND_FCADD, SVE_COND_FP_MAXMIN, SVE_COND_FCMLA) (SVE_COND_INT_CMP_WIDE, SVE_FP_TERNARY_LANE, SVE_CFP_TERNARY_LANE) (SVE_WHILE, SVE_SHIFT_WIDE, SVE_LDFF1_LDNF1, SVE_BRK_UNARY) (SVE_BRK_BINARY, SVE_PITER): New int iterators. (optab): Handle UNSPEC_SADDV, UNSPEC_UADDV, UNSPEC_FRECPE, UNSPEC_FRECPS, UNSPEC_RSQRTE, UNSPEC_RSQRTS, UNSPEC_RBIT, UNSPEC_SMUL_HIGHPART, UNSPEC_UMUL_HIGHPART, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FCMLA, UNSPEC_FCMLA90, UNSPEC_FCMLA180, UNSPEC_FCMLA270, UNSPEC_FEXPA, UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_FCADD90, UNSPEC_COND_FCADD270, UNSPEC_COND_FCMLA, UNSPEC_COND_FCMLA90, UNSPEC_COND_FCMLA180, UNSPEC_COND_FCMLA270, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN, UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX and UNSPEC_COND_FSCALE. (maxmin_uns): Handle UNSPEC_COND_FMAX and UNSPEC_COND_FMIN. (binqops_op, binqops_op_rev, last_op): New int attributes. (su): Handle UNSPEC_SADDV and UNSPEC_UADDV. (fn, ab): New int attributes. (cmp_op): Handle UNSPEC_COND_CMP*_WIDE and UNSPEC_WHILE_*. (while_optab_cmp, brk_op, sve_pred_op): New int attributes. (sve_int_op): Handle UNSPEC_SMUL_HIGHPART, UNSPEC_UMUL_HIGHPART, UNSPEC_ASHIFT_WIDE, UNSPEC_ASHIFTRT_WIDE, UNSPEC_LSHIFTRT_WIDE and UNSPEC_RBIT. (sve_fp_op): Handle UNSPEC_FRECPE, UNSPEC_FRECPS, UNSPEC_RSQRTE, UNSPEC_RSQRTS, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FEXPA, UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN, UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX and UNSPEC_COND_FSCALE. (sve_fp_op_rev): Handle UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and UNSPEC_COND_FMULX. (rot): Handle UNSPEC_COND_FCADD* and UNSPEC_COND_FCMLA*. (brk_reg_con, brk_reg_opno): New int attributes. (sve_pred_fp_rhs1_operand, sve_pred_fp_rhs2_operand): Handle UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and UNSPEC_COND_FMULX. (sve_pred_fp_rhs2_immediate): Handle UNSPEC_COND_FMAX and UNSPEC_COND_FMIN. (max_elem_bits): New int attribute. (min_elem_bits): Handle UNSPEC_RBIT. * config/aarch64/predicates.md (subreg_lowpart_operator): Handle TRUNCATE as well as SUBREG. (ascending_int_parallel, aarch64_simd_reg_or_minus_one) (aarch64_sve_ldff1_operand, aarch64_sve_ldnf1_operand) (aarch64_sve_prefetch_operand, aarch64_sve_ptrue_svpattern_immediate) (aarch64_sve_qadd_immediate, aarch64_sve_qsub_immediate) (aarch64_sve_gather_immediate_b, aarch64_sve_gather_immediate_h) (aarch64_sve_gather_immediate_w, aarch64_sve_gather_immediate_d) (aarch64_sve_sqadd_operand, aarch64_sve_gather_offset_b) (aarch64_sve_gather_offset_h, aarch64_sve_gather_offset_w) (aarch64_sve_gather_offset_d, aarch64_gather_scale_operand_b) (aarch64_gather_scale_operand_h): New predicates. * config/aarch64/constraints.md (UPb, UPd, UPh, UPw, Utf, Utn, vgb) (vgd, vgh, vgw, vsQ, vsS): New constraints. * config/aarch64/aarch64-sve.md: Add a note on the FFR handling. (*aarch64_sve_reinterpret<mode>): Allow any source register instead of requiring an exact match. (*aarch64_sve_ptruevnx16bi_cc, *aarch64_sve_ptrue<mode>_cc) (*aarch64_sve_ptruevnx16bi_ptest, *aarch64_sve_ptrue<mode>_ptest) (aarch64_wrffr, aarch64_update_ffr_for_load, aarch64_copy_ffr_to_ffrt) (aarch64_rdffr, aarch64_rdffr_z, *aarch64_rdffr_z_ptest) (*aarch64_rdffr_ptest, *aarch64_rdffr_z_cc, *aarch64_rdffr_cc) (aarch64_update_ffrt): New patterns. (@aarch64_load_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>) (@aarch64_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (@aarch64_ld<fn>f1<mode>): New patterns. (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>) (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (@aarch64_ldnt1<mode>): New patterns. (gather_load<mode>): Use aarch64_sve_gather_offset_<Vesize> for the scalar part of the address. (mask_gather_load<SVE_S:mode>): Use aarch64_sve_gather_offset_w for the scalar part of the addresse and add an alternative for handling nonzero offsets. (mask_gather_load<SVE_D:mode>): Likewise aarch64_sve_gather_offset_d. (*mask_gather_load<mode>_sxtw, *mask_gather_load<mode>_uxtw) (@aarch64_gather_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw) (*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw) (@aarch64_ldff1_gather<SVE_S:mode>, @aarch64_ldff1_gather<SVE_D:mode>) (*aarch64_ldff1_gather<mode>_sxtw, *aarch64_ldff1_gather<mode>_uxtw) (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw) (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw) (@aarch64_sve_prefetch<mode>): New patterns. (@aarch64_sve_gather_prefetch<SVE_I:mode><VNx4SI_ONLY:mode>) (@aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>) (*aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>_sxtw) (*aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>_uxtw) (@aarch64_store_trunc<VNx8_NARROW:mode><VNx8_WIDE:mode>) (@aarch64_store_trunc<VNx4_NARROW:mode><VNx4_WIDE:mode>) (@aarch64_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>) (@aarch64_stnt1<mode>): New patterns. (scatter_store<mode>): Use aarch64_sve_gather_offset_<Vesize> for the scalar part of the address. (mask_scatter_store<SVE_S:mode>): Use aarch64_sve_gather_offset_w for the scalar part of the addresse and add an alternative for handling nonzero offsets. (mask_scatter_store<SVE_D:mode>): Likewise aarch64_sve_gather_offset_d. (*mask_scatter_store<mode>_sxtw, *mask_scatter_store<mode>_uxtw) (@aarch64_scatter_store_trunc<VNx4_NARROW:mode><VNx4_WIDE:mode>) (@aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>) (*aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>_sxtw) (*aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>_uxtw): New patterns. (vec_duplicate<mode>): Use QI as the mode of the input operand. (extract_last_<mode>): Generalize to... (@extract_<LAST:last_op>_<mode>): ...this. (*<SVE_INT_UNARY:optab><mode>2): Rename to... (@aarch64_pred_<SVE_INT_UNARY:optab><mode>): ...this. (@cond_<SVE_INT_UNARY:optab><mode>): New expander. (@aarch64_pred_sxt<SVE_HSDI:mode><SVE_PARTIAL:mode>): New pattern. (@aarch64_cond_sxt<SVE_HSDI:mode><SVE_PARTIAL:mode>): Likewise. (@aarch64_pred_cnot<mode>, @cond_cnot<mode>): New expanders. (@aarch64_sve_<SVE_FP_UNARY_INT:optab><mode>): New pattern. (@aarch64_sve_<SVE_FP_UNARY:optab><mode>): Likewise. (*<SVE_COND_FP_UNARY:optab><mode>2): Rename to... (@aarch64_pred_<SVE_COND_FP_UNARY:optab><mode>): ...this. (@cond_<SVE_COND_FP_UNARY:optab><mode>): New expander. (*<SVE_INT_BINARY_IMM:optab><mode>3): Rename to... (@aarch64_pred_<SVE_INT_BINARY_IMM:optab><mode>): ...this. (@aarch64_adr<mode>, *aarch64_adr_sxtw): New patterns. (*aarch64_adr_uxtw_unspec): Likewise. (*aarch64_adr_uxtw): Rename to... (*aarch64_adr_uxtw_and): ...this. (@aarch64_adr<mode>_shift): New expander. (*aarch64_adr_shift_sxtw): New pattern. (aarch64_<su>abd<mode>_3): Rename to... (@aarch64_pred_<su>abd<mode>): ...this. (<su>abd<mode>_3): Update accordingly. (@aarch64_cond_<su>abd<mode>): New expander. (@aarch64_<SBINQOPS:su_optab><optab><mode>): New pattern. (@aarch64_<UBINQOPS:su_optab><optab><mode>): Likewise. (*<su>mul<mode>3_highpart): Rename to... (@aarch64_pred_<optab><mode>): ...this. (@cond_<MUL_HIGHPART:optab><mode>): New expander. (*cond_<MUL_HIGHPART:optab><mode>_2): New pattern. (*cond_<MUL_HIGHPART:optab><mode>_z): Likewise. (*<SVE_INT_BINARY_SD:optab><mode>3): Rename to... (@aarch64_pred_<SVE_INT_BINARY_SD:optab><mode>): ...this. (cond_<SVE_INT_BINARY_SD:optab><mode>): Add a "@" marker. (@aarch64_bic<mode>, @cond_bic<mode>): New expanders. (*v<ASHIFT:optab><mode>3): Rename to... (@aarch64_pred_<ASHIFT:optab><mode>): ...this. (@aarch64_sve_<SVE_SHIFT_WIDE:sve_int_op><mode>): New pattern. (@cond_<SVE_SHIFT_WIDE:sve_int_op><mode>): New expander. (*cond_<SVE_SHIFT_WIDE:sve_int_op><mode>_m): New pattern. (*cond_<SVE_SHIFT_WIDE:sve_int_op><mode>_z): Likewise. (@cond_asrd<mode>): New expander. (*cond_asrd<mode>_2, *cond_asrd<mode>_z): New patterns. (sdiv_pow2<mode>3): Expand to *cond_asrd<mode>_2. (*sdiv_pow2<mode>3): Delete. (@cond_<SVE_COND_FP_BINARY_INT:optab><mode>): New expander. (*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2): New pattern. (*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any): Likewise. (@aarch64_sve_<SVE_FP_BINARY:optab><mode>): New pattern. (@aarch64_sve_<SVE_FP_BINARY_INT:optab><mode>): Likewise. (*<SVE_COND_FP_BINARY_REG:optab><mode>3): Rename to... (@aarch64_pred_<SVE_COND_FP_BINARY_REG:optab><mode>): ...this. (@aarch64_pred_<SVE_COND_FP_BINARY_INT:optab><mode>): New pattern. (cond_<SVE_COND_FP_BINARY:optab><mode>): Add a "@" marker. (*add<SVE_F:mode>3): Rename to... (@aarch64_pred_add<SVE_F:mode>): ...this and add alternatives for SVE_STRICT_GP. (@aarch64_pred_<SVE_COND_FCADD:optab><mode>): New pattern. (@cond_<SVE_COND_FCADD:optab><mode>): New expander. (*cond_<SVE_COND_FCADD:optab><mode>_2): New pattern. (*cond_<SVE_COND_FCADD:optab><mode>_any): Likewise. (*sub<SVE_F:mode>3): Rename to... (@aarch64_pred_sub<SVE_F:mode>): ...this and add alternatives for SVE_STRICT_GP. (@aarch64_pred_abd<SVE_F:mode>): New expander. (*fabd<SVE_F:mode>3): Rename to... (*aarch64_pred_abd<SVE_F:mode>): ...this. (@aarch64_cond_abd<SVE_F:mode>): New expander. (*mul<SVE_F:mode>3): Rename to... (@aarch64_pred_<SVE_F:optab><mode>): ...this and add alternatives for SVE_STRICT_GP. (@aarch64_mul_lane_<SVE_F:mode>): New pattern. (*<SVE_COND_FP_MAXMIN_PUBLIC:optab><mode>3): Rename and generalize to... (@aarch64_pred_<SVE_COND_FP_MAXMIN:optab><mode>): ...this. (*<LOGICAL:optab><PRED_ALL:mode>3_ptest): New pattern. (*<nlogical><PRED_ALL:mode>3): Rename to... (aarch64_pred_<nlogical><PRED_ALL:mode>_z): ...this. (*<nlogical><PRED_ALL:mode>3_cc): New pattern. (*<nlogical><PRED_ALL:mode>3_ptest): Likewise. (*<logical_nn><PRED_ALL:mode>3): Rename to... (aarch64_pred_<logical_nn><mode>_z): ...this. (*<logical_nn><PRED_ALL:mode>3_cc): New pattern. (*<logical_nn><PRED_ALL:mode>3_ptest): Likewise. (*fma<SVE_I:mode>4): Rename to... (@aarch64_pred_fma<SVE_I:mode>): ...this. (*fnma<SVE_I:mode>4): Rename to... (@aarch64_pred_fnma<SVE_I:mode>): ...this. (@aarch64_<sur>dot_prod_lane<vsi2qi>): New pattern. (*<SVE_FP_TERNARY:optab><mode>4): Rename to... (@aarch64_pred_<SVE_FP_TERNARY:optab><mode>): ...this. (cond_<SVE_FP_TERNARY:optab><mode>): Add a "@" marker. (@aarch64_<SVE_FP_TERNARY_LANE:optab>_lane_<mode>): New pattern. (@aarch64_pred_<SVE_COND_FCMLA:optab><mode>): Likewise. (@cond_<SVE_COND_FCMLA:optab><mode>): New expander. (*cond_<SVE_COND_FCMLA:optab><mode>_4): New pattern. (*cond_<SVE_COND_FCMLA:optab><mode>_any): Likewise. (@aarch64_<FCMLA:optab>_lane_<mode>): Likewise. (@aarch64_sve_tmad<mode>): Likewise. (vcond_mask_<SVE_ALL:mode><vpred>): Add a "@" marker. (*aarch64_sel_dup<mode>): Rename to... (@aarch64_sel_dup<mode>): ...this. (@aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide): New pattern. (*aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide_cc): Likewise. (*aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide_ptest): Likewise. (@while_ult<GPI:mode><PRED_ALL:mode>): Generalize to... (@while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>): ...this. (*while_ult<GPI:mode><PRED_ALL:mode>_cc): Generalize to. (*while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>_cc): ...this. (*while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>_ptest): New pattern. (*fcm<cmp_op><mode>): Rename to... (@aarch64_pred_fcm<cmp_op><mode>): ...this. Make operand order match @aarch64_pred_cmp<cmp_op><SVE_I:mode>. (*fcmuo<mode>): Rename to... (@aarch64_pred_fcmuo<mode>): ...this. Make operand order match @aarch64_pred_cmp<cmp_op><SVE_I:mode>. (@aarch64_pred_fac<cmp_op><mode>): New expander. (@vcond_mask_<PRED_ALL:mode><mode>): New pattern. (fold_extract_last_<mode>): Generalize to... (@fold_extract_<last_op>_<mode>): ...this. (@aarch64_fold_extract_vector_<last_op>_<mode>): New pattern. (*reduc_plus_scal_<SVE_I:mode>): Replace with... (@aarch64_pred_reduc_<optab>_<mode>): ...this pattern, making the DImode result explicit. (reduc_plus_scal_<mode>): Update accordingly. (*reduc_<optab>_scal_<SVE_I:mode>): Rename to... (@aarch64_pred_reduc_<optab>_<SVE_I:mode>): ...this. (*reduc_<optab>_scal_<SVE_F:mode>): Rename to... (@aarch64_pred_reduc_<optab>_<SVE_F:mode>): ...this. (*aarch64_sve_tbl<mode>): Rename to... (@aarch64_sve_tbl<mode>): ...this. (@aarch64_sve_compact<mode>): New pattern. (*aarch64_sve_dup_lane<mode>): Rename to... (@aarch64_sve_dup_lane<mode>): ...this. (@aarch64_sve_dupq_lane<mode>): New pattern. (@aarch64_sve_splice<mode>): Likewise. (aarch64_sve_<perm_insn><mode>): Rename to... (@aarch64_sve_<perm_insn><mode>): ...this. (*aarch64_sve_ext<mode>): Rename to... (@aarch64_sve_ext<mode>): ...this. (aarch64_sve_<su>unpk<perm_hilo>_<SVE_BHSI:mode>): Add a "@" marker. (*aarch64_sve_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): Rename to... (@aarch64_sve_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): ...this. (*aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): Rename to... (@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): ...this. (@cond_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): New expander. (@cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): Likewise. (*cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): New pattern. (*aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): Rename to... (@aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): ...this. (aarch64_sve_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): Add a "@" marker. (@cond_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): New expander. (@cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): Likewise. (*cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): New pattern. (*aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): Rename to... (@aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): ...this. (@cond_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): New expander. (*cond_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): New pattern. (aarch64_sve_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): Add a "@" marker. (@cond_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): New expander. (*cond_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): New pattern. (aarch64_sve_punpk<perm_hilo>_<mode>): Add a "@" marker. (@aarch64_brk<SVE_BRK_UNARY:brk_op>): New pattern. (*aarch64_brk<SVE_BRK_UNARY:brk_op>_cc): Likewise. (*aarch64_brk<SVE_BRK_UNARY:brk_op>_ptest): Likewise. (@aarch64_brk<SVE_BRK_BINARY:brk_op>): Likewise. (*aarch64_brk<SVE_BRK_BINARY:brk_op>_cc): Likewise. (*aarch64_brk<SVE_BRK_BINARY:brk_op>_ptest): Likewise. (@aarch64_sve_<SVE_PITER:sve_pred_op><mode>): Likewise. (*aarch64_sve_<SVE_PITER:sve_pred_op><mode>_cc): Likewise. (*aarch64_sve_<SVE_PITER:sve_pred_op><mode>_ptest): Likewise. (aarch64_sve_cnt_pat): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode>_pat): Likewise. (*aarch64_sve_incsi_pat): Likewise. (@aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_pat): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_pat): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode>_pat): Likewise. (*aarch64_sve_decsi_pat): Likewise. (@aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_pat): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_pat): New pattern. (@aarch64_pred_cntp<mode>): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp) (*aarch64_incsi<PRED_ALL:mode>_cntp): New patterns. (@aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp): New expander. (*aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp): New pattern. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp) (*aarch64_incsi<PRED_ALL:mode>_cntp): New patterns. (@aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp): New expander. (*aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New pattern. * config/aarch64/arm_sve.h: New file. * config/aarch64/aarch64-sve-builtins.h: Likewise. * config/aarch64/aarch64-sve-builtins.cc: Likewise. * config/aarch64/aarch64-sve-builtins.def: Likewise. * config/aarch64/aarch64-sve-builtins-base.h: Likewise. * config/aarch64/aarch64-sve-builtins-base.cc: Likewise. * config/aarch64/aarch64-sve-builtins-base.def: Likewise. * config/aarch64/aarch64-sve-builtins-functions.h: Likewise. * config/aarch64/aarch64-sve-builtins-shapes.h: Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc: Likewise. gcc/testsuite/ * g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: New file. * g++.target/aarch64/sve/acle/general-c++: New test directory. * gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: New file. * gcc.target/aarch64/sve/acle/general: New test directory. * gcc.target/aarch64/sve/acle/general-c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> From-SVN: r277563
2019-10-29[AArch64] Handle scalars in cmp and shift immediate queriesRichard Sandiford1-2/+2
The SVE ACLE has convenience functions that take scalar arguments instead of vectors. This patch makes it easier to implement the shift and compare functions by making the associated immediate queries work for scalar immediates as well as vector duplicates of them. The "const" codes in the predicates were a holdover from an early version of the SVE port in which we used (const ...) wrappers for variable-length vector constants. I'll remove other instances of them in a separate patch. 2019-10-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64.c (aarch64_sve_cmp_immediate_p) (aarch64_simd_shift_imm_p): Accept scalars as well as vectors. * config/aarch64/predicates.md (aarch64_sve_cmp_vsc_immediate) (aarch64_sve_cmp_vsd_immediate): Accept "const_int", but don't accept "const". From-SVN: r277556
2019-08-19[AArch64] Use scvtf fbits option where appropriateJoel Hutton1-0/+4
gcc/ChangeLog: 2019-08-19 Joel Hutton <Joel.Hutton@arm.com> * config/aarch64/aarch64-protos.h (aarch64_fpconst_pow2_recip): New prototype * config/aarch64/aarch64.c (aarch64_fpconst_pow2_recip): New function * config/aarch64/aarch64.md (*aarch64_<su_optab>cvtf<fcvt_target><GPF:mode>2_mult): New pattern (*aarch64_<su_optab>cvtf<fcvt_iesize><GPF:mode>2_mult): New pattern * config/aarch64/constraints.md (Dt): New constraint * config/aarch64/predicates.md (aarch64_fpconst_pow2_recip): New predicate gcc/testsuite/ChangeLog: 2019-08-19 Joel Hutton <Joel.Hutton@arm.com> * gcc.target/aarch64/fmul_scvtf_1.c: New test. From-SVN: r274676
2019-08-15[AArch64] Rework SVE INC/DEC handlingRichard Sandiford1-4/+13
The scalar addition patterns allowed all the VL constants that ADDVL and ADDPL allow, but wrote the instructions as INC or DEC if possible (i.e. adding or subtracting a number of elements * [1, 16] when the source and target registers the same). That works for the cases that the autovectoriser needs, but there are a few constants that INC and DEC can handle but ADDPL and ADDVL can't. E.g.: inch x0, all, mul #9 is not a multiple of the number of bytes in an SVE register, and so can't use ADDVL. It represents 36 times the number of bytes in an SVE predicate, putting it outside the range of ADDPL. This patch therefore adds separate alternatives for INC and DEC, tied to a new Uai constraint. It also adds an explicit "scalar" or "vector" to the function names, to avoid a clash with the existing support for vector INC and DEC. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-protos.h (aarch64_sve_scalar_inc_dec_immediate_p): Declare. (aarch64_sve_inc_dec_immediate_p): Rename to... (aarch64_sve_vector_inc_dec_immediate_p): ...this. (aarch64_output_sve_addvl_addpl): Take a single rtx argument. (aarch64_output_sve_scalar_inc_dec): Declare. (aarch64_output_sve_inc_dec_immediate): Rename to... (aarch64_output_sve_vector_inc_dec): ...this. * config/aarch64/aarch64.c (aarch64_sve_scalar_inc_dec_immediate_p) (aarch64_output_sve_scalar_inc_dec): New functions. (aarch64_output_sve_addvl_addpl): Remove the base and offset arguments. Only handle true ADDVL and ADDPL instructions; don't emit an INC or DEC. (aarch64_sve_inc_dec_immediate_p): Rename to... (aarch64_sve_vector_inc_dec_immediate_p): ...this. (aarch64_output_sve_inc_dec_immediate): Rename to... (aarch64_output_sve_vector_inc_dec): ...this. Update call to aarch64_sve_vector_inc_dec_immediate_p. * config/aarch64/predicates.md (aarch64_sve_scalar_inc_dec_immediate) (aarch64_sve_plus_immediate): New predicates. (aarch64_pluslong_operand): Accept aarch64_sve_plus_immediate rather than aarch64_sve_addvl_addpl_immediate. (aarch64_sve_inc_dec_immediate): Rename to... (aarch64_sve_vector_inc_dec_immediate): ...this. Update call to aarch64_sve_vector_inc_dec_immediate_p. (aarch64_sve_add_operand): Update accordingly. * config/aarch64/constraints.md (Uai): New constraint. (vsi): Update call to aarch64_sve_vector_inc_dec_immediate_p. * config/aarch64/aarch64.md (add<GPI:mode>3): Don't force the second operand into a register if it satisfies aarch64_sve_plus_immediate. (*add<GPI:mode>3_aarch64, *add<GPI:mode>3_poly_1): Add an alternative for Uai. Update calls to aarch64_output_sve_addvl_addpl. * config/aarch64/aarch64-sve.md (add<mode>3): Call aarch64_output_sve_vector_inc_dec instead of aarch64_output_sve_inc_dec_immediate. From-SVN: r274518
2019-08-15[AArch64] Use SVE binary immediate instructions for conditional arithmeticRichard Sandiford1-2/+6
This patch lets us use the immediate forms of FADD, FSUB, FSUBR, FMUL, FMAXNM and FMINNM for conditional arithmetic. (We already use them for normal unconditional arithmetic.) 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64.c (aarch64_print_vector_float_operand): Print 2.0 naturally. (aarch64_sve_float_mul_immediate_p): Return true for 2.0. * config/aarch64/predicates.md (aarch64_sve_float_negated_arith_immediate): New predicate, renamed from aarch64_sve_float_arith_with_sub_immediate. (aarch64_sve_float_arith_with_sub_immediate): Test for both positive and negative constants. (aarch64_sve_float_arith_with_sub_operand): Redefine as a register or an aarch64_sve_float_arith_with_sub_immediate. * config/aarch64/constraints.md (vsN): Use aarch64_sve_float_negated_arith_immediate. * config/aarch64/iterators.md (SVE_COND_FP_BINARY_I1): New int iterator. (sve_pred_fp_rhs2_immediate): New int attribute. * config/aarch64/aarch64-sve.md (cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>): Use sve_pred_fp_rhs1_operand and sve_pred_fp_rhs2_operand. (*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_2_const) (*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_any_const) (*cond_add<SVE_F:mode>_2_const, *cond_add<SVE_F:mode>_any_const) (*cond_sub<mode>_3_const, *cond_sub<mode>_any_const): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_fadd_1.c: New test. * gcc.target/aarch64/sve/cond_fadd_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_2.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_3.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_4.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_1.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_2.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_3.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_4.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_1.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_2.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_3.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_4.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_1.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_2.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_3.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_4.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_1.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_2.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_3.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_4.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_4_run.c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274508
2019-08-14[AArch64] Use SVE UXT[BHW] as a form of predicated ANDRichard Sandiford1-0/+19
UXTB, UXTH and UXTW are equivalent to predicated ANDs with the constants 0xff, 0xffff and 0xffffffff respectively. This patch uses them in the patterns for IFN_COND_AND. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64.c (aarch64_print_operand): Allow %e to take the equivalent mask, as well as a bit count. * config/aarch64/predicates.md (aarch64_sve_uxtb_immediate) (aarch64_sve_uxth_immediate, aarch64_sve_uxt_immediate) (aarch64_sve_pred_and_operand): New predicates. * config/aarch64/iterators.md (sve_pred_int_rhs2_operand): New code attribute. * config/aarch64/aarch64-sve.md (cond_<SVE_INT_BINARY:optab><SVE_I:mode>): Use it. (*cond_uxt<mode>_2, *cond_uxt<mode>_any): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_uxt_1.c: New test. * gcc.target/aarch64/sve/cond_uxt_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_2.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_3.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_4.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_4_run.c: Likewise. From-SVN: r274479
2019-08-14[AArch64] Make more use of SVE conditional constant movesRichard Sandiford1-1/+6
This patch extends the SVE UNSPEC_SEL patterns so that they can use: (1) MOV /M of a duplicated integer constant (2) MOV /M of a duplicated floating-point constant bitcast to an integer, accepting the same constants as (1) (3) FMOV /M of a duplicated floating-point constant (4) MOV /Z of a duplicated integer constant (5) MOV /Z of a duplicated floating-point constant bitcast to an integer, accepting the same constants as (4) (6) MOVPRFXed FMOV /M of a duplicated floating-point constant We already handled (4) with a special pattern; the rest are new. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64.c (aarch64_bit_representation): New function. (aarch64_print_vector_float_operand): Also handle 8-bit floats. (aarch64_print_operand): Add support for %I. (aarch64_sve_dup_immediate_p): Handle scalars as well as vectors. Bitcast floating-point constants to the corresponding integer constant. (aarch64_float_const_representable_p): Handle vectors as well as scalars. (aarch64_expand_sve_vcond): Make sure that the operands are valid for the new vcond_mask_<mode><vpred> expander. * config/aarch64/predicates.md (aarch64_sve_dup_immediate): Also test aarch64_float_const_representable_p. (aarch64_sve_reg_or_dup_imm): New predicate. * config/aarch64/aarch64-sve.md (vec_extract<vpred><Vel>): Use gen_vcond_mask_<mode><vpred> instead of gen_aarch64_sve_dup<mode>_const. (vcond_mask_<mode><vpred>): Turn into a define_expand that accepts aarch64_sve_reg_or_dup_imm and aarch64_simd_reg_or_zero for operands 1 and 2 respectively. Force operand 2 into a register if operand 1 is a register. Fold old define_insn... (aarch64_sve_dup<mode>_const): ...and this define_insn... (*vcond_mask_<mode><vpred>): ...into this new pattern. Handle floating-point constants that can be moved as integers. Add alternatives for MOV /M and FMOV /M. (vcond<mode><v_int_equiv>, vcondu<mode><v_int_equiv>) (vcond<mode><v_fp_equiv>): Accept nonmemory_operand for operands 1 and 2 respectively. * config/aarch64/constraints.md (Ufc): Handle vectors as well as scalars. (vss): New constraint. gcc/testsuite/ * gcc.target/aarch64/sve/vcond_18.c: New test. * gcc.target/aarch64/sve/vcond_18_run.c: Likewise. * gcc.target/aarch64/sve/vcond_19.c: Likewise. * gcc.target/aarch64/sve/vcond_19_run.c: Likewise. * gcc.target/aarch64/sve/vcond_20.c: Likewise. * gcc.target/aarch64/sve/vcond_20_run.c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274441
2019-08-14[AArch64] Add support for SVE F{MAX,MIN}NM immediateRichard Sandiford1-0/+9
This patch uses the immediate forms of FMAXNM and FMINNM for unconditional arithmetic. The same rules apply to FMAX and FMIN, but we only generate those via the ACLE. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/predicates.md (aarch64_sve_float_maxmin_immediate) (aarch64_sve_float_maxmin_operand): New predicates. * config/aarch64/constraints.md (vsB): New constraint. (vsM): Fix typo. * config/aarch64/iterators.md (sve_pred_fp_rhs2_operand): Use aarch64_sve_float_maxmin_operand for UNSPEC_COND_FMAXNM and UNSPEC_COND_FMINNM. * config/aarch64/aarch64-sve.md (<maxmin_uns><SVE_F:mode>3): Use aarch64_sve_float_maxmin_operand for operand 2. (*<SVE_COND_FP_MAXMIN_PUBLIC:optab><SVE_F:mode>3): Likewise. Add alternatives for the constant forms. gcc/testsuite/ * gcc.target/aarch64/sve/fmaxnm_1.c: New test. * gcc.target/aarch64/sve/fminnm_1.c: Likewise. From-SVN: r274440
2019-08-14[AArch64] Add support for SVE [SU]{MAX,MIN} immediateRichard Sandiford1-3/+15
This patch adds support for the immediate forms of SVE SMAX, SMIN, UMAX and UMIN. SMAX and SMIN take the same range as MUL, so the patch basically just moves and generalises the existing MUL patterns. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/constraints.md (vsb): New constraint. (vsm): Generalize description. * config/aarch64/iterators.md (SVE_INT_BINARY_IMM): New code iterator. (sve_imm_con): Handle smax, smin, umax and umin. (sve_imm_prefix): New code attribute. * config/aarch64/predicates.md (aarch64_sve_vsb_immediate) (aarch64_sve_vsb_operand): New predicates. (aarch64_sve_mul_immediate): Rename to... (aarch64_sve_vsm_immediate): ...this. (aarch64_sve_mul_operand): Rename to... (aarch64_sve_vsm_operand): ...this. * config/aarch64/aarch64-sve.md (mul<mode>3): Generalize to... (<SVE_INT_BINARY_IMM:optab><SVE_I:mode>3): ...this. (*mul<mode>3, *post_ra_mul<mode>3): Generalize to... (*<SVE_INT_BINARY_IMM:optab><SVE_I:mode>3) (*post_ra_<SVE_INT_BINARY_IMM:optab><SVE_I:mode>3): ...these and add movprfx support for the immediate alternatives. (<su><maxmin><mode>3, *<su><maxmin><mode>3): Delete in favor of the above. (*<SVE_INT_BINARY_SD:optab><SVE_SDI:mode>3): Fix incorrect predicate for operand 3. gcc/testsuite/ * gcc.target/aarch64/sve/smax_1.c: New test. * gcc.target/aarch64/sve/smin_1.c: Likewise. * gcc.target/aarch64/sve/umax_1.c: Likewise. * gcc.target/aarch64/sve/umin_1.c: Likewise. From-SVN: r274439
2019-08-14[AArch64] Add support for SVE CNOTRichard Sandiford1-0/+4
This patch adds support for predicated and unpredicated CNOT (logical NOT on integers). In RTL terms, this is a select between 1 and 0 in which the predicate is fed by a comparison with zero. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/predicates.md (aarch64_simd_imm_one): New predicate. * config/aarch64/aarch64-sve.md (*cnot<mode>): New pattern. (*cond_cnot<mode>_2, *cond_cnot<mode>_any): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/cnot_1.c: New test. * gcc.target/aarch64/sve/cond_cnot_1.c: Likewise. * gcc.target/aarch64/sve/cond_cnot_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_cnot_2.c: Likewise. * gcc.target/aarch64/sve/cond_cnot_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_cnot_3.c: Likewise. * gcc.target/aarch64/sve/cond_cnot_3_run.c: Likewise. From-SVN: r274438
2019-08-14[AArch64] Use SVE ADR to optimise shift-add sequencesRichard Sandiford1-0/+12
This patch uses SVE ADR to optimise shift-and-add and uxtw-and-add sequences. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/predicates.md (const_1_to_3_operand): New predicate. * config/aarch64/aarch64-sve.md (*aarch64_adr_uxtw) (*aarch64_adr<mode>_shift, *aarch64_adr_shift_uxtw): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/adr_1.c: New test. * gcc.target/aarch64/sve/adr_1_run.c: Likewise. * gcc.target/aarch64/sve/adr_2.c: Likewise. * gcc.target/aarch64/sve/adr_2_run.c: Likewise. * gcc.target/aarch64/sve/adr_3.c: Likewise. * gcc.target/aarch64/sve/adr_3_run.c: Likewise. * gcc.target/aarch64/sve/adr_4.c: Likewise. * gcc.target/aarch64/sve/adr_4_run.c: Likewise. * gcc.target/aarch64/sve/adr_5.c: Likewise. * gcc.target/aarch64/sve/adr_5_run.c: Likewise. From-SVN: r274436
2019-08-14[AArch64] Add a "GP strictness" operand to SVE FP unspecsRichard Sandiford1-0/+5
This patch makes the SVE unary, binary and ternary FP unspecs take a new "GP strictness" operand that indicates whether the predicate has to be taken literally, or whether it is valid to make extra lanes active (up to and including using a PTRUE). This again is laying the groundwork for the ACLE patterns, in which the value can depend on the FP command-line flags. At the moment it's only needed for addition, subtraction and multiplication, which have unpredicated forms that can only be used when operating on all lanes is safe. But in future it might be useful for optimising predicate usage. The strict mode requires extra alternatives for addition, subtraction and multiplication, but I've left those for the main ACLE patch. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64.md (SVE_RELAXED_GP, SVE_STRICT_GP): New constants. * config/aarch64/predicates.md (aarch64_sve_gp_strictness): New predicate. * config/aarch64/aarch64-protos.h (aarch64_sve_pred_dominates_p): Declare. * config/aarch64/aarch64.c (aarch64_sve_pred_dominates_p): New function. * config/aarch64/aarch64-sve.md: Add a block comment about the handling of predicated FP operations. (<SVE_COND_FP_UNARY:optab><SVE_F:mode>2, add<SVE_F:mode>3) (sub<SVE_F:mode>3, mul<SVE_F:mode>3, div<SVE_F:mode>3) (<SVE_COND_FP_MAXMIN_PUBLIC:optab><SVE_F:mode>3) (<SVE_COND_FP_MAXMIN_PUBLIC:maxmin_uns><SVE_F:mode>3) (<SVE_COND_FP_TERNARY:optab><SVE_F:mode>4): Add an SVE_RELAXED_GP operand. (cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>) (cond_<SVE_COND_FP_TERNARY:optab><SVE_F:mode>): Add an SVE_STRICT_GP operand. (*<SVE_COND_FP_UNARY:optab><SVE_F:mode>2) (*cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>_2) (*cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>_3) (*cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>_any) (*fabd<SVE_F:mode>3, *div<SVE_F:mode>3) (*<SVE_COND_FP_MAXMIN_PUBLIC:optab><SVE_F:mode>3) (*<SVE_COND_FP_TERNARY:optab><SVE_F:mode>4) (*cond_<SVE_COND_FP_TERNARY:optab><SVE_F:mode>_2) (*cond_<SVE_COND_FP_TERNARY:optab><SVE_F:mode>_4) (*cond_<SVE_COND_FP_TERNARY:optab><SVE_F:mode>_any): Match the strictness operands. Use aarch64_sve_pred_dominates_p to check whether the predicate on the conditional operation is suitable for merging. Split patterns into the canonical equal-predicate form. (*add<SVE_F:mode>3, *sub<SVE_F:mode>3, *mul<SVE_F:mode>3): Likewise. Restrict the unpredicated alternatives to SVE_RELAXED_GP. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274418
2019-08-14[AArch64] Rework SVE PTEST patternsRichard Sandiford1-0/+5
This patch reworks the rtl representation of the SVE PTEST operation so that: - the governing predicate is always VNx16BI (and so all bits are defined) - it is still possible to pattern-match the governing predicate in the mode that it had previously - a new hint operand says whether the governing predicate is known to be all true for the element size of interest, rather than this being part of the unspec name. These changes make it easier to handle more flag-setting instructions as part of the ACLE work. See the comment in aarch64-sve.md for more details. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-protos.h (aarch64_ptrue_all): Declare. * config/aarch64/aarch64.c (aarch64_ptrue_all): New function. * config/aarch64/aarch64.md (UNSPEC_PTEST_PTRUE): Delete. (UNSPEC_PTEST): New unspec. (SVE_MAYBE_NOT_PTRUE, SVE_KNOWN_PTRUE): New constants. * config/aarch64/iterators.md (data_bytes): New mode attribute. * config/aarch64/predicates.md (aarch64_sve_ptrue_flag): New predicate. * config/aarch64/aarch64-sve.md: Add a new section describing the handling of UNSPEC_PTEST. (pred_<LOGICAL:optab><PRED_ALL:mode>3): Rename to... (@aarch64_pred_<LOGICAL:optab><PRED_ALL:mode>_z): ...this. (ptest_ptrue<mode>): Replace with... (aarch64_ptest<mode>): ...this new pattern. (cbranch<mode>4): Update after above changes. (*<LOGICAL:optab><PRED_ALL:mode>3_cc): Use UNSPEC_PTEST instead of UNSPEC_PTEST_PTRUE. (*cmp<SVE_INT_CMP:cmp_op><SVE_I:mode>_cc): Likewise. (*cmp<SVE_INT_CMP:cmp_op><SVE_I:mode>_ptest): Likewise. (*while_ult<GPI:mode><PRED_ALL:mode>_cc): Likewise. From-SVN: r274414
2019-08-13[AArch64] Improve SVE constant movesRichard Sandiford1-0/+10
If there's no SVE instruction to load a given constant directly, this patch instead tries to use an Advanced SIMD constant move and then duplicates the constant to fill an SVE vector. The main use of this is to support constants in which each byte is in { 0, 0xff }. Also, the patch prefers a simple integer move followed by a duplicate over a load from memory, like we already do for Advanced SIMD. This is a useful option to have and would be easy to turn off via a tuning parameter if necessary. The patch also extends the handling of wide LD1Rs to big endian, whereas previously we punted to a full LD1RQ. 2019-08-13 Richard Sandiford <richard.sandiford@arm.com> gcc/ * machmode.h (opt_mode::else_mode): New function. (opt_mode::else_blk): Use it. * config/aarch64/aarch64-protos.h (aarch64_vq_mode): Declare. (aarch64_full_sve_mode, aarch64_sve_ld1rq_operand_p): Likewise. (aarch64_gen_stepped_int_parallel): Likewise. (aarch64_stepped_int_parallel_p): Likewise. (aarch64_expand_mov_immediate): Remove the optional gen_vec_duplicate argument. * config/aarch64/aarch64.c (aarch64_expand_sve_widened_duplicate): Delete. (aarch64_expand_sve_dupq, aarch64_expand_sve_ld1rq): New functions. (aarch64_expand_sve_const_vector): Rewrite to handle more cases. (aarch64_expand_mov_immediate): Remove the optional gen_vec_duplicate argument. Use early returns in the !CONST_INT_P handling. Pass all SVE data vectors to aarch64_expand_sve_const_vector rather than handling some inline. (aarch64_full_sve_mode, aarch64_vq_mode): New functions, split out from... (aarch64_simd_container_mode): ...here. (aarch64_gen_stepped_int_parallel, aarch64_stepped_int_parallel_p) (aarch64_sve_ld1rq_operand_p): New functions. * config/aarch64/predicates.md (descending_int_parallel) (aarch64_sve_ld1rq_operand): New predicates. * config/aarch64/constraints.md (UtQ): New constraint. * config/aarch64/aarch64.md (UNSPEC_REINTERPRET): New unspec. * config/aarch64/aarch64-sve.md (mov<SVE_ALL:mode>): Remove the gen_vec_duplicate from call to aarch64_expand_mov_immediate. (@aarch64_sve_reinterpret<mode>): New expander. (*aarch64_sve_reinterpret<mode>): New pattern. (@aarch64_vec_duplicate_vq<mode>_le): New pattern. (@aarch64_vec_duplicate_vq<mode>_be): Likewise. (*sve_ld1rq<Vesize>): Replace with... (@aarch64_sve_ld1rq<mode>): ...this new pattern. gcc/testsuite/ * gcc.target/aarch64/sve/init_2.c: Expect ld1rd to be used instead of a full vector load. * gcc.target/aarch64/sve/init_4.c: Likewise. * gcc.target/aarch64/sve/ld1r_2.c: Remove constants that no longer need to be loaded from memory. * gcc.target/aarch64/sve/slp_2.c: Expect the same output for big and little endian. * gcc.target/aarch64/sve/slp_3.c: Likewise. Expect 3 of the doubles to be moved via integer registers rather than loaded from memory. * gcc.target/aarch64/sve/slp_4.c: Likewise but for 4 doubles. * gcc.target/aarch64/sve/spill_4.c: Expect 16-bit constants to be loaded via an integer register rather than from memory. * gcc.target/aarch64/sve/const_1.c: New test. * gcc.target/aarch64/sve/const_2.c: Likewise. * gcc.target/aarch64/sve/const_3.c: Likewise. From-SVN: r274375
2019-08-13[AArch64] Add a "y" constraint for V0-V7Richard Sandiford1-2/+1
Some indexed SVE FCMLA operations have a 3-bit register field that requires one of Z0-Z7. This patch adds a public "y" constraint for that. The patch also documents "x", which is again intended to be a public constraint. 2019-08-13 Richard Sandiford <richard.sandiford@arm.com> gcc/ * doc/md.texi: Document the x and y constraints for AArch64. * config/aarch64/aarch64.h (FP_LO8_REGNUM_P): New macro. (FP_LO8_REGS): New reg_class. (REG_CLASS_NAMES, REG_CLASS_CONTENTS): Add an entry for FP_LO8_REGS. * config/aarch64/aarch64.c (aarch64_hard_regno_nregs) (aarch64_regno_regclass, aarch64_class_max_nregs): Handle FP_LO8_REGS. * config/aarch64/predicates.md (aarch64_simd_register): Use FP_REGNUM_P instead of checking the classes manually. * config/aarch64/constraints.md (y): New constraint. gcc/testsuite/ * gcc.target/aarch64/asm-x-constraint-1.c: New test. * gcc.target/aarch64/asm-y-constraint-1.c: Likewise. From-SVN: r274367
2019-08-07[AArch64] Fix INSR for zero floatsRichard Sandiford1-2/+2
We used INSR to handle zero integers but not zero floats. 2019-08-07 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/constraints.md (Z): Handle floating-point zeros too. * config/aarch64/predicates.md (aarch64_reg_or_zero): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/init_13.c: New test. From-SVN: r274193
2019-05-12Accept code attributes as rtx codes in .md filesRichard Sandiford1-6/+0
The recent AArch64 absolute difference patterns had to go through some hoops to pair max/min rtx codes with the same signedness. I also need to pair signed/unsigned codes with sign/zero extension for some SVE ACLE patterns. This patch therefore supports <...> as rtx codes, like we already do for modes. 2019-05-12 Richard Sandiford <richard.sandiford@arm.com> gcc/ * doc/md.texi: Document use of code attributes in rtx patterns. * read-md.h (rtx_reader::rtx_alloc_for_name): New member function. * read-rtl.c (find_code): Split out search loops into... (maybe_find_code): ...this new function. (check_code_iterator): Make the error message more informative. (check_code_attribute): New function. (rtx_reader::rtx_alloc_for_name): Likewise. (rtx_reader::read_rtx_code): Use rtx_alloc_for_name. * config/aarch64/predicates.md (aarch64_smin, aarch64_umin): Delete. * config/aarch64/aarch64-simd.md (*aarch64_<su>abd<mode>_3): Use <max_opp> directly as an rtx code instead of via a match_operator. * config/aarch64/aarch64-sve.md (aarch64_<su>abd<mode>_3): Likewise. (<su>abd<mode>_3): Update accordingly. From-SVN: r271107
2019-02-22Handle stack pointer with SUBS/ADDS instructions.Matthew Malcomson1-0/+4
In general the stack pointer was not handled for many SUBS/ADDS patterns in aarch64.md. Both the "extended register" and "immediate" forms allow the stack pointer to be used as the source register, while no form allows the stack pointer for the destination register. The define_insn patterns generating ADDS/SUBS did not allow the stack pointer for any operand, while the define_peephole2 patterns that generated RTX to be matched by these patterns allowed the stack pointer for any operand. The patterns are fixed by adding the 'k' constraint for the first source operand to all define_insns that generate the ADDS/SUBS "extended register" and "immediate" forms (but not the "shifted register" form). In peephole optimizations, constraint strings are ignored (see "(gccint) C Constraint Interface" info node in the documentation), so the decision to act or not is based solely on the predicate and condition. This patch introduces a new predicate "aarch64_general_reg" to be used in define_peephole2 patterns where only GENERAL_REGS registers are acceptable and uses that predicate in the peepholes that generate patterns for ADDS/SUBS. Full bootstrap and regtest done on aarch64-none-linux-gnu. Regression tests done on aarch64-none-linux-gnu and aarch64-none-elf cross compiler. OK for trunk? gcc/ChangeLog: 2019-02-22 Matthew Malcomson <matthew.malcomson@arm.com> PR target/89324 * config/aarch64/aarch64.md: Use aarch64_general_reg predicate on destination register in peepholes generating patterns for ADDS/SUBS. (add<mode>3_compare0, *addsi3_compare0_uxtw, add<mode>3_compareC, add<mode>3_compareV_imm, add<mode>3_compareV, *adds_<optab><ALLX:mode>_<GPI:mode>, *subs_<optab><ALLX:mode>_<GPI:mode>, *adds_<optab><ALLX:mode>_shift_<GPI:mode>, *subs_<optab><ALLX:mode>_shift_<GPI:mode>, *adds_<optab><mode>_multp2, *subs_<optab><mode>_multp2, *sub<mode>3_compare0, *subsi3_compare0_uxtw, sub<mode>3_compare1): Allow stack pointer for source register. * config/aarch64/predicates.md (aarch64_general_reg): New predicate. gcc/testsuite/ChangeLog: 2019-02-22 Matthew Malcomson <matthew.malcomson@arm.com> PR target/89324 * gcc.dg/rtl/aarch64/subs_adds_sp.c: New test. * gfortran.fortran-torture/compile/pr89324.f90: New test. From-SVN: r269122
2019-02-07[AArch64] Change representation of SABD in RTLKyrylo Tkachov1-0/+6
Richard raised a concern about the RTL we use to represent the AdvSIMD SABD (vector signed absolute difference) instruction. We currently represent it as ABS (MINUS op1 op2). This isn't exactly what SABD does. ABS treats its input as a signed value and returns the absolute of that. For example: (sabd:QI 64 -128) == 192 (unsigned) aka -64 (signed) whereas (minus:QI 64 -128) == 192 (unsigned) aka -64 (signed), (abs ...) of that is 64. A better way to describe the instruction is with MINUS (SMAX (op1 op2) SMIN (op1 op2)). This patch implements that, and also implements similar semantics for the UABD instruction that uses UMAX and UMIN. That way for the example above we'll have: (minus:QI (smax:QI (64 -128)) (smin:QI (64 -128))) == (minus:QI 64 -128) == 192 (or -64 signed) which matches what SABD does. * config/aarch64/iterators.md (max_opp): New code_attr. (USMAX): New code iterator. * config/aarch64/predicates.md (aarch64_smin): New predicate. (aarch64_smax): Likewise. * config/aarch64/aarch64-simd.md (abd<mode>_3): Rename to... (*aarch64_<su>abd<mode>_3): ... Change RTL representation to MINUS (MAX MIN). * gcc.target/aarch64/abd_1.c: New test. * gcc.dg/sabd_1.c: Likewise. From-SVN: r268658
2019-01-16__builtin_<add/sub>_overflow issues on AArch64 (redux)Richard Earnshaw1-6/+20
Further investigation showed that my previous patch for this issue was still incomplete. The problem stemmed from what I suspect was a mis-understanding of the way overflow is calculated on aarch64 when values are subtracted (and hence in comparisons). In this case, unlike addition, the carry flag is /cleared/ if there is overflow (technically, underflow) and set when that does not happen. This patch clears up this issue by using CCmode for all subtractive operations (this can fully describe the normal overflow conditions without anything particularly fancy); clears up the way we express normal unsigned overflow using CC_Cmode (the result of a sum is less than one of the operands) and adds a new mode, CC_ADCmode to handle expressing overflow of an add-with-carry operation, where the standard idiom is no-longer sufficient to describe the overflow condition. PR target/86891 * config/aarch64/aarch64-modes.def: Add comment about how the carry bit is set by add and compare. (CC_ADC): New CC_MODE. * config/aarch64/aarch64.c (aarch64_select_cc_mode): Use variables to cache the code and mode of X. Adjust the shape of a CC_Cmode comparison. Add detection for CC_ADCmode. (aarch64_get_condition_code_1): Update code support for CC_Cmode. Add CC_ADCmode. * config/aarch64/aarch64.md (uaddv<mode>4): Use LTU with CCmode. (uaddvti4): Comparison result is in CC_ADCmode and the condition is GEU. (add<mode>3_compareC_cconly_imm): Delete. Merge into... (add<mode>3_compareC_cconly): ... this. Restructure the comparison to eliminate the need for zero-extending the operands. (add<mode>3_compareC_imm): Delete. Merge into ... (add<mode>3_compareC): ... this. Restructure the comparison to eliminate the need for zero-extending the operands. (add<mode>3_carryin): Use LTU for the overflow detection. (add<mode>3_carryinC): Use CC_ADCmode for the result of the carry out. Reexpress comparison for overflow. (add<mode>3_carryinC_zero): Update for change to add<mode>3_carryinC. (add<mode>3_carryinC): Likewise. (add<mode>3_carryinV): Use LTU for carry between partials. * config/aarch64/predicates.md (aarch64_carry_operation): Update handling of CC_Cmode and add CC_ADCmode. (aarch64_borrow_operation): Likewise. From-SVN: r267971
2019-01-01Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r267494
2018-10-31aarch64: Improve cas generationRichard Henderson1-0/+12
Do not zero-extend the input to the cas for subword operations; instead, use the appropriate zero-extending compare insns. Correct the predicates and constraints for immediate expected operand. * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): New. (aarch64_split_compare_and_swap): Use it. (aarch64_expand_compare_and_swap): Likewise. Remove convert_modes; test oldval against the proper predicate. * config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI>): Use nonmemory_operand for expected. (cas_short_expected_pred): New. (@aarch64_compare_and_swap<SHORT>): Use it; use "rn" not "rI" to match. (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for expected. * config/aarch64/predicates.md (aarch64_plushi_immediate): New. (aarch64_plushi_operand): New. From-SVN: r265657
2018-09-19[AARCH64] Use STLUR for atomic_storeMatthew Malcomson1-0/+30
Use the STLUR instruction introduced in Armv8.4-a. This instruction has the store-release semantic like STLR but can take a 9-bit unscaled signed immediate offset. Example test case: ``` void foo () { int32_t *atomic_vals = calloc (4, sizeof (int32_t)); atomic_store_explicit (atomic_vals + 1, 2, memory_order_release); } ``` Before patch generates ``` foo: stp x29, x30, [sp, -16]! mov x1, 4 mov x0, x1 mov x29, sp bl calloc mov w1, 2 add x0, x0, 4 stlr w1, [x0] ldp x29, x30, [sp], 16 ret ``` After patch generates ``` foo: stp x29, x30, [sp, -16]! mov x1, 4 mov x0, x1 mov x29, sp bl calloc mov w1, 2 stlur w1, [x0, 4] ldp x29, x30, [sp], 16 ret ``` We introduce a new feature flag to indicate the presence of this instruction. The feature flag is called AARCH64_ISA_RCPC8_4 and is included when targeting armv8.4 architecture. We also introduce an "arch" attribute to be checked called "rcpc8_4" after this feature flag. gcc/ 2018-09-19 Matthew Malcomson <matthew.malcomson@arm.com> * config/aarch64/aarch64-protos.h (aarch64_offset_9bit_signed_unscaled_p): New declaration. * config/aarch64/aarch64.md (arches): New "rcpc8_4" attribute value. (arch_enabled): Add check for "rcpc8_4" attribute value of "arch". * config/aarch64/aarch64.h (AARCH64_FL_RCPC8_4): New bitfield. (AARCH64_FL_FOR_ARCH8_4): Include AARCH64_FL_RCPC8_4. (AARCH64_FL_PROFILE): Move index so flags are ordered. (AARCH64_ISA_RCPC8_4): New flag. * config/aarch64/aarch64.c (offset_9bit_signed_unscaled_p): Renamed to aarch64_offset_9bit_signed_unscaled_p. * config/aarch64/atomics.md (atomic_store<mode>): Allow offset and use stlur. * config/aarch64/constraints.md (Ust): New constraint. * config/aarch64/predicates.md. (aarch64_9bit_offset_memory_operand): New predicate. (aarch64_rcpc_memory_operand): New predicate. gcc/testsuite/ 2018-09-19 Matthew Malcomson <matthew.malcomson@arm.com> * gcc.target/aarch64/atomic-store.c: New. From-SVN: r264421
2018-07-19[AArch64][PATCH 2/2] PR target/83009: Relax strict address checking for storeAndre Vieira1-1/+1
pair lanes gcc/ChangeLog 2018-07-19 Andre Vieira <andre.simoesdiasvieira@arm.com> PR target/83009 * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): Make address check not strict. gcc/testsuite/ChangeLog 2018-07-19 Andre Vieira <andre.simoesdiasvieira@arm.com> PR target/83009 * gcc/target/aarch64/store_v2vec_lanes.c: Add extra tests. From-SVN: r262881
2018-07-19[AArch64][PATCH 1/2] Fix addressing printing of LDP/STPAndre Vieira1-2/+3
gcc/ChangeLog 2018-07-19 Andre Vieira <andre.simoesdiasvieira@arm.com> * config/aarch64/aarch64-simd.md (aarch64_simd_mov<VQ:mode>): Replace Umq with Umn. (store_pair_lanes<mode>): Likewise. * config/aarch64/aarch64-protos.h (aarch64_addr_query_type): Add new enum value 'ADDR_QUERY_LDP_STP_N'. * config/aarch64/aarch64.c (aarch64_addr_query_type): Likewise. (aarch64_print_address_internal): Add declaration. (aarch64_print_ldpstp_address): Remove. (aarch64_classify_address): Adapt mode for 'ADDR_QUERY_LDP_STP_N'. (aarch64_print_operand): Change printing of 'y'. * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): Use new enum value 'ADDR_QUERY_LDP_STP_N', don't hardcode mode and use 'true' rather than '1'. * gcc/config/aarch64/constraints.md (Uml): Likewise. (Uml): Rename to Umn. (Umq): Remove. From-SVN: r262880
2018-07-02aarch64: Add movprfx patterns alternativesRichard Henderson1-0/+3
* config/aarch64/aarch64-protos.h, config/aarch64/aarch64.c (aarch64_sve_prepare_conditional_op): Remove. * config/aarch64/aarch64-sve.md (cond_<SVE_INT_BINARY><SVE_I>): Allow aarch64_simd_reg_or_zero as select operand; remove the aarch64_sve_prepare_conditional_op call. (cond_<SVE_INT_BINARY_SD><SVE_SDI>): Likewise. (cond_<SVE_COND_FP_BINARY><SVE_F>): Likewise. (*cond_<SVE_INT_BINARY><SVE_I>_z): New pattern. (*cond_<SVE_INT_BINARY_SD><SVE_SDI>_z): New pattern. (*cond_<SVE_COND_FP_BINARY><SVE_F>_z): New pattern. (*cond_<SVE_INT_BINARY><SVE_I>_any): New pattern. (*cond_<SVE_INT_BINARY_SD><SVE_SDI>_any): New pattern. (*cond_<SVE_COND_FP_BINARY><SVE_F>_any): New pattern and a splitters to match all of the *_any patterns. * config/aarch64/predicates.md (aarch64_sve_any_binary_operator): New. * config/aarch64/iterators.md (SVE_INT_BINARY_REV): Remove. (SVE_COND_FP_BINARY_REV): Remove. (sve_int_op_rev, sve_fp_op_rev): New. * config/aarch64/aarch64-sve.md (*cond_<SVE_INT_BINARY><SVE_I>_0): New. (*cond_<SVE_INT_BINARY_SD><SVE_SDI>_0): New. (*cond_<SVE_COND_FP_BINARY><SVE_F>_0): New. (*cond_<SVE_INT_BINARY><SVE_I>_2): Rename, add movprfx alternative. (*cond_<SVE_INT_BINARY_SD><SVE_SDI>_2): Similarly. (*cond_<SVE_COND_FP_BINARY><SVE_F>_2): Similarly. (*cond_<SVE_INT_BINARY><SVE_I>_3): Similarly; use sve_int_op_rev. (*cond_<SVE_INT_BINARY_SD><SVE_SDI>_3): Similarly. (*cond_<SVE_COND_FP_BINARY><SVE_F>_3): Similarly; use sve_fp_op_rev. * config/aarch64/aarch64-sve.md (cond_<SVE_COND_FP_BINARY><SVE_F>): Remove match_dup 1 from the inner unspec. (*cond_<SVE_COND_FP_BINARY><SVE_F>): Likewise. * config/aarch64/aarch64.md (movprfx): New attr. (length): Default movprfx to 8. * config/aarch64/aarch64-sve.md (*mul<SVE_I>3): Add movprfx alt. (*madd<SVE_I>, *msub<SVE_I): Likewise. (*<su>mul<SVE_I>3_highpart): Likewise. (*<SVE_INT_BINARY_SD><SVE_SDI>3): Likewise. (*v<ASHIFT><SVE_I>3): Likewise. (*<su><MAXMIN><SVE_I>3): Likewise. (*<su><MAXMIN><SVE_F>3): Likewise. (*fma<SVE_F>4, *fnma<SVE_F>4): Likewise. (*fms<SVE_F>4, *fnms<SVE_F>4): Likewise. (*div<SVE_F>4): Likewise. From-SVN: r262312
2018-05-30Reverting r260635Andre Vieira1-1/+1
gcc 2018-05-30 Andre Vieira <andre.simoesdiasvieira@arm.com> 2018-05-24 Andre Vieira <andre.simoesdiasvieira@arm.com> PR target/83009 Revert: * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): Make address check not strict. gcc/testsuite 2018-05-30 Andre Vieira <andre.simoesdiasvieira@arm.com> 2018-05-24 Andre Vieira <andre.simoesdiasvieira@arm.com> Revert PR target/83009 * gcc/target/aarch64/store_v2vec_lanes.c: Add extra tests. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@260635 138bc75d-0d04-0410-961f-82ee72b054a4 From-SVN: r260957
2018-05-24PR target/83009: Relax strict address checking for store pair lanesAndre Vieira1-1/+1
The operand constraint for the memory address of store/load pair lanes was enforcing strictly hardware registers be allowed as memory addresses. We want to relax that such that these patterns can be used by combine. During register allocation the register constraint will enforce the correct register is chosen. gcc 2018-05-24 Andre Vieira <andre.simoesdiasvieira@arm.com> PR target/83009 * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): Make address check not strict. gcc/testsuite 2018-05-24 Andre Vieira <andre.simoesdiasvieira@arm.com> PR target/83009 * gcc/target/aarch64/store_v2vec_lanes.c: Add extra tests. From-SVN: r260635
2018-05-22[AArch64] Merge stores of D-register values with different modesJackson Woodruff1-0/+4
This patch merges loads and stores from D-registers that are of different modes. Code like this: typedef int __attribute__((vector_size(8))) vec; struct pair { vec v; double d; } Now generates a store pair instruction: void assign (struct pair *p, vec v) { p->v = v; p->d = 1.0; } Whereas previously it generated two `str` instructions. This patch also merges storing of double zero values with long integer values: struct pair { long long l; double d; } void foo (struct pair *p) { p->l = 10; p->d = 0.0; } Now generates a single store pair instruction rather than two `str` instructions. The patch basically generalises the mode iterators on the patterns in aarch64.md and the peepholes in aarch64-ldpstp.md to take all combinations of pairs of modes so, while it may be a large-ish patch, it does fairly mechanical stuff. 2018-05-22 Jackson Woodruff <jackson.woodruff@arm.com> Kyrylo Tkachov <kyrylo.tkachov@arm.com> * config/aarch64/aarch64.md: New patterns to generate stp and ldp. (store_pair_sw, store_pair_dw): New patterns to generate stp for single words and double words. (load_pair_sw, load_pair_dw): Likewise. (store_pair_sf, store_pair_df, store_pair_si, store_pair_di): Delete. (load_pair_sf, load_pair_df, load_pair_si, load_pair_di): Delete. * config/aarch64/aarch64-ldpstp.md: Modify peephole for different mode ldpstp and add peephole for merged zero stores. Likewise for loads. * config/aarch64/aarch64.c (aarch64_operands_ok_for_ldpstp): Add size check. (aarch64_gen_store_pair): Rename calls to match new patterns. (aarch64_gen_load_pair): Rename calls to match new patterns. * config/aarch64/aarch64-simd.md (load_pair<mode>): Rename to... (load_pair<DREG:mode><DREG2:mode>): ... This. (store_pair<mode>): Rename to... (vec_store_pair<DREG:mode><DREG2:mode>): ... This. * config/aarch64/iterators.md (DREG, DREG2, DX2, SX, SX2, DSX): New mode iterators. (V_INT_EQUIV): Handle SImode. * config/aarch64/predicates.md (aarch64_reg_zero_or_fp_zero): New predicate. * gcc.target/aarch64/ldp_stp_6.c: New. * gcc.target/aarch64/ldp_stp_7.c: New. * gcc.target/aarch64/ldp_stp_8.c: New. Co-Authored-By: Kyrylo Tkachov <kyrylo.tkachov@arm.com> From-SVN: r260538
2018-03-07re PR target/84565 (ICE in extract_insn, at recog.c:2304 on aarch64)Jakub Jelinek1-1/+1
PR fortran/84565 * config/aarch64/predicates.md (aarch64_simd_reg_or_zero): Use aarch64_simd_or_scalar_imm_zero rather than aarch64_simd_imm_zero. * gfortran.dg/pr84565.f90: New test. From-SVN: r258333
2018-02-01[AArch64] Handle SVE subregs that are effectively REVsRichard Sandiford1-0/+4
Subreg reads should be equivalent to storing the inner register to memory and loading the appropriate memory bytes back, with subreg writes doing the reverse. For the reasons explained in the comments, this isn't what happens for big-endian SVE if we simply reinterpret one vector register as having a different element size, so the conceptual store and load is needed in the general case. However, that obviously produces poor code if we do it too often. The patch therefore adds a pattern for handling the operation in registers. This copes with the important case of a VIEW_CONVERT created by tree-vect-slp.c:duplicate_and_interleave. It might make sense to tighten the predicates in aarch64-sve.md so that such subregs are not allowed as operands to most instructions, but that's future work. This fixes the sve/slp_*.c tests on aarch64_be. 2018-02-01 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * config/aarch64/aarch64-protos.h (aarch64_split_sve_subreg_move) (aarch64_maybe_expand_sve_subreg_move): Declare. * config/aarch64/aarch64.md (UNSPEC_REV_SUBREG): New unspec. * config/aarch64/predicates.md (aarch64_any_register_operand): New predicate. * config/aarch64/aarch64-sve.md (mov<mode>): Optimize subreg moves that are semantically a reverse operation. (*aarch64_sve_mov<mode>_subreg_be): New pattern. * config/aarch64/aarch64.c (aarch64_maybe_expand_sve_subreg_move): (aarch64_replace_reg_mode, aarch64_split_sve_subreg_move): New functions. (aarch64_can_change_mode_class): For big-endian, forbid changes between two SVE modes if they have different element sizes. Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com> From-SVN: r257289
2018-01-17[AArch64] PR82964: Fix 128-bit immediate ICEsWilco Dijkstra1-7/+6
This fixes PR82964 which reports ICEs for some CONST_WIDE_INT immediates. It turns out decimal floating point CONST_DOUBLE get changed into CONST_WIDE_INT without checking the constraint on the operand, which results in failures. Avoid this by only allowing SF/DF/TF mode floating point constants in aarch64_legitimate_constant_p. A similar issue can occur with 128-bit immediates which may be emitted even when disallowed in aarch64_legitimate_constant_p, and the constraints in movti_aarch64 don't match. Fix this with a new constraint and allowing valid immediates in aarch64_legitimate_constant_p. Rather than allowing all 128-bit immediates and expanding in up to 8 MOV/MOVK instructions, limit them to 4 instructions and use a literal load for other cases. Improve a few TImode tests to use a literal and ensure they are skipped with -fpic. This fixes all reported failures. gcc/ PR target/82964 * config/aarch64/aarch64.md (movti_aarch64): Use Uti constraint. * config/aarch64/aarch64.c (aarch64_mov128_immediate): New function. (aarch64_legitimate_constant_p): Just support CONST_DOUBLE SF/DF/TF mode to avoid creating illegal CONST_WIDE_INT immediates. * config/aarch64/aarch64-protos.h (aarch64_mov128_immediate): Add declaration. * config/aarch64/constraints.md (aarch64_movti_operand): Limit immediates. * config/aarch64/predicates.md (Uti): Add new constraint. gcc/testsuite/ PR target/79041 PR target/82964 * gcc.target/aarch64/pr79041-2.c: Improve test, disable with fpic. * gcc.target/aarch64/pr78733.c: Improve test, disable with fpic. Co-Authored-By: Richard Sandiford <richard.sandiford@linaro.org> From-SVN: r256800
2018-01-13Add support for SVE gather loadsRichard Sandiford1-0/+8
This patch adds support for SVE gather loads. It uses the basically the same analysis code as the AVX gather support, but after that there are two major differences: - It uses new internal functions rather than target built-ins. The interface is: IFN_GATHER_LOAD (base, offsets scale) IFN_MASK_GATHER_LOAD (base, offsets scale, mask) which should be reasonably generic. One of the advantages of using internal functions is that other passes can understand what the functions do, but a more immediate advantage is that we can query the underlying target pattern to see which scales it supports. - It uses pattern recognition to convert the offset to the right width, if it was originally narrower than that. This avoids having to do a widening operation as part of the gather expansion itself. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * doc/md.texi (gather_load@var{m}): Document. (mask_gather_load@var{m}): Likewise. * genopinit.c (main): Add supports_vec_gather_load and supports_vec_gather_load_cached to target_optabs. * optabs-tree.c (init_tree_optimization_optabs): Use ggc_cleared_alloc to allocate target_optabs. * optabs.def (gather_load_optab, mask_gather_laod_optab): New optabs. * internal-fn.def (GATHER_LOAD, MASK_GATHER_LOAD): New internal functions. * internal-fn.h (internal_load_fn_p): Declare. (internal_gather_scatter_fn_p): Likewise. (internal_fn_mask_index): Likewise. (internal_gather_scatter_fn_supported_p): Likewise. * internal-fn.c (gather_load_direct): New macro. (expand_gather_load_optab_fn): New function. (direct_gather_load_optab_supported_p): New macro. (direct_internal_fn_optab): New function. (internal_load_fn_p): Likewise. (internal_gather_scatter_fn_p): Likewise. (internal_fn_mask_index): Likewise. (internal_gather_scatter_fn_supported_p): Likewise. * optabs-query.c (supports_at_least_one_mode_p): New function. (supports_vec_gather_load_p): Likewise. * optabs-query.h (supports_vec_gather_load_p): Declare. * tree-vectorizer.h (gather_scatter_info): Add ifn, element_type and memory_type field. (NUM_PATTERNS): Bump to 15. * tree-vect-data-refs.c: Include internal-fn.h. (vect_gather_scatter_fn_p): New function. (vect_describe_gather_scatter_call): Likewise. (vect_check_gather_scatter): Try using internal functions for gather loads. Recognize existing calls to a gather load function. (vect_analyze_data_refs): Consider using gather loads if supports_vec_gather_load_p. * tree-vect-patterns.c (vect_get_load_store_mask): New function. (vect_get_gather_scatter_offset_type): Likewise. (vect_convert_mask_for_vectype): Likewise. (vect_add_conversion_to_patterm): Likewise. (vect_try_gather_scatter_pattern): Likewise. (vect_recog_gather_scatter_pattern): New pattern recognizer. (vect_vect_recog_func_ptrs): Add it. * tree-vect-stmts.c (exist_non_indexing_operands_for_use_p): Use internal_fn_mask_index and internal_gather_scatter_fn_p. (check_load_store_masking): Take the gather_scatter_info as an argument and handle gather loads. (vect_get_gather_scatter_ops): New function. (vectorizable_call): Check internal_load_fn_p. (vectorizable_load): Likewise. Handle gather load internal functions. (vectorizable_store): Update call to check_load_store_masking. * config/aarch64/aarch64.md (UNSPEC_LD1_GATHER): New unspec. * config/aarch64/iterators.md (SVE_S, SVE_D): New mode iterators. * config/aarch64/predicates.md (aarch64_gather_scale_operand_w) (aarch64_gather_scale_operand_d): New predicates. * config/aarch64/aarch64-sve.md (gather_load<mode>): New expander. (mask_gather_load<mode>): New insns. gcc/testsuite/ * gcc.target/aarch64/sve/gather_load_1.c: New test. * gcc.target/aarch64/sve/gather_load_2.c: Likewise. * gcc.target/aarch64/sve/gather_load_3.c: Likewise. * gcc.target/aarch64/sve/gather_load_4.c: Likewise. * gcc.target/aarch64/sve/gather_load_5.c: Likewise. * gcc.target/aarch64/sve/gather_load_6.c: Likewise. * gcc.target/aarch64/sve/gather_load_7.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_1.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_2.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_3.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_4.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_5.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_6.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_7.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256640
2018-01-13[AArch64] SVE load/store_lanes supportRichard Sandiford1-0/+8
This patch adds support for SVE LD[234], ST[234] and associated structure modes. Unlike Advanced SIMD, these modes are extra-long vector modes instead of integer modes. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * config/aarch64/aarch64-modes.def: Define x2, x3 and x4 vector modes for SVE. * config/aarch64/aarch64-protos.h (aarch64_sve_struct_memory_operand_p): Declare. * config/aarch64/iterators.md (SVE_STRUCT): New mode iterator. (vector_count, insn_length, VSINGLE, vsingle): New mode attributes. (VPRED, vpred): Handle SVE structure modes. * config/aarch64/constraints.md (Utx): New constraint. * config/aarch64/predicates.md (aarch64_sve_struct_memory_operand) (aarch64_sve_struct_nonimmediate_operand): New predicates. * config/aarch64/aarch64.md (UNSPEC_LDN, UNSPEC_STN): New unspecs. * config/aarch64/aarch64-sve.md (mov<mode>, *aarch64_sve_mov<mode>_le) (*aarch64_sve_mov<mode>_be, pred_mov<mode>): New patterns for structure modes. Split into pieces after RA. (vec_load_lanes<mode><vsingle>, vec_mask_load_lanes<mode><vsingle>) (vec_store_lanes<mode><vsingle>, vec_mask_store_lanes<mode><vsingle>): New patterns. * config/aarch64/aarch64.c (aarch64_classify_vector_mode): Handle SVE structure modes. (aarch64_classify_address): Likewise. (sizetochar): Move earlier in file. (aarch64_print_operand): Handle SVE register lists. (aarch64_array_mode): New function. (aarch64_sve_struct_memory_operand_p): Likewise. (TARGET_ARRAY_MODE): Redefine. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_load_lanes): Return true for SVE too. * g++.dg/vect/pr36648.cc: XFAIL for variable-length vectors if load/store lanes are supported. * gcc.dg/vect/slp-10.c: Likewise. * gcc.dg/vect/slp-12c.c: Likewise. * gcc.dg/vect/slp-17.c: Likewise. * gcc.dg/vect/slp-33.c: Likewise. * gcc.dg/vect/slp-6.c: Likewise. * gcc.dg/vect/slp-cond-1.c: Likewise. * gcc.dg/vect/slp-multitypes-11-big-array.c: Likewise. * gcc.dg/vect/slp-multitypes-11.c: Likewise. * gcc.dg/vect/slp-multitypes-12.c: Likewise. * gcc.dg/vect/slp-perm-5.c: Remove XFAIL for variable-length SVE. * gcc.dg/vect/slp-perm-6.c: Likewise. * gcc.dg/vect/slp-perm-9.c: Likewise. * gcc.dg/vect/slp-reduc-6.c: Remove XFAIL for variable-length vectors. * gcc.dg/vect/vect-load-lanes-peeling-1.c: Expect an epilogue loop for variable-length vectors. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256618
2018-01-13[AArch64] Add SVE supportRichard Sandiford1-22/+176
This patch adds support for ARM's Scalable Vector Extension. The patch just contains the core features that work with the current vectoriser framework; later patches will add extra capabilities to both the target-independent code and AArch64 code. The patch doesn't include: - support for unwinding frames whose size depends on the vector length - modelling the effect of __tls_get_addr on the SVE registers These are handled by later patches instead. Some notes: - The copyright years for aarch64-sve.md start at 2009 because some of the code is based on aarch64.md, which also starts from then. - The patch inserts spaces between items in the AArch64 section of sourcebuild.texi. This matches at least the surrounding architectures and looks a little nicer in the info output. - aarch64-sve.md includes a pattern: while_ult<GPI:mode><PRED_ALL:mode> A later patch adds a matching "while_ult" optab, but the pattern is also needed by the predicate vec_duplicate expander. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * doc/invoke.texi (-msve-vector-bits=): Document new option. (sve): Document new AArch64 extension. * doc/md.texi (w): Extend the description of the AArch64 constraint to include SVE vectors. (Upl, Upa): Document new AArch64 predicate constraints. * config/aarch64/aarch64-opts.h (aarch64_sve_vector_bits_enum): New enum. * config/aarch64/aarch64.opt (sve_vector_bits): New enum. (msve-vector-bits=): New option. * config/aarch64/aarch64-option-extensions.def (fp, simd): Disable SVE when these are disabled. (sve): New extension. * config/aarch64/aarch64-modes.def: Define SVE vector and predicate modes. Adjust their number of units based on aarch64_sve_vg. (MAX_BITSIZE_MODE_ANY_MODE): Define. * config/aarch64/aarch64-protos.h (ADDR_QUERY_ANY): New aarch64_addr_query_type. (aarch64_const_vec_all_same_in_range_p, aarch64_sve_pred_mode) (aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p) (aarch64_sve_inc_dec_immediate_p, aarch64_add_offset_temporaries) (aarch64_split_add_offset, aarch64_output_sve_cnt_immediate) (aarch64_output_sve_addvl_addpl, aarch64_output_sve_inc_dec_immediate) (aarch64_output_sve_mov_immediate, aarch64_output_ptrue): Declare. (aarch64_simd_imm_zero_p): Delete. (aarch64_check_zero_based_sve_index_immediate): Declare. (aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p) (aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p) (aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p) (aarch64_sve_float_mul_immediate_p): Likewise. (aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT rather than an rtx. (aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): Declare. (aarch64_expand_mov_immediate): Take a gen_vec_duplicate callback. (aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move): Declare. (aarch64_expand_sve_vec_cmp_int, aarch64_expand_sve_vec_cmp_float) (aarch64_expand_sve_vcond, aarch64_expand_sve_vec_perm): Declare. (aarch64_regmode_natural_size): Likewise. * config/aarch64/aarch64.h (AARCH64_FL_SVE): New macro. (AARCH64_FL_V8_3, AARCH64_FL_RCPC, AARCH64_FL_DOTPROD): Shift left one place. (AARCH64_ISA_SVE, TARGET_SVE): New macros. (FIXED_REGISTERS, CALL_USED_REGISTERS, REGISTER_NAMES): Add entries for VG and the SVE predicate registers. (V_ALIASES): Add a "z"-prefixed alias. (FIRST_PSEUDO_REGISTER): Change to P15_REGNUM + 1. (AARCH64_DWARF_VG, AARCH64_DWARF_P0): New macros. (PR_REGNUM_P, PR_LO_REGNUM_P): Likewise. (PR_LO_REGS, PR_HI_REGS, PR_REGS): New reg_classes. (REG_CLASS_NAMES): Add entries for them. (REG_CLASS_CONTENTS): Likewise. Update ALL_REGS to include VG and the predicate registers. (aarch64_sve_vg): Declare. (BITS_PER_SVE_VECTOR, BYTES_PER_SVE_VECTOR, BYTES_PER_SVE_PRED) (SVE_BYTE_MODE, MAX_COMPILE_TIME_VEC_BYTES): New macros. (REGMODE_NATURAL_SIZE): Define. * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Handle SVE macros. * config/aarch64/aarch64.c: Include cfgrtl.h. (simd_immediate_info): Add a constructor for series vectors, and an associated step field. (aarch64_sve_vg): New variable. (aarch64_dbx_register_number): Handle VG and the predicate registers. (aarch64_vect_struct_mode_p, aarch64_vector_mode_p): Delete. (VEC_ADVSIMD, VEC_SVE_DATA, VEC_SVE_PRED, VEC_STRUCT, VEC_ANY_SVE) (VEC_ANY_DATA, VEC_STRUCT): New constants. (aarch64_advsimd_struct_mode_p, aarch64_sve_pred_mode_p) (aarch64_classify_vector_mode, aarch64_vector_data_mode_p) (aarch64_sve_data_mode_p, aarch64_sve_pred_mode) (aarch64_get_mask_mode): New functions. (aarch64_hard_regno_nregs): Handle SVE data modes for FP_REGS and FP_LO_REGS. Handle PR_REGS, PR_LO_REGS and PR_HI_REGS. (aarch64_hard_regno_mode_ok): Handle VG. Also handle the SVE predicate modes and predicate registers. Explicitly restrict GPRs to modes of 16 bytes or smaller. Only allow FP registers to store a vector mode if it is recognized by aarch64_classify_vector_mode. (aarch64_regmode_natural_size): New function. (aarch64_hard_regno_caller_save_mode): Return the original mode for predicates. (aarch64_sve_cnt_immediate_p, aarch64_output_sve_cnt_immediate) (aarch64_sve_addvl_addpl_immediate_p, aarch64_output_sve_addvl_addpl) (aarch64_sve_inc_dec_immediate_p, aarch64_output_sve_inc_dec_immediate) (aarch64_add_offset_1_temporaries, aarch64_offset_temporaries): New functions. (aarch64_add_offset): Add a temp2 parameter. Assert that temp1 does not overlap dest if the function is frame-related. Handle SVE constants. (aarch64_split_add_offset): New function. (aarch64_add_sp, aarch64_sub_sp): Add temp2 parameters and pass them aarch64_add_offset. (aarch64_allocate_and_probe_stack_space): Add a temp2 parameter and update call to aarch64_sub_sp. (aarch64_add_cfa_expression): New function. (aarch64_expand_prologue): Pass extra temporary registers to the functions above. Handle the case in which we need to emit new DW_CFA_expressions for registers that were originally saved relative to the stack pointer, but now have to be expressed relative to the frame pointer. (aarch64_output_mi_thunk): Pass extra temporary registers to the functions above. (aarch64_expand_epilogue): Likewise. Prevent inheritance of IP0 and IP1 values for SVE frames. (aarch64_expand_vec_series): New function. (aarch64_expand_sve_widened_duplicate): Likewise. (aarch64_expand_sve_const_vector): Likewise. (aarch64_expand_mov_immediate): Add a gen_vec_duplicate parameter. Handle SVE constants. Use emit_move_insn to move a force_const_mem into the register, rather than emitting a SET directly. (aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move) (aarch64_get_reg_raw_mode, offset_4bit_signed_scaled_p) (offset_6bit_unsigned_scaled_p, aarch64_offset_7bit_signed_scaled_p) (offset_9bit_signed_scaled_p): New functions. (aarch64_replicate_bitmask_imm): New function. (aarch64_bitmask_imm): Use it. (aarch64_cannot_force_const_mem): Reject expressions involving a CONST_POLY_INT. Update call to aarch64_classify_symbol. (aarch64_classify_index): Handle SVE indices, by requiring a plain register index with a scale that matches the element size. (aarch64_classify_address): Handle SVE addresses. Assert that the mode of the address is VOIDmode or an integer mode. Update call to aarch64_classify_symbol. (aarch64_classify_symbolic_expression): Update call to aarch64_classify_symbol. (aarch64_const_vec_all_in_range_p): New function. (aarch64_print_vector_float_operand): Likewise. (aarch64_print_operand): Handle 'N' and 'C'. Use "zN" rather than "vN" for FP registers with SVE modes. Handle (const ...) vectors and the FP immediates 1.0 and 0.5. (aarch64_print_address_internal): Handle SVE addresses. (aarch64_print_operand_address): Use ADDR_QUERY_ANY. (aarch64_regno_regclass): Handle predicate registers. (aarch64_secondary_reload): Handle big-endian reloads of SVE data modes. (aarch64_class_max_nregs): Handle SVE modes and predicate registers. (aarch64_rtx_costs): Check for ADDVL and ADDPL instructions. (aarch64_convert_sve_vector_bits): New function. (aarch64_override_options): Use it to handle -msve-vector-bits=. (aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT rather than an rtx. (aarch64_legitimate_constant_p): Use aarch64_classify_vector_mode. Handle SVE vector and predicate modes. Accept VL-based constants that need only one temporary register, and VL offsets that require no temporary registers. (aarch64_conditional_register_usage): Mark the predicate registers as fixed if SVE isn't available. (aarch64_vector_mode_supported_p): Use aarch64_classify_vector_mode. Return true for SVE vector and predicate modes. (aarch64_simd_container_mode): Take the number of bits as a poly_int64 rather than an unsigned int. Handle SVE modes. (aarch64_preferred_simd_mode): Update call accordingly. Handle SVE modes. (aarch64_autovectorize_vector_sizes): Add BYTES_PER_SVE_VECTOR if SVE is enabled. (aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p) (aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p) (aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p) (aarch64_sve_float_mul_immediate_p): New functions. (aarch64_sve_valid_immediate): New function. (aarch64_simd_valid_immediate): Use it as the fallback for SVE vectors. Explicitly reject structure modes. Check for INDEX constants. Handle PTRUE and PFALSE constants. (aarch64_check_zero_based_sve_index_immediate): New function. (aarch64_simd_imm_zero_p): Delete. (aarch64_mov_operand_p): Use aarch64_simd_valid_immediate for vector modes. Accept constants in the range of CNT[BHWD]. (aarch64_simd_scalar_immediate_valid_for_move): Explicitly ask for an Advanced SIMD mode. (aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): New functions. (aarch64_simd_vector_alignment): Handle SVE predicates. (aarch64_vectorize_preferred_vector_alignment): New function. (aarch64_simd_vector_alignment_reachable): Use it instead of the vector size. (aarch64_shift_truncation_mask): Use aarch64_vector_data_mode_p. (aarch64_output_sve_mov_immediate, aarch64_output_ptrue): New functions. (MAX_VECT_LEN): Delete. (expand_vec_perm_d): Add a vec_flags field. (emit_unspec2, aarch64_expand_sve_vec_perm): New functions. (aarch64_evpc_trn, aarch64_evpc_uzp, aarch64_evpc_zip) (aarch64_evpc_ext): Don't apply a big-endian lane correction for SVE modes. (aarch64_evpc_rev): Rename to... (aarch64_evpc_rev_local): ...this. Use a predicated operation for SVE. (aarch64_evpc_rev_global): New function. (aarch64_evpc_dup): Enforce a 64-byte range for SVE DUP. (aarch64_evpc_tbl): Use MAX_COMPILE_TIME_VEC_BYTES instead of MAX_VECT_LEN. (aarch64_evpc_sve_tbl): New function. (aarch64_expand_vec_perm_const_1): Update after rename of aarch64_evpc_rev. Handle SVE permutes too, trying aarch64_evpc_rev_global and using aarch64_evpc_sve_tbl rather than aarch64_evpc_tbl. (aarch64_vectorize_vec_perm_const): Initialize vec_flags. (aarch64_sve_cmp_operand_p, aarch64_unspec_cond_code) (aarch64_gen_unspec_cond, aarch64_expand_sve_vec_cmp_int) (aarch64_emit_unspec_cond, aarch64_emit_unspec_cond_or) (aarch64_emit_inverted_unspec_cond, aarch64_expand_sve_vec_cmp_float) (aarch64_expand_sve_vcond): New functions. (aarch64_modes_tieable_p): Use aarch64_vector_data_mode_p instead of aarch64_vector_mode_p. (aarch64_dwarf_poly_indeterminate_value): New function. (aarch64_compute_pressure_classes): Likewise. (aarch64_can_change_mode_class): Likewise. (TARGET_GET_RAW_RESULT_MODE, TARGET_GET_RAW_ARG_MODE): Redefine. (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): Likewise. (TARGET_VECTORIZE_GET_MASK_MODE): Likewise. (TARGET_DWARF_POLY_INDETERMINATE_VALUE): Likewise. (TARGET_COMPUTE_PRESSURE_CLASSES): Likewise. (TARGET_CAN_CHANGE_MODE_CLASS): Likewise. * config/aarch64/constraints.md (Upa, Upl, Uav, Uat, Usv, Usi, Utr) (Uty, Dm, vsa, vsc, vsd, vsi, vsn, vsl, vsm, vsA, vsM, vsN): New constraints. (Dn, Dl, Dr): Accept const as well as const_vector. (Dz): Likewise. Compare against CONST0_RTX. * config/aarch64/iterators.md: Refer to "Advanced SIMD" instead of "vector" where appropriate. (SVE_ALL, SVE_BH, SVE_BHS, SVE_BHSI, SVE_HSDI, SVE_HSF, SVE_SD) (SVE_SDI, SVE_I, SVE_F, PRED_ALL, PRED_BHS): New mode iterators. (UNSPEC_SEL, UNSPEC_ANDF, UNSPEC_IORF, UNSPEC_XORF, UNSPEC_COND_LT) (UNSPEC_COND_LE, UNSPEC_COND_EQ, UNSPEC_COND_NE, UNSPEC_COND_GE) (UNSPEC_COND_GT, UNSPEC_COND_LO, UNSPEC_COND_LS, UNSPEC_COND_HS) (UNSPEC_COND_HI, UNSPEC_COND_UO): New unspecs. (Vetype, VEL, Vel, VWIDE, Vwide, vw, vwcore, V_INT_EQUIV) (v_int_equiv): Extend to SVE modes. (Vesize, V128, v128, Vewtype, V_FP_EQUIV, v_fp_equiv, VPRED): New mode attributes. (LOGICAL_OR, SVE_INT_UNARY, SVE_FP_UNARY): New code iterators. (optab): Handle popcount, smin, smax, umin, umax, abs and sqrt. (logical_nn, lr, sve_int_op, sve_fp_op): New code attributs. (LOGICALF, OPTAB_PERMUTE, UNPACK, UNPACK_UNSIGNED, SVE_COND_INT_CMP) (SVE_COND_FP_CMP): New int iterators. (perm_hilo): Handle the new unpack unspecs. (optab, logicalf_op, su, perm_optab, cmp_op, imm_con): New int attributes. * config/aarch64/predicates.md (aarch64_sve_cnt_immediate) (aarch64_sve_addvl_addpl_immediate, aarch64_split_add_offset_immediate) (aarch64_pluslong_or_poly_operand, aarch64_nonmemory_operand) (aarch64_equality_operator, aarch64_constant_vector_operand) (aarch64_sve_ld1r_operand, aarch64_sve_ldr_operand): New predicates. (aarch64_sve_nonimmediate_operand): Likewise. (aarch64_sve_general_operand): Likewise. (aarch64_sve_dup_operand, aarch64_sve_arith_immediate): Likewise. (aarch64_sve_sub_arith_immediate, aarch64_sve_inc_dec_immediate) (aarch64_sve_logical_immediate, aarch64_sve_mul_immediate): Likewise. (aarch64_sve_dup_immediate, aarch64_sve_cmp_vsc_immediate): Likewise. (aarch64_sve_cmp_vsd_immediate, aarch64_sve_index_immediate): Likewise. (aarch64_sve_float_arith_immediate): Likewise. (aarch64_sve_float_arith_with_sub_immediate): Likewise. (aarch64_sve_float_mul_immediate, aarch64_sve_arith_operand): Likewise. (aarch64_sve_add_operand, aarch64_sve_logical_operand): Likewise. (aarch64_sve_lshift_operand, aarch64_sve_rshift_operand): Likewise. (aarch64_sve_mul_operand, aarch64_sve_cmp_vsc_operand): Likewise. (aarch64_sve_cmp_vsd_operand, aarch64_sve_index_operand): Likewise. (aarch64_sve_float_arith_operand): Likewise. (aarch64_sve_float_arith_with_sub_operand): Likewise. (aarch64_sve_float_mul_operand): Likewise. (aarch64_sve_vec_perm_operand): Likewise. (aarch64_pluslong_operand): Include aarch64_sve_addvl_addpl_immediate. (aarch64_mov_operand): Accept const_poly_int and const_vector. (aarch64_simd_lshift_imm, aarch64_simd_rshift_imm): Accept const as well as const_vector. (aarch64_simd_imm_zero, aarch64_simd_imm_minus_one): Move earlier in file. Use CONST0_RTX and CONSTM1_RTX. (aarch64_simd_or_scalar_imm_zero): Likewise. Add match_codes. (aarch64_simd_reg_or_zero): Accept const as well as const_vector. Use aarch64_simd_imm_zero. * config/aarch64/aarch64-sve.md: New file. * config/aarch64/aarch64.md: Include it. (VG_REGNUM, P0_REGNUM, P7_REGNUM, P15_REGNUM): New register numbers. (UNSPEC_REV, UNSPEC_LD1_SVE, UNSPEC_ST1_SVE, UNSPEC_MERGE_PTRUE) (UNSPEC_PTEST_PTRUE, UNSPEC_UNPACKSHI, UNSPEC_UNPACKUHI) (UNSPEC_UNPACKSLO, UNSPEC_UNPACKULO, UNSPEC_PACK) (UNSPEC_FLOAT_CONVERT, UNSPEC_WHILE_LO): New unspec constants. (sve): New attribute. (enabled): Disable instructions with the sve attribute unless TARGET_SVE. (movqi, movhi): Pass CONST_POLY_INT operaneds through aarch64_expand_mov_immediate. (*mov<mode>_aarch64, *movsi_aarch64, *movdi_aarch64): Handle CNT[BHSD] immediates. (movti): Split CONST_POLY_INT moves into two halves. (add<mode>3): Accept aarch64_pluslong_or_poly_operand. Split additions that need a temporary here if the destination is the stack pointer. (*add<mode>3_aarch64): Handle ADDVL and ADDPL immediates. (*add<mode>3_poly_1): New instruction. (set_clobber_cc): New expander. Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com> Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256612
2018-01-11aarch64-modes.def (V2HF): New VECTOR_MODE.Michael Collison1-0/+12
2018-01-10 Michael Collison <michael.collison@arm.com> * config/aarch64/aarch64-modes.def (V2HF): New VECTOR_MODE. * config/aarch64/aarch64-option-extension.def: Add AARCH64_OPT_EXTENSION of 'fp16fml'. * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): (__ARM_FEATURE_FP16_FML): Define if TARGET_F16FML is true. * config/aarch64/predicates.md (aarch64_lane_imm3): New predicate. * config/aarch64/constraints.md (Ui7): New constraint. * config/aarch64/iterators.md (VFMLA_W): New mode iterator. (VFMLA_SEL_W): Ditto. (f16quad): Ditto. (f16mac1): Ditto. (VFMLA16_LOW): New int iterator. (VFMLA16_HIGH): Ditto. (UNSPEC_FMLAL): New unspec. (UNSPEC_FMLSL): Ditto. (UNSPEC_FMLAL2): Ditto. (UNSPEC_FMLSL2): Ditto. (f16mac): New code attribute. * config/aarch64/aarch64-simd-builtins.def (aarch64_fmlal_lowv2sf): Ditto. (aarch64_fmlsl_lowv2sf): Ditto. (aarch64_fmlalq_lowv4sf): Ditto. (aarch64_fmlslq_lowv4sf): Ditto. (aarch64_fmlal_highv2sf): Ditto. (aarch64_fmlsl_highv2sf): Ditto. (aarch64_fmlalq_highv4sf): Ditto. (aarch64_fmlslq_highv4sf): Ditto. (aarch64_fmlal_lane_lowv2sf): Ditto. (aarch64_fmlsl_lane_lowv2sf): Ditto. (aarch64_fmlal_laneq_lowv2sf): Ditto. (aarch64_fmlsl_laneq_lowv2sf): Ditto. (aarch64_fmlalq_lane_lowv4sf): Ditto. (aarch64_fmlsl_lane_lowv4sf): Ditto. (aarch64_fmlalq_laneq_lowv4sf): Ditto. (aarch64_fmlsl_laneq_lowv4sf): Ditto. (aarch64_fmlal_lane_highv2sf): Ditto. (aarch64_fmlsl_lane_highv2sf): Ditto. (aarch64_fmlal_laneq_highv2sf): Ditto. (aarch64_fmlsl_laneq_highv2sf): Ditto. (aarch64_fmlalq_lane_highv4sf): Ditto. (aarch64_fmlsl_lane_highv4sf): Ditto. (aarch64_fmlalq_laneq_highv4sf): Ditto. (aarch64_fmlsl_laneq_highv4sf): Ditto. * config/aarch64/aarch64-simd.md: (aarch64_fml<f16mac1>l<f16quad>_low<mode>): New pattern. (aarch64_fml<f16mac1>l<f16quad>_high<mode>): Ditto. (aarch64_simd_fml<f16mac1>l<f16quad>_low<mode>): Ditto. (aarch64_simd_fml<f16mac1>l<f16quad>_high<mode>): Ditto. (aarch64_fml<f16mac1>l_lane_lowv2sf): Ditto. (aarch64_fml<f16mac1>l_lane_highv2sf): Ditto. (aarch64_simd_fml<f16mac>l_lane_lowv2sf): Ditto. (aarch64_simd_fml<f16mac>l_lane_highv2sf): Ditto. (aarch64_fml<f16mac1>lq_laneq_lowv4sf): Ditto. (aarch64_fml<f16mac1>lq_laneq_highv4sf): Ditto. (aarch64_simd_fml<f16mac>lq_laneq_lowv4sf): Ditto. (aarch64_simd_fml<f16mac>lq_laneq_highv4sf): Ditto. (aarch64_fml<f16mac1>l_laneq_lowv2sf): Ditto. (aarch64_fml<f16mac1>l_laneq_highv2sf): Ditto. (aarch64_simd_fml<f16mac>l_laneq_lowv2sf): Ditto. (aarch64_simd_fml<f16mac>l_laneq_highv2sf): Ditto. (aarch64_fml<f16mac1>lq_lane_lowv4sf): Ditto. (aarch64_fml<f16mac1>lq_lane_highv4sf): Ditto. (aarch64_simd_fml<f16mac>lq_lane_lowv4sf): Ditto. (aarch64_simd_fml<f16mac>lq_lane_highv4sf): Ditto. * config/aarch64/arm_neon.h (vfmlal_low_u32): New intrinsic. (vfmlsl_low_u32): Ditto. (vfmlalq_low_u32): Ditto. (vfmlslq_low_u32): Ditto. (vfmlal_high_u32): Ditto. (vfmlsl_high_u32): Ditto. (vfmlalq_high_u32): Ditto. (vfmlslq_high_u32): Ditto. (vfmlal_lane_low_u32): Ditto. (vfmlsl_lane_low_u32): Ditto. (vfmlal_laneq_low_u32): Ditto. (vfmlsl_laneq_low_u32): Ditto. (vfmlalq_lane_low_u32): Ditto. (vfmlslq_lane_low_u32): Ditto. (vfmlalq_laneq_low_u32): Ditto. (vfmlslq_laneq_low_u32): Ditto. (vfmlal_lane_high_u32): Ditto. (vfmlsl_lane_high_u32): Ditto. (vfmlal_laneq_high_u32): Ditto. (vfmlsl_laneq_high_u32): Ditto. (vfmlalq_lane_high_u32): Ditto. (vfmlslq_lane_high_u32): Ditto. (vfmlalq_laneq_high_u32): Ditto. (vfmlslq_laneq_high_u32): Ditto. * config/aarch64/aarch64.h (AARCH64_FL_F16SML): New flag. (AARCH64_FL_FOR_ARCH8_4): New. (AARCH64_ISA_F16FML): New ISA flag. (TARGET_F16FML): New feature flag for fp16fml. (doc/invoke.texi): Document new fp16fml option. 2018-01-10 Michael Collison <michael.collison@arm.com> * config/aarch64/aarch64-builtins.c: (aarch64_types_ternopu_imm_qualifiers, TYPES_TERNOPUI): New. * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): (__ARM_FEATURE_SHA3): Define if TARGET_SHA3 is true. * config/aarch64/aarch64.h (AARCH64_FL_SHA3): New flags. (AARCH64_ISA_SHA3): New ISA flag. (TARGET_SHA3): New feature flag for sha3. * config/aarch64/iterators.md (sha512_op): New int attribute. (CRYPTO_SHA512): New int iterator. (UNSPEC_SHA512H): New unspec. (UNSPEC_SHA512H2): Ditto. (UNSPEC_SHA512SU0): Ditto. (UNSPEC_SHA512SU1): Ditto. * config/aarch64/aarch64-simd-builtins.def (aarch64_crypto_sha512hqv2di): New builtin. (aarch64_crypto_sha512h2qv2di): Ditto. (aarch64_crypto_sha512su0qv2di): Ditto. (aarch64_crypto_sha512su1qv2di): Ditto. (aarch64_eor3qv8hi): Ditto. (aarch64_rax1qv2di): Ditto. (aarch64_xarqv2di): Ditto. (aarch64_bcaxqv8hi): Ditto. * config/aarch64/aarch64-simd.md: (aarch64_crypto_sha512h<sha512_op>qv2di): New pattern. (aarch64_crypto_sha512su0qv2di): Ditto. (aarch64_crypto_sha512su1qv2di): Ditto. (aarch64_eor3qv8hi): Ditto. (aarch64_rax1qv2di): Ditto. (aarch64_xarqv2di): Ditto. (aarch64_bcaxqv8hi): Ditto. * config/aarch64/arm_neon.h (vsha512hq_u64): New intrinsic. (vsha512h2q_u64): Ditto. (vsha512su0q_u64): Ditto. (vsha512su1q_u64): Ditto. (veor3q_u16): Ditto. (vrax1q_u64): Ditto. (vxarq_u64): Ditto. (vbcaxq_u16): Ditto. * config/arm/types.md (crypto_sha512): New type attribute. (crypto_sha3): Ditto. (doc/invoke.texi): Document new sha3 option. 2018-01-10 Michael Collison <michael.collison@arm.com> * config/aarch64/aarch64-builtins.c: (aarch64_types_quadopu_imm_qualifiers, TYPES_QUADOPUI): New. * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): (__ARM_FEATURE_SM3): Define if TARGET_SM4 is true. (__ARM_FEATURE_SM4): Define if TARGET_SM4 is true. * config/aarch64/aarch64.h (AARCH64_FL_SM4): New flags. (AARCH64_ISA_SM4): New ISA flag. (TARGET_SM4): New feature flag for sm4. * config/aarch64/aarch64-simd-builtins.def (aarch64_sm3ss1qv4si): Ditto. (aarch64_sm3tt1aq4si): Ditto. (aarch64_sm3tt1bq4si): Ditto. (aarch64_sm3tt2aq4si): Ditto. (aarch64_sm3tt2bq4si): Ditto. (aarch64_sm3partw1qv4si): Ditto. (aarch64_sm3partw2qv4si): Ditto. (aarch64_sm4eqv4si): Ditto. (aarch64_sm4ekeyqv4si): Ditto. * config/aarch64/aarch64-simd.md: (aarch64_sm3ss1qv4si): Ditto. (aarch64_sm3tt<sm3tt_op>qv4si): Ditto. (aarch64_sm3partw<sm3part_op>qv4si): Ditto. (aarch64_sm4eqv4si): Ditto. (aarch64_sm4ekeyqv4si): Ditto. * config/aarch64/iterators.md (sm3tt_op): New int iterator. (sm3part_op): Ditto. (CRYPTO_SM3TT): Ditto. (CRYPTO_SM3PART): Ditto. (UNSPEC_SM3SS1): New unspec. (UNSPEC_SM3TT1A): Ditto. (UNSPEC_SM3TT1B): Ditto. (UNSPEC_SM3TT2A): Ditto. (UNSPEC_SM3TT2B): Ditto. (UNSPEC_SM3PARTW1): Ditto. (UNSPEC_SM3PARTW2): Ditto. (UNSPEC_SM4E): Ditto. (UNSPEC_SM4EKEY): Ditto. * config/aarch64/constraints.md (Ui2): New constraint. * config/aarch64/predicates.md (aarch64_imm2): New predicate. * config/arm/types.md (crypto_sm3): New type attribute. (crypto_sm4): Ditto. * config/aarch64/arm_neon.h (vsm3ss1q_u32): New intrinsic. (vsm3tt1aq_u32): Ditto. (vsm3tt1bq_u32): Ditto. (vsm3tt2aq_u32): Ditto. (vsm3tt2bq_u32): Ditto. (vsm3partw1q_u32): Ditto. (vsm3partw2q_u32): Ditto. (vsm4eq_u32): Ditto. (vsm4ekeyq_u32): Ditto. (doc/invoke.texi): Document new sm4 option. 2018-01-10 Michael Collison <michael.collison@arm.com> * config/aarch64/aarch64-arches.def (armv8.4-a): New architecture. * config/aarch64/aarch64.h (AARCH64_ISA_V8_4): New ISA flag. (AARCH64_FL_FOR_ARCH8_4): New. (AARCH64_FL_V8_4): New flag. (doc/invoke.texi): Document new armv8.4-a option. 2018-01-10 Michael Collison <michael.collison@arm.com> * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): (__ARM_FEATURE_AES): Define if TARGET_AES is true. (__ARM_FEATURE_SHA2): Define if TARGET_SHA2 is true. * config/aarch64/aarch64-option-extension.def: Add AARCH64_OPT_EXTENSION of 'sha2'. (aes): Add AARCH64_OPT_EXTENSION of 'aes'. (crypto): Disable sha2 and aes if crypto disabled. (crypto): Enable aes and sha2 if enabled. (simd): Disable sha2 and aes if simd disabled. * config/aarch64/aarch64.h (AARCH64_FL_AES, AARCH64_FL_SHA2): New flags. (AARCH64_ISA_AES, AARCH64_ISA_SHA2): New ISA flags. (TARGET_SHA2): New feature flag for sha2. (TARGET_AES): New feature flag for aes. * config/aarch64/aarch64-simd.md: (aarch64_crypto_aes<aes_op>v16qi): Make pattern conditional on TARGET_AES. (aarch64_crypto_aes<aesmc_op>v16qi): Ditto. (aarch64_crypto_sha1hsi): Make pattern conditional on TARGET_SHA2. (aarch64_crypto_sha1hv4si): Ditto. (aarch64_be_crypto_sha1hv4si): Ditto. (aarch64_crypto_sha1su1v4si): Ditto. (aarch64_crypto_sha1<sha1_op>v4si): Ditto. (aarch64_crypto_sha1su0v4si): Ditto. (aarch64_crypto_sha256h<sha256_op>v4si): Ditto. (aarch64_crypto_sha256su0v4si): Ditto. (aarch64_crypto_sha256su1v4si): Ditto. (doc/invoke.texi): Document new aes and sha2 options. From-SVN: r256478
2018-01-03[AArch64] Rewrite aarch64_simd_valid_immediateRichard Sandiford1-4/+4
This patch reworks aarch64_simd_valid_immediate so that it's easier to add SVE support. The main changes are: - make simd_immediate_info easier to construct - replace the while (1) { ... break; } blocks with checks that use the full 64-bit value of the constant - treat floating-point modes as integers if they aren't valid as floating-point values 2018-01-03 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * config/aarch64/aarch64-protos.h (aarch64_output_simd_mov_immediate): Remove the mode argument. (aarch64_simd_valid_immediate): Remove the mode and inverse arguments. * config/aarch64/iterators.md (bitsize): New iterator. * config/aarch64/aarch64-simd.md (*aarch64_simd_mov<mode>, and<mode>3) (ior<mode>3): Update calls to aarch64_output_simd_mov_immediate. * config/aarch64/constraints.md (Do, Db, Dn): Update calls to aarch64_simd_valid_immediate. * config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Likewise. (aarch64_reg_or_bic_imm): Likewise. * config/aarch64/aarch64.c (simd_immediate_info): Replace mvn with an insn_type enum and msl with a modifier_type enum. Replace element_width with a scalar_mode. Change the shift to unsigned int. Add constructors for scalar_float_mode and scalar_int_mode elements. (aarch64_vect_float_const_representable_p): Delete. (aarch64_can_const_movi_rtx_p) (aarch64_simd_scalar_immediate_valid_for_move) (aarch64_simd_make_constant): Update call to aarch64_simd_valid_immediate. (aarch64_advsimd_valid_immediate_hs): New function. (aarch64_advsimd_valid_immediate): Likewise. (aarch64_simd_valid_immediate): Remove mode and inverse arguments. Rewrite to use the above. Use const_vec_duplicate_p to detect duplicated constants and use aarch64_float_const_zero_rtx_p and aarch64_float_const_representable_p on the result. (aarch64_output_simd_mov_immediate): Remove mode argument. Update call to aarch64_simd_valid_immediate and use of simd_immediate_info. (aarch64_output_scalar_simd_mov_immediate): Update call accordingly. gcc/testsuite/ * gcc.target/aarch64/vect-movi.c (movi_float_lsl24): New function. (main): Call it. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256205
2018-01-03Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r256169
2017-12-21[AArch64] Tweak aarch64_classify_address interfaceRichard Sandiford1-4/+4
Previously aarch64_classify_address used an rtx code to distinguish LDP/STP addresses from normal addresses; the code was PARALLEL to select LDP/STP and anything else to select normal addresses. This patch replaces that parameter with a dedicated enum. The SVE port will add another enum value that didn't map naturally to an rtx code. 2017-12-21 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * config/aarch64/aarch64-protos.h (aarch64_addr_query_type): New enum. (aarch64_legitimate_address_p): Use it instead of an rtx code, as an optional final parameter. * config/aarch64/aarch64.c (aarch64_classify_address): Likewise. (aarch64_legitimate_address_p): Likewise. (aarch64_print_address_internal): Take an aarch64_addr_query_type instead of an rtx code. (aarch64_address_valid_for_prefetch_p): Update calls accordingly. (aarch64_legitimate_address_hook_p): Likewise. (aarch64_print_ldpstp_address): Likewise. (aarch64_print_operand_address): Likewise. (aarch64_address_cost): Likewise. * config/aarch64/constraints.md (Uml, Umq, Ump, Utq): Likewise. * config/aarch64/predicates.md (aarch64_mem_pair_operand): Likewise. (aarch64_mem_pair_lanes_operand): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r255911