aboutsummaryrefslogtreecommitdiff
path: root/gcc/config
AgeCommit message (Collapse)AuthorFilesLines
2020-06-16S/390: Emit vector alignment hints for z13 if AS accepts themStefan Schulze Frielinghaus2-3/+8
Since 87cb9423add vector alignment hints are emitted for target z13, too. This patch changes this behaviour in the sense that alignment hints are only emitted for target z13 if the assembler accepts them. gcc/ChangeLog: * config.in: Regenerate. * config/s390/s390.c (print_operand): Emit vector alignment hints for target z13, if AS accepts them. For other targets the logic stays the same. * config/s390/s390.h (TARGET_VECTOR_LOADSTORE_ALIGNMENT_HINTS): Define macro. * configure: Regenerate. * configure.ac: Check HAVE_AS_VECTOR_LOADSTORE_ALIGNMENT_HINTS_ON_Z13.
2020-06-16[PATCH][GCC] arm: Fix the MVE ACLE vaddq_m polymorphic variants.Srinath Parvathaneni1-24/+24
Hello, This patch fixes the MVE ACLE vaddq_m polymorphic variants by modifying the corresponding intrinsic parameters and vaddq_m polymorphic variant's _Generic case entries in "arm_mve.h" header file. 2020-06-04 Srinath Parvathaneni <srinath.parvathaneni@arm.com> gcc/ * config/arm/arm_mve.h (__arm_vaddq_m_n_s8): Correct the intrinsic arguments. (__arm_vaddq_m_n_s32): Likewise. (__arm_vaddq_m_n_s16): Likewise. (__arm_vaddq_m_n_u8): Likewise. (__arm_vaddq_m_n_u32): Likewise. (__arm_vaddq_m_n_u16): Likewise. (__arm_vaddq_m): Modify polymorphic variant. gcc/testsuite/ * gcc.target/arm/mve/intrinsics/mve_vaddq_m.c: New test.
2020-06-16[PATCH][GCC] arm: Fix MVE scalar shift intrinsics code-gen.Srinath Parvathaneni2-36/+48
This patch modifies the MVE scalar shift RTL patterns. The current patterns have wrong constraints and predicates due to which the values returned from MVE scalar shift instructions are overwritten in the code-gen. example: $ cat x.c int32_t foo(int64_t acc, int shift) { return sqrshrl_sat48 (acc, shift); } Code-gen before applying this patch: $ arm-none-eabi-gcc -march=armv8.1-m.main+mve -mfloat-abi=hard -O2 -S $ cat x.s foo: push {r4, r5} sqrshrl r0, r1, #48, r2 ----> (a) mov r0, r4 ----> (b) pop {r4, r5} bx lr Code-gen after applying this patch: foo: sqrshrl r0, r1, #48, r2 bx lr In the current compiler the return value (r0) from sqrshrl (a) is getting overwritten by the mov statement (b). This patch fixes above issue. 2020-06-12 Srinath Parvathaneni <srinath.parvathaneni@arm.com> gcc/ * config/arm/mve.md (mve_uqrshll_sat<supf>_di): Correct the predicate and constraint of all the operands. (mve_sqrshrl_sat<supf>_di): Likewise. (mve_uqrshl_si): Likewise. (mve_sqrshr_si): Likewise. (mve_uqshll_di): Likewise. (mve_urshrl_di): Likewise. (mve_uqshl_si): Likewise. (mve_urshr_si): Likewise. (mve_sqshl_si): Likewise. (mve_srshr_si): Likewise. (mve_srshrl_di): Likewise. (mve_sqshll_di): Likewise. * config/arm/predicates.md (arm_low_register_operand): Define. gcc/testsuite/ * gcc.target/arm/mve/intrinsics/mve_scalar_shifts1.c: New test. * gcc.target/arm/mve/intrinsics/mve_scalar_shifts2.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_scalar_shifts3.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_scalar_shifts4.c: Likewise.
2020-06-16RISC-V: Fix ICE on riscv_gpr_save_operation_p [PR95683]Kito Cheng1-1/+4
- riscv_gpr_save_operation_p might try to match parallel on other patterns like inline asm pattern, and then it might trigger ther assertion checking there, so we could trun it into a early exit check. gcc/ChangeLog: PR target/95683 * config/riscv/riscv.c (riscv_gpr_save_operation_p): Remove assertion and turn it into a early exit check. gcc/testsuite/ChangeLog PR target/95683 * gcc.target/riscv/pr95683.c: New.
2020-06-15gcc: xtensa: make TARGET_HAVE_TLS definition staticMax Filippov1-3/+6
Remove TARGET_THREADPTR reference from TARGET_HAVE_TLS to avoid static data initialization dependency on xtensa core configuration. 2020-06-15 Max Filippov <jcmvbkbc@gmail.com> gcc/ * config/xtensa/xtensa.c (TARGET_HAVE_TLS): Remove TARGET_THREADPTR reference. (xtensa_tls_symbol_p, xtensa_tls_referenced_p): Use targetm.have_tls instead of TARGET_HAVE_TLS. (xtensa_option_override): Set targetm.have_tls to false in configurations without THREADPTR.
2020-06-15gcc: xtensa: add -mabi option for call0/windowed ABIMax Filippov6-7/+35
2020-06-15 Max Filippov <jcmvbkbc@gmail.com> gcc/ * config/xtensa/elf.h (ASM_SPEC, LINK_SPEC): Pass ABI switch to assembler/linker. * config/xtensa/linux.h (ASM_SPEC, LINK_SPEC): Ditto. * config/xtensa/uclinux.h (ASM_SPEC, LINK_SPEC): Ditto. * config/xtensa/xtensa.c (xtensa_option_override): Initialize xtensa_windowed_abi if needed. * config/xtensa/xtensa.h (TARGET_WINDOWED_ABI_DEFAULT): New macro. (TARGET_WINDOWED_ABI): Redefine to xtensa_windowed_abi. * config/xtensa/xtensa.opt (xtensa_windowed_abi): New target option variable. (mabi=call0, mabi=windowed): New options. * doc/invoke.texi: Document new -mabi= Xtensa-specific options. gcc/testsuite/ * gcc.target/xtensa/mabi-call0.c: New test. * gcc.target/xtensa/mabi-windowed.c: New test. libgcc/ * configure: Regenerate. * configure.ac: Use AC_COMPILE_IFELSE instead of manual preprocessor invocation to check for __XTENSA_CALL0_ABI__.
2020-06-15gcc: xtensa: make register elimination data staticMax Filippov2-8/+34
Remove ABI reference from the ELIMINABLE_REGS to avoid static data initialization dependency on xtensa core configuration. 2020-06-15 Max Filippov <jcmvbkbc@gmail.com> gcc/ * config/xtensa/xtensa.c (xtensa_can_eliminate): New function. (TARGET_CAN_ELIMINATE): New macro. * config/xtensa/xtensa.h (XTENSA_WINDOWED_HARD_FRAME_POINTER_REGNUM) (XTENSA_CALL0_HARD_FRAME_POINTER_REGNUM): New macros. (HARD_FRAME_POINTER_REGNUM): Define using XTENSA_*_HARD_FRAME_POINTER_REGNUM. (ELIMINABLE_REGS): Replace lines with HARD_FRAME_POINTER_REGNUM by lines with XTENSA_WINDOWED_HARD_FRAME_POINTER_REGNUM and XTENSA_CALL0_HARD_FRAME_POINTER_REGNUM.
2020-06-15RISC-V: Suppress warning for signed and unsigned integer comparison.Kito Cheng1-3/+3
gcc/ChangeLog: * config/riscv/riscv.c (riscv_gen_gpr_save_insn): Change type to unsigned for i. (riscv_gpr_save_operation_p): Change type to unsigned for i and len.
2020-06-15Optimize multiplication for V8QI,V16QI,V32QI under TARGET_AVX512BW.liuhongt3-2/+80
2020-06-13 Hongtao Liu <hongtao.liu@intel.com> gcc/ChangeLog: PR target/95488 * config/i386/i386-expand.c (ix86_expand_vecmul_qihi): New function. * config/i386/i386-protos.h (ix86_expand_vecmul_qihi): Declare. * config/i386/sse.md (mul<mode>3): Drop mask_name since there's no real simd int8 multiplication instruction with mask. Also optimize it under TARGET_AVX512BW. (mulv8qi3): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512bw-pr95488-1.c: New test. * gcc.target/i386/avx512bw-pr95488-2.c: Ditto. * gcc.target/i386/avx512vl-pr95488-1.c: Ditto. * gcc.target/i386/avx512vl-pr95488-2.c: Ditto.
2020-06-11x86: Add UNSPECV_PATCHABLE_AREAH.J. Lu6-56/+175
Currently patchable area is at the wrong place. It is placed immediately after function label, before both .cfi_startproc and ENDBR. This patch adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and changes ENDBR insertion pass to also insert patchable area instruction. TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY is defined to avoid placing patchable area before .cfi_startproc and ENDBR. gcc/ PR target/93492 * config/i386/i386-features.c (rest_of_insert_endbranch): Renamed to ... (rest_of_insert_endbr_and_patchable_area): Change return type to void. Add need_endbr and patchable_area_size arguments. Don't call timevar_push nor timevar_pop. Replace endbr_queued_at_entrance with insn_queued_at_entrance. Insert UNSPECV_PATCHABLE_AREA for patchable area. (pass_data_insert_endbranch): Renamed to ... (pass_data_insert_endbr_and_patchable_area): This. Change pass name to endbr_and_patchable_area. (pass_insert_endbranch): Renamed to ... (pass_insert_endbr_and_patchable_area): This. Add need_endbr and patchable_area_size;. (pass_insert_endbr_and_patchable_area::gate): Set and check need_endbr and patchable_area_size. (pass_insert_endbr_and_patchable_area::execute): Call timevar_push and timevar_pop. Pass need_endbr and patchable_area_size to rest_of_insert_endbr_and_patchable_area. (make_pass_insert_endbranch): Renamed to ... (make_pass_insert_endbr_and_patchable_area): This. * config/i386/i386-passes.def: Replace pass_insert_endbranch with pass_insert_endbr_and_patchable_area. * config/i386/i386-protos.h (ix86_output_patchable_area): New. (make_pass_insert_endbranch): Renamed to ... (make_pass_insert_endbr_and_patchable_area): This. * config/i386/i386.c (ix86_asm_output_function_label): Set function_label_emitted to true. (ix86_print_patchable_function_entry): New function. (ix86_output_patchable_area): Likewise. (x86_function_profiler): Replace endbr_queued_at_entrance with insn_queued_at_entrance. Generate ENDBR only for TYPE_ENDBR. Call ix86_output_patchable_area to generate patchable area if needed. (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): New. * config/i386/i386.h (queued_insn_type): New. (machine_function): Add function_label_emitted. Replace endbr_queued_at_entrance with insn_queued_at_entrance. * config/i386/i386.md (UNSPECV_PATCHABLE_AREA): New. (patchable_area): New. gcc/testsuite/ PR target/93492 * gcc.target/i386/pr93492-1.c: New test. * gcc.target/i386/pr93492-2.c: Likewise. * gcc.target/i386/pr93492-3.c: Likewise. * gcc.target/i386/pr93492-4.c: Likewise. * gcc.target/i386/pr93492-5.c: Likewise.
2020-06-11Fix formatting in rs6000.c.Martin Liska1-1/+1
gcc/ChangeLog: * config/rs6000/rs6000.c (rs6000_density_test): Fix GNU coding style.
2020-06-11rs6000: skip debug info statementsMartin Liska1-0/+3
gcc/ChangeLog: PR target/95627 * config/rs6000/rs6000.c (rs6000_density_test): Skip debug statements.
2020-06-10RISC-V: Unify the output asm pattern between gpr_save and gpr_restore pattern.Kito Cheng4-18/+3
gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_output_gpr_save): Remove. * config/riscv/riscv-sr.c (riscv_sr_match_prologue): Update value. * config/riscv/riscv.c (riscv_output_gpr_save): Remove. * config/riscv/riscv.md (gpr_save): Update output asm pattern.
2020-06-10RISC-V: Describe correct USEs for gpr_save pattern [PR95252]Kito Cheng5-5/+109
- Verified on rv32emc/rv32gc/rv64gc bare-metal target and rv32gc/rv64gc linux target with qemu. gcc/ChangeLog: * config/riscv/predicates.md (gpr_save_operation): New. * config/riscv/riscv-protos.h (riscv_gen_gpr_save_insn): New. (riscv_gpr_save_operation_p): Ditto. * config/riscv/riscv-sr.c (riscv_remove_unneeded_save_restore_calls): Ignore USEs for gpr_save patter. * config/riscv/riscv.c (gpr_save_reg_order): New. (riscv_expand_prologue): Use riscv_gen_gpr_save_insn to gen gpr_save. (riscv_gen_gpr_save_insn): New. (riscv_gpr_save_operation_p): Ditto. * config/riscv/riscv.md (S3_REGNUM): New. (S4_REGNUM): Ditto. (S5_REGNUM): Ditto. (S6_REGNUM): Ditto. (S7_REGNUM): Ditto. (S8_REGNUM): Ditto. (S9_REGNUM): Ditto. (S10_REGNUM): Ditto. (S11_REGNUM): Ditto. (gpr_save): Model USEs correctly. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr95252.c: New.
2020-06-10aarch64: Fix an ICE in register_tuple_type [PR95523]z002190972-0/+5
When registering the tuple type in register_tuple_type, the TYPE_ALIGN (tuple_type) will be changed by -fpack-struct=n. We need to maintain natural alignment in handle_arm_sve_h. 2020-06-10 Haijian Zhang <z.zhanghaijian@huawei.com> gcc/ PR target/95523 * config/aarch64/aarch64-sve-builtins.h (sve_switcher::m_old_maximum_field_alignment): New member. * config/aarch64/aarch64-sve-builtins.cc (sve_switcher::sve_switcher): Save maximum_field_alignment in m_old_maximum_field_alignment and clear maximum_field_alignment. (sve_switcher::~sve_switcher): Restore maximum_field_alignment. gcc/testsuite/ PR target/95523 * gcc.target/aarch64/sve/pr95523.c: New test.
2020-06-10AArch64: Adjust costing of by element MUL to be the same as SAME3 MUL.Tamar Christina1-1/+17
The cost model is currently treating multiplication by element as being more expensive than 3 same multiplication. This means that if the value is on the SIMD side we add an unneeded DUP. If the value is on the genreg side we use the more expensive DUP instead of fmov. This patch corrects the costs such that the two multiplies are costed the same which allows us to generate fmul v3.4s, v3.4s, v0.s[0] instead of dup v0.4s, v0.s[0] fmul v3.4s, v3.4s, v0.4s gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_rtx_mult_cost): Adjust costs for mul. gcc/testsuite/ChangeLog: * gcc.target/aarch64/asimd-mull-elem.c: New test.
2020-06-09PowerPC: Add future hwcap2 bitsMichael Meissner3-1/+11
This patch adds support for the two new HWCAP2 fields used by the __builtin_cpu_supports function. It adds support in the target_clones attribute for -mcpu=future. The two new __builtin_cpu_supports tests are: __builtin_cpu_supports ("isa_3_1") __builtin_cpu_supports ("mma") The bits used are the bits that the Linux kernel engineers will be using for these new features. gcc/ 2020-06-05 Michael Meissner <meissner@linux.ibm.com> * config/rs6000/ppc-auxv.h (PPC_PLATFORM_FUTURE): Allocate 'future' PowerPC platform. (PPC_FEATURE2_ARCH_3_1): New HWCAP2 bit for ISA 3.1. (PPC_FEATURE2_MMA): New HWCAP2 bit for MMA. * config/rs6000/rs6000-call.c (cpu_supports_info): Add ISA 3.1 and MMA HWCAP2 bits. * config/rs6000/rs6000.c (CLONE_ISA_3_1): New clone support. (rs6000_clone_map): Add 'future' system target_clones support.
2020-06-09AArch64+SVE: Add support for unpacked unary ops and BICJoe Ramsay1-19/+19
MD patterns extended for unary ops ABS, CLS, CLZ, CNT, NEG and NOT to support unpacked vectors. Also extended patterns for BIC to support unpacked vectors where input elements are of the same width. gcc/ChangeLog: 2020-06-09 Joe Ramsay <joe.ramsay@arm.com> * config/aarch64/aarch64-sve.md (<optab><mode>2): Add support for unpacked vectors. (@aarch64_pred_<optab><mode>): Add support for unpacked vectors. (@aarch64_bic<mode>): Enable unpacked BIC. (*bic<mode>3): Enable unpacked BIC. gcc/testsuite/ChangeLog: 2020-06-09 Joe Ramsay <joe.ramsay@arm.com> * gcc.target/aarch64/sve/logical_unpacked_abs.c: New test. * gcc.target/aarch64/sve/logical_unpacked_bic_1.c: New test. * gcc.target/aarch64/sve/logical_unpacked_bic_2.c: New test. * gcc.target/aarch64/sve/logical_unpacked_bic_3.c: New test. * gcc.target/aarch64/sve/logical_unpacked_bic_4.c: New test. * gcc.target/aarch64/sve/logical_unpacked_neg.c: New test. * gcc.target/aarch64/sve/logical_unpacked_not.c: New test.
2020-06-08AArch64: Expand on comment of stack-clash and implicit probing through LR.Tamar Christina1-1/+3
This expands the comment on an assert we have in aarch64_layout_frame and points to an existing comment somewhere else that has a much longer explanation of what's going on. Committed under the GCC Obvious rule. gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_layout_frame): Expand comments.
2020-06-08[arm] Fix vfp_operand_register for VFP HI regsChristophe Lyon1-1/+1
While looking at PR target/94743 I noticed an ICE when I tried to save all the FP registers: this was because all HI registers wouldn't match vfp_register_operand. gcc/ChangeLog: * config/arm/predicates.md (vfp_register_operand): Use VFP_HI_REGS instead of VFP_REGS.
2020-06-08rs6000: Replace FAIL with gcc_unreachableMartin Liska1-9/+9
gcc/ChangeLog: * config/rs6000/vector.md: Replace FAIL with gcc_unreachable in all vcond* patterns.
2020-06-07i386: Improve expansion of __builtin_parityUros Bizjak1-53/+143
GCC currently hides the shift and xor reduction inside a backend specific UNSPEC PARITY, making it invisible to the RTL optimizers until very late during compilation. It is normally reasonable for the middle-end to maintain wider mode representations for as long as possible and split them later, but this only helps if the semantics are visible at the RTL-level (to combine and other passes), but UNSPECs are black boxes, so in this case splitting early (during RTL expansion) is a better strategy. It turns out that that popcount instruction on modern x86_64 processors has (almost) made the integer parity flag in the x86 ALU completely obsolete, especially as POPCOUNT's integer semantics are a much better fit to RTL. The one remaining case where these transistors are useful is where __builtin_parity is immediately tested by a conditional branch, and therefore the result is wanted in a flags register rather than as an integer. This case is captured by two peephole2 optimizations in the attached patch. 2020-06-07 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog: * config/i386/i386.md (paritydi2, paritysi2): Expand reduction via shift and xor to an USPEC PARITY matching a parityhi2_cmp. (paritydi2_cmp, paritysi2_cmp): Delete these define_insn_and_split. (parityhi2, parityqi2): New expanders. (parityhi2_cmp): Implement set parity flag with xorb insn. (parityqi2_cmp): Implement set parity flag with testb insn. New peephole2s to use these insns (UNSPEC PARITY) when appropriate. gcc/testsuite/ChangeLog: * gcc.target/i386/parity-3.c: New test. * gcc.target/i386/parity-4.c: Likewise. * gcc.target/i386/parity-5.c: Likewise. * gcc.target/i386/parity-6.c: Likewise. * gcc.target/i386/parity-7.c: Likewise. * gcc.target/i386/parity-8.c: Likewise. * gcc.target/i386/parity-9.c: Likewise.
2020-06-07rs6000: allow cunroll to grow size according to -funroll-loop or -fpeel-loopsguojiufu1-0/+5
Previously, flag_unroll_loops was turned on at -O2 implicitly. This also turned on cunroll with allowance size increasing, and cunroll will unroll/peel the loop even the loop is complex like code in PR95018. With this patch, size growth for cunroll is allowed only for if -funroll-loops or -fpeel-loops or -O3 is specified explicitly. gcc/ChangeLog 2020-06-07 Jiufu Guo <guojiufu@linux.ibm.com> PR target/95018 * config/rs6000/rs6000.c (rs6000_option_override_internal): Override flag_cunroll_grow_size.
2020-06-05ix86: Improve __builtin_c[lt]z followed by extension [PR95535]Jakub Jelinek1-0/+86
In January I've added patterns to optimize SImode -> DImode sign or zero extension of __builtin_popcount, this patch does the same for __builtin_c[lt]z. Like most other instructions, the [tl]zcntl instructions clear the upper 32 bits of the destination register and as the instructions only result in values 0 to 32 inclusive, both sign and zero extensions behave the same. 2020-06-05 Jakub Jelinek <jakub@redhat.com> PR target/95535 * config/i386/i386.md (*ctzsi2_zext, *clzsi2_lzcnt_zext): New define_insn_and_split patterns. (*ctzsi2_zext_falsedep, *clzsi2_lzcnt_zext_falsedep): New define_insn patterns. * gcc.target/i386/pr95535-1.c: New test. * gcc.target/i386/pr95535-2.c: New test.
2020-06-05Fix bitmask conflict between PTA_AVX512VP2INTERSECT and PTA_WAITPKG in ↵Cui,Lili1-1/+1
gcc/config/i386/i386.h 2020-06-05 Lili Cui <lili.cui@intel.com> gcc/ChangeLog: PR target/95525 * config/i386/i386.h (PTA_WAITPKG): Change bitmask value.
2020-06-04aarch64: PR target/95526: Fix gimplification of varargsRichard Biener1-0/+1
This patch fixes a latent bug exposed by eb72dc663e9070b281be83a80f6f838a3a878822 in the aarch64 backend that was causing wrong codegen and several testsuite failures. See the discussion on the bug for details. Bootstrapped and regtested on aarch64-linux-gnu. Cleaned up several failing tests and no new fails introduced. 2020-06-04 Richard Biener <rguenther@suse.de> gcc/: * config/aarch64/aarch64.c (aarch64_gimplify_va_arg_expr): Ensure that tmp_ha is marked TREE_ADDRESSABLE.
2020-06-04[ARM]: Correct the grouping of operands in MVE vector scatter store ↵Srinath Parvathaneni2-321/+513
intrinsics (PR94735). The operands in RTL patterns of MVE vector scatter store intrinsics are wrongly grouped, because of which few vector loads and stores instructions are wrongly getting optimized out with -O2. A new predicate "mve_scatter_memory" is defined in this patch, this predicate returns TRUE on matching: (mem(reg)) for MVE scatter store intrinsics. This patch fixes the issue by adding define_expand pattern with "mve_scatter_memory" predicate and calls the corresponding define_insn by passing register_operand as first argument. This register_operand is extracted from the operand with "mve_scatter_memory" predicate in define_expand pattern. gcc/ChangeLog: 2020-06-01 Srinath Parvathaneni <srinath.parvathaneni@arm.com> PR target/94735 * config/arm/predicates.md (mve_scatter_memory): Define to match (mem (reg)) for scatter store memory. * config/arm/mve.md (mve_vstrbq_scatter_offset_<supf><mode>): Modify define_insn to define_expand. (mve_vstrbq_scatter_offset_p_<supf><mode>): Likewise. (mve_vstrhq_scatter_offset_<supf><mode>): Likewise. (mve_vstrhq_scatter_shifted_offset_p_<supf><mode>): Likewise. (mve_vstrhq_scatter_shifted_offset_<supf><mode>): Likewise. (mve_vstrdq_scatter_offset_p_<supf>v2di): Likewise. (mve_vstrdq_scatter_offset_<supf>v2di): Likewise. (mve_vstrdq_scatter_shifted_offset_p_<supf>v2di): Likewise. (mve_vstrdq_scatter_shifted_offset_<supf>v2di): Likewise. (mve_vstrhq_scatter_offset_fv8hf): Likewise. (mve_vstrhq_scatter_offset_p_fv8hf): Likewise. (mve_vstrhq_scatter_shifted_offset_fv8hf): Likewise. (mve_vstrhq_scatter_shifted_offset_p_fv8hf): Likewise. (mve_vstrwq_scatter_offset_fv4sf): Likewise. (mve_vstrwq_scatter_offset_p_fv4sf): Likewise. (mve_vstrwq_scatter_offset_p_<supf>v4si): Likewise. (mve_vstrwq_scatter_offset_<supf>v4si): Likewise. (mve_vstrwq_scatter_shifted_offset_fv4sf): Likewise. (mve_vstrwq_scatter_shifted_offset_p_fv4sf): Likewise. (mve_vstrwq_scatter_shifted_offset_p_<supf>v4si): Likewise. (mve_vstrwq_scatter_shifted_offset_<supf>v4si): Likewise. (mve_vstrbq_scatter_offset_<supf><mode>_insn): Define insn for scatter stores. (mve_vstrbq_scatter_offset_p_<supf><mode>_insn): Likewise. (mve_vstrhq_scatter_offset_<supf><mode>_insn): Likewise. (mve_vstrhq_scatter_shifted_offset_p_<supf><mode>_insn): Likewise. (mve_vstrhq_scatter_shifted_offset_<supf><mode>_insn): Likewise. (mve_vstrdq_scatter_offset_p_<supf>v2di_insn): Likewise. (mve_vstrdq_scatter_offset_<supf>v2di_insn): Likewise. (mve_vstrdq_scatter_shifted_offset_p_<supf>v2di_insn): Likewise. (mve_vstrdq_scatter_shifted_offset_<supf>v2di_insn): Likewise. (mve_vstrhq_scatter_offset_fv8hf_insn): Likewise. (mve_vstrhq_scatter_offset_p_fv8hf_insn): Likewise. (mve_vstrhq_scatter_shifted_offset_fv8hf_insn): Likewise. (mve_vstrhq_scatter_shifted_offset_p_fv8hf_insn): Likewise. (mve_vstrwq_scatter_offset_fv4sf_insn): Likewise. (mve_vstrwq_scatter_offset_p_fv4sf_insn): Likewise. (mve_vstrwq_scatter_offset_p_<supf>v4si_insn): Likewise. (mve_vstrwq_scatter_offset_<supf>v4si_insn): Likewise. (mve_vstrwq_scatter_shifted_offset_fv4sf_insn): Likewise. (mve_vstrwq_scatter_shifted_offset_p_fv4sf_insn): Likewise. (mve_vstrwq_scatter_shifted_offset_p_<supf>v4si_insn): Likewise. (mve_vstrwq_scatter_shifted_offset_<supf>v4si_insn): Likewise. gcc/testsuite/ChangeLog: 2020-06-01 Srinath Parvathaneni <srinath.parvathaneni@arm.com> PR target/94735 * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_base.c: New test. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_base_p.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_offset.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_offset_p.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_shifted_offset.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_shifted_offset_p.c: Likewise.
2020-06-04[PATCH][GCC] arm: Fix the MVE ACLE vbicq intrinsics.Srinath Parvathaneni1-16/+16
Following MVE intrinsic testcases are failing in GCC testsuite. Directory: gcc.target/arm/mve/intrinsics/ Testcases: vbicq_f16.c, vbicq_f32.c, vbicq_s16.c, vbicq_s32.c, vbicq_s8.c ,vbicq_u16.c, vbicq_u32.c and vbicq_u8.c. This patch fixes the vbicq intrinsics by modifying the intrinsic parameters and polymorphic variants in "arm_mve.h" header file. Thanks, Srinath. gcc/ChangeLog: 2020-05-20 Srinath Parvathaneni <srinath.parvathaneni@arm.com> * config/arm/arm_mve.h (__arm_vbicq_n_u16): Correct the intrinsic arguments. (__arm_vbicq_n_s16): Likewise. (__arm_vbicq_n_u32): Likewise. (__arm_vbicq_n_s32): Likewise. (__arm_vbicq): Modify polymorphic variant. gcc/testsuite/ChangeLog: 2020-05-20 Srinath Parvathaneni <srinath.parvathaneni@arm.com> * gcc.target/arm/mve/intrinsics/vbicq_f16.c: Modify. * gcc.target/arm/mve/intrinsics/vbicq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_u8.c: Likewise.
2020-06-04Fix zero-masking for vcvtps2ph when dest operand is memory.liuhongt2-4/+40
When dest is memory, zero-masking is not valid, only merging-masking is available, 2020-06-24 Hongtao Liu <hongtao.liu@inte.com> gcc/ChangeLog: PR target/95254 * config/i386/sse.md (*vcvtps2ph_store<merge_mask_name>): Refine from *vcvtps2ph_store<mask_name>. (vcvtps2ph256<mask_name>): Refine constraint from vm to v. (<mask_codefor>avx512f_vcvtps2ph512<mask_name>): Ditto. (*vcvtps2ph256<merge_mask_name>): New define_insn. (*avx512f_vcvtps2ph512<merge_mask_name>): Ditto. * config/i386/subst.md (merge_mask): New define_subst. (merge_mask_name): New define_subst_attr. (merge_mask_operand3): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512f-vcvtps2ph-pr95254.c: New test. * gcc.target/i386/avx512vl-vcvtps2ph-pr95254.c: Ditto.
2020-06-04Fix missing assemble_external in ASM_OUTPUT_FDESCAndreas Schwab1-0/+1
When TARGET_VTABLE_USES_DESCRIPTORS is defined then function pointers in the vtable are output by ASM_OUTPUT_FDESC. The only current user of this is ia64, but its implementation of ASM_OUTPUT_FDESC lacks a call to assemble_external. Thus if there is no other reference to the function the weak declaration for it will be missing. PR target/95154 * config/ia64/ia64.h (ASM_OUTPUT_FDESC): Call assemble_external.
2020-06-04Fix uppercase in trunc<mode><pmov_dst_3>2.liuhongt1-1/+3
2020-06-04 Hongtao.liu <hongtao.liu@intel.com> gcc/ChangeLog: * config/i386/sse.md (pmov_dst_3_lower): New mode attribute. (trunc<mode><pmov_dst_3_lower>2): Refine from trunc<mode><pmov_dst_3>2. gcc/testsuite * gcc.target/i386/pr92658-avx512bw-trunc.c: Adjust testcase.
2020-06-03identify lfs prefixed case PR95347Aaron Sawdey1-12/+25
The same problem also arises for plfs where prefixed_load_p() doesn't recognize it so we get just lfs in the asm output with an @pcrel address. PR target/95347 * config/rs6000/rs6000.c (is_stfs_insn): Rename to is_lfs_stfs_insn and make it recognize lfs as well. (prefixed_store_p): Use is_lfs_stfs_insn(). (prefixed_load_p): Use is_lfs_stfs_insn() to recognize lfs.
2020-06-02aarch64: Fix an ICE in aarch64_short_vector_p [PR95459]Fei Yang1-1/+2
In aarch64_short_vector_p, we are simply checking whether a type (and a mode) is a 64/128-bit short vector or not. This should not be affected by the value of TARGET_SVE. Simply leave later code to report an error if SVE is disabled. 2020-06-02 Felix Yang <felix.yang@huawei.com> gcc/ PR target/95459 * config/aarch64/aarch64.c (aarch64_short_vector_p): Leave later code to report an error if SVE is disabled. gcc/testsuite/ PR target/95459 * gcc.target/aarch64/mgeneral-regs_6.c: New test.
2020-06-02aarch64: Add initial support for -mcpu=zeusKyrylo Tkachov2-1/+4
This patch adds support for the Arm Zeus CPU. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ 2020-06-02 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * config/aarch64/aarch64-cores.def (zeus): Define. * config/aarch64/aarch64-tune.md: Regenerate. * doc/invoke.texi (AArch64 Options): Document zeus -mcpu option.
2020-06-02Correctly identify stfs if prefixedAaron Sawdey1-1/+54
Because reg_to_non_prefixed() only looks at the register being used, it doesn't get the right answer for stfs, which leads to us not seeing that it has a PCREL symbol ref. This patch works around this by introducing a helper function that inspects the insn to see if it is in fact a stfs. Then if we use NON_PREFIXED_DEFAULT, address_to_insn_form() can see that it has the PCREL symbol ref. gcc/ChangeLog: PR target/95347 * config/rs6000/rs6000.c (prefixed_store_p): Add special case for stfs. (is_stfs_insn): New helper function.
2020-06-02amdgcn: Remove -mlocal-symbol-id optionAndrew Stubbs3-22/+2
This patch removes the obsolete -mlocal-symbol-id option. This was used to control mangling of local symbol names in order to work around a ROCm runtime bug, but that has not been needed in some time, and the mangling was removed already. gcc/ChangeLog: * config/gcn/gcn-hsa.h (CC1_SPEC): Delete. * config/gcn/gcn.opt (-mlocal-symbol-id): Delete. * config/gcn/mkoffload.c (main): Don't use -mlocal-symbol-id. gcc/testsuite/ChangeLog: * gcc.dg/intermod-1.c: Don't use -mlocal-symbol-id.
2020-06-02S/390: Emit vector alignment hints for z13Stefan Schulze Frielinghaus1-1/+1
2020-06-02 Stefan Schulze Frielinghaus <stefansf@linux.ibm.com> gcc/ChangeLog: * config/s390/s390.c (print_operand): Emit vector alignment hints for z13. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/align-1.c: Change target architecture to z13. * gcc.target/s390/vector/align-2.c: Change target architecture to z13.
2020-05-30Disable brabc/brabs patterns as their length computation is horribly broken ↵Jeff Law1-2/+12
and leads to incorrect code generation. * config/h8300/jumpcall.md (brabs, brabc): Disable patterns.
2020-05-30RISC-V: Optimize si to di zero-extend followed by left shift.Jim Wilson1-0/+22
This is potentially a sequence of 3 shifts, we which optimize to a sequence of 2 shifts. This can happen when unsigned int is used for array indexing. gcc/ * config/riscv/riscv.md (zero_extendsidi2_shifted): New. gcc/testsuite/ * gcc.target/riscv/zero-extend-5.c: New.
2020-05-30gcc/config/i386/mingw32.h: Ensure `-lmsvcrt` precede `-lkernel32`Jonathan Yong1-1/+1
This is necessary as libmsvcrt.a is not a pure import library, but also contains some functions that invoke others in KERNEL32.DLL. gcc/ * config/i386/mingw32.h (REAL_LIBGCC_SPEC): Insert -lkernel32 after -lmsvcrt. This is necessary as libmsvcrt.a is not a pure import library, but also contains some functions that invoke others in KERNEL32.DLL. Signed-off-by: Liu Hao <lh_mouse@126.com> Signed-off-by: Jonathan Yong <10walls@gmail.com>
2020-05-29rs6000: Prefer VSX insns over VMX ones (part 1: perm and mrg)Segher Boessenkool1-52/+52
There are various VSX insns that do the same job as (older) AltiVec insns, just with a wider range of possible registers. Many patterns for such insns have the "v" alternative before the "wa" alternative, which makes the output less readable than possible (since vs32 is v0, and most insns before or after this insn will be VSX as well). This changes the define_insns for the mrg and perm machine instructions to prefer the VSX form. No behaviour change. Only one testcase needed a little adjustment as well. 2020-05-29 Segher Boessenkool <segher@kernel.crashing.org> * config/rs6000/altivec.md (altivec_vmrghw_direct): Prefer VSX form. (altivec_vmrglw_direct): Ditto. (altivec_vperm_<mode>_direct): Ditto. (altivec_vperm_v8hiv16qi): Ditto. (*altivec_vperm_<mode>_uns_internal): Ditto. (*altivec_vpermr_<mode>_internal): Ditto. (vperm_v8hiv4si): Ditto. (vperm_v16qiv8hi): Ditto. gcc/testsuite/ * gcc.target/powerpc/vsx-vector-6.p9.c: Allow xxperm as perm as well.
2020-05-29amdgcn: Fix VCC early clobberAndrew Stubbs1-16/+16
gcc/ChangeLog: 2020-05-28 Andrew Stubbs <ams@codesourcery.com> * config/gcn/gcn-valu.md (add<mode>3_vcc_zext_dup): Add early clobber. (add<mode>3_vcc_zext_dup_exec): Likewise. (add<mode>3_vcc_zext_dup2): Likewise. (add<mode>3_vcc_zext_dup2_exec): Likewise.
2020-05-29aarch64: add support for unpacked EOR, ORR and ANDJoe Ramsay1-4/+4
Extended patterns for these instructions to support unpacked vectors. BIC will have to wait, as there is not currently support for unpacked NOT. 2020-05-29 Joe Ramsay <joe.ramsay@arm.com> gcc/ * config/aarch64/aarch64-sve.md (<LOGICAL:optab><mode>3): Add support for unpacked EOR, ORR, AND. gcc/testsuite/ * gcc.target/aarch64/sve/load_const_offset_2.c: Force using packed vectors. * gcc.target/aarch64/sve/logical_unpacked_and_1.c: New test. * gcc.target/aarch64/sve/logical_unpacked_and_2.c: New test. * gcc.target/aarch64/sve/logical_unpacked_and_3.c: New test. * gcc.target/aarch64/sve/logical_unpacked_and_4.c: New test. * gcc.target/aarch64/sve/logical_unpacked_and_5.c: New test. * gcc.target/aarch64/sve/logical_unpacked_and_6.c: New test. * gcc.target/aarch64/sve/logical_unpacked_and_7.c: New test. * gcc.target/aarch64/sve/logical_unpacked_eor_1.c: New test. * gcc.target/aarch64/sve/logical_unpacked_eor_2.c: New test. * gcc.target/aarch64/sve/logical_unpacked_eor_3.c: New test. * gcc.target/aarch64/sve/logical_unpacked_eor_4.c: New test. * gcc.target/aarch64/sve/logical_unpacked_eor_5.c: New test. * gcc.target/aarch64/sve/logical_unpacked_eor_6.c: New test. * gcc.target/aarch64/sve/logical_unpacked_eor_7.c: New test. * gcc.target/aarch64/sve/logical_unpacked_orr_1.c: New test. * gcc.target/aarch64/sve/logical_unpacked_orr_2.c: New test. * gcc.target/aarch64/sve/logical_unpacked_orr_3.c: New test. * gcc.target/aarch64/sve/logical_unpacked_orr_4.c: New test. * gcc.target/aarch64/sve/logical_unpacked_orr_5.c: New test. * gcc.target/aarch64/sve/logical_unpacked_orr_6.c: New test. * gcc.target/aarch64/sve/logical_unpacked_orr_7.c: New test. * gcc.target/aarch64/sve/scatter_store_6.c: Force using packed vectors. * gcc.target/aarch64/sve/scatter_store_7.c: Force using packed vectors. * gcc.target/aarch64/sve/strided_load_3.c: Force using packed vectors. * gcc.target/aarch64/sve/strided_store_3.c: Force using packed vectors. * gcc.target/aarch64/sve/unpack_signed_1.c: Force using packed vectors.
2020-05-28Finish prior patchJeff Law1-8/+0
* config/h8300/logical.md (bclrhi_msx): Remove pattern.
2020-05-28Fix incorrect code generation with bit insns on H8/SX.Jeff Law1-19/+17
* config/h8300/logical.md (HImode H8/SX bit-and splitter): Don't make a nonzero adjustment to the memory offset. (b<ior,xor>hi_msx): Turn into a splitter.
2020-05-28aarch64: Fix missed shrink-wrapping opportunityRichard Sandiford2-0/+25
wb_candidate1 and wb_candidate2 exist for two overlapping cases: when we use an STR or STP with writeback to allocate the frame, and when we set up a frame chain record (either using writeback allocation or not). However, aarch64_layout_frame was leaving these fields with legitimate register numbers even if we decided to do neither of those things. This prevented those registers from being shrink-wrapped, even though we were otherwise treating them as normal saves and restores. The case this patch handles isn't the common case, so it might not be worth going out of our way to optimise it. But I think the patch actually makes the output of aarch64_layout_frame more consistent. 2020-05-28 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64.h (aarch64_frame): Add a comment above wb_candidate1 and wb_candidate2. * config/aarch64/aarch64.c (aarch64_layout_frame): Invalidate wb_candidate1 and wb_candidate2 if we decided not to use them. gcc/testsuite/ * gcc.target/aarch64/shrink_wrap_1.c: New test.
2020-05-28aarch64: Fix segfault in aarch64_expand_epilogue [PR95361]Richard Sandiford1-1/+5
The stack frame for the function in the testcase consisted of two SVE save slots. Both saves had been shrink-wrapped, but for different blocks, meaning that the stack allocation and deallocation were separate from the saves themselves. Before emitting the deallocation, we tried to attach a REG_CFA_DEF_CFA note to the preceding instruction, to redefine the CFA in terms of the stack pointer. But in this case there was no preceding instruction. This in practice only happens for SVE because: (a) We don't try to shrink-wrap wb_candidate* registers even when we've decided to treat them as normal saves and restores. I have a fix for that. (b) Even with (a) fixed, we're (almost?) guaranteed to emit a stack tie for frames that are 64k or larger, so we end up hanging the REG_CFA_DEF_CFA note on that instead. We should only need to redefine the CFA if it was previously defined in terms of the frame pointer. In other cases the CFA should already be defined in terms of the stack pointer, so redefining it is unnecessary but usually harmless. 2020-05-28 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR testsuite/95361 * config/aarch64/aarch64.c (aarch64_expand_epilogue): Assert that we have at least some CFI operations when using a frame pointer. Only redefine the CFA if we have CFI operations. gcc/testsuite/ PR testsuite/95361 * gcc.target/aarch64/sve/pr95361.c: New test.
2020-05-28arm: Fix unwanted fall-throughs in arm.cAndrea Corallo1-0/+6
gcc/ChangeLog 2020-05-28 Andrea Corallo <andrea.corallo@arm.com> * config/arm/arm.c (mve_vector_mem_operand): Fix unwanted fall-throughs.
2020-05-28Fix nonconforming memory_operand for vpmovq{d,w,b}/vpmovd{w,b}/vpmovwb.liuhongt7-283/+421
According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit memory_operand instead of 128-bit one which existed in current implementation. Also for other vpmov instructions which have memory_operand narrower than 128bits. 2020-05-25 Hongtao Liu <hongtao.liu@intel.com> gcc/ChangeLog * config/i386/sse.md (*avx512vl_<code>v2div2qi2_store_1): Rename from *avx512vl_<code>v2div2qi_store and refine memory size of the pattern. (*avx512vl_<code>v2div2qi2_mask_store_1): Ditto. (*avx512vl_<code><mode>v4qi2_store_1): Ditto. (*avx512vl_<code><mode>v4qi2_mask_store_1): Ditto. (*avx512vl_<code><mode>v8qi2_store_1): Ditto. (*avx512vl_<code><mode>v8qi2_mask_store_1): Ditto. (*avx512vl_<code><mode>v4hi2_store_1): Ditto. (*avx512vl_<code><mode>v4hi2_mask_store_1): Ditto. (*avx512vl_<code>v2div2hi2_store_1): Ditto. (*avx512vl_<code>v2div2hi2_mask_store_1): Ditto. (*avx512vl_<code>v2div2si2_store_1): Ditto. (*avx512vl_<code>v2div2si2_mask_store_1): Ditto. (*avx512f_<code>v8div16qi2_store_1): Ditto. (*avx512f_<code>v8div16qi2_mask_store_1): Ditto. (*avx512vl_<code>v2div2qi2_store_2): New define_insn_and_split. (*avx512vl_<code>v2div2qi2_mask_store_2): Ditto. (*avx512vl_<code><mode>v4qi2_store_2): Ditto. (*avx512vl_<code><mode>v4qi2_mask_store_2): Ditto. (*avx512vl_<code><mode>v8qi2_store_2): Ditto. (*avx512vl_<code><mode>v8qi2_mask_store_2): Ditto. (*avx512vl_<code><mode>v4hi2_store_2): Ditto. (*avx512vl_<code><mode>v4hi2_mask_store_2): Ditto. (*avx512vl_<code>v2div2hi2_store_2): Ditto. (*avx512vl_<code>v2div2hi2_mask_store_2): Ditto. (*avx512vl_<code>v2div2si2_store_2): Ditto. (*avx512vl_<code>v2div2si2_mask_store_2): Ditto. (*avx512f_<code>v8div16qi2_store_2): Ditto. (*avx512f_<code>v8div16qi2_mask_store_2): Ditto. * config/i386/i386-builtin-types.def: Adjust builtin type. * config/i386/i386-expand.c: Ditto. * config/i386/i386-builtin.def: Adjust builtin. * config/i386/avx512fintrin.h: Ditto. * config/i386/avx512vlbwintrin.h: Ditto. * config/i386/avx512vlintrin.h: Ditto.
2020-05-27gcc: xtensa: delegitimize UNSPEC_PLTMax Filippov1-0/+24
This fixes 'non-delegitimized UNSPEC 3 found in variable location' notes issued when building libraries which interferes with running tests. 2020-05-27 Max Filippov <jcmvbkbc@gmail.com> gcc/ * config/xtensa/xtensa.c (xtensa_delegitimize_address): New function. (TARGET_DELEGITIMIZE_ADDRESS): New macro.