aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-06-26d: Merge upstream dmd, druntime a45f4e9f43, phobos 106038f2e.Iain Buclaw46-207/+435
D front-end changes: - Import dmd v2.103.1. - Deprecated invalid special token sequences inside token strings. D runtime changes: - Import druntime v2.103.1. Phobos changes: - Import phobos v2.103.1. gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd a45f4e9f43. * dmd/VERSION: Bump version to v2.103.1. libphobos/ChangeLog: * libdruntime/MERGE: Merge upstream druntime a45f4e9f43. * src/MERGE: Merge upstream phobos 106038f2e.
2023-06-25RISC-V: Optimize VSETVL codegen of SELECT_VL with LEN_MASK_{LOAD, STORE}Juzhe-Zhong4-4/+76
This patch is depending on LEN_MASK_{LOAD,STORE} patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622742.html After enabling the LEN_MASK_{LOAD,STORE}, I notice that there is a case that VSETVL PASS need to be optimized: void f (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict cond, int n) { for (int i = 0; i < 8; i++) if (cond[i]) a[i] = b[i]; } Before this patch: f: vsetivli a5,8,e8,mf4,tu,mu --> Propagate "8" to the following vsetvl vsetvli zero,a5,e32,m1,ta,ma vle32.v v0,0(a2) vsetvli a6,zero,e32,m1,ta,ma li a3,8 vmsne.vi v0,v0,0 vsetvli zero,a5,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t sub a4,a3,a5 beq a3,a5,.L6 slli a5,a5,2 add a2,a2,a5 add a1,a1,a5 add a0,a0,a5 vsetvli a5,a4,e8,mf4,tu,mu --> Propagate "a4" to the following vsetvl vsetvli zero,a5,e32,m1,ta,ma vle32.v v0,0(a2) vsetvli a6,zero,e32,m1,ta,ma vmsne.vi v0,v0,0 vsetvli zero,a5,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t .L6: ret Current VSETLV PASS only enable AVL propagation of VLMAX AVL ("zero"). Now, we enable AVL propagation of immediate && conservative non-VLMAX. After this patch: f: vsetivli a5,8,e8,mf4,ta,ma vle32.v v0,0(a2) vsetvli a6,zero,e32,m1,ta,ma li a3,8 vmsne.vi v0,v0,0 vsetivli zero,8,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t sub a4,a3,a5 beq a3,a5,.L6 slli a5,a5,2 vsetvli a4,a4,e8,mf4,ta,ma add a2,a2,a5 vle32.v v0,0(a2) add a1,a1,a5 vsetvli a6,zero,e32,m1,ta,ma add a0,a0,a5 vmsne.vi v0,v0,0 vsetvli zero,a4,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t .L6: ret gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (vector_insn_info::parse_insn): Ehance AVL propagation. * config/riscv/riscv-vsetvl.h: New function. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/select_vl-1.c: Add dump checks. * gcc.target/riscv/rvv/autovec/partial/select_vl-2.c: New test.
2023-06-25RISC-V: fix expand function of vlmul_ext RVV intrinsicLi Xu2-1/+9
Consider this following case: void test_vlmul_ext_v_i8mf8_i8mf4(vint8mf8_t op1) { vint8mf4_t res = __riscv_vlmul_ext_v_i8mf8_i8mf4(op1); } Compilation fails with: test.c: In function 'test_vlmul_ext_v_i8mf8_i8mf4': test.c:5:1: error: unrecognizable insn: 5 | } | ^ (insn 30 29 0 2 (set (mem/c:VNx2QI (reg/f:DI 143) [0 x+0 S[2, 2] A32]) (mem/c:VNx2QI (reg/f:DI 148) [0 op1+0 S[2, 2] A16])) "test.c":4:18 -1 (nil)) during RTL pass: vregs test.c:5:1: internal compiler error: in extract_insn, at recog.cc:2791 0x7c61b8 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) ../.././riscv-gcc/gcc/rtl-error.cc:108 0x7c61d7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) ../.././riscv-gcc/gcc/rtl-error.cc:116 0xed58a7 extract_insn(rtx_insn*) ../.././riscv-gcc/gcc/recog.cc:2791 0xb7f789 instantiate_virtual_regs_in_insn ../.././riscv-gcc/gcc/function.cc:1611 0xb7f789 instantiate_virtual_regs ../.././riscv-gcc/gcc/function.cc:1984 gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: change emit_insn to emit_move_insn gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/vlmul_ext-2.c: New test.
2023-06-25RISC-V: Enable len_mask{load, store} and remove len_{load, store}Juzhe-Zhong12-15/+346
This patch enable len_mask_{load,store} to support flow-control in RVV auto-vectorization. Consider this following case: void f (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict cond, int n) { for (int i = 0; i < n; i++) if (cond[i]) a[i] = b[i]; } Before this patch: <source>:9:21: missed: couldn't vectorize loop <source>:9:21: missed: not vectorized: control flow in loop. After this patch: f: ble a3,zero,.L5 .L3: vsetvli a5,a3,e32,m1,ta,ma vle32.v v0,0(a2) vsetvli a6,zero,e32,m1,ta,ma slli a4,a5,2 vmsne.vi v0,v0,0 sub a3,a3,a5 vsetvli zero,a5,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t add a2,a2,a4 add a1,a1,a4 add a0,a0,a4 bne a3,zero,.L3 .L5: ret gcc/ChangeLog: * config/riscv/autovec.md (len_load_<mode>): Remove. (len_maskload<mode><vm>): Remove. (len_store_<mode>): New pattern. (len_maskstore<mode><vm>): New pattern. * config/riscv/predicates.md (autovec_length_operand): New predicate. * config/riscv/riscv-protos.h (enum insn_type): New enum. (expand_load_store): New function. * config/riscv/riscv-v.cc (emit_vlmax_masked_insn): Ditto. (emit_nonvlmax_masked_insn): Ditto. (expand_load_store): Ditto. * config/riscv/riscv-vector-builtins.cc (function_expander::use_contiguous_store_insn): Add avl_type operand into pred_store. * config/riscv/vector.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c: New test.
2023-06-25internal-fn: Fix bug of BIAS argument indexJu-Zhe Zhong1-1/+1
When trying to enable LEN_MASK_{LOAD,STORE} in RISC-V port, I found I made a mistake in case of argument index of BIAS. This patch is an obvious fix. gcc/ChangeLog: * internal-fn.cc (expand_partial_store_optab_fn): Fix bug of BIAS argument index.
2023-06-25MAINTAINERS: Add myself to write after approvalLehua Ding1-0/+1
ChangeLog: * MAINTAINERS: Add Lehua Ding to write after approval
2023-06-25configure, Darwin: Ensure overrides to host-pie are passed to gcc configure.Iain Sandoe4-35/+89
The latest versions of Darwin on the Aarch64 platform mandate PIE executables. On x86_64 it remains optional, but produces tool warnings after Darwin20, so we default to PIE executables there too. All (non-PowerPC) 64b Darwin platforms mandate PIC code and therefore force host_shared on (we issue a diagnostic if the user tries to configure them non-shared). However, this also means we cannot test the host_shared setting independently of the host_pie setting so that the logic for setting PICFLAG must be amended for Darwin. For Darwin versions required to have PIE executables, in the event that the user tries to configure these as --disable-host-pie, we issue a warning and override the setting. These versions must also switch host_pie on even if it is not given in the configure line. To cater for this we pass the current value of host_pie, as determined by top-level configure, to the GCC configure. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> ChangeLog: * Makefile.def: Pass the enable-host-pie value to GCC configure. * Makefile.in: Regenerate. * configure: Regenerate. * configure.ac: Adjust the logic for shared and PIE host flags to ensure that PIE is passed for hosts that require it.
2023-06-25Revert "RISC-V:Add float16 tuple type abi"Pan Li9-630/+17
This reverts commit f9ab5d62c94547499de52c800ab914cc8e802212 due to the bootstrap failure on machine mode out of range memory access. Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/vector.md: Revert. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-10.c: Revert. * gcc.target/riscv/rvv/base/abi-11.c: Ditto. * gcc.target/riscv/rvv/base/abi-12.c: Ditto. * gcc.target/riscv/rvv/base/abi-15.c: Ditto. * gcc.target/riscv/rvv/base/abi-8.c: Ditto. * gcc.target/riscv/rvv/base/abi-9.c: Ditto. * gcc.target/riscv/rvv/base/abi-17.c: Ditto. * gcc.target/riscv/rvv/base/abi-18.c: Ditto.
2023-06-25Revert "RISC-V:Add float16 tuple type support"Pan Li12-366/+3
This reverts commit 8a96f240d71d367a2955ab9e0f0fef3a0b0e2a74 due to bootstrap failure on mode out of range access, will commit this patch after the issue addressed. gcc/ChangeLog: * config/riscv/genrvv-type-indexer.cc (valid_type): Revert changes. * config/riscv/riscv-modes.def (RVV_TUPLE_MODES): Ditto. (ADJUST_ALIGNMENT): Ditto. (RVV_TUPLE_PARTIAL_MODES): Ditto. (ADJUST_NUNITS): Ditto. * config/riscv/riscv-vector-builtins-types.def (vfloat16mf4x2_t): Ditto. (vfloat16mf4x3_t): Ditto. (vfloat16mf4x4_t): Ditto. (vfloat16mf4x5_t): Ditto. (vfloat16mf4x6_t): Ditto. (vfloat16mf4x7_t): Ditto. (vfloat16mf4x8_t): Ditto. (vfloat16mf2x2_t): Ditto. (vfloat16mf2x3_t): Ditto. (vfloat16mf2x4_t): Ditto. (vfloat16mf2x5_t): Ditto. (vfloat16mf2x6_t): Ditto. (vfloat16mf2x7_t): Ditto. (vfloat16mf2x8_t): Ditto. (vfloat16m1x2_t): Ditto. (vfloat16m1x3_t): Ditto. (vfloat16m1x4_t): Ditto. (vfloat16m1x5_t): Ditto. (vfloat16m1x6_t): Ditto. (vfloat16m1x7_t): Ditto. (vfloat16m1x8_t): Ditto. (vfloat16m2x2_t): Ditto. (vfloat16m2x3_t): Diito. (vfloat16m2x4_t): Diito. (vfloat16m4x2_t): Diito. * config/riscv/riscv-vector-builtins.def (vfloat16mf4x2_t): Ditto. (vfloat16mf4x3_t): Ditto. (vfloat16mf4x4_t): Ditto. (vfloat16mf4x5_t): Ditto. (vfloat16mf4x6_t): Ditto. (vfloat16mf4x7_t): Ditto. (vfloat16mf4x8_t): Ditto. (vfloat16mf2x2_t): Ditto. (vfloat16mf2x3_t): Ditto. (vfloat16mf2x4_t): Ditto. (vfloat16mf2x5_t): Ditto. (vfloat16mf2x6_t): Ditto. (vfloat16mf2x7_t): Ditto. (vfloat16mf2x8_t): Ditto. (vfloat16m1x2_t): Ditto. (vfloat16m1x3_t): Ditto. (vfloat16m1x4_t): Ditto. (vfloat16m1x5_t): Ditto. (vfloat16m1x6_t): Ditto. (vfloat16m1x7_t): Ditto. (vfloat16m1x8_t): Ditto. (vfloat16m2x2_t): Ditto. (vfloat16m2x3_t): Ditto. (vfloat16m2x4_t): Ditto. (vfloat16m4x2_t): Ditto. * config/riscv/riscv-vector-switch.def (TUPLE_ENTRY): Ditto. * config/riscv/riscv.md: Ditto. * config/riscv/vector-iterators.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/tuple-28.c: Removed. * gcc.target/riscv/rvv/base/tuple-29.c: Removed. * gcc.target/riscv/rvv/base/tuple-30.c: Removed. * gcc.target/riscv/rvv/base/tuple-31.c: Removed. * gcc.target/riscv/rvv/base/tuple-32.c: Removed. Signed-off-by: Pan Li <pan2.li@intel.com>
2023-06-25GIMPLE_FOLD: Apply LEN_MASK_{LOAD,STORE} into GIMPLE_FOLDJu-Zhe Zhong1-5/+18
Hi, since we are going to have LEN_MASK_{LOAD,STORE} into loopVectorizer. Currenly, 1. we can fold MASK_{LOAD,STORE} into MEM when mask is all ones. 2. we can fold LEN_{LOAD,STORE} into MEM when (len - bias) is VF. Now, I think it makes sense that we can support fold LEN_MASK_{LOAD,STORE} into MEM when both mask = all ones and (len - bias) is VF. gcc/ChangeLog: * gimple-fold.cc (arith_overflowed_p): Apply LEN_MASK_{LOAD,STORE}. (gimple_fold_partial_load_store_mem_ref): Ditto. (gimple_fold_partial_store): Ditto. (gimple_fold_call): Ditto.
2023-06-25Refine maskloadmn pattern with UNSPEC_MASKLOAD.liuhongt2-14/+28
If mem_addr points to a memory region with less than whole vector size bytes of accessible memory and k is a mask that would prevent reading the inaccessible bytes from mem_addr, add UNSPEC_MASKLOAD to prevent it to be transformed to vpblendd. gcc/ChangeLog: PR target/110309 * config/i386/sse.md (maskload<mode><avx512fmaskmodelower>): Refine pattern with UNSPEC_MASKLOAD. (maskload<mode><avx512fmaskmodelower>): Ditto. (*<avx512>_load<mode>_mask): Extend mode iterator to VI12HFBF_AVX512VL. (*<avx512>_load<mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110309.c: New test.
2023-06-25SSA ALIAS: Apply LEN_MASK_STORE to 'ref_maybe_used_by_call_p_1'Ju-Zhe Zhong1-0/+1
gcc/ChangeLog: * tree-ssa-alias.cc (call_may_clobber_ref_p_1): Add LEN_MASK_STORE.
2023-06-25SSA ALIAS: Apply LEN_MASK_{LOAD, STORE} into SSA alias analysisJu-Zhe Zhong1-0/+2
gcc/ChangeLog: * tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): Apply LEN_MASK_{LOAD,STORE}
2023-06-25RISC-V:Add float16 tuple type abiyulong9-17/+630
gcc/ChangeLog: * config/riscv/vector.md: Add float16 attr at sew、vlmul and ratio. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-10.c: Add float16 tuple type case. * gcc.target/riscv/rvv/base/abi-11.c: Ditto. * gcc.target/riscv/rvv/base/abi-12.c: Ditto. * gcc.target/riscv/rvv/base/abi-15.c: Ditto. * gcc.target/riscv/rvv/base/abi-8.c: Ditto. * gcc.target/riscv/rvv/base/abi-9.c: Ditto. * gcc.target/riscv/rvv/base/abi-17.c: New test. * gcc.target/riscv/rvv/base/abi-18.c: New test.
2023-06-25Daily bump.GCC Administrator5-1/+128
2023-06-24i386: Add alternate representation for {and,or,xor}b %ah,%dh.Roger Sayle1-0/+22
A patch that I'm working on to improve RTL simplifications in the middle-end results in the regression of pr78904-1b.c, due to changes in the canonical representation of high-byte (%ah, %bh, %ch, %dh) logic. See also PR target/78904. This patch avoids/prevents those failures by adding support for the alternate representation, duplicating the existing *<code>qi_ext<mode>_2 as *<code>qi_ext<mode>_3 (the new version also replacing any_or with any_logic to provide *andqi_ext<mode>_3 in the same pattern). Removing the original pattern isn't trivial, as it's generated by define_split, but this can be investigated after the other pieces are approved. The current representation of this instruction is: (set (zero_extract:DI (reg/v:DI 87 [ aD.2763 ]) (const_int 8 [0x8]) (const_int 8 [0x8])) (subreg:DI (xor:QI (subreg:QI (zero_extract:DI (reg:DI 94) (const_int 8 [0x8]) (const_int 8 [0x8])) 0) (subreg:QI (zero_extract:DI (reg/v:DI 87 [ aD.2763 ]) (const_int 8 [0x8]) (const_int 8 [0x8])) 0)) 0)) after my proposed middle-end improvement, we attempt to recognize: (set (zero_extract:DI (reg/v:DI 87 [ aD.2763 ]) (const_int 8 [0x8]) (const_int 8 [0x8])) (zero_extract:DI (xor:DI (reg:DI 94) (reg/v:DI 87 [ aD.2763 ])) (const_int 8 [0x8]) (const_int 8 [0x8]))) 2023-06-24 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386.md (*<code>qi_ext<mode>_3): New define_insn.
2023-06-24Fortran: ABI for scalar CHARACTER(LEN=1),VALUE dummy argument [PR110360]Harald Anlauf1-8/+13
gcc/fortran/ChangeLog: PR fortran/110360 * trans-expr.cc (gfc_conv_procedure_call): Truncate constant string argument of length > 1 passed to scalar CHARACTER(1),VALUE dummy.
2023-06-24RISC-V: Refactor the integer ternary autovec patternJuzhe-Zhong1-26/+28
Long time ago, I encounter ICE when trying to set clobber register as Pmode and I forgot the reason. So, I clobber SI scratch and PUT_MODE to make it Pmode after reload which makes patterns look unreasonable. According to Jeff's comments, I tried it again, it works now when we try to set clobber register as Pmode and the patterns look more reasonable now. The tests are all passed, Ok for trunk. gcc/ChangeLog: * config/riscv/autovec.md (*fma<mode>): set clobber to Pmode in expand stage. (*fma<VI:mode><P:mode>): Ditto. (*fnma<mode>): Ditto. (*fnma<VI:mode><P:mode>): Ditto.
2023-06-24RISC-V: Support RVV floating-point auto-vectorizationJuzhe-Zhong40-34/+1386
This patch adds RVV floating-point auto-vectorization. Also, fix attribute bug of floating-point ternary operations in vector.md. gcc/ChangeLog: * config/riscv/autovec.md (fma<mode>4): New pattern. (*fma<mode>): Ditto. (fnma<mode>4): Ditto. (*fnma<mode>): Ditto. (fms<mode>4): Ditto. (*fms<mode>): Ditto. (fnms<mode>4): Ditto. (*fnms<mode>): Ditto. * config/riscv/riscv-protos.h (emit_vlmax_fp_ternary_insn): New function. * config/riscv/riscv-v.cc (emit_vlmax_fp_ternary_insn): Ditto. * config/riscv/vector.md: Fix attribute bug. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/ternop/ternop-1.c: Adjust tests. * gcc.target/riscv/rvv/autovec/ternop/ternop-2.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop-3.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop-4.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop-5.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop-6.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-6.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop-10.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop-11.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop-12.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop-7.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop-8.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop-9.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-10.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-11.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-12.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-7.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-8.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-9.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-10.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-11.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-12.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-2.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-3.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-4.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-5.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-6.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-7.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-8.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-9.c: New test.
2023-06-24LOOP IVOPTS: Apply LEN_MASK_{LOAD,STORE}Ju-Zhe Zhong1-3/+11
Hi, Jeff. I fix format as you suggested. Ok for trunk ? gcc/ChangeLog: * tree-ssa-loop-ivopts.cc (get_mem_type_for_internal_fn): Apply LEN_MASK_{LOAD,STORE}.
2023-06-24IVOPTS: Add LEN_MASK_{LOAD, STORE} into 'get_alias_ptr_type_for_ptr_address'Ju-Zhe Zhong1-0/+2
gcc/ChangeLog: * tree-ssa-loop-ivopts.cc (get_alias_ptr_type_for_ptr_address): Add LEN_MASK_{LOAD,STORE}.
2023-06-23text-art: remove explicit #include of C++ standard library headersDavid Malcolm18-5/+25
gcc/analyzer/ChangeLog: * access-diagram.cc: Add #define INCLUDE_VECTOR. * bounds-checking.cc: Likewise. gcc/ChangeLog: * diagnostic-format-sarif.cc: Add #define INCLUDE_VECTOR. * diagnostic.cc: Likewise. * text-art/box-drawing.cc: Likewise. * text-art/canvas.cc: Likewise. * text-art/ruler.cc: Likewise. * text-art/selftests.cc: Likewise. * text-art/selftests.h (text_art::canvas): New forward decl. * text-art/style.cc: Add #define INCLUDE_VECTOR. * text-art/styled-string.cc: Likewise. * text-art/table.cc: Likewise. * text-art/table.h: Remove #include <vector>. * text-art/theme.cc: Add #define INCLUDE_VECTOR. * text-art/types.h: Check that INCLUDE_VECTOR is defined. Remove #include of <vector> and <string>. * text-art/widget.cc: Add #define INCLUDE_VECTOR. * text-art/widget.h: Remove #include <vector>. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic_plugin_test_text_art.c: Add #define INCLUDE_VECTOR. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-24VECT: Apply LEN_MASK_{LOAD,STORE} into vectorizerJu-Zhe Zhong4-83/+267
Address comments from Richard and Bernhard from V5 patch. V6 fixed all issues according their comments. gcc/ChangeLog: * internal-fn.cc (expand_partial_store_optab_fn): Adapt for LEN_MASK_STORE. (internal_load_fn_p): Add LEN_MASK_LOAD. (internal_store_fn_p): Add LEN_MASK_STORE. (internal_fn_mask_index): Add LEN_MASK_{LOAD,STORE}. (internal_fn_stored_value_index): Add LEN_MASK_STORE. (internal_len_load_store_bias): Add LEN_MASK_{LOAD,STORE}. * optabs-tree.cc (can_vec_mask_load_store_p): Adapt for LEN_MASK_{LOAD,STORE}. (get_len_load_store_mode): Ditto. * optabs-tree.h (can_vec_mask_load_store_p): Ditto. (get_len_load_store_mode): Ditto. * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Ditto. (get_all_ones_mask): New function. (vectorizable_store): Apply LEN_MASK_{LOAD,STORE} into vectorizer. (vectorizable_load): Ditto.
2023-06-24Daily bump.GCC Administrator8-1/+170
2023-06-23compiler, libgo: support bootstrapping gc compilerIan Lance Taylor6-11/+37
In the Go 1.21 release the package internal/profile imports internal/lazyregexp. That works when bootstrapping with Go 1.17, because that compiler has internal/lazyregep and permits importing it. We also have internal/lazyregexp in libgo, but since it is not installed it is not available for importing. This CL adds internal/lazyregexp to the list of internal packages that are installed for bootstrapping. The Go 1.21, and earlier, releases have a couple of functions in the internal/abi package that are always fully intrinsified. The gofrontend recognizes and intrinsifies those functions as well. However, the gofrontend was also building function descriptors for references to the functions without calling them, which failed because there was nothing to refer to. That is OK for the gc compiler, which guarantees that the functions are only called, not referenced. This CL arranges to not generate function descriptors for these functions. For golang/go#60913 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/504798
2023-06-23c++: provide #include hint for missing includes [PR110164]David Malcolm4-1/+22
PR c++/110164 notes that in cases where we have a forward decl of a std library type such as: std::array<int, 10> x; we emit this diagnostic: error: aggregate ‘std::array<int, 10> x’ has incomplete type and cannot be defined This patch adds this hint to the diagnostic: note: ‘std::array’ is defined in header ‘<array>’; this is probably fixable by adding ‘#include <array>’ gcc/cp/ChangeLog: PR c++/110164 * cp-name-hint.h (maybe_suggest_missing_header): New decl. * decl.cc: Define INCLUDE_MEMORY. Add include of "cp/cp-name-hint.h". (start_decl_1): Call maybe_suggest_missing_header. * name-lookup.cc (maybe_suggest_missing_header): Remove "static". gcc/testsuite/ChangeLog: PR c++/110164 * g++.dg/diagnostic/missing-header-pr110164.C: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-23c++: Add support for -std={c,gnu}++2{c,6}Marek Polacek11-17/+113
It seems prudent to add C++26 now that the first C++26 papers have been approved. I followed commit r11-6920 as well as r8-3237. Since C++23 is essentially finished and its __cplusplus value has settled to 202302L, I've updated cpp_init_builtins and marked -std=c++2b Undocumented and made -std=c++23 no longer Undocumented. As for __cplusplus, I've chosen 202400L: $ xg++ -std=c++26 -dM -E -x c++ - < /dev/null | grep cplusplus #define __cplusplus 202400L I've verified the patch with a simple test, exercising the new directives. Don't forget to update your GXX_TESTSUITE_STDS! This patch does not add -Wc++26-extensions. gcc/c-family/ChangeLog: * c-common.h (cxx_dialect): Add cxx26 as a dialect. * c-opts.cc (set_std_cxx26): New. (c_common_handle_option): Set options when -std={c,gnu}++2{c,6} is enabled. (c_common_post_options): Adjust comments. * c.opt: Add options for -std=c++26, std=c++2c, -std=gnu++26, and -std=gnu++2c. (std=c++2b): Mark as Undocumented. (std=c++23): No longer Undocumented. gcc/ChangeLog: * doc/cpp.texi (__cplusplus): Document value for -std=c++26 and -std=gnu++26. Document that for C++23, its value is 202302L. * doc/invoke.texi: Document -std=c++26 and -std=gnu++26. * dwarf2out.cc (highest_c_language): Handle GNU C++26. (gen_compile_unit_die): Likewise. libcpp/ChangeLog: * include/cpplib.h (c_lang): Add CXX26 and GNUCXX26. * init.cc (lang_defaults): Add rows for CXX26 and GNUCXX26. (cpp_init_builtins): Set __cplusplus to 202400L for C++26. Set __cplusplus to 202302L for C++23. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_c++23): Return 1 also if check_effective_target_c++26. (check_effective_target_c++23_down): New. (check_effective_target_c++26_only): New. (check_effective_target_c++26): New. * g++.dg/cpp23/cplusplus.C: Adjust expected value. * g++.dg/cpp26/cplusplus.C: New test.
2023-06-23libcpp: allow UCS_LIMIT codepoints in UTF-8 stringsBen Boeckel1-1/+1
Fixes r14-1954 (libcpp: reject codepoints above 0x10FFFF, 2023-06-06) libcpp/ * charset.cc: Allow `UCS_LIMIT` in UTF-8 strings. Reported-by: Damien Guibouret <damien.guibouret@partition-saving.com> Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com>
2023-06-23libstdc++: Use RAII in std::vector::_M_realloc_insertJonathan Wakely1-51/+87
Replace the try-block with RAII types for deallocating storage and destroying elements. libstdc++-v3/ChangeLog: * include/bits/vector.tcc (_M_realloc_insert): Replace try-block with RAII types.
2023-06-23Tiny phiprop compile time optimizationJan Hubicka1-3/+5
gcc/ChangeLog: * tree-ssa-phiprop.cc (propagate_with_phi): Compute post dominators on demand. (pass_phiprop::execute): Do not compute it here; return update_ssa_only_virtuals if something changed. (pass_data_phiprop): Remove TODO_update_ssa from todos.
2023-06-23Fortran: ABI for scalar CHARACTER(LEN=1),VALUE dummy argument [PR110360]Harald Anlauf2-0/+97
gcc/fortran/ChangeLog: PR fortran/110360 * trans-expr.cc (gfc_conv_procedure_call): Pass actual argument to scalar CHARACTER(1),VALUE dummy argument by value. gcc/testsuite/ChangeLog: PR fortran/110360 * gfortran.dg/value_9.f90: New test.
2023-06-23Fix power10 fusion bug with prefixed loads, PR target/105325Michael Meissner6-42/+84
This changes fixes PR target/105325. PR target/105325 is a bug where an invalid lwa instruction is generated due to power10 fusion of a load instruction to a GPR and an compare immediate instruction with the immediate being -1, 0, or 1. In some cases, when the load instruction is done, the GCC compiler would generate a load instruction with an offset that was too large to fit into the normal load instruction. In particular, loads from the stack might originally have a small offset, so that the load is not a prefixed load. However, after the stack is set up, and register allocation has been done, the offset now is large enough that we would have to use a prefixed load instruction. The support for prefixed loads did not consider that patterns with a fused load and compare might have a prefixed address. Without this support, the proper prefixed load won't be generated. In the original code, when the split2 pass is run after reload has finished the ds_form_mem_operand predicate that was used for lwa and ld no longer returns true. When the pattern was created, ds_form_mem_operand recognized the insn as being valid since the offset was small. But after register allocation, ds_form_mem_operand did not return true. Because it didn't return true, the insn could not be split. Since the insn was not split and the prefix support did not indicate a prefixed instruction was used, the wrong load is generated. The solution involves: 1) Don't use ds_form_mem_operand for ld and lwa, always use non_update_memory_operand. 2) Delete ds_form_mem_operand since it is no longer used. 3) Use the "YZ" constraints for ld/lwa instead of "m". 4) If we don't need to sign extend the lwa, convert it to lwz, and use cmpwi instead of cmpdi. Adjust the insn name to reflect the code generate. 5) Insure that the insn using lwa will be recognized as having a prefixed operand (and hence the insn length will be 16 bytes instead of 8 bytes). 5a) Set the prefixed and maybe_prefix attributes to know that fused_load_cmpi are also load insns; 5b) In the case where we are just setting CC and not using the memory afterward, set the clobber to use a DI register, and put an explicit sign_extend operation in the split; 5c) Set the sign_extend attribute to "yes" for lwa. 5d) 5a-5c are the things that prefixed_load_p in rs6000.cc checks to ensure that lwa is treated as a ds-form instruction and not as a d-form instruction (i.e. lwz). 6) Add a new test case for this case. 7) Adjust the insn counts in fusion-p10-ldcmpi.c. Because we are no longer using ds_form_mem_operand, the ld and lwa instructions will fuse x-form (reg+reg) addresses in addition ds-form (reg+offset or reg). 2023-06-23 Michael Meissner <meissner@linux.ibm.com> gcc/ PR target/105325 * config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): Fix problems that allowed prefixed lwa to be generated. * config/rs6000/fusion.md: Regenerate. * config/rs6000/predicates.md (ds_form_mem_operand): Delete. * config/rs6000/rs6000.md (prefixed attribute): Add support for load plus compare immediate fused insns. (maybe_prefixed): Likewise. gcc/testsuite/ PR target/105325 * g++.target/powerpc/pr105325.C: New test. * gcc.target/powerpc/fusion-p10-ldcmpi.c: Update insn counts. Co-Authored-By: Aaron Sawdey <acsawdey@linux.ibm.com>
2023-06-23testsuite,objective-c++: Fix imported NSObjCRuntime.h.Iain Sandoe1-0/+3
We have imported some headers from the GNUStep project to allow us to maintain the testsuite independent to changing versions of system headers. One of these headers has a macro that (now we have support for __has_feature) expands to a declaration that triggers a warning. These headers are considered part of the implementation so that, in this case, we can suppress the warning with the system_header pragma. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/testsuite/ChangeLog: * objc-obj-c++-shared/GNUStep/Foundation/NSObjCRuntime.h: Make this header use pragma system_header.
2023-06-23Improved SUBREG simplifications in simplify-rtx.cc's simplify_subreg.Roger Sayle1-0/+32
An x86 backend improvement that I'm working results in combine attempting to recognize: (set (reg:DI 87 [ xD.2846 ]) (ior:DI (subreg:DI (ashift:TI (zero_extend:TI (reg:DI 92)) (const_int 64 [0x40])) 0) (reg:DI 91))) where the lowpart SUBREG has difficulty seeing through the (hi<<64) that the lowpart must be zero. Rather than workaround this in the backend, the better fix is to teach simplify-rtx that lowpart((hi<<64)|lo) -> lo and highpart((hi<<64)|lo) -> hi, so that all backends benefit. Reducing the number of places where the middle-end generates a SUBREG of something other than REG is a good thing. On x86_64-pc-linux-gnu, the testcase pr78904-1b.c FAILs with this patch, due to changes in expected/canonical RTL, for which a backend patch to i386.md has already been provisionally approved. 2023-06-23 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * simplify-rtx.cc (simplify_subreg): Optimize lowpart SUBREGs of ASHIFT to const0_rtx with sufficiently large shift count. Optimize highpart SUBREGs of ASHIFT as the shift operand when the shift count is the correct offset. Optimize SUBREGs of multi-word logic operations if the SUBREGs of both operands can be simplified.
2023-06-23Fix initializer_constant_valid_p_1 TYPE_PRECISION useRichard Biener1-1/+2
initializer_constant_valid_p_1 is letting through all conversions of float vector types that have the same number of elements but that's of course not valid. The following restricts the code to scalar floating point types as was probably intended (only scalar integer types are handled as well). * varasm.cc (initializer_constant_valid_p_1): Only allow conversions between scalar floating point types.
2023-06-23Deal with vector typed operands in conversionsRichard Biener1-5/+8
The following avoids using TYPE_PRECISION on VECTOR_TYPE when looking for bit-precision changes in vectorizable_assignment. We didn't anticipate a stmt like _21 = VIEW_CONVERT_EXPR<unsigned int>(vect__1.7_28); and the following makes sure to handle that. * tree-vect-stmts.cc (vectorizable_assignment): Properly handle non-integral operands when analyzing conversions.
2023-06-23[aarch64/match.pd] Fix ICE observed in PR110280.Prathamesh Kulkarni2-1/+20
gcc/ChangeLog: PR tree-optimization/110280 * match.pd (vec_perm_expr(v, v, mask) -> v): Explicitly build vector using build_vector_from_val with the element of input operand, and mask's type if operand and mask's types don't match. gcc/testsuite/ChangeLog: PR tree-optimization/110280 * gcc.target/aarch64/sve/pr110280.c: New test.
2023-06-23Fix tree_simple_nonnegative_warnv_p for VECTOR_TYPEsRichard Biener1-1/+2
tree_simple_nonnegative_warnv_p ends up being called on VECTOR_TYPEs which I think even gets the wrong answer here for tcc_comparison since vector bools are signed. The following properly guards that with !VECTOR_TYPE_P. * fold-const.cc (tree_simple_nonnegative_warnv_p): Guard the truth_value_p case with !VECTOR_TYPE_P.
2023-06-23Properly guard vect_look_through_possible_promotionRichard Biener1-1/+5
The function ends up getting called on VECTOR_TYPEs which it really isn't prepared for and with the TYPE_PRECISION checking changes will ICE. The following exits early when the type to work on isn't scalar integral. * tree-vect-patterns.cc (vect_look_through_possible_promotion): Exit early when the type isn't scalar integral.
2023-06-23Use element_precision for match.pd arith conversion optimizationRichard Biener1-4/+4
The simplification (outertype)((innertype0)a+(innertype1)b) to ((newtype)a+(newtype)b) ends up using TYPE_PRECISION to check whether it can elide a conversion but in some paths there can be VECTOR_TYPEs where this instead compares the number of lanes. The following fixes the missed optimizations and uses element_precision in those places. * match.pd ((outertype)((innertype0)a+(innertype1)b) -> ((newtype)a+(newtype)b)): Use element_precision where appropriate.
2023-06-23Bogus and missed folding on vector comparesRichard Biener2-4/+4
fold_binary tries to transform (double)float1 CMP (double)float2 into float1 CMP float2 but ends up using TYPE_PRECISION on the argument types. For vector types that compares the number of lanes which should be always equal (so it's harmless as to not generating wrong code). The following instead properly uses element_precision. The same happens in the corresponding match.pd pattern. * fold-const.cc (fold_binary_loc): Use element_precision when trying (double)float1 CMP (double)float2 to float1 CMP float2 simplification. * match.pd: Likewise.
2023-06-23Optimize vector codegen for invariant loads, fix SLP supportRichard Biener1-20/+19
The following avoids creating duplicate stmts for invariant loads which was necessary when the vector stmts were in a linked list. It also fixes SLP support which didn't correctly create the appropriate number of copies. * tree-vect-stmts.cc (vectorizable_load): Avoid useless copies of VMAT_INVARIANT vectorized stmts, fix SLP support.
2023-06-23Improve vector_vector_composition_typeRichard Biener1-0/+8
We sometimes get to ask to decompose, say V2DFmode into two halves. Currently this results in composing it from two DImode pieces instead of the obvious two DFmode pieces. The following adjusts vector_vector_composition_type for this trivial case and avoids a VIEW_CONVERT_EXPR in the initial code generation. * tree-vect-stmts.cc (vector_vector_composition_type): Handle composition of a vector from a number of elements that happens to match its number of lanes.
2023-06-23Daily bump.GCC Administrator11-1/+399
2023-06-22rust: Update usage of TARGET_AIX to TARGET_AIX_OSPaul E. Murphy1-3/+3
This was noticed when fixing the gccgo usage of the macro, the rust usage is very similar. TARGET_AIX is defined as a non-zero value on linux/powerpc64le which may cause unexpected behavior. TARGET_AIX_OS should be used to toggle AIX specific behavior. 2023-06-22 Paul E. Murphy <murphyp@linux.ibm.com> gcc/rust/ * rust-object-export.cc [TARGET_AIX]: Rename and update usage to TARGET_AIX_OS.
2023-06-22go: Update usage of TARGET_AIX to TARGET_AIX_OSPaul E. Murphy2-7/+7
TARGET_AIX is defined to a non-zero value on linux and maybe other powerpc64le targets. This leads to unexpected behavior such as dropping the .go_export section when linking a shared library on linux/powerpc64le. Instead, use TARGET_AIX_OS to toggle AIX specific behavior. Fixes golang/go#60798. 2023-06-22 Paul E. Murphy <murphyp@linux.ibm.com> gcc/go/ * go-backend.cc [TARGET_AIX]: Rename and update usage to TARGET_AIX_OS. * go-lang.cc: Likewise.
2023-06-22configure: Implement --enable-host-bind-nowMarek Polacek7-5/+83
As promised in the --enable-host-pie patch, this patch adds another configure option, --enable-host-bind-now, which adds -z now when linking the compiler executables in order to extend hardening. BIND_NOW with RELRO allows the GOT to be marked RO; this prevents GOT modification attacks. This option does not affect linking of target libraries; you can use LDFLAGS_FOR_TARGET=-Wl,-z,relro,-z,now to enable RELRO/BIND_NOW. With this patch: $ readelf -Wd cc1{,plus,obj,gm2} f951 lto1 cpp rust1 gnat1 | grep FLAGS 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE c++tools/ChangeLog: * configure.ac (--enable-host-bind-now): New check. * configure: Regenerate. gcc/ChangeLog: * configure.ac (--enable-host-bind-now): New check. Add -Wl,-z,now to LD_PICFLAG if --enable-host-bind-now. * configure: Regenerate. * doc/install.texi: Document --enable-host-bind-now. lto-plugin/ChangeLog: * configure.ac (--enable-host-bind-now): New check. Link with -z,now. * configure: Regenerate.
2023-06-22Change fma_reassoc_width tuning for ampere1Di Zhao OS1-1/+1
This patch enables reassociation of floating-point additions on ampere1. This brings about 1% overall benefit on spec2017 fprate cases. (There are minor regressions in 510.parest_r and 508.namd_r, analyzed here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110279 .) gcc/ChangeLog: * config/aarch64/aarch64.cc: Change fma_reassoc_width for ampere1.
2023-06-22libgomp.texi: Improve OpenMP ICV descriptionTobias Burnus1-12/+117
Use @var{} instead of @emph{} - for semantic texinfo formatting; the result is similar: slanted instead of italic in PDF, still italic in HTML, albeit in info is is now uppercase instead of '_' as pre/suffix. The patch also documents the newer _ALL/_DEV/_DEV_<no> env var suffixes and as it refers to the ICV vars and their scope, those were added to the OMP_ env vars for reference. For OMP_NESTING, a note that those were deprecated was added plus a bunch of cross references. For OMP_ALLOCATOR, add note about the lack of per-device env vars support. A new section, consisting mostly of cross references was added to document the implementation-defined ICV initialization, especially as OpenMP demands that implementations document what they do for 'implementation defined'. For nvptx, the implementation-defined used stack size was documented libgomp/ * libgomp.texi: Use @var for ICV vars. (OpenMP Environment Variables): Mention _ALL/_DEV/_DEV_<no> variants, document which ICV is set and which scope the ICV has; extend/cleanup some @ref. (Implementation-defined ICV Initialization): New. (nvptx): Document the implementation-defined used per-warp stack size.
2023-06-22tree-optimization/110332 - fix ICE with phipropRichard Biener4-3/+46
The following fixes an ICE that occurs when we visit an edge inserted load from the code validating correctness for inserting an aggregate copy there. We can simply skip those loads here. PR tree-optimization/110332 * tree-ssa-phiprop.cc (propagate_with_phi): Always check aliasing with edge inserted loads. * g++.dg/torture/pr110332.C: New testcase. * gcc.dg/torture/pr110332-1.c: Likewise. * gcc.dg/torture/pr110332-2.c: Likewise.