riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-10-14	AArch64: rename the SVE2 psel intrinsics to psel_lane [PR116371]	Tamar Christina	19	-698/+698
	The psel intrinsics. similar to the pext, should be name psel_lane. This corrects the naming. gcc/ChangeLog: PR target/116371 * config/aarch64/aarch64-sve-builtins-sve2.cc (class svpsel_impl): Renamed to ... (class svpsel_lane_impl): ... This and adjust initialization. * config/aarch64/aarch64-sve-builtins-sve2.def (svpsel): Renamed to ... (svpsel_lane): ... This. * config/aarch64/aarch64-sve-builtins-sve2.h (svpsel): Renamed to svpsel_lane. gcc/testsuite/ChangeLog: PR target/116371 * gcc.target/aarch64/sme2/acle-asm/psel_b16.c, gcc.target/aarch64/sme2/acle-asm/psel_b32.c, gcc.target/aarch64/sme2/acle-asm/psel_b64.c, gcc.target/aarch64/sme2/acle-asm/psel_b8.c, gcc.target/aarch64/sme2/acle-asm/psel_c16.c, gcc.target/aarch64/sme2/acle-asm/psel_c32.c, gcc.target/aarch64/sme2/acle-asm/psel_c64.c, gcc.target/aarch64/sme2/acle-asm/psel_c8.c: Renamed to.... * gcc.target/aarch64/sme2/acle-asm/psel_lane_b16.c, gcc.target/aarch64/sme2/acle-asm/psel_lane_b32.c, gcc.target/aarch64/sme2/acle-asm/psel_lane_b64.c, gcc.target/aarch64/sme2/acle-asm/psel_lane_b8.c, gcc.target/aarch64/sme2/acle-asm/psel_lane_c16.c, gcc.target/aarch64/sme2/acle-asm/psel_lane_c32.c, gcc.target/aarch64/sme2/acle-asm/psel_lane_c64.c, gcc.target/aarch64/sme2/acle-asm/psel_lane_c8.c: ... These.
2024-10-14	middle-end: support SLP early break	Tamar Christina	7	-21/+257
	This patch introduces feature parity for early break int the SLP only vectorizer. The approach taken here is to treat the early exits as root statements for an SLP tree. This means that we don't need any changes to build_slp to support gconds. Codegen for the gcond itself now has to be done out of line but the body of the SLP blocks itself is simply driven by SLP scheduling. There is a slight awkwardness in having re-used vectorizable_early_exit for both SLP and non-SLP but I've documented the differences and when I did try to refactor it it wasn't really worth it given that this is a temporary state anyway. This version is restricted to lane = 1, as such we can re-use the existing move_early_break function instead of having to do safety update through scheduling. I have a branch where I'm working on that but lane > 1 is out of scope for GCC 15 anyway. The only reason I will try to get moving through scheduling done as a stretch goal is so we get epilogue vectorization back for early break. The example: unsigned test4(unsigned x) { unsigned ret = 0; for (int i = 0; i < N; i++) { vect_b[i] = x + i; if (vect_a[i]2 != x) break; vect_a[i] = x; } return ret; } builds the following SLP instance for early break: note: Analyzing vectorizable control flow: if (patt_6 != 0) note: Starting SLP discovery for note: patt_6 = _4 != x_9(D); note: starting SLP discovery for node 0x63abc80 note: Build SLP for patt_6 = _4 != x_9(D); note: precomputed vectype: vector(4) <signed-boolean:32> note: nunits = 4 note: vect_is_simple_use: operand x_9(D), type of def: external note: vect_is_simple_use: operand # RANGE [irange] unsigned int [0, 0][2, +INF] MASK 0xffff _3 2, type of def: internal note: starting SLP discovery for node 0x63abdc0 note: Build SLP for _4 = _3 * 2; note: precomputed vectype: vector(4) unsigned int note: nunits = 4 note: vect_is_simple_use: operand # vect_aD.4416[i_15], type of def: internal note: vect_is_simple_use: operand 2, type of def: constant note: starting SLP discovery for node 0x63abe60 note: Build SLP for _3 = vect_a[i_15]; note: precomputed vectype: vector(4) unsigned int note: nunits = 4 note: SLP discovery for node 0x63abe60 succeeded note: SLP discovery for node 0x63abdc0 succeeded note: SLP discovery for node 0x63abc80 succeeded note: SLP size 3 vs. limit 10. note: Final SLP tree for instance 0x6474190: note: node 0x63abc80 (max_nunits=4, refcnt=2) vector(4) <signed-boolean:32> note: op template: patt_6 = _4 != x_9(D); note: stmt 0 patt_6 = _4 != x_9(D); note: children 0x63abd20 0x63abdc0 note: node (external) 0x63abd20 (max_nunits=1, refcnt=1) note: { x_9(D) } note: node 0x63abdc0 (max_nunits=4, refcnt=2) vector(4) unsigned int note: op template: _4 = _3 * 2; note: stmt 0 _4 = _3 * 2; note: children 0x63abe60 0x63abf00 note: node 0x63abe60 (max_nunits=4, refcnt=2) vector(4) unsigned int note: op template: _3 = vect_a[i_15]; note: stmt 0 _3 = vect_a[i_15]; note: load permutation { 0 } note: node (constant) 0x63abf00 (max_nunits=1, refcnt=1) note: { 2 } and during codegen: note: ------>vectorizing SLP node starting from: patt_6 = _4 != x_9(D); note: vect_is_simple_use: operand # RANGE [irange] unsigned int [0, 0][2, +INF] MASK 0xffff _3 * 2, type of def: internal note: add new stmt: mask_patt_6.18_58 = _53 != vect__4.17_57; note: === vectorizable_early_exit === note: transform early-exit. note: vectorizing stmts using SLP. note: Vectorizing SLP tree: note: node 0x63abfa0 (max_nunits=4, refcnt=1) vector(4) int note: op template: i_12 = i_15 + 1; note: stmt 0 i_12 = i_15 + 1; note: children 0x63aba00 0x63ac040 note: node 0x63aba00 (max_nunits=4, refcnt=2) vector(4) int note: op template: i_15 = PHI <i_12(6), 0(14)> note: [l] stmt 0 i_15 = PHI <i_12(6), 0(14)> note: children (nil) (nil) note: node (constant) 0x63ac040 (max_nunits=1, refcnt=1) vector(4) int note: { 1 } gcc/ChangeLog: * tree-vect-loop.cc (vect_analyze_loop_2): Handle SLP trees with no children. * tree-vectorizer.h (enum slp_instance_kind): Add slp_inst_kind_gcond. (LOOP_VINFO_EARLY_BREAKS_LIVE_IVS): New. (vectorizable_early_exit): Expose. (class _loop_vec_info): Add early_break_live_stmts. * tree-vect-slp.cc (vect_build_slp_instance, vect_analyze_slp_instance): Support gcond instances. (vect_analyze_slp): Analyze gcond roots and early break live statements. (maybe_push_to_hybrid_worklist): Don't sink gconds. (vect_slp_analyze_operations): Support gconds. (vect_slp_check_for_roots): Update comments. (vectorize_slp_instance_root_stmt): Support gconds. (vect_schedule_slp): Pass vinfo to vectorize_slp_instance_root_stmt. * tree-vect-stmts.cc (vect_stmt_relevant_p): Record early break live statements. (vectorizable_early_exit): Support SLP. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-early-break_126.c: New test. * gcc.dg/vect/vect-early-break_127.c: New test. * gcc.dg/vect/vect-early-break_128.c: New test.
2024-10-14	Add regression test	Eric Botcazou	3	-0/+38
	gcc/testsuite/ PR ada/114593 * gnat.dg/specs/generic_inst2-child2.ads: New test. * gnat.dg/specs/generic_inst2.ads: New helper. * gnat.dg/specs/generic_inst2-child1.ads: Likewise.
2024-10-14	middle-end: [PR middle-end/116926] Allow widening optabs for vec-mode -> ↵	Victor Do Nascimento	1	-0/+6
	scalar-mode The recent refactoring of the dot_prod optab to convert-type exposed a limitation in how `find_widening_optab_handler_and_mode' is currently implemented, owing to the fact that, while the function expects the GET_MODE_CLASS (from_mode) == GET_MODE_CLASS (to_mode) condition to hold, the c6x backend implements a dot product from V2HI to SI, which triggers an ICE. Consequently, this patch adds some logic to allow widening optabs which accumulate vector elements to a single scalar. gcc/ChangeLog: PR middle-end/116926 * optabs-query.cc (find_widening_optab_handler_and_mode): Add handling of vector -> scalar optab handling.
2024-10-14	aarch64: Fix folding of degenerate svwhilele case [PR117045]	Richard Sandiford	4	-2/+76
	The svwhilele folder mishandled the degenerate case in which the second argument is the maximum integer. In that case, the result is all-true regardless of the first parameter: If the second scalar operand is equal to the maximum signed integer value then a condition which includes an equality test can never fail and the result will be an all-true predicate. This is because the conceptual "increment the first operand by 1 after each element" is done modulo the range of the operand. The GCC code was instead treating it as infinite precision. whilele_5.c even had a test for the incorrect behaviour. The easiest fix seemed to be to handle that case specially before doing constant folding. This also copes with variable first operands. gcc/ PR target/116999 PR target/117045 * config/aarch64/aarch64-sve-builtins-base.cc (svwhilelx_impl::fold): Check for WHILELTs of the minimum value and WHILELEs of the maximum value. Fold them to all-false and all-true respectively. gcc/testsuite/ PR target/116999 PR target/117045 * gcc.target/aarch64/sve/acle/general/whilele_5.c: Fix bogus expected result. * gcc.target/aarch64/sve/acle/general/whilele_11.c: New test. * gcc.target/aarch64/sve/acle/general/whilele_12.c: Likewise.
2024-10-14	middle-end/116891 - fix (negate (IFN_FNMS@3 @0 @1 @2)) -> (IFN_FMA @0 @1 @2)	Richard Biener	1	-1/+1
	Transforming -fma (-a, b, -c) to fma (a, b, c) is only valid when not rounding towards -inf or +inf as the sign of the multiplication changes. PR middle-end/116891 * match.pd ((negate (IFN_FNMS@3 @0 @1 @2)) -> (IFN_FMA @0 @1 @2)): Only enable for !HONOR_SIGN_DEPENDENT_ROUNDING.
2024-10-14	RISC-V: Add testcases for form 4 of vector signed SAT_SUB	Pan Li	9	-0/+126
	Form 4: #define DEF_VEC_SAT_S_SUB_FMT_4(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_sub_##T##_fmt_4 (T out, T op_1, T op_2, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ T x = op_1[i]; \ T y = op_2[i]; \ T minus; \ bool overflow = __builtin_sub_overflow (x, y, &minus); \ out[i] = !overflow ? minus : x < 0 ? MIN : MAX; \ } \ } The below test are passed for this patch. The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-4-i16.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-4-i32.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-4-i64.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-4-i8.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-4-i16.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-4-i32.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-4-i64.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-4-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-14	RISC-V: Add testcases for form 3 of vector signed SAT_SUB	Pan Li	9	-0/+126
	Form 3: #define DEF_VEC_SAT_S_SUB_FMT_3(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_sub_##T##_fmt_3 (T out, T op_1, T op_2, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ T x = op_1[i]; \ T y = op_2[i]; \ T minus; \ bool overflow = __builtin_sub_overflow (x, y, &minus); \ out[i] = overflow ? x < 0 ? MIN : MAX : minus; \ } \ } The below test are passed for this patch. The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-3-i16.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-3-i32.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-3-i64.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-3-i8.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-3-i16.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-3-i32.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-3-i64.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-3-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-14	Match: Support form 3 for vector signed integer SAT_SUB	Pan Li	1	-0/+12
	This patch would like to support the form 3 of the vector signed integer SAT_SUB. Aka below example: Form 3: #define DEF_VEC_SAT_S_SUB_FMT_3(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_sub_##T##_fmt_3 (T out, T op_1, T op_2, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ T x = op_1[i]; \ T y = op_2[i]; \ T minus; \ bool overflow = __builtin_sub_overflow (x, y, &minus); \ out[i] = overflow ? x < 0 ? MIN : MAX : minus; \ } \ } Before this patch: 25 │ if (limit_11(D) != 0) 26 │ goto <bb 3>; [89.00%] 27 │ else 28 │ goto <bb 8>; [11.00%] 29 │ ;; succ: 3 30 │ ;; 8 31 │ 32 │ ;; basic block 3, loop depth 0 33 │ ;; pred: 2 34 │ _13 = (unsigned long) limit_11(D); 35 │ ;; succ: 4 36 │ 37 │ ;; basic block 4, loop depth 1 38 │ ;; pred: 3 39 │ ;; 7 40 │ # ivtmp.7_34 = PHI <0(3), ivtmp.7_30(7)> 41 │ _26 = op_1_12(D) + ivtmp.7_34; 42 │ x_29 = MEM[(int8_t )_26]; 43 │ _1 = op_2_14(D) + ivtmp.7_34; 44 │ y_24 = MEM[(int8_t )_1]; 45 │ _9 = .SUB_OVERFLOW (x_29, y_24); 46 │ _7 = IMAGPART_EXPR <_9>; 47 │ if (_7 != 0) 48 │ goto <bb 6>; [50.00%] 49 │ else 50 │ goto <bb 5>; [50.00%] 51 │ ;; succ: 6 52 │ ;; 5 53 │ 54 │ ;; basic block 5, loop depth 1 55 │ ;; pred: 4 56 │ _42 = REALPART_EXPR <_9>; 57 │ _2 = out_17(D) + ivtmp.7_34; 58 │ MEM[(int8_t )_2] = _42; 59 │ ivtmp.7_27 = ivtmp.7_34 + 1; 60 │ if (_13 != ivtmp.7_27) 61 │ goto <bb 7>; [89.00%] 62 │ else 63 │ goto <bb 8>; [11.00%] 64 │ ;; succ: 7 65 │ ;; 8 66 │ 67 │ ;; basic block 6, loop depth 1 68 │ ;; pred: 4 69 │ _38 = x_29 < 0; 70 │ _39 = (signed char) _38; 71 │ _40 = -_39; 72 │ _41 = _40 ^ 127; 73 │ _33 = out_17(D) + ivtmp.7_34; 74 │ MEM[(int8_t )_33] = _41; 75 │ ivtmp.7_25 = ivtmp.7_34 + 1; 76 │ if (_13 != ivtmp.7_25) 77 │ goto <bb 7>; [89.00%] 78 │ else 79 │ goto <bb 8>; [11.00%] After this patch: 77 │ _94 = .SELECT_VL (ivtmp_92, POLY_INT_CST [16, 16]); 78 │ vect_x_13.9_81 = .MASK_LEN_LOAD (vectp_op_1.7_79, 8B, { -1, ... }, _94, 0); 79 │ vect_y_15.12_85 = .MASK_LEN_LOAD (vectp_op_2.10_83, 8B, { -1, ... }, _94, 0); 80 │ vect_patt_49.13_86 = .SAT_SUB (vect_x_13.9_81, vect_y_15.12_85); 81 │ .MASK_LEN_STORE (vectp_out.14_88, 8B, { -1, ... }, _94, 0, vect_patt_49.13_86); 82 │ vectp_op_1.7_80 = vectp_op_1.7_79 + _94; 83 │ vectp_op_2.10_84 = vectp_op_2.10_83 + _94; 84 │ vectp_out.14_89 = vectp_out.14_88 + _94; 85 │ ivtmp_93 = ivtmp_92 - _94; The below test suites are passed for this patch. The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Add matching pattern for vector signed SAT_SUB form 3. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-14	RISC-V: Add testcases for form 2 of vector signed SAT_SUB	Pan Li	9	-0/+126
	Form 2: #define DEF_VEC_SAT_S_SUB_FMT_2(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_sub_##T##_fmt_2 (T out, T op_1, T op_2, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ T x = op_1[i]; \ T y = op_2[i]; \ T minus = (UT)x - (UT)y; \ out[i] = (x ^ y) >= 0 \|\| (minus ^ x) >= 0 \ ? minus : x < 0 ? MIN : MAX; \ } \ } DEF_VEC_SAT_S_SUB_FMT_2(int8_t, uint8_t, INT8_MIN, INT8_MAX) The below test are passed for this patch. The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-2-i16.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-2-i32.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-2-i64.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-2-i8.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-2-i16.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-2-i32.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-2-i64.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-2-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-14	tree-optimization/116290 - fix compare-debug issue in ldist	Richard Biener	3	-4/+23
	Loop distribution does different analysis with -g0/-g due to counting a debug stmt starting a BB against a limit which will everntually lead to different IVOPTs choices. I've fixed a possible IVOPTs issue on the way even though it doesn't make a difference here. PR tree-optimization/116290 * tree-loop-distribution.cc (determine_reduction_stmt_1): PHIs have no debug variants. Start with first non-debug real stmt. * tree-ssa-loop-ivopts.cc (find_givs_in_bb): Do not analyze debug stmts. * gcc.dg/pr116290.c: New testcase.
2024-10-14	SH: Fix cost estimation of mem load/store	Oleg Endo	1	-7/+13
	For memory loads/stores (that contain a MEM rtx) sh_rtx_costs would wrongly report a cost lower than 1 insn which is not accurate as it makes loads/stores appear cheaper than simple arithmetic insns. The cost of a load/store insn is at least 1 insn plus the cost of the address expression (some addressing modes can be considered more expensive than others due to additional constraints). gcc/ChangeLog: PR target/113533 * config/sh/sh.cc (sh_rtx_costs): Adjust cost estimation of MEM rtx to be always at least COST_N_INSNS (1). Forward speed argument to sh_address_cost. Co-authored-by: Roger Sayle <roger@nextmovesoftware.com>
2024-10-14	SH: Add -fno-math-errno to fsca,fsrra tests.	Oleg Endo	5	-5/+5
	Without -fno-math-errno some of the test might fail because the expected insns will not be generated. gcc/testsuite/ChangeLog: * gcc.target/sh/pr53512-1.c: Add -fno-math-errno option. * gcc.target/sh/pr53512-2.c: Likewise. * gcc.target/sh/pr53512-3.c: Likewise. * gcc.target/sh/pr53512-4.c: Likewise. * gcc.target/sh/pr54680.c: Likewise.
2024-10-14	Daily bump.	GCC Administrator	5	-1/+69

2024-10-13	Revert "c++: Fix overeager Woverloaded-virtual with conversion operators ↵	Simon Martin	8	-135/+33
	[PR109918]" This reverts commit 60163c85730e6b7c566e219222403ac87ddbbddd.
2024-10-13	m68k: replace reload_in_progress by reload_in_progress \|\| lra_in_progress	Andreas Schwab	3	-11/+20
	For now assume that LRA needs the same treatment as reload. * config/m68k/m68k.md ("movsi", "movxf"): Replace reload_in_progress by reload_in_progress \|\| lra_in_progress. * config/m68k/m68k.cc (m68k_legitimate_mem_p) (emit_move_sequence): Likewise. * config/m68k/predicates.md ("fp_src_operand"): Likewise.
2024-10-13	tree-optimization/116481 - avoid building function_type[]	Richard Biener	2	-0/+24
	The following avoids building an array type with function or method element type during diagnosing an array bound violation as this will result in an error, rejecting a program with a not too useful error message. Instead build such array type manually. PR tree-optimization/116481 * pointer-query.cc (build_printable_array_type): Build an array types with function or method element type manually to avoid bogus diagnostic. * gcc.dg/pr116481.c: New testcase.
2024-10-13	Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' ↵	Tobias Burnus	6	-20/+60
	__builtin_is_initial_device It turned out that 'if (omp_is_initial_device() .eqv. true)' gave an ICE due to comparing 'int' with 'logical(4)'. When digging deeper, it also turned out that when the procedure pointer is needed, the builtin cannot be used, either. (Follow up to r15-2799-gf1bfba3a9b3f31 ) Extend the code to also use the builtin acc_on_device with OpenACC, which was previously only used in C/C++. Additionally, fix folding when offloading is not enabled. Fixes additionally the BT_BOOL data type, which was 'char'/integer(1) instead of bool, backing the booleaness; use bool_type_node as the rest of GCC. gcc/fortran/ChangeLog: * gfortran.h (gfc_option_t): Add disable_acc_on_device. * options.cc (gfc_handle_option): Handle -fno-builtin-acc_on_device. * trans-decl.cc (gfc_get_extern_function_decl): Move __builtin_omp_is_initial_device handling to ... * trans-expr.cc (get_builtin_fn): ... this new function. (conv_function_val): Call it. (update_builtin_function): New. (gfc_conv_procedure_call): Call it. * types.def (BT_BOOL): Fix type by using bool_type_node. gcc/ChangeLog: * gimple-fold.cc (gimple_fold_builtin_acc_on_device): Also fold when offloading is not configured. libgomp/ChangeLog: * libgomp.texi (TR13): Fix minor typos. (omp_is_initial_device): Improve wording. (acc_on_device): Note how to disable the builtin. * testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Remove TODO. * testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise. Add -fno-builtin-acc_on_device. * testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c: Update dg- as !offloading_enabled now compile-time expands acc_on_device. * testsuite/libgomp.fortran/target-is-initial-device-3.f90: New test. * testsuite/libgomp.oacc-fortran/acc_on_device-2.f90: New test.
2024-10-12	[RISC-V] Avoid unnecessary extensions when value is already extended	Jivan Hakobyan	1	-2/+18
	This is a minor patch from Jivan from roughly a year ago. The basic idea here is similar to what we do when extending values for the sake of comparisons. Specifically if the value is already known to be properly extended, then an extension is just a copy. The original idea was to use a similar patch, but which aborted to identify cases where these unnecessary promotions where emitted. All that showed up when doing a testsuite run with that abort was the promotions created by the arithmetic with overflow patterns such as addv. Things like addv aren't that common so this never got high on my todo list, even after a minor issue in this space was raised in bugzilla. But with stage1 closing soon and no good reason not to go forward, I'm submitting this into the pre-commit tester now. My tester has been using it since roughly Feb :-) Plan would be to commit after the pre-commit tester renders its verdict. * config/riscv/riscv.md (zero_extendsidi2): If RHS is already zero extended, then this is just a copy. (extendsidi2): Similarly, but for sign extension.
2024-10-13	Daily bump.	GCC Administrator	6	-1/+303

2024-10-12	Unsigned constants for ISO_FORTRAN_ENV and ISO_C_BINDING.	Thomas Koenig	9	-8/+244
	gcc/fortran/ChangeLog: * dump-parse-tree.cc (get_c_type_name): Also handle BT_UNSIGNED. * gfortran.h (NAMED_UINTCST): Define before inclusion of iso-c-binding.def and iso-fortran-env.def. (gfc_get_uint_kind_from_width_isofortranenv): Prototype. * gfortran.texi: Mention new constants in iso_c_binding and iso_fortran_env. * iso-c-binding.def: Handle NAMED_UINTCST. Add c_unsigned, c_unsigned_short,c_unsigned_char, c_unsigned_long, c_unsigned_long_long, c_uintmax_t, c_uint8_t, c_uint16_t, c_uint32_t, c_uint64_t, c_uint128_t, c_uint_least8_t, c_uint_least16_t, c_uint_least32_t, c_uint_least64_t, c_uint_least128_t, c_uint_fast8_t, c_uint_fast16_t, c_uint_fast32_t, c_uint_fast64_t and c_uint_fast128_t. * iso-fortran-env.def: Handle NAMED_UINTCST. Add uint8, uint16, uint32 and uint64. * module.cc (parse_integer): Whitespace fix. (write_module): Whitespace fix. (NAMED_UINTCST): Define before inclusion of iso-fortran-evn.def and iso-fortran-env.def. * symbol.cc: Likewise. * trans-types.cc (get_unsigned_kind_from_node): New function. (get_uint_kind_from_name): New function. (gfc_get_uint_kind_from_width_isofortranenv): New function. (get_uint_kind_from_width): New function. (gfc_init_kinds): Initialize gfc_c_uint_kind. gcc/testsuite/ChangeLog: * gfortran.dg/unsigned_36.f90: New test.
2024-10-12	vect: Fix inconsistency in fully-masked lane-reducing op generation [PR116985]	Feng Xue	2	-2/+28
	To align vectorized def/use when lane-reducing op is present in loop reduction, we may need to insert extra trivial pass-through copies, which would cause mismatch between lane-reducing vector copy and loop mask index. This could be fixed by computing the right index around a new counter on effective lane- reducing vector copies. 2024-10-11 Feng Xue <fxue@os.amperecomputing.com> gcc/ PR tree-optimization/116985 * tree-vect-loop.cc (vect_transform_reduction): Compute loop mask index based on effective vector copies for reduction op. gcc/testsuite/ PR tree-optimization/116985 * gcc.dg/vect/pr116985.c: New testcase.
2024-10-12	tree-optimization/117104 - add missed guards to max(a,b) != a simplification	Richard Biener	2	-1/+17
	For vector types we have to make sure the comparison result is a vector type and the resulting compare operation is supported. As the resulting compare is never an equality compare I didn't bother to check for the cbranch case. PR tree-optimization/117104 * match.pd ((cmp:c (minmax:c @0 @1) @0) -> (out @0 @1)): Properly guard the vector case. * gcc.dg/pr117104.c: New testcase.
2024-10-12	RISC-V] Slightly improve broadcasting small constants into vectors	Jeff Law	2	-6/+21
	I probably spent way more time on this than it's worth... I was looking at the code we generate for vector SAD and noticed that we were being a bit silly. Specifically: li a4,0 # 272 [c=4 l=4] movsi_internal/1 Followed shortly by: vmv.s.x v3,a4 # 261 [c=4 l=4] pred_broadcastrvvm1si/6 And no other uses of a4. We could have used x0 trivially. First we adjust the expander so that it doesn't force the constant into a register. In the matching pattern we change the appropriate source constraints from "r" to "rJ" and the output template is changed to use %z for the operand. The net is we drop the li completely and emit vmv.s.x,v3,x0. But wait, there's more. If we're broadcasting a constant in the range [-16..15] into a vector, we currently load the constant into a register and use vmv.v.r. We can instead use vmv.v.i, which avoids loading the constant into a GPR. For that case we again avoid forcing the constant into a register in the expander and adjust the output template to emit vmv.v.x or vmv.v.i based on whether or not the appropriate operand is a constant or general purpose register. So again, we'll drop a load immediate into a scalar for this case. Whether or not we should use vmv.v.i vs vmv.s.x for loading [-16..15] into the 0th element is probably uarch dependent. The tradeoff is loading the GPR vs the broadcast in the vector unit. I didn't bother with this case. Tested in my tester (which tests rv64gcv as a default codegen option). Will wait for the pre-commit tester to render a verdict. gcc/ * config/riscv/constraints.md (P): New constraint. * config/riscv/vector.md (pred_broadcast<mode> expander): Do not force small integers into GPRs so aggressively. (pred_broadcast<mode> insn & splitter): Allow splatting small constants across the vector register directly. Allow splatting (const_int 0) into element 0 directly.
2024-10-12	Fortran/OpenMP: Warn when mapping polymorphic variables	Tobias Burnus	4	-2/+121
	OpenMP (TR13) states for Fortran: * For map: "If a list item has polymorphic type, the behavior is unspecified." * "If the firstprivate clause is on a target construct and a variable is of polymorphic type, the behavior is unspecified." which this commit now warns for. gcc/fortran/ChangeLog: * openmp.cc (resolve_omp_clauses): Diagnose polymorphic mapping. * trans-openmp.cc (gfc_omp_finish_clause): Warn when polymorphic variable is implicitly mapped. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/polymorphic-mapping.f90: New test. * gfortran.dg/gomp/polymorphic-mapping-2.f90: New test.
2024-10-12	bootstrap: Fix genmatch build where system gcc defaults to -fPIE -pie	Jakub Jelinek	1	-0/+5
	Seems our buildbot is unhappy about my latest commit to link genmatch with libcommon.a in order to support gcc_diag diagnostics in libcpp. We have in gcc/configure.ac: if test x$enable_host_shared = xyes; then PICFLAG=-fPIC elif test x$enable_host_pie = xyes; then PICFLAG=-fPIE elif test x$gcc_cv_c_no_fpie = xyes; then PICFLAG=-fno-PIE else PICFLAG= fi if test x$enable_host_pie = xyes; then LD_PICFLAG=-pie elif test x$gcc_cv_no_pie = xyes; then LD_PICFLAG=-no-pie else LD_PICFLAG= fi if test x$enable_host_bind_now = xyes; then LD_PICFLAG="$LD_PICFLAG -Wl,-z,now" fi Now, for object files linked into cc1, cc1plus, xgcc etc. we carefully arrange for them to be compiled with $(PICFLAG) and do the link with $(LD_PICFLAG). For the generator programs, we don't do anything like that, we simply compile their objects without $(PICFLAG) and link without $(LD_PICFLAG). It isn't that big deal, the generator programs runs once or a couple of times during the build and that is it, we don't ship them and don't care much if they are PIE or not. Except that after my changes to link in libcommon.a into build/genmatch, we now link -fno-PIE compiled objects into a binary which is linked with default flags. Our distro compiler just links a normal executable and everything works fine (-fPIE/-pie is added through spec file snippet and just added in rpm default flags), but seems the buildbot system gcc defaults to -fPIE -pie instead and so building build/genmatch fails. The following patch is a minimal fix for that, just add -no-pie when linking build/genmatch, but don't add -pie. If we wanted to start building even the build/gen* tools with $(PICFLAG) and $(LD_PICFLAG), that would be much larger change. 2024-10-12 Jakub Jelinek <jakub@redhat.com> * Makefile.in (LINKER_FOR_BUILD): Append -no-pie if it is in $(LD_PICFLAG) when building build/genmatch.
2024-10-12	gcc.target/i386/pr55583.c: Use long long for 64-bit integer	H.J. Lu	1	-3/+3
	Since long is 32-bit for x32, use long long for 64-bit integer. * gcc.target/i386/pr55583.c: Use long long for 64-bit integer. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-12	gcc.target/i386/pr115749.c: Use word_mode integer	H.J. Lu	1	-1/+3
	Use word_mode integer with func so that 64-bit integer is used with x32. * gcc.target/i386/pr115749.c (uword): New. (func): Replace unsigned long with uword. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-12	gcc.target/i386/invariant-ternlog-1.c: Also scan (%edx)	H.J. Lu	1	-1/+1
	Since x32 uses (%edx), instead of (%rdx), also scan (%edx). * gcc.target/i386/invariant-ternlog-1.c: Also scan (%edx). Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-12	libcpp, genmatch: Use gcc_diag instead of printf for libcpp diagnostics	Jakub Jelinek	108	-1021/+1013
	When working on #embed support, or -Wheader-guard or other recent libcpp changes, I've been annoyed by the libcpp diagnostics being visually different from normal gcc diagnostics, especially in the area of quoting stuff in the diagnostic messages. Normall GCC diagnostics is gcc_diag/gcc_tdiag, one can use %</%>, %qs etc. in there, while libcpp diagnostics was marked as printf and in libcpp we've been very creative with quoting stuff, either no quotes at all, or "something" quoting, or 'something' quoting, or `something' quoting (but in none of the cases it used colors consistently with the rest of the compiler). Now, libcpp diagnostics is always emitted using a callback, pfile->cb.diagnostic. On the gcc/ side, this callback is initialized with genmatch.cc: cb->diagnostic = diagnostic_cb; c-family/c-opts.cc: cb->diagnostic = c_cpp_diagnostic; fortran/cpp.cc: cb->diagnostic = cb_cpp_diagnostic; where the latter two just use diagnostic_report_diagnostic, so actually support all the gcc_diag stuff, only the genmatch.cc case didn't. So, the following patch changes genmatch.cc to use pp_format* instead of vfprintf so that it supports the gcc_diag formatting (pretty-print.o unfortunately has various dependencies, so had to link genmatch with libcommon.a libbacktrace.a and tweak Makefile.in so that there are no circular dependencies) and marks the libcpp diagnostic routines as gcc_diag rather than printf. That change resulted in hundreds of -Wformat-diag new warnings (most of them useful and resulting IMHO in better diagnostics), so the rest of the patch is changing the format strings to make -Wformat-diag happy and adjusting the testsuite for the differences in how is the diagnostic reformatted. Dunno if some out of GCC tree projects use libcpp, that case would make it harder because one couldn't use vfprintf in the diagnostic callback anymore, but there is always David's libdiagnostic which could be used for that purpose IMHO. 2024-10-12 Jakub Jelinek <jakub@redhat.com> libcpp/ * include/cpplib.h (ATTRIBUTE_CPP_PPDIAG): Define. (struct cpp_callbacks): Use ATTRIBUTE_CPP_PPDIAG instead of ATTRIBUTE_FPTR_PRINTF on diagnostic callback. (cpp_error, cpp_warning, cpp_pedwarning, cpp_warning_syshdr): Use ATTRIBUTE_CPP_PPDIAG (3, 4) instead of ATTRIBUTE_PRINTF_3. (cpp_warning_at, cpp_pedwarning_at): Use ATTRIBUTE_CPP_PPDIAG (4, 5) instead of ATTRIBUTE_PRINTF_4. (cpp_error_with_line, cpp_warning_with_line, cpp_pedwarning_with_line, cpp_warning_with_line_syshdr): Use ATTRIBUTE_CPP_PPDIAG (5, 6) instead of ATTRIBUTE_PRINTF_5. (cpp_error_at): Use ATTRIBUTE_CPP_PPDIAG (4, 5) instead of ATTRIBUTE_PRINTF_4. * Makefile.in (po/$(PACKAGE).pot): Use --language=GCC-source rather than --language=c. * errors.cc (cpp_diagnostic_at, cpp_diagnostic, cpp_diagnostic_with_line): Use ATTRIBUTE_CPP_PPDIAG instead of -ATTRIBUTE_FPTR_PRINTF. * charset.cc (cpp_host_to_exec_charset, _cpp_valid_ucn, convert_hex, convert_oct, convert_escape): Fix up -Wformat-diag warnings. (cpp_interpret_string_ranges, count_source_chars): Use ATTRIBUTE_CPP_PPDIAG instead of ATTRIBUTE_FPTR_PRINTF. (narrow_str_to_charconst): Fix up -Wformat-diag warnings. * directives.cc (check_eol_1, directive_diagnostics, lex_macro_node, do_undef, glue_header_name, parse_include, do_include_common, do_include_next, _cpp_parse_embed_params, do_embed, read_flag, do_line, do_linemarker, register_pragma_1, do_pragma_once, do_pragma_push_macro, do_pragma_pop_macro, do_pragma_poison, do_pragma_system_header, do_pragma_warning_or_error, _cpp_do__Pragma, do_else, do_elif, do_endif, parse_answer, do_assert, cpp_define_unused): Likewise. * expr.cc (cpp_classify_number, parse_defined, eval_token, _cpp_parse_expr, reduce, check_promotion): Likewise. * files.cc (_cpp_find_file, finish_base64_embed, _cpp_pop_file_buffer): Likewise. * init.cc (sanity_checks): Likewise. * lex.cc (_cpp_process_line_notes, maybe_warn_bidi_on_char, _cpp_warn_invalid_utf8, _cpp_skip_block_comment, warn_about_normalization, forms_identifier_p, maybe_va_opt_error, identifier_diagnostics_on_lex, cpp_maybe_module_directive): Likewise. * macro.cc (class vaopt_state, builtin_has_include_1, builtin_has_include, builtin_has_embed, _cpp_warn_if_unused_macro, _cpp_builtin_macro_text, builtin_macro, stringify_arg, _cpp_arguments_ok, collect_args, enter_macro_context, _cpp_save_parameter, parse_params, create_iso_definition, _cpp_create_definition, check_trad_stringification): Likewise. * pch.cc (cpp_valid_state): Likewise. * traditional.cc (_cpp_scan_out_logical_line, recursive_macro): Likewise. gcc/ * Makefile.in (generated_files): Remove {gimple,generic}-match. (generated_match_files): New variable. Add a dependency of $(filter-out $(OBJS-libcommon),$(ALL_HOST_OBJS)) files on those. (build/genmatch$(build_exeext)): Depend on and link against libcommon.a and $(LIBBACKTRACE). genmatch.cc: Include pretty-print.h and input.h. (ggc_internal_cleared_alloc, ggc_free): Remove. (fatal): New function. (line_table): Remove. (linemap_client_expand_location_to_spelling_point): Remove. (diagnostic_cb): Use gcc_diag rather than printf format. Use pp_format_verbatim on a temporary pretty_printer instead of vfprintf. (fatal_at, warning_at): Use gcc_diag rather than printf format. (output_line_directive): Rename location_hash to loc_hash. (parser::eat_ident, parser::parse_operation, parser::parse_expr, parser::parse_pattern, parser::finish_match_operand): Fix up -Wformat-diag warnings. gcc/c-family/ * c-lex.cc (c_common_has_attribute, c_common_lex_availability_macro): Fix up -Wformat-diag warnings. gcc/testsuite/ * c-c++-common/cpp/counter-2.c: Adjust expected diagnostics for libcpp diagnostic formatting changes. * c-c++-common/cpp/embed-3.c: Likewise. * c-c++-common/cpp/embed-4.c: Likewise. * c-c++-common/cpp/embed-16.c: Likewise. * c-c++-common/cpp/embed-18.c: Likewise. * c-c++-common/cpp/eof-2.c: Likewise. * c-c++-common/cpp/eof-3.c: Likewise. * c-c++-common/cpp/fmax-include-depth.c: Likewise. * c-c++-common/cpp/has-builtin.c: Likewise. * c-c++-common/cpp/line-2.c: Likewise. * c-c++-common/cpp/line-3.c: Likewise. * c-c++-common/cpp/macro-arg-count-1.c: Likewise. * c-c++-common/cpp/macro-arg-count-2.c: Likewise. * c-c++-common/cpp/macro-ranges.c: Likewise. * c-c++-common/cpp/named-universal-char-escape-4.c: Likewise. * c-c++-common/cpp/named-universal-char-escape-5.c: Likewise. * c-c++-common/cpp/pr88974.c: Likewise. * c-c++-common/cpp/va-opt-error.c: Likewise. * c-c++-common/cpp/va-opt-pedantic.c: Likewise. * c-c++-common/cpp/Wheader-guard-2.c: Likewise. * c-c++-common/cpp/Wheader-guard-3.c: Likewise. * c-c++-common/cpp/Winvalid-utf8-1.c: Likewise. * c-c++-common/cpp/Winvalid-utf8-2.c: Likewise. * c-c++-common/cpp/Winvalid-utf8-3.c: Likewise. * c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c: Likewise. * c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-3.c: Likewise. * c-c++-common/pr68833-3.c: Likewise. * c-c++-common/raw-string-directive-1.c: Likewise. * gcc.dg/analyzer/named-constants-Wunused-macros.c: Likewise. * gcc.dg/binary-constants-4.c: Likewise. * gcc.dg/builtin-redefine.c: Likewise. * gcc.dg/cpp/19951025-1.c: Likewise. * gcc.dg/cpp/c11-warning-1.c: Likewise. * gcc.dg/cpp/c11-warning-2.c: Likewise. * gcc.dg/cpp/c11-warning-3.c: Likewise. * gcc.dg/cpp/c23-elifdef-2.c: Likewise. * gcc.dg/cpp/c23-warning-2.c: Likewise. * gcc.dg/cpp/embed-2.c: Likewise. * gcc.dg/cpp/embed-3.c: Likewise. * gcc.dg/cpp/embed-4.c: Likewise. * gcc.dg/cpp/expr.c: Likewise. * gcc.dg/cpp/gnu11-elifdef-2.c: Likewise. * gcc.dg/cpp/gnu11-elifdef-3.c: Likewise. * gcc.dg/cpp/gnu11-elifdef-4.c: Likewise. * gcc.dg/cpp/gnu11-warning-1.c: Likewise. * gcc.dg/cpp/gnu11-warning-2.c: Likewise. * gcc.dg/cpp/gnu11-warning-3.c: Likewise. * gcc.dg/cpp/gnu23-warning-2.c: Likewise. * gcc.dg/cpp/include6.c: Likewise. * gcc.dg/cpp/pr35322.c: Likewise. * gcc.dg/cpp/tr-warn6.c: Likewise. * gcc.dg/cpp/undef2.c: Likewise. * gcc.dg/cpp/warn-comments.c: Likewise. * gcc.dg/cpp/warn-comments-2.c: Likewise. * gcc.dg/cpp/warn-comments-3.c: Likewise. * gcc.dg/cpp/warn-cxx-compat.c: Likewise. * gcc.dg/cpp/warn-cxx-compat-2.c: Likewise. * gcc.dg/cpp/warn-deprecated.c: Likewise. * gcc.dg/cpp/warn-deprecated-2.c: Likewise. * gcc.dg/cpp/warn-long-long.c: Likewise. * gcc.dg/cpp/warn-long-long-2.c: Likewise. * gcc.dg/cpp/warn-normalized-1.c: Likewise. * gcc.dg/cpp/warn-normalized-2.c: Likewise. * gcc.dg/cpp/warn-normalized-3.c: Likewise. * gcc.dg/cpp/warn-normalized-4-bytes.c: Likewise. * gcc.dg/cpp/warn-normalized-4-unicode.c: Likewise. * gcc.dg/cpp/warn-redefined.c: Likewise. * gcc.dg/cpp/warn-redefined-2.c: Likewise. * gcc.dg/cpp/warn-traditional.c: Likewise. * gcc.dg/cpp/warn-traditional-2.c: Likewise. * gcc.dg/cpp/warn-trigraphs-1.c: Likewise. * gcc.dg/cpp/warn-trigraphs-2.c: Likewise. * gcc.dg/cpp/warn-trigraphs-3.c: Likewise. * gcc.dg/cpp/warn-trigraphs-4.c: Likewise. * gcc.dg/cpp/warn-undef.c: Likewise. * gcc.dg/cpp/warn-undef-2.c: Likewise. * gcc.dg/cpp/warn-unused-macros.c: Likewise. * gcc.dg/cpp/warn-unused-macros-2.c: Likewise. * gcc.dg/pch/counter-2.c: Likewise. * g++.dg/cpp0x/udlit-error1.C: Likewise. * g++.dg/cpp23/named-universal-char-escape1.C: Likewise. * g++.dg/cpp23/named-universal-char-escape2.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-1.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-2.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-3.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-4.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-5.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-6.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-7.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-8.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-9.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-10.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-11.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-12.C: Likewise. * g++.dg/cpp/elifdef-3.C: Likewise. * g++.dg/cpp/elifdef-5.C: Likewise. * g++.dg/cpp/elifdef-6.C: Likewise. * g++.dg/cpp/elifdef-7.C: Likewise. * g++.dg/cpp/embed-1.C: Likewise. * g++.dg/cpp/embed-2.C: Likewise. * g++.dg/cpp/pedantic-errors.C: Likewise. * g++.dg/cpp/warning-1.C: Likewise. * g++.dg/cpp/warning-2.C: Likewise. * g++.dg/ext/bitint1.C: Likewise. * g++.dg/ext/bitint2.C: Likewise.
2024-10-12	Fortran: Unify gfc_get_location handling; fix expr->ts bug	Tobias Burnus	5	-27/+32
	This commit reduces code duplication by moving gfc_get_location from trans.cc to error.cc. The gcc_assert is now used more often and reveald a bug in gfc_match_array_constructor where the union expr->ts.u.derived of a derived type is partially overwritten by the assignment expr->ts.u.cl->... as a ts.type == BT_CHARACTER check was missing. gcc/fortran/ChangeLog: * array.cc (gfc_match_array_constructor): Only update the character length if the expression is of character type. * error.cc (gfc_get_location_with_offset): New; split off from ... (gfc_format_decoder): ... here; call it. * gfortran.h (gfc_get_location_with_offset): New prototype. (gfc_get_location): New inline function. * trans.cc (gfc_get_location): Remove function definition. * trans.h (gfc_get_location): Remove declaration.
2024-10-12	testsuite/i386: Add vector sat_sub testcases [PR112600]	Uros Bizjak	2	-0/+50
	PR middle-end/112600 gcc/testsuite/ChangeLog: * gcc.target/i386/pr112600-4a.c: New test. * gcc.target/i386/pr112600-4b.c: New test.
2024-10-12	c++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]	Simon Martin	8	-33/+135
	We currently emit an incorrect -Woverloaded-virtual warning upon the following test case === cut here === struct A { virtual operator int() { return 42; } virtual operator char() = 0; }; struct B : public A { operator char() { return 'A'; } }; === cut here === The problem is that when iterating over ovl_range (fns), warn_hidden gets confused by the conversion operator marker, concludes that seen_non_override is true and therefore emits a warning for all conversion operators in A that do not convert to char, even if -Woverloaded-virtual is 1 (e.g. with -Wall, the case reported). A second set of problems is highlighted when -Woverloaded-virtual is 2. First, with the same test case, since base_fndecls contains all conversion operators in A (except the one to char, that's been removed when iterating over ovl_range (fns)), we emit a spurious warning for the conversion operator to int, even though it's unrelated. Second, in case there are several conversion operators with different cv-qualifiers to the same type in A, we rightfully emit a warning, however the note uses the location of the conversion operator marker instead of the right one; location_of should go over conv_op_marker. This patch fixes all these by explicitly keeping track of (1) base methods that are overriden, as well as (2) base methods that are hidden but not overriden (and by what), and warning about methods that are in (2) but not (1). It also ignores non virtual base methods, per "definition" of -Woverloaded-virtual. PR c++/109918 gcc/cp/ChangeLog: * class.cc (warn_hidden): Keep track of overloaded and of hidden base methods. Mention the actual hiding function in the warning, not the first overload. * error.cc (location_of): Skip over conv_op_marker. gcc/testsuite/ChangeLog: * g++.dg/warn/Woverloaded-virt1.C: Check that no warning is emitted for non virtual base methods. * g++.dg/warn/Woverloaded-virt5.C: New test. * g++.dg/warn/Woverloaded-virt6.C: New test. * g++.dg/warn/Woverloaded-virt7.C: New test. * g++.dg/warn/Woverloaded-virt8.C: New test. * g++.dg/warn/Woverloaded-virt9.C: New test.
2024-10-12	RISC-V: Add testcases for form 1 of vector signed SAT_SUB	Pan Li	10	-0/+393
	Form 1: #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_add_##T##_fmt_1 (T out, T op_1, T op_2, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ T x = op_1[i]; \ T y = op_2[i]; \ T minus = (UT)x - (UT)y; \ out[i] = (x ^ y) >= 0 \ ? minus \ : (minus ^ x) >= 0 \ ? minus \ : x < 0 ? MIN : MAX; \ } \ } DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX) The below test are passed for this patch. The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: Add test data for run test. * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i16.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i32.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i64.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i8.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i16.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i32.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i64.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-12	RISC-V: Implement vector SAT_SUB for signed integer	Pan Li	3	-0/+21
	This patch would like to implement the sssub for vector signed integer. Form 1: #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_add_##T##_fmt_1 (T out, T op_1, T op_2, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ T x = op_1[i]; \ T y = op_2[i]; \ T minus = (UT)x - (UT)y; \ out[i] = (x ^ y) >= 0 \ ? minus \ : (minus ^ x) >= 0 \ ? minus \ : x < 0 ? MIN : MAX; \ } \ } DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX) Before this patch: 28 │ vle8.v v1,0(a1) 29 │ vle8.v v2,0(a2) 30 │ sub a3,a3,a5 31 │ add a1,a1,a5 32 │ add a2,a2,a5 33 │ vsra.vi v4,v1,7 34 │ vsub.vv v3,v1,v2 35 │ vxor.vv v2,v1,v2 36 │ vxor.vv v0,v1,v3 37 │ vmslt.vi v2,v2,0 38 │ vmslt.vi v0,v0,0 39 │ vmand.mm v0,v0,v2 40 │ vxor.vv v3,v4,v5,v0.t 41 │ vse8.v v3,0(a0) 42 │ add a0,a0,a5 After this patch: 25 │ vle8.v v1,0(a1) 26 │ vle8.v v2,0(a2) 27 │ sub a3,a3,a5 28 │ add a1,a1,a5 29 │ add a2,a2,a5 30 │ vssub.vv v1,v1,v2 31 │ vse8.v v1,0(a0) 32 │ add a0,a0,a5 The below test suites are passed for this patch. The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/autovec.md (sssub<mode>3): Add new pattern for signed SAT_SUB. * config/riscv/riscv-protos.h (expand_vec_sssub): Add new func decl to expand sssub to vssub. * config/riscv/riscv-v.cc (expand_vec_sssub): Add new func impl to expand sssub to vssub. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-12	Vect: Try the pattern of vector signed integer SAT_SUB	Pan Li	1	-1/+25
	Almost the same as vector unsigned integer SAT_SUB, try to match the signed version during the vector pattern matching. The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * tree-vect-patterns.cc (gimple_signed_integer_sat_sub): Add new func decl for signed SAT_SUB. (vect_recog_sat_sub_pattern_transform): Update comments. (vect_recog_sat_sub_pattern): Try the vector signed SAT_SUB pattern. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-12	Match: Support form 1 for vector signed integer SAT_SUB	Pan Li	1	-0/+16
	This patch would like to support the form 1 of the vector signed integer SAT_SUB. Aka below example: Form 1: #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_add_##T##_fmt_1 (T out, T op_1, T op_2, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ T x = op_1[i]; \ T y = op_2[i]; \ T minus = (UT)x - (UT)y; \ out[i] = (x ^ y) >= 0 \ ? minus \ : (minus ^ x) >= 0 \ ? minus \ : x < 0 ? MIN : MAX; \ } \ } DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX) Before this patch: 91 │ _108 = .SELECT_VL (ivtmp_106, POLY_INT_CST [16, 16]); 92 │ vect_x_16.11_80 = .MASK_LEN_LOAD (vectp_op_1.9_78, 8B, { -1, ... }, _108, 0); 93 │ _69 = vect_x_16.11_80 >> 7; 94 │ vect_x.12_81 = VIEW_CONVERT_EXPR<vector([16,16]) unsigned char>(vect_x_16.11_80); 95 │ vect_y_18.15_85 = .MASK_LEN_LOAD (vectp_op_2.13_83, 8B, { -1, ... }, _108, 0); 96 │ vect__7.21_91 = vect_x_16.11_80 ^ vect_y_18.15_85; 97 │ mask__44.22_92 = vect__7.21_91 < { 0, ... }; 98 │ vect_y.16_86 = VIEW_CONVERT_EXPR<vector([16,16]) unsigned char>(vect_y_18.15_85); 99 │ vect__6.17_87 = vect_x.12_81 - vect_y.16_86; 100 │ vect_minus_19.18_88 = VIEW_CONVERT_EXPR<vector([16,16]) signed char>(vect__6.17_87); 101 │ vect__8.19_89 = vect_x_16.11_80 ^ vect_minus_19.18_88; 102 │ mask__42.20_90 = vect__8.19_89 < { 0, ... }; 103 │ mask__41.23_93 = mask__42.20_90 & mask__44.22_92; 104 │ _4 = .COND_XOR (mask__41.23_93, _69, { 127, ... }, vect_minus_19.18_88); 105 │ .MASK_LEN_STORE (vectp_out.31_102, 8B, { -1, ... }, _108, 0, _4); 106 │ vectp_op_1.9_79 = vectp_op_1.9_78 + _108; 107 │ vectp_op_2.13_84 = vectp_op_2.13_83 + _108; 108 │ vectp_out.31_103 = vectp_out.31_102 + _108; 109 │ ivtmp_107 = ivtmp_106 - _108; After this patch: 81 │ _102 = .SELECT_VL (ivtmp_100, POLY_INT_CST [16, 16]); 82 │ vect_x_16.11_89 = .MASK_LEN_LOAD (vectp_op_1.9_87, 8B, { -1, ... }, _102, 0); 83 │ vect_y_18.14_93 = .MASK_LEN_LOAD (vectp_op_2.12_91, 8B, { -1, ... }, _102, 0); 84 │ vect_patt_38.15_94 = .SAT_SUB (vect_x_16.11_89, vect_y_18.14_93); 85 │ .MASK_LEN_STORE (vectp_out.16_96, 8B, { -1, ... }, _102, 0, vect_patt_38.15_94); 86 │ vectp_op_1.9_88 = vectp_op_1.9_87 + _102; 87 │ vectp_op_2.12_92 = vectp_op_2.12_91 + _102; 88 │ vectp_out.16_97 = vectp_out.16_96 + _102; 89 │ ivtmp_101 = ivtmp_100 - _102; The below test suites are passed for this patch. The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Add case 1 matching pattern for vector signed SAT_SUB. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-12	Daily bump.	GCC Administrator	5	-1/+311

2024-10-11	Introduce GFC_STD_UNSIGNED.	Thomas Koenig	3	-10/+16
	This patch creates an unsigned "standard" for the gfc_option.allow_std field. One of the main reason why people want UNSIGNED for Fortran is interfacing for C. This is a preparation for further work on the ISO_C_BINDING constants. That, we do via iso-c-binding.def , whose last field is a standard for the constant to be defined for the standard in question, which is then checked. I could try and invent a different method for this, but I'd rather not. gcc/fortran/ChangeLog: * intrinsic.cc (add_functions): Convert uint and selected_unsigned_kind to GFC_STD_UNSIGNED. (gfc_check_intrinsic_standard): Handle GFC_STD_UNSIGNED. * libgfortran.h (GFC_STD_UNSIGNED): Add. * options.cc (gfc_post_options): Set GFC_STD_UNSIGNED if -funsigned is set.
2024-10-12	gcc.target/i386: Replace long with long long	H.J. Lu	5	-8/+9
	Since long is 64-bit for x32, replace long with long long for x32. * gcc.target/i386/bmi2-pr112526.c: Replace long with long long. * gcc.target/i386/pr105854.c: Likewise. * gcc.target/i386/pr112943.c: Likewise. * gcc.target/i386/pr67325.c: Likewise. * gcc.target/i386/pr97971.c: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-12	g++.target/i386/pr105953.C: Skip for x32	H.J. Lu	1	-1/+1
	Since -mabi=ms isn't supported for x32, skip g++.target/i386/pr105953.C for x32. * g++.target/i386/pr105953.C: Skip for x32. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-12	gcc.target/i386/pr115407.c: Only run for lp64	H.J. Lu	1	-1/+1
	Since -mcmodel=large is valid only for lp64, run pr115407.c only for lp64. * gcc.target/i386/pr115407.c: Only run for lp64. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-11	Fix thinko in previous change	Eric Botcazou	1	-1/+1
	gcc/ada/ PR ada/116498 PR ada/117087 * gcc-interface/decl.cc (validate_size): Fix thinko.
2024-10-11	PR target/117048 aarch64: Use more canonical and optimization-friendly ↵	Kyrylo Tkachov	2	-4/+63
	representation for XAR instruction The pattern for the Advanced SIMD XAR instruction isn't very optimization-friendly at the moment. In the testcase from the PR once simlify-rtx has done its work it generates the RTL: (set (reg:V2DI 119 [ _14 ]) (rotate:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ]) (reg:V2DI 116 [ m1_01_8(D) ])) (const_vector:V2DI [ (const_int 32 [0x20]) repeated x2 ]))) which fails to match our XAR pattern because the pattern expects: 1) A ROTATERT instead of the ROTATE. However, according to the RTL ops documentation the preferred form of rotate-by-immediate is ROTATE, which I take to mean it's the canonical form. ROTATE (x, C) <-> ROTATERT (x, MODE_WIDTH - C) so it's better to match just one canonical representation. 2) A CONST_INT shift amount whereas the midend asks for a repeated vector constant. These issues are fixed by introducing a dedicated expander for the aarch64_xarqv2di name, needed by the arm_neon.h intrinsic, that translate the intrinsic-level CONST_INT immediate (the right-rotate amount) into a repeated vector constant subtracted from 64 to give the corresponding left-rotate amount that is fed to the new representation for the XAR define_insn that uses the ROTATE RTL code. This is a similar approach to have we handle the discrepancy between intrinsic-level and RTL-level vector lane numbers for big-endian. With this patch and [1/2] the arithmetic parts of the testcase now simplify to just one XAR instruction. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ PR target/117048 config/aarch64/aarch64-simd.md (aarch64_xarqv2di): Redefine into a define_expand. (aarch64_xarqv2di_insn): Define. gcc/testsuite/ PR target/117048 g++.target/aarch64/pr117048.C: New test.
2024-10-11	PR 117048: simplify-rtx: Extend (x << C1) \| (X >> C2) --> ROTATE ↵	Kyrylo Tkachov	1	-6/+10
	transformation to vector operands In the testcase from patch [2/2] we want to match a vector rotate operate from an IOR of left and right shifts by immediate. simplify-rtx has code for just that but it looks like it's prepared to do handle only scalar operands. In practice most of the code works for vector modes as well except the shift amounts are checked to be CONST_INT rather than vector constants that we have here. This is easily extended by using unwrap_const_vec_duplicate to extract the repeating constant shift amount. With this change combine now tries matching the simpler and expected: (set (reg:V2DI 119 [ _14 ]) (rotate:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ]) (reg:V2DI 116 [ m1_01_8(D) ])) (const_vector:V2DI [ (const_int 32 [0x20]) repeated x2 ]))) instead of the previous: (set (reg:V2DI 119 [ _14 ]) (ior:V2DI (ashift:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ]) (reg:V2DI 116 [ m1_01_8(D) ])) (const_vector:V2DI [ (const_int 32 [0x20]) repeated x2 ])) (lshiftrt:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ]) (reg:V2DI 116 [ m1_01_8(D) ])) (const_vector:V2DI [ (const_int 32 [0x20]) repeated x2 ])))) To actually fix the PR the aarch64 backend needs some adjustment as well which is done in patch [2/2], which adds the testcase as well. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> PR target/117048 simplify-rtx.cc (simplify_context::simplify_binary_operation_1): Handle vector constants in (x << C1) \| (x >> C2) -> ROTATE simplification.
2024-10-11	Fortran: Dead-function removal in error.cc (shrinking by 40%)	Tobias Burnus	1	-717/+0
	This patch removes a large number of unused static functions from error.cc, which previously were used for diagnostic but have been replaced by the common diagnostic code. gcc/fortran/ChangeLog: * error.cc (error_char, error_string, error_uinteger, error_integer, error_hwuint, error_hwint, gfc_widechar_display_length, gfc_wide_display_length, error_printf, show_locus, show_loci): Remove unused static functions. (IBUF_LEN, MAX_ARGS): Remove now unused #define.
2024-10-11	match.pd: Fold logarithmic identities.	Jennifer Schmitz	2	-0/+81
	This patch implements 4 rules for logarithmic identities in match.pd under -funsafe-math-optimizations: 1) logN(1.0/a) -> -logN(a). This avoids the division instruction. 2) logN(C/a) -> logN(C) - logN(a), where C is a real constant. Same as 1). 3) logN(a) + logN(b) -> logN(ab). This reduces the number of calls to log function. 4) logN(a) - logN(b) -> logN(a/b). Same as 4). Tests were added for float, double, and long double. The patch was bootstrapped and regtested on aarch64-linux-gnu and x86_64-linux-gnu, no regression. Additionally, SPEC 2017 fprate was run. While the transform does not seem to be triggered, we also see no non-noise impact on performance. OK for mainline? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ PR tree-optimization/116826 PR tree-optimization/86710 match.pd: Fold logN(1.0/a) -> -logN(a), logN(C/a) -> logN(C) - logN(a), logN(a) + logN(b) -> logN(ab), and logN(a) - logN(b) -> logN(a/b). gcc/testsuite/ PR tree-optimization/116826 PR tree-optimization/86710 gcc.dg/tree-ssa/log_ident.c: New test.
2024-10-11	tree-optimization/117080 - Add SLP_TREE_MEMORY_ACCESS_TYPE	Richard Biener	4	-50/+91
	It turns out target costing code looks at STMT_VINFO_MEMORY_ACCESS_TYPE to identify operations from (emulated) gathers for example. This doesn't work for SLP loads since we do not set STMT_VINFO_MEMORY_ACCESS_TYPE there as the vectorization strathegy might differ between different stmt uses. It seems we got away with setting it for stores though. The following adds a memory_access_type field to slp_tree and sets it from load and store vectorization code. All the costing doesn't record the SLP node (that was only done selectively for some corner case). The costing is really in need of a big overhaul, the following just massages the two relevant ops to fix gcc.dg/target/pr88531-2[bc].c FAILs when switching on SLP for non-grouped stores. In particular currently we either have a SLP node or a stmt_info in the cost hook but not both. So the following mitigates this, postponing a rewrite of costing to next stage1. Other targets look possibly affected as well but are left to respective maintainers to update. PR tree-optimization/117080 * tree-vectorizer.h (_slp_tree::memory_access_type): Add. (SLP_TREE_MEMORY_ACCESS_TYPE): New. (record_stmt_cost): Add another overload. * tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize memory_access_type. * tree-vect-stmts.cc (vectorizable_store): Set SLP_TREE_MEMORY_ACCESS_TYPE. (vectorizable_load): Likewise. Also record the SLP node when costing emulated gather offset decompose and vector composition. * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Also recognize SLP emulated gather/scatter.
2024-10-11	aarch64: Add codegen support for SVE2 faminmax	Saurabh Jha	4	-0/+147
	The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking UNSPEC_COND_SMAX of the two operands and then taking absolute value of their result. 2. famin can be expressed as taking UNSPEC_COND_SMIN of the two operands and then taking absolute value of their result. This fusion of operators is only possible when -march=armv9-a+faminmax+sve flags are passed. We also need to pass -ffast-math flag; this is what enables compiler to use UNSPEC_COND_SMAX and UNSPEC_COND_SMIN. This code generation is only available on -O2 or -O3 as that is when auto-vectorization is enabled. gcc/ChangeLog: * config/aarch64/aarch64-sve2.md (aarch64_pred_faminmax_fused): Instruction pattern for faminmax codegen. config/aarch64/iterators.md: Iterator and attribute for faminmax codegen. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/faminmax_1.c: New test. * gcc.target/aarch64/sve/faminmax_2.c: New test.
2024-10-11	aarch64: Add SVE2 faminmax intrinsics	Saurabh Jha	11	-1/+2615
	The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces SVE2 faminmax intrinsics. The intrinsics of this extension are implemented as the following builtin functions: * sva[max\|min]_[m\|x\|z] * sva[max\|min]_[f16\|f32\|f64]_[m\|x\|z] * sva[max\|min]_n_[f16\|f32\|f64]_[m\|x\|z] gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc (svamax): Absolute maximum declaration. (svamin): Absolute minimum declaration. * config/aarch64/aarch64-sve-builtins-base.def (REQUIRED_EXTENSIONS): Add faminmax intrinsics behind a flag. (svamax): Absolute maximum declaration. (svamin): Absolute minimum declaration. * config/aarch64/aarch64-sve-builtins-base.h: Declaring function bases for the new intrinsics. * config/aarch64/aarch64.h (TARGET_SVE_FAMINMAX): New flag for SVE2 faminmax. * config/aarch64/iterators.md: New unspecs, iterators, and attrs for the new intrinsics. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve2/acle/asm/amax_f16.c: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f32.c: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f64.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f16.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f32.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f64.c: New test.