aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
5 hours[vxworks] [ppc] match TARGET_VXWORKS64 to TARGET_64BITHEADtrunkmasterAlexandre Oliva1-0/+15
Configuring gcc for --target=powerpc-wrs-vxworks7r2 sets things up for a 64-bit compiler, just like powerpc64-wrs-vxworks7r2, except that TARGET_VXWORKS64 is only defined as 1 for targets that match *64-*-vxworks*. With !TARGET_VXWORKS64, we get a 64-bit toolchain that defines SIZE_TYPE, PTRDIFF_TYPE, and WCHAR_TYPE as 32-bit types, and that breaks GCC passes that expect SIZE_TYPE and PTRDIFF_TYPE to be as wide as pointers. Arrange for TARGET_VXWORKS64 on ppc to match TARGET_64BIT, after using it to select the default word size with driver self specs. for gcc/ChangeLog * config/rs6000/vxworks.h (SUBTARGET_DRIVER_SELF_SPECS): Redefine to select word size matching TARGET_VXWORKS64. (TARGET_VXWORKS64): Redefine in terms of TARGET_64BIT.
9 hoursDaily bump.GCC Administrator7-1/+268
10 hoursRISC-V: prefetch: fix LRA failing to allocate reg [PR118241]Vineet Gupta2-1/+34
prefetch was recently fixed/tightened (with Q reg constraint) to only support right address patterns (REG or REG+D with lower 5 bits clear). However in some cases that's too restrictive for LRA and it fails to allocate a reg resulting in following ICE... | gcc/testsuite/gcc.target/riscv/pr118241-b.cc:31:19: error: unable to generate reloads for: | 31 | void m() { a.l(); } | | ^ |(insn 26 25 27 7 (prefetch (mem/f:DI (plus:DI (reg/f:DI 143 [ _5 ]) | (const_int 56 [0x38])) [5 _5->batch[6]+0 S8 A64]) | (const_int 0 [0]) | (const_int 3 [0x3])) "gcc/testsuite/gcc.target/riscv/pr118241-b.cc":18:29 498 {prefetch} | (expr_list:REG_DEAD (reg/f:DI 142 [ _5->batch[6] ]) | (nil))) |during RTL pass: reload Fix that by providing a fallback alternative register constraint to reload the address. PR target/118241 gcc/ChangeLog: * config/riscv/riscv.md (prefetch): Add alternative "r". gcc/testsuite/ChangeLog: * gcc.target/riscv/pr118241-b.cc: New test. Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
10 hoursRISC-V: prefetch: const offset needs to have 5 bits zero, not 4Vineet Gupta1-2/+2
Spotted this by chance as I saw a similar fixup in comment. From comments, I think this is needed, but I've not hit any issues due to this. gcc/ChangeLog: * config/riscv/predicates.md (prefetch_operand): mack 5 bits. Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
17 hourssh: Recognize >> 31 in treg_set_expr_not_const01Raphael Moreira Zinsly4-10/+26
A right shift of 31 will become 0 or 1, this can be checked for treg_set_expr_not_const01 to avoid matching addc_t_r as this can expand to a 3 insn sequence instead. This improves tests 023 to 026 from gcc.target/sh/pr54236-2.c, e.g.: test_023: shll r5 mov #0,r1 mov r4,r0 rts addc r1,r0 With this change: test_023: shll r5 movt r0 rts add r4,r0 We noticed this while evaluating a patch to improve how we handle selecting between two constants based on the output of a LT/GE 0 test. gcc/ChangeLog: * config/sh/predicates.md (treg_set_expr_not_const01): call sh_recog_treg_set_expr_not_01 * config/sh/sh-protos.h (sh_recog_treg_set_expr_not_01): New function * config/sh/sh.cc (sh_recog_treg_set_expr_not_01): Likewise gcc/testsuite/ChangeLog: * gcc.target/sh/pr54236-2.c: Fix comments and expected output
17 hoursFortran: Silence a clang warning (suggesting a brace) in io.ccMartin Jambor1-1/+1
When GCC is built with clang, it suggests that we add a brace to the initialization of format_asterisk: gcc/fortran/io.cc:32:16: warning: suggest braces around initialization of subobject [-Wmissing-braces] So this patch does that to silence it. gcc/fortran/ChangeLog: 2025-06-24 Martin Jambor <mjambor@suse.cz> * io.cc (format_asterisk): Add a brace around static initialization location part of the field locus.
18 hoursfold: Change comparison of error_mark_node to use error_operand_p in ↵Andrew Pinski2-1/+13
tree_expr_nonnegative_warnv_p [PR118948] This is an obvious fix for this small regression. Basically after r15-328-g5726de79e2154a, there is a call to tree_expr_nonnegative_warnv_p where the type of the expression is now error_mark_node. Though there was only a check if the expression was error_mark_node. Bootstrapped and tested on x86_64-linux-gnu. PR c/118948 gcc/ChangeLog: * fold-const.cc (tree_expr_nonnegative_warnv_p): Use error_operand_p instead of checking for error_mark_node directly. gcc/testsuite/ChangeLog: * gcc.dg/pr118948-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
19 hoursc++: -Wtemplate-body and tentative parsing [PR120575]Jason Merrill2-1/+13
Here we were asserting non-zero errorcount, which is not the case if the parse error was reduced to a warning (or silenced) in a template body. So check seen_error instead. PR c++/120575 PR c++/116064 gcc/cp/ChangeLog: * parser.cc (cp_parser_abort_tentative_parse): Check seen_error instead of errorcount. gcc/testsuite/ChangeLog: * g++.dg/template/permissive-error3.C: New test.
20 hoursRISC-V: Add test for vec_duplicate + vsadd.vv combine case 1 with GR2VR cost ↵Pan Li12-0/+24
0, 1 and 2 Add asm dump check test for vec_duplicate + vsadd.vv combine to vsadd.vx, with the GR2VR cost is 0, 1 and 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto. Signed-off-by: Pan Li <pan2.li@intel.com>
20 hoursRISC-V: Add test for vec_duplicate + vsadd.vv combine case 0 with GR2VR cost ↵Pan Li18-14/+311
0, 2 and 15 Add asm dump check and run test for vec_duplicate + vsadd.vv combine to vsadd.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test data for run test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vsadd-run-1-i16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vsadd-run-1-i32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vsadd-run-1-i64.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vsadd-run-1-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
20 hoursRISC-V: Combine vec_duplicate + vsadd.vv to vsadd.vx on GR2VR costPan Li3-2/+5
This patch would like to combine the vec_duplicate + vsadd.vv to the vsadd.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the GR2VR cost is greater than zero. Assume we have example code like below, GR2VR cost is 0. #define DEF_SAT_S_ADD(T, UT, MIN, MAX) \ T \ test_##T##_sat_add (T x, T y) \ { \ T sum = (UT)x + (UT)y; \ return (x ^ y) < 0 \ ? sum \ : (sum ^ x) >= 0 \ ? sum \ : x < 0 ? MIN : MAX; \ } DEF_SAT_S_ADD(int32_t, uint32_t, INT32_MIN, INT32_MAX) DEF_VX_BINARY_CASE_2_WRAP(T, SAT_S_ADD_FUNC(T), sat_add) Before this patch: 10 │ test_vx_binary_or_int32_t_case_0: 11 │ beq a3,zero,.L8 12 │ vsetvli a5,zero,e32,m1,ta,ma 13 │ vmv.v.x v2,a2 14 │ slli a3,a3,32 15 │ srli a3,a3,32 16 │ .L3: 17 │ vsetvli a5,a3,e32,m1,ta,ma 18 │ vle32.v v1,0(a1) 19 │ slli a4,a5,2 20 │ sub a3,a3,a5 21 │ add a1,a1,a4 22 │ vsadd.vv v1,v1,v2 23 │ vse32.v v1,0(a0) 24 │ add a0,a0,a4 25 │ bne a3,zero,.L3 After this patch: 10 │ test_vx_binary_or_int32_t_case_0: 11 │ beq a3,zero,.L8 12 │ slli a3,a3,32 13 │ srli a3,a3,32 14 │ .L3: 15 │ vsetvli a5,a3,e32,m1,ta,ma 16 │ vle32.v v1,0(a1) 17 │ slli a4,a5,2 18 │ sub a3,a3,a5 19 │ add a1,a1,a4 20 │ vsadd.vx v1,v1,a2 21 │ vse32.v v1,0(a0) 22 │ add a0,a0,a4 23 │ bne a3,zero,.L3 gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vx_binary_vec_dup_vec): Add new case SS_PLUS. (expand_vx_binary_vec_vec_dup): Ditto. * config/riscv/riscv.cc (riscv_rtx_costs): Ditto. * config/riscv/vector-iterators.md: Add new op ss_plus. Signed-off-by: Pan Li <pan2.li@intel.com>
23 hourstree-optimization/120944 - bogus VN with volatile copiesRichard Biener2-2/+41
The following avoids translating expressions through volatile copies. PR tree-optimization/120944 * tree-ssa-sccvn.cc (vn_reference_lookup_3): Gate optimizations invalid when volatile is involved. * gcc.dg/torture/pr120944.c: New testcase.
25 hoursAda: Switch from ACATS 2.6 to ACATS 4.2 testsuiteEric Botcazou1-1/+1
This effectively adds 250 new tests, i.e. around 10% more tests. gcc/ada/ * gcc-interface/Make-lang.in (ACATSDIR): Change to acats-4.
26 hoursada: Fix alignment violation for chain of aligned and misaligned composite typesEric Botcazou1-1/+3
This happens when aggressive optimizations are enabled (i.e. -O2 and above) because the ivopts pass fails to properly mark the new memory accesses it is creating as misaligned by means of the build_aligned_type function. gcc/ada/ChangeLog: * gcc-interface/utils.cc (make_packable_type): Clear the TYPE_PACKED flag in the case where the alignment is bumped.
26 hoursada: Remove strange elaboration code generated for Cluster type in ↵Eric Botcazou1-3/+7
System.Pack_NN Initialization procedures are turned into functions under the hood and, even when they are null (empty), the compiler may generate a convoluted sequence of instructions that return uninitialized data and, therefore, is useless. gcc/ada/ChangeLog: * gcc-interface/trans.cc (Subprogram_Body_to_gnu): Do not generate a block-copy out for a null initialization procedure when the _Init parameter is not passed in.
26 hoursada: Disable previous change for enumeration typesEric Botcazou1-3/+3
The debugger cannot correctly interpret the return value in this case. gcc/ada/ChangeLog: * gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Only apply the transformation to integer types.
26 hoursada: Add missing guards to previous changeEric Botcazou1-0/+4
We need to make sure that an integer type exists for the given size. gcc/ada/ChangeLog: * gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Add guards.
26 hoursada: Improve code generated for return of Out parameter with access typeEric Botcazou2-4/+38
The second problem occurs on 64-bit platforms where there is a second Out parameter that is smaller than the access parameter, creating a hole in the return structure. gcc/ada/ChangeLog: * gcc-interface/decl.cc (gnat_to_gnu_subprog_type): In the case of a subprogram using the Copy-In/Copy-Out mechanism, deal specially with the case of 2 parameters of differing sizes. * gcc-interface/trans.cc (Subprogram_Body_to_gnu): In the case of a subprogram using the Copy-In/Copy-Out mechanism, make sure the types are consistent on the two sides for all the parameters.
26 hoursada: Do not generate incorrect warning about redundant type conversionSteve Baird1-10/+10
If -gnatwr is enabled, then in some cases a type conversion between two different Boolean types incorrectly results in a warning that the conversion is redundant. gcc/ada/ChangeLog: * sem_res.adb (Resolve_Type_Conversion): Replace code for detecting a similar case with a more comprehensive test.
26 hoursada: Pragma Short_Circuit_And_OrBob Duff3-13/+87
Improve documentation of pragma Short_Circuit_And_Or. Also disallow renamings, because the semantics as currently implemented is confusing. gcc/ada/ChangeLog: * doc/gnat_rm/implementation_defined_pragmas.rst (Short_Circuit_And_Or): Add more documentation. * sem_ch8.adb (Analyze_Subprogram_Renaming): Disallow renamings. * gnat_rm.texi: Regenerate.
26 hoursada: Fix selection of Finalize subprogram in untagged caseRonan Desplanques2-10/+20
The newly introduced Finalizable aspect makes it possible to derive from a type that is not tagged but has a Finalize primitive. This patch fixes problems where overridings of the Finalize primitive were ignored. gcc/ada/ChangeLog: * exp_ch7.adb (Make_Final_Call): Tweak search of Finalize primitive. * exp_util.adb (Finalize_Address): Likewise.
26 hoursada: Fix inefficient Unchecked_Conversion to large array typeEric Botcazou1-5/+78
We fail to use the implementation permission given by RM 13.9(12) because the array type does not have the Size_Known_At_Compile_Time flag set. gcc/ada/ChangeLog: * freeze.adb (Check_Compile_Time_Size): Try harder to see whether the bounds of array types are known at compile time.
26 hoursada: Fix style in commentPiotr Trojanek1-1/+1
Cleanup; technical commit meant to trigger a GNAT continuous builder. gcc/ada/ChangeLog: * sem_aux.ads (First_Discriminant): Remove space before period.
26 hoursada: Missing component clause warning for discriminant of Unchecked_Union typeSteve Baird1-2/+41
Even when -gnatw.c is enabled, no warning about a missing component clause should be generated if the placement of a discriminant of an Unchecked_Union type is left unspecified in a record representation clause (such a discriminant occupies no storage). In determining whether to generate such a warning, in some cases the compiler would incorrectly ignore an Unchecked_Union pragma occurring after the record representation clause. This could result in a spurious warning. gcc/ada/ChangeLog: * sem_ch13.adb (Analyze_Record_Representation_Clause): In deciding whether to generate a warning about a missing component clause, in addition to calling Is_Unchecked_Union also call a new local function, Unchecked_Union_Pragma_Pending, which checks for the case of a not-yet-analyzed Unchecked_Union pragma occurring later in the declaration list.
26 hoursada: Improved error message when size of descendant type exceeds Size'Class ↵Steve Baird1-20/+40
limit Improve the error message that is generated when the size of tagged type exceeds a Size'Class limit specified for an ancestor type. gcc/ada/ChangeLog: * mutably_tagged.adb (Make_CW_Size_Compile_Check): Include the value of the Size'Class limit in the message generated via a Compile_Time_Error pragma.
26 hoursada: Remove leftover from rework of aspect representationRonan Desplanques1-11/+3
This patch removes some comments and object definitions that referred to a hacky use of the Entity field that had been removed by the latest rework of the internal representation of aspects. gcc/ada/ChangeLog: * sem_ch13.adb (Check_Aspect_At_Freeze_Point): Remove obsolete bits.
26 hoursada: Fix error on Designated_Storage_Model with extensions disabledRonan Desplanques1-0/+1
The format string used for the error in that case requires setting the Error_Msg_Name_1 global variable. This was not done so this patch adds the missing assignment. gcc/ada/ChangeLog: * sem_ch13.adb (Analyze_Aspect_Specifications): Fix error emission.
26 hoursRegenerate common.opt.urls and add period into common.optJan Hubicka2-1/+4
gcc/ChangeLog: * common.opt: Add period. * common.opt.urls: Regenerate.
27 hourstree-optimization/120927 - 510.parest_r segfault with masked epilogRichard Biener3-4/+60
The following fixes bad alignment computaton for epilog vectorization when as in this case for 510.parest_r and masked epilog vectorization with AVX512 we end up choosing AVX to vectorize the main loop and masked AVX512 (sic!) to vectorize the epilog. In that case alignment analysis for the epilog tries to force alignment of the base to 64, but that cannot possibly help the epilog when the main loop had used a vector mode with smaller alignment requirement. There's another issue, that the check whether the step preserves alignment needs to consider possibly previously involved VFs (here, the main loops smaller VF) as well. These might not be the only case with problems for such a mode mix but at least there it seems wise to never use DR alignment forcing when analyzing an epilog. We get to chose this mode setup because the iteration over epilog modes doesn't prevent this, the maybe_ge (cached_vf_per_mode[0], first_vinfo_vf) skip is conditional on !supports_partial_vectors and it is also conditional on having a cached VF. Further nothing in vect_analyze_loop_1 rejects this setup - it might be conceivable that a target can do masking only for larger modes. There is a second reason we end up with this mode setup, which is that vect_need_peeling_or_partial_vectors_p says we do not need peeling or partial vectors when analyzing the main loop with AVX512 (if it would say so we'd have chosen a masked AVX512 epilog-only vectorization). It does that because it looks at LOOP_VINFO_COST_MODEL_THRESHOLD (which is not yet computed, so always zero at this point), and compares max_niter (5) against the VF (8), but not with equality as the comment says but with greater. This also needs looking at, PR120939. PR tree-optimization/120927 * tree-vect-data-refs.cc (vect_compute_data_ref_alignment): Do not force a DRs base alignment when analyzing an epilog loop. Check whether the step preserves alignment for all VFs possibly involved sofar. * gcc.dg/vect/vect-pr120927.c: New testcase. * gcc.dg/vect/vect-pr120927-2.c: Likewise.
27 hoursc-family: Tweak ptr +- (expr +- cst) FE optimization [PR120837]Jakub Jelinek2-14/+67
The following testcase is miscompiled with -fsanitize=undefined but we introduce UB into the IL even without that flag. The optimization ptr +- (expr +- cst) when expr/cst have undefined overflow into (ptr +- cst) +- expr is sometimes simply not valid, without careful analysis on what ptr points to we don't know if it is valid to do (ptr +- cst) pointer arithmetics. E.g. on the testcase, ptr points to start of an array (actually conditionally one or another) and cst is -1, so ptr - 1 is invalid pointer arithmetics, while ptr + (expr - 1) can be valid if expr is at runtime always > 1 and smaller than size of the array ptr points to + 1. Unfortunately, removing this 1992-ish optimization altogether causes FAIL: c-c++-common/restrict-2.c -Wc++-compat scan-tree-dump-times lim2 "Moving statement" 11 FAIL: gcc.dg/tree-ssa/copy-headers-5.c scan-tree-dump ch2 "is now do-while loop" FAIL: gcc.dg/tree-ssa/copy-headers-5.c scan-tree-dump-times ch2 " if " 3 FAIL: gcc.dg/vect/pr57558-2.c scan-tree-dump vect "vectorized 1 loops" FAIL: gcc.dg/vect/pr57558-2.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops" regressions (restrict-2.c also for C++ in all std modes). I've been thinking about some match.pd optimization for signed integer addition/subtraction of constant followed by widening integral conversion followed by multiplication or left shift, but that wouldn't help 32-bit arches. So, instead at least for now, the following patch keeps doing the optimization, just doesn't perform it in pointer arithmetics. pointer_int_sum itself actually adds the multiplication by size_exp, so ptr + expr is turned into ptr p+ expr * size_exp, so this patch will try to optimize ptr + (expr +- cst) into ptr p+ ((sizetype)expr * size_exp +- (sizetype)cst * size_exp) and ptr - (expr +- cst) into ptr p+ -((sizetype)expr * size_exp +- (sizetype)cst * size_exp) 2025-07-04 Jakub Jelinek <jakub@redhat.com> PR c/120837 * c-common.cc (pointer_int_sum): Rewrite the intop PLUS_EXPR or MINUS_EXPR optimization into extension of both intop operands, their separate multiplication and then addition/subtraction followed by rest of pointer_int_sum handling after the multiplication. * gcc.dg/ubsan/pr120837.c: New test.
30 hourstestsuite: Rename a testXi Ruoyao1-0/+0
I mistyped the file name :(. gcc/testsuite/ChangeLog: PR target/120807 * gcc.c-torture/compile/pr120708.c: Rename to ... * gcc.c-torture/compile/pr120807.c: ... here.
30 hoursLoongArch: Prevent subreg of subreg in CRCXi Ruoyao2-1/+22
The register_operand predicate can match subreg, then we'd have a subreg of subreg and it's invalid. Use lowpart_subreg to avoid the nested subreg. gcc/ChangeLog: * config/loongarch/loongarch.md (crc_combine): Avoid nested subreg. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/pr120708.c: New test.
30 hours[RISC-V] Add basic instrumentation to fusion detectionShreya Munnangi1-16/+64
We were looking to evaluate some changes from Artemiy that improve GCC's ability to discover fusible instruction pairs. There was no good way to get any static data out of the compiler about what kinds of fusions were happening. Yea, you could grub around the .sched dumps looking for the magic '+' annotation, then look around at the slim RTL representation and make an educated guess about what fused. But boy that was inconvenient. All we really needed was a quick note in the dump file that the target hook found a fusion pair and what kind was discovered. That made it easy to spot invalid fusions, evaluate the effectiveness of Artemiy's work, write/discover testcases for existing fusions and implement new fusions. So from a codegen standpoint this is NFC, it only affects dump file output. It's gone through the usual testing and I'll wait for pre-commit CI to churn through it before moving forward. gcc/ * config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Add basic instrumentation to all cases where fusion is detected. Fix minor formatting goofs found in the process.
32 hoursRISC-V: Add testcases for signed scalar SAT_ADD IMM form 2panciyan12-0/+440
This patch adds testcase for form2, as shown below: T __attribute__((noinline)) \ sat_s_add_imm_##T##_fmt_2##_##INDEX (T x) \ { \ T sum = (T)((UT)x + (UT)IMM); \ return ((x ^ sum) < 0 && (x ^ IMM) >= 0) ? \ (-(T)(x < 0) ^ MAX) : sum; \ } Passed the rv64gcv regression test. Signed-off-by: Ciyan Pan <panciyan@eswincomputing.com> gcc/testsuite/ChangeLog: * gcc.target/riscv/sat/sat_arith.h: Add signed scalar SAT_ADD IMM form2. * gcc.target/riscv/sat/sat_s_add_imm-2-i16.c: New test. * gcc.target/riscv/sat/sat_s_add_imm-2-i32.c: New test. * gcc.target/riscv/sat/sat_s_add_imm-2-i64.c: New test. * gcc.target/riscv/sat/sat_s_add_imm-2-i8.c: New test. * gcc.target/riscv/sat/sat_s_add_imm-run-2-i16.c: New test. * gcc.target/riscv/sat/sat_s_add_imm-run-2-i32.c: New test. * gcc.target/riscv/sat/sat_s_add_imm-run-2-i64.c: New test. * gcc.target/riscv/sat/sat_s_add_imm-run-2-i8.c: New test. * gcc.target/riscv/sat/sat_s_add_imm_type_check-2-i16.c: New test. * gcc.target/riscv/sat/sat_s_add_imm_type_check-2-i32.c: New test. * gcc.target/riscv/sat/sat_s_add_imm_type_check-2-i8.c: New test.
32 hoursMatch: Support for signed scalar SAT_ADD IMM form 2panciyan1-1/+12
This patch would like to support signed scalar SAT_ADD IMM form 2 Form2: T __attribute__((noinline)) \ sat_s_add_imm_##T##_fmt_2##_##INDEX (T x) \ { \ T sum = (T)((UT)x + (UT)IMM); \ return ((x ^ sum) < 0 && (x ^ IMM) >= 0) ? \ (-(T)(x < 0) ^ MAX) : sum; \ } Take below form1 as example: DEF_SAT_S_ADD_IMM_FMT_2(0, int8_t, uint8_t, 9, INT8_MIN, INT8_MAX) Before this patch: __attribute__((noinline)) int8_t sat_s_add_imm_int8_t_fmt_2_0 (int8_t x) { int8_t sum; unsigned char x.0_1; unsigned char _2; signed char _3; signed char _4; _Bool _5; signed char _6; int8_t _7; int8_t _10; signed char _11; signed char _13; signed char _14; <bb 2> [local count: 1073741822]: x.0_1 = (unsigned char) x_8(D); _2 = x.0_1 + 9; sum_9 = (int8_t) _2; _3 = x_8(D) ^ sum_9; _4 = x_8(D) ^ 9; _13 = ~_3; _14 = _4 | _13; if (_14 >= 0) goto <bb 3>; [59.00%] else goto <bb 4>; [41.00%] <bb 3> [local count: 259738146]: _5 = x_8(D) < 0; _11 = (signed char) _5; _6 = -_11; _10 = _6 ^ 127; <bb 4> [local count: 1073741824]: # _7 = PHI <sum_9(2), _10(3)> return _7; } After this patch: __attribute__((noinline)) int8_t sat_s_add_imm_int8_t_fmt_2_0 (int8_t x) { int8_t _7; <bb 2> [local count: 1073741824]: _7 = .SAT_ADD (x_8(D), 9); [tail call] return _7; } The below test suites are passed for this patch: 1. The rv64gcv fully regression tests. 2. The x86 bootstrap tests. 3. The x86 fully regression tests. Signed-off-by: Ciyan Pan <panciyan@eswincomputing.com> gcc/ChangeLog: * match.pd: Add signed scalar SAT_ADD IMM form2 matching.
33 hoursDaily bump.GCC Administrator6-1/+637
34 hoursc++: trivial lambda pruning [PR120716]Jason Merrill3-1/+26
In this testcase there is nothing in the lambda except a static_assert which mentions a variable from the enclosing scope but does not odr-use it, so we want prune_lambda_captures to remove its capture. Since the lambda is so empty, there's nothing in the body except the DECL_EXPR of the capture proxy, so pop_stmt_list moves that into the enclosing STATEMENT_LIST and passes the 'body' STATEMENT_LIST to free_stmt_list. As a result, passing 'body' to prune_lambda_captures is wrong; we should instead pass the enclosing scope, i.e. cur_stmt_list. PR c++/120716 gcc/cp/ChangeLog: * lambda.cc (finish_lambda_function): Pass cur_stmt_list to prune_lambda_captures. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/lambda/lambda-constexpr3.C: New test. * g++.dg/cpp0x/lambda/lambda-constexpr3a.C: New test.
34 hoursc++: ICE with 'this' in lambda signature [PR120748]Jason Merrill4-6/+49
This testcase was crashing from infinite recursion in the diagnostic machinery, trying to print the lambda signature, which referred to the __this capture field in the lambda, which wanted to print the lambda again. But we don't want the signature to refer to the capture field; 'this' in an unevaluated context refers to the 'this' from the enclosing function, not the capture. After fixing that, we still wrongly rejected the B case because THIS_FORBIDDEN is set in a default (template) argument. Since we don't distinguish between THIS_FORBIDDEN being set for a default argument and it being set for a static member function, let's just ignore it if cp_unevaluated_operand; we'll give a better diagnostic for the static memfn case in finish_this_expr. PR c++/120748 gcc/cp/ChangeLog: * lambda.cc (lambda_expr_this_capture): Don't return a FIELD_DECL. * parser.cc (cp_parser_primary_expression): Ignore THIS_FORBIDDEN if cp_unevaluated_operand. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/lambda-targ16.C: New test. * g++.dg/cpp0x/this1.C: Adjust diagnostics.
37 hoursc++: Fix a pasto in the PR120471 fix [PR120940]Jakub Jelinek3-1/+30
No idea how this slipped in, I'm terribly sorry. Strangely nothing in the testsuite has caught this, so I've added a new test for that. 2025-07-03 Jakub Jelinek <jakub@redhat.com> PR c++/120940 * typeck.cc (cp_build_array_ref): Fix a pasto. * g++.dg/parse/pr120940.C: New test. * g++.dg/warn/Wduplicated-branches9.C: New test.
39 hoursAda: Remove left-overs of front-end exception mechanismEric Botcazou2-31/+0
It was removed from the compiler a few releases ago. gcc/ada/ * gcc-interface/Makefile.in (gnatlib-sjlj): Delete. (gnatlib-zcx): Do not modify Frontend_Exceptions constant. * libgnat/system-linux-loongarch.ads (Frontend_Exceptions): Delete.
41 hourss390: More vec-perm-const cases.Juergen Christ3-2/+542
s390 missed constant vector permutation cases based on the vector pack instruction or changing the size of the vector elements during vector merge. This enables some more patterns that do not need to load a constant vector for permutation. gcc/ChangeLog: * config/s390/s390.cc (expand_perm_with_merge): Add size change cases. (expand_perm_with_pack): New function. (vectorize_vec_perm_const_1): Wire up new function. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-perm-merge-1.c: New test. * gcc.target/s390/vector/vec-perm-pack-1.c: New test. Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
44 hoursOpenMP: Add omp_get_initial_device/omp_get_num_devices builtins: Fix test casesThomas Schwinge2-4/+4
With this fix-up for commit 387209938d2c476a67966c6ddbdbf817626f24a2 "OpenMP: Add omp_get_initial_device/omp_get_num_devices builtins", we progress: PASS: c-c++-common/gomp/omp_get_num_devices_initial_device.c (test for excess errors) PASS: c-c++-common/gomp/omp_get_num_devices_initial_device.c scan-tree-dump-not optimized "abort" -FAIL: c-c++-common/gomp/omp_get_num_devices_initial_device.c scan-tree-dump-times optimized "omp_get_num_devices;" 1 +PASS: c-c++-common/gomp/omp_get_num_devices_initial_device.c scan-tree-dump-times optimized "omp_get_num_devices" 1 PASS: c-c++-common/gomp/omp_get_num_devices_initial_device.c scan-tree-dump optimized "_1 = __builtin_omp_get_num_devices \\(\\);[\\r\\n]+[ ]+return _1;" ... etc. for offloading configurations. gcc/testsuite/ * c-c++-common/gomp/omp_get_num_devices_initial_device.c: Fix. * gfortran.dg/gomp/omp_get_num_devices_initial_device.f90: Likewise.
44 hours[RISC-V][PR target/118886] Refine when two insns are signaled as fusion ↵Jeff Law1-57/+80
candidates A number of folks have had their fingers in this code and it's going to take a few submissions to do everything we want to do. This patch is primarily concerned with avoiding signaling that fusion can occur in cases where it obviously should not be signaling fusion. Every DEC based fusion I'm aware of requires the first instruction to set a destination register that is both used and set again by the second instruction. If the two instructions set different registers, then the destination of the first instruction was not dead and would need to have a result produced. This is complicated by the fact that we have pseudo registers prior to reload. So the approach we take is to signal fusion prior to reload even if the destination registers don't match. Post reload we require them to match. That allows us to clean up the code ever-so-slightly. Second, we sometimes signaled fusion into loads that weren't scalar integer loads. I'm not aware of a design that's fusing into FP loads or vector loads. So those get rejected explicitly. Third, the store pair "fusion" code is cleaned up a little. We use fusion to model store pair commits since the basic properties for detection are the same. The point where they "fuse" is different. Also this code liked to "return false" at each step along the way if fusion wasn't possible. Future work for additional fusion cases makes that behavior undesirable. So the logic gets reworked a little bit to be more friendly to future work. Fourth, if we already fused the previous instruction, then we can't fuse it again. Signaling fusion in that case is, umm, bad as it creates an atomic blob of code from a scheduling standpoint. Hopefully I got everything correct with extracting this work out of a larger set of changes 🙂 We will contribute some instrumentation & testing code so if I botched things in a major way we'll soon have a way to test that and I'll be on the hook to fix any goof's. From a correctness standpoint this should be a big fat nop. We've seen this make measurable differences in pico benchmarks, but obviously as you scale up to bigger stuff the gains largely disappear into the noise. This has been through Ventana's internal CI and my tester. I'll obviously wait for a verdict from the pre-commit tester. PR target/118886 gcc/ * config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Check for fusion being disabled earlier. If PREV is already fused, then it can't be fused again. Be more selective about fusing when the destination registers do not match. Don't fuse into loads that aren't scalar integer modes. Revamp store pair commit support. Co-authored-by: Daniel Barboza <dbarboza@ventanamicro.com> Co-authored-by: Shreya Munnangi <smunnangi1@ventanamicro.com>
45 hourstestsuite: Fix gcc.dg/ipa/pr120295.c on SolarisRainer Orth1-2/+2
gcc.dg/ipa/pr120295.c FAILs on Solaris: FAIL: gcc.dg/ipa/pr120295.c (test for excess errors) Excess errors: ld: warning: symbol 'glob' has differing types: (file /var/tmp//ccsDR59c.o type=OBJT; file /lib/libc.so type=FUNC); /var/tmp//ccsDR59c.o definition taken Fixed by renaming the glob variable to glob_ to avoid the conflict. Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu. gcc/testsuite: * gcc.dg/ipa/pr120295.c (glob): Rename to glob_.
45 hoursAArch64: make rules for CBZ/TBZ higher priorityKarl Meakin2-92/+105
Move the rules for CBZ/TBZ to be above the rules for CBB<cond>/CBH<cond>/CB<cond>. We want them to have higher priority because they can express larger displacements. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_cbz<optab><mode>1): Move above rules for CBB<cond>/CBH<cond>/CB<cond>. (*aarch64_tbz<optab><mode>1): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/cmpbr.c: Update tests.
45 hoursAArch64: rules for CMPBR instructionsKarl Meakin6-447/+450
Add rules for lowering `cbranch<mode>4` to CBB<cond>/CBH<cond>/CB<cond> when CMPBR extension is enabled. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_cb_rhs): New function. * config/aarch64/aarch64.cc (aarch64_cb_rhs): Likewise. * config/aarch64/aarch64.md (cbranch<mode>4): Rename to ... (cbranch<GPI:mode>4): ...here, and emit CMPBR if possible. (cbranch<SHORT:mode>4): New expand rule. (aarch64_cb<INT_CMP:code><GPI:mode>): New insn rule. (aarch64_cb<INT_CMP:code><SHORT:mode>): Likewise. * config/aarch64/constraints.md (Uc0): New constraint. (Uc1): Likewise. (Uc2): Likewise. * config/aarch64/iterators.md (cmpbr_suffix): New mode attr. (INT_CMP): New code iterator. (cmpbr_imm_constraint): New code attr. gcc/testsuite/ChangeLog: * gcc.target/aarch64/cmpbr.c:
45 hoursAArch64: precommit test for CMPBR instructionsKarl Meakin2-6/+1999
Commit the test file `cmpbr.c` before rules for generating the new instructions are added, so that the changes in codegen are more obvious in the next commit. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add `cmpbr` to the list of extensions. * gcc.target/aarch64/cmpbr.c: New test.
45 hoursAArch64: recognize `+cmpbr` optionKarl Meakin3-0/+8
Add the `+cmpbr` option to enable the FEAT_CMPBR architectural extension. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (cmpbr): New option. * config/aarch64/aarch64.h (TARGET_CMPBR): New macro. * doc/invoke.texi (cmpbr): New option.
45 hoursAArch64: make `far_branch` attribute a booleanKarl Meakin1-12/+10
The `far_branch` attribute only ever takes the values 0 or 1, so make it a `no/yes` valued string attribute instead. gcc/ChangeLog: * config/aarch64/aarch64.md (far_branch): Replace 0/1 with no/yes. (aarch64_bcond): Handle rename. (aarch64_cbz<optab><mode>1): Likewise. (*aarch64_tbz<optab><mode>1): Likewise. (@aarch64_tbz<optab><ALLI:mode><GPI:mode>): Likewise.
45 hoursAArch64: add constants for branch displacementsKarl Meakin1-16/+44
Extract the hardcoded values for the minimum PC-relative displacements into named constants and document them. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_1MiB): New constant. (BRANCH_LEN_N_1MiB): Likewise. (BRANCH_LEN_P_32KiB): Likewise. (BRANCH_LEN_N_32KiB): Likewise.