aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2024-03-07sccvn: Avoid UB in ao_ref_init_from_vn_reference [PR105533]Jakub Jelinek1-1/+1
When compiling libgcc or on e.g. int a[64]; int p; void foo (void) { int s = 1; while (p) { s -= 11; a[s] != 0; } } sccvn invokes UB in the compiler as detected by ubsan: ../../gcc/poly-int.h:1089:5: runtime error: left shift of negative value -40 The problem is that we still use C++11..C++17 as the implementation language and in those C++ versions shifting negative values left is UB (well defined since C++20) and above in offset += op->off << LOG2_BITS_PER_UNIT; op->off is poly_int64 with -40 value (in libgcc with -8). I understand the offset_int << LOG2_BITS_PER_UNIT shifts but it is then well defined during underlying implementation which is done on the uhwi limbs, but for poly_int64 we use offset += pop->off * BITS_PER_UNIT; a few lines earlier and I think that is both more readable in what it actually does and triggers UB only if there would be signed multiply overflow. In the end, the compiler will treat them the same at least at the RTL level (at least, if not and they aren't the same cost, it should). 2024-03-07 Jakub Jelinek <jakub@redhat.com> PR middle-end/105533 * tree-ssa-sccvn.cc (ao_ref_init_from_vn_reference) <case ARRAY_REF>: Multiple op->off by BITS_PER_UNIT instead of shifting it left by LOG2_BITS_PER_UNIT.
2024-03-07LoongArch: testsuite:Fix problems with incorrect results in vector test cases.chenxiaolong5-68/+68
In simd_correctness_check.h, the role of the macro ASSERTEQ_64 is to check the result of the passed vector values for the 64-bit data of each array element. It turns out that it uses the abs() function to check only the lower 32 bits of the data at a time, so it replaces abs() with the llabs() function. However, the following two problems may occur after modification: 1.FAIL in lasx-xvfrint_s.c and lsx-vfrint_s.c The reason for the error is because vector test cases that use __m{128,256} to define vector types are composed of 32-bit primitive types, they should use ASSERTEQ_32 instead of ASSERTEQ_64 to check for correctness. 2.FAIL in lasx-xvshuf_b.c and lsx-vshuf.c The cause of the error is that the expected result of the function setting in the test case is incorrect. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vector/lasx/lasx-xvfrint_s.c: Replace ASSERTEQ_64 with the macro ASSERTEQ_32. * gcc.target/loongarch/vector/lasx/lasx-xvshuf_b.c: Modify the expected test results of some functions according to the function of the vector instruction. * gcc.target/loongarch/vector/lsx/lsx-vfrint_s.c: Same modification as lasx-xvfrint_s.c. * gcc.target/loongarch/vector/lsx/lsx-vshuf.c: Same modification as lasx-xvshuf_b.c. * gcc.target/loongarch/vector/simd_correctness_check.h: Use the llabs() function instead of abs() to check the correctness of the results.
2024-03-07LoongArch: Use /lib instead of /lib64 as the library search path for MUSL.Yang Yujie3-1/+29
gcc/ChangeLog: * config.gcc: Add a case for loongarch*-*-linux-musl*. * config/loongarch/linux.h: Disable the multilib-compatible treatment for *musl* targets. * config/loongarch/musl.h: New file.
2024-03-07match.pd: Optimize a * !a to 0 [PR114009]Jakub Jelinek3-1/+45
The following patch attempts to fix an optimization regression through adding a simple simplification. We already have the /* (m1 CMP m2) * d -> (m1 CMP m2) ? d : 0 */ (if (!canonicalize_math_p ()) (for cmp (tcc_comparison) (simplify (mult:c (convert (cmp@0 @1 @2)) @3) (if (INTEGRAL_TYPE_P (type) && INTEGRAL_TYPE_P (TREE_TYPE (@0))) (cond @0 @3 { build_zero_cst (type); }))) optimization which otherwise triggers during the a * !a multiplication, but that is done only late and we aren't able through range assumptions optimize it yet anyway. The patch adds a specific simplification for it. If a is zero, then a * !a will be 0 * 1 (or for signed 1-bit 0 * -1) and so 0. If a is non-zero, then a * !a will be a * 0 and so again 0. THe pattern is valid for scalar integers, complex integers and vector types, but I think will actually trigger only for the scalar integers. For vector types I've added other two with VEC_COND_EXPR in it, for complex there are different GENERIC trees to match and it is something that likely would be never matched in GIMPLE, so I didn't handle that. 2024-03-07 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/114009 * genmatch.cc (decision_tree::gen): Emit ARG_UNUSED for captures argument even for GENERIC, not just for GIMPLE. * match.pd (a * !a -> 0): New simplifications. * gcc.dg/tree-ssa/pr114009.c: New test.
2024-03-07RISC-V: Refactor expand_vec_cmp [NFC]demin.han2-31/+15
There are two expand_vec_cmp functions. They have same structure and similar code. We can use default arguments instead of overloading. Tested on RV32 and RV64. gcc/ChangeLog: * config/riscv/riscv-protos.h (expand_vec_cmp): Change proto * config/riscv/riscv-v.cc (expand_vec_cmp): Use default arguments (expand_vec_cmp_float): Adapt arguments Signed-off-by: demin.han <demin.han@starfivetech.com>
2024-03-06Fortran: Fix issue with using snprintf function.Jerry DeLisle1-2/+2
The previous patch used snprintf to set the message string. The message string is not a formatted string and the snprintf will interpret '%' related characters as format specifiers when there are no associated output variables. A segfault ensues. This change replaces snprintf with a fortran string copy function and null terminates the message string. PR libfortran/105456 libgfortran/ChangeLog: * io/list_read.c (list_formatted_read_scalar): Use fstrcpy from libgfortran/runtime/string.c to replace snprintf. (nml_read_obj): Likewise. * io/transfer.c (unformatted_read): Likewise. (unformatted_write): Likewise. (formatted_transfer_scalar_read): Likewise. (formatted_transfer_scalar_write): Likewise. * io/write.c (list_formatted_write_scalar): Likewise. (nml_write_obj): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/pr105456.f90: Revise using '%' characters in users error message.
2024-03-07Daily bump.GCC Administrator5-1/+217
2024-03-06i386: Fix and improve insn constraint for V2QI arithmetic/shift insnsUros Bizjak1-10/+23
optimize_function_for_size_p predicate is not stable during optab selection, because it also depends on node->count/node->frequency of the current function, which are updated during IPA, so they may change between early opts and late opts. Use optimize_size instead - optimize_size implies optimize_function_for_size_p (cfun), so if a named pattern uses "&& optimize_size" and the insn it splits into uses optimize_function_for_size_p (cfun), it shouldn't fail. PR target/114232 gcc/ChangeLog: * config/i386/mmx.md (negv2qi2): Enable for optimize_size instead of optimize_function_for_size_p. Explictily enable for TARGET_SSE2. (negv2qi SSE reg splitter): Enable for TARGET_SSE2 only. (<plusminus:insn>v2qi3): Enable for optimize_size instead of optimize_function_for_size_p. Explictily enable for TARGET_SSE2. (<plusminus:insn>v2qi SSE reg splitter): Enable for TARGET_SSE2 only. (<any_shift:insn>v2qi3): Enable for optimize_size instead of optimize_function_for_size_p.
2024-03-06RISC-V: Use vmv1r.v instead of vmv.v.v for fma output reloads [PR114200].Robin Dapp3-48/+86
Three-operand instructions like vmacc are modeled with an implicit output reload when the output does not match one of the operands. For this we use vmv.v.v which is subject to length masking. In a situation where the current vl is less than the full vlenb and the fma's result value is used as input for a vector reduction (which is never length masked) we effectively only reduce vl elements. The masked-out elements are relevant for the reduction, though, leading to a wrong result. This patch replaces the vmv reloads by full-register reloads. gcc/ChangeLog: PR target/114200 PR target/114202 * config/riscv/vector.md: Use vmv[1248]r.v instead of vmv.v.v. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr114200.c: New test. * gcc.target/riscv/rvv/autovec/pr114202.c: New test.
2024-03-06RISC-V: Adjust vec unit-stride load/store costs.Robin Dapp4-10/+188
Scalar loads provide offset addressing while unit-stride vector instructions cannot. The offset must be loaded into a general-purpose register before it can be used. In order to account for this, this patch adds an address arithmetic heuristic that keeps track of data reference operands. If we haven't seen the operand before we add the cost of a scalar statement. This helps to get rid of an lbm regression when vectorizing (roughly 0.5% fewer dynamic instructions). gcc5 improves by 0.2% and deepsjeng by 0.25%. wrf and nab degrade by 0.1%. This is because before we now adjust the cost of SLP as well as loop-vectorized instructions whereas we would only adjust loop-vectorized instructions before. Considering higher scalar_to_vec costs (3 vs 1) for all vectorization types causes some snippets not to get vectorized anymore. Given these costs the decision looks correct but appears worse when just counting dynamic instructions. In total SPECint 2017 has 4 bln dynamic instructions less and SPECfp 0.7 bln. gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (adjust_stmt_cost): Move... (costs::adjust_stmt_cost): ... to here and add vec_load/vec_store offset handling. (costs::add_stmt_cost): Also adjust cost for statements without stmt_info. * config/riscv/riscv-vector-costs.h: Define zero constant. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/vse-slp-1.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/vse-slp-2.c: New test.
2024-03-06ARM: Fix conditional execution [PR113915]Wilco Dijkstra3-11/+15
By default most patterns can be conditionalized on Arm targets. However Thumb-2 predication requires the "predicable" attribute be explicitly set to "yes". Most patterns are shared between Arm and Thumb(-2) and are marked with "predicable". Given this sharing, it does not make sense to use a different default for Arm. So only consider conditional execution of instructions that have the predicable attribute set to yes. This ensures that patterns not explicitly marked as such are never conditionally executed. gcc/ChangeLog: PR target/113915 * config/arm/arm.md (NOCOND): Improve comment. (arm_rev*) Add predicable. * config/arm/arm.cc (arm_final_prescan_insn): Add check for PREDICABLE_YES. gcc/testsuite/ChangeLog: PR target/113915 * gcc.target/arm/builtin-bswap-1.c: Fix test to allow conditional execution both for Arm and Thumb-2.
2024-03-06[PR target/113001] Fix incorrect operand swapping in conditional moveJeff Law3-2/+37
This bug totally fell off my radar. Sorry about that. We have some special casing the conditional move expander to simplify a conditional move when comparing a register against zero and that same register is one of the arms. Specifically a (eq (reg) (const_int 0)) where reg is also the true arm or (ne (reg) (const_int 0)) where reg is the false arm need not use the fully generalized conditional move, thus saving an instruction for those cases. In the NE case we swapped the operands, but didn't swap the condition, which led to the ICE due to an unrecognized pattern. THe backend actually has distinct patterns for those two cases. So swapping the operands is neither needed nor advisable. Regression tested on rv64gc and verified the new tests pass. Pushing to the trunk. PR target/113001 PR target/112871 gcc/ * config/riscv/riscv.cc (expand_conditional_move): Do not swap operands when the comparison operand is the same as the false arm for a NE test. gcc/testsuite * gcc.target/riscv/zicond-ice-3.c: New test. * gcc.target/riscv/zicond-ice-4.c: New test.
2024-03-06Fortran: error recovery while simplifying expressions [PR103707,PR106987]Harald Anlauf3-41/+143
When an exception is encountered during simplification of arithmetic expressions, the result may depend on whether range-checking is active (-frange-check) or not. However, the code path in the front-end should stay the same for "soft" errors for which the exception is triggered by the check, while "hard" errors should always terminate the simplification, so that error recovery is independent of the flag. Separation of arithmetic error codes into "hard" and "soft" errors shall be done consistently via is_hard_arith_error(). PR fortran/103707 PR fortran/106987 gcc/fortran/ChangeLog: * arith.cc (is_hard_arith_error): New helper function to determine whether an arithmetic error is "hard" or not. (check_result): Use it. (gfc_arith_divide): Set "Division by zero" only for regular numerators of real and complex divisions. (reduce_unary): Use is_hard_arith_error to determine whether a hard or (recoverable) soft error was encountered. Terminate immediately on hard error, otherwise remember code of first soft error. (reduce_binary_ac): Likewise. (reduce_binary_ca): Likewise. (reduce_binary_aa): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/pr99350.f90: * gfortran.dg/arithmetic_overflow_3.f90: New test.
2024-03-06c++: ICE with noexcept and local specialization [PR114114]Marek Polacek2-0/+38
Here we ICE because we call register_local_specialization while local_specializations is null, so local_specializations->put (); crashes on null this. It's null since maybe_instantiate_noexcept calls push_to_top_level which creates a new scope. Normally, I would have guessed that we need a new local_specialization_stack. But here we're dealing with an operand of a noexcept, which is an unevaluated operand, and those aren't registered in the hash map. maybe_instantiate_noexcept wasn't signalling that it's substituting an unevaluated operand though. PR c++/114114 gcc/cp/ChangeLog: * pt.cc (maybe_instantiate_noexcept): Save/restore cp_unevaluated_operand, c_inhibit_evaluation_warnings, and cp_noexcept_operand around the tsubst_expr call. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/noexcept84.C: New test.
2024-03-06i386: Eliminate common code from x86_32 TARGET_MACHO part in ix86_expand_moveUros Bizjak1-26/+11
Eliminate common code from x86_32 TARGET_MACHO part in ix86_expand_move and use generic code instead. No functional changes. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_move) [TARGET_MACHO]: Eliminate common code and use generic code instead.
2024-03-06amdgcn: additional gfx1030/gfx1100 support: adjust test casesThomas Schwinge4-4/+4
The "SDWA" changes in commit 99890e15527f1f04caef95ecdd135c9f1a077f08 "amdgcn: additional gfx1030/gfx1100 support" caused a few regressions: PASS: gcc.target/gcn/sram-ecc-3.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-3.c scan-assembler zero_extendv64qiv64si2 PASS: gcc.target/gcn/sram-ecc-4.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-4.c scan-assembler zero_extendv64hiv64si2 PASS: gcc.target/gcn/sram-ecc-7.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-7.c scan-assembler zero_extendv64qiv64si2 PASS: gcc.target/gcn/sram-ecc-8.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-8.c scan-assembler zero_extendv64hiv64si2 Those test cases need corresponding adjustment. gcc/testsuite/ * gcc.target/gcn/sram-ecc-3.c: Adjust. * gcc.target/gcn/sram-ecc-4.c: Likewise. * gcc.target/gcn/sram-ecc-7.c: Likewise. * gcc.target/gcn/sram-ecc-8.c: Likewise.
2024-03-06AVR: Adjust rtx cost of plus + zero_extend.Georg-Johann Lay1-0/+7
gcc/ * config/avr/avr.cc (avr_rtx_costs_1) [PLUS+ZERO_EXTEND]: Adjust rtx cost.
2024-03-06tree-optimization/114239 - rework reduction epilogue drivingRichard Biener2-81/+53
The following reworks vectorizable_live_operation to pass the live stmt to vect_create_epilog_for_reduction also for early breaks and a peeled main exit. This is to be able to figure the scalar definition to replace. This reverts the PR114192 fix as it is subsumed by this cleanup. PR tree-optimization/114239 * tree-vect-loop.cc (vect_get_vect_def): Remove. (vect_create_epilog_for_reduction): The passed in stmt_info should now be the live stmt that produces the scalar reduction result. Revert PR114192 fix. Base reduction info off info_for_reduction. Remove special handling of early-break/peeled, restore original vector def gathering. Make sure to pick the correct exit PHIs. (vectorizable_live_operation): Pass in the proper stmt_info for early break exits. * gcc.dg/vect/vect-early-break_122-pr114239.c: New testcase.
2024-03-06LoongArch: testsuite: Rewrite {x,}vfcmp-{d,f}.c to avoid named registersXi Ruoyao4-139/+816
Loops on named vector register are not vectorized (see comment 11 of PR113622), so the these test cases have been failing for a while. Rewrite them using check-function-bodies to remove hard coding register names. A barrier is needed to always load the first operand before the second operand. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vfcmp-f.c: Rewrite to avoid named registers. * gcc.target/loongarch/vfcmp-d.c: Likewise. * gcc.target/loongarch/xvfcmp-f.c: Likewise. * gcc.target/loongarch/xvfcmp-d.c: Likewise.
2024-03-06aarch64: Define out-of-class static constantsRichard Sandiford1-0/+3
While reworking the aarch64 feature descriptions, I forgot to add out-of-class definitions of some static constants. This could lead to a build failure with some compilers. This was seen with some WIP to increase the number of extensions beyond 64. It's latent on trunk though, and a regression from before the rework. gcc/ * config/aarch64/aarch64-feature-deps.h (feature_deps::info): Add out-of-class definitions of static constants.
2024-03-06c++: Fix template deduction for conversion operators with xobj parameters ↵Nathaniel Shead2-1/+60
[PR113629] Unification for conversion operators (DEDUCE_CONV) doesn't perform transformations like handling forwarding references. This is correct in general, but not for xobj parameters, which should be handled "normally" for the purposes of deduction: [temp.deduct.conv] only applies to the return type of the conversion function. PR c++/113629 gcc/cp/ChangeLog: * pt.cc (type_unification_real): Only use DEDUCE_CONV for the return type of a conversion function. gcc/testsuite/ChangeLog: * g++.dg/cpp23/explicit-obj-conv-op.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-03-06tree-optimization/114249 - ICE with BB reduction vectorizationRichard Biener2-10/+30
When we scrap the last def of an odd lane numbered BB reduction we can end up recording a pattern def which will later wreck code generation. The following puts this logic where it better belongs, avoiding this issue. PR tree-optimization/114249 * tree-vect-slp.cc (vect_build_slp_instance): Move making a BB reduction lane number even ... (vect_slp_check_for_roots): ... here to avoid leaking pattern defs. * gcc.dg/vect/bb-slp-pr114249.c: New testcase.
2024-03-06tree-optimization/114246 - invalid call argument from DSERichard Biener2-0/+13
The following makes sure to strip type conversions added by build_fold_addr_expr before placing the result in a call argument. PR tree-optimization/114246 * tree-ssa-dse.cc (increment_start_addr): Strip useless type conversions from the adjusted address. * gcc.dg/torture/pr114246.c: New testcase.
2024-03-06i386: Fix up the vzeroupper REG_DEAD/REG_UNUSED note workaround [PR114190]Jakub Jelinek2-0/+28
When writing the rest_of_handle_insert_vzeroupper workaround to manually remove all the REG_DEAD/REG_UNUSED notes from the IL, I've missed that there is a df_analyze () call right after it and that the problems added earlier in the pass, like df_note_add_problem () done during mode switching, doesn't affect just the next df_analyze () call right after it, but all other df_analyze () calls until the end of the current pass where df_finish_pass removes the optional problems. So, as can be seen on the following patch, the workaround doesn't actually work there, because while rest_of_handle_insert_vzeroupper carefully removes all REG_DEAD/REG_UNUSED notes, the df_analyze () call at the end of the function immediately adds them in again (so, I must say I have no idea why the workaround worked on the earlier testcases). Now, I could move the df_analyze () call just before the REG_DEAD/REG_UNUSED note removal loop, but I think the following patch is better, because the df_analyze () call doesn't have to recompute the problem when we don't care about it and will actively strip all traces of it away. 2024-03-06 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/114190 * config/i386/i386-features.cc (rest_of_handle_insert_vzeroupper): Call df_remove_problem for df_note before calling df_analyze. * gcc.target/i386/avx-pr114190.c: New test.
2024-03-05Fortran: Add user defined error messages for UDTIO.Jerry DeLisle5-0/+224
The defines IOMSG_LEN and MSGLEN were redundant so these are combined into IOMSG_LEN as defined in io.h. The remainder of the patch adds checks for when a user defined derived type IO procedure sets the IOSTAT or IOMSG variables independent of the librrary defined I/O messages. PR libfortran/105456 libgfortran/ChangeLog: * io/io.h (IOMSG_LEN): Moved to here. * io/list_read.c (MSGLEN): Removed MSGLEN. (convert_integer): Changed MSGLEN to IOMSG_LEN. (parse_repeat): Likewise. (read_logical): Likewise. (read_integer): Likewise. (read_character): Likewise. (parse_real): Likewise. (read_complex): Likewise. (read_real): Likewise. (check_type): Likewise. (list_formatted_read_scalar): Adjust to IOMSG_LEN. (nml_read_obj): Add user defined error message. * io/transfer.c (unformatted_read): Add user defined error message. (unformatted_write): Add user defined error message. (formatted_transfer_scalar_read): Add user defined error message. (formatted_transfer_scalar_write): Add user defined error message. * io/write.c (list_formatted_write_scalar): Add user defined error message. (nml_write_obj): Add user defined error message. gcc/testsuite/ChangeLog: * gfortran.dg/pr105456-nmlr.f90: New test. * gfortran.dg/pr105456-nmlw.f90: New test. * gfortran.dg/pr105456-ruf.f90: New test. * gfortran.dg/pr105456-wf.f90: New test. * gfortran.dg/pr105456-wuf.f90: New test.
2024-03-05c++/modules: befriending template from current class scopePatrick Palka4-10/+23
Here the TEMPLATE_DECL representing the template friend declaration naming B has class scope since the template B has class scope, but get_merge_kind assumes all DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P TEMPLATE_DECL have namespace scope and wrongly returns MK_named instead of MK_local_friend for the friend. gcc/cp/ChangeLog: * module.cc (trees_out::get_merge_kind) <case depset::EK_DECL>: Accomodate class-scope DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P TEMPLATE_DECL. Consolidate IDENTIFIER_ANON_P cases. gcc/testsuite/ChangeLog: * g++.dg/modules/friend-7.h: New test. * g++.dg/modules/friend-7_a.H: New test. * g++.dg/modules/friend-7_b.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-03-06Daily bump.GCC Administrator5-1/+180
2024-03-05ctf: fix incorrect CTF for multi-dimensional array typesCupertino Miranda2-83/+89
PR debug/114186 DWARF DIEs of type DW_TAG_subrange_type are linked together to represent the information about the subsequent dimensions. The CTF processing was so far working through them in the opposite (incorrect) order. While fixing the issue, refactor the code a bit for readability. co-authored-By: Indu Bhagat <indu.bhagat@oracle.com> gcc/ PR debug/114186 * dwarf2ctf.cc (gen_ctf_array_type): Invoke the ctf_add_array () in the correct order of the dimensions. (gen_ctf_subrange_type): Refactor out handling of DW_TAG_subrange_type DIE to here. gcc/testsuite/ PR debug/114186 * gcc.dg/debug/ctf/ctf-array-6.c: Add test.
2024-03-05asan: Handle poly-int sizes in ASAN_MARK [PR97696]Richard Sandiford2-5/+33
This patch makes the expansion of IFN_ASAN_MARK let through poly-int-sized objects. The expansion itself was already generic enough, but the tests for the fast path were too strict. gcc/ PR sanitizer/97696 * asan.cc (asan_expand_mark_ifn): Allow the length to be a poly_int. gcc/testsuite/ PR sanitizer/97696 * gcc.target/aarch64/sve/pr97696.c: New test.
2024-03-05aarch64: Remove SME2.1 forms of LUTI2/4Richard Sandiford4-145/+3
I was over-eager when adding support for strided SME2 instructions and accidentally included forms of LUTI2 and LUTI4 that are only available with SME2.1, not SME2. This patch removes them for now. We're planning to add proper support for SME2.1 in the GCC 15 timeframe. Sorry for the blunder :( gcc/ * config/aarch64/aarch64.md (stride_type): Remove luti_consecutive and luti_strided. * config/aarch64/aarch64-sme.md (@aarch64_sme_lut<LUTI_BITS><mode>): Remove stride_type attribute. (@aarch64_sme_lut<LUTI_BITS><mode>_strided2): Delete. (@aarch64_sme_lut<LUTI_BITS><mode>_strided4): Likewise. * config/aarch64/aarch64-early-ra.cc (is_stride_candidate) (early_ra::maybe_convert_to_strided_access): Remove support for strided LUTI2 and LUTI4. gcc/testsuite/ * gcc.target/aarch64/sme/strided_1.c (test5): Remove.
2024-03-05arm: check for low register before applying peephole [PR113510]Richard Earnshaw1-1/+1
For thumb1, when using a peephole to fuse mov reg, #const add reg, reg, SP into add reg, SP, #const we must first check that reg is a low register, otherwise we will ICE when trying to recognize the resulting insn. gcc/ChangeLog: PR target/113510 * config/arm/thumb1.md (peephole2 to fuse mov imm/add SP): Use low_register_operand.
2024-03-05Fix testcase pr112337.c to check the options [PR112337]Saurabh Jha1-1/+3
gcc.target/arm/pr112337.c was failing to validate that adding MVE options was compatible with the test environment, so add the missing checks. gcc/testsuite/ChangeLog: PR target/112337 * gcc.target/arm/pr112337.c: Check for, then use the right MVE options.
2024-03-05Remove dead code associated with `AST::ExternalFunctionItem`0xn4utilus26-335/+7
gcc/rust/ChangeLog: * ast/rust-ast-collector.cc (TokenCollector::visit): Remove dead code. * ast/rust-ast-collector.h: Likewise. * ast/rust-ast-full-decls.h (class ExternalFunctionItem): Likewise. * ast/rust-ast-visitor.cc (DefaultASTVisitor::visit): Likewise. * ast/rust-ast-visitor.h: Likewise. * ast/rust-ast.cc (ExternalFunctionItem::as_string): Likewise. (ExternalFunctionItem::accept_vis): Likewise. * checks/errors/rust-ast-validation.cc (ASTValidation::visit): Likewise. * checks/errors/rust-ast-validation.h: Likewise. * checks/errors/rust-feature-gate.h: Likewise. * expand/rust-cfg-strip.cc (CfgStrip::visit): Likewise. * expand/rust-cfg-strip.h: Likewise. * expand/rust-derive.h: Likewise. * expand/rust-expand-visitor.cc (ExpandVisitor::visit): Likewise. * expand/rust-expand-visitor.h: Likewise. * hir/rust-ast-lower-base.cc (ASTLoweringBase::visit): Likewise. * hir/rust-ast-lower-base.h: Likewise. * metadata/rust-export-metadata.cc (ExportContext::emit_function): Likewise. * parse/rust-parse-impl.h: Likewise. * parse/rust-parse.h: Likewise. * resolve/rust-ast-resolve-base.cc (ResolverBase::visit): Likewise. * resolve/rust-ast-resolve-base.h: Likewise. * resolve/rust-default-resolver.cc (DefaultResolver::visit): Likewise. * resolve/rust-default-resolver.h: Likewise. * util/rust-attributes.cc (AttributeChecker::visit): Likewise. * util/rust-attributes.h: Likewise. gcc/testsuite/ChangeLog: * rust/compile/extern_func_with_body.rs: New test. Signed-off-by: 0xn4utilus <gyanendrabanjare8@gmail.com>
2024-03-05Update resolver to use `AST::Function` instead of `AST::ExternalFunctionItem`0xn4utilus7-19/+39
gcc/rust/ChangeLog: * checks/errors/rust-feature-gate.cc (FeatureGate::visit): Check if function is_external or not. * hir/rust-ast-lower-extern.h: Use AST::Function instead of AST::ExternalFunctionItem. * parse/rust-parse-impl.h (Parser::parse_external_item): Likewise. (Parser::parse_pattern): Fix clang format. * resolve/rust-ast-resolve-implitem.h: Likewise. * resolve/rust-ast-resolve-item.cc (ResolveExternItem::visit): Likewise. * resolve/rust-ast-resolve-item.h: Likewise. * resolve/rust-default-resolver.cc (DefaultResolver::visit): Check if param has_pattern before using get_pattern. Signed-off-by: 0xn4utilus <gyanendrabanjare8@gmail.com>
2024-03-05Unify ASTValidation::visit for ExternalFunctionItem and Function0xn4utilus3-17/+57
gcc/rust/ChangeLog: * checks/errors/rust-ast-validation.cc (ASTValidation::visit): Add external function validation support. Add ErrorCode::E0130. * parse/rust-parse-impl.h (Parser::parse_function): Parse external functions from `parse_function`. (Parser::parse_external_item): Clang format. (Parser::parse_pattern): Clang format. * parse/rust-parse.h: Add default parameter `is_external` in `parse_function`. Signed-off-by: 0xn4utilus <gyanendrabanjare8@gmail.com>
2024-03-05Add get_pattern_kind to Pattern0xn4utilus4-0/+60
gcc/rust/ChangeLog: * ast/rust-ast.h: Add Kind Enum to Pattern. * ast/rust-macro.h: Add get_pattern_kind(). * ast/rust-path.h: Likewise. * ast/rust-pattern.h: Likewise. Signed-off-by: 0xn4utilus <gyanendrabanjare8@gmail.com>
2024-03-05Add support for external functions0xn4utilus4-20/+43
gcc/rust/ChangeLog: * ast/rust-ast.cc (Function::Function): Add `is_external_function` field. (Function::operator=): Likewise. * ast/rust-ast.h: New constructor for ExternalItem. * ast/rust-item.h (class Function): Add `is_external_function` field. Update `get_node_id`. * ast/rust-macro.h: Update copy constructor. Signed-off-by: 0xn4utilus <gyanendrabanjare8@gmail.com>
2024-03-05AVR: Add two RTL peepholes.Georg-Johann Lay1-3/+58
Register alloc may expand a 3-operand arithmetic X = Y o CST as X = CST X o= Y where it may be better to instead: X = Y X o= CST because 1) the first insn may use MOVW for "X = Y", and 2) the operation may be more efficient when performed with a constant, for example when ADIW or SBIW can be used, or some bytes of the constant are 0x00 or 0xff. gcc/ * config/avr/avr.md: Add two RTL peepholes for PLUS, IOR and AND in HI, PSI, SI that swap operation order from "X = CST, X o= Y" to "X = Y, X o= CST".
2024-03-05Regenerate c.opt.urlsMark Wielaard1-0/+3
Fixes: 08edf85f747b ("c++/modules: relax diagnostic about GMF contents") gcc/c-family/ChangeLog: * c.opt.urls: Regenerate.
2024-03-05LoongArch: Allow s9 as a register aliasXi Ruoyao2-0/+4
The psABI allows using s9 as an alias of r22. gcc/ChangeLog: * config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add s9 as an alias of r22. gcc/testsuite/ChangeLog: * gcc.target/loongarch/regname-fp-s9.c: New test.
2024-03-05AVR: Improve output of insn "*insv.any_shift.<mode>_split".Roger Sayle6-51/+474
The instructions printed by insn "*insv.any_shift.<mode>_split" were sub-optimal. The code to print the improved output is lengthy and performed by new function avr_out_insv. As it turns out, the function can also handle shift-offsets of zero, which is "*andhi3", "*andpsi3" and "*andsi3". Thus, these tree insns get a new 3-operand alternative where the 3rd operand is an exact power of 2. gcc/ * config/avr/avr-protos.h (avr_out_insv): New proto. * config/avr/avr.cc (avr_out_insv): New function. (avr_adjust_insn_length) [ADJUST_LEN_INSV]: Handle case. (avr_cbranch_cost) [ZERO_EXTRACT]: Adjust rtx costs. * config/avr/avr.md (define_attr "adjust_len") Add insv. (andhi3, *andhi3, andpsi3, *andpsi3, andsi3, *andsi3): Add constraint alternative where the 3rd operand is a power of 2, and the source register may differ from the destination. (*insv.any_shift.<mode>_split): Call avr_out_insv to output instructions. Set attr "length" to "insv". * config/avr/constraints.md (Cb2, Cb3, Cb4): New constraints. gcc/testsuite/ * gcc.target/avr/torture/insv-anyshift-hi.c: New test. * gcc.target/avr/torture/insv-anyshift-si.c: New test.
2024-03-05tree-optimization/114231 - use patterns for BB SLP discovery root stmtsRichard Biener2-0/+16
The following makes sure to use recognized patterns when vectorizing roots during BB SLP discovery. We need to apply those late since during root discovery we've not yet done pattern recognition. All parts of the vectorizer assume patterns get used, for the testcase we mix this up when doing live lane computation. PR tree-optimization/114231 * tree-vect-slp.cc (vect_analyze_slp): Lookup patterns when processing a BB SLP root. * gcc.dg/vect/pr114231.c: New testcase.
2024-03-05Clean BiMap to use tl::optional for lookupsSourabh Jaiswal6-23/+28
gcc/rust/Changelog: * expand/rust-expand-visitor.cc (ExpandVisitor::expand_inner_items): Adjust to use has_value () (ExpandVisitor::expand_inner_stmts): Likewise * expand/rust-macro-builtins.cc (builtin_macro_from_string): Likewise (make_macro_path_str): Likewise * util/rust-hir-map.cc (Mappings::insert_macro_def): Likewise * util/rust-lang-item.cc (LangItem::Parse): Adjust to return tl::optional (LangItem::toString) Likewise * util/rust-token-converter.cc (handle_suffix): Adjust to use value.or () (from_literal) Likewise * util/bi-map.h (BiMap::lookup): Adjust to use tl::optional for lookups Signed-off-by: Sourabh Jaiswal <sourabhrj31@gmail.com>
2024-03-05lower-subreg: Fix ROTATE handling [PR114211]Jakub Jelinek2-0/+38
On the following testcase, we have (insn 10 7 11 2 (set (reg/v:TI 106 [ h ]) (rotate:TI (reg/v:TI 106 [ h ]) (const_int 64 [0x40]))) "pr114211.c":8:5 1042 {rotl64ti2_doubleword} (nil)) before subreg1 and the pass decides to use (reg:DI 127 [ h ]) / (reg:DI 128 [ h+8 ]) register pair instead of (reg/v:TI 106 [ h ]). resolve_operand_for_swap_move_operator implements it by pretending it is an assignment from (concatn (reg:DI 127 [ h ]) (reg:DI 128 [ h+8 ])) to (concatn (reg:DI 128 [ h+8 ]) (reg:DI 127 [ h ])) The problem is that if the rotate argument is the same as destination or if there is even an overlap between the first half of the destination with second half of the source we emit incorrect code, because the store to (reg:DI 128 [ h+8 ]) overwrites what we need for source of the second move. The following patch detects that case and uses a temporary pseudo to hold the original (reg:DI 128 [ h+8 ]) value across the first store. 2024-03-05 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/114211 * lower-subreg.cc (resolve_simple_move): For double-word rotates by BITS_PER_WORD if there is overlap between source and destination use a temporary. * gcc.dg/pr114211.c: New test.
2024-03-05bitint: Handle BIT_FIELD_REF lowering [PR114157]Jakub Jelinek4-0/+145
The following patch adds support for BIT_FIELD_REF lowering with large/huge _BitInt lhs. BIT_FIELD_REF requires mode argument first operand, so the operand shouldn't be any huge _BitInt. If we only access limbs from inside of BIT_FIELD_REF using constant indexes, we can just create a new BIT_FIELD_REF to extract the limb, but if we need to use variable index in a loop, I'm afraid we need to spill it into memory, which is what the following patch does. If there is some bitwise type for the extraction, it extracts just what we need and not more than that, otherwise it spills the whole first argument of BIT_FIELD_REF and uses MEM_REF with an offset with VIEW_CONVERT_EXPR around it. 2024-03-05 Jakub Jelinek <jakub@redhat.com> PR middle-end/114157 * gimple-lower-bitint.cc: Include stor-layout.h. (mergeable_op): Return true for BIT_FIELD_REF. (struct bitint_large_huge): Declare handle_bit_field_ref method. (bitint_large_huge::handle_bit_field_ref): New method. (bitint_large_huge::handle_stmt): Use it for BIT_FIELD_REF. * gcc.dg/bitint-98.c: New test. * gcc.target/i386/avx2-pr114157.c: New test. * gcc.target/i386/avx512f-pr114157.c: New test.
2024-03-05i386: For noreturn functions save at least the bp register if it is used ↵Jakub Jelinek9-29/+45
[PR114116] As mentioned in the PR, on x86_64 currently a lot of ICEs end up with crashes in the unwinder like: during RTL pass: expand pr114044-2.c: In function ‘foo’: pr114044-2.c:5:3: internal compiler error: in expand_fn_using_insn, at internal-fn.cc:208 5 | __builtin_clzg (a); | ^~~~~~~~~~~~~~~~~~ 0x7d9246 expand_fn_using_insn ../../gcc/internal-fn.cc:208 pr114044-2.c:5:3: internal compiler error: Segmentation fault 0x1554262 crash_signal ../../gcc/toplev.cc:319 0x2b20320 x86_64_fallback_frame_state ./md-unwind-support.h:63 0x2b20320 uw_frame_state_for ../../../libgcc/unwind-dw2.c:1013 0x2b2165d _Unwind_Backtrace ../../../libgcc/unwind.inc:303 0x2acbd69 backtrace_full ../../libbacktrace/backtrace.c:127 0x2a32fa6 diagnostic_context::action_after_output(diagnostic_t) ../../gcc/diagnostic.cc:781 0x2a331bb diagnostic_action_after_output(diagnostic_context*, diagnostic_t) ../../gcc/diagnostic.h:1002 0x2a331bb diagnostic_context::report_diagnostic(diagnostic_info*) ../../gcc/diagnostic.cc:1633 0x2a33543 diagnostic_impl ../../gcc/diagnostic.cc:1767 0x2a33c26 internal_error(char const*, ...) ../../gcc/diagnostic.cc:2225 0xe232c8 fancy_abort(char const*, int, char const*) ../../gcc/diagnostic.cc:2336 0x7d9246 expand_fn_using_insn ../../gcc/internal-fn.cc:208 Segmentation fault (core dumped) The problem are the PR38534 r14-8470 changes which avoid saving call-saved registers in noreturn functions. If such functions ever touch the bp register but because of the r14-8470 changes don't save it in the prologue, the caller or any other function in the backtrace uses a frame pointer and the noreturn function or anything it calls directly or indirectly calls backtrace, then the unwinder crashes, because bp register contains some unrelated value, but in the frames which do use frame pointer CFA is based on the bp register. In theory this could happen with any other call-saved register, e.g. code written by hand in assembly with .cfi_* directives could use any other call-saved register as register into which store the CFA or something related to that, but in reality at least compiler generated code and usual assembly probably just making sure bp doesn't contain garbage could be enough for backtrace purposes. In the debugger of course it will not be enough, the values of the arguments etc. can be lost (if DW_CFA_undefined is emitted) or garbage. So, I think for noreturn function we should at least save the bp register if we use it. If user asks for it using no_callee_saved_registers attribute, let's honor what is asked for (but then it is up to the user to make sure e.g. backtrace isn't called from the function or anything it calls). As discussed in the PR, whether to save bp or not shouldn't be based on whether compiling with -g or -g0, because we don't want code generation changes without/with debugging, it would also break -fcompare-debug, and users can call backtrace(3), that doesn't use debug info, just unwind info, even backtrace_symbols{,_fd}(3) don't use debug info but just looks at dynamic symbol table. The patch also adds check for no_caller_saved_registers attribute in the implicit addition of not saving callee saved register in noreturn functions, because on I think __attribute__((no_caller_saved_registers, noreturn)) will otherwise error that no_caller_saved_registers and no_callee_saved_registers attributes are incompatible (but user didn't specify anything like that). 2024-03-05 Jakub Jelinek <jakub@redhat.com> PR target/114116 * config/i386/i386.h (enum call_saved_registers_type): Add TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP enumerator. * config/i386/i386-options.cc (ix86_set_func_type): Remove has_no_callee_saved_registers variable, add no_callee_saved_registers instead, initialize it depending on whether it is no_callee_saved_registers function or not. Don't set it if no_caller_saved_registers attribute is present. Adjust users. * config/i386/i386.cc (ix86_function_ok_for_sibcall): Handle TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP like TYPE_NO_CALLEE_SAVED_REGISTERS. (ix86_save_reg): Handle TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP. * gcc.target/i386/pr38534-1.c: Allow push/pop of bp. * gcc.target/i386/pr38534-4.c: Likewise. * gcc.target/i386/pr38534-2.c: Likewise. * gcc.target/i386/pr38534-3.c: Likewise. * gcc.target/i386/pr114097-1.c: Likewise. * gcc.target/i386/stack-check-17.c: Expect no pop on ! ia32.
2024-03-05RISC-V: Cleanup unused code in riscv_v_adjust_bytesize [NFC]Pan Li1-4/+0
Cleanup mode_size related code which is not used anymore. Below tests are passed for this patch. * The RVV fully regresssion test. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_adjust_bytesize): Cleanup unused mode_size related code. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-03-04c++/modules: relax diagnostic about GMF contentsPatrick Palka4-7/+15
Issuing a hard error when the GMF doesn't consist only of preprocessing directives happens to be inconvenient for automated testcase reduction via cvise. This patch relaxes this diagnostic into a pedwarn that can be disabled with -Wno-global-module. gcc/c-family/ChangeLog: * c.opt (Wglobal-module): New warning. gcc/cp/ChangeLog: * parser.cc (cp_parser_translation_unit): Relax GMF contents error into a pedwarn. gcc/ChangeLog: * doc/invoke.texi (-Wno-global-module): Document. gcc/testsuite/ChangeLog: * g++.dg/modules/friend-6_a.C: Pass -Wno-global-module instead of -Wno-pedantic. Remove now unnecessary preprocessing directives from GMF. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-03-05Daily bump.GCC Administrator8-1/+428
2024-03-05c++: Support exporting using-decls in same namespace as targetNathaniel Shead5-8/+166
Currently a using-declaration bringing a name into its own namespace is a no-op, except for functions. This prevents people from being able to redeclare a name brought in from the GMF as exported, however, which this patch fixes. Apart from marking declarations as exported they are also now marked as effectively being in the module purview (due to the using-decl) so that they are properly processed, as 'add_binding_entity' assumes that declarations not in the module purview cannot possibly be exported. gcc/cp/ChangeLog: * name-lookup.cc (walk_module_binding): Remove completed FIXME. (do_nonmember_using_decl): Mark redeclared entities as exported when needed. Check for re-exporting internal linkage types. gcc/testsuite/ChangeLog: * g++.dg/modules/using-12.C: New test. * g++.dg/modules/using-13.h: New test. * g++.dg/modules/using-13_a.C: New test. * g++.dg/modules/using-13_b.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>