aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-05-12Fortran: Unlimited polymorphic intrinsic function arguments [PR84006]Paul Thomas5-20/+257
2024-05-12 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/84006 PR fortran/100027 PR fortran/98534 * iresolve.cc (gfc_resolve_transfer): Emit a TODO error for unlimited polymorphic mold. * trans-expr.cc (gfc_resize_class_size_with_len): Use the fold even if a block is not available in which to fix the result. (trans_class_assignment): Enable correct assignment of character expressions to unlimited polymorphic variables using lhs _len field and rse string_length. * trans-intrinsic.cc (gfc_conv_intrinsic_storage_size): Extract the class expression so that the unlimited polymorphic class expression can be used in gfc_resize_class_size_with_len to obtain the storage size for character payloads. Guard the use of GFC_DECL_SAVED_DESCRIPTOR by testing for DECL_LANG_SPECIFIC to prevent the ICE. Also, invert the order to use the class expression extracted from the argument. (gfc_conv_intrinsic_transfer): In same way as 'storage_size', use the _len field to obtaining the correct length for arg 1. Add a branch for the element size in bytes of class expressions with provision to make use of the unlimited polymorphic _len field. Again, the class references are explicitly identified. 'mold_expr' was already declared. Use it instead of 'arg'. Do not fix 'dest_word_len' for deferred character sources because reallocation on assign makes use of it before it is assigned. gcc/testsuite/ PR fortran/84006 PR fortran/100027 * gfortran.dg/storage_size_7.f90: New test. PR fortran/98534 * gfortran.dg/transfer_class_4.f90: New test.
2024-05-11Fortran: fix dependency checks for inquiry refs [PR115039]Harald Anlauf2-1/+21
gcc/fortran/ChangeLog: PR fortran/115039 * expr.cc (gfc_traverse_expr): An inquiry ref does not constitute a dependency and cannot collide with a symbol. gcc/testsuite/ChangeLog: PR fortran/115039 * gfortran.dg/statement_function_5.f90: New test.
2024-05-11[PATCH v4 4/4] Output S_COMPILE3 symbol in CodeView debug sectionMark Harmstone1-0/+126
Outputs the S_COMPILE3 symbol in the CodeView .debug$S debug section. The DEBUG_S_SYMBOLS block added here makes up pretty much everything that isn't data structures or line numbers; we add the S_COMPILE3 symbol here to start it off. This is a descriptive bit, the most interesting part of which is the version of the compiler used. gcc/ * dwarf2codeview.cc (DEBUG_S_SYMBOLS): Define. (S_COMPILE3, CV_CFL_80386, CV_CFL_X64): Likewise. (CV_CFL_C, CV_CFL_CXX): Likewise. (SYMBOL_START_LABEL, SYMBOL_END_LABEL): Likewise. (start_processor, language_constant): New functions. (write_compile3_symbol, write_codeview_symbols): Likewise. (codeview_debug_finish): Call write_codeview_symbols.
2024-05-11[PATCH v2 3/4] Output line numbers in CodeView sectionMark Harmstone4-1/+322
Outputs the DEBUG_S_LINES block in the CodeView .debug$S section, which maps between line numbers and addresses. You'll need a fairly recent version of GAS for the .secidx directive to be recognized. gcc/ * dwarf2codeview.cc (DEBUG_S_LINES, LINE_LABEL): Define. (END_FUNC_LABEL): Likewise. (struct codeview_line, codeview_line_block): New structures. (codeview_function): Likewise. (line_label_num, func_label_num, funcs, last_func): New variables. (last_filename, last_file_id): Likewise. (codeview_source_line, write_line_numbers): New functions. (codeview_switch_text_section, codeview_end_epilogue): Likewise. (codeview_debug_finish): Call write_line_numbers. * dwarf2codeview.h (codeview_source_line): Prototype. (codeview_switch_text_secction, codeview_end_epilogue): Likewise. * dwarf2out.cc (dwarf2_end_epilogue): Add codeview support. (dwarf2out_switch_text_section): Likewise. (dwarf2out_source_line): Likewise. * opts.cc (finish_options): Handle codeview debugging symbols.
2024-05-11[PATCH v2 2/4] Output file checksums in CodeView sectionMark Harmstone3-0/+260
Outputs the file name and MD5 hash of the main source file into the CodeView .debug$S section, along with that of any #include'd files. gcc/ * dwarf2codeview.cc (DEBUG_S_STRINGTABLE): Define. (DEBUG_S_FILECHKSMS, CHKSUM_TYPE_MD5, HASH_SIZE): Likewise. (codeview_string, codeview_source_file): New structures. (struct string_hasher): New class for codeview_string hashing. (files, last_file, num_files, string_offset): New variables. (strings_hstab, strings, last_string): Likewise. (add_string, codevie_start_source_file): New functions. (write_strings_tabe, write_soruce_files): Likewise. (codeview_debug_finish): Call new functions. * dwarf2codeview.h (codeview_start_source_file): Prototype. * dwarf2out.cc (dwarf2out_start_source_file): Handle codeview.
2024-05-11[PATCH v2 1/4] Support for CodeView debugging formatMark Harmstone11-5/+177
This patch and the following add initial support for Microsoft's CodeView debugging format, as used by MSVC, to mingw targets. Note that you will need a recent version of binutils for this to be useful. The best way to view the output is to run Microsoft's cvdump.exe, found in their microsoft-pdb repo on GitHub, against the object files. gcc/ * Makefile.in (OBJS): Add dwarf2codeview.o. (GTFILES): Add dwarf2codeview.cc * config/i386/cygming.h (CODEVIEW_DEBUGGING_INFO): Define. * dwarf2codeview.cc: New file. * dwarf2codeview.h: New file. * dwarf2out.cc: Include dwarf2codeview.h. (dwarf2out_finish): Call codeview_debug_finish as needed. * flag-types.h (DINFO_TYPE_CODEVIEW): Add enum member. (CODEVIEW_DEBUG): Define. * flags.h (codeview_debuginfo_p): Proottype. * opts.cc (debug_type_names): Add codeview. (debug_type_masks): Add CODEVIEW_DEBUG. (df_set_names): Add codeview. (codeview_debuginfo_p): New function. (dwarf_based_debuginfo_p): Add CODEVIEW clause. (set_debug_level): Handle CODEVIEW_DEBUG. * toplev.cc (process_options): Handle codeview. gcc/testsuite * gcc.dg/debug/codeview/codeview-1.c: New test. * gcc.dg/debug/codeview/codeview.exp: New testsuite driver.
2024-05-11tree-optimization/114760 - check variants of >> and << in loop-niterdzhao.ampere3-14/+131
When recognizing bit counting idiom, include pattern "x * 2" for "x << 1", and "x / 2" for "x >> 1" (given x is unsigned). gcc/ChangeLog: PR tree-optimization/114760 * tree-ssa-loop-niter.cc (is_lshift_by_1): New function to check if STMT is equivalent to x << 1. (is_rshift_by_1): New function to check if STMT is equivalent to x >> 1. (number_of_iterations_cltz): Enhance the identification of logical shift by one. (number_of_iterations_cltz_complement): Enhance the identification of logical shift by one. gcc/testsuite/ChangeLog: PR tree-optimization/114760 * gcc.dg/tree-ssa/pr114760-1.c: New test. * gcc.dg/tree-ssa/pr114760-2.c: New test.
2024-05-11[prange] Default unimplemented prange operators to false.Aldy Hernandez1-40/+15
The canonical way to indicate that a range operator is unsupported is to return false, which has the sematic meaning of VARYING. This patch cleans up a few default virtuals that were trying harder to set VARYING manually. gcc/ChangeLog: * range-op-ptr.cc (range_operator::fold_range): Return false.
2024-05-11[prange] Do not trap by default on range dispatch mismatches.Aldy Hernandez1-6/+17
The trap in the range-op dispatch code is really an internal debugging aid, and only a temporary one for a few weeks while the dust settles. This patch turns it off by default, allowing problematic passes to turn it on for analysis. gcc/ChangeLog: * range-op.cc (TRAP_ON_UNHANDLED_POINTER_OPERATORS): New (range_op_handler::fold_range): Use it. (range_op_handler::op1_range): Same. (range_op_handler::op2_range): Same. (range_op_handler::lhs_op1_relation): Same. (range_op_handler::lhs_op2_relation): Same. (range_op_handler::op1_op2_relation): Same.
2024-05-10c++: Implement __is_nothrow_invocable built-in traitKen Matsui5-0/+77
This patch implements built-in trait for std::is_nothrow_invocable. gcc/cp/ChangeLog: * cp-trait.def: Define __is_nothrow_invocable. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_NOTHROW_INVOCABLE. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __is_nothrow_invocable. * g++.dg/ext/is_nothrow_invocable.C: New test. Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-10c++: Implement __is_invocable built-in traitKen Matsui10-0/+724
This patch implements built-in trait for std::is_invocable. gcc/cp/ChangeLog: * cp-trait.def: Define __is_invocable. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_INVOCABLE. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise. * cp-tree.h (build_invoke): New function. * method.cc (build_invoke): New function. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __is_invocable. * g++.dg/ext/is_invocable1.C: New test. * g++.dg/ext/is_invocable2.C: New test. * g++.dg/ext/is_invocable3.C: New test. * g++.dg/ext/is_invocable4.C: New test. Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Patrick Palka <ppalka@redhat.com> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-10c++: Implement __array_rank built-in traitKen Matsui5-3/+52
This patch implements built-in trait for std::rank. gcc/cp/ChangeLog: * cp-trait.def: Define __array_rank. * constraint.cc (diagnose_trait_expr): Handle CPTK_RANK. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __array_rank. * g++.dg/ext/rank.C: New test. Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-10c++: Implement __decay built-in traitKen Matsui4-0/+38
This patch implements built-in trait for std::decay. gcc/cp/ChangeLog: * cp-trait.def: Define __decay. * semantics.cc (finish_trait_type): Handle CPTK_DECAY. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __decay. * g++.dg/ext/decay.C: New test. Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-10c++: Implement __add_rvalue_reference built-in traitKen Matsui4-0/+30
This patch implements built-in trait for std::add_rvalue_reference. gcc/cp/ChangeLog: * cp-trait.def: Define __add_rvalue_reference. * semantics.cc (finish_trait_type): Handle CPTK_ADD_RVALUE_REFERENCE. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __add_rvalue_reference. * g++.dg/ext/add_rvalue_reference.C: New test. Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-10c++: Implement __add_lvalue_reference built-in traitKen Matsui4-0/+31
This patch implements built-in trait for std::add_lvalue_reference. gcc/cp/ChangeLog: * cp-trait.def: Define __add_lvalue_reference. * semantics.cc (finish_trait_type): Handle CPTK_ADD_LVALUE_REFERENCE. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __add_lvalue_reference. * g++.dg/ext/add_lvalue_reference.C: New test. Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-10c++: Implement __remove_all_extents built-in traitKen Matsui4-0/+23
This patch implements built-in trait for std::remove_all_extents. gcc/cp/ChangeLog: * cp-trait.def: Define __remove_all_extents. * semantics.cc (finish_trait_type): Handle CPTK_REMOVE_ALL_EXTENTS. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __remove_all_extents. * g++.dg/ext/remove_all_extents.C: New test. Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-10c++: Implement __remove_extent built-in traitKen Matsui4-0/+25
This patch implements built-in trait for std::remove_extent. gcc/cp/ChangeLog: * cp-trait.def: Define __remove_extent. * semantics.cc (finish_trait_type): Handle CPTK_REMOVE_EXTENT. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __remove_extent. * g++.dg/ext/remove_extent.C: New test. Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-10c++: Implement __add_pointer built-in traitKen Matsui4-3/+76
This patch implements built-in trait for std::add_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __add_pointer. * semantics.cc (finish_trait_type): Handle CPTK_ADD_POINTER. (object_type_p): New function. (referenceable_type_p): Likewise. (trait_expr_value): Use object_type_p. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __add_pointer. * g++.dg/ext/add_pointer.C: New test. Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-10c++: Implement __is_unbounded_array built-in traitKen Matsui5-0/+48
This patch implements built-in trait for std::is_unbounded_array. gcc/cp/ChangeLog: * cp-trait.def: Define __is_unbounded_array. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_UNBOUNDED_ARRAY. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __is_unbounded_array. * g++.dg/ext/is_unbounded_array.C: New test. Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-05-10[RISC-V] Use shNadd for constant synthesisJeff Law3-1/+124
So here's the next idiom to improve constant synthesis. The basic idea here is to try and use shNadd to generate the constant when profitable. Let's take 0x300000801. Right now that generates: li a0,3145728 addi a0,a0,1 slli a0,a0,12 addi a0,a0,-2047 But we can do better. The constant is evenly divisible by 9 resulting in 0x55555639 which doesn't look terribly interesting. But that constant can be generated with two instructions, then we can use a sh3add to multiply it by 9. So the updated sequence looks like: li a0,1431654400 addi a0,a0,1593 sh3add a0,a0,a0 This doesn't trigger a whole lot, but I haven't really set up a test to explore the most likely space where this might be useful. The tests were found exploring a different class of constant synthesis problems. If you were to dive into the before/after you'd see that the shNadd interacts quite nicely with the recent bseti work. The joys of recursion. Probably the most controversial thing in here is using the "FMA" opcode to stand in for when we want to use shNadd. Essentially when we synthesize a constant we generate a series of RTL opcodes and constants for emission by another routine. We don't really have a way to say we want a shift-add. But you can think of shift-add as a limited form of multiply-accumulate. It's a bit of a stretch, but not crazy bad IMHO. Other approaches would be to store our own enum rather than an RTL opcode. Or store an actual generator function rather than any kind of opcode. It wouldn't take much pushback over (ab)using FMA in this manner to get me to use our own enums rather than RTL opcodes for this stuff. gcc/ * config/riscv/riscv.cc (riscv_build_integer_1): Recognize cases where we can use shNadd to improve constant synthesis. (riscv_move_integer): Handle code generation for shNadd. gcc/testsuite * gcc.target/riscv/synthesis-1.c: Also count shNadd instructions. * gcc.target/riscv/synthesis-3.c: New test.
2024-05-10i386: Improve V[48]QI shifts on AVX512/SSE4.1Roger Sayle5-2/+91
The following one line patch improves the code generated for V8QI and V4QI shifts when AV512BW and AVX512VL functionality is available. For the testcase (from gcc.target/i386/vect-shiftv8qi.c): typedef signed char v8qi __attribute__ ((__vector_size__ (8))); v8qi foo (v8qi x) { return x >> 5; } GCC with -O2 -march=cascadelake currently generates: foo: movl $67372036, %eax vpsraw $5, %xmm0, %xmm2 vpbroadcastd %eax, %xmm1 movl $117901063, %eax vpbroadcastd %eax, %xmm3 vmovdqa %xmm1, %xmm0 vmovdqa %xmm3, -24(%rsp) vpternlogd $120, -24(%rsp), %xmm2, %xmm0 vpsubb %xmm1, %xmm0, %xmm0 ret with this patch we now generate the much improved: foo: vpmovsxbw %xmm0, %xmm0 vpsraw $5, %xmm0, %xmm0 vpmovwb %xmm0, %xmm0 ret This patch also fixes the FAILs of gcc.target/i386/vect-shiftv[48]qi.c when run with the additional -march=cascadelake flag, by splitting these tests into two; one form testing code generation with -msse2 (and -mno-avx512vl) as originally intended, and the other testing AVX512 code generation with an explicit -march=cascadelake. 2024-05-10 Roger Sayle <roger@nextmovesoftware.com> Hongtao Liu <hongtao.liu@intel.com> gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_vecop_qihi_partial): Don't attempt ix86_expand_vec_shift_qihi_constant on SSE4.1. gcc/testsuite/ChangeLog * gcc.target/i386/vect-shiftv4qi.c: Specify -mno-avx512vl. * gcc.target/i386/vect-shiftv8qi.c: Likewise. * gcc.target/i386/vect-shiftv4qi-2.c: New test case. * gcc.target/i386/vect-shiftv8qi-2.c: Likewise.
2024-05-10pru: Fix register class checks in predicatesDimitar Dimitrov1-2/+2
The register class checks in the multiply-source predicates was incorrectly using the register number instead of the register class for comparison. gcc/ChangeLog: * config/pru/predicates.md (pru_mulsrc0_operand): Use register class instead of register number for the check. (pru_mulsrc1_operand): Ditto. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2024-05-10[PR114942][LRA]: Don't reuse input reload reg of inout early clobber operandVladimir N. Makarov2-8/+43
The insn in question has the same reg in inout operand and input operand. The inout operand is early clobber. LRA reused input reload reg of the inout operand for the input operand which is wrong. It were a good decision if the inout operand was not early clobber one. The patch rejects the reuse for the PR test case. gcc/ChangeLog: PR target/114942 * lra-constraints.cc (struct input_reload): Add new member early_clobber_p. (get_reload_reg): Add new arg early_clobber_p, don't reuse input reload with true early_clobber_p member value, use the arg for new element of curr_insn_input_reloads. (match_reload): Assign false to early_clobber_p member. (process_addr_reg, simplify_operand_subreg, curr_insn_transform): Adjust get_reload_reg calls. gcc/testsuite/ChangeLog: PR target/114942 * gcc.target/i386/pr114942.c: New.
2024-05-10[prange] Fix thinko in prange::update_bitmask() [PR115026]Aldy Hernandez1-1/+1
gcc/ChangeLog: PR tree-optimization/115026 * value-range.cc (prange::update_bitmask): Use operand bitmask.
2024-05-10tree-optimization/114998 - use-after-free with loop distributionRichard Biener2-6/+53
When loop distribution releases a PHI node of the original IL it can end up clobbering memory that's re-used when it upon releasing its RDG resets all stmt UIDs back to -1, even those that got released. The fix is to avoid resetting UIDs based on stmts in the RDG but instead reset only those still present in the loop. PR tree-optimization/114998 * tree-loop-distribution.cc (free_rdg): Take loop argument. Reset UIDs of stmts still in the IL rather than all stmts referenced from the RDG. (loop_distribution::build_rdg): Pass loop to free_rdg. (loop_distribution::distribute_loop): Likewise. (loop_distribution::transform_reduction_loop): Likewise. * gcc.dg/torture/pr114998.c: New testcase.
2024-05-10Allow patterns in SLP reductionsRichard Biener4-21/+101
The following removes the over-broad rejection of patterns for SLP reductions which is done by removing them from LOOP_VINFO_REDUCTIONS during pattern detection. That's also insufficient in case the pattern only appears on the reduction path. Instead this implements the proper correctness check in vectorizable_reduction and guides SLP discovery to heuristically avoid forming later invalid groups. I also couldn't find any testcase that FAILs when allowing the SLP reductions to form so I've added one. I came across this for single-lane SLP reductions with the all-SLP work where we rely on patterns to properly vectorize COND_EXPR reductions. * tree-vect-patterns.cc (vect_pattern_recog_1): Do not remove reductions involving patterns. * tree-vect-loop.cc (vectorizable_reduction): Reject SLP reduction groups with multiple lane-reducing reductions. * tree-vect-slp.cc (vect_analyze_slp_instance): When discovering SLP reduction groups avoid including lane-reducing ones. * gcc.dg/vect/vect-reduc-sad-9.c: New testcase.
2024-05-10AVR: target/114981 - Tweak __builtin_powif / __powisf2Georg-Johann Lay3-1/+190
Implement __powisf2 in assembly. PR target/114981 libgcc/ * config/avr/t-avr (LIB2FUNCS_EXCLUDE): Add _powisf2. (LIB1ASMFUNCS) [!avrtiny]: Add _powif. * config/avr/lib1funcs.S (mov4): New .macro. (L_powif, __powisf2) [!avrtiny]: New module and function. gcc/testsuite/ * gcc.target/avr/pr114981-powif.c: New test.
2024-05-10bpf: fix printing of memory operands in pseudoc asm dialectJose E. Marchesi3-24/+21
The BPF backend was emitting memory operands in pseudo-C syntax without surrounding parentheses. These were being provided in the corresponding instruction templates. This was causing GCC emitting invalid instructions when finding inline assembly with memory operands like: asm volatile ( "r1 = *(u64 *)%[ctx_a];" "if r1 != 42 goto 1f;" "r1 = *(u64 *)%[ctx_b];" "if r1 != 42 goto 1f;" "r1 = *(u64 *)%[ctx_c];" "if r1 != 7 goto 1f;" "r1 /= 0;" "1:" : : [ctx_a]"m"(ctx.a), [ctx_b]"m"(ctx.b), [ctx_c]"m"(ctx.c) : "r1" ); This patch changes the backend to include the surrounding parentheses in the printed representation of the memory operands (much like surrounding brackets are included in normal asm syntax) and adapts the impacted instruction templates accordingly. Tested in target bpf-unknown-none, host x86_64-linux-gnu. gcc/ChangeLog: * config/bpf/bpf.cc (bpf_print_operand_address): Include surrounding parenthesis around mem operands in pseudoc asm dialect. * config/bpf/bpf.md (*mov<MM:mode>): Adapt accordingly. (zero_extendhidi2): Likewise. (zero_extendqidi2): Likewise. (*extendsidi2): Likewise. (*extendsidi2): Likewise. (extendhidi2): Likewise. (extendqidi2): Likewise. (extendhisi2): Likewise. * config/bpf/atomic.md (atomic_add<AMO:mode>): Likewise. (atomic_and<AMO:mode>): Likewise. (atomic_or<AMO:mode>): Likewise. (atomic_xor<AMO:mode>): Likewise. (atomic_fetch_add<AMO:mode>): Likewise. (atomic_fetch_and<AMO:mode>): Likewise. (atomic_fetch_or<AMO:mode>): Likewise. (atomic_fetch_xor<AMO:mode>): Likewise.
2024-05-10c++, mingw: Fix up types of dtor hooks to __cxa_{,thread_}atexit/__cxa_throw ↵Jakub Jelinek9-35/+85
on mingw ia32 [PR114968] __cxa_atexit/__cxa_thread_atexit/__cxa_throw functions accept function pointers to usually directly destructors rather than wrappers around them. Now, mingw ia32 uses implicitly __attribute__((thiscall)) calling conventions for METHOD_TYPE (where the this pointer is passed in %ecx register, the rest on the stack), so these functions use: in config/os/mingw32/os_defines.h: #if defined (__i386__) #define _GLIBCXX_CDTOR_CALLABI __thiscall #endif in libsupc++/cxxabi.h __cxa_atexit(void (_GLIBCXX_CDTOR_CALLABI *)(void*), void*, void*) _GLIBCXX_NOTHROW; __cxa_thread_atexit(void (_GLIBCXX_CDTOR_CALLABI *)(void*), void*, void *) _GLIBCXX_NOTHROW; __cxa_throw(void*, std::type_info*, void (_GLIBCXX_CDTOR_CALLABI *) (void *)) __attribute__((__noreturn__)); Now, mingw for some weird reason uses #define TARGET_CXX_USE_ATEXIT_FOR_CXA_ATEXIT hook_bool_void_true so it never actually uses __cxa_atexit, but does use __cxa_thread_atexit and __cxa_throw. Recent changes for modules result in more detailed __cxa_*atexit/__cxa_throw prototypes precreated by the compiler, and if that happens and one also includes <cxxabi.h>, the compiler complains about mismatches in the prototypes. One thing is the missing thiscall attribute on the FUNCTION_TYPE, the other problem is that all of atexit/__cxa_atexit/__cxa_thread_atexit get function pointer types created by a single function, get_atexit_fn_ptr_type (), which creates it depending on if atexit or __cxa_atexit will be used as either void(*)(void) or void(*)(void *), but when using atexit and __cxa_thread_atexit it uses the wrong function type for __cxa_thread_atexit. The following patch adds a target hook to add the thiscall attribute to the function pointers, and splits the get_atexit_fn_ptr_type () function into get_atexit_fn_ptr_type () and get_cxa_atexit_fn_ptr_type (), the former always creates shared void(*)(void) type, the latter creates either void(*)(void*) (on most targets) or void(__attribute__((thiscall))*)(void*) (on mingw ia32). So that we don't waiste another GTY global tree for it, because cleanup_type used for the same purpose for __cxa_throw should be the same, the code changes it to use that type too. In register_dtor_fn then based on the decision whether to use atexit, __cxa_atexit or __cxa_thread_atexit it picks the right function pointer type, and also if it decides to emit a __tcf_* wrapper for the cleanup, uses that type for that wrapper so that it agrees on calling convention. 2024-05-10 Jakub Jelinek <jakub@redhat.com> PR target/114968 gcc/ * target.def (use_atexit_for_cxa_atexit): Remove spurious space from comment. (adjust_cdtor_callabi_fntype): New cxx target hook. * targhooks.h (default_cxx_adjust_cdtor_callabi_fntype): Declare. * targhooks.cc (default_cxx_adjust_cdtor_callabi_fntype): New function. * doc/tm.texi.in (TARGET_CXX_ADJUST_CDTOR_CALLABI_FNTYPE): Add. * doc/tm.texi: Regenerate. * config/i386/i386.cc (ix86_cxx_adjust_cdtor_callabi_fntype): New function. (TARGET_CXX_ADJUST_CDTOR_CALLABI_FNTYPE): Redefine. gcc/cp/ * cp-tree.h (atexit_fn_ptr_type_node, cleanup_type): Adjust macro comments. (get_cxa_atexit_fn_ptr_type): Declare. * decl.cc (get_atexit_fn_ptr_type): Adjust function comment, only build type for atexit argument. (get_cxa_atexit_fn_ptr_type): New function. (get_atexit_node): Call get_cxa_atexit_fn_ptr_type rather than get_atexit_fn_ptr_type when using __cxa_atexit. (get_thread_atexit_node): Call get_cxa_atexit_fn_ptr_type rather than get_atexit_fn_ptr_type. (start_cleanup_fn): Add ob_parm argument, call get_cxa_atexit_fn_ptr_type or get_atexit_fn_ptr_type depending on it and create PARM_DECL also based on that argument. (register_dtor_fn): Adjust start_cleanup_fn caller, use get_cxa_atexit_fn_ptr_type rather than get_atexit_fn_ptr_type for use_dtor casts. * except.cc (build_throw): Use get_cxa_atexit_fn_ptr_type ().
2024-05-10[prange] Do not assume all pointers are the same size [PR115009]Aldy Hernandez1-11/+19
In a world with same sized pointers we can always reuse the storage slots, but since this is not always the case, we need to be more careful. However, we can always store an undefined, because that requires no extra storage. gcc/ChangeLog: PR tree-optimization/115009 * value-range-storage.cc (prange_storage::alloc): Do not assume all pointers are the same size. (prange_storage::prange_storage): Same. (prange_storage::fits_p): Same.
2024-05-10RISC-V: Fix typos in code or comment [NFC]Kito Cheng2-50/+50
Just found some typo when fixing bugs and then use aspell to find few more typos, this patch didn't do anything other than fix typo. gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc: Fix typos in comments. (get_all_predecessors): Ditto. (pre_vsetvl::m_unknow_info): Rename to... (pre_vsetvl::m_unknown_info): this. (pre_vsetvl::compute_vsetvl_def_data): Rename m_unknow_info to m_unknown_info. (pre_vsetvl::cleaup): Rename to... (pre_vsetvl::cleanup): this. (pre_vsetvl::compute_vsetvl_def_data): Fix typos. (pass_vsetvl::lazy_vsetvl): Update function name and fix typos. * config/riscv/riscv.cc: Fix typos in comments. (struct machine_function): Fix typo in comments. (riscv_valid_lo_sum_p): Ditto. (riscv_force_address): Ditto. (riscv_immediate_operand_p): Ditto. (riscv_in_small_data_p): Ditto. (riscv_first_stack_step): Ditto. (riscv_expand_prologue): Ditto. (riscv_convert_vector_chunks): Ditto. (riscv_override_options_internal): Ditto. (get_common_costs): Ditto.
2024-05-10driver: Move -fdiagnostics-urls= early like -fdiagnostics-color= [PR114980]Xi Ruoyao1-0/+13
In GCC 14 we started to emit URLs for "command-line option <option> is valid for <language> but not <another language>" and "-Werror= argument '-Werror=<option>' is not valid for <language>" warnings. So we should have moved -fdiagnostics-urls= early like -fdiagnostics-color=, or -fdiagnostics-urls= wouldn't be able to control URLs in these warnings. No test cases are added because with TERM=xterm-256colors PR114980 already triggers some test failures. gcc/ChangeLog: PR driver/114980 * opts-common.cc (prune_options): Move -fdiagnostics-urls= early like -fdiagnostics-color=.
2024-05-09[committed] [RISC-V] Provide splitting guidance to combine to faciliate ↵Jeff Law2-0/+52
shNadd.uw generation This fixes a minor code quality issue I found while comparing GCC and LLVM. Essentially we want to do a bit of re-association to generate shNadd.uw instructions. Combine does the right thing and finds all the necessary instructions, reassociates the operands, combines constants, etc. Where is fails is finding a good split point. The backend can trivially provide guidance on how to split via a define_split pattern. This has survived both Ventana's internal CI system (rv64gcb) as well as my own (rv64gc, rv32gcv). I'll wait for the external CI system to give the all-clear before pushing. gcc/ * config/riscv/bitmanip.md: Add splitter for shadd feeding another add instruction. gcc/testsuite/ * gcc.target/riscv/zba-shadduw.c: New test.
2024-05-10Revert: "Enable prange support." [PR114985]Aldy Hernandez13-18/+31
This reverts commit 36e877996936abd8bd08f8b1d983c8d1023a5842 until the IPA pass is fixed with regards to POINTER = POINTER <RELOP> POINTER.
2024-05-09Constant fold {-1,-1} << 1 in simplify-rtx.ccRoger Sayle1-0/+54
This patch addresses a missed optimization opportunity in the RTL optimization passes. The function simplify_const_binary_operation will constant fold binary operators with two CONST_INT operands, and those with two CONST_VECTOR operands, but is missing compile-time evaluation of binary operators with a CONST_VECTOR and a CONST_INT, such as vector shifts and rotates. The first version of this patch didn't contain a switch statement to explicitly check for valid binary opcodes, which bootstrapped and regression tested fine, but my paranoia has got the better of me, so this version now checks that VEC_SELECT or some funky (future) rtx_code doesn't cause problems. 2024-05-09 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * simplify-rtx.cc (simplify_const_binary_operation): Constant fold binary operations where the LHS is CONST_VECTOR and the RHS is CONST_INT (or CONST_DOUBLE) such as vector shifts.
2024-05-09c++: failure to suppress -Wsizeof-array-div in template [PR114983]Marek Polacek3-0/+30
-Wsizeof-array-div offers a way to suppress the warning by wrapping the second operand of the division in parens: sizeof (samplesBuffer) / (sizeof(unsigned char)) but this doesn't work in a template, because we fail to propagate the suppression bits. Do it, then. The finish_parenthesized_expr hunk is not needed because suppress_warning isn't very fine-grained. But I think it makes sense to be explicit and not rely on OPT_Wparentheses also suppressing OPT_Wsizeof_array_div. PR c++/114983 gcc/cp/ChangeLog: * pt.cc (tsubst_expr) <case SIZEOF_EXPR>: Use copy_warning. * semantics.cc (finish_parenthesized_expr): Also suppress -Wsizeof-array-div. gcc/testsuite/ChangeLog: * g++.dg/warn/Wsizeof-array-div3.C: New test.
2024-05-09testsuite: Fix up pr84508* tests [PR84508]Jakub Jelinek2-2/+4
The tests FAIL on x86_64-linux with /usr/bin/ld: cannot find -lubsan collect2: error: ld returned 1 exit status compiler exited with status 1 FAIL: gcc.target/i386/pr84508-1.c (test for excess errors) Excess errors: /usr/bin/ld: cannot find -lubsan The problem is that only *.dg/ubsan/ubsan.exp calls ubsan_init which adds the needed search paths to libubsan library. So, link/run tests for -fsanitize=undefined need to go into gcc.dg/ubsan/ or g++.dg/ubsan/, even when they are target specific. 2024-05-09 Jakub Jelinek <jakub@redhat.com> PR target/84508 * gcc.target/i386/pr84508-1.c: Move to ... * gcc.dg/ubsan/pr84508-1.c: ... here. Restrict to i?86/x86_64 non-ia32 targets. * gcc.target/i386/pr84508-2.c: Move to ... * gcc.dg/ubsan/pr84508-2.c: ... here. Restrict to i?86/x86_64 non-ia32 targets.
2024-05-09PR modula2/115003 exporting a symbol to outer scope with a name clash causes ICEGaius Mulley1-0/+1
An ICE will occur if an unknown symbol is exported and causes a name clash. The error mechanism attempts to find the scope of an unknown symbol. This patch adds a missing case clause to GetScope and returns NulSym if the scope is an unknown symbol. gcc/m2/ChangeLog: PR modula2/115003 * gm2-compiler/SymbolTable.mod (GetScope): Add UndefinedSym case clause and return NulSym. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-05-09c++: lambda capturing structured bindings [PR85889]Marek Polacek3-1/+23
<https://wg21.link/p1381r1> clarifies that it's OK to capture structured bindings. [expr.prim.lambda.capture]/4 says "The identifier in a simple-capture shall denote a local entity" and [basic.pre]/3: "An entity is a [...] structured binding". It doesn't appear that this was made a DR, so, strictly speaking, we should have a -Wc++20-extensions warning, like clang++. PR c++/85889 gcc/cp/ChangeLog: * lambda.cc (add_capture): Add a pedwarn for capturing structured bindings. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/decomp3.C: Use -Wno-c++20-extensions. * g++.dg/cpp1z/decomp60.C: New test.
2024-05-09Add myself to DCOH.J. Lu1-0/+1
ChangeLog: * MAINTAINERS: Add myself to DCO. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-05-09sra: Do not leave work for DSE (that it can sometimes not perform)Martin Jambor3-6/+17
When looking again at the g++.dg/tree-ssa/pr109849.C testcase we discovered that it generates terrible store-to-load forwarding stalls because SRA was leaving behind aggregate loads but all the stores were by scalar parts and DSE failed to remove the useless load. SRA has all the knowledge to remove the statement even now, so this small patch makes it do so. With this patch, the g++.dg/tree-ssa/pr109849.C micro-benchmark runs 9 times faster (on an AMD EPYC 75F3 machine). gcc/ChangeLog: 2024-04-18 Martin Jambor <mjambor@suse.cz> * tree-sra.cc (sra_modify_assign): Remove the original statement also when dealing with a store to a fully covered aggregate from a non-candidate. gcc/testsuite/ChangeLog: 2024-04-23 Martin Jambor <mjambor@suse.cz> * g++.dg/tree-ssa/pr109849.C: Also check that the aggeegate store to cur disappears. * gcc.dg/tree-ssa/ssa-dse-26.c: Instead of relying on DSE, check that the unwanted stores were removed at early SRA time.
2024-05-09Manually update entries for the Revert Revert commits.Jakub Jelinek2-0/+23
2024-05-09contrib: Add 109f1b28fc94c93096506e3df0c25e331cef19d0 to ignored commitsJakub Jelinek1-1/+2
2024-05-09 Jakub Jelinek <jakub@redhat.com> * gcc-changelog/git_update_version.py: Replace 9dbff9c05520a74e6cd337578f27b56c941f64f3 with 39f81924d88e3cc197fc3df74204c9b5e01e12f7 and 109f1b28fc94c93096506e3df0c25e331cef19d0 in IGNORED_COMMITS.
2024-05-09Daily bump.GCC Administrator20-1/+1441
2024-05-09RISC-V: Make full-vec-move1.c test robust for optimizationPan Li1-2/+4
During investigate the support of early break autovec, we notice the test full-vec-move1.c will be optimized to 'return 0;' in main function body. Because somehow the value of V type is compiler time constant, and then the second loop will be considered as assert (true). Thus, the ccp4 pass will eliminate these stmt and just return 0. typedef int16_t V __attribute__((vector_size (128))); int main () { V v; for (int i = 0; i < sizeof (v) / sizeof (v[0]); i++) (v)[i] = i; V res = v; for (int i = 0; i < sizeof (v) / sizeof (v[0]); i++) assert (res[i] == i); // will be optimized to assert (true) } This patch would like to introduce a extern function to use the res[i] that get rid of the ccp4 optimization. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/full-vec-move1.c: Introduce extern func use to get rid of ccp4 optimization. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-05-09contrib: Add 9dbff9c05520a74e6cd337578f27b56c941f64f3 to ignored commitsJakub Jelinek1-1/+2
2024-05-09 Jakub Jelinek <jakub@redhat.com> * gcc-changelog/git_update_version.py: Add 9dbff9c05520a74e6cd337578f27b56c941f64f3 to IGNORED_COMMITS.
2024-05-09testsuite: Fix up vector-subaccess-1.C test for ia32 [PR89224]Jakub Jelinek1-0/+1
The test FAILs on i686-linux due to .../gcc/testsuite/g++.dg/torture/vector-subaccess-1.C:16:6: warning: SSE vector argument without SSE enabled changes the ABI [-Wpsabi] excess warnings. This fixes it by adding -Wno-psabi, like commonly done in other tests. 2024-05-09 Jakub Jelinek <jakub@redhat.com> PR c++/89224 * g++.dg/torture/vector-subaccess-1.C: Add -Wno-psabi as additional options.
2024-05-09MIPS: Support constraint 'w' for MSA instructionYunQiang Su2-0/+12
Support syntax like: asm volatile ("fmadd.d %w0, %w1, %w2" : "+w"(a): "w"(b), "w"(c)); gcc * config/mips/constraints.md: Add new constraint 'w'. gcc/testsuite * gcc.target/mips/msa-inline-asm.c: New test.
2024-05-09RISC-V: Add tests for cpymemsi expansionChristoph Müllner4-0/+116
cpymemsi expansion was available for RISC-V since the initial port. However, there are not tests to detect regression. This patch adds such tests. Three of the tests target the expansion requirements (known length and alignment). One test reuses an existing memcpy test from the by-pieces framework (gcc/testsuite/gcc.dg/torture/inline-mem-cpy-1.c). gcc/testsuite/ChangeLog: * gcc.target/riscv/cpymemsi-1.c: New test. * gcc.target/riscv/cpymemsi-2.c: New test. * gcc.target/riscv/cpymemsi-3.c: New test. * gcc.target/riscv/cpymemsi.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-05-09i386: Fix some intrinsics without alignment requirements.Hu, Lin14-9/+33
gcc/ChangeLog: PR target/84508 * config/i386/emmintrin.h (_mm_load_sd): Remove alignment requirement. (_mm_store_sd): Ditto. (_mm_loadh_pd): Ditto. (_mm_loadl_pd): Ditto. (_mm_storel_pd): Add alignment requirement. * config/i386/xmmintrin.h (_mm_loadh_pi): Remove alignment requirement. (_mm_loadl_pi): Ditto. (_mm_load_ss): Ditto. (_mm_store_ss): Ditto. gcc/testsuite/ChangeLog: PR target/84508 * gcc.target/i386/pr84508-1.c: New test. * gcc.target/i386/pr84508-2.c: Ditto.