aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-10-17LibF7: Implement fma / fmal.Georg-Johann Lay3-10/+62
libgcc/config/avr/libf7/ * libf7.h (F7_SIZEOF): New macro. * libf7-asm.sx: Use F7_SIZEOF instead of magic number "10". (F7MOD_D_fma_, __fma): New module and function. (fma) [-mdouble=64]: Define as alias for __fma. (fmal) [-mlong-double=64]: Define as alias for __fma. * libf7-common.mk (F7_ASM_PARTS): Add D_fma.
2023-10-17Disparage slightly for the alternative which move DFmode between SSE_REGS ↵liuhongt2-2/+13
and GENERAL_REGS. For testcase void __cond_swap(double* __x, double* __y) { bool __r = (*__x < *__y); auto __tmp = __r ? *__x : *__y; *__y = __r ? *__y : *__x; *__x = __tmp; } GCC-14 with -O2 and -march=x86-64 options generates the following code: __cond_swap(double*, double*): movsd xmm1, QWORD PTR [rdi] movsd xmm0, QWORD PTR [rsi] comisd xmm0, xmm1 jbe .L2 movq rax, xmm1 movapd xmm1, xmm0 movq xmm0, rax .L2: movsd QWORD PTR [rsi], xmm1 movsd QWORD PTR [rdi], xmm0 ret rax is used to save and restore DFmode value. In RA both GENERAL_REGS and SSE_REGS cost zero since we didn't disparage the alternative in movdf_internal pattern, according to register allocation order, GENERAL_REGS is allocated. The patch add ? for alternative (r,v) and (v,r) just like we did for movsf/hf/bf_internal pattern, after that we get optimal RA. __cond_swap: .LFB0: .cfi_startproc movsd (%rdi), %xmm1 movsd (%rsi), %xmm0 comisd %xmm1, %xmm0 jbe .L2 movapd %xmm1, %xmm2 movapd %xmm0, %xmm1 movapd %xmm2, %xmm0 .L2: movsd %xmm1, (%rsi) movsd %xmm0, (%rdi) ret gcc/ChangeLog: PR target/110170 * config/i386/i386.md (movdf_internal): Disparage slightly for 2 alternatives (r,v) and (v,r) by adding constraint modifier '?'. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110170-3.c: New test. (cherry picked from commit 37a231cc7594d12ba0822077018aad751a6fb94e)
2023-10-17Daily bump.GCC Administrator3-1/+37
2023-10-16Merge branch 'releases/gcc-13' into devel/omp/gcc-13Tobias Burnus21-127/+520
Merge up to r13-7954-gcc87aaeceea58389b681e3a6a63f95e54f2b59cd (16th Oct 2023)
2023-10-16Fortran's gfc_match_char: %S to match symbol with host_assocTobias Burnus2-2/+12
gfc_match ("... %s ...", ...) matches a gfc_symbol but with host_assoc = 0. This commit adds '%S' as variant which matches with host_assoc = 1 gcc/fortran/ChangeLog: * match.cc (gfc_match_char): Match with '%S' a symbol with host_assoc = 1. (cherry picked from commit 0607e93490058ec31b6ab57078c54771f139b870)
2023-10-15rs6000: Use default target option node for callee by default [PR111380]Kewen Lin3-35/+70
As PR111380 (and the discussion in related PRs) shows, for now how function rs6000_can_inline_p treats the callee without any target option node is wrong. It considers it's always safe to inline this kind of callee, but actually its target flags are from the command line options (target_option_default_node), it's possible that the flags of callee don't satisfy the condition of inlining, but it is still inlined, then result in unexpected consequence. As the associated test case pr111380-1.c shows, the caller main is attributed with power8, but the callee foo is compiled with power9 from command line, it's unexpected to make main inline foo since foo can contain something that requires power9 capability. Without this patch, for lto (with -flto) we can get error message (as it forces the callee to have a target option node), but for non-lto, it's inlined unexpectedly. This patch is to make callee adopt target_option_default_node when it doesn't have a target option node, it can avoid wrong inlining decision and fix the inconsistency between LTO and non-LTO. It also aligns with what the other ports do. PR target/111380 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_can_inline_p): Adopt target_option_default_node when the callee has no option attributes, also simplify the existing code accordingly. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr111380-1.c: New test. * gcc.target/powerpc/pr111380-2.c: New test. (cherry picked from commit 266dfed68b881702e9660889f63408054b7fa9c0)
2023-10-15rs6000: Skip empty inline asm in rs6000_update_ipa_fn_target_info [PR111366]Kewen Lin2-3/+54
PR111366 exposes one thing that can be improved in function rs6000_update_ipa_fn_target_info is to skip the given empty inline asm string, since it's impossible to adopt any hardware features (so far HTM). Since this rs6000_update_ipa_fn_target_info related approach exists in GCC12 and later, the affected project highway has updated its target pragma with ",htm", see the link: https://github.com/google/highway/commit/15e63d61eb535f478bc I'd not bother to consider an inline asm parser for now but will file a separated PR for further enhancement. PR target/111366 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_update_ipa_fn_target_info): Skip empty inline asm. gcc/testsuite/ChangeLog: * g++.target/powerpc/pr111366.C: New test. (cherry picked from commit a65b38e361320e0aa45adbc969c704385ab1f45b)
2023-10-16Daily bump.GCC Administrator1-1/+1
2023-10-15Daily bump.GCC Administrator1-1/+1
2023-10-14Daily bump.GCC Administrator2-1/+7
2023-10-13Do not add partial equivalences with no uses.Andrew MacLeod1-0/+9
PR tree-optimization/111622 * value-relation.cc (equiv_oracle::add_partial_equiv): Do not register a partial equivalence if an operand has no uses.
2023-10-13Daily bump.GCC Administrator2-1/+7
2023-10-12LibF7: Implement atan2.Georg-Johann Lay3-3/+61
libgcc/config/avr/libf7/ * libf7.c (F7MOD_atan2_, f7_atan2): New module and function. * libf7.h: Adjust comments. * libf7-common.mk (CALL_PROLOGUES): Add atan2.
2023-10-12Daily bump.GCC Administrator3-1/+40
2023-10-11Ensure float equivalences include + and - zero.Andrew MacLeod4-0/+44
A floating point equivalence may not properly reflect both signs of zero, so be pessimsitic and ensure both signs are included. PR tree-optimization/111694 gcc/ * gimple-range-cache.cc (ranger_cache::fill_block_cache): Adjust equivalence range. * value-relation.cc (adjust_equivalence_range): New. * value-relation.h (adjust_equivalence_range): New prototype. gcc/testsuite/ * gcc.dg/pr111694.c: New.
2023-10-11tree-ssa-strlen: optimization skips clobbering store [PR111519]Jakub Jelinek2-22/+79
The following testcase is miscompiled, because count_nonzero_bytes incorrectly uses get_strinfo information on a pointer from which an earlier instruction loads SSA_NAME stored at the current instruction. get_strinfo shows a state right before the current store though, so if there are some stores in between the current store and the load, the string length information might have changed. The patch passes around gimple_vuse from the store and punts instead of using strinfo on loads from MEM_REF which have different gimple_vuse from that. 2023-10-11 Richard Biener <rguenther@suse.de> Jakub Jelinek <jakub@redhat.com> PR tree-optimization/111519 * tree-ssa-strlen.cc (strlen_pass::count_nonzero_bytes): Add vuse argument and pass it through to recursive calls and count_nonzero_bytes_addr calls. Don't shadow the stmt argument, but change stmt for gimple_assign_single_p statements for which we don't immediately punt. (strlen_pass::count_nonzero_bytes_addr): Add vuse argument and pass it through to recursive calls and count_nonzero_bytes calls. Don't use get_strinfo if gimple_vuse (stmt) is different from vuse. Don't shadow the stmt argument. * gcc.dg/torture/pr111519.c: New testcase. (cherry picked from commit e75bf1985fdc9a5d3a307882a9251d8fd6e93def)
2023-10-11Daily bump.GCC Administrator2-1/+16
2023-10-10ada: Fix infinite loop with multiple limited with clausesEric Botcazou1-63/+107
This occurs when one of the types has an incomplete declaration in addition to its full declaration in its package. In this case AI05-129 says that the incomplete type is not part of the limited view of the package, i.e. only the full view is. Now, in the GNAT implementation, it's the opposite in the regular view of the package, i.e. the incomplete type is the visible one. That's why the implementation needs to also swap the types on the visibility chain while it is swapping the views when the clauses are either installed or removed. This works correctly for the installation, but does not for the removal, so this change rewrites the code doing the latter. gcc/ada/ PR ada/111434 * sem_ch10.adb (Replace): New procedure to replace an entity with another on the homonym chain. (Install_Limited_With_Clause): Rename Non_Lim_View to Typ for the sake of consistency. Call Replace to do the replacements and split the code into the regular and the special cases. Add debuggging output controlled by -gnatdi. (Install_With_Clause): Print the Parent_With and Implicit_With flags in the debugging output controlled by -gnatdi. (Remove_Limited_With_Unit.Restore_Chain_For_Shadow (Shadow)): Rewrite using a direct replacement of E4 by E2. Call Replace to do the replacements. Add debuggging output controlled by -gnatdi.
2023-10-10Daily bump.GCC Administrator1-1/+1
2023-10-09Fortran/OpenMP: Fix handling of strictly structured blocksTobias Burnus5-6/+126
For strictly structured blocks, a BLOCK was created but the code was placed after the block the outer structured block. Additionally, labelled blocks were mishandled. As the code is now properly in a BLOCK, it solves additional issues. gcc/fortran/ChangeLog: * parse.cc (parse_omp_structured_block): Make the user code end up inside of BLOCK construct for strictly structured blocks; fix fallout for 'section' and 'teams'. * openmp.cc (resolve_omp_target): Fix changed BLOCK handling for teams in target checking. libgomp/ChangeLog: * testsuite/libgomp.fortran/strictly-structured-block-1.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/block_17.f90: New test. * gfortran.dg/gomp/strictly-structured-block-5.f90: New test. (cherry picked from commit 6a8edd50a149f10621b59798c887c24c81c8b9ea)
2023-10-09Daily bump.GCC Administrator1-1/+1
2023-10-08Daily bump.GCC Administrator3-1/+18
2023-10-07MATCH: Fix infinite loop between `vec_cond(vec_cond(a,b,0), c, d)` and `a & b`Andrew Pinski2-0/+12
Match has a pattern which converts `vec_cond(vec_cond(a,b,0), c, d)` into `vec_cond(a & b, c, d)` but since in this case a is a comparison fold will change `a & b` back into `vec_cond(a,b,0)` which causes an infinite loop. The best way to fix this is to enable the patterns for vec_cond(*,vec_cond,*) only for GIMPLE so we don't get an infinite loop for fold any more. Note this is a latent bug since these patterns were added in r11-2577-g229752afe3156a and was exposed by r14-3350-g47b833a9abe1 where now able to remove a VIEW_CONVERT_EXPR. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR middle-end/111699 gcc/ChangeLog: * match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e), (v ? w : 0) ? a : b, c1 ? c2 ? a : b : b): Enable only for GIMPLE. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/pr111699-1.c: New test. (cherry picked from commit e77428a9a336f57e3efe3eff95f2b491d7e9be14)
2023-10-07Daily bump.GCC Administrator1-1/+1
2023-10-06Merge branch 'releases/gcc-13' into devel/omp/gcc-13Tobias Burnus27-39/+410
Merge up to r13-7936-g7c47e03df2a77f2e25e23887734a5e818aeca3f5 (6th Oct 2023)
2023-10-06Daily bump.GCC Administrator1-1/+1
2023-10-05Daily bump.GCC Administrator3-1/+22
2023-10-04Fortran: Alloc comp of non-finalizable type not finalized [PR111674]Paul Thomas3-2/+18
2023-10-04 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/37336 PR fortran/111674 * trans-expr.cc (gfc_trans_scalar_assign): Finalize components on deallocation if derived type is not finalizable. gcc/testsuite/ PR fortran/37336 PR fortran/111674 * gfortran.dg/allocate_with_source_25.f90: Final count in tree dump reverts from 4 to original 6. * gfortran.dg/finalize_38.f90: Add test for fix of PR111674. (cherry picked from commit 84284e1c490e9235fca5cb85269ecfcb87eef4f1)
2023-10-04Daily bump.GCC Administrator1-1/+1
2023-10-03Daily bump.GCC Administrator4-1/+36
2023-10-02libstdc++: Force _Hash_node_value_base methods inline to fix abi (PR111050)Tim Song1-0/+4
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1b6f0476837205932613ddb2b3429a55c26c409d changed _Hash_node_value_base to no longer derive from _Hash_node_base, which means that its member functions expect _M_storage to be at a different offset. So explosions result if an out-of-line definition is emitted for any of the member functions (say, in a non-optimized build) and the resulting object file is then linked with code built using older version of GCC/libstdc++. libstdc++-v3/ChangeLog: PR libstdc++/111050 * include/bits/hashtable_policy.h (_Hash_node_value_base<>::_M_valptr(), _Hash_node_value_base<>::_M_v()) Add [[__gnu__::__always_inline__]]. (cherry picked from commit 2c1e3544a94c5d7354fad031e1f9731c3ce3af25)
2023-10-02Disable generation of scalar modulo instructions.Pat Haugen8-29/+72
It was recently discovered that the scalar modulo instructions can suffer noticeable performance issues for certain input values. This patch disables their generation since the equivalent div/mul/sub sequence does not suffer the same problem. gcc/ * config/rs6000/rs6000.cc (rs6000_rtx_costs): Check whether the modulo instruction is disabled. * config/rs6000/rs6000.h (RS6000_DISABLE_SCALAR_MODULO): New. * config/rs6000/rs6000.md (mod<mode>3, *mod<mode>3): Check it. (define_expand umod<mode>3): New. (define_insn umod<mode>3): Rename to *umod<mode>3 and check if the modulo instruction is disabled. (umodti3, modti3): Check if the modulo instruction is disabled. gcc/testsuite/ * gcc.target/powerpc/clone1.c: Add xfails. * gcc.target/powerpc/clone3.c: Likewise. * gcc.target/powerpc/mod-1.c: Update scan strings and add xfails. * gcc.target/powerpc/mod-2.c: Likewise. * gcc.target/powerpc/p10-vdivq-vmodq.c: Add xfails. (cherry picked from commit 58ab38213b979811d314f68e3f455c28a1d44140)
2023-10-02Daily bump.GCC Administrator3-1/+38
2023-10-01Fix PR 111331: wrong code for `a > 28 ? MIN<a, 28> : 29`Andrew Pinski4-4/+55
The problem here is after r6-7425-ga9fee7cdc3c62d0e51730, the comparison to see if the transformation could be done was using the wrong value. Instead of see if the inner was LE (for MIN and GE for MAX) the outer value, it was comparing the inner to the value used in the comparison which was wrong. Committed to GCC 13 branch after bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: PR tree-optimization/111331 * tree-ssa-phiopt.cc (minmax_replacement): Fix the LE/GE comparison for the `(a CMP CST1) ? max<a,CST2> : a` optimization. gcc/testsuite/ChangeLog: PR tree-optimization/111331 * gcc.c-torture/execute/pr111331-1.c: New test. * gcc.c-torture/execute/pr111331-2.c: New test. * gcc.c-torture/execute/pr111331-3.c: New test. (cherry picked from commit 30e6ee074588bacefd2dfe745b188bb20c81fe5e)
2023-10-01Fix PR 110386: backprop vs ABSU_EXPRAndrew Pinski3-1/+20
The issue here is that when backprop tries to go and strip sign ops, it skips over ABSU_EXPR but ABSU_EXPR not only does an ABS, it also changes the type to unsigned. Since strip_sign_op_1 is only supposed to strip off sign changing operands and not ones that change types, removing ABSU_EXPR here is correct. We don't handle nop conversions so this does cause any missed optimizations either. Committed to the GCC 13 branch after bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/110386 gcc/ChangeLog: * gimple-ssa-backprop.cc (strip_sign_op_1): Remove ABSU_EXPR. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/pr110386-1.c: New test. * gcc.c-torture/compile/pr110386-2.c: New test. (cherry picked from commit 2bbac12ea7bd8a3eef5382e1b13f6019df4ec03f)
2023-10-01Daily bump.GCC Administrator2-1/+6
2023-09-30Fixup d9b3269bdc.Andre Vehreschild1-1/+1
Adapt to different parameter count in comparison to gcc-14. gcc/fortran/ChangeLog: * trans-array.cc (gfc_trans_deferred_array): Use correct position for statements to add to guarded block.
2023-09-30Daily bump.GCC Administrator4-1/+51
2023-09-29Fortran: Free alloc. comp. in allocated coarrays only.Andre Vehreschild3-2/+92
When freeing allocatable components of an allocatable coarray, add a check that the coarray is still allocated, before accessing the components. This patch adds to PR fortran/37336, but does not fix it completely. gcc/fortran/ChangeLog: PR fortran/37336 * trans-array.cc (structure_alloc_comps): Deref coarray. (gfc_trans_deferred_array): Add freeing of components after check for allocated coarray. gcc/testsuite/ChangeLog: PR fortran/37336 * gfortran.dg/coarray/alloc_comp_6.f90: New test. * gfortran.dg/coarray/alloc_comp_7.f90: New test. (cherry picked from commit 9a63a62dfd73e159f1956e9b04b555c445de4e78)
2023-09-29Merge branch 'releases/gcc-13' into devel/omp/gcc-13Tobias Burnus85-892/+3531
Merge up to r13-7922-gb5b98a2d055d967d1fc92859827839c83c9368d7 (29th Sep 2023)
2023-09-29AArch64: List official cores before codenamesWilco Dijkstra2-4/+4
List official cores first so that -mcpu=native does not show a codename with -v or in errors/warnings. gcc/ChangeLog: * config/aarch64/aarch64-cores.def (neoverse-n1): Place before ares. (neoverse-v1): Place before zeus. (neoverse-v2): Place before demeter. * config/aarch64/aarch64-tune.md: Regenerate. (cherry picked from commit 64d5bc35c8c2a66ac133a3e6ace820b0ad8a63fb)
2023-09-29AArch64: Fix memmove operand corruption [PR111121]Wilco Dijkstra4-18/+77
A MOPS memmove may corrupt registers since there is no copy of the input operands to temporary registers. Fix this by calling aarch64_expand_cpymem_mops. Reviewed-by: Richard Sandiford <richard.sandiford@arm.com> gcc/ChangeLog/ PR target/111121 * config/aarch64/aarch64.md (aarch64_movmemdi): Add new expander. (movmemdi): Call aarch64_expand_cpymem_mops for correct expansion. * config/aarch64/aarch64.cc (aarch64_expand_cpymem_mops): Add support for memmove. * config/aarch64/aarch64-protos.h (aarch64_expand_cpymem_mops): Add new function. gcc/testsuite/ChangeLog/ PR target/111121 * gcc.target/aarch64/mops_4.c: Add memmove testcases. (cherry picked from commit d8b56c95782aeeee79ec40932ca88d00fd9f2ee2)
2023-09-29Daily bump.GCC Administrator1-1/+1
2023-09-28Daily bump.GCC Administrator6-1/+559
2023-09-27libstdc++: Add test for illegal pointer arithmetic in format [PR111102]Paul Dreik1-0/+15
libstdc++-v3/ChangeLog: PR libstdc++/111102 * testsuite/std/format/string.cc: Check wide character format strings with out-of-range widths. (cherry picked from commit 7564fe98657ad5ede34bd08f5279778fa8698865)
2023-09-27libstdc++: [_GLIBCXX_INLINE_VERSION] Fix <format> friend declarationFrançois Dumont1-1/+7
GCC do not consider the inline namespace in friend function declarations. This is PR c++/59526, we need to explicit this namespace. libstdc++-v3/ChangeLog: * include/std/format (std::__format::_Arg_store): Explicit version namespace on make_format_args friend declaration. (cherry picked from commit 92456291849fe88303bbcab366f41dcd4a885ad5)
2023-09-27libstdc++: fix illegal pointer arithmetic in format [PR111102]Paul Dreik1-1/+2
When parsing a format string, the width is parsed into an unsigned short but the result is not checked in the case the format string is not a char string (such as a wide string). In case the parse fails, a null pointer is returned which is used for pointer arithmetic which is undefined behaviour. Signed-off-by: Paul Dreik <gccpatches@pauldreik.se> libstdc++-v3/ChangeLog: PR libstdc++/111102 * include/std/format (__format::__parse_integer): Check for non-null pointer. (cherry picked from commit dd4bdb9eea436bf06f175d8dbfc2190377455be4)
2023-09-27libstdc++: Minor fixes for some warnings in <format>Jonathan Wakely1-15/+13
libstdc++-v3/ChangeLog: * include/std/format: Fix some warnings. (__format::__write(Ctx&, basic_string_view<CharT>)): Remove unused function template. (cherry picked from commit b9e5a4b4f035ba85b1a4065b751c2d583206b4e3)
2023-09-27libstdc++: Fix std::format alternate form for floating-point [PR108046]Jonathan Wakely2-8/+13
A decimal point was being added to the end of the string for {:#.0} because the __expc character was not being set, for the _Pres_none presentation type, so __s.find(__expc) didn't the 'e' in "1e+01" and so we created "1e+01." by appending the radix char to the end. This can be fixed by ensuring that __expc='e' is set for the _Pres_none case. I realized we can also set __expc='P' and __expc='E' when needed, to save a call to std::toupper later. For the {:#.0g} format, __expc='e' was being set and so the 'e' was found in "1e+10" but then __z = __prec - __sigfigs would wraparound to SIZE_MAX. That meant we would decide not to add a radix char because the number of extra characters to insert would be 1+SIZE_MAX i.e. zero. This can be fixed by using __z == 0 when __prec == 0. libstdc++-v3/ChangeLog: PR libstdc++/108046 * include/std/format (__formatter_fp::format): Ensure __expc is always set for all presentation types. Set __z correctly for zero precision. * testsuite/std/format/functions/format.cc: Check problem cases. (cherry picked from commit 50bc490c090cc95175e6068ed7438788d7fd7040)
2023-09-27libstdc++: Fix constexpr functions to conform to older standardsJonathan Wakely4-13/+12
Some constexpr functions were inadvertently relying on relaxed constexpr rules from later standards. libstdc++-v3/ChangeLog: * include/bits/chrono.h (duration_cast): Do not use braces around statements for C++11 constexpr rules. * include/bits/stl_algobase.h (__lg): Rewrite as a single statement for C++11 constexpr rules. * include/experimental/bits/fs_path.h (path::string): Use _GLIBCXX17_CONSTEXPR not _GLIBCXX_CONSTEXPR for 'if constexpr'. * include/std/charconv (__to_chars_8): Initialize variable for C++17 constexpr rules. (cherry picked from commit b3a2b307b9deea719fb725a86df43b82176fe459)