aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-11-09[PATCH v3] libiberty: Use posix_spawn in pex-unix when available.Brendan Shanks5-8/+189
Hi, This patch implements pex_unix_exec_child using posix_spawn when available. This should especially benefit recent macOS (where vfork just calls fork), but should have equivalent or faster performance on all platforms. In addition, the implementation is substantially simpler than the vfork+exec code path. Tested on x86_64-linux. v2: Fix error handling (previously the function would be run twice in case of error), and don't use a macro that changes control flow. v3: Match file style for error-handling blocks, don't close in/out/errdes on error, and check close() for errors. libiberty/ * configure.ac (AC_CHECK_HEADERS): Add spawn.h. (checkfuncs): Add posix_spawn, posix_spawnp. (AC_CHECK_FUNCS): Add posix_spawn, posix_spawnp. * aclocal.m4, configure, config.in: Rebuild. * pex-unix.c [HAVE_POSIX_SPAWN] (pex_unix_exec_child): New function.
2023-11-10test: Fix FAIL of pr97428.c for RVVJuzhe-Zhong1-0/+1
gcc/testsuite/ChangeLog: * gcc.dg/vect/pr97428.c: Add additional compile option for riscv.
2023-11-10RISC-V: Move cond_copysign from combine pattern to autovec patternJuzhe-Zhong2-22/+22
Since cond_copysign has been support into match.pd (middle-end). We don't need to support conditional copysign by RTL combine pass. Instead, we can support it by direct explicit cond_copysign optab. conditional copysign tests are already available in the testsuite. No need to add tests. gcc/ChangeLog: * config/riscv/autovec-opt.md (*cond_copysign<mode>): Remove. * config/riscv/autovec.md (cond_copysign<mode>): New pattern.
2023-11-10Internal-fn: Add FLOATN support for l/ll round and rint [PR/112432]Pan Li1-4/+4
The defined DEF_EXT_LIB_FLOATN_NX_BUILTINS functions should also have DEF_INTERNAL_FLT_FLOATN_FN instead of DEF_INTERNAL_FLT_FN for the FLOATN support. According to the glibc API and gcc builtin, we have below table for the FLOATN is supported or not. +---------+-------+-------------------------------------+ | | glibc | gcc: DEF_EXT_LIB_FLOATN_NX_BUILTINS | +---------+-------+-------------------------------------+ | iceil | N | N | | ifloor | N | N | | irint | N | N | | iround | N | N | | lceil | N | N | | lfloor | N | N | | lrint | Y | Y | | lround | Y | Y | | llceil | N | N | | llfllor | N | N | | llrint | Y | Y | | llround | Y | Y | +---------+-------+-------------------------------------+ This patch would like to support FLOATN for: 1. lrint 2. lround 3. llrint 4. llround The below tests are passed within this patch: 1. x86 bootstrap and regression test. 2. aarch64 regression test. 3. riscv regression tests. PR target/112432 gcc/ChangeLog: * internal-fn.def (LRINT): Add FLOATN support. (LROUND): Ditto. (LLRINT): Ditto. (LLROUND): Ditto. Signed-off-by: Pan Li <pan2.li@intel.com>
2023-11-09[committed] Improve single bit zero extraction on H8.Jeff Law1-2/+68
When zero extracting a single bit bitfield from bits 16..31 on the H8 we currently generate some pretty bad code. The fundamental issue is we can't shift efficiently and there's no trivial way to extract a single bit out of the high half word of an SImode value. What usually happens is we use a synthesized right shift to get the single bit into the desired position, then a bit-and to mask off everything we don't care about. The shifts are expensive, even using tricks like half and quarter word moves to implement shift-by-16 and shift-by-8. Additionally a logical right shift must clear out the upper bits which is redundant since we're going to mask things with &1 later. This patch provides a consistently better sequence for such extractions. The general form moves the high half into the low half, a bit extraction into C, clear the destination, then move C into the destination with a few special cases. This also avoids all the shenanigans for H8/SX which has a much more capable shifter. It's not single cycle, but it is reasonably efficient. This has been regression tested on the H8 without issues. Pushing to the trunk momentarily. jeff ps. Yes, supporting zero extraction of multi-bit fields might be improvable as well. But I've already spent more time on this than I can reasonably justify. gcc/ * config/h8300/combiner.md (single bit sign_extract): Avoid recently added patterns for H8/SX. (single bit zero_extract): New patterns.
2023-11-10Fix wrong code due to vec_merge + pcmp to blendvb splitter.liuhongt2-2/+110
gcc/ChangeLog: PR target/112443 * config/i386/sse.md (*avx2_pcmp<mode>3_4): Fix swap condition from LT to GT since there's not in the pattern. (*avx2_pcmp<mode>3_5): Ditto. gcc/testsuite/ChangeLog: * g++.target/i386/pr112443.C: New test.
2023-11-10bpf: fix pseudo-c asm emitted for *mulsidi3_zeroextendJose E. Marchesi3-8/+26
This patch fixes the pseudo-c BPF assembly syntax used for *mulsidi3_zeroextend, which was being emitted as: rN *= wM instead of the proper way to denote a mul32 in pseudo-C syntax: wN *= wM Includes test. Tested in bpf-unknown-none-gcc target in x86_64-linux-gnu host. gcc/ChangeLog: * config/bpf/bpf.cc (bpf_print_register): Accept modifier code 'W' to force emitting register names using the wN form. * config/bpf/bpf.md (*mulsidi3_zeroextend): Force operands to always use wN written form in pseudo-C assembly syntax. gcc/testsuite/ChangeLog: * gcc.target/bpf/mulsidi3-zeroextend-pseudoc.c: New test.
2023-11-10bpf: testsuite: fix expected regexp in gcc.target/bpf/ldxdw.cJose E. Marchesi1-1/+1
gcc/testsuite/ChangeLog: * gcc.target/bpf/ldxdw.c: Fix regexp with expected result.
2023-11-10libstdc++: mark 20_util/scoped_allocator/noexcept.cc R-E-T hostedArsen Arsenović1-0/+1
libstdc++-v3/ChangeLog: * testsuite/20_util/scoped_allocator/noexcept.cc: Mark as requiring hosted.
2023-11-10libstdc++: declare std::allocator in !HOSTED as an extensionArsen Arsenović1-2/+1
This allows us to add features to freestanding which allow specifying non-default allocators (generators, collections, ...) without having to modify them. libstdc++-v3/ChangeLog: * include/bits/memoryfwd.h: Remove HOSTED check around allocator and its specializations.
2023-11-09diagnostics: cleanups to diagnostic-show-locus.ccDavid Malcolm8-87/+129
Reduce implicit usage of line_table global, and move source printing to within diagnostic_context. gcc/ChangeLog: * diagnostic-show-locus.cc (layout::m_line_table): New field. (compatible_locations_p): Convert to... (layout::compatible_locations_p): ...this, replacing uses of line_table global with m_line_table. (layout::layout): Convert "richloc" param from a pointer to a const reference. Initialize m_line_table member. (layout::maybe_add_location_range): Replace uses of line_table global with m_line_table. Pass the latter to linemap_client_expand_location_to_spelling_point. (layout::print_leading_fixits): Pass m_line_table to affects_line_p. (layout::print_trailing_fixits): Likewise. (gcc_rich_location::add_location_if_nearby): Update for change to layout ctor params. (diagnostic_show_locus): Convert to... (diagnostic_context::maybe_show_locus): ...this, converting richloc param from a pointer to a const reference. Make "loc" const. Split out printing part of function to... (diagnostic_context::show_locus): ...this. (selftest::test_offset_impl): Update for change to layout ctor params. (selftest::test_layout_x_offset_display_utf8): Likewise. (selftest::test_layout_x_offset_display_tab): Likewise. (selftest::test_tab_expansion): Likewise. * diagnostic.h (diagnostic_context::maybe_show_locus): New decl. (diagnostic_context::show_locus): New decl. (diagnostic_show_locus): Convert from a decl to an inline function. * gdbinit.in (break-on-diagnostic): Update from a breakpoint on diagnostic_show_locus to one on diagnostic_context::maybe_show_locus. * genmatch.cc (linemap_client_expand_location_to_spelling_point): Add "set" param and use it in place of line_table global. * input.cc (expand_location_1): Likewise. (expand_location): Update for new param of expand_location_1. (expand_location_to_spelling_point): Likewise. (linemap_client_expand_location_to_spelling_point): Add "set" param and use it in place of line_table global. * tree-diagnostic-path.cc (event_range::print): Pass line_table for new param of linemap_client_expand_location_to_spelling_point. libcpp/ChangeLog: * include/line-map.h (rich_location::get_expanded_location): Make const. (rich_location::get_line_table): New accessor. (rich_location::m_line_table): Make the pointer be const. (rich_location::m_have_expanded_location): Make mutable. (rich_location::m_expanded_location): Likewise. (fixit_hint::affects_line_p): Add const line_maps * param. (linemap_client_expand_location_to_spelling_point): Likewise. * line-map.cc (rich_location::get_expanded_location): Make const. Pass m_line_table to linemap_client_expand_location_to_spelling_point. (rich_location::maybe_add_fixit): Likewise. (fixit_hint::affects_line_p): Add set param and pass to linemap_client_expand_location_to_spelling_point. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-11-09Add missing declaration of get_restrict in C++ interfaceGuillaume Gomez1-0/+1
gcc/jit/ChangeLog: * libgccjit++.h:
2023-11-10MAINTAINERS: Add myself to write after approvalJivan Hakobyan1-0/+1
Signed-off-by: Jeff Law <jeffreyalaw@gmail.com> ChangeLog: * MAINTAINERS: Add myself.
2023-11-09libstdc++: Fix forwarding in __take/drop_of_repeat_view [PR112453]Patrick Palka2-5/+28
We need to respect the value category of the repeat_view passed to these two functions when accessing the view's _M_value member. This revealed that the space-efficient partial specialization of __box lacks && overloads of operator* to match those of the primary template (inherited from std::optional). PR libstdc++/112453 libstdc++-v3/ChangeLog: * include/std/ranges (__detail::__box<_Tp>::operator*): Define && overloads as well. (__detail::__take_of_repeat_view): Forward __r when accessing its _M_value member. (__detail::__drop_of_repeat_view): Likewise. * testsuite/std/ranges/repeat/1.cc (test07): New test. Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
2023-11-09RISC-V/testsuite: Fix several zvfh tests.Robin Dapp34-108/+310
This fixes some zvfh test oversights as well as adds zfh to the target requirements. It's not strictly necessary to have zfh but it greatly simplifies test handling when we can just calculate the reference value instead of working around it. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/fmax_zvfh-1.c: Adjust. * gcc.target/riscv/rvv/autovec/binop/fmax_zvfh_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/fmin_zvfh-1.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/fmin_zvfh_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-1.h: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-2.h: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-zvfh-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-zvfh-2.c: Ditto. * gcc.target/riscv/rvv/autovec/reduc/reduc_zvfh-10.c: Ditto. * gcc.target/riscv/rvv/autovec/reduc/reduc_zvfh_run-10.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int_zvfh-1.h: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int_zvfh-2.h: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int_zvfh-rv32-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int_zvfh-rv32-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int_zvfh-rv64-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int_zvfh-rv64-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int_zvfh_run-2.c: New test.
2023-11-09i386: Improve stack protector patterns and peephole2s even moreUros Bizjak1-9/+66
Improve stack protector patterns and peephole2s even more: a. Use unrelated register clears with integer mode size <= word mode size to clear stack protector scratch register. b. Use unrelated register initializations in front of stack protector sequence to clear stack protector scratch register. c. Use unrelated register initializations using LEA instructions to clear stack protector scratch register. These stack protector improvements reuse 6914 unrelated register initializations to substitute the clear of stack protector scratch register in 12034 instances of stack protector sequence in recent linux defconfig build. gcc/ChangeLog: * config/i386/i386.md (@stack_protect_set_1_<PTR:mode>_<W:mode>): Use W mode iterator instead of SWI48. Output MOV instead of XOR for TARGET_USE_MOV0. (stack_protect_set_1 peephole2): Use integer modes with mode size <= word mode size for operand 3. (stack_protect_set_1 peephole2 #2): New peephole2 pattern to substitute stack protector scratch register clear with unrelated register initialization, originally in front of stack protector sequence. (*stack_protect_set_3_<PTR:mode>_<SWI48:mode>): New insn pattern. (stack_protect_set_1 peephole2): New peephole2 pattern to substitute stack protector scratch register clear with unrelated register initialization involving LEA instruction.
2023-11-09[IRA]: Fixing conflict calculation from region landing pads.Vladimir N. Makarov1-16/+28
The following patch fixes conflict calculation from exception landing pads. The previous patch processed only one newly created landing pad. Besides it was wrong, it also resulted in large memory consumption by IRA. gcc/ChangeLog: PR rtl-optimization/110215 * ira-lives.cc: (add_conflict_from_region_landing_pads): New function. (process_bb_node_lives): Use it.
2023-11-09libstdc++: [_Hashtable] Use RAII type to manage rehash functor stateFrançois Dumont2-50/+45
Replace usage of __try/__catch with a RAII type to restore rehash functor state when needed. libstdc++-v3/ChangeLog: * include/bits/hashtable_policy.h (_RehashStateGuard): New. (_Insert_base<>::_M_insert_range(_IIt, _IIt, const _NodeGet&, false_type)): Adapt. * include/bits/hashtable.h (__rehash_guard_t): New. (__rehash_state): Remove. (_M_rehash): Remove. (_M_rehash_aux): Rename into _M_rehash. (_M_assign_elements, _M_insert_unique_node, _M_insert_multi_node): Adapt. (rehash): Adapt.
2023-11-09i386 PIE: accept @GOTOFF in load/store multi base addressAlexandre Oliva1-14/+75
Looking at the code generated for sse2-{load,store}-multi.c with PIE, I realized we could use UNSPEC_GOTOFF as a base address, and that this would enable the test to use the vector insns expected by the tests even with PIC, so I extended the base + offset logic used by the SSE2 multi-load/store peepholes to accept reg + symbolic base + offset too, so that the test generated the expected insns even with PIE. for gcc/ChangeLog * config/i386/i386.cc (symbolic_base_address_p, base_address_p): New, factored out from... (extract_base_offset_in_addr): ... here and extended to recognize REG+GOTOFF, as in gcc.target/i386/sse2-load-multi.c and sse2-store-multi.c with PIE enabled by default.
2023-11-09testsuite: xfail scev-[35].c on ia32Alexandre Oliva2-2/+2
These gimplefe tests never got the desired optimization on ia32, but they only started visibly failing when the representation of MEMs in dumps changed from printing 'symbol: a' to '&a'. The transformation is not considered profitable on ia32, that's why it doesn't take place. Maybe that's a bug in itself, but it's not a regression, and not something to be noisy about. for gcc/testsuite/ChangeLog * gcc.dg/tree-ssa/scev-3.c: xfail on ia32. * gcc.dg/tree-ssa/scev-5.c: Likewise.
2023-11-09AArch64: Add SVE implementation for cond_copysign.Tamar Christina2-0/+87
This adds an implementation for masked copysign along with an optimized pattern for masked copysign (x, -1). gcc/ChangeLog: PR tree-optimization/109154 * config/aarch64/aarch64-sve.md (cond_copysign<mode>): New. gcc/testsuite/ChangeLog: PR tree-optimization/109154 * gcc.target/aarch64/sve/fneg-abs_5.c: New test.
2023-11-09AArch64: Handle copysign (x, -1) expansion efficientlyTamar Christina3-10/+57
copysign (x, -1) is effectively fneg (abs (x)) which on AArch64 can be most efficiently done by doing an OR of the signbit. The middle-end will optimize fneg (abs (x)) now to copysign as the canonical form and so this optimizes the expansion. If the target has an inclusive-OR that takes an immediate, then the transformed instruction is both shorter and faster. For those that don't, the immediate has to be separately constructed, but this still ends up being faster as the immediate construction is not on the critical path. Note that this is part of another patch series, the additional testcases are mutually dependent on the match.pd patch. As such the tests are added there insteadof here. gcc/ChangeLog: PR tree-optimization/109154 * config/aarch64/aarch64.md (copysign<GPF:mode>3): Handle copysign (x, -1). * config/aarch64/aarch64-simd.md (copysign<mode>3): Likewise. * config/aarch64/aarch64-sve.md (copysign<mode>3): Likewise.
2023-11-09AArch64: Use SVE unpredicated LOGICAL expressions when Advanced SIMD ↵Tamar Christina6-21/+22
inefficient [PR109154] SVE has much bigger immediate encoding range for bitmasks than Advanced SIMD has and so on a system that is SVE capable if we need an Advanced SIMD Inclusive-OR by immediate and would require a reload then use an unpredicated SVE ORR instead. This has both speed and size improvements. gcc/ChangeLog: PR tree-optimization/109154 * config/aarch64/aarch64.md (<optab><mode>3): Add SVE split case. * config/aarch64/aarch64-simd.md (ior<mode>3<vczle><vczbe>): Likewise. * config/aarch64/predicates.md(aarch64_orr_imm_sve_advsimd): New. gcc/testsuite/ChangeLog: PR tree-optimization/109154 * gcc.target/aarch64/sve/fneg-abs_1.c: Updated. * gcc.target/aarch64/sve/fneg-abs_2.c: Updated. * gcc.target/aarch64/sve/fneg-abs_4.c: Updated.
2023-11-09AArch64: Add movi for 0 moves for scalar types [PR109154]Tamar Christina5-3/+7
Following the Neoverse N/V and Cortex-A optimization guides SIMD 0 immediates should be created with a movi of 0. At the moment we generate an `fmov .., xzr` which is slower and requires a GP -> FP transfer. gcc/ChangeLog: PR tree-optimization/109154 * config/aarch64/aarch64.md (*mov<mode>_aarch64, *movsi_aarch64, *movdi_aarch64): Add new w -> Z case. * config/aarch64/iterators.md (Vbtype): Add QI and HI. gcc/testsuite/ChangeLog: PR tree-optimization/109154 * gcc.target/aarch64/fneg-abs_2.c: Updated. * gcc.target/aarch64/fneg-abs_4.c: Updated. * gcc.target/aarch64/dbl_mov_immediate_1.c: Updated.
2023-11-09AArch64: Add special patterns for creating DI scalar and vector constant 1 ↵Tamar Christina9-33/+119
<< 63 [PR109154] This adds a way to generate special sequences for creation of constants for which we don't have single instructions sequences which would have normally lead to a GP -> FP transfer or a literal load. The patch starts out by adding support for creating 1 << 63 using fneg (mov 0). gcc/ChangeLog: PR tree-optimization/109154 * config/aarch64/aarch64-protos.h (aarch64_simd_special_constant_p, aarch64_maybe_generate_simd_constant): New. * config/aarch64/aarch64-simd.md (*aarch64_simd_mov<VQMOV:mode>, *aarch64_simd_mov<VDMOV:mode>): Add new coden for special constants. * config/aarch64/aarch64.cc (aarch64_extract_vec_duplicate_wide_int): Take optional mode. (aarch64_simd_special_constant_p, aarch64_maybe_generate_simd_constant): New. * config/aarch64/aarch64.md (*movdi_aarch64): Add new codegen for special constants. * config/aarch64/constraints.md (Dx): new. gcc/testsuite/ChangeLog: PR tree-optimization/109154 * gcc.target/aarch64/fneg-abs_1.c: Updated. * gcc.target/aarch64/fneg-abs_2.c: Updated. * gcc.target/aarch64/fneg-abs_4.c: Updated. * gcc.target/aarch64/dbl_mov_immediate_1.c: Updated.
2023-11-09ifcvt: Add support for conditional copysignTamar Christina3-3/+6
This adds a masked variant of copysign. Nothing very exciting just the general machinery to define and use a new masked IFN. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Note: This patch is part of a testseries and tests for it are added in the AArch64 patch that adds supports for the optab. gcc/ChangeLog: PR tree-optimization/109154 * internal-fn.def (COPYSIGN): New. * match.pd (UNCOND_BINARY, COND_BINARY): Map IFN_COPYSIGN to IFN_COND_COPYSIGN. * optabs.def (cond_copysign_optab, cond_len_copysign_optab): New.
2023-11-09middle-end: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154]Tamar Christina15-12/+303
This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more canonical and allows a target to expand this sequence efficiently. Such sequences are common in scientific code working with gradients. There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x)) which I remove since this is a less efficient form. The testsuite is also updated in light of this. gcc/ChangeLog: PR tree-optimization/109154 * match.pd: Add new neg+abs rule, remove inverse copysign rule. gcc/testsuite/ChangeLog: PR tree-optimization/109154 * gcc.dg/fold-copysign-1.c: Updated. * gcc.dg/pr55152-2.c: Updated. * gcc.dg/tree-ssa/abs-4.c: Updated. * gcc.dg/tree-ssa/backprop-6.c: Updated. * gcc.dg/tree-ssa/copy-sign-2.c: Updated. * gcc.dg/tree-ssa/mult-abs-2.c: Updated. * gcc.target/aarch64/fneg-abs_1.c: New test. * gcc.target/aarch64/fneg-abs_2.c: New test. * gcc.target/aarch64/fneg-abs_3.c: New test. * gcc.target/aarch64/fneg-abs_4.c: New test. * gcc.target/aarch64/sve/fneg-abs_1.c: New test. * gcc.target/aarch64/sve/fneg-abs_2.c: New test. * gcc.target/aarch64/sve/fneg-abs_3.c: New test. * gcc.target/aarch64/sve/fneg-abs_4.c: New test.
2023-11-09middle-end: expand copysign handling from lockstep to nested itersTamar Christina1-24/+24
various optimizations in match.pd only happened on COPYSIGN in lock step which means they exclude IFN_COPYSIGN. COPYSIGN however is restricted to only the C99 builtins and so doesn't work for vectors. The patch expands these optimizations to work as nested iters. This is needed for the second patch which will add the testcase. gcc/ChangeLog: PR tree-optimization/109154 * match.pd: expand existing copysign optimizations.
2023-11-09Fix PR ada/111813 (Inconsistent limit in Ada.Calendar.Formatting)Simon Wright2-3/+34
The description of the second Value function (returning Duration) (ARM 9.6.1(87) doesn't place any limitation on the Elapsed_Time parameter's value, beyond "Constraint_Error is raised if the string is not formatted as described for Image, or the function cannot interpret the given string as a Duration value". It would seem reasonable that Value and Image should be consistent, in that any string produced by Image should be accepted by Value. Since Image must produce a two-digit representation of the Hours, there's an implication that its Elapsed_Time parameter should be less than 100.0 hours (the ARM merely says that in that case the result is implementation-defined). The current implementation of Value raises Constraint_Error if the Elapsed_Time parameter is greater than or equal to 24 hours. This patch removes the restriction, so that the Elapsed_Time parameter must only be less than 100.0 hours. 2023-10-15 Simon Wright <simon@pushface.org> PR ada/111813 gcc/ada/ * libgnat/a-calfor.adb (Value (2)): Allow values of parameter Elapsed_Time greater than or equal to 24 hours, by doing the hour calculations in Natural rather than Hour_Number (0 .. 23). Calculate the result directly rather than by using Seconds_Of (whose Hour parameter is of type Hour_Number). If an exception occurs of type Constraint_Error, re-raise it rather than raising a new CE. gcc/testsuite/ * gnat.dg/calendar_format_value.adb: New test.
2023-11-09Do not prepend target triple to -fuse-ld=lld,mold.Tatsuyuki Ishi1-5/+8
lld and mold are platform-agnostic and not prefixed with target triple. Prepending the target triple makes it less likely to find the intended linker executable. A potential breaking change is that we no longer try to search for triple-prefixed lld/mold binaries anymore. However, since there doesn't seem to be support to build LLVM or mold with triple-prefixed executable names, it seems better to just not bother with that case. PR driver/111605 * collect2.cc (main): Do not prepend target triple to -fuse-ld=lld,mold.
2023-11-09Refactor x86 decl based scatter vectorization, prepare SLPRichard Biener2-360/+332
The following refactors the x86 decl based scatter vectorization similar to what I did to the gather path. This prepares scatters for SLP as well, mainly single-lane since there are multiple missing bits to support multi-lane scatters. Tested extensively on the SLP-only branch which has the ability to force SLP even for single lanes. PR tree-optimization/111133 * tree-vect-stmts.cc (vect_build_scatter_store_calls): Remove and refactor to ... (vect_build_one_scatter_store_call): ... this new function. (vectorizable_store): Use vect_check_scalar_mask to record the SLP node for the mask operand. Code generate scatters with builtin decls from the main scatter vectorization path and prepare that for SLP. * tree-vect-slp.cc (vect_get_operand_map): Do not look at the VDEF to decide between scatter or gather since that doesn't work for patterns. Use the LHS being an SSA_NAME or not instead.
2023-11-09RISC-V: Refine frm emit after bb end in succ edgesPan Li1-4/+17
This patch would like to fine the frm insn emit when we meet abnormal edge in the loop. Conceptually, we only need to emit once when abnormal instead of every iteration in the loop. This patch would like to fix this defect and only perform insert_insn_end_basic_block when at least one succ edge is abnormal. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_frm_emit_after_bb_end): Only perform once emit when at least one succ edge is abnormal. Signed-off-by: Pan Li <pan2.li@intel.com>
2023-11-09RISC-V: Add PR112450 test to avoid regressionJuzhe-Zhong1-0/+19
ICE has been fixed by Richard:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450. Add test to avoid future regression. Committed. PR target/112450 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr112450.c: New test.
2023-11-09tree-optimization/112450 - avoid AVX512 style masking for BImode masksRichard Biener1-1/+4
The following avoids running into the AVX512 style masking code for RVV which would theoretically be able to handle it if I were not relying on integer mode maskness in vect_get_loop_mask. While that's easy to fix (patch in PR), the preference is to not have AVX512 style masking for RVV, thus the following. * tree-vect-loop.cc (vect_verify_full_masking_avx512): Check we have integer mode masks as required by vect_get_loop_mask.
2023-11-09tree-optimization/112444 - avoid bougs PHI value-numberingRichard Biener2-2/+79
With .DEFERRED_INIT ssa_undefined_value_p () can return true for values we did not visit (because they proved unreachable) but are not .VN_TOP. Avoid using those as value which, because they are not visited, are assumed to be defined outside of the region. PR tree-optimization/112444 * tree-ssa-sccvn.cc (visit_phi): Avoid using not visited defs as undefined vals. * gcc.dg/torture/pr112444.c: New testcase.
2023-11-09MAINTAINERS: Update my email addressYunQiang Su1-1/+1
ChangeLog: * MAINTAINERS: Update my email address.
2023-11-09MIPS: Use -mnan value for -mabs if not specifiedYunQiang Su3-0/+22
On most hardware, FCSR.ABS2008 is set the value same with FCSR.NAN2008. Let's use this behaivor by default in GCC, aka gcc -mnan=2008 -c fabs.c will imply `-mabs=2008`. And of course, `gcc -mnan=2008 -mabs=legacy` can continue workable like previous. gcc/ChangeLog * config/mips/mips.cc(mips_option_override): Set mips_abs to 2008, if mips_abs is default and mips_nan is 2008. gcc/testsuite/ * gcc.target/mips/fabs-nan2008.c: New test. * gcc.target/mips/fabsf-nan2008.c: New test.
2023-11-09i386: Fix C99 compatibility issues in the x86-64 AVX ABI test suiteFlorian Weimer9-11/+15
gcc/testsuite/ * gcc.target/x86_64/abi/avx/avx-check.h (main): Call __builtin_printf instead of printf. * gcc.target/x86_64/abi/avx/test_passing_m256.c (fun_check_passing_m256_8_values): Add missing void return type. * gcc.target/x86_64/abi/avx512f/avx512f-check.h (main): Call __builtin_printf instead of printf. * gcc.target/x86_64/abi/avx512f/test_passing_m512.c (fun_check_passing_m512_8_values): Add missing void return type. * gcc.target/x86_64/abi/bf16/bf16-check.h (main): Call __builtin_printf instead of printf. * gcc.target/x86_64/abi/bf16/m256bf16/bf16-ymm-check.h (main): Likewise. * gcc.target/x86_64/abi/bf16/m256bf16/test_passing_m256.c (fun_check_passing_m256bf16_8_values): Add missing void return type. * gcc.target/x86_64/abi/bf16/m512bf16/bf16-zmm-check.h (main): Call __builtin_printf instead of printf. * gcc.target/x86_64/abi/bf16/m512bf16/test_passing_m512.c (fun_check_passing_m512bf16_8_values): Add missign void return type.
2023-11-09c: Add -Wreturn-mismatch warning, split from -Wreturn-typeFlorian Weimer11-27/+277
The existing -Wreturn-type option covers both constraint violations (which are mandatory to diagnose) and warnings that have known false positives. The new -Wreturn-mismatch warning is only about the constraint violations (missing or extra return expressions), and should eventually be turned into a permerror. The -std=gnu89 test cases show that by default, we do not warn for return; in a function not returning void. This matches previous practice for -Wreturn-type. gcc/c-family/ * c.opt (Wreturn-mismatch): New. gcc/c/ * c-typeck.cc (c_finish_return): Use pedwarn with OPT_Wreturn_mismatch for missing/extra return expressions. gcc/ * doc/invoke.texi (Warning Options): Document -Wreturn-mismatch. Update -Wreturn-type documentation. gcc/testsuite/ * gcc.dg/Wreturn-mismatch-1.c: New. * gcc.dg/Wreturn-mismatch-2.c: New. * gcc.dg/Wreturn-mismatch-3.c: New. * gcc.dg/Wreturn-mismatch-4.c: New. * gcc.dg/Wreturn-mismatch-5.c: New. * gcc.dg/Wreturn-mismatch-6.c: New. * gcc.dg/noncompile/pr55976-1.c: Change -Werror=return-type to -Werror=return-mismatch. * gcc.dg/noncompile/pr55976-2.c: Change -Wreturn-type to -Wreturn-mismatch.
2023-11-09gcc.dg/Wmissing-parameter-type*: Test the intended warningFlorian Weimer2-4/+4
gcc/testsuite/ChangeLog: * gcc.dg/Wmissing-parameter-type.c: Build with -std=gnu89 to trigger the -Wmissing-parameter-type warning and not the default -Wimplicit warning. Also match against -Wmissing-parameter-type. * gcc.dg/Wmissing-parameter-type-Wextra.c: Likewise.
2023-11-09s390: Revise vector reverse elementsStefan Schulze Frielinghaus12-147/+527
Replace UNSPEC_VEC_ELTSWAP with a vec_select implementation. Furthermore, for a vector reverse elements operation between registers of mode V8HI perform three rotates instead of a vperm operation since the latter involves loading the permutation vector from the literal pool. Prior z15, instead of larl + vl + vl + vperm prefer vl + vpdi (+ verllg (+ verllf)) for a load operation. Likewise, prior z15, instead of larl + vl + vperm + vst prefer vpdi (+ verllg (+ verllf)) + vst for a store operation. gcc/ChangeLog: * config/s390/s390.md: Remove UNSPEC_VEC_ELTSWAP. * config/s390/vector.md (eltswapv16qi): New expander. (*eltswapv16qi): New insn and splitter. (eltswapv8hi): New insn and splitter. (eltswap<mode>): New insn and splitter for modes V_HW_4 as well as V_HW_2. * config/s390/vx-builtins.md (eltswap<mode>): Remove. (*eltswapv16qi): Remove. (*eltswap<mode>): Remove. (*eltswap<mode>_emu): Remove. gcc/testsuite/ChangeLog: * gcc.target/s390/zvector/vec-reve-load-halfword-z14.c: Remove vperm and substitude by vpdi et al. * gcc.target/s390/zvector/vec-reve-load-halfword.c: Likewise. * gcc.target/s390/vector/reverse-elements-1.c: New test. * gcc.target/s390/vector/reverse-elements-2.c: New test. * gcc.target/s390/vector/reverse-elements-3.c: New test. * gcc.target/s390/vector/reverse-elements-4.c: New test. * gcc.target/s390/vector/reverse-elements-5.c: New test. * gcc.target/s390/vector/reverse-elements-6.c: New test. * gcc.target/s390/vector/reverse-elements-7.c: New test.
2023-11-09s390: Add expand_perm_reverse_elementsStefan Schulze Frielinghaus1-72/+16
Replace expand_perm_with_rot, expand_perm_with_vster, and expand_perm_with_vstbrq with a general implementation expand_perm_reverse_elements. gcc/ChangeLog: * config/s390/s390.cc (expand_perm_with_rot): Remove. (expand_perm_reverse_elements): New. (expand_perm_with_vster): Remove. (expand_perm_with_vstbrq): Remove. (vectorize_vec_perm_const_1): Replace removed functions with new one.
2023-11-09s390: Recognize further vpdi and vmr{l,h} patternStefan Schulze Frielinghaus1-28/+90
Deal with cases where vpdi and vmr{l,h} are still applicable if the operands of those instructions are swapped. For example, currently for V2DI foo (V2DI x) { return (V2DI) {x[1], x[0]}; } the assembler sequence vlgvg %r1,%v24,1 vzero %v0 vlvgg %v0,%r1,0 vmrhg %v24,%v0,%v24 is emitted. With this patch a single vpdi is emitted. Extensive tests are included in a subsequent patch of this series where more cases are covered. gcc/ChangeLog: * config/s390/s390.cc (expand_perm_with_merge): Deal with cases where vmr{l,h} are still applicable if the operands are swapped. (expand_perm_with_vpdi): Likewise for vpdi.
2023-11-09s390: Reduce number of patterns where the condition is false anywayStefan Schulze Frielinghaus2-51/+46
For patterns which make use of two modes, do not build the cross product and then exclude illegal combinations via conditions but rather do not create those in the first place. Here we are following the idea of the attribute TOINTVEC/tointvec and introduce TOINT/toint. gcc/ChangeLog: * config/s390/s390.md (VX_CONV_INT): Remove iterator. (gf): Add float mappings. (TOINT, toint): New attribute. (*fixuns_trunc<VX_CONV_BFP:mode><VX_CONV_INT:mode>2_z13): Remove. (*fixuns_trunc<mode><toint>2_z13): Add. (*fix_trunc<VX_CONV_BFP:mode><VX_CONV_INT:mode>2_bfp_z13): Remove. (*fix_trunc<mode><toint>2_bfp_z13): Add. (*floatuns<VX_CONV_INT:mode><VX_CONV_BFP:mode>2_z13): Remove. (*floatuns<toint><mode>2_z13): Add. * config/s390/vector.md (VX_VEC_CONV_INT): Remove iterator. (float<VX_VEC_CONV_INT:mode><VX_VEC_CONV_BFP:mode>2): Remove. (float<tointvec><mode>2): Add. (floatuns<VX_VEC_CONV_INT:mode><VX_VEC_CONV_BFP:mode>2): Remove. (floatuns<tointvec><mode>2): Add. (fix_trunc<VX_VEC_CONV_BFP:mode><VX_VEC_CONV_INT:mode>2): Remove. (fix_trunc<mode><tointvec>2): Add. (fixuns_trunc<VX_VEC_CONV_BFP:mode><VX_VEC_CONV_INT:mode>2): Remove. (fixuns_trunc<VX_VEC_CONV_BFP:mode><tointvec>2): Add.
2023-11-09libgcc: Add {unsigned ,}__int128 <-> _Decimal{32,64,128} conversion support ↵Jakub Jelinek17-1/+1174
[PR65833] The following patch adds the missing {unsigned ,}__int128 <-> _Decimal{32,64,128} conversion support into libgcc.a on top of the _BitInt support (doing it without that would be larger amount of code and I hope all the targets which support __int128 will eventually support _BitInt, after all it is a required part of C23) and because it is in libgcc.a only, it doesn't hurt that much if it is added for some architectures only in GCC 15. Initially I thought about doing this on the compiler side, but doing it on the library side seems to be easier and more -Os friendly. The tests currently require bitint effective target, that can be removed when all the int128 targets support bitint. 2023-11-09 Jakub Jelinek <jakub@redhat.com> PR libgcc/65833 libgcc/ * config/t-softfp (softfp_bid_list): Add {U,}TItype <-> _Decimal{32,64,128} conversions. * soft-fp/floattisd.c: New file. * soft-fp/floattidd.c: New file. * soft-fp/floattitd.c: New file. * soft-fp/floatuntisd.c: New file. * soft-fp/floatuntidd.c: New file. * soft-fp/floatuntitd.c: New file. * soft-fp/fixsdti.c: New file. * soft-fp/fixddti.c: New file. * soft-fp/fixtdti.c: New file. * soft-fp/fixunssdti.c: New file. * soft-fp/fixunsddti.c: New file. * soft-fp/fixunstdti.c: New file. gcc/testsuite/ * gcc.dg/dfp/int128-1.c: New test. * gcc.dg/dfp/int128-2.c: New test. * gcc.dg/dfp/int128-3.c: New test. * gcc.dg/dfp/int128-4.c: New test.
2023-11-09attribs: Fix ICE with -Wno-attributes= [PR112339]Jakub Jelinek2-3/+15
The following testcase ICEs, because with -Wno-attributes=foo::no_sanitize (but generally any other non-gnu namespace and some gnu well known attribute name within that other namespace) the FEs don't really parse attribute arguments of such attribute, but lookup_attribute_spec is non-NULL with NULL handler and such attributes are added to DECL_ATTRIBUTES or TYPE_ATTRIBUTES and then when e.g. middle-end does lookup_attribute on a particular attribute and expects the attribute to mean something and/or have a particular verified arguments, it can crash when seeing the foreign attribute in there instead. The following patch fixes that by never adding ignored attributes to DECL_ATTRIBUTES/TYPE_ATTRIBUTES, previously that was the case just for attributes in ignored namespace (where lookup_attribute_space returned NULL). We don't really know anything about those attributes, so shouldn't pretend we know something about them, especially when the arguments are error_mark_node or NULL instead of something that would have been parsed. And it would be really weird if we normally ignore say [[clang::unused]] attribute, but when people use -Wno-attributes=clang::unused we actually treated it as gnu::unused. All the user asked for is suppress warnings about that attribute being unknown. The first hunk is just playing safe, I'm worried people could -Wno-attributes=gnu:: and get various crashes with known GNU attributes not being actually parsed and recorded (or worse e.g. when we tweak standard attributes into GNU attributes and we wouldn't add those). The -Wno-attributes= documentation says that it suppresses warning about unknown attributes, so I think -Wno-attributes=gnu:: should prevent warning about say [[gnu::foobarbaz]] attribute, but not about [[gnu::unused]] because the latter is a known attribute. The routine would return true for any scoped attribute in the ignored namespace, with the change it ignores only unknown attributes in ignored namespace, known ones in there will be ignored only if they have max_length of -2 (e.g.. with -Wno-attributes=gnu:: -Wno-attributes=gnu::foobarbaz). 2023-11-09 Jakub Jelinek <jakub@redhat.com> PR c/112339 * attribs.cc (attribute_ignored_p): Only return true for attr_namespace_ignored_p if as is NULL. (decl_attributes): Never add ignored attributes. * c-c++-common/ubsan/Wno-attributes-1.c: New test.
2023-11-09RISC-V: Fix the illegal operands for the XTheadMemidx extension.Jin Ma2-2/+32
The pattern "*extend<SHORT:mode><SUPERQI:mode>2_bitmanip" and "*zero_extendhi<GPR:mode>2_bitmanip" in bitmanip.md are similar to the pattern "*th_memidx_bb_extendqi<SUPERQI:mode>2" and "*th_memidx_bb_zero_extendhi<GPR:mode>2" in thead.md, which will cause the wrong instruction to be generated and report the following error in binutils: Assembler messages: Error: illegal operands `lb a5,(a0),1,0' In fact, the correct instruction is "th.lbia a5,(a0),1,0". gcc/ChangeLog: * config/riscv/bitmanip.md: Avoid the conflict between zbb and xtheadmemidx in patterns. gcc/testsuite/ChangeLog: * gcc.target/riscv/xtheadfmemidx-uindex-zbb.c: New test.
2023-11-09Fix SIMD clone SLP a bit moreRichard Biener1-5/+4
The following fixes an omission, mangling the non-SLP and SLP simd-clone info. * tree-vect-stmts.cc (vectorizable_simd_clone_call): Record to the correct simd_clone_info.
2023-11-09libstdc++: [_Hashtable] Use RAII type to guard node while constructing valueFrançois Dumont1-20/+26
libstdc++-v3/ChangeLog: * include/bits/hashtable_policy.h (struct _NodePtrGuard<_HashtableAlloc, _NodePtr>): New. (_ReuseAllocNode::operator()(_Args&&...)): Use latter to guard allocated node pointer while constructing in place the value_type instance.
2023-11-09RISC-V: Fix dynamic LMUL cost model ICEJuzhe-Zhong4-3/+69
When trying to use dynamic LMUL to compile benchmark. Notice there is a bunch ICEs. This patch fixes those ICEs and append tests. gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (costs::preferred_new_lmul_p): Fix ICE. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul-ice-1.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul-ice-2.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul-ice-3.c: New test.