riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2023-08-18	libstdc++: Revert pre-C++23 support for 16-bit float types [PR111060]	Jonathan Wakely	2	-5/+5
	In r14-3304-g1a566fddea212a and r14-3305-g6cf214b4fc97f5 I tried to enable std::format for 16-bit float types before C++23. This causes errors for targets where the types are defined but can't actually be used, e.g. i686 without sse2. Make the std::numeric_limits and std::formatter specializations for _Float16 and __bfloat16_t depend on the __STDCPP_FLOAT16_T__ and __STDCPP_BFLOAT16_T__ macros again, so they're only defined for C++23 when the type is fully supported. This is OK because the main point of my earlier commits was to add better support for _Float32 and _Float64. It seems fine for the new 16-bit types to only be supported for C++23, as they were never present before GCC 13 anyway. libstdc++-v3/ChangeLog: PR target/111060 * include/std/format (formatter): Only define specializations for 16-bit floating-point types for C++23. * include/std/limits (numeric_limits): Likewise.
2023-08-18	testsuite: Improve test in dg-require-python-h	Thiago Jung Bauermann	1	-2/+12
	If GCC is tested with a sysroot which doesn't contain a Python installation (e.g., with a command such as "make check-gcc-c FLAGS_UNDER_TEST="--sysroot=/some/path"), but there's a python3-config in $PATH, then the testsuite will pick up the host's Python.h which can't actually be used: Executing on host: python3-config --includes (timeout = 300) spawn -ignore SIGHUP python3-config --includes -I/usr/include/python3.10 -I/usr/include/python3.10 Executing on host: /some/sysroot/bin/aarch64-unknown-linux-gnu-gcc --sysroot=/some/sysroot/libc -Wl,-dynamic-linker=/some/sysroot/libc/lib/ld-linux-aarch64.so.1 -Wl,-rpath=/some/sysroot/libc/lib /some/src/gcc.git/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c -fdiagnostics-plain-output -fplugin=./analyzer_cpython_plugin.so -fanalyzer -I/usr/include/python3.10 -I/usr/include/python3.10 -S -o cpython-plugin-test-2.s (timeout = 600) spawn -ignore SIGHUP /some/sysroot/bin/aarch64-unknown-linux-gnu-gcc --sysroot=/some/sysroot/libc -Wl,-dynamic-linker=/some/sysroot/libc/lib/ld-linux-aarch64.so.1 -Wl,-rpath=/some/sysroot/libc/lib /some/src/gcc.git/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c -fdiagnostics-plain-output -fplugin=./analyzer_cpython_plugin.so -fanalyzer -I/usr/include/python3.10 -I/usr/include/python3.10 -S -o cpython-plugin-test-2.s In file included from /usr/include/python3.10/Python.h:8, from /some/src/gcc.git/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c:8: /usr/include/python3.10/pyconfig.h:9:12: fatal error: aarch64-linux-gnu/python3.10/pyconfig.h: No such file or directory compilation terminated. compiler exited with status 1 This problem causes these testsuite failures: FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 17) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 18) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 21) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 31) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 32) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 35) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 45) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 55) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 63) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 66) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 68) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 69) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for excess errors) Excess errors: /usr/include/python3.10/pyconfig.h:9:12: fatal error: aarch64-linux-gnu/python3.10/pyconfig.h: No such file or directory compilation terminated. So try to compile a test file so that the testcase can be marked as unsupported instead. gcc/testsuite/ChangeLog: * lib/target-supports.exp (dg-require-python-h): Test whether Python.h can really be used.
2023-08-18	i386: Use PUNPCKL?? to implement vector extend and zero_extend for TARGET_SSE2.	Uros Bizjak	8	-8/+234
	Implement vector extend and zero_extend functionality for TARGET_SSE2 using PUNPCKL?? family of instructions. The code for e.g. zero-extend from V2SI to V2DImode improves from: movd %xmm0, %edx pshufd $85, %xmm0, %xmm0 movd %xmm0, %eax movq %rdx, (%rdi) movq %rax, 8(%rdi) to: pxor %xmm1, %xmm1 punpckldq %xmm1, %xmm0 movaps %xmm0, (%rdi) And the code for sign-extend from V2SI to V2DImode from: movd %xmm0, %edx pshufd $85, %xmm0, %xmm0 movd %xmm0, %eax movslq %edx, %rdx cltq movq %rdx, (%rdi) movq %rax, 8(%rdi) to: pxor %xmm1, %xmm1 pcmpgtd %xmm0, %xmm1 punpckldq %xmm1, %xmm0 movaps %xmm0, (%rdi) PR target/111023 gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_split_mmx_punpck): Also handle V2QImode. (ix86_expand_sse_extend): New function. * config/i386/i386-protos.h (ix86_expand_sse_extend): New prototype. * config/i386/mmx.md (<any_extend:insn>v4qiv4hi2): Enable for TARGET_SSE2. Expand through ix86_expand_sse_extend for !TARGET_SSE4_1. (<any_extend:insn>v2hiv2si2): Ditto. (<any_extend:insn>v2qiv2hi2): Ditto. * config/i386/sse.md (<any_extend:insn>v8qiv8hi2): Ditto. (<any_extend:insn>v4hiv4si2): Ditto. (<any_extend:insn>v2siv2di2): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr111023-2.c: New test. * gcc.target/i386/pr111023-4b.c: New test. * gcc.target/i386/pr111023-8b.c: New test. * gcc.target/i386/pr111023.c: New test.
2023-08-18	[irange] Return FALSE if updated bitmask is unchanged [PR110753]	Aldy Hernandez	2	-0/+33
	The mask/value pair we track in the irange is a bit fickle in that it can sometimes contradict the bitmask inherent in the range. This can happen when a series of calculations yield a combination such as: [3, 1000] MASK 0xfffffffe VALUE 0x0 The mask/value above implies that the lowest bit is a known 0, which would exclude the 3 in the range. At one time we tried keeping mask and ranges 100% consistent, but the performance penalty was too high (5% in VRP). Also, it's unclear whether the intersection of two incompatible known bits should make the whole range undefined, or just the contradicting bits. This is all documented in irange::get_bitmask(). We could revisit both of these assumptions in the future. In this testcase IPA ends up with a range where the lower 2 bits are expected to be 0, but the range is [1,1]. [irange] long int [1, 1] MASK 0xfffffffffffffffc VALUE 0x0 This causes irange::union_bitmask() to think an update occurred, when no semantic change happened, thus triggering an assert in IPA-cp. We could get rid of the assert, but it's cleaner to make irange::{union,intersect}_bitmask always tell the truth. Beside, the ranger's cache also depends on union being truthful. PR ipa/110753 gcc/ChangeLog: * value-range.cc (irange::union_bitmask): Return FALSE if updated bitmask is semantically equivalent to the original mask. (irange::intersect_bitmask): Same. (irange::get_bitmask): Add comment. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr110753.c: New test.
2023-08-18	tree-optimization/111019 - invariant motion and aliasing	Richard Biener	2	-2/+77
	The following fixes a bad choice in representing things to the alias oracle by LIM which while correct in pieces is inconsistent with itself. When canonicalizing a ref to a bare deref instead of leaving the base object and the extracted offset the same and just substituting an alternate ref the following replaces the base and the offset as well, avoiding the confusion that otherwise will arise in aliasing_matching_component_refs_p. PR tree-optimization/111019 * tree-ssa-loop-im.cc (gather_mem_refs_stmt): When canonicalizing also scrap base and offset in case the ref is indirect. * g++.dg/torture/pr111019.C: New testcase.
2023-08-18	bpf: bump maximum frame size limit to 32767 bytes	Jose E. Marchesi	3	-1/+35
	This commit bumps the maximum stack frame size allowed for BPF functions to the maximum possible value. Tested in x86_64-linux-gnu host and target bpf-unknown-none. gcc/ChangeLog * config/bpf/bpf.opt (mframe-limit): Set default to 32767. gcc/testsuite/ChangeLog * gcc.target/bpf/frame-limit-1.c: New test. * gcc.target/bpf/frame-limit-2.c: Likewise.
2023-08-18	libstdc++: Replace non-type-dependent uses of wchar_t in <format> and <chrono>	Jonathan Wakely	2	-13/+14
	This is one more piece of the rework to make wchar_t support in std::format depend on _GLIBCXX_USE_WCHAR_T. In <format> the __to_wstring_numeric function is called with arguments that aren't type-dependent, so a declaration needs to be available, or the calls need to be guarded by _GLIBCXX_USE_WCHAR_T. In <chrono> there is a similarly non-type-dependent call to std::format with a wchar_t format string, which is ill-formed when the wchar_t overloads of std::format are not declared. Use _GLIBCXX_WIDEN to make it type-dependent. libstdc++-v3/ChangeLog: * include/bits/chrono_io.h (operator<<): Make uses of wide strings with streams and std::format type-dependent on _CharT. * include/std/format [!_GLIBCXX_USE_WCHAR_T] Do not use __to_wstring_numeric.
2023-08-18	Makefile.in: Make TM_P_H depend on $(TREE_H) [PR111021]	Kewen Lin	1	-1/+2
	As PR111021 shows, the below ${port}-protos.h include tree.h for code_helper and tree_code: arm/arm-protos.h:#include "tree.h" cris/cris-protos.h:#include "tree.h" (H-P removed this in r14-3218) microblaze/microblaze-protos.h:#include "tree.h" rl78/rl78-protos.h:#include "tree.h" stormy16/stormy16-protos.h:#include "tree.h" , when compiling build/gencondmd.cc, the include hierarchy makes it depend on tm_p.h -> ${port}-protos.h -> tree.h, which further includes (depends on) some files that are generated during the building, such as: all-tree.def, tree-check.h and so on. The previous commit r14-3215 should already force build/gencondmd.cc to depend on ${TREE_H}, so the reported build failure should be gone. But for a long term maintenance, especially one day some build/xxx.cc requires tm_p.h but not recog.h, the ${TREE_H} dependence could be missed and a build failure will show up. So this patch is to make TM_P_H depend on $(TREE_H), any new build/xxx.cc depending on tm_p.h will be able to consider ${TREE_H}. It's tested with cross-builds for the affected ports with steps: 1) dropped the fix r14-3215; 2) reproduced the build failure with serial build; 3) applied this patch, serial built and verified all passed; 4) added back r14-3215, serial built and verified all passed; PR bootstrap/111021 gcc/ChangeLog: * Makefile.in (TM_P_H): Add $(TREE_H) as dependence.
2023-08-18	vect: Factor out the handling on scatter store having gs_info.decl	Kewen Lin	1	-199/+212
	Similar to the existing function vect_build_gather_load_calls, this patch is to factor out the handling on scatter store having gs_info.decl to vect_build_scatter_store_calls which is a new function. It also does some minor refactoring like moving some variables' declarations close to their uses and restrict the scope for some of them etc. It's a pre-patch for upcoming vectorizable_store re-structuring for costing. gcc/ChangeLog: * tree-vect-stmts.cc (vect_build_scatter_store_calls): New, factor out from ... (vectorizable_store): ... here.
2023-08-18	libstdc++: Fix incomplete rework of wchar_t support in std::format	Jonathan Wakely	2	-14/+14
	r14-3300-g023a62b77f999b left make_wformat_args and some uses of std::wformat_context unguarded by _GLIBCXX_USE_WCHAR_T. libstdc++-v3/ChangeLog: * include/bits/chrono_io.h (operator<<): Use __format_context. * include/std/format (__format::__format_context): New alias template. [!_GLIBCXX_USE_WCHAR_T] (wformat_args, make_wformat_arg): Disable.
2023-08-18	tree-optimization/111048 - avoid flawed logic in fold_vec_perm	Richard Biener	2	-6/+30
	The following avoids running into somehow flawed logic in fold_vec_perm for non-VLA vectors. PR tree-optimization/111048 * fold-const.cc (fold_vec_perm_cst): Check for non-VLA vectors first. * gcc.dg/torture/pr111048.c: New testcase.
2023-08-18	i386: Add AVX2 pragma wrapper for AVX512DQVL intrins	Haochen Jiang	2	-0/+22
	PR target/111051 gcc/ChangeLog: * config/i386/avx512vldqintrin.h: Push AVX2 when AVX2 is disabled. gcc/testsuite/ChangeLog: PR target/111051 * gcc.target/i386/pr111051-1.c: New test.
2023-08-18	vect: Move VMAT_GATHER_SCATTER handlings from final loop nest	Kewen Lin	1	-142/+216
	Following Richi's suggestion [1], this patch is to move the handlings on VMAT_GATHER_SCATTER in the final loop nest of function vectorizable_load to its own loop. Basically it duplicates the final loop nest, clean up some useless set up code for the case of VMAT_GATHER_SCATTER, remove some unreachable code. Also remove the corresponding handlings in the final loop nest. [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623329.html gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Move the handlings on VMAT_GATHER_SCATTER in the final loop nest to its own loop, and update the final nest accordingly.
2023-08-18	RISC-V: Fix -march error of zhinxmin testcases	Lehua Ding	2	-3/+4
	This little patch fixs the -march error of a zhinxmin testcase I added earlier and an old zhinxmin testcase, since these testcases are for zhinxmin extension and not zfhmin extension. gcc/testsuite/ChangeLog: * gcc.target/riscv/_Float16-zhinxmin-3.c: Adjust. * gcc.target/riscv/_Float16-zhinxmin-4.c: Ditto.
2023-08-17	Document cond_neg, cond_one_cmpl, cond_len_neg and cond_len_one_cmpl ↵	Andrew Pinski	1	-0/+62
	standard patterns When I added `cond_one_cmpl` (and the corresponding IFN) I had noticed cond_neg standard named pattern was not documented and this adds the documentation for all 4 named patterns now. OK? Tested by building the manual. gcc/ChangeLog: * doc/md.texi (Standard patterns): Document cond_neg, cond_one_cmpl, cond_len_neg and cond_len_one_cmpl.
2023-08-18	RISC-V: Add the missed half floating-point mode patterns of ↵	Lehua Ding	4	-17/+44
	local_pic_load/store when only use zfhmin or zhinxmin Hi, There is a new failed RISC-V testcase(testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c) on the current trunk branch when use medany as default cmodel. The reason is the load of half floating-point imm is convert from RTL 1 to RTL 2 as the cmodel be changed from medlow to medany. This change let insn 7 be combineed with @pred_broadcast patterns (insn 8) at combine pass. However, insn 6 and insn 7 are combined for SF and DF mode, but not for HF mode, and the fail combined leads to insn 7 and insn 8 be combined. The reason of the fail combined is the local_pic_loadhf pattern doesn't exist when only enable zfhmin(implied by zvfh). Therefore, when only zfhmin but not zfh is enabled, the define_insn of local_pic_load<ANYF:mode> must also be able to produce the pattern for load_pic_loadhf pattern, since the zfhmin extension also includes a half floating-point load/store instructions. So, I added an ANFLSF Iterator and applied it to local_pic_load/store define_insns. I have checked other ANYF usage scenarios and feel that this is the only place that needs to be corrected. I may have missed something, please correct. Thanks. RTL 1: (insn 6 3 7 2 (set (reg:DI 137) (high:DI (symbol_ref/u:DI (".LC0") [flags 0x82]))) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1 discrim 3 179 {movdi_64bit} (nil)) (insn 7 6 8 2 (set (reg:HF 136) (mem/u/c:HF (lo_sum:DI (reg:DI 137) (symbol_ref/u:DI (".LC0") [flags 0x82])) [0 S2 A16])) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1 discrim 3 126 {movhf_hardfloat} (expr_list:REG_EQUAL (const_double:HF 8.8828125e+0 [0x0.8e2p+4]) (nil))) RTL 2: (insn 6 3 7 2 (set (reg/f:DI 137) (symbol_ref/u:DI (".LC0") [flags 0x82])) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1 discrim 3 179 {movdi_64bit} (nil)) (insn 7 6 8 2 (set (reg:HF 136) (mem/u/c:HF (reg/f:DI 137) [0 S2 A16])) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1 discrim 3 126 {movhf_hardfloat} (expr_list:REG_EQUAL (const_double:HF 8.8828125e+0 [0x0.8e2p+4]) (nil))) (insn 8 7 9 2 (set (reg:V2HF 135) (if_then_else:V2HF (unspec:V2BI [ (const_vector:V2BI [ (const_int 1 [0x1]) repeated x2 ]) (const_int 2 [0x2]) repeated x3 (const_int 0 [0]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (vec_duplicate:V2HF (reg:HF 136)) (unspec:V2HF [ (reg:SI 0 zero) ] UNSPEC_VUNDEF))) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":6:1 discrim 3 1389 {pred_broadcastv2hf} (nil)) Best, Lehua gcc/ChangeLog: * config/riscv/iterators.md (TARGET_HARD_FLOAT \|\| TARGET_ZFINX): New. * config/riscv/pic.md (local_pic_load<ANYF:mode>): Change ANYF. (local_pic_load<ANYLSF:mode>): To ANYLSF. (local_pic_load_32d<ANYF:mode>): Ditto. (local_pic_load_32d<ANYLSF:mode>): Ditto. (local_pic_store<ANYF:mode>): Ditto. (local_pic_store<ANYLSF:mode>): Ditto. (local_pic_store_32d<ANYF:mode>): Ditto. (local_pic_store_32d<ANYLSF:mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/_Float16-zfhmin-4.c: New test. * gcc.target/riscv/_Float16-zhinxmin-4.c: New test.
2023-08-18	RISC-V: Revert the convert from vmv.s.x to vmv.v.i	Lehua Ding	4	-19/+70
	Hi, This patch revert the convert from vmv.s.x to vmv.v.i and add new pattern optimize the special case when the scalar operand is zero. Currently, the broadcast pattern where the scalar operand is a imm will be converted to vmv.v.i from vmv.s.x and the mask operand will be converted from 00..01 to 11..11. There are some advantages and disadvantages before and after the conversion after discussing with Juzhe offline and we chose not to do this transform. Before: Advantages: The vsetvli info required by vmv.s.x has better compatibility since vmv.s.x only required SEW and VLEN be zero or one. That mean there is more opportunities to combine with other vsetlv infos in vsetvl pass. Disadvantages: For non-zero scalar imm, one more `li rd, imm` instruction will be needed. After: Advantages: No need `li rd, imm` instruction since vmv.v.i support imm operand. Disadvantages: Like before's advantages. Worse compatibility leads to more vsetvl instrunctions need. Consider the bellow C code and asm after autovec. there is an extra insn (vsetivli zero, 1, e32, m1, ta, ma) after converted vmv.s.x to vmv.v.i. ``` int foo1(int* restrict a, int* restrict b, int restrict c, int n) { int sum = 0; for (int i = 0; i < n; i++) sum += a[i] b[i]; return sum; } ``` asm (Before): ``` foo1: ble a3,zero,.L7 vsetvli a2,zero,e32,m1,ta,ma vmv.v.i v1,0 .L6: vsetvli a5,a3,e32,m1,tu,ma slli a4,a5,2 sub a3,a3,a5 vle32.v v2,0(a0) vle32.v v3,0(a1) add a0,a0,a4 add a1,a1,a4 vmacc.vv v1,v3,v2 bne a3,zero,.L6 vsetvli a2,zero,e32,m1,ta,ma vmv.s.x v2,zero vredsum.vs v1,v1,v2 vmv.x.s a0,v1 ret .L7: li a0,0 ret ``` asm (After): ``` foo1: ble a3,zero,.L4 vsetvli a2,zero,e32,m1,ta,ma vmv.v.i v1,0 .L3: vsetvli a5,a3,e32,m1,tu,ma slli a4,a5,2 sub a3,a3,a5 vle32.v v2,0(a0) vle32.v v3,0(a1) add a0,a0,a4 add a1,a1,a4 vmacc.vv v1,v3,v2 bne a3,zero,.L3 vsetivli zero,1,e32,m1,ta,ma vmv.v.i v2,0 vsetvli a2,zero,e32,m1,ta,ma vredsum.vs v1,v1,v2 vmv.x.s a0,v1 ret .L4: li a0,0 ret ``` Best, Lehua Co-Authored-By: Ju-Zhe Zhong <juzhe.zhong@rivai.ai> gcc/ChangeLog: * config/riscv/predicates.md (vector_const_0_operand): New. * config/riscv/vector.md (pred_broadcast<mode>_zero): Ditto. gcc/testsuite/ChangeLog: gcc.target/riscv/rvv/base/scalar_move-5.c: Update. * gcc.target/riscv/rvv/base/scalar_move-6.c: Ditto.
2023-08-18	RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl	Lehua Ding	2	-0/+23
	Hi, This little patch fix the fail testcase (gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c) after apply this patch (https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627121.html). The specific reason is that the vsetvl pass has bug and this patch forbidden the fuse of this case. This patch needs to be committed before that patch to work. Best, Lehua gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pass_vsetvl::backward_demand_fusion): Forbidden. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c: Address failure due to uninitialized vtype register.
2023-08-18	Daily bump.	GCC Administrator	8	-1/+782

2023-08-18	Revert "libstdc++: Reuse double overload of __convert_to_v if possible"	Jonathan Wakely	1	-6/+0
	This reverts commit aad83d61d2e92b168688f7b6bd00b8604d11fc9f. libstdc++-v3/ChangeLog: * config/locale/generic/c_locale.cc:
2023-08-17	libstdc++: Replace global std::string objects in tzdb.cc	Jonathan Wakely	1	-2/+2
	When the library is built with --disable-libstdcxx-dual-abi the only type of std::string supported is the COW string, and the two global std::string objects in tzdb.cc have to allocate memory. I added them thinking they would fit in the SSO string buffer, but that's not the case when the library only uses COW strings. Replace them with string_view objects to avoid any allocations. libstdc++-v3/ChangeLog: * src/c++20/tzdb.cc (tzdata_file, leaps_file): Change type to std::string_view.
2023-08-17	libstdc++: Reuse double overload of __convert_to_v if possible	Jonathan Wakely	1	-0/+6
	For targets where double and long double have the same representation we can reuse the same __convert_to_v code for both types. This will slightly reduce the size of the compiled code in the library. libstdc++-v3/ChangeLog: * config/locale/generic/c_locale.cc (__convert_to_v): Reuse double overload for long double if possible.
2023-08-17	libstdc++: Micro-optimize construction of named std::locale	Jonathan Wakely	1	-16/+23
	This shaves about 100ns off the std::locale constructor for named locales (which is only about 1% of the total time). Using !s instead of !strcmp(s, "") doesn't make any difference as GCC optimizes that already even at -O1. !strcmp(s, "C") is optimized at -O2 so replacing that with s[0] == 'C' && s[1] == '\0' only matters for the --enable-libstdcxx-debug builds. But !strcmp(s, "POSIX") always makes a call to strcmp at any optimization level. We make that strcmp call, maybe several times, for any locale name except for "C" (which will be matched before we get to the check for "POSIX"). For most targets, locale names begin with a lowercase letter and the only one that begins with 'P' is "POSIX". Replacing !strcmp(s, "POSIX") with s[0] == 'P' && !strcmp(s+1, "OSIX") means that we avoid calling strcmp unless the string really does match "POSIX". Maybe more importantly, I find is_C_locale(s) easier to read than strcmp(s, "C") == 0 \|\| strcmp(s, "POSIX") == 0, and !is_C_locale(s) easier to read than strcmp(s, "C") != 0 && strcmp(s, "POSIX") != 0. libstdc++-v3/ChangeLog: src/c++98/localename.cc (is_C_locale): New function. (locale::locale(const char*)): Use is_C_locale.
2023-08-17	libstdc++: Optimize std::string::assign(Iter, Iter) [PR110945]	Jonathan Wakely	1	-4/+38
	Calling string::assign(Iter, Iter) with "foreign" iterators (not the string's own iterator or pointer types) currently constructs a temporary string and then calls replace to copy the characters from it. That means we copy from the iterators twice, and if the replace operation has to grow the string then we also allocate twice. By using this = basic_string(first, last, get_allocator()) we only perform a single allocation+copy and then do a cheap move assignment instead of a second copy (and possible allocation). But that alternative has to be done conditionally, so that we don't pessimize the native iterator case (the string's own iterator and pointer types) which currently select efficient overloads of replace which will not allocate at all if the string already has sufficient capacity. For C++20 we can extend that efficient case to work for any contiguous iterator with the right value type, not just for the string's native iterators. So the change is to inline the code that decides whether to work in place or to allocate+copy (instead of deciding that via overload resolution for replace), and for the allocate+copy case do a move assignment instead of another call to replace. For C++98 there is no change, as we can't do an efficient move assignment anyway, so keep the current code. We can also simplify assign(initializer_list<CharT>) because the backing array for an initializer_list is always disjunct with this, so most of the code in _M_replace is not needed. libstdc++-v3/ChangeLog: PR libstdc++/110945 * include/bits/basic_string.h (basic_string::assign(Iter, Iter)): Dispatch to _M_replace or move assignment from a temporary, based on the iterator type.
2023-08-17	libstdc++: Add std::formatter specializations for extended float types	Jonathan Wakely	3	-35/+236
	This makes it possible to format _Float32, _Float64 etc. in C++20 mode. Previously it was only possible to format them in C++23 when the <stdfloat> typedefs and the std::to_chars overloads were defined. Instead of relying on std::to_chars for those types, we can just reuse the formatters for float, double and long double. This also avoids template bloat by reusing the same specializations instead of instantiating __formatter_fp for every different type. libstdc++-v3/ChangeLog: * include/std/format (formatter): Add partial specializations for extended floating-point types. * testsuite/std/format/functions/format.cc: Move test_float128() to ... * testsuite/std/format/formatter/ext_float.cc: New test.
2023-08-17	libstdc++: Define std::numeric_limits<_FloatNN> before C++23	Jonathan Wakely	2	-95/+103
	The extended floating-point types such as _Float32 are supported by GCC prior to C++23, you just can't use the standard-conforming names from <stdfloat> to refer to them. This change defines the specializations of std::numeric_limits for those types for older dialects, not only for C++23. libstdc++-v3/ChangeLog: * include/bits/c++config (__gnu_cxx::__bfloat16_t): Define whenever __BFLT16_DIG__ is defined, not only for C++23. * include/std/limits (numeric_limits<bfloat16_t>): Likewise. (numeric_limits<_Float16>, numeric_limits<_Float32>) (numeric_limits<_Float64>): Likewise for other extended floating-point types.
2023-08-17	libstdc++: Fix -Wunused-parameter in <experimental/internet>	Jonathan Wakely	1	-1/+1
	libstdc++-v3/ChangeLog: * include/experimental/internet (address_v4::to_string): Remove unused parameter name.
2023-08-17	libstdc++: Make __cmp_cat::__unseq constructor consteval	Jonathan Wakely	2	-1/+9
	This constructor should only ever be used with a literal 0 as the argument, so we can make it consteval. This has the nice advantage that it is expanded immediately in the front end, and so GDB will never step into the __cmp_cat::__unseq::__unseq(__unseq) constructor that is uninteresting and probably confusing to users. libstdc++-v3/ChangeLog: libsupc++/compare (__cmp_cat::__unseq): Make ctor consteval. * testsuite/18_support/comparisons/categories/zero_neg.cc: Prune excess errors caused by invalid consteval calls.
2023-08-17	libstdc++: Simplify chrono::__units_suffix using std::format	Jonathan Wakely	1	-55/+29
	For std::chrono formatting we can simplify __units_suffix by using std::format_to to generate the "[n/m]s" suffix with the correct character type and write directly to the output iterator, so it doesn't need to be widened using ctype. We can't remove the use of ctype::widen for formatting a time zone abbreviation as a wide string, because that can contain arbitrary characters that can't be widened by __to_wstring_numeric. This also fixes a bug in the chrono formatter for %Z which created a dangling wstring_view. libstdc++-v3/ChangeLog: * include/bits/chrono_io.h (__units_suffix_misc): Remove. (__units_suffix): Return a known suffix as string view, do not write unknown suffixes to a buffer. (__fmt_units_suffix): New function that formats the suffix using std::format_to. (operator<<, __chrono_formatter::_M_q): Use __fmt_units_suffix. (__chrono_formatter::_M_Z): Correct lifetime of wstring.
2023-08-17	libstdc++: Rework std::format support for wchar_t	Jonathan Wakely	2	-36/+82
	This changes how std::format creates wide strings, by replacing uses of std::ctype<wchar_t>::widen with the recently-added __to_wstring_numeric helper function. This removes the dependency on the locale, which should only be used for locale-specific formats such as {:Ld}. Also disable all the wide string formatting support if the _GLIBCXX_USE_WCHAR_T macro is not defined. This is consistent with other wchar_t support being disabled if the library is built without that macro defined. libstdc++-v3/ChangeLog: * include/std/format [_GLIBCXX_USE_WCHAR_T]: Guard all wide string formatters with this macro. (__formatter_int::_M_format_int, __formatter_fp::format) (formatter<const void, C>::format): Use __to_wstring_numeric instead of std::ctype::widen. (__formatter_fp::_M_localize): Use hardcoded wchar_t values instead of std::ctype::widen. testsuite/std/format/functions/format.cc: Add more checks for wstring formatting of arithmetic types.
2023-08-17	libstdc++: Implement std::to_string in terms of std::format (P2587R3)	Jonathan Wakely	10	-12/+429
	This change for C++26 affects std::to_string for floating-point arguments, so that they should be formatted using std::format("{}", v) instead of using sprintf. The modified specification in the standard also affects integral arguments, but there's no observable difference for them, and we already use std::to_chars for them anyway. To avoid <string> depending on all of <format>, this change actually just uses std::to_chars directly instead of using std::format. This is equivalent, because the format spec "{}" doesn't use any of the other features of std::format. libstdc++-v3/ChangeLog: * include/bits/basic_string.h (to_string(floating-point-type)): Implement using std::to_chars for C++26. * include/bits/version.def (__cpp_lib_to_string): Define. * include/bits/version.h: Regenerate. * testsuite/21_strings/basic_string/numeric_conversions/char/dr1261.cc: Adjust expected result in C++26 mode. * testsuite/21_strings/basic_string/numeric_conversions/char/to_string.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/wchar_t/dr1261.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/wchar_t/to_wstring.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/char/to_string_float.cc: New test. * testsuite/21_strings/basic_string/numeric_conversions/wchar_t/to_wstring_float.cc: New test. * testsuite/21_strings/basic_string/numeric_conversions/version.cc: New test.
2023-08-17	libstdc++: Optimize std::to_string using std::string::resize_and_overwrite	Jonathan Wakely	2	-52/+123
	This uses std::string::__resize_and_overwrite to avoid initializing the string buffer with characters that are immediately overwritten. This results in about 6% better performance for the std_to_string case in int-benchmark.cc from https://github.com/fmtlib/format-benchmark This requires a change to a testcase. The previous implementation guaranteed that the string returned from std::to_string(integral-type) would have no excess capacity, because it was constructed with the correct length. The new implementation constructs an empty string and then resizes it with resize_and_overwrite, which over-allocates. This means that the "no-excess capacity" guarantee no longer holds. We can also greatly improve the performance of std::to_wstring by using std::to_string and then widening it with a new helper function, instead of using std::swprintf to do the formatting. libstdc++-v3/ChangeLog: * include/bits/basic_string.h (to_string(integral-type)): Use resize_and_overwrite when available. (__to_wstring_numeric): New helper functions. (to_wstring): Use std::to_string then __to_wstring_numeric. * testsuite/21_strings/basic_string/numeric_conversions/char/to_string_int.cc: Remove check for no excess capacity.
2023-08-17	libstdc++: Define std::string::resize_and_overwrite for C++11 and COW string	Jonathan Wakely	8	-59/+214
	There are several places in the library where we can improve performance using resize_and_overwrite so it's inconvenient only being able to use it in C++23 mode, and only for cxx11 strings. This adds it for COW strings, and also adds __resize_and_overwrite as an extension for C++11 mode. The new __resize_and_overwrite is available for C++11 and later, so within the library we can use that consistently even in C++23. In order to avoid making a copy (which might not be possible for non-copyable, non-movable types) the callable is passed to resize_and_overwrite as an lvalue reference. Unlike wrapping it in std::ref(op) this ensures that invoking it as std::move(op)(n, p) will use the correct value category. It also avoids any overhead that would be added by wrapping it in a lambda like [&op](auto p, auto n) { return std::move(op)(p, n); }. Adjust std::format to use the new __resize_and_overwrite, which we can assume exists because we only use std::basic_string<char> and std::basic_string<wchar_t>, so no program-defined specializations. The uses in <experimental/internet> cannot be replaced, because those are type-dependent on an Allocator template parameter, which could mean they use program-defined specializations of std::basic_string that don't have the __resize_and_overwrite extension. libstdc++-v3/ChangeLog: * include/bits/basic_string.h (__resize_and_overwrite): New function. * include/bits/basic_string.tcc (__resize_and_overwrite): New function. (resize_and_overwrite): Simplify by using reserve instead of growing the string manually. Adjust for C++11 compatibility. * include/bits/cow_string.h (resize_and_overwrite): New function. (__resize_and_overwrite): New function. * include/bits/version.def (__cpp_lib_string_resize_and_overwrite): Do not depend on cxx11abi. * include/bits/version.h: Regenerate. * include/std/format (__formatter_fp::_S_resize_and_overwrite): Remove. (__formatter_fp::format, __formatter_fp::_M_localize): Use __resize_and_overwrite instead of _S_resize_and_overwrite. * testsuite/21_strings/basic_string/capacity/char/resize_and_overwrite.cc: Adjust for C++11 compatibility when included by ... * testsuite/21_strings/basic_string/capacity/char/resize_and_overwrite_ext.cc: New test.
2023-08-17	Fix range-ops operator_addr.	Andrew MacLeod	2	-1/+49
	Lack of symbolic information prevents op1_range from beig able to draw the same conclusions as fold_range can. PR tree-optimization/111009 gcc/ * range-op.cc (operator_addr_expr::op1_range): Be more restrictive. gcc/testsuite/ * gcc.dg/pr111009.c: New.
2023-08-17	RISCV: Add rotate immediate regression test	Patrick O'Neill	2	-0/+40
	This adds new regression tests to ensure half-register rotations are correctly optimized into rori instructions. gcc/testsuite/ChangeLog: * gcc.target/riscv/zbb-rol-ror-08.c: New test. * gcc.target/riscv/zbb-rol-ror-09.c: New test. Co-authored-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-08-17	libstdc++: Implement P2770R0 changes to join_view / join_with_view	Patrick Palka	3	-49/+257
	This C++23 paper fixes an issue in these views when adapting a certain kind of non-forward range, and we treat it as a DR against C++20. Reviewed-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/regex.h (regex_iterator::iterator_concept): Define for C++20 as per P2770R0. (regex_token_iterator::iterator_concept): Likewise. * include/std/ranges (__detail::__as_lvalue): Define. (join_view::_Iterator): Befriend join_view. (join_view::_Iterator::_M_satisfy): Use _M_get_outer instead of _M_outer. (join_view::_Iterator::_M_get_outer): Define. (join_view::_Iterator::_Iterator): Split constructor taking _Parent argument into two as per P2770R0. Remove constraint on default constructor. (join_view::_Iterator::_M_outer): Make this data member present only when the underlying range is forward. (join_view::_Iterator::operator++): Use _M_get_outer instead of _M_outer. (join_view::_Iterator::operator--): Use __as_lvalue helper. (join_view::_Iterator::operator==): Adjust constraints as per P2770R0. (join_view::_Sentinel::__equal): Use _M_get_outer instead of _M_outer. (join_view::_M_outer): New data member when the underlying range is non-forward. (join_view::begin): Adjust definition as per P2770R0. (join_view::end): Likewise. (join_with_view::_M_outer_it): New data member when the underlying range is non-forward. (join_with_view::begin): Adjust definition as per P2770R0. (join_with_view::end): Likewise. (join_with_view::_Iterator::_M_outer_it): Make this data member present only when the underlying range is forward. (join_with_view::_Iterator::_M_get_outer): Define. (join_with_view::_Iterator::_Iterator): Split constructor taking _Parent argument into two as per P2770R0. Remove constraint on default constructor. (join_with_view::_Iterator::_M_update_inner): Adjust definition as per P2770R0. (join_with_view::_Iterator::_M_get_inner): Likewise. (join_with_view::_Iterator::_M_satisfy): Adjust calls to _M_get_inner. Use _M_get_outer instead of _M_outer_it. (join_with_view::_Iterator::operator==): Adjust constraints as per P2770R0. (join_with_view::_Sentinel::operator==): Use _M_get_outer instead of _M_outer_it. * testsuite/std/ranges/adaptors/p2770r0.cc: New test.
2023-08-17	libstdc++: Convert _RangeAdaptorClosure into a CRTP base [PR108827]	Patrick Palka	2	-42/+45
	Using the CRTP idiom for this base class avoids bloating the size of a pipeline when adding distinct empty range adaptor closure objects to it, as detailed in section 4.1 of P2387R3. But it means we can no longer define its operator\| overloads as hidden friends, since it'd mean each instantiation of _RangeAdaptorClosure introduces its own distinct set of hidden friends. So e.g. for the outer \| in x \| (views::reverse \| views::join) ADL would find 6 distinct hidden operator\| friends: two from _RangeAdaptorClosure<_Reverse> two from _RangeAdaptorClosure<_Join> two from _RangeAdaptorClosure<_Pipe<_Reverse, _Join>> but we really only want to consider the last two. We avoid this issue by instead defining the operator\| overloads at namespace scope alongside _RangeAdaptorClosure. This should be fine because the only types defined in this namespace are _RangeAdaptorClosure, _RangeAdaptor, _Pipe and _Partial, so we don't have to worry about unintentional ADL. Reviewed-by: Jonathan Wakely <jwakely@redhat.com> PR libstdc++/108827 libstdc++-v3/ChangeLog: * include/std/ranges (__adaptor::_RangeAdaptorClosure): Convert into a CRTP class template. Move hidden operator\| friends into namespace scope and adjust their constraints. (__closure::__is_range_adaptor_closure_fn): Define. (__closure::__is_range_adaptor_closure): Define. (__adaptor::_Partial): Adjust use of _RangeAdaptorClosure. (__adaptor::_Pipe): Likewise. (views::_All): Likewise. (views::_Join): Likewise. (views::_Common): Likewise. (views::_Reverse): Likewise. (views::_Elements): Likewise. (views::_Adjacent): Likewise. (views::_AsRvalue): Likewise. (views::_Enumerate): Likewise. (views::_AsConst): Likewise. * testsuite/std/ranges/adaptors/all.cc: Reinstate assertion expecting that adding empty range adaptor closure objects to a pipeline doesn't increase the size of a pipeline.
2023-08-17	[LRA]: When assigning stack slots to pseudos previously assigned to fp ↵	Vladimir N. Makarov	1	-1/+2
	consider other spilled pseudos The previous LRA patch can assign slot of conflicting pseudos to pseudos spilled after prohibiting fp->sp elimination. This patch fixes this problem. gcc/ChangeLog: * lra-spills.cc (assign_stack_slot_num_and_sort_pseudos): Moving slots_num initialization from here ... (lra_spill): ... to here before the 1st call of assign_stack_slot_num_and_sort_pseudos. Add the 2nd call after fp->sp elimination.
2023-08-17	Add warning options -W[no-]compare-distinct-pointer-types	Jose E. Marchesi	6	-3/+111
	GCC emits pedwarns unconditionally when comparing pointers of different types, for example: int xdp_context (struct xdp_md xdp) { void data = (void )(long)xdp->data; __u32 metadata = (void )(long)xdp->data_meta; __u32 ret; if (metadata + 1 > data) return 0; return 1; } /home/jemarch/foo.c: In function ‘xdp_context’: /home/jemarch/foo.c:15:20: warning: comparison of distinct pointer types lacks a cast 15 \| if (metadata + 1 > data) \| ^ LLVM supports an option -W[no-]compare-distinct-pointer-types that can be used in order to enable or disable the emission of such warnings. It is enabled by default. This patch adds the same options to GCC. Documentation and testsuite updated included. Regtested in x86_64-linu-gnu. No regressions observed. gcc/ChangeLog: PR c/106537 doc/invoke.texi (Option Summary): Mention -Wcompare-distinct-pointer-types under `Warning Options'. (Warning Options): Document -Wcompare-distinct-pointer-types. gcc/c-family/ChangeLog: PR c/106537 * c.opt (Wcompare-distinct-pointer-types): New option. gcc/c/ChangeLog: PR c/106537 * c-typeck.cc (build_binary_op): Warning on comparing distinct pointer types only when -Wcompare-distinct-pointer-types. gcc/testsuite/ChangeLog: PR c/106537 * gcc.c-torture/compile/pr106537-1.c: New test. * gcc.c-torture/compile/pr106537-2.c: Likewise. * gcc.c-torture/compile/pr106537-3.c: Likewise.
2023-08-17	Fix code_helper unused argument warning for fr30	Jan-Benedict Glaw	1	-1/+1
	fr30 is the only target defining GO_IF_LEGITIMATE_ADDRESS right now, in which case the `code_helper ch` argument to memory_address_addr_space_p() is unused and emits a new warning. gcc/ChangeLog: * recog.cc (memory_address_addr_space_p): Mark possibly unused argument as unused.
2023-08-17	[PATCH] RISC-V: Deduplicate #error messages in testsuite	Tsukasa OI	16	-104/+104
	"#error Feature macro not defined" is required to test the existence of an extension through the preprocessor. However, multiple occurrence of the exact same error message will confuse the developer once an error is encountered. This commit replaces such error messages to "#error Feature macro for `EXT' not defined" to make which macro is missing. gcc/testsuite/ChangeLog: * gcc.target/riscv/zvkn.c: Deduplicate #error messages. * gcc.target/riscv/zvkn-1.c: Ditto. * gcc.target/riscv/zvknc.c: Ditto. * gcc.target/riscv/zvknc-1.c: Ditto. * gcc.target/riscv/zvknc-2.c: Ditto. * gcc.target/riscv/zvkng.c: Ditto. * gcc.target/riscv/zvkng-1.c: Ditto. * gcc.target/riscv/zvkng-2.c: Ditto. * gcc.target/riscv/zvks.c: Ditto. * gcc.target/riscv/zvks-1.c: Ditto. * gcc.target/riscv/zvksc.c: Ditto. * gcc.target/riscv/zvksc-1.c: Ditto. * gcc.target/riscv/zvksc-2.c: Ditto. * gcc.target/riscv/zvksg.c: Ditto. * gcc.target/riscv/zvksg-1.c: Ditto. * gcc.target/riscv/zvksg-2.c: Ditto.
2023-08-17	tree-optimization/111039 - abnormals and bit test merging	Richard Biener	2	-0/+22
	The following guards the bit test merging code in if-combine against the appearance of SSA names used in abnormal PHIs. PR tree-optimization/111039 * tree-ssa-ifcombine.cc (ifcombine_ifandif): Check for SSA_NAME_OCCURS_IN_ABNORMAL_PHI. * gcc.dg/pr111039.c: New testcase.
2023-08-17	libgomp: call numa_available first when using libnuma	Tobias Burnus	1	-0/+11
	The documentation requires that numa_available() is called and only when successful, other libnuma function may be called. Internally, it does a syscall to get_mempolicy with flag=0 (which would return the default policy if mode were not NULL). If this returns -1 (and not 0) and errno == ENOSYS, the Linux kernel does not have the get_mempolicy syscall function; if so, numa_available() returns -1 (otherwise: 0). libgomp/ PR libgomp/111024 * allocator.c (gomp_init_libnuma): Call numa_available; if not available or not returning 0, disable libnuma usage.
2023-08-17	doc: Fixes to RTL-SSA sample code	Alex Coplan	1	-12/+12
	This patch fixes up the code examples in the RTL-SSA documentation (the sections on making insn changes) to reflect the current API. The main issues are as follows: - rtl_ssa::recog takes an obstack_watermark & as the first parameter. Presumably this is intended to be the change attempt, so I've updated the examples to pass this through. - The variants of recog and restrict_movement that take an ignore predicate have been renamed with an _ignoring suffix, so I've updated callers to use those names. - A couple of minor "obvious" fixes to add a missing address-of operator and correct a variable name. gcc/ChangeLog: * doc/rtl.texi: Fix up sample code for RTL-SSA insn changes.
2023-08-17	RISC-V: Fix XPASS slp testcases	Lehua Ding	10	-25/+36
	This patch fixs XPASS slp testcases on trunk by making the conditions for xfail stricter. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/slp-1.c: Fix. * gcc.target/riscv/rvv/autovec/partial/slp-16.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-17.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-18.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-19.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-2.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-3.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-4.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-5.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-6.c: Ditto.
2023-08-17	bpf: support `naked' function attributes in BPF targets	Jose E. Marchesi	3	-0/+48
	The kernel selftests and other BPF programs make extensive use of the `naked' function attribute with bodies written using basic inline assembly. This patch adds support for the attribute to bpf-unkonwn-none, makes it to inhibit warnings due to lack of explicit `return' statement, and updates documentation and testsuite accordingly. Tested in x86_64-linux-gnu host and bpf-unknown-none target. gcc/ChangeLog PR target/111046 * config/bpf/bpf.cc (bpf_attribute_table): Add entry for the `naked' function attribute. (bpf_warn_func_return): New function. (TARGET_WARN_FUNC_RETURN): Define. (bpf_expand_prologue): Add preventive comment. (bpf_expand_epilogue): Likewise. * doc/extend.texi (BPF Function Attributes): Document the `naked' function attribute. gcc/testsuite/ChangeLog * gcc.target/bpf/naked-1.c: New test.
2023-08-17	libstdc++: Fix std::format("{:F}", inf) to use uppercase	Jonathan Wakely	2	-2/+20
	std::format was treating {:f} and {:F} identically on the basis that for the fixed 1.234567 format there are no alphabetical characters that need to be in uppercase. But that's wrong for infinities and NaNs, which should be formatted as "INF" and "NAN" for {:F}. libstdc++-v3/ChangeLog: * include/std/format (__format::_Pres_type): Add _Pres_F. (__formatter_fp::parse): Use _Pres_F for 'F'. (__formatter_fp::format): Set __upper for _Pres_F. * testsuite/std/format/functions/format.cc: Check formatting of infinity and NaN for each presentation type.
2023-08-17	libstdc++: Regenerate Makefile.in	Jonathan Wakely	1	-3/+3
	libstdc++-v3/ChangeLog: * include/Makefile.in: Regenerate.
2023-08-17	Handle TYPE_OVERFLOW_UNDEFINED vectorized BB reductions	Richard Biener	2	-14/+109
	The following changes the gate to perform vectorization of BB reductions to use needs_fold_left_reduction_p which in turn requires handling TYPE_OVERFLOW_UNDEFINED types in the epilogue code generation by promoting any operations generated there to use unsigned arithmetic. The following does this, there's currently only v16qi where x86 supports a .REDUC_PLUS reduction for integral modes so I had to add a x86 specific testcase using GIMPLE IL. * tree-vect-slp.cc (vect_slp_check_for_roots): Use !needs_fold_left_reduction_p to decide whether we can handle the reduction with association. (vectorize_slp_instance_root_stmt): For TYPE_OVERFLOW_UNDEFINED reductions perform all arithmetic in an unsigned type. * gcc.target/i386/vect-reduc-2.c: New testcase.
2023-08-17	testsuite: Remove unused dg-line in ce8cdf5bcf96a2db6d7b9f656fc9ba58d7942a83	benjamin priour	1	-1/+1
	Test case g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C introduced by patch ce8cdf5bcf96a2db6d7b9f656fc9ba58d7942a83 emitted a warning for an unused dg-line variable. This fixes up the blunder. Signed-off-by: benjamin priour <vultkayn@gcc.gnu.org> gcc/testsuite/ChangeLog: * g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C: Remove dg-line var declare_a.