aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-01-23gcn: Fix a warningJakub Jelinek1-1/+2
I see ../../gcc/config/gcn/gcn.cc: In function ‘void gcn_hsa_declare_function_name(FILE*, const char*, tree)’: ../../gcc/config/gcn/gcn.cc:6568:67: warning: unused parameter ‘decl’ [-Wunused-parameter] 6568 | gcn_hsa_declare_function_name (FILE *file, const char *name, tree decl) | ~~~~~^~~~ warning presumably since r14-6945-gc659dd8bfb55e02a1b97407c1c28f7a0e8f7f09b Previously, the argument was anonymous, but now it is passed to a macro which ignores it, so I think we should go with ATTRIBUTE_UNUSED. 2024-01-23 Jakub Jelinek <jakub@redhat.com> * config/gcn/gcn.cc (gcn_hsa_declare_function_name): Add ATTRIBUTE_UNUSED to decl.
2024-01-23debug/107058 - gracefully handle unexpected DIE contextsRichard Biener2-4/+18
While the bug is persisting that LTO streaming picks up a CONST_DECL from an attribute argument on a VAR_DECL which with -fdebug-type-section refers to a DIE in a type unit we can handle this gracefully, at least with -fno-checking. Do so. The C++ frontend nevetheless should resolve the CONST_DECL attribute argument to a constant. PR debug/107058 * dwarf2out.cc (dwarf2out_die_ref_for_decl): Gracefully handle unexpected but bogus DIE contexts when not checking enabled. * c-c++-common/pr107058.c: New testcase.
2024-01-23c++: Fix handling of extern templates in modules [PR112820]Nathaniel Shead6-16/+64
Currently, extern templates are detected by looking for the DECL_EXTERNAL flag on a TYPE_DECL. However, this is incorrect: TYPE_DECLs don't actually set this flag, and it happens to work by coincidence due to TYPE_DECL_SUPPRESS_DEBUG happening to use the same underlying bit. This however causes issues with other TYPE_DECLs that also happen to have suppressed debug information. Instead, this patch reworks the logic so CLASSTYPE_INTERFACE_ONLY is always emitted into the module BMI and can then be used to check for an extern template correctly. Otherwise, for other declarations we always want to redetermine this: even for declarations from the GMF, we may change our mind on whether to import or export depending on decisions made later in the TU after importing so we shouldn't decide this now, or necessarily reuse what the module we'd imported had decided. Some of this may need to change in the future to account for https://github.com/itanium-cxx-abi/cxx-abi/issues/170. PR c++/112820 PR c++/102607 gcc/cp/ChangeLog: * module.cc (trees_out::lang_type_bools): Write interface_only and interface_unknown. (trees_in::lang_type_bools): Read the above flags. (trees_in::decl_value): Reset CLASSTYPE_INTERFACE_* except for extern templates. (trees_in::read_class_def): Remove buggy extern template handling. gcc/testsuite/ChangeLog: * g++.dg/modules/debug-2_a.C: New test. * g++.dg/modules/debug-2_b.C: New test. * g++.dg/modules/debug-2_c.C: New test. * g++.dg/modules/debug-3_a.C: New test. * g++.dg/modules/debug-3_b.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-01-23fold-const: Fold larger VIEW_CONVERT_EXPRs [PR113462]Jakub Jelinek1-5/+17
On Mon, Jan 22, 2024 at 11:27:52AM +0100, Richard Biener wrote: > We run into > > static tree > native_interpret_int (tree type, const unsigned char *ptr, int len) > { > ... > if (total_bytes > len > || total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT) > return NULL_TREE; > > OTOH using a V_C_E to "truncate" a _BitInt looks wrong? OTOH the > check doesn't really handle native_encode_expr using the "proper" > wide_int encoding however that's exactly handled. So it might be > a pre-existing issue that's only uncovered by large _BitInts > (__int128 might show similar issues?) I guess the || total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT conditions make no sense, all we care is whether it fits in the buffer or not. But then there is fold_view_convert_expr (and other spots) which use /* We support up to 1024-bit values (for GCN/RISC-V V128QImode). */ unsigned char buffer[128]; or something similar. This patch fixes even that by using a XALLOCAVEC allocated buffer if the type size is 129 .. 8192 bytes. 2024-01-22 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/113462 * fold-const.cc (native_interpret_int): Don't punt if total_bytes is larger than HOST_BITS_PER_DOUBLE_INT / BITS_PER_UNIT. (fold_view_convert_expr): Use XALLOCAVEC buffers for types with sizes between 129 and 8192 bytes.
2024-01-23LoongArch: Disable explicit reloc for TLS LD/GD with -mexplicit-relocs=autoXi Ruoyao2-6/+7
Binutils 2.42 supports TLS LD/GD relaxation which requires the assembler macro. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_explicit_relocs_p): If la_opt_explicit_relocs is EXPLICIT_RELOCS_AUTO, return false for SYMBOL_TLS_LDM and SYMBOL_TLS_GD. (loongarch_call_tls_get_addr): Do not split symbols of SYMBOL_TLS_LDM or SYMBOL_TLS_GD if la_opt_explicit_relocs is EXPLICIT_RELOCS_AUTO. gcc/testsuite/ChangeLog: * gcc.target/loongarch/explicit-relocs-auto-tls-ld-gd.c: Check for la.tls.ld and la.tls.gd.
2024-01-23find_base_value partRichard Biener1-58/+4
The following adjusts find_base_value similar as to what find_base_term was adjusted for PR113255. * alias.cc (known_base_value_p): Remove. (find_base_value): Remove PLUS/MINUS handling when both operands are not CONST_INT_P.
2024-01-23rtl-optimization/113255 - base_alias_check vs. pointer differenceRichard Biener2-23/+32
When the x86 backend generates code for cpymem with the rep_8byte strathegy for the 8 byte aligned main rep movq it needs to compute an adjusted pointer to the source after doing a prologue aligning the destination. It computes that via src_ptr + (dest_ptr - orig_dest_ptr) which is perfectly fine. On RTL this is then 8: r134:DI=const(`g'+0x44) 9: {r133:DI=frame:DI-0x4c;clobber flags:CC;} REG_UNUSED flags:CC 56: r129:DI=const(`g'+0x4c) 57: {r129:DI=r129:DI&0xfffffffffffffff8;clobber flags:CC;} REG_UNUSED flags:CC REG_EQUAL const(`g'+0x4c)&0xfffffffffffffff8 58: {r118:DI=r134:DI-r129:DI;clobber flags:CC;} REG_DEAD r134:DI REG_UNUSED flags:CC REG_EQUAL const(`g'+0x44)-r129:DI 59: {r119:DI=r133:DI-r118:DI;clobber flags:CC;} REG_DEAD r133:DI REG_UNUSED flags:CC but as written find_base_term happily picks the first candidate it finds for the MINUS which means it picks const(`g') rather than the correct frame:DI. This way find_base_term (but also the unfixed find_base_value used by init_alias_analysis to initialize REG_BASE_VALUE) performs pointer analysis isn't sound. The following restricts the handling of multi-operand operations to the case we know only one can be a pointer. This for example causes gcc.dg/tree-ssa/pr94969.c to miss some RTL PRE (I've opened PR113395 for this). A more drastic patch, removing base_alias_check results in only gcc.dg/guality/pr41447-1.c regressing (so testsuite coverage is bad). I've looked at gcc.dg/tree-ssa tests and mostly scheduling changes are present, the cc1plus .text size is only 230 bytes worse. With the this less drastic patch below most scheduling changes are gone. x86_64 might not the very best target to test for impact, but test coverage on other targets is unlikely to be very much better. PR rtl-optimization/113255 * alias.cc (find_base_term): Remove PLUS/MINUS handling when both operands are not CONST_INT_P. * gcc.dg/torture/pr113255.c: New testcase.
2024-01-23debug/112718 - reset all type units with -ffat-lto-objectsRichard Biener2-12/+12
When mixing -flto, -ffat-lto-objects and -fdebug-type-section we fail to reset all type units after early output resulting in an ICE when attempting to add then duplicate sibling attributes. PR debug/112718 * dwarf2out.cc (dwarf2out_finish): Reset all type units for the fat part of an LTO compile. * gcc.dg/debug/pr112718.c: New testcase.
2024-01-23LoongArch: doc: Add attribute descriptions defined in the target-supports.exp.chenxiaolong1-0/+20
gcc/ChangeLog: * doc/sourcebuild.texi: Add attributes for keywords.
2024-01-22compiler: don't pass iota value to lowering passIan Lance Taylor4-40/+37
It is no longer used. The iota value is now handled in the determine-types pass. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/536644
2024-01-23Correct lists of options enabled by -Wall and -Wextra [PR90463]Sandra Loosemore1-29/+50
gcc/ChangeLog PR c++/90463 * doc/invoke.texi (Warning Options): Correct lists of options enabled by -Wall and -Wextra by checking against common.opt and c-family/c.opt.
2024-01-23Sort warning options in c-family/c.opt.Sandra Loosemore1-244/+244
In spite of the plea "Please try to keep this file in ASCII collating order" at the top of the file, the sorting of the entries for the various -Wfoo options had increased in entropy. This made it hard to correlate them against e.g. the list of options enabled by -Wall in the manual (see PR90463). This patch is at least a step in the right direction to restore order to the file. I confirmed that no lines were added or removed by these changes, and that the output of "gcc -Q --help=warnings" is unchanged both with or without -Wall added to the command. gcc/c-family/ChangeLog * c.opt: Improve sorting of warning options.
2024-01-23Daily bump.GCC Administrator5-1/+104
2024-01-22c++: extend Wdangling-reference17.C [PR109642]Marek Polacek1-0/+12
This patch extends g++.dg/warn/Wdangling-reference17.C with code from PR109642. I'm not creating a new test because this one already #includes the required headers. PR c++/109642 gcc/testsuite/ChangeLog: * g++.dg/warn/Wdangling-reference17.C: Additional testing.
2024-01-22Add -gno-strict-dwarf to dg-options in various btf enum testsJohn David Anglin4-4/+4
The -gno-strict-dwarf option is needed to ensure enum signedness is added to type_die. 2024-01-22 John David Anglin <danglin@gcc.gnu.org> gcc/testsuite/ChangeLog: PR debug/113382 * gcc.dg/debug/btf/btf-bitfields-3.c: Add -gno-strict-dwarf option to dg-options. * gcc.dg/debug/btf/btf-enum-1.c: Likewise. * gcc.dg/debug/btf/btf-enum-small.c: Likewise. * gcc.dg/debug/btf/btf-enum64-1.c: Likewise.
2024-01-22arm: Fix parsecpu.awk for aliases [PR113030]Andrew Pinski1-2/+2
So the problem here is the 2 functions check_cpu and check_arch use the wrong variable to check if an alias is valid for that cpu/arch. check_cpu uses cpu_optaliases instead of cpu_opt_alias. cpu_optaliases is an array of index'ed by the cpuname that contains all of the valid aliases for that cpu but cpu_opt_alias is an double index array which is index'ed by cpuname and the alias which provides what is the alias for that option. Similar thing happens for check_arch and arch_optaliases vs arch_optaliases. Tested by running: ``` awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+simd" config/arm/arm-cpus.in awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+neon" config/arm/arm-cpus.in awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+neon-vfpv3" config/arm/arm-cpus.in ``` And they don't return error back. gcc/ChangeLog: PR target/113030 * config/arm/parsecpu.awk (check_cpu): Use cpu_opt_alias instead of cpu_optaliases. (check_arch): Use arch_opt_alias instead of arch_optaliases. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-01-22xfail libgomp.c/declare-variant-4-{fiji,gfx803}.cTobias Burnus2-0/+4
Since r14-4734-g56ed1055b2f40ac162ae8d382280ac07a33f789f, GCC no longer builds the Fiji (alias gfx803) libraries by default as support for it was removed in ROCm 4.0 and will be removed in LLVM 18. Thus, unless gfx803 is explicitly enabled, the following testcases will fail to link as libgomp is not available for Fiji. Hence, this commit xfails those testcases. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-4-fiji.c: Xfail as fiji support is no longer enabled by default. * testsuite/libgomp.c/declare-variant-4-gfx803.c: Likewise. Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
2024-01-22RISC-V: Lower vmv.v.x (avl = 1) into vmv.s.xJuzhe-Zhong5-1/+94
Notice there is a AI benchmark, GCC vs Clang has 3% performance drop. It's because Clang/LLVM has a simplification transform vmv.v.x (avl = 1) into vmv.s.x. Since vmv.s.x has more flexible vsetvl demand than vmv.v.x that can allow us to have better chances to fuse vsetvl. Consider this following case: void foo (uint32_t *outputMat, uint32_t *inputMat) { vuint32m1_t matRegIn0 = __riscv_vle32_v_u32m1 (inputMat, 4); vuint32m1_t matRegIn1 = __riscv_vle32_v_u32m1 (inputMat + 4, 4); vuint32m1_t matRegIn2 = __riscv_vle32_v_u32m1 (inputMat + 8, 4); vuint32m1_t matRegIn3 = __riscv_vle32_v_u32m1 (inputMat + 12, 4); vbool32_t oddMask = __riscv_vreinterpret_v_u32m1_b32 (__riscv_vmv_v_x_u32m1 (0xaaaa, 1)); vuint32m1_t smallTransposeMat0 = __riscv_vslideup_vx_u32m1_tumu (oddMask, matRegIn0, matRegIn1, 1, 4); vuint32m1_t smallTransposeMat2 = __riscv_vslideup_vx_u32m1_tumu (oddMask, matRegIn2, matRegIn3, 1, 4); vuint32m1_t outMat0 = __riscv_vslideup_vx_u32m1_tu (smallTransposeMat0, smallTransposeMat2, 2, 4); __riscv_vse32_v_u32m1 (outputMat, outMat0, 4); } Before this patch: vsetivli zero,4,e32,m1,ta,ma li a5,45056 addi a2,a1,16 addi a3,a1,32 addi a4,a1,48 vle32.v v1,0(a1) vle32.v v4,0(a2) vle32.v v2,0(a3) vle32.v v3,0(a4) addiw a5,a5,-1366 vsetivli zero,1,e32,m1,ta,ma vmv.v.x v0,a5 ---> Since it avl = 1, we can transform it into vmv.s.x vsetivli zero,4,e32,m1,tu,mu vslideup.vi v1,v4,1,v0.t vslideup.vi v2,v3,1,v0.t vslideup.vi v1,v2,2 vse32.v v1,0(a0) ret After this patch: li a5,45056 addi a2,a1,16 vsetivli zero,4,e32,m1,tu,mu addiw a5,a5,-1366 vle32.v v3,0(a2) addi a3,a1,32 addi a4,a1,48 vle32.v v1,0(a1) vmv.s.x v0,a5 vle32.v v2,0(a3) vslideup.vi v1,v3,1,v0.t vle32.v v3,0(a4) vslideup.vi v2,v3,1,v0.t vslideup.vi v1,v2,2 vse32.v v1,0(a0) ret Tested on both RV32 and RV64 no regression. gcc/ChangeLog: * config/riscv/riscv-protos.h (splat_to_scalar_move_p): New function. * config/riscv/riscv-v.cc (splat_to_scalar_move_p): Ditto. * config/riscv/vector.md: Simplify vmv.v.x. into vmv.s.x. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/attribute-2.c: New test. * gcc.target/riscv/rvv/vsetvl/attribute-3.c: New test.
2024-01-22libstdc++: Fix check in testsuite/std/time/clock/file/io.ccJonathan Wakely1-1/+2
The test_format() function contained an incorrect assertion but wasn't actually being called from main. libstdc++-v3/ChangeLog: * testsuite/std/time/clock/file/io.cc: Fix expected result in assertion and call test_format() from main.
2024-01-22RISC-V: Fix regressions due to 86de9b66480b710202a2898cf513db105d8c432fJuzhe-Zhong2-4/+6
This patch fixes the recent regression: FAIL: gcc.dg/torture/float32-tg-2.c -O1 (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/float32-tg-2.c -O1 (test for excess errors) FAIL: gcc.dg/torture/float32-tg-2.c -O2 (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/float32-tg-2.c -O2 (test for excess errors) FAIL: gcc.dg/torture/float32-tg-2.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/float32-tg-2.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) FAIL: gcc.dg/torture/float32-tg-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/float32-tg-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) FAIL: gcc.dg/torture/float32-tg-2.c -O3 -g (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/float32-tg-2.c -O3 -g (test for excess errors) FAIL: gcc.dg/torture/float32-tg-2.c -Os (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/float32-tg-2.c -Os (test for excess errors) FAIL: gcc.dg/torture/float32-tg.c -O1 (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/float32-tg.c -O1 (test for excess errors) FAIL: gcc.dg/torture/float32-tg.c -O2 (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/float32-tg.c -O2 (test for excess errors) FAIL: gcc.dg/torture/float32-tg.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/float32-tg.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) FAIL: gcc.dg/torture/float32-tg.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/float32-tg.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) FAIL: gcc.dg/torture/float32-tg.c -O3 -g (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/float32-tg.c -O3 -g (test for excess errors) FAIL: gcc.dg/torture/float32-tg.c -Os (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/float32-tg.c -Os (test for excess errors) FAIL: gcc.dg/torture/pr48124-4.c -O1 (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/pr48124-4.c -O1 (test for excess errors) FAIL: gcc.dg/torture/pr48124-4.c -O2 (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/pr48124-4.c -O2 (test for excess errors) FAIL: gcc.dg/torture/pr48124-4.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/pr48124-4.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) FAIL: gcc.dg/torture/pr48124-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/pr48124-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) FAIL: gcc.dg/torture/pr48124-4.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/pr48124-4.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) FAIL: gcc.dg/torture/pr48124-4.c -O3 -g (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/pr48124-4.c -O3 -g (test for excess errors) FAIL: gcc.dg/torture/pr48124-4.c -Os (internal compiler error: in reg_or_subregno, at jump.cc:1895) FAIL: gcc.dg/torture/pr48124-4.c -Os (test for excess errors) due to commit 86de9b66480b710202a2898cf513db105d8c432f. The root cause is register_operand and reg_or_subregno are consistent so we reach the assertion fail. We shouldn't worry about subreg:...VL_REGNUM since it's impossible that we can have such situation, that is, we only have (set (reg) (reg:VL_REGNUM)) which generate "csrr vl" ASM for first fault load instructions (vleff). So, using REG_P and REGNO must be totally solid and robostic. Since we don't allow VL_RENUM involved into register allocation and we don't have such constraint, we always use this following pattern to generate "csrr vl" ASM: (define_insn "read_vlsi" [(set (match_operand:SI 0 "register_operand" "=r") (reg:SI VL_REGNUM))] "TARGET_VECTOR" "csrr\t%0,vl" [(set_attr "type" "rdvl") (set_attr "mode" "SI")]) So the check in riscv.md is to disallow such situation fall into move pattern in riscv.md Tested on both RV32/RV64 no regression. PR target/109092 gcc/ChangeLog: * config/riscv/riscv.md: Use reg instead of subreg. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr109092.c: New test.
2024-01-22[gcn] mkoffload: Fix linking with "-g"; fix file deletion; improve ↵Tobias Burnus1-16/+16
diagnostic [PR111966] With debugging enabled, '*.mkoffload.dbg.o' files are generated. The e_flags header of all *.o files must be the same - otherwise, the linker complains. Since r14-4734-g56ed1055b2f40ac162ae8d382280ac07a33f789f the -march= default is now gfx900. If compiling without any -march= flag, the default value is used by the compiler but not passed to mkoffload. Hence, mkoffload.cc's uses its own default for march - unfortunately, it still had gfx803/fiji as default, leading to the linker error: 'incompatible mach'. Solution: Update the default to gfx900. While debugging it, I saw that /tmp/cc*.mkoffload.dbg.o kept accumulating; there were a couple of issues with the handling: * dbgobj was always added to files_to_cleanup * If copy_early_debug_info returned true, dbgobj was added again -> pointless and in theory a race if the same file was added in the faction of a second. * If copy_early_debug_info returned false, - In exactly one case, it already deleted the file it self (same potential race as above) - The pointer dbgobj was freed - such that files_to_cleanup contained a dangling pointer - probably the reason that stale files remained. Solution: Only if copy_early_debug_info returns true, dbgobj is added to files_to_cleanup. If it returns false, the file is unlinked before freeing the pointer. When compiling, GCC warned about several fatal_error messages as having no %<...%> or %qs quotes. This patch now silences several of those warnings by using those quotes. gcc/ChangeLog: PR other/111966 * config/gcn/mkoffload.cc (elf_arch): Change default to gfx900 to match the compiler default. (simple_object_copy_lto_debug_sections): Never unlink the outfile on error as the caller does so. (maybe_unlink, compile_native): Use %<...%> and %qs in fatal_error. (main): Likewise. Fix 'mkoffload.dbg.o' cleanup. Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
2024-01-22Update copyright years.Marc Poulhiès2236-2236/+2236
2024-01-22tree-optimization/113373 - add missing LC PHIs for live operationsRichard Biener3-39/+60
The following makes reduction epilogue code generation happy by properly adding LC PHIs to the exit blocks for multiple exit vectorized loops. Some refactoring might make the flow easier to follow but I've refrained from doing that with this patch. I've kept some fixes in reduction epilogue generation from the earlier attempt fixing this PR. PR tree-optimization/113373 * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Create LC PHIs in the exit blocks where necessary. * tree-vect-loop.cc (vectorizable_live_operation): Do not try to handle missing LC PHIs. (find_connected_edge): Remove. (vect_create_epilog_for_reduction): Cleanup use of auto_vec. * gcc.dg/vect/vect-early-break_104-pr113373.c: New testcase.
2024-01-22RISC-V: Fix vfirst/vmsbf/vmsif/vmsof ratio attributesJuzhe-Zhong2-1/+48
vfirst/vmsbf/vmsif/vmsof instructions are supposed to demand ratio instead of demanding sew_lmul. But my previous typo makes VSETVL PASS miss honor the risc-v v spec. Consider this following simple case: int foo4 (void * in, void * out) { vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); v = __riscv_vadd_vv_i32m1 (v, v, 4); vbool32_t mask = __riscv_vreinterpret_v_i32m1_b32(v); mask = __riscv_vmsof_m_b32(mask, 4); return __riscv_vfirst_m_b32(mask, 4); } Before this patch: foo4: vsetivli zero,4,e32,m1,ta,ma vle32.v v1,0(a0) vadd.vv v1,v1,v1 vsetvli zero,zero,e8,mf4,ta,ma ----> redundant. vmsof.m v2,v1 vfirst.m a0,v2 ret After this patch: foo4: vsetivli zero,4,e32,m1,ta,ma vle32.v v1,0(a0) vadd.vv v1,v1,v1 vmsof.m v2,v1 vfirst.m a0,v2 ret Confirm RVV spec and Clang, this patch makes VSETVL PASS match the correct behavior. Tested on both RV32/RV64, no regression. gcc/ChangeLog: * config/riscv/vector.md: Fix vfirst/vmsbf/vmsof ratio attributes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/attribute-1.c: New test.
2024-01-22RISC-V: Bugfix for resolve_overloaded_builtin[PR113420]xuli3-77/+77
v2: Avoid internal ICE for the case below. vint8mf8_t test_vle8_v_i8mf8_m(vbool64_t vm, const int32_t *rs1, size_t vl) { return __riscv_vle8(vm, rs1, vl); } v1: Change the hash value of overloaded intrinsic from considering all parameter types to: 1. Encoding vector data type 2. In order to distinguish vle8_v_i8mf8_m(vbool64_t vm, const int8_t *rs1, size_t vl) and vle8_v_u8mf8_m(vbool64_t vm, const uint8_t *rs1, size_t vl), encode the pointer type 3. In order to distinguish vfadd_vv_f32mf2_rm(vfloat32mf2_t vs2, vfloat32mf2_t vs1, size_t vl) and vfadd_vv_f32mf2(vfloat32mf2_t vs2, vfloat32mf2_t vs1, size_t vl), encode the number of parameters. The same goes for the vxrm intrinsics. PR target/113420 gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (has_vxrm_or_frm_p):remove. (registered_function::overloaded_hash):refactor. (resolve_overloaded_builtin):avoid internal ICE. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr113420-1.c: New test. * gcc.target/riscv/rvv/base/pr113420-2.c: New test.
2024-01-21[committed] Adjust expectations for pr59533-1.cJeff Law1-4/+4
The change for pr111267 twiddled code generation for sh/pr59533-1.c We end up eliminating two comparisons, but require two shll instructions to do so. And in a couple places we're using an addc sequence rather than a subc sequence. This patch adjusts the expected codegen for the test as all those are either a wash or a The fwprop change does cause some code regressions on the same test. I'll file a distinct but for that issue. gcc/testsuite * gcc.target/sh/pr59533-1.c: Adjust expected output.
2024-01-22Daily bump.GCC Administrator6-1/+116
2024-01-21[PATCH v3 2/2] RISC-V: Fix XCValu testMary Bennett1-20/+20
gcc/testsuite/ChangeLog: * gcc.target/riscv/cv-alu-fail-compile.c: Change warning to error.
2024-01-21Re: [PATCH] Avoid ICE with m68k-elf -malign-int and libcallsMikael Pettersson3-4/+14
>> emit_library_call_value_1 calls emit_push_insn with NULL_TREE >> for TYPE. Sometimes emit_push_insn needs to assign a temp with >> that TYPE, which causes a segfault. >> >> Fixed by computing the TYPE from MODE when needed. >> >> Original patch by Thorsten Otto. >> [ ... ] > This really needs to happen in the two call paths which pass in > NULL_TREE for the type. Note how the type is used to determine padding > earlier in emit_push_insn. That would also make the code more > consistent with the comment before emit_push_insn which implies that > both MODE and TYPE are valid. > > > Additionally you should bootstrap and regression test this patch on at > least one target. Updated as requested, and bootstrapped and tested on {x86_64,aarch64,m68k}-linux-gnu without regressions. gcc/ PR target/82420 PR target/111279 * calls.cc (emit_library_call_value_1): Pass valid TYPE to emit_push_insn. * expr.cc (emit_push_insn): Likewise. gcc/testsuite/ PR target/82420 * gcc.target/m68k/pr82420.c: New test. Co-authored-by: Thorsten Otto <admin@tho-otto.de>
2024-01-21Install right version of last change.Jeff Law1-0/+1
gcc/ * config/riscv/riscv.cc (riscv_init_cumulative_args): Install correcction version of last change.
2024-01-21[committed] [NFC] Fix riscv_init_cumulative_args for unused argumentsJeff Law1-6/+1
The signature was still using ATTRIBUTE_UNUSED and actually marked one of the used arguments with ATTRIBUTE_UNUSED. This patch drops the decorations and instead remove the name of arguments which are actually unused which is the preferred way to handle this now when we can. Bootstrapped. I didn't have test results on the platform where I bootstrapped, so no results to compare against. Given its NFC, I think we're OK without the regression results. gcc/ * config/riscv/riscv.cc (riscv_init_cumulative_args): Update and fix bugs in signature.
2024-01-21libstdc++: Fix std::format for floating-point chrono::time_point [PR113500]Jonathan Wakely5-23/+124
Currently trying to use std::format with certain specializations of std::chrono::time_point is ill-formed, due to one member function of the __formatter_chrono type which tries to write a time_point to an ostream. For sys_time<floating-point> or sys_time with a period greater than days there is no operator<< that can be used. That operator<< is only needed when using an empty chrono-specs in the format string, like "{}", but the ill-formed expression gives an error even if not actually used. This means it's not possible to format some other specializations of chrono::time_point, even when using a non-empty chrono-specs. This fixes it by avoiding using 'os << t' for all chrono::time_point specializations, and instead using std::format("{:L%F %T}", t). So that we continue to reject std::format("{}", sys_time{1.0s}) a check for empty chrono-specs is added to the formatter<sys_time<D>, C> specialization. While testing this I noticed that the output for %S with a floating-point duration was incorrect, as the subseconds part was being appended to the seconds without a decimal point, and without the correct number of leading zeros. libstdc++-v3/ChangeLog: PR libstdc++/113500 * include/bits/chrono_io.h (__formatter_chrono::_M_S): Fix printing of subseconds with floating-point rep. (__formatter_chrono::_M_format_to_ostream): Do not write time_point specializations directly to the ostream. (formatter<chrono::sys_time<D>, C>::parse): Do not allow an empty chrono-spec if the type fails to meet the constraints for writing to an ostream with operator<<. * testsuite/std/time/clock/file/io.cc: Check formatting non-integral times with empty chrono-specs. * testsuite/std/time/clock/gps/io.cc: Likewise. * testsuite/std/time/clock/utc/io.cc: Likewise. * testsuite/std/time/hh_mm_ss/io.cc: Likewise.
2024-01-21libstdc++: Fix std::chrono::file_clock conversions for low-precision timesJonathan Wakely2-6/+17
THe std::chrono::file_clock conversions were not using common_type and so failed to compile when converting anything that should have increased precision after arithmetic with a std::chrono::seconds value. libstdc++-v3/ChangeLog: * include/bits/chrono.h (__file_clock::from_sys) (__file_clock::to_sys, __file_clock::_S_from_sys) (__file_clock::_S_to_sys): Use common_type for return type. * testsuite/std/time/clock/file/members.cc: Check round trip conversion for time with lower precision that seconds.
2024-01-21PR rtl-optimization/111267: Improved forward propagation.Roger Sayle2-5/+23
This patch resolves PR rtl-optimization/111267 by improving RTL-level forward propagation. This x86_64 code quality regression was caused (exposed) by my changes to improve how x86's (TImode) argument passing is represented at the RTL-level (reducing the use of SUBREGs to catch more optimization opportunities in combine). The pitfall is that the more complex RTL representations expose a limitation in RTL's fwprop pass. At the heart of fwprop, in try_fwprop_subst_pattern, the logic can be summarized as three steps. Step 1 is a heuristic that rejects the propagation attempt if the expression is too complex, step 2 calls the backend's recog to see if the propagated/simplified instruction is recognizable/valid, and step 3 then calls set_src_cost to compare the rtx costs of the replacement vs. the original, and accepts the transformation if the final cost is the same of better. The logic error (or missed optimization opportunity) is that the step 1 heuristic that attempts to predict (second guess) the process is flawed. Ultimately the decision on whether to fwprop or not should depend solely on actual improvement, as measured by RTX costs. Hence the prototype fix in the bugzilla PR removes the heuristic of calling prop.profitable_p entirely, relying entirely on the cost comparison in step 3. Unfortunately, things are a tiny bit more complicated. The cost comparison in fwprop uses the older set_src_cost API and not the newer (preffered) insn_cost API as currently used in combine. This means that the cost improvement comparisons are only done for single_set instructions (more complex PARALLELs etc. aren't supported). Hence we can only rely on skipping step 1 for that subset of instructions actually evaluated by step 3. The other subtlety is that to avoid potential infinite loops in fwprop we should only reply purely on rtx costs when the transformation is obviously an improvement. If the replacement has the same cost as the original, we can use the prop.profitable_p test to preserve the current behavior. Finally, to answer Richard Biener's remaining question about this approach: yes, there is an asymmetry between how patterns are handled and how REG_EQUAL notes are handled. For example, at the moment propagation into notes doesn't use rtx costs at all, and ultimately when fwprop is updated to use insn_cost, this (and recog) obviously isn't applicable to notes. There's no reason the logic need be identical between patterns and notes, and during stage4 we only need update propagation into patterns to fix this P1 regression (notes and use of cost_insn can be done for GCC 15). For Jakub's reduced testcase: struct S { float a, b, c, d; }; int bar (struct S x, struct S y) { return x.b <= y.d && x.c >= y.a; } On x86_64-pc-linux-gnu with -O2 gcc currently generates: bar: movq %xmm2, %rdx movq %xmm3, %rax movq %xmm0, %rsi xchgq %rdx, %rax movq %rsi, %rcx movq %rax, %rsi movq %rdx, %rax shrq $32, %rcx shrq $32, %rax movd %ecx, %xmm4 movd %eax, %xmm0 comiss %xmm4, %xmm0 jb .L6 movd %esi, %xmm0 xorl %eax, %eax comiss %xmm0, %xmm1 setnb %al ret .L6: xorl %eax, %eax ret with this simple patch to fwprop, we now generate: bar: shufps $85, %xmm0, %xmm0 shufps $85, %xmm3, %xmm3 comiss %xmm0, %xmm3 jb .L6 xorl %eax, %eax comiss %xmm2, %xmm1 setnb %al ret .L6: xorl %eax, %eax ret 2024-01-21 Roger Sayle <roger@nextmovesoftware.com> Richard Biener <rguenther@suse.de> gcc/ChangeLog PR rtl-optimization/111267 * fwprop.cc (fwprop_propagation::profitabe_p): Rename profitable_p method to likely_profitable_p. (try_fwprop_subst_node): Update call to likely_profitable_p. Only bail-out early when !prop.likely_profitable_p for instructions that are not single sets. When comparing costs, bail-out if the cost is unchanged and !prop.likely_profitable_p. gcc/testsuite/ChangeLog PR rtl-optimization/111267 * gcc.target/i386/pr111267.c: New test case.
2024-01-21Make the manual clearer about what options -Wunused enables [PR90464]Sandra Loosemore1-10/+18
gcc/ChangeLog PR c++/90464 * doc/invoke.texi (Warning Options): Document that -Wunused-parameter isn't enabled by -Wunused unless -Wextra is provided, and that -Wunused does enable -Wunused-const-variable=1 for C. Clarify that -Wunused doesn't enable -Wunused-* options documented as behaving otherwise, and list them explicitly.
2024-01-21Fortran: passing of optional scalar arguments with VALUE attribute [PR113377]Harald Anlauf2-0/+341
gcc/fortran/ChangeLog: PR fortran/113377 * trans-expr.cc (gfc_conv_procedure_call): Fix handling of optional scalar arguments of intrinsic type with the VALUE attribute. gcc/testsuite/ChangeLog: PR fortran/113377 * gfortran.dg/optional_absent_9.f90: New test.
2024-01-21C23: Fix ICE for composite type for structs with unsigned bitfields [PR113492]Martin Uecker2-1/+44
This patch fixes a bug when forming a composite type from structs that contain an unsigned bitfield declared with int while using -funsigned-bitfields. In such structs the unsigned integer type was not compatible to the regular unsigned integer type used elsewhere in the C FE. PR c/113492 gcc/c: * c-decl.cc (grokdeclarator): Use c_common_unsigned_type instead of unsigned_type_for to create the unsigned type for bitfields declared with int when using -funsigned-bitfields. gcc/testsuite: * gcc.dg/pr113492.c: New test.
2024-01-21libstdc++: Fix std::format floating-point alternate forms [PR113512]Jonathan Wakely2-16/+41
The logic for handling '#' forms was ... not good. The count of significant figures just counted digits, instead of ignoring leading zeros. And when moving the result from the stack buffer to a dynamic string the exponent could get lost in some cases. libstdc++-v3/ChangeLog: PR libstdc++/113512 * include/std/format (__formatter_fp::format): Fix logic for alternate forms. * testsuite/std/format/functions/format.cc: Check buggy cases of alternate forms with g presentation type.
2024-01-21Clean up examples for -Wdangling-pointer [PR109708]Sandra Loosemore1-22/+34
gcc/ChangeLog PR c/109708 * doc/invoke.texi (Warning Options): Fix broken example and clean up/reorganize the others. Also describe what the short-form options mean.
2024-01-21Daily bump.GCC Administrator6-1/+92
2024-01-20Remove several xfails for 32-bit hppa*-*-*John David Anglin4-5/+4
These arise because 32-bit ELF targets were changed from callee copies to caller copies. 2024-01-20 John David Anglin <danglin@gcc.gnu.org> gcc/testsuite/ChangeLog: * gcc.dg/ipa/iinline-4.c: Remove dg-final xfail for 32-bit hppa*-*-*. * gcc.dg/ipa/inline-5.c: Likewise. * gcc.dg/ipa/ipcp-cstagg-7.c: Likewise. * gcc.dg/tree-ssa/vector-4.c: Likewise.
2024-01-20Increase timeout by 2 in libgomp.fortran/alloc-comp-3.f90 on hppa*-*-*John David Anglin1-0/+1
2024-01-20 John David Anglin <danglin@gcc.gnu.org> libgomp/ChangeLog: * testsuite/libgomp.fortran/alloc-comp-3.f90: Increase timeout by 2 on hppa*-*-*.
2024-01-20Don't run libgomp.c/simd-math-1.c on hppa*-*-hpux*John David Anglin1-1/+1
hppa*-*-hpux* lacks necessary math functions. 2024-01-20 John David Anglin <danglin@gcc.gnu.org> libgomp/ChangeLog: * testsuite/libgomp.c/simd-math-1.c: Don't run on hppa*-*-hpux*.
2024-01-20xfail scan-tree-dump-times checks on hppa*64*-*-* in gcc.dg/tree-ssa/slsr-13.cJohn David Anglin1-2/+2
2024-01-20 John David Anglin <danglin@gcc.gnu.org> gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/slsr-13.c: xfail scan-tree-dump-times checks on hppa*64*-*-*.
2024-01-20Require target lra in gcc.dg/torture/pr110422.cJohn David Anglin1-1/+1
LRA is required for asm goto. 2024-01-20 John David Anglin <danglin@gcc.gnu.org> gcc/testsuite/ChangeLog: * gcc.dg/torture/pr110422.c: Require target lra.
2024-01-20libstdc++: suppress -Wdangling-reference with operator| [PR111410]Marek Polacek2-0/+18
It seems to me that we should exclude std::ranges::views::__adaptor::operator| from the -Wdangling-reference warning. It's commonly used when handling ranges. PR c++/111410 libstdc++-v3/ChangeLog: * include/std/ranges: Add #pragma to disable -Wdangling-reference with std::ranges::views::__adaptor::operator|. gcc/testsuite/ChangeLog: * g++.dg/warn/Wdangling-reference17.C: New test.
2024-01-20ipa: Add testcase for already fixed case [PR110705]Andrew Pinski1-0/+27
This testcase was fixed with r13-1695-gb0f02eeb906b63 which added an Ada testcase for the issue but adding a C testcase is a good idea and that is what this does. Committed after making sure it passes on x86_64-linux-gnu. PR ipa/110705 gcc/testsuite/ChangeLog: * gcc.c-torture/compile/pr110705-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-01-20fortran: Restore current interface info on error [PR111291]Mikael Morin1-0/+1
This change is a followup to the fix for PR48776 (namely r14-3572-gd58150452976c4ca65ddc811fac78ef956fa96b0 AKA fortran: Restore interface to its previous state on error [PR48776]), which cleaned up new changes from interfaces upon error. Unfortunately, there is one case in that fix that is mishandled, visible on unexpected_interface.f90 with valgrind or an asan-instrumented gfortran. when an interface statement is found while parsing an interface body (which is invalid), the current interface is replaced by the one from the new statement, and as parsing continues, new procedures are added to the new interface, which has been rejected and freed, instead of the original one. This change restores the current interface pointer to its previous value on each rejected statement. PR fortran/48776 PR fortran/111291 gcc/fortran/ChangeLog: * parse.cc: Restore current interface to its previous value on error.
2024-01-20Correct documentation for -Warray-parameter [PR102998]Sandra Loosemore1-8/+16
gcc/ChangeLog PR c/102998 * doc/invoke.texi (Option Summary): Add -Warray-parameter. (Warning Options): Correct/edit discussion of -Warray-parameter to make the first example less confusing, and fill in missing info.
2024-01-20lower-bitint: Handle INTEGER_CST rhs1 in handle_cast [PR113462]Jakub Jelinek2-1/+17
The following patch ICEs because fre3 leaves around unfolded _1 = VIEW_CONVERT_EXPR<_BitInt(129)>(0); statement and in handle_cast I was expecting just SSA_NAMEs for the large/huge _BitInt to large/huge _BitInt casts; INTEGER_CST is something we can handle in that case exactly the same, as the handle_operand recursion handles those. Of course, maybe we should also try to fold_stmt such cases somewhere in bitint lowering preparation. 2024-01-20 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/113462 * gimple-lower-bitint.cc (bitint_large_huge::handle_cast): Handle rhs1 INTEGER_CST like SSA_NAME. * gcc.dg/bitint-76.c: New test.