riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-03-08	modula2: Add constant aggregate tests	Gaius Mulley	4	-0/+132
	This patch adds four constant aggregate tests and assignment of arrays by a constant in two different scopes. gcc/testsuite/ChangeLog: * gm2/iso/pass/arrayconst.mod: New test. * gm2/iso/pass/arrayconst2.mod: New test. * gm2/iso/pass/arrayconst3.mod: New test. * gm2/iso/pass/arrayconst4.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-03-08	RISC-V: Fix ICE in riscv vector costs	demin.han	2	-0/+17
	The following code can result in ICE: -march=rv64gcv --param riscv-autovec-lmul=dynamic -O3 char jpeg_difference7_input_buf; void jpeg_difference7(int diff_buf) { unsigned width; int samp, Rb; while (--width) { Rb = samp = jpeg_difference7_input_buf; diff_buf++ = -(int)(samp + (long)Rb >> 1); } } One biggest_mode update missed in one branch and trigger assertion fail. gcc_assert (biggest_size >= mode_size); Tested On RV64 and no regression. PR target/114264 gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc: Fix ICE gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr114264.c: New test. Signed-off-by: demin.han <demin.han@starfivetech.com>
2024-03-08	fwprop: Avoid volatile rtx to be propagated	Haochen Gui	2	-0/+17
	The patch for PR111267 (commit id 86de9b66480b710202a2898cf513db105d8c432f) which introduces an exception for propagation on single set insn. The propagation which might not be profitable (checked by profitable_p) is still allowed to be propagated to single set insn. It has a potential problem that a volatile operand might be propagated to a singel set insn. If the define insn is not eliminated after propagation, the volatile operand will be executed for multiple times. This patch fixes the problem by skipping volatile set source rtx in propagation. gcc/ * fwprop.cc (forward_propagate_into): Return false for volatile set source rtx. gcc/testsuite/ * gcc.target/powerpc/fwprop-1.c: New.
2024-03-08	Daily bump.	GCC Administrator	10	-1/+364

2024-03-07	libstdc++: Use std::from_chars to speed up parsing subsecond durations	Jonathan Wakely	1	-10/+18
	With std::from_chars we can parse subsecond durations much faster than with std::num_get, as shown in the microbenchmarks below. We were using std::num_get and std::numpunct in order to parse a number with the locale's decimal point character. But we copy the chars from the input stream into a new buffer anyway, so we can replace the locale's decimal point with '.' in that buffer, and then we can use std::from_chars on it. Benchmark Time CPU Iterations ---------------------------------------------------------- from_chars_millisec 158 ns 158 ns 4524046 num_get_millisec 192 ns 192 ns 3644626 from_chars_microsec 164 ns 163 ns 4330627 num_get_microsec 205 ns 205 ns 3413452 from_chars_nanosec 173 ns 173 ns 4072653 num_get_nanosec 227 ns 227 ns 3105161 libstdc++-v3/ChangeLog: * include/bits/chrono_io.h (_Parser::operator()): Use std::from_chars to parse fractional seconds.
2024-03-07	libstdc++: Fix parsing of fractional seconds [PR114244]	Jonathan Wakely	3	-6/+60
	When converting a chrono::duration<long double> to a result type with an integer representation we should use chrono::round<_Duration> so that we don't truncate towards zero. Rounding ensures that e.g. 0.001999s becomes 2ms not 1ms. We can also remove some redundant uses of chrono::duration_cast to convert from seconds to _Duration, because the _Parser class template requires _Duration type to be able to represent seconds without loss of precision. This also fixes a bug where no fractional part would be parsed for chrono::duration<long double> because its period is ratio<1>. We should also consider treat_as_floating_point<rep> when deciding whether to skip reading a fractional part. libstdc++-v3/ChangeLog: PR libstdc++/114244 * include/bits/chrono_io.h (_Parser::operator()): Remove redundant uses of duration_cast. Use chrono::round to convert long double value to durations with integer representations. Check represenation type when deciding whether to skip parsing fractional seconds. * testsuite/20_util/duration/114244.cc: New test. * testsuite/20_util/duration/io.cc: Check that a floating-point duration with ratio<1> precision can be parsed.
2024-03-08	c++: Redetermine whether to write vtables on stream-in [PR114229]	Nathaniel Shead	7	-4/+42
	We currently always stream DECL_INTERFACE_KNOWN, which is needed since many kinds of declarations already have their interface determined at parse time. But for vtables and type-info declarations we need to re-evaluate on stream-in as whether they need to be emitted or not changes in each TU, so this patch clears DECL_INTERFACE_KNOWN on these kinds of declarations so that they can go through 'import_export_decl' again. Note that the precise details of the virt-2 tests will need to change when we implement the resolution of [1], for now I just updated the test to not fail with the new (current) semantics. [1]: https://github.com/itanium-cxx-abi/cxx-abi/pull/171 PR c++/114229 gcc/cp/ChangeLog: * module.cc (trees_out::core_bools): Redetermine DECL_INTERFACE_KNOWN on stream-in for vtables and tinfo. * decl2.cc (import_export_decl): Add fixme for ABI changes with module vtables and tinfo. gcc/testsuite/ChangeLog: * g++.dg/modules/virt-2_b.C: Update test to acknowledge that we now emit vtables here too. * g++.dg/modules/virt-3_a.C: New test. * g++.dg/modules/virt-3_b.C: New test. * g++.dg/modules/virt-3_c.C: New test. * g++.dg/modules/virt-3_d.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-03-07	c++/modules: member alias tmpl partial inst [PR103994]	Patrick Palka	7	-124/+77
	Alias templates are weird in that their specializations can appear in both decl_specializations and type_specializations. They're always in the decl table, and additionally appear in the type table only at parse time via finish_template_type. There seems to be no good reason for them to appear in both tables, and the code paths end up stepping over each other in particular for a partial instantiation such as A<B>::key_arg<T> in the below modules testcase: the type code path (lookup_template_class) wants to set TI_TEMPLATE to the most general template whereas the decl code path (tsubst_template_decl called during instantiation of A<B>) already set TI_TEMPLATE to the partially instantiated TEMPLATE_DECL. This TI_TEMPLATE change ends up confusing modules which decides to stream the logically equivalent TYPE_DECL and TEMPLATE_DECL for this partial instantiation separately. This patch fixes this by making lookup_template_class dispatch to instantiate_alias_template early for alias template specializations. In turn we now add such specializations only to the decl table. This admits some nice simplification in the modules code which otherwise has to cope with such specializations appearing in both tables. PR c++/103994 gcc/cp/ChangeLog: * cp-tree.h (add_mergeable_specialization): Remove second parameter. * module.cc (depset::disc_bits::DB_ALIAS_TMPL_INST_BIT): Remove. (depset::disc_bits::DB_ALIAS_SPEC_BIT): Remove. (depset::is_alias_tmpl_inst): Remove. (depset::is_alias): Remove. (merge_kind::MK_tmpl_alias_mask): Remove. (merge_kind::MK_alias_spec): Remove. (merge_kind_name): Remove entries for alias specializations. (trees_out::core_vals) <case TEMPLATE_DECL>: Adjust after removing is_alias_tmpl_inst. (trees_in::decl_value): Adjust add_mergeable_specialization calls. (trees_out::get_merge_kind) <case depset::EK_SPECIALIZATION>: Use MK_decl_spec for alias template specializations. (trees_out::key_mergeable): Simplify after MK_tmpl_alias_mask removal. (depset::hash::make_dependency): Adjust after removing DB_ALIAS_TMPL_INST_BIT. (specialization_add): Don't allow alias templates when !decl_p. (depset::hash::add_specializations): Remove now-dead code accomodating alias template specializations in the type table. * pt.cc (lookup_template_class): Dispatch early to instantiate_alias_template for alias templates. Simplify accordingly. (add_mergeable_specialization): Remove alias_p parameter and simplify accordingly. gcc/testsuite/ChangeLog: * g++.dg/modules/pr99425-1_b.H: s/alias/decl in dump scan. * g++.dg/modules/tpl-alias-1_a.H: Likewise. * g++.dg/modules/tpl-alias-2_a.H: New test. * g++.dg/modules/tpl-alias-2_b.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-03-07	AArch64: memcpy/memset expansions should not emit LDP/STP [PR113618]	Wilco Dijkstra	2	-47/+57
	The new RTL introduced for LDP/STP results in regressions due to use of UNSPEC. Given the new LDP fusion pass is good at finding LDP opportunities, change the memcpy, memmove and memset expansions to emit single vector loads/stores. This fixes the regression and enables more RTL optimization on the standard memory accesses. Handling of unaligned tail of memcpy/memmove is improved with -mgeneral-regs-only. SPEC2017 performance improves slightly. Codesize is a bit worse due to missed LDP opportunities as discussed in the PR. gcc/ChangeLog: PR target/113618 * config/aarch64/aarch64.cc (aarch64_copy_one_block): Remove. (aarch64_expand_cpymem): Emit single load/store only. (aarch64_set_one_block): Emit single stores only. gcc/testsuite/ChangeLog: PR target/113618 * gcc.target/aarch64/pr113618.c: New test.
2024-03-07	c++/modules: inline namespace abi_tag streaming [PR110730]	Patrick Palka	5	-0/+66
	The unreduced testcase from PR110730 crashes at runtime ultimately because we don't stream the abi_tag attribute on inline namespaces and so the filesystem::current_path() call resolves to the non-C++11 ABI version even though the C++11 ABI is active, leading to a crash when destroying the path temporary (which contains an std::string member). Similar story for the PR105512 testcase. While we do stream the DECL_ATTRIBUTES of all decls that go through the generic tree streaming routines, it seems namespaces are streamed separately from other decls and we don't use the generic routines for them. So this patch makes us stream the abi_tag manually for (inline) namespaces. PR c++/110730 PR c++/105512 gcc/cp/ChangeLog: * module.cc (module_state::write_namespaces): Stream the abi_tag attribute of an inline namespace. (module_state::read_namespaces): Likewise. gcc/testsuite/ChangeLog: * g++.dg/modules/hello-2_a.C: New test. * g++.dg/modules/hello-2_b.C: New test. * g++.dg/modules/namespace-6_a.H: New test. * g++.dg/modules/namespace-6_b.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-03-07	libstdc++: Do not define lock-free atomic aliases if not fully lock-free ↵	Jonathan Wakely	5	-5/+12
	[PR114103] The whole point of these typedefs is to guarantee lock-freedom, so if the target has no such types, we shouldn't defined the typedefs at all. libstdc++-v3/ChangeLog: PR libstdc++/114103 * include/bits/version.def (atomic_lock_free_type_aliases): Add extra_cond to check for at least one always-lock-free type. * include/bits/version.h: Regenerate. * include/std/atomic (atomic_signed_lock_free) (atomic_unsigned_lock_free): Only use always-lock-free types. * src/c++20/tzdb.cc (time_zone::_Impl::RulesCounter): Don't use atomic counter if lock-free aliases aren't available. * testsuite/29_atomics/atomic/lock_free_aliases.cc: XFAIL for targets without lock-free word-size compare_exchange.
2024-03-07	libstdc++: Update expiry times for leap seconds lists	Jonathan Wakely	2	-2/+2
	The list in tzdb.cc isn't the only hardcoded list of leap seconds in the library, there's the one defined inline in <chrono> (to avoid loading the tzdb for the common case) and another in a testcase. This updates them to note that there are no new leap seconds in 2024 either, until at least 2024-12-28. libstdc++-v3/ChangeLog: * include/std/chrono (__get_leap_second_info): Update expiry time for hardcoded list of leap seconds. * testsuite/std/time/tzdb/leap_seconds.cc: Update comment.
2024-03-07	libstdc++: Replace unnecessary uses of built-ins in testsuite	Jonathan Wakely	11	-26/+35
	I don't see why we should rely on __builtin_memset etc. in tests. We can just include <cstring> and use the public API. libstdc++-v3/ChangeLog: * testsuite/23_containers/deque/allocator/default_init.cc: Use std::memset instead of __builtin_memset. * testsuite/23_containers/forward_list/allocator/default_init.cc: Likewise. * testsuite/23_containers/list/allocator/default_init.cc: Likewise. * testsuite/23_containers/map/allocator/default_init.cc: Likewise. * testsuite/23_containers/set/allocator/default_init.cc: Likewise. * testsuite/23_containers/unordered_map/allocator/default_init.cc: Likewise. * testsuite/23_containers/unordered_set/allocator/default_init.cc: Likewise. * testsuite/23_containers/vector/allocator/default_init.cc: Likewise. * testsuite/23_containers/vector/bool/allocator/default_init.cc: Likewise. * testsuite/29_atomics/atomic/compare_exchange_padding.cc: Likewise. * testsuite/util/atomic/wait_notify_util.h: Likewise.
2024-03-07	libstdc++: Better diagnostics for std::format errors	Jonathan Wakely	4	-1/+51
	This adds two new static_assert messages to the internals of std::make_format_args to give better diagnostics for invalid format args. Rather than just getting an error saying that basic_format_arg cannot be constructed, we get more specific errors for the cases where std::formatter isn't specialized for the type at all, and where it's specialized but only meets the BasicFormatter requirements and so can only format non-const arguments. Also add a test for the existing static_assert when constructing a format_string for non-formattable args. libstdc++-v3/ChangeLog: * include/std/format (_Arg_store::_S_make_elt): Add two static_assert checks to give more user-friendly error messages. * testsuite/lib/prune.exp (libstdc++-dg-prune): Prune another form of "in requirements with" note. * testsuite/std/format/arguments/args_neg.cc: Check for user-friendly diagnostics for non-formattable types. * testsuite/std/format/string_neg.cc: Likewise.
2024-03-07	testsuite, darwin: improve check for -shared support	Francois-Xavier Coudert	1	-1/+1
	The undefined symbols are allowed for C checks, but when this is run as C++, the mangled foo() symbol is still seen as undefined, and the testsuite thinks darwin does not support -shared. gcc/testsuite/ChangeLog: PR target/114233 * lib/target-supports.exp: Fix test for C++.
2024-03-07	vect: Do not peel epilogue for partial vectors.	Robin Dapp	3	-23/+45
	r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early break but PR114196 shows that we also run into the problem without early break. Therefore merge the condition into the topmost vectorization guard. gcc/ChangeLog: PR middle-end/114196 * tree-vect-loop-manip.cc (vect_can_peel_nonlinear_iv_p): Merge vectorization guards. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr114196.c: New test. * gcc.target/riscv/rvv/autovec/pr114196.c: New test.
2024-03-07	PR modula2/109969 Linking large project causes an ICE	Gaius Mulley	6	-471/+451
	This patch contains a re-write of M2LexBuf.mod which removes the linked list of token buckets and simplifies the implementation using a dynamic array. It contains more checking (for empty source files for example). The patch also contains a fix for an ICE in gcc/m2/gm2-gcc/builtins.cc gcc/m2/ChangeLog: PR modula2/109969 * gm2-compiler/M2LexBuf.def (TokenToLineNo): Rename parameter. (TokenToColumnNo): Rename parameter. (TokenToLocation): Rename parameter. (FindFileNameFromToken): Rename parameter. (DumpTokens): Rewrite comment. * gm2-compiler/M2LexBuf.mod: Rewrite. * gm2-compiler/P0SyntaxCheck.bnf (CheckInsertCandidate): DumpTokens before and after inserting recovery token. * gm2-gcc/m2builtins.cc (do_target_support_exists): Add bf_c99_compl case. * gm2-libs/Indexing.def (InitIndexTuned): New procedure function. (IsEmpty): New procedure function. * gm2-libs/Indexing.mod (InitIndexTuned): New procedure function. (IsEmpty): New procedure function. (Index): New field GrowFactor. (PutIndice): Use GrowFactor to extend dynamic array. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-03-07	c++: ICE with variable template and [[deprecated]] [PR110031]	Marek Polacek	2	-1/+33
	lookup_and_finish_template_variable already has and uses the complain parameter but it is not passing it down to mark_used so we got the default tf_warning_or_error, which causes various problems when lookup_and_finish_template_variable gets called with complain=tf_none. PR c++/110031 gcc/cp/ChangeLog: * pt.cc (lookup_and_finish_template_variable): Pass complain to mark_used. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/inline-var11.C: New test.
2024-03-07	doc: Fix docs for -dD regarding predefined macros	Jonathan Wakely	1	-2/+1
	The manual has always claimed that -dD differs from -dM by not outputting predefined macros, but that's untrue. It has been untrue since GCC 3.0 (probably with the change to use libcpp as the default preprocessor implementation). gcc/ChangeLog: * doc/cppopts.texi: Remove incorrect claim about -dD not outputting predefined macros.
2024-03-07	rs6000: Don't ICE when compiling the __builtin_vsx_splat_2di [PR113950]	Jeevitha	2	-2/+26
	When we expand the __builtin_vsx_splat_2di built-in, we were allowing immediate value for second operand which causes an unrecognizable insn ICE. Even though the immediate value was forced into a register, it wasn't correctly assigned to the second operand. So corrected the assignment of op1 to operands[1]. 2024-03-07 Jeevitha Palanisamy <jeevitha@linux.ibm.com> gcc/ PR target/113950 * config/rs6000/vsx.md (vsx_splat_<mode>): Correct assignment to operand1 and simplify else if with else. gcc/testsuite/ PR target/113950 * gcc.target/powerpc/pr113950.c: New testcase.
2024-03-07	Fix bogus error on allocator for array type with Dynamic_Predicate	Eric Botcazou	2	-2/+19
	This is a regression present on all active branches: the compiler gives a bogus error on an allocator for an unconstrained array type declared with a Dynamic_Predicate because Apply_Predicate_Check is invoked directly on a subtype reference, which it cannot handle. This moves the check to the resulting access value (after dereference) like in Expand_Allocator_Expression. gcc/ada/ PR ada/113979 * exp_ch4.adb (Expand_N_Allocator): In the subtype indication case, call Apply_Predicate_Check on the resulting access value if needed. gcc/testsuite/ * gnat.dg/predicate15.adb: New test.
2024-03-07	Include safe-ctype.h after C++ standard headers, to avoid over-poisoning	Francois-Xavier Coudert	1	-21/+18
	When building gcc's C++ sources against recent libc++, the poisoning of the ctype macros due to including safe-ctype.h before including C++ standard headers such as <list>, <map>, etc, causes many compilation errors, similar to: In file included from /home/dim/src/gcc/master/gcc/gensupport.cc:23: In file included from /home/dim/src/gcc/master/gcc/system.h:233: In file included from /usr/include/c++/v1/vector:321: In file included from /usr/include/c++/v1/__format/formatter_bool.h:20: In file included from /usr/include/c++/v1/__format/formatter_integral.h:32: In file included from /usr/include/c++/v1/locale:202: /usr/include/c++/v1/__locale:546:5: error: '__abi_tag__' attribute only applies to structs, variables, functions, and namespaces 546 \| _LIBCPP_INLINE_VISIBILITY \| ^ /usr/include/c++/v1/__config:813:37: note: expanded from macro '_LIBCPP_INLINE_VISIBILITY' 813 \| # define _LIBCPP_INLINE_VISIBILITY _LIBCPP_HIDE_FROM_ABI \| ^ /usr/include/c++/v1/__config:792:26: note: expanded from macro '_LIBCPP_HIDE_FROM_ABI' 792 \| __attribute__((__abi_tag__(_LIBCPP_TOSTRING( _LIBCPP_VERSIONED_IDENTIFIER)))) \| ^ In file included from /home/dim/src/gcc/master/gcc/gensupport.cc:23: In file included from /home/dim/src/gcc/master/gcc/system.h:233: In file included from /usr/include/c++/v1/vector:321: In file included from /usr/include/c++/v1/__format/formatter_bool.h:20: In file included from /usr/include/c++/v1/__format/formatter_integral.h:32: In file included from /usr/include/c++/v1/locale:202: /usr/include/c++/v1/__locale:547:37: error: expected ';' at end of declaration list 547 \| char_type toupper(char_type __c) const \| ^ /usr/include/c++/v1/__locale:553:48: error: too many arguments provided to function-like macro invocation 553 \| const char_type* toupper(char_type* __low, const char_type* __high) const \| ^ /home/dim/src/gcc/master/gcc/../include/safe-ctype.h:146:9: note: macro 'toupper' defined here 146 \| #define toupper(c) do_not_use_toupper_with_safe_ctype \| ^ This is because libc++ uses different transitive includes than libstdc++, and some of those transitive includes pull in various ctype declarations (typically via <locale>). There was already a special case for including <string> before safe-ctype.h, so move the rest of the C++ standard header includes to the same location, to fix the problem. gcc/ChangeLog: * system.h: Include safe-ctype.h after C++ standard headers. Signed-off-by: Dimitry Andric <dimitry@andric.com>
2024-03-07	analyzer: Fix up some -Wformat* warnings	Jakub Jelinek	5	-1/+5
	I'm seeing warnings like ../../gcc/analyzer/access-diagram.cc: In member function ‘void ana::bit_size_expr::print(pretty_printer) const’: ../../gcc/analyzer/access-diagram.cc:399:26: warning: unknown conversion type character ‘E’ in format [-Wformat=] 399 \| pp_printf (pp, _("%qE bytes"), bytes_expr); \| ^~~~~~~~~~~ when building stage2/stage3 gcc. While such warnings would be understandable when building stage1 because one could e.g. have some older host compiler which doesn't understand some of the format specifiers, the above seems to be because we have in pretty-print.h #ifdef GCC_DIAG_STYLE #define GCC_PPDIAG_STYLE GCC_DIAG_STYLE #else #define GCC_PPDIAG_STYLE __gcc_diag__ #endif and use GCC_PPDIAG_STYLE e.g. for pp_printf, and while diagnostic-core.h has #ifndef GCC_DIAG_STYLE #define GCC_DIAG_STYLE __gcc_tdiag__ #endif (and similarly various FE headers include their own GCC_DIAG_STYLE) when including pretty-print.h before diagnostic-core.h we end up with __gcc_diag__ style rather than __gcc_tdiag__ style, which I think is the right thing for the analyzer, because analyzer seems to use default_tree_printer everywhere: grep pp_format_decoder.=.default_tree_printer analyzer/* \| wc -l 57 The following patch fixes that by making sure diagnostic-core.h is included before pretty-print.h. 2024-03-07 Jakub Jelinek <jakub@redhat.com> * access-diagram.cc: Include diagnostic-core.h before including diagnostic.h or diagnostic-path.h. * sm-malloc.cc: Likewise. * diagnostic-manager.cc: Likewise. * call-summary.cc: Likewise. * record-layout.cc: Likewise.
2024-03-07	contrib: Update test_mklog to correspond to mklog	Filip Kastl	1	-1/+1
	contrib/ChangeLog: * test_mklog.py: "Moved to..." -> "Move to..." Signed-off-by: Filip Kastl <fkastl@suse.cz>
2024-03-07	c++: Fix ICE diagnosing incomplete type of overloaded function set [PR98356]	Nathaniel Shead	2	-6/+14
	In the linked PR the result of 'get_first_fn' is a USING_DECL against the template parameter, to be filled in on instantiation. But we don't actually need to get the first set of the member functions: it's enough to know that we have a (possibly overloaded) member function at all. PR c++/98356 gcc/cp/ChangeLog: * typeck2.cc (cxx_incomplete_type_diagnostic): Don't assume 'member' will be a FUNCTION_DECL (or something like it). gcc/testsuite/ChangeLog: * g++.dg/pr98356.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-03-07	c++: Stream DECL_CONTEXT for template template parms [PR98881]	Nathaniel Shead	5	-31/+47
	When streaming in a nested template-template parameter as in the attached testcase, we end up reaching the containing template-template parameter in 'tpl_parms_fini'. We should not set the DECL_CONTEXT to this (nested) template-template parameter, as it should already be the struct that the outer template-template parameter is declared on. The precise logic for what DECL_CONTEXT should be for a template template parameter in various situations seems rather obscure. Rather than trying to determine the assumptions that need to hold, it seems simpler to just always re-stream the DECL_CONTEXT as needed for now. PR c++/98881 gcc/cp/ChangeLog: * module.cc (trees_out::tpl_parms_fini): Stream out DECL_CONTEXT for template template parameters. (trees_in::tpl_parms_fini): Read it. gcc/testsuite/ChangeLog: * g++.dg/modules/tpl-tpl-parm-3.h: New test. * g++.dg/modules/tpl-tpl-parm-3_a.H: New test. * g++.dg/modules/tpl-tpl-parm-3_b.C: New test. * g++.dg/modules/tpl-tpl-parm-3_c.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Patrick Palka <ppalka@redhat.com> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-03-07	bb-reorder: Fix -freorder-blocks-and-partition ICEs on aarch64 with asm goto ↵	Jakub Jelinek	2	-1/+45
	[PR110079] The following testcase ICEs, because fix_crossing_unconditional_branches thinks that asm goto is an unconditional jump and removes it, replacing it with unconditional jump to one of the labels. This doesn't happen on x86 because the function in question isn't invoked there at all: /* If the architecture does not have unconditional branches that can span all of memory, convert crossing unconditional branches into indirect jumps. Since adding an indirect jump also adds a new register usage, update the register usage information as well. / if (!HAS_LONG_UNCOND_BRANCH) fix_crossing_unconditional_branches (); I think for the asm goto case, for the non-fallthru edge if any we should handle it like any other fallthru (and fix_crossing_unconditional_branches doesn't really deal with those, it only looks at explicit branches at the end of bbs and we are in cfglayout mode at that point) and for the labels we just pass the labels as immediates to the assembly and it is up to the user to figure out how to store them/branch to them or whatever they want to do. So, the following patch fixes this by not treating asm goto as a simple unconditional jump. I really think that on the !HAS_LONG_UNCOND_BRANCH targets we have a bug somewhere else, where outofcfglayout or whatever should actually create those indirect jumps on the crossing edges instead of adding normal unconditional jumps, I see e.g. in __attribute__((cold)) int bar (char ); __attribute__((hot)) int baz (char ); void qux (int x) { if (__builtin_expect (!x, 1)) goto l1; bar (""); goto l1; l1: baz (""); } void corge (int x) { if (__builtin_expect (!x, 0)) goto l1; baz (""); l2: return; l1: bar (""); goto l2; } with -O2 -freorder-blocks-and-partition on aarch64 before/after this patch just b .L? jumps which I believe are +-32MB, so if .text is larger than 32MB, it could fail to link, but this patch doesn't address that. 2024-03-07 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/110079 bb-reorder.cc (fix_crossing_unconditional_branches): Don't adjust asm goto. * gcc.dg/pr110079.c: New test.
2024-03-07	expand: Fix UB in choose_mult_variant [PR105533]	Jakub Jelinek	2	-5/+18
	As documented in the function comment, choose_mult_variant attempts to compute costs of 3 different cases, val, -val and val - 1. The -val case is actually only done if val fits into host int, so there should be no overflow, but the val - 1 case is done unconditionally. val is shwi (but inside of synth_mult already uhwi), so when val is HOST_WIDE_INT_MIN, val - 1 invokes UB. The following patch fixes that by using val - HOST_WIDE_INT_1U, but I'm not really convinced it would DTRT for > 64-bit modes, so I've guarded it as well. Though, arch would need to have really strange costs that something that could be expressed as x << 63 would be better expressed as (x * 0x7fffffffffffffff) + 1 In the long term, I think we should just rewrite choose_mult_variant/synth_mult etc. to work on wide_int. 2024-03-07 Jakub Jelinek <jakub@redhat.com> PR middle-end/105533 * expmed.cc (choose_mult_variant): Only try the val - 1 variant if val is not HOST_WIDE_INT_MIN or if mode has exactly HOST_BITS_PER_WIDE_INT precision. Avoid triggering UB while computing val - 1. * gcc.dg/pr105533.c: New test.
2024-03-07	sccvn: Avoid UB in ao_ref_init_from_vn_reference [PR105533]	Jakub Jelinek	1	-1/+1
	When compiling libgcc or on e.g. int a[64]; int p; void foo (void) { int s = 1; while (p) { s -= 11; a[s] != 0; } } sccvn invokes UB in the compiler as detected by ubsan: ../../gcc/poly-int.h:1089:5: runtime error: left shift of negative value -40 The problem is that we still use C++11..C++17 as the implementation language and in those C++ versions shifting negative values left is UB (well defined since C++20) and above in offset += op->off << LOG2_BITS_PER_UNIT; op->off is poly_int64 with -40 value (in libgcc with -8). I understand the offset_int << LOG2_BITS_PER_UNIT shifts but it is then well defined during underlying implementation which is done on the uhwi limbs, but for poly_int64 we use offset += pop->off * BITS_PER_UNIT; a few lines earlier and I think that is both more readable in what it actually does and triggers UB only if there would be signed multiply overflow. In the end, the compiler will treat them the same at least at the RTL level (at least, if not and they aren't the same cost, it should). 2024-03-07 Jakub Jelinek <jakub@redhat.com> PR middle-end/105533 * tree-ssa-sccvn.cc (ao_ref_init_from_vn_reference) <case ARRAY_REF>: Multiple op->off by BITS_PER_UNIT instead of shifting it left by LOG2_BITS_PER_UNIT.
2024-03-07	LoongArch: testsuite:Fix problems with incorrect results in vector test cases.	chenxiaolong	5	-68/+68
	In simd_correctness_check.h, the role of the macro ASSERTEQ_64 is to check the result of the passed vector values for the 64-bit data of each array element. It turns out that it uses the abs() function to check only the lower 32 bits of the data at a time, so it replaces abs() with the llabs() function. However, the following two problems may occur after modification: 1.FAIL in lasx-xvfrint_s.c and lsx-vfrint_s.c The reason for the error is because vector test cases that use __m{128,256} to define vector types are composed of 32-bit primitive types, they should use ASSERTEQ_32 instead of ASSERTEQ_64 to check for correctness. 2.FAIL in lasx-xvshuf_b.c and lsx-vshuf.c The cause of the error is that the expected result of the function setting in the test case is incorrect. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vector/lasx/lasx-xvfrint_s.c: Replace ASSERTEQ_64 with the macro ASSERTEQ_32. * gcc.target/loongarch/vector/lasx/lasx-xvshuf_b.c: Modify the expected test results of some functions according to the function of the vector instruction. * gcc.target/loongarch/vector/lsx/lsx-vfrint_s.c: Same modification as lasx-xvfrint_s.c. * gcc.target/loongarch/vector/lsx/lsx-vshuf.c: Same modification as lasx-xvshuf_b.c. * gcc.target/loongarch/vector/simd_correctness_check.h: Use the llabs() function instead of abs() to check the correctness of the results.
2024-03-07	LoongArch: Use /lib instead of /lib64 as the library search path for MUSL.	Yang Yujie	3	-1/+29
	gcc/ChangeLog: * config.gcc: Add a case for loongarch--linux-musl. config/loongarch/linux.h: Disable the multilib-compatible treatment for musl targets. * config/loongarch/musl.h: New file.
2024-03-07	match.pd: Optimize a * !a to 0 [PR114009]	Jakub Jelinek	3	-1/+45
	The following patch attempts to fix an optimization regression through adding a simple simplification. We already have the /* (m1 CMP m2) * d -> (m1 CMP m2) ? d : 0 / (if (!canonicalize_math_p ()) (for cmp (tcc_comparison) (simplify (mult:c (convert (cmp@0 @1 @2)) @3) (if (INTEGRAL_TYPE_P (type) && INTEGRAL_TYPE_P (TREE_TYPE (@0))) (cond @0 @3 { build_zero_cst (type); }))) optimization which otherwise triggers during the a !a multiplication, but that is done only late and we aren't able through range assumptions optimize it yet anyway. The patch adds a specific simplification for it. If a is zero, then a * !a will be 0 * 1 (or for signed 1-bit 0 * -1) and so 0. If a is non-zero, then a * !a will be a * 0 and so again 0. THe pattern is valid for scalar integers, complex integers and vector types, but I think will actually trigger only for the scalar integers. For vector types I've added other two with VEC_COND_EXPR in it, for complex there are different GENERIC trees to match and it is something that likely would be never matched in GIMPLE, so I didn't handle that. 2024-03-07 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/114009 * genmatch.cc (decision_tree::gen): Emit ARG_UNUSED for captures argument even for GENERIC, not just for GIMPLE. * match.pd (a * !a -> 0): New simplifications. * gcc.dg/tree-ssa/pr114009.c: New test.
2024-03-07	RISC-V: Refactor expand_vec_cmp [NFC]	demin.han	2	-31/+15
	There are two expand_vec_cmp functions. They have same structure and similar code. We can use default arguments instead of overloading. Tested on RV32 and RV64. gcc/ChangeLog: * config/riscv/riscv-protos.h (expand_vec_cmp): Change proto * config/riscv/riscv-v.cc (expand_vec_cmp): Use default arguments (expand_vec_cmp_float): Adapt arguments Signed-off-by: demin.han <demin.han@starfivetech.com>
2024-03-06	Fortran: Fix issue with using snprintf function.	Jerry DeLisle	4	-18/+26
	The previous patch used snprintf to set the message string. The message string is not a formatted string and the snprintf will interpret '%' related characters as format specifiers when there are no associated output variables. A segfault ensues. This change replaces snprintf with a fortran string copy function and null terminates the message string. PR libfortran/105456 libgfortran/ChangeLog: * io/list_read.c (list_formatted_read_scalar): Use fstrcpy from libgfortran/runtime/string.c to replace snprintf. (nml_read_obj): Likewise. * io/transfer.c (unformatted_read): Likewise. (unformatted_write): Likewise. (formatted_transfer_scalar_read): Likewise. (formatted_transfer_scalar_write): Likewise. * io/write.c (list_formatted_write_scalar): Likewise. (nml_write_obj): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/pr105456.f90: Revise using '%' characters in users error message.
2024-03-07	Daily bump.	GCC Administrator	7	-1/+250

2024-03-06	i386: Fix and improve insn constraint for V2QI arithmetic/shift insns	Uros Bizjak	1	-10/+23
	optimize_function_for_size_p predicate is not stable during optab selection, because it also depends on node->count/node->frequency of the current function, which are updated during IPA, so they may change between early opts and late opts. Use optimize_size instead - optimize_size implies optimize_function_for_size_p (cfun), so if a named pattern uses "&& optimize_size" and the insn it splits into uses optimize_function_for_size_p (cfun), it shouldn't fail. PR target/114232 gcc/ChangeLog: * config/i386/mmx.md (negv2qi2): Enable for optimize_size instead of optimize_function_for_size_p. Explictily enable for TARGET_SSE2. (negv2qi SSE reg splitter): Enable for TARGET_SSE2 only. (<plusminus:insn>v2qi3): Enable for optimize_size instead of optimize_function_for_size_p. Explictily enable for TARGET_SSE2. (<plusminus:insn>v2qi SSE reg splitter): Enable for TARGET_SSE2 only. (<any_shift:insn>v2qi3): Enable for optimize_size instead of optimize_function_for_size_p.
2024-03-06	RISC-V: Use vmv1r.v instead of vmv.v.v for fma output reloads [PR114200].	Robin Dapp	3	-48/+86
	Three-operand instructions like vmacc are modeled with an implicit output reload when the output does not match one of the operands. For this we use vmv.v.v which is subject to length masking. In a situation where the current vl is less than the full vlenb and the fma's result value is used as input for a vector reduction (which is never length masked) we effectively only reduce vl elements. The masked-out elements are relevant for the reduction, though, leading to a wrong result. This patch replaces the vmv reloads by full-register reloads. gcc/ChangeLog: PR target/114200 PR target/114202 * config/riscv/vector.md: Use vmv[1248]r.v instead of vmv.v.v. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr114200.c: New test. * gcc.target/riscv/rvv/autovec/pr114202.c: New test.
2024-03-06	RISC-V: Adjust vec unit-stride load/store costs.	Robin Dapp	4	-10/+188
	Scalar loads provide offset addressing while unit-stride vector instructions cannot. The offset must be loaded into a general-purpose register before it can be used. In order to account for this, this patch adds an address arithmetic heuristic that keeps track of data reference operands. If we haven't seen the operand before we add the cost of a scalar statement. This helps to get rid of an lbm regression when vectorizing (roughly 0.5% fewer dynamic instructions). gcc5 improves by 0.2% and deepsjeng by 0.25%. wrf and nab degrade by 0.1%. This is because before we now adjust the cost of SLP as well as loop-vectorized instructions whereas we would only adjust loop-vectorized instructions before. Considering higher scalar_to_vec costs (3 vs 1) for all vectorization types causes some snippets not to get vectorized anymore. Given these costs the decision looks correct but appears worse when just counting dynamic instructions. In total SPECint 2017 has 4 bln dynamic instructions less and SPECfp 0.7 bln. gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (adjust_stmt_cost): Move... (costs::adjust_stmt_cost): ... to here and add vec_load/vec_store offset handling. (costs::add_stmt_cost): Also adjust cost for statements without stmt_info. * config/riscv/riscv-vector-costs.h: Define zero constant. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/vse-slp-1.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/vse-slp-2.c: New test.
2024-03-06	ARM: Fix conditional execution [PR113915]	Wilco Dijkstra	3	-11/+15
	By default most patterns can be conditionalized on Arm targets. However Thumb-2 predication requires the "predicable" attribute be explicitly set to "yes". Most patterns are shared between Arm and Thumb(-2) and are marked with "predicable". Given this sharing, it does not make sense to use a different default for Arm. So only consider conditional execution of instructions that have the predicable attribute set to yes. This ensures that patterns not explicitly marked as such are never conditionally executed. gcc/ChangeLog: PR target/113915 * config/arm/arm.md (NOCOND): Improve comment. (arm_rev) Add predicable. config/arm/arm.cc (arm_final_prescan_insn): Add check for PREDICABLE_YES. gcc/testsuite/ChangeLog: PR target/113915 * gcc.target/arm/builtin-bswap-1.c: Fix test to allow conditional execution both for Arm and Thumb-2.
2024-03-06	Revert "Set num_threads to 50 on 32-bit hppa in two libgomp loop tests"	John David Anglin	2	-14/+2
	This reverts commit b14209715e659f6d3ca0f9eef9a4851e7bd6e373.
2024-03-06	[PR target/113001] Fix incorrect operand swapping in conditional move	Jeff Law	3	-2/+37
	This bug totally fell off my radar. Sorry about that. We have some special casing the conditional move expander to simplify a conditional move when comparing a register against zero and that same register is one of the arms. Specifically a (eq (reg) (const_int 0)) where reg is also the true arm or (ne (reg) (const_int 0)) where reg is the false arm need not use the fully generalized conditional move, thus saving an instruction for those cases. In the NE case we swapped the operands, but didn't swap the condition, which led to the ICE due to an unrecognized pattern. THe backend actually has distinct patterns for those two cases. So swapping the operands is neither needed nor advisable. Regression tested on rv64gc and verified the new tests pass. Pushing to the trunk. PR target/113001 PR target/112871 gcc/ * config/riscv/riscv.cc (expand_conditional_move): Do not swap operands when the comparison operand is the same as the false arm for a NE test. gcc/testsuite * gcc.target/riscv/zicond-ice-3.c: New test. * gcc.target/riscv/zicond-ice-4.c: New test.
2024-03-06	Fortran: error recovery while simplifying expressions [PR103707,PR106987]	Harald Anlauf	3	-41/+143
	When an exception is encountered during simplification of arithmetic expressions, the result may depend on whether range-checking is active (-frange-check) or not. However, the code path in the front-end should stay the same for "soft" errors for which the exception is triggered by the check, while "hard" errors should always terminate the simplification, so that error recovery is independent of the flag. Separation of arithmetic error codes into "hard" and "soft" errors shall be done consistently via is_hard_arith_error(). PR fortran/103707 PR fortran/106987 gcc/fortran/ChangeLog: * arith.cc (is_hard_arith_error): New helper function to determine whether an arithmetic error is "hard" or not. (check_result): Use it. (gfc_arith_divide): Set "Division by zero" only for regular numerators of real and complex divisions. (reduce_unary): Use is_hard_arith_error to determine whether a hard or (recoverable) soft error was encountered. Terminate immediately on hard error, otherwise remember code of first soft error. (reduce_binary_ac): Likewise. (reduce_binary_ca): Likewise. (reduce_binary_aa): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/pr99350.f90: * gfortran.dg/arithmetic_overflow_3.f90: New test.
2024-03-06	c++: ICE with noexcept and local specialization [PR114114]	Marek Polacek	2	-0/+38
	Here we ICE because we call register_local_specialization while local_specializations is null, so local_specializations->put (); crashes on null this. It's null since maybe_instantiate_noexcept calls push_to_top_level which creates a new scope. Normally, I would have guessed that we need a new local_specialization_stack. But here we're dealing with an operand of a noexcept, which is an unevaluated operand, and those aren't registered in the hash map. maybe_instantiate_noexcept wasn't signalling that it's substituting an unevaluated operand though. PR c++/114114 gcc/cp/ChangeLog: * pt.cc (maybe_instantiate_noexcept): Save/restore cp_unevaluated_operand, c_inhibit_evaluation_warnings, and cp_noexcept_operand around the tsubst_expr call. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/noexcept84.C: New test.
2024-03-06	i386: Eliminate common code from x86_32 TARGET_MACHO part in ix86_expand_move	Uros Bizjak	1	-26/+11
	Eliminate common code from x86_32 TARGET_MACHO part in ix86_expand_move and use generic code instead. No functional changes. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_move) [TARGET_MACHO]: Eliminate common code and use generic code instead.
2024-03-06	amdgcn: additional gfx1030/gfx1100 support: adjust test cases	Thomas Schwinge	4	-4/+4
	The "SDWA" changes in commit 99890e15527f1f04caef95ecdd135c9f1a077f08 "amdgcn: additional gfx1030/gfx1100 support" caused a few regressions: PASS: gcc.target/gcn/sram-ecc-3.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-3.c scan-assembler zero_extendv64qiv64si2 PASS: gcc.target/gcn/sram-ecc-4.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-4.c scan-assembler zero_extendv64hiv64si2 PASS: gcc.target/gcn/sram-ecc-7.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-7.c scan-assembler zero_extendv64qiv64si2 PASS: gcc.target/gcn/sram-ecc-8.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-8.c scan-assembler zero_extendv64hiv64si2 Those test cases need corresponding adjustment. gcc/testsuite/ * gcc.target/gcn/sram-ecc-3.c: Adjust. * gcc.target/gcn/sram-ecc-4.c: Likewise. * gcc.target/gcn/sram-ecc-7.c: Likewise. * gcc.target/gcn/sram-ecc-8.c: Likewise.
2024-03-06	AVR: Adjust rtx cost of plus + zero_extend.	Georg-Johann Lay	1	-0/+7
	gcc/ * config/avr/avr.cc (avr_rtx_costs_1) [PLUS+ZERO_EXTEND]: Adjust rtx cost.
2024-03-06	tree-optimization/114239 - rework reduction epilogue driving	Richard Biener	2	-81/+53
	The following reworks vectorizable_live_operation to pass the live stmt to vect_create_epilog_for_reduction also for early breaks and a peeled main exit. This is to be able to figure the scalar definition to replace. This reverts the PR114192 fix as it is subsumed by this cleanup. PR tree-optimization/114239 * tree-vect-loop.cc (vect_get_vect_def): Remove. (vect_create_epilog_for_reduction): The passed in stmt_info should now be the live stmt that produces the scalar reduction result. Revert PR114192 fix. Base reduction info off info_for_reduction. Remove special handling of early-break/peeled, restore original vector def gathering. Make sure to pick the correct exit PHIs. (vectorizable_live_operation): Pass in the proper stmt_info for early break exits. * gcc.dg/vect/vect-early-break_122-pr114239.c: New testcase.
2024-03-06	LoongArch: testsuite: Rewrite {x,}vfcmp-{d,f}.c to avoid named registers	Xi Ruoyao	4	-139/+816
	Loops on named vector register are not vectorized (see comment 11 of PR113622), so the these test cases have been failing for a while. Rewrite them using check-function-bodies to remove hard coding register names. A barrier is needed to always load the first operand before the second operand. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vfcmp-f.c: Rewrite to avoid named registers. * gcc.target/loongarch/vfcmp-d.c: Likewise. * gcc.target/loongarch/xvfcmp-f.c: Likewise. * gcc.target/loongarch/xvfcmp-d.c: Likewise.
2024-03-06	aarch64: Define out-of-class static constants	Richard Sandiford	1	-0/+3
	While reworking the aarch64 feature descriptions, I forgot to add out-of-class definitions of some static constants. This could lead to a build failure with some compilers. This was seen with some WIP to increase the number of extensions beyond 64. It's latent on trunk though, and a regression from before the rework. gcc/ * config/aarch64/aarch64-feature-deps.h (feature_deps::info): Add out-of-class definitions of static constants.
2024-03-06	c++: Fix template deduction for conversion operators with xobj parameters ↵	Nathaniel Shead	2	-1/+60
	[PR113629] Unification for conversion operators (DEDUCE_CONV) doesn't perform transformations like handling forwarding references. This is correct in general, but not for xobj parameters, which should be handled "normally" for the purposes of deduction: [temp.deduct.conv] only applies to the return type of the conversion function. PR c++/113629 gcc/cp/ChangeLog: * pt.cc (type_unification_real): Only use DEDUCE_CONV for the return type of a conversion function. gcc/testsuite/ChangeLog: * g++.dg/cpp23/explicit-obj-conv-op.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>