riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-07-24	aarch64: Extend aarch64_feature_flags to 128 bits	Andrew Carlotti	5	-17/+33
	Replace the existing uint64_t typedef with a bbitmap<2> typedef. Most of the preparatory work was carried out in previous commits, so this patch itself is fairly small. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (aarch64_set_asm_isa_flags): Store a second uint64_t value. * config/aarch64/aarch64-opts.h (aarch64_feature_flags): Switch typedef to bbitmap<2>. * config/aarch64/aarch64.cc (aarch64_set_current_function): Extract isa mode from val[0]. * config/aarch64/aarch64.h (aarch64_get_asm_isa_flags): Load a second uint64_t value. (aarch64_get_isa_flags): Ditto. (aarch64_asm_isa_flags): Ditto. (aarch64_isa_flags): Ditto. (HANDLE): Use bbitmap<2>::from_index to initialise flags. (AARCH64_FL_ISA_MODES): Do arithmetic on integer type. (AARCH64_ISA_MODE): Extract value from bbitmap<2> array. * config/aarch64/aarch64.opt (aarch64_asm_isa_flags_1): New variable. (aarch64_isa_flags_1): Ditto.
2024-07-24	Add new bbitmap<N> class	Andrew Carlotti	1	-0/+236
	This class provides a constant-size bitmap that can be used as almost a drop-in replacement for bitmaps stored in integer types. The implementation is entirely within the header file and uses recursive templated operations to support effective optimisation and usage in constexpr expressions. This initial implementation hardcodes the choice of uint64_t elements for storage and initialisation, but this could instead be specified via a second template parameter. gcc/ChangeLog: * bbitmap.h: New file.
2024-07-24	aarch64: Use constructor explicitly in get_flags_off	Andrew Carlotti	1	-2/+3
	gcc/ChangeLog: * config/aarch64/aarch64-feature-deps.h (get_flags_off): Construct aarch64_feature_flags (0) explicitly.
2024-07-24	aarch64: Add bool conversion to TARGET_* macros	Andrew Carlotti	5	-131/+79
	Use a new AARCH64_HAVE_ISA macro in TARGET_* definitions, and eliminate all the AARCH64_ISA_* feature macros. gcc/ChangeLog: * config/aarch64/aarch64-c.cc (aarch64_define_unconditional_macros): Use TARGET_V8R macro. (aarch64_update_cpp_builtins): Use TARGET_* macros. * config/aarch64/aarch64.h (AARCH64_HAVE_ISA): New macro. (AARCH64_ISA_SM_OFF, AARCH64_ISA_SM_ON, AARCH64_ISA_ZA_ON) (AARCH64_ISA_V8A, AARCH64_ISA_V8_1A, AARCH64_ISA_CRC) (AARCH64_ISA_FP, AARCH64_ISA_SIMD, AARCH64_ISA_LSE) (AARCH64_ISA_RDMA, AARCH64_ISA_V8_2A, AARCH64_ISA_F16) (AARCH64_ISA_SVE, AARCH64_ISA_SVE2, AARCH64_ISA_SVE2_AES) (AARCH64_ISA_SVE2_BITPERM, AARCH64_ISA_SVE2_SHA3) (AARCH64_ISA_SVE2_SM4, AARCH64_ISA_SME, AARCH64_ISA_SME_I16I64) (AARCH64_ISA_SME_F64F64, AARCH64_ISA_SME2, AARCH64_ISA_V8_3A) (AARCH64_ISA_DOTPROD, AARCH64_ISA_AES, AARCH64_ISA_SHA2) (AARCH64_ISA_V8_4A, AARCH64_ISA_SM4, AARCH64_ISA_SHA3) (AARCH64_ISA_F16FML, AARCH64_ISA_RCPC, AARCH64_ISA_RCPC8_4) (AARCH64_ISA_RNG, AARCH64_ISA_V8_5A, AARCH64_ISA_TME) (AARCH64_ISA_MEMTAG, AARCH64_ISA_V8_6A, AARCH64_ISA_I8MM) (AARCH64_ISA_F32MM, AARCH64_ISA_F64MM, AARCH64_ISA_BF16) (AARCH64_ISA_SB, AARCH64_ISA_RCPC3, AARCH64_ISA_V8R) (AARCH64_ISA_PAUTH, AARCH64_ISA_V8_7A, AARCH64_ISA_V8_8A) (AARCH64_ISA_V8_9A, AARCH64_ISA_V9A, AARCH64_ISA_V9_1A) (AARCH64_ISA_V9_2A, AARCH64_ISA_V9_3A, AARCH64_ISA_V9_4A) (AARCH64_ISA_MOPS, AARCH64_ISA_LS64, AARCH64_ISA_CSSC) (AARCH64_ISA_D128, AARCH64_ISA_THE, AARCH64_ISA_GCS): Remove. (TARGET_BASE_SIMD, TARGET_SIMD, TARGET_FLOAT) (TARGET_NON_STREAMING, TARGET_STREAMING, TARGET_ZA, TARGET_SHA2) (TARGET_SHA3, TARGET_AES, TARGET_SM4, TARGET_F16FML) (TARGET_CRC32, TARGET_LSE, TARGET_FP_F16INST) (TARGET_SIMD_F16INST, TARGET_DOTPROD, TARGET_SVE, TARGET_SVE2) (TARGET_SVE2_AES, TARGET_SVE2_BITPERM, TARGET_SVE2_SHA3) (TARGET_SVE2_SM4, TARGET_SME, TARGET_SME_I16I64) (TARGET_SME_F64F64, TARGET_SME2, TARGET_ARMV8_3, TARGET_JSCVT) (TARGET_FRINT, TARGET_TME, TARGET_RNG, TARGET_MEMTAG) (TARGET_I8MM, TARGET_SVE_I8MM, TARGET_SVE_F32MM) (TARGET_SVE_F64MM, TARGET_BF16_FP, TARGET_BF16_SIMD) (TARGET_SVE_BF16, TARGET_PAUTH, TARGET_BTI, TARGET_MOPS) (TARGET_LS64, TARGET_CSSC, TARGET_SB, TARGET_RCPC, TARGET_RCPC2) (TARGET_RCPC3, TARGET_SIMD_RDMA, TARGET_ARMV9_4, TARGET_D128) (TARGET_THE, TARGET_GCS): Redefine using AARCH64_HAVE_ISA. (TARGET_V8R, TARGET_V9A): New. * config/aarch64/aarch64.md (arch_enabled): Use TARGET_RCPC2. * config/aarch64/iterators.md (GPI_I16): Use TARGET_FP_F16INST. (GPF_F16): Ditto. * config/aarch64/predicates.md (aarch64_rcpc_memory_operand): Use TARGET_RCPC2.
2024-07-24	aarch64: Add explicit bool cast to return value	Andrew Carlotti	1	-1/+1
	gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_valid_sysreg_name_p): Add bool cast.
2024-07-24	aarch64: Decouple feature flag option storage type	Andrew Carlotti	3	-10/+16
	The awk scripts that process the .opt files are relatively fragile and only handle a limited set of data types correctly. The unrecognised aarch64_feature_flags type is handled as a uint64_t, which happens to be correct for now. However, that assumption will change when we extend the mask to 128 bits. This patch changes the option members to use uint64_t types, and adds a "_0" suffix to the names (both for future extensibility, and to allow the original name to be used for the full aarch64_feature_flags mask within generator files). gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (aarch64_set_asm_isa_flags): Reorder, and add suffix to names. * config/aarch64/aarch64.h (aarch64_get_asm_isa_flags): Add "_0" suffix. (aarch64_get_isa_flags): Ditto. (aarch64_asm_isa_flags): Redefine using renamed uint64_t value. (aarch64_isa_flags): Ditto. * config/aarch64/aarch64.opt: (aarch64_asm_isa_flags): Rename to... (aarch64_asm_isa_flags_0): ...this, and change to uint64_t. (aarch64_isa_flags): Rename to... (aarch64_isa_flags_0): ...this, and change to uint64_t.
2024-07-24	aarch64: Define aarch64_get_{asm_\|}isa_flags	Andrew Carlotti	3	-23/+26
	Building an aarch64_feature_flags value from data within a gcc_options or cl_target_option struct will get more complicated in a later commit. Use a macro to avoid doing this manually in more than one location. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (aarch64_handle_option): Use new macro. * config/aarch64/aarch64.cc (aarch64_override_options_internal): Ditto. (aarch64_option_print): Ditto. (aarch64_set_current_function): Ditto. (aarch64_can_inline_p): Ditto. (aarch64_declare_function_name): Ditto. (aarch64_start_file): Ditto. * config/aarch64/aarch64.h (aarch64_get_asm_isa_flags): New (aarch64_get_isa_flags): New. (aarch64_asm_isa_flags): Use new macro. (aarch64_isa_flags): Ditto.
2024-07-24	aarch64: Introduce aarch64_isa_mode type	Andrew Carlotti	4	-77/+94
	Currently there are many places where an aarch64_feature_flags variable is used, but only the bottom three isa mode bits are set and read. Using a separate data type for these value makes it more clear that they're not expected or required to have any of their upper feature bits set. It will also make things simpler and more efficient when we extend aarch64_feature_flags to 128 bits. This patch uses explicit casts whenever converting from an aarch64_feature_flags value to an aarch64_isa_mode value. This isn't strictly necessary, but serves to highlight the locations where an explicit conversion will become necessary later. gcc/ChangeLog: * config/aarch64/aarch64-opts.h: Add aarch64_isa_mode typedef. * config/aarch64/aarch64-protos.h (aarch64_gen_callee_cookie): Use aarch64_isa_mode parameter. (aarch64_sme_vq_immediate): Ditto. * config/aarch64/aarch64.cc (aarch64_fntype_pstate_sm): Use aarch64_isa_mode values. (aarch64_fntype_pstate_za): Ditto. (aarch64_fndecl_pstate_sm): Ditto. (aarch64_fndecl_pstate_za): Ditto. (aarch64_fndecl_isa_mode): Ditto. (aarch64_cfun_incoming_pstate_sm): Ditto. (aarch64_cfun_enables_pstate_sm): Ditto. (aarch64_call_switches_pstate_sm): Ditto. (aarch64_gen_callee_cookie): Ditto. (aarch64_callee_isa_mode): Ditto. (aarch64_insn_callee_abi): Ditto. (aarch64_sme_vq_immediate): Ditto. (aarch64_add_offset_temporaries): Ditto. (aarch64_add_offset): Ditto. (aarch64_add_sp): Ditto. (aarch64_sub_sp): Ditto. (aarch64_guard_switch_pstate_sm): Ditto. (aarch64_switch_pstate_sm): Ditto. (aarch64_init_cumulative_args): Ditto. (aarch64_allocate_and_probe_stack_space): Ditto. (aarch64_expand_prologue): Ditto. (aarch64_expand_epilogue): Ditto. (aarch64_start_call_args): Ditto. (aarch64_expand_call): Ditto. (aarch64_end_call_args): Ditto. (aarch64_set_current_function): Ditto, with added conversions. (aarch64_handle_attr_arch): Avoid macro with changed type. (aarch64_handle_attr_cpu): Ditto. (aarch64_handle_attr_isa_flags): Ditto. (aarch64_switch_pstate_sm_for_landing_pad): Use arch64_isa_mode values. (aarch64_switch_pstate_sm_for_jump): Ditto. (pass_switch_pstate_sm::gate): Ditto. * config/aarch64/aarch64.h (AARCH64_ISA_MODE_{SM_ON\|SM_OFF\|ZA_ON}): New macros. (AARCH64_FL_SM_STATE): Mark as possibly unused. (AARCH64_ISA_MODE_SM_STATE): New aarch64_isa_mode mask. (AARCH64_DEFAULT_ISA_MODE): New aarch64_isa_mode value. (AARCH64_FL_DEFAULT_ISA_MODE): Define using above value. (AARCH64_ISA_MODE): Change type to aarch64_isa_mode. (arm_pcs): Use aarch64_isa_mode value.
2024-07-24	aarch64: Eliminate a temporary variable.	Andrew Carlotti	1	-5/+4
	The name would become misleading in a later commit anyway, and I think this is marginally more readable. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_override_options): Remove temporary variable.
2024-07-24	aarch64: Move AARCH64_NUM_ISA_MODES definition	Andrew Carlotti	2	-5/+5
	AARCH64_NUM_ISA_MODES will be used within aarch64-opts.h in a later commit. gcc/ChangeLog: * config/aarch64/aarch64.h (DEF_AARCH64_ISA_MODE): Move to... * config/aarch64/aarch64-opts.h (DEF_AARCH64_ISA_MODE): ...here.
2024-07-24	aarch64: Remove unused global aarch64_tune_flags	Andrew Carlotti	1	-4/+0
	gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_tune_flags): Remove unused global variable. (aarch64_override_options_internal): Remove dead assignment.
2024-07-24	c++: add fixed testcase [PR109997]	Jason Merrill	1	-0/+4
	Fixed by r14-9713 for PR100667. PR c++/109997 gcc/testsuite/ChangeLog: * g++.dg/ext/is_assignable1.C: New test.
2024-07-24	optabs/rs6000: Rename iorc and andc to iorn and andn	Andrew Pinski	7	-40/+44
	When I was trying to add an scalar version of iorc and andc, the optab that got matched was for and/ior with the mode of csi and cdi instead of iorc and andc optabs for si and di modes. Since csi/cdi are the complex integer modes, we need to rename the optabs to be without c there. This changes c to n which is a neutral and known not to be first letter of a mode. Bootstrapped and tested on x86_64 and powerpc64le. gcc/ChangeLog: * config/rs6000/rs6000-builtins.def: s/iorc/iorn/. s/andc/andn/ for the code. * config/rs6000/rs6000-string.cc (expand_cmp_vec_sequence): Update to iorn. * config/rs6000/rs6000.md (andc<mode>3): Rename to ... (andn<mode>3): This. (iorc<mode>3): Rename to ... (iorn<mode>3): This. * doc/md.texi: Update documentation for the rename. * internal-fn.def (BIT_ANDC): Rename to ... (BIT_ANDN): This. (BIT_IORC): Rename to ... (BIT_IORN): This. * optabs.def (andc_optab): Rename to ... (andn_optab): This. (iorc_optab): Rename to ... (iorn_optab): This. * gimple-isel.cc (gimple_expand_vec_cond_expr): Update for the renamed internal functions, ANDC/IORC to ANDN/IORN. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-07-24	modula2: Improve error message to include symbol name.	Gaius Mulley	1	-1/+1
	gcc/m2/ChangeLog: * gm2-compiler/M2StateCheck.mod (GenerateError): Add symbol name to the error message. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-07-24	modula2: Add GNU flex as a build and install prerequisite.	Gaius Mulley	1	-0/+4
	gcc/ChangeLog: * doc/install.texi (GM2-prerequisite): Add GNU flex. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-07-24	tree-optimization/116057 - wrong code with CCP and vector CTORs	Richard Biener	2	-0/+31
	The following fixes an issue with CCPs likely_value when faced with a vector CTOR containing undef SSA names and constants. This should be classified as CONSTANT and not UNDEFINED. PR tree-optimization/116057 * tree-ssa-ccp.cc (likely_value): Also walk CTORs in stmt operands to look for constants. * gcc.dg/torture/pr116057.c: New testcase.
2024-07-24	Revert "aarch64: Fuse CMP+CSEL and CMP+CSET for -mcpu=neoverse-v2"	Kyrylo Tkachov	5	-90/+1
	This reverts commit 4c5eb66e701bc9f3bf1298269f52559b10d63a09.
2024-07-24	aarch64: Fuse CMP+CSEL and CMP+CSET for -mcpu=neoverse-v2	Jennifer Schmitz	5	-1/+90
	According to the Neoverse V2 Software Optimization Guide (section 4.14), the instruction pairs CMP+CSEL and CMP+CSET can be fused, which had not been implemented so far. This patch implements and tests the two fusion pairs. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. There was also no non-noise impact on SPEC CPU2017 benchmark. OK for mainline? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ * config/aarch64/aarch64.cc (aarch_macro_fusion_pair_p): Implement fusion logic. * config/aarch64/aarch64-fusion-pairs.def (cmp+csel): New entry. (cmp+cset): Likewise. * config/aarch64/tuning_models/neoversev2.h: Enable logic in field fusible_ops. gcc/testsuite/ * gcc.target/aarch64/cmp_csel_fuse.c: New test. * gcc.target/aarch64/cmp_cset_fuse.c: Likewise.
2024-07-24	RISC-V: Disable Zba optimization pattern if XTheadMemIdx is enabled	Christoph Müllner	3	-1/+56
	It is possible that the Zba optimization pattern zero_extendsidi2_bitmanip matches for a XTheadMemIdx INSN with the effect of emitting an invalid instruction as reported in PR116035. The pattern above is used to emit a zext.w instruction to zero-extend SI mode registers to DI mode. A similar functionality can be achieved by XTheadBb's th.extu instruction. And indeed, we have the equivalent pattern in thead.md (zero_extendsidi2_th_extu). However, that pattern depends on !TARGET_XTHEADMEMIDX. To compensate for that, there are specific patterns that ensure that zero-extension instruction can still be emitted (th_memidx_bb_zero_extendsidi2 and friends). While we could implement something similar (th_memidx_zba_zero_extendsidi2) it would only make sense, if there existed real HW that does implement Zba and XTheadMemIdx, but not XTheadBb. Unless such a machine exists, let's simply disable zero_extendsidi2_bitmanip if XTheadMemIdx is available. PR target/116035 gcc/ChangeLog: * config/riscv/bitmanip.md: Disable zero_extendsidi2_bitmanip for XTheadMemIdx. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr116035-1.c: New test. * gcc.target/riscv/pr116035-2.c: New test. Reported-by: Patrick O'Neill <patrick@rivosinc.com> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-07-24	x86: Don't enable APX_F in 32-bit mode	Lingling Kong	4	-2/+32
	gcc/ChangeLog: PR target/115978 * config/i386/driver-i386.cc (host_detect_local_cpu): Enable APX_F only for 64-bit codegen. * config/i386/i386-options.cc (DEF_PTA): Skip PTA_APX_F if not in 64-bit mode. gcc/testsuite/ChangeLog: PR target/115978 * gcc.target/i386/pr115978-1.c: New test. * gcc.target/i386/pr115978-2.c: Ditto.
2024-07-24	Internal-fn: Only allow modes describe types for internal fn[PR115961]	Pan Li	2	-0/+64
	The direct_internal_fn_supported_p has no restrictions for the type modes. For example the bitfield like below will be recog as .SAT_TRUNC. struct e { unsigned pre : 12; unsigned a : 4; }; __attribute__((noipa)) void bug (e * v, unsigned def, unsigned use) { e & defE = v; defE.a = min_u (use + 1, 0xf); } This patch would like to add checks for the direct_internal_fn_supported_p, and only allows the tree types describled by modes. The below test suites are passed for this patch: 1. The rv64gcv fully regression tests. 2. The x86 bootstrap tests. 3. The x86 fully regression tests. PR target/115961 gcc/ChangeLog: internal-fn.cc (type_strictly_matches_mode_p): Add new func impl to check type strictly matches mode or not. (type_pair_strictly_matches_mode_p): Ditto but for tree type pair. (direct_internal_fn_supported_p): Add above check for the tree type pair. gcc/testsuite/ChangeLog: * g++.dg/torture/pr115961-run-1.C: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-07-23	[PR rtl-optimization/115877][6/n] Add testcase from pr115877	Jeff Law	1	-0/+20
	This just adds the testcase from pr115877. It's working now on the trunk. I'm not done with cleanups/bugfixing, but there's no reason to not have the testcase installed at this point. PR rtl-optimization/115877 gcc/testsuite * gcc.dg/torture/pr115877.c: New test.
2024-07-24	Daily bump.	GCC Administrator	7	-1/+334

2024-07-24	Output CodeView type information for rvalue references	Mark Harmstone	2	-5/+11
	Translates DW_TAG_rvalue_reference_type DIEs into LF_POINTER types. gcc/ * dwarf2codeview.cc (get_type_num_reference_type): Handle rvalue refs. (get_type_num_array_type): Add DW_TAG_rvalue_reference_type to switch. (get_type_num): Handle DW_TAG_rvalue_reference_type DIEs. * dwarf2codeview.h (CV_PTR_MODE_RVREF): Define.
2024-07-24	Output CodeView type information for references	Mark Harmstone	2	-0/+45
	Translates DW_TAG_reference_type DIEs into LF_POINTER types. gcc/ * dwarf2codeview.cc (get_type_num_reference_type): New function. (get_type_num_array_type): Add DW_TAG_reference_type to switch. (get_type_num): Handle DW_TAG_reference_type DIEs. * dwarf2codeview.h (CV_PTR_MODE_LVREF): Define.
2024-07-23	RISC-V: Fix snafu in SI mode splitters patch	Vineet Gupta	1	-1/+1
	SPEC2017 perlbench for RISC-V was broke as runtime output mismatch failure. > 3830: mbox2: dWshe3Aa1EULre4CT5O/ErYFrk+o/EOoebA1kTVjQVQQH2EjT5fHcYnwjj2MdBmZu5y3Ce4Ei4QQZo/SNrry9g > mbox2: uuWPimQiU0D4UrwFP+LS0lFNph4qL43WV1A6T3tHleatIOUaHixhrJU9NoA2lc9KjwYpdEL0lNTXkvo8ymNHzA > ^ > 3832: mbox3: 8f4jdv6GIf0lX3DcdwRdEm6/aZwnmGX6n86GzCvmkwTKFXQjwlwVHc8jy8XlcyiIPr3yXTkgVOiP3cRYvyYQPg > mbox3: 9xQySgP6qbhfxl8Usu1WfGA5UhStB5AN31wueGM6OF4Jp59DkqJPu6ksGblOU5u0nQapQC1e9oYIs16a2mq2NA > ^ > specdiff run completed Edwin bisected this to 273f16a125c4 ("[v3][RISC-V] Handle bit manipulation of SImode values") which had the operands swapped in one of the new splitters introduced. No test as reducer narrows it to down to the exact test introduced by the original commit. gcc/ChangeLog: * config/riscv/bitmanip.md: Fix splitter. Reported-by: Edwin Lu <ewlu@rivosinc.com> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
2024-07-23	doc: add missing @option for musttail	Marek Polacek	1	-2/+2
	gcc/ChangeLog: * doc/extend.texi: Add missing @option.
2024-07-23	Add documentation for musttail attribute	Andi Kleen	1	-2/+23
	gcc/ChangeLog: PR c/83324 * doc/extend.texi: Document [[musttail]]
2024-07-23	Add tests for C/C++ musttail attributes	Andi Kleen	14	-0/+327
	Some adopted from the existing C musttail plugin tests. Also extends the ability to query the sibcall capabilities of the target. gcc/testsuite/ChangeLog: * lib/target-supports.exp: (check_effective_target_struct_tail_call): New function. * c-c++-common/musttail1.c: New test. * c-c++-common/musttail12.c: New test. * c-c++-common/musttail13.c: New test. * c-c++-common/musttail2.c: New test. * c-c++-common/musttail3.c: New test. * c-c++-common/musttail4.c: New test. * c-c++-common/musttail5.c: New test. * c-c++-common/musttail7.c: New test. * c-c++-common/musttail8.c: New test. * g++.dg/musttail10.C: New test. * g++.dg/musttail11.C: New test. * g++.dg/musttail6.C: New test. * g++.dg/musttail9.C: New test.
2024-07-23	C: Implement musttail attribute for returns	Andi Kleen	3	-16/+64
	Implement a C23 clang compatible musttail attribute similar to the earlier C++ implementation in the C parser. gcc/c/ChangeLog: PR c/83324 * c-parser.cc (struct attr_state): Define with musttail_p. (c_parser_statement_after_labels): Handle [[musttail]]. (c_parser_std_attribute): Dito. (c_parser_handle_musttail): Dito. (c_parser_compound_statement_nostart): Dito. (c_parser_all_labels): Dito. (c_parser_statement): Dito. * c-tree.h (c_finish_return): Add musttail_p flag. * c-typeck.cc (c_finish_return): Handle musttail_p flag.
2024-07-23	C++: Support clang compatible [[musttail]] (PR83324)	Andi Kleen	6	-4/+63
	This patch implements a clang compatible [[musttail]] attribute for returns. musttail is useful as an alternative to computed goto for interpreters. With computed goto the interpreter function usually ends up very big which causes problems with register allocation and other per function optimizations not scaling. With musttail the interpreter can be instead written as a sequence of smaller functions that call each other. To avoid unbounded stack growth this requires forcing a sibling call, which this attribute does. It guarantees an error if the call cannot be tail called which allows the programmer to fix it instead of risking a stack overflow. Unlike computed goto it is also type-safe. It turns out that David Malcolm had already implemented middle/backend support for a musttail attribute back in 2016, but it wasn't exposed to any frontend other than a special plugin. This patch adds a [[gnu::musttail]] attribute for C++ that can be added to return statements. The return statement must be a direct call (it does not follow dependencies), which is similar to what clang implements. It then uses the existing must tail infrastructure. For compatibility it also detects clang::musttail Passes bootstrap and full test gcc/c-family/ChangeLog: * c-attribs.cc (set_musttail_on_return): New function. * c-common.h (set_musttail_on_return): Declare new function. gcc/cp/ChangeLog: PR c/83324 * cp-tree.h (AGGR_INIT_EXPR_MUST_TAIL): Add. * parser.cc (cp_parser_statement): Handle musttail. (cp_parser_jump_statement): Dito. * pt.cc (tsubst_expr): Copy CALL_EXPR_MUST_TAIL_CALL. * semantics.cc (simplify_aggr_init_expr): Handle musttail.
2024-07-23	c++: normalizing ttp constraints [PR115656]	Patrick Palka	4	-6/+19
	Here we normalize the constraint same_as<T, bool> for the first time during ttp coercion of B / UU, specifically constraint subsumption checking. During this normalization the set of in-scope template parameters i.e. current_template_parms is empty, which we rely on during normalization of the ttp constraints since we pass in_decl=NULL_TREE to norm_info. And this tricks the satisfaction cache into thinking that the satisfaction value of same_as<T, bool> is independent of its template parameters, and we incorrectly conflate the satisfaction value with T = bool vs T = long and accept the specialization A<long, B>. Since is_compatible_template_arg rewrites the ttp's constraints to be in terms of the argument template's parameters, and since it's the only caller of weakly_subsumes, the latter funcion can instead pass in_decl=tmpl to avoid relying on current_template_parms. This patch implements this, and in turns renames weakly_subsumes to ttp_subsumes to reflect that this predicate is now hardcoded for this one caller. PR c++/115656 gcc/cp/ChangeLog: * constraint.cc (weakly_subsumes): Pass in_decl=tmpl to get_normalized_constraints_from_info. Rename to ... (ttp_subsumes): ... this. * cp-tree.h (weakly_subsumes): Rename to ... (ttp_subsumes): ... this. * pt.cc (is_compatible_template_arg): Adjust after renaming. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-ttp7.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-07-23	c++: missing SFINAE during alias CTAD [PR115296]	Patrick Palka	2	-1/+20
	During the alias CTAD transformation, if substitution failed for some guide we should just silently discard the guide. We currently do discard the guide, but not silently, as in the below testcase which we diagnose forming a too-large array type when transforming the user-defined deduction guides. This patch fixes this by using complain=tf_none instead of tf_warning_or_error throughout alias_ctad_tweaks. PR c++/115296 gcc/cp/ChangeLog: * pt.cc (alias_ctad_tweaks): Use complain=tf_none instead of tf_warning_or_error. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/class-deduction-alias23.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-07-23	PR modula2/116048 ICE when encountering wrong kind of qualident	Gaius Mulley	13	-56/+645
	Following on from PR-115957 further ICEs can be generated by using the wrong kind of qualident symbol. For example using a variable instead of a type or using a type instead of a const. This fix tracks the expected qualident kind state when parsing const, type and variable declarations. If the error is unrecoverable then a detailed message explaining the context of the qualident (and why the seen qualident is wrong) is generated. gcc/m2/ChangeLog: PR modula2/116048 * Make-lang.in (GM2-COMP-BOOT-DEFS): Add M2StateCheck.def. (GM2-COMP-BOOT-MODS): Add M2StateCheck.mod. (GM2-COMP-DEFS): Add M2StateCheck.def. (GM2-COMP-MODS): Add M2StateCheck.mod. * gm2-compiler/M2Quads.mod (StartBuildWith): Generate unrecoverable error is the qualident type is NulSym. Replace MetaError1 with MetaErrorT1 and position the error to the qualident. * gm2-compiler/P3Build.bnf (M2StateCheck): Import procedures. (seenError): New variable. (WasNoError): Remove variable. (BlockState): New variable. (ErrorString): Rewrite using seenError. (CompilationUnit): Ditto. (QualidentCheck): New rule. (ConstantDeclaration): Bookend with InclConst and ExclConst. (Constructor): Add InclConstructor, ExclConstructor and call CheckQualident. (ConstActualParameters): Call PushState, PopState, InclConstFunc and CheckQualident. (TypeDeclaration): Bookend with InclType and ExclType. (SimpleType): Call QualidentCheck. (CaseTag): Ditto. (OptReturnType): Ditto. (VariableDeclaration): Bookend with InclVar and ExclVar. (Designator): Call QualidentCheck. (Formal;Type): Ditto. * gm2-compiler/PCBuild.bnf (M2StateCheck): Import procedures. (ConstantDeclaration): Rewrite using InclConst and ExclConst. (Constructor): Bookend with InclConstructor and ExclConstructor. Call CheckQualident. (ConstructorOrConstActualParameters): Rewrite and cal l CheckQualident. (ConstActualParameters): Bookend with PushState PopState. Call InclConstFunc and CheckQualident. * gm2-gcc/init.cc (_M2_M2StateCheck_init): New declaration. (_M2_P3Build_init): New declaration. (init_PerCompilationInit): Call _M2_M2StateCheck_init and _M2_P3Build_init. * gm2-compiler/M2StateCheck.def: New file. * gm2-compiler/M2StateCheck.mod: New file. gcc/testsuite/ChangeLog: PR modula2/116048 * gm2/errors/fail/errors-fail.exp: Remove -Wstudents and add -Wuninit-variable-checking=all. Replace gm2_init_pim with gm2_init_iso. * gm2/errors/fail/testfio.mod: Modify test code to provoke an error in the first basic block. * gm2/errors/fail/testparam.mod: Ditto. * gm2/errors/fail/array1.mod: Ditto. * gm2/errors/fail/badtype.mod: New test. * gm2/errors/fail/badvar.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-07-23	cp/coroutines: add a test for PR c++/103953	Arsen Arsenović	1	-0/+75
	This PR seems to have been fixed by a fix for a seemingly unrelated PR. Lets add a regression test to make sure it stays fixed. PR c++/103953 - Leak of coroutine return object PR c++/103953 gcc/testsuite/ChangeLog: * g++.dg/coroutines/torture/pr103953.C: New test. Reviewed-by: Iain Sandoe <iain@sandoe.co.uk>
2024-07-23	install.texi (gcn): Suggest newer commit for Newlib	Tobias Burnus	1	-3/+3
	Newlib 4.4.0 lacks two commits: 7dd4eb1db (2024-03-25) to fix device console output for GFX10/GFX11 and ed50a50b9 (2024-04-04) to make the added lock.h compilable with C++. This commit mentiones now also the second commit. gcc/ChangeLog: * doc/install.texi (amdgcn-x-amdhsa): Suggest newer git version for newlib.
2024-07-23	report message for operator %a on unaddressible operand	Jiufu Guo	3	-1/+37
	Hi, For PR96866, when printing asm code for modifier "%a", an addressable operand is required. While the constraint "X" allow any kind of operand even which is hard to get the address directly. e.g. extern symbol whose address is in TOC. An error message would be reported to indicate the invalid asm operand. Compare with previous version, test case is updated with -mno-pcrel. Bootstrap&regtest pass on ppc64{,le}. Is this ok for trunk? BR, Jeff(Jiufu Guo) PR target/96866 gcc/ChangeLog: * config/rs6000/rs6000.cc (print_operand_address): Emit message for unsupported operand. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr96866-1.c: New test. * gcc.target/powerpc/pr96866-2.c: New test.
2024-07-23	testsuite: Disable finite math only for test [PR115826]	Torbjörn SVENSSON	1	-0/+3
	As the test case requires +-Inf and NaN to work and -ffast-math is added by default for arm-none-eabi, re-enable non-finite math. gcc/testsuite/ChangeLog: PR testsuite/115826 * gcc.dg/vect/tsvc/vect-tsvc-s1281.c: Use -fno-finite-math-only. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2024-07-23	tree-optimization/116002 - PTA solving slow with degenerate graph	Richard Biener	1	-0/+12
	When the constraint graph consists of N nodes with only complex constraints and no copy edges we have to be lucky to arrive at a constraint solving order that requires the optimal number of iterations. What happens in the testcase is that we bottle-neck on computing the visitation order but propagate changes only very slowly. Luckily the testcase complex constraints are all copy-with-offset and those do provide a way to order visitation. The following adds this which reduces the iteration count to one. PR tree-optimization/116002 * tree-ssa-structalias.cc (topo_visit): Also consider SCALAR = SCALAR complex constraints as edges.
2024-07-23	ssa: Fix up maybe_rewrite_mem_ref_base complex type handling [PR116034]	Jakub Jelinek	2	-1/+26
	The folding into REALPART_EXPR is correct, used only when the mem_offset is zero, but for IMAGPART_EXPR it didn't check the exact offset value (just that it is not 0). The following patch fixes that by using IMAGPART_EXPR only if the offset is right and using BITFIELD_REF or whatever else otherwise. 2024-07-23 Jakub Jelinek <jakub@redhat.com> Andrew Pinski <quic_apinski@quicinc.com> PR tree-optimization/116034 * tree-ssa.cc (maybe_rewrite_mem_ref_base): Only use IMAGPART_EXPR if MEM_REF offset is equal to element type size. * gcc.dg/pr116034.c: New test.
2024-07-23	c++: Remove CHECK_CONSTR	Jakub Jelinek	6	-51/+0
	On Mon, Jul 22, 2024 at 11:48:51AM -0400, Patrick Palka wrote: > FWIW this tree code seems to be a vestige of the initial Concepts TS > implementation and is effectively unused, we can remove it outright. Here is a patch which removes that. 2024-07-23 Jakub Jelinek <jakub@redhat.com> * cp-tree.def (CHECK_CONSTR): Remove. * cp-tree.h (CHECK_CONSTR_CONCEPT, CHECK_CONSTR_ARGS): Remove. * cp-objcp-common.cc (cp_common_init_ts): Don't handle CHECK_CONSTR. * tree.cc (cp_tree_equal): Likewise. * error.cc (dump_expr): Likewise. * cxx-pretty-print.cc (cxx_pretty_printer::expression): Likewise. (pp_cxx_check_constraint): Remove. (pp_cxx_constraint): Don't handle CHECK_CONSTR.
2024-07-23	[v2] rtl-optimization/116002 - cselib hash is bad	Richard Biener	1	-102/+122
	The following addresses the bad hash function of cselib which uses integer plus for merging. This causes a huge number of collisions for the testcase in the PR and thus very large compile-time. The following rewrites it to use inchash, eliding duplicate mixing of RTX code and mode in some cases and more consistently avoiding a return value of zero as well as treating zero as fatal. An important part is to preserve mixing of hashes of commutative operators as commutative. For cselib_hash_plus_const_int this removes the apparent attempt of making sure to hash the same as a PLUS as cselib_hash_rtx makes sure to dispatch to cselib_hash_plus_const_int consistently. This reduces compile-time for the testcase in the PR from unknown to 22s and for a reduced testcase from 73s to 9s. There's another pending patchset to improve the speed of inchash mixing, but it's not in the profile for this testcase (PTA pops up now). The generated code is equal. I've also compared cc1 builds with and without the patch and they are now commparing equal after retaining commutative hashing for commutative operators. PR rtl-optimization/116002 * cselib.cc (cselib_hash_rtx): Use inchash to get proper mixing. Consistently avoid a zero return value when hashing successfully. Consistently treat a zero hash value from recursing as fatal. Use hashval_t where appropriate. (cselib_hash_plus_const_int): Likewise. (new_cselib_val): Use hashval_t. (cselib_lookup_1): Likewise.
2024-07-23	Relax ix86_hardreg_mov_ok after split1.	liuhongt	2	-3/+13
	ix86_hardreg_mov_ok is added by r11-5066-gbe39636d9f68c4 > The solution proposed here is to have the x86 backend/recog prevent > early RTL passes composing instructions (that set likely_spilled hard > registers) that they (combine) can't simplify, until after reload. > We allow sets from pseudo registers, immediate constants and memory > accesses, but anything more complicated is performed via a temporary > pseudo. Not only does this simplify things for the register allocator, > but any remaining register-to-register moves are easily cleaned up > by the late optimization passes after reload, such as peephole2 and > cprop_hardreg. The restriction is mainly for rtl optimization passes before pass_combine. But split1 splits ``` (insn 17 13 18 2 (set (reg/i:V4SI 20 xmm0) (vec_merge:V4SI (const_vector:V4SI [ (const_int -1 [0xffffffffffffffff]) repeated x4 ]) (const_vector:V4SI [ (const_int 0 [0]) repeated x4 ]) (unspec:QI [ (reg:V4SF 106) (reg:V4SF 102) (const_int 0 [0]) ] UNSPEC_PCMP))) "/app/example.cpp":20:1 2929 {avx_cmpv4sf3_1} (expr_list:REG_DEAD (reg:V4SF 102) (expr_list:REG_DEAD (reg:V4SF 106) (nil)))) ``` into: ``` (insn 23 13 24 2 (set (reg:V4SF 107) (unspec:V4SF [ (reg:V4SF 106) (reg:V4SF 102) (const_int 0 [0]) ] UNSPEC_PCMP)) "/app/example.cpp":20:1 -1 (nil)) (insn 24 23 18 2 (set (reg/i:V4SI 20 xmm0) (subreg:V4SI (reg:V4SF 107) 0)) "/app/example.cpp":20:1 -1 (nil)) ``` There're many splitters generating MOV insn with SUBREG and would have same problem. Instead of changing those splitters one by one, the patch relaxes ix86_hard_mov_ok to allow mov subreg to hard register after split1. ix86_pre_reload_split () is used to replace !reload_completed && ira_in_progress. gcc/ChangeLog: config/i386/i386.cc (ix86_hardreg_mov_ok): Relax mov subreg to hard register after split1. gcc/testsuite/ChangeLog: * g++.target/i386/pr115982.C: New test.
2024-07-23	rs6000: Update option set in rs6000_inner_target_options [PR115713]	Kewen Lin	2	-1/+24
	When function rs6000_inner_target_options parsing target options, it updates the explicit option set information for rs6000_opt_masks by rs6000_isa_flags_explicit, but it misses to update that information for rs6000_opt_vars, and it can result in some unexpected consequence as the associated test case shows. This patch is to fix rs6000_inner_target_options to update the option set for rs6000_opt_vars as well. PR target/115713 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_inner_target_options): Update option set information for rs6000_opt_vars. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr115713-2.c: New test.
2024-07-23	rs6000: Consider explicitly set options in target option parsing [PR115713]	Kewen Lin	3	-3/+26
	In rs6000_inner_target_options, when enabling VSX we enable altivec and disable -mavoid-indexed-addresses implicitly, but it doesn't consider the case that the options altivec and avoid-indexed-addresses can be explicitly disabled. As the test case in PR115713#c1 shows, with target attribute "no-altivec,vsx", it results in that VSX unexpectedly set altivec flag and there isn't an expected error. This patch is to avoid the automatic enablement when they are explicitly specified. With this change, an existing test case ppc-target-4.c also requires an adjustment by specifying explicit altivec in target attribute (since it requires altivec feature and command line is specifying no-altivec). PR target/115713 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_inner_target_options): Avoid to enable altivec or disable avoid-indexed-addresses automatically when they get specified explicitly. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr115713-1.c: New test. * gcc.target/powerpc/ppc-target-4.c: Adjust by specifying altivec in target attribute.
2024-07-23	rs6000: Escalate warning to error for VSX with explicit no-altivec etc.	Kewen Lin	2	-20/+24
	As the discussion in PR115688, for now when users specify -mvsx and -mno-altivec explicitly, compiler emits warning rather than error, but considering both options are given explicitly, emitting hard error should be better. So this patch is to escalate some related warning to error when both are incompatible. PR target/115713 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_option_override_internal): Emit error messages when explicit VSX encounters explicit soft-float, no-altivec or avoid-indexed-addresses. gcc/testsuite/ChangeLog: * gcc.target/powerpc/warn-1.c: Move to ... * gcc.target/powerpc/error-1.c: ... here. Adjust dg-warning with dg-error and remove ineffective scan.
2024-07-23	i386: Change prefetchi output template	Haochen Jiang	2	-3/+3
	For prefetchi instructions, RIP-relative address is explicitly mentioned for operand and assembler obeys that rule strictly. This makes instruction like: prefetchit0 bar got illegal for assembler, which should be a broad usage for prefetchi. Change to %a to explicitly add (%rip) after function label to make it legal in assembler so that it could pass to linker to get the real address. gcc/ChangeLog: * config/i386/i386.md (prefetchi): Change to %a. gcc/testsuite/ChangeLog: * gcc.target/i386/prefetchi-1.c: Check (%rip).
2024-07-22	[5/n][PR rtl-optimization/115877] Fix handling of input/output operands	Jeff Law	1	-5/+26
	So in this patch we're correcting a failure to mark objects live in scenarios like (set (dest) (plus (dest) (src)) When handling set pseudos, we transfer the liveness information from LIVENOW into LIVE_TMP. LIVE_TMP is subsequently used to narrow what bit groups are live for the inputs. The first time we process the block we may not have DEST in the LIVENOW set (it may be live across the loop, but not live after the loop). Thus we can totally miss making certain objects live, resulting in incorrect code. The fix is pretty simple. If LIVE_TMP is empty, then we should go ahead and mark all the bit groups for the set object in LIVE_TMP. This also removes an invalid gcc_assert on the state of the liveness bitmaps. This showed up on pru, rl78 and/or msp430 in the testsuite. So no new test. Bootstrapped and regression tested on x86_64 and also run through my tester on all the cross platforms. Pushing to the trunk. PR rtl-optimization/115877 gcc/ * ext-dce.cc (ext_dce_process_sets): Reasonably handle input/output operands. (ext_dce_rd_transfer_n): Drop bogus assertion.
2024-07-22	[powerpc] [testsuite] reorder dg directives [PR106069]	Alexandre Oliva	1	-1/+1
	The dg-do directive appears after dg-require-effective-target in g++.target/powerpc/pr106069.C. That doesn't work the way that was presumably intended. Both of these directives set dg-do-what, but dg-do does so fully and unconditionally, overriding any decisions recorded there by earlier directives. Reorder the directives more canonically, so that both take effect. for gcc/testsuite/ChangeLog PR target/106069 * g++.target/powerpc/pr106069.C: Reorder dg directives.
2024-07-22	c++/coroutines: correct passing *this to promise type [PR104981]	Patrick Palka	3	-15/+84
	When passing this to the promise type ctor (or to its operator new) (as per [dcl.fct.def.coroutine]/4), we add an explicit cast to lvalue reference. But this is unnecessary since this is already always an lvalue. And doing so means we need to call convert_from_reference afterward to lower the reference expression to an implicit dereference, which we're currently neglecting to do and which causes overload resolution to get confused when computing argument conversions. So this patch removes this unneeded reference cast when passing this to the promise ctor, and removes both the cast and implicit deref when passing this to operator new, for consistency. While we're here, use cp_build_fold_indirect_ref instead of directly building INDIRECT_REF. PR c++/104981 PR c++/115550 gcc/cp/ChangeLog: * coroutines.cc (morph_fn_to_coro): Remove unneeded calls to convert_to_reference and convert_from_reference when passing this. Use cp_build_fold_indirect_ref instead of directly building INDIRECT_REF. gcc/testsuite/ChangeLog: g++.dg/coroutines/pr104981-preview-this.C: New test. * g++.dg/coroutines/pr115550-preview-this.C: New test. Reviewed-by: Iain Sandoe <iain@sandoe.co.uk> Reviewed-by: Jason Merrill <jason@redhat.com>