riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-08-23	ada: Fix validity checks for named parameter associations	Piotr Trojanek	4	-12/+10
	When iterating over actual and formal parameters, we should use First_Actual/Next_Actual and not simply First/Next, because the order of actual parameters might be different than the order of formal parameters obtained with First_Formal/Next_Formal. This patch fixes a glitch in validity checks for actual parameters and applies the same fix to other misuses of First/Next as well. gcc/ada/ * checks.adb (Ensure_Valid): Use First_Actual/Next_Actual. * exp_ch6.adb (Is_Direct_Deep_Call): Likewise. * exp_util.adb (Type_Of_Formal): Likewise. * sem_util.adb (Is_Container_Element): Likewise; cleanup membership test by using a subtype.
2024-08-23	ada: First controlling parameter aspect	Javier Miranda	1	-3/+0
	gcc/ada/ * sem_ch13.adb (Analyze_One_Aspect): Temporarily remove reporting an error when the new aspect is set to True and the extensions are not enabled.
2024-08-23	ada: Error missing when 'access is applied to an interface type object	Javier Miranda	4	-1/+25
	The compiler does not report an error when 'access is applied to a non-aliased class-wide interface type object. gcc/ada/ * exp_util.ads (Is_Expanded_Class_Wide_Interface_Object_Decl): New subprogram. * exp_util.adb (Is_Expanded_Class_Wide_Interface_Object_Decl): ditto. * sem_util.adb (Is_Aliased_View): Handle expanded class-wide type object declaration. * checks.adb (Is_Aliased_Unconstrained_Component): Protect the frontend against calling Is_Aliased_View with Empty. Found working on this issue.
2024-08-23	ada: First controlling parameter aspect	Javier Miranda	19	-23/+860
	This patch adds support for a new GNAT aspect/pragma that modifies the semantics of dispatching primitives. When a tagged type has this aspect/pragma, only subprograms that have the first parameter of this type will be considered dispatching primitives; this new pragma/aspect is inherited by all descendant types. gcc/ada/ * aspects.ads (Aspect_First_Controlling_Parameter): New aspect. Defined as implementation defined aspect that has a static boolean value and it is converted to pragma when the value is True. * einfo.ads (Has_First_Controlling_Parameter): New attribute. * exp_ch9.adb (Build_Corresponding_Record): Propagate the aspect to the corresponding record type. (Expand_N_Protected_Type_Declaration): Analyze the inherited aspect to add the pragma. (Expand_N_Task_Type_Declaration): ditto. * freeze.adb (Warn_If_Implicitly_Inherited_Aspects): New subprogram. (Has_First_Ctrl_Param_Aspect): New subprogram. (Freeze_Record_Type): Call Warn_If_Implicitly_Inherited_Aspects. (Freeze_Subprogram): Check illegal subprograms of tagged types and interface types that have this new aspect. * gen_il-fields.ads (Has_First_Controlling_Parameter): New entity field. * gen_il-gen-gen_entities.adb (Has_First_Controlling_Parameter): The new field is a semantic flag. * gen_il-internals.adb (Image): Add Has_First_Controlling_Parameter. * par-prag.adb (Prag): No action for Pragma_First_Controlling_Parameter since processing is handled entirely in Sem_Prag. * sem_ch12.adb (Validate_Private_Type_Instance): When the generic formal has this new aspect, check that the actual type also has this aspect. * sem_ch13.adb (Analyze_One_Aspect): Check that the aspect is applied to a tagged type or a concurrent type. * sem_ch3.adb (Analyze_Full_Type_Declaration): Derived tagged types inherit this new aspect, and also from their implemented interface types. (Process_Full_View): Propagate the aspect to the full view. * sem_ch6.adb (Is_A_Primitive): New subprogram; used to factor code and also clarify detection of primitives. * sem_ch9.adb (Check_Interfaces): Propagate this new aspect to the type implementing interface types. * sem_disp.adb (Check_Controlling_Formals): Handle tagged type that has the aspect and has subprograms overriding primitives of tagged types that lack this aspect. (Check_Dispatching_Operation): Warn on dispatching primitives disallowed by this new aspect. (Has_Predefined_Dispatching_Operation_Name): New subprogram. (Find_Dispatching_Type): Handle dispatching functions of tagged types that have the new aspect. (Find_Primitive_Covering_Interface): For primitives of tagged types that have the aspect and override a primitive of a parent type that does not have the aspect, we must temporarily unset attribute First_Controlling_ Parameter to properly check conformance. * sem_prag.ads (Aspect_Specifying_Pragma): Add new pragma. * sem_prag.adb (Pragma_First_Controlling_Parameter): Handle new pragma. * snames.ads-tmpl (Name_First_Controlling_Parameter): New name. * warnsw.ads (Warn_On_Non_Dispatching_Primitives): New warning. * warnsw.adb (Warn_On_Non_Dispatching_Primitives): New warning; not set by default when GNAT_Mode warnings are enabled, nor when all warnings are enabled (-gnatwa).
2024-08-23	fortran: Minor fix to -ffrontend-optimize description	Gerald Pfeifer	1	-1/+1
	gcc/fortran: * invoke.texi (Code Gen Options): Add a missing word.
2024-08-23	doc: Specifically link to GPL v3.0 for GM2	Gerald Pfeifer	1	-1/+1
	The generic GPL link redirects to GPL v3.0 right now, but may redirect to a different version at one point. Specifically link to the version we are using gcc: * doc/gm2.texi (License): Specifically link to GPL v3.0
2024-08-23	Remove unnecessary view_convert obsoleted by [PR86468].	Andre Vehreschild	1	-3/+1
	This patch removes an unnecessary view_convert in trans_associate to prevent hard to find runtime errors in the future. The view_convert was erroneously introduced not understanding why ranks of the arrays to assign are different. The ranks are fixed by PR86468 now and the view_convert is obsolete. gcc/fortran/ChangeLog: PR fortran/86468 * trans-stmt.cc (trans_associate_var): Remove superfluous view_convert.
2024-08-22	testsuite: Fix vect-mod-var.c for division by 0 [PR116461]	Andrew Pinski	1	-0/+3
	The testcase cc.dg/vect/vect-mod-var.c has an division by 0 which is undefined. On some targets (aarch64), the scalar and the vectorized version, the result of division by 0 is the same. While on other targets (x86), we get a SIGFAULT. On other targets (powerpc), the results are different. The fix is to make sure the testcase does not test division by 0 (or really mod by 0). Pushed as obvious after testing on x86_64-linux-gnu to make sure the testcase passes now. PR testsuite/116461 gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-mod-var.c: Change the initialization loop so that `b[i]` is never 0. Use 1 in those places. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-08-23	Daily bump.	GCC Administrator	7	-1/+233

2024-08-22	testsuite: Fix gcc.dg/torture/pr116420.c for targets default unsigned char ↵	Andrew Pinski	1	-1/+1
	[PR116464] This is an obvious fix to the gcc.dg/torture/pr116420.c testcase which simplier changes from plain `char` to `signed char` so it works on targets where plain char defaults to unsigned. Pushed as obvious after a quick test for aarch64-linux-gnu to make sure the testcase passes now. PR testsuite/116464 gcc/testsuite/ChangeLog: * gcc.dg/torture/pr116420.c: Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-08-22	[PR rtl-optimization/116420] Fix interesting block bitmap DF dataflow	Jeff Law	2	-1/+18
	The DF framework provides us a way to run dataflow problems on sub-graphs. Naturally a bitmap of interesting blocks is passed into those routines. At a confluence point, the DF framework will not mark a block for re-processing if it's not in that set of interesting blocks. When ext-dce sets up that set of interesting blocks it's using the wrong counter. ie, it's using n_basic_blocks rather than last_basic_block. If there are holes in the block indices, some number of blocks won't get marked as interesting. In this case the block needing reprocessing has an index higher than n_basic_blocks. It never gets reprocessed and the newly found live chunks don't propagate further up the CFG -- ultimately resulting in a pseudo appearing to have only the low 8 bits live, when in fact the low 32 bits are actually live. Fixed in the obvious way, by using last_basic_block instead. Bootstrapped and regression tested on x86_64. Pushing to the trunk. PR rtl-optimization/116420 gcc/ * ext-dce.cc (ext_dce_init): Fix loop iteration when setting up the interesting block for DF to analyze. gcc/testsuite * gcc.dg/torture/pr116420.c: New test.
2024-08-22	libstdc++: Add some missing ranges feature-test macro tests	Patrick Palka	3	-0/+12
	libstdc++-v3/ChangeLog: * testsuite/25_algorithms/contains/1.cc: Verify value of __cpp_lib_ranges_contains. * testsuite/25_algorithms/find_last/1.cc: Verify value of __cpp_lib_ranges_find_last. * testsuite/26_numerics/iota/2.cc: Verify value of __cpp_lib_ranges_iota. Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
2024-08-22	Recompute TYPE_MODE and DECL_MODE for aggregate type for acclerator.	Prathamesh Kulkarni	5	-20/+69
	The patch streams out VOIDmode for aggregate types with offloading enabled, and recomputes appropriate TYPE_MODE and DECL_MODE while streaming-in on accel side. The rationale for this change is to avoid streaming out host-specific modes that may be used for aggregate types, which may not be representable on the accelerator. For eg, AArch64 uses OImode for ARRAY_TYPE whose size is 256-bits, and nvptx doesn't have OImode, and thus ends up emitting an error from lto_input_mode_table. gcc/ChangeLog: * lto-streamer-in.cc: (lto_read_tree_1): Set DECL_MODE (expr) to TREE_TYPE (TYPE_MODE (expr)) if TREE_TYPE (expr) is aggregate type and offloading is enabled. * stor-layout.cc (layout_type): Move computation of mode for ARRAY_TYPE from ... (compute_array_mode): ... to here. * stor-layout.h (compute_array_mode): Declare. * tree-streamer-in.cc: Include stor-layout.h. (unpack_ts_common_value_fields): Call compute_array_mode if offloading is enabled. * tree-streamer-out.cc (pack_ts_fixed_cst_value_fields): Stream out VOIDmode if decl has aggregate type and offloading is enabled. (pack_ts_type_common_value_fields): Stream out VOIDmode for aggregate type if offloading is enabled. Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>
2024-08-22	RISC-V: Fix vector cfi notes for stack-clash protection	Raphael Moreira Zinsly	2	-3/+18
	The stack-clash code is generating wrong cfi directives in riscv_v_adjust_scalable_frame because REG_CFA_DEF_CFA has a different encoding than REG_FRAME_RELATED_EXPR, this patch fixes the offset sign in prologue and starts using REG_CFA_DEF_CFA in the epilogue. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_adjust_scalable_frame): Add epilogue code for stack-clash and fix prologue cfi note. gcc/testsuite/ChangeLog: * gcc.target/riscv/stack-check-cfa-3.c: Fix the expected output.
2024-08-22	libstdc++: Optimize std::projected<I, std::identity>	Patrick Palka	2	-0/+10
	Algorithms that are generalized to take projections typically default the projection to std::identity, which is equivalent to no projection at all. In that case, I believe we could shortcut the projection logic to return the iterator unchanged rather than wrapping it. This should reduce compile times especially after P2609R3 which made the indirect invocability concepts more expensive to check when actual projections are involved. libstdc++-v3/ChangeLog: * include/bits/iterator_concepts.h (__detail::__projected): Define an optimized partial specialization for when the projection is std::identity. * testsuite/24_iterators/indirect_callable/projected.cc: Verify the optimization. Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
2024-08-22	libstdc++: Implement P2997R1 changes to the indirect invocability concepts	Patrick Palka	6	-85/+57
	This implements the changes of this C++26 paper as a DR against C++20. In passing this patch removes the std/ranges/version_c++23.cc test which is now mostly obsolete after the version.def FTM refactoring, and instead expands the __cpp_lib_ranges checks in another test so that it verifies the exact value of the FTM on a per language version basis. libstdc++-v3/ChangeLog: * include/bits/iterator_concepts.h (indirectly_unary_invocable): Relax as per P2997R1. (indirectly_regular_unary_invocable): Likewise. (indirect_unary_predicate): Likewise. (indirect_binary_predicate): Likewise. (indirect_equivalence_relation): Likewise. (indirect_strict_weak_order): Likewise. * include/bits/version.def (ranges): Update value for C++26. * include/bits/version.h: Regenerate. * testsuite/24_iterators/indirect_callable/p2997r1.cc: New test. * testsuite/std/ranges/version_c++23.cc: Remove. * testsuite/std/ranges/headers/ranges/synopsis.cc: Refine the __cpp_lib_ranges checks. Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
2024-08-22	libstdc++: Implement P2609R3 changes to the indirect invocability concepts	Patrick Palka	5	-19/+77
	This implements the changes of this C++23 paper as a DR against C++20. Note that after the later P2538R1 "ADL-proof std::projected" (which we already implement), we can't use a simple partial specialization to match specializations of the 'projected' alias template. So instead we identify such specializations using a pair of distinguishing member aliases. libstdc++-v3/ChangeLog: * include/bits/iterator_concepts.h (__detail::__indirect_value): Define. (__indirect_value_t): Define as per P2609R3. (iter_common_reference_t): Adjust as per P2609R3. (indirectly_unary_invocable): Likewise. (indirectly_regular_unary_invocable): Likewise. (indirect_unary_predicate): Likewise. (indirect_binary_predicate): Likewise. (indirect_equivalence_relation): Likewise. (indirect_strict_weak_order): Likewise. (__detail::__projected::__type): Define member aliases __projected_Iter and __projected_Proj providing the template arguments of the current specialization. * include/bits/version.def (ranges): Update value. * include/bits/version.h: Regenerate. * testsuite/24_iterators/indirect_callable/p2609r3.cc: New test. * testsuite/std/ranges/version_c++23.cc: Update expected value of __cpp_lib_ranges macro. Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
2024-08-22	Update LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook	H.J. Lu	1	-22/+31
	This hook allows the BFD linker plugin to distinguish calls to claim_file_handler that know the object is being used by the linker (from ldmain.c:add_archive_element), from calls that don't know it's being used by the linker (from elf_link_is_defined_archive_symbol); in the latter case, the plugin should avoid including the unused LTO archive members in link output. To get the proper support for archives with LTO common symbols, the linker fix commit a6f8fe0a9e9cbe871652e46ba7c22d5e9fb86208 Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Aug 14 20:50:02 2024 -0700 lto: Don't include unused LTO archive members in output is required. PR lto/116361 * lto-plugin.c (claim_file_handler_v2): Rename claimed to can_be_claimed. Include the LTO object only if it is known to be included in link output. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-08-22	fold: Fix `a * 1j` if a has side effects [PR116454]	Andrew Pinski	3	-10/+50
	The problem here was a missing save_expr around arg0 since it is used twice, once in REALPART_EXPR and once in IMAGPART_EXPR. Thia adds the save_expr and reformats the code slightly so it is a little easier to understand. It excludes the case when arg0 is a COMPLEX_EXPR since in that case we'll end up with the distinct real and imaginary parts. This is important to retain early optimization in some testcases. Bootstapped and tested on x86_64-linux-gnu with no regressions. PR middle-end/116454 gcc/ChangeLog: * fold-const.cc (fold_binary_loc): Fix `a * +-1i` by wrapping arg0 with save_expr when it is not COMPLEX_EXPR. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr116454-1.c: New test. * gcc.dg/torture/pr116454-2.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> Co-Authored-By: Richard Biener <rguenther@suse.de>
2024-08-22	fix single argument static_assert	Marc Poulhiès	1	-1/+2
	Single argument static_assert is C++17 only. libcpp/ChangeLog: * lex.cc(search_line_ssse3): fix static_assert to use 2 arguments.
2024-08-22	PR target/116365: Add user-friendly arguments to --param ↵	Jennifer Schmitz	18	-25/+80
	aarch64-autovec-preference=N The param aarch64-autovec-preference=N is a useful tool for testing auto-vectorisation in GCC as it allows the user to force a particular strategy. So far, N could be a numerical value between 0 and 4. This patch replaces the numerical values by more user-friendly names to distinguish the options. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. Ok for mainline? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ PR target/116365 * config/aarch64/aarch64-opts.h (enum aarch64_autovec_preference_enum): New enum. * config/aarch64/aarch64.cc (aarch64_cmp_autovec_modes): Change numerical to enum values. (aarch64_autovectorize_vector_modes): Change numerical to enum values. (aarch64_vector_costs::record_potential_advsimd_unrolling): Change numerical to enum values. * config/aarch64/aarch64.opt: Change param type to enum. * doc/invoke.texi: Update documentation. gcc/testsuite/ PR target/116365 * gcc.target/aarch64/autovec_param_asimd-only.c: New test. * gcc.target/aarch64/autovec_param_default.c: Likewise. * gcc.target/aarch64/autovec_param_prefer-asimd.c: Likewise. * gcc.target/aarch64/autovec_param_prefer-sve.c: Likewise. * gcc.target/aarch64/autovec_param_sve-only.c: Likewise. * gcc.target/aarch64/neoverse_v1_2.c: Update parameter value. * gcc.target/aarch64/neoverse_v1_3.c: Likewise. * gcc.target/aarch64/sve/cond_asrd_1.c: Likewise. * gcc.target/aarch64/sve/cond_cnot_4.c: Likewise. * gcc.target/aarch64/sve/cond_unary_5.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_5.c: Likewise. * gcc.target/aarch64/sve/cond_xorsign_2.c: Likewise. * gcc.target/aarch64/sve/pr98268-1.c: Likewise. * gcc.target/aarch64/sve/pr98268-2.c: Likewise.
2024-08-22	RISC-V: Enable -gvariable-location-views by default	Bernd Edlinger	4	-8/+26
	This affects only the RISC-V targets, where the compiler options -gvariable-location-views and consequently also -ginline-points are disabled by default, which is unexpected and disables some useful features of the generated debug info. Due to a bug in the gas assembler the .loc statement is not usable to generate location view debug info. That is detected by configure: configure:31500: checking assembler for dwarf2 debug_view support configure:31509: .../riscv-unknown-elf/bin/as -o conftest.o conftest.s >&5 conftest.s: Assembler messages: conftest.s:5: Error: .uleb128 only supports constant or subtract expressions conftest.s:6: Error: .uleb128 only supports constant or subtract expressions configure:31512: $? = 1 configure: failed program was .file 1 "conftest.s" .loc 1 3 0 view .LVU1 nop .data .uleb128 .LVU1 .uleb128 .LVU1 configure:31523: result: no This results in dwarf2out_as_locview_support being set to false, and that creates a sequence of events, with the end result that most inlined functions either have no DW_AT_entry_pc, or one with a wrong entry pc value. But the location views can also be generated without using any .loc statements, therefore we should enable the option -gvariable-location-views by default, regardless of the status of -gas-locview-support. Note however, that the combination of the following compiler options -g -O2 -gvariable-location-views -gno-as-loc-support turned out to create invalid assembler intermediate files, with lots of assembler errors like: Error: leb128 operand is an undefined symbol: .LVU3 This affects all targets, except RISC-V of course ;-) and is fixed by the changes in dwarf2out.cc Finally the .debug_loclists created without assembler locview support did contain location view pairs like v0000000ffffffff v000000000000000 which is the value from FORCE_RESET_NEXT_VIEW, but that is most likely not as expected either, so change that as well. gcc/ChangeLog: * dwarf2out.cc (dwarf2out_maybe_output_loclist_view_pair, output_loc_list): Correct handling of -gno-as-loc-support, use ZERO_VIEW_P to output view number as zero value. * toplev.cc (process_options): Do not automatically disable -gvariable-location-views when -gno-as-loc-support or -gno-as-locview-support is used, instead do automatically disable -gas-locview-support if -gno-as-loc-support is used. gcc/testsuite/ChangeLog: * gcc.dg/debug/dwarf2/inline2.c: Add checks for inline entry_pc. * gcc.dg/debug/dwarf2/inline6.c: Add -gno-as-loc-support and check the resulting location views.
2024-08-22	Do not emit a redundant DW_TAG_lexical_block for inlined subroutines	Bernd Edlinger	2	-3/+32
	While this already works correctly for the case when an inlined subroutine contains only one subrange, a redundant DW_TAG_lexical_block is still emitted when the subroutine has multiple blocks. Fixes: ac02e5b75451 ("re PR debug/37801 (DWARF output for inlined functions doesn't always use DW_TAG_inlined_subroutine)") gcc/ChangeLog: PR debug/87440 * dwarf2out.cc (gen_inlined_subroutine_die): Handle the case of multiple subranges correctly. gcc/testsuite/ChangeLog: * gcc.dg/debug/dwarf2/inline7.c: New test.
2024-08-22	PR tree-optimization/101390: Vectorize modulo operator	Jennifer Schmitz	4	-0/+136
	This patch adds a new vectorization pattern that detects the modulo operation where the second operand is a variable. It replaces the statement by division, multiplication, and subtraction. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. Ok for mainline? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ PR tree-optimization/101390 * tree-vect-patterns.cc (vect_recog_mod_var_pattern): Add new pattern. gcc/testsuite/ PR tree-optimization/101390 * gcc.dg/vect/vect-mod-var.c: New test. * gcc.target/aarch64/sve/mod_1.c: Likewise. * lib/target-supports.exp: New selector expression.
2024-08-22	Dump aliases in -fcallgraph-info	Alexandre Oliva	3	-0/+48
	Dump ICF-unified decls, thunks, aliases and whatnot along with their ultimate targets, with edges from the alias to the target. Add support for dropping the source file's suffix when forming from dump-base, so that auxiliary files can be scanned, such as the .ci files generated by -fcallgraph-info, as in the testcase. for gcc/ChangeLog * toplev.cc (dump_final_alias_vcg): New. (dump_final_node_vcg): Dump aliases along with node. for gcc/testsuite/ChangeLog * lib/scandump.exp (dump-base): Support {} in dump base suffix to drop it. * gcc.dg/callgraph-info-1.c: New.
2024-08-22	Makefile.tpl: fix whitespace in licence header	Sam James	2	-4/+4
	* Makefile.in: Regenerate. * Makefile.tpl: Fix whitespace. Signed-off-by: Sam James <sam@gentoo.org>
2024-08-22	Makefile.tpl: drop leftover intermodule cruft	Sam James	2	-14/+8
	intermodule supported was dropped in r0-103106-gde6ba7aee152a0 with some remaining bits for Fortran removed in r14-1696-gecc96eb5d2a0e5. Remove some small leftovers. * Makefile.in: Regenerate. * Makefile.tpl (STAGE1_CONFIGURE_FLAGS): Remove --disable-intermodule.
2024-08-22	Align ix86_{move_max,store_max} with vectorizer.	liuhongt	12	-8/+53
	When none of mprefer-vector-width, avx256_optimal/avx128_optimal, avx256_store_by_pieces/avx512_store_by_pieces is specified, GCC will set ix86_{move_max,store_max} as max available vector length except for AVX part. if (TARGET_AVX512F_P (opts->x_ix86_isa_flags) && TARGET_EVEX512_P (opts->x_ix86_isa_flags2)) opts->x_ix86_move_max = PVW_AVX512; else opts->x_ix86_move_max = PVW_AVX128; So for -mavx2, vectorizer will choose 256-bit for vectorization, but 128-bit is used for struct copy, there could be a potential STLF issue due to this "misalign". The patch fixes that. gcc/ChangeLog: * config/i386/i386-options.cc (ix86_option_override_internal): set ix86_{move_max,store_max} to PVW_AVX256 when TARGET_AVX instead of PVW_AVX128. gcc/testsuite/ChangeLog: * gcc.target/i386/pieces-memcpy-10.c: Add -mprefer-vector-width=128. * gcc.target/i386/pieces-memcpy-6.c: Ditto. * gcc.target/i386/pieces-memset-38.c: Ditto. * gcc.target/i386/pieces-memset-40.c: Ditto. * gcc.target/i386/pieces-memset-41.c: Ditto. * gcc.target/i386/pieces-memset-42.c: Ditto. * gcc.target/i386/pieces-memset-43.c: Ditto. * gcc.target/i386/pieces-strcpy-2.c: Ditto. * gcc.target/i386/pieces-memcpy-22.c: New test. * gcc.target/i386/pieces-memset-51.c: New test. * gcc.target/i386/pieces-strcpy-3.c: New test.
2024-08-22	Daily bump.	GCC Administrator	6	-1/+182

2024-08-22	RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 3	Pan Li	13	-0/+236
	This patch would like to add test cases for the unsigned vector .SAT_TRUNC form 3. Aka: Form 3: #define DEF_VEC_SAT_U_TRUNC_FMT_3(NT, WT) \ void __attribute__((noinline)) \ vec_sat_u_trunc_##NT##_##WT##_fmt_3 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT max = (WT)(NT)-1; \ out[i] = in[i] <= max ? (NT)in[i] : (NT)max; \ } \ } DEF_VEC_SAT_U_TRUNC_FMT_3 (uint32_t, uint64_t) The below test is passed for this patch. * The rv64gcv regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-13.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-14.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-15.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-17.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-18.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-13.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-14.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-15.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-17.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-18.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-22	RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 2	Pan Li	13	-0/+236
	This patch would like to add test cases for the unsigned vector .SAT_TRUNC form 2. Aka: Form 2: #define DEF_VEC_SAT_U_TRUNC_FMT_2(NT, WT) \ void __attribute__((noinline)) \ vec_sat_u_trunc_##NT##_##WT##_fmt_2 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT max = (WT)(NT)-1; \ out[i] = in[i] > max ? (NT)max : (NT)in[i]; \ } \ } DEF_VEC_SAT_U_TRUNC_FMT_2 (uint32_t, uint64_t) The below test is passed for this patch. * The rv64gcv regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-10.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-11.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-12.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-7.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-9.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-10.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-11.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-12.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-7.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-9.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-21	[PR rtl-optimization/116437] Fix RTL checking issue in ext-dce	Jeff Law	1	-1/+6
	Another RTL checking failure in ext-dce. An easy one to fix this time. When we optimize an extension we have to go back and cleanup with SUBREG_PROMOTED state. So we record the destination register into a bitmap as we make changes, then later do a single pass over the IL fixing any associated subreg expressions. The optimization is changing the SET_SRC and largely ignores the destination. The LHS could be a REG, SUBREG, or ZERO_EXTRACT. If the LHS is a SUBREG or ZERO_EXTRACT we can just strip them. Bootstrapped and ran the testsuite with an RTL checking compiler and verified no ext-dce RTL checking failures tripped. Also bootstrapped and regression tested x86_64 in the usual way. Pushing to the trunk. PR rtl-optimization/116437 gcc/ * ext-dce.cc (ext_dce_try_optimize_insn): Handle SUBREG and ZERO_EXTRACT destinations.
2024-08-21	aarch64: Fix caller saves of VNx2QI [PR116238]	Richard Sandiford	2	-3/+17
	The testcase contains a VNx2QImode pseudo that is live across a call and that cannot be allocated a call-preserved register. LRA quite reasonably tried to save it before the call and restore it afterwards. Unfortunately, the target told it to do that in SImode, even though punning between SImode and VNx2QImode is disallowed by both TARGET_CAN_CHANGE_MODE_CLASS and TARGET_MODES_TIEABLE_P. The natural class to use for SImode is GENERAL_REGS, so this led to an unsalvageable situation in which we had: (set (subreg:VNx2QI (reg:SI A) 0) (reg:VNx2QI B)) where A needed GENERAL_REGS and B needed FP_REGS. We therefore ended up in a reload loop. The hooks above should ensure that this situation can never occur for incoming subregs. It only happened here because the target explicitly forced it. The decision to use SImode for modes smaller than 4 bytes dates back to the beginning of the port, before 16-bit floating-point modes existed. I'm not sure whether promoting to SImode really makes sense for any FPR, but that's a separate performance/QoI discussion. For now, this patch just disallows using SImode when it is wrong for correctness reasons, since that should be safer to backport. gcc/ PR testsuite/116238 * config/aarch64/aarch64.cc (aarch64_hard_regno_caller_save_mode): Only return SImode if we can convert to and from it. gcc/testsuite/ PR testsuite/116238 * gcc.target/aarch64/sve/pr116238.c: New test.
2024-08-21	aarch64: Implement popcountti2 pattern [PR113042]	Andrew Pinski	3	-0/+63
	When CSSC is not enabled, 128bit popcount can be implemented just via the vector (v16qi) cnt instruction followed by a reduction, like how the 64bit one is currently implemented instead of splitting into 2 64bit popcount. Changes since v1: * v2: Make operand 0 be DImode instead of TImode and simplify. Build and tested for aarch64-linux-gnu. PR target/113042 gcc/ChangeLog: * config/aarch64/aarch64.md (popcountti2): New define_expand. gcc/testsuite/ChangeLog: * gcc.target/aarch64/popcnt10.c: New test. * gcc.target/aarch64/popcnt9.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-08-21	tree-optimization/116406 - ICE with int<->float punning prevention	Richard Biener	3	-1/+24
	The following does away with the idea to use non-symmetrical testing of mode_can_transfer_bits in hash-table equality testing. It isn't feasible to always control query order to maintain consistency. PR tree-optimization/116406 * tree-ssa-sccvn.cc (vn_reference_eq): Never equate float and int when the float mode cannot transfer bits. Do not try to anticipate which is the mode we actually load from. * gcc.dg/tree-ssa/pr116406.c: New testcase. * gcc.dg/tree-ssa/ssa-pre-30.c: On x86 dd -msse -mfpmath=sse.
2024-08-21	sra: Avoid risking x87 magling binary representation of a replacement (PR 58416)	Martin Jambor	2	-1/+59
	PR 58416 shows that storing non-floating point data to floating point scalar registers can lead to miscompilations when the data is normalized or otherwise processed upon loading to a register. To avoid that risk, this patch detects situations where we have multiple types and a we decide to represent the data in a type with a mode that is known to not be able to transfer actual bits reliably using the new TARGET_MODE_CAN_TRANSFER_BITS hook. gcc/ChangeLog: 2024-08-19 Martin Jambor <mjambor@suse.cz> PR target/58416 * tree-sra.cc (types_risk_mangled_binary_repr_p): New function. (sort_and_splice_var_accesses): Use it. (propagate_subaccesses_from_rhs): Likewise. gcc/testsuite/ChangeLog: 2024-08-19 Martin Jambor <mjambor@suse.cz> PR target/58416 * gcc.dg/torture/pr58416.c: New test.
2024-08-21	tree-optimization/116380 - bogus SSA update with loop distribution	Richard Biener	2	-0/+19
	When updating LC PHIs after copying loops we have to handle defs defined outside of the loop appropriately (by not setting them to NULL ...). This mimics how we handle this in the SSA updating code of the vectorizer. PR tree-optimization/116380 * tree-loop-distribution.cc (copy_loop_before): Handle out-of-loop defs appropriately. * gcc.dg/torture/pr116380.c: New testcase.
2024-08-21	libstdc++: Use strlen for std::char_traits<char8_t>::length [PR102958]	Jonathan Wakely	1	-4/+1
	libstdc++-v3/ChangeLog: PR tree-optimization/102958 * include/bits/char_traits.h (char_traits<char8_t>::length): Use strlen.
2024-08-21	libstdc++: Check ios::uppercase for ios::fixed floating-point output [PR114862]	Jonathan Wakely	2	-5/+54
	This is LWG 4084 which I filed recently. LWG seems to support making the change, so that std::num_put can use the %F format for floating-point numbers. libstdc++-v3/ChangeLog: PR libstdc++/114862 * src/c++98/locale_facets.cc (__num_base::_S_format_float): Check uppercase flag for fixed format. * testsuite/22_locale/num_put/put/char/lwg4084.cc: New test.
2024-08-21	Fix coarray rank for non-coarrays in derived types. [PR86468]	Andre Vehreschild	5	-11/+70
	The corank was propagated to array components in derived types. Fix this by setting a zero corank when the array component is not a pointer. For pointer typed array components propagate the corank of the derived type to allow associating the component to a coarray. gcc/fortran/ChangeLog: PR fortran/86468 * trans-intrinsic.cc (conv_intrinsic_move_alloc): Correct comment. * trans-types.cc (gfc_sym_type): Pass coarray rank, not false. (gfc_get_derived_type): Only propagate codimension for coarrays and pointers to array components in derived typed coarrays. gcc/testsuite/ChangeLog: * gfortran.dg/coarray_lib_this_image_2.f90: Fix array rank in tree dump scan. * gfortran.dg/coarray_lib_token_4.f90: Same. * gfortran.dg/coarray/move_alloc_2.f90: New test.
2024-08-21	libstdc++: Fix std::variant to reject array types [PR116381]	Jonathan Wakely	2	-4/+19
	libstdc++-v3/ChangeLog: PR libstdc++/116381 * include/std/variant (variant): Fix conditions for static_assert to match the spec. * testsuite/20_util/variant/types_neg.cc: New test.
2024-08-21	c++, coroutines: Check for malformed functions before splitting.	Iain Sandoe	1	-1/+7
	This performs the same basic check that is done by finish_function to catch cases where the function is so badly malformed that we do not have a consistent binding level. gcc/cp/ChangeLog: * coroutines.cc (split_coroutine_body_from_ramp): Check that the binding level is as expected before attempting to outline the function body. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2024-08-21	testsuite: i386: Fix g++.target/i386/pr116275-2.C on Solaris/x86	Rainer Orth	1	-1/+1
	The new g++.target/i386/pr116275-2.C test FAILs on 32-bit Solaris/x86: FAIL: g++.target/i386/pr116275-2.C scan-assembler vpslld This happens because Solaris defaults to -mstackrealign, disabling -mstv. Fixed by disabling the former and enabling the latter. Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu. 2024-08-20 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: * g++.target/i386/pr116275-2.C (dg-options): Add -mstv -mno-stackrealign.
2024-08-21	Fortran: Fix ICE in sizeof(coarray) [PR77518]	Andre Vehreschild	2	-3/+37
	Use se's class_container where present in sizeof(). PR fortran/77518 gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_conv_intrinsic_sizeof): Use class_container of se when set. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/sizeof_1.f90: New test.
2024-08-21	rs6000: Remove "+" constraint modifier from vsx_le_perm_store_ insns	Kewen Lin	1	-5/+5
	Since vsx_le_perm_store_ can be split into vector permute and vector store, after reload_completed, we reuse the operand 1 as the destination of vector permute, so we set operand 1 with constraint modifier "+". But since it's taken as pure input in DF and most passes as Richard pointed out in [1], to ensure it's correct when operand 1 is still live, we actually restore the operand 1's value after the store with vector permute, that is: op1 = vector permute op1 (doubleword swapping) op0 = op2 op1 = vector permute op1 (doubleword swapping) , it means op1's value isn't changed by this insn. So according to the comments from Richard and Segher in that thread, this patch is to remove the "+" constraint modifier of operand 1 from vsx_le_perm_store_ insns. [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660145.html gcc/ChangeLog: * config/rs6000/vsx.md (define_insn *vsx_le_perm_store_{<VSX_D:mode>, <VSX_W:mode>,v8hi,v16qi,<VSX_LE_128:mode>}): Remove constraint modifier "+" from operand 1.
2024-08-21	rs6000: Fix vsx_le_perm_store_* splitters for !reload_completed	Kewen Lin	1	-11/+10
	For vsx_le_perm_store_* we have two splitters, one is for !reload_completed and the other is for reload_completed. As Richard pointed out in [1], operand 1 here is a pure input for DF and most passes, but it could be used as the vector rotation (64 bit) destination of itself, so we re-compute the source (back to the original value) for the case reload_completed, while for !reload_completed we generate one new pseudo, so both cases are fine if operand 1 is still live after this insn. But according to the source code, for !reload_completed case, it can logically reuse the operand 1 as the new pseudo generation is conditional on can_create_pseudo_p, then it can cause wrong result once operand 1 is live. So considering this and there is no splitting for this when reload_in_progress, this patch is to fix the code to assert can_create_pseudo_p there, so that both !reload_completed and reload_completed cases would ensure operand 1 is unchanged (pure input), it is also prepared for the following up patch which would strip the unnecessary INOUT constraint modifier "+". This also fixes an oversight in the splitter for VSX_LE_128 (!reload_completed), it should use operand 1 rather than operand 0. [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660145.html gcc/ChangeLog: * config/rs6000/vsx.md (*vsx_le_perm_store_{<VSX_D:mode>,<VSX_W:mode>, v8hi,v16qi,<VSX_LE_128:mode>} !reload_completed splitters): Assert can_create_pseudo_p and always generate one new pseudo for operand 1.
2024-08-21	testsuite, rs6000: Remove all powerpc-paired uses	Kewen Lin	1	-33/+2
	Similar to r15-710-g458b23bc8b3e2b which removed all uses of powerpc--linuxpaired, this patch is to remove the remaining powerpc-paired* uses which I missed to catch with "linux" in search keyword. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_vect_support_and_set_flags): Remove the if arm checking powerpc-paired. (check_750cl_hw_available): Remove. (check_effective_target_vect_unpack): Remove the check on powerpc-paired.
2024-08-21	Align predicates for operands[1] between mov<mode> and *mov<mode>_internal.	liuhongt	2	-2/+2
	> It's not obvious to me why movv16qi requires a nonimmediate_operand > > source, especially since ix86_expand_vector_mode does have code to > > cope with constant operand[1]s. emit_move_insn_1 doesn't check the > > predicates anyway, so the predicate will have little effect. > > > > A workaround would be to check legitimate_constant_p instead of the > > predicate, but I'm not sure that that should be necessary. > > > > Has this already been discussed? If not, we should loop in the x86 > > maintainers (but I didn't do that here in case it would be a repeat). > > I also noticed it. Not sure why movv16qi requires a > nonimmediate_operand, while ix86_expand_vector_mode could deal with > constant op. Looking forward to Hongtao's comments. The code has been there since 2005 before I'm involved. It looks to me at the beginning both mov<mode> and mov<mode>_internal only support nonimmediate_operand for the operands[1]. And r0-75606-g5656a184e83983 adjusted the nonimmediate_operand to nonimmediate_or_sse_const_operand for mov<mode>_internal, but not for mov<mode>. I think we can align the predicate between mov<mode> and mov<mode>_internal. gcc/ChangeLog: config/i386/sse.md (mov<mode>): Align predicates for operands[1] between mov<mode> and mov<mode>_internal. config/i386/mmx.md (mov<mode>): Ditto.
2024-08-21	Daily bump.	GCC Administrator	8	-1/+350

2024-08-20	builtins: Don't expand bit query builtins for __int128_t if the target ↵	Andrew Pinski	1	-1/+3
	supports an optab for it On aarch64 (without !CSSC instructions), since popcount is implemented using the SIMD instruction cnt, instead of using two SIMD cnt (V8QI mode), it is better to use one 128bit cnt (V16QI mode). And only one reduction addition instead of 2. Currently fold_builtin_bit_query will expand always without checking if there was an optab for the type, so this changes that to check the optab to see if we should expand or have the backend handle it. Bootstrapped and tested on x86_64-linux-gnu and built and tested for aarch64-linux-gnu. gcc/ChangeLog: * builtins.cc (fold_builtin_bit_query): Don't expand double `unsigned long long` typess if there is an optab entry for that type. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>