riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-06-27	ada: Implement first half of Generalized Finalization	Eric Botcazou	27	-242/+621
	This implements the first half of the Generalized Finalization proposal, namely the Finalizable aspect as well as its optional relaxed semantics for the finalization operations, but the latter part is only implemented for dynamically allocated objects. In accordance with the spirit, if not the letter, of the proposal, this implements the finalizable types declared with strict semantics for the finalization operations as a direct generalization of controlled types, which in turn makes it possible to reimplement the latter types in terms of the former types and ensures full interoperability between them. The relaxed semantics for the finalization operations is also a direct generalization of the GNAT pragma No_Heap_Finalization for dynamically allocated objects, in that it extends the effects of the pragma to all access types designating the finalizable type, instead of just applying them to library-level named access types. gcc/ada/ * aspects.ads (Aspect_Id): Add Aspect_Finalizable. (Implementation_Defined_Aspect): Add True for Aspect_Finalizable. (Operational_Aspect): Add True for Aspect_Finalizable. (Aspect_Argument): Add Expression for Aspect_Finalizable. (Is_Representation_Aspect): Add False for Aspect_Finalizable. (Aspect_Names): Add Name_Finalizable for Aspect_Finalizable. (Aspect_Delay): Add Always_Delay for Aspect_Finalizable. * checks.adb: Add with and use clauses for Sem_Elab. (Install_Primitive_Elaboration_Check): Call Is_Controlled_Procedure. * einfo.ads (Has_Relaxed_Finalization): Document new flag. (Is_Controlled_Active): Update documentation. * exp_aggr.adb (Generate_Finalization_Actions): Replace Find_Prim_Op with Find_Controlled_Prim_Op for Name_Finalize. * exp_attr.adb (Expand_N_Attribute_Reference) <Finalization_Size>: Return 0 if the prefix type has relaxed finalization. * exp_ch3.adb (Build_Equivalent_Record_Aggregate): Return Empty if the type needs finalization. (Expand_Freeze_Record_Type): Call Find_Controlled_Prim_Op instead of Find_Prim_Op for Name_{Adjust,Initialize,Finalize}. Call Make_Finalize_Address_Body for all controlled types. * exp_ch4.adb (Insert_Dereference_Action): Do not generate a call to Adjust_Controlled_Dereference if the designated type has relaxed finalization. * exp_ch6.adb (Needs_BIP_Collection): Return false for an untagged type that has relaxed finalization. * exp_ch7.adb (Allows_Finalization_Collection): Return false if the designated type has relaxed finalization. (Check_Visibly_Controlled): Call Find_Controlled_Prim_Op instead of Find_Prim_Op. (Make_Adjust_Call): Likewise. (Make_Deep_Record_Body): Likewise. (Make_Final_Call): Likewise. (Make_Init_Call): Likewise. * exp_disp.adb (Set_All_DT_Position): Remove obsolete warning. * exp_util.ads: Add with and use clauses for Snames. (Find_Prim_Op): Add precondition. (Find_Controlled_Prim_Op): New function declaration. (Name_Of_Controlled_Prim_Op): Likewise. * exp_util.adb: Remove with and use clauses for Snames. (Build_Allocate_Deallocate_Proc): Do not build finalization actions if the designated type has relaxed finalization. (Find_Controlled_Prim_Op): New function. (Find_Last_Init): Call Find_Controlled_Prim_Op instead of Find_Prim_Op. (Name_Of_Controlled_Prim_Op): New function. * freeze.adb (Freeze_Entity.Freeze_Record_Type): Propagate the Has_Relaxed_Finalization flag from components. * gen_il-fields.ads (Opt_Field_Enum): Add Has_Relaxed_Finalization. * gen_il-gen-gen_entities.adb (Entity_Kind): Likewise. * sem_aux.adb (Is_By_Reference_Type): Return true for all controlled types. * sem_ch3.adb (Build_Derived_Record_Type): Do not special case types declared in Ada.Finalization. (Record_Type_Definition): Propagate the Has_Relaxed_Finalization flag from components. * sem_ch13.adb (Analyze_Aspects_At_Freeze_Point): Also process the Finalizable aspect. (Analyze_Aspect_Specifications): Likewise. Call Flag_Non_Static_Expr in more cases. (Check_Aspect_At_Freeze_Point): Likewise. (Inherit_Aspects_At_Freeze_Point): Likewise. (Resolve_Aspect_Expressions): Likewise. (Resolve_Finalizable_Argument): New procedure. (Validate_Finalizable_Aspect): Likewise. * sem_elab.ads: Add with and use clauses for Snames. (Is_Controlled_Procedure): New function declaration. * sem_elab.adb: Remove with and use clauses for Snames. (Is_Controlled_Proc): Move to... (Is_Controlled_Procedure): ...here and rename. (Check_A_Call): Call Find_Controlled_Prim_Op instead of Find_Prim_Op. (Is_Finalization_Procedure): Likewise. * sem_util.ads (Propagate_Controlled_Flags): Update documentation. * sem_util.adb (Is_Fully_Initialized_Type): Replace call to Find_Optional_Prim_Op with Find_Controlled_Prim_Op. Call Has_Null_Extension only for derived tagged types. (Propagate_Controlled_Flags): Propagate Has_Relaxed_Finalization. * snames.ads-tmpl (Name_Finalizable): New name. (Name_Relaxed_Finalization): Likewise. * libgnat/s-finroo.ads (Root_Controlled): Add Finalizable aspect. * doc/gnat_rm/gnat_language_extensions.rst: Document implementation of Generalized Finalization. * gnat_rm.texi: Regenerate. * gnat_ugn.texi: Regenerate.
2024-06-27	i386: Refactor vcvttps2qq/vcvtqq2ps patterns.	Hu, Lin1	2	-31/+22
	Refactor vcvttps2qq/vcvtqq2ps patterns for remove redundant round__modev8sf_condition. gcc/ChangeLog: config/i386/sse.md (float<floatunssuffix><sselongvecmodelower><mode>2<mask_name> <round_name>): Refactor the pattern. (unspec_fix<vcvtt_uns_suffix>_trunc<mode><sselongvecmodelower>2 <mask_name><round_saeonly_name>): Ditto. (fix<fixunssuffix>_trunc<mode><sselongvecmodelower>2<mask_name> <round_saeonly_name>): Ditto. * config/i386/subst.md (round_modev8sf_condition): Remove. (round_saeonly_modev8sf_condition): Ditto.
2024-06-27	vect: support direct conversion under x86-64-v3.	Hu, Lin1	7	-32/+363
	gcc/ChangeLog: PR target/107432 * config/i386/i386-expand.cc (ix86_expand_trunc_with_avx2_noavx512f): New function for generate a series of suitable insn. * config/i386/i386-protos.h (ix86_expand_trunc_with_avx2_noavx512f): Define new function. * config/i386/sse.md: Extend trunc<mode><mode>2 for x86-64-v3. (ssebytemode) Add V8HI. (PMOV_DST_MODE_2_AVX2): New mode iterator. (PMOV_SRC_MODE_3_AVX2): Ditto. * config/i386/mmx.md (trunc<mode><mmxhalfmodelower>2): Ditto. (avx512vl_trunc<mode><mmxhalfmodelower>2): Ditto. (truncv2si<mode>2): Ditto. (avx512vl_truncv2si<mode>2): Ditto. (mmxbytemode): New mode attr. gcc/testsuite/ChangeLog: PR target/107432 * gcc.target/i386/pr107432-8.c: New test. * gcc.target/i386/pr107432-9.c: Ditto. * gcc.target/i386/pr92645-4.c: Modify test.
2024-06-27	vect: Support v4hi -> v4qi.	Hu, Lin1	4	-17/+44
	gcc/ChangeLog: PR target/107432 * config/i386/mmx.md (VI2_32_64): New mode iterator. (mmxhalfmode): New mode atter. (mmxhalfmodelower): Ditto. (truncv2hiv2qi2): Extend mode v4hi and change name from truncv2hiv2qi to trunc<mode><mmxhalfmodelower>2. gcc/testsuite/ChangeLog: PR target/107432 * gcc.target/i386/pr107432-1.c: Modify test. * gcc.target/i386/pr107432-6.c: Add test. * gcc.target/i386/pr108938-3.c: This patch supports truncv4hiv4qi affect bswap optimization, so I added the -mno-avx option for now, and open a bugzilla.
2024-06-27	vect: generate suitable convert insn for int -> int, float -> float and int ↵	Hu, Lin1	10	-95/+990
	<-> float. gcc/ChangeLog: PR target/107432 * tree-vect-generic.cc (expand_vector_conversion): Support convert for int -> int, float -> float and int <-> float. * tree-vect-stmts.cc (vectorizable_conversion): Wrap the indirect convert part. (supportable_indirect_convert_operation): New function. * tree-vectorizer.h (supportable_indirect_convert_operation): Define the new function. gcc/testsuite/ChangeLog: PR target/107432 * gcc.target/i386/pr107432-1.c: New test. * gcc.target/i386/pr107432-2.c: Ditto. * gcc.target/i386/pr107432-3.c: Ditto. * gcc.target/i386/pr107432-4.c: Ditto. * gcc.target/i386/pr107432-5.c: Ditto. * gcc.target/i386/pr107432-6.c: Ditto. * gcc.target/i386/pr107432-7.c: Ditto.
2024-06-27	RISC-V: Add testcases for vector truncate after .SAT_SUB	Pan Li	8	-0/+331
	This patch would like to add the test cases of the vector truncate after .SAT_SUB. Aka: #define DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(OUT_T, IN_T) \ void __attribute__((noinline)) \ vec_sat_u_sub_trunc_##OUT_T##_fmt_1 (OUT_T out, IN_T op_1, IN_T y, \ unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ IN_T x = op_1[i]; \ out[i] = (OUT_T)(x >= y ? x - y : 0); \ } \ } The below 3 cases are included. DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint8_t, uint16_t) DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint16_t, uint32_t) DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint32_t, uint64_t) gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add helper test macros. * gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-2.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-3.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-06-27	LoongArch: NFC: Dedup and sort the comment in loongarch_print_operand_reloc	Xi Ruoyao	1	-9/+1
	gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_print_operand_reloc): Dedup and sort the comment describing modifiers.
2024-06-27	LoongArch: Tweak IOR rtx_cost for bstrins	Xi Ruoyao	2	-17/+72
	Consider c &= 0xfff; a &= ~0xfff; b &= ~0xfff; a \|= c; b \|= c; This can be done with 2 bstrins instructions. But we need to recognize it in loongarch_rtx_costs or the compiler will not propagate "c & 0xfff" forward. gcc/ChangeLog: * config/loongarch/loongarch.cc: (loongarch_use_bstrins_for_ior_with_mask): Split the main logic into ... (loongarch_use_bstrins_for_ior_with_mask_1): ... here. (loongarch_rtx_costs): Special case for IOR those can be implemented with bstrins. gcc/testsuite/ChangeLog; * gcc.target/loongarch/bstrins-3.c: New test.
2024-06-27	Fix wrong cost of MEM when addr is a lea.	liuhongt	2	-1/+26
	416.gamess regressed 4-6% on x86_64 since my r15-882-g1d6199e5f8c1c0. The commit adjust rtx_cost of mem to reduce cost of (add op0 disp). But Cost of ADDR could be cheaper than XEXP (addr, 0) when it's a lea. It is the case in the PR, the patch adjust rtx_cost to only handle reg + disp, for other forms, they're basically all LEA which doesn't have additional cost of ADD. gcc/ChangeLog: PR target/115462 * config/i386/i386.cc (ix86_rtx_costs): Make cost of MEM (reg + disp) just a little bit more than MEM (reg). gcc/testsuite/ChangeLog: * gcc.target/i386/pr115462.c: New test.
2024-06-27	Internal-fn: Support new IFN SAT_TRUNC for unsigned scalar int	Pan Li	4	-0/+53
	This patch would like to add the middle-end presentation for the saturation truncation. Aka set the result of truncated value to the max value when overflow. It will take the pattern similar as below. Form 1: #define DEF_SAT_U_TRUC_FMT_1(WT, NT) \ NT __attribute__((noinline)) \ sat_u_truc_##T##_fmt_1 (WT x) \ { \ bool overflow = x > (WT)(NT)(-1); \ return ((NT)x) \| (NT)-overflow; \ } For example, truncated uint16_t to uint8_t, we have * SAT_TRUNC (254) => 254 * SAT_TRUNC (255) => 255 * SAT_TRUNC (256) => 255 * SAT_TRUNC (65536) => 255 Given below SAT_TRUNC from uint64_t to uint32_t. DEF_SAT_U_TRUC_FMT_1 (uint64_t, uint32_t) Before this patch: __attribute__((noinline)) uint32_t sat_u_truc_T_fmt_1 (uint64_t x) { _Bool overflow; unsigned int _1; unsigned int _2; unsigned int _3; uint32_t _6; ;; basic block 2, loop depth 0 ;; pred: ENTRY overflow_5 = x_4(D) > 4294967295; _1 = (unsigned int) x_4(D); _2 = (unsigned int) overflow_5; _3 = -_2; _6 = _1 \| _3; return _6; ;; succ: EXIT } After this patch: __attribute__((noinline)) uint32_t sat_u_truc_T_fmt_1 (uint64_t x) { uint32_t _6; ;; basic block 2, loop depth 0 ;; pred: ENTRY _6 = .SAT_TRUNC (x_4(D)); [tail call] return _6; ;; succ: EXIT } The below tests are passed for this patch: . The rv64gcv fully regression tests. . The rv64gcv build with glibc. . The x86 bootstrap tests. . The x86 fully regression tests. gcc/ChangeLog: * internal-fn.def (SAT_TRUNC): Add new signed IFN sat_trunc as unary_convert. * match.pd: Add new matching pattern for unsigned int sat_trunc. * optabs.def (OPTAB_CL): Add unsigned and signed optab. * tree-ssa-math-opts.cc (gimple_unsigend_integer_sat_trunc): Add new decl for the matching pattern generated func. (match_unsigned_saturation_trunc): Add new func impl to match the .SAT_TRUNC. (math_opts_dom_walker::after_dom_children): Add .SAT_TRUNC match function under BIT_IOR_EXPR case. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-06-27	Vect: Support truncate after .SAT_SUB pattern in zip	Pan Li	2	-22/+33
	The zip benchmark of coremark-pro have one SAT_SUB like pattern but truncated as below: void test (uint16_t x, unsigned b, unsigned n) { unsigned a = 0; register uint16_t p = x; do { a = --p; p = (uint16_t)(a >= b ? a - b : 0); // Truncate after .SAT_SUB } while (--n); } It will have gimple before vect pass, it cannot hit any pattern of SAT_SUB and then cannot vectorize to SAT_SUB. _2 = a_11 - b_12(D); iftmp.0_13 = (short unsigned int) _2; _18 = a_11 >= b_12(D); iftmp.0_5 = _18 ? iftmp.0_13 : 0; This patch would like to improve the pattern match to recog above as truncate after .SAT_SUB pattern. Then we will have the pattern similar to below, as well as eliminate the first 3 dead stmt. _2 = a_11 - b_12(D); iftmp.0_13 = (short unsigned int) _2; _18 = a_11 >= b_12(D); iftmp.0_5 = (short unsigned int).SAT_SUB (a_11, b_12(D)); The below tests are passed for this patch. 1. The rv64gcv fully regression tests. 2. The rv64gcv build with glibc. 3. The x86 bootstrap tests. 4. The x86 fully regression tests. gcc/ChangeLog: * match.pd: Add convert description for minus and capture. * tree-vect-patterns.cc (vect_recog_build_binary_gimple_call): Add new logic to handle in_type is incompatibile with out_type, as well as rename from. (vect_recog_build_binary_gimple_stmt): Rename to. (vect_recog_sat_add_pattern): Leverage above renamed func. (vect_recog_sat_sub_pattern): Ditto. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-06-27	tree-optimization/115652 - amend last fix	Richard Biener	1	-1/+2
	The previous fix breaks in the degenerate case when the discovered last_stmt is equal to the first stmt in the block since then we undo a required stmt advancement. PR tree-optimization/115652 * tree-vect-slp.cc (vect_schedule_slp_node): Only insert at the start of the block if that strictly dominates the discovered dependent stmt.
2024-06-27	tree-optimization/115493 - complete previous fix	Richard Biener	1	-1/+1
	The following fixes the 2nd occurance of new_temp missed with the previous fix. PR tree-optimization/115493 * tree-vect-loop.cc (vect_create_epilog_for_reduction): Use first scalar result.
2024-06-27	Daily bump.	GCC Administrator	6	-1/+568

2024-06-26	libstdc++: Add script to update docs for a new release branch	Jonathan Wakely	1	-0/+14
	This should be run on a release branch after branching from trunk. Various links and references to trunk in the docs will be updated to refer to the new release branch. libstdc++-v3/ChangeLog: * scripts/update_release_branch.sh: New file.
2024-06-26	libstdc++: Remove duplicate test	Jonathan Wakely	2	-57/+5
	We currently have 808590.cc which only runs for C++98 mode, and 808590-cxx11.cc which only runs for C++11 and later, but have almost identical content (except for a defaulted special member in the C++11 one, to suppress a -Wdeprecated-copy warning). This was done originally to ensure that the test ran for both C++98 mode and C++11 mode, because the logic being tested was different enough to need both to be tested. But it's trivial to run all tests in multiple -std modes now, using GLIBCXX_TESTSUITE_STDS, so we don't need two separate tests. We can remove one of the tests and allow the other one to run in any -std mode. libstdc++-v3/ChangeLog: * testsuite/20_util/specialized_algorithms/uninitialized_copy/808590.cc: Copy defaulted assignment operator from 808590-cxx11.cc to suppress a warning. * testsuite/20_util/specialized_algorithms/uninitialized_copy/808590-cxx11.cc: Removed.
2024-06-26	libstdc++: Increase timeouts for PSTL tests in debug mode [PR90276]	Jonathan Wakely	6	-0/+6
	These tests compile very slowly in debug mode. libstdc++-v3/ChangeLog: PR libstdc++/90276 * testsuite/25_algorithms/pstl/alg_modifying_operations/rotate_copy.cc: Increase timeout for debug mode. * testsuite/25_algorithms/pstl/alg_modifying_operations/transform_binary.cc: Likewise. * testsuite/25_algorithms/pstl/alg_nonmodifying/mismatch.cc: Likewise. * testsuite/25_algorithms/pstl/alg_sorting/lexicographical_compare.cc: Likewise. * testsuite/25_algorithms/pstl/alg_sorting/minmax_element.cc: Likewise. * testsuite/25_algorithms/pstl/alg_sorting/set_symmetric_difference.cc: Likewise.
2024-06-26	libstdc++: Work around some PSTL test failures for debug mode [PR90276]	Jonathan Wakely	4	-0/+13
	This addresses one known failure due to a bug in the upstream tests, and a number of timeouts due to the algorithms running much more slowly with debug mode checks enabled. libstdc++-v3/ChangeLog: PR libstdc++/90276 * testsuite/25_algorithms/pstl/alg_sorting/partial_sort.cc [_GLIBCXX_DEBUG]: Add xfail-run-if for debug mode. * testsuite/25_algorithms/pstl/alg_nonmodifying/nth_element.cc [_GLIBCXX_DEBUG]: Reduce size of test data. * testsuite/25_algorithms/pstl/alg_sorting/includes.cc: Likewise. * testsuite/25_algorithms/pstl/alg_sorting/set_util.h: Likewise.
2024-06-26	libstdc++: Fix std::chrono::tzdb to work with vanguard format	Jonathan Wakely	3	-103/+274
	I found some issues in the std::chrono::tzdb parser by testing the tzdata "vanguard" format, which uses new features that aren't enabled in the "main" and "rearguard" data formats. Since 2024a the keyword "minimum" is no longer valid for the FROM and TO fields in a Rule line, which means that "m" is now a valid abbreviation for "maximum". Previously we expected either "mi" or "ma". For backwards compatibility, a FROM field beginning with "mi" is still supported and is treated as 1900. The "maximum" keyword is only allowed in TO now, because it makes no sense in FROM. To support these changes the minmax_year and minmax_year2 classes for parsing FROM and TO are replaced with a single years_from_to class that reads both fields. The vanguard format makes use of %z in Zone FORMAT fields, which caused an exception to be thrown from ZoneInfo::set_abbrev because no % or / characters were expected when a Zone doesn't use a named Rule. The ZoneInfo::to(sys_info&) function now uses format_abbrev_str to replace any %z with the current offset. Although format_abbrev_str also checks for %s and STD/DST formats, those only make sense when a named Rule is in effect, so won't occur when ZoneInfo::to(sys_info&) is used. This change also implements a feature that has always been missing from time_zone::_M_get_sys_info: finding the Rule that is active before the specified time point, so that we can correctly handle %s in the FORMAT for the first new sys_info that gets created. This requires implementing a poorly documented feature of zic, to get the LETTERS field from a later transition, as described at https://mm.icann.org/pipermail/tz/2024-April/058891.html In order for this to work we need to be able to distinguish an empty letters field (as used by CE%sT where the variable part is either empty or "S") from "the letters field is not known for this transition". The tzdata file uses "-" for an empty letters field, which libstdc++ was previously replacing with "" when the Rule was parsed. Instead, we now preserve the "-" in the Rule object, so that "" can be used for the case where we don't know the letters (and so need to decide it). libstdc++-v3/ChangeLog: * src/c++20/tzdb.cc (minmax_year, minmax_year2): Remove. (years_from_to): New class replacing minmax_year and minmax_year2. (format_abbrev_str, select_std_or_dst_abbrev): Move earlier in the file. Handle "-" for letters. (ZoneInfo::to): Use format_abbrev_str to expand %z. (ZoneInfo::set_abbrev): Remove exception. Change parameter from reference to value. (operator>>(istream&, Rule&)): Do not clear letters when it contains "-". (time_zone::_M_get_sys_info): Add missing logic to find the Rule in effect before the time point. * testsuite/std/time/tzdb/1.cc: Adjust for vanguard format using "GMT" as the Zone name, not as a Link to "Etc/GMT". * testsuite/std/time/time_zone/sys_info_abbrev.cc: New test.
2024-06-26	tree-optimization/115629 - missed tail merging	Richard Biener	2	-8/+75
	The following fixes a missed tail-merging observed for the testcase in PR115629. The issue is that when deps_ok_for_redirect doesn't compute both would be valid prevailing blocks it rejects the merge. The following instead makes sure to record the working block as prevailing. Also stmt comparison fails for indirect references and is not handling memory references thoroughly, failing to unify array indices and pointers indirected. The following attempts to fix this. PR tree-optimization/115629 * tree-ssa-tail-merge.cc (gimple_equal_p): Handle memory references better. (deps_ok_for_redirect): Handle the case not both blocks are considered a valid prevailing block. * gcc.dg/tree-ssa/tail-merge-1.c: New testcase.
2024-06-26	RISC-V: Update testcase comments to point to PSABI rather than Table A.6	Patrick O'Neill	40	-40/+45
	Table A.6 was originally the source of truth for the recommended mappings. Point to the PSABI doc since the memory model mappings have been moved there. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo/a-rvwmo-fence.c: Replace A.6 reference with PSABI. * gcc.target/riscv/amo/a-rvwmo-load-acquire.c: Ditto. * gcc.target/riscv/amo/a-rvwmo-load-relaxed.c: Ditto. * gcc.target/riscv/amo/a-rvwmo-load-seq-cst.c: Ditto. * gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c: Ditto. * gcc.target/riscv/amo/a-rvwmo-store-relaxed.c: Ditto. * gcc.target/riscv/amo/a-rvwmo-store-release.c: Ditto. * gcc.target/riscv/amo/a-ztso-fence.c: Ditto. * gcc.target/riscv/amo/a-ztso-load-acquire.c: Ditto. * gcc.target/riscv/amo/a-ztso-load-relaxed.c: Ditto. * gcc.target/riscv/amo/a-ztso-load-seq-cst.c: Ditto. * gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c: Ditto. * gcc.target/riscv/amo/a-ztso-store-relaxed.c: Ditto. * gcc.target/riscv/amo/a-ztso-store-release.c: Ditto. * gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c: Ditto. * gcc.target/riscv/amo/zaamo-ztso-amo-add-int.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acq-rel.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acquire.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-relaxed.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-release.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-seq-cst.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acq-rel.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acquire.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-relaxed.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-release.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-seq-cst.c: Ditto. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-06-26	RISC-V: Consolidate amo testcase variants	Patrick O'Neill	31	-435/+378
	Many riscv/amo/ testcases use check-function-bodies. These testcases can be consolidated with related testcases (memory ordering variants) without affecting the assertions. Give functions descriptive names so testsuite failures are obvious from the 'FAIL:' line. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c: Removed. * gcc.target/riscv/amo/amo-table-a-6-amo-add-2.c: Removed. * gcc.target/riscv/amo/amo-table-a-6-amo-add-3.c: Removed. * gcc.target/riscv/amo/amo-table-a-6-amo-add-4.c: Removed. * gcc.target/riscv/amo/amo-table-a-6-amo-add-5.c: Removed. * gcc.target/riscv/amo/amo-table-a-6-fence-1.c: Removed. * gcc.target/riscv/amo/amo-table-a-6-fence-2.c: Removed. * gcc.target/riscv/amo/amo-table-a-6-fence-3.c: Removed. * gcc.target/riscv/amo/amo-table-a-6-fence-4.c: Removed. * gcc.target/riscv/amo/amo-table-a-6-fence-5.c: Removed. * gcc.target/riscv/amo/amo-table-ztso-amo-add-1.c: Removed. * gcc.target/riscv/amo/amo-table-ztso-amo-add-2.c: Removed. * gcc.target/riscv/amo/amo-table-ztso-amo-add-3.c: Removed. * gcc.target/riscv/amo/amo-table-ztso-amo-add-4.c: Removed. * gcc.target/riscv/amo/amo-table-ztso-amo-add-5.c: Removed. * gcc.target/riscv/amo/amo-table-ztso-fence-1.c: Removed. * gcc.target/riscv/amo/amo-table-ztso-fence-2.c: Removed. * gcc.target/riscv/amo/amo-table-ztso-fence-3.c: Removed. * gcc.target/riscv/amo/amo-table-ztso-fence-4.c: Removed. * gcc.target/riscv/amo/amo-table-ztso-fence-5.c: Removed. * gcc.target/riscv/amo/amo-zalrsc-amo-add-1.c: Removed. * gcc.target/riscv/amo/amo-zalrsc-amo-add-2.c: Removed. * gcc.target/riscv/amo/amo-zalrsc-amo-add-3.c: Removed. * gcc.target/riscv/amo/amo-zalrsc-amo-add-4.c: Removed. * gcc.target/riscv/amo/amo-zalrsc-amo-add-5.c: Removed. * gcc.target/riscv/amo/a-rvwmo-fence.c: New test. * gcc.target/riscv/amo/a-ztso-fence.c: New test. * gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c: New test. * gcc.target/riscv/amo/zaamo-ztso-amo-add-int.c: New test. * gcc.target/riscv/amo/zalrsc-rvwmo-amo-add-int.c: New test. * gcc.target/riscv/amo/zalrsc-ztso-amo-add-int.c: New test. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-06-26	RISC-V: Rename amo testcases	Patrick O'Neill	37	-0/+0
	Rename riscv/amo/ testcases to follow a '{ext}-{model}-{name}-{memory order}.c' naming convention. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo/amo-table-a-6-load-2.c: Move to... * gcc.target/riscv/amo/a-rvwmo-load-acquire.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-load-1.c: Move to... * gcc.target/riscv/amo/a-rvwmo-load-relaxed.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-load-3.c: Move to... * gcc.target/riscv/amo/a-rvwmo-load-seq-cst.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-store-compat-3.c: Move to... * gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-store-1.c: Move to... * gcc.target/riscv/amo/a-rvwmo-store-relaxed.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-store-2.c: Move to... * gcc.target/riscv/amo/a-rvwmo-store-release.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-load-2.c: Move to... * gcc.target/riscv/amo/a-ztso-load-acquire.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-load-1.c: Move to... * gcc.target/riscv/amo/a-ztso-load-relaxed.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-load-3.c: Move to... * gcc.target/riscv/amo/a-ztso-load-seq-cst.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-store-3.c: Move to... * gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-store-1.c: Move to... * gcc.target/riscv/amo/a-ztso-store-relaxed.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-store-2.c: Move to... * gcc.target/riscv/amo/a-ztso-store-release.c: ...here. * gcc.target/riscv/amo/amo-zaamo-preferred-over-zalrsc.c: Move to... * gcc.target/riscv/amo/zaamo-preferred-over-zalrsc.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-compare-exchange-6.c: Move to... * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-compare-exchange-3.c: Move to... * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-compare-exchange-2.c: Move to... * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-compare-exchange-1.c: Move to... * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-compare-exchange-4.c: Move to... * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-compare-exchange-7.c: Move to... * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-compare-exchange-5.c: Move to... * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-4.c: Move to... * gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acq-rel.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-2.c: Move to... * gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acquire.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-1.c: Move to... * gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-relaxed.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-3.c: Move to... * gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-release.c: ...here. * gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-5.c: Move to... * gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-seq-cst.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-compare-exchange-6.c: Move to... * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-compare-exchange-3.c: Move to... * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-compare-exchange-2.c: Move to... * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-compare-exchange-1.c: Move to... * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-compare-exchange-4.c: Move to... * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-compare-exchange-7.c: Move to... * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-compare-exchange-5.c: Move to... * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-4.c: Move to... * gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acq-rel.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-2.c: Move to... * gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acquire.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-1.c: Move to... * gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-relaxed.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-3.c: Move to... * gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-release.c: ...here. * gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-5.c: Move to... * gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-seq-cst.c: ...here. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-06-26	rs6000, change altivec*-runnable.c test file names	Carl Love	2	-0/+0
	Changed the names of the test files. gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog: * gcc.target/powerpc/altivec-1-runnable.c: Change the name to altivec-38.c. * gcc.target/powerpc/altivec-2-runnable.c: Change the name to p8vector-builtin-9.c.
2024-06-26	rs6000, altivec-2-runnable.c update the require-effective-target	Carl Love	1	-3/+3
	The test requires a minimum of Power8 vector HW and a compile level of -O2. Update the dg test directives. gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog: * gcc.target/powerpc/altivec-2-runnable.c: Change the require-effective-target for the test.
2024-06-26	rs6000, altivec-1-runnable.c update the require-effective-target	Carl Love	1	-2/+3
	Update the dg test directives. gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog: * gcc.target/powerpc/altivec-1-runnable.c: Change the require-effective-target for the test.
2024-06-26	[committed] Remove compromised sh test	Jeff Law	1	-14/+0
	Surya's recent patch to IRA improves the code for sh/pr54602-1.c slightly. Specifically it's able to eliminate a save/restore in the prologue/epilogue and a bit of register shuffling. As a result there literally aren't any insns that can be used to fill the delay slot of the return, so a nop gets emitted and the test fails. Given there literally aren't any insns to move into the delay slot, the best course of action is to just drop the test. gcc/testsuite * gcc.target/sh/pr54602-1.c: Delete test.
2024-06-26	[committed][RISC-V] Fix expected output for thead store pair test	Jeff Law	1	-6/+4
	Surya's patch to IRA has improved the code we generate for one of the thead store pair tests for both rv32 and rv64. This patch adjusts the expectations of that test. I've verified that the test now passes on rv32 and rv64 in my tester. Pushing to the trunk. gcc/testsuite * gcc.target/riscv/xtheadmempair-3.c: Update expected output.
2024-06-26	tree-optimization/115652 - adjust insertion gsi for SLP	Richard Biener	1	-16/+13
	The following adjusts how SLP computes the insertion location. In particular it advanced the insert iterator of the found last_stmt. The vectorizer will later insert stmts _before_ it. But we also have the constraint that possibly masked ops may not be scheduled outside of the loop and as we do not model the loop mask in the SLP graph we have to adjust for that. The following moves this to after the advance since it isn't compatible with that as the current GIMPLE_COND exception shows. The PR is about in-order reduction vectorization which also isn't happy when that's the very first stmt. PR tree-optimization/115652 * tree-vect-slp.cc (vect_schedule_slp_node): Advance the iterator based on last_stmt only for vector defs.
2024-06-26	Record edge true/false value for gcov	Jørgen Kvalsvik	3	-0/+14
	Make gcov aware which edges are the true/false to more accurately reconstruct the CFG. There are plenty of bits left in arc_info and it opens up for richer reporting. gcc/ChangeLog: * gcov-io.h (GCOV_ARC_TRUE): New. (GCOV_ARC_FALSE): New. * gcov.cc (struct arc_info): Add true_value, false_value. (read_graph_file): Read true_value, false_value. * profile.cc (branch_prob): Write GCOV_ARC_TRUE, GCOV_ARC_FALSE.
2024-06-26	Use the term MC/DC in help for gcov --conditions	Jørgen Kvalsvik	1	-1/+1
	Without key terms like "masking" and "MC/DC" it is not at all obvious what --conditions actually reports on, and there is no easy path for the user to figure out. By at least including the two key terms MC/DC and masking users have something to search for. gcc/ChangeLog: * gcov.cc (print_usage): Reference masking MC/DC.
2024-06-26	Add section on MC/DC in gcov manual	Jørgen Kvalsvik	1	-0/+72
	gcc/ChangeLog: * doc/gcov.texi: Add MC/DC section.
2024-06-26	Use auto_vec for memory release on return	Jørgen Kvalsvik	1	-1/+1
	Using auto_vec ensure this memory is cleaned up on function exit. gcc/ChangeLog: * tree-profile.cc (find_conditions): Use auto_vec.
2024-06-26	arm: make arm_predict_doloop_p reject loops with calls	Andre Vieira	2	-0/+29
	With the introduction of low overhead loops we defined arm_predict_doloop_p, this is meant to be a low-weight check to rule out loops we are not considering for doloop optimization and it is used by other passes to prevent optimizations that may hurt the doloop optimization later on. The reason these are meant to be lightweight is because it's used by pre-RTL optimizations, meaning we can't do the same checks that doloop does. After the definition of arm_predict_doloop_p, when testing for armv8.1-m.main, tree-ssa/ivopts-3.c failed the scan-dump check as the dump now matched an extra '!= 0' introduced by: Doloop cmp iv use: if (ivtmp_1 != 0) Predict loop 1 can perform doloop optimization later. where previously we had: Predict doloop failure due to target specific checks. and after this patch: Predict doloop failure due to call in loop. Predict doloop failure due to target specific checks. Added a copy of the original tree-ssa/ivopts-3.c as a target specifc test to check for the new dump message. gcc/ChangeLog: * config/arm/arm.cc (arm_predict_doloop_p): Reject loops with function calls that are not builtins. gcc/testsuite/ChangeLog: * gcc.target/arm/mve/ivopts-3.c: New test.
2024-06-26	[aarch64] Add support for -mcpu=grace	Kyrylo Tkachov	3	-3/+5
	This adds support for the NVIDIA Grace CPU to aarch64. We reuse the tuning decisions for the Neoverse V2 core, but include a number of architecture features that are not enabled by default in -mcpu=neoverse-v2. This allows Grace users to more simply target the CPU with -mcpu=grace rather than remembering what extensions to tag on top of -mcpu=neoverse-v2. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ * config/aarch64/aarch64-cores.def (grace): New entry. * config/aarch64/aarch64-tune.md: Regenerate. * doc/invoke.texi (AArch64 Options): Document the above. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
2024-06-26	i386: Remove declaration of unused functions	Evgeny Karpov	1	-2/+0
	The patch fixes the issue introduced in https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=63512c72df09b43d56ac7680cdfd57a66d40c636 and reported at https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655599.html . Regards, Evgeny The patch fixes the issue with compilation on x86_64-gnu-linux when warnings for unused functions are treated as errors. gcc/ChangeLog: * config/i386/i386.cc (legitimize_dllimport_symbol): Remove unused functions. (legitimize_pe_coff_extern_decl): Likewise.
2024-06-26	rs6000: Fix wrong RTL patterns for vector merge high/low short on LE	Kewen Lin	3	-27/+94
	Commit r12-4496 changes some define_expands and define_insns for vector merge high/low short, which are altivec_vmrg[hl]h. These defines are mainly for built-in function vec_merge{h,l} and some internal gen function needs. These functions should consider endianness, taking vec_mergeh as example, as PVIPR defines, vec_mergeh "Merges the first halves (in element order) of two vectors", it does note it's in element order. So it's mapped into vmrghh on BE while vmrglh on LE respectively. Although the mapped insns are different, as the discussion in PR106069, the RTL pattern should be still the same, it is conformed before commit r12-4496, but gets changed into different patterns on BE and LE starting from commit r12-4496. Similar to 32-bit element case in commit log of r15-1504, this 16-bit element pattern on LE doesn't actually match what the underlying insn is intended to represent, once some optimization like combine does some changes basing on it, it would cause the unexpected consequence. The newly constructed test case pr106069-2.c is a typical example for this issue on element type short. So this patch is to fix the wrong RTL pattern, ensure the associated RTL patterns become the same as before which can have the same semantic as their mapped insns. With the proposed patch, the expanders like altivec_vmrghh expands into altivec_vmrghh_direct_be or altivec_vmrglh_direct_le depending on endianness, "direct" can easily show which insn would be generated, _be and _le are mainly for the different RTL patterns as endianness. Co-authored-by: Xionghu Luo <xionghuluo@tencent.com> PR target/106069 PR target/115355 gcc/ChangeLog: * config/rs6000/altivec.md (altivec_vmrghh_direct): Rename to ... (altivec_vmrghh_direct_be): ... this. Add condition BYTES_BIG_ENDIAN. (altivec_vmrghh_direct_le): New define_insn. (altivec_vmrglh_direct): Rename to ... (altivec_vmrglh_direct_be): ... this. Add condition BYTES_BIG_ENDIAN. (altivec_vmrglh_direct_le): New define_insn. (altivec_vmrghh): Adjust by calling gen_altivec_vmrghh_direct_be for BE and gen_altivec_vmrglh_direct_le for LE. (altivec_vmrglh): Adjust by calling gen_altivec_vmrglh_direct_be for BE and gen_altivec_vmrghh_direct_le for LE. (vec_widen_umult_hi_v16qi): Adjust the call to gen_altivec_vmrghh_direct by gen_altivec_vmrghh for BE and by gen_altivec_vmrglh for LE. (vec_widen_smult_hi_v16qi): Likewise. (vec_widen_umult_lo_v16qi): Adjust the call to gen_altivec_vmrglh_direct by gen_altivec_vmrglh for BE and by gen_altivec_vmrghh for LE. (vec_widen_smult_lo_v16qi): Likewise. * config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace CODE_FOR_altivec_vmrghh_direct by CODE_FOR_altivec_vmrghh_direct_be for BE and CODE_FOR_altivec_vmrghh_direct_le for LE. And replace CODE_FOR_altivec_vmrglh_direct by CODE_FOR_altivec_vmrglh_direct_be for BE and CODE_FOR_altivec_vmrglh_direct_le for LE. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr106069-2.c: New test.
2024-06-26	rs6000: Fix wrong RTL patterns for vector merge high/low char on LE	Kewen Lin	3	-18/+95
	Commit r12-4496 changes some define_expands and define_insns for vector merge high/low char, which are altivec_vmrg[hl]b. These defines are mainly for built-in function vec_merge{h,l} and some internal gen function needs. These functions should consider endianness, taking vec_mergeh as example, as PVIPR defines, vec_mergeh "Merges the first halves (in element order) of two vectors", it does note it's in element order. So it's mapped into vmrghb on BE while vmrglb on LE respectively. Although the mapped insns are different, as the discussion in PR106069, the RTL pattern should be still the same, it is conformed before commit r12-4496, but gets changed into different patterns on BE and LE starting from commit r12-4496. Similar to 32-bit element case in commit log of r15-1504, this 8-bit element pattern on LE doesn't actually match what the underlying insn is intended to represent, once some optimization like combine does some changes basing on it, it would cause the unexpected consequence. The newly constructed test case pr106069-1.c is a typical example for this issue. So this patch is to fix the wrong RTL pattern, ensure the associated RTL patterns become the same as before which can have the same semantic as their mapped insns. With the proposed patch, the expanders like altivec_vmrghb expands into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le depending on endianness, "direct" can easily show which insn would be generated, _be and _le are mainly for the different RTL patterns as endianness. Co-authored-by: Xionghu Luo <xionghuluo@tencent.com> PR target/106069 PR target/115355 gcc/ChangeLog: * config/rs6000/altivec.md (altivec_vmrghb_direct): Rename to ... (altivec_vmrghb_direct_be): ... this. Add condition BYTES_BIG_ENDIAN. (altivec_vmrghb_direct_le): New define_insn. (altivec_vmrglb_direct): Rename to ... (altivec_vmrglb_direct_be): ... this. Add condition BYTES_BIG_ENDIAN. (altivec_vmrglb_direct_le): New define_insn. (altivec_vmrghb): Adjust by calling gen_altivec_vmrghb_direct_be for BE and gen_altivec_vmrglb_direct_le for LE. (altivec_vmrglb): Adjust by calling gen_altivec_vmrglb_direct_be for BE and gen_altivec_vmrghb_direct_le for LE. * config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace CODE_FOR_altivec_vmrghb_direct by CODE_FOR_altivec_vmrghb_direct_be for BE and CODE_FOR_altivec_vmrghb_direct_le for LE. And replace CODE_FOR_altivec_vmrglb_direct by CODE_FOR_altivec_vmrglb_direct_be for BE and CODE_FOR_altivec_vmrglb_direct_le for LE. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr106069-1.c: New test.
2024-06-26	tree-optimization/115646 - ICE with pow shrink-wrapping from bitfield	Richard Biener	2	-1/+14
	The following makes analysis and transform agree on constraints. PR tree-optimization/115646 * tree-call-cdce.cc (check_pow): Check for bit_sz values as allowed by transform. * gcc.dg/pr115646.c: New testcase.
2024-06-26	optab: Add isnormal_optab for isnormal builtin	Haochen Gui	3	-0/+9
	gcc/ * builtins.cc (interclass_mathfn_icode): Set optab to isnormal_optab for isnormal builtin. * optabs.def (isnormal_optab): New. * doc/md.texi (isnormal): Document.
2024-06-26	optab: Add isfinite_optab for isfinite builtin	Haochen Gui	3	-1/+10
	gcc/ * builtins.cc (interclass_mathfn_icode): Set optab to isfinite_optab for isfinite builtin. * optabs.def (isfinite_optab): New. * doc/md.texi (isfinite): Document.
2024-06-26	[libstdc++] [testsuite] no libatomic for vxworks	Alexandre Oliva	1	-0/+5
	libatomic hasn't been ported to vxworks. Most of the stdatomic.h and <atomic> underlying requirements are provided by builtins and libgcc, and the vxworks libc already provides remaining __atomic symbols, so porting libatomic doesn't seem to make sense. However, some of the target arch-only tests in add_options_for_libatomic cover vxworks targets, so we end up attempting to link libatomic in, even though it's not there. Preempt those too-broad tests. Co-Authored-By: Marc Poulhiès <poulhies@adacore.com> for libstdc++-v3/ChangeLog * testsuite/lib/dg-options.exp (add_options_for_libatomic): None for --vxworks*.
2024-06-26	[testsuite] [arm] [vect] adjust mve-vshr test [PR113281]	Alexandre Oliva	1	-0/+2
	The test was too optimistic, alas. We used to vectorize shifts by clamping the shift counts below the bit width of the types (e.g. at 15 for 16-bit vector elements), but (uint16_t)32768 >> (uint16_t)16 is well defined (because of promotion to 32-bit int) and must yield 0, not 1 (as before the fix). Unfortunately, in the gimple model of vector units, such large shift counts wouldn't be well-defined, so we won't vectorize such shifts any more, unless we can tell they're in range or undefined. So the test that expected the vectorization we no longer performed needs to be adjusted. Instead of nobbling the test, Richard Earnshaw suggested annotating the test with the expected ranges so as to enable the optimization, and Christophe Lyon suggested a further simplification. Co-Authored-By: Richard Earnshaw <Richard.Earnshaw@arm.com> for gcc/testsuite/ChangeLog PR tree-optimization/113281 * gcc.target/arm/simd/mve-vshr.c: Add expected ranges.
2024-06-26	Optimize a < 0 ? -1 : 0 to (signed)a >> 31.	liuhongt	6	-1/+264
	Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31 and x < 0 ? 1 : 0 into (unsigned) x >> 31. Move the optimization did in ix86_expand_int_vcond to match.pd gcc/ChangeLog: PR target/114189 * match.pd: Simplify a < 0 ? -1 : 0 to (signed) >> 31 and a < 0 ? 1 : 0 to (unsigned) a >> 31 for vector integer type. gcc/testsuite/ChangeLog: * gcc.target/i386/avx2-pr115517.c: New test. * gcc.target/i386/avx512-pr115517.c: New test. * g++.target/i386/avx2-pr115517.C: New test. * g++.target/i386/avx512-pr115517.C: New test. * g++.dg/tree-ssa/pr88152-1.C: Adjust testcase.
2024-06-25	[PATCH 11/11] Handle subroutine types in CodeView	Mark Harmstone	2	-0/+240
	gcc/ * dwarf2codeview.cc (struct codeview_custom_type): Add lf_procedure and lf_arglist to union. (write_lf_procedure, write_lf_arglist): New functions. (write_custom_types): Call write_lf_procedure and write_lf_arglist. (get_type_num_subroutine_type): New function. (get_type_num): Handle DW_TAG_subroutine_type DIEs. * dwarf2codeview.h (LF_PROCEDURE, LF_ARGLIST): Define.
2024-06-25	[PATCH 10/11] Handle bitfields for CodeView	Mark Harmstone	2	-2/+88
	Translates structure members with DW_AT_data_bit_offset set in DWARF into LF_BITFIELD symbols. gcc/ * dwarf2codeview.cc (struct codeview_custom_type): Add lf_bitfield to union. (write_lf_bitfield): New function. (write_custom_types): Call write_lf_bitfield. (create_bitfield): New function. (get_type_num_struct): Handle bitfields. * dwarf2codeview.h (LF_BITFIELD): Define.
2024-06-25	diagnostics: introduce diagnostic-global-context.cc	David Malcolm	3	-521/+554
	This moves all of the uses of global_dc within diagnostic.cc (including the definition) to a new diagnostic-global-context.cc. My intent is to make clearer those parts of our internal API that implicitly use global_dc, and to perhaps avoid linking global_dc into a future libdiagnostics.so. No functional change intended. gcc/ChangeLog: * Makefile.in (OBJS-libcommon): Add diagnostic-global-context.o. * diagnostic-global-context.cc: New file, taken from material in diagnostic.cc. * diagnostic.cc (global_diagnostic_context): Move to diagnostic-global-context.cc. (global_dc): Likewise. (verbatim): Likewise. (emit_diagnostic): Likewise. (emit_diagnostic_valist): Likewise. (emit_diagnostic_valist_meta): Likewise. (inform): Likewise. (inform_n): Likewise. (warning): Likewise. (warning_at): Likewise. (warning_meta): Likewise. (warning_n): Likewise. (pedwarn): Likewise. (permerror): Likewise. (permerror_opt): Likewise. (error): Likewise. (error_n): Likewise. (error_at): Likewise. (error_meta): Likewise. (sorry): Likewise. (sorry_at): Likewise. (seen_error): Likewise. (fatal_error): Likewise. (internal_error): Likewise. (internal_error_no_backtrace): Likewise. (fnotice): Likewise. (auto_diagnostic_group::auto_diagnostic_group): Likewise. (auto_diagnostic_group::~auto_diagnostic_group): Likewise. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-06-25	diagnostics: eliminate various implicit uses of global_dc	David Malcolm	7	-147/+166
	This patch eliminates all implicit uses of "global_dc" from the path-printing logic and from gcc_rich_location::add_location_if_nearby. No functional change intended. gcc/c/ChangeLog: * c-parser.cc (c_parser_require): Pass global_dc to gcc_rich_location::add_location_if_nearby. gcc/cp/ChangeLog: parser.cc (cp_parser_error_1): Pass global_dc to gcc_rich_location::add_location_if_nearby. (cp_parser_decl_specifier_seq): Likewise. (cp_parser_set_storage_class): Likewise. (cp_parser_set_storage_class): Likewise. gcc/ChangeLog: diagnostic-path.cc (class path_label): Add m_path field, and use it to replace all uses of global_dc. (event_range::event_range): Add "ctxt" param and use it to construct m_path_label. (event_range::maybe_add_event): Add "ctxt" param and pass it to gcc_rich_location::add_location_if_nearby. (path_summary::path_summary): Add "ctxt" param and pass it to event_range::maybe_add_event. (diagnostic_context::print_path): Pass this to path_summary ctor. (selftest::test_empty_path): Use "dc" when constructing path_summary rather than implicitly using global_dc. (selftest::test_intraprocedural_path): Likewise. (selftest::test_interprocedural_path_1): Likewise. (selftest::test_interprocedural_path_2): Likewise. (selftest::test_recursion): Likewise. (selftest::test_control_flow_1): Likewise. (selftest::test_control_flow_2): Likewise. (selftest::test_control_flow_3): Likewise. (selftest::assert_cfg_edge_path_streq): Likewise. (selftest::test_control_flow_5): Likewise. (selftest::test_control_flow_6): Likewise. (selftest::diagnostic_path_cc_tests): Eliminate use of global_dc. diagnostic-show-locus.cc (gcc_rich_location::add_location_if_nearby): Add "ctxt" param and use it instead of implicitly using global_dc. (selftest::test_add_location_if_nearby): Use test_diagnostic_context rather than implicitly using global_dc. * diagnostic.cc (pedantic_warning_kind): Delete macro. (permissive_error_kind): Delete macro. (permissive_error_option): Delete macro. (diagnostic_context::diagnostic_enabled): Remove use of permissive_error_option. (diagnostic_context::report_diagnostic): Remove use of pedantic_warning_kind. (diagnostic_impl): Convert to... (diagnostic_context::diagnostic_impl): ...this. (diagnostic_n_impl): Convert to... (diagnostic_context::diagnostic_n_impl): ...this. (emit_diagnostic): Explicitly use global_dc for method call. (emit_diagnostic_valist): Likewise. (emit_diagnostic_valist_meta): Likewise. (inform): Likewise. (inform_n): Likewise. (warning): Likewise. (warning_at): Likewise. (warning_meta): Likewise. (warning_n): Likewise. (pedwarn): Likewise. (permerror): Likewise. (permerror_opt): Likewise. (error): Likewise. (error_n): Likewise. (error_at): Likewise. (error_meta): Likewise. (sorry): Likewise. (sorry_at): Likewise. (fatal_error): Likewise. (internal_error): Likewise. (internal_error_no_backtrace): Likewise. * diagnostic.h (diagnostic_context::diagnostic_impl): New decl. (diagnostic_context::diagnostic_n_impl): New decl. * gcc-rich-location.h (gcc_rich_location::add_location_if_nearby): Add "ctxt" param. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-06-25	testsuite: use check-jsonschema for validating .sarif files [PR109360]	David Malcolm	3	-10/+11
	As reported here: https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655434.html the schema validation I added for generated .sarif files in r15-1541-ga84fe222029ff2 used the "jsonschema" command line tool, which has been deprecated by more recent versions of the Python 3 "jsonschema" module. This patch updates the validation to use the more recent "check-jsonschema" command line tool, from the Python 3 "check-jsonschema" module, fixing the testsuite FAILs due to the deprecation message. As an added bonus, the output on validation failures is much nicer, e.g. if I undo r15-1540-g9f4fdc3acebcf6, the error messages begin like this: verify-sarif-file: res: Schema validation errors were encountered. diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].locations[0].physicalLocation.region.startColumn: 0 is less than the minimum of 1 diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].relatedLocations[0].physicalLocation.region.startColumn: 0 is less than the minimum of 1 diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].relatedLocations[1].physicalLocation.region.startColumn: 0 is less than the minimum of 1 diagnostic-format-sarif-file-bad-utf8-pr109098-1.c.sarif::$.runs[0].results[0].relatedLocations[2].physicalLocation.region.startColumn: 0 is less than the minimum of 1 child process exited abnormally FAIL: c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c -Wc++-compat (test .sarif output against SARIF schema) Tested with Python 3.8 with check_jsonschema 0.28.6 gcc/ChangeLog: PR testsuite/109360 * doc/install.texi (Python3 modules): Update SARIF validation requirement to use check-jsonschema rather than jsonschema. gcc/testsuite/ChangeLog: PR testsuite/109360 * lib/scansarif.exp (verify-sarif-file): Use check-jsonschema rather than jsonschema, updating the invocation accordingly. * lib/target-supports.exp (check_effective_target_jsonschema): Convert to... (check_effective_target_check_jsonschema): ...this. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-06-26	Daily bump.	GCC Administrator	12	-1/+838