aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-07-06ada: Refer to non-Ada binding limitations in user guideViljar Indus2-31/+37
The limitation of resetting the FPU mode for non 80-bit precision was not referenced from "Creating a Stand-alone Library to be used in a non-Ada context". Reference it the same way it is already referenced from "Interfacing to C". gcc/ada/ * doc/gnat_ugn/the_gnat_compilation_model.rst: Reference "Binding with Non-Ada Main Programs" from "Creating a Stand-alone Library to be used in a non-Ada context". * gnat_ugn.texi: Regenerate.
2023-07-06ada: Reuse code in Is_Fully_Initialized_TypeViljar Indus1-1/+1
gcc/ada/ * sem_util.adb (Is_Fully_Initialized_Type): Avoid recalculating the underlying type twice.
2023-07-06ada: Avoid crash in Find_Optional_Prim_OpViljar Indus1-0/+5
Find_Optional_Prim_Op can crash when the Underlying_Type is Empty. This can happen when you are dealing with a structure type with a private part that does not have its Full_View set yet. gcc/ada/ * exp_util.adb (Find_Optional_Prim_Op): Stop deriving primitive operation if there is no underlying type to derive it from.
2023-07-06ada: Improve error message on violation of SPARK_Mode rulesYannick Moy2-1/+4
SPARK_Mode On can only be used on library-level entities. Improve the error message here. gcc/ada/ * errout.ads: Add explain code. * sem_prag.adb (Check_Library_Level_Entity): Refine error message and add explain code.
2023-07-06ada: Finalization not performed for component of protected typeSteve Baird2-1/+16
In some cases involving a discriminated protected type with an array component that is subject to a discriminant-dependent index constraint, where the element type of the array requires finalization and the array type has not yet been frozen at the point of the declaration of the protected type, finalization of an object of the protected type may incorrectly omit finalization of the array component. One case where this scenario can arise is an instantiation of Ada.Containers.Bounded_Synchronized_Queues, passing in an Element type that requires finalization. gcc/ada/ * exp_ch7.adb (Make_Final_Call): Add assertion that if no finalization call is generated, then the type of the object being finalized does not require finalization. * freeze.adb (Freeze_Entity): If freezing an already-frozen subtype, do not assume that nothing needs to be done. In the case of a frozen subtype of a non-frozen type or subtype (which is possible), freeze the non-frozen entity.
2023-07-06tree-optimization/110563 - simplify epilogue VF checksRichard Biener3-41/+19
The following consolidates an assert that now hits for ppc64le with an earlier check we already do, simplifying vect_determine_partial_vectors_and_peeling and getting rid of its now redundant argument. PR tree-optimization/110563 * tree-vectorizer.h (vect_determine_partial_vectors_and_peeling): Remove second argument. * tree-vect-loop.cc (vect_determine_partial_vectors_and_peeling): Remove for_epilogue_p argument. Merge assert ... (vect_analyze_loop_2): ... with check done before determining partial vectors by moving it after. * tree-vect-loop-manip.cc (vect_do_peeling): Adjust.
2023-07-06GGC, GTY: Tighten up a few things re 'reorder' option and stringsThomas Schwinge2-4/+15
..., which doesn't make sense in combination. This, again, is primarily preparational for another change. gcc/ * ggc-common.cc (gt_pch_note_reorder, gt_pch_save): Tighten up a few things re 'reorder' option and strings. * stringpool.cc (gt_pch_p_S): This is now 'gcc_unreachable'.
2023-07-06GTY: Clean up obsolete parametrized structs remnantsThomas Schwinge3-8/+3
Support removed in 2014 with commit 63f5d5b818319129217e41bcb23db53f99ff11b0 (Subversion r218558) "remove gengtype support for param_is use_param, if_marked and splay tree allocators". gcc/ * gengtype-parse.cc: Clean up obsolete parametrized structs remnants. * gengtype.cc: Likewise. * gengtype.h: Likewise.
2023-07-06GTY: Clean up obsolete 'bool needs_cast_p' field of 'gcc/gengtype.cc:struct ↵Thomas Schwinge1-11/+8
walk_type_data' Last use disappeared in 2014 with commit 63f5d5b818319129217e41bcb23db53f99ff11b0 (Subversion r218558) "remove gengtype support for param_is use_param, if_marked and splay tree allocators". gcc/ * gengtype.cc (struct walk_type_data): Remove 'needs_cast_p'. Adjust all users.
2023-07-06GTY: Repair 'enum gty_token', 'token_names' desynchronizationThomas Schwinge2-1/+7
For example, for the following (made-up) changes: --- gcc/ggc-tests.cc +++ gcc/ggc-tests.cc @@ -258 +258 @@ class GTY((tag("1"))) some_subclass : public example_base -class GTY((tag("2"))) some_other_subclass : public example_base +class GTY((tag(user))) some_other_subclass : public example_base @@ -384 +384 @@ test_chain_next () -struct GTY((user)) user_struct +struct GTY((user user)) user_struct ..., we get unexpected "have a param<N>_is option" diagnostics: [...] build/gengtype \ -S [...]/source-gcc/gcc -I gtyp-input.list -w tmp-gtype.state [...]/source-gcc/gcc/ggc-tests.cc:258: parse error: expected a string constant, have a param<N>_is option [...]/source-gcc/gcc/ggc-tests.cc:384: parse error: expected ')', have a param<N>_is option make[2]: *** [Makefile:2888: s-gtype] Error 1 [...] This traces back to 2012 "Support garbage-collected C++ templates", which got incorporated in commit 0823efedd0fb8669b7e840954bc54c3b2cf08d67 (Subversion r190402), which did add 'USER_GTY' to what nowadays is known as 'enum gty_token', but didn't accordingly update 'gcc/gengtype-parse.c:token_names', leaving those out of sync. Updating 'gcc/gengtype-parse.c:token_value_format' wasn't necessary, as: /* print_token assumes that any token >= FIRST_TOKEN_WITH_VALUE may have a meaningful value to be printed. */ FIRST_TOKEN_WITH_VALUE = PARAM_IS This, in turn, got further confused -- or "fixed" -- by later changes: 2014 commit 63f5d5b818319129217e41bcb23db53f99ff11b0 (Subversion r218558) "remove gengtype support for param_is use_param, if_marked and splay tree allocators", which reciprocally missed corresponding clean-up. With that addressed via adding the missing '"user"' to 'token_names', and, until that is properly fixed, a temporary 'UNUSED_PARAM_IS' (re-)added for use with 'FIRST_TOKEN_WITH_VALUE', we then get the expected: [...]/source-gcc/gcc/ggc-tests.cc:258: parse error: expected a string constant, have 'user' [...]/source-gcc/gcc/ggc-tests.cc:384: parse error: expected ')', have 'user' gcc/ * gengtype-parse.cc (token_names): Add '"user"'. * gengtype.h (gty_token): Add 'UNUSED_PARAM_IS' for use with 'FIRST_TOKEN_WITH_VALUE'.
2023-07-06GTY: Enhance 'string_length' option documentationThomas Schwinge2-3/+12
We're (currently) not aware of any actual use of 'ht_identifier's with NUL characters embedded; its 'len' field appears to exist for optimization purposes, since "forever". Before 'struct ht_identifier' was added in commit 2a967f3d3a45294640e155381ef549e0b8090ad4 (Subversion r42334), we had in 'gcc/cpplib.h:struct cpp_hashnode': 'unsigned short len', or earlier 'length', earlier in 'gcc/cpphash.h:struct hashnode': 'unsigned short length', earlier 'size_t length' with comment: "length of token, for quick comparison", earlier 'int length', ever since the 'gcc/cpp*' files were added in commit 7f2935c734c36f84ab62b20a04de465e19061333 (Subversion r9191). This amends commit f3b957ea8b9dadfb1ed30f24f463529684b7a36a "pch: Fix streaming of strings with embedded null bytes". gcc/ * doc/gty.texi (GTY Options) <string_length>: Enhance. libcpp/ * include/symtab.h (struct ht_identifier): Document different rationale.
2023-07-06GTY: Explicitly reject 'string_length' option for (fields in) global variablesThomas Schwinge2-0/+14
This is preparational for another thing that I'm working on. No change in behavior -- other than a more explicit error message. The 'string_length' option currently is not supported for (fields in) global variables. For example, if we apply the following (made-up) changes: --- gcc/c-family/c-cppbuiltin.cc +++ gcc/c-family/c-cppbuiltin.cc @@ -1777 +1777 @@ struct GTY(()) lazy_hex_fp_value_struct - const char *hex_str; + const char * GTY((string_length("strlen(%h.hex_str) + 1"))) hex_str; --- gcc/varasm.cc +++ gcc/varasm.cc @@ -66 +66 @@ along with GCC; see the file COPYING3. If not see -extern GTY(()) const char *first_global_object_name; +extern GTY((string_length("strlen(%h.first_global_object_name) + 1"))) const char *first_global_object_name; ..., we get: [...] build/gengtype \ -S [...]/source-gcc/gcc -I gtyp-input.list -w tmp-gtype.state /bin/sh [...]/source-gcc/gcc/../move-if-change tmp-gtype.state gtype.state build/gengtype \ -r gtype.state [...]/source-gcc/gcc/varasm.cc:66: global `first_global_object_name' has unknown option `string_length' [...]/source-gcc/gcc/c-family/c-cppbuiltin.cc:1789: field `hex_str' of global `lazy_hex_fp_values[0]' has unknown option `string_length' make[2]: *** [Makefile:2890: s-gtype] Error 1 [...] These errors occur when writing "GC roots", where -- per my understanding -- 'string_length' isn't relevant for actual GC purposes. However, like elsewhere, it is for PCH purposes, and simply accepting 'string_length' here isn't sufficient: we'll still get '(gt_pointer_walker) &gt_pch_n_S' used in the 'struct ggc_root_tab' instances, and there's no easy way to change that to instead use 'gt_pch_n_S2' with explicit 'size_t string_len' argument. (At least not sufficiently easy to justify spending any further time on, given that I don't have an actual use for that feature.) So, until an actual need arises, and/or to avoid the next person looking into this having to figure out the same thing again, let's just document this limitation: [...]/source-gcc/gcc/varasm.cc:66: option `string_length' not supported for global `first_global_object_name' [...]/source-gcc/gcc/c-family/c-cppbuiltin.cc:1789: option `string_length' not supported for field `hex_str' of global `lazy_hex_fp_values[0]' This amends commit f3b957ea8b9dadfb1ed30f24f463529684b7a36a "pch: Fix streaming of strings with embedded null bytes". gcc/ * gengtype.cc (write_root, write_roots): Explicitly reject 'string_length' option. * doc/gty.texi (GTY Options) <string_length>: Document.
2023-07-06GGC: Remove unused 'bool is_string' arguments to ↵Thomas Schwinge3-18/+12
'ggc_pch_{count,alloc,write}_object' They're unused since the removal of 'gcc/ggc-zone.c' in 2013 Subversion r195426 (Git commit cd030c079e5e42fe3f49261fe01f384e6b7f0111) "Remove zone allocator". Should any future 'gcc/ggc-[...].cc' ever need this again, it'll be a conscious decision at that time. gcc/ * ggc-internal.h (ggc_pch_count_object, ggc_pch_alloc_object) (ggc_pch_write_object): Remove 'bool is_string' argument. * ggc-common.cc: Adjust. * ggc-page.cc: Likewise.
2023-07-06[Committed] Handle COPYSIGN in dwarf2out.cc's mem_loc_descriptor.Roger Sayle1-0/+1
Many thanks to Hans-Peter Nilsson for reminding me that new RTX codes need to be added to dwarf2out.cc's mem_loc_descriptor, and for doing this for BITREVERSE. This patch does the same for the recently added COPYSIGN. I'd been testing these on a target that doesn't use DWARF (nvptx-none) and so didn't exhibit the issue, and my additional testing on x86_64-pc-linux-gnu to double check that changes were safe, doesn't (yet) trigger the problematic assert in dwarf2out.cc's mem_loc_descriptor. 2023-07-06 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * dwarf2out.cc (mem_loc_descriptor): Handle COPYSIGN.
2023-07-06i386: Update document for inlining rulesHongyu Wang1-5/+14
gcc/ChangeLog: * doc/extend.texi: Move x86 inlining rule to a new subsubsection and add description for inling of function with arch and tune attributes.
2023-07-06tree-optimization/110515 - wrong code with LIM + PRERichard Biener2-0/+224
In this PR we face the issue that LIM speculates a load when hoisting it out of the loop (since it knows it cannot trap). Unfortunately this exposes undefined behavior when the load accesses memory with the wrong dynamic type. This later makes PRE use that representation instead of the original which accesses the same memory location but using a different dynamic type leading to a wrong disambiguation of that original access against another and thus a wrong-code transform. Fortunately there already is code in PRE dealing with a similar situation for code hoisting but that left a small gap which when fixed also fixes the wrong-code transform in this bug even if it doesn't address the underlying issue of LIM speculating that load. The upside is this fix is trivially safe to backport and chances of code generation regressions are very low. PR tree-optimization/110515 * tree-ssa-pre.cc (compute_avail): Make code dealing with hoisting loads with different alias-sets more robust. * g++.dg/opt/pr110515.C: New testcase.
2023-07-06VECT: Fix ICE of variable stride on strieded load/store with SELECT_VL loop ↵Ju-Zhe Zhong1-4/+2
control. Hi, Richi. Sorry for making mistake on LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with SELECT_VL loop control. Consider this following case: void __attribute__ ((noinline, noclone)) \ f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \ INDEX##BITS stride, INDEX##BITS n) \ { \ for (INDEX##BITS i = 0; i < n; ++i) \ dest[i] += src[i * stride]; \ } When "stride" is a constant, current flow works fine. However, when "stride" is a variable. It causes an ICE: ... _96 = .SELECT_VL (ivtmp_94, 4); ... ivtmp_78 = ((sizetype) _39 * (sizetype) _96) * 4; vect__11.69_87 = .LEN_MASK_GATHER_LOAD (vectp_src.67_85, _84, 4, { 0, 0, 0, 0 }, { -1, -1, -1, -1 }, _96, 0); ... vectp_src.67_86 = vectp_src.67_85 + ivtmp_78; Becase the IR: ivtmp_78 = ((sizetype) _39 * (sizetype) _96) * 4; Instead, I split the IR into: step_stride = _39 step = step_stride * 4 ivtmp_78 = step * _96 Thanks. gcc/ChangeLog: * tree-vect-stmts.cc (vect_get_strided_load_store_ops): Fix ICE.
2023-07-06Fix expectation on gcc.dg/vect/pr71264.cRichard Biener1-3/+0
With the recent change to more reliably not vectorize code already using vector types we run into FAILs of gcc.dg/vect/pr71264.c The testcase was added for fixing an ICE and possible (re-)vectorization of the code isn't really supported and I suspect might even go wrong for non-bitops. The following leaves the testcase as just testing for an ICE. PR tree-optimization/110544 * gcc.dg/vect/pr71264.c: Remove scan for vectorization.
2023-07-06i386: Inline function with default arch/tune to callerHongyu Wang3-7/+66
For function with different target attributes, current logic rejects to inline the callee when any arch or tune is mismatched. Relax the condition to allow callee with default arch/tune to be inlined. gcc/ChangeLog: * config/i386/i386.cc (ix86_can_inline_p): If callee has default arch=x86-64 and tune=generic, do not block the inlining to its caller. Also allow callee with different arch= to be inlined if it has always_inline attribute and it's ISA is subset of caller's. gcc/testsuite/ChangeLog: * gcc.target/i386/inline_attr_arch.c: New test. * gcc.target/i386/inline_target_clones.c: Ditto.
2023-07-06RISC-V: Handle rouding mode correctly on zfinxKito Cheng1-1/+1
Zfinx has provide fcsr like F, so rouding mode should use fcsr instead of `soft` fenv. libgcc/ChangeLog: * config/riscv/sfp-machine.h (FP_INIT_ROUNDMODE): Check zfinx. (FP_HANDLE_EXCEPTIONS): Ditto.
2023-07-06Adjust rtx_cost for DF/SFmode AND/IOR/XOR/ANDN operations.liuhongt2-3/+22
They should have same cost as vector mode since both generate pand/pandn/pxor/por instruction. gcc/ChangeLog: * config/i386/i386.cc (ix86_rtx_costs): Adjust rtx_cost for DF/SFmode AND/IOR/XOR/ANDN operations. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110170-2.c: New test.
2023-07-05Fix PR 110554: vec lowering introduces scalar signed-boolean:32 comparisonsAndrew Pinski1-2/+6
So the problem is vector generic decided to do comparisons in signed-boolean:32 types but the rest of the middle-end was not ready for that. Since we are building the comparison which will feed into a cond_expr here, using boolean_type_node is better and also correct. The rest of the compiler thinks the ranges for comparison is always [0,1] too. Note this code does not currently lowers bigger vector sizes into smaller vector sizes so using boolean_type_node here is better. OK? bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR middle-end/110554 * tree-vect-generic.cc (expand_vector_condition): For comparisons, just build using boolean_type_node instead of the cond_type. For non-comparisons/non-scalar-bitmask, build a ` != 0` gimple that will feed into the COND_EXPR.
2023-07-06Disparage slightly for the alternative which move DFmode between SSE_REGS ↵liuhongt2-2/+13
and GENERAL_REGS. For testcase void __cond_swap(double* __x, double* __y) { bool __r = (*__x < *__y); auto __tmp = __r ? *__x : *__y; *__y = __r ? *__y : *__x; *__x = __tmp; } GCC-14 with -O2 and -march=x86-64 options generates the following code: __cond_swap(double*, double*): movsd xmm1, QWORD PTR [rdi] movsd xmm0, QWORD PTR [rsi] comisd xmm0, xmm1 jbe .L2 movq rax, xmm1 movapd xmm1, xmm0 movq xmm0, rax .L2: movsd QWORD PTR [rsi], xmm1 movsd QWORD PTR [rdi], xmm0 ret rax is used to save and restore DFmode value. In RA both GENERAL_REGS and SSE_REGS cost zero since we didn't disparage the alternative in movdf_internal pattern, according to register allocation order, GENERAL_REGS is allocated. The patch add ? for alternative (r,v) and (v,r) just like we did for movsf/hf/bf_internal pattern, after that we get optimal RA. __cond_swap: .LFB0: .cfi_startproc movsd (%rdi), %xmm1 movsd (%rsi), %xmm0 comisd %xmm1, %xmm0 jbe .L2 movapd %xmm1, %xmm2 movapd %xmm0, %xmm1 movapd %xmm2, %xmm0 .L2: movsd %xmm1, (%rsi) movsd %xmm0, (%rdi) ret gcc/ChangeLog: PR target/110170 * config/i386/i386.md (movdf_internal): Disparage slightly for 2 alternatives (r,v) and (v,r) by adding constraint modifier '?'. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110170-3.c: New test.
2023-07-05rs6000: Remove redundant initialization [PR106907]Jeevitha Palanisamy1-2/+1
PR106907 has few warnings spotted from cppcheck. In that addressing redundant initialization issue. Here the initialized value of 'new_addr' was overwritten before it was read. Updated the source by removing the unnecessary initialization of 'new_addr'. 2023-07-06 Jeevitha Palanisamy <jeevitha@linux.ibm.com> gcc/ PR target/106907 * config/rs6000/rs6000.cc (rs6000_expand_vector_extract): Remove redundant initialization of new_addr.
2023-07-06tree-optimization/110474 - Vect: select small VF for epilog of unrolled loopHao Liu2-6/+47
If a loop is unrolled during vectorization (i.e. suggested_unroll_factor > 1), the VFs of both main and epilog loop are enlarged. The epilog vect loop is specific for a loop with small iteration counts, so a large VF may hurt performance. This patch unscales the main loop VF by suggested_unroll_factor while selecting the epilog loop VF, so that it will be the same as vectorized loop without unrolling (i.e. suggested_unroll_factor = 1). gcc/ChangeLog: PR tree-optimization/110474 * tree-vect-loop.cc (vect_analyze_loop_2): unscale the VF by suggested unroll factor while selecting the epilog vect loop VF. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr110474.c: New testcase.
2023-07-06Daily bump.GCC Administrator9-1/+427
2023-07-05Make compute_operand_range a tail call.Andrew MacLeod1-18/+16
Tweak the routine so it is making a tail call. * gimple-range-gori.cc (compute_operand_range): Convert to a tail call.
2023-07-05Make compute_operand2_range a leaf call.Andrew MacLeod2-28/+26
Rather than creating long call chains, put the onus for finishing the evlaution on the caller. * gimple-range-gori.cc (compute_operand_range): After calling compute_operand2_range, recursively call self if needed. (compute_operand2_range): Turn into a leaf function. (gori_compute::compute_operand1_and_operand2_range): Finish operand2 calculation. * gimple-range-gori.h (compute_operand2_range): Remove name param.
2023-07-05Make compute_operand1_range a leaf call.Andrew MacLeod2-26/+25
Rather than creating long call chains, put the onus for finishing the evlaution on the caller. * gimple-range-gori.cc (compute_operand_range): After calling compute_operand1_range, recursively call self if needed. (compute_operand1_range): Turn into a leaf function. (gori_compute::compute_operand1_and_operand2_range): Finish operand1 calculation. * gimple-range-gori.h (compute_operand1_range): Remove name param.
2023-07-05Simplify compute_operand_range for op1 and op2 case.Andrew MacLeod1-14/+11
Move the check for co-dependency between 2 operands into compute_operand_range, resulting in a much cleaner compute_operand1_and_operand2_range routine. * gimple-range-gori.cc (compute_operand_range): Check for operand interdependence when both op1 and op2 are computed. (compute_operand1_and_operand2_range): No checks required now.
2023-07-05Move relation discovery into compute_operand_rangeAndrew MacLeod1-29/+13
compute_operand1_range and compute_operand2_range were both doing relation discovery between the 2 operands... move it into a common area. * gimple-range-gori.cc (compute_operand_range): Check for a relation between op1 and op2 and use that instead. (compute_operand1_range): Don't look for a relation override. (compute_operand2_range): Ditto.
2023-07-05libstdc++: Split up pstl/set.cc testcaseThomas Rodgers6-289/+451
This testcase is causing some timeout issues. This patch splits the testcase up by individual set algorithm. libstdc++-v3:/ChangeLog: * testsuite/25_algorithms/pstl/alg_sorting/set.cc: Delete file. * testsuite/25_algorithms/pstl/alg_sorting/set_difference.cc: New file. * testsuite/25_algorithms/pstl/alg_sorting/set_intersection.cc: Likewise. * testsuite/25_algorithms/pstl/alg_sorting/set_symmetric_difference.cc: Likewise. * testsuite/25_algorithms/pstl/alg_sorting/set_union.cc: Likewise. * testsuite/25_algorithms/pstl/alg_sorting/set_util.h: Likewise.
2023-07-05doc: Update my Contributors entryJonathan Wakely1-2/+1
gcc/ChangeLog: * doc/contrib.texi (Contributors): Update my entry.
2023-07-05value-prof.cc: Correct edge prob calculation.Filip Kastl1-1/+5
The mod-subtract optimization with ncounts==1 produced incorrect edge probabilities due to incorrect conditional probability calculation. This patch fixes the calculation. Signed-off-by: Filip Kastl <filip.kastl@gmail.com> gcc/ChangeLog: * value-prof.cc (gimple_mod_subtract_transform): Correct edge prob calculation.
2023-07-05sched: Change return type of predicate functions from int to boolUros Bizjak6-125/+131
Also change some internal variables to bool. gcc/ChangeLog: * sched-int.h (struct haifa_sched_info): Change can_schedule_ready_p, scehdule_more_p and contributes_to_priority indirect frunction type from int to bool. (no_real_insns_p): Change return type from int to bool. (contributes_to_priority): Ditto. * haifa-sched.cc (no_real_insns_p): Change return type from int to bool and adjust function body accordingly. * modulo-sched.cc (try_scheduling_node_in_cycle): Change "success" variable type from int to bool. (ps_insn_advance_column): Change return type from int to bool. (ps_has_conflicts): Ditto. Change "has_conflicts" variable type from int to bool. * sched-deps.cc (deps_may_trap_p): Change return type from int to bool. (conditions_mutex_p): Ditto. * sched-ebb.cc (schedule_more_p): Ditto. (ebb_contributes_to_priority): Change return type from int to bool and adjust function body accordingly. * sched-rgn.cc (is_cfg_nonregular): Ditto. (check_live_1): Ditto. (is_pfree): Ditto. (find_conditional_protection): Ditto. (is_conditionally_protected): Ditto. (is_prisky): Ditto. (is_exception_free): Ditto. (haifa_find_rgns): Change "unreachable" and "too_large_failure" variables from int to bool. (extend_rgns): Change "rescan" variable from int to bool. (check_live): Change return type from int to bool and adjust function body accordingly. (can_schedule_ready_p): Ditto. (schedule_more_p): Ditto. (contributes_to_priority): Ditto.
2023-07-05gimple-isel: Recognize vec_extract pattern.Robin Dapp6-28/+143
In gimple-isel we already deduce a vec_set pattern from an ARRAY_REF(VIEW_CONVERT_EXPR). This patch does the same for a vec_extract. The code is largely similar to the vec_set one including the addition of a can_vec_extract_var_idx_p function in optabs.cc to check if the backend can handle a register operand as index. We already have can_vec_extract in optabs-query but that one checks whether we can extract specific modes. With the introduction of an internal function for vec_extract the expander must not FAIL. For vec_set this has already been the case so adjust the documentation accordingly. Additionally, clarify the wording of the vector-vector case for vec_extract. gcc/ChangeLog: * doc/md.texi: Document that vec_set and vec_extract must not fail. * gimple-isel.cc (gimple_expand_vec_set_expr): Rename this... (gimple_expand_vec_set_extract_expr): ...to this. (gimple_expand_vec_exprs): Call renamed function. * internal-fn.cc (vec_extract_direct): Add. (expand_vec_extract_optab_fn): New function to expand vec_extract optab. (direct_vec_extract_optab_supported_p): Add. * internal-fn.def (VEC_EXTRACT): Add. * optabs.cc (can_vec_extract_var_idx_p): New function. * optabs.h (can_vec_extract_var_idx_p): Declare.
2023-07-05RISC-V: Support variable index in vec_extract.Robin Dapp7-162/+171
This patch adds a gen_lowpart in the vec_extract expander so it properly works with a variable index and adds tests. gcc/ChangeLog: * config/riscv/autovec.md: Add gen_lowpart. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: Add tests for variable index. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c: Ditto.
2023-07-05RISC-V: Allow variable index for vec_set.Robin Dapp7-164/+185
This patch enables a variable index for vec_set and adjust the tests. gcc/ChangeLog: * config/riscv/autovec.md: Allow register index operand. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Adjust test. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c: Ditto.
2023-07-05RISC-V: Use FRM_DYN when add the rounding mode operandPan Li1-4/+3
This patch would like to take FRM_DYN const rtx as the rounding mode operand according to the RVV spec, which takes the dyn as the only rounding mode for floating-point. Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (function_expander::use_exact_insn): Use FRM_DYN instead of const0. Signed-off-by: Pan Li <pan2.li@intel.com>
2023-07-05RISC-V: Change truncate to float_truncate in narrowing patterns.Robin Dapp1-2/+2
This fixes a bug in the autovect FP narrowing patterns which resulted in a combine ICE. It would try to e.g. simplify a unary operation by simplify_const_unary_operation which obviously expects a float_truncate and not a truncate for a floating-point mode. gcc/ChangeLog: * config/riscv/autovec.md: Use float_truncate.
2023-07-05VECT: Apply LEN_MASK_GATHER_LOAD/SCATTER_STORE into vectorizerJu-Zhe Zhong5-23/+129
Hi, Richard and Richi. Address comments from Richi. Make gs_info.ifn = LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE. I have fully tested these 4 format: length = vf is a dummpy length, mask = {-1,-1, ... } is a dummy mask. 1. no length, no mask LEN_MASK_GATHER_LOAD (..., length = vf, mask = {-1,-1,...}) 2. exist length, no mask LEN_MASK_GATHER_LOAD (..., len, mask = {-1,-1,...}) 3. exist mask, no length LEN_MASK_GATHER_LOAD (..., length = vf, mask) 4. both mask and length exist LEN_MASK_GATHER_LOAD (..., length, mask) All of these work fine in this patch. Here is the example: void f (int *restrict a, int *restrict b, int n, int base, int step, int *restrict cond) { for (int i = 0; i < n; ++i) { if (cond[i]) a[i * 4] = b[i]; } } Gimple IR: <bb 3> [local count: 105119324]: _58 = (unsigned long) n_13(D); <bb 4> [local count: 630715945]: # vectp_cond.7_45 = PHI <vectp_cond.7_46(4), cond_14(D)(3)> # vectp_b.11_51 = PHI <vectp_b.11_52(4), b_15(D)(3)> # vectp_a.14_55 = PHI <vectp_a.14_56(4), a_16(D)(3)> # ivtmp_59 = PHI <ivtmp_60(4), _58(3)> _61 = .SELECT_VL (ivtmp_59, POLY_INT_CST [2, 2]); ivtmp_44 = _61 * 4; vect__4.9_47 = .LEN_MASK_LOAD (vectp_cond.7_45, 32B, _61, 0, { -1, ... }); mask__24.10_49 = vect__4.9_47 != { 0, ... }; vect__8.13_53 = .LEN_MASK_LOAD (vectp_b.11_51, 32B, _61, 0, mask__24.10_49); ivtmp_54 = _61 * 16; .LEN_MASK_SCATTER_STORE (vectp_a.14_55, { 0, 16, 32, ... }, 1, vect__8.13_53, _61, 0, mask__24.10_49); vectp_cond.7_46 = vectp_cond.7_45 + ivtmp_44; vectp_b.11_52 = vectp_b.11_51 + ivtmp_44; vectp_a.14_56 = vectp_a.14_55 + ivtmp_54; ivtmp_60 = ivtmp_59 - _61; if (ivtmp_60 != 0) goto <bb 4>; [83.33%] else goto <bb 5>; [16.67%] Ok for trunk ? gcc/ChangeLog: * internal-fn.cc (internal_fn_len_index): Apply LEN_MASK_GATHER_LOAD/SCATTER_STORE into vectorizer. (internal_fn_mask_index): Ditto. * optabs-query.cc (supports_vec_gather_load_p): Ditto. (supports_vec_scatter_store_p): Ditto. * tree-vect-data-refs.cc (vect_gather_scatter_fn_p): Ditto. * tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Ditto. * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Ditto. (vect_get_strided_load_store_ops): Ditto. (vectorizable_store): Ditto. (vectorizable_load): Ditto.
2023-07-05Change MODE_BITSIZE to MODE_PRECISION for MODE_VECTOR_BOOL.Robin Dapp22-12/+367
RISC-V lowers the TYPE_PRECISION for MODE_VECTOR_BOOL vectors in order to distinguish between VNx1BI, VNx2BI, VNx4BI and VNx8BI. This patch adjusts uses of MODE_VECTOR_BOOL to use GET_MODE_PRECISION instead of GET_MODE_BITSIZE. The RISC-V tests are provided by Juzhe. Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai> gcc/c-family/ChangeLog: * c-common.cc (c_common_type_for_mode): Use GET_MODE_PRECISION. gcc/ChangeLog: * simplify-rtx.cc (native_encode_rtx): Ditto. (native_decode_vector_rtx): Ditto. (simplify_const_vector_byte_offset): Ditto. (simplify_const_vector_subreg): Ditto. * tree.cc (build_truth_vector_type_for_mode): Ditto. * varasm.cc (output_constant_pool_2): Ditto. gcc/fortran/ChangeLog: * trans-types.cc (gfc_type_for_mode): Ditto. gcc/go/ChangeLog: * go-lang.cc (go_langhook_type_for_mode): Ditto. gcc/lto/ChangeLog: * lto-lang.cc (lto_type_for_mode): Ditto. gcc/rust/ChangeLog: * backend/rust-tree.cc (c_common_type_for_mode): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-1.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-10.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-11.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-12.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-13.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-14.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-2.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-3.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-4.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-5.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-6.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-7.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-8.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-9.c: New test.
2023-07-05MIPS: Use unaligned access to expand block_move on r6YunQiang Su3-17/+54
MIPSr6 support unaligned memory access with normal lh/sh/lw/sw/ld/sd instructions, and thus lwl/lwr/ldl/ldr and swl/swr/sdl/sdr is removed. For microarchitecture, these memory access instructions issue 2 operation if the address is not aligned, which is like what lwl family do. For some situation (such as accessing boundary of pages) on some microarchitectures, the unaligned access may not be good enough, then the kernel should trap&emu it: the kernel may need -mno-unalgined-access option. gcc/ * config/mips/mips.cc (mips_expand_block_move): don't expand for r6 with -mno-unaligned-access option if one or both of src and dest are unaligned. restruct: return directly if length is not const. (mips_block_move_straight): emit_move if ISA_HAS_UNALIGNED_ACCESS. gcc/testsuite/ * gcc.target/mips/expand-block-move-r6-no-unaligned.c: new test. * gcc.target/mips/expand-block-move-r6.c: new test.
2023-07-05adjust testcase for now happening epilogue vectorizationRichard Biener1-4/+0
gcc.dg/vect/slp-perm-9.c is reported to FAIL with -march=cascadelake now which is because we now vectorize the epilogue with V2HImode vectors after the recent change to not scrap too large vector epilogues during transform but during analysis time. The following adjusts the testcase to always use the existing alternate N which avoids epilogue vectorization. * gcc.dg/vect/slp-perm-9.c: Always use alternate N.
2023-07-05x86: suppress avx512f-copysign.c testcase for 32-bitJan Beulich1-1/+1
The test installed by "x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F" won't succeed on 32-bit, for floating point operations being done there (by default) without using SIMD insns. gcc/testsuite/ * gcc.target/i386/avx512f-copysign.c: Suppress for 32-bit.
2023-07-05x86: yet more PR target/100711-like splittingJan Beulich2-2/+33
Following two-operand bitwise operations, add another splitter to also deal with not followed by broadcast all on its own, which can be expressed as simple embedded broadcast instead once a broadcast operand is actually permitted in the respective insn. While there also permit a broadcast operand in the corresponding expander. gcc/ PR target/100711 * config/i386/sse.md: New splitters to simplify not;vec_duplicate as a singular vpternlog. (one_cmpl<mode>2): Allow broadcast for operand 1. (<mask_codefor>one_cmpl<mode>2<mask_name>): Likewise. gcc/testsuite/ PR target/100711 * gcc.target/i386/pr100711-6.c: New test.
2023-07-05x86: further PR target/100711-like splittingJan Beulich3-0/+112
With respective two-operand bitwise operations now expressable by a single VPTERNLOG, add splitters to also deal with ior and xor counterparts of the original and-only case. Note that the splitters need to be separate, as the placement of "not" differs in the final insns (*iornot<mode>3, *xnor<mode>3) which are intended to pick up one half of the result. gcc/ PR target/100711 * config/i386/sse.md: New splitters to simplify not;vec_duplicate;{ior,xor} as vec_duplicate;{iornot,xnor}. gcc/testsuite/ PR target/100711 * gcc.target/i386/pr100711-4.c: New test. * gcc.target/i386/pr100711-5.c: New test.
2023-07-05x86: allow memory operand for AVX2 splitter for PR target/100711Jan Beulich1-1/+1
The intended broadcast (with AVX512) can very well be done right from memory. gcc/ PR target/100711 * config/i386/sse.md: Permit non-immediate operand 1 in AVX2 form of splitter for PR target/100711.
2023-07-05middle-end/110541 - VEC_PERM_EXPR documentation is offRichard Biener1-6/+11
The following adjusts the tree.def documentation about VEC_PERM_EXPR which wasn't adjusted when the restrictions of permutes with constant mask were relaxed. PR middle-end/110541 * tree.def (VEC_PERM_EXPR): Adjust documentation to reflect reality.
2023-07-05x86: use VPTERNLOG also for certain andnot formsJan Beulich4-11/+47
When it's the memory operand which is to be inverted, using VPANDN* requires a further load instruction. The same can be achieved by a single VPTERNLOG*. Add two new alternatives (for plain memory and embedded broadcast), adjusting the predicate for the first operand accordingly. Two pre-existing testcases actually end up being affected (improved) by the change, which is reflected in updated expectations there. gcc/ PR target/93768 * config/i386/sse.md (*andnot<mode>3): Add new alternatives for memory form operand 1. gcc/testsuite/ PR target/93768 * gcc.target/i386/avx512f-andn-di-zmm-2.c: New test. * gcc.target/i386/avx512f-andn-si-zmm-2.c: Adjust expecations towards generated code. * gcc.target/i386/pr100711-3.c: Adjust expectations for 32-bit code.