aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-06-28RISC-V: Support floating-point vfwadd/vfwsub vv/wv combine loweringJuzhe-Zhong16-26/+187
Currently, vfwadd.wv is the pattern with (set (reg) (float_extend:(reg)) which makes combine pass faile to combine. change RTL format of vfwadd.wv ------> (set (float_extend:(reg) (reg)) so that combine PASS can combine. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Adapt expand. * config/riscv/vector.md (@pred_single_widen_<plus_minus:optab><mode>): Remove. (@pred_single_widen_add<mode>): New pattern. (@pred_single_widen_sub<mode>): New pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/widen/widen-1.c: Add floating-point. * gcc.target/riscv/rvv/autovec/widen/widen-2.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen-5.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen-6.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen-complicate-1.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen-complicate-2.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run-6.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-2.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-5.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-6.c: New test.
2023-06-28i386: Fix mvc17.c test for default target clone under --with-archHongyu Wang1-1/+1
For target clones, the default clone follows the default march so adjust the testcase to avoid test failure on --with-arch=native build. gcc/testsuite/ChangeLog: * gcc.target/i386/mvc17.c: Add -march=x86-64 to dg-options.
2023-06-28Issue a warning for conversion between short and __bf16 under TARGET_AVX512BF16.liuhongt2-0/+49
__bfloat16 is redefined from typedef short to real __bf16 since GCC V13. The patch issues an warning for potential silent implicit conversion between __bf16 and short where users may only expect a data movement. To avoid too many false positive, warning is only under TARGET_AVX512BF16. gcc/ChangeLog: * config/i386/i386.cc (ix86_invalid_conversion): New function. (TARGET_INVALID_CONVERSION): Define as ix86_invalid_conversion. gcc/testsuite/ChangeLog: * gcc.target/i386/bf16_short_warn.c: New test.
2023-06-28Daily bump.GCC Administrator4-1/+303
2023-06-27RISC-V: Add autovect widening/narrowing Integer/FP conversions.Robin Dapp22-0/+737
This patch implements widening and narrowing float-to-int and int-to-float conversions and adds tests. gcc/ChangeLog: * config/riscv/autovec.md (<optab><vnconvert><mode>2): New expander. (<float_cvt><vnconvert><mode>2): Ditto. (<optab><mode><vnconvert>2): Ditto. (<float_cvt><mode><vnconvert>2): Ditto. * config/riscv/vector-iterators.md: Add vnconvert. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-zvfh-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-zvfh-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-zvfh-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-zvfh-run.c: New test.
2023-06-27RISC-V: Add autovec FP widening/narrowing.Robin Dapp12-2/+292
This patch adds FP widening and narrowing expanders as well as tests. Conceptually similar to integer extension/truncation, we emulate _Float16 -> double by two vfwcvts and double -> _Float16 by two vfncvts. gcc/ChangeLog: * config/riscv/autovec.md (extend<v_double_trunc><mode>2): New expander. (extend<v_quad_trunc><mode>2): Ditto. (trunc<mode><v_double_trunc>2): Ditto. (trunc<mode><v_quad_trunc>2): Ditto. * config/riscv/vector-iterators.md: Add VQEXTF and HF to V_QUAD_TRUNC and v_quad_trunc. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/conversions/vfncvt-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-zvfh-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfwcvt-zvfh-run.c: New test.
2023-06-27RISC-V: Add autovec FP int->float conversion.Robin Dapp15-15/+378
This patch adds the autovec expander for vfcvt.f.x.v and tests for it. gcc/ChangeLog: * config/riscv/autovec.md (<float_cvt><vconvert><mode>2): New expander. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/conversions/vfcvt_rtz-run.c: Adjust. * gcc.target/riscv/rvv/autovec/conversions/vfcvt_rtz-rv32gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/conversions/vfcvt_rtz-rv64gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/conversions/vfcvt_rtz-template.h: Ditto. * gcc.target/riscv/rvv/autovec/conversions/vncvt-template.h: Ditto. * gcc.target/riscv/rvv/autovec/conversions/vsext-template.h: Ditto. * gcc.target/riscv/rvv/autovec/conversions/vzext-template.h: Ditto. * gcc.target/riscv/rvv/autovec/zvfhmin-1.c: Add int/float conversions. * gcc.target/riscv/rvv/autovec/conversions/vfcvt-itof-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfcvt-itof-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfcvt-itof-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vfcvt-itof-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vfcvt-itof-zvfh-run.c: New test.
2023-06-27RISC-V: Implement autovec copysign.Robin Dapp9-9/+377
This adds vector copysign, ncopysign and xorsign as well as the accompanying tests. gcc/ChangeLog: * config/riscv/autovec.md (copysign<mode>3): Add expander. (xorsign<mode>3): Ditto. * config/riscv/riscv-vector-builtins-bases.cc (class vfsgnjn): New class. * config/riscv/vector-iterators.md (copysign): Remove ncopysign. (xorsign): Ditto. (n): Ditto. (x): Ditto. * config/riscv/vector.md (@pred_ncopysign<mode>): Split off. (@pred_ncopysign<mode>_scalar): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/copysign-run.c: New test. * gcc.target/riscv/rvv/autovec/binop/copysign-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/binop/copysign-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/binop/copysign-template.h: New test. * gcc.target/riscv/rvv/autovec/binop/copysign-zvfh-run.c: New test.
2023-06-27RISC-V: Split VF iterators for Zvfh(min).Robin Dapp3-111/+128
When working on FP widening/narrowing I realized the Zvfhmin handling is not ideal right now: We use the "enabled" insn attribute to disable instructions not available with Zvfhmin but only with Zvfh. However, "enabled == 0" only disables insn alternatives, in our case all of them when the mode is a HFmode. The insn itself remains available (e.g. for combine to match) and we end up with an insn without alternatives that reload cannot handle --> ICE. The proper solution is to disable the instruction for the respective mode altogether. This patch achieves this by splitting the VF as well as VWEXTF iterators into variants with TARGET_ZVFH and TARGET_VECTOR_ELEN_FP_16 (which is true when either TARGET_ZVFH or TARGET_ZVFHMIN are true). Also, VWCONVERTI, VHF and VHF_LMUL1 need adjustments. gcc/ChangeLog: * config/riscv/autovec.md: VF_AUTO -> VF. * config/riscv/vector-iterators.md: Introduce VF_ZVFHMIN, VWEXTF_ZVFHMIN and use TARGET_ZVFH in VWCONVERTI, VHF and VHF_LMUL1. * config/riscv/vector.md: Use new iterators.
2023-06-27match.pd: Use element_mode instead of TYPE_MODE.Robin Dapp1-2/+4
This patch changes TYPE_MODE into element_mode in a match.pd simplification. As the simplification can be also called with vector types real_can_shorten_arithmetic would ICE in REAL_MODE_FORMAT which expects a scalar mode. Therefore, use element_mode instead of TYPE_MODE. Additionally, check if the target supports the resulting operation. One target that supports e.g. a float addition but not a _Float16 addition is the RISC-V vector extension Zvfhmin. gcc/ChangeLog: * match.pd: Use element_mode and check if target supports operation with new type.
2023-06-28[SVE] Fold svdupq to VEC_PERM_EXPR if elements are not constant.Prathamesh Kulkarni2-1/+78
gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc (svdupq_impl::fold_nonconst_dupq): New method. (svdupq_impl::fold): Call fold_nonconst_dupq. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/acle/general/dupq_11.c: New test.
2023-06-27Mark asm goto with outputs as volatileAndrew Pinski2-1/+32
The manual references asm goto as being implicitly volatile already and that was done when asm goto could not have outputs. When outputs were added to `asm goto`, only asm goto without outputs were still being marked as volatile. Now some parts of GCC decide, removing the `asm goto` is ok if the output is not used, though not updating the CFG (this happens on both the RTL level and the gimple level). Since the biggest user of `asm goto` is the Linux kernel and they expect them to be volatile (they use them to copy to/from userspace), we should just mark the inline-asm as volatile. OK? Bootstrapped and tested on x86_64-linux-gnu. PR middle-end/110420 PR middle-end/103979 PR middle-end/98619 gcc/ChangeLog: * gimplify.cc (gimplify_asm_expr): Mark asm with labels as volatile. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/asmgoto-6.c: New test.
2023-06-27ada: Fix build of GNAT toolsEric Botcazou1-3/+6
gcc/ada/ * gcc-interface/Makefile.in (LIBIBERTY): Fix condition. (TOOLS_LIBS): Add @LD_PICFLAG@.
2023-06-27ada: Fix bad interaction between inlining and thunk generationEric Botcazou1-3/+6
This may cause the type of the RESULT_DECL of a function which returns by invisible reference to be turned into a reference type twice. gcc/ada/ * gcc-interface/trans.cc (Subprogram_Body_to_gnu): Add guard to the code turning the type of the RESULT_DECL into a reference type. (maybe_make_gnu_thunk): Use a more precise guard in the same case.
2023-06-27ada: Make the identification of case expressions more robustEric Botcazou1-5/+3
gcc/ada/ * gcc-interface/trans.cc (Case_Statement_to_gnu): Rename boolean constant and use From_Conditional_Expression flag for its value.
2023-06-27ada: Fix double finalization of case expression in concatenationEric Botcazou4-72/+23
This streamlines the expansion of case expressions by not wrapping them in an Expression_With_Actions node when the type is not by copy, which avoids the creation of a temporary and the associated finalization issues. That's the same strategy as the one used for the expansion of if expressions when the type is by reference, unless Back_End_Handles_Limited_Types is set to True. Given that it is never set to True, except by a debug switch, and has never been implemented, this parameter is removed in the process. gcc/ada/ * debug.adb (d.L): Remove documentation. * exp_ch4.adb (Expand_N_Case_Expression): In the not-by-copy case, do not wrap the case statement in an Expression_With_Actions node. (Expand_N_If_Expression): Do not test Back_End_Handles_Limited_Types * gnat1drv.adb (Adjust_Global_Switches): Do not set it. * opt.ads (Back_End_Handles_Limited_Types): Delete.
2023-06-27ada: Fix incorrect handling of iterator specifications in recent changeEric Botcazou1-7/+11
Unlike for loop parameter specifications where it references an index, the defining identifier references an element in them. gcc/ada/ * sem_ch12.adb (Check_Generic_Actuals): Check the component type of constants and variables of an array type. (Copy_Generic_Node): Fix bogus handling of iterator specifications.
2023-06-27ada: Correct the contract of Ada.Text_IO.Get_LineClaire Dross1-9/+13
Item might not be entirely initialized after a call to Get_Line. gcc/ada/ * libgnat/a-textio.ads (Get_Line): Use Relaxed_Initialization on the Item parameter of Get_Line.
2023-06-27ada: Fix too late finalization and secondary stack release in iterator loopsEric Botcazou2-31/+14
Sem_Ch5 contains an entire machinery to deal with finalization actions and secondary stack releases around iterator loops, so this removes a recent fix that was made in a narrower case and instead refines the condition under which this machinery is triggered. As a side effect, given that finalization and secondary stack management are still entangled in this machinery, this also fixes the counterpart of a leak for the former, which is a finalization occurring too late. gcc/ada/ * exp_ch4.adb (Expand_N_Quantified_Expression): Revert the latest change as it is subsumed by the machinery in Sem_Ch5. * sem_ch5.adb (Prepare_Iterator_Loop): Also wrap the loop statement in a block in the name contains a function call that returns on the secondary stack.
2023-06-27ada: Plug small loophole in the handling of private views in instancesEric Botcazou1-7/+39
This deals with nested instantiations in package bodies. gcc/ada/ * sem_ch12.adb (Scope_Within_Body_Or_Same): New predicate. (Check_Actual_Type): Take into account packages nested in bodies to compute the enclosing scope by means of Scope_Within_Body_Or_Same.
2023-06-27ada: Plug another loophole in the handling of private views in instancesEric Botcazou1-0/+17
This deals with discriminants of types declared in package bodies. gcc/ada/ * sem_ch12.adb (Check_Private_View): Also check the type of visible discriminants in record and concurrent types.
2023-06-27ada: Update printing container aggregates for debuggingViljar Indus1-2/+4
All N_Aggregate nodes were printed with parentheses "()". However the new container aggregates (homogeneous N_Aggregate nodes) should be printed with brackets "[]". gcc/ada/ * sprint.adb (Print_Node_Actual): Print homogeneous N_Aggregate nodes with brackets.
2023-06-27ada: Fix expanding container aggregatesViljar Indus1-0/+1
Ensure that that container aggregate expressions are expanded as such and not as records even if the type of the expression is a record. gcc/ada/ * exp_aggr.adb (Expand_N_Aggregate): Ensure that container aggregate expressions do not get expanded as records but instead as container aggregates.
2023-06-27Convert remaining uses of value_range in ipa-*.cc to Value_Range.Aldy Hernandez3-15/+19
Minor cleanups to get rid of value_range in IPA. There's only one left, but it's in the switch code which is integer specific. gcc/ChangeLog: * ipa-cp.cc (decide_whether_version_node): Adjust comment. * ipa-fnsummary.cc (evaluate_conditions_for_known_args): Adjust for Value_Range. (set_switch_stmt_execution_predicate): Same. * ipa-prop.cc (ipa_compute_jump_functions_for_edge): Same.
2023-06-27Implement ipa_vr hashing.Aldy Hernandez2-46/+45
Implement hashing for ipa_vr. When all is said and done, all these patches incurr a 7.64% slowdown for ipa-cp, with is entirely covered by the similar 7% increase in this area last week. So we get type agnostic ranges with "infinite" range precision close to free. There is no change in overall compilation. gcc/ChangeLog: * ipa-prop.cc (struct ipa_vr_ggc_hash_traits): Adjust for use with ipa_vr instead of value_range. (gt_pch_nx): Same. (gt_ggc_mx): Same. (ipa_get_value_range): Same. * value-range.cc (gt_pch_nx): Move to ipa-prop.cc and adjust for ipa_vr. (gt_ggc_mx): Same.
2023-06-27Convert ipa_jump_func to use ipa_vr instead of a value_range.Aldy Hernandez3-46/+44
This patch converts the ipa_jump_func code to use the type agnostic ipa_vr suitable for GC instead of value_range which is integer specific. I've disabled the range cacheing to simplify the patch for review, but it is handled in the next patch in the series. gcc/ChangeLog: * ipa-cp.cc (ipa_vr_operation_and_type_effects): New. * ipa-prop.cc (ipa_get_value_range): Adjust for ipa_vr. (ipa_set_jfunc_vr): Take a range. (ipa_compute_jump_functions_for_edge): Pass range to ipa_set_jfunc_vr. (ipa_write_jump_function): Call streamer write helper. (ipa_read_jump_function): Call streamer read helper. * ipa-prop.h (class ipa_vr): Change m_vr to an ipa_vr.
2023-06-27gengtype: Handle braced initialisers in structsRichard Sandiford1-0/+6
I have a patch that adds braced initialisers to a GTY structure. gengtype didn't accept that, because it parsed the "{ ... }" in " = { ... };" as the end of a statement (as "{ ... }" would be in a function definition) and so it didn't expect the following ";". This patch explicitly handles initialiser-like sequences. Arguably, the parser should also skip redundant ";", but that feels more like a workaround rather than the real fix. gcc/ * gengtype-parse.cc (consume_until_comma_or_eos): Parse "= { ... }" as a probable initializer rather than a probable complete statement.
2023-06-27tree-optimization/96208 - SLP of non-grouped loadsRichard Biener4-70/+127
The following extends SLP discovery to handle non-grouped loads in loop vectorization in the case the same load appears in all lanes. Code generation is adjusted to mimick what we do for the case of single element interleaving (when the load is not unit-stride) which is already handled by SLP. There are some limits we run into because peeling for gap cannot cover all cases and we choose VMAT_CONTIGUOUS. The patch does not try to address these issues yet. The main obstacle is that these loads are not STMT_VINFO_GROUPED_ACCESS and that's a new thing with SLP. I know from the past that it's not a good idea to make them grouped. Instead the following massages places to deal with SLP loads that are not STMT_VINFO_GROUPED_ACCESS. There's already a testcase testing for the case the PR is after, just XFAILed, the following adjusts that instead of adding another. I do expect to have missed some so I don't plan to push this on a Friday. Still there may be feedback, so posting this now. Bootstrapped and tested on x86_64-unknown-linux-gnu. PR tree-optimization/96208 * tree-vect-slp.cc (vect_build_slp_tree_1): Allow a non-grouped load if it is the same for all lanes. (vect_build_slp_tree_2): Handle not grouped loads. (vect_optimize_slp_pass::remove_redundant_permutations): Likewise. (vect_transform_slp_perm_load_1): Likewise. * tree-vect-stmts.cc (vect_model_load_cost): Likewise. (get_group_load_store_type): Likewise. Handle invariant accesses. (vectorizable_load): Likewise. * gcc.dg/vect/slp-46.c: Adjust for new vectorizations. * gcc.dg/vect/bb-slp-pr65935.c: Adjust.
2023-06-27Refine maskstore patterns with UNSPEC_MASKMOV.liuhongt1-12/+57
Similar like r14-2070-gc79476da46728e If mem_addr points to a memory region with less than whole vector size bytes of accessible memory and k is a mask that would prevent reading the inaccessible bytes from mem_addr, add UNSPEC_MASKMOV to prevent it to be transformed to any other whole memory access instructions. gcc/ChangeLog: PR rtl-optimization/110237 * config/i386/sse.md (<avx512>_store<mode>_mask): Refine with UNSPEC_MASKMOV. (maskstore<mode><avx512fmaskmodelower): Ditto. (*<avx512>_store<mode>_mask): New define_insn, it's renamed from original <avx512>_store<mode>_mask.
2023-06-27Make option mvzeroupper independent of optimization level.liuhongt3-3/+18
pass_insert_vzeroupper is under condition TARGET_AVX && TARGET_VZEROUPPER && flag_expensive_optimizations && !optimize_size But the document of mvzeroupper doesn't mention the insertion required -O2 and above, it may confuse users when they explicitly use -Os -mvzeroupper. ------------ mvzeroupper Target Mask(VZEROUPPER) Save Generate vzeroupper instruction before a transfer of control flow out of the function. ------------ The patch moves flag_expensive_optimizations && !optimize_size to ix86_option_override_internal. It makes -mvzeroupper independent of optimization level, but still keeps the behavior of architecture tuning(emit_vzeroupper) unchanged. gcc/ChangeLog: * config/i386/i386-features.cc (pass_insert_vzeroupper:gate): Move flag_expensive_optimizations && !optimize_size to .. * config/i386/i386-options.cc (ix86_option_override_internal): .. this, it makes -mvzeroupper independent of optimization level, but still keeps the behavior of architecture tuning(emit_vzeroupper) unchanged. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-vzeroupper-29.c: New testcase.
2023-06-27Don't issue vzeroupper for vzeroupper call_insn.liuhongt2-2/+18
gcc/ChangeLog: PR target/82735 * config/i386/i386.cc (ix86_avx_u127_mode_needed): Don't emit vzeroupper for vzeroupper call_insn. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-vzeroupper-30.c: New test.
2023-06-27Fix __builtin_alloca_with_align_and_max defbuiltin usageAndrew Pinski1-1/+1
There is a missing space between the return type and the name which causes the name not to be outputted in the html docs. Committed as obvious after building html docs. gcc/ChangeLog: * doc/extend.texi (__builtin_alloca_with_align_and_max): Fix defbuiltin usage.
2023-06-27Daily bump.GCC Administrator6-1/+219
2023-06-27RISC-V: Support const vector expansion with step vector with base != 0Juzhe-Zhong7-2/+320
Currently, we are able to generate step vector with base == 0: { 0, 0, 2, 2, 4, 4, ... } ASM: vid vand However, we do wrong for step vector with base != 0: { 1, 1, 3, 3, 5, 5, ... } Before this patch, such case will run fail. After this patch, we are able to pass the testcase and generate the step vector with asm: vid vand vadd gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Fix stepped vector with base != 0. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/slp-17.c: New test. * gcc.target/riscv/rvv/autovec/partial/slp-18.c: New test. * gcc.target/riscv/rvv/autovec/partial/slp-19.c: New test. * gcc.target/riscv/rvv/autovec/partial/slp_run-17.c: New test. * gcc.target/riscv/rvv/autovec/partial/slp_run-18.c: New test. * gcc.target/riscv/rvv/autovec/partial/slp_run-19.c: New test.
2023-06-26docs: Add @cindex for some attributesAndrew Pinski1-0/+3
While looking for the access attribute, I tried to find it via the concept index but it was missing. This patch fixes that and adds one for interrupt/interrupt_handler too. Committed as obvious after building the HTML docs and looking at the resulting concept index page. gcc/ChangeLog: * doc/extend.texi (access attribute): Add cindex for it. (interrupt/interrupt_handler attribute): Likewise.
2023-06-26compiler: support -fgo-importcfgIan Lance Taylor10-7/+182
* lang.opt (fgo-importcfg): New option. * go-c.h (struct go_create_gogo_args): Add importcfg field. * go-lang.cc (go_importcfg): New static variable. (go_langhook_init): Set args.importcfg. (go_langhook_handle_option): Handle -fgo-importcfg. * gccgo.texi (Invoking gccgo): Document -fgo-importcfg. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/506095
2023-06-26aarch64: Use <DWI> instead of <V2XWIDE> in scalar SQRSHRUN patternKyrylo Tkachov1-10/+10
In the scalar pattern for SQRSHRUN it's a bit clearer to use DWI instead of V2XWIDE to make it more clear that no vector modes are involved. No behavioural change intended. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_sqrshrun_n<mode>_insn): Use <DWI> instead of <V2XWIDE>. (aarch64_sqrshrun_n<mode>): Likewise.
2023-06-26aarch64: Clean up some rounding immediate predicatesKyrylo Tkachov4-24/+20
aarch64_simd_rsra_rnd_imm_vec is now used for more than just RSRA and accepts more than just vectors so rename it to make it more truthful. The aarch64_simd_rshrn_imm_vec is now unused and can be deleted. No behavioural change intended. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_const_vec_rsra_rnd_imm_p): Rename to... (aarch64_rnd_imm_p): ... This. * config/aarch64/predicates.md (aarch64_simd_rsra_rnd_imm_vec): Rename to... (aarch64_int_rnd_operand): ... This. (aarch64_simd_rshrn_imm_vec): Delete. * config/aarch64/aarch64-simd.md (aarch64_<sra_op>rsra_n<mode>_insn): Adjust for the above. (aarch64_<sra_op>rshr_n<mode><vczle><vczbe>_insn): Likewise. (*aarch64_<shrn_op>rshrn_n<mode>_insn): Likewise. (*aarch64_sqrshrun_n<mode>_insn<vczle><vczbe>): Likewise. (aarch64_sqrshrun_n<mode>_insn): Likewise. (aarch64_<shrn_op>rshrn2_n<mode>_insn_le): Likewise. (aarch64_<shrn_op>rshrn2_n<mode>_insn_be): Likewise. (aarch64_sqrshrun2_n<mode>_insn_le): Likewise. (aarch64_sqrshrun2_n<mode>_insn_be): Likewise. * config/aarch64/aarch64.cc (aarch64_const_vec_rsra_rnd_imm_p): Rename to... (aarch64_rnd_imm_p): ... This.
2023-06-26IBM zSystems: Assume symbols without explicit alignment to be okAndreas Krebbel2-2/+36
A change we have committed back in 2015 relies on the backend requested ABI alignment to be applied to ALL symbols by the middle-end. However, this does not appear to be the case for external symbols. With this commit we assume all symbols without explicit alignment to be aligned according to the ABI. That's the behavior we had before. This fixes a performance regression caused by the 2015 patch. Since then the address of external char type symbols have been pushed to the literal pool, although it is safe to access them with larl (which requires symbols to reside at even addresses). gcc/ * config/s390/s390.cc (s390_encode_section_info): Set SYMBOL_FLAG_SET_NOTALIGN2 only if the symbol has explicitely been misaligned. gcc/testsuite/ * gcc.target/s390/larl-1.c: New test.
2023-06-26Fix profile of forwarders produced by cd-dceJan Hubicka1-0/+3
compiling the testcase from PR109849 (which uses std:vector based stack to drive a loop) with profile feedbakc leads to profile mismatches introduced by tree-ssa-dce. This is the new code to produce unified forwarder blocks for PHIs. I am not including the testcase itself since checking it for Invalid sum is probably going to be too fragile and this should show in our LNT testers. The patch however fixes the mismatch. Bootstrapped/regtested x86_64-linux and plan to commit it shortly. gcc/ChangeLog: PR tree-optimization/109849 * tree-ssa-dce.cc (make_forwarders_with_degenerate_phis): Fix profile count of newly constructed forwarder block.
2023-06-26docs: Fix typoAndrew Carlotti1-1/+1
gcc/ChangeLog: * doc/optinfo.texi: Fix "steam" -> "stream".
2023-06-26DSE: Add LEN_MASK_STORE analysis into DSE and fix LEN_STOREJu-Zhe Zhong1-16/+31
Hi, Richi. This patch is adding LEN_MASK_STORE into DSE. My understanding is LEN_MASK_STORE is predicated by mask and len. No matter len is constant or not, the ao_ref should be the same as MASK_STORE. Wheras for LEN_STORE, when len is constant, we use (len - bias), otherwise, it's the same as MASK_STORE/LEN_MASK_STORE. Not sure whether I am on the same page with you, feel free to correct me. Thanks. gcc/ChangeLog: * tree-ssa-dse.cc (initialize_ao_ref_for_dse): Add LEN_MASK_STORE and fix LEN_STORE. (dse_optimize_stmt): Add LEN_MASK_STORE.
2023-06-26GIMPLE_FOLD: Fix gimple fold for LEN_{MASK}_{LOAD,STORE}Ju-Zhe Zhong2-2/+47
Hi, previous I made a mistake on GIMPLE_FOLD of LEN_MASK_{LOAD,STORE}. We should fold LEN_MASK_{LOAD,STORE} (bias+len) == vf (nunits instead of bytesize) && mask = all trues mask into: MEM_REF [...]. This patch added testcase to test gimple fold of LEN_MASK_{LOAD,STORE}. Also, I fix LEN_LOAD/LEN_STORE, to make them have the same behavior. Ok for trunk ? gcc/ChangeLog: * gimple-fold.cc (gimple_fold_partial_load_store_mem_ref): Fix gimple fold of LOAD/STORE with length. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/gimple_fold-1.c: New test.
2023-06-26Avoid redundant GORI calcuations.Andrew MacLeod1-4/+17
When GORI evaluates a statement, if operand 1 and 2 are both in the dependency chain, GORI evaluates the name through both operands sequentially and combines the results. If either operand is in the dependency chain of the other, this evaluation will do the same work twice, for questionable gain. Instead, simple evaluate only the operand which depends on the other and keep the evaluation linear in time. * gimple-range-gori.cc (compute_operand1_and_operand2_range): Check for interdependence between operands 1 and 2.
2023-06-26vect: Cost intermediate conversionsRichard Sandiford1-2/+3
g:6f19cf7526168f8 extended N-vector to N-vector conversions to handle cases where an intermediate integer extension or truncation is needed. This patch adjusts the cost to account for these intermediate conversions. gcc/ * tree-vect-stmts.cc (vectorizable_conversion): Take multi_step_cvt into account when costing non-widening/truncating conversions.
2023-06-26tree-optimization/110381 - preserve SLP permutation with in-order reductionsRichard Biener2-2/+56
The following fixes a bug that manifests itself during fold-left reduction transform in picking not the last scalar def to replace and thus double-counting some elements. But the underlying issue is that we merge a load permutation into the in-order reduction which is of course wrong. Now, reduction analysis has not yet been performend when optimizing permutations so we have to resort to check that ourselves. PR tree-optimization/110381 * tree-vect-slp.cc (vect_optimize_slp_pass::start_choosing_layouts): Materialize permutes before fold-left reductions. * gcc.dg/vect/pr110381.c: New testcase.
2023-06-26RISC-V: Remove duplicated extern function_base declPan Li1-5/+0
Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.h: Remove duplicated decl.
2023-06-26narrowing initializers and initializer_constant_valid_p_1Richard Biener1-0/+2
initializer_constant_valid_p_1 attempts to handle narrowing differences and sums but fails to handle when the overall value looks like VIEW_CONVERT_EXPR<long long int>(NON_LVALUE_EXPR <v> - VEC_COND_EXPR < { 0, 0 } == { 0, 0 } , { -1, -1 } , { 0, 0 } > ) where endtype is scalar integer but value is a vector type. In this particular case all is good and we recurse since two vector lanes is more than 64bits of long long. But still it compares apples and oranges. Fixed by appropriately also requiring the type of the value to be scalar integral. * varasm.cc (initializer_constant_valid_p_1): Also constrain the type of value to be scalar integral before dispatching to narrowing_initializer_constant_valid_p.
2023-06-26Avoid shorten_binary_op on VECTOR_TYPERichard Biener1-0/+4
When we disallow TYPE_PRECISION on VECTOR_TYPEs it shows that shorten_binary_op performs some checks on that that are likely harmless in the end. The following bails out early for VECTOR_TYPE operations to avoid those questionable checks. gcc/c-family/ * c-common.cc (shorten_binary_op): Exit early for VECTOR_TYPE operations.
2023-06-26Fix TYPE_PRECISION use in hashable_expr_equal_pRichard Biener1-1/+1
While the checks look unnecessary they probably are quick and thus done early. The following avoids using TYPE_PRECISION on VECTOR_TYPEs by making the code match the comment which talks about precision and signedness. An alternative would be to only retain the ERROR_MARK and TYPE_MODE checks or use TYPE_PRECISION_RAW (but I like that least). * tree-ssa-scopedtables.cc (hashable_expr_equal_p): Use element_precision.