aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-05-26ada: Remove redundant guard against empty listsPiotr Trojanek2-4/+1
There is no need to guard against routine Contains being called on No_Elist, because it will return False. Code cleanup related to handling of primitive operations in GNATprove; semantics is unaffected. gcc/ada/ * sem_prag.adb (Record_Possible_Body_Reference): Remove call to Present. * sem_util.adb (Find_Untagged_Type_Of): Likewise.
2023-05-26ada: Fix double free on finalization of Vector in array aggregateEric Botcazou1-18/+9
The handling of finalization is delicate during the expansion of aggregates since the generated assignments must not cause the finalization of the RHS. That's why the No_Ctrl_Actions flag is set on them and the adjustments are generated manually. This was not done in the case of an array of array with controlled component when its subaggregates are not expanded in place but instead are replaced by temporaries, leading to double free or memory corruption. gcc/ada/ * exp_aggr.adb (Initialize_Array_Component): Remove obsolete code. (Expand_Array_Aggregate): In the case where a temporary is created and the parent is an assignment statement with No_Ctrl_Actions set, set Is_Ignored_Transient on the temporary.
2023-05-26ada: Fix internal error on Big_Integer conversion ghost instanceEric Botcazou1-12/+12
The problem is that the ghost mode of the instance is used to analyze the parent of the generic body, whose own ghost mode has nothing to do with it. gcc/ada/ * sem_ch12.adb (Instantiate_Package_Body): Set the ghost mode to that of the instance only after loading the generic's parent. (Instantiate_Subprogram_Body): Likewise.
2023-05-26ada: Simplify expansion of set membershipPiotr Trojanek1-10/+7
Code cleanup; semantics is unaffected. gcc/ada/ * exp_ch4.adb (Expand_Set_Membership): Simplify by using Evolve_Or_Else.
2023-05-26ada: Cleanup expansion of membership operators into attribute ValidPiotr Trojanek1-22/+4
Code cleanup; semantics is unaffected. gcc/ada/ * exp_ch4.adb (Is_OK_Object_Reference): Replace loop with a call to Unqual_Conv; consequently, change object from variable to constant; replace an IF statement with an AND THEN expression.
2023-05-26ada: Remove leftover code for counting protected entriesPiotr Trojanek1-20/+5
We used to count protected entries by iterating over component declarations, but then switched to iterating over entities and left some code that is no longer needed. Cleanup; semantics is unaffected (maybe except fixing an assertion failure in developer builds when there is pragma among entry family declarations). gcc/ada/ * exp_ch9.adb (Build_Entry_Count_Expression): Remove loop over component declaration; consequently remove a parameter that is no longer used; adapt callers. (Make_Task_Create_Call): Refine type of a local variable.
2023-05-26ada: Fix detection of non-static expressions in records with pragmasPiotr Trojanek1-6/+5
When iterating over record components we must ignore pragmas. Minor bug, as pragmas within record components do not appear often. gcc/ada/ * sem_cat.adb (Check_Non_Static_Default_Expr): Detect components inside loop, not in the loop condition itself.
2023-05-26ada: Reorder components in Ada.Containers.Bounded_Doubly_Linked_ListsEric Botcazou1-1/+1
gcc/ada/ * libgnat/a-cbdlli.ads (List): Move Nodes component to the end.
2023-05-26ada: Reorder components in Ada.Containers.Restricted_Doubly_Linked_ListsEric Botcazou1-1/+1
An instantiation of the package compiled with -gnatw.q yields: warning: in instantiation at a-crdlli.ads:317 [-gnatw.q] warning: record layout may cause performance issues [-gnatw.q] warning: in instantiation at a-crdlli.ads:317 [-gnatw.q] warning: component "Nodes" whose length depends on a discriminant [-gnatw.q] warning: in instantiation at a-crdlli.ads:317 [-gnatw.q] warning: comes too early and was moved down [-gnatw.q] gcc/ada/ * libgnat/a-crdlli.ads (List): Move Nodes component to the end.
2023-05-26ada: Reject thin 'Unrestricted_Access value to aliased constrained arrayEric Botcazou1-23/+51
This rejects the Unrestricted_Access attribute applied to an aliased array with a constrained nominal subtype when its type is resolved to be a thin pointer. The reason is that supporting this case would require the aliased array to contain its bounds, and this is the case only for aliased arrays whose nominal subtype is unconstrained. gcc/ada/ * sem_attr.adb (Is_Thin_Pointer_To_Unc_Array): New predicate. (Resolve_Attribute): Apply the static matching legality rule to an Unrestricted_Access attribute applied to an aliased prefix if the type is a thin pointer. Call Is_Thin_Pointer_To_Unc_Array for the aliasing legality rule as well.
2023-05-26ada: Simplify iteration over record component items with possible pragmasPiotr Trojanek1-4/+2
Code cleanup; semantics is unaffected. gcc/ada/ * sem_util.adb (Is_Null_Record_Definition): Use First_Non_Pragma and Next_Non_Pragma to ignore pragmas within component list.
2023-05-26ada: Fix handling of Global contracts inside generic subprogramsPiotr Trojanek1-1/+3
Routine Get_Argument works differently for generic units (as explained in its comment), but it failed to reliably detect such units when their kind is temporarily made non-generic (for resolving recursive calls, as explained in the comment at the end of Is_Generic_Declaration_Or_Body). With this patch the frontend will look at the decorated expression of the Global contract attached to the Global aspect; previously it was looking at the undecorated expression attached to the corresponding pragma. gcc/ada/ * sem_prag.adb (Get_Argument): Improve detection of generic units.
2023-05-26ada: Tune detection of expression functions within a declare expressionPiotr Trojanek2-3/+3
Code cleanup; semantics is unaffected. gcc/ada/ * sem_ch4.adb (Check_Action_OK): Replace low-level test with a high-level routine. * sem_ch13.adb (Is_Predicate_Static): Likewise.
2023-05-26ada: Crash on loop in dispatching conditional entry callJavier Miranda3-125/+30
gcc/ada/ * exp_ch9.adb (Expand_N_Conditional_Entry_Call): Factorize code to avoid duplicating subtrees; required to avoid problems when the copied code has implicit labels. * sem_util.ads (New_Copy_Separate_List): Removed. (New_Copy_Separate_Tree): Removed. * sem_util.adb (New_Copy_Separate_List): Removed. (New_Copy_Separate_Tree): Removed.
2023-05-26ada: Remove redundant protection against empty listsPiotr Trojanek1-122/+116
Calls to Length on No_List intentionally return 0, so explicit guards against No_List are unnecessary. Code cleanup; semantics is unaffected. gcc/ada/ * sem_ch13.adb (Check_Component_List): Local variable Compl is now a constant; a nested block is no longer needed.
2023-05-26ada: Cleanups in handling of aggregatesPiotr Trojanek5-26/+24
Assorted cleanups related to recent fixes of aggregate handling for GNATprove; semantics is unaffected. gcc/ada/ * sem_aggr.adb (Resolve_Record_Aggregate): Remove useless assignment. * sem_aux.adb (Has_Variant_Part): Remove useless guard; this routine is only called on type entities (and now will crash in other cases). * sem_ch3.adb (Create_Constrained_Components): Only assign Assoc_List when necessary; tune whitespace. (Is_Variant_Record): Refactor repeated calls to Parent. * sem_util.adb (Gather_Components): Assert that discriminant association has just one choice in component_association; refactor repeated calls to Next. * sem_util.ads (Gather_Components): Tune whitespace in comment.
2023-05-26ada: Fix iteration over component items with pragmasPiotr Trojanek2-4/+4
Component items in a record declaration might include pragmas, which must be ignored when detecting components with default expressions. More a code cleanup than a bugfix, as it only affects artificial corner cases. Found while fixing missing legality checks for variant component declarations. gcc/ada/ * sem_ch3.adb (Check_CPP_Type_Has_No_Defaults): Iterate with First_Non_Pragma and Next_Non_Pragma. * exp_dist.adb (Append_Record_Traversal): Likewise.
2023-05-26ada: Duplicate declaration of _master entityJavier Miranda3-23/+40
gcc/ada/ * exp_ch9.adb (Build_Class_Wide_Master): Remember internal blocks that have a task master entity declaration. (Build_Master_Entity): Code cleanup. * sem_util.ads (Is_Internal_Block): New subprogram. * sem_util.adb (Is_Internal_Block): New subprogram.
2023-05-26ada: Remove redundant guards from handling of record componentsPiotr Trojanek1-6/+1
Call to First on empty list is intentionally returning Empty. gcc/ada/ * sem_util.adb (Gather_Components): Remove guard for empty list of components.
2023-05-26ada: Remove Is_Descendant_Of_Address flag from Standard_AddressEric Botcazou4-13/+20
It breaks the Allow_Integer_Address special mode. Add new standard_address parameters to gigi and alphabetize others, this is necessary when addresses are not treated like integers. gcc/ada/ * back_end.adb (Call_Back_End): Add gigi_standard_address to the signature of the gigi procedure and alphabetize other parameters. Pass Standard_Address as actual parameter for it. * cstand.adb (Create_Standard): Do not set Is_Descendant_Of_Address on Standard_Address. * gcc-interface/gigi.h (gigi): Add a standard_address parameter and alphabetize others. * gcc-interface/trans.cc (gigi): Likewise. Record a builtin address type and save it as the type for Standard.Address.
2023-05-26ada: Handle new Controlling_Tag format when converting to SCILGhjuvan Lacambre2-10/+29
This commit fixes two CodePeer crashes that were introduced when the format of the controlling tag changed. gcc/ada/ * exp_disp.adb (Expand_Dispatching_Call): Handle new Controlling_Tag. * sem_scil.adb (Check_SCIL_Node): Treat N_Object_Renaming_Declaration as N_Object_Declaration.
2023-05-26ada: Use context variables in expansion of aggregatesPiotr Trojanek1-10/+7
Code cleanup; semantics is unaffected. gcc/ada/ * exp_aggr.adb (Build_Constrained_Type): Remove local constants that were shadowing equivalent global constants; replace a wrapper that calls Make_Integer_Literal with a numeric literal; remove explicit Aliased_Present parameter which is equivalent to the default value. (Check_Bounds): Remove unused initial value. (Expand_Array_Aggregate): Use aggregate type from the context.
2023-05-26ada: Fix missing finalization in library-level instance bodyEric Botcazou6-228/+222
This extends the delaying mechanism present in the cases where the instance is not at library level, so as to wait until after the instantiation of the body is performed, before generating the finalizer of the compilation unit. gcc/ada/ * einfo.ads (Delay_Cleanups): Document new usage. * exp_ch7.ads (Build_Finalizer): New declaration. * exp_ch7.adb (Build_Finalizer.Process_Declarations): Do not treat library-level package instantiations specially. (Build_Finalizer): Return early for package bodies and specs that are not compilation units instead of using a more convoluted test. (Expand_N_Package_Body): Do not build a finalizer if Delay_Cleanups is set on the defining entity. (Expand_N_Package_Declaration): Likewise. * inline.ads (Pending_Body_Info): Reorder and add Fin_Scop. (Add_Pending_Instantiation): Add Fin_Scop parameter. * inline.adb (Add_Pending_Instantiation): Likewise and copy it into the Pending_Body_Info appended to Pending_Instantiations. (Add_Scope_To_Clean): Change parameter name to Scop and remove now irrelevant processing. (Cleanup_Scopes): Deal with scopes that are package specs or bodies. (Instantiate_Body): For package instantiations, deal specially with scopes that are package bodies and with scopes that are dynamic. Pass the resulting scope to Add_Scope_To_Clean directly. * sem_ch12.adb (Analyze_Package_Instantiation): In the case where a body is needed, compute the enclosing finalization scope and pass it in the call to Add_Pending_Instantiation. (Inline_Instance_Body): Adjust aggregate passed in the calls to Instantiate_Package_Body. (Load_Parent_Of_Generic): Likewise.
2023-05-26ada: Minor tweak in conditionEric Botcazou1-1/+1
gcc/ada/ * sem_util.adb (Compile_Time_Constraint_Error): Test the Ekind.
2023-05-26ada: Simplify expansion of positional aggregatesPiotr Trojanek1-9/+3
Code cleanup; semantics is unaffected. gcc/ada/ * exp_aggr.adb (Build_Constrained_Type): Use List_Length to count expressions in consecutive subaggregates.
2023-05-26ada: Use computed value from os_constants to define sigset_tDoug Rupp1-1/+3
Remove hard coded definition and conform to standard usage of using computed os_constants for opaque type declarations. gcc/ada/ * libgnarl/s-osinte__qnx.ads (sigset_t): Modify declaration to use system.os_constants computed value. Align it.
2023-05-26ada: Fix another couple of unchecked conversions to Ada.Tags.TagEric Botcazou1-54/+17
They are problematic on platforms where the provenance of pointers must be tracked throughout their lifetime. gcc/ada/ * exp_sel.adb: Add clauses for Sem_Util, remove them for Opt, Sinfo and Sinfo.Nodes. (Build_K): Always use 'Tag of the object. (Build_S_Assignment): Likewise.
2023-05-26ada: Refine types for an accessibility-checking routinePiotr Trojanek1-2/+2
Code cleanup related to work on expression functions for GNATprove (which require accessibility checks even when they are not expanded and thus have no explicit return statements). gcc/ada/ * accessibility.adb (Is_Formal_Of_Current_Function): This routine expects an entity reference and not the entity itself, so its parameter is a Node_Id and not an Entity_Id.
2023-05-26ada: Clean style in expansion of array aggregatesPiotr Trojanek1-7/+5
Code cleanup only; semantics is unaffected. gcc/ada/ * exp_aggr.adb (Build_Array_Aggr_Code): Change variable to constant. (Check_Same_Aggr_Bounds): Fix style; remove unused initial value.
2023-05-26ada: Fix late extra formals creationRonan Desplanques1-0/+1
Before this patch, in some situations, a subprogram call could be expanded before the extra formals for the subprogram were created. This patch fixes the problem in those situations. gcc/ada/ * sem_ch6.adb (Analyze_Subprogram_Body_Helper): Create extra formals in more situations.
2023-05-26ada: Add missing guards in Selected_Range_ChecksEric Botcazou1-0/+2
gcc/ada/ * checks.adb (Selected_Range_Checks): Add guards to protect calls to Expr_Value on bounds.
2023-05-26ada: Enhance Is_Null_Range and Not_Null_Range predicatesEric Botcazou3-9/+11
Both predicates bail out if the bounds of the range are not known at compile time, whereas Compile_Time_Compare can deal with them in specific cases. gcc/ada/ * sem_eval.ads (Is_Null_Range): Remove requirements of compile-time known bounds and add WARNING line. (Not_Null_Range): Remove requirements of compile-time known bounds. * sem_eval.adb (Is_Null_Range): Fall back to Compile_Time_Compare. (Not_Null_Range): Likewise. * fe.h (Is_Null_Range): New predicate.
2023-05-26i386: Do not disable call to ix86_expand_vecop_qihi2Uros Bizjak1-1/+1
gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_vecop_qihi): Do not disable call to ix86_expand_vecop_qihi2.
2023-05-26Only use NO_REGS in cost calculation when !hard_regno_mode_ok for ↵liuhongt1-4/+8
GENERAL_REGS and mode. r14-172-g0368d169492017 replaces GENERAL_REGS with NO_REGS in cost calculation when the preferred register class are not known yet. It regressed powerpc PR109610 and PR109858, it looks too aggressive to use NO_REGS when mode can be allocated with GENERAL_REGS. The patch takes a step back, still use GENERAL_REGS when hard_regno_mode_ok for mode and GENERAL_REGS, otherwise uses NO_REGS. gcc/ChangeLog: PR target/109610 PR target/109858 * ira-costs.cc (scan_one_insn): Only use NO_REGS in cost calculation when !hard_regno_mode_ok for GENERAL_REGS and mode, otherwise still use GENERAL_REGS.
2023-05-26RISC-V: Fix zero-scratch-regs-3.c failJuzhe-Zhong1-2/+2
gcc/ChangeLog: * config/riscv/riscv.cc (vector_zero_call_used_regs): Add explict VL and drop VL in ops. Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
2023-05-26Daily bump.GCC Administrator4-1/+691
2023-05-25testsuite: Require trampolines for nestev-vla testsDimitar Dimitrov3-0/+3
gcc/testsuite/ChangeLog: * gcc.dg/nested-vla-1.c: Require effective target trampolines. * gcc.dg/nested-vla-2.c: Ditto. * gcc.dg/nested-vla-3.c: Ditto. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2023-05-25In pipeline scheduling, insns should not be fusion in different BB blocks.Jin Ma1-1/+1
gcc/ChangeLog: * sched-deps.cc (sched_macro_fuse_insns): Insns should not be fusion in different BB blocks.
2023-05-25i386: Use 2x-wider modes when emulating QImode vector instructionsUros Bizjak3-171/+255
Rewrite ix86_expand_vecop_qihi2 to expand fo 2x-wider (e.g. V16QI -> V16HImode) instructions when available. Currently, the compiler generates following assembly for V16QImode multiplication (-mavx2): vpunpcklbw %xmm0, %xmm0, %xmm3 vpunpcklbw %xmm1, %xmm1, %xmm2 vpunpckhbw %xmm0, %xmm0, %xmm0 movl $255, %eax vpunpckhbw %xmm1, %xmm1, %xmm1 vpmullw %xmm3, %xmm2, %xmm2 vmovd %eax, %xmm3 vpmullw %xmm0, %xmm1, %xmm1 vpbroadcastw %xmm3, %xmm3 vpand %xmm2, %xmm3, %xmm0 vpand %xmm1, %xmm3, %xmm3 vpackuswb %xmm3, %xmm0, %xmm0 and only with -mavx512bw -mavx512vl generates: vpmovzxbw %xmm1, %ymm1 vpmovzxbw %xmm0, %ymm0 vpmullw %ymm1, %ymm0, %ymm0 vpmovwb %ymm0, %xmm0 Patched compiler generates more optimized code involving multiplication in 2x-wider mode in cases where missing truncate instruction has to be emulated with a permutation (-mavx2): vpmovzxbw %xmm0, %ymm0 vpmovzxbw %xmm1, %ymm1 movl $255, %eax vpmullw %ymm1, %ymm0, %ymm1 vmovd %eax, %xmm0 vpbroadcastw %xmm0, %ymm0 vpand %ymm1, %ymm0, %ymm0 vpackuswb %ymm0, %ymm0, %ymm0 vpermq $216, %ymm0, %ymm0 The patch also adjusts cost calculation of V*QImode emulations to account for generation of 2x-wider mode instructions. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_vecop_qihi2): Rewrite to expand to 2x-wider (e.g. V16QI -> V16HImode) instructions when available. Emulate truncation via ix86_expand_vec_perm_const_1 when native truncate insn is not available. (ix86_expand_vecop_qihi_partial) <case MULT>: Use pmovzx when available. Trivially rename some variables. (ix86_expand_vecop_qihi): Unconditionally call ix86_expand_vecop_qihi2. * config/i386/i386.cc (ix86_multiplication_cost): Rewrite cost calculation of V*QImode emulations to account for generation of 2x-wider mode instructions. (ix86_shift_rotate_cost): Update cost calculation of V*QImode emulations to account for generation of 2x-wider mode instructions. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512vl-pr95488-1.c: Revert 2023-05-18 change.
2023-05-25target/104327: Allow more inlining between different optimization levels.Georg-Johann Lay1-0/+16
avr-common.cc introduces the following options that are set depending on optimization level: -mgas-isr-prologues, -mmain-is-OS-task and -fsplit-wide-types-early. The inliner thinks that different options disallow cross-optimization inlining, so provide can_inline_p. gcc/ PR target/104327 * config/avr/avr.cc (avr_can_inline_p): New static function. (TARGET_CAN_INLINE_P): Define to that function.
2023-05-25target/82931: Make a pattern more generic to match more bit-transfers.Georg-Johann Lay3-10/+52
There is already a pattern in avr.md that matches single-bit transfers from one register to another one, but it only handled bit 0 of 8-bit registers. This change makes that pattern more generic so it matches more of similar single-bit transfers. gcc/ PR target/82931 * config/avr/avr.md (*movbitqi.0): Rename to *movbit<mode>.0-6. Handle any bit position and use mode QISI. * config/avr/avr.cc (avr_rtx_costs_1) [IOR]: Return a cost of 2 insns for bit-transfer of respective style. gcc/testsuite/ PR target/82931 * gcc.target/avr/pr82931.c: New test.
2023-05-25arm: merge MVE_5 and MVE_6 iteratorsChristophe Lyon2-35/+34
MVE_5 and MVE_6 iterators are the same: this patch replaces MVE_6 with MVE_5 everywhere in mve.md and removes MVE_6 from iterators.md. 2023-05-25 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/iterators.md (MVE_6): Remove. * config/arm/mve.md: Replace MVE_6 with MVE_5.
2023-05-25VECT: Add decrement IV iteration loop control by variable amount supportJu-Zhe Zhong7-12/+558
This patch is supporting decrement IV by following the flow designed by Richard: (1) In vect_set_loop_condition_partial_vectors, for the first iteration of: call vect_set_loop_controls_directly. (2) vect_set_loop_controls_directly calculates "step" as in your patch. If rgc has 1 control, this step is the SSA name created for that control. Otherwise the step is a fresh SSA name, as in your patch. (3) vect_set_loop_controls_directly stores this step somewhere for later use, probably in LOOP_VINFO. Let's use "S" to refer to this stored step. (4) After the vect_set_loop_controls_directly call above, and outside the "if" statement that now contains vect_set_loop_controls_directly, check whether rgc->controls.length () > 1. If so, use vect_adjust_loop_lens_control to set the controls based on S. Then the only caller of vect_adjust_loop_lens_control is vect_set_loop_condition_partial_vectors. And the starting step for vect_adjust_loop_lens_control is always S. This patch has well tested for single-rgroup and multiple-rgroup (SLP) and passed all testcase in RISC-V port. Signed-off-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai> Co-Authored-By: Richard Sandiford <richard.sandiford@arm.com> gcc/ChangeLog: * tree-vect-loop-manip.cc (vect_adjust_loop_lens_control): New function. (vect_set_loop_controls_directly): Add decrement IV support. (vect_set_loop_condition_partial_vectors): Ditto. * tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): New variable. * tree-vectorizer.h (LOOP_VINFO_USING_DECREMENTING_IV_P): New macro. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-3.c: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-4.c: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-3.c: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-4.c: New test.
2023-05-25aarch64: PR target/99195 Annotate complex FP patterns for vec-concat-zeroKyrylo Tkachov2-16/+80
This patch annotates the complex add and mla patterns for vec-concat-zero. Testing showed an interesting bug in our MD patterns where they were defined to match: (plus:VHSDF (match_operand:VHSDF 1 "register_operand" "0") (unspec:VHSDF [(match_operand:VHSDF 2 "register_operand" "w") (match_operand:VHSDF 3 "register_operand" "w") (match_operand:SI 4 "const_int_operand" "n")] FCMLA)) but the canonicalisation rules for PLUS require the more "complex" operand to be first so during combine when the new substituted patterns were attempted to be formed combine/recog would try to match: (plus:V2SF (unspec:V2SF [ (reg:V2SF 100) (reg:V2SF 101) (const_int 0 [0]) ] UNSPEC_FCMLA270) (reg:V2SF 99)) instead. This patch fixes the operands of the PLUS RTX in these patterns. Similar patterns for the dot-product instructions already used the right order. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. gcc/ChangeLog: PR target/99195 * config/aarch64/aarch64-simd.md (aarch64_fcadd<rot><mode>): Rename to... (aarch64_fcadd<rot><mode><vczle><vczbe>): ... This. Fix canonicalization of PLUS operands. (aarch64_fcmla<rot><mode>): Rename to... (aarch64_fcmla<rot><mode><vczle><vczbe>): ... This. Fix canonicalization of PLUS operands. (aarch64_fcmla_lane<rot><mode>): Rename to... (aarch64_fcmla_lane<rot><mode><vczle><vczbe>): ... This. Fix canonicalization of PLUS operands. (aarch64_fcmla_laneq<rot>v4hf): Rename to... (aarch64_fcmla_laneq<rot>v4hf<vczle><vczbe>): ... This. Fix canonicalization of PLUS operands. (aarch64_fcmlaq_lane<rot><mode>): Fix canonicalization of PLUS operands. gcc/testsuite/ChangeLog: PR target/99195 * gcc.target/aarch64/simd/pr99195_9.c: New test.
2023-05-25arm: Implement ACLE Data IntrinsicsChris Sidebottom7-4/+498
This patch implements a number of scalar data processing intrinsics from ACLE that were requested by some users. Some of these have fast single-instruction sequences for Armv6 and later, but even for earlier versions they can still emit an inline sequence or a call to libgcc (and ACLE recommends them being unconditionally available). Chris Sidebottom wrote most of the patch, I just cleaned it up, wired up some builtins and adjusted the tests. Bootstrapped and tested on arm-none-linux-gnueabihf. Co-authored-by: Chris Sidebottom <chris.sidebottom@arm.com> gcc/ChangeLog: * config/arm/arm.md (rbitsi2): Rename to... (arm_rbit): ... This. (ctzsi2): Adjust for the above. (arm_rev16si2): Convert to define_expand. (arm_rev16si2_alt1): New pattern. (arm_rev16si2_alt): Rename to... (*arm_rev16si2_alt2): ... This. * config/arm/arm_acle.h (__ror, __rorl, __rorll, __clz, __clzl, __clzll, __cls, __clsl, __clsll, __revsh, __rev, __revl, __revll, __rev16, __rev16l, __rev16ll, __rbit, __rbitl, __rbitll): Define intrinsics. * config/arm/arm_acle_builtins.def (rbit, rev16si2): Define builtins. gcc/testsuite/ChangeLog: * gcc.target/arm/acle/data-intrinsics-armv6.c: New test. * gcc.target/arm/acle/data-intrinsics-assembly.c: New test. * gcc.target/arm/acle/data-intrinsics-rbit.c: New test. * gcc.target/arm/acle/data-intrinsics.c: New test.
2023-05-25arm: Fix ICE due to infinite splitting [PR109800]Alex Coplan3-4/+9
In r11-966-g9a182ef9ee011935d827ab5c6c9a7cd8e22257d8 we introduce a simplification to emit_move_insn that attempts to simplify moves of the form: (set (subreg:M1 (reg:M2 ...)) (constant C)) where M1 and M2 are of equal mode size. That is problematic for the splitter vfp.md:no_literal_pool_df_immediate in the arm backend, which tries to pun an lvalue DFmode pseudo into DImode and assign a constant to it with emit_move_insn, as the new transformation simply undoes this, and we end up splitting indefinitely. This patch changes things around in the arm backend so that we use a DImode temporary (instead of DFmode) and first load the DImode constant into the pseudo, and then pun the pseudo into DFmode as an rvalue in a reg -> reg move. I believe this should be semantically equivalent but avoids the pathalogical behaviour seen in the PR. gcc/ChangeLog: PR target/109800 * config/arm/arm.md (movdf): Generate temporary pseudo in DImode instead of DFmode. * config/arm/vfp.md (no_literal_pool_df_immediate): Rather than punning an lvalue DFmode pseudo into DImode, use a DImode pseudo and pun it into DFmode as an rvalue. gcc/testsuite/ChangeLog: PR target/109800 * gcc.target/arm/pure-code/pr109800.c: New test.
2023-05-25target/109955 - handle pattern generated COND_EXPR without vcondRichard Biener1-1/+6
The following properly handles pattern matching generated COND_EXPRs which can still have embedded compares in vectorizable_condition which will always code generate the masked vector variant. We were requiring vcond with embedded comparisons instead of also allowing (as code generated) split compare and VEC_COND_EXPR. This fixes some of the fallout when removing vcond{,u,eq} expanders from the x86 backend. PR target/109955 * tree-vect-stmts.cc (vectorizable_condition): For embedded comparisons also handle the case when the target only provides vec_cmp and vcond_mask.
2023-05-25arc: Make TLS Local Dynamic work like Global Dynamic modelClaudiu Zissulescu1-23/+1
Current ARC's TLS Local Dynamic model is using two anchors to access data, namely `.tdata` and `.tbss`. This implementation is unnecessary complicated. However, the TLS Local Dynamic model has better results using Global Dynamic model and anchors. gcc/ChangeLog; * config/arc/arc.cc (arc_call_tls_get_addr): Simplify access using TLS Local Dynamic. Signed-off-by: Claudiu Zissulescu <claziss@gmail.com>
2023-05-25[aarch64] Ignore cost of scalar moves for seq in vector initialization.Prathamesh Kulkarni1-2/+42
gcc/ChangeLog: * config/aarch64/aarch64.cc (scalar_move_insn_p): New function. (seq_cost_ignoring_scalar_moves): Likewise. (aarch64_expand_vector_init): Call seq_cost_ignoring_scalar_moves.
2023-05-25aarch64: Implement vector FP absolute compare intrinsics with builtinsKyrylo Tkachov2-24/+40
While optimising some vector math library code with intrinsics we stumbled upon the issue in the testcase. The compiler should be generating a FACGT instruction but instead we generate: foo(__Float32x4_t, __Float32x4_t, __Float32x4_t): fabs v0.4s, v0.4s adrp x0, .LC0 ldr q31, [x0, #:lo12:.LC0] fcmgt v0.4s, v0.4s, v31.4s ret This is because the vcagtq_f32 intrinsic is open-coded in arm_neon.h as return vabsq_f32 (__a) > vabsq_f32 (__b) thus relying on the optimisers to merge it back together. But since one of the arms of the comparison is a vector constant the combine pass optimises the abs into it and tries matching: (set (reg:V4SI 101) (neg:V4SI (gt:V4SI (reg:V4SF 100) (const_vector:V4SF [ (const_double:SF 1.0e+2 [0x0.c8p+7]) repeated x4 ])))) and (set (reg:V4SI 101) (neg:V4SI (gt:V4SI (abs:V4SF (reg:V4SF 104)) (reg:V4SF 103)))) instead of what we want: (insn 13 9 14 2 (set (reg/i:V4SI 32 v0) (neg:V4SI (gt:V4SI (abs:V4SF (reg:V4SF 98)) (abs:V4SF (reg:V4SF 96))))) I don't really see a good way around that with our current implementation of these intrinsics. Therefore this patch reimplements these intrinsics with aarch64 builtins that generate the RTL for these instructions directly. Apparently we already had them defined in aarch64-simd-builtins.def and have been using them for the fp16 case already. I realise that this approach is against the general principle of expressing intrinsics in the higher-level constructs, so I'm willing to listen to counter-arguments. That said, the FACGT/FACGE instructions are as fast as the non-ABS comparison instructions on all microarchitectures that I know of so it should always be a win to have them in the merged form rather than split the fabs step separately or try to hoist it. And the testcase does come from real library code that we're trying to optimise. With this patch for the testcase we generate: foo: adrp x0, .LC0 ldr q31, [x0, #:lo12:.LC0] facgt v0.4s, v0.4s, v31.4s ret gcc/ChangeLog: * config/aarch64/arm_neon.h (vcage_f64): Reimplement with builtins. (vcage_f32): Likewise. (vcages_f32): Likewise. (vcageq_f32): Likewise. (vcaged_f64): Likewise. (vcageq_f64): Likewise. (vcagts_f32): Likewise. (vcagt_f32): Likewise. (vcagt_f64): Likewise. (vcagtq_f32): Likewise. (vcagtd_f64): Likewise. (vcagtq_f64): Likewise. (vcale_f32): Likewise. (vcale_f64): Likewise. (vcaled_f64): Likewise. (vcales_f32): Likewise. (vcaleq_f32): Likewise. (vcaleq_f64): Likewise. (vcalt_f32): Likewise. (vcalt_f64): Likewise. (vcaltd_f64): Likewise. (vcaltq_f32): Likewise. (vcaltq_f64): Likewise. (vcalts_f32): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/facgt_constpool_1.c: New test.