aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-05-06Daily bump.GCC Administrator5-1/+1200
2023-05-06CRIS: peephole2 an add into two addq or subqHans-Peter Nilsson3-2/+81
Unfortunately, doesn't cause a performance improvement for coremark, but happens a few times in newlib, just enough to affect coremark 0.01% by size (or 4 bytes, and three cycles (__fwalk_sglue and __vfiprintf_r each two bytes). gcc: * config/cris/cris.md (splitop): Add PLUS. * config/cris/cris.cc (cris_split_constant): Also handle PLUS when a split into two insns may be useful. gcc/testsuite: * gcc.target/cris/peep2-addsplit1.c: New test.
2023-05-06CRIS: peephole2 a move of constant followed by and of same registerHans-Peter Nilsson2-0/+55
While moves of constants into registers are separately optimizable, a combination of a move with a subsequent "and" is slightly preferable even if the move can be generated with the same number (and timing) of insns, as moves of "just" registers are eliminated now and then in different passes, loosely speaking. This movandsplit1 pattern feeds into the opsplit1/AND peephole2, with matching occurrences observed in the floating point functions in libgcc. Also, a test-case to fit. Coremark improvements are unimpressive: less than 0.0003% speed, 0.1% size. But that was pre-LRA; after the switch to LRA this peephole2 doesn't match anymore (for any of coremark, local tests, libgcc and newlib libc) and the test-case passes with and without the patch. Still, there's no apparent reason why LRA prefers "move R1,R2" "and I,R2" to "move I,R1" "and R1,R2", or why that wouldn't "randomly" change (also seen with other operations than "and"). Thus committed. gcc: * config/cris/cris.md (movandsplit1): New define_peephole2. gcc/testsuite: * gcc.target/cris/peep2-movandsplit1.c: New test.
2023-05-06CRIS: peephole2 a lsrq into a lslq+lsrq pairHans-Peter Nilsson3-0/+91
Observed after opsplit1 with AND in libgcc floating-point functions, like the first spottings of opsplit1/AND opportunities. Two patterns are nominally needed, as the peephole2 optimizer continues from the *first replacement* insn, not from a minimum context for general matching; one that includes it as the last match. But, the "free-standing" opportunity (three shifts) didn't match by itself in a gcc build of libraries plus running the test-suite, and thus deemed uninteresting and left out. (As expected; if it had matched, that'd have indicated a previously missed optimization or other problem elsewhere.) Only the one that includes the previous define_peephole2 that may generate the sequence (i.e. opsplit1/AND), matches easily. Coremark results aren't impressive though: 0.003% improvement in speed and slightly less than 0.1% in size. A testcase is added to match and another one to cover a case of movulsr checking that it's used; it's preferable to lsrandsplit when both would match. gcc: * config/cris/cris.md (lsrandsplit1): New define_peephole2. gcc/testsuite: * gcc.target/cris/peep2-lsrandsplit1.c, gcc.target/cris/peep2-movulsr2.c: New tests.
2023-05-06doc: Document order of define_peephole2 scanningHans-Peter Nilsson1-0/+9
I was a bit surprised when my newly-added define_peephole2 didn't match, but it was because it was expected to partially match the generated output of a previous define_peephole2, which matched and modified the last insn of a sequence to be matched. I had assumed that the algorithm backed-up the size of the match-buffer, thereby exposing newly created opportunities *with sufficient context* to all define_peephole2's. While things can change in that direction, let's start with documenting the current state. * doc/md.texi (define_peephole2): Document order of scanning.
2023-05-05Fortran: overloading of intrinsic binary operators [PR109641]Harald Anlauf4-0/+162
Fortran allows overloading of intrinsic operators also for operands of numeric intrinsic types. The intrinsic operator versions are used according to the rules of F2018 table 10.2 and imply type conversion as long as the operand ranks are conformable. Otherwise no type conversion shall be performed to allow the resolution of a matching user-defined operator. gcc/fortran/ChangeLog: PR fortran/109641 * arith.cc (eval_intrinsic): Check conformability of ranks of operands for intrinsic binary operators before performing type conversions. * gfortran.h (gfc_op_rank_conformable): Add prototype. * resolve.cc (resolve_operator): Check conformability of ranks of operands for intrinsic binary operators before performing type conversions. (gfc_op_rank_conformable): New helper function to compare ranks of operands of binary operator. gcc/testsuite/ChangeLog: PR fortran/109641 * gfortran.dg/overload_5.f90: New test.
2023-05-05RISC-V: Legitimise the const0_rtx for RVV indexed load/storePan Li2-32/+33
This patch try to legitimise the const0_rtx (aka zero register) as the base register for the RVV indexed load/store instructions by allowing the const as the operand of the indexed RTL pattern. Then the underlying combine pass will try to perform the const propagation. For example: vint32m1_t test_vluxei32_v_i32m1_shortcut (vuint32m1_t bindex, size_t vl) { return __riscv_vluxei32_v_i32m1 ((int32_t *)0, bindex, vl); } Before this patch: li a5,0 <- can be eliminated. vl1re32.v v1,0(a1) vsetvli zero,a2,e32,m1,ta,ma vluxei32.v v1,(a5),v1 <- can propagate the const 0 to a5 here. vs1r.v v1,0(a0) ret After this patch: test_vluxei32_v_i32m1_shortcut: vl1re32.v v1,0(a1) vsetvli zero,a2,e32,m1,ta,ma vluxei32.v v1,(0),v1 vs1r.v v1,0(a0) ret As above, this patch allow you to propagaate the const 0 (aka zero register) to the base register of the RVV indexed load in the combine pass. This may benefit the underlying RVV auto-vectorization. gcc/ChangeLog: * config/riscv/vector.md: Allow const as the operand of RVV indexed load/store. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/zero_base_load_store_optimization.c: Adjust indexed load/store check condition. Signed-off-by: Pan Li <pan2.li@intel.com> Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
2023-05-05RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSETPan Li2-5/+6
When some RVV integer compare operators act on the same vector registers without mask. They can be simplified to VMSET. This PATCH allows the eq, le, leu, ge, geu to perform such kind of the simplification by adding one macro in riscv for simplify rtx. Given we have: vbool1_t test_shortcut_for_riscv_vmseq_case_0(vint8m8_t v1, size_t vl) { return __riscv_vmseq_vv_i8m8_b1(v1, v1, vl); } Before this patch: vsetvli zero,a2,e8,m8,ta,ma vl8re8.v v8,0(a1) vmseq.vv v8,v8,v8 vsetvli a5,zero,e8,m8,ta,ma vsm.v v8,0(a0) ret After this patch: vsetvli zero,a2,e8,m8,ta,ma vmset.m v1 <- optimized to vmset.m vsetvli a5,zero,e8,m8,ta,ma vsm.v v1,0(a0) ret As above, we may have one instruction eliminated and require less vector registers. Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv.h (VECTOR_STORE_FLAG_VALUE): Add new macro consumed by simplify_rtx. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c: Adjust test check condition.
2023-05-05arm: [MVE intrinsics] rework vshrq vrshrqChristophe Lyon4-628/+6
Implement vshrq and vrshrq using the new MVE builtins framework. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (vrshrq, vshrq): New. * config/arm/arm-mve-builtins-base.def (vrshrq, vshrq): New. * config/arm/arm-mve-builtins-base.h (vrshrq, vshrq): New. * config/arm/arm_mve.h (vshrq): Remove. (vrshrq): Remove. (vrshrq_m): Remove. (vshrq_m): Remove. (vrshrq_x): Remove. (vshrq_x): Remove. (vshrq_n_s8): Remove. (vshrq_n_s16): Remove. (vshrq_n_s32): Remove. (vshrq_n_u8): Remove. (vshrq_n_u16): Remove. (vshrq_n_u32): Remove. (vrshrq_n_u8): Remove. (vrshrq_n_s8): Remove. (vrshrq_n_u16): Remove. (vrshrq_n_s16): Remove. (vrshrq_n_u32): Remove. (vrshrq_n_s32): Remove. (vrshrq_m_n_s8): Remove. (vrshrq_m_n_s32): Remove. (vrshrq_m_n_s16): Remove. (vrshrq_m_n_u8): Remove. (vrshrq_m_n_u32): Remove. (vrshrq_m_n_u16): Remove. (vshrq_m_n_s8): Remove. (vshrq_m_n_s32): Remove. (vshrq_m_n_s16): Remove. (vshrq_m_n_u8): Remove. (vshrq_m_n_u32): Remove. (vshrq_m_n_u16): Remove. (vrshrq_x_n_s8): Remove. (vrshrq_x_n_s16): Remove. (vrshrq_x_n_s32): Remove. (vrshrq_x_n_u8): Remove. (vrshrq_x_n_u16): Remove. (vrshrq_x_n_u32): Remove. (vshrq_x_n_s8): Remove. (vshrq_x_n_s16): Remove. (vshrq_x_n_s32): Remove. (vshrq_x_n_u8): Remove. (vshrq_x_n_u16): Remove. (vshrq_x_n_u32): Remove. (__arm_vshrq_n_s8): Remove. (__arm_vshrq_n_s16): Remove. (__arm_vshrq_n_s32): Remove. (__arm_vshrq_n_u8): Remove. (__arm_vshrq_n_u16): Remove. (__arm_vshrq_n_u32): Remove. (__arm_vrshrq_n_u8): Remove. (__arm_vrshrq_n_s8): Remove. (__arm_vrshrq_n_u16): Remove. (__arm_vrshrq_n_s16): Remove. (__arm_vrshrq_n_u32): Remove. (__arm_vrshrq_n_s32): Remove. (__arm_vrshrq_m_n_s8): Remove. (__arm_vrshrq_m_n_s32): Remove. (__arm_vrshrq_m_n_s16): Remove. (__arm_vrshrq_m_n_u8): Remove. (__arm_vrshrq_m_n_u32): Remove. (__arm_vrshrq_m_n_u16): Remove. (__arm_vshrq_m_n_s8): Remove. (__arm_vshrq_m_n_s32): Remove. (__arm_vshrq_m_n_s16): Remove. (__arm_vshrq_m_n_u8): Remove. (__arm_vshrq_m_n_u32): Remove. (__arm_vshrq_m_n_u16): Remove. (__arm_vrshrq_x_n_s8): Remove. (__arm_vrshrq_x_n_s16): Remove. (__arm_vrshrq_x_n_s32): Remove. (__arm_vrshrq_x_n_u8): Remove. (__arm_vrshrq_x_n_u16): Remove. (__arm_vrshrq_x_n_u32): Remove. (__arm_vshrq_x_n_s8): Remove. (__arm_vshrq_x_n_s16): Remove. (__arm_vshrq_x_n_s32): Remove. (__arm_vshrq_x_n_u8): Remove. (__arm_vshrq_x_n_u16): Remove. (__arm_vshrq_x_n_u32): Remove. (__arm_vshrq): Remove. (__arm_vrshrq): Remove. (__arm_vrshrq_m): Remove. (__arm_vshrq_m): Remove. (__arm_vrshrq_x): Remove. (__arm_vshrq_x): Remove.
2023-05-05arm: [MVE intrinsics] factorize vsrhrq vrshrqChristophe Lyon2-38/+22
Factorize vsrhrq vrshrq so that they use the same pattern. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (MVE_VSHRQ_M_N, MVE_VSHRQ_N): New. (mve_insn): Add vrshr, vshr. * config/arm/mve.md (mve_vshrq_n_<supf><mode>) (mve_vrshrq_n_<supf><mode>): Merge into ... (@mve_<mve_insn>q_n_<supf><mode>): ... this. (mve_vrshrq_m_n_<supf><mode>, mve_vshrq_m_n_<supf><mode>): Merge into ... (@mve_<mve_insn>q_m_n_<supf><mode>): ... this.
2023-05-05arm: [MVE intrinsics] add binary_rshift shapeChristophe Lyon2-0/+37
This patch adds the binary_rshift shape description. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (binary_rshift): New. * config/arm/arm-mve-builtins-shapes.h (binary_rshift): New.
2023-05-05arm: [MVE intrinsics] rework vqrshrunbq vqrshruntq vqshrunbq vqshruntqChristophe Lyon5-320/+25
Implement vqrshrunbq, vqrshruntq, vqshrunbq, vqshruntq using the new MVE builtins framework. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (FUNCTION_ONLY_N_NO_U_F): New. (vqshrunbq, vqshruntq, vqrshrunbq, vqrshruntq): New. * config/arm/arm-mve-builtins-base.def (vqshrunbq, vqshruntq) (vqrshrunbq, vqrshruntq): New. * config/arm/arm-mve-builtins-base.h (vqshrunbq, vqshruntq) (vqrshrunbq, vqrshruntq): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Handle vqshrunbq, vqshruntq, vqrshrunbq, vqrshruntq. * config/arm/arm_mve.h (vqrshrunbq): Remove. (vqrshruntq): Remove. (vqrshrunbq_m): Remove. (vqrshruntq_m): Remove. (vqrshrunbq_n_s16): Remove. (vqrshrunbq_n_s32): Remove. (vqrshruntq_n_s16): Remove. (vqrshruntq_n_s32): Remove. (vqrshrunbq_m_n_s32): Remove. (vqrshrunbq_m_n_s16): Remove. (vqrshruntq_m_n_s32): Remove. (vqrshruntq_m_n_s16): Remove. (__arm_vqrshrunbq_n_s16): Remove. (__arm_vqrshrunbq_n_s32): Remove. (__arm_vqrshruntq_n_s16): Remove. (__arm_vqrshruntq_n_s32): Remove. (__arm_vqrshrunbq_m_n_s32): Remove. (__arm_vqrshrunbq_m_n_s16): Remove. (__arm_vqrshruntq_m_n_s32): Remove. (__arm_vqrshruntq_m_n_s16): Remove. (__arm_vqrshrunbq): Remove. (__arm_vqrshruntq): Remove. (__arm_vqrshrunbq_m): Remove. (__arm_vqrshruntq_m): Remove. (vqshrunbq): Remove. (vqshruntq): Remove. (vqshrunbq_m): Remove. (vqshruntq_m): Remove. (vqshrunbq_n_s16): Remove. (vqshruntq_n_s16): Remove. (vqshrunbq_n_s32): Remove. (vqshruntq_n_s32): Remove. (vqshrunbq_m_n_s32): Remove. (vqshrunbq_m_n_s16): Remove. (vqshruntq_m_n_s32): Remove. (vqshruntq_m_n_s16): Remove. (__arm_vqshrunbq_n_s16): Remove. (__arm_vqshruntq_n_s16): Remove. (__arm_vqshrunbq_n_s32): Remove. (__arm_vqshruntq_n_s32): Remove. (__arm_vqshrunbq_m_n_s32): Remove. (__arm_vqshrunbq_m_n_s16): Remove. (__arm_vqshruntq_m_n_s32): Remove. (__arm_vqshruntq_m_n_s16): Remove. (__arm_vqshrunbq): Remove. (__arm_vqshruntq): Remove. (__arm_vqshrunbq_m): Remove. (__arm_vqshruntq_m): Remove.
2023-05-05arm: [MVE intrinsics] factorize vqrshrunb vqrshrunt vqshrunb vqshruntChristophe Lyon2-132/+40
Factorize vqrshrunb, vqrshrunt, vqshrunb, vqshrunt so that they use existing patterns. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (MVE_SHRN_N): Add VQRSHRUNBQ, VQRSHRUNTQ, VQSHRUNBQ, VQSHRUNTQ. (MVE_SHRN_M_N): Likewise. (mve_insn): Add vqrshrunb, vqrshrunt, vqshrunb, vqshrunt. (isu): Add VQRSHRUNBQ, VQRSHRUNTQ, VQSHRUNBQ, VQSHRUNTQ. (supf): Likewise. * config/arm/mve.md (mve_vqrshrunbq_n_s<mode>): Remove. (mve_vqrshruntq_n_s<mode>): Remove. (mve_vqshrunbq_n_s<mode>): Remove. (mve_vqshruntq_n_s<mode>): Remove. (mve_vqrshrunbq_m_n_s<mode>): Remove. (mve_vqrshruntq_m_n_s<mode>): Remove. (mve_vqshrunbq_m_n_s<mode>): Remove. (mve_vqshruntq_m_n_s<mode>): Remove.
2023-05-05arm: [MVE intrinsics] add binary_rshift_narrow_unsigned shapeChristophe Lyon2-0/+49
This patch adds the binary_rshift_narrow_unsigned shape description. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (binary_rshift_narrow_unsigned): New. * config/arm/arm-mve-builtins-shapes.h (binary_rshift_narrow_unsigned): New.
2023-05-05arm: [MVE intrinsics] rework vshrnbq vshrntq vrshrnbq vrshrntq vqshrnbq ↵Christophe Lyon5-1153/+42
vqshrntq vqrshrnbq vqrshrntq Implement vshrnbq, vshrntq, vrshrnbq, vrshrntq, vqshrnbq, vqshrntq, vqrshrnbq, vqrshrntq using the new MVE builtins framework. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (FUNCTION_ONLY_N_NO_F): New. (vshrnbq, vshrntq, vrshrnbq, vrshrntq, vqshrnbq, vqshrntq) (vqrshrnbq, vqrshrntq): New. * config/arm/arm-mve-builtins-base.def (vshrnbq, vshrntq) (vrshrnbq, vrshrntq, vqshrnbq, vqshrntq, vqrshrnbq, vqrshrntq): New. * config/arm/arm-mve-builtins-base.h (vshrnbq, vshrntq, vrshrnbq) (vrshrntq, vqshrnbq, vqshrntq, vqrshrnbq, vqrshrntq): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Handle vshrnbq, vshrntq, vrshrnbq, vrshrntq, vqshrnbq, vqshrntq, vqrshrnbq, vqrshrntq. * config/arm/arm_mve.h (vshrnbq): Remove. (vshrntq): Remove. (vshrnbq_m): Remove. (vshrntq_m): Remove. (vshrnbq_n_s16): Remove. (vshrntq_n_s16): Remove. (vshrnbq_n_u16): Remove. (vshrntq_n_u16): Remove. (vshrnbq_n_s32): Remove. (vshrntq_n_s32): Remove. (vshrnbq_n_u32): Remove. (vshrntq_n_u32): Remove. (vshrnbq_m_n_s32): Remove. (vshrnbq_m_n_s16): Remove. (vshrnbq_m_n_u32): Remove. (vshrnbq_m_n_u16): Remove. (vshrntq_m_n_s32): Remove. (vshrntq_m_n_s16): Remove. (vshrntq_m_n_u32): Remove. (vshrntq_m_n_u16): Remove. (__arm_vshrnbq_n_s16): Remove. (__arm_vshrntq_n_s16): Remove. (__arm_vshrnbq_n_u16): Remove. (__arm_vshrntq_n_u16): Remove. (__arm_vshrnbq_n_s32): Remove. (__arm_vshrntq_n_s32): Remove. (__arm_vshrnbq_n_u32): Remove. (__arm_vshrntq_n_u32): Remove. (__arm_vshrnbq_m_n_s32): Remove. (__arm_vshrnbq_m_n_s16): Remove. (__arm_vshrnbq_m_n_u32): Remove. (__arm_vshrnbq_m_n_u16): Remove. (__arm_vshrntq_m_n_s32): Remove. (__arm_vshrntq_m_n_s16): Remove. (__arm_vshrntq_m_n_u32): Remove. (__arm_vshrntq_m_n_u16): Remove. (__arm_vshrnbq): Remove. (__arm_vshrntq): Remove. (__arm_vshrnbq_m): Remove. (__arm_vshrntq_m): Remove. (vrshrnbq): Remove. (vrshrntq): Remove. (vrshrnbq_m): Remove. (vrshrntq_m): Remove. (vrshrnbq_n_s16): Remove. (vrshrntq_n_s16): Remove. (vrshrnbq_n_u16): Remove. (vrshrntq_n_u16): Remove. (vrshrnbq_n_s32): Remove. (vrshrntq_n_s32): Remove. (vrshrnbq_n_u32): Remove. (vrshrntq_n_u32): Remove. (vrshrnbq_m_n_s32): Remove. (vrshrnbq_m_n_s16): Remove. (vrshrnbq_m_n_u32): Remove. (vrshrnbq_m_n_u16): Remove. (vrshrntq_m_n_s32): Remove. (vrshrntq_m_n_s16): Remove. (vrshrntq_m_n_u32): Remove. (vrshrntq_m_n_u16): Remove. (__arm_vrshrnbq_n_s16): Remove. (__arm_vrshrntq_n_s16): Remove. (__arm_vrshrnbq_n_u16): Remove. (__arm_vrshrntq_n_u16): Remove. (__arm_vrshrnbq_n_s32): Remove. (__arm_vrshrntq_n_s32): Remove. (__arm_vrshrnbq_n_u32): Remove. (__arm_vrshrntq_n_u32): Remove. (__arm_vrshrnbq_m_n_s32): Remove. (__arm_vrshrnbq_m_n_s16): Remove. (__arm_vrshrnbq_m_n_u32): Remove. (__arm_vrshrnbq_m_n_u16): Remove. (__arm_vrshrntq_m_n_s32): Remove. (__arm_vrshrntq_m_n_s16): Remove. (__arm_vrshrntq_m_n_u32): Remove. (__arm_vrshrntq_m_n_u16): Remove. (__arm_vrshrnbq): Remove. (__arm_vrshrntq): Remove. (__arm_vrshrnbq_m): Remove. (__arm_vrshrntq_m): Remove. (vqshrnbq): Remove. (vqshrntq): Remove. (vqshrnbq_m): Remove. (vqshrntq_m): Remove. (vqshrnbq_n_s16): Remove. (vqshrntq_n_s16): Remove. (vqshrnbq_n_u16): Remove. (vqshrntq_n_u16): Remove. (vqshrnbq_n_s32): Remove. (vqshrntq_n_s32): Remove. (vqshrnbq_n_u32): Remove. (vqshrntq_n_u32): Remove. (vqshrnbq_m_n_s32): Remove. (vqshrnbq_m_n_s16): Remove. (vqshrnbq_m_n_u32): Remove. (vqshrnbq_m_n_u16): Remove. (vqshrntq_m_n_s32): Remove. (vqshrntq_m_n_s16): Remove. (vqshrntq_m_n_u32): Remove. (vqshrntq_m_n_u16): Remove. (__arm_vqshrnbq_n_s16): Remove. (__arm_vqshrntq_n_s16): Remove. (__arm_vqshrnbq_n_u16): Remove. (__arm_vqshrntq_n_u16): Remove. (__arm_vqshrnbq_n_s32): Remove. (__arm_vqshrntq_n_s32): Remove. (__arm_vqshrnbq_n_u32): Remove. (__arm_vqshrntq_n_u32): Remove. (__arm_vqshrnbq_m_n_s32): Remove. (__arm_vqshrnbq_m_n_s16): Remove. (__arm_vqshrnbq_m_n_u32): Remove. (__arm_vqshrnbq_m_n_u16): Remove. (__arm_vqshrntq_m_n_s32): Remove. (__arm_vqshrntq_m_n_s16): Remove. (__arm_vqshrntq_m_n_u32): Remove. (__arm_vqshrntq_m_n_u16): Remove. (__arm_vqshrnbq): Remove. (__arm_vqshrntq): Remove. (__arm_vqshrnbq_m): Remove. (__arm_vqshrntq_m): Remove. (vqrshrnbq): Remove. (vqrshrntq): Remove. (vqrshrnbq_m): Remove. (vqrshrntq_m): Remove. (vqrshrnbq_n_s16): Remove. (vqrshrnbq_n_u16): Remove. (vqrshrnbq_n_s32): Remove. (vqrshrnbq_n_u32): Remove. (vqrshrntq_n_s16): Remove. (vqrshrntq_n_u16): Remove. (vqrshrntq_n_s32): Remove. (vqrshrntq_n_u32): Remove. (vqrshrnbq_m_n_s32): Remove. (vqrshrnbq_m_n_s16): Remove. (vqrshrnbq_m_n_u32): Remove. (vqrshrnbq_m_n_u16): Remove. (vqrshrntq_m_n_s32): Remove. (vqrshrntq_m_n_s16): Remove. (vqrshrntq_m_n_u32): Remove. (vqrshrntq_m_n_u16): Remove. (__arm_vqrshrnbq_n_s16): Remove. (__arm_vqrshrnbq_n_u16): Remove. (__arm_vqrshrnbq_n_s32): Remove. (__arm_vqrshrnbq_n_u32): Remove. (__arm_vqrshrntq_n_s16): Remove. (__arm_vqrshrntq_n_u16): Remove. (__arm_vqrshrntq_n_s32): Remove. (__arm_vqrshrntq_n_u32): Remove. (__arm_vqrshrnbq_m_n_s32): Remove. (__arm_vqrshrnbq_m_n_s16): Remove. (__arm_vqrshrnbq_m_n_u32): Remove. (__arm_vqrshrnbq_m_n_u16): Remove. (__arm_vqrshrntq_m_n_s32): Remove. (__arm_vqrshrntq_m_n_s16): Remove. (__arm_vqrshrntq_m_n_u32): Remove. (__arm_vqrshrntq_m_n_u16): Remove. (__arm_vqrshrnbq): Remove. (__arm_vqrshrntq): Remove. (__arm_vqrshrnbq_m): Remove. (__arm_vqrshrntq_m): Remove.
2023-05-05arm: [MVE intrinsics] factorize vshrntq vshrnbq vrshrnbq vrshrntq vqshrnbq ↵Christophe Lyon2-242/+85
vqshrntq vqrshrnbq vqrshrntq Factorize vqshrnbq, vqshrntq, vqrshrnbq, vqrshrntq, vshrntq, vshrnbq, vrshrnbq and vrshrntq so that they use the same pattern. Introduce <isu> iterator for *shrn* so that we can use the same pattern despite the different "s", "u" and "i" suffixes. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (MVE_SHRN_N, MVE_SHRN_M_N): New. (mve_insn): Add vqrshrnb, vqrshrnt, vqshrnb, vqshrnt, vrshrnb, vrshrnt, vshrnb, vshrnt. (isu): New. * config/arm/mve.md (mve_vqrshrnbq_n_<supf><mode>) (mve_vqrshrntq_n_<supf><mode>, mve_vqshrnbq_n_<supf><mode>) (mve_vqshrntq_n_<supf><mode>, mve_vrshrnbq_n_<supf><mode>) (mve_vrshrntq_n_<supf><mode>, mve_vshrnbq_n_<supf><mode>) (mve_vshrntq_n_<supf><mode>): Merge into ... (@mve_<mve_insn>q_n_<supf><mode>): ... this. (mve_vqrshrnbq_m_n_<supf><mode>, mve_vqrshrntq_m_n_<supf><mode>) (mve_vqshrnbq_m_n_<supf><mode>, mve_vqshrntq_m_n_<supf><mode>) (mve_vrshrnbq_m_n_<supf><mode>, mve_vrshrntq_m_n_<supf><mode>) (mve_vshrnbq_m_n_<supf><mode>, mve_vshrntq_m_n_<supf><mode>): Merge into ... (@mve_<mve_insn>q_m_n_<supf><mode>): ... this.
2023-05-05arm: [MVE intrinsics] add binary_rshift_narrow shapeChristophe Lyon2-0/+48
This patch adds the binary_rshift_narrow shape description. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (binary_rshift_narrow): New. * config/arm/arm-mve-builtins-shapes.h (binary_rshift_narrow): New.
2023-05-05arm: [MVE intrinsics] rework vmaxq vminqChristophe Lyon4-628/+15
Implement vmaxq and vminq using the new MVE builtins framework. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (FUNCTION_WITH_RTX_M_NO_F): New. (vmaxq, vminq): New. * config/arm/arm-mve-builtins-base.def (vmaxq, vminq): New. * config/arm/arm-mve-builtins-base.h (vmaxq, vminq): New. * config/arm/arm_mve.h (vminq): Remove. (vmaxq): Remove. (vmaxq_m): Remove. (vminq_m): Remove. (vminq_x): Remove. (vmaxq_x): Remove. (vminq_u8): Remove. (vmaxq_u8): Remove. (vminq_s8): Remove. (vmaxq_s8): Remove. (vminq_u16): Remove. (vmaxq_u16): Remove. (vminq_s16): Remove. (vmaxq_s16): Remove. (vminq_u32): Remove. (vmaxq_u32): Remove. (vminq_s32): Remove. (vmaxq_s32): Remove. (vmaxq_m_s8): Remove. (vmaxq_m_s32): Remove. (vmaxq_m_s16): Remove. (vmaxq_m_u8): Remove. (vmaxq_m_u32): Remove. (vmaxq_m_u16): Remove. (vminq_m_s8): Remove. (vminq_m_s32): Remove. (vminq_m_s16): Remove. (vminq_m_u8): Remove. (vminq_m_u32): Remove. (vminq_m_u16): Remove. (vminq_x_s8): Remove. (vminq_x_s16): Remove. (vminq_x_s32): Remove. (vminq_x_u8): Remove. (vminq_x_u16): Remove. (vminq_x_u32): Remove. (vmaxq_x_s8): Remove. (vmaxq_x_s16): Remove. (vmaxq_x_s32): Remove. (vmaxq_x_u8): Remove. (vmaxq_x_u16): Remove. (vmaxq_x_u32): Remove. (__arm_vminq_u8): Remove. (__arm_vmaxq_u8): Remove. (__arm_vminq_s8): Remove. (__arm_vmaxq_s8): Remove. (__arm_vminq_u16): Remove. (__arm_vmaxq_u16): Remove. (__arm_vminq_s16): Remove. (__arm_vmaxq_s16): Remove. (__arm_vminq_u32): Remove. (__arm_vmaxq_u32): Remove. (__arm_vminq_s32): Remove. (__arm_vmaxq_s32): Remove. (__arm_vmaxq_m_s8): Remove. (__arm_vmaxq_m_s32): Remove. (__arm_vmaxq_m_s16): Remove. (__arm_vmaxq_m_u8): Remove. (__arm_vmaxq_m_u32): Remove. (__arm_vmaxq_m_u16): Remove. (__arm_vminq_m_s8): Remove. (__arm_vminq_m_s32): Remove. (__arm_vminq_m_s16): Remove. (__arm_vminq_m_u8): Remove. (__arm_vminq_m_u32): Remove. (__arm_vminq_m_u16): Remove. (__arm_vminq_x_s8): Remove. (__arm_vminq_x_s16): Remove. (__arm_vminq_x_s32): Remove. (__arm_vminq_x_u8): Remove. (__arm_vminq_x_u16): Remove. (__arm_vminq_x_u32): Remove. (__arm_vmaxq_x_s8): Remove. (__arm_vmaxq_x_s16): Remove. (__arm_vmaxq_x_s32): Remove. (__arm_vmaxq_x_u8): Remove. (__arm_vmaxq_x_u16): Remove. (__arm_vmaxq_x_u32): Remove. (__arm_vminq): Remove. (__arm_vmaxq): Remove. (__arm_vmaxq_m): Remove. (__arm_vminq_m): Remove. (__arm_vminq_x): Remove. (__arm_vmaxq_x): Remove.
2023-05-05arm: [MVE intrinsics] factorize vmaxq vminqChristophe Lyon2-39/+16
Factorize vmaxq and vminq so that they use the same pattern. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (MAX_MIN_SU): New. (max_min_su_str): New. (max_min_supf): New. * config/arm/mve.md (mve_vmaxq_s<mode>, mve_vmaxq_u<mode>) (mve_vminq_s<mode>, mve_vminq_u<mode>): Merge into ... (mve_<max_min_su_str>q_<max_min_supf><mode>): ... this.
2023-05-05arm: [MVE intrinsics] rework vqshlq vshlqChristophe Lyon4-1492/+19
Implement vqshlq, vshlq using the new MVE builtins framework. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (FUNCTION_WITH_M_N_R): New. (vqshlq, vshlq): New. * config/arm/arm-mve-builtins-base.def (vqshlq, vshlq): New. * config/arm/arm-mve-builtins-base.h (vqshlq, vshlq): New. * config/arm/arm_mve.h (vshlq): Remove. (vshlq_r): Remove. (vshlq_n): Remove. (vshlq_m_r): Remove. (vshlq_m): Remove. (vshlq_m_n): Remove. (vshlq_x): Remove. (vshlq_x_n): Remove. (vshlq_s8): Remove. (vshlq_s16): Remove. (vshlq_s32): Remove. (vshlq_u8): Remove. (vshlq_u16): Remove. (vshlq_u32): Remove. (vshlq_r_u8): Remove. (vshlq_n_u8): Remove. (vshlq_r_s8): Remove. (vshlq_n_s8): Remove. (vshlq_r_u16): Remove. (vshlq_n_u16): Remove. (vshlq_r_s16): Remove. (vshlq_n_s16): Remove. (vshlq_r_u32): Remove. (vshlq_n_u32): Remove. (vshlq_r_s32): Remove. (vshlq_n_s32): Remove. (vshlq_m_r_u8): Remove. (vshlq_m_r_s8): Remove. (vshlq_m_r_u16): Remove. (vshlq_m_r_s16): Remove. (vshlq_m_r_u32): Remove. (vshlq_m_r_s32): Remove. (vshlq_m_u8): Remove. (vshlq_m_s8): Remove. (vshlq_m_u16): Remove. (vshlq_m_s16): Remove. (vshlq_m_u32): Remove. (vshlq_m_s32): Remove. (vshlq_m_n_s8): Remove. (vshlq_m_n_s32): Remove. (vshlq_m_n_s16): Remove. (vshlq_m_n_u8): Remove. (vshlq_m_n_u32): Remove. (vshlq_m_n_u16): Remove. (vshlq_x_s8): Remove. (vshlq_x_s16): Remove. (vshlq_x_s32): Remove. (vshlq_x_u8): Remove. (vshlq_x_u16): Remove. (vshlq_x_u32): Remove. (vshlq_x_n_s8): Remove. (vshlq_x_n_s16): Remove. (vshlq_x_n_s32): Remove. (vshlq_x_n_u8): Remove. (vshlq_x_n_u16): Remove. (vshlq_x_n_u32): Remove. (__arm_vshlq_s8): Remove. (__arm_vshlq_s16): Remove. (__arm_vshlq_s32): Remove. (__arm_vshlq_u8): Remove. (__arm_vshlq_u16): Remove. (__arm_vshlq_u32): Remove. (__arm_vshlq_r_u8): Remove. (__arm_vshlq_n_u8): Remove. (__arm_vshlq_r_s8): Remove. (__arm_vshlq_n_s8): Remove. (__arm_vshlq_r_u16): Remove. (__arm_vshlq_n_u16): Remove. (__arm_vshlq_r_s16): Remove. (__arm_vshlq_n_s16): Remove. (__arm_vshlq_r_u32): Remove. (__arm_vshlq_n_u32): Remove. (__arm_vshlq_r_s32): Remove. (__arm_vshlq_n_s32): Remove. (__arm_vshlq_m_r_u8): Remove. (__arm_vshlq_m_r_s8): Remove. (__arm_vshlq_m_r_u16): Remove. (__arm_vshlq_m_r_s16): Remove. (__arm_vshlq_m_r_u32): Remove. (__arm_vshlq_m_r_s32): Remove. (__arm_vshlq_m_u8): Remove. (__arm_vshlq_m_s8): Remove. (__arm_vshlq_m_u16): Remove. (__arm_vshlq_m_s16): Remove. (__arm_vshlq_m_u32): Remove. (__arm_vshlq_m_s32): Remove. (__arm_vshlq_m_n_s8): Remove. (__arm_vshlq_m_n_s32): Remove. (__arm_vshlq_m_n_s16): Remove. (__arm_vshlq_m_n_u8): Remove. (__arm_vshlq_m_n_u32): Remove. (__arm_vshlq_m_n_u16): Remove. (__arm_vshlq_x_s8): Remove. (__arm_vshlq_x_s16): Remove. (__arm_vshlq_x_s32): Remove. (__arm_vshlq_x_u8): Remove. (__arm_vshlq_x_u16): Remove. (__arm_vshlq_x_u32): Remove. (__arm_vshlq_x_n_s8): Remove. (__arm_vshlq_x_n_s16): Remove. (__arm_vshlq_x_n_s32): Remove. (__arm_vshlq_x_n_u8): Remove. (__arm_vshlq_x_n_u16): Remove. (__arm_vshlq_x_n_u32): Remove. (__arm_vshlq): Remove. (__arm_vshlq_r): Remove. (__arm_vshlq_n): Remove. (__arm_vshlq_m_r): Remove. (__arm_vshlq_m): Remove. (__arm_vshlq_m_n): Remove. (__arm_vshlq_x): Remove. (__arm_vshlq_x_n): Remove. (vqshlq): Remove. (vqshlq_r): Remove. (vqshlq_n): Remove. (vqshlq_m_r): Remove. (vqshlq_m_n): Remove. (vqshlq_m): Remove. (vqshlq_u8): Remove. (vqshlq_r_u8): Remove. (vqshlq_n_u8): Remove. (vqshlq_s8): Remove. (vqshlq_r_s8): Remove. (vqshlq_n_s8): Remove. (vqshlq_u16): Remove. (vqshlq_r_u16): Remove. (vqshlq_n_u16): Remove. (vqshlq_s16): Remove. (vqshlq_r_s16): Remove. (vqshlq_n_s16): Remove. (vqshlq_u32): Remove. (vqshlq_r_u32): Remove. (vqshlq_n_u32): Remove. (vqshlq_s32): Remove. (vqshlq_r_s32): Remove. (vqshlq_n_s32): Remove. (vqshlq_m_r_u8): Remove. (vqshlq_m_r_s8): Remove. (vqshlq_m_r_u16): Remove. (vqshlq_m_r_s16): Remove. (vqshlq_m_r_u32): Remove. (vqshlq_m_r_s32): Remove. (vqshlq_m_n_s8): Remove. (vqshlq_m_n_s32): Remove. (vqshlq_m_n_s16): Remove. (vqshlq_m_n_u8): Remove. (vqshlq_m_n_u32): Remove. (vqshlq_m_n_u16): Remove. (vqshlq_m_s8): Remove. (vqshlq_m_s32): Remove. (vqshlq_m_s16): Remove. (vqshlq_m_u8): Remove. (vqshlq_m_u32): Remove. (vqshlq_m_u16): Remove. (__arm_vqshlq_u8): Remove. (__arm_vqshlq_r_u8): Remove. (__arm_vqshlq_n_u8): Remove. (__arm_vqshlq_s8): Remove. (__arm_vqshlq_r_s8): Remove. (__arm_vqshlq_n_s8): Remove. (__arm_vqshlq_u16): Remove. (__arm_vqshlq_r_u16): Remove. (__arm_vqshlq_n_u16): Remove. (__arm_vqshlq_s16): Remove. (__arm_vqshlq_r_s16): Remove. (__arm_vqshlq_n_s16): Remove. (__arm_vqshlq_u32): Remove. (__arm_vqshlq_r_u32): Remove. (__arm_vqshlq_n_u32): Remove. (__arm_vqshlq_s32): Remove. (__arm_vqshlq_r_s32): Remove. (__arm_vqshlq_n_s32): Remove. (__arm_vqshlq_m_r_u8): Remove. (__arm_vqshlq_m_r_s8): Remove. (__arm_vqshlq_m_r_u16): Remove. (__arm_vqshlq_m_r_s16): Remove. (__arm_vqshlq_m_r_u32): Remove. (__arm_vqshlq_m_r_s32): Remove. (__arm_vqshlq_m_n_s8): Remove. (__arm_vqshlq_m_n_s32): Remove. (__arm_vqshlq_m_n_s16): Remove. (__arm_vqshlq_m_n_u8): Remove. (__arm_vqshlq_m_n_u32): Remove. (__arm_vqshlq_m_n_u16): Remove. (__arm_vqshlq_m_s8): Remove. (__arm_vqshlq_m_s32): Remove. (__arm_vqshlq_m_s16): Remove. (__arm_vqshlq_m_u8): Remove. (__arm_vqshlq_m_u32): Remove. (__arm_vqshlq_m_u16): Remove. (__arm_vqshlq): Remove. (__arm_vqshlq_r): Remove. (__arm_vqshlq_n): Remove. (__arm_vqshlq_m_r): Remove. (__arm_vqshlq_m_n): Remove. (__arm_vqshlq_m): Remove.
2023-05-05arm: [MVE intrinsics] add unspec_mve_function_exact_insn_vshlChristophe Lyon1-0/+150
Introduce a function that will be used to build vshl intrinsics. They are special because they have to handle MODE_r. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-functions.h (class unspec_mve_function_exact_insn_vshl): New.
2023-05-05arm: [MVE intrinsics] add binary_lshift_r shapeChristophe Lyon2-0/+42
This patch adds the binary_lshift_r shape description. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (binary_lshift_r): New. * config/arm/arm-mve-builtins-shapes.h (binary_lshift_r): New.
2023-05-05arm: [MVE intrinsics] add support for MODE_rChristophe Lyon2-2/+7
A few intrinsics have an additional mode (MODE_r), which does not always support the same set of predicates as MODE_none and MODE_n. For vqshlq they are the same, but for vshlq they are not. Indeed we have: vqshlq vqshlq_m vqshlq_n vqshlq_m_n vqshlq_r vqshlq_m_r vshlq vshlq_m vshlq_x vshlq_n vshlq_m_n vshlq_x_n vshlq_r vshlq_m_r This patch adds support for it. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins.cc (has_inactive_argument) (finish_opt_n_resolution): Handle MODE_r. * config/arm/arm-mve-builtins.def (r): New mode.
2023-05-05arm: [MVE intrinsics] add binary_lshift shapeChristophe Lyon2-0/+58
This patch adds the binary_lshift shape description. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (binary_lshift): New. * config/arm/arm-mve-builtins-shapes.h (binary_lshift): New.
2023-05-05arm: [MVE intrinsics] rework vabdqChristophe Lyon4-431/+13
Implement vabdq using the new MVE builtins framework. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (FUNCTION_WITHOUT_N): New. (vabdq): New. * config/arm/arm-mve-builtins-base.def (vabdq): New. * config/arm/arm-mve-builtins-base.h (vabdq): New. * config/arm/arm_mve.h (vabdq): Remove. (vabdq_m): Remove. (vabdq_x): Remove. (vabdq_u8): Remove. (vabdq_s8): Remove. (vabdq_u16): Remove. (vabdq_s16): Remove. (vabdq_u32): Remove. (vabdq_s32): Remove. (vabdq_f16): Remove. (vabdq_f32): Remove. (vabdq_m_s8): Remove. (vabdq_m_s32): Remove. (vabdq_m_s16): Remove. (vabdq_m_u8): Remove. (vabdq_m_u32): Remove. (vabdq_m_u16): Remove. (vabdq_m_f32): Remove. (vabdq_m_f16): Remove. (vabdq_x_s8): Remove. (vabdq_x_s16): Remove. (vabdq_x_s32): Remove. (vabdq_x_u8): Remove. (vabdq_x_u16): Remove. (vabdq_x_u32): Remove. (vabdq_x_f16): Remove. (vabdq_x_f32): Remove. (__arm_vabdq_u8): Remove. (__arm_vabdq_s8): Remove. (__arm_vabdq_u16): Remove. (__arm_vabdq_s16): Remove. (__arm_vabdq_u32): Remove. (__arm_vabdq_s32): Remove. (__arm_vabdq_m_s8): Remove. (__arm_vabdq_m_s32): Remove. (__arm_vabdq_m_s16): Remove. (__arm_vabdq_m_u8): Remove. (__arm_vabdq_m_u32): Remove. (__arm_vabdq_m_u16): Remove. (__arm_vabdq_x_s8): Remove. (__arm_vabdq_x_s16): Remove. (__arm_vabdq_x_s32): Remove. (__arm_vabdq_x_u8): Remove. (__arm_vabdq_x_u16): Remove. (__arm_vabdq_x_u32): Remove. (__arm_vabdq_f16): Remove. (__arm_vabdq_f32): Remove. (__arm_vabdq_m_f32): Remove. (__arm_vabdq_m_f16): Remove. (__arm_vabdq_x_f16): Remove. (__arm_vabdq_x_f32): Remove. (__arm_vabdq): Remove. (__arm_vabdq_m): Remove. (__arm_vabdq_x): Remove.
2023-05-05arm: [MVE intrinsics] factorize vabdqChristophe Lyon2-22/+12
2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (MVE_FP_M_BINARY): Add vabdq. (MVE_FP_VABDQ_ONLY): New. (mve_insn): Add vabd. * config/arm/mve.md (mve_vabdq_f<mode>): Move into ... (@mve_<mve_insn>q_f<mode>): ... this. (mve_vabdq_m_f<mode>): Remove.
2023-05-05arm: [MVE intrinsics] rework vqrdmulhqChristophe Lyon4-213/+3
Implement vqrdmulhq using the new MVE builtins framework. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (vqrdmulhq): New. * config/arm/arm-mve-builtins-base.def (vqrdmulhq): New. * config/arm/arm-mve-builtins-base.h (vqrdmulhq): New. * config/arm/arm_mve.h (vqrdmulhq): Remove. (vqrdmulhq_m): Remove. (vqrdmulhq_s8): Remove. (vqrdmulhq_n_s8): Remove. (vqrdmulhq_s16): Remove. (vqrdmulhq_n_s16): Remove. (vqrdmulhq_s32): Remove. (vqrdmulhq_n_s32): Remove. (vqrdmulhq_m_n_s8): Remove. (vqrdmulhq_m_n_s32): Remove. (vqrdmulhq_m_n_s16): Remove. (vqrdmulhq_m_s8): Remove. (vqrdmulhq_m_s32): Remove. (vqrdmulhq_m_s16): Remove. (__arm_vqrdmulhq_s8): Remove. (__arm_vqrdmulhq_n_s8): Remove. (__arm_vqrdmulhq_s16): Remove. (__arm_vqrdmulhq_n_s16): Remove. (__arm_vqrdmulhq_s32): Remove. (__arm_vqrdmulhq_n_s32): Remove. (__arm_vqrdmulhq_m_n_s8): Remove. (__arm_vqrdmulhq_m_n_s32): Remove. (__arm_vqrdmulhq_m_n_s16): Remove. (__arm_vqrdmulhq_m_s8): Remove. (__arm_vqrdmulhq_m_s32): Remove. (__arm_vqrdmulhq_m_s16): Remove. (__arm_vqrdmulhq): Remove. (__arm_vqrdmulhq_m): Remove.
2023-05-05arm: [MVE intrinsics] factorize vqshlq vshlqChristophe Lyon3-81/+51
Factorize vqshlq and vshlq so that they use the same pattern. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (MVE_SHIFT_M_R, MVE_SHIFT_M_N) (MVE_SHIFT_N, MVE_SHIFT_R): New. (mve_insn): Add vqshl, vshl. * config/arm/mve.md (mve_vqshlq_n_<supf><mode>) (mve_vshlq_n_<supf><mode>): Merge into ... (@mve_<mve_insn>q_n_<supf><mode>): ... this. (mve_vqshlq_r_<supf><mode>, mve_vshlq_r_<supf><mode>): Merge into ... (@mve_<mve_insn>q_r_<supf><mode>): ... this. (mve_vqshlq_m_r_<supf><mode>, mve_vshlq_m_r_<supf><mode>): Merge into ... (@mve_<mve_insn>q_m_r_<supf><mode>): ... this. (mve_vqshlq_m_n_<supf><mode>, mve_vshlq_m_n_<supf><mode>): Merge into ... (@mve_<mve_insn>q_m_n_<supf><mode>): ... this. * config/arm/vec-common.md (mve_vshlq_<supf><mode>): Transform into ... (@mve_<mve_insn>q_<supf><mode>): ... this.
2023-05-05arm: [MVE intrinsics] rework vrshlq vqrshlqChristophe Lyon5-952/+9
Implement vrshlq, vqrshlq using the new MVE builtins framework. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (vqrshlq, vrshlq): New. * config/arm/arm-mve-builtins-base.def (vqrshlq, vrshlq): New. * config/arm/arm-mve-builtins-base.h (vqrshlq, vrshlq): New. * config/arm/arm-mve-builtins.cc (has_inactive_argument): Handle vqrshlq, vrshlq. * config/arm/arm_mve.h (vrshlq): Remove. (vrshlq_m_n): Remove. (vrshlq_m): Remove. (vrshlq_x): Remove. (vrshlq_u8): Remove. (vrshlq_n_u8): Remove. (vrshlq_s8): Remove. (vrshlq_n_s8): Remove. (vrshlq_u16): Remove. (vrshlq_n_u16): Remove. (vrshlq_s16): Remove. (vrshlq_n_s16): Remove. (vrshlq_u32): Remove. (vrshlq_n_u32): Remove. (vrshlq_s32): Remove. (vrshlq_n_s32): Remove. (vrshlq_m_n_u8): Remove. (vrshlq_m_n_s8): Remove. (vrshlq_m_n_u16): Remove. (vrshlq_m_n_s16): Remove. (vrshlq_m_n_u32): Remove. (vrshlq_m_n_s32): Remove. (vrshlq_m_s8): Remove. (vrshlq_m_s32): Remove. (vrshlq_m_s16): Remove. (vrshlq_m_u8): Remove. (vrshlq_m_u32): Remove. (vrshlq_m_u16): Remove. (vrshlq_x_s8): Remove. (vrshlq_x_s16): Remove. (vrshlq_x_s32): Remove. (vrshlq_x_u8): Remove. (vrshlq_x_u16): Remove. (vrshlq_x_u32): Remove. (__arm_vrshlq_u8): Remove. (__arm_vrshlq_n_u8): Remove. (__arm_vrshlq_s8): Remove. (__arm_vrshlq_n_s8): Remove. (__arm_vrshlq_u16): Remove. (__arm_vrshlq_n_u16): Remove. (__arm_vrshlq_s16): Remove. (__arm_vrshlq_n_s16): Remove. (__arm_vrshlq_u32): Remove. (__arm_vrshlq_n_u32): Remove. (__arm_vrshlq_s32): Remove. (__arm_vrshlq_n_s32): Remove. (__arm_vrshlq_m_n_u8): Remove. (__arm_vrshlq_m_n_s8): Remove. (__arm_vrshlq_m_n_u16): Remove. (__arm_vrshlq_m_n_s16): Remove. (__arm_vrshlq_m_n_u32): Remove. (__arm_vrshlq_m_n_s32): Remove. (__arm_vrshlq_m_s8): Remove. (__arm_vrshlq_m_s32): Remove. (__arm_vrshlq_m_s16): Remove. (__arm_vrshlq_m_u8): Remove. (__arm_vrshlq_m_u32): Remove. (__arm_vrshlq_m_u16): Remove. (__arm_vrshlq_x_s8): Remove. (__arm_vrshlq_x_s16): Remove. (__arm_vrshlq_x_s32): Remove. (__arm_vrshlq_x_u8): Remove. (__arm_vrshlq_x_u16): Remove. (__arm_vrshlq_x_u32): Remove. (__arm_vrshlq): Remove. (__arm_vrshlq_m_n): Remove. (__arm_vrshlq_m): Remove. (__arm_vrshlq_x): Remove. (vqrshlq): Remove. (vqrshlq_m_n): Remove. (vqrshlq_m): Remove. (vqrshlq_u8): Remove. (vqrshlq_n_u8): Remove. (vqrshlq_s8): Remove. (vqrshlq_n_s8): Remove. (vqrshlq_u16): Remove. (vqrshlq_n_u16): Remove. (vqrshlq_s16): Remove. (vqrshlq_n_s16): Remove. (vqrshlq_u32): Remove. (vqrshlq_n_u32): Remove. (vqrshlq_s32): Remove. (vqrshlq_n_s32): Remove. (vqrshlq_m_n_u8): Remove. (vqrshlq_m_n_s8): Remove. (vqrshlq_m_n_u16): Remove. (vqrshlq_m_n_s16): Remove. (vqrshlq_m_n_u32): Remove. (vqrshlq_m_n_s32): Remove. (vqrshlq_m_s8): Remove. (vqrshlq_m_s32): Remove. (vqrshlq_m_s16): Remove. (vqrshlq_m_u8): Remove. (vqrshlq_m_u32): Remove. (vqrshlq_m_u16): Remove. (__arm_vqrshlq_u8): Remove. (__arm_vqrshlq_n_u8): Remove. (__arm_vqrshlq_s8): Remove. (__arm_vqrshlq_n_s8): Remove. (__arm_vqrshlq_u16): Remove. (__arm_vqrshlq_n_u16): Remove. (__arm_vqrshlq_s16): Remove. (__arm_vqrshlq_n_s16): Remove. (__arm_vqrshlq_u32): Remove. (__arm_vqrshlq_n_u32): Remove. (__arm_vqrshlq_s32): Remove. (__arm_vqrshlq_n_s32): Remove. (__arm_vqrshlq_m_n_u8): Remove. (__arm_vqrshlq_m_n_s8): Remove. (__arm_vqrshlq_m_n_u16): Remove. (__arm_vqrshlq_m_n_s16): Remove. (__arm_vqrshlq_m_n_u32): Remove. (__arm_vqrshlq_m_n_s32): Remove. (__arm_vqrshlq_m_s8): Remove. (__arm_vqrshlq_m_s32): Remove. (__arm_vqrshlq_m_s16): Remove. (__arm_vqrshlq_m_u8): Remove. (__arm_vqrshlq_m_u32): Remove. (__arm_vqrshlq_m_u16): Remove. (__arm_vqrshlq): Remove. (__arm_vqrshlq_m_n): Remove. (__arm_vqrshlq_m): Remove.
2023-05-05arm: [MVE intrinsics] factorize vqrshlq vrshlqChristophe Lyon2-39/+24
Factorize vqrshlq, vrshlq so that they use the same pattern. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (MVE_RSHIFT_M_N, MVE_RSHIFT_N): New. (mve_insn): Add vqrshl, vrshl. * config/arm/mve.md (mve_vqrshlq_n_<supf><mode>) (mve_vrshlq_n_<supf><mode>): Merge into ... (@mve_<mve_insn>q_n_<supf><mode>): ... this. (mve_vqrshlq_m_n_<supf><mode>, mve_vrshlq_m_n_<supf><mode>): Merge into ... (@mve_<mve_insn>q_m_n_<supf><mode>): ... this.
2023-05-05arm: [MVE intrinsics] add binary_round_lshift shapeChristophe Lyon2-0/+62
This patch adds the binary_round_lshift shape description. 2022-09-08 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (binary_round_lshift): New. * config/arm/arm-mve-builtins-shapes.h (binary_round_lshift): New.
2023-05-05RISC-V: Fix PR109615Juzhe-Zhong4-66/+56
This patch is to fix following case: void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond) { size_t vl = 101; if (cond) vl = m * 2; else vl = m * 2 * vl; for (size_t i = 0; i < n; i++) { vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl); __riscv_vse8_v_i8mf8 (out + i, v, vl); vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl); vint8mf8_t v2 = __riscv_vle8_v_i8mf8_tumu (mask, v, in + i + 100, vl); __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl); } for (size_t i = 0; i < n; i++) { vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, vl); __riscv_vse8_v_i8mf8 (out + i + 300, v, vl); } } The value of "vl" is coming from different blocks so it will be wrapped as a PHI node of each block. In the first loop, the "vl" source is a PHI node from bb 4. In the second loop, the "vl" source is a PHI node from bb 5. since bb 5 is dominated by bb 4, the PHI input of "vl" in the second loop is the PHI node of "vl" in bb 4. So when 2 "vl" PHI node are both degenerate PHI node (the phi->num_inputs () == 1) and their only input are same, it's safe for us to consider they are compatible. This patch is only optimize degenerate PHI since it's safe and simple optimization. non-dengerate PHI are considered as incompatible unless the PHI are the same in RTL_SSA. TODO: non-generate PHI is complicated, we can support it when it is necessary in the future. Before this patch: ... .L2: addi a4,a1,100 add t1,a0,a2 mv t0,a0 beq a2,zero,.L1 vsetvli zero,a3,e8,mf8,tu,mu .L4: addi a6,t0,100 addi a7,a4,-100 vle8.v v1,0(t0) addi t0,t0,1 vse8.v v1,0(a7) vlm.v v0,0(a6) vle8.v v1,0(a6),v0.t vse8.v v1,0(a4) addi a4,a4,1 bne t0,t1,.L4 addi a0,a0,300 addi a1,a1,300 add a2,a0,a2 vsetvli zero,a3,e8,mf8,ta,ma .L5: vle8.v v2,0(a0) addi a0,a0,1 vse8.v v2,0(a1) addi a1,a1,1 bne a2,a0,.L5 .L1: ret After this patch: ... .L2: addi a4,a1,100 add t1,a0,a2 mv t0,a0 beq a2,zero,.L1 vsetvli zero,a3,e8,mf8,tu,mu .L4: addi a6,t0,100 addi a7,a4,-100 vle8.v v1,0(t0) addi t0,t0,1 vse8.v v1,0(a7) vlm.v v0,0(a6) vle8.v v1,0(a6),v0.t vse8.v v1,0(a4) addi a4,a4,1 bne t0,t1,.L4 addi a0,a0,300 addi a1,a1,300 add a2,a0,a2 .L5: vle8.v v2,0(a0) addi a0,a0,1 vse8.v v2,0(a1) addi a1,a1,1 bne a2,a0,.L5 .L1: ret PR target/109615 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (avl_info::multiple_source_equal_p): Add denegrate PHI optmization. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/avl_single-74.c: Adapt testcase. * gcc.target/riscv/rvv/vsetvl/vsetvl-11.c: Ditto. * gcc.target/riscv/rvv/vsetvl/pr109615.c: New test.
2023-05-05i386: Rename index_register_operand predicate to register_no_SP_operandUros Bizjak2-9/+9
Rename index_register_operand predicate to what it really does. No functional change. gcc/ChangeLog: * config/i386/predicates.md (register_no_SP_operand): Rename from index_register_operand. (call_register_operand): Update for rename. * config/i386/i386.md (*lea<mode>_general_[1234]): Update for rename.
2023-05-05match.pd: Use splits in makefile and make configurable.Tamar Christina3-22/+89
This updates the build system to split up match.pd files into chunks of 10. This also introduces a new flag --with-matchpd-partitions which can be used to change the number of partitions. For the analysis of why 10 please look at the previous patch in the series. gcc/ChangeLog: PR bootstrap/84402 * Makefile.in (NUM_MATCH_SPLITS, MATCH_SPLITS_SEQ, GIMPLE_MATCH_PD_SEQ_SRC, GIMPLE_MATCH_PD_SEQ_O, GENERIC_MATCH_PD_SEQ_SRC, GENERIC_MATCH_PD_SEQ_O): New. (OBJS, MOSTLYCLEANFILES, .PRECIOUS): Use them. (s-match): Split into s-generic-match and s-gimple-match. * configure.ac (with-matchpd-partitions, DEFAULT_MATCHPD_PARTITIONS): New. * configure: Regenerate.
2023-05-05match.pd: automatically partition *-match.cc files.Tamar Christina1-36/+190
Following on from Richi's RFC[1] this is another attempt to split up match.pd into multiple gimple-match and generic-match files. This version is fully automated and requires no human intervention. First things first, some perf numbers. The following shows the effect of the patch on my desktop doing parallel compilation of gimple-match: +--------+------------------+--------+------------------+ | splits | rel. improvement | splits | rel. improvement | +--------+------------------+--------+------------------+ | 1 | 0.00% | 33 | 91.03% | | 2 | 71.77% | 34 | 84.02% | | 3 | 100.71% | 35 | 83.42% | | 4 | 143.08% | 36 | 78.80% | | 5 | 176.18% | 37 | 74.06% | | 6 | 174.40% | 38 | 55.76% | | 7 | 176.62% | 39 | 66.90% | | 8 | 168.35% | 40 | 18.25% | | 9 | 189.80% | 41 | 16.55% | | 10 | 171.77% | 42 | 47.02% | | 11 | 152.82% | 43 | 15.29% | | 12 | 112.20% | 44 | 21.63% | | 13 | 158.57% | 45 | 41.53% | | 14 | 158.57% | 46 | 21.98% | | 15 | 152.07% | 47 | -42.74% | | 16 | 151.70% | 48 | -32.62% | | 17 | 131.52% | 49 | 11.81% | | 18 | 133.11% | 50 | 34.07% | | 19 | 137.33% | 51 | 2.71% | | 20 | 103.83% | 52 | -22.23% | | 21 | 132.47% | 53 | 32.30% | | 22 | 116.52% | 54 | 21.45% | | 23 | 112.73% | 55 | 40.02% | | 24 | 111.94% | 56 | 42.83% | | 25 | 112.73% | 57 | -9.98% | | 26 | 104.07% | 58 | 18.01% | | 27 | 113.27% | 59 | -4.91% | | 28 | 96.77% | 60 | 22.94% | | 29 | 93.42% | 61 | -3.73% | | 30 | 87.67% | 62 | -27.43% | | 31 | 89.54% | 63 | -1.05% | | 32 | 84.42% | 64 | -5.44% | +--------+------------------+--------+------------------+ As can be seen there seems to be a point of diminishing returns in doing splits. This comes from the fact that these match files consume a sizeable amount of headers. At a certain point the parsing overhead of the headers dominate and you start losing in gains. As such from this I've made the default 10 splits per file to allow for some room for growth in the future without needing changes to the split amount. Since 5-10 show roughly the same gains it means we can afford to double the file sizes before we need to up the split amount. This can be controlled by the configure parameter --with-matchpd-partitions=. At 10 splits the sizes of the files are: 1.2M gimple-match-1.cc 490K gimple-match-2.cc 459K gimple-match-3.cc 462K gimple-match-4.cc 466K gimple-match-5.cc 690K gimple-match-6.cc 517K gimple-match-7.cc 693K gimple-match-8.cc 1011K gimple-match-9.cc 490K gimple-match-10.cc 210K gimple-match-auto.h The reason gimple-match-1.cc is so large is because it got allocated a very large function: gimple_simplify_NE_EXPR. Because of these sporadically large functions the allocation to a split happens based on the amount of data already written to a split instead of just a simple round robin allocation (though the patch supports that too.). This means that once gimple_simplify_NE_EXPR is allocated to gimple-match-1.cc nothing uses it again until the rest of the files catch up. To support this split a new header file *-match-auto.h is generated to allow the individual files to compile separately. Lastly for the auto generated files I use pragmas to silence the unused predicate warnings instead of the previous Makefile way because I couldn't find a way to set them without knowing the number of split files beforehand. Finally with this change, bootstrap time has dropped 8 minutes on AArch64. [1] https://gcc.gnu.org/legacy-ml/gcc-patches/2018-04/msg01125.html gcc/ChangeLog: PR bootstrap/84402 * genmatch.cc (emit_func, SIZED_BASED_CHUNKS, get_out_file): New. (decision_tree::gen): Accept list of files instead of single and update to write function definition to header and main file. (write_predicate): Likewise. (write_header): Emit pragmas and new includes. (main): Create file buffers and cleanup. (showUsage, write_header_includes): New.
2023-05-05genmatch: split shared code to gimple-match-exports.ccTamar Christina4-1193/+1260
In preparation for automatically splitting match.pd files I split off the non-static helper functions that are shared between the match.pd functions off to another file. This file can be compiled in parallel and also allows us to later avoid duplicate symbols errors. gcc/ChangeLog: PR bootstrap/84402 * Makefile.in (OBJS): Add gimple-match-exports.o. * genmatch.cc (decision_tree::gen): Export gimple_gimplify helpers. * gimple-match-head.cc (gimple_simplify, gimple_resimplify1, gimple_resimplify2, gimple_resimplify3, gimple_resimplify4, gimple_resimplify5, constant_for_folding, convert_conditional_op, maybe_resimplify_conditional_op, gimple_match_op::resimplify, maybe_build_generic_op, build_call_internal, maybe_push_res_to_seq, do_valueize, try_conditional_simplification, gimple_extract, gimple_extract_op, canonicalize_code, commutative_binary_op_p, commutative_ternary_op_p, first_commutative_argument, associative_binary_op_p, directly_supported_p, get_conditional_internal_fn): Moved to gimple-match-exports.cc * gimple-match-exports.cc: New file.
2023-05-05match.pd: CSE the dump output check.Tamar Christina1-1/+7
This is a small improvement in QoL codegen for match.pd to save time not re-evaluating the condition for printing debug information in every function. There is a small but consistent runtime and compile time win here. The runtime win comes from not having to do the condition over again, and on Arm plaforms we now use the new test-and-branch support for booleans to only have a single instruction here. gcc/ChangeLog: PR bootstrap/84402 * genmatch.cc (decision_tree::gen, write_predicate): Generate new debug_dump var. (dt_simplify::gen_1): Use it.
2023-05-05match.pd: Remove commented out line pragmas unless -vv is used.Tamar Christina1-1/+1
genmatch currently outputs commented out line directives that have no effect but the compiler still has to parse only to discard. They are however handy when debugging genmatch output. As such this moves them behind the -vv flag. gcc/ChangeLog: PR bootstrap/84402 * genmatch.cc (output_line_directive): Only emit commented directive when -vv.
2023-05-05match.pd: don't emit label if not neededTamar Christina1-8/+22
This is a small QoL codegen improvement for match.pd to not emit labels when they are not needed. The codegen is nice and there is a small (but consistent) improvement in compile time. gcc/ChangeLog: PR bootstrap/84402 * genmatch.cc (dt_simplify::gen_1): Only emit labels if used.
2023-05-05GCN: Silence unused-variable warningTobias Burnus1-2/+0
gcc/ChangeLog: * config/gcn/gcn.cc (gcn_vectorize_builtin_vectorized_function): Remove unused in_mode/in_n variables.
2023-05-05tree-optimization/109735 - conversion for vectorized pointer-diffRichard Biener1-12/+13
There's handling in vectorizable_operation for POINTER_DIFF_EXPR requiring conversion of the result of the unsigned operation to a signed type. But that's conditional on the "default" kind of vectorization. In this PR it's shown the emulated vector path needs it and I think the masked operation case will, too (though we might eventually never mask an integral MINUS_EXPR). So the following makes that handling unconditional. PR tree-optimization/109735 * tree-vect-stmts.cc (vectorizable_operation): Perform conversion for POINTER_DIFF_EXPR unconditionally.
2023-05-05i386: Introduce mulv2si3 instructionUros Bizjak2-0/+76
For SSE2 targets the expander unpacks input elements into the correct position in the V4SI vector and emits PMULUDQ instruction. The output elements are then shuffled back to their positions in the V2SI vector. For SSE4 targets PMULLD instruction is emitted directly. gcc/ChangeLog: * config/i386/mmx.md (mulv2si3): New expander. (*mulv2si3): New insn pattern. gcc/testsuite/ChangeLog: * gcc.target/i386/sse2-mmx-mult-vec.c: New test.
2023-05-05nvptx/mkoffload.cc: Add dummy proc for OpenMP rev-offload table [PR108098]Tobias Burnus1-0/+14
Seemingly, the ptx JIT of CUDA <= 10.2 replaces function pointers in global variables by NULL if a translation does not contain any executable code. It works with CUDA 11.1. The code of this commit is about reverse offload; having NULL values disables the side of reverse offload during image load. Solution is the same as found by Thomas for a related issue: Adding a dummy procedure. Cf. the PR of this issue and Thomas' patch "nvptx: Support global constructors/destructors via 'collect2'" https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607749.html As that approach also works here: Co-authored-by: Thomas Schwinge <thomas@codesourcery.com> gcc/ PR libgomp/108098 * config/nvptx/mkoffload.cc (process): Emit dummy procedure alongside reverse-offload function table to prevent NULL values of the function addresses.
2023-05-05builtins: Fix comment typo mpft_t -> mpfr_tJakub Jelinek2-4/+4
I've noticed 4 typos in comments, fixed thusly. 2023-05-05 Jakub Jelinek <jakub@redhat.com> * builtins.cc (do_mpfr_ckconv, do_mpc_ckconv): Fix comment typo, mpft_t -> mpfr_t. * fold-const-call.cc (do_mpfr_ckconv, do_mpc_ckconv): Likewise.
2023-05-04PHIOPT: Fix diamond case of match_simplify_replacementAndrew Pinski3-3/+90
So it turns out I messed checking which edge was true/false for the diamond form. The edges, e0 and e1 here are edges from the merge block but the true/false edges are from the conditional block and with diamond/threeway, there is a bb inbetween on both edges. Most of the time, the check that was in match_simplify_replacement would happen to be correct for diamond form as most of the time the first edge in the conditional is the edge for the true side of the conditional. This is why I didn't see the issue during bootstrap/testing. I added a fragile gimple testcase which exposed the issue. Since there is no way to specify the order of the edges in the gimple fe, we have to have forwprop to swap the false/true edges (not order of them, just swapping true/false flags) and hope not to do cleanupcfg inbetween forwprop and the first phiopt pass. This is the fragile part really, it is not that we will produce wrong code, just we won't hit what was the failing case. OK? Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/109732 gcc/ChangeLog: * tree-ssa-phiopt.cc (match_simplify_replacement): Fix the selection of the argtrue/argfalse. gcc/testsuite/ChangeLog: * gcc.dg/pr109732.c: New test. * gcc.dg/pr109732-1.c: New test.
2023-05-05MATCH: Add ABSU<a> == 0 to a == 0 simplificationAndrew Pinski2-5/+18
There is already an `ABS<a> == 0` to `a == 0` pattern, this just extends that to ABSU too. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/109722 gcc/ChangeLog: * match.pd: Extend the `ABS<a> == 0` pattern to cover `ABSU<a> == 0` too. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/abs-1.c: New test.
2023-05-04Revert "c++: restore instantiate_decl assert"Jason Merrill2-6/+10
In the testcase the assert fails because we use one member function from another while we're in the middle of instantiating them all, which is perfectly fine. It seems complicated to detect this situation, so let's remove the assert again. PR c++/109658 This reverts commit 95d4c0d2e6318aef88ba0bc607dfc1ec6b7a612f. gcc/testsuite/ChangeLog: * g++.dg/template/local10.C: New test.
2023-05-05Daily bump.GCC Administrator5-1/+407
2023-05-04i386: Tighten ashift to lea splitter operand predicates [PR109733]Uros Bizjak2-4/+9
The predicates of ashift to lea post-reload splitter were too broad so the splitter tried to convert the mask shift instruction. Tighten operand predicates to match only general registers. gcc/ChangeLog: PR target/109733 * config/i386/predicates.md (index_reg_operand): New predicate. * config/i386/i386.md (ashift to lea spliter): Use general_reg_operand and index_reg_operand predicates.
2023-05-04PR modula2/109729 cannot use a CHAR type as a FOR loop iteratorGaius Mulley4-30/+85
This patch introduces a new quadruple ArithAddOp which is used in the construction of FOR loop to ensure that when constant folding is applied it does not concatenate two constant char operands into a string constant. Overloading only occurs with constant operands. gcc/m2/ChangeLog: PR modula2/109729 * gm2-compiler/M2GenGCC.mod (CodeStatement): Detect ArithAddOp and call CodeAddChecked. (ResolveConstantExpressions): Detect ArithAddOp and call FoldArithAdd. (FoldArithAdd): New procedure. (FoldAdd): Refactor to use FoldArithAdd. * gm2-compiler/M2Quads.def (QuadOperator): Add ArithAddOp. * gm2-compiler/M2Quads.mod: Remove commented imports. (QuadFrame): Changed comments to use GNU coding standards. (ArithPlusTok): New global variable. (BuildForToByDo): Use ArithPlusTok instead of PlusTok. (MakeOp): Detect ArithPlusTok and return ArithAddOp. (WriteQuad): Add ArithAddOp clause. (WriteOperator): Add ArithAddOp clause. (Init): Initialize ArithPlusTok. gcc/testsuite/ChangeLog: PR modula2/109729 * gm2/pim/run/pass/ForChar.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>