Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch adds the binary_acc_int32 shape description.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_acc_int32): New.
* config/arm/arm-mve-builtins-shapes.h (binary_acc_int32): New.
|
|
Implement vaddlvaq using the new MVE builtins framework.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-base.cc (vaddlvaq): New.
* config/arm/arm-mve-builtins-base.def (vaddlvaq): New.
* config/arm/arm-mve-builtins-base.h (vaddlvaq): New.
* config/arm/arm_mve.h (vaddlvaq): Remove.
(vaddlvaq_p): Remove.
(vaddlvaq_u32): Remove.
(vaddlvaq_s32): Remove.
(vaddlvaq_p_s32): Remove.
(vaddlvaq_p_u32): Remove.
(__arm_vaddlvaq_u32): Remove.
(__arm_vaddlvaq_s32): Remove.
(__arm_vaddlvaq_p_s32): Remove.
(__arm_vaddlvaq_p_u32): Remove.
(__arm_vaddlvaq): Remove.
(__arm_vaddlvaq_p): Remove.
|
|
This patch adds the unary_widen_acc shape description.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_widen_acc): New.
* config/arm/arm-mve-builtins-shapes.h (unary_widen_acc): New.
|
|
Factorize vaddlvaq builtins so that they use parameterized names.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/iterators.md (mve_insn): Add vaddlva.
* config/arm/mve.md (mve_vaddlvaq_<supf>v4si): Rename into ...
(@mve_<mve_insn>q_<supf>v4si): ... this.
(mve_vaddlvaq_p_<supf>v4si): Rename into ...
(@mve_<mve_insn>q_p_<supf>v4si): ... this.
|
|
Do not crash when asking ix86_widen_mult_cost for the cost of
a widening mul operation to V4HI or V2SImode.
gcc/ChangeLog:
PR target/109807
* config/i386/i386.cc (ix86_widen_mult_cost):
Handle V4HImode and V2SImode.
gcc/testsuite/ChangeLog:
PR target/109807
* gcc.target/i386/pr109807.c: New test.
|
|
This patch adds various vector constants to riscv_const_insns in order
for them to be properly recognized as immediate operands. This then
allows to emit vmv.v.i instructions via autovectorization.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_const_insns): Add permissible
vector constants.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vmv-imm-rv32.c: New test.
* gcc.target/riscv/rvv/autovec/vmv-imm-rv64.c: New test.
* gcc.target/riscv/rvv/autovec/vmv-imm-template.h: New test.
* gcc.target/riscv/rvv/autovec/vmv-imm-run.c: New test.
|
|
The VMSET simplification RVV integer comparision has merged already.
This patch would like to update the comments for the cases that the
define_split will act on.
Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:
* config/riscv/vector.md: Add comments for simplifying to vmset.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
This patch splits off the shift patterns of the binop patterns.
This is necessary as the scalar shifts require a Pmode operand
as shift count. To this end, a new iterator any_int_binop_no_shift
is introduced. At a later point when the binops are split up
further in commutative and non-commutative patterns (which both
do not include the shift patterns) we might not need this anymore.
gcc/ChangeLog:
* config/riscv/autovec.md (<optab><mode>3): Add scalar shift
pattern.
(v<optab><mode>3): Add vector shift pattern.
* config/riscv/vector-iterators.md: New iterator.
|
|
This patch tries to improve the wrappers that emit either vlmax or
non-vlmax operations. Now, emit_len_op can be used to
emit a regular operation. Depending on whether a length != NULL
is passed either no VLMAX flags are set or we emit a vsetvli and
set VLMAX flags. The patch also adds some comments that describes
some of the rationale of the current handling of vlmax/nonvlmax
operations.
gcc/ChangeLog:
* config/riscv/autovec.md: Use renamed functions.
* config/riscv/riscv-protos.h (emit_vlmax_op): Rename.
(emit_vlmax_reg_op): To this.
(emit_nonvlmax_op): Rename.
(emit_len_op): To this.
(emit_nonvlmax_binop): Rename.
(emit_len_binop): To this.
* config/riscv/riscv-v.cc (emit_pred_op): Add default parameter.
(emit_pred_binop): Remove vlmax_p.
(emit_vlmax_op): Rename.
(emit_vlmax_reg_op): To this.
(emit_nonvlmax_op): Rename.
(emit_len_op): To this.
(emit_nonvlmax_binop): Rename.
(emit_len_binop): To this.
(sew64_scalar_helper): Use renamed functions.
(expand_tuple_move): Use renamed functions.
* config/riscv/riscv.cc (vector_zero_call_used_regs): Use
renamed functions.
* config/riscv/vector.md: Use renamed functions.
|
|
This patch adds basic binary integer operations support. It is based
on Michael Collison's work and makes use of the existing helpers in
riscv-c.cc. It introduces emit_nonvlmax_binop which, in turn, uses
emit_pred_binop. Setting the destination as well as the mask and the
length are factored out into separate functions.
gcc/ChangeLog:
* config/riscv/autovec.md (<optab><mode>3): Add integer binops.
* config/riscv/riscv-protos.h (emit_nonvlmax_binop): Declare.
* config/riscv/riscv-v.cc (emit_pred_op): New function.
(set_expander_dest_and_mask): New function.
(emit_pred_binop): New function.
(emit_nonvlmax_binop): New function.
Co-authored-by: Michael Collison <collison@rivosinc.com>
|
|
Implement vmovlbq, vmovltq using the new MVE builtins framework.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-base.cc (vmovlbq, vmovltq): New.
* config/arm/arm-mve-builtins-base.def (vmovlbq, vmovltq): New.
* config/arm/arm-mve-builtins-base.h (vmovlbq, vmovltq): New.
* config/arm/arm_mve.h (vmovlbq): Remove.
(vmovltq): Remove.
(vmovlbq_m): Remove.
(vmovltq_m): Remove.
(vmovlbq_x): Remove.
(vmovltq_x): Remove.
(vmovlbq_s8): Remove.
(vmovlbq_s16): Remove.
(vmovltq_s8): Remove.
(vmovltq_s16): Remove.
(vmovltq_u8): Remove.
(vmovltq_u16): Remove.
(vmovlbq_u8): Remove.
(vmovlbq_u16): Remove.
(vmovlbq_m_s8): Remove.
(vmovltq_m_s8): Remove.
(vmovlbq_m_u8): Remove.
(vmovltq_m_u8): Remove.
(vmovlbq_m_s16): Remove.
(vmovltq_m_s16): Remove.
(vmovlbq_m_u16): Remove.
(vmovltq_m_u16): Remove.
(vmovlbq_x_s8): Remove.
(vmovlbq_x_s16): Remove.
(vmovlbq_x_u8): Remove.
(vmovlbq_x_u16): Remove.
(vmovltq_x_s8): Remove.
(vmovltq_x_s16): Remove.
(vmovltq_x_u8): Remove.
(vmovltq_x_u16): Remove.
(__arm_vmovlbq_s8): Remove.
(__arm_vmovlbq_s16): Remove.
(__arm_vmovltq_s8): Remove.
(__arm_vmovltq_s16): Remove.
(__arm_vmovltq_u8): Remove.
(__arm_vmovltq_u16): Remove.
(__arm_vmovlbq_u8): Remove.
(__arm_vmovlbq_u16): Remove.
(__arm_vmovlbq_m_s8): Remove.
(__arm_vmovltq_m_s8): Remove.
(__arm_vmovlbq_m_u8): Remove.
(__arm_vmovltq_m_u8): Remove.
(__arm_vmovlbq_m_s16): Remove.
(__arm_vmovltq_m_s16): Remove.
(__arm_vmovlbq_m_u16): Remove.
(__arm_vmovltq_m_u16): Remove.
(__arm_vmovlbq_x_s8): Remove.
(__arm_vmovlbq_x_s16): Remove.
(__arm_vmovlbq_x_u8): Remove.
(__arm_vmovlbq_x_u16): Remove.
(__arm_vmovltq_x_s8): Remove.
(__arm_vmovltq_x_s16): Remove.
(__arm_vmovltq_x_u8): Remove.
(__arm_vmovltq_x_u16): Remove.
(__arm_vmovlbq): Remove.
(__arm_vmovltq): Remove.
(__arm_vmovlbq_m): Remove.
(__arm_vmovltq_m): Remove.
(__arm_vmovlbq_x): Remove.
(__arm_vmovltq_x): Remove.
|
|
This patch adds the unary_widen shape description.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_widen): New.
* config/arm/arm-mve-builtins-shapes.h (unary_widen): New.
|
|
Factorize vmovlbq, vmovltq builtins so that they use the same
parameterized names.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/iterators.md (mve_insn): Add vmovlb, vmovlt.
(VMOVLBQ, VMOVLTQ): Merge into ...
(VMOVLxQ): ... this.
(VMOVLTQ_M, VMOVLBQ_M): Merge into ...
(VMOVLxQ_M): ... this.
* config/arm/mve.md (mve_vmovltq_<supf><mode>)
(mve_vmovlbq_<supf><mode>): Merge into ...
(@mve_<mve_insn>q_<supf><mode>): ... this.
(mve_vmovlbq_m_<supf><mode>, mve_vmovltq_m_<supf><mode>): Merge
into ...
(@mve_<mve_insn>q_m_<supf><mode>): ... this.
|
|
Implement vaddlvq using the new MVE builtins framework.
Since we kept v4si hardcoded in the builtin name, we need to
special-case it in unspec_mve_function_exact_insn_pred_p.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-base.cc (vaddlvq): New.
* config/arm/arm-mve-builtins-base.def (vaddlvq): New.
* config/arm/arm-mve-builtins-base.h (vaddlvq): New.
* config/arm/arm-mve-builtins-functions.h
(unspec_mve_function_exact_insn_pred_p): Handle vaddlvq.
* config/arm/arm_mve.h (vaddlvq): Remove.
(vaddlvq_p): Remove.
(vaddlvq_s32): Remove.
(vaddlvq_u32): Remove.
(vaddlvq_p_s32): Remove.
(vaddlvq_p_u32): Remove.
(__arm_vaddlvq_s32): Remove.
(__arm_vaddlvq_u32): Remove.
(__arm_vaddlvq_p_s32): Remove.
(__arm_vaddlvq_p_u32): Remove.
(__arm_vaddlvq): Remove.
(__arm_vaddlvq_p): Remove.
|
|
Factorize vaddlvq builtins so that they use parameterized names.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/iterators.md (mve_insn): Add vaddlv.
* config/arm/mve.md (mve_vaddlvq_<supf>v4si): Rename into ...
(@mve_<mve_insn>q_<supf>v4si): ... this.
(mve_vaddlvq_p_<supf>v4si): Rename into ...
(@mve_<mve_insn>q_p_<supf>v4si): ... this.
|
|
This patch adds the unary_acc shape description.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_acc): New.
* config/arm/arm-mve-builtins-shapes.h (unary_acc): New.
|
|
Implement vaddvaq using the new MVE builtins framework.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-base.cc (vaddvaq): New.
* config/arm/arm-mve-builtins-base.def (vaddvaq): New.
* config/arm/arm-mve-builtins-base.h (vaddvaq): New.
* config/arm/arm_mve.h (vaddvaq): Remove.
(vaddvaq_p): Remove.
(vaddvaq_u8): Remove.
(vaddvaq_s8): Remove.
(vaddvaq_u16): Remove.
(vaddvaq_s16): Remove.
(vaddvaq_u32): Remove.
(vaddvaq_s32): Remove.
(vaddvaq_p_u8): Remove.
(vaddvaq_p_s8): Remove.
(vaddvaq_p_u16): Remove.
(vaddvaq_p_s16): Remove.
(vaddvaq_p_u32): Remove.
(vaddvaq_p_s32): Remove.
(__arm_vaddvaq_u8): Remove.
(__arm_vaddvaq_s8): Remove.
(__arm_vaddvaq_u16): Remove.
(__arm_vaddvaq_s16): Remove.
(__arm_vaddvaq_u32): Remove.
(__arm_vaddvaq_s32): Remove.
(__arm_vaddvaq_p_u8): Remove.
(__arm_vaddvaq_p_s8): Remove.
(__arm_vaddvaq_p_u16): Remove.
(__arm_vaddvaq_p_s16): Remove.
(__arm_vaddvaq_p_u32): Remove.
(__arm_vaddvaq_p_s32): Remove.
(__arm_vaddvaq): Remove.
(__arm_vaddvaq_p): Remove.
|
|
This patch adds the unary_int32_acc shape description.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_int32_acc): New.
* config/arm/arm-mve-builtins-shapes.h (unary_int32_acc): New.
|
|
Factorize vaddvaq builtins so that they use parameterized names.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/iterators.md (mve_insn): Add vaddva.
* config/arm/mve.md (mve_vaddvaq_<supf><mode>): Rename into ...
(@mve_<mve_insn>q_<supf><mode>): ... this.
(mve_vaddvaq_p_<supf><mode>): Rename into ...
(@mve_<mve_insn>q_p_<supf><mode>): ... this.
|
|
Implement vaddvq using the new MVE builtins framework.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-base.cc (vaddvq): New.
* config/arm/arm-mve-builtins-base.def (vaddvq): New.
* config/arm/arm-mve-builtins-base.h (vaddvq): New.
* config/arm/arm_mve.h (vaddvq): Remove.
(vaddvq_p): Remove.
(vaddvq_s8): Remove.
(vaddvq_s16): Remove.
(vaddvq_s32): Remove.
(vaddvq_u8): Remove.
(vaddvq_u16): Remove.
(vaddvq_u32): Remove.
(vaddvq_p_u8): Remove.
(vaddvq_p_s8): Remove.
(vaddvq_p_u16): Remove.
(vaddvq_p_s16): Remove.
(vaddvq_p_u32): Remove.
(vaddvq_p_s32): Remove.
(__arm_vaddvq_s8): Remove.
(__arm_vaddvq_s16): Remove.
(__arm_vaddvq_s32): Remove.
(__arm_vaddvq_u8): Remove.
(__arm_vaddvq_u16): Remove.
(__arm_vaddvq_u32): Remove.
(__arm_vaddvq_p_u8): Remove.
(__arm_vaddvq_p_s8): Remove.
(__arm_vaddvq_p_u16): Remove.
(__arm_vaddvq_p_s16): Remove.
(__arm_vaddvq_p_u32): Remove.
(__arm_vaddvq_p_s32): Remove.
(__arm_vaddvq): Remove.
(__arm_vaddvq_p): Remove.
|
|
This patch adds the unary_int32 shape description.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_int32): New.
* config/arm/arm-mve-builtins-shapes.h (unary_int32): New.
|
|
Factorize vaddvq builtins so that they use parameterized names.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/iterators.md (mve_insn): Add vaddv.
* config/arm/mve.md (@mve_vaddvq_<supf><mode>): Rename into ...
(@mve_<mve_insn>q_<supf><mode>): ... this.
(mve_vaddvq_p_<supf><mode>): Rename into ...
(@mve_<mve_insn>q_p_<supf><mode>): ... this.
* config/arm/vec-common.md: Use gen_mve_q instead of
gen_mve_vaddvq.
|
|
Implement vdupq using the new MVE builtins framework.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-base.cc (FUNCTION_ONLY_N): New.
(vdupq): New.
* config/arm/arm-mve-builtins-base.def (vdupq): New.
* config/arm/arm-mve-builtins-base.h: (vdupq): New.
* config/arm/arm_mve.h (vdupq_n): Remove.
(vdupq_m): Remove.
(vdupq_n_f16): Remove.
(vdupq_n_f32): Remove.
(vdupq_n_s8): Remove.
(vdupq_n_s16): Remove.
(vdupq_n_s32): Remove.
(vdupq_n_u8): Remove.
(vdupq_n_u16): Remove.
(vdupq_n_u32): Remove.
(vdupq_m_n_u8): Remove.
(vdupq_m_n_s8): Remove.
(vdupq_m_n_u16): Remove.
(vdupq_m_n_s16): Remove.
(vdupq_m_n_u32): Remove.
(vdupq_m_n_s32): Remove.
(vdupq_m_n_f16): Remove.
(vdupq_m_n_f32): Remove.
(vdupq_x_n_s8): Remove.
(vdupq_x_n_s16): Remove.
(vdupq_x_n_s32): Remove.
(vdupq_x_n_u8): Remove.
(vdupq_x_n_u16): Remove.
(vdupq_x_n_u32): Remove.
(vdupq_x_n_f16): Remove.
(vdupq_x_n_f32): Remove.
(__arm_vdupq_n_s8): Remove.
(__arm_vdupq_n_s16): Remove.
(__arm_vdupq_n_s32): Remove.
(__arm_vdupq_n_u8): Remove.
(__arm_vdupq_n_u16): Remove.
(__arm_vdupq_n_u32): Remove.
(__arm_vdupq_m_n_u8): Remove.
(__arm_vdupq_m_n_s8): Remove.
(__arm_vdupq_m_n_u16): Remove.
(__arm_vdupq_m_n_s16): Remove.
(__arm_vdupq_m_n_u32): Remove.
(__arm_vdupq_m_n_s32): Remove.
(__arm_vdupq_x_n_s8): Remove.
(__arm_vdupq_x_n_s16): Remove.
(__arm_vdupq_x_n_s32): Remove.
(__arm_vdupq_x_n_u8): Remove.
(__arm_vdupq_x_n_u16): Remove.
(__arm_vdupq_x_n_u32): Remove.
(__arm_vdupq_n_f16): Remove.
(__arm_vdupq_n_f32): Remove.
(__arm_vdupq_m_n_f16): Remove.
(__arm_vdupq_m_n_f32): Remove.
(__arm_vdupq_x_n_f16): Remove.
(__arm_vdupq_x_n_f32): Remove.
(__arm_vdupq_n): Remove.
(__arm_vdupq_m): Remove.
|
|
This patch adds the unary_n shape description.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_n): New.
* config/arm/arm-mve-builtins-shapes.h (unary_n): New.
|
|
Factorize vdup builtins so that they use parameterized names.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/iterators.md (MVE_FP_M_N_VDUPQ_ONLY)
(MVE_FP_N_VDUPQ_ONLY): New.
(mve_insn): Add vdupq.
* config/arm/mve.md (mve_vdupq_n_f<mode>): Rename into ...
(@mve_<mve_insn>q_n_f<mode>): ... this.
(mve_vdupq_n_<supf><mode>): Rename into ...
(@mve_<mve_insn>q_n_<supf><mode>): ... this.
(mve_vdupq_m_n_<supf><mode>): Rename into ...
(@mve_<mve_insn>q_m_n_<supf><mode>): ... this.
(mve_vdupq_m_n_f<mode>): Rename into ...
(@mve_<mve_insn>q_m_n_f<mode>): ... this.
|
|
Implement vrev16q, vrev32q, vrev64q using the new MVE builtins
framework.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-base.cc (vrev16q, vrev32q, vrev64q):
New.
* config/arm/arm-mve-builtins-base.def (vrev16q, vrev32q)
(vrev64q): New.
* config/arm/arm-mve-builtins-base.h (vrev16q, vrev32q)
(vrev64q): New.
* config/arm/arm_mve.h (vrev16q): Remove.
(vrev32q): Remove.
(vrev64q): Remove.
(vrev64q_m): Remove.
(vrev16q_m): Remove.
(vrev32q_m): Remove.
(vrev16q_x): Remove.
(vrev32q_x): Remove.
(vrev64q_x): Remove.
(vrev64q_f16): Remove.
(vrev64q_f32): Remove.
(vrev32q_f16): Remove.
(vrev16q_s8): Remove.
(vrev32q_s8): Remove.
(vrev32q_s16): Remove.
(vrev64q_s8): Remove.
(vrev64q_s16): Remove.
(vrev64q_s32): Remove.
(vrev64q_u8): Remove.
(vrev64q_u16): Remove.
(vrev64q_u32): Remove.
(vrev32q_u8): Remove.
(vrev32q_u16): Remove.
(vrev16q_u8): Remove.
(vrev64q_m_u8): Remove.
(vrev64q_m_s8): Remove.
(vrev64q_m_u16): Remove.
(vrev64q_m_s16): Remove.
(vrev64q_m_u32): Remove.
(vrev64q_m_s32): Remove.
(vrev16q_m_s8): Remove.
(vrev32q_m_f16): Remove.
(vrev16q_m_u8): Remove.
(vrev32q_m_s8): Remove.
(vrev64q_m_f16): Remove.
(vrev32q_m_u8): Remove.
(vrev32q_m_s16): Remove.
(vrev64q_m_f32): Remove.
(vrev32q_m_u16): Remove.
(vrev16q_x_s8): Remove.
(vrev16q_x_u8): Remove.
(vrev32q_x_s8): Remove.
(vrev32q_x_s16): Remove.
(vrev32q_x_u8): Remove.
(vrev32q_x_u16): Remove.
(vrev64q_x_s8): Remove.
(vrev64q_x_s16): Remove.
(vrev64q_x_s32): Remove.
(vrev64q_x_u8): Remove.
(vrev64q_x_u16): Remove.
(vrev64q_x_u32): Remove.
(vrev32q_x_f16): Remove.
(vrev64q_x_f16): Remove.
(vrev64q_x_f32): Remove.
(__arm_vrev16q_s8): Remove.
(__arm_vrev32q_s8): Remove.
(__arm_vrev32q_s16): Remove.
(__arm_vrev64q_s8): Remove.
(__arm_vrev64q_s16): Remove.
(__arm_vrev64q_s32): Remove.
(__arm_vrev64q_u8): Remove.
(__arm_vrev64q_u16): Remove.
(__arm_vrev64q_u32): Remove.
(__arm_vrev32q_u8): Remove.
(__arm_vrev32q_u16): Remove.
(__arm_vrev16q_u8): Remove.
(__arm_vrev64q_m_u8): Remove.
(__arm_vrev64q_m_s8): Remove.
(__arm_vrev64q_m_u16): Remove.
(__arm_vrev64q_m_s16): Remove.
(__arm_vrev64q_m_u32): Remove.
(__arm_vrev64q_m_s32): Remove.
(__arm_vrev16q_m_s8): Remove.
(__arm_vrev16q_m_u8): Remove.
(__arm_vrev32q_m_s8): Remove.
(__arm_vrev32q_m_u8): Remove.
(__arm_vrev32q_m_s16): Remove.
(__arm_vrev32q_m_u16): Remove.
(__arm_vrev16q_x_s8): Remove.
(__arm_vrev16q_x_u8): Remove.
(__arm_vrev32q_x_s8): Remove.
(__arm_vrev32q_x_s16): Remove.
(__arm_vrev32q_x_u8): Remove.
(__arm_vrev32q_x_u16): Remove.
(__arm_vrev64q_x_s8): Remove.
(__arm_vrev64q_x_s16): Remove.
(__arm_vrev64q_x_s32): Remove.
(__arm_vrev64q_x_u8): Remove.
(__arm_vrev64q_x_u16): Remove.
(__arm_vrev64q_x_u32): Remove.
(__arm_vrev64q_f16): Remove.
(__arm_vrev64q_f32): Remove.
(__arm_vrev32q_f16): Remove.
(__arm_vrev32q_m_f16): Remove.
(__arm_vrev64q_m_f16): Remove.
(__arm_vrev64q_m_f32): Remove.
(__arm_vrev32q_x_f16): Remove.
(__arm_vrev64q_x_f16): Remove.
(__arm_vrev64q_x_f32): Remove.
(__arm_vrev16q): Remove.
(__arm_vrev32q): Remove.
(__arm_vrev64q): Remove.
(__arm_vrev64q_m): Remove.
(__arm_vrev16q_m): Remove.
(__arm_vrev32q_m): Remove.
(__arm_vrev16q_x): Remove.
(__arm_vrev32q_x): Remove.
(__arm_vrev64q_x): Remove.
|
|
Factorize vrev16q vrev32q vrev64q so that they use generic builtin
names.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/iterators.md (MVE_V8HF, MVE_V16QI)
(MVE_FP_VREV64Q_ONLY, MVE_FP_M_VREV64Q_ONLY, MVE_FP_VREV32Q_ONLY)
(MVE_FP_M_VREV32Q_ONLY): New iterators.
(mve_insn): Add vrev16q, vrev32q, vrev64q.
* config/arm/mve.md (mve_vrev64q_f<mode>): Rename into ...
(@mve_<mve_insn>q_f<mode>): ... this
(mve_vrev32q_fv8hf): Rename into @mve_<mve_insn>q_f<mode>.
(mve_vrev64q_<supf><mode>): Rename into ...
(@mve_<mve_insn>q_<supf><mode>): ... this.
(mve_vrev32q_<supf><mode>): Rename into
@mve_<mve_insn>q_<supf><mode>.
(mve_vrev16q_<supf>v16qi): Rename into
@mve_<mve_insn>q_<supf><mode>.
(mve_vrev64q_m_<supf><mode>): Rename into
@mve_<mve_insn>q_m_<supf><mode>.
(mve_vrev32q_m_fv8hf): Rename into @mve_<mve_insn>q_m_f<mode>.
(mve_vrev32q_m_<supf><mode>): Rename into
@mve_<mve_insn>q_m_<supf><mode>.
(mve_vrev64q_m_f<mode>): Rename into @mve_<mve_insn>q_m_f<mode>.
(mve_vrev16q_m_<supf>v16qi): Rename into
@mve_<mve_insn>q_m_<supf><mode>.
|
|
Implement vcmp using the new MVE builtins framework.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-base.cc (vcmpeqq, vcmpneq, vcmpgeq)
(vcmpgtq, vcmpleq, vcmpltq, vcmpcsq, vcmphiq): New.
* config/arm/arm-mve-builtins-base.def (vcmpeqq, vcmpneq, vcmpgeq)
(vcmpgtq, vcmpleq, vcmpltq, vcmpcsq, vcmphiq): New.
* config/arm/arm-mve-builtins-base.h (vcmpeqq, vcmpneq, vcmpgeq)
(vcmpgtq, vcmpleq, vcmpltq, vcmpcsq, vcmphiq): New.
* config/arm/arm-mve-builtins-functions.h (class
unspec_based_mve_function_exact_insn_vcmp): New.
* config/arm/arm-mve-builtins.cc
(function_instance::has_inactive_argument): Handle vcmp.
* config/arm/arm_mve.h (vcmpneq): Remove.
(vcmphiq): Remove.
(vcmpeqq): Remove.
(vcmpcsq): Remove.
(vcmpltq): Remove.
(vcmpleq): Remove.
(vcmpgtq): Remove.
(vcmpgeq): Remove.
(vcmpneq_m): Remove.
(vcmphiq_m): Remove.
(vcmpeqq_m): Remove.
(vcmpcsq_m): Remove.
(vcmpcsq_m_n): Remove.
(vcmpltq_m): Remove.
(vcmpleq_m): Remove.
(vcmpgtq_m): Remove.
(vcmpgeq_m): Remove.
(vcmpneq_s8): Remove.
(vcmpneq_s16): Remove.
(vcmpneq_s32): Remove.
(vcmpneq_u8): Remove.
(vcmpneq_u16): Remove.
(vcmpneq_u32): Remove.
(vcmpneq_n_u8): Remove.
(vcmphiq_u8): Remove.
(vcmphiq_n_u8): Remove.
(vcmpeqq_u8): Remove.
(vcmpeqq_n_u8): Remove.
(vcmpcsq_u8): Remove.
(vcmpcsq_n_u8): Remove.
(vcmpneq_n_s8): Remove.
(vcmpltq_s8): Remove.
(vcmpltq_n_s8): Remove.
(vcmpleq_s8): Remove.
(vcmpleq_n_s8): Remove.
(vcmpgtq_s8): Remove.
(vcmpgtq_n_s8): Remove.
(vcmpgeq_s8): Remove.
(vcmpgeq_n_s8): Remove.
(vcmpeqq_s8): Remove.
(vcmpeqq_n_s8): Remove.
(vcmpneq_n_u16): Remove.
(vcmphiq_u16): Remove.
(vcmphiq_n_u16): Remove.
(vcmpeqq_u16): Remove.
(vcmpeqq_n_u16): Remove.
(vcmpcsq_u16): Remove.
(vcmpcsq_n_u16): Remove.
(vcmpneq_n_s16): Remove.
(vcmpltq_s16): Remove.
(vcmpltq_n_s16): Remove.
(vcmpleq_s16): Remove.
(vcmpleq_n_s16): Remove.
(vcmpgtq_s16): Remove.
(vcmpgtq_n_s16): Remove.
(vcmpgeq_s16): Remove.
(vcmpgeq_n_s16): Remove.
(vcmpeqq_s16): Remove.
(vcmpeqq_n_s16): Remove.
(vcmpneq_n_u32): Remove.
(vcmphiq_u32): Remove.
(vcmphiq_n_u32): Remove.
(vcmpeqq_u32): Remove.
(vcmpeqq_n_u32): Remove.
(vcmpcsq_u32): Remove.
(vcmpcsq_n_u32): Remove.
(vcmpneq_n_s32): Remove.
(vcmpltq_s32): Remove.
(vcmpltq_n_s32): Remove.
(vcmpleq_s32): Remove.
(vcmpleq_n_s32): Remove.
(vcmpgtq_s32): Remove.
(vcmpgtq_n_s32): Remove.
(vcmpgeq_s32): Remove.
(vcmpgeq_n_s32): Remove.
(vcmpeqq_s32): Remove.
(vcmpeqq_n_s32): Remove.
(vcmpneq_n_f16): Remove.
(vcmpneq_f16): Remove.
(vcmpltq_n_f16): Remove.
(vcmpltq_f16): Remove.
(vcmpleq_n_f16): Remove.
(vcmpleq_f16): Remove.
(vcmpgtq_n_f16): Remove.
(vcmpgtq_f16): Remove.
(vcmpgeq_n_f16): Remove.
(vcmpgeq_f16): Remove.
(vcmpeqq_n_f16): Remove.
(vcmpeqq_f16): Remove.
(vcmpneq_n_f32): Remove.
(vcmpneq_f32): Remove.
(vcmpltq_n_f32): Remove.
(vcmpltq_f32): Remove.
(vcmpleq_n_f32): Remove.
(vcmpleq_f32): Remove.
(vcmpgtq_n_f32): Remove.
(vcmpgtq_f32): Remove.
(vcmpgeq_n_f32): Remove.
(vcmpgeq_f32): Remove.
(vcmpeqq_n_f32): Remove.
(vcmpeqq_f32): Remove.
(vcmpeqq_m_f16): Remove.
(vcmpeqq_m_f32): Remove.
(vcmpneq_m_u8): Remove.
(vcmpneq_m_n_u8): Remove.
(vcmphiq_m_u8): Remove.
(vcmphiq_m_n_u8): Remove.
(vcmpeqq_m_u8): Remove.
(vcmpeqq_m_n_u8): Remove.
(vcmpcsq_m_u8): Remove.
(vcmpcsq_m_n_u8): Remove.
(vcmpneq_m_s8): Remove.
(vcmpneq_m_n_s8): Remove.
(vcmpltq_m_s8): Remove.
(vcmpltq_m_n_s8): Remove.
(vcmpleq_m_s8): Remove.
(vcmpleq_m_n_s8): Remove.
(vcmpgtq_m_s8): Remove.
(vcmpgtq_m_n_s8): Remove.
(vcmpgeq_m_s8): Remove.
(vcmpgeq_m_n_s8): Remove.
(vcmpeqq_m_s8): Remove.
(vcmpeqq_m_n_s8): Remove.
(vcmpneq_m_u16): Remove.
(vcmpneq_m_n_u16): Remove.
(vcmphiq_m_u16): Remove.
(vcmphiq_m_n_u16): Remove.
(vcmpeqq_m_u16): Remove.
(vcmpeqq_m_n_u16): Remove.
(vcmpcsq_m_u16): Remove.
(vcmpcsq_m_n_u16): Remove.
(vcmpneq_m_s16): Remove.
(vcmpneq_m_n_s16): Remove.
(vcmpltq_m_s16): Remove.
(vcmpltq_m_n_s16): Remove.
(vcmpleq_m_s16): Remove.
(vcmpleq_m_n_s16): Remove.
(vcmpgtq_m_s16): Remove.
(vcmpgtq_m_n_s16): Remove.
(vcmpgeq_m_s16): Remove.
(vcmpgeq_m_n_s16): Remove.
(vcmpeqq_m_s16): Remove.
(vcmpeqq_m_n_s16): Remove.
(vcmpneq_m_u32): Remove.
(vcmpneq_m_n_u32): Remove.
(vcmphiq_m_u32): Remove.
(vcmphiq_m_n_u32): Remove.
(vcmpeqq_m_u32): Remove.
(vcmpeqq_m_n_u32): Remove.
(vcmpcsq_m_u32): Remove.
(vcmpcsq_m_n_u32): Remove.
(vcmpneq_m_s32): Remove.
(vcmpneq_m_n_s32): Remove.
(vcmpltq_m_s32): Remove.
(vcmpltq_m_n_s32): Remove.
(vcmpleq_m_s32): Remove.
(vcmpleq_m_n_s32): Remove.
(vcmpgtq_m_s32): Remove.
(vcmpgtq_m_n_s32): Remove.
(vcmpgeq_m_s32): Remove.
(vcmpgeq_m_n_s32): Remove.
(vcmpeqq_m_s32): Remove.
(vcmpeqq_m_n_s32): Remove.
(vcmpeqq_m_n_f16): Remove.
(vcmpgeq_m_f16): Remove.
(vcmpgeq_m_n_f16): Remove.
(vcmpgtq_m_f16): Remove.
(vcmpgtq_m_n_f16): Remove.
(vcmpleq_m_f16): Remove.
(vcmpleq_m_n_f16): Remove.
(vcmpltq_m_f16): Remove.
(vcmpltq_m_n_f16): Remove.
(vcmpneq_m_f16): Remove.
(vcmpneq_m_n_f16): Remove.
(vcmpeqq_m_n_f32): Remove.
(vcmpgeq_m_f32): Remove.
(vcmpgeq_m_n_f32): Remove.
(vcmpgtq_m_f32): Remove.
(vcmpgtq_m_n_f32): Remove.
(vcmpleq_m_f32): Remove.
(vcmpleq_m_n_f32): Remove.
(vcmpltq_m_f32): Remove.
(vcmpltq_m_n_f32): Remove.
(vcmpneq_m_f32): Remove.
(vcmpneq_m_n_f32): Remove.
(__arm_vcmpneq_s8): Remove.
(__arm_vcmpneq_s16): Remove.
(__arm_vcmpneq_s32): Remove.
(__arm_vcmpneq_u8): Remove.
(__arm_vcmpneq_u16): Remove.
(__arm_vcmpneq_u32): Remove.
(__arm_vcmpneq_n_u8): Remove.
(__arm_vcmphiq_u8): Remove.
(__arm_vcmphiq_n_u8): Remove.
(__arm_vcmpeqq_u8): Remove.
(__arm_vcmpeqq_n_u8): Remove.
(__arm_vcmpcsq_u8): Remove.
(__arm_vcmpcsq_n_u8): Remove.
(__arm_vcmpneq_n_s8): Remove.
(__arm_vcmpltq_s8): Remove.
(__arm_vcmpltq_n_s8): Remove.
(__arm_vcmpleq_s8): Remove.
(__arm_vcmpleq_n_s8): Remove.
(__arm_vcmpgtq_s8): Remove.
(__arm_vcmpgtq_n_s8): Remove.
(__arm_vcmpgeq_s8): Remove.
(__arm_vcmpgeq_n_s8): Remove.
(__arm_vcmpeqq_s8): Remove.
(__arm_vcmpeqq_n_s8): Remove.
(__arm_vcmpneq_n_u16): Remove.
(__arm_vcmphiq_u16): Remove.
(__arm_vcmphiq_n_u16): Remove.
(__arm_vcmpeqq_u16): Remove.
(__arm_vcmpeqq_n_u16): Remove.
(__arm_vcmpcsq_u16): Remove.
(__arm_vcmpcsq_n_u16): Remove.
(__arm_vcmpneq_n_s16): Remove.
(__arm_vcmpltq_s16): Remove.
(__arm_vcmpltq_n_s16): Remove.
(__arm_vcmpleq_s16): Remove.
(__arm_vcmpleq_n_s16): Remove.
(__arm_vcmpgtq_s16): Remove.
(__arm_vcmpgtq_n_s16): Remove.
(__arm_vcmpgeq_s16): Remove.
(__arm_vcmpgeq_n_s16): Remove.
(__arm_vcmpeqq_s16): Remove.
(__arm_vcmpeqq_n_s16): Remove.
(__arm_vcmpneq_n_u32): Remove.
(__arm_vcmphiq_u32): Remove.
(__arm_vcmphiq_n_u32): Remove.
(__arm_vcmpeqq_u32): Remove.
(__arm_vcmpeqq_n_u32): Remove.
(__arm_vcmpcsq_u32): Remove.
(__arm_vcmpcsq_n_u32): Remove.
(__arm_vcmpneq_n_s32): Remove.
(__arm_vcmpltq_s32): Remove.
(__arm_vcmpltq_n_s32): Remove.
(__arm_vcmpleq_s32): Remove.
(__arm_vcmpleq_n_s32): Remove.
(__arm_vcmpgtq_s32): Remove.
(__arm_vcmpgtq_n_s32): Remove.
(__arm_vcmpgeq_s32): Remove.
(__arm_vcmpgeq_n_s32): Remove.
(__arm_vcmpeqq_s32): Remove.
(__arm_vcmpeqq_n_s32): Remove.
(__arm_vcmpneq_m_u8): Remove.
(__arm_vcmpneq_m_n_u8): Remove.
(__arm_vcmphiq_m_u8): Remove.
(__arm_vcmphiq_m_n_u8): Remove.
(__arm_vcmpeqq_m_u8): Remove.
(__arm_vcmpeqq_m_n_u8): Remove.
(__arm_vcmpcsq_m_u8): Remove.
(__arm_vcmpcsq_m_n_u8): Remove.
(__arm_vcmpneq_m_s8): Remove.
(__arm_vcmpneq_m_n_s8): Remove.
(__arm_vcmpltq_m_s8): Remove.
(__arm_vcmpltq_m_n_s8): Remove.
(__arm_vcmpleq_m_s8): Remove.
(__arm_vcmpleq_m_n_s8): Remove.
(__arm_vcmpgtq_m_s8): Remove.
(__arm_vcmpgtq_m_n_s8): Remove.
(__arm_vcmpgeq_m_s8): Remove.
(__arm_vcmpgeq_m_n_s8): Remove.
(__arm_vcmpeqq_m_s8): Remove.
(__arm_vcmpeqq_m_n_s8): Remove.
(__arm_vcmpneq_m_u16): Remove.
(__arm_vcmpneq_m_n_u16): Remove.
(__arm_vcmphiq_m_u16): Remove.
(__arm_vcmphiq_m_n_u16): Remove.
(__arm_vcmpeqq_m_u16): Remove.
(__arm_vcmpeqq_m_n_u16): Remove.
(__arm_vcmpcsq_m_u16): Remove.
(__arm_vcmpcsq_m_n_u16): Remove.
(__arm_vcmpneq_m_s16): Remove.
(__arm_vcmpneq_m_n_s16): Remove.
(__arm_vcmpltq_m_s16): Remove.
(__arm_vcmpltq_m_n_s16): Remove.
(__arm_vcmpleq_m_s16): Remove.
(__arm_vcmpleq_m_n_s16): Remove.
(__arm_vcmpgtq_m_s16): Remove.
(__arm_vcmpgtq_m_n_s16): Remove.
(__arm_vcmpgeq_m_s16): Remove.
(__arm_vcmpgeq_m_n_s16): Remove.
(__arm_vcmpeqq_m_s16): Remove.
(__arm_vcmpeqq_m_n_s16): Remove.
(__arm_vcmpneq_m_u32): Remove.
(__arm_vcmpneq_m_n_u32): Remove.
(__arm_vcmphiq_m_u32): Remove.
(__arm_vcmphiq_m_n_u32): Remove.
(__arm_vcmpeqq_m_u32): Remove.
(__arm_vcmpeqq_m_n_u32): Remove.
(__arm_vcmpcsq_m_u32): Remove.
(__arm_vcmpcsq_m_n_u32): Remove.
(__arm_vcmpneq_m_s32): Remove.
(__arm_vcmpneq_m_n_s32): Remove.
(__arm_vcmpltq_m_s32): Remove.
(__arm_vcmpltq_m_n_s32): Remove.
(__arm_vcmpleq_m_s32): Remove.
(__arm_vcmpleq_m_n_s32): Remove.
(__arm_vcmpgtq_m_s32): Remove.
(__arm_vcmpgtq_m_n_s32): Remove.
(__arm_vcmpgeq_m_s32): Remove.
(__arm_vcmpgeq_m_n_s32): Remove.
(__arm_vcmpeqq_m_s32): Remove.
(__arm_vcmpeqq_m_n_s32): Remove.
(__arm_vcmpneq_n_f16): Remove.
(__arm_vcmpneq_f16): Remove.
(__arm_vcmpltq_n_f16): Remove.
(__arm_vcmpltq_f16): Remove.
(__arm_vcmpleq_n_f16): Remove.
(__arm_vcmpleq_f16): Remove.
(__arm_vcmpgtq_n_f16): Remove.
(__arm_vcmpgtq_f16): Remove.
(__arm_vcmpgeq_n_f16): Remove.
(__arm_vcmpgeq_f16): Remove.
(__arm_vcmpeqq_n_f16): Remove.
(__arm_vcmpeqq_f16): Remove.
(__arm_vcmpneq_n_f32): Remove.
(__arm_vcmpneq_f32): Remove.
(__arm_vcmpltq_n_f32): Remove.
(__arm_vcmpltq_f32): Remove.
(__arm_vcmpleq_n_f32): Remove.
(__arm_vcmpleq_f32): Remove.
(__arm_vcmpgtq_n_f32): Remove.
(__arm_vcmpgtq_f32): Remove.
(__arm_vcmpgeq_n_f32): Remove.
(__arm_vcmpgeq_f32): Remove.
(__arm_vcmpeqq_n_f32): Remove.
(__arm_vcmpeqq_f32): Remove.
(__arm_vcmpeqq_m_f16): Remove.
(__arm_vcmpeqq_m_f32): Remove.
(__arm_vcmpeqq_m_n_f16): Remove.
(__arm_vcmpgeq_m_f16): Remove.
(__arm_vcmpgeq_m_n_f16): Remove.
(__arm_vcmpgtq_m_f16): Remove.
(__arm_vcmpgtq_m_n_f16): Remove.
(__arm_vcmpleq_m_f16): Remove.
(__arm_vcmpleq_m_n_f16): Remove.
(__arm_vcmpltq_m_f16): Remove.
(__arm_vcmpltq_m_n_f16): Remove.
(__arm_vcmpneq_m_f16): Remove.
(__arm_vcmpneq_m_n_f16): Remove.
(__arm_vcmpeqq_m_n_f32): Remove.
(__arm_vcmpgeq_m_f32): Remove.
(__arm_vcmpgeq_m_n_f32): Remove.
(__arm_vcmpgtq_m_f32): Remove.
(__arm_vcmpgtq_m_n_f32): Remove.
(__arm_vcmpleq_m_f32): Remove.
(__arm_vcmpleq_m_n_f32): Remove.
(__arm_vcmpltq_m_f32): Remove.
(__arm_vcmpltq_m_n_f32): Remove.
(__arm_vcmpneq_m_f32): Remove.
(__arm_vcmpneq_m_n_f32): Remove.
(__arm_vcmpneq): Remove.
(__arm_vcmphiq): Remove.
(__arm_vcmpeqq): Remove.
(__arm_vcmpcsq): Remove.
(__arm_vcmpltq): Remove.
(__arm_vcmpleq): Remove.
(__arm_vcmpgtq): Remove.
(__arm_vcmpgeq): Remove.
(__arm_vcmpneq_m): Remove.
(__arm_vcmphiq_m): Remove.
(__arm_vcmpeqq_m): Remove.
(__arm_vcmpcsq_m): Remove.
(__arm_vcmpltq_m): Remove.
(__arm_vcmpleq_m): Remove.
(__arm_vcmpgtq_m): Remove.
(__arm_vcmpgeq_m): Remove.
|
|
This patch adds the cmp shape description.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-shapes.cc (cmp): New.
* config/arm/arm-mve-builtins-shapes.h (cmp): New.
|
|
Factorize vcmp so that they use the same pattern.
2022-10-25 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/iterators.md (MVE_CMP_M, MVE_CMP_M_F, MVE_CMP_M_N)
(MVE_CMP_M_N_F, mve_cmp_op1): New.
(isu): Add VCMP*
(supf): Likewise.
* config/arm/mve.md (mve_vcmp<mve_cmp_op>q_n_<mode>): Rename into ...
(@mve_vcmp<mve_cmp_op>q_n_<mode>): ... this.
(mve_vcmpeqq_m_f<mode>, mve_vcmpgeq_m_f<mode>)
(mve_vcmpgtq_m_f<mode>, mve_vcmpleq_m_f<mode>)
(mve_vcmpltq_m_f<mode>, mve_vcmpneq_m_f<mode>): Merge into ...
(@mve_vcmp<mve_cmp_op1>q_m_f<mode>): ... this.
(mve_vcmpcsq_m_u<mode>, mve_vcmpeqq_m_<supf><mode>)
(mve_vcmpgeq_m_s<mode>, mve_vcmpgtq_m_s<mode>)
(mve_vcmphiq_m_u<mode>, mve_vcmpleq_m_s<mode>)
(mve_vcmpltq_m_s<mode>, mve_vcmpneq_m_<supf><mode>): Merge into
...
(@mve_vcmp<mve_cmp_op1>q_m_<supf><mode>): ... this.
(mve_vcmpcsq_m_n_u<mode>, mve_vcmpeqq_m_n_<supf><mode>)
(mve_vcmpgeq_m_n_s<mode>, mve_vcmpgtq_m_n_s<mode>)
(mve_vcmphiq_m_n_u<mode>, mve_vcmpleq_m_n_s<mode>)
(mve_vcmpltq_m_n_s<mode>, mve_vcmpneq_m_n_<supf><mode>): Merge
into ...
(@mve_vcmp<mve_cmp_op1>q_m_n_<supf><mode>): ... this.
(mve_vcmpeqq_m_n_f<mode>, mve_vcmpgeq_m_n_f<mode>)
(mve_vcmpgtq_m_n_f<mode>, mve_vcmpleq_m_n_f<mode>)
(mve_vcmpltq_m_n_f<mode>, mve_vcmpneq_m_n_f<mode>): Merge into ...
(@mve_vcmp<mve_cmp_op1>q_m_n_f<mode>): ... this.
|
|
This patch is the prerequiste patch for more RVV auto-vectorization
support.
Since when we enable a very simple binary operations, we will end
up with such following ICE:
during RTL pass: expand
add_run-1.c: In function 'main':
add_run-1.c:28:1: internal compiler error: Segmentation fault
0x1618ea3 crash_signal
../../../riscv-gcc/gcc/toplev.cc:314
0xe76cd9 single_set(rtx_insn const*)
../../../riscv-gcc/gcc/rtl.h:3602
0x1080f8a emit_move_insn(rtx_def*, rtx_def*)
../../../riscv-gcc/gcc/expr.cc:4342
0x170c458 insert_value_copy_on_edge
../../../riscv-gcc/gcc/tree-outof-ssa.cc:352
0x170d58e eliminate_phi
../../../riscv-gcc/gcc/tree-outof-ssa.cc:785
0x170df17 expand_phi_nodes(ssaexpand*)
../../../riscv-gcc/gcc/tree-outof-ssa.cc:1024
0xef27e2 execute
../../../riscv-gcc/gcc/cfgexpand.cc:6818
This is because LoopVectorizer assume target is able to handle
series const vector when we enable binary operations.
Then it will be easily causing ICE like that.
gcc/ChangeLog:
* config/riscv/autovec.md (@vec_series<mode>): New pattern
* config/riscv/riscv-protos.h (expand_vec_series): New function.
* config/riscv/riscv-v.cc (emit_binop): Ditto.
(emit_index_op): Ditto.
(expand_vec_series): Ditto.
(expand_const_vector): Add series vector handling.
* config/riscv/riscv.cc (riscv_const_insns): Enable series vector for testing.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/series-1.c: New test.
* gcc.target/riscv/rvv/autovec/series_run-1.c: New test.
|
|
This cleans up the use of [(clobber (const_int 0))] in the i386 backend.
2023-05-10 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386.md (*concat<mode><dwi>3_1): Use preferred
[(const_int 0)] idiom, instead of [(clobber (const_int 0))].
(*concat<mode><dwi>3_2): Likewise.
(*concat<mode><dwi>3_3): Likewise.
(*concat<mode><dwi>3_4): Likewise.
(*concat<mode><dwi>3_5): Likewise.
(*concat<mode><dwi>3_6): Likewise.
(*concat<mode><dwi>3_7): Likewise.
|
|
Add missing insn pattern for v2qi -> v2si vector extend and named
expanders to activate generation of vector extends to 8-byte and 4-byte
vectors.
gcc/ChangeLog:
PR target/92658
* config/i386/mmx.md (sse4_1_<code>v2qiv2si2): New insn pattern.
(<insn>v4qiv4hi2): New expander.
(<insn>v2hiv2si2): Ditto.
(<insn>v2qiv2si2): Ditto.
(<insn>v2qiv2hi2): Ditto.
gcc/testsuite/ChangeLog:
PR target/92658
* gcc.target/i386/pr92658-sse4-4b.c: New test.
* gcc.target/i386/pr92658-sse4-8b.c: New test.
|
|
So this is the 2nd patch on the way to LRA for the H8.
LRA is more sensitive to getting define_constraint vs define_memory_constraint
vs define_special_memory_constraint correct. than reload.
The H8 port has the "Q" constraint, which is used to indicate memory addresses
that can be used under certain circumstances in various ALU operations. So it
should be a memory constraint. Ideally it'd would be a simple memory
constraint, but it's used in contexts where MEMs are valid only for certain
parts in the H8 family. So it really needs to be a special_memory_constraint.
The "Zz" constraint accepts memory, but the forms are limited and can not be
reloaded into a register. It seems to be working, but I wouldn't be totally
surprised if this got stressed in the right way if it broke.
Anyway, this patch fixes "Q" and "Zz" to be special memory constraints.
Regression tested with gdbsim and pushed to the trunk.
gcc
* config/h8300/constraints.md (Q): Make this a special memory
constraint.
(Zz): Similarly.
|
|
This patch is a no-op as it removes the explicit vec-concat-zero patterns in favour of vczle/vczbe.
This allows us to delete the explicit expander too. Tests are added to ensure the optimisation required
still triggers.
Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (aarch64_sqmovun<mode>_insn_le): Delete.
(aarch64_sqmovun<mode>_insn_be): Delete.
(aarch64_sqmovun<mode><vczle><vczbe>): New define_insn.
(aarch64_sqmovun<mode>): Delete expander.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/simd/pr99195_4.c: Add tests for sqmovun.
|
|
vec-concat-zero
Another straightforward patch annotating patterns for the zip1, zip2, uzp1, uzp2, rev* instructions, plus tests.
Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
gcc/ChangeLog:
PR target/99195
* config/aarch64/aarch64-simd.md (aarch64_<PERMUTE:perm_insn><mode>):
Rename to...
(aarch64_<PERMUTE:perm_insn><mode><vczle><vczbe>): ... This.
(aarch64_rev<REVERSE:rev_op><mode>): Rename to...
(aarch64_rev<REVERSE:rev_op><mode><vczle><vczbe>): ... This.
gcc/testsuite/ChangeLog:
PR target/99195
* gcc.target/aarch64/simd/pr99195_1.c: Add tests for zip and rev
intrinsics.
|
|
vec-concat-zero
Moving onto the saturating instructions, this one goes through the simple add/sub ones.
Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
gcc/ChangeLog:
PR target/99195
* config/aarch64/aarch64-simd.md (aarch64_<su_optab>q<addsub><mode>):
Rename to...
(aarch64_<su_optab>q<addsub><mode><vczle><vczbe>): ... This.
(aarch64_<sur>qadd<mode>): Rename to...
(aarch64_<sur>qadd<mode><vczle><vczbe>): ... This.
gcc/testsuite/ChangeLog:
PR target/99195
* gcc.target/aarch64/simd/pr99195_1.c: Add testing for qadd, qsub.
* gcc.target/aarch64/simd/pr99195_6.c: New test.
|
|
This patch deletes the explicit BYTES_BIG_ENDIAN and !BYTES_BIG_ENDIAN patterns for the QSHRN instructions in favour
of annotating a single one with <vczle><vczbe>. This allows simplification of the expander too.
Tests are added to ensure that we still optimise away the concat-with-zero use case.
Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md
(aarch64_<sur>q<r>shr<u>n_n<mode>_insn_le): Delete.
(aarch64_<sur>q<r>shr<u>n_n<mode>_insn_be): Delete.
(aarch64_<sur>q<r>shr<u>n_n<mode>_insn<vczle><vczbe>): New define_insn.
(aarch64_<sur>q<r>shr<u>n_n<mode>): Simplify expander.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/simd/pr99195_5.c: New test.
|
|
This patch cleans up some almost-duplicate patterns for the XTN, SQXTN, UQXTN instructions.
Using the <vczle><vczbe> attributes we can remove the BYTES_BIG_ENDIAN and !BYTES_BIG_ENDIAN cases,
as well as the intrinsic expanders that select between the two.
Tests are also added. Thankfully the diffstat comes out negative \O/.
Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
gcc/ChangeLog:
PR target/99195
* config/aarch64/aarch64-simd.md (aarch64_xtn<mode>_insn_le): Delete.
(aarch64_xtn<mode>_insn_be): Likewise.
(trunc<mode><Vnarrowq>2): Rename to...
(trunc<mode><Vnarrowq>2<vczle><vczbe>): ... This.
(aarch64_xtn<mode>): Move under the above. Just emit the truncate RTL.
(aarch64_<su>qmovn<mode>): Likewise.
(aarch64_<su>qmovn<mode><vczle><vczbe>): New define_insn.
(aarch64_<su>qmovn<mode>_insn_le): Delete.
(aarch64_<su>qmovn<mode>_insn_be): Likewise.
gcc/testsuite/ChangeLog:
PR target/99195
* gcc.target/aarch64/simd/pr99195_4.c: Add tests for vmovn, vqmovn.
|
|
REG_P(operand[1]) in -O0.
This issue happens is because the operand1 of scalar move can be
REG_P (operand[1]) in the O0 case, which causes the VSETVL PASS to
not insert the vsetvl instruction correctly, and the compiler crashes.
Consider this following case:
int16_t foo1 (void *base, size_t vl)
{
int16_t maxVal = __riscv_vmv_x_s_i16m1_i16 (__riscv_vle16_v_i16m1 (base, vl));
return maxVal;
}
Before this patch:
bug.c:15:1: internal compiler error: Segmentation fault
15 | }
| ^
0x145d723 crash_signal
../.././riscv-gcc/gcc/toplev.cc:314
0x22929dd const_csr_operand(rtx_def*, machine_mode)
../.././riscv-gcc/gcc/config/riscv/predicates.md:44
0x2292a21 csr_operand(rtx_def*, machine_mode)
../.././riscv-gcc/gcc/config/riscv/predicates.md:46
0x23dfbb0 recog_356
../.././riscv-gcc/gcc/config/riscv/iterators.md:72
0x23efecd recog(rtx_def*, rtx_insn*, int*)
../.././riscv-gcc/gcc/config/riscv/iterators.md:89
0xdddc15 recog_memoized(rtx_insn*)
../.././riscv-gcc/gcc/recog.h:273
After this patch:
vsetivli zero,0,e16,m1,ta,ma
vmv.x.s a5,v1
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (gen_vsetvl_pat): For vfmv.f.s/vmv.x.s
intruction replace null avl with (const_int 0).
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/scalar_move-10.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-11.c: New test.
|
|
TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
This incorrect codes blocks the scalable RVV auto-vectorization.
Take a look at this target hook implementation of aarch64.
They only have the similiar handling on TARGET_SIMD.
They let movmisalign<mode> to handle scalable vector of SVE.
For RVV, we should follow the same implementation of ARM SVE.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_support_vector_misalignment): Fix
incorrect codes.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/v-2.c: Adapt testcase.
* gcc.target/riscv/rvv/autovec/zve32f-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve32f-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve32x-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve32x-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64d-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64d-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64f-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64f-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64x-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64x-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c: Ditto.
|
|
This patch is fix dead loop in vsetvl intrinsic avl checking.
vsetvli->get_def () has vsetvli->get_def () has vsetvli.....
Then it will keep looping in the vsetvli avl checking which is a dead loop.
PR target/109773
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (avl_source_has_vsetvl_p): New function.
(source_equal_p): Fix dead loop in vsetvl avl checking.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/pr109773-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr109773-2.c: New test.
|
|
Typo spotted while doing CCmode improvements, as a missed
optimization. It's almost visible from the patch context;
there's not much done in terms of "mode-adjustment" when
replacing (reg:CC CRIS_CC0_REGNUM) with a copy!
This bug affects functions in the newlib printf-formatting
functions (nothing else in libgcc or newlib libc), with the
performance impact on coremark scores being less than 1e-6
(3/5078992 cycles, 6/48543 bytes).
* config/cris/cris.cc (cris_postdbr_cmpelim): Correct mode
of modeadjusted_dccr.
|
|
Implement vmaxaq and vminaq using the new MVE builtins framework.
2022-09-08 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-base.cc (vmaxaq, vminaq): New.
* config/arm/arm-mve-builtins-base.def (vmaxaq, vminaq): New.
* config/arm/arm-mve-builtins-base.h (vmaxaq, vminaq): New.
* config/arm/arm-mve-builtins.cc
(function_instance::has_inactive_argument): Handle vmaxaq and
vminaq.
* config/arm/arm_mve.h (vminaq): Remove.
(vmaxaq): Remove.
(vminaq_m): Remove.
(vmaxaq_m): Remove.
(vminaq_s8): Remove.
(vmaxaq_s8): Remove.
(vminaq_s16): Remove.
(vmaxaq_s16): Remove.
(vminaq_s32): Remove.
(vmaxaq_s32): Remove.
(vminaq_m_s8): Remove.
(vmaxaq_m_s8): Remove.
(vminaq_m_s16): Remove.
(vmaxaq_m_s16): Remove.
(vminaq_m_s32): Remove.
(vmaxaq_m_s32): Remove.
(__arm_vminaq_s8): Remove.
(__arm_vmaxaq_s8): Remove.
(__arm_vminaq_s16): Remove.
(__arm_vmaxaq_s16): Remove.
(__arm_vminaq_s32): Remove.
(__arm_vmaxaq_s32): Remove.
(__arm_vminaq_m_s8): Remove.
(__arm_vmaxaq_m_s8): Remove.
(__arm_vminaq_m_s16): Remove.
(__arm_vmaxaq_m_s16): Remove.
(__arm_vminaq_m_s32): Remove.
(__arm_vmaxaq_m_s32): Remove.
(__arm_vminaq): Remove.
(__arm_vmaxaq): Remove.
(__arm_vminaq_m): Remove.
(__arm_vmaxaq_m): Remove.
|
|
Factorize vmaxaq vminaq so that they use the same pattern.
2022-09-08 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/iterators.md (MVE_VMAXAVMINAQ, MVE_VMAXAVMINAQ_M):
New.
(mve_insn): Add vmaxa, vmina.
(supf): Add VMAXAQ_S, VMAXAQ_M_S, VMINAQ_S, VMINAQ_M_S.
* config/arm/mve.md (mve_vmaxaq_s<mode>, mve_vminaq_s<mode>):
Merge into ...
(@mve_<mve_insn>q_<supf><mode>): ... this.
(mve_vmaxaq_m_s<mode>, mve_vminaq_m_s<mode>): Merge into ...
(@mve_<mve_insn>q_m_<supf><mode>): ... this.
|
|
This patch adds the binary_maxamina shape description.
2022-09-08 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_maxamina): New.
* config/arm/arm-mve-builtins-shapes.h (binary_maxamina): New.
|
|
Implement vmaxnmaq and vminnmaq using the new MVE builtins framework.
2022-09-08 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-base.cc (vmaxnmaq, vminnmaq): New.
* config/arm/arm-mve-builtins-base.def (vmaxnmaq, vminnmaq): New.
* config/arm/arm-mve-builtins-base.h (vmaxnmaq, vminnmaq): New.
* config/arm/arm-mve-builtins.cc
(function_instance::has_inactive_argument): Handle vmaxnmaq and
vminnmaq.
* config/arm/arm_mve.h (vminnmaq): Remove.
(vmaxnmaq): Remove.
(vmaxnmaq_m): Remove.
(vminnmaq_m): Remove.
(vminnmaq_f16): Remove.
(vmaxnmaq_f16): Remove.
(vminnmaq_f32): Remove.
(vmaxnmaq_f32): Remove.
(vmaxnmaq_m_f16): Remove.
(vminnmaq_m_f16): Remove.
(vmaxnmaq_m_f32): Remove.
(vminnmaq_m_f32): Remove.
(__arm_vminnmaq_f16): Remove.
(__arm_vmaxnmaq_f16): Remove.
(__arm_vminnmaq_f32): Remove.
(__arm_vmaxnmaq_f32): Remove.
(__arm_vmaxnmaq_m_f16): Remove.
(__arm_vminnmaq_m_f16): Remove.
(__arm_vmaxnmaq_m_f32): Remove.
(__arm_vminnmaq_m_f32): Remove.
(__arm_vminnmaq): Remove.
(__arm_vmaxnmaq): Remove.
(__arm_vmaxnmaq_m): Remove.
(__arm_vminnmaq_m): Remove.
|
|
Factorize vmaxnmaq and vminnmaq so that they use the same pattern.
2022-09-08 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/iterators.md (MVE_VMAXNMA_VMINNMAQ)
(MVE_VMAXNMA_VMINNMAQ_M): New.
(mve_insn): Add vmaxnma, vminnma.
* config/arm/mve.md (mve_vmaxnmaq_f<mode>, mve_vminnmaq_f<mode>):
Merge into ...
(@mve_<mve_insn>q_f<mode>): ... this.
(mve_vmaxnmaq_m_f<mode>, mve_vminnmaq_m_f<mode>): Merge into ...
(@mve_<mve_insn>q_m_f<mode>): ... this.
|
|
Implement vmaxnmavq vmaxnmvq vminnmavq vminnmvq using the new MVE
builtins framework.
2022-09-08 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-base.cc (FUNCTION_PRED_P_F): New.
(vmaxnmavq, vmaxnmvq, vminnmavq, vminnmvq): New.
* config/arm/arm-mve-builtins-base.def (vmaxnmavq, vmaxnmvq)
(vminnmavq, vminnmvq): New.
* config/arm/arm-mve-builtins-base.h (vmaxnmavq, vmaxnmvq)
(vminnmavq, vminnmvq): New.
* config/arm/arm_mve.h (vminnmvq): Remove.
(vminnmavq): Remove.
(vmaxnmvq): Remove.
(vmaxnmavq): Remove.
(vmaxnmavq_p): Remove.
(vmaxnmvq_p): Remove.
(vminnmavq_p): Remove.
(vminnmvq_p): Remove.
(vminnmvq_f16): Remove.
(vminnmavq_f16): Remove.
(vmaxnmvq_f16): Remove.
(vmaxnmavq_f16): Remove.
(vminnmvq_f32): Remove.
(vminnmavq_f32): Remove.
(vmaxnmvq_f32): Remove.
(vmaxnmavq_f32): Remove.
(vmaxnmavq_p_f16): Remove.
(vmaxnmvq_p_f16): Remove.
(vminnmavq_p_f16): Remove.
(vminnmvq_p_f16): Remove.
(vmaxnmavq_p_f32): Remove.
(vmaxnmvq_p_f32): Remove.
(vminnmavq_p_f32): Remove.
(vminnmvq_p_f32): Remove.
(__arm_vminnmvq_f16): Remove.
(__arm_vminnmavq_f16): Remove.
(__arm_vmaxnmvq_f16): Remove.
(__arm_vmaxnmavq_f16): Remove.
(__arm_vminnmvq_f32): Remove.
(__arm_vminnmavq_f32): Remove.
(__arm_vmaxnmvq_f32): Remove.
(__arm_vmaxnmavq_f32): Remove.
(__arm_vmaxnmavq_p_f16): Remove.
(__arm_vmaxnmvq_p_f16): Remove.
(__arm_vminnmavq_p_f16): Remove.
(__arm_vminnmvq_p_f16): Remove.
(__arm_vmaxnmavq_p_f32): Remove.
(__arm_vmaxnmvq_p_f32): Remove.
(__arm_vminnmavq_p_f32): Remove.
(__arm_vminnmvq_p_f32): Remove.
(__arm_vminnmvq): Remove.
(__arm_vminnmavq): Remove.
(__arm_vmaxnmvq): Remove.
(__arm_vmaxnmavq): Remove.
(__arm_vmaxnmavq_p): Remove.
(__arm_vmaxnmvq_p): Remove.
(__arm_vminnmavq_p): Remove.
(__arm_vminnmvq_p): Remove.
(__arm_vmaxnmavq_m): Remove.
(__arm_vmaxnmvq_m): Remove.
|
|
We can call code_for_mve_q_p_f only once this function exists, which
is the case after we factorized vmaxnmavq, vmaxnmvq, vminnmavq and
vminnmvq in a previous patch.
2022-09-08 Christophe Lyon <christophe.lyon@arm.com>
gcc/
* config/arm/arm-mve-builtins-functions.h
(unspec_mve_function_exact_insn_pred_p): Use code_for_mve_q_p_f.
|