Age | Commit message (Collapse) | Author | Files | Lines |
|
Since satisfies_constraint_vi (x) belongs to RVV region.
We make this condition inside riscv_v_ext_vector_mode_p to make codes
more reasonable.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_const_insns): Reorganize the
codes.
|
|
If just one bit is inserted in the same position like with:
__builtin_avr_insert_bits (0xFFFFF2FF, src, dst);
a BLD/BST sequence is better than XOR/AND/XOR. Thus, don't fold that
case to the latter sequence.
gcc/
PR target/90622
* config/avr/avr.cc (avr_fold_builtin) [AVR_BUILTIN_INSERT_BITS]:
Don't fold to XOR / AND / XOR if just one bit is copied to the
same position.
|
|
This patch adds support for (a pair of) bit reversal intrinsics
__builtin_nvptx_brev and __builtin_nvptx_brevll which perform 32-bit
and 64-bit bit reversal (using nvptx's brev instruction) matching
the __brev and __brevll instrinsics provided by NVidia's nvcc compiler.
https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH__INTRINSIC__INT.html
2023-05-21 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/nvptx/nvptx.cc (nvptx_expand_brev): Expand target
builtin for bit reversal using brev instruction.
(enum nvptx_builtins): Add NVPTX_BUILTIN_BREV and
NVPTX_BUILTIN_BREVLL.
(nvptx_init_builtins): Define "brev" and "brevll".
(nvptx_expand_builtin): Expand NVPTX_BUILTIN_BREV and
NVPTX_BUILTIN_BREVLL via nvptx_expand_brev function.
* doc/extend.texi (Nvidia PTX Builtin-in Functions): New
section, document __builtin_nvptx_brev{,ll}.
gcc/testsuite/ChangeLog
* gcc.target/nvptx/brev-1.c: New 32-bit test case.
* gcc.target/nvptx/brev-2.c: Likewise.
* gcc.target/nvptx/brevll-1.c: New 64-bit test case.
* gcc.target/nvptx/brevll-2.c: Likewise.
|
|
This patch support the RVV VREINTERPRET from the int to the
vbool[2|4|8|16|32|64]_t. Aka:
vbool[2|4|8|16|32|64]_t __riscv_vreinterpret_x_x(v{u}int[8|16|32|64]_t);
These APIs help the users to convert vector LMUL=1 integer to
vbool[2-64]_t. According to the RVV intrinsic SPEC as below,
the reinterpret intrinsics only change the types of the underlying
contents.
https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#reinterpret-vbool-o-vintm1
For example, given below code.
vbool64_t test_vreinterpret_v_u8m1_b64 (vuint8m1_t src) {
return __riscv_vreinterpret_v_u8m1_b64 (src);
}
It will generate the assembly code similar as below:
vsetvli a5,zero,e8,mf8,ta,ma
vlm.v v1,0(a1)
vsm.v v1,0(a0)
ret
Please NOTE the test files doesn't cover all the possible combinations
of the intrinsic APIs introduced by this PATCH due to too many.
The reinterpret from vbool*_t to v{u}int*_t with lmul=1 will be coverred
int another PATCH.
Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:
* config/riscv/genrvv-type-indexer.cc (BOOL_SIZE_LIST): Add the
rest bool size, aka 2, 4, 8, 16, 32, 64.
* config/riscv/riscv-vector-builtins-functions.def (vreinterpret):
Register vbool[2|4|8|16|32|64] interpret function.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_BOOL2_INTERPRET_OPS):
New macro for vbool2_t.
(DEF_RVV_BOOL4_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL8_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL16_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL32_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL64_INTERPRET_OPS): Likewise.
(vint8m1_t): Add the type to bool[2|4|8|16|32|64]_interpret_ops.
(vint16m1_t): Likewise.
(vint32m1_t): Likewise.
(vint64m1_t): Likewise.
(vuint8m1_t): Likewise.
(vuint16m1_t): Likewise.
(vuint32m1_t): Likewise.
(vuint64m1_t): Likewise.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_BOOL2_INTERPRET_OPS):
New macro for vbool2_t.
(DEF_RVV_BOOL4_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL8_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL16_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL32_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL64_INTERPRET_OPS): Likewise.
(required_extensions_p): Add vbool[2|4|8|16|32|64] interpret case.
* config/riscv/riscv-vector-builtins.def (bool2_interpret): Add
vbool2_t interprect to base type.
(bool4_interpret): Likewise.
(bool8_interpret): Likewise.
(bool16_interpret): Likewise.
(bool32_interpret): Likewise.
(bool64_interpret): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c: Add
test cases for vbool[2|4|8|16|32|64]_t.
|
|
This patch removes the superfluous parallel in [u]divmod patterns in
the AVR backend. Effect of extra parallel is that add_clobbers reaches
gcc_unreachable() because the clobbers for [u]divmod are missing.
If an insn has multiple parts like clobbers, the parallel around the
parts of the insn pattern is implicit.
gcc/
PR target/105753
* config/avr/avr.md (divmodpsi, udivmodpsi, divmodsi, udivmodsi):
Remove superfluous "parallel" in insn pattern.
([u]divmod<mode>4): Tidy code. Use gcc_unreachable() instead of
printing error text to assembly.
gcc/testsuite/
PR target/105753
* gcc.target/avr/torture/pr105753.c: New test.
|
|
Two issues have been observed in current riscv_expand_conditional_move
implementation.
1. Before introduction of TARGET_XTHEADCONDMOV, op0 of comparision expression
is used for mode comparision with word_mode, but after TARGET_XTHEADCONDMOV
megered with TARGET_SFB_ALU, dest of if-then-else is used for mode comparision with
word_mode, and from md file mode of dest is DI or SI which can be different with
word_mode in RV64.
2. TARGET_XTHEADCONDMOV cannot be generated when the mode of the comparison is E_VOID.
This patch solves the issues above.
Provide an example from the newly added test case.
Testcase:
int ConNmv_reg_reg_reg(int x, int y, int z, int n){
if (x != y) return z;
return n;
}
Cflags:
-O2 -march=rv64gc_xtheadcondmov -mabi=lp64d
before patch:
ConNmv_reg_reg_reg:
bne a0,a1,.L23
mv a2,a3
.L23:
mv a0,a2
ret
after patch:
ConNmv_reg_reg_reg:
sub a1,a0,a1
th.mveqz a2,zero,a1
th.mvnez a3,zero,a1
or a0,a2,a3
ret
Co-Authored by: Fei Gao <gaofei@eswincomputing.com>
Signed-off-by: Die Li <lidie@eswincomputing.com>
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_expand_conditional_move): Fix mode
checking.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/xtheadcondmov-indirect-rv32.c: New test.
* gcc.target/riscv/xtheadcondmov-indirect-rv64.c: New test.
|
|
Changes since v1:
- Removed name clash change.
- Fix new pattern indentation.
-- >8 --
When (a & (1 << bit_no)) is tested inside an IF we can use a bit extract.
gcc/ChangeLog:
* config/riscv/bitmanip.md (branch<X:mode>_bext): New split pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbs-bext-02.c: New test.
|
|
Changes since v1:
- Remove subreg from operand 1.
-- >8 --
We were not able to match the CTZ sign extend pattern on RISC-V
because it gets optimized to zero extend and/or to ANDI patterns.
For the ANDI case, combine scrambles the RTL and generates the
extension by using subregs.
gcc/ChangeLog:
PR target/106888
* config/riscv/bitmanip.md
(<bitmanip_optab>disi2): Match with any_extend.
(<bitmanip_optab>disi2_sext): New pattern to match
with sign extend using an ANDI instruction.
gcc/testsuite/ChangeLog:
PR target/106888
* gcc.target/riscv/pr106888.c: New test.
* gcc.target/riscv/zbbw.c: Check for ANDI.
|
|
Sorry, I forgot the ChangeLog entry for my patch and missed the [v2]
part of the subject.
2023-05-18 Joern Rennecke <joern.rennecke@embecosm.com>
gcc/ChangeLog:
* config/riscv/constraints.md (DsS, DsD): Restore agreement
with shiftm1 mode attribute.
|
|
[part #2 of PR/109279]
SPEC2017 deepsjeng uses large constants which currently generates less than
ideal code. This fix improves codegen for large constants which have
same low and hi parts: e.g.
long long f(void) { return 0x0101010101010101ull; }
Before
li a5,0x1010000
addi a5,a5,0x101
mv a0,a5
slli a5,a5,32
add a0,a5,a0
ret
With patch
li a5,0x1010000
addi a5,a5,0x101
slli a0,a5,32
add a0,a0,a5
ret
This is testsuite clean.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_split_integer): if loval is equal
to hival, ASHIFT the corresponding regs.
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
|
|
This patch fixes the recent vmv patch in order to allow loading
of constants via vmv.vi with the "fixed-vlmax" vectorization flavor.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_const_insns): Remove else.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv32.c: New test.
* gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv64.c: New test.
|
|
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_short_vector_p): Use _P
defines from tree.h.
(aarch64_mangle_type): Ditto.
* config/alpha/alpha.cc (alpha_in_small_data_p): Ditto.
(alpha_gimplify_va_arg_1): Ditto.
* config/arc/arc.cc (arc_encode_section_info): Ditto.
(arc_is_aux_reg_p): Ditto.
(arc_is_uncached_mem_p): Ditto.
(arc_handle_aux_attribute): Ditto.
* config/arm/arm.cc (arm_handle_isr_attribute): Ditto.
(arm_handle_cmse_nonsecure_call): Ditto.
(arm_set_default_type_attributes): Ditto.
(arm_is_segment_info_known): Ditto.
(arm_mangle_type): Ditto.
* config/arm/unknown-elf.h (IN_NAMED_SECTION_P): Ditto.
* config/avr/avr.cc (avr_lookup_function_attribute1): Ditto.
(avr_decl_absdata_p): Ditto.
(avr_insert_attributes): Ditto.
(avr_section_type_flags): Ditto.
(avr_encode_section_info): Ditto.
* config/bfin/bfin.cc (bfin_handle_l2_attribute): Ditto.
* config/bpf/bpf.cc (bpf_core_compute): Ditto.
* config/c6x/c6x.cc (c6x_in_small_data_p): Ditto.
* config/csky/csky.cc (csky_handle_isr_attribute): Ditto.
(csky_mangle_type): Ditto.
* config/darwin-c.cc (darwin_pragma_unused): Ditto.
* config/darwin.cc (is_objc_metadata): Ditto.
* config/epiphany/epiphany.cc (epiphany_function_ok_for_sibcall): Ditto.
* config/epiphany/epiphany.h (ROUND_TYPE_ALIGN): Ditto.
* config/frv/frv.cc (frv_emit_movsi): Ditto.
* config/gcn/gcn-tree.cc (gcn_lockless_update): Ditto.
* config/gcn/gcn.cc (gcn_asm_output_symbol_ref): Ditto.
* config/h8300/h8300.cc (h8300_encode_section_info): Ditto.
* config/i386/i386-expand.cc: Ditto.
* config/i386/i386.cc (type_natural_mode): Ditto.
(ix86_function_arg): Ditto.
(ix86_data_alignment): Ditto.
(ix86_local_alignment): Ditto.
(ix86_simd_clone_compute_vecsize_and_simdlen): Ditto.
* config/i386/winnt-cxx.cc (i386_pe_type_dllimport_p): Ditto.
(i386_pe_type_dllexport_p): Ditto.
(i386_pe_adjust_class_at_definition): Ditto.
* config/i386/winnt.cc (i386_pe_determine_dllimport_p): Ditto.
(i386_pe_binds_local_p): Ditto.
(i386_pe_section_type_flags): Ditto.
* config/ia64/ia64.cc (ia64_encode_section_info): Ditto.
(ia64_gimplify_va_arg): Ditto.
(ia64_in_small_data_p): Ditto.
* config/iq2000/iq2000.cc (iq2000_function_arg): Ditto.
* config/lm32/lm32.cc (lm32_in_small_data_p): Ditto.
* config/loongarch/loongarch.cc (loongarch_handle_model_attribute): Ditto.
* config/m32c/m32c.cc (m32c_insert_attributes): Ditto.
* config/mcore/mcore.cc (mcore_mark_dllimport): Ditto.
(mcore_encode_section_info): Ditto.
* config/microblaze/microblaze.cc (microblaze_elf_in_small_data_p): Ditto.
* config/mips/mips.cc (mips_output_aligned_decl_common): Ditto.
* config/mmix/mmix.cc (mmix_encode_section_info): Ditto.
* config/nvptx/nvptx.cc (nvptx_encode_section_info): Ditto.
(pass_in_memory): Ditto.
(nvptx_generate_vector_shuffle): Ditto.
(nvptx_lockless_update): Ditto.
* config/pa/pa.cc (pa_function_arg_padding): Ditto.
(pa_function_value): Ditto.
(pa_function_arg): Ditto.
* config/pa/pa.h (IN_NAMED_SECTION_P): Ditto.
(TEXT_SPACE_P): Ditto.
* config/pa/som.h (MAKE_DECL_ONE_ONLY): Ditto.
* config/pdp11/pdp11.cc (pdp11_return_in_memory): Ditto.
* config/riscv/riscv.cc (riscv_in_small_data_p): Ditto.
(riscv_mangle_type): Ditto.
* config/rl78/rl78.cc (rl78_insert_attributes): Ditto.
(rl78_addsi3_internal): Ditto.
* config/rs6000/aix.h (ROUND_TYPE_ALIGN): Ditto.
* config/rs6000/darwin.h (ROUND_TYPE_ALIGN): Ditto.
* config/rs6000/freebsd64.h (ROUND_TYPE_ALIGN): Ditto.
* config/rs6000/linux64.h (ROUND_TYPE_ALIGN): Ditto.
* config/rs6000/rs6000-call.cc (rs6000_function_arg_boundary): Ditto.
(rs6000_function_arg_advance_1): Ditto.
(rs6000_function_arg): Ditto.
(rs6000_pass_by_reference): Ditto.
* config/rs6000/rs6000-logue.cc (rs6000_function_ok_for_sibcall): Ditto.
* config/rs6000/rs6000.cc (rs6000_data_alignment): Ditto.
(rs6000_set_default_type_attributes): Ditto.
(rs6000_elf_in_small_data_p): Ditto.
(IN_NAMED_SECTION): Ditto.
(rs6000_xcoff_encode_section_info): Ditto.
(rs6000_function_value): Ditto.
(invalid_arg_for_unprototyped_fn): Ditto.
* config/s390/s390-c.cc (s390_fn_types_compatible): Ditto.
(s390_vec_n_elem): Ditto.
* config/s390/s390.cc (s390_check_type_for_vector_abi): Ditto.
(s390_function_arg_integer): Ditto.
(s390_return_in_memory): Ditto.
(s390_encode_section_info): Ditto.
* config/sh/sh.cc (sh_gimplify_va_arg_expr): Ditto.
(sh_function_value): Ditto.
* config/sol2.cc (solaris_insert_attributes): Ditto.
* config/sparc/sparc.cc (function_arg_slotno): Ditto.
* config/sparc/sparc.h (ROUND_TYPE_ALIGN): Ditto.
* config/stormy16/stormy16.cc (xstormy16_encode_section_info): Ditto.
(xstormy16_handle_below100_attribute): Ditto.
* config/v850/v850.cc (v850_encode_section_info): Ditto.
(v850_insert_attributes): Ditto.
* config/visium/visium.cc (visium_pass_by_reference): Ditto.
(visium_return_in_memory): Ditto.
* config/xtensa/xtensa.cc (xtensa_multibss_section_type_flags): Ditto.
|
|
QImode partial vector multiplications and shifts can be implemented using
their HImode counterparts. Add infrastructure to handle V8QImode and
V4QImode vectors by extending (interleaving) their input operands to
V8HImode, performing V8HImode operation and truncating output back to
the original QImode vector.
The patch implements V8QImode and V4QImode multiplication for SSE2 targets,
using generic permutation to truncate output operand, but still taking
advantage of VPMOVWB down convert instruction, when available.
The patch also removes setting of REG_EQAUL note to the last insn
of ix86_expand_vecop_qihi expander. This is what generic code does
automatically when named pattern is expanded.
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_vecop_qihi_partial): New.
(ix86_expand_vecop_qihi): Add op2vec bool variable.
Do not set REG_EQUAL note.
* config/i386/i386-protos.h (ix86_expand_vecop_qihi_partial):
Add prototype.
* config/i386/i386.cc (ix86_multiplication_cost): Handle
V4QImode and V8QImode.
* config/i386/mmx.md (mulv8qi3): New expander.
(mulv4qi3): Ditto.
* config/i386/sse.md (mulv8qi3): Remove.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512vl-pr95488-1.c: Adjust
expected scan-assembler-times frequency and strings..
* gcc.target/i386/vect-mulv4qi.c: New test.
* gcc.target/i386/vect-mulv8qi.c: New test.
|
|
gcc/ChangeLog
* config/avr/gen-avr-mmcu-specs.cc: Remove stale */ after // comment.
|
|
POSIX sh does not support the == for string comparisons, use = instead.
gcc/ChangeLog:
PR bootstrap/105831
* config/nvptx/gen-opt.sh: Use = operator instead of ==.
* configure.ac: Likewise.
* configure: Regenerate.
|
|
Hi all,
Previously we had fixed the overloading of scalar arguments to intrinsics
with the introduction of a new `__ARM_mve_coerce3` _ Generic association.
This allowed users to give types other than int32_t, e.g. int, short, long,
etc., which previously would emit a nonsensical error message from the
_Generic.
Here I adjust that handling slightly and I am also doing the same thing, but
for pointer types:
(un)signed char* can be now used instead of (u)int8_t*
(un)signed short* can be now used instead of (u)int16_t*
(un)signed int* and long* can be now used instead of (u)int32_t*
(un)signed long long* can be now used instead of (u)int64_t*
__fp16* and _Float16* can be now used instead of float16_t*
float* can be now used instead of float32_t*
This required me to break down the _coerce_ generics for the specific
pointer types.
On the scalar types, the change in this patch is minor, renaming the
_coerce_ generics and passing all scalars through the `__typeof` for
consistency with each-other.
No test regressions in the GCC testsuite or CMSIS-NN.
gcc/ChangeLog:
* config/arm/arm_mve.h: (__ARM_mve_typeid): Add more pointer types.
(__ARM_mve_coerce1): Remove.
(__ARM_mve_coerce2): Remove.
(__ARM_mve_coerce3): Remove.
(__ARM_mve_coerce_i_scalar): New.
(__ARM_mve_coerce_s8_ptr): New.
(__ARM_mve_coerce_u8_ptr): New.
(__ARM_mve_coerce_s16_ptr): New.
(__ARM_mve_coerce_u16_ptr): New.
(__ARM_mve_coerce_s32_ptr): New.
(__ARM_mve_coerce_u32_ptr): New.
(__ARM_mve_coerce_s64_ptr): New.
(__ARM_mve_coerce_u64_ptr): New.
(__ARM_mve_coerce_f_scalar): New.
(__ARM_mve_coerce_f16_ptr): New.
(__ARM_mve_coerce_f32_ptr): New.
(__arm_vst4q): Change _coerce_ overloads.
(__arm_vbicq): Change _coerce_ overloads.
(__arm_vld1q): Change _coerce_ overloads.
(__arm_vld1q_z): Change _coerce_ overloads.
(__arm_vld2q): Change _coerce_ overloads.
(__arm_vld4q): Change _coerce_ overloads.
(__arm_vldrhq_gather_offset): Change _coerce_ overloads.
(__arm_vldrhq_gather_offset_z): Change _coerce_ overloads.
(__arm_vldrhq_gather_shifted_offset): Change _coerce_ overloads.
(__arm_vldrhq_gather_shifted_offset_z): Change _coerce_ overloads.
(__arm_vldrwq_gather_offset): Change _coerce_ overloads.
(__arm_vldrwq_gather_offset_z): Change _coerce_ overloads.
(__arm_vldrwq_gather_shifted_offset): Change _coerce_ overloads.
(__arm_vldrwq_gather_shifted_offset_z): Change _coerce_ overloads.
(__arm_vst1q_p): Change _coerce_ overloads.
(__arm_vst2q): Change _coerce_ overloads.
(__arm_vst1q): Change _coerce_ overloads.
(__arm_vstrhq): Change _coerce_ overloads.
(__arm_vstrhq_p): Change _coerce_ overloads.
(__arm_vstrhq_scatter_offset_p): Change _coerce_ overloads.
(__arm_vstrhq_scatter_offset): Change _coerce_ overloads.
(__arm_vstrhq_scatter_shifted_offset_p): Change _coerce_ overloads.
(__arm_vstrhq_scatter_shifted_offset): Change _coerce_ overloads.
(__arm_vstrwq_p): Change _coerce_ overloads.
(__arm_vstrwq): Change _coerce_ overloads.
(__arm_vstrwq_scatter_offset): Change _coerce_ overloads.
(__arm_vstrwq_scatter_offset_p): Change _coerce_ overloads.
(__arm_vstrwq_scatter_shifted_offset): Change _coerce_ overloads.
(__arm_vstrwq_scatter_shifted_offset_p): Change _coerce_ overloads.
(__arm_vsetq_lane): Change _coerce_ overloads.
(__arm_vldrbq_gather_offset): Change _coerce_ overloads.
(__arm_vdwdupq_x_u8): Change _coerce_ overloads.
(__arm_vdwdupq_x_u16): Change _coerce_ overloads.
(__arm_vdwdupq_x_u32): Change _coerce_ overloads.
(__arm_viwdupq_x_u8): Change _coerce_ overloads.
(__arm_viwdupq_x_u16): Change _coerce_ overloads.
(__arm_viwdupq_x_u32): Change _coerce_ overloads.
(__arm_vidupq_x_u8): Change _coerce_ overloads.
(__arm_vddupq_x_u8): Change _coerce_ overloads.
(__arm_vidupq_x_u16): Change _coerce_ overloads.
(__arm_vddupq_x_u16): Change _coerce_ overloads.
(__arm_vidupq_x_u32): Change _coerce_ overloads.
(__arm_vddupq_x_u32): Change _coerce_ overloads.
(__arm_vldrdq_gather_offset): Change _coerce_ overloads.
(__arm_vldrdq_gather_offset_z): Change _coerce_ overloads.
(__arm_vldrdq_gather_shifted_offset): Change _coerce_ overloads.
(__arm_vldrdq_gather_shifted_offset_z): Change _coerce_ overloads.
(__arm_vldrbq_gather_offset_z): Change _coerce_ overloads.
(__arm_vidupq_u16): Change _coerce_ overloads.
(__arm_vidupq_u32): Change _coerce_ overloads.
(__arm_vidupq_u8): Change _coerce_ overloads.
(__arm_vddupq_u16): Change _coerce_ overloads.
(__arm_vddupq_u32): Change _coerce_ overloads.
(__arm_vddupq_u8): Change _coerce_ overloads.
(__arm_viwdupq_m): Change _coerce_ overloads.
(__arm_viwdupq_u16): Change _coerce_ overloads.
(__arm_viwdupq_u32): Change _coerce_ overloads.
(__arm_viwdupq_u8): Change _coerce_ overloads.
(__arm_vdwdupq_m): Change _coerce_ overloads.
(__arm_vdwdupq_u16): Change _coerce_ overloads.
(__arm_vdwdupq_u32): Change _coerce_ overloads.
(__arm_vdwdupq_u8): Change _coerce_ overloads.
(__arm_vstrbq): Change _coerce_ overloads.
(__arm_vstrbq_p): Change _coerce_ overloads.
(__arm_vstrbq_scatter_offset_p): Change _coerce_ overloads.
(__arm_vstrdq_scatter_offset_p): Change _coerce_ overloads.
(__arm_vstrdq_scatter_offset): Change _coerce_ overloads.
(__arm_vstrdq_scatter_shifted_offset_p): Change _coerce_ overloads.
(__arm_vstrdq_scatter_shifted_offset): Change _coerce_ overloads.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-fp.c: Add testcases.
* gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c: Add testcases.
|
|
We found this as part of the wider testsuite updates.
The applicable tests are authored by Andrea earlier in this patch series
Ok for trunk?
gcc/ChangeLog:
* config/arm/arm_mve.h (__arm_vbicq): Change coerce on
scalar constant.
|
|
Hi all,
We noticed that calls to the vadcq and vsbcq intrinsics, both of
which use __builtin_arm_set_fpscr_nzcvqc to set the Carry flag in
the FPSCR, would produce the following code:
```
< r2 is the *carry input >
vmrs r3, FPSCR_nzcvqc
bic r3, r3, #536870912
orr r3, r3, r2, lsl #29
vmsr FPSCR_nzcvqc, r3
```
when the MVE ACLE instead gives a different instruction sequence of:
```
< Rt is the *carry input >
VMRS Rs,FPSCR_nzcvqc
BFI Rs,Rt,#29,#1
VMSR FPSCR_nzcvqc,Rs
```
the bic + orr pair is slower and it's also wrong, because, if the
*carry input is greater than 1, then we risk overwriting the top two
bits of the FPSCR register (the N and Z flags).
This turned out to be a problem in the header file and the solution was
to simply add a `& 1x0u` to the `*carry` input: then the compiler knows
that we only care about the lowest bit and can optimise to a BFI.
Ok for trunk?
Thanks,
Stam Markianos-Wright
gcc/ChangeLog:
* config/arm/arm_mve.h (__arm_vadcq_s32): Fix arithmetic.
(__arm_vadcq_u32): Likewise.
(__arm_vadcq_m_s32): Likewise.
(__arm_vadcq_m_u32): Likewise.
(__arm_vsbcq_s32): Likewise.
(__arm_vsbcq_u32): Likewise.
(__arm_vsbcq_m_s32): Likewise.
(__arm_vsbcq_m_u32): Likewise.
* config/arm/mve.md (get_fpscr_nzcvqc): Make unspec_volatile.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/mve_vadcq_vsbcq_fpscr_overwrite.c: New.
|
|
Hi all,
this patch improves a number of MVE tests in the testsuite for more
precise and better coverage using check-function-bodies instead of
scan-assembler checks. Also all intrusctions prescribed in the
ACLE[1] are now checked.
Also a number of simple fixes are done in the backend to fix
capitalization and spacing.
Best Regards
Andrea
[1] <https://github.com/ARM-software/acle>
gcc/ChangeLog:
* config/arm/mve.md (mve_vrndq_m_f<mode>, mve_vrev64q_f<mode>)
(mve_vrev32q_fv8hf, mve_vcvttq_f32_f16v4sf)
(mve_vcvtbq_f32_f16v4sf, mve_vcvtq_to_f_<supf><mode>)
(mve_vrev64q_<supf><mode>, mve_vcvtq_from_f_<supf><mode>)
(mve_vmovltq_<supf><mode>, mve_vmovlbq_<supf><mode>)
(mve_vcvtpq_<supf><mode>, mve_vcvtnq_<supf><mode>)
(mve_vcvtmq_<supf><mode>, mve_vcvtaq_<supf><mode>)
(mve_vmvnq_n_<supf><mode>, mve_vrev16q_<supf>v16qi)
(mve_vctp<MVE_vctp>q<MVE_vpred>, mve_vbrsrq_n_f<mode>)
(mve_vbrsrq_n_<supf><mode>, mve_vandq_f<mode>, mve_vbicq_f<mode>)
(mve_vctp<MVE_vctp>q_m<MVE_vpred>, mve_vcvtbq_f16_f32v8hf)
(mve_vcvttq_f16_f32v8hf, mve_veorq_f<mode>)
(mve_vmlaldavxq_s<mode>, mve_vmlsldavq_s<mode>)
(mve_vmlsldavxq_s<mode>, mve_vornq_f<mode>, mve_vorrq_f<mode>)
(mve_vrmlaldavhxq_sv4si, mve_vcvtq_m_to_f_<supf><mode>)
(mve_vshlcq_<supf><mode>, mve_vmvnq_m_<supf><mode>)
(mve_vpselq_<supf><mode>, mve_vcvtbq_m_f16_f32v8hf)
(mve_vcvtbq_m_f32_f16v4sf, mve_vcvttq_m_f16_f32v8hf)
(mve_vcvttq_m_f32_f16v4sf, mve_vmlaldavq_p_<supf><mode>)
(mve_vmlsldavaq_s<mode>, mve_vmlsldavaxq_s<mode>)
(mve_vmlsldavq_p_s<mode>, mve_vmlsldavxq_p_s<mode>)
(mve_vmvnq_m_n_<supf><mode>, mve_vorrq_m_n_<supf><mode>)
(mve_vpselq_f<mode>, mve_vrev32q_m_fv8hf)
(mve_vrev32q_m_<supf><mode>, mve_vrev64q_m_f<mode>)
(mve_vrmlaldavhaxq_sv4si, mve_vrmlaldavhxq_p_sv4si)
(mve_vrmlsldavhaxq_sv4si, mve_vrmlsldavhq_p_sv4si)
(mve_vrmlsldavhxq_p_sv4si, mve_vrev16q_m_<supf>v16qi)
(mve_vrmlaldavhq_p_<supf>v4si, mve_vrmlsldavhaq_sv4si)
(mve_vandq_m_<supf><mode>, mve_vbicq_m_<supf><mode>)
(mve_veorq_m_<supf><mode>, mve_vornq_m_<supf><mode>)
(mve_vorrq_m_<supf><mode>, mve_vandq_m_f<mode>)
(mve_vbicq_m_f<mode>, mve_veorq_m_f<mode>, mve_vornq_m_f<mode>)
(mve_vorrq_m_f<mode>)
(mve_vstrdq_scatter_shifted_offset_p_<supf>v2di_insn)
(mve_vstrdq_scatter_shifted_offset_<supf>v2di_insn)
(mve_vstrdq_scatter_base_wb_p_<supf>v2di) : Fix spacing and
capitalization in the emitted asm.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/intrinsics/asrl.c: Use
check-function-bodies instead of scan-assembler checks. Use
extern "C" for C++ testing.
* gcc.target/arm/mve/intrinsics/lsll.c: Likewise.
* gcc.target/arm/mve/intrinsics/sqrshr.c: Likewise.
* gcc.target/arm/mve/intrinsics/sqrshrl_sat48.c: Likewise.
* gcc.target/arm/mve/intrinsics/sqshl.c: Likewise.
* gcc.target/arm/mve/intrinsics/sqshll.c: Likewise.
* gcc.target/arm/mve/intrinsics/srshr.c: Likewise.
* gcc.target/arm/mve/intrinsics/srshrl.c: Likewise.
* gcc.target/arm/mve/intrinsics/uqrshl.c: Likewise.
* gcc.target/arm/mve/intrinsics/uqrshll_sat48.c: Likewise.
* gcc.target/arm/mve/intrinsics/uqshl.c: Likewise.
* gcc.target/arm/mve/intrinsics/uqshll.c: Likewise.
* gcc.target/arm/mve/intrinsics/urshr.c: Likewise.
* gcc.target/arm/mve/intrinsics/urshrl.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadciq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadciq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadciq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadciq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadcq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadcq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadcq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadcq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vandq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbicq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_m_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_m_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_x_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_x_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vbrsrq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vctp16q.c: Likewise.
* gcc.target/arm/mve/intrinsics/vctp16q_m.c: Likewise.
* gcc.target/arm/mve/intrinsics/vctp32q.c: Likewise.
* gcc.target/arm/mve/intrinsics/vctp32q_m.c: Likewise.
* gcc.target/arm/mve/intrinsics/vctp64q.c: Likewise.
* gcc.target/arm/mve/intrinsics/vctp64q_m.c: Likewise.
* gcc.target/arm/mve/intrinsics/vctp8q.c: Likewise.
* gcc.target/arm/mve/intrinsics/vctp8q_m.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtaq_m_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtaq_m_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtaq_m_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtaq_m_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtaq_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtaq_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtaq_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtaq_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtaq_x_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtaq_x_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtaq_x_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtaq_x_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtbq_f16_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtbq_f32_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtbq_m_f16_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtbq_m_f32_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtbq_x_f32_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtmq_m_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtmq_m_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtmq_m_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtmq_m_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtmq_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtmq_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtmq_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtmq_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtmq_x_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtmq_x_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtmq_x_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtmq_x_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtnq_m_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtnq_m_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtnq_m_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtnq_m_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtnq_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtnq_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtnq_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtnq_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtnq_x_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtnq_x_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtnq_x_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtnq_x_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtpq_m_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtpq_m_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtpq_m_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtpq_m_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtpq_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtpq_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtpq_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtpq_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtpq_x_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtpq_x_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtpq_x_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtpq_x_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_f16_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_f16_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_f32_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_f32_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_f16_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_f16_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_f32_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_f32_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_n_f16_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_n_f16_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_n_f32_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_n_f32_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_n_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_n_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_n_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_n_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_m_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_f16_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_f16_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_f32_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_f32_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_f16_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_f16_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_f32_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_f32_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_n_f16_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_n_f16_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_n_f32_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_n_f32_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_n_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_n_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_n_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_n_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_x_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvttq_f16_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvttq_f32_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvttq_m_f16_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvttq_m_f32_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvttq_x_f32_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/veorq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_m_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_m_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmasq_m_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmasq_m_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmasq_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmasq_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmsq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmsq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmsq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmsq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot270_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot270_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot270_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot270_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot270_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot270_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot270_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot270_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot270_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot90_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot90_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot90_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot90_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot90_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot90_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot90_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot90_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhcaddq_rot90_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavq_p_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavxq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavxq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavxq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavxq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavaq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavaq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavaq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavaq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavaq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavaq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavaxq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavaxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavaxq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavaxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavaxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavaxq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavxq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavxq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsdavxq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavaq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavaq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavaq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavaq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavaxq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavaxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavaxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavaxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavxq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlsldavxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovlbq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovlbq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovlbq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovlbq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovlbq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovlbq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovlbq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovlbq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovlbq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovlbq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovlbq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovlbq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovltq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovltq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovltq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovltq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovltq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovltq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovltq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovltq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovltq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovltq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovltq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovltq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovnbq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovnbq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovnbq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovnbq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovnbq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovnbq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovnbq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovnbq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovntq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovntq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovntq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovntq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovntq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovntq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovntq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmovntq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vornq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vorrq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpnot.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpselq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpselq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpselq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpselq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpselq_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpselq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpselq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpselq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpselq_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpselq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovnbq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovnbq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovnbq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovnbq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovnbq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovnbq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovnbq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovnbq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovntq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovntq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovntq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovntq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovntq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovntq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovntq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovntq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovunbq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovunbq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovunbq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovunbq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovuntq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovuntq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovuntq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqmovuntq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlahq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlahq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlahq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshlq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrnbq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrnbq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrnbq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrnbq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrnbq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrnbq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrnbq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrnbq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrntq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrntq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrntq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrntq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrntq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrntq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrntq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrntq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrunbq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrunbq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrunbq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshrunbq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshruntq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshruntq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshruntq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrshruntq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_r_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_r_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_r_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_r_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_r_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_r_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_r_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_r_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_r_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_r_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_r_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_r_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshlq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshluq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshluq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshluq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshluq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshluq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshluq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrnbq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrnbq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrnbq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrnbq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrnbq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrnbq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrnbq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrnbq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrntq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrntq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrntq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrntq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrntq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrntq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrntq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrntq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrunbq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrunbq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrunbq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshrunbq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshruntq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshruntq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshruntq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqshruntq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqsubq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev16q_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev16q_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev16q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev16q_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev16q_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev16q_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrhaddq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlaldavhaq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlaldavhaq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlaldavhaq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlaldavhaq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlaldavhaxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlaldavhaxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlaldavhq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlaldavhq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlaldavhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlaldavhq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlaldavhxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlaldavhxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlsldavhaq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlsldavhaq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlsldavhaxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlsldavhaxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlsldavhq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlsldavhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlsldavhxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmlsldavhxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrmulhq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndaq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndaq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndaq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndaq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndaq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndaq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndmq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndmq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndmq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndmq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndmq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndmq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndnq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndnq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndnq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndnq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndnq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndnq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndpq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndpq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndpq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndpq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndpq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndpq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndxq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndxq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndxq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndxq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndxq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrndxq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrnbq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrnbq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrnbq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrnbq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrnbq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrnbq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrnbq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrnbq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrntq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrntq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrntq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrntq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrntq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrntq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrntq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrntq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshrq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbciq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbciq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbcq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbcq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshllbq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshllbq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshllbq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshllbq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshllbq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshllbq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshllbq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshllbq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshllbq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshllbq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshllbq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshllbq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlltq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlltq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlltq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlltq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlltq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlltq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlltq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlltq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlltq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlltq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlltq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlltq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_r_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_r_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_r_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_r_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_r_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_r_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_r_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_r_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_r_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_r_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_r_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_r_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrnbq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrnbq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrnbq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrnbq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrnbq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrnbq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrnbq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrnbq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrntq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrntq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrntq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrntq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrntq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrntq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrntq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrntq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsliq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsliq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsliq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsliq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsliq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsliq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsliq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsliq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsliq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsliq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsliq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsliq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsriq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsriq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsriq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsriq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsriq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsriq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsriq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsriq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsriq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsriq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsriq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsriq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_p_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst1q_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_p_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_p_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_offset_p_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_offset_p_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_offset_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_offset_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_shifted_offset_p_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_shifted_offset_p_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_shifted_offset_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_shifted_offset_u64.c: Likewise.
|
|
Hi all,
this patch fixes the vstrwq* MVE instrinsics failing to emit the
correct sequence of instruction due to a missing predicate. Also the
immediate range is fixed to be multiples of 2 up between [-252, 252].
Best Regards
Andrea
gcc/ChangeLog:
* config/arm/constraints.md (mve_vldrd_immediate): Move it to
predicates.md.
(Ri): Move constraint definition from predicates.md.
(Rl): Define new constraint.
* config/arm/mve.md (mve_vstrwq_scatter_base_wb_p_<supf>v4si): Add
missing constraint.
(mve_vstrwq_scatter_base_wb_p_fv4sf): Add missing Up constraint
for op 1, use mve_vstrw_immediate predicate and Rl constraint for
op 2. Fix asm output spacing.
(mve_vstrdq_scatter_base_wb_p_<supf>v2di): Add missing constraint.
* config/arm/predicates.md (Ri) Move constraint to constraints.md
(mve_vldrd_immediate): Move it from
constraints.md.
(mve_vstrw_immediate): New predicate.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/intrinsics/vstrwq_f32.c: Use
check-function-bodies instead of scan-assembler checks. Use
extern "C" for C++ testing.
* gcc.target/arm/mve/intrinsics/vstrwq_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_u32.c: Likewise.
|
|
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Remove
trailing spaces on lines.
* config/riscv/riscv.cc (riscv_legitimize_move): Likewise.
* config/riscv/riscv.h (enum reg_class): Likewise.
* config/riscv/riscv.md: Likewise.
|
|
2023-05-17 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa.md (clear_cache): New.
|
|
Rotate instructions do not need to mask the third operand.
For example, RV64 the following code:
unsigned long foo1(unsigned long rs1, unsigned long rs2)
{
long shamt = rs2 & (64 - 1);
return (rs1 << shamt) | (rs1 >> ((64 - shamt) & (64 - 1)));
}
Compiles to:
foo1:
andi a1,a1,63
rol a0,a0,a1
ret
This patch removes unnecessary masking.
Besides, I have merged masking insns for shifts that were written before.
gcc/ChangeLog:
* config/riscv/riscv.md (*<optab><GPR:mode>3_mask): New pattern,
combined from ...
(*<optab>si3_mask, *<optab>di3_mask): Here.
(*<optab>si3_mask_1, *<optab>di3_mask_1): And here.
* config/riscv/bitmanip.md (*<bitmanip_optab><GPR:mode>3_mask): New
pattern.
(*<bitmanip_optab>si3_sext_mask): Likewise.
* config/riscv/iterators.md (shiftm1): Use const_si_mask_operand
and const_di_mask_operand.
(bitmanip_rotate): New iterator.
(bitmanip_optab): Add rotates.
* config/riscv/predicates.md (const_si_mask_operand): Renamed
from const31_operand. Generalize to handle more mask constants.
(const_di_mask_operand): Similarly.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/shift-and-2.c: Fixed test
* gcc.target/riscv/zbb-rol-ror-01.c: New test
* gcc.target/riscv/zbb-rol-ror-02.c: New test
* gcc.target/riscv/zbb-rol-ror-03.c: New test
* gcc.target/riscv/zbb-rol-ror-04.c: New test
* gcc.target/riscv/zbb-rol-ror-05.c: New test
* gcc.target/riscv/zbb-rol-ror-06.c: New test
* gcc.target/riscv/zbb-rol-ror-07.c: New test
|
|
builtins [PR109884]
When _Float128 support has been added to C++ for 13.1, float128t_type_node
tree has been added - in C float128_type_node and float128t_type_node is
the same and represents both _Float128 and __float128, but in C++ they
are distinct types which have different handling in the FEs.
When doing that change, I mistakenly forgot to change FLOAT128 primitive
type, which is used for the __builtin_{inf,huge_val,nan{,s},fabs,copysign}q
builtins results and some of their arguments (and nothing else).
The following patch fixes that.
On ia64 we already use float128t_type_node for those builtins, pa while
it has __float128 that type is the same as long double and so those builtins
have long double types and on powerpc seems we don't have these builtins
but instead define macros which map them to __builtin_*f128. That will
not work properly in C++, perhaps we should change those macros to be
function-like and cast to __float128.
2023-05-17 Jakub Jelinek <jakub@redhat.com>
PR c++/109884
* config/i386/i386-builtin-types.def (FLOAT128): Use
float128t_type_node rather than float128_type_node.
* c-c++-common/pr109884.c: New test.
|
|
Returned integer vector mode costs of emulated modes in
ix86_multiplication_cost are wrong and do not reflect generated
instruction sequences. Rewrite handling of different integer vector
modes and different target ABIs to return real instruction
counts in order to calcuate better costs of various emulated modes.
gcc/ChangeLog:
* config/i386/i386.cc (ix86_multiplication_cost): Correct
calcuation of integer vector mode costs to reflect generated
instruction sequences of different integer vector modes and
different target ABIs.
|
|
fixed-point instructions
Hi, this patch support the new coming fixed-point intrinsics:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222
Insert fixed-point rounding mode configuration by mode switching target hook.
Mode switching target hook is implemented applying LCM (Lazy code Motion).
So the performance && correctness can be well trusted.
Here is the example:
void f (void * in, void *out, int32_t x, int n, int m)
{
for (int i = 0; i < n; i++) {
vint32m1_t v = __riscv_vle32_v_i32m1 (in + i, 4);
vint32m1_t v2 = __riscv_vle32_v_i32m1_tu (v, in + 100 + i, 4);
vint32m1_t v3 = __riscv_vaadd_vx_i32m1 (v2, 0, VXRM_RDN, 4);
v3 = __riscv_vaadd_vx_i32m1 (v3, 3, VXRM_RDN, 4);
__riscv_vse32_v_i32m1 (out + 100 + i, v3, 4);
}
for (int i = 0; i < n; i++) {
vint32m1_t v = __riscv_vle32_v_i32m1 (in + i + 1000, 4);
vint32m1_t v2 = __riscv_vle32_v_i32m1_tu (v, in + 100 + i + 1000, 4);
vint32m1_t v3 = __riscv_vaadd_vx_i32m1 (v2, 0, VXRM_RDN, 4);
v3 = __riscv_vaadd_vx_i32m1 (v3, 3, VXRM_RDN, 4);
__riscv_vse32_v_i32m1 (out + 100 + i + 1000, v3, 4);
}
}
ASM:
...
csrwi vxrm,2
vsetivli zero,4,e32,m1,tu,ma
...
Loop 1
...
Loop 2
mode switching can global recognize both Loop 1 and Loop 2 are using RDN
rounding mode and hoist such single "csrwi vxrm,2" to dominate both Loop 1
and Loop 2.
Besides, I have add correctness check sanity tests in this patch too.
Ok for trunk ?
gcc/ChangeLog:
* config/riscv/riscv-opts.h (enum riscv_entity): New enum.
* config/riscv/riscv.cc (riscv_emit_mode_set): New function.
(riscv_mode_needed): Ditto.
(riscv_mode_after): Ditto.
(riscv_mode_entry): Ditto.
(riscv_mode_exit): Ditto.
(riscv_mode_priority): Ditto.
(TARGET_MODE_EMIT): New target hook.
(TARGET_MODE_NEEDED): Ditto.
(TARGET_MODE_AFTER): Ditto.
(TARGET_MODE_ENTRY): Ditto.
(TARGET_MODE_EXIT): Ditto.
(TARGET_MODE_PRIORITY): Ditto.
* config/riscv/riscv.h (OPTIMIZE_MODE_SWITCHING): Ditto.
(NUM_MODES_FOR_MODE_SWITCHING): Ditto.
* config/riscv/riscv.md: Add csrwvxrm.
* config/riscv/vector.md (rnu,rne,rdn,rod,none): New attribute.
(vxrmsi): New pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vxrm-10.c: New test.
* gcc.target/riscv/rvv/base/vxrm-6.c: New test.
* gcc.target/riscv/rvv/base/vxrm-7.c: New test.
* gcc.target/riscv/rvv/base/vxrm-8.c: New test.
* gcc.target/riscv/rvv/base/vxrm-9.c: New test.
|
|
According to new comming fixed-point API:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222
Introduce vxrm argument:
- vint32m1_t __riscv_vsadd_vv_i32m1 (vint32m1_t op1, vint32m1_t op2, size_t vl);
+ vint32m1_t __riscv_vsadd_vv_i32m1 (vint32m1_t op1, vint32m1_t op2, size_t vxrm, size_t vl);
This patch doesn't insert vxrm csrw configuration instruction yet.
Will support automatically insert csrw vxrm instruction in the next patch.
This patch does this following:
1. Only extend the vxrm argument.
2. Check vxrm argument is invalid immediate and report error message if it is invalid.
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc: Introduce rounding mode.
* config/riscv/riscv-vector-builtins-shapes.cc (struct alu_def): Ditto.
(struct narrow_alu_def): Ditto.
* config/riscv/riscv-vector-builtins.cc (function_builder::apply_predication): Ditto.
(function_expander::use_exact_insn): Ditto.
* config/riscv/riscv-vector-builtins.h (function_checker::arg_num): New function.
(function_base::has_rounding_mode_operand_p): New function.
gcc/testsuite/ChangeLog:
* g++.target/riscv/rvv/base/bug-11.C: Adapt testcase.
* g++.target/riscv/rvv/base/bug-12.C: Ditto.
* g++.target/riscv/rvv/base/bug-14.C: Ditto.
* g++.target/riscv/rvv/base/bug-15.C: Ditto.
* g++.target/riscv/rvv/base/bug-16.C: Ditto.
* g++.target/riscv/rvv/base/bug-17.C: Ditto.
* g++.target/riscv/rvv/base/bug-18.C: Ditto.
* g++.target/riscv/rvv/base/bug-19.C: Ditto.
* g++.target/riscv/rvv/base/bug-20.C: Ditto.
* g++.target/riscv/rvv/base/bug-21.C: Ditto.
* g++.target/riscv/rvv/base/bug-22.C: Ditto.
* g++.target/riscv/rvv/base/bug-23.C: Ditto.
* g++.target/riscv/rvv/base/bug-3.C: Ditto.
* g++.target/riscv/rvv/base/bug-5.C: Ditto.
* g++.target/riscv/rvv/base/bug-6.C: Ditto.
* g++.target/riscv/rvv/base/bug-8.C: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-100.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-101.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-102.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-103.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-104.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-105.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-106.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-107.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-108.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-109.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-110.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-111.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-112.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-113.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-114.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-115.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-116.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-117.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-118.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-119.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-122.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-97.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-98.c: Ditto.
* gcc.target/riscv/rvv/base/merge_constraint-1.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-6.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-7.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-8.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-9.c: Ditto.
* gcc.target/riscv/rvv/base/vxrm-2.c: New test.
* gcc.target/riscv/rvv/base/vxrm-3.c: New test.
* gcc.target/riscv/rvv/base/vxrm-4.c: New test.
* gcc.target/riscv/rvv/base/vxrm-5.c: New test.
|
|
Hi, since fixed-point with modeling rounding mode intrinsics are coming:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222
I am adding vxrm rounding mode enum to user first before the API intrinsic.
This patch is simple && obvious.
Ok for trunk ?
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins.cc (register_vxrm): New function.
(DEF_RVV_VXRM_ENUM): New macro.
(handle_pragma_vector): Add vxrm enum register.
* config/riscv/riscv-vector-builtins.def (DEF_RVV_VXRM_ENUM): New macro.
(RNU): Ditto.
(RNE): Ditto.
(RDN): Ditto.
(ROD): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vxrm-1.c: New test.
|
|
So far atomic objects are aligned according to their default alignment.
For 128 bit scalar types like int128 or long double this results in an
8 byte alignment which is wrong and must be 16 byte.
libstdc++ already computes a correct alignment, though, still adding a
test case in order to make sure that both implementations are
compatible.
gcc/ChangeLog:
* config/s390/s390.cc (TARGET_ATOMIC_ALIGN_FOR_MODE):
New.
(s390_atomic_align_for_mode): New.
gcc/testsuite/ChangeLog:
* g++.target/s390/atomic-align-1.C: New test.
* gcc.target/s390/atomic-align-1.c: New test.
* gcc.target/s390/atomic-align-2.c: New test.
|
|
This patch support the RVV VREINTERPRET from the int to the vbool1_t. Aka:
vbool1_t __riscv_vreinterpret_xx_xx(v{u}int[8|16|32|64]_t);
These APIs help the users to convert vector LMUL=1 integer to vbool1_t.
According to the RVV intrinsic SPEC as below, the reinterpret intrinsics
only change the types of the underlying contents.
https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#reinterpret-vbool-o-vintm1
For example, given below code.
vbool1_t test_vreinterpret_v_i8m1_b1(vint8m1_t src) {
return __riscv_vreinterpret_v_i8m1_b1(src);
}
It will generate the assembly code similar as below:
vsetvli a5,zero,e8,m8,ta,ma
vlm.v v1,0(a1)
vsm.v v1,0(a0)
ret
The rest intrinsic bool size APIs will be prepared in other PATCH.
Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:
* config/riscv/genrvv-type-indexer.cc (BOOL_SIZE_LIST): New
macro.
(main): Add bool1 to the type indexer.
* config/riscv/riscv-vector-builtins-functions.def
(vreinterpret): Register vbool1 interpret function.
* config/riscv/riscv-vector-builtins-types.def
(DEF_RVV_BOOL1_INTERPRET_OPS): New macro.
(vint8m1_t): Add the type to bool1_interpret_ops.
(vint16m1_t): Ditto.
(vint32m1_t): Ditto.
(vint64m1_t): Ditto.
(vuint8m1_t): Ditto.
(vuint16m1_t): Ditto.
(vuint32m1_t): Ditto.
(vuint64m1_t): Ditto.
* config/riscv/riscv-vector-builtins.cc
(DEF_RVV_BOOL1_INTERPRET_OPS): New macro.
(required_extensions_p): Add bool1 interpret case.
* config/riscv/riscv-vector-builtins.def
(bool1_interpret): Add bool1 interpret to base type.
* config/riscv/vector.md (@vreinterpret<mode>): Add new expand
with VB dest for vreinterpret.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c: New test.
|
|
For constant C:
If '(c & 0xFFFFFFFF0000FFFFULL) == 0xFFFFFFFF00000000' or say:
32(1) || 1(0) || 15(x) || 16(0), we could use "lis; xoris" to build.
Here N(M) means N continuous bit M, x for M means it is ok for either
1 or 0; '||' means concatenation.
This patch update rs6000_emit_set_long_const to support those constants.
Compare with previous version:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608292.html
This patch updates test function names only.
Bootstrap and regtest pass on ppc64{,le}.
PR target/106708
gcc/ChangeLog:
* config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Support building
constants through "lis; xoris".
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr106708.c: Add test function.
|
|
Vectorize memset with a constant length of less than or equal to 64
bytes.
Do not perform a libc function call into memset in case the size is not
a compile-time constant but bounded and the upper bound is less than or
equal to 256 bytes.
gcc/ChangeLog:
* config/s390/s390-protos.h (s390_expand_setmem): Change
function signature.
* config/s390/s390.cc (s390_expand_setmem): For memset's less
than or equal to 256 byte do not perform a libc call.
* config/s390/s390.md: Change expander into a version which
takes 8 operands.
gcc/testsuite/ChangeLog:
* gcc.target/s390/memset-1.c: Test case memset1 makes use of
vst, now.
|
|
gcc/ChangeLog:
* config/s390/s390-protos.h (s390_expand_movmem): New.
* config/s390/s390.cc (s390_expand_movmem): New.
* config/s390/s390.md (movmem<mode>): New.
(*mvcrl): New.
(mvcrl): New.
|
|
Do not perform a libc function call into memcpy in case the size is not
a compile-time constant but bounded and the upper bound is less than or
equal to 256 bytes.
gcc/ChangeLog:
* config/s390/s390-protos.h (s390_expand_cpymem): Change
function signature.
* config/s390/s390.cc (s390_expand_cpymem): For memcpy's less
than or equal to 256 byte do not perform a libc call.
(s390_expand_insv): Adapt new function signature of
s390_expand_cpymem.
* config/s390/s390.md: Change expander into a version which
takes 8 operands.
|
|
This patch is adding rounding mode operand and FRM_REGNUM dependency
into floating-point instructions.
The floating-point instructions we added FRM and rounding mode operand:
1. vfadd/vfsub
2. vfwadd/vfwsub
3. vfmul
4. vfdiv
5. vfwmul
6. vfwmacc/vfwnmacc/vfwmsac/vfwnmsac
7. vfsqrt
8. floating-point conversions.
9. floating-point reductions.
10. floating-point ternary.
The floating-point instructions we did NOT add FRM and rounding mode
operand:
1. vfabs/vfneg/vfsqrt7/vfrec7
2. vfmin/vfmax
3. comparisons
4. vfclass
5. vfsgnj/vfsgnjn/vfsgnjx
6. vfmerge
7. vfmv.v.f
gcc/ChangeLog:
* config/riscv/riscv-protos.h (enum frm_field_enum): New enum.
* config/riscv/riscv-vector-builtins.cc
(function_expander::use_ternop_insn): Add default rounding mode.
(function_expander::use_widen_ternop_insn): Ditto.
* config/riscv/riscv.cc (riscv_hard_regno_nregs): Add FRM REGNUM.
(riscv_hard_regno_mode_ok): Ditto.
(riscv_conditional_register_usage): Ditto.
* config/riscv/riscv.h (DWARF_FRAME_REGNUM): Ditto.
(FRM_REG_P): Ditto.
(RISCV_DWARF_FRM): Ditto.
* config/riscv/riscv.md: Ditto.
* config/riscv/vector-iterators.md: split no frm and has frm operations.
* config/riscv/vector.md (@pred_<optab><mode>_scalar): New pattern.
(@pred_<optab><mode>): Ditto.
Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
|
|
Since we are going to have fixed-point intrinsics that are modeling
rounding mode
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222
We should have operand to specify rounding mode in fixed-point instructions.
We don't support these modeling rounding mode intrinsics yet but we will
definetely support them later.
This is the preparing patch for new coming intrinsics.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (enum vxrm_field_enum): New enum.
* config/riscv/riscv-vector-builtins.cc
(function_expander::use_exact_insn): Add default rounding mode operand.
* config/riscv/riscv.cc (riscv_hard_regno_nregs): Add VXRM_REGNUM.
(riscv_hard_regno_mode_ok): Ditto.
(riscv_conditional_register_usage): Ditto.
* config/riscv/riscv.h (DWARF_FRAME_REGNUM): Ditto.
(VXRM_REG_P): Ditto.
(RISCV_DWARF_VXRM): Ditto.
* config/riscv/riscv.md: Ditto.
* config/riscv/vector.md: Ditto
Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
|
|
We are missing cases for combining of FACGE/FACGT instructions. In the testcase of the patch we generate:
foo:
fabs v3.4s, v0.4s
fabs v0.4s, v1.4s
fabs v1.4s, v2.4s
fcmgt v0.4s, v3.4s, v0.4s
fcmgt v1.4s, v3.4s, v1.4s
b g
This is because combine is rejecting the pattern due to costs:
Successfully matched this instruction:
(set (reg:V4SI 106)
(neg:V4SI (lt:V4SI (abs:V4SF (reg:V4SF 113))
(abs:V4SF (reg:V4SF 111)))))
rejecting combination of insns 8, 9 and 10
original costs 8 + 8 + 12 = 28
replacement costs 8 + 28 = 36
It is obviously recursing in the various arms of the RTX and such.
This patch teaches the aarch64 rtx costs routine that our vector comparisons are represented as a NEG of
compare operators, with the FACGE/FAGT operations in particular having ABS on each arm. With this patch we get
the much more reasonable dump:
original costs 8 + 8 + 8 = 24
replacement costs 8 + 8 = 16
and generate the optimal assembly:
foo:
mov v31.16b, v0.16b
facgt v0.4s, v0.4s, v1.4s
facgt v1.4s, v31.4s, v2.4s
b g
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_rtx_costs, NEG case): Add costing
logic for vector modes.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/facg_1.c: New test.
|
|
This instalment of the series goes through the vector comparison patterns in the backend.
One wart are the int64x1_t comparisons that this patch doesn't touch.
Those are a bit trickier because they have define_insn_and_split mechanisms for falling back to
GP reg comparisons after reload and I don't think a simple annotation will catch those cases correctly.
Those will need more custom thinking.
As said, this patch doesn't touch those and is a decent straightforward improvement on its own.
Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
gcc/ChangeLog:
PR target/99195
* config/aarch64/aarch64-simd.md (aarch64_cm<optab><mode>): Rename to...
(aarch64_cm<optab><mode><vczle><vczbe>): ... This.
(aarch64_cmtst<mode>): Rename to...
(aarch64_cmtst<mode><vczle><vczbe>): ... This.
(*aarch64_cmtst_same_<mode>): Rename to...
(*aarch64_cmtst_same_<mode><vczle><vczbe>): ... This.
(*aarch64_cmtstdi): Rename to...
(*aarch64_cmtstdi<vczle><vczbe>): ... This.
(aarch64_fac<optab><mode>): Rename to...
(aarch64_fac<optab><mode><vczle><vczbe>): ... This.
gcc/testsuite/ChangeLog:
PR target/99195
* gcc.target/aarch64/simd/pr99195_7.c: New test.
|
|
Straightforward like previous patches in this series.
Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
gcc/ChangeLog:
PR target/99195
* config/aarch64/aarch64-simd.md (aarch64_s<optab><mode>): Rename to...
(aarch64_s<optab><mode><vczle><vczbe>): ... This.
gcc/testsuite/ChangeLog:
PR target/99195
* gcc.target/aarch64/simd/pr99195_4.c: Add testing for qabs, qneg.
|
|
This patch is optimizing the AVL for VLS auto-vectorzation.
Given below sample code:
typedef int8_t vnx2qi __attribute__ ((vector_size (2)));
__attribute__ ((noipa)) void
f_vnx2qi (int8_t a, int8_t b, int8_t *out)
{
vnx2qi v = {a, b};
*(vnx2qi *) out = v;
}
Before this patch:
f_vnx2qi:
vsetvli a5,zero,e8,mf8,ta,ma
vmv.v.x v1,a0
vslide1down.vx v1,v1,a1
vse8.v v1,0(a2)
ret
After this patch:
f_vnx2qi:
vsetivli zero,2,e8,mf8,ta,ma
vmv.v.x v1,a0
vslide1down.vx v1,v1,a1
vse8.v v1,0(a2)
ret
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Co-authored-by: kito-cheng <kito.cheng@sifive.com>
gcc/ChangeLog:
* config/riscv/riscv-v.cc (const_vlmax_p): New function for
deciding the mode is constant or not.
(set_len_and_policy): Optimize VLS-VLMAX code gen to vsetivli.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vf_avl-1.c: New test.
|
|
codegen of both VLA && VLS auto-vectorization
This patch optimizes both RVV VLA && VLS vectorization.
Consider this following case:
void __attribute__((noinline, noclone))
f (int * __restrict dst, int * __restrict op1, int * __restrict op2, int
count)
{
for (int i = 0; i < count; ++i)
dst[i] = op1[i] + op2[i];
}
VLA:
Before this patch:
ble a3,zero,.L1
srli a4,a1,2
negw a4,a4
andi a5,a4,3
sext.w a3,a3
beq a5,zero,.L3
lw a7,0(a1)
lw a6,0(a2)
andi a4,a4,2
addw a6,a6,a7
sw a6,0(a0)
beq a4,zero,.L3
lw a7,4(a1)
lw a4,4(a2)
li a6,3
addw a4,a4,a7
sw a4,4(a0)
bne a5,a6,.L3
lw a6,8(a2)
lw a4,8(a1)
addw a4,a4,a6
sw a4,8(a0)
.L3:
subw a3,a3,a5
slli a4,a3,32
csrr a6,vlenb
srli a4,a4,32
srli a6,a6,2
slli a3,a5,2
mv a5,a4
bgtu a4,a6,.L17
.L5:
csrr a6,vlenb
add a1,a1,a3
add a2,a2,a3
add a0,a0,a3
srli a7,a6,2
li a3,0
.L8:
vsetvli zero,a5,e32,m1,ta,ma
vle32.v v1,0(a1)
vle32.v v2,0(a2)
vsetvli t1,zero,e32,m1,ta,ma
add a3,a3,a7
vadd.vv v1,v1,v2
vsetvli zero,a5,e32,m1,ta,ma
vse32.v v1,0(a0)
mv a5,a4
bleu a4,a3,.L6
mv a5,a3
.L6:
sub a5,a4,a5
bleu a5,a7,.L7
mv a5,a7
.L7:
add a1,a1,a6
add a2,a2,a6
add a0,a0,a6
bne a5,zero,.L8
.L1:
ret
.L17:
mv a5,a6
j .L5
After this patch:
f:
ble a3,zero,.L1
csrr a4,vlenb
srli a4,a4,2
mv a5,a3
bgtu a3,a4,.L9
.L3:
csrr a6,vlenb
li a4,0
srli a7,a6,2
.L6:
vsetvli zero,a5,e32,m1,ta,ma
vle32.v v2,0(a1)
vle32.v v1,0(a2)
vsetvli t1,zero,e32,m1,ta,ma
add a4,a4,a7
vadd.vv v1,v1,v2
vsetvli zero,a5,e32,m1,ta,ma
vse32.v v1,0(a0)
mv a5,a3
bleu a3,a4,.L4
mv a5,a4
.L4:
sub a5,a3,a5
bleu a5,a7,.L5
mv a5,a7
.L5:
add a0,a0,a6
add a2,a2,a6
add a1,a1,a6
bne a5,zero,.L6
.L1:
ret
.L9:
mv a5,a4
j .L3
VLS:
Before this patch:
f3:
ble a3,zero,.L1
srli a5,a1,2
negw a5,a5
andi a4,a5,3
sext.w a3,a3
beq a4,zero,.L3
lw a7,0(a1)
lw a6,0(a2)
andi a5,a5,2
addw a6,a6,a7
sw a6,0(a0)
beq a5,zero,.L3
lw a7,4(a1)
lw a5,4(a2)
li a6,3
addw a5,a5,a7
sw a5,4(a0)
bne a4,a6,.L3
lw a6,8(a2)
lw a5,8(a1)
addw a5,a5,a6
sw a5,8(a0)
.L3:
subw a3,a3,a4
slli a6,a4,2
slli a5,a3,32
srli a5,a5,32
add a1,a1,a6
add a2,a2,a6
add a0,a0,a6
li a3,4
.L6:
mv a4,a5
bleu a5,a3,.L5
li a4,4
.L5:
vsetvli zero,a4,e32,m1,ta,ma
vle32.v v1,0(a1)
vle32.v v2,0(a2)
vsetivli zero,4,e32,m1,ta,ma
sub a5,a5,a4
vadd.vv v1,v1,v2
vsetvli zero,a4,e32,m1,ta,ma
vse32.v v1,0(a0)
addi a1,a1,16
addi a2,a2,16
addi a0,a0,16
bne a5,zero,.L6
.L1:
ret
After this patch:
f3:
ble a3,zero,.L1
li a4,4
.L4:
mv a5,a3
bleu a3,a4,.L3
li a5,4
.L3:
vsetvli zero,a5,e32,m1,ta,ma
vle32.v v2,0(a1)
vle32.v v1,0(a2)
vsetivli zero,4,e32,m1,ta,ma
sub a3,a3,a5
vadd.vv v1,v1,v2
vsetvli zero,a5,e32,m1,ta,ma
vse32.v v1,0(a0)
addi a2,a2,16
addi a0,a0,16
addi a1,a1,16
bne a3,zero,.L4
.L1:
ret
Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
gcc/ChangeLog:
* config/riscv/riscv.cc
(riscv_vectorize_preferred_vector_alignment): New function.
(TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New target hook.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/shift-rv32gcv.c: Adapt testcase.
* gcc.target/riscv/rvv/autovec/align-1.c: New test.
* gcc.target/riscv/rvv/autovec/align-2.c: New test.
|
|
Revert my previous change that faked handling of V4HI and V2SImodes
in ix86_widen_mult_cost and rather return arbitrary high value
for unsupported modes. This should prevent cost estimator from
selecting non-existent vector widen multiply operation.
gcc/ChangeLog:
PR target/109807
* config/i386/i386.cc: Revert the 2023-05-11 change.
(ix86_widen_mult_cost): Return high value instead of
ICEing for unsupported modes.
gcc/testsuite/ChangeLog:
PR target/109807
* gcc.target/i386/pr109825.c: New test.
|
|
The small and medium PIC code models generate profiling calls that
always load the address of __fentry__() via the GOT, even if
-mdirect-extern-access is in effect.
This deviates from the behavior with respect to other external
references, and results in a longer opcode that relies on linker
relaxation to eliminate the GOT load. In this particular case, the
transformation replaces an indirect 'CALL *__fentry__@GOTPCREL(%rip)'
with either 'CALL __fentry__; NOP' or 'NOP; CALL __fentry__', where the
NOP is a 1 byte NOP that preserves the 6 byte length of the sequence.
This is problematic for the Linux kernel, which generally relies on
-mdirect-extern-access and hidden visibility to eliminate GOT based
symbol references in code generated with -fpie/-fpic, without having to
depend on linker relaxation.
The Linux kernel relies on code patching to replace these opcodes with
NOPs at runtime, and this is complicated code that we'd prefer not to
complicate even more by adding support for patching both 5 and 6 byte
sequences as well as parsing the instruction stream to decide which
variant of CALL+NOP we are dealing with.
So let's honour -mdirect-extern-access, and only load the address of
__fentry__ via the GOT if direct references to external symbols are not
permitted.
Note that the GOT reference in question is in fact a data reference: we
explicitly load the address of __fentry__ from the GOT, which amounts to
eager binding, rather than emitting a PLT call that could bind eagerly,
lazily or directly at link time.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
gcc/ChangeLog:
* config/i386/i386.cc (x86_function_profiler): Take
ix86_direct_extern_access into account when generating calls
to __fentry__()
|
|
This patch refactor the pattern A or B or C or D, to the switch case for
easy add/remove new types, as well as human reading friendly.
Before this patch:
return A || B || C || D;
After this patch:
switch (type)
{
case A:
case B:
case C:
case D:
return true;
default:
return false;
}
Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins.cc (required_extensions_p):
Refactor the or pattern to switch cases.
|
|
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_expand_vector_init_fallback): Rename
aarch64_expand_vector_init to this, and remove interleaving case.
Recursively call aarch64_expand_vector_init_fallback, instead of
aarch64_expand_vector_init.
(aarch64_unzip_vector_init): New function.
(aarch64_expand_vector_init): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/ldp_stp_16.c (cons2_8_float): Adjust for new
code-gen.
* gcc.target/aarch64/sve/acle/general/dupq_5.c: Likewise.
* gcc.target/aarch64/sve/acle/general/dupq_6.c: Likewise.
* gcc.target/aarch64/interleave-init-1.c: Rename to ...
* gcc.target/aarch64/vec-init-18.c: ... this.
* gcc.target/aarch64/vec-init-19.c: New test.
* gcc.target/aarch64/vec-init-20.c: Likewise.
* gcc.target/aarch64/vec-init-21.c: Likewise.
* gcc.target/aarch64/vec-init-22-size.c: Likewise.
* gcc.target/aarch64/vec-init-22-speed.c: Likewise.
* gcc.target/aarch64/vec-init-22.h: New header.
|
|
It will broken when release mode.
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (pass_vsetvl::cleanup_insns):
Pull out function call from the gcc_assert.
|
|
Convert vlmul and policy to human readable string, some example below:
Before:
[VALID,Demand field={1(VL),0(DEMAND_NONZERO_AVL),1(SEW),0(DEMAND_GE_SEW),1(LMUL),0(RATIO),0(TAIL_POLICY),0(MASK_POLICY)}
AVL=(reg:DI 0 zero)
SEW=16,VLMUL=3,RATIO=2,TAIL_POLICY=1,MASK_POLICY=1]
^ ^ ^
After:
[VALID,Demand field={1(VL),0(DEMAND_NONZERO_AVL),1(SEW),0(DEMAND_GE_SEW),1(LMUL),0(RATIO),0(TAIL_POLICY),0(MASK_POLICY)}
AVL=(reg:DI 0 zero)
SEW=16,VLMUL=m8,RATIO=2,TAIL_POLICY=agnostic,MASK_POLICY=agnostic]
^^ ^^^^^^^^ ^^^^^^^^
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (vlmul_to_str): New.
(policy_to_str): New.
(vector_insn_info::dump): Use vlmul_to_str and policy_to_str.
|
|
Some cleanups while looking at these two functions.
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_vecop_qihi2): Also
reject ymm instructions for TARGET_PREFER_AVX128. Use generic
gen_extend_insn to generate zero/sign extension instructions.
Fix comments.
(ix86_expand_vecop_qihi): Initialize interleave functions
for MULT code only. Fix comments.
|
|
Remove mulv2si emulated sequence for TARGET_SSE2 and enable
only native PMULLD instruction for TARGET_SSE4_1. Ideally, the
vectorization for TARGET_SSE2 should depend on more precise cost
estimation (the PR contains patch for ix86_multiplication_cost),
but even with patched cost function the runtime regression
was not fixed.
PR target/109797
gcc/ChangeLog:
* config/i386/mmx.md (mulv2si3): Remove expander.
(mulv2si3): Rename insn pattern from *mulv2si.
|
|
Rebase to trunk and send V3 patch for:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617821.html
This patch is fixing: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109743.
This issue happens is because we are currently very conservative in optimization of user vsetvli.
Consider this following case:
bb 1:
vsetvli a5,a4... (demand AVL = a4).
bb 2:
RVV insn use a5 (demand AVL = a5).
LCM will hoist vsetvl of bb 2 into bb 1.
We don't do AVL propagation for this situation since it's complicated that
we should analyze the code sequence between vsetvli in bb 1 and RVV insn in bb 2.
They are not necessary the consecutive blocks.
This patch is doing the optimizations after LCM, we will check and eliminate the vsetvli
in LCM inserted edge if such vsetvli is redundant. Such approach is much simplier and safe.
code:
void
foo2 (int32_t *a, int32_t *b, int n)
{
if (n <= 0)
return;
int i = n;
size_t vl = __riscv_vsetvl_e32m1 (i);
for (; i >= 0; i--)
{
vint32m1_t v = __riscv_vle32_v_i32m1 (a, vl);
__riscv_vse32_v_i32m1 (b, v, vl);
if (i >= vl)
continue;
if (i == 0)
return;
vl = __riscv_vsetvl_e32m1 (i);
}
}
Before this patch:
foo2:
.LFB2:
.cfi_startproc
ble a2,zero,.L1
mv a4,a2
li a3,-1
vsetvli a5,a2,e32,m1,ta,mu
vsetvli zero,a5,e32,m1,ta,ma <- can be eliminated.
.L5:
vle32.v v1,0(a0)
vse32.v v1,0(a1)
bgeu a4,a5,.L3
.L10:
beq a2,zero,.L1
vsetvli a5,a4,e32,m1,ta,mu
addi a4,a4,-1
vsetvli zero,a5,e32,m1,ta,ma <- can be eliminated.
vle32.v v1,0(a0)
vse32.v v1,0(a1)
addiw a2,a2,-1
bltu a4,a5,.L10
.L3:
addiw a2,a2,-1
addi a4,a4,-1
bne a2,a3,.L5
.L1:
ret
After this patch:
f:
ble a2,zero,.L1
mv a4,a2
li a3,-1
vsetvli a5,a2,e32,m1,ta,ma
.L5:
vle32.v v1,0(a0)
vse32.v v1,0(a1)
bgeu a4,a5,.L3
.L10:
beq a2,zero,.L1
vsetvli a5,a4,e32,m1,ta,ma
addi a4,a4,-1
vle32.v v1,0(a0)
vse32.v v1,0(a1)
addiw a2,a2,-1
bltu a4,a5,.L10
.L3:
addiw a2,a2,-1
addi a4,a4,-1
bne a2,a3,.L5
.L1:
ret
PR target/109743
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (pass_vsetvl::get_vsetvl_at_end): New.
(local_avl_compatible_p): New.
(pass_vsetvl::local_eliminate_vsetvl_insn): Enhance local optimizations
for LCM, rewrite as a backward algorithm.
(pass_vsetvl::cleanup_insns): Use new local_eliminate_vsetvl_insn
interface, handle a BB at once.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/pr109743-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr109743-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr109743-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr109743-4.c: New test.
Co-authored-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
|