diff options
author | Srinath Parvathaneni <srinath.parvathaneni@arm.com> | 2020-03-20 11:58:30 +0000 |
---|---|---|
committer | Kyrylo Tkachov <kyrylo.tkachov@arm.com> | 2020-03-20 11:58:30 +0000 |
commit | 92f80065d10ece75327c30eed924c322f1e4b338 (patch) | |
tree | eeddf33866b16de15dffef4a5b8d8c4c48367cf4 /gcc/config | |
parent | 85a94e8790198cdafc6f2af8224b273075bab84d (diff) | |
download | gcc-92f80065d10ece75327c30eed924c322f1e4b338.zip gcc-92f80065d10ece75327c30eed924c322f1e4b338.tar.gz gcc-92f80065d10ece75327c30eed924c322f1e4b338.tar.bz2 |
[ARM][GCC][1/8x]: MVE ACLE vidup, vddup, viwdup and vdwdup intrinsics with writeback.
This patch supports following MVE ACLE intrinsics with writeback.
vddupq_m_n_u8, vddupq_m_n_u32, vddupq_m_n_u16, vddupq_m_wb_u8, vddupq_m_wb_u16, vddupq_m_wb_u32, vddupq_n_u8, vddupq_n_u32, vddupq_n_u16, vddupq_wb_u8, vddupq_wb_u16, vddupq_wb_u32, vdwdupq_m_n_u8, vdwdupq_m_n_u32, vdwdupq_m_n_u16, vdwdupq_m_wb_u8, vdwdupq_m_wb_u32, vdwdupq_m_wb_u16, vdwdupq_n_u8, vdwdupq_n_u32, vdwdupq_n_u16, vdwdupq_wb_u8, vdwdupq_wb_u32, vdwdupq_wb_u16, vidupq_m_n_u8, vidupq_m_n_u32, vidupq_m_n_u16, vidupq_m_wb_u8, vidupq_m_wb_u16, vidupq_m_wb_u32, vidupq_n_u8, vidupq_n_u32, vidupq_n_u16, vidupq_wb_u8, vidupq_wb_u16, vidupq_wb_u32, viwdupq_m_n_u8, viwdupq_m_n_u32, viwdupq_m_n_u16, viwdupq_m_wb_u8, viwdupq_m_wb_u32, viwdupq_m_wb_u16, viwdupq_n_u8, viwdupq_n_u32, viwdupq_n_u16, viwdupq_wb_u8, viwdupq_wb_u32, viwdupq_wb_u16.
Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details.
[1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics
2020-03-20 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
Andre Vieira <andre.simoesdiasvieira@arm.com>
Mihail Ionescu <mihail.ionescu@arm.com>
* config/arm/arm-builtins.c
(QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Define quinary
builtin qualifier.
* config/arm/arm_mve.h (vddupq_m_n_u8): Define macro.
(vddupq_m_n_u32): Likewise.
(vddupq_m_n_u16): Likewise.
(vddupq_m_wb_u8): Likewise.
(vddupq_m_wb_u16): Likewise.
(vddupq_m_wb_u32): Likewise.
(vddupq_n_u8): Likewise.
(vddupq_n_u32): Likewise.
(vddupq_n_u16): Likewise.
(vddupq_wb_u8): Likewise.
(vddupq_wb_u16): Likewise.
(vddupq_wb_u32): Likewise.
(vdwdupq_m_n_u8): Likewise.
(vdwdupq_m_n_u32): Likewise.
(vdwdupq_m_n_u16): Likewise.
(vdwdupq_m_wb_u8): Likewise.
(vdwdupq_m_wb_u32): Likewise.
(vdwdupq_m_wb_u16): Likewise.
(vdwdupq_n_u8): Likewise.
(vdwdupq_n_u32): Likewise.
(vdwdupq_n_u16): Likewise.
(vdwdupq_wb_u8): Likewise.
(vdwdupq_wb_u32): Likewise.
(vdwdupq_wb_u16): Likewise.
(vidupq_m_n_u8): Likewise.
(vidupq_m_n_u32): Likewise.
(vidupq_m_n_u16): Likewise.
(vidupq_m_wb_u8): Likewise.
(vidupq_m_wb_u16): Likewise.
(vidupq_m_wb_u32): Likewise.
(vidupq_n_u8): Likewise.
(vidupq_n_u32): Likewise.
(vidupq_n_u16): Likewise.
(vidupq_wb_u8): Likewise.
(vidupq_wb_u16): Likewise.
(vidupq_wb_u32): Likewise.
(viwdupq_m_n_u8): Likewise.
(viwdupq_m_n_u32): Likewise.
(viwdupq_m_n_u16): Likewise.
(viwdupq_m_wb_u8): Likewise.
(viwdupq_m_wb_u32): Likewise.
(viwdupq_m_wb_u16): Likewise.
(viwdupq_n_u8): Likewise.
(viwdupq_n_u32): Likewise.
(viwdupq_n_u16): Likewise.
(viwdupq_wb_u8): Likewise.
(viwdupq_wb_u32): Likewise.
(viwdupq_wb_u16): Likewise.
(__arm_vddupq_m_n_u8): Define intrinsic.
(__arm_vddupq_m_n_u32): Likewise.
(__arm_vddupq_m_n_u16): Likewise.
(__arm_vddupq_m_wb_u8): Likewise.
(__arm_vddupq_m_wb_u16): Likewise.
(__arm_vddupq_m_wb_u32): Likewise.
(__arm_vddupq_n_u8): Likewise.
(__arm_vddupq_n_u32): Likewise.
(__arm_vddupq_n_u16): Likewise.
(__arm_vdwdupq_m_n_u8): Likewise.
(__arm_vdwdupq_m_n_u32): Likewise.
(__arm_vdwdupq_m_n_u16): Likewise.
(__arm_vdwdupq_m_wb_u8): Likewise.
(__arm_vdwdupq_m_wb_u32): Likewise.
(__arm_vdwdupq_m_wb_u16): Likewise.
(__arm_vdwdupq_n_u8): Likewise.
(__arm_vdwdupq_n_u32): Likewise.
(__arm_vdwdupq_n_u16): Likewise.
(__arm_vdwdupq_wb_u8): Likewise.
(__arm_vdwdupq_wb_u32): Likewise.
(__arm_vdwdupq_wb_u16): Likewise.
(__arm_vidupq_m_n_u8): Likewise.
(__arm_vidupq_m_n_u32): Likewise.
(__arm_vidupq_m_n_u16): Likewise.
(__arm_vidupq_n_u8): Likewise.
(__arm_vidupq_m_wb_u8): Likewise.
(__arm_vidupq_m_wb_u16): Likewise.
(__arm_vidupq_m_wb_u32): Likewise.
(__arm_vidupq_n_u32): Likewise.
(__arm_vidupq_n_u16): Likewise.
(__arm_vidupq_wb_u8): Likewise.
(__arm_vidupq_wb_u16): Likewise.
(__arm_vidupq_wb_u32): Likewise.
(__arm_vddupq_wb_u8): Likewise.
(__arm_vddupq_wb_u16): Likewise.
(__arm_vddupq_wb_u32): Likewise.
(__arm_viwdupq_m_n_u8): Likewise.
(__arm_viwdupq_m_n_u32): Likewise.
(__arm_viwdupq_m_n_u16): Likewise.
(__arm_viwdupq_m_wb_u8): Likewise.
(__arm_viwdupq_m_wb_u32): Likewise.
(__arm_viwdupq_m_wb_u16): Likewise.
(__arm_viwdupq_n_u8): Likewise.
(__arm_viwdupq_n_u32): Likewise.
(__arm_viwdupq_n_u16): Likewise.
(__arm_viwdupq_wb_u8): Likewise.
(__arm_viwdupq_wb_u32): Likewise.
(__arm_viwdupq_wb_u16): Likewise.
(vidupq_m): Define polymorphic variant.
(vddupq_m): Likewise.
(vidupq_u16): Likewise.
(vidupq_u32): Likewise.
(vidupq_u8): Likewise.
(vddupq_u16): Likewise.
(vddupq_u32): Likewise.
(vddupq_u8): Likewise.
(viwdupq_m): Likewise.
(viwdupq_u16): Likewise.
(viwdupq_u32): Likewise.
(viwdupq_u8): Likewise.
(vdwdupq_m): Likewise.
(vdwdupq_u16): Likewise.
(vdwdupq_u32): Likewise.
(vdwdupq_u8): Likewise.
* config/arm/arm_mve_builtins.def
(QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Use builtin
qualifier.
* config/arm/mve.md (mve_vidupq_n_u<mode>): Define RTL pattern.
(mve_vidupq_u<mode>_insn): Likewise.
(mve_vidupq_m_n_u<mode>): Likewise.
(mve_vidupq_m_wb_u<mode>_insn): Likewise.
(mve_vddupq_n_u<mode>): Likewise.
(mve_vddupq_u<mode>_insn): Likewise.
(mve_vddupq_m_n_u<mode>): Likewise.
(mve_vddupq_m_wb_u<mode>_insn): Likewise.
(mve_vdwdupq_n_u<mode>): Likewise.
(mve_vdwdupq_wb_u<mode>): Likewise.
(mve_vdwdupq_wb_u<mode>_insn): Likewise.
(mve_vdwdupq_m_n_u<mode>): Likewise.
(mve_vdwdupq_m_wb_u<mode>): Likewise.
(mve_vdwdupq_m_wb_u<mode>_insn): Likewise.
(mve_viwdupq_n_u<mode>): Likewise.
(mve_viwdupq_wb_u<mode>): Likewise.
(mve_viwdupq_wb_u<mode>_insn): Likewise.
(mve_viwdupq_m_n_u<mode>): Likewise.
(mve_viwdupq_m_wb_u<mode>): Likewise.
(mve_viwdupq_m_wb_u<mode>_insn): Likewise.
gcc/testsuite/ChangeLog:
2020-03-20 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
Andre Vieira <andre.simoesdiasvieira@arm.com>
Mihail Ionescu <mihail.ionescu@arm.com>
* gcc.target/arm/mve/intrinsics/vddupq_m_n_u16.c: New test.
* gcc.target/arm/mve/intrinsics/vddupq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_m_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_m_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_m_wb_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_wb_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_wb_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_m_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_m_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_m_wb_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_wb_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_wb_u8.c: Likewise.
Diffstat (limited to 'gcc/config')
-rw-r--r-- | gcc/config/arm/arm-builtins.c | 7 | ||||
-rw-r--r-- | gcc/config/arm/arm_mve.h | 548 | ||||
-rw-r--r-- | gcc/config/arm/arm_mve_builtins.def | 12 | ||||
-rw-r--r-- | gcc/config/arm/mve.md | 373 |
4 files changed, 939 insertions, 1 deletions
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index c3deb9e..cefc144 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -711,6 +711,13 @@ arm_ldru_z_qualifiers[SIMD_MAX_BUILTIN_ARGS] = { qualifier_unsigned, qualifier_pointer, qualifier_unsigned}; #define LDRU_Z_QUALIFIERS (arm_ldru_z_qualifiers) +static enum arm_type_qualifiers +arm_quinop_unone_unone_unone_unone_imm_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned, + qualifier_unsigned, qualifier_immediate, qualifier_unsigned }; +#define QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS \ + (arm_quinop_unone_unone_unone_unone_imm_unone_qualifiers) + /* End of Qualifier for MVE builtins. */ /* void ([T element type] *, T, immediate). */ diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 916565c..00f2242 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -2006,6 +2006,54 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t; #define vuninitializedq_s64(void) __arm_vuninitializedq_s64(void) #define vuninitializedq_f16(void) __arm_vuninitializedq_f16(void) #define vuninitializedq_f32(void) __arm_vuninitializedq_f32(void) +#define vddupq_m_n_u8(__inactive, __a, __imm, __p) __arm_vddupq_m_n_u8(__inactive, __a, __imm, __p) +#define vddupq_m_n_u32(__inactive, __a, __imm, __p) __arm_vddupq_m_n_u32(__inactive, __a, __imm, __p) +#define vddupq_m_n_u16(__inactive, __a, __imm, __p) __arm_vddupq_m_n_u16(__inactive, __a, __imm, __p) +#define vddupq_m_wb_u8(__inactive, __a, __imm, __p) __arm_vddupq_m_wb_u8(__inactive, __a, __imm, __p) +#define vddupq_m_wb_u16(__inactive, __a, __imm, __p) __arm_vddupq_m_wb_u16(__inactive, __a, __imm, __p) +#define vddupq_m_wb_u32(__inactive, __a, __imm, __p) __arm_vddupq_m_wb_u32(__inactive, __a, __imm, __p) +#define vddupq_n_u8(__a, __imm) __arm_vddupq_n_u8(__a, __imm) +#define vddupq_n_u32(__a, __imm) __arm_vddupq_n_u32(__a, __imm) +#define vddupq_n_u16(__a, __imm) __arm_vddupq_n_u16(__a, __imm) +#define vddupq_wb_u8( __a, __imm) __arm_vddupq_wb_u8( __a, __imm) +#define vddupq_wb_u16( __a, __imm) __arm_vddupq_wb_u16( __a, __imm) +#define vddupq_wb_u32( __a, __imm) __arm_vddupq_wb_u32( __a, __imm) +#define vdwdupq_m_n_u8(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_n_u8(__inactive, __a, __b, __imm, __p) +#define vdwdupq_m_n_u32(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_n_u32(__inactive, __a, __b, __imm, __p) +#define vdwdupq_m_n_u16(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_n_u16(__inactive, __a, __b, __imm, __p) +#define vdwdupq_m_wb_u8(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_wb_u8(__inactive, __a, __b, __imm, __p) +#define vdwdupq_m_wb_u32(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_wb_u32(__inactive, __a, __b, __imm, __p) +#define vdwdupq_m_wb_u16(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_wb_u16(__inactive, __a, __b, __imm, __p) +#define vdwdupq_n_u8(__a, __b, __imm) __arm_vdwdupq_n_u8(__a, __b, __imm) +#define vdwdupq_n_u32(__a, __b, __imm) __arm_vdwdupq_n_u32(__a, __b, __imm) +#define vdwdupq_n_u16(__a, __b, __imm) __arm_vdwdupq_n_u16(__a, __b, __imm) +#define vdwdupq_wb_u8( __a, __b, __imm) __arm_vdwdupq_wb_u8( __a, __b, __imm) +#define vdwdupq_wb_u32( __a, __b, __imm) __arm_vdwdupq_wb_u32( __a, __b, __imm) +#define vdwdupq_wb_u16( __a, __b, __imm) __arm_vdwdupq_wb_u16( __a, __b, __imm) +#define vidupq_m_n_u8(__inactive, __a, __imm, __p) __arm_vidupq_m_n_u8(__inactive, __a, __imm, __p) +#define vidupq_m_n_u32(__inactive, __a, __imm, __p) __arm_vidupq_m_n_u32(__inactive, __a, __imm, __p) +#define vidupq_m_n_u16(__inactive, __a, __imm, __p) __arm_vidupq_m_n_u16(__inactive, __a, __imm, __p) +#define vidupq_m_wb_u8(__inactive, __a, __imm, __p) __arm_vidupq_m_wb_u8(__inactive, __a, __imm, __p) +#define vidupq_m_wb_u16(__inactive, __a, __imm, __p) __arm_vidupq_m_wb_u16(__inactive, __a, __imm, __p) +#define vidupq_m_wb_u32(__inactive, __a, __imm, __p) __arm_vidupq_m_wb_u32(__inactive, __a, __imm, __p) +#define vidupq_n_u8(__a, __imm) __arm_vidupq_n_u8(__a, __imm) +#define vidupq_n_u32(__a, __imm) __arm_vidupq_n_u32(__a, __imm) +#define vidupq_n_u16(__a, __imm) __arm_vidupq_n_u16(__a, __imm) +#define vidupq_wb_u8( __a, __imm) __arm_vidupq_wb_u8( __a, __imm) +#define vidupq_wb_u16( __a, __imm) __arm_vidupq_wb_u16( __a, __imm) +#define vidupq_wb_u32( __a, __imm) __arm_vidupq_wb_u32( __a, __imm) +#define viwdupq_m_n_u8(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_n_u8(__inactive, __a, __b, __imm, __p) +#define viwdupq_m_n_u32(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_n_u32(__inactive, __a, __b, __imm, __p) +#define viwdupq_m_n_u16(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_n_u16(__inactive, __a, __b, __imm, __p) +#define viwdupq_m_wb_u8(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_wb_u8(__inactive, __a, __b, __imm, __p) +#define viwdupq_m_wb_u32(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_wb_u32(__inactive, __a, __b, __imm, __p) +#define viwdupq_m_wb_u16(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_wb_u16(__inactive, __a, __b, __imm, __p) +#define viwdupq_n_u8(__a, __b, __imm) __arm_viwdupq_n_u8(__a, __b, __imm) +#define viwdupq_n_u32(__a, __b, __imm) __arm_viwdupq_n_u32(__a, __b, __imm) +#define viwdupq_n_u16(__a, __b, __imm) __arm_viwdupq_n_u16(__a, __b, __imm) +#define viwdupq_wb_u8( __a, __b, __imm) __arm_viwdupq_wb_u8( __a, __b, __imm) +#define viwdupq_wb_u32( __a, __b, __imm) __arm_viwdupq_wb_u32( __a, __b, __imm) +#define viwdupq_wb_u16( __a, __b, __imm) __arm_viwdupq_wb_u16( __a, __b, __imm) #endif __extension__ extern __inline void @@ -12956,6 +13004,390 @@ __arm_vreinterpretq_u8_u64 (uint64x2_t __a) return (uint8x16_t) __a; } +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vddupq_m_n_u8 (uint8x16_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) +{ + return __builtin_mve_vddupq_m_n_uv16qi (__inactive, __a, __imm, __p); +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vddupq_m_n_u32 (uint32x4_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) +{ + return __builtin_mve_vddupq_m_n_uv4si (__inactive, __a, __imm, __p); +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vddupq_m_n_u16 (uint16x8_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) +{ + return __builtin_mve_vddupq_m_n_uv8hi (__inactive, __a, __imm, __p); +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vddupq_m_wb_u8 (uint8x16_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) +{ + uint8x16_t __res = __builtin_mve_vddupq_m_n_uv16qi (__inactive, * __a, __imm, __p); + *__a -= __imm * 16u; + return __res; +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vddupq_m_wb_u16 (uint16x8_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) +{ + uint16x8_t __res = __builtin_mve_vddupq_m_n_uv8hi (__inactive, *__a, __imm, __p); + *__a -= __imm * 8u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vddupq_m_wb_u32 (uint32x4_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) +{ + uint32x4_t __res = __builtin_mve_vddupq_m_n_uv4si (__inactive, *__a, __imm, __p); + *__a -= __imm * 4u; + return __res; +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vddupq_n_u8 (uint32_t __a, const int __imm) +{ + return __builtin_mve_vddupq_n_uv16qi (__a, __imm); +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vddupq_n_u32 (uint32_t __a, const int __imm) +{ + return __builtin_mve_vddupq_n_uv4si (__a, __imm); +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vddupq_n_u16 (uint32_t __a, const int __imm) +{ + return __builtin_mve_vddupq_n_uv8hi (__a, __imm); +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vdwdupq_m_n_u8 (uint8x16_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) +{ + return __builtin_mve_vdwdupq_m_n_uv16qi (__inactive, __a, __b, __imm, __p); +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vdwdupq_m_n_u32 (uint32x4_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) +{ + return __builtin_mve_vdwdupq_m_n_uv4si (__inactive, __a, __b, __imm, __p); +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vdwdupq_m_n_u16 (uint16x8_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) +{ + return __builtin_mve_vdwdupq_m_n_uv8hi (__inactive, __a, __b, __imm, __p); +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vdwdupq_m_wb_u8 (uint8x16_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) +{ + uint8x16_t __res = __builtin_mve_vdwdupq_m_n_uv16qi (__inactive, *__a, __b, __imm, __p); + *__a = __builtin_mve_vdwdupq_m_wb_uv16qi (__inactive, *__a, __b, __imm, __p); + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vdwdupq_m_wb_u32 (uint32x4_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) +{ + uint32x4_t __res = __builtin_mve_vdwdupq_m_n_uv4si (__inactive, *__a, __b, __imm, __p); + *__a = __builtin_mve_vdwdupq_m_wb_uv4si (__inactive, *__a, __b, __imm, __p); + return __res; +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vdwdupq_m_wb_u16 (uint16x8_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) +{ + uint16x8_t __res = __builtin_mve_vdwdupq_m_n_uv8hi (__inactive, *__a, __b, __imm, __p); + *__a = __builtin_mve_vdwdupq_m_wb_uv8hi (__inactive, *__a, __b, __imm, __p); + return __res; +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vdwdupq_n_u8 (uint32_t __a, uint32_t __b, const int __imm) +{ + return __builtin_mve_vdwdupq_n_uv16qi (__a, __b, __imm); +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vdwdupq_n_u32 (uint32_t __a, uint32_t __b, const int __imm) +{ + return __builtin_mve_vdwdupq_n_uv4si (__a, __b, __imm); +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vdwdupq_n_u16 (uint32_t __a, uint32_t __b, const int __imm) +{ + return __builtin_mve_vdwdupq_n_uv8hi (__a, __b, __imm); +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vdwdupq_wb_u8 (uint32_t * __a, uint32_t __b, const int __imm) +{ + uint8x16_t __res = __builtin_mve_vdwdupq_n_uv16qi (*__a, __b, __imm); + *__a = __builtin_mve_vdwdupq_wb_uv16qi (*__a, __b, __imm); + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vdwdupq_wb_u32 (uint32_t * __a, uint32_t __b, const int __imm) +{ + uint32x4_t __res = __builtin_mve_vdwdupq_n_uv4si (*__a, __b, __imm); + *__a = __builtin_mve_vdwdupq_wb_uv4si (*__a, __b, __imm); + return __res; +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vdwdupq_wb_u16 (uint32_t * __a, uint32_t __b, const int __imm) +{ + uint16x8_t __res = __builtin_mve_vdwdupq_n_uv8hi (*__a, __b, __imm); + *__a = __builtin_mve_vdwdupq_wb_uv8hi (*__a, __b, __imm); + return __res; +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vidupq_m_n_u8 (uint8x16_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) +{ + return __builtin_mve_vidupq_m_n_uv16qi (__inactive, __a, __imm, __p); +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vidupq_m_n_u32 (uint32x4_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) +{ + return __builtin_mve_vidupq_m_n_uv4si (__inactive, __a, __imm, __p); +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vidupq_m_n_u16 (uint16x8_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) +{ + return __builtin_mve_vidupq_m_n_uv8hi (__inactive, __a, __imm, __p); +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vidupq_n_u8 (uint32_t __a, const int __imm) +{ + return __builtin_mve_vidupq_n_uv16qi (__a, __imm); +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vidupq_m_wb_u8 (uint8x16_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) +{ + uint8x16_t __res = __builtin_mve_vidupq_m_n_uv16qi (__inactive, *__a, __imm, __p); + *__a += __imm * 16u; + return __res; +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vidupq_m_wb_u16 (uint16x8_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) +{ + uint16x8_t __res = __builtin_mve_vidupq_m_n_uv8hi (__inactive, *__a, __imm, __p); + *__a += __imm * 8u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vidupq_m_wb_u32 (uint32x4_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) +{ + uint32x4_t __res = __builtin_mve_vidupq_m_n_uv4si (__inactive, *__a, __imm, __p); + *__a += __imm * 4u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vidupq_n_u32 (uint32_t __a, const int __imm) +{ + return __builtin_mve_vidupq_n_uv4si (__a, __imm); +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vidupq_n_u16 (uint32_t __a, const int __imm) +{ + return __builtin_mve_vidupq_n_uv8hi (__a, __imm); +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vidupq_wb_u8 (uint32_t * __a, const int __imm) +{ + uint8x16_t __res = __builtin_mve_vidupq_n_uv16qi (*__a, __imm); + *__a += __imm * 16u; + return __res; +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vidupq_wb_u16 (uint32_t * __a, const int __imm) +{ + uint16x8_t __res = __builtin_mve_vidupq_n_uv8hi (*__a, __imm); + *__a += __imm * 8u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vidupq_wb_u32 (uint32_t * __a, const int __imm) +{ + uint32x4_t __res = __builtin_mve_vidupq_n_uv4si (*__a, __imm); + *__a += __imm * 4u; + return __res; +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vddupq_wb_u8 (uint32_t * __a, const int __imm) +{ + uint8x16_t __res = __builtin_mve_vddupq_n_uv16qi (*__a, __imm); + *__a -= __imm * 16u; + return __res; +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vddupq_wb_u16 (uint32_t * __a, const int __imm) +{ + uint16x8_t __res = __builtin_mve_vddupq_n_uv8hi (*__a, __imm); + *__a -= __imm * 8u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vddupq_wb_u32 (uint32_t * __a, const int __imm) +{ + uint32x4_t __res = __builtin_mve_vddupq_n_uv4si (*__a, __imm); + *__a -= __imm * 4u; + return __res; +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_viwdupq_m_n_u8 (uint8x16_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) +{ + return __builtin_mve_viwdupq_m_n_uv16qi (__inactive, __a, __b, __imm, __p); +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_viwdupq_m_n_u32 (uint32x4_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) +{ + return __builtin_mve_viwdupq_m_n_uv4si (__inactive, __a, __b, __imm, __p); +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_viwdupq_m_n_u16 (uint16x8_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) +{ + return __builtin_mve_viwdupq_m_n_uv8hi (__inactive, __a, __b, __imm, __p); +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_viwdupq_m_wb_u8 (uint8x16_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) +{ + uint8x16_t __res = __builtin_mve_viwdupq_m_n_uv16qi (__inactive, *__a, __b, __imm, __p); + *__a = __builtin_mve_viwdupq_m_wb_uv16qi (__inactive, *__a, __b, __imm, __p); + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_viwdupq_m_wb_u32 (uint32x4_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) +{ + uint32x4_t __res = __builtin_mve_viwdupq_m_n_uv4si (__inactive, *__a, __b, __imm, __p); + *__a = __builtin_mve_viwdupq_m_wb_uv4si (__inactive, *__a, __b, __imm, __p); + return __res; +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_viwdupq_m_wb_u16 (uint16x8_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) +{ + uint16x8_t __res = __builtin_mve_viwdupq_m_n_uv8hi (__inactive, *__a, __b, __imm, __p); + *__a = __builtin_mve_viwdupq_m_wb_uv8hi (__inactive, *__a, __b, __imm, __p); + return __res; +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_viwdupq_n_u8 (uint32_t __a, uint32_t __b, const int __imm) +{ + return __builtin_mve_viwdupq_n_uv16qi (__a, __b, __imm); +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_viwdupq_n_u32 (uint32_t __a, uint32_t __b, const int __imm) +{ + return __builtin_mve_viwdupq_n_uv4si (__a, __b, __imm); +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_viwdupq_n_u16 (uint32_t __a, uint32_t __b, const int __imm) +{ + return __builtin_mve_viwdupq_n_uv8hi (__a, __b, __imm); +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_viwdupq_wb_u8 (uint32_t * __a, uint32_t __b, const int __imm) +{ + uint8x16_t __res = __builtin_mve_viwdupq_n_uv16qi (*__a, __b, __imm); + *__a = __builtin_mve_viwdupq_wb_uv16qi (*__a, __b, __imm); + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_viwdupq_wb_u32 (uint32_t * __a, uint32_t __b, const int __imm) +{ + uint32x4_t __res = __builtin_mve_viwdupq_n_uv4si (*__a, __b, __imm); + *__a = __builtin_mve_viwdupq_wb_uv4si (*__a, __b, __imm); + return __res; +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_viwdupq_wb_u16 (uint32_t * __a, uint32_t __b, const int __imm) +{ + uint16x8_t __res = __builtin_mve_viwdupq_n_uv8hi (*__a, __b, __imm); + *__a = __builtin_mve_viwdupq_wb_uv8hi (*__a, __b, __imm); + return __res; +} + #if (__ARM_FEATURE_MVE & 2) /* MVE Floating point. */ __extension__ extern __inline void @@ -21764,6 +22196,122 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint8_t_const_ptr][__ARM_mve_type_uint16x8_t]: __arm_vldrbq_gather_offset_u16 (__ARM_mve_coerce(__p0, uint8_t const *), __ARM_mve_coerce(__p1, uint16x8_t)), \ int (*)[__ARM_mve_type_uint8_t_const_ptr][__ARM_mve_type_uint32x4_t]: __arm_vldrbq_gather_offset_u32 (__ARM_mve_coerce(__p0, uint8_t const *), __ARM_mve_coerce(__p1, uint32x4_t)));}) +#define vidupq_m(p0,p1,p2,p3) __arm_vidupq_m(p0,p1,p2,p3) +#define __arm_vidupq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint32_t]: __arm_vidupq_m_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint32_t), p2, p3), \ + int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint32_t]: __arm_vidupq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint32_t), p2, p3), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32_t]: __arm_vidupq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32_t), p2, p3), \ + int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_m_wb_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint32_t *), p2, p3), \ + int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_m_wb_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint32_t *), p2, p3), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_m_wb_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32_t *), p2, p3));}) + +#define vddupq_m(p0,p1,p2,p3) __arm_vddupq_m(p0,p1,p2,p3) +#define __arm_vddupq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint32_t]: __arm_vddupq_m_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint32_t), p2, p3), \ + int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint32_t]: __arm_vddupq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint32_t), p2, p3), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32_t]: __arm_vddupq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32_t), p2, p3), \ + int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_m_wb_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint32_t *), p2, p3), \ + int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_m_wb_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint32_t *), p2, p3), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_m_wb_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32_t *), p2, p3));}) + +#define vidupq_u16(p0,p1) __arm_vidupq_u16(p0,p1) +#define __arm_vidupq_u16(p0,p1) ({ __typeof(p0) __p0 = (p0); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ + int (*)[__ARM_mve_type_uint32_t]: __arm_vidupq_n_u16 (__ARM_mve_coerce(__p0, uint32_t), p1), \ + int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_wb_u16 (__ARM_mve_coerce(__p0, uint32_t *), p1));}) + +#define vidupq_u32(p0,p1) __arm_vidupq_u32(p0,p1) +#define __arm_vidupq_u32(p0,p1) ({ __typeof(p0) __p0 = (p0); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ + int (*)[__ARM_mve_type_uint32_t]: __arm_vidupq_n_u32 (__ARM_mve_coerce(__p0, uint32_t), p1), \ + int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_wb_u32 (__ARM_mve_coerce(__p0, uint32_t *), p1));}) + +#define vidupq_u8(p0,p1) __arm_vidupq_u8(p0,p1) +#define __arm_vidupq_u8(p0,p1) ({ __typeof(p0) __p0 = (p0); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ + int (*)[__ARM_mve_type_uint32_t]: __arm_vidupq_n_u8 (__ARM_mve_coerce(__p0, uint32_t), p1), \ + int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_wb_u8 (__ARM_mve_coerce(__p0, uint32_t *), p1));}) + +#define vddupq_u16(p0,p1) __arm_vddupq_u16(p0,p1) +#define __arm_vddupq_u16(p0,p1) ({ __typeof(p0) __p0 = (p0); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ + int (*)[__ARM_mve_type_uint32_t]: __arm_vddupq_n_u16 (__ARM_mve_coerce(__p0, uint32_t), p1), \ + int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_wb_u16 (__ARM_mve_coerce(__p0, uint32_t *), p1));}) + +#define vddupq_u32(p0,p1) __arm_vddupq_u32(p0,p1) +#define __arm_vddupq_u32(p0,p1) ({ __typeof(p0) __p0 = (p0); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ + int (*)[__ARM_mve_type_uint32_t]: __arm_vddupq_n_u32 (__ARM_mve_coerce(__p0, uint32_t), p1), \ + int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_wb_u32 (__ARM_mve_coerce(__p0, uint32_t *), p1));}) + +#define vddupq_u8(p0,p1) __arm_vddupq_u8(p0,p1) +#define __arm_vddupq_u8(p0,p1) ({ __typeof(p0) __p0 = (p0); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ + int (*)[__ARM_mve_type_uint32_t]: __arm_vddupq_n_u8 (__ARM_mve_coerce(__p0, uint32_t), p1), \ + int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_wb_u8 (__ARM_mve_coerce(__p0, uint32_t *), p1));}) + +#define viwdupq_m(p0,p1,p2,p3,p4) __arm_viwdupq_m(p0,p1,p2,p3,p4) +#define __arm_viwdupq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint32_t]: __arm_viwdupq_m_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint32_t), p2, p3, p4), \ + int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint32_t]: __arm_viwdupq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint32_t), p2, p3, p4), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32_t]: __arm_viwdupq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32_t), p2, p3, p4), \ + int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_m_wb_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint32_t *), p2, p3, p4), \ + int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_m_wb_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint32_t *), p2, p3, p4), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_m_wb_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32_t *), p2, p3, p4));}) + +#define viwdupq_u16(p0,p1,p2) __arm_viwdupq_u16(p0,p1,p2) +#define __arm_viwdupq_u16(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ + int (*)[__ARM_mve_type_uint32_t]: __arm_viwdupq_n_u16 (__ARM_mve_coerce(__p0, uint32_t), p1, p2), \ + int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_wb_u16 (__ARM_mve_coerce(__p0, uint32_t *), p1, p2));}) + +#define viwdupq_u32(p0,p1,p2) __arm_viwdupq_u32(p0,p1,p2) +#define __arm_viwdupq_u32(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ + int (*)[__ARM_mve_type_uint32_t]: __arm_viwdupq_n_u32 (__ARM_mve_coerce(__p0, uint32_t), p1, p2), \ + int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_wb_u32 (__ARM_mve_coerce(__p0, uint32_t *), p1, p2));}) + +#define viwdupq_u8(p0,p1,p2) __arm_viwdupq_u8(p0,p1,p2) +#define __arm_viwdupq_u8(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ + int (*)[__ARM_mve_type_uint32_t]: __arm_viwdupq_n_u8 (__ARM_mve_coerce(__p0, uint32_t), p1, p2), \ + int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_wb_u8 (__ARM_mve_coerce(__p0, uint32_t *), p1, p2));}) + +#define vdwdupq_m(p0,p1,p2,p3,p4) __arm_vdwdupq_m(p0,p1,p2,p3,p4) +#define __arm_vdwdupq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint32_t]: __arm_vdwdupq_m_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint32_t), p2, p3, p4), \ + int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint32_t]: __arm_vdwdupq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint32_t), p2, p3, p4), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32_t]: __arm_vdwdupq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32_t), p2, p3, p4), \ + int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_m_wb_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint32_t *), p2, p3, p4), \ + int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_m_wb_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint32_t *), p2, p3, p4), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_m_wb_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32_t *), p2, p3, p4));}) + +#define vdwdupq_u16(p0,p1,p2) __arm_vdwdupq_u16(p0,p1,p2) +#define __arm_vdwdupq_u16(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ + int (*)[__ARM_mve_type_uint32_t]: __arm_vdwdupq_n_u16 (__ARM_mve_coerce(__p0, uint32_t), p1, p2), \ + int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_wb_u16 (__ARM_mve_coerce(__p0, uint32_t *), p1, p2));}) + +#define vdwdupq_u32(p0,p1,p2) __arm_vdwdupq_u32(p0,p1,p2) +#define __arm_vdwdupq_u32(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ + int (*)[__ARM_mve_type_uint32_t]: __arm_vdwdupq_n_u32 (__ARM_mve_coerce(__p0, uint32_t), p1, p2), \ + int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_wb_u32 (__ARM_mve_coerce(__p0, uint32_t *), p1, p2));}) + +#define vdwdupq_u8(p0,p1,p2) __arm_vdwdupq_u8(p0,p1,p2) +#define __arm_vdwdupq_u8(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ + int (*)[__ARM_mve_type_uint32_t]: __arm_vdwdupq_n_u8 (__ARM_mve_coerce(__p0, uint32_t), p1, p2), \ + int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_wb_u8 (__ARM_mve_coerce(__p0, uint32_t *), p1, p2));}) + #ifdef __cplusplus } #endif diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index 144547f..2ed7886 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -815,3 +815,15 @@ VAR1 (STRSU_P, vstrdq_scatter_offset_p_u, v2di) VAR1 (STRSU_P, vstrdq_scatter_shifted_offset_p_u, v2di) VAR1 (STRSU_P, vstrwq_scatter_offset_p_u, v4si) VAR1 (STRSU_P, vstrwq_scatter_shifted_offset_p_u, v4si) +VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, viwdupq_wb_u, v16qi, v4si, v8hi) +VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, vdwdupq_wb_u, v16qi, v4si, v8hi) +VAR3 (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_UNONE, viwdupq_m_wb_u, v16qi, v8hi, v4si) +VAR3 (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_UNONE, vdwdupq_m_wb_u, v16qi, v8hi, v4si) +VAR3 (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_UNONE, viwdupq_m_n_u, v16qi, v8hi, v4si) +VAR3 (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_UNONE, vdwdupq_m_n_u, v16qi, v8hi, v4si) +VAR3 (BINOP_UNONE_UNONE_IMM, vddupq_n_u, v16qi, v8hi, v4si) +VAR3 (BINOP_UNONE_UNONE_IMM, vidupq_n_u, v16qi, v8hi, v4si) +VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_UNONE, vddupq_m_n_u, v16qi, v8hi, v4si) +VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_UNONE, vidupq_m_n_u, v16qi, v8hi, v4si) +VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, vdwdupq_n_u, v16qi, v4si, v8hi) +VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, viwdupq_n_u, v16qi, v4si, v8hi) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 77b36a7..b2702f5 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -207,7 +207,8 @@ VSTRDQSB_U VSTRDQSO_S VSTRDQSO_U VSTRDQSSO_S VSTRDQSSO_U VSTRWQSO_S VSTRWQSO_U VSTRWQSSO_S VSTRWQSSO_U VSTRHQSO_F VSTRHQSSO_F VSTRWQSB_F - VSTRWQSO_F VSTRWQSSO_F]) + VSTRWQSO_F VSTRWQSSO_F VDDUPQ VDDUPQ_M VDWDUPQ + VDWDUPQ_M VIDUPQ VIDUPQ_M VIWDUPQ VIWDUPQ_M]) (define_mode_attr MVE_CNVT [(V8HI "V8HF") (V4SI "V4SF") (V8HF "V8HI") (V4SF "V4SI")]) @@ -9671,3 +9672,373 @@ "vadd.f%#<V_sz_elem> %q0, %q1, %q2" [(set_attr "type" "mve_move") ]) + +;; +;; [vidupq_n_u]) +;; +(define_expand "mve_vidupq_n_u<mode>" + [(match_operand:MVE_2 0 "s_register_operand") + (match_operand:SI 1 "s_register_operand") + (match_operand:SI 2 "mve_imm_selective_upto_8")] + "TARGET_HAVE_MVE" +{ + rtx temp = gen_reg_rtx (SImode); + emit_move_insn (temp, operands[1]); + rtx inc = gen_int_mode (INTVAL(operands[2]) * <MVE_LANES>, SImode); + emit_insn (gen_mve_vidupq_u<mode>_insn (operands[0], temp, operands[1], + operands[2], inc)); + DONE; +}) + +;; +;; [vidupq_u_insn]) +;; +(define_insn "mve_vidupq_u<mode>_insn" + [(set (match_operand:MVE_2 0 "s_register_operand" "=w") + (unspec:MVE_2 [(match_operand:SI 2 "s_register_operand" "1") + (match_operand:SI 3 "mve_imm_selective_upto_8" "Rg")] + VIDUPQ)) + (set (match_operand:SI 1 "s_register_operand" "=e") + (plus:SI (match_dup 2) + (match_operand:SI 4 "immediate_operand" "i")))] + "TARGET_HAVE_MVE" + "vidup.u%#<V_sz_elem>\t%q0, %1, %3") + +;; +;; [vidupq_m_n_u]) +;; +(define_expand "mve_vidupq_m_n_u<mode>" + [(match_operand:MVE_2 0 "s_register_operand") + (match_operand:MVE_2 1 "s_register_operand") + (match_operand:SI 2 "s_register_operand") + (match_operand:SI 3 "mve_imm_selective_upto_8") + (match_operand:HI 4 "vpr_register_operand")] + "TARGET_HAVE_MVE" +{ + rtx temp = gen_reg_rtx (SImode); + emit_move_insn (temp, operands[2]); + rtx inc = gen_int_mode (INTVAL(operands[3]) * <MVE_LANES>, SImode); + emit_insn (gen_mve_vidupq_m_wb_u<mode>_insn(operands[0], operands[1], temp, + operands[2], operands[3], + operands[4], inc)); + DONE; +}) + +;; +;; [vidupq_m_wb_u_insn]) +;; +(define_insn "mve_vidupq_m_wb_u<mode>_insn" + [(set (match_operand:MVE_2 0 "s_register_operand" "=w") + (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0") + (match_operand:SI 3 "s_register_operand" "2") + (match_operand:SI 4 "mve_imm_selective_upto_8" "Rg") + (match_operand:HI 5 "vpr_register_operand" "Up")] + VIDUPQ_M)) + (set (match_operand:SI 2 "s_register_operand" "=e") + (plus:SI (match_dup 3) + (match_operand:SI 6 "immediate_operand" "i")))] + "TARGET_HAVE_MVE" + "vpst\;\tvidupt.u%#<V_sz_elem>\t%q0, %2, %4" + [(set_attr "length""8")]) + +;; +;; [vddupq_n_u]) +;; +(define_expand "mve_vddupq_n_u<mode>" + [(match_operand:MVE_2 0 "s_register_operand") + (match_operand:SI 1 "s_register_operand") + (match_operand:SI 2 "mve_imm_selective_upto_8")] + "TARGET_HAVE_MVE" +{ + rtx temp = gen_reg_rtx (SImode); + emit_move_insn (temp, operands[1]); + rtx inc = gen_int_mode (INTVAL(operands[2]) * <MVE_LANES>, SImode); + emit_insn (gen_mve_vddupq_u<mode>_insn (operands[0], temp, operands[1], + operands[2], inc)); + DONE; +}) + +;; +;; [vddupq_u_insn]) +;; +(define_insn "mve_vddupq_u<mode>_insn" + [(set (match_operand:MVE_2 0 "s_register_operand" "=w") + (unspec:MVE_2 [(match_operand:SI 2 "s_register_operand" "1") + (match_operand:SI 3 "immediate_operand" "i")] + VDDUPQ)) + (set (match_operand:SI 1 "s_register_operand" "=e") + (minus:SI (match_dup 2) + (match_operand:SI 4 "immediate_operand" "i")))] + "TARGET_HAVE_MVE" + "vddup.u%#<V_sz_elem> %q0, %1, %3") + +;; +;; [vddupq_m_n_u]) +;; +(define_expand "mve_vddupq_m_n_u<mode>" + [(match_operand:MVE_2 0 "s_register_operand") + (match_operand:MVE_2 1 "s_register_operand") + (match_operand:SI 2 "s_register_operand") + (match_operand:SI 3 "mve_imm_selective_upto_8") + (match_operand:HI 4 "vpr_register_operand")] + "TARGET_HAVE_MVE" +{ + rtx temp = gen_reg_rtx (SImode); + emit_move_insn (temp, operands[2]); + rtx inc = gen_int_mode (INTVAL(operands[3]) * <MVE_LANES>, SImode); + emit_insn (gen_mve_vddupq_m_wb_u<mode>_insn(operands[0], operands[1], temp, + operands[2], operands[3], + operands[4], inc)); + DONE; +}) + +;; +;; [vddupq_m_wb_u_insn]) +;; +(define_insn "mve_vddupq_m_wb_u<mode>_insn" + [(set (match_operand:MVE_2 0 "s_register_operand" "=w") + (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0") + (match_operand:SI 3 "s_register_operand" "2") + (match_operand:SI 4 "mve_imm_selective_upto_8" "Rg") + (match_operand:HI 5 "vpr_register_operand" "Up")] + VDDUPQ_M)) + (set (match_operand:SI 2 "s_register_operand" "=e") + (minus:SI (match_dup 3) + (match_operand:SI 6 "immediate_operand" "i")))] + "TARGET_HAVE_MVE" + "vpst\;\tvddupt.u%#<V_sz_elem>\t%q0, %2, %4" + [(set_attr "length""8")]) + +;; +;; [vdwdupq_n_u]) +;; +(define_expand "mve_vdwdupq_n_u<mode>" + [(match_operand:MVE_2 0 "s_register_operand") + (match_operand:SI 1 "s_register_operand") + (match_operand:SI 2 "s_register_operand") + (match_operand:SI 3 "mve_imm_selective_upto_8")] + "TARGET_HAVE_MVE" +{ + rtx ignore_wb = gen_reg_rtx (SImode); + emit_insn (gen_mve_vdwdupq_wb_u<mode>_insn (operands[0], ignore_wb, + operands[1], operands[2], + operands[3])); + DONE; +}) + +;; +;; [vdwdupq_wb_u]) +;; +(define_expand "mve_vdwdupq_wb_u<mode>" + [(match_operand:SI 0 "s_register_operand") + (match_operand:SI 1 "s_register_operand") + (match_operand:SI 2 "s_register_operand") + (match_operand:SI 3 "mve_imm_selective_upto_8") + (unspec:MVE_2 [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + "TARGET_HAVE_MVE" +{ + rtx ignore_vec = gen_reg_rtx (<MODE>mode); + emit_insn (gen_mve_vdwdupq_wb_u<mode>_insn (ignore_vec, operands[0], + operands[1], operands[2], + operands[3])); + DONE; +}) + +;; +;; [vdwdupq_wb_u_insn]) +;; +(define_insn "mve_vdwdupq_wb_u<mode>_insn" + [(set (match_operand:MVE_2 0 "s_register_operand" "=w") + (unspec:MVE_2 [(match_operand:SI 2 "s_register_operand" "1") + (match_operand:SI 3 "s_register_operand" "r") + (match_operand:SI 4 "mve_imm_selective_upto_8" "Rg")] + VDWDUPQ)) + (set (match_operand:SI 1 "s_register_operand" "=e") + (unspec:SI [(match_dup 2) + (match_dup 3) + (match_dup 4)] + VDWDUPQ))] + "TARGET_HAVE_MVE" + "vdwdup.u%#<V_sz_elem>\t%q0, %2, %3, %4" +) + +;; +;; [vdwdupq_m_n_u]) +;; +(define_expand "mve_vdwdupq_m_n_u<mode>" + [(match_operand:MVE_2 0 "s_register_operand") + (match_operand:MVE_2 1 "s_register_operand") + (match_operand:SI 2 "s_register_operand") + (match_operand:SI 3 "s_register_operand") + (match_operand:SI 4 "mve_imm_selective_upto_8") + (match_operand:HI 5 "vpr_register_operand")] + "TARGET_HAVE_MVE" +{ + rtx ignore_wb = gen_reg_rtx (SImode); + emit_insn (gen_mve_vdwdupq_m_wb_u<mode>_insn (operands[0], ignore_wb, + operands[1], operands[2], + operands[3], operands[4], + operands[5])); + DONE; +}) + +;; +;; [vdwdupq_m_wb_u]) +;; +(define_expand "mve_vdwdupq_m_wb_u<mode>" + [(match_operand:SI 0 "s_register_operand") + (match_operand:MVE_2 1 "s_register_operand") + (match_operand:SI 2 "s_register_operand") + (match_operand:SI 3 "s_register_operand") + (match_operand:SI 4 "mve_imm_selective_upto_8") + (match_operand:HI 5 "vpr_register_operand")] + "TARGET_HAVE_MVE" +{ + rtx ignore_vec = gen_reg_rtx (<MODE>mode); + emit_insn (gen_mve_vdwdupq_m_wb_u<mode>_insn (ignore_vec, operands[0], + operands[1], operands[2], + operands[3], operands[4], + operands[5])); + DONE; +}) + +;; +;; [vdwdupq_m_wb_u_insn]) +;; +(define_insn "mve_vdwdupq_m_wb_u<mode>_insn" + [(set (match_operand:MVE_2 0 "s_register_operand" "=w") + (unspec:MVE_2 [(match_operand:MVE_2 2 "s_register_operand" "w") + (match_operand:SI 3 "s_register_operand" "1") + (match_operand:SI 4 "s_register_operand" "r") + (match_operand:SI 5 "mve_imm_selective_upto_8" "Rg") + (match_operand:HI 6 "vpr_register_operand" "Up")] + VDWDUPQ_M)) + (set (match_operand:SI 1 "s_register_operand" "=e") + (unspec:SI [(match_dup 2) + (match_dup 3) + (match_dup 4) + (match_dup 5) + (match_dup 6)] + VDWDUPQ_M)) + ] + "TARGET_HAVE_MVE" + "vpst\;\tvdwdupt.u%#<V_sz_elem>\t%q2, %3, %4, %5" + [(set_attr "type" "mve_move") + (set_attr "length""8")]) + +;; +;; [viwdupq_n_u]) +;; +(define_expand "mve_viwdupq_n_u<mode>" + [(match_operand:MVE_2 0 "s_register_operand") + (match_operand:SI 1 "s_register_operand") + (match_operand:SI 2 "s_register_operand") + (match_operand:SI 3 "mve_imm_selective_upto_8")] + "TARGET_HAVE_MVE" +{ + rtx ignore_wb = gen_reg_rtx (SImode); + emit_insn (gen_mve_viwdupq_wb_u<mode>_insn (operands[0], ignore_wb, + operands[1], operands[2], + operands[3])); + DONE; +}) + +;; +;; [viwdupq_wb_u]) +;; +(define_expand "mve_viwdupq_wb_u<mode>" + [(match_operand:SI 0 "s_register_operand") + (match_operand:SI 1 "s_register_operand") + (match_operand:SI 2 "s_register_operand") + (match_operand:SI 3 "mve_imm_selective_upto_8") + (unspec:MVE_2 [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + "TARGET_HAVE_MVE" +{ + rtx ignore_vec = gen_reg_rtx (<MODE>mode); + emit_insn (gen_mve_viwdupq_wb_u<mode>_insn (ignore_vec, operands[0], + operands[1], operands[2], + operands[3])); + DONE; +}) + +;; +;; [viwdupq_wb_u_insn]) +;; +(define_insn "mve_viwdupq_wb_u<mode>_insn" + [(set (match_operand:MVE_2 0 "s_register_operand" "=w") + (unspec:MVE_2 [(match_operand:SI 2 "s_register_operand" "1") + (match_operand:SI 3 "s_register_operand" "r") + (match_operand:SI 4 "mve_imm_selective_upto_8" "Rg")] + VIWDUPQ)) + (set (match_operand:SI 1 "s_register_operand" "=e") + (unspec:SI [(match_dup 2) + (match_dup 3) + (match_dup 4)] + VIWDUPQ))] + "TARGET_HAVE_MVE" + "viwdup.u%#<V_sz_elem>\t%q0, %2, %3, %4" +) + +;; +;; [viwdupq_m_n_u]) +;; +(define_expand "mve_viwdupq_m_n_u<mode>" + [(match_operand:MVE_2 0 "s_register_operand") + (match_operand:MVE_2 1 "s_register_operand") + (match_operand:SI 2 "s_register_operand") + (match_operand:SI 3 "s_register_operand") + (match_operand:SI 4 "mve_imm_selective_upto_8") + (match_operand:HI 5 "vpr_register_operand")] + "TARGET_HAVE_MVE" +{ + rtx ignore_wb = gen_reg_rtx (SImode); + emit_insn (gen_mve_viwdupq_m_wb_u<mode>_insn (operands[0], ignore_wb, + operands[1], operands[2], + operands[3], operands[4], + operands[5])); + DONE; +}) + +;; +;; [viwdupq_m_wb_u]) +;; +(define_expand "mve_viwdupq_m_wb_u<mode>" + [(match_operand:SI 0 "s_register_operand") + (match_operand:MVE_2 1 "s_register_operand") + (match_operand:SI 2 "s_register_operand") + (match_operand:SI 3 "s_register_operand") + (match_operand:SI 4 "mve_imm_selective_upto_8") + (match_operand:HI 5 "vpr_register_operand")] + "TARGET_HAVE_MVE" +{ + rtx ignore_vec = gen_reg_rtx (<MODE>mode); + emit_insn (gen_mve_viwdupq_m_wb_u<mode>_insn (ignore_vec, operands[0], + operands[1], operands[2], + operands[3], operands[4], + operands[5])); + DONE; +}) + +;; +;; [viwdupq_m_wb_u_insn]) +;; +(define_insn "mve_viwdupq_m_wb_u<mode>_insn" + [(set (match_operand:MVE_2 0 "s_register_operand" "=w") + (unspec:MVE_2 [(match_operand:MVE_2 2 "s_register_operand" "w") + (match_operand:SI 3 "s_register_operand" "1") + (match_operand:SI 4 "s_register_operand" "r") + (match_operand:SI 5 "mve_imm_selective_upto_8" "Rg") + (match_operand:HI 6 "vpr_register_operand" "Up")] + VIWDUPQ_M)) + (set (match_operand:SI 1 "s_register_operand" "=e") + (unspec:SI [(match_dup 2) + (match_dup 3) + (match_dup 4) + (match_dup 5) + (match_dup 6)] + VIWDUPQ_M)) + ] + "TARGET_HAVE_MVE" + "vpst\;\tviwdupt.u%#<V_sz_elem>\t%q2, %3, %4, %5" + [(set_attr "type" "mve_move") + (set_attr "length""8")]) |