diff options
author | Jonathan Wright <jonathan.wright@arm.com> | 2021-08-09 15:26:48 +0100 |
---|---|---|
committer | Jonathan Wright <jonathan.wright@arm.com> | 2021-11-04 14:54:36 +0000 |
commit | 66f206b85395c273980e2b81a54dbddc4897e4a7 (patch) | |
tree | b405ef729ea395a3da2f42ecb78a1fdef0853e91 /gcc/genmodes.c | |
parent | 4e5929e4575922015ff4634af4dea59c59a44f10 (diff) | |
download | gcc-66f206b85395c273980e2b81a54dbddc4897e4a7.zip gcc-66f206b85395c273980e2b81a54dbddc4897e4a7.tar.gz gcc-66f206b85395c273980e2b81a54dbddc4897e4a7.tar.bz2 |
aarch64: Add machine modes for Neon vector-tuple types
Until now, GCC has used large integer machine modes (OI, CI and XI)
to model Neon vector-tuple types. This is suboptimal for many
reasons, the most notable are:
1) Large integer modes are opaque and modifying one vector in the
tuple requires a lot of inefficient set/get gymnastics. The
result is a lot of superfluous move instructions.
2) Large integer modes do not map well to types that are tuples of
64-bit vectors - we need additional zero-padding which again
results in superfluous move instructions.
This patch adds new machine modes that better model the C-level Neon
vector-tuple types. The approach is somewhat similar to that already
used for SVE vector-tuple types.
All of the AArch64 backend patterns and builtins that manipulate Neon
vector tuples are updated to use the new machine modes. This has the
effect of significantly reducing the amount of boiler-plate code in
the arm_neon.h header.
While this patch increases the quality of code generated in many
instances, there is still room for significant improvement - which
will be attempted in subsequent patches.
gcc/ChangeLog:
2021-08-09 Jonathan Wright <jonathan.wright@arm.com>
Richard Sandiford <richard.sandiford@arm.com>
* config/aarch64/aarch64-builtins.c (v2x8qi_UP): Define.
(v2x4hi_UP): Likewise.
(v2x4hf_UP): Likewise.
(v2x4bf_UP): Likewise.
(v2x2si_UP): Likewise.
(v2x2sf_UP): Likewise.
(v2x1di_UP): Likewise.
(v2x1df_UP): Likewise.
(v2x16qi_UP): Likewise.
(v2x8hi_UP): Likewise.
(v2x8hf_UP): Likewise.
(v2x8bf_UP): Likewise.
(v2x4si_UP): Likewise.
(v2x4sf_UP): Likewise.
(v2x2di_UP): Likewise.
(v2x2df_UP): Likewise.
(v3x8qi_UP): Likewise.
(v3x4hi_UP): Likewise.
(v3x4hf_UP): Likewise.
(v3x4bf_UP): Likewise.
(v3x2si_UP): Likewise.
(v3x2sf_UP): Likewise.
(v3x1di_UP): Likewise.
(v3x1df_UP): Likewise.
(v3x16qi_UP): Likewise.
(v3x8hi_UP): Likewise.
(v3x8hf_UP): Likewise.
(v3x8bf_UP): Likewise.
(v3x4si_UP): Likewise.
(v3x4sf_UP): Likewise.
(v3x2di_UP): Likewise.
(v3x2df_UP): Likewise.
(v4x8qi_UP): Likewise.
(v4x4hi_UP): Likewise.
(v4x4hf_UP): Likewise.
(v4x4bf_UP): Likewise.
(v4x2si_UP): Likewise.
(v4x2sf_UP): Likewise.
(v4x1di_UP): Likewise.
(v4x1df_UP): Likewise.
(v4x16qi_UP): Likewise.
(v4x8hi_UP): Likewise.
(v4x8hf_UP): Likewise.
(v4x8bf_UP): Likewise.
(v4x4si_UP): Likewise.
(v4x4sf_UP): Likewise.
(v4x2di_UP): Likewise.
(v4x2df_UP): Likewise.
(TYPES_GETREGP): Delete.
(TYPES_SETREGP): Likewise.
(TYPES_LOADSTRUCT_U): Define.
(TYPES_LOADSTRUCT_P): Likewise.
(TYPES_LOADSTRUCT_LANE_U): Likewise.
(TYPES_LOADSTRUCT_LANE_P): Likewise.
(TYPES_STORE1P): Move for consistency.
(TYPES_STORESTRUCT_U): Define.
(TYPES_STORESTRUCT_P): Likewise.
(TYPES_STORESTRUCT_LANE_U): Likewise.
(TYPES_STORESTRUCT_LANE_P): Likewise.
(aarch64_simd_tuple_types): Define.
(aarch64_lookup_simd_builtin_type): Handle tuple type lookup.
(aarch64_init_simd_builtin_functions): Update frontend lookup
for builtin functions after handling arm_neon.h pragma.
(register_tuple_type): Manually set modes of single-integer
tuple types. Record tuple types.
* config/aarch64/aarch64-modes.def
(ADV_SIMD_D_REG_STRUCT_MODES): Define D-register tuple modes.
(ADV_SIMD_Q_REG_STRUCT_MODES): Define Q-register tuple modes.
(SVE_MODES): Give single-vector modes priority over vector-
tuple modes.
(VECTOR_MODES_WITH_PREFIX): Set partial-vector mode order to
be after all single-vector modes.
* config/aarch64/aarch64-simd-builtins.def: Update builtin
generator macros to reflect modifications to the backend
patterns.
* config/aarch64/aarch64-simd.md (aarch64_simd_ld2<mode>):
Use vector-tuple mode iterator and rename to...
(aarch64_simd_ld2<vstruct_elt>): This.
(aarch64_simd_ld2r<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_simd_ld2r<vstruct_elt>): This.
(aarch64_vec_load_lanesoi_lane<mode>): Use vector-tuple mode
iterator and rename to...
(aarch64_vec_load_lanes<mode>_lane<vstruct_elt>): This.
(vec_load_lanesoi<mode>): Use vector-tuple mode iterator and
rename to...
(vec_load_lanes<mode><vstruct_elt>): This.
(aarch64_simd_st2<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_simd_st2<vstruct_elt>): This.
(aarch64_vec_store_lanesoi_lane<mode>): Use vector-tuple mode
iterator and rename to...
(aarch64_vec_store_lanes<mode>_lane<vstruct_elt>): This.
(vec_store_lanesoi<mode>): Use vector-tuple mode iterator and
rename to...
(vec_store_lanes<mode><vstruct_elt>): This.
(aarch64_simd_ld3<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_simd_ld3<vstruct_elt>): This.
(aarch64_simd_ld3r<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_simd_ld3r<vstruct_elt>): This.
(aarch64_vec_load_lanesci_lane<mode>): Use vector-tuple mode
iterator and rename to...
(vec_load_lanesci<mode>): This.
(aarch64_simd_st3<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_simd_st3<vstruct_elt>): This.
(aarch64_vec_store_lanesci_lane<mode>): Use vector-tuple mode
iterator and rename to...
(vec_store_lanesci<mode>): This.
(aarch64_simd_ld4<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_simd_ld4<vstruct_elt>): This.
(aarch64_simd_ld4r<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_simd_ld4r<vstruct_elt>): This.
(aarch64_vec_load_lanesxi_lane<mode>): Use vector-tuple mode
iterator and rename to...
(vec_load_lanesxi<mode>): This.
(aarch64_simd_st4<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_simd_st4<vstruct_elt>): This.
(aarch64_vec_store_lanesxi_lane<mode>): Use vector-tuple mode
iterator and rename to...
(vec_store_lanesxi<mode>): This.
(mov<mode>): Define for Neon vector-tuple modes.
(aarch64_ld1x3<VALLDIF:mode>): Use vector-tuple mode iterator
and rename to...
(aarch64_ld1x3<vstruct_elt>): This.
(aarch64_ld1_x3_<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_ld1_x3_<vstruct_elt>): This.
(aarch64_ld1x4<VALLDIF:mode>): Use vector-tuple mode iterator
and rename to...
(aarch64_ld1x4<vstruct_elt>): This.
(aarch64_ld1_x4_<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_ld1_x4_<vstruct_elt>): This.
(aarch64_st1x2<VALLDIF:mode>): Use vector-tuple mode iterator
and rename to...
(aarch64_st1x2<vstruct_elt>): This.
(aarch64_st1_x2_<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_st1_x2_<vstruct_elt>): This.
(aarch64_st1x3<VALLDIF:mode>): Use vector-tuple mode iterator
and rename to...
(aarch64_st1x3<vstruct_elt>): This.
(aarch64_st1_x3_<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_st1_x3_<vstruct_elt>): This.
(aarch64_st1x4<VALLDIF:mode>): Use vector-tuple mode iterator
and rename to...
(aarch64_st1x4<vstruct_elt>): This.
(aarch64_st1_x4_<mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_st1_x4_<vstruct_elt>): This.
(*aarch64_mov<mode>): Define for vector-tuple modes.
(*aarch64_be_mov<mode>): Likewise.
(aarch64_ld<VSTRUCT:nregs>r<VALLDIF:mode>): Use vector-tuple
mode iterator and rename to...
(aarch64_ld<nregs>r<vstruct_elt>): This.
(aarch64_ld2<mode>_dreg): Use vector-tuple mode iterator and
rename to...
(aarch64_ld2<vstruct_elt>_dreg): This.
(aarch64_ld3<mode>_dreg): Use vector-tuple mode iterator and
rename to...
(aarch64_ld3<vstruct_elt>_dreg): This.
(aarch64_ld4<mode>_dreg): Use vector-tuple mode iterator and
rename to...
(aarch64_ld4<vstruct_elt>_dreg): This.
(aarch64_ld<VSTRUCT:nregs><VDC:mode>): Use vector-tuple mode
iterator and rename to...
(aarch64_ld<nregs><vstruct_elt>): Use vector-tuple mode
iterator and rename to...
(aarch64_ld<VSTRUCT:nregs><VQ:mode>): Use vector-tuple mode
(aarch64_ld1x2<VQ:mode>): Delete.
(aarch64_ld1x2<VDC:mode>): Use vector-tuple mode iterator and
rename to...
(aarch64_ld1x2<vstruct_elt>): This.
(aarch64_ld<VSTRUCT:nregs>_lane<VALLDIF:mode>): Use vector-
tuple mode iterator and rename to...
(aarch64_ld<nregs>_lane<vstruct_elt>): This.
(aarch64_get_dreg<VSTRUCT:mode><VDC:mode>): Delete.
(aarch64_get_qreg<VSTRUCT:mode><VQ:mode>): Likewise.
(aarch64_st2<mode>_dreg): Use vector-tuple mode iterator and
rename to...
(aarch64_st2<vstruct_elt>_dreg): This.
(aarch64_st3<mode>_dreg): Use vector-tuple mode iterator and
rename to...
(aarch64_st3<vstruct_elt>_dreg): This.
(aarch64_st4<mode>_dreg): Use vector-tuple mode iterator and
rename to...
(aarch64_st4<vstruct_elt>_dreg): This.
(aarch64_st<VSTRUCT:nregs><VDC:mode>): Use vector-tuple mode
iterator and rename to...
(aarch64_st<nregs><vstruct_elt>): This.
(aarch64_st<VSTRUCT:nregs><VQ:mode>): Use vector-tuple mode
iterator and rename to aarch64_st<nregs><vstruct_elt>.
(aarch64_st<VSTRUCT:nregs>_lane<VALLDIF:mode>): Use vector-
tuple mode iterator and rename to...
(aarch64_st<nregs>_lane<vstruct_elt>): This.
(aarch64_set_qreg<VSTRUCT:mode><VQ:mode>): Delete.
(aarch64_simd_ld1<mode>_x2): Use vector-tuple mode iterator
and rename to...
(aarch64_simd_ld1<vstruct_elt>_x2): This.
* config/aarch64/aarch64.c (aarch64_advsimd_struct_mode_p):
Refactor to include new vector-tuple modes.
(aarch64_classify_vector_mode): Add cases for new vector-
tuple modes.
(aarch64_advsimd_partial_struct_mode_p): Define.
(aarch64_advsimd_full_struct_mode_p): Likewise.
(aarch64_advsimd_vector_array_mode): Likewise.
(aarch64_sve_data_mode): Change location in file.
(aarch64_array_mode): Handle case of Neon vector-tuple modes.
(aarch64_hard_regno_nregs): Handle case of partial Neon
vector structures.
(aarch64_classify_address): Refactor to include handling of
Neon vector-tuple modes.
(aarch64_print_operand): Print "d" for "%R" for a partial
Neon vector structure.
(aarch64_expand_vec_perm_1): Use new vector-tuple mode.
(aarch64_modes_tieable_p): Prevent tieing Neon partial struct
modes with scalar machines modes larger than 8 bytes.
(aarch64_can_change_mode_class): Don't allow changes between
partial and full Neon vector-structure modes.
* config/aarch64/arm_neon.h (vst2_lane_f16): Use updated
builtin and remove boiler-plate code for opaque mode.
(vst2_lane_f32): Likewise.
(vst2_lane_f64): Likewise.
(vst2_lane_p8): Likewise.
(vst2_lane_p16): Likewise.
(vst2_lane_p64): Likewise.
(vst2_lane_s8): Likewise.
(vst2_lane_s16): Likewise.
(vst2_lane_s32): Likewise.
(vst2_lane_s64): Likewise.
(vst2_lane_u8): Likewise.
(vst2_lane_u16): Likewise.
(vst2_lane_u32): Likewise.
(vst2_lane_u64): Likewise.
(vst2q_lane_f16): Likewise.
(vst2q_lane_f32): Likewise.
(vst2q_lane_f64): Likewise.
(vst2q_lane_p8): Likewise.
(vst2q_lane_p16): Likewise.
(vst2q_lane_p64): Likewise.
(vst2q_lane_s8): Likewise.
(vst2q_lane_s16): Likewise.
(vst2q_lane_s32): Likewise.
(vst2q_lane_s64): Likewise.
(vst2q_lane_u8): Likewise.
(vst2q_lane_u16): Likewise.
(vst2q_lane_u32): Likewise.
(vst2q_lane_u64): Likewise.
(vst3_lane_f16): Likewise.
(vst3_lane_f32): Likewise.
(vst3_lane_f64): Likewise.
(vst3_lane_p8): Likewise.
(vst3_lane_p16): Likewise.
(vst3_lane_p64): Likewise.
(vst3_lane_s8): Likewise.
(vst3_lane_s16): Likewise.
(vst3_lane_s32): Likewise.
(vst3_lane_s64): Likewise.
(vst3_lane_u8): Likewise.
(vst3_lane_u16): Likewise.
(vst3_lane_u32): Likewise.
(vst3_lane_u64): Likewise.
(vst3q_lane_f16): Likewise.
(vst3q_lane_f32): Likewise.
(vst3q_lane_f64): Likewise.
(vst3q_lane_p8): Likewise.
(vst3q_lane_p16): Likewise.
(vst3q_lane_p64): Likewise.
(vst3q_lane_s8): Likewise.
(vst3q_lane_s16): Likewise.
(vst3q_lane_s32): Likewise.
(vst3q_lane_s64): Likewise.
(vst3q_lane_u8): Likewise.
(vst3q_lane_u16): Likewise.
(vst3q_lane_u32): Likewise.
(vst3q_lane_u64): Likewise.
(vst4_lane_f16): Likewise.
(vst4_lane_f32): Likewise.
(vst4_lane_f64): Likewise.
(vst4_lane_p8): Likewise.
(vst4_lane_p16): Likewise.
(vst4_lane_p64): Likewise.
(vst4_lane_s8): Likewise.
(vst4_lane_s16): Likewise.
(vst4_lane_s32): Likewise.
(vst4_lane_s64): Likewise.
(vst4_lane_u8): Likewise.
(vst4_lane_u16): Likewise.
(vst4_lane_u32): Likewise.
(vst4_lane_u64): Likewise.
(vst4q_lane_f16): Likewise.
(vst4q_lane_f32): Likewise.
(vst4q_lane_f64): Likewise.
(vst4q_lane_p8): Likewise.
(vst4q_lane_p16): Likewise.
(vst4q_lane_p64): Likewise.
(vst4q_lane_s8): Likewise.
(vst4q_lane_s16): Likewise.
(vst4q_lane_s32): Likewise.
(vst4q_lane_s64): Likewise.
(vst4q_lane_u8): Likewise.
(vst4q_lane_u16): Likewise.
(vst4q_lane_u32): Likewise.
(vst4q_lane_u64): Likewise.
(vtbl3_s8): Likewise.
(vtbl3_u8): Likewise.
(vtbl3_p8): Likewise.
(vtbl4_s8): Likewise.
(vtbl4_u8): Likewise.
(vtbl4_p8): Likewise.
(vld1_u8_x3): Likewise.
(vld1_s8_x3): Likewise.
(vld1_u16_x3): Likewise.
(vld1_s16_x3): Likewise.
(vld1_u32_x3): Likewise.
(vld1_s32_x3): Likewise.
(vld1_u64_x3): Likewise.
(vld1_s64_x3): Likewise.
(vld1_f16_x3): Likewise.
(vld1_f32_x3): Likewise.
(vld1_f64_x3): Likewise.
(vld1_p8_x3): Likewise.
(vld1_p16_x3): Likewise.
(vld1_p64_x3): Likewise.
(vld1q_u8_x3): Likewise.
(vld1q_s8_x3): Likewise.
(vld1q_u16_x3): Likewise.
(vld1q_s16_x3): Likewise.
(vld1q_u32_x3): Likewise.
(vld1q_s32_x3): Likewise.
(vld1q_u64_x3): Likewise.
(vld1q_s64_x3): Likewise.
(vld1q_f16_x3): Likewise.
(vld1q_f32_x3): Likewise.
(vld1q_f64_x3): Likewise.
(vld1q_p8_x3): Likewise.
(vld1q_p16_x3): Likewise.
(vld1q_p64_x3): Likewise.
(vld1_u8_x2): Likewise.
(vld1_s8_x2): Likewise.
(vld1_u16_x2): Likewise.
(vld1_s16_x2): Likewise.
(vld1_u32_x2): Likewise.
(vld1_s32_x2): Likewise.
(vld1_u64_x2): Likewise.
(vld1_s64_x2): Likewise.
(vld1_f16_x2): Likewise.
(vld1_f32_x2): Likewise.
(vld1_f64_x2): Likewise.
(vld1_p8_x2): Likewise.
(vld1_p16_x2): Likewise.
(vld1_p64_x2): Likewise.
(vld1q_u8_x2): Likewise.
(vld1q_s8_x2): Likewise.
(vld1q_u16_x2): Likewise.
(vld1q_s16_x2): Likewise.
(vld1q_u32_x2): Likewise.
(vld1q_s32_x2): Likewise.
(vld1q_u64_x2): Likewise.
(vld1q_s64_x2): Likewise.
(vld1q_f16_x2): Likewise.
(vld1q_f32_x2): Likewise.
(vld1q_f64_x2): Likewise.
(vld1q_p8_x2): Likewise.
(vld1q_p16_x2): Likewise.
(vld1q_p64_x2): Likewise.
(vld1_s8_x4): Likewise.
(vld1q_s8_x4): Likewise.
(vld1_s16_x4): Likewise.
(vld1q_s16_x4): Likewise.
(vld1_s32_x4): Likewise.
(vld1q_s32_x4): Likewise.
(vld1_u8_x4): Likewise.
(vld1q_u8_x4): Likewise.
(vld1_u16_x4): Likewise.
(vld1q_u16_x4): Likewise.
(vld1_u32_x4): Likewise.
(vld1q_u32_x4): Likewise.
(vld1_f16_x4): Likewise.
(vld1q_f16_x4): Likewise.
(vld1_f32_x4): Likewise.
(vld1q_f32_x4): Likewise.
(vld1_p8_x4): Likewise.
(vld1q_p8_x4): Likewise.
(vld1_p16_x4): Likewise.
(vld1q_p16_x4): Likewise.
(vld1_s64_x4): Likewise.
(vld1_u64_x4): Likewise.
(vld1_p64_x4): Likewise.
(vld1q_s64_x4): Likewise.
(vld1q_u64_x4): Likewise.
(vld1q_p64_x4): Likewise.
(vld1_f64_x4): Likewise.
(vld1q_f64_x4): Likewise.
(vld2_s64): Likewise.
(vld2_u64): Likewise.
(vld2_f64): Likewise.
(vld2_s8): Likewise.
(vld2_p8): Likewise.
(vld2_p64): Likewise.
(vld2_s16): Likewise.
(vld2_p16): Likewise.
(vld2_s32): Likewise.
(vld2_u8): Likewise.
(vld2_u16): Likewise.
(vld2_u32): Likewise.
(vld2_f16): Likewise.
(vld2_f32): Likewise.
(vld2q_s8): Likewise.
(vld2q_p8): Likewise.
(vld2q_s16): Likewise.
(vld2q_p16): Likewise.
(vld2q_p64): Likewise.
(vld2q_s32): Likewise.
(vld2q_s64): Likewise.
(vld2q_u8): Likewise.
(vld2q_u16): Likewise.
(vld2q_u32): Likewise.
(vld2q_u64): Likewise.
(vld2q_f16): Likewise.
(vld2q_f32): Likewise.
(vld2q_f64): Likewise.
(vld3_s64): Likewise.
(vld3_u64): Likewise.
(vld3_f64): Likewise.
(vld3_s8): Likewise.
(vld3_p8): Likewise.
(vld3_s16): Likewise.
(vld3_p16): Likewise.
(vld3_s32): Likewise.
(vld3_u8): Likewise.
(vld3_u16): Likewise.
(vld3_u32): Likewise.
(vld3_f16): Likewise.
(vld3_f32): Likewise.
(vld3_p64): Likewise.
(vld3q_s8): Likewise.
(vld3q_p8): Likewise.
(vld3q_s16): Likewise.
(vld3q_p16): Likewise.
(vld3q_s32): Likewise.
(vld3q_s64): Likewise.
(vld3q_u8): Likewise.
(vld3q_u16): Likewise.
(vld3q_u32): Likewise.
(vld3q_u64): Likewise.
(vld3q_f16): Likewise.
(vld3q_f32): Likewise.
(vld3q_f64): Likewise.
(vld3q_p64): Likewise.
(vld4_s64): Likewise.
(vld4_u64): Likewise.
(vld4_f64): Likewise.
(vld4_s8): Likewise.
(vld4_p8): Likewise.
(vld4_s16): Likewise.
(vld4_p16): Likewise.
(vld4_s32): Likewise.
(vld4_u8): Likewise.
(vld4_u16): Likewise.
(vld4_u32): Likewise.
(vld4_f16): Likewise.
(vld4_f32): Likewise.
(vld4_p64): Likewise.
(vld4q_s8): Likewise.
(vld4q_p8): Likewise.
(vld4q_s16): Likewise.
(vld4q_p16): Likewise.
(vld4q_s32): Likewise.
(vld4q_s64): Likewise.
(vld4q_u8): Likewise.
(vld4q_u16): Likewise.
(vld4q_u32): Likewise.
(vld4q_u64): Likewise.
(vld4q_f16): Likewise.
(vld4q_f32): Likewise.
(vld4q_f64): Likewise.
(vld4q_p64): Likewise.
(vld2_dup_s8): Likewise.
(vld2_dup_s16): Likewise.
(vld2_dup_s32): Likewise.
(vld2_dup_f16): Likewise.
(vld2_dup_f32): Likewise.
(vld2_dup_f64): Likewise.
(vld2_dup_u8): Likewise.
(vld2_dup_u16): Likewise.
(vld2_dup_u32): Likewise.
(vld2_dup_p8): Likewise.
(vld2_dup_p16): Likewise.
(vld2_dup_p64): Likewise.
(vld2_dup_s64): Likewise.
(vld2_dup_u64): Likewise.
(vld2q_dup_s8): Likewise.
(vld2q_dup_p8): Likewise.
(vld2q_dup_s16): Likewise.
(vld2q_dup_p16): Likewise.
(vld2q_dup_s32): Likewise.
(vld2q_dup_s64): Likewise.
(vld2q_dup_u8): Likewise.
(vld2q_dup_u16): Likewise.
(vld2q_dup_u32): Likewise.
(vld2q_dup_u64): Likewise.
(vld2q_dup_f16): Likewise.
(vld2q_dup_f32): Likewise.
(vld2q_dup_f64): Likewise.
(vld2q_dup_p64): Likewise.
(vld3_dup_s64): Likewise.
(vld3_dup_u64): Likewise.
(vld3_dup_f64): Likewise.
(vld3_dup_s8): Likewise.
(vld3_dup_p8): Likewise.
(vld3_dup_s16): Likewise.
(vld3_dup_p16): Likewise.
(vld3_dup_s32): Likewise.
(vld3_dup_u8): Likewise.
(vld3_dup_u16): Likewise.
(vld3_dup_u32): Likewise.
(vld3_dup_f16): Likewise.
(vld3_dup_f32): Likewise.
(vld3_dup_p64): Likewise.
(vld3q_dup_s8): Likewise.
(vld3q_dup_p8): Likewise.
(vld3q_dup_s16): Likewise.
(vld3q_dup_p16): Likewise.
(vld3q_dup_s32): Likewise.
(vld3q_dup_s64): Likewise.
(vld3q_dup_u8): Likewise.
(vld3q_dup_u16): Likewise.
(vld3q_dup_u32): Likewise.
(vld3q_dup_u64): Likewise.
(vld3q_dup_f16): Likewise.
(vld3q_dup_f32): Likewise.
(vld3q_dup_f64): Likewise.
(vld3q_dup_p64): Likewise.
(vld4_dup_s64): Likewise.
(vld4_dup_u64): Likewise.
(vld4_dup_f64): Likewise.
(vld4_dup_s8): Likewise.
(vld4_dup_p8): Likewise.
(vld4_dup_s16): Likewise.
(vld4_dup_p16): Likewise.
(vld4_dup_s32): Likewise.
(vld4_dup_u8): Likewise.
(vld4_dup_u16): Likewise.
(vld4_dup_u32): Likewise.
(vld4_dup_f16): Likewise.
(vld4_dup_f32): Likewise.
(vld4_dup_p64): Likewise.
(vld4q_dup_s8): Likewise.
(vld4q_dup_p8): Likewise.
(vld4q_dup_s16): Likewise.
(vld4q_dup_p16): Likewise.
(vld4q_dup_s32): Likewise.
(vld4q_dup_s64): Likewise.
(vld4q_dup_u8): Likewise.
(vld4q_dup_u16): Likewise.
(vld4q_dup_u32): Likewise.
(vld4q_dup_u64): Likewise.
(vld4q_dup_f16): Likewise.
(vld4q_dup_f32): Likewise.
(vld4q_dup_f64): Likewise.
(vld4q_dup_p64): Likewise.
(vld2_lane_u8): Likewise.
(vld2_lane_u16): Likewise.
(vld2_lane_u32): Likewise.
(vld2_lane_u64): Likewise.
(vld2_lane_s8): Likewise.
(vld2_lane_s16): Likewise.
(vld2_lane_s32): Likewise.
(vld2_lane_s64): Likewise.
(vld2_lane_f16): Likewise.
(vld2_lane_f32): Likewise.
(vld2_lane_f64): Likewise.
(vld2_lane_p8): Likewise.
(vld2_lane_p16): Likewise.
(vld2_lane_p64): Likewise.
(vld2q_lane_u8): Likewise.
(vld2q_lane_u16): Likewise.
(vld2q_lane_u32): Likewise.
(vld2q_lane_u64): Likewise.
(vld2q_lane_s8): Likewise.
(vld2q_lane_s16): Likewise.
(vld2q_lane_s32): Likewise.
(vld2q_lane_s64): Likewise.
(vld2q_lane_f16): Likewise.
(vld2q_lane_f32): Likewise.
(vld2q_lane_f64): Likewise.
(vld2q_lane_p8): Likewise.
(vld2q_lane_p16): Likewise.
(vld2q_lane_p64): Likewise.
(vld3_lane_u8): Likewise.
(vld3_lane_u16): Likewise.
(vld3_lane_u32): Likewise.
(vld3_lane_u64): Likewise.
(vld3_lane_s8): Likewise.
(vld3_lane_s16): Likewise.
(vld3_lane_s32): Likewise.
(vld3_lane_s64): Likewise.
(vld3_lane_f16): Likewise.
(vld3_lane_f32): Likewise.
(vld3_lane_f64): Likewise.
(vld3_lane_p8): Likewise.
(vld3_lane_p16): Likewise.
(vld3_lane_p64): Likewise.
(vld3q_lane_u8): Likewise.
(vld3q_lane_u16): Likewise.
(vld3q_lane_u32): Likewise.
(vld3q_lane_u64): Likewise.
(vld3q_lane_s8): Likewise.
(vld3q_lane_s16): Likewise.
(vld3q_lane_s32): Likewise.
(vld3q_lane_s64): Likewise.
(vld3q_lane_f16): Likewise.
(vld3q_lane_f32): Likewise.
(vld3q_lane_f64): Likewise.
(vld3q_lane_p8): Likewise.
(vld3q_lane_p16): Likewise.
(vld3q_lane_p64): Likewise.
(vld4_lane_u8): Likewise.
(vld4_lane_u16): Likewise.
(vld4_lane_u32): Likewise.
(vld4_lane_u64): Likewise.
(vld4_lane_s8): Likewise.
(vld4_lane_s16): Likewise.
(vld4_lane_s32): Likewise.
(vld4_lane_s64): Likewise.
(vld4_lane_f16): Likewise.
(vld4_lane_f32): Likewise.
(vld4_lane_f64): Likewise.
(vld4_lane_p8): Likewise.
(vld4_lane_p16): Likewise.
(vld4_lane_p64): Likewise.
(vld4q_lane_u8): Likewise.
(vld4q_lane_u16): Likewise.
(vld4q_lane_u32): Likewise.
(vld4q_lane_u64): Likewise.
(vld4q_lane_s8): Likewise.
(vld4q_lane_s16): Likewise.
(vld4q_lane_s32): Likewise.
(vld4q_lane_s64): Likewise.
(vld4q_lane_f16): Likewise.
(vld4q_lane_f32): Likewise.
(vld4q_lane_f64): Likewise.
(vld4q_lane_p8): Likewise.
(vld4q_lane_p16): Likewise.
(vld4q_lane_p64): Likewise.
(vqtbl2_s8): Likewise.
(vqtbl2_u8): Likewise.
(vqtbl2_p8): Likewise.
(vqtbl2q_s8): Likewise.
(vqtbl2q_u8): Likewise.
(vqtbl2q_p8): Likewise.
(vqtbl3_s8): Likewise.
(vqtbl3_u8): Likewise.
(vqtbl3_p8): Likewise.
(vqtbl3q_s8): Likewise.
(vqtbl3q_u8): Likewise.
(vqtbl3q_p8): Likewise.
(vqtbl4_s8): Likewise.
(vqtbl4_u8): Likewise.
(vqtbl4_p8): Likewise.
(vqtbl4q_s8): Likewise.
(vqtbl4q_u8): Likewise.
(vqtbl4q_p8): Likewise.
(vqtbx2_s8): Likewise.
(vqtbx2_u8): Likewise.
(vqtbx2_p8): Likewise.
(vqtbx2q_s8): Likewise.
(vqtbx2q_u8): Likewise.
(vqtbx2q_p8): Likewise.
(vqtbx3_s8): Likewise.
(vqtbx3_u8): Likewise.
(vqtbx3_p8): Likewise.
(vqtbx3q_s8): Likewise.
(vqtbx3q_u8): Likewise.
(vqtbx3q_p8): Likewise.
(vqtbx4_s8): Likewise.
(vqtbx4_u8): Likewise.
(vqtbx4_p8): Likewise.
(vqtbx4q_s8): Likewise.
(vqtbx4q_u8): Likewise.
(vqtbx4q_p8): Likewise.
(vst1_s64_x2): Likewise.
(vst1_u64_x2): Likewise.
(vst1_f64_x2): Likewise.
(vst1_s8_x2): Likewise.
(vst1_p8_x2): Likewise.
(vst1_s16_x2): Likewise.
(vst1_p16_x2): Likewise.
(vst1_s32_x2): Likewise.
(vst1_u8_x2): Likewise.
(vst1_u16_x2): Likewise.
(vst1_u32_x2): Likewise.
(vst1_f16_x2): Likewise.
(vst1_f32_x2): Likewise.
(vst1_p64_x2): Likewise.
(vst1q_s8_x2): Likewise.
(vst1q_p8_x2): Likewise.
(vst1q_s16_x2): Likewise.
(vst1q_p16_x2): Likewise.
(vst1q_s32_x2): Likewise.
(vst1q_s64_x2): Likewise.
(vst1q_u8_x2): Likewise.
(vst1q_u16_x2): Likewise.
(vst1q_u32_x2): Likewise.
(vst1q_u64_x2): Likewise.
(vst1q_f16_x2): Likewise.
(vst1q_f32_x2): Likewise.
(vst1q_f64_x2): Likewise.
(vst1q_p64_x2): Likewise.
(vst1_s64_x3): Likewise.
(vst1_u64_x3): Likewise.
(vst1_f64_x3): Likewise.
(vst1_s8_x3): Likewise.
(vst1_p8_x3): Likewise.
(vst1_s16_x3): Likewise.
(vst1_p16_x3): Likewise.
(vst1_s32_x3): Likewise.
(vst1_u8_x3): Likewise.
(vst1_u16_x3): Likewise.
(vst1_u32_x3): Likewise.
(vst1_f16_x3): Likewise.
(vst1_f32_x3): Likewise.
(vst1_p64_x3): Likewise.
(vst1q_s8_x3): Likewise.
(vst1q_p8_x3): Likewise.
(vst1q_s16_x3): Likewise.
(vst1q_p16_x3): Likewise.
(vst1q_s32_x3): Likewise.
(vst1q_s64_x3): Likewise.
(vst1q_u8_x3): Likewise.
(vst1q_u16_x3): Likewise.
(vst1q_u32_x3): Likewise.
(vst1q_u64_x3): Likewise.
(vst1q_f16_x3): Likewise.
(vst1q_f32_x3): Likewise.
(vst1q_f64_x3): Likewise.
(vst1q_p64_x3): Likewise.
(vst1_s8_x4): Likewise.
(vst1q_s8_x4): Likewise.
(vst1_s16_x4): Likewise.
(vst1q_s16_x4): Likewise.
(vst1_s32_x4): Likewise.
(vst1q_s32_x4): Likewise.
(vst1_u8_x4): Likewise.
(vst1q_u8_x4): Likewise.
(vst1_u16_x4): Likewise.
(vst1q_u16_x4): Likewise.
(vst1_u32_x4): Likewise.
(vst1q_u32_x4): Likewise.
(vst1_f16_x4): Likewise.
(vst1q_f16_x4): Likewise.
(vst1_f32_x4): Likewise.
(vst1q_f32_x4): Likewise.
(vst1_p8_x4): Likewise.
(vst1q_p8_x4): Likewise.
(vst1_p16_x4): Likewise.
(vst1q_p16_x4): Likewise.
(vst1_s64_x4): Likewise.
(vst1_u64_x4): Likewise.
(vst1_p64_x4): Likewise.
(vst1q_s64_x4): Likewise.
(vst1q_u64_x4): Likewise.
(vst1q_p64_x4): Likewise.
(vst1_f64_x4): Likewise.
(vst1q_f64_x4): Likewise.
(vst2_s64): Likewise.
(vst2_u64): Likewise.
(vst2_f64): Likewise.
(vst2_s8): Likewise.
(vst2_p8): Likewise.
(vst2_s16): Likewise.
(vst2_p16): Likewise.
(vst2_s32): Likewise.
(vst2_u8): Likewise.
(vst2_u16): Likewise.
(vst2_u32): Likewise.
(vst2_f16): Likewise.
(vst2_f32): Likewise.
(vst2_p64): Likewise.
(vst2q_s8): Likewise.
(vst2q_p8): Likewise.
(vst2q_s16): Likewise.
(vst2q_p16): Likewise.
(vst2q_s32): Likewise.
(vst2q_s64): Likewise.
(vst2q_u8): Likewise.
(vst2q_u16): Likewise.
(vst2q_u32): Likewise.
(vst2q_u64): Likewise.
(vst2q_f16): Likewise.
(vst2q_f32): Likewise.
(vst2q_f64): Likewise.
(vst2q_p64): Likewise.
(vst3_s64): Likewise.
(vst3_u64): Likewise.
(vst3_f64): Likewise.
(vst3_s8): Likewise.
(vst3_p8): Likewise.
(vst3_s16): Likewise.
(vst3_p16): Likewise.
(vst3_s32): Likewise.
(vst3_u8): Likewise.
(vst3_u16): Likewise.
(vst3_u32): Likewise.
(vst3_f16): Likewise.
(vst3_f32): Likewise.
(vst3_p64): Likewise.
(vst3q_s8): Likewise.
(vst3q_p8): Likewise.
(vst3q_s16): Likewise.
(vst3q_p16): Likewise.
(vst3q_s32): Likewise.
(vst3q_s64): Likewise.
(vst3q_u8): Likewise.
(vst3q_u16): Likewise.
(vst3q_u32): Likewise.
(vst3q_u64): Likewise.
(vst3q_f16): Likewise.
(vst3q_f32): Likewise.
(vst3q_f64): Likewise.
(vst3q_p64): Likewise.
(vst4_s64): Likewise.
(vst4_u64): Likewise.
(vst4_f64): Likewise.
(vst4_s8): Likewise.
(vst4_p8): Likewise.
(vst4_s16): Likewise.
(vst4_p16): Likewise.
(vst4_s32): Likewise.
(vst4_u8): Likewise.
(vst4_u16): Likewise.
(vst4_u32): Likewise.
(vst4_f16): Likewise.
(vst4_f32): Likewise.
(vst4_p64): Likewise.
(vst4q_s8): Likewise.
(vst4q_p8): Likewise.
(vst4q_s16): Likewise.
(vst4q_p16): Likewise.
(vst4q_s32): Likewise.
(vst4q_s64): Likewise.
(vst4q_u8): Likewise.
(vst4q_u16): Likewise.
(vst4q_u32): Likewise.
(vst4q_u64): Likewise.
(vst4q_f16): Likewise.
(vst4q_f32): Likewise.
(vst4q_f64): Likewise.
(vst4q_p64): Likewise.
(vtbx4_s8): Likewise.
(vtbx4_u8): Likewise.
(vtbx4_p8): Likewise.
(vld1_bf16_x2): Likewise.
(vld1q_bf16_x2): Likewise.
(vld1_bf16_x3): Likewise.
(vld1q_bf16_x3): Likewise.
(vld1_bf16_x4): Likewise.
(vld1q_bf16_x4): Likewise.
(vld2_bf16): Likewise.
(vld2q_bf16): Likewise.
(vld2_dup_bf16): Likewise.
(vld2q_dup_bf16): Likewise.
(vld3_bf16): Likewise.
(vld3q_bf16): Likewise.
(vld3_dup_bf16): Likewise.
(vld3q_dup_bf16): Likewise.
(vld4_bf16): Likewise.
(vld4q_bf16): Likewise.
(vld4_dup_bf16): Likewise.
(vld4q_dup_bf16): Likewise.
(vst1_bf16_x2): Likewise.
(vst1q_bf16_x2): Likewise.
(vst1_bf16_x3): Likewise.
(vst1q_bf16_x3): Likewise.
(vst1_bf16_x4): Likewise.
(vst1q_bf16_x4): Likewise.
(vst2_bf16): Likewise.
(vst2q_bf16): Likewise.
(vst3_bf16): Likewise.
(vst3q_bf16): Likewise.
(vst4_bf16): Likewise.
(vst4q_bf16): Likewise.
(vld2_lane_bf16): Likewise.
(vld2q_lane_bf16): Likewise.
(vld3_lane_bf16): Likewise.
(vld3q_lane_bf16): Likewise.
(vld4_lane_bf16): Likewise.
(vld4q_lane_bf16): Likewise.
(vst2_lane_bf16): Likewise.
(vst2q_lane_bf16): Likewise.
(vst3_lane_bf16): Likewise.
(vst3q_lane_bf16): Likewise.
(vst4_lane_bf16): Likewise.
(vst4q_lane_bf16): Likewise.
* config/aarch64/geniterators.sh: Modify iterator regex to
match new vector-tuple modes.
* config/aarch64/iterators.md (insn_count): Extend mode
attribute with vector-tuple type information.
(nregs): Likewise.
(Vendreg): Likewise.
(Vetype): Likewise.
(Vtype): Likewise.
(VSTRUCT_2D): New mode iterator.
(VSTRUCT_2DNX): Likewise.
(VSTRUCT_2DX): Likewise.
(VSTRUCT_2Q): Likewise.
(VSTRUCT_2QD): Likewise.
(VSTRUCT_3D): Likewise.
(VSTRUCT_3DNX): Likewise.
(VSTRUCT_3DX): Likewise.
(VSTRUCT_3Q): Likewise.
(VSTRUCT_3QD): Likewise.
(VSTRUCT_4D): Likewise.
(VSTRUCT_4DNX): Likewise.
(VSTRUCT_4DX): Likewise.
(VSTRUCT_4Q): Likewise.
(VSTRUCT_4QD): Likewise.
(VSTRUCT_D): Likewise.
(VSTRUCT_Q): Likewise.
(VSTRUCT_QD): Likewise.
(VSTRUCT_ELT): New mode attribute.
(vstruct_elt): Likewise.
* genmodes.c (VECTOR_MODE): Add default prefix and order
parameters.
(VECTOR_MODE_WITH_PREFIX): Define.
(make_vector_mode): Add mode prefix and order parameters.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/advsimd-intrinsics/bf16_vldN_lane_2.c:
Relax incorrect register number requirement.
* gcc.target/aarch64/sve/pcs/struct_3_256.c: Accept
equivalent codegen with fmov.
Diffstat (limited to 'gcc/genmodes.c')
-rw-r--r-- | gcc/genmodes.c | 10 |
1 files changed, 7 insertions, 3 deletions
diff --git a/gcc/genmodes.c b/gcc/genmodes.c index c268ebc..c9af4ef 100644 --- a/gcc/genmodes.c +++ b/gcc/genmodes.c @@ -749,12 +749,15 @@ make_partial_integer_mode (const char *base, const char *name, /* A single vector mode can be specified by naming its component mode and the number of components. */ -#define VECTOR_MODE(C, M, N) \ - make_vector_mode (MODE_##C, #M, N, __FILE__, __LINE__); +#define VECTOR_MODE_WITH_PREFIX(PREFIX, C, M, N, ORDER) \ + make_vector_mode (MODE_##C, #PREFIX, #M, N, ORDER, __FILE__, __LINE__); +#define VECTOR_MODE(C, M, N) VECTOR_MODE_WITH_PREFIX(V, C, M, N, 0); static void ATTRIBUTE_UNUSED make_vector_mode (enum mode_class bclass, + const char *prefix, const char *base, unsigned int ncomponents, + unsigned int order, const char *file, unsigned int line) { struct mode_data *v; @@ -778,7 +781,7 @@ make_vector_mode (enum mode_class bclass, return; } - if ((size_t)snprintf (namebuf, sizeof namebuf, "V%u%s", + if ((size_t)snprintf (namebuf, sizeof namebuf, "%s%u%s", prefix, ncomponents, base) >= sizeof namebuf) { error ("%s:%d: mode name \"%s\" is too long", @@ -787,6 +790,7 @@ make_vector_mode (enum mode_class bclass, } v = new_mode (vclass, xstrdup (namebuf), file, line); + v->order = order; v->ncomponents = ncomponents; v->component = component; } |