diff options
author | Jonathan Wright <jonathan.wright@arm.com> | 2021-07-20 10:28:34 +0100 |
---|---|---|
committer | Jonathan Wright <jonathan.wright@arm.com> | 2021-07-23 12:15:20 +0100 |
commit | e8de7edde6c5c3cc60f15c78422b85b4ccdc08bf (patch) | |
tree | 926acb26919bd15868b75646d73cbeeef2f373c8 /gcc/gimple-array-bounds.h | |
parent | 4848e283ccaed451ddcc38edcb9f5ce9e9f2d7eb (diff) | |
download | gcc-e8de7edde6c5c3cc60f15c78422b85b4ccdc08bf.zip gcc-e8de7edde6c5c3cc60f15c78422b85b4ccdc08bf.tar.gz gcc-e8de7edde6c5c3cc60f15c78422b85b4ccdc08bf.tar.bz2 |
aarch64: Use memcpy to copy vector tables in vst4[q] intrinsics
Use __builtin_memcpy to copy vector structures instead of building
a new opaque structure one vector at a time in each of the vst4[q]
Neon intrinsics in arm_neon.h. This simplifies the header file and
also improves code generation - superfluous move instructions were
emitted for every register extraction/set in this additional
structure.
Add new code generation tests to verify that superfluous move
instructions are no longer generated for the vst4q intrinsics.
gcc/ChangeLog:
2021-07-20 Jonathan Wright <jonathan.wright@arm.com>
* config/aarch64/arm_neon.h (vst4_s64): Use __builtin_memcpy
instead of constructing __builtin_aarch64_simd_xi one vector
at a time.
(vst4_u64): Likewise.
(vst4_f64): Likewise.
(vst4_s8): Likewise.
(vst4_p8): Likewise.
(vst4_s16): Likewise.
(vst4_p16): Likewise.
(vst4_s32): Likewise.
(vst4_u8): Likewise.
(vst4_u16): Likewise.
(vst4_u32): Likewise.
(vst4_f16): Likewise.
(vst4_f32): Likewise.
(vst4_p64): Likewise.
(vst4q_s8): Likewise.
(vst4q_p8): Likewise.
(vst4q_s16): Likewise.
(vst4q_p16): Likewise.
(vst4q_s32): Likewise.
(vst4q_s64): Likewise.
(vst4q_u8): Likewise.
(vst4q_u16): Likewise.
(vst4q_u32): Likewise.
(vst4q_u64): Likewise.
(vst4q_f16): Likewise.
(vst4q_f32): Likewise.
(vst4q_f64): Likewise.
(vst4q_p64): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vector_structure_intrinsics.c: Add new
tests.
Diffstat (limited to 'gcc/gimple-array-bounds.h')
0 files changed, 0 insertions, 0 deletions