diff options
author | liuhongt <hongtao.liu@intel.com> | 2023-09-04 13:16:11 +0800 |
---|---|---|
committer | liuhongt <hongtao.liu@intel.com> | 2023-09-05 11:11:14 +0800 |
commit | 33066c903a614f948a2657c7aa3090067f5984a5 (patch) | |
tree | a0b82086028251021e2cb063694947d5e0d289b1 /gcc/ada | |
parent | 6f94ef6c86074a8348ec21d8aade04ce67b4e292 (diff) | |
download | gcc-33066c903a614f948a2657c7aa3090067f5984a5.zip gcc-33066c903a614f948a2657c7aa3090067f5984a5.tar.gz gcc-33066c903a614f948a2657c7aa3090067f5984a5.tar.bz2 |
Generate vmovsh instead of vpblendw for specific vec_merge.
On SPR, vmovsh can be execute on 3 ports, vpblendw can only be
executed on 2 ports.
On znver4, vpblendw can be executed on 4 ports, if vmovsh is similar
as vmovss, then it can also be executed on 4 ports.
So there's no difference for znver? but vmovsh is more optimized on
SPR.
gcc/ChangeLog:
* config/i386/sse.md: (V8BFH_128): Renamed to ..
(VHFBF_128): .. this.
(V16BFH_256): Renamed to ..
(VHFBF_256): .. this.
(avx512f_mov<mode>): Extend to V_128.
(vcvtnee<bf16_ph>2ps_<mode>): Changed to VHFBF_128.
(vcvtneo<bf16_ph>2ps_<mode>): Ditto.
(vcvtnee<bf16_ph>2ps_<mode>): Changed to VHFBF_256.
(vcvtneo<bf16_ph>2ps_<mode>): Ditto.
* config/i386/i386-expand.cc (expand_vec_perm_blend):
Canonicalize vec_merge.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-vmovsh-1a.c: Remove xfail.
Diffstat (limited to 'gcc/ada')
0 files changed, 0 insertions, 0 deletions