diff options
author | liuhongt <hongtao.liu@intel.com> | 2023-06-15 16:46:14 +0800 |
---|---|---|
committer | liuhongt <hongtao.liu@intel.com> | 2023-06-19 09:34:19 +0800 |
commit | f8e02702726d4514b8ff9f5481c9c1f5d34e1787 (patch) | |
tree | 6eb17843a07be508981d7070bb76a12dfc3c1c18 /gcc/tree-vectorizer.h | |
parent | 58e61a3ab1c13b6d5b07d86a30cf48a46e0345c8 (diff) | |
download | gcc-f8e02702726d4514b8ff9f5481c9c1f5d34e1787.zip gcc-f8e02702726d4514b8ff9f5481c9c1f5d34e1787.tar.gz gcc-f8e02702726d4514b8ff9f5481c9c1f5d34e1787.tar.bz2 |
Refined 256/512-bit vpacksswb/vpackssdw patterns.
The packing in vpacksswb/vpackssdw is not a simple concat, it's an
interweave from src1 and src2 for every 128 bit(or 64-bit for the
ss_truncate result).
.i.e.
dst[192-255] = ss_truncate (src2[128-255])
dst[128-191] = ss_truncate (src1[128-255])
dst[64-127] = ss_truncate (src2[0-127])
dst[0-63] = ss_truncate (src1[0-127]
The patch refined those patterns with an extra vec_select for the
interweave.
gcc/ChangeLog:
PR target/110235
* config/i386/sse.md (<sse2_avx2>_packsswb<mask_name>):
Substitute with ..
(sse2_packsswb<mask_name>): .. this, ..
(avx2_packsswb<mask_name>): .. this and ..
(avx512bw_packsswb<mask_name>): .. this.
(<sse2_avx2>_packssdw<mask_name>): Substitute with ..
(sse2_packssdw<mask_name>): .. this, ..
(avx2_packssdw<mask_name>): .. this and ..
(avx512bw_packssdw<mask_name>): .. this.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512bw-vpackssdw-3.c: New test.
* gcc.target/i386/avx512bw-vpacksswb-3.c: New test.
Diffstat (limited to 'gcc/tree-vectorizer.h')
0 files changed, 0 insertions, 0 deletions