diff options
author | Hongyu Wang <hongyu.wang@intel.com> | 2022-03-19 01:16:29 +0800 |
---|---|---|
committer | Hongyu Wang <hongyu.wang@intel.com> | 2022-03-22 11:48:38 +0800 |
commit | 7bce0be03b857eefe5990c3ef0af06ea8f8ae04e (patch) | |
tree | 5d2ba232d0294f28781f9b0690d9d7cf72f33127 /gcc/fold-const.cc | |
parent | d156bb870225f442b32983983f94e731397fdb6e (diff) | |
download | gcc-7bce0be03b857eefe5990c3ef0af06ea8f8ae04e.zip gcc-7bce0be03b857eefe5990c3ef0af06ea8f8ae04e.tar.gz gcc-7bce0be03b857eefe5990c3ef0af06ea8f8ae04e.tar.bz2 |
AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]
For complex scalar intrinsic like _mm_mask_fcmadd_sch, the
mask should be and by 1 to ensure the mask is bind to lowest byte.
Use masked vmovss to perform same operation which omits higher bits
of mask.
gcc/ChangeLog:
PR target/104978
* config/i386/sse.md
(avx512fp16_fmaddcsh_v8hf_mask1<round_expand_name):
Use avx512f_movsf_mask instead of vmovaps or vblend, and
force_reg before lowpart_subreg.
(avx512fp16_fcmaddcsh_v8hf_mask1<round_expand_name): Likewise.
gcc/testsuite/ChangeLog:
PR target/104978
* gcc.target/i386/avx512fp16-vfcmaddcsh-1a.c: Adjust asm scan.
* gcc.target/i386/avx512fp16-vfmaddcsh-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vfcmaddcsh-1c.c: Removed.
* gcc.target/i386/avx512fp16-vfmaddcsh-1c.c: Ditto.
* gcc.target/i386/pr104978.c: New test.
Diffstat (limited to 'gcc/fold-const.cc')
0 files changed, 0 insertions, 0 deletions