diff options
author | Haochen Jiang <haochen.jiang@intel.com> | 2025-03-05 10:35:11 +0800 |
---|---|---|
committer | Haochen Jiang <haochen.jiang@intel.com> | 2025-03-07 11:33:48 +0800 |
commit | a1eaeac63adc4e20b7e74290fdbe51725d40ddeb (patch) | |
tree | 1d0953f90c58b5d3c4f565b71034dcff03aa6dca /gcc/tree-vectorizer.h | |
parent | c207dcf393b864adc8eb41bbbcd630a6cfdc145a (diff) | |
download | gcc-a1eaeac63adc4e20b7e74290fdbe51725d40ddeb.zip gcc-a1eaeac63adc4e20b7e74290fdbe51725d40ddeb.tar.gz gcc-a1eaeac63adc4e20b7e74290fdbe51725d40ddeb.tar.bz2 |
i386: Correct mask width for bf8->fp16 intrin on 256/512 bit
For bf8 -> fp16 convert, when dst is 256 bit, the mask should be
16 bit since 16*16=256, not the 8 bit in the current intrin. In
512 bit intrin, the mask size is also halved. This patch will fix
both of them.
gcc/ChangeLog:
* config/i386/avx10_2-512convertintrin.h
(_mm512_mask_cvtbf8_ph): Correct mask width.
(_mm512_maskz_cvtbf8_ph): Ditto.
* config/i386/avx10_2convertintrin.h
(_mm256_mask_cvtbf8_ph): Ditto.
(_mm256_maskz_cvtbf8_ph): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-512-convert-1.c: Change function call.
* gcc.target/i386/avx10_2-convert-1.c: Ditto.
Diffstat (limited to 'gcc/tree-vectorizer.h')
0 files changed, 0 insertions, 0 deletions