aboutsummaryrefslogtreecommitdiff
path: root/gcc/ada
diff options
context:
space:
mode:
authorJakub Jelinek <jakub@redhat.com>2020-11-26 08:44:15 +0100
committerJakub Jelinek <jakub@redhat.com>2020-11-26 08:46:14 +0100
commit32b0abb24b8702ec9954448739682ace6fa5ccf5 (patch)
tree4e36a72b8c81c870020720f0a26d9f166873b462 /gcc/ada
parent768ce4f0ceb030e38427e85e483ed44330cd5da7 (diff)
downloadgcc-32b0abb24b8702ec9954448739682ace6fa5ccf5.zip
gcc-32b0abb24b8702ec9954448739682ace6fa5ccf5.tar.gz
gcc-32b0abb24b8702ec9954448739682ace6fa5ccf5.tar.bz2
i386: Optimize psubusw compared to 0 into pminuw compared to op0 [PR96906]
The following patch renames VI12_AVX2 iterator to VI12_AVX2_AVX512BW for consistency with some other iterators, as I need VI12_AVX2 without AVX512BW for this change. The real meat is a combiner split which combine can use to optimize psubusw compared to 0 into pminuw compared to op0 (and similarly for psubusb compared to 0 into pminub compared to op0). According to Agner Fog's tables, psubus[bw] and pminu[bw] timings are the same, but the advantage of pminu[bw] is that the comparison doesn't need a zero operand, so e.g. for -msse4.1 it causes changes like - psubusw %xmm1, %xmm0 - pxor %xmm1, %xmm1 + pminuw %xmm0, %xmm1 pcmpeqw %xmm1, %xmm0 and similarly for avx2: - vpsubusb %ymm1, %ymm0, %ymm0 - vpxor %xmm1, %xmm1, %xmm1 - vpcmpeqb %ymm1, %ymm0, %ymm0 + vpminub %ymm1, %ymm0, %ymm1 + vpcmpeqb %ymm0, %ymm1, %ymm0 I haven't done the AVX512{BW,VL} define_split(s), they'll need to match the UNSPEC_PCMP which are used for avx512 comparisons. 2020-11-26 Jakub Jelinek <jakub@redhat.com> PR target/96906 * config/i386/sse.md (VI12_AVX2): Remove V64QI/V32HI modes. (VI12_AVX2_AVX512BW): New mode iterator. (<sse2_avx2>_<plusminus_insn><mode>3<mask_name>, uavg<mode>3_ceil, <sse2_avx2>_uavg<mode>3<mask_name>): Use VI12_AVX2_AVX512BW iterator instead of VI12_AVX2. (*<sse2_avx2>_<plusminus_insn><mode>3<mask_name>): Likewise. (*<sse2_avx2>_uavg<mode>3<mask_name>): Likewise. (*<sse2_avx2>_<plusminus_insn><mode>3<mask_name>): Add a new define_split after this insn. * gcc.target/i386/pr96906-1.c: New test.
Diffstat (limited to 'gcc/ada')
0 files changed, 0 insertions, 0 deletions