diff options
author | Jakub Jelinek <jakub@redhat.com> | 2020-12-30 11:21:24 +0100 |
---|---|---|
committer | Jakub Jelinek <jakub@redhat.com> | 2020-12-30 11:21:24 +0100 |
commit | 8f7941ca37001773a36add8119791725aeb823ba (patch) | |
tree | e2ca3ce142651d180577771f50190584cd2daf1f /gcc/config | |
parent | 86b3edf1ff26590077b5e968fca0b32dfdc2bf33 (diff) | |
download | gcc-8f7941ca37001773a36add8119791725aeb823ba.zip gcc-8f7941ca37001773a36add8119791725aeb823ba.tar.gz gcc-8f7941ca37001773a36add8119791725aeb823ba.tar.bz2 |
i386: Optimize pmovmskb on inverted vector to inversion of pmovmskb result [PR98461]
The following patch adds combine splitters to optimize:
- vpcmpeqd %ymm1, %ymm1, %ymm1
- vpandn %ymm1, %ymm0, %ymm0
vpmovmskb %ymm0, %eax
+ notl %eax
etc. (for vectors with less than 32 elements with xorl instead of notl).
2020-12-30 Jakub Jelinek <jakub@redhat.com>
PR target/98461
* config/i386/sse.md (<sse2_avx2>_pmovmskb): Add splitters
for pmovmskb of NOT vector.
* gcc.target/i386/sse2-pr98461.c: New test.
* gcc.target/i386/avx2-pr98461.c: New test.
Diffstat (limited to 'gcc/config')
-rw-r--r-- | gcc/config/i386/sse.md | 47 |
1 files changed, 47 insertions, 0 deletions
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 141a99d..d841038 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -16099,6 +16099,53 @@ (set_attr "prefix" "maybe_vex") (set_attr "mode" "SI")]) +(define_split + [(set (match_operand:SI 0 "register_operand") + (unspec:SI + [(not:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand"))] + UNSPEC_MOVMSK))] + "TARGET_SSE2" + [(set (match_dup 2) + (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)) + (set (match_dup 0) (match_dup 3))] +{ + operands[2] = gen_reg_rtx (SImode); + if (GET_MODE_NUNITS (<MODE>mode) == 32) + operands[3] = gen_rtx_NOT (SImode, operands[2]); + else + { + operands[3] + = gen_int_mode ((HOST_WIDE_INT_1 << GET_MODE_NUNITS (<MODE>mode)) - 1, + SImode); + operands[3] = gen_rtx_XOR (SImode, operands[2], operands[3]); + } +}) + +(define_split + [(set (match_operand:SI 0 "register_operand") + (unspec:SI + [(subreg:VI1_AVX2 (not (match_operand 1 "register_operand")) 0)] + UNSPEC_MOVMSK))] + "TARGET_SSE2 + && GET_MODE_CLASS (GET_MODE (operands[1])) == MODE_VECTOR_INT + && GET_MODE_SIZE (GET_MODE (operands[1])) == <MODE_SIZE>" + [(set (match_dup 2) + (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)) + (set (match_dup 0) (match_dup 3))] +{ + operands[2] = gen_reg_rtx (SImode); + operands[1] = gen_lowpart (<MODE>mode, operands[1]); + if (GET_MODE_NUNITS (<MODE>mode) == 32) + operands[3] = gen_rtx_NOT (SImode, operands[2]); + else + { + operands[3] + = gen_int_mode ((HOST_WIDE_INT_1 << GET_MODE_NUNITS (<MODE>mode)) - 1, + SImode); + operands[3] = gen_rtx_XOR (SImode, operands[2], operands[3]); + } +}) + (define_insn_and_split "*<sse2_avx2>_pmovmskb_lt" [(set (match_operand:SI 0 "register_operand" "=r") (unspec:SI |