diff options
author | Jakub Jelinek <jakub@redhat.com> | 2021-07-28 10:52:51 +0200 |
---|---|---|
committer | Jakub Jelinek <jakub@redhat.com> | 2021-07-28 10:52:51 +0200 |
commit | 88d0f70a326eeb42b479aa537f8a81bf5a199346 (patch) | |
tree | 4e8f8d05dd96cc2aac695286be821f8ea642e025 /gcc/tree-ssa-threadbackward.c | |
parent | 8af0c50a29346f97a370f76bd881ccb4252b1e4d (diff) | |
download | gcc-88d0f70a326eeb42b479aa537f8a81bf5a199346.zip gcc-88d0f70a326eeb42b479aa537f8a81bf5a199346.tar.gz gcc-88d0f70a326eeb42b479aa537f8a81bf5a199346.tar.bz2 |
i386: Improve AVX2 expansion of vector >> vector DImode arithm. shifts [PR101611]
AVX2 introduced vector >> vector shifts, but unfortunately for V{2,4}DImode
it only supports logical and not arithmetic shifts, only AVX512F for
V8DImode or AVX512VL for V{2,4}DImode fixed that omission.
Earlier in GCC12 cycle I've committed vector >> scalar arithmetic shift
emulation using various sequences, this patch handles the vector >> vector
case. No need to adjust costs, the previous cost adjustment actually
covers even the vector by vector shifts.
The patch emits the right arithmetic V{2,4}DImode shifts using 2 logical right
V{2,4}DImode shifts (once of the original operands, once of sign mask
constant by the vector shift count), xor and subtraction, on each element
(long long) x >> y is done as
(((unsigned long long) x >> y) ^ (0x8000000000000000ULL >> y))
- (0x8000000000000000ULL >> y)
i.e. if x doesn't have in some element the MSB set, it is just the logical
shift, if it does, then the xor and subtraction cause also all higher bits
to be set.
2021-07-28 Jakub Jelinek <jakub@redhat.com>
PR target/101611
* config/i386/sse.md (vashr<mode>3): Split into vashrv8di3 expander
and vashrv4di3 expander, where the latter requires just TARGET_AVX2
and has special !TARGET_AVX512VL expansion.
(vashrv2di3<mask_name>): Rename to ...
(vashrv2di3): ... this. Change condition to TARGET_XOP || TARGET_AVX2
and add special !TARGET_XOP && !TARGET_AVX512VL expansion.
* gcc.target/i386/avx2-pr101611-1.c: New test.
* gcc.target/i386/avx2-pr101611-2.c: New test.
Diffstat (limited to 'gcc/tree-ssa-threadbackward.c')
0 files changed, 0 insertions, 0 deletions