aboutsummaryrefslogtreecommitdiff
path: root/gcc/tree-ssa-threadbackward.c
diff options
context:
space:
mode:
authorJakub Jelinek <jakub@redhat.com>2021-07-28 10:52:51 +0200
committerJakub Jelinek <jakub@redhat.com>2021-07-28 10:52:51 +0200
commit88d0f70a326eeb42b479aa537f8a81bf5a199346 (patch)
tree4e8f8d05dd96cc2aac695286be821f8ea642e025 /gcc/tree-ssa-threadbackward.c
parent8af0c50a29346f97a370f76bd881ccb4252b1e4d (diff)
downloadgcc-88d0f70a326eeb42b479aa537f8a81bf5a199346.zip
gcc-88d0f70a326eeb42b479aa537f8a81bf5a199346.tar.gz
gcc-88d0f70a326eeb42b479aa537f8a81bf5a199346.tar.bz2
i386: Improve AVX2 expansion of vector >> vector DImode arithm. shifts [PR101611]
AVX2 introduced vector >> vector shifts, but unfortunately for V{2,4}DImode it only supports logical and not arithmetic shifts, only AVX512F for V8DImode or AVX512VL for V{2,4}DImode fixed that omission. Earlier in GCC12 cycle I've committed vector >> scalar arithmetic shift emulation using various sequences, this patch handles the vector >> vector case. No need to adjust costs, the previous cost adjustment actually covers even the vector by vector shifts. The patch emits the right arithmetic V{2,4}DImode shifts using 2 logical right V{2,4}DImode shifts (once of the original operands, once of sign mask constant by the vector shift count), xor and subtraction, on each element (long long) x >> y is done as (((unsigned long long) x >> y) ^ (0x8000000000000000ULL >> y)) - (0x8000000000000000ULL >> y) i.e. if x doesn't have in some element the MSB set, it is just the logical shift, if it does, then the xor and subtraction cause also all higher bits to be set. 2021-07-28 Jakub Jelinek <jakub@redhat.com> PR target/101611 * config/i386/sse.md (vashr<mode>3): Split into vashrv8di3 expander and vashrv4di3 expander, where the latter requires just TARGET_AVX2 and has special !TARGET_AVX512VL expansion. (vashrv2di3<mask_name>): Rename to ... (vashrv2di3): ... this. Change condition to TARGET_XOP || TARGET_AVX2 and add special !TARGET_XOP && !TARGET_AVX512VL expansion. * gcc.target/i386/avx2-pr101611-1.c: New test. * gcc.target/i386/avx2-pr101611-2.c: New test.
Diffstat (limited to 'gcc/tree-ssa-threadbackward.c')
0 files changed, 0 insertions, 0 deletions