aboutsummaryrefslogtreecommitdiff
path: root/gcc/tree-vect-generic.c
diff options
context:
space:
mode:
authorJakub Jelinek <jakub@redhat.com>2020-11-30 10:55:43 +0100
committerJakub Jelinek <jakub@redhat.com>2020-11-30 10:55:43 +0100
commit4d87bd39bafae86747944b2f8c53fdbc43b8dac3 (patch)
tree26e6f9fd7b023462353752375cc2bf1f23c00f33 /gcc/tree-vect-generic.c
parentfbbce1c6e98dec378955b1a591d2dff31caa01f5 (diff)
downloadgcc-4d87bd39bafae86747944b2f8c53fdbc43b8dac3.zip
gcc-4d87bd39bafae86747944b2f8c53fdbc43b8dac3.tar.gz
gcc-4d87bd39bafae86747944b2f8c53fdbc43b8dac3.tar.bz2
expansion: Improve double-word modulo by certain constant divisors [PR97459]
As discussed in the PR, e.g. on x86_64 (both -m32 and -m64) there is no double-word modulo and so we expand it to a __{,u}mod[dt]i3 call. For certain constant divisors we can do better. E.g. consider 32-bit word-size, 0x100000000ULL % 3 == 1, so we can use partly the Hacker's delight modulo by summing digits approach and optimize unsigned long long foo (unsigned long long x) { return x % 3; } as unsigned long long foo (unsigned long long x) { unsigned int sum, carry; carry = __builtin_add_overflow ((unsigned int) x, (unsigned int) (x >> 32), &sum); sum += carry; return sum % 3; } Similarly, 0x10000000ULL % 5 == 1 (note, 1 << 28), so unsigned long long bar (unsigned long long x) { return x % 5; } as unsigned long long bar (unsigned long long x) { unsigned int sum = x & ((1 << 28) - 1); sum += (x >> 28) & ((1 << 28) - 1); sum += (x >> 56); return sum % 5; } etc. And we can do also signed modulo, long long baz (long long x) { return x % 5; } as long long baz (long long x) { unsigned int sum = x & ((1 << 28) - 1); sum += ((unsigned long long) x >> 28) & ((1 << 28) - 1); sum += ((unsigned long long) x >> 56); /* Sum adjustment for negative x. */ sum += (x >> 63) & 3; unsigned int rem = sum % 5; /* And finally adjust it to the right interval for negative values. */ return (int) (rem + ((x >> 63) & -4)); } 2020-11-30 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/97459 * internal-fn.h (expand_addsub_overflow): Declare. * internal-fn.c (expand_addsub_overflow): No longer static. * optabs.c (expand_doubleword_mod): New function. (expand_binop): Optimize double-word mod with constant divisor. * gcc.dg/pr97459-1.c: New test. * gcc.dg/pr97459-2.c: New test.
Diffstat (limited to 'gcc/tree-vect-generic.c')
0 files changed, 0 insertions, 0 deletions