diff options
author | Richard Biener <rguenther@suse.de> | 2021-01-04 09:53:11 +0100 |
---|---|---|
committer | Richard Biener <rguenther@suse.de> | 2021-01-04 10:47:43 +0100 |
commit | 8837f82e4bab1b5405cf034eab9b3e83afc563ad (patch) | |
tree | aaf2a349e092221769a6a891d3d398f3f1dd73a5 | |
parent | ad64e807ffca93e927b68f1aa0cea54dacbe9afd (diff) | |
download | gcc-8837f82e4bab1b5405cf034eab9b3e83afc563ad.zip gcc-8837f82e4bab1b5405cf034eab9b3e83afc563ad.tar.gz gcc-8837f82e4bab1b5405cf034eab9b3e83afc563ad.tar.bz2 |
tree-optimization/98291 - allow SLP more vectorization of reductions
When the VF is one a SLP reduction is in-order and thus we can
vectorize even when the reduction op is not associative.
2021-01-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/98291
* tree-vect-loop.c (vectorizable_reduction): Bypass
associativity check for SLP reductions with VF 1.
* gcc.dg/vect/slp-reduc-11.c: New testcase.
* gcc.dg/vect/vect-reduc-in-order-4.c: Adjust.
-rw-r--r-- | gcc/testsuite/gcc.dg/vect/slp-reduc-11.c | 20 | ||||
-rw-r--r-- | gcc/testsuite/gcc.dg/vect/vect-reduc-in-order-4.c | 2 | ||||
-rw-r--r-- | gcc/tree-vect-loop.c | 10 |
3 files changed, 28 insertions, 4 deletions
diff --git a/gcc/testsuite/gcc.dg/vect/slp-reduc-11.c b/gcc/testsuite/gcc.dg/vect/slp-reduc-11.c new file mode 100644 index 0000000..a2f86fb --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/slp-reduc-11.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_double } */ + +double dotprod(const double *a, const double *b, unsigned long long n) +{ + double d1 = 0.0; + double d2 = 0.0; + + for (unsigned long long i = 0; i < n; i += 2) { + d1 += a[i] * b[i]; + d2 += a[i + 1] * b[i + 1]; + } + + return (d1 + d2); +} + +/* We should use a SLP reduction even without -ffast-math by using a + VF of one. */ +/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */ +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-in-order-4.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-in-order-4.c index 7706a2d..eff3994 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-reduc-in-order-4.c +++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-in-order-4.c @@ -41,6 +41,4 @@ main () return 0; } -/* { dg-final { scan-tree-dump {in-order unchained SLP reductions not supported} "vect" } } */ -/* { dg-final { scan-tree-dump-not {vectorizing stmts using SLP} "vect" } } */ /* { dg-final { scan-tree-dump-times "VECT_PERM_EXPR" 0 "vect" } } */ diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index f3b95ae..2985bfe 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -6868,8 +6868,14 @@ vectorizable_reduction (loop_vec_info loop_vinfo, cases, so we need to check that this is ok. One exception is when vectorizing an outer-loop: the inner-loop is executed sequentially, and therefore vectorizing reductions in the inner-loop during - outer-loop vectorization is safe. */ - if (needs_fold_left_reduction_p (scalar_type, orig_code)) + outer-loop vectorization is safe. Likewise when we are vectorizing + a series of reductions using SLP and the VF is one the reductions + are performed in scalar order. */ + if (slp_node + && !REDUC_GROUP_FIRST_ELEMENT (stmt_info) + && known_eq (LOOP_VINFO_VECT_FACTOR (loop_vinfo), 1u)) + ; + else if (needs_fold_left_reduction_p (scalar_type, orig_code)) { /* When vectorizing a reduction chain w/o SLP the reduction PHI is not directy used in stmt. */ |