tree-optimization/114435 - pcom left around copies confusing SLP

The following arranges for the pre-SLP vectorization scalar cleanup to be run when predictive commoning was applied to a loop in the function. This is similar to the complete unroll situation and facilitating SLP vectorization. Avoiding the SSA copies in predictive commoning itself isn't easy (and predcom also sometimes unrolls, asking for scalar cleanup). PR tree-optimization/114435 * tree-predcom.cc (tree_predictive_commoning): Queue the next scalar cleanup sub-pipeline to be run when we did something. * gcc.dg/vect/bb-slp-pr114435.c: New testcase.
author: Richard Biener <rguenther@suse.de> 2024-05-29 10:41:51 +0200
committer: Richard Biener <rguenther@suse.de> 2024-05-29 12:58:08 +0200
commit: 1065a7db6f2a69770a85b4d53b9123b090dd1771 (patch)
tree: 61464923d85c641d70d15d19976694820f62becd /gcc
parent: 9c6e75a6d1cc2858fc945266a5edb700edb44389 (diff)
download: gcc-1065a7db6f2a69770a85b4d53b9123b090dd1771.zip
gcc-1065a7db6f2a69770a85b4d53b9123b090dd1771.tar.gz
gcc-1065a7db6f2a69770a85b4d53b9123b090dd1771.tar.bz2
2 files changed, 40 insertions, 0 deletions
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr114435.c b/gcc/testsuite/gcc.dg/vect/bb-slp-pr114435.c
new file mode 100644
index 0000000..d1eecf7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr114435.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_double } */
+/* Predictive commining is supposed to happen.  */
+/* { dg-additional-options "-O3 -fdump-tree-pcom" } */
+
+struct res {
+    double r0;
+    double r1;
+    double r2;
+    double r3;
+};
+
+struct pxl {
+    double v0;
+    double v1;
+    double v2;
+    double v3;
+};
+
+#define IS_NAN(x) ((x) == (x))
+
+void fold(struct res *r, struct pxl *in, double k, int sz)
+{
+  int i;
+
+  for (i = 0; i < sz; i++) {
+      if (IS_NAN(k)) continue;
+      r->r0 += in[i].v0 * k;
+      r->r1 += in[i].v1 * k;
+      r->r2 += in[i].v2 * k;
+      r->r3 += in[i].v3 * k;
+  }
+}
+
+/* { dg-final { scan-tree-dump "# r__r0_lsm\[^\r\n\]* = PHI" "pcom" } } */
+/* { dg-final { scan-tree-dump "optimized: basic block part vectorized" "slp1" } } */
+/* { dg-final { scan-tree-dump "# vect\[^\r\n\]* = PHI" "slp1" } } */
diff --git a/gcc/tree-predcom.cc b/gcc/tree-predcom.cc
index 75a4c85..9844fee 100644
--- a/gcc/tree-predcom.cc
+++ b/gcc/tree-predcom.cc
@@ -3522,6 +3522,9 @@ tree_predictive_commoning (bool allow_unroll_p)
 	}
     }
 
+  if (ret != 0)
+    cfun->pending_TODOs |= PENDING_TODO_force_next_scalar_cleanup;
+
   return ret;
 }
author	Richard Biener <rguenther@suse.de>	2024-05-29 10:41:51 +0200
committer	Richard Biener <rguenther@suse.de>	2024-05-29 12:58:08 +0200
commit	1065a7db6f2a69770a85b4d53b9123b090dd1771 (patch)
tree	61464923d85c641d70d15d19976694820f62becd /gcc
parent	9c6e75a6d1cc2858fc945266a5edb700edb44389 (diff)
download	gcc-1065a7db6f2a69770a85b4d53b9123b090dd1771.zip gcc-1065a7db6f2a69770a85b4d53b9123b090dd1771.tar.gz gcc-1065a7db6f2a69770a85b4d53b9123b090dd1771.tar.bz2