vect: Simplify and extend the complex numbers validation routines.

This patch boosts the analysis for complex mul,fma and fms in order to ensure that it doesn't create an incorrect output. Essentially it adds an extra verification to check that the two nodes it's going to combine do the same operations on compatible values. The reason it needs to do this is that if one computation differs from the other then with the current implementation we have no way to deal with it since we have to remove the permute. When we can keep the permute around we can probably handle these by unrolling. While implementing this since I have to do the traversal anyway I took advantage of it by simplifying the code a bit. Previously we would determine whether something is a conjugate and then try to figure out which conjugate it is and then try to see if the permutes match what we expect. Now the code that does the traversal will detect this in one go and return to us whether the operation is something that can be combined and whether a conjugate is present. Secondly because it does this I can now simplify the checking code itself to essentially just try to apply fixed patterns to each operation. The patterns represent the order operations should appear in. For instance a complex MUL operation combines : Left 1 + Right 1 Left 2 + Right 2 with a permute on the nodes consisting of: { Even, Even } + { Odd, Odd } { Even, Odd } + { Odd, Even } By abstracting over these patterns the checking code becomes quite simple. As part of this I was checking the order of the operands which was left in "slp" order. as in, the same order they showed up in during SLP, which means that the accumulator is first. However it looks like I didn't document this and the x86 optab was implemented assuming the same order as FMA, i.e. that the accumulator is last. I have this changed the order to match that of FMA and FMS which corrects the x86 codegen and will update the Arm targets. This has now also been documented. gcc/ChangeLog: PR tree-optimization/102819 PR tree-optimization/103169 * doc/md.texi: Update docs for cfms, cfma. * tree-data-ref.h (same_data_refs): Accept optional offset. * tree-vect-slp-patterns.cc (is_linear_load_p): Fix issue with repeating patterns. (vect_normalize_conj_loc): Remove. (is_eq_or_top): Change to take two nodes. (enum _conj_status, compatible_complex_nodes_p, vect_validate_multiplication): New. (class complex_add_pattern, complex_add_pattern::matches, complex_add_pattern::recognize, class complex_mul_pattern, complex_mul_pattern::recognize, class complex_fms_pattern, complex_fms_pattern::recognize, class complex_operations_pattern, complex_operations_pattern::recognize, addsub_pattern::recognize): Pass new cache. (complex_fms_pattern::matches, complex_mul_pattern::matches): Pass new cache and use new validation code. * tree-vect-slp.cc (vect_match_slp_patterns_2, vect_match_slp_patterns, vect_analyze_slp): Pass along cache. (compatible_calls_p): Expose. * tree-vectorizer.h (compatible_calls_p, slp_node_hash, slp_compat_nodes_map_t): New. (class vect_pattern): Update signatures include new cache. gcc/testsuite/ChangeLog: PR tree-optimization/102819 PR tree-optimization/103169 * g++.dg/vect/pr99149.cc: xfail for now. * gcc.dg/vect/complex/pr102819-1.c: New test. * gcc.dg/vect/complex/pr102819-2.c: New test. * gcc.dg/vect/complex/pr102819-3.c: New test. * gcc.dg/vect/complex/pr102819-4.c: New test. * gcc.dg/vect/complex/pr102819-5.c: New test. * gcc.dg/vect/complex/pr102819-6.c: New test. * gcc.dg/vect/complex/pr102819-7.c: New test. * gcc.dg/vect/complex/pr102819-8.c: New test. * gcc.dg/vect/complex/pr102819-9.c: New test. * gcc.dg/vect/complex/pr103169.c: New test.
author: Tamar Christina <tamar.christina@arm.com> 2022-02-02 10:39:03 +0000
committer: Tamar Christina <tamar.christina@arm.com> 2022-02-02 10:39:03 +0000
commit: 55d83cdf23b5f284b4e0bd0a6d1af3d947b2e7c3 (patch)
tree: 2062fe07adf965b3db5a4cc12b689c070b0a4e99 /gcc/doc
parent: 756eabacfcd767e39eea63257a026f61a4c4e661 (diff)
download: gcc-55d83cdf23b5f284b4e0bd0a6d1af3d947b2e7c3.zip
gcc-55d83cdf23b5f284b4e0bd0a6d1af3d947b2e7c3.tar.gz
gcc-55d83cdf23b5f284b4e0bd0a6d1af3d947b2e7c3.tar.bz2
1 files changed, 28 insertions, 24 deletions
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index feacb12..f3619c5 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -6360,12 +6360,13 @@ Perform a vector multiply and accumulate that is semantically the same as
 a multiply and accumulate of complex numbers.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] += a[i] * b[i];
+      op0[i] = op1[i] * op2[i] + op3[i];
     @}
 @end smallexample
 
@@ -6383,12 +6384,13 @@ the same as a multiply and accumulate of complex numbers where the second
 multiply arguments is conjugated.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] += a[i] * conj (b[i]);
+      op0[i] = op1[i] * conj (op2[i]) + op3[i];
     @}
 @end smallexample
 
@@ -6405,12 +6407,13 @@ Perform a vector multiply and subtract that is semantically the same as
 a multiply and subtract of complex numbers.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] -= a[i] * b[i];
+      op0[i] = op1[i] * op2[i] - op3[i];
     @}
 @end smallexample
 
@@ -6428,12 +6431,13 @@ the same as a multiply and subtract of complex numbers where the second
 multiply arguments is conjugated.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] -= a[i] * conj (b[i]);
+      op0[i] = op1[i] * conj (op2[i]) - op3[i];
     @}
 @end smallexample
 
@@ -6450,12 +6454,12 @@ Perform a vector multiply that is semantically the same as multiply of
 complex numbers.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] = a[i] * b[i];
+      op0[i] = op1[i] * op2[i];
     @}
 @end smallexample
 
@@ -6472,12 +6476,12 @@ Perform a vector multiply by conjugate that is semantically the same as a
 multiply of complex numbers where the second multiply arguments is conjugated.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
   for (int i = 0; i < N; i += 1)
     @{
-      c[i] = a[i] * conj (b[i]);
+      op0[i] = op1[i] * conj (op2[i]);
     @}
 @end smallexample
author	Tamar Christina <tamar.christina@arm.com>	2022-02-02 10:39:03 +0000
committer	Tamar Christina <tamar.christina@arm.com>	2022-02-02 10:39:03 +0000
commit	55d83cdf23b5f284b4e0bd0a6d1af3d947b2e7c3 (patch)
tree	2062fe07adf965b3db5a4cc12b689c070b0a4e99 /gcc/doc
parent	756eabacfcd767e39eea63257a026f61a4c4e661 (diff)
download	gcc-55d83cdf23b5f284b4e0bd0a6d1af3d947b2e7c3.zip gcc-55d83cdf23b5f284b4e0bd0a6d1af3d947b2e7c3.tar.gz gcc-55d83cdf23b5f284b4e0bd0a6d1af3d947b2e7c3.tar.bz2