diff options
author | Tamar Christina <tamar.christina@arm.com> | 2022-02-02 10:39:03 +0000 |
---|---|---|
committer | Tamar Christina <tamar.christina@arm.com> | 2022-02-02 10:39:03 +0000 |
commit | 55d83cdf23b5f284b4e0bd0a6d1af3d947b2e7c3 (patch) | |
tree | 2062fe07adf965b3db5a4cc12b689c070b0a4e99 /gcc/tree-vectorizer.h | |
parent | 756eabacfcd767e39eea63257a026f61a4c4e661 (diff) | |
download | gcc-55d83cdf23b5f284b4e0bd0a6d1af3d947b2e7c3.zip gcc-55d83cdf23b5f284b4e0bd0a6d1af3d947b2e7c3.tar.gz gcc-55d83cdf23b5f284b4e0bd0a6d1af3d947b2e7c3.tar.bz2 |
vect: Simplify and extend the complex numbers validation routines.
This patch boosts the analysis for complex mul,fma and fms in order to ensure
that it doesn't create an incorrect output.
Essentially it adds an extra verification to check that the two nodes it's going
to combine do the same operations on compatible values. The reason it needs to
do this is that if one computation differs from the other then with the current
implementation we have no way to deal with it since we have to remove the
permute.
When we can keep the permute around we can probably handle these by unrolling.
While implementing this since I have to do the traversal anyway I took advantage
of it by simplifying the code a bit. Previously we would determine whether
something is a conjugate and then try to figure out which conjugate it is and
then try to see if the permutes match what we expect.
Now the code that does the traversal will detect this in one go and return to us
whether the operation is something that can be combined and whether a conjugate
is present.
Secondly because it does this I can now simplify the checking code itself to
essentially just try to apply fixed patterns to each operation.
The patterns represent the order operations should appear in. For instance a
complex MUL operation combines :
Left 1 + Right 1
Left 2 + Right 2
with a permute on the nodes consisting of:
{ Even, Even } + { Odd, Odd }
{ Even, Odd } + { Odd, Even }
By abstracting over these patterns the checking code becomes quite simple.
As part of this I was checking the order of the operands which was left in
"slp" order. as in, the same order they showed up in during SLP, which means
that the accumulator is first. However it looks like I didn't document this
and the x86 optab was implemented assuming the same order as FMA, i.e. that
the accumulator is last.
I have this changed the order to match that of FMA and FMS which corrects the
x86 codegen and will update the Arm targets. This has now also been
documented.
gcc/ChangeLog:
PR tree-optimization/102819
PR tree-optimization/103169
* doc/md.texi: Update docs for cfms, cfma.
* tree-data-ref.h (same_data_refs): Accept optional offset.
* tree-vect-slp-patterns.cc (is_linear_load_p): Fix issue with repeating
patterns.
(vect_normalize_conj_loc): Remove.
(is_eq_or_top): Change to take two nodes.
(enum _conj_status, compatible_complex_nodes_p,
vect_validate_multiplication): New.
(class complex_add_pattern, complex_add_pattern::matches,
complex_add_pattern::recognize, class complex_mul_pattern,
complex_mul_pattern::recognize, class complex_fms_pattern,
complex_fms_pattern::recognize, class complex_operations_pattern,
complex_operations_pattern::recognize, addsub_pattern::recognize): Pass
new cache.
(complex_fms_pattern::matches, complex_mul_pattern::matches): Pass new
cache and use new validation code.
* tree-vect-slp.cc (vect_match_slp_patterns_2, vect_match_slp_patterns,
vect_analyze_slp): Pass along cache.
(compatible_calls_p): Expose.
* tree-vectorizer.h (compatible_calls_p, slp_node_hash,
slp_compat_nodes_map_t): New.
(class vect_pattern): Update signatures include new cache.
gcc/testsuite/ChangeLog:
PR tree-optimization/102819
PR tree-optimization/103169
* g++.dg/vect/pr99149.cc: xfail for now.
* gcc.dg/vect/complex/pr102819-1.c: New test.
* gcc.dg/vect/complex/pr102819-2.c: New test.
* gcc.dg/vect/complex/pr102819-3.c: New test.
* gcc.dg/vect/complex/pr102819-4.c: New test.
* gcc.dg/vect/complex/pr102819-5.c: New test.
* gcc.dg/vect/complex/pr102819-6.c: New test.
* gcc.dg/vect/complex/pr102819-7.c: New test.
* gcc.dg/vect/complex/pr102819-8.c: New test.
* gcc.dg/vect/complex/pr102819-9.c: New test.
* gcc.dg/vect/complex/pr103169.c: New test.
Diffstat (limited to 'gcc/tree-vectorizer.h')
-rw-r--r-- | gcc/tree-vectorizer.h | 11 |
1 files changed, 10 insertions, 1 deletions
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 524c86c..ec479d3 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2301,6 +2301,7 @@ extern void duplicate_and_interleave (vec_info *, gimple_seq *, tree, extern int vect_get_place_in_interleaving_chain (stmt_vec_info, stmt_vec_info); extern slp_tree vect_create_new_slp_node (unsigned, tree_code); extern void vect_free_slp_tree (slp_tree); +extern bool compatible_calls_p (gcall *, gcall *); /* In tree-vect-patterns.cc. */ extern void @@ -2339,6 +2340,12 @@ typedef enum _complex_perm_kinds { typedef hash_map <slp_tree, complex_perm_kinds_t> slp_tree_to_load_perm_map_t; +/* Cache from nodes pair to being compatible or not. */ +typedef pair_hash <nofree_ptr_hash <_slp_tree>, + nofree_ptr_hash <_slp_tree>> slp_node_hash; +typedef hash_map <slp_node_hash, bool> slp_compat_nodes_map_t; + + /* Vector pattern matcher base class. All SLP pattern matchers must inherit from this type. */ @@ -2371,7 +2378,8 @@ class vect_pattern public: /* Create a new instance of the pattern matcher class of the given type. */ - static vect_pattern* recognize (slp_tree_to_load_perm_map_t *, slp_tree *); + static vect_pattern* recognize (slp_tree_to_load_perm_map_t *, + slp_compat_nodes_map_t *, slp_tree *); /* Build the pattern from the data collected so far. */ virtual void build (vec_info *) = 0; @@ -2385,6 +2393,7 @@ class vect_pattern /* Function pointer to create a new pattern matcher from a generic type. */ typedef vect_pattern* (*vect_pattern_decl_t) (slp_tree_to_load_perm_map_t *, + slp_compat_nodes_map_t *, slp_tree *); /* List of supported pattern matchers. */ |