aboutsummaryrefslogtreecommitdiff
path: root/gcc
diff options
context:
space:
mode:
authorRichard Biener <rguenther@suse.de>2023-05-23 15:03:00 +0200
committerRichard Biener <rguenther@suse.de>2023-05-23 18:59:05 +0200
commitb6b8870ec585947a03a797f9037d02380316e235 (patch)
treeab5d73d249927fc27e52c2348980253eef0053e7 /gcc
parent58b41bb4bafffec5e2e3425c8cee82b304bd0a5f (diff)
downloadgcc-b6b8870ec585947a03a797f9037d02380316e235.zip
gcc-b6b8870ec585947a03a797f9037d02380316e235.tar.gz
gcc-b6b8870ec585947a03a797f9037d02380316e235.tar.bz2
tree-optimization/109747 - SLP cost of CTORs
The x86 backend looks at the SLP node passed to the add_stmt_cost hook when costing vec_construct, looking for elements that require a move from a GPR to a vector register and cost that. But since vect_prologue_cost_for_slp decomposes the cost for an external SLP node into individual pieces this cost gets applied N times without a chance for the backend to know it's just dealing with a part of the SLP node. Just looking at a part is also not perfect since the GPR to XMM move cost applies only once per distinct element so handling the whole SLP node one more correctly reflects cost (albeit without considering other external SLP nodes). The following addresses the issue by passing down the SLP node only for one piece and nullptr for the rest. The x86 backend is currently the only one looking at it. In the future the cost of external elements is something to deal with globally but that would require the full SLP tree be available to costing. It's difficult to write a testcase, at the tipping point not vectorizing is better so I'll followup with x86 specific adjustments and will see to add a testcase later. PR tree-optimization/109747 * tree-vect-slp.cc (vect_prologue_cost_for_slp): Pass down the SLP node only once to the cost hook.
Diffstat (limited to 'gcc')
-rw-r--r--gcc/tree-vect-slp.cc11
1 files changed, 10 insertions, 1 deletions
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index e5c9d7e..a6f277c 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -6069,6 +6069,7 @@ vect_prologue_cost_for_slp (slp_tree node,
}
/* ??? We're just tracking whether vectors in a single node are the same.
Ideally we'd do something more global. */
+ bool passed = false;
for (unsigned int start : starts)
{
vect_cost_for_stmt kind;
@@ -6078,7 +6079,15 @@ vect_prologue_cost_for_slp (slp_tree node,
kind = scalar_to_vec;
else
kind = vec_construct;
- record_stmt_cost (cost_vec, 1, kind, node, vectype, 0, vect_prologue);
+ /* The target cost hook has no idea which part of the SLP node
+ we are costing so avoid passing it down more than once. Pass
+ it to the first vec_construct or scalar_to_vec part since for those
+ the x86 backend tries to account for GPR to XMM register moves. */
+ record_stmt_cost (cost_vec, 1, kind,
+ (kind != vector_load && !passed) ? node : nullptr,
+ vectype, 0, vect_prologue);
+ if (kind != vector_load)
+ passed = true;
}
}