aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorHongyu Wang <hongyu.wang@intel.com>2024-04-09 16:05:26 +0800
committerHongyu Wang <hongyu.wang@intel.com>2024-06-06 15:29:47 +0800
commit23db87301b623ecf162c9df718ce82ed9aa354a8 (patch)
tree74dd3d284896e08d4597123b0389516b9508e912
parentc989e59fc99d994159114304d4e715c72bedff0a (diff)
downloadgcc-23db87301b623ecf162c9df718ce82ed9aa354a8.zip
gcc-23db87301b623ecf162c9df718ce82ed9aa354a8.tar.gz
gcc-23db87301b623ecf162c9df718ce82ed9aa354a8.tar.bz2
[APX CCMP] Adjust startegy for selecting ccmp candidates
For general ccmp scenario, the tree sequence is like _1 = (a < b) _2 = (c < d) _3 = _1 & _2 current ccmp expanding will try to swap compare order for _1 and _2, compare the expansion cost/cost2 for expanding _1 or _2 first, then return the sequence with lower cost. It is possible that one expansion succeeds and the other fails. For example, x86 has int ccmp but not fp ccmp, so a combined fp and int comparison must be ordered such that the fp comparison happens first. The costs are not meaningful for failed expansions. Check the expand_ccmp_next result ret and ret2, returns the valid one before cost comparison. gcc/ChangeLog: * ccmp.cc (expand_ccmp_expr_1): Check ret and ret2 of expand_ccmp_next, returns the valid one first instead of comparing cost.
-rw-r--r--gcc/ccmp.cc10
1 files changed, 9 insertions, 1 deletions
diff --git a/gcc/ccmp.cc b/gcc/ccmp.cc
index 7cb525a..4d50708 100644
--- a/gcc/ccmp.cc
+++ b/gcc/ccmp.cc
@@ -247,7 +247,15 @@ expand_ccmp_expr_1 (gimple *g, rtx_insn **prep_seq, rtx_insn **gen_seq)
cost2 = seq_cost (prep_seq_2, speed_p);
cost2 += seq_cost (gen_seq_2, speed_p);
}
- if (cost2 < cost1)
+
+ /* It's possible that one expansion succeeds and the other
+ fails.
+ For example, x86 has int ccmp but not fp ccmp, and so a
+ combined fp and int comparison must be ordered such that
+ the fp comparison happens first. The costs are not
+ meaningful for failed expansions. */
+
+ if (ret2 && (!ret || cost2 < cost1))
{
*prep_seq = prep_seq_2;
*gen_seq = gen_seq_2;