diff options
author | Hongyu Wang <hongyu.wang@intel.com> | 2024-04-09 16:05:26 +0800 |
---|---|---|
committer | Hongyu Wang <hongyu.wang@intel.com> | 2024-06-06 15:29:47 +0800 |
commit | 23db87301b623ecf162c9df718ce82ed9aa354a8 (patch) | |
tree | 74dd3d284896e08d4597123b0389516b9508e912 | |
parent | c989e59fc99d994159114304d4e715c72bedff0a (diff) | |
download | gcc-23db87301b623ecf162c9df718ce82ed9aa354a8.zip gcc-23db87301b623ecf162c9df718ce82ed9aa354a8.tar.gz gcc-23db87301b623ecf162c9df718ce82ed9aa354a8.tar.bz2 |
[APX CCMP] Adjust startegy for selecting ccmp candidates
For general ccmp scenario, the tree sequence is like
_1 = (a < b)
_2 = (c < d)
_3 = _1 & _2
current ccmp expanding will try to swap compare order for _1 and _2,
compare the expansion cost/cost2 for expanding _1 or _2 first, then
return the sequence with lower cost.
It is possible that one expansion succeeds and the other fails.
For example, x86 has int ccmp but not fp ccmp, so a combined fp and
int comparison must be ordered such that the fp comparison happens
first. The costs are not meaningful for failed expansions.
Check the expand_ccmp_next result ret and ret2, returns the valid one
before cost comparison.
gcc/ChangeLog:
* ccmp.cc (expand_ccmp_expr_1): Check ret and ret2 of
expand_ccmp_next, returns the valid one first instead of
comparing cost.
-rw-r--r-- | gcc/ccmp.cc | 10 |
1 files changed, 9 insertions, 1 deletions
diff --git a/gcc/ccmp.cc b/gcc/ccmp.cc index 7cb525a..4d50708 100644 --- a/gcc/ccmp.cc +++ b/gcc/ccmp.cc @@ -247,7 +247,15 @@ expand_ccmp_expr_1 (gimple *g, rtx_insn **prep_seq, rtx_insn **gen_seq) cost2 = seq_cost (prep_seq_2, speed_p); cost2 += seq_cost (gen_seq_2, speed_p); } - if (cost2 < cost1) + + /* It's possible that one expansion succeeds and the other + fails. + For example, x86 has int ccmp but not fp ccmp, and so a + combined fp and int comparison must be ordered such that + the fp comparison happens first. The costs are not + meaningful for failed expansions. */ + + if (ret2 && (!ret || cost2 < cost1)) { *prep_seq = prep_seq_2; *gen_seq = gen_seq_2; |