diff options
author | Kewen Lin <linkw@gcc.gnu.org> | 2019-12-13 06:00:53 +0000 |
---|---|---|
committer | Kewen Lin <linkw@gcc.gnu.org> | 2019-12-13 06:00:53 +0000 |
commit | 396c2a9842f0f565b86efcb41bb3386da5dca889 (patch) | |
tree | 381a323f8fd9bdec0fcc7222ae69101d35ef8958 /gcc | |
parent | a1af2dd9c3a5f6b4885db739f6fc3d80ed80007a (diff) | |
download | gcc-396c2a9842f0f565b86efcb41bb3386da5dca889.zip gcc-396c2a9842f0f565b86efcb41bb3386da5dca889.tar.gz gcc-396c2a9842f0f565b86efcb41bb3386da5dca889.tar.bz2 |
[rs6000] Adjust vectorization cost for scalar COND_EXPR
We found that the vectorization cost modeling on scalar COND_EXPR is a bit off
on rs6000. One typical case is 548.exchange2_r, -Ofast -mcpu=power9 -mrecip
-fvect-cost-model=unlimited is better than -Ofast -mcpu=power9 -mrecip (the
default is -fvect-cost-model=dynamic) by 1.94%. Scalar COND_EXPR is expanded
into compare + branch or compare + isel normally, either of them should be
priced more than the simple FXU operation. This patch is to add additional
vectorization cost onto scalar COND_EXPR on top of builtin_vectorization_cost.
The idea to use additional cost value 2 instead of the others: 1) try various
possible value candidates from 1 to 5, 2 is the best measured on Power9. 2)
from latency view, compare takes 3 cycles and isel takes 2 on Power9, it's
2.5 times of simple FXU instruction which takes cost 1 in the current
modeling, it's close. 3) get fine SPEC2017 ratio on Power8 as well.
gcc/ChangeLog
* config/rs6000/rs6000.c (adjust_vectorization_cost): New function.
(rs6000_add_stmt_cost): Call adjust_vectorization_cost and update
stmt_cost.
From-SVN: r279336
Diffstat (limited to 'gcc')
-rw-r--r-- | gcc/ChangeLog | 6 | ||||
-rw-r--r-- | gcc/config/rs6000/rs6000.c | 24 |
2 files changed, 30 insertions, 0 deletions
diff --git a/gcc/ChangeLog b/gcc/ChangeLog index d9bf5f7..981b3ff 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2019-12-13 Kewen Lin <linkw@gcc.gnu.org> + + * config/rs6000/rs6000.c (adjust_vectorization_cost): New function. + (rs6000_add_stmt_cost): Call adjust_vectorization_cost and update + stmt_cost. + 2019-12-12 Jakub Jelinek <jakub@redhat.com> PR target/92904 diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 23b6d99..6f0c7fa 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -4997,6 +4997,29 @@ rs6000_init_cost (struct loop *loop_info) return data; } +/* Adjust vectorization cost after calling rs6000_builtin_vectorization_cost. + For some statement, we would like to further fine-grain tweak the cost on + top of rs6000_builtin_vectorization_cost handling which doesn't have any + information on statement operation codes etc. One typical case here is + COND_EXPR, it takes the same cost to simple FXU instruction when evaluating + for scalar cost, but it should be priced more whatever transformed to either + compare + branch or compare + isel instructions. */ + +static unsigned +adjust_vectorization_cost (enum vect_cost_for_stmt kind, + struct _stmt_vec_info *stmt_info) +{ + if (kind == scalar_stmt && stmt_info && stmt_info->stmt + && gimple_code (stmt_info->stmt) == GIMPLE_ASSIGN) + { + tree_code subcode = gimple_assign_rhs_code (stmt_info->stmt); + if (subcode == COND_EXPR) + return 2; + } + + return 0; +} + /* Implement targetm.vectorize.add_stmt_cost. */ static unsigned @@ -5012,6 +5035,7 @@ rs6000_add_stmt_cost (void *data, int count, enum vect_cost_for_stmt kind, tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE; int stmt_cost = rs6000_builtin_vectorization_cost (kind, vectype, misalign); + stmt_cost += adjust_vectorization_cost (kind, stmt_info); /* Statements in an inner loop relative to the loop being vectorized are weighted more heavily. The value here is arbitrary and could potentially be improved with analysis. */ |