diff options
author | Kewen Lin <linkw@linux.ibm.com> | 2023-10-22 21:18:40 -0500 |
---|---|---|
committer | Kewen Lin <linkw@linux.ibm.com> | 2023-10-22 21:18:40 -0500 |
commit | 1908775f7982bd2de36df5d94396eca0865bad9a (patch) | |
tree | ecd62fdb18d1d99bb7b0537db4d9f9a89b0ca271 /libcpp | |
parent | 1df490edd48042b07aa780b088148a9118cbcb46 (diff) | |
download | gcc-1908775f7982bd2de36df5d94396eca0865bad9a.zip gcc-1908775f7982bd2de36df5d94396eca0865bad9a.tar.gz gcc-1908775f7982bd2de36df5d94396eca0865bad9a.tar.bz2 |
vect: Cost adjacent vector loads/stores together [PR111784]
As comments[1][2], this patch is to change the costing way
on some adjacent vector loads/stores from costing one by
one to costing them together with the total number once.
It helps to fix the exposed regression PR111784 on aarch64,
as aarch64 specific costing could make different decisions
according to the different costing ways (counting with total
number vs. counting one by one). Based on a reduced test
case from PR111784, only considering vec_num can fix the
regression already, but vector loads/stores in regard to
ncopies are also adjacent accesses, so they are considered
as well.
btw, this patch leaves the costing on dr_explicit_realign
and dr_explicit_realign_optimized alone to make it simple.
The costing way change can cause the differences for them
since there is one costing depending on targetm.vectorize.
builtin_mask_for_load and it's costed according to the
calling times. IIUC, these two dr_alignment_support are
mainly used for old Power? (only having 16 bytes aligned
vector load/store but no unaligned vector load/store).
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630742.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630744.html
PR tree-optimization/111784
gcc/ChangeLog:
* tree-vect-stmts.cc (vectorizable_store): Adjust costing way for
adjacent vector stores, by costing them with the total number
rather than costing them one by one.
(vectorizable_load): Adjust costing way for adjacent vector
loads, by costing them with the total number rather than costing
them one by one.
Diffstat (limited to 'libcpp')
0 files changed, 0 insertions, 0 deletions