tree-optimization/117502 - VMAT_STRIDED_SLP vs VMAT_ELEMENTWISE when considering gather

The following treats both the same when considering to use gather or scatter for single-element interleaving accesses. This will cause FAIL: gcc.target/aarch64/sve/sve_iters_low_2.c scan-tree-dump-not vect "LOOP VECTORIZED" where we now vectorize the loop with VNx4QI, I'll leave it to ARM folks to investigate whether that's OK and to adjust the testcase or to see where to adjust things to make the testcase not vectorized again. The original fix for which the testcase was introduced is still efffective. PR tree-optimization/117502 * tree-vect-stmts.cc (get_group_load_store_type): Also consider VMAT_STRIDED_SLP when checking to use gather/scatter for single-element interleaving access. * tree-vect-loop.cc (update_epilogue_loop_vinfo): STMT_VINFO_STRIDED_P can be classified as VMAT_GATHER_SCATTER, so update DR_REF for those as well.
author: Richard Biener <rguenther@suse.de> 2024-11-08 13:06:07 +0100
committer: Richard Biener <rguenth@gcc.gnu.org> 2024-11-12 08:31:14 +0100
commit: 0b27a7dd050262a7d64d87863201e4ebbde88386 (patch)
tree: f6461493d33806259292dc5e4dbd7d6f40489efc /gcc
parent: e232dc3bb5c3e8f8a3749239135b7b859a204fc7 (diff)
download: gcc-0b27a7dd050262a7d64d87863201e4ebbde88386.zip
gcc-0b27a7dd050262a7d64d87863201e4ebbde88386.tar.gz
gcc-0b27a7dd050262a7d64d87863201e4ebbde88386.tar.bz2
2 files changed, 3 insertions, 1 deletions
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 6cfce5a..f50ee2e 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -12295,6 +12295,7 @@ update_epilogue_loop_vinfo (class loop *epilogue, tree advance)
 	 refs that get_load_store_type classified as VMAT_GATHER_SCATTER.  */
       auto vstmt_vinfo = vect_stmt_to_vectorize (stmt_vinfo);
       if (STMT_VINFO_MEMORY_ACCESS_TYPE (vstmt_vinfo) == VMAT_GATHER_SCATTER
+	  || STMT_VINFO_STRIDED_P (vstmt_vinfo)
 	  || STMT_VINFO_GATHER_SCATTER_P (vstmt_vinfo))
 	{
 	  /* ???  As we copy epilogues from the main loop incremental
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 666e049..f77a223 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2274,7 +2274,8 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info,
      on nearby locations.  Or, even if it's a win over scalar code,
      it might not be a win over vectorizing at a lower VF, if that
      allows us to use contiguous accesses.  */
-  if (*memory_access_type == VMAT_ELEMENTWISE
+  if ((*memory_access_type == VMAT_ELEMENTWISE
+       || *memory_access_type == VMAT_STRIDED_SLP)
       && single_element_p
       && loop_vinfo
       && vect_use_strided_gather_scatters_p (stmt_info, loop_vinfo,
author	Richard Biener <rguenther@suse.de>	2024-11-08 13:06:07 +0100
committer	Richard Biener <rguenth@gcc.gnu.org>	2024-11-12 08:31:14 +0100
commit	0b27a7dd050262a7d64d87863201e4ebbde88386 (patch)
tree	f6461493d33806259292dc5e4dbd7d6f40489efc /gcc
parent	e232dc3bb5c3e8f8a3749239135b7b859a204fc7 (diff)
download	gcc-0b27a7dd050262a7d64d87863201e4ebbde88386.zip gcc-0b27a7dd050262a7d64d87863201e4ebbde88386.tar.gz gcc-0b27a7dd050262a7d64d87863201e4ebbde88386.tar.bz2