aboutsummaryrefslogtreecommitdiff
path: root/gcc/cp/parser.cc
diff options
context:
space:
mode:
authorRichard Biener <rguenther@suse.de>2023-09-29 12:54:17 +0200
committerRichard Biener <rguenth@gcc.gnu.org>2024-09-06 11:19:48 +0200
commitd34cda720988674bcf8a24267c9e1ec61335d6de (patch)
treeedafe2b69dd5ab7cdc3bb3c344812bd38808cc3f /gcc/cp/parser.cc
parentf9c5c12d24cc3a9da5c1d38e69a8aa5f58224c4a (diff)
downloadgcc-d34cda720988674bcf8a24267c9e1ec61335d6de.zip
gcc-d34cda720988674bcf8a24267c9e1ec61335d6de.tar.gz
gcc-d34cda720988674bcf8a24267c9e1ec61335d6de.tar.bz2
Handle non-grouped stores as single-lane SLP
The following enables single-lane loop SLP discovery for non-grouped stores and adjusts vectorizable_store to properly handle those. For gfortran.dg/vect/vect-8.f90 we vectorize one additional loop, not running into the "not falling back to strided accesses" bail-out. I have not investigated in detail. There is a set of i386 target assembler test FAILs, gcc.target/i386/pr88531-2[bc].c in particular fail because the target cannot identify SLP emulated gathers, see another mail from me. Others need adjustment, I've adjusted one with this patch only. In particular there are gcc.target/i386/cond_op_fma_*-1.c FAILs that are because we no longer fold a VEC_COND_EXPR during the region value-numbering we do after vectorization since we code-generate a { 0.0, ... } constant in the VEC_COND_EXPR now instead of having a separate statement which gets forwarded and then triggers folding. This leads to sligtly different code generation. The solution is probably to use gimple_build when building stmts or, in this case, directly emit .COND_FMA instead of .FMA and a VEC_COND_EXPR. gcc.dg/vect/slp-19a.c mixes contiguous 8-lane SLP with a single lane contiguous store from one lane of the 8-lane load and we expect to use load-lanes for this reason but the heuristic for forcing single-lane rediscovery as implemented doesn't trigger here as it treats both SLP instances separately. FAILs on RISC-V gcc.dg/vect/slp-19c.c shows we fail to implement an interleaving scheme for group_size 12 (by extension using the group_size 3 scheme to reduce to 4 lanes and then continue with a pow2 scheme would work); we are also not considering load-lanes because of the above reason, but aarch64 cannot do ld12. FAILs on AARCH64 (load requires three vectors) and x86_64. gcc.dg/vect/slp-19c.c FAILs with variable-length vectors because of "SLP induction not supported for variable-length vectors". gcc.target/aarch64/pr110449.c will FAIL because the (contested) optimization in r14-2367-g224fd59b2dc8a5 was only applied to loop-vect but not SLP vect. I'll leave it to target maintainers to either XFAIL (the optimization is bad) or remove the test. * tree-vect-slp.cc (vect_analyze_slp): Perform single-lane loop SLP discovery for non-grouped stores. Move check on the root for re-doing SLP analysis with a single lane for load/store-lanes earlier and make sure we are dealing with a grouped access. * tree-vect-stmts.cc (vectorizable_store): Always set vec_num for SLP. * gcc.dg/vect/O3-pr39675-2.c: Adjust expected number of SLP. * gcc.dg/vect/fast-math-vect-call-1.c: Likewise. * gcc.dg/vect/no-scevccp-slp-31.c: Likewise. * gcc.dg/vect/slp-12b.c: Likewise. * gcc.dg/vect/slp-12c.c: Likewise. * gcc.dg/vect/slp-19a.c: Likewise. * gcc.dg/vect/slp-19b.c: Likewise. * gcc.dg/vect/slp-4-big-array.c: Likewise. * gcc.dg/vect/slp-4.c: Likewise. * gcc.dg/vect/slp-5.c: Likewise. * gcc.dg/vect/slp-7.c: Likewise. * gcc.dg/vect/slp-perm-7.c: Likewise. * gcc.dg/vect/slp-37.c: Likewise. * gcc.dg/vect/fast-math-vect-call-2.c: Likewise. * gcc.dg/vect/slp-26.c: RISC-V can now SLP two instances. * gcc.dg/vect/vect-outer-slp-3.c: Disable vectorization of initialization loop. * gcc.dg/vect/slp-reduc-5.c: Likewise. * gcc.dg/vect/no-scevccp-outer-12.c: Un-XFAIL. SLP can handle inner loop inductions with multiple vector stmt copies. * gfortran.dg/vect/vect-8.f90: Adjust expected number of vectorized loops. * gcc.target/i386/vectorize1.c: Adjust what we scan for.
Diffstat (limited to 'gcc/cp/parser.cc')
0 files changed, 0 insertions, 0 deletions