aboutsummaryrefslogtreecommitdiff
path: root/gcc/doc
diff options
context:
space:
mode:
authorTamar Christina <tamar.christina@arm.com>2024-07-22 10:26:14 +0100
committerThomas Koenig <tkoenig@gcc.gnu.org>2024-07-28 19:05:43 +0200
commitf60e18d0c8a64f26dd87cbf3c91975f19953379d (patch)
tree3db99e567991156a67dd324dc78d22f2669077a8 /gcc/doc
parent4eb0e778e3b3a4096df3deaca625709cb209c7b4 (diff)
downloadgcc-f60e18d0c8a64f26dd87cbf3c91975f19953379d.zip
gcc-f60e18d0c8a64f26dd87cbf3c91975f19953379d.tar.gz
gcc-f60e18d0c8a64f26dd87cbf3c91975f19953379d.tar.bz2
middle-end: Implement conditonal store vectorizer pattern [PR115531]
This adds a conditional store optimization for the vectorizer as a pattern. The vectorizer already supports modifying memory accesses because of the pattern based gather/scatter recognition. Doing it in the vectorizer allows us to still keep the ability to vectorize such loops for architectures that don't have MASK_STORE support, whereas doing this in ifcvt makes us commit to MASK_STORE. Concretely for this loop: void foo1 (char *restrict a, int *restrict b, int *restrict c, int n, int stride) { if (stride <= 1) return; for (int i = 0; i < n; i++) { int res = c[i]; int t = b[i+stride]; if (a[i] != 0) res = t; c[i] = res; } } today we generate: .L3: ld1b z29.s, p7/z, [x0, x5] ld1w z31.s, p7/z, [x2, x5, lsl 2] ld1w z30.s, p7/z, [x1, x5, lsl 2] cmpne p15.b, p6/z, z29.b, #0 sel z30.s, p15, z30.s, z31.s st1w z30.s, p7, [x2, x5, lsl 2] add x5, x5, x4 whilelo p7.s, w5, w3 b.any .L3 which in gimple is: vect_res_18.9_68 = .MASK_LOAD (vectp_c.7_65, 32B, loop_mask_67); vect_t_20.12_74 = .MASK_LOAD (vectp.10_72, 32B, loop_mask_67); vect__9.15_77 = .MASK_LOAD (vectp_a.13_75, 8B, loop_mask_67); mask__34.16_79 = vect__9.15_77 != { 0, ... }; vect_res_11.17_80 = VEC_COND_EXPR <mask__34.16_79, vect_t_20.12_74, vect_res_18.9_68>; .MASK_STORE (vectp_c.18_81, 32B, loop_mask_67, vect_res_11.17_80); A MASK_STORE is already conditional, so there's no need to perform the load of the old values and the VEC_COND_EXPR. This patch makes it so we generate: vect_res_18.9_68 = .MASK_LOAD (vectp_c.7_65, 32B, loop_mask_67); vect__9.15_77 = .MASK_LOAD (vectp_a.13_75, 8B, loop_mask_67); mask__34.16_79 = vect__9.15_77 != { 0, ... }; .MASK_STORE (vectp_c.18_81, 32B, mask__34.16_79, vect_res_18.9_68); which generates: .L3: ld1b z30.s, p7/z, [x0, x5] ld1w z31.s, p7/z, [x1, x5, lsl 2] cmpne p7.b, p7/z, z30.b, #0 st1w z31.s, p7, [x2, x5, lsl 2] add x5, x5, x4 whilelo p7.s, w5, w3 b.any .L3 gcc/ChangeLog: PR tree-optimization/115531 * tree-vect-patterns.cc (vect_cond_store_pattern_same_ref): New. (vect_recog_cond_store_pattern): New. (vect_vect_recog_func_ptrs): Use it. * target.def (conditional_operation_is_expensive): New. * doc/tm.texi: Regenerate. * doc/tm.texi.in: Document it. * targhooks.cc (default_conditional_operation_is_expensive): New. * targhooks.h (default_conditional_operation_is_expensive): New.
Diffstat (limited to 'gcc/doc')
-rw-r--r--gcc/doc/tm.texi7
-rw-r--r--gcc/doc/tm.texi.in2
2 files changed, 9 insertions, 0 deletions
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f10d9a5..c7535d0 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6449,6 +6449,13 @@ The default implementation returns a @code{MODE_VECTOR_INT} with the
same size and number of elements as @var{mode}, if such a mode exists.
@end deftypefn
+@deftypefn {Target Hook} bool TARGET_VECTORIZE_CONDITIONAL_OPERATION_IS_EXPENSIVE (unsigned @var{ifn})
+This hook returns true if masked operation @var{ifn} (really of
+type @code{internal_fn}) should be considered more expensive to use than
+implementing the same operation without masking. GCC can then try to use
+unconditional operations instead with extra selects.
+@end deftypefn
+
@deftypefn {Target Hook} bool TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE (unsigned @var{ifn})
This hook returns true if masked internal function @var{ifn} (really of
type @code{internal_fn}) should be considered expensive when the mask is
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 24596eb..64cea3b 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4290,6 +4290,8 @@ address; but often a machine-dependent strategy can generate better code.
@hook TARGET_VECTORIZE_GET_MASK_MODE
+@hook TARGET_VECTORIZE_CONDITIONAL_OPERATION_IS_EXPENSIVE
+
@hook TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE
@hook TARGET_VECTORIZE_CREATE_COSTS