aboutsummaryrefslogtreecommitdiff
path: root/gcc/tree-vectorizer.h
diff options
context:
space:
mode:
authorRichard Sandiford <richard.sandiford@arm.com>2018-07-03 09:59:37 +0000
committerRichard Sandiford <rsandifo@gcc.gnu.org>2018-07-03 09:59:37 +0000
commit370c2ebe8fa20e0812cd2d533d4ed38ee2d37c85 (patch)
tree702c8f6f9f77e7b73d3ebba7dd792e61301526e6 /gcc/tree-vectorizer.h
parent3239dde94019f11e6c1a8c6ae2b3f7d944689148 (diff)
downloadgcc-370c2ebe8fa20e0812cd2d533d4ed38ee2d37c85.zip
gcc-370c2ebe8fa20e0812cd2d533d4ed38ee2d37c85.tar.gz
gcc-370c2ebe8fa20e0812cd2d533d4ed38ee2d37c85.tar.bz2
[14/n] PR85694: Rework overwidening detection
This patch is the main part of PR85694. The aim is to recognise at least: signed char *a, *b, *c; ... for (int i = 0; i < 2048; i++) c[i] = (a[i] + b[i]) >> 1; as an over-widening pattern, since the addition and shift can be done on shorts rather than ints. However, it ended up being a lot more general than that. The current over-widening pattern detection is limited to a few simple cases: logical ops with immediate second operands, and shifts by a constant. These cases are enough for common pixel-format conversion and can be detected in a peephole way. The loop above requires two generalisations of the current code: support for addition as well as logical ops, and support for non-constant second operands. These are harder to detect in the same peephole way, so the patch tries to take a more global approach. The idea is to get information about the minimum operation width in two ways: (1) by using the range information attached to the SSA_NAMEs (effectively a forward walk, since the range info is context-independent). (2) by back-propagating the number of output bits required by users of the result. As explained in the comments, there's a balance to be struck between narrowing an individual operation and fitting in with the surrounding code. The approach is pretty conservative: if we could narrow an operation to N bits without changing its semantics, it's OK to do that if: - no operations later in the chain require more than N bits; or - all internally-defined inputs are extended from N bits or fewer, and at least one of them is single-use. See the comments for the rationale. I didn't bother adding STMT_VINFO_* wrappers for the new fields since the code seemed more readable without. 2018-06-20 Richard Sandiford <richard.sandiford@arm.com> gcc/ * poly-int.h (print_hex): New function. * dumpfile.h (dump_dec, dump_hex): Declare. * dumpfile.c (dump_dec, dump_hex): New poly_wide_int functions. * tree-vectorizer.h (_stmt_vec_info): Add min_output_precision, min_input_precision, operation_precision and operation_sign. * tree-vect-patterns.c (vect_get_range_info): New function. (vect_same_loop_or_bb_p, vect_single_imm_use) (vect_operation_fits_smaller_type): Delete. (vect_look_through_possible_promotion): Add an optional single_use_p parameter. (vect_recog_over_widening_pattern): Rewrite to use new stmt_vec_info infomration. Handle one operation at a time. (vect_recog_cast_forwprop_pattern, vect_narrowable_type_p) (vect_truncatable_operation_p, vect_set_operation_type) (vect_set_min_input_precision): New functions. (vect_determine_min_output_precision_1): Likewise. (vect_determine_min_output_precision): Likewise. (vect_determine_precisions_from_range): Likewise. (vect_determine_precisions_from_users): Likewise. (vect_determine_stmt_precisions, vect_determine_precisions): Likewise. (vect_vect_recog_func_ptrs): Put over_widening first. Add cast_forwprop. (vect_pattern_recog): Call vect_determine_precisions. gcc/testsuite/ * gcc.dg/vect/vect-widen-mult-u8-u32.c: Check specifically for a widen_mult pattern. * gcc.dg/vect/vect-over-widen-1.c: Update the scan tests for new over-widening messages. * gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise. * gcc.dg/vect/vect-over-widen-2.c: Likewise. * gcc.dg/vect/vect-over-widen-2-big-array.c: Likewise. * gcc.dg/vect/vect-over-widen-3.c: Likewise. * gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise. * gcc.dg/vect/vect-over-widen-4.c: Likewise. * gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise. * gcc.dg/vect/bb-slp-over-widen-1.c: New test. * gcc.dg/vect/bb-slp-over-widen-2.c: Likewise. * gcc.dg/vect/vect-over-widen-5.c: Likewise. * gcc.dg/vect/vect-over-widen-6.c: Likewise. * gcc.dg/vect/vect-over-widen-7.c: Likewise. * gcc.dg/vect/vect-over-widen-8.c: Likewise. * gcc.dg/vect/vect-over-widen-9.c: Likewise. * gcc.dg/vect/vect-over-widen-10.c: Likewise. * gcc.dg/vect/vect-over-widen-11.c: Likewise. * gcc.dg/vect/vect-over-widen-12.c: Likewise. * gcc.dg/vect/vect-over-widen-13.c: Likewise. * gcc.dg/vect/vect-over-widen-14.c: Likewise. * gcc.dg/vect/vect-over-widen-15.c: Likewise. * gcc.dg/vect/vect-over-widen-16.c: Likewise. * gcc.dg/vect/vect-over-widen-17.c: Likewise. * gcc.dg/vect/vect-over-widen-18.c: Likewise. * gcc.dg/vect/vect-over-widen-19.c: Likewise. * gcc.dg/vect/vect-over-widen-20.c: Likewise. * gcc.dg/vect/vect-over-widen-21.c: Likewise. From-SVN: r262333
Diffstat (limited to 'gcc/tree-vectorizer.h')
-rw-r--r--gcc/tree-vectorizer.h15
1 files changed, 15 insertions, 0 deletions
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 2dac54e..28be41f 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -899,6 +899,21 @@ typedef struct _stmt_vec_info {
/* The number of scalar stmt references from active SLP instances. */
unsigned int num_slp_uses;
+
+ /* If nonzero, the lhs of the statement could be truncated to this
+ many bits without affecting any users of the result. */
+ unsigned int min_output_precision;
+
+ /* If nonzero, all non-boolean input operands have the same precision,
+ and they could each be truncated to this many bits without changing
+ the result. */
+ unsigned int min_input_precision;
+
+ /* If OPERATION_BITS is nonzero, the statement could be performed on
+ an integer with the sign and number of bits given by OPERATION_SIGN
+ and OPERATION_BITS without changing the result. */
+ unsigned int operation_precision;
+ signop operation_sign;
} *stmt_vec_info;
/* Information about a gather/scatter call. */