diff options
author | Richard Sandiford <richard.sandiford@arm.com> | 2023-01-27 17:03:51 +0000 |
---|---|---|
committer | Richard Sandiford <richard.sandiford@arm.com> | 2023-01-27 17:03:51 +0000 |
commit | 7486fe153adaa868f36248b72f3e78d18b1b3ba1 (patch) | |
tree | a032a0741ea03f90921b41e02085a23d71dfde48 /gcc/tree.h | |
parent | 553f8003ba5ecfdf0574a171692843ef838226b4 (diff) | |
download | gcc-7486fe153adaa868f36248b72f3e78d18b1b3ba1.zip gcc-7486fe153adaa868f36248b72f3e78d18b1b3ba1.tar.gz gcc-7486fe153adaa868f36248b72f3e78d18b1b3ba1.tar.bz2 |
Add support for conditional xorsign [PR96373]
This patch is an optimisation, but it's also a prerequisite for
fixing PR96373 without regressing vect-xorsign_exec.c.
Currently the vectoriser vectorises:
for (i = 0; i < N; i++)
r[i] = a[i] * __builtin_copysignf (1.0f, b[i]);
as two unconditional operations (copysign and mult).
tree-ssa-math-opts.cc later combines them into an "xorsign" function.
This works for both Advanced SIMD and SVE.
However, with the fix for PR96373, the vectoriser will instead
generate a conditional multiplication (IFN_COND_MUL). Something then
needs to fold copysign & IFN_COND_MUL to the equivalent of a conditional
xorsign. Three obvious options were:
(1) Extend tree-ssa-math-opts.cc.
(2) Do the fold in match.pd.
(3) Leave it to rtl combine.
I'm against (3), because this isn't a target-specific optimisation.
(1) would be possible, but would involve open-coding a lot of what
match.pd does for us. And, in contrast to doing the current
tree-ssa-math-opts.cc optimisation in match.pd, there should be
no danger of (2) happening too early. If we have an IFN_COND_MUL
then we're already past the stage of simplifying the original
source code.
There was also a choice between adding a conditional xorsign ifn
and simply open-coding the xorsign. The latter seems simpler,
and means less boiler-plate for target-specific code.
The signed_or_unsigned_type_for change is needed to make sure
that we stay in "SVE space" when doing the optimisation on 128-bit
fixed-length SVE.
gcc/
PR tree-optimization/96373
* tree.h (sign_mask_for): Declare.
* tree.cc (sign_mask_for): New function.
(signed_or_unsigned_type_for): For vector types, try to use the
related_int_vector_mode.
* genmatch.cc (commutative_op): Handle conditional internal functions.
* match.pd: Fold an IFN_COND_MUL+copysign into an IFN_COND_XOR+and.
gcc/testsuite/
PR tree-optimization/96373
* gcc.target/aarch64/sve/cond_xorsign_1.c: New test.
* gcc.target/aarch64/sve/cond_xorsign_2.c: Likewise.
Diffstat (limited to 'gcc/tree.h')
-rw-r--r-- | gcc/tree.h | 1 |
1 files changed, 1 insertions, 0 deletions
@@ -4675,6 +4675,7 @@ extern tree build_one_cst (tree); extern tree build_minus_one_cst (tree); extern tree build_all_ones_cst (tree); extern tree build_zero_cst (tree); +extern tree sign_mask_for (tree); extern tree build_string (unsigned, const char * = NULL); extern tree build_poly_int_cst (tree, const poly_wide_int_ref &); extern tree build_tree_list (tree, tree CXX_MEM_STAT_INFO); |