diff options
author | Richard Biener <rguenther@suse.de> | 2023-06-26 12:51:37 +0200 |
---|---|---|
committer | Richard Biener <rguenther@suse.de> | 2023-06-26 14:14:54 +0200 |
commit | 53d6f57c1b20c6da52aefce737fb7d5263686ba3 (patch) | |
tree | ea464de31dd3bd20324ceb81ea7dd6aa7aa9881d /gcc/testsuite | |
parent | a024176a97b0176f526862836c33e283b8db4197 (diff) | |
download | gcc-53d6f57c1b20c6da52aefce737fb7d5263686ba3.zip gcc-53d6f57c1b20c6da52aefce737fb7d5263686ba3.tar.gz gcc-53d6f57c1b20c6da52aefce737fb7d5263686ba3.tar.bz2 |
tree-optimization/110381 - preserve SLP permutation with in-order reductions
The following fixes a bug that manifests itself during fold-left
reduction transform in picking not the last scalar def to replace
and thus double-counting some elements. But the underlying issue
is that we merge a load permutation into the in-order reduction
which is of course wrong.
Now, reduction analysis has not yet been performend when optimizing
permutations so we have to resort to check that ourselves.
PR tree-optimization/110381
* tree-vect-slp.cc (vect_optimize_slp_pass::start_choosing_layouts):
Materialize permutes before fold-left reductions.
* gcc.dg/vect/pr110381.c: New testcase.
Diffstat (limited to 'gcc/testsuite')
-rw-r--r-- | gcc/testsuite/gcc.dg/vect/pr110381.c | 40 |
1 files changed, 40 insertions, 0 deletions
diff --git a/gcc/testsuite/gcc.dg/vect/pr110381.c b/gcc/testsuite/gcc.dg/vect/pr110381.c new file mode 100644 index 0000000..2313dbf --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr110381.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ + +struct FOO { + double a; + double b; + double c; +}; + +double __attribute__((noipa)) +sum_8_foos(const struct FOO* foos) +{ + double sum = 0; + + for (int i = 0; i < 8; ++i) + { + struct FOO foo = foos[i]; + + /* Need to use an in-order reduction here, preserving + the load permutation. */ + sum += foo.a; + sum += foo.c; + sum += foo.b; + } + + return sum; +} + +int main() +{ + struct FOO foos[8]; + + __builtin_memset (foos, 0, sizeof (foos)); + foos[0].a = __DBL_MAX__; + foos[0].b = 5; + foos[0].c = -__DBL_MAX__; + + if (sum_8_foos (foos) != 5) + __builtin_abort (); + return 0; +} |