aboutsummaryrefslogtreecommitdiff
path: root/gcc/fortran/options.cc
diff options
context:
space:
mode:
authorDi Zhao <dizhao@os.amperecomputing.com>2023-11-09 15:06:37 +0800
committerDi Zhao <dizhao@os.amperecomputing.com>2023-11-23 20:56:31 +0800
commit746344dd53807d840c29f52adba10d0ab093bd3d (patch)
treed125ee965f08ace40f9b75f608bfebe4bc2ff935 /gcc/fortran/options.cc
parentef296fb37cac12a5a10e83c16ae021a624e1238c (diff)
downloadgcc-746344dd53807d840c29f52adba10d0ab093bd3d.zip
gcc-746344dd53807d840c29f52adba10d0ab093bd3d.tar.gz
gcc-746344dd53807d840c29f52adba10d0ab093bd3d.tar.bz2
swap ops in reassoc to reduce cross backedge FMA
Previously for ops.length >= 3, when FMA is present, we don't rank the operands so that more FMAs can be preserved. But this brings more FMAs with loop dependency, which lead to worse performance on some targets. Rank the oprands (set width=2) when: 1. avoid_fma_max_bits is set. 2. And loop dependent FMA sequence is found. In this way, we don't have to discard all the FMA candidates in the bad shaped sequence in widening_mul, instead we can keep fewer FMAs without loop dependency. With this patch, there's about 2% improvement in 510.parest_r 1-copy run on ampere1 (with "-Ofast -mcpu=ampere1 -flto --param avoid-fma-max-bits=512"). PR tree-optimization/110279 gcc/ChangeLog: * tree-ssa-reassoc.cc (get_reassociation_width): check for loop dependent FMAs. (reassociate_bb): For 3 ops, refine the condition to call swap_ops_for_binary_stmt. gcc/testsuite/ChangeLog: * gcc.dg/pr110279-1.c: New test.
Diffstat (limited to 'gcc/fortran/options.cc')
0 files changed, 0 insertions, 0 deletions