aboutsummaryrefslogtreecommitdiff
path: root/gcc/tree-ssa-loop-split.cc
diff options
context:
space:
mode:
authorRoger Sayle <roger@nextmovesoftware.com>2023-07-28 09:39:46 +0100
committerRoger Sayle <roger@nextmovesoftware.com>2023-07-28 09:39:46 +0100
commit095eb138f736d94dabf9a07a6671bd351be0e66a (patch)
tree218dfdc0a8304e51355be6ce850b27a05f7b9fe0 /gcc/tree-ssa-loop-split.cc
parentb24acae8f4d315a5b071ffc2574ce91c7a0800ca (diff)
downloadgcc-095eb138f736d94dabf9a07a6671bd351be0e66a.zip
gcc-095eb138f736d94dabf9a07a6671bd351be0e66a.tar.gz
gcc-095eb138f736d94dabf9a07a6671bd351be0e66a.tar.bz2
PR rtl-optimization/110587: Reduce useless moves in compile-time hog.
This patch is one of a series of fixes for PR rtl-optimization/110587, a compile-time regression with -O0, that attempts to address the underlying cause. As noted previously, the pathological test case pr28071.c contains a large number of useless register-to-register moves that can produce quadratic behaviour (in LRA). These moves are generated during RTL expansion in emit_group_load_1, where the middle-end attempts to simplify the source before calling extract_bit_field. This is reasonable if the source is a complex expression (from before the tree-ssa optimizers), or a SUBREG, or a hard register, but it's not particularly useful to copy a pseudo register into a new pseudo register. This patch eliminates that redundancy. The -fdump-tree-expand for pr28071.c compiled with -O0 currently contains 777K lines, with this patch it contains 717K lines, i.e. saving about 60K lines (admittedly of debugging text output, but it makes the point). 2023-07-28 Roger Sayle <roger@nextmovesoftware.com> Richard Biener <rguenther@suse.de> gcc/ChangeLog PR middle-end/28071 PR rtl-optimization/110587 * expr.cc (emit_group_load_1): Simplify logic for calling force_reg on ORIG_SRC, to avoid making a copy if the source is already in a pseudo register.
Diffstat (limited to 'gcc/tree-ssa-loop-split.cc')
0 files changed, 0 insertions, 0 deletions