diff options
author | David Malcolm <dmalcolm@redhat.com> | 2022-03-24 20:58:10 -0400 |
---|---|---|
committer | David Malcolm <dmalcolm@redhat.com> | 2022-03-24 20:58:10 -0400 |
commit | 5f6197d7c197f9d2b7fb2e1a19dac39a023755e8 (patch) | |
tree | 55c78e08c0ae81516a4c8708283d76352ae64b52 /gcc/analyzer/region.h | |
parent | 319ba7e241e7e21f9eb481f075310796f13d2035 (diff) | |
download | gcc-5f6197d7c197f9d2b7fb2e1a19dac39a023755e8.zip gcc-5f6197d7c197f9d2b7fb2e1a19dac39a023755e8.tar.gz gcc-5f6197d7c197f9d2b7fb2e1a19dac39a023755e8.tar.bz2 |
analyzer: add region::tracked_p to optimize state objects [PR104954]
PR analyzer/104954 tracks that -fanalyzer was taking a very long time
on a particular source file in the Linux kernel:
drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
One issue occurs with the repeated use of dynamic debug lines e.g. via
the DC_LOG_BANDWIDTH_CALCS macro, such as in print_bw_calcs_dceip in
drivers/gpu/drm/amd/display/dc/calcs/calcs_logger.h:
DC_LOG_BANDWIDTH_CALCS("#####################################################################");
DC_LOG_BANDWIDTH_CALCS("struct bw_calcs_dceip");
DC_LOG_BANDWIDTH_CALCS("#####################################################################");
[...snip dozens of lines...]
DC_LOG_BANDWIDTH_CALCS("[bw_fixed] dmif_request_buffer_size: %d",
bw_fixed_to_int(dceip->dmif_request_buffer_size));
When this is configured to use __dynamic_pr_debug, each of these becomes
code like:
do {
static struct _ddebug __attribute__((__aligned__(8)))
__attribute__((__section__("__dyndbg"))) __UNIQUE_ID_ddebug277 = {
[...snip...]
};
if (arch_static_branch(&__UNIQUE_ID_ddebug277.key, false))
__dynamic_pr_debug(&__UNIQUE_ID_ddebug277, [...the message...]);
} while (0);
The analyzer was naively seeing each call to __dynamic_pr_debug, noting
that the __UNIQUE_ID_nnnn object escapes. At each call, as successive
__UNIQUE_ID_nnnn object escapes, there are N escaped objects, and thus N
need clobbering, and so we have O(N^2) clobbering of escaped objects overall,
leading to huge amounts of pointless work: print_bw_calcs_data has 225
uses of DC_LOG_BANDWIDTH_CALCS, many of which are in loops.
This patch adds a way to identify declarations that aren't interesting
to the analyzer, so that we don't attempt to create binding_clusters
for them (i.e. we don't store any state for them in our state objects).
This is implemented by adding a new region::tracked_p, implemented for
declarations by walking the existing IPA data the first time the
analyzer sees a declaration, setting it to false for global vars that
have no loads/stores/aliases, and "sufficiently safe" address-of
ipa-refs.
The patch gives a large speedup of -fanalyzer on the above kernel
source file:
Before After
Total cc1 wallclock time: 180s 36s
analyzer wallclock time: 162s 17s
% spent in analyzer: 90% 47%
gcc/analyzer/ChangeLog:
PR analyzer/104954
* analyzer.opt (-fdump-analyzer-untracked): New option.
* engine.cc (impl_run_checkers): Handle it.
* region-model-asm.cc (region_model::on_asm_stmt): Don't attempt
to clobber regions with !tracked_p ().
* region-model-manager.cc (dump_untracked_region): New.
(region_model_manager::dump_untracked_regions): New.
(frame_region::dump_untracked_regions): New.
* region-model.h (region_model_manager::dump_untracked_regions):
New decl.
* region.cc (ipa_ref_requires_tracking): New.
(symnode_requires_tracking_p): New.
(decl_region::calc_tracked_p): New.
* region.h (region::tracked_p): New vfunc.
(frame_region::dump_untracked_regions): New decl.
(class decl_region): Note that this is also used fo SSA names.
(decl_region::decl_region): Initialize m_tracked.
(decl_region::tracked_p): New.
(decl_region::calc_tracked_p): New decl.
(decl_region::m_tracked): New.
* store.cc (store::get_or_create_cluster): Assert that we
don't try to create clusters for base regions that aren't
trackable.
(store::mark_as_escaped): Don't mark base regions that we're not
tracking.
gcc/ChangeLog:
PR analyzer/104954
* doc/invoke.texi (Static Analyzer Options): Add
-fdump-analyzer-untracked.
gcc/testsuite/ChangeLog:
PR analyzer/104954
* gcc.dg/analyzer/asm-x86-dyndbg-1.c: New test.
* gcc.dg/analyzer/asm-x86-dyndbg-2.c: New test.
* gcc.dg/analyzer/many-unused-locals.c: New test.
* gcc.dg/analyzer/untracked-1.c: New test.
* gcc.dg/analyzer/unused-local-1.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Diffstat (limited to 'gcc/analyzer/region.h')
-rw-r--r-- | gcc/analyzer/region.h | 24 |
1 files changed, 22 insertions, 2 deletions
diff --git a/gcc/analyzer/region.h b/gcc/analyzer/region.h index dbeb485..5150be7 100644 --- a/gcc/analyzer/region.h +++ b/gcc/analyzer/region.h @@ -197,6 +197,11 @@ public: bool symbolic_for_unknown_ptr_p () const; + /* For most base regions it makes sense to track the bindings of the region + within the store. As an optimization, some are not tracked (to avoid + bloating the store object with redundant binding clusters). */ + virtual bool tracked_p () const { return true; } + const complexity &get_complexity () const { return m_complexity; } bool is_named_decl_p (const char *decl_name) const; @@ -319,6 +324,9 @@ public: unsigned get_num_locals () const { return m_locals.elements (); } + /* Implemented in region-model-manager.cc. */ + void dump_untracked_regions () const; + private: const frame_region *m_calling_frame; function *m_fun; @@ -633,13 +641,15 @@ template <> struct default_hash_traits<symbolic_region::key_t> namespace ana { /* Concrete region subclass representing the memory occupied by a - variable (whether for a global or a local). */ + variable (whether for a global or a local). + Also used for representing SSA names, as if they were locals. */ class decl_region : public region { public: decl_region (unsigned id, const region *parent, tree decl) - : region (complexity (parent), id, parent, TREE_TYPE (decl)), m_decl (decl) + : region (complexity (parent), id, parent, TREE_TYPE (decl)), m_decl (decl), + m_tracked (calc_tracked_p (decl)) {} enum region_kind get_kind () const FINAL OVERRIDE { return RK_DECL; } @@ -648,6 +658,8 @@ public: void dump_to_pp (pretty_printer *pp, bool simple) const FINAL OVERRIDE; + bool tracked_p () const FINAL OVERRIDE { return m_tracked; } + tree get_decl () const { return m_decl; } int get_stack_depth () const; @@ -657,7 +669,15 @@ public: const svalue *get_svalue_for_initializer (region_model_manager *mgr) const; private: + static bool calc_tracked_p (tree decl); + tree m_decl; + + /* Cached result of calc_tracked_p, so that we can quickly determine when + we don't to track a binding_cluster for this decl (to avoid bloating + store objects). + This can be debugged using -fdump-analyzer-untracked. */ + bool m_tracked; }; } // namespace ana |