diff options
author | David Malcolm <dmalcolm@redhat.com> | 2022-06-02 15:40:22 -0400 |
---|---|---|
committer | David Malcolm <dmalcolm@redhat.com> | 2022-06-02 15:40:22 -0400 |
commit | 6cf276ddf22066af780335cd0072d2c27aabe468 (patch) | |
tree | a765d4f8d248aa7a4cebe46e77c3eace8c99ad6f /gcc | |
parent | 5ab73173cca4610e59df8a3fe9cb5b30ded75aec (diff) | |
download | gcc-6cf276ddf22066af780335cd0072d2c27aabe468.zip gcc-6cf276ddf22066af780335cd0072d2c27aabe468.tar.gz gcc-6cf276ddf22066af780335cd0072d2c27aabe468.tar.bz2 |
diagnostics: add SARIF output format
This patch adds support to gcc's diagnostic subsystem for emitting
diagnostics in SARIF, aka the Static Analysis Results Interchange Format:
https://sarifweb.azurewebsites.net/
by extending -fdiagnostics-format= to add two new options:
-fdiagnostics-format=sarif-stderr
and:
-fdiagnostics-format=sarif-file
The patch targets SARIF v2.1.0
This is a JSON-based format suited for capturing the results of static
analysis tools (like GCC's -fanalyzer), but it can also be used for plain
GCC warnings and errors.
SARIF supports per-event metadata in diagnostic paths such as
["acquire", "resource"] and ["release", "lock"] (specifically, the
threadFlowLocation "kinds" property: SARIF v2.1.0 section 3.38.8), so
the patch extends GCC"s diagnostic_event subclass with a "struct meaning"
with similar purpose. The patch implements this for -fanalyzer so that
the various state-machine-based warnings set these in the SARIF output.
The heart of the implementation is in the new file
diagnostic-format-sarif.cc. Much of the rest of the patch is interface
classes, isolating the diagnostic subsystem (which has no knowledge of
e.g. tree or langhook) from the "client" code in the compiler proper
cc1 etc).
The patch adds a langhook for specifying the SARIF v2.1.0
"artifact.sourceLanguage" property, based on the list in
SARIF v2.1.0 Appendix J.
The patch adds automated DejaGnu tests to our testsuite via new
scan-sarif-file and scan-sarif-file-not directives (although these
merely use regexps, rather than attempting to use a proper JSON parser).
I've tested the patch by hand using the validator at:
https://sarifweb.azurewebsites.net/Validation
and the react-based viewer at:
https://microsoft.github.io/sarif-web-component/
which successfully shows most of the information (although not paths,
and not CWE IDs), and I've fixed all validation errors I've seen (though
bugs no doubt remain).
I've also tested the generated SARIF using the VS Code extension linked
to from the SARIF website; I'm a novice with VS Code, but it seems to be
able to handle my generated SARIF files (e.g. showing the data in the
SARIF tab, and showing squiggly underlines under issues, and when I
click on them, it visualizes the events in the path inline within the
source window).
Has anyone written an Emacs mode for SARIF files? (pretty please)
gcc/ChangeLog:
* Makefile.in (OBJS): Add tree-diagnostic-client-data-hooks.o and
tree-logical-location.o.
(OBJS-libcommon): Add diagnostic-format-sarif.o; reorder.
(CFLAGS-tree-diagnostic-client-data-hooks.o): Add TARGET_NAME.
* common.opt (fdiagnostics-format=): Add sarif-stderr and sarif-file.
(sarif-stderr, sarif-file): New enum values.
* diagnostic-client-data-hooks.h: New file.
* diagnostic-format-sarif.cc: New file.
* diagnostic-path.h (enum diagnostic_event::verb): New enum.
(enum diagnostic_event::noun): New enum.
(enum diagnostic_event::property): New enum.
(struct diagnostic_event::meaning): New struct.
(diagnostic_event::get_logical_location): New vfunc.
(diagnostic_event::get_meaning): New vfunc.
(simple_diagnostic_event::get_logical_location): New vfunc impl.
(simple_diagnostic_event::get_meaning): New vfunc impl.
* diagnostic.cc: Include "diagnostic-client-data-hooks.h".
(diagnostic_initialize): Initialize m_client_data_hooks.
(diagnostic_finish): Clean up m_client_data_hooks.
(diagnostic_event::meaning::dump_to_pp): New.
(diagnostic_event::meaning::maybe_get_verb_str): New.
(diagnostic_event::meaning::maybe_get_noun_str): New.
(diagnostic_event::meaning::maybe_get_property_str): New.
(get_cwe_url): Make non-static.
(diagnostic_output_format_init): Handle
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_STDERR and
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE.
* diagnostic.h (enum diagnostics_output_format): Add
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_STDERR and
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE.
(class diagnostic_client_data_hooks): New forward decl.
(class logical_location): New forward decl.
(diagnostic_context::m_client_data_hooks): New field.
(diagnostic_output_format_init_sarif_stderr): New decl.
(diagnostic_output_format_init_sarif_file): New decl.
(get_cwe_url): New decl.
* doc/invoke.texi (-fdiagnostics-format=): Add sarif-stderr and
sarif-file.
* doc/sourcebuild.texi (Scan a particular file): Add
scan-sarif-file and scan-sarif-file-not.
* langhooks-def.h (lhd_get_sarif_source_language): New decl.
(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): New macro.
(LANG_HOOKS_INITIALIZER): Add
LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE.
* langhooks.cc (lhd_get_sarif_source_language): New.
* langhooks.h (lang_hooks::get_sarif_source_language): New field.
* logical-location.h: New file.
* plugin.cc (struct for_each_plugin_closure): New.
(for_each_plugin_cb): New.
(for_each_plugin): New.
* plugin.h (for_each_plugin): New decl.
* tree-diagnostic-client-data-hooks.cc: New file.
* tree-diagnostic.cc: Include "diagnostic-client-data-hooks.h".
(tree_diagnostics_defaults): Populate m_client_data_hooks.
* tree-logical-location.cc: New file.
* tree-logical-location.h: New file.
gcc/ada/ChangeLog:
* gcc-interface/misc.cc (gnat_get_sarif_source_language): New.
(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
gcc/analyzer/ChangeLog:
* checker-path.cc (checker_event::get_meaning): New.
(function_entry_event::get_meaning): New.
(state_change_event::get_desc): Add dump of meaning of the event
to the -fanalyzer-verbose-state-changes output.
(state_change_event::get_meaning): New.
(cfg_edge_event::get_meaning): New.
(call_event::get_meaning): New.
(return_event::get_meaning): New.
(start_consolidated_cfg_edges_event::get_meaning): New.
(warning_event::get_meaning): New.
* checker-path.h: Include "tree-logical-location.h".
(checker_event::checker_event): Construct m_logical_loc.
(checker_event::get_logical_location): New.
(checker_event::get_meaning): New decl.
(checker_event::m_logical_loc): New.
(function_entry_event::get_meaning): New decl.
(state_change_event::get_meaning): New decl.
(cfg_edge_event::get_meaning): New decl.
(call_event::get_meaning): New decl.
(return_event::get_meaning): New decl.
(start_consolidated_cfg_edges_event::get_meaning): New.
(warning_event::get_meaning): New decl.
* pending-diagnostic.h: Include "diagnostic-path.h".
(pending_diagnostic::get_meaning_for_state_change): New vfunc.
* sm-file.cc (file_diagnostic::get_meaning_for_state_change): New
vfunc impl.
* sm-malloc.cc (malloc_diagnostic::get_meaning_for_state_change):
Likewise.
* sm-sensitive.cc
(exposure_through_output_file::get_meaning_for_state_change):
Likewise.
* sm-taint.cc (taint_diagnostic::get_meaning_for_state_change):
Likewise.
* varargs.cc
(va_list_sm_diagnostic::get_meaning_for_state_change): Likewise.
gcc/c/ChangeLog:
* c-lang.cc (LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
(c_get_sarif_source_language): New.
* c-tree.h (c_get_sarif_source_language): New decl.
gcc/cp/ChangeLog:
* cp-lang.cc (LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
(cp_get_sarif_source_language): New.
gcc/d/ChangeLog:
* d-lang.cc (d_get_sarif_source_language): New.
(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
gcc/fortran/ChangeLog:
* f95-lang.cc (gfc_get_sarif_source_language): New.
(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
gcc/go/ChangeLog:
* go-lang.cc (go_get_sarif_source_language): New.
(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
gcc/objc/ChangeLog:
* objc-act.h (objc_get_sarif_source_language): New decl.
* objc-lang.cc (LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
(objc_get_sarif_source_language): New.
gcc/testsuite/ChangeLog:
* c-c++-common/diagnostic-format-sarif-file-1.c: New test.
* c-c++-common/diagnostic-format-sarif-file-2.c: New test.
* c-c++-common/diagnostic-format-sarif-file-3.c: New test.
* c-c++-common/diagnostic-format-sarif-file-4.c: New test.
* gcc.dg/analyzer/file-meaning-1.c: New test.
* gcc.dg/analyzer/malloc-meaning-1.c: New test.
* gcc.dg/analyzer/malloc-sarif-1.c: New test.
* gcc.dg/plugin/analyzer_gil_plugin.c
(gil_diagnostic::get_meaning_for_state_change): New vfunc impl.
* gcc.dg/plugin/diagnostic-test-paths-5.c: New test.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
diagnostic-test-paths-5.c to tests for
diagnostic_plugin_test_paths.c.
* lib/gcc-dg.exp: Load scansarif.exp.
* lib/scansarif.exp: New test.
libatomic/ChangeLog:
* testsuite/lib/libatomic.exp: Add load_gcc_lib of scansarif.exp.
libgomp/ChangeLog:
* testsuite/lib/libgomp.exp: Add load_gcc_lib of scansarif.exp.
libitm/ChangeLog:
* testsuite/lib/libitm.exp: Add load_gcc_lib of scansarif.exp.
libphobos/ChangeLog:
* testsuite/lib/libphobos-dg.exp: Add load_gcc_lib of scansarif.exp.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Diffstat (limited to 'gcc')
48 files changed, 3001 insertions, 13 deletions
diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 020b3b1..b6dcc45 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1617,6 +1617,7 @@ OBJS = \ tree-data-ref.o \ tree-dfa.o \ tree-diagnostic.o \ + tree-diagnostic-client-data-hooks.o \ tree-diagnostic-path.o \ tree-dump.o \ tree-eh.o \ @@ -1625,6 +1626,7 @@ OBJS = \ tree-inline.o \ tree-into-ssa.o \ tree-iterator.o \ + tree-logical-location.o \ tree-loop-distribution.o \ tree-nested.o \ tree-nrv.o \ @@ -1728,9 +1730,12 @@ OBJS = \ # Objects in libcommon.a, potentially used by all host binaries and with # no target dependencies. OBJS-libcommon = diagnostic-spec.o diagnostic.o diagnostic-color.o \ - diagnostic-show-locus.o diagnostic-format-json.o json.o \ + diagnostic-format-json.o \ + diagnostic-format-sarif.o \ + diagnostic-show-locus.o \ edit-context.o \ pretty-print.o intl.o \ + json.o \ sbitmap.o \ vec.o input.o hash-table.o ggc-none.o memory-block.o \ selftest.o selftest-diagnostic.o sort.o @@ -2368,6 +2373,7 @@ s-bversion: BASE-VER $(STAMP) s-bversion CFLAGS-toplev.o += -DTARGET_NAME=\"$(target_noncanonical)\" +CFLAGS-tree-diagnostic-client-data-hooks.o += -DTARGET_NAME=\"$(target_noncanonical)\" CFLAGS-optinfo-emit-json.o += -DTARGET_NAME=\"$(target_noncanonical)\" $(ZLIBINC) CFLAGS-analyzer/engine.o += $(ZLIBINC) diff --git a/gcc/ada/gcc-interface/misc.cc b/gcc/ada/gcc-interface/misc.cc index 7824ebf..f0ca197 100644 --- a/gcc/ada/gcc-interface/misc.cc +++ b/gcc/ada/gcc-interface/misc.cc @@ -1292,6 +1292,15 @@ gnat_eh_personality (void) return gnat_eh_personality_decl; } +/* Get a value for the SARIF v2.1.0 "artifact.sourceLanguage" property, + based on the list in SARIF v2.1.0 Appendix J. */ + +static const char * +gnat_get_sarif_source_language (const char *) +{ + return "ada"; +} + /* Initialize language-specific bits of tree_contains_struct. */ static void @@ -1414,6 +1423,8 @@ get_lang_specific (tree node) #define LANG_HOOKS_DEEP_UNSHARING true #undef LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS #define LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS true +#undef LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE +#define LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE gnat_get_sarif_source_language struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER; diff --git a/gcc/analyzer/checker-path.cc b/gcc/analyzer/checker-path.cc index 5fdbc38..8aa5bf7 100644 --- a/gcc/analyzer/checker-path.cc +++ b/gcc/analyzer/checker-path.cc @@ -112,6 +112,15 @@ event_kind_to_string (enum event_kind ek) /* class checker_event : public diagnostic_event. */ +/* No-op implementation of diagnostic_event::get_meaning vfunc for + checker_event: checker events have no meaning by default. */ + +diagnostic_event::meaning +checker_event::get_meaning () const +{ + return meaning (); +} + /* Dump this event to PP (for debugging/logging purposes). */ void @@ -242,6 +251,15 @@ function_entry_event::get_desc (bool can_colorize) const return make_label_text (can_colorize, "entry to %qE", m_fndecl); } +/* Implementation of diagnostic_event::get_meaning vfunc for + function entry. */ + +diagnostic_event::meaning +function_entry_event::get_meaning () const +{ + return meaning (VERB_enter, NOUN_function); +} + /* class state_change_event : public checker_event. */ /* state_change_event's ctor. */ @@ -292,25 +310,33 @@ state_change_event::get_desc (bool can_colorize) const { if (flag_analyzer_verbose_state_changes) { + /* Get any "meaning" of event. */ + diagnostic_event::meaning meaning = get_meaning (); + pretty_printer meaning_pp; + meaning.dump_to_pp (&meaning_pp); + /* Append debug version. */ label_text result; if (m_origin) result = make_label_text (can_colorize, - "%s (state of %qE: %qs -> %qs, origin: %qE)", + "%s (state of %qE: %qs -> %qs, origin: %qE, meaning: %s)", custom_desc.m_buffer, var, m_from->get_name (), m_to->get_name (), - origin); + origin, + pp_formatted_text (&meaning_pp)); else result = make_label_text (can_colorize, - "%s (state of %qE: %qs -> %qs, NULL origin)", + "%s (state of %qE: %qs -> %qs, NULL origin, meaning: %s)", custom_desc.m_buffer, var, m_from->get_name (), - m_to->get_name ()); + m_to->get_name (), + pp_formatted_text (&meaning_pp)); + custom_desc.maybe_free (); return result; } @@ -357,6 +383,26 @@ state_change_event::get_desc (bool can_colorize) const } } +/* Implementation of diagnostic_event::get_meaning vfunc for + state change events: delegate to the pending_diagnostic to + get any meaning. */ + +diagnostic_event::meaning +state_change_event::get_meaning () const +{ + if (m_pending_diagnostic) + { + region_model *model = m_dst_state.m_region_model; + tree var = model->get_representative_tree (m_sval); + tree origin = model->get_representative_tree (m_origin); + return m_pending_diagnostic->get_meaning_for_state_change + (evdesc::state_change (false, var, origin, + m_from, m_to, m_emission_id, *this)); + } + else + return meaning (); +} + /* class superedge_event : public checker_event. */ /* Get the callgraph_superedge for this superedge_event, which must be @@ -432,6 +478,21 @@ cfg_edge_event::cfg_edge_event (enum event_kind kind, gcc_assert (eedge.m_sedge->m_kind == SUPEREDGE_CFG_EDGE); } +/* Implementation of diagnostic_event::get_meaning vfunc for + CFG edge events. */ + +diagnostic_event::meaning +cfg_edge_event::get_meaning () const +{ + const cfg_superedge& cfg_sedge = get_cfg_superedge (); + if (cfg_sedge.true_value_p ()) + return meaning (VERB_branch, PROPERTY_true); + else if (cfg_sedge.false_value_p ()) + return meaning (VERB_branch, PROPERTY_false); + else + return meaning (); +} + /* class start_cfg_edge_event : public cfg_edge_event. */ /* Implementation of diagnostic_event::get_desc vfunc for @@ -690,6 +751,15 @@ call_event::get_desc (bool can_colorize) const get_caller_fndecl ()); } +/* Implementation of diagnostic_event::get_meaning vfunc for + function call events. */ + +diagnostic_event::meaning +call_event::get_meaning () const +{ + return meaning (VERB_call, NOUN_function); +} + /* Override of checker_event::is_call_p for calls. */ bool @@ -760,6 +830,15 @@ return_event::get_desc (bool can_colorize) const m_src_snode->m_fun->decl); } +/* Implementation of diagnostic_event::get_meaning vfunc for + function return events. */ + +diagnostic_event::meaning +return_event::get_meaning () const +{ + return meaning (VERB_return, NOUN_function); +} + /* Override of checker_event::is_return_p for returns. */ bool @@ -778,6 +857,16 @@ start_consolidated_cfg_edges_event::get_desc (bool can_colorize) const m_edge_sense ? "true" : "false"); } +/* Implementation of diagnostic_event::get_meaning vfunc for + start_consolidated_cfg_edges_event. */ + +diagnostic_event::meaning +start_consolidated_cfg_edges_event::get_meaning () const +{ + return meaning (VERB_branch, + (m_edge_sense ? PROPERTY_true : PROPERTY_false)); +} + /* class setjmp_event : public checker_event. */ /* Implementation of diagnostic_event::get_desc vfunc for @@ -977,6 +1066,15 @@ warning_event::get_desc (bool can_colorize) const return label_text::borrow ("here"); } +/* Implementation of diagnostic_event::get_meaning vfunc for + warning_event. */ + +diagnostic_event::meaning +warning_event::get_meaning () const +{ + return meaning (VERB_danger, NOUN_unknown); +} + /* Print a single-line representation of this path to PP. */ void diff --git a/gcc/analyzer/checker-path.h b/gcc/analyzer/checker-path.h index fd274e5..8960d56 100644 --- a/gcc/analyzer/checker-path.h +++ b/gcc/analyzer/checker-path.h @@ -21,6 +21,8 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_ANALYZER_CHECKER_PATH_H #define GCC_ANALYZER_CHECKER_PATH_H +#include "tree-logical-location.h" + namespace ana { /* An enum for discriminating between the concrete subclasses of @@ -85,7 +87,8 @@ public: checker_event (enum event_kind kind, location_t loc, tree fndecl, int depth) : m_kind (kind), m_loc (loc), m_fndecl (fndecl), m_depth (depth), - m_pending_diagnostic (NULL), m_emission_id () + m_pending_diagnostic (NULL), m_emission_id (), + m_logical_loc (fndecl) { } @@ -94,6 +97,14 @@ public: location_t get_location () const final override { return m_loc; } tree get_fndecl () const final override { return m_fndecl; } int get_stack_depth () const final override { return m_depth; } + const logical_location *get_logical_location () const final override + { + if (m_fndecl) + return &m_logical_loc; + else + return NULL; + } + meaning get_meaning () const override; /* Additional functionality. */ @@ -122,6 +133,7 @@ public: int m_depth; pending_diagnostic *m_pending_diagnostic; diagnostic_event_id_t m_emission_id; // only set once all pruning has occurred + tree_logical_location m_logical_loc; }; /* A concrete event subclass for a purely textual event, for use in @@ -222,6 +234,7 @@ public: } label_text get_desc (bool can_colorize) const final override; + meaning get_meaning () const override; bool is_function_entry_p () const final override { return true; } }; @@ -241,6 +254,7 @@ public: const program_state &dst_state); label_text get_desc (bool can_colorize) const final override; + meaning get_meaning () const override; function *get_dest_function () const { @@ -295,6 +309,8 @@ public: class cfg_edge_event : public superedge_event { public: + meaning get_meaning () const override; + const cfg_superedge& get_cfg_superedge () const; protected: @@ -353,6 +369,7 @@ public: location_t loc, tree fndecl, int depth); label_text get_desc (bool can_colorize) const override; + meaning get_meaning () const override; bool is_call_p () const final override; @@ -373,6 +390,7 @@ public: location_t loc, tree fndecl, int depth); label_text get_desc (bool can_colorize) const final override; + meaning get_meaning () const override; bool is_return_p () const final override; @@ -394,6 +412,7 @@ public: } label_text get_desc (bool can_colorize) const final override; + meaning get_meaning () const override; private: bool m_edge_sense; @@ -521,6 +540,7 @@ public: } label_text get_desc (bool can_colorize) const final override; + meaning get_meaning () const override; private: const state_machine *m_sm; diff --git a/gcc/analyzer/pending-diagnostic.h b/gcc/analyzer/pending-diagnostic.h index a273f89..9e1c656 100644 --- a/gcc/analyzer/pending-diagnostic.h +++ b/gcc/analyzer/pending-diagnostic.h @@ -21,6 +21,8 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_ANALYZER_PENDING_DIAGNOSTIC_H #define GCC_ANALYZER_PENDING_DIAGNOSTIC_H +#include "diagnostic-path.h" + namespace ana { /* A bundle of information about things that are of interest to a @@ -232,6 +234,15 @@ class pending_diagnostic return label_text (); } + /* Vfunc for implementing diagnostic_event::get_meaning for + state_change_event. */ + virtual diagnostic_event::meaning + get_meaning_for_state_change (const evdesc::state_change &) const + { + /* Default no-op implementation. */ + return diagnostic_event::meaning (); + } + /* Precision-of-wording vfunc for describing an interprocedural call carrying critial state for the diagnostic, from caller to callee. diff --git a/gcc/analyzer/sm-file.cc b/gcc/analyzer/sm-file.cc index e9b5b8b..8514af1 100644 --- a/gcc/analyzer/sm-file.cc +++ b/gcc/analyzer/sm-file.cc @@ -143,6 +143,20 @@ public: return label_text (); } + diagnostic_event::meaning + get_meaning_for_state_change (const evdesc::state_change &change) + const final override + { + if (change.m_old_state == m_sm.get_start_state () + && change.m_new_state == m_sm.m_unchecked) + return diagnostic_event::meaning (diagnostic_event::VERB_acquire, + diagnostic_event::NOUN_resource); + if (change.m_new_state == m_sm.m_closed) + return diagnostic_event::meaning (diagnostic_event::VERB_release, + diagnostic_event::NOUN_resource); + return diagnostic_event::meaning (); + } + protected: const fileptr_state_machine &m_sm; tree m_arg; diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc index 3c0f890..3bd4042 100644 --- a/gcc/analyzer/sm-malloc.cc +++ b/gcc/analyzer/sm-malloc.cc @@ -736,6 +736,20 @@ public: return label_text (); } + diagnostic_event::meaning + get_meaning_for_state_change (const evdesc::state_change &change) + const final override + { + if (change.m_old_state == m_sm.get_start_state () + && unchecked_p (change.m_new_state)) + return diagnostic_event::meaning (diagnostic_event::VERB_acquire, + diagnostic_event::NOUN_memory); + if (freed_p (change.m_new_state)) + return diagnostic_event::meaning (diagnostic_event::VERB_release, + diagnostic_event::NOUN_memory); + return diagnostic_event::meaning (); + } + protected: const malloc_state_machine &m_sm; tree m_arg; diff --git a/gcc/analyzer/sm-sensitive.cc b/gcc/analyzer/sm-sensitive.cc index 20809dd..83c1906 100644 --- a/gcc/analyzer/sm-sensitive.cc +++ b/gcc/analyzer/sm-sensitive.cc @@ -117,6 +117,15 @@ public: return label_text (); } + diagnostic_event::meaning + get_meaning_for_state_change (const evdesc::state_change &change) + const final override + { + if (change.m_new_state == m_sm.m_sensitive) + return diagnostic_event::meaning (diagnostic_event::VERB_acquire, + diagnostic_event::NOUN_sensitive); + return diagnostic_event::meaning (); + } label_text describe_call_with_state (const evdesc::call_with_state &info) final override { diff --git a/gcc/analyzer/sm-taint.cc b/gcc/analyzer/sm-taint.cc index 3aaa69a..d2d03c3 100644 --- a/gcc/analyzer/sm-taint.cc +++ b/gcc/analyzer/sm-taint.cc @@ -163,6 +163,17 @@ public: change.m_expr); return label_text (); } + + diagnostic_event::meaning + get_meaning_for_state_change (const evdesc::state_change &change) + const final override + { + if (change.m_new_state == m_sm.m_tainted) + return diagnostic_event::meaning (diagnostic_event::VERB_acquire, + diagnostic_event::NOUN_taint); + return diagnostic_event::meaning (); + } + protected: const taint_state_machine &m_sm; tree m_arg; diff --git a/gcc/analyzer/varargs.cc b/gcc/analyzer/varargs.cc index 2d27484..846a0b1 100644 --- a/gcc/analyzer/varargs.cc +++ b/gcc/analyzer/varargs.cc @@ -335,6 +335,19 @@ public: return label_text (); } + diagnostic_event::meaning + get_meaning_for_state_change (const evdesc::state_change &change) + const final override + { + if (change.m_new_state == m_sm.m_started) + return diagnostic_event::meaning (diagnostic_event::VERB_acquire, + diagnostic_event::NOUN_resource); + if (change.m_new_state == m_sm.m_ended) + return diagnostic_event::meaning (diagnostic_event::VERB_release, + diagnostic_event::NOUN_resource); + return diagnostic_event::meaning (); + } + protected: va_list_sm_diagnostic (const va_list_state_machine &sm, const svalue *ap_sval, tree ap_tree) diff --git a/gcc/c/c-lang.cc b/gcc/c/c-lang.cc index eecc0a0..0e67045 100644 --- a/gcc/c/c-lang.cc +++ b/gcc/c/c-lang.cc @@ -46,9 +46,21 @@ enum c_language_kind c_language = clk_c; #undef LANG_HOOKS_GET_SUBSTRING_LOCATION #define LANG_HOOKS_GET_SUBSTRING_LOCATION c_get_substring_location +#undef LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE +#define LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE c_get_sarif_source_language + /* Each front end provides its own lang hook initializer. */ struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER; +/* Get a value for the SARIF v2.1.0 "artifact.sourceLanguage" property, + based on the list in SARIF v2.1.0 Appendix J. */ + +const char * +c_get_sarif_source_language (const char *) +{ + return "c"; +} + #if CHECKING_P namespace selftest { diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h index 3b322ad..e655afd 100644 --- a/gcc/c/c-tree.h +++ b/gcc/c/c-tree.h @@ -837,6 +837,8 @@ set_c_expr_source_range (c_expr *expr, /* In c-fold.cc */ extern vec<tree> incomplete_record_decls; +extern const char *c_get_sarif_source_language (const char *filename); + #if CHECKING_P namespace selftest { extern void run_c_tests (void); diff --git a/gcc/common.opt b/gcc/common.opt index 3237ce9..7ca0cce 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1390,7 +1390,7 @@ Common Joined RejectNegative UInteger fdiagnostics-format= Common Joined RejectNegative Enum(diagnostics_output_format) --fdiagnostics-format=[text|json|json-stderr|json-file] Select output format. +-fdiagnostics-format=[text|sarif-stderr|sarif-file|json|json-stderr|json-file] Select output format. fdiagnostics-escape-format= Common Joined RejectNegative Enum(diagnostics_escape_format) @@ -1433,6 +1433,12 @@ Enum(diagnostics_output_format) String(json-stderr) Value(DIAGNOSTICS_OUTPUT_FOR EnumValue Enum(diagnostics_output_format) String(json-file) Value(DIAGNOSTICS_OUTPUT_FORMAT_JSON_FILE) +EnumValue +Enum(diagnostics_output_format) String(sarif-stderr) Value(DIAGNOSTICS_OUTPUT_FORMAT_SARIF_STDERR) + +EnumValue +Enum(diagnostics_output_format) String(sarif-file) Value(DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE) + fdiagnostics-parseable-fixits Common Var(flag_diagnostics_parseable_fixits) Print fix-it hints in machine-readable form. diff --git a/gcc/cp/cp-lang.cc b/gcc/cp/cp-lang.cc index 7c8b947..c3cfde5 100644 --- a/gcc/cp/cp-lang.cc +++ b/gcc/cp/cp-lang.cc @@ -36,6 +36,7 @@ static tree get_template_argument_pack_elems_folded (const_tree); static tree cxx_enum_underlying_base_type (const_tree); static tree *cxx_omp_get_decl_init (tree); static void cxx_omp_finish_decl_inits (void); +static const char *cp_get_sarif_source_language (const char *); /* Lang hooks common to C++ and ObjC++ are declared in cp/cp-objcp-common.h; consequently, there should be very few hooks below. */ @@ -100,6 +101,9 @@ static void cxx_omp_finish_decl_inits (void); #undef LANG_HOOKS_OMP_FINISH_DECL_INITS #define LANG_HOOKS_OMP_FINISH_DECL_INITS cxx_omp_finish_decl_inits +#undef LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE +#define LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE cp_get_sarif_source_language + /* Each front end provides its own lang hook initializer. */ struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER; @@ -265,6 +269,15 @@ cxx_omp_finish_decl_inits (void) dynamic_initializers = NULL; } +/* Get a value for the SARIF v2.1.0 "artifact.sourceLanguage" property, + based on the list in SARIF v2.1.0 Appendix J. */ + +static const char * +cp_get_sarif_source_language (const char *) +{ + return "cplusplus"; +} + #if CHECKING_P namespace selftest { diff --git a/gcc/d/d-lang.cc b/gcc/d/d-lang.cc index b7c8685..6e4350f 100644 --- a/gcc/d/d-lang.cc +++ b/gcc/d/d-lang.cc @@ -1933,6 +1933,15 @@ d_enum_underlying_base_type (const_tree type) return TREE_TYPE (type); } +/* Get a value for the SARIF v2.1.0 "artifact.sourceLanguage" property, + based on the list in SARIF v2.1.0 Appendix J. */ + +static const char * +d_get_sarif_source_language (const char *) +{ + return "d"; +} + /* Definitions for our language-specific hooks. */ #undef LANG_HOOKS_NAME @@ -1966,6 +1975,7 @@ d_enum_underlying_base_type (const_tree type) #undef LANG_HOOKS_TYPE_FOR_MODE #undef LANG_HOOKS_TYPE_FOR_SIZE #undef LANG_HOOKS_TYPE_PROMOTES_TO +#undef LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE #define LANG_HOOKS_NAME "GNU D" #define LANG_HOOKS_INIT d_init @@ -1998,6 +2008,7 @@ d_enum_underlying_base_type (const_tree type) #define LANG_HOOKS_TYPE_FOR_MODE d_type_for_mode #define LANG_HOOKS_TYPE_FOR_SIZE d_type_for_size #define LANG_HOOKS_TYPE_PROMOTES_TO d_type_promotes_to +#define LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE d_get_sarif_source_language struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER; diff --git a/gcc/diagnostic-client-data-hooks.h b/gcc/diagnostic-client-data-hooks.h new file mode 100644 index 0000000..ba78546 --- /dev/null +++ b/gcc/diagnostic-client-data-hooks.h @@ -0,0 +1,105 @@ +/* Additional metadata about a client for a diagnostic context. + Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by David Malcolm <dmalcolm@redhat.com> + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#ifndef GCC_DIAGNOSTIC_CLIENT_DATA_HOOKS_H +#define GCC_DIAGNOSTIC_CLIENT_DATA_HOOKS_H + +class client_version_info; + +/* A bundle of additional metadata, owned by the diagnostic_context, + for querying things about the client, like version data. */ + +class diagnostic_client_data_hooks +{ + public: + virtual ~diagnostic_client_data_hooks () {} + + /* Get version info for this client, or NULL. */ + virtual const client_version_info *get_any_version_info () const = 0; + + /* Get the current logical_location for this client, or NULL. */ + virtual const logical_location *get_current_logical_location () const = 0; + + /* Get a sourceLanguage value for FILENAME, or return NULL. + See SARIF v2.1.0 Appendix J for suggested values. */ + virtual const char * + maybe_get_sarif_source_language (const char *filename) const = 0; +}; + +/* Factory function for making an instance of diagnostic_client_data_hooks + for use in the compiler (i.e. with knowledge of "tree", access to + langhooks, etc). */ + +extern diagnostic_client_data_hooks *make_compiler_data_hooks (); + +class diagnostic_client_plugin_info; + +/* Abstract base class for a diagnostic_context to get at + version information about the client. */ + +class client_version_info +{ +public: + class plugin_visitor + { + public: + virtual void on_plugin (const diagnostic_client_plugin_info &) = 0; + }; + + virtual ~client_version_info () {} + + /* Get a string suitable for use as the value of the "name" property + (SARIF v2.1.0 section 3.19.8). */ + virtual const char *get_tool_name () const = 0; + + /* Create a string suitable for use as the value of the "fullName" property + (SARIF v2.1.0 section 3.19.9). */ + virtual char *maybe_make_full_name () const = 0; + + /* Get a string suitable for use as the value of the "version" property + (SARIF v2.1.0 section 3.19.13). */ + virtual const char *get_version_string () const = 0; + + /* Create a string suitable for use as the value of the "informationUri" + property (SARIF v2.1.0 section 3.19.17). */ + virtual char *maybe_make_version_url () const = 0; + + virtual void for_each_plugin (plugin_visitor &v) const = 0; +}; + +/* Abstract base class for a diagnostic_context to get at + information about a specific plugin within a client. */ + +class diagnostic_client_plugin_info +{ +public: + /* For use e.g. by SARIF "name" property (SARIF v2.1.0 section 3.19.8). */ + virtual const char *get_short_name () const = 0; + + /* For use e.g. by SARIF "fullName" property + (SARIF v2.1.0 section 3.19.9). */ + virtual const char *get_full_name () const = 0; + + /* For use e.g. by SARIF "version" property + (SARIF v2.1.0 section 3.19.13). */ + virtual const char *get_version () const = 0; +}; + +#endif /* ! GCC_DIAGNOSTIC_CLIENT_DATA_HOOKS_H */ diff --git a/gcc/diagnostic-format-sarif.cc b/gcc/diagnostic-format-sarif.cc new file mode 100644 index 0000000..0c33179 --- /dev/null +++ b/gcc/diagnostic-format-sarif.cc @@ -0,0 +1,1586 @@ +/* SARIF output for diagnostics + Copyright (C) 2018-2022 Free Software Foundation, Inc. + Contributed by David Malcolm <dmalcolm@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "diagnostic.h" +#include "diagnostic-metadata.h" +#include "diagnostic-path.h" +#include "json.h" +#include "cpplib.h" +#include "logical-location.h" +#include "diagnostic-client-data-hooks.h" + +class sarif_builder; + +/* Subclass of json::object for SARIF result objects + (SARIF v2.1.0 section 3.27. */ + +class sarif_result : public json::object +{ +public: + sarif_result () : m_related_locations_arr (NULL) {} + + void + on_nested_diagnostic (diagnostic_context *context, + diagnostic_info *diagnostic, + diagnostic_t orig_diag_kind, + sarif_builder *builder); + +private: + json::array *m_related_locations_arr; +}; + +/* A class for managing SARIF output (for -fdiagnostics-format=sarif-stderr + and -fdiagnostics-format=sarif-file). + + As diagnostics occur, we build "result" JSON objects, and + accumulate state: + - which source files are referenced + - which warnings are emitted + - which CWEs are used + + At the end of the compile, we use the above to build the full SARIF + object tree, adding the result objects to the correct place, and + creating objects for the various source files, warnings and CWEs + referenced. + + Implemented: + - fix-it hints + - CWE metadata + - diagnostic groups (see limitations below) + - logical locations (e.g. cfun) + + Known limitations: + - GCC supports one-deep nesting of diagnostics (via auto_diagnostic_group), + but we only capture location and message information from such nested + diagnostics (e.g. we ignore fix-it hints on them) + - doesn't yet capture command-line arguments: would be run.invocations + property (SARIF v2.1.0 section 3.14.11), as invocation objects + (SARIF v2.1.0 section 3.20), but we'd want to capture the arguments to + toplev::main, and the response files. + - doesn't capture escape_on_output_p + - doesn't capture secondary locations within a rich_location + (perhaps we should use the "relatedLocations" property: SARIF v2.1.0 + section 3.27.22) + - doesn't capture "artifact.encoding" property + (SARIF v2.1.0 section 3.24.9). + - doesn't capture hashes of the source files + ("artifact.hashes" property (SARIF v2.1.0 section 3.24.11). + - doesn't capture the "analysisTarget" property + (SARIF v2.1.0 section 3.27.13). + - doesn't capture labelled ranges + - doesn't capture -Werror cleanly + - doesn't capture inlining information (can SARIF handle this?) + - doesn't capture macro expansion information (can SARIF handle this?). */ + +class sarif_builder +{ +public: + sarif_builder (diagnostic_context *context); + + void end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic, + diagnostic_t orig_diag_kind); + + void end_group (); + + void flush_to_file (FILE *outf); + + json::object *make_location_object (const rich_location &rich_loc, + const logical_location *logical_loc); + json::object *make_message_object (const char *msg) const; + +private: + sarif_result *make_result_object (diagnostic_context *context, + diagnostic_info *diagnostic, + diagnostic_t orig_diag_kind); + void set_any_logical_locs_arr (json::object *location_obj, + const logical_location *logical_loc); + json::object *make_location_object (const diagnostic_event &event); + json::object * + make_logical_location_object (const logical_location &logical_loc) const; + json::object *make_code_flow_object (const diagnostic_path &path); + json::object *make_thread_flow_object (const diagnostic_path &path); + json::object * + make_thread_flow_location_object (const diagnostic_event &event); + json::array *maybe_make_kinds_array (diagnostic_event::meaning m) const; + json::object *maybe_make_physical_location_object (location_t loc); + json::object *make_artifact_location_object (location_t loc); + json::object *make_artifact_location_object (const char *filename); + json::object *make_artifact_location_object_for_pwd () const; + json::object *maybe_make_region_object (location_t loc) const; + json::object *maybe_make_region_object_for_context (location_t loc) const; + json::object *make_region_object_for_hint (const fixit_hint &hint) const; + json::object *make_multiformat_message_string (const char *msg) const; + json::object *make_top_level_object (json::array *results); + json::object *make_run_object (json::array *results); + json::object *make_tool_object () const; + json::object *make_driver_tool_component_object () const; + json::array *maybe_make_taxonomies_array () const; + json::object *maybe_make_cwe_taxonomy_object () const; + json::object *make_tool_component_reference_object_for_cwe () const; + json::object * + make_reporting_descriptor_object_for_warning (diagnostic_context *context, + diagnostic_info *diagnostic, + diagnostic_t orig_diag_kind, + const char *option_text); + json::object *make_reporting_descriptor_object_for_cwe_id (int cwe_id) const; + json::object * + make_reporting_descriptor_reference_object_for_cwe_id (int cwe_id); + json::object *make_artifact_object (const char *filename); + json::object *maybe_make_artifact_content_object (const char *filename) const; + json::object *maybe_make_artifact_content_object (const char *filename, + int start_line, + int end_line) const; + json::object *make_fix_object (const rich_location &rich_loc); + json::object *make_artifact_change_object (const rich_location &richloc); + json::object *make_replacement_object (const fixit_hint &hint) const; + json::object *make_artifact_content_object (const char *text) const; + int get_sarif_column (expanded_location exploc) const; + + diagnostic_context *m_context; + + /* The JSON array of pending diagnostics. */ + json::array *m_results_array; + + /* The JSON object for the result object (if any) in the current + diagnostic group. */ + sarif_result *m_cur_group_result; + + hash_set <const char *> m_filenames; + bool m_seen_any_relative_paths; + hash_set <free_string_hash> m_rule_id_set; + json::array *m_rules_arr; + + /* The set of all CWE IDs we've seen, if any. */ + hash_set <int_hash <int, 0, 1> > m_cwe_id_set; + + int m_tabstop; +}; + +static sarif_builder *the_builder; + +/* class sarif_result : public json::object. */ + +/* Handle secondary diagnostics that occur within a diagnostic group. + The closest SARIF seems to have to nested diagnostics is the + "relatedLocations" property of result objects (SARIF v2.1.0 section 3.27.22), + so we lazily set this property and populate the array if and when + secondary diagnostics occur (such as notes to a warning). */ + +void +sarif_result::on_nested_diagnostic (diagnostic_context *context, + diagnostic_info *diagnostic, + diagnostic_t /*orig_diag_kind*/, + sarif_builder *builder) +{ + if (!m_related_locations_arr) + { + m_related_locations_arr = new json::array (); + set ("relatedLocations", m_related_locations_arr); + } + + /* We don't yet generate meaningful logical locations for notes; + sometimes these will related to current_function_decl, but + often they won't. */ + json::object *location_obj + = builder->make_location_object (*diagnostic->richloc, NULL); + json::object *message_obj + = builder->make_message_object (pp_formatted_text (context->printer)); + pp_clear_output_area (context->printer); + location_obj->set ("message", message_obj); + + m_related_locations_arr->append (location_obj); +} + +/* class sarif_builder. */ + +/* sarif_builder's ctor. */ + +sarif_builder::sarif_builder (diagnostic_context *context) +: m_context (context), + m_results_array (new json::array ()), + m_cur_group_result (NULL), + m_seen_any_relative_paths (false), + m_rule_id_set (), + m_rules_arr (new json::array ()), + m_tabstop (context->tabstop) +{ +} + +/* Implementation of "end_diagnostic" for SARIF output. */ + +void +sarif_builder::end_diagnostic (diagnostic_context *context, + diagnostic_info *diagnostic, + diagnostic_t orig_diag_kind) +{ + + if (m_cur_group_result) + /* Nested diagnostic. */ + m_cur_group_result->on_nested_diagnostic (context, + diagnostic, + orig_diag_kind, + this); + else + { + /* Top-level diagnostic. */ + sarif_result *result_obj + = make_result_object (context, diagnostic, orig_diag_kind); + m_results_array->append (result_obj); + m_cur_group_result = result_obj; + } +} + +/* Implementation of "end_group_cb" for SARIF output. */ + +void +sarif_builder::end_group () +{ + m_cur_group_result = NULL; +} + +/* Create a top-level object, and add it to all the results + (and other entities) we've seen so far. + + Flush it all to OUTF. */ + +void +sarif_builder::flush_to_file (FILE *outf) +{ + json::object *top = make_top_level_object (m_results_array); + top->dump (outf); + m_results_array = NULL; + fprintf (outf, "\n"); + delete top; +} + +/* Attempt to convert DIAG_KIND to a suitable value for the "level" + property (SARIF v2.1.0 section 3.27.10). + + Return NULL if there isn't one. */ + +static const char * +maybe_get_sarif_level (diagnostic_t diag_kind) +{ + switch (diag_kind) + { + case DK_WARNING: + return "warning"; + case DK_ERROR: + return "error"; + case DK_NOTE: + case DK_ANACHRONISM: + return "note"; + default: + return NULL; + } +} + +/* Make a string for DIAG_KIND suitable for use a ruleId + (SARIF v2.1.0 section 3.27.5) as a fallback for when we don't + have anything better to use. */ + +static char * +make_rule_id_for_diagnostic_kind (diagnostic_t diag_kind) +{ + static const char *const diagnostic_kind_text[] = { +#define DEFINE_DIAGNOSTIC_KIND(K, T, C) (T), +#include "diagnostic.def" +#undef DEFINE_DIAGNOSTIC_KIND + "must-not-happen" + }; + /* Lose the trailing ": ". */ + const char *kind_text = diagnostic_kind_text[diag_kind]; + size_t len = strlen (kind_text); + gcc_assert (len > 2); + gcc_assert (kind_text[len - 2] == ':'); + gcc_assert (kind_text[len - 1] == ' '); + char *rstrip = xstrdup (kind_text); + rstrip[len - 2] = '\0'; + return rstrip; +} + +/* Make a result object (SARIF v2.1.0 section 3.27) for DIAGNOSTIC. */ + +sarif_result * +sarif_builder::make_result_object (diagnostic_context *context, + diagnostic_info *diagnostic, + diagnostic_t orig_diag_kind) +{ + sarif_result *result_obj = new sarif_result (); + + /* "ruleId" property (SARIF v2.1.0 section 3.27.5). */ + /* Ideally we'd have an option_name for these. */ + if (char *option_text + = context->option_name (context, diagnostic->option_index, + orig_diag_kind, diagnostic->kind)) + { + /* Lazily create reportingDescriptor objects for and add to m_rules_arr. + Set ruleId referencing them. */ + result_obj->set ("ruleId", new json::string (option_text)); + if (m_rule_id_set.contains (option_text)) + free (option_text); + else + { + /* This is the first time we've seen this ruleId. */ + /* Add to set, taking ownership. */ + m_rule_id_set.add (option_text); + + json::object *reporting_desc_obj + = make_reporting_descriptor_object_for_warning (context, + diagnostic, + orig_diag_kind, + option_text); + m_rules_arr->append (reporting_desc_obj); + } + } + else + { + /* Otherwise, we have an "error" or a stray "note"; use the + diagnostic kind as the ruleId, so that the result object at least + has a ruleId. + We don't bother creating reportingDescriptor objects for these. */ + char *rule_id = make_rule_id_for_diagnostic_kind (orig_diag_kind); + result_obj->set ("ruleId", new json::string (rule_id)); + free (rule_id); + } + + /* "taxa" property (SARIF v2.1.0 section 3.27.8). */ + if (diagnostic->metadata) + if (int cwe_id = diagnostic->metadata->get_cwe ()) + { + json::array *taxa_arr = new json::array (); + json::object *cwe_id_obj + = make_reporting_descriptor_reference_object_for_cwe_id (cwe_id); + taxa_arr->append (cwe_id_obj); + result_obj->set ("taxa", taxa_arr); + } + + /* "level" property (SARIF v2.1.0 section 3.27.10). */ + if (const char *sarif_level = maybe_get_sarif_level (diagnostic->kind)) + result_obj->set ("level", new json::string (sarif_level)); + + /* "message" property (SARIF v2.1.0 section 3.27.11). */ + json::object *message_obj + = make_message_object (pp_formatted_text (context->printer)); + pp_clear_output_area (context->printer); + result_obj->set ("message", message_obj); + + /* "locations" property (SARIF v2.1.0 section 3.27.12). */ + json::array *locations_arr = new json::array (); + const logical_location *logical_loc = NULL; + if (m_context->m_client_data_hooks) + logical_loc + = m_context->m_client_data_hooks->get_current_logical_location (); + + json::object *location_obj + = make_location_object (*diagnostic->richloc, logical_loc); + locations_arr->append (location_obj); + result_obj->set ("locations", locations_arr); + + /* "codeFlows" property (SARIF v2.1.0 section 3.27.18). */ + if (const diagnostic_path *path = diagnostic->richloc->get_path ()) + { + json::array *code_flows_arr = new json::array (); + json::object *code_flow_obj = make_code_flow_object (*path); + code_flows_arr->append (code_flow_obj); + result_obj->set ("codeFlows", code_flows_arr); + } + + /* The "relatedLocations" property (SARIF v2.1.0 section 3.27.22) is + set up later, if any nested diagnostics occur within this diagnostic + group. */ + + /* "fixes" property (SARIF v2.1.0 section 3.27.30). */ + const rich_location *richloc = diagnostic->richloc; + if (richloc->get_num_fixit_hints ()) + { + json::array *fix_arr = new json::array (); + json::object *fix_obj = make_fix_object (*richloc); + fix_arr->append (fix_obj); + result_obj->set ("fixes", fix_arr); + } + + return result_obj; +} + +/* Make a reportingDescriptor object (SARIF v2.1.0 section 3.49) + for a GCC warning. */ + +json::object * +sarif_builder:: +make_reporting_descriptor_object_for_warning (diagnostic_context *context, + diagnostic_info *diagnostic, + diagnostic_t /*orig_diag_kind*/, + const char *option_text) +{ + json::object *reporting_desc = new json::object (); + + /* "id" property (SARIF v2.1.0 section 3.49.3). */ + reporting_desc->set ("id", new json::string (option_text)); + + /* We don't implement "name" property (SARIF v2.1.0 section 3.49.7), since + it seems redundant compared to "id". */ + + /* "helpUri" property (SARIF v2.1.0 section 3.49.12). */ + if (context->get_option_url) + { + char *option_url + = context->get_option_url (context, diagnostic->option_index); + if (option_url) + { + reporting_desc->set ("helpUri", new json::string (option_url)); + free (option_url); + } + } + + return reporting_desc; +} + +/* Make a reportingDescriptor object (SARIF v2.1.0 section 3.49) + for CWE_ID, for use within the CWE taxa array. */ + +json::object * +sarif_builder::make_reporting_descriptor_object_for_cwe_id (int cwe_id) const +{ + json::object *reporting_desc = new json::object (); + + /* "id" property (SARIF v2.1.0 section 3.49.3). */ + { + pretty_printer pp; + pp_printf (&pp, "%i", cwe_id); + reporting_desc->set ("id", new json::string (pp_formatted_text (&pp))); + } + + /* "helpUri" property (SARIF v2.1.0 section 3.49.12). */ + { + char *url = get_cwe_url (cwe_id); + reporting_desc->set ("helpUri", new json::string (url)); + free (url); + } + + return reporting_desc; +} + +/* Make a reportingDescriptorReference object (SARIF v2.1.0 section 3.52) + referencing CWE_ID, for use within a result object. + Also, add CWE_ID to m_cwe_id_set. */ + +json::object * +sarif_builder:: +make_reporting_descriptor_reference_object_for_cwe_id (int cwe_id) +{ + json::object *desc_ref_obj = new json::object (); + + /* "id" property (SARIF v2.1.0 section 3.52.4). */ + { + pretty_printer pp; + pp_printf (&pp, "%i", cwe_id); + desc_ref_obj->set ("id", new json::string (pp_formatted_text (&pp))); + } + + /* "toolComponent" property (SARIF v2.1.0 section 3.52.7). */ + json::object *comp_ref_obj = make_tool_component_reference_object_for_cwe (); + desc_ref_obj->set ("toolComponent", comp_ref_obj); + + /* Add CWE_ID to our set. */ + gcc_assert (cwe_id > 0); + m_cwe_id_set.add (cwe_id); + + return desc_ref_obj; +} + +/* Make a toolComponentReference object (SARIF v2.1.0 section 3.54) that + references the CWE taxonomy. */ + +json::object * +sarif_builder:: +make_tool_component_reference_object_for_cwe () const +{ + json::object *comp_ref_obj = new json::object (); + + /* "name" property (SARIF v2.1.0 section 3.54.3). */ + comp_ref_obj->set ("name", new json::string ("cwe")); + + return comp_ref_obj; +} + +/* If LOGICAL_LOC is non-NULL, use it to create a "logicalLocations" property + within LOCATION_OBJ (SARIF v2.1.0 section 3.28.4). */ + +void +sarif_builder:: +set_any_logical_locs_arr (json::object *location_obj, + const logical_location *logical_loc) +{ + if (!logical_loc) + return; + json::object *logical_loc_obj = make_logical_location_object (*logical_loc); + json::array *location_locs_arr = new json::array (); + location_locs_arr->append (logical_loc_obj); + location_obj->set ("logicalLocations", location_locs_arr); +} + +/* Make a location object (SARIF v2.1.0 section 3.28) for RICH_LOC + and LOGICAL_LOC. */ + +json::object * +sarif_builder::make_location_object (const rich_location &rich_loc, + const logical_location *logical_loc) +{ + json::object *location_obj = new json::object (); + + /* Get primary loc from RICH_LOC. */ + location_t loc = rich_loc.get_loc (); + + /* "physicalLocation" property (SARIF v2.1.0 section 3.28.3). */ + if (json::object *phs_loc_obj = maybe_make_physical_location_object (loc)) + location_obj->set ("physicalLocation", phs_loc_obj); + + /* "logicalLocations" property (SARIF v2.1.0 section 3.28.4). */ + set_any_logical_locs_arr (location_obj, logical_loc); + + return location_obj; +} + +/* Make a location object (SARIF v2.1.0 section 3.28) for EVENT + within a diagnostic_path. */ + +json::object * +sarif_builder::make_location_object (const diagnostic_event &event) +{ + json::object *location_obj = new json::object (); + + /* "physicalLocation" property (SARIF v2.1.0 section 3.28.3). */ + location_t loc = event.get_location (); + if (json::object *phs_loc_obj = maybe_make_physical_location_object (loc)) + location_obj->set ("physicalLocation", phs_loc_obj); + + /* "logicalLocations" property (SARIF v2.1.0 section 3.28.4). */ + const logical_location *logical_loc = event.get_logical_location (); + set_any_logical_locs_arr (location_obj, logical_loc); + + /* "message" property (SARIF v2.1.0 section 3.28.5). */ + label_text ev_desc = event.get_desc (false); + json::object *message_obj = make_message_object (ev_desc.m_buffer); + location_obj->set ("message", message_obj); + ev_desc.maybe_free (); + + return location_obj; +} + +/* Make a physicalLocation object (SARIF v2.1.0 section 3.29) for LOC, + or return NULL; + Add any filename to the m_artifacts. */ + +json::object * +sarif_builder::maybe_make_physical_location_object (location_t loc) +{ + if (loc <= BUILTINS_LOCATION) + return NULL; + + json::object *phys_loc_obj = new json::object (); + + /* "artifactLocation" property (SARIF v2.1.0 section 3.29.3). */ + json::object *artifact_loc_obj = make_artifact_location_object (loc); + phys_loc_obj->set ("artifactLocation", artifact_loc_obj); + m_filenames.add (LOCATION_FILE (loc)); + + /* "region" property (SARIF v2.1.0 section 3.29.4). */ + if (json::object *region_obj = maybe_make_region_object (loc)) + phys_loc_obj->set ("region", region_obj); + + /* "contextRegion" property (SARIF v2.1.0 section 3.29.5). */ + if (json::object *context_region_obj + = maybe_make_region_object_for_context (loc)) + phys_loc_obj->set ("contextRegion", context_region_obj); + + /* Instead, we add artifacts to the run as a whole, + with artifact.contents. + Could do both, though. */ + + return phys_loc_obj; +} + +/* Make an artifactLocation object (SARIF v2.1.0 section 3.4) for LOC, + or return NULL. */ + +json::object * +sarif_builder::make_artifact_location_object (location_t loc) +{ + return make_artifact_location_object (LOCATION_FILE (loc)); +} + +/* The ID value for use in "uriBaseId" properties (SARIF v2.1.0 section 3.4.4) + for when we need to express paths relative to PWD. */ + +#define PWD_PROPERTY_NAME ("PWD") + +/* Make an artifactLocation object (SARIF v2.1.0 section 3.4) for FILENAME, + or return NULL. */ + +json::object * +sarif_builder::make_artifact_location_object (const char *filename) +{ + json::object *artifact_loc_obj = new json::object (); + + /* "uri" property (SARIF v2.1.0 section 3.4.3). */ + artifact_loc_obj->set ("uri", new json::string (filename)); + + if (filename[0] != '/') + { + /* If we have a relative path, set the "uriBaseId" property + (SARIF v2.1.0 section 3.4.4). */ + artifact_loc_obj->set ("uriBaseId", new json::string (PWD_PROPERTY_NAME)); + m_seen_any_relative_paths = true; + } + + return artifact_loc_obj; +} + +/* Get the PWD, or NULL, as an absolute file-based URI, + adding a trailing forward slash (as required by SARIF v2.1.0 + section 3.14.14). */ + +static char * +make_pwd_uri_str () +{ + /* The prefix of a file-based URI, up to, but not including the path. */ +#define FILE_PREFIX ("file://") + + const char *pwd = getpwd (); + if (!pwd) + return NULL; + size_t len = strlen (pwd); + if (len == 0 || pwd[len - 1] != '/') + return concat (FILE_PREFIX, pwd, "/", NULL); + else + { + gcc_assert (pwd[len - 1] == '/'); + return concat (FILE_PREFIX, pwd, NULL); + } +} + +/* Make an artifactLocation object (SARIF v2.1.0 section 3.4) for the pwd, + for use in the "run.originalUriBaseIds" property (SARIF v2.1.0 + section 3.14.14) when we have any relative paths. */ + +json::object * +sarif_builder::make_artifact_location_object_for_pwd () const +{ + json::object *artifact_loc_obj = new json::object (); + + /* "uri" property (SARIF v2.1.0 section 3.4.3). */ + if (char *pwd = make_pwd_uri_str ()) + { + gcc_assert (strlen (pwd) > 0); + gcc_assert (pwd[strlen (pwd) - 1] == '/'); + artifact_loc_obj->set ("uri", new json::string (pwd)); + free (pwd); + } + + return artifact_loc_obj; +} + +/* Get the column number within EXPLOC. */ + +int +sarif_builder::get_sarif_column (expanded_location exploc) const +{ + cpp_char_column_policy policy (m_tabstop, cpp_wcwidth); + return location_compute_display_column (exploc, policy); +} + +/* Make a region object (SARIF v2.1.0 section 3.30) for LOC, + or return NULL. */ + +json::object * +sarif_builder::maybe_make_region_object (location_t loc) const +{ + location_t caret_loc = get_pure_location (loc); + + if (caret_loc <= BUILTINS_LOCATION) + return NULL; + + location_t start_loc = get_start (loc); + location_t finish_loc = get_finish (loc); + + expanded_location exploc_caret = expand_location (caret_loc); + expanded_location exploc_start = expand_location (start_loc); + expanded_location exploc_finish = expand_location (finish_loc); + + if (exploc_start.file !=exploc_caret.file) + return NULL; + if (exploc_finish.file !=exploc_caret.file) + return NULL; + + json::object *region_obj = new json::object (); + + /* "startLine" property (SARIF v2.1.0 section 3.30.5) */ + region_obj->set ("startLine", new json::integer_number (exploc_start.line)); + + /* "startColumn" property (SARIF v2.1.0 section 3.30.6) */ + region_obj->set ("startColumn", + new json::integer_number (get_sarif_column (exploc_start))); + + /* "endLine" property (SARIF v2.1.0 section 3.30.7) */ + if (exploc_finish.line != exploc_start.line) + region_obj->set ("endLine", new json::integer_number (exploc_finish.line)); + + /* "endColumn" property (SARIF v2.1.0 section 3.30.8). + This expresses the column immediately beyond the range. */ + { + int next_column = sarif_builder::get_sarif_column (exploc_finish) + 1; + region_obj->set ("endColumn", new json::integer_number (next_column)); + } + + return region_obj; +} + +/* Make a region object (SARIF v2.1.0 section 3.30) for the "contextRegion" + property (SARIF v2.1.0 section 3.29.5) of a physicalLocation. + + This is similar to maybe_make_region_object, but ignores column numbers, + covering the line(s) as a whole, and including a "snippet" property + embedding those source lines, making it easier for consumers to show + the pertinent source. */ + +json::object * +sarif_builder::maybe_make_region_object_for_context (location_t loc) const +{ + location_t caret_loc = get_pure_location (loc); + + if (caret_loc <= BUILTINS_LOCATION) + return NULL; + + location_t start_loc = get_start (loc); + location_t finish_loc = get_finish (loc); + + expanded_location exploc_caret = expand_location (caret_loc); + expanded_location exploc_start = expand_location (start_loc); + expanded_location exploc_finish = expand_location (finish_loc); + + if (exploc_start.file !=exploc_caret.file) + return NULL; + if (exploc_finish.file !=exploc_caret.file) + return NULL; + + json::object *region_obj = new json::object (); + + /* "startLine" property (SARIF v2.1.0 section 3.30.5) */ + region_obj->set ("startLine", new json::integer_number (exploc_start.line)); + + /* "endLine" property (SARIF v2.1.0 section 3.30.7) */ + if (exploc_finish.line != exploc_start.line) + region_obj->set ("endLine", new json::integer_number (exploc_finish.line)); + + /* "snippet" property (SARIF v2.1.0 section 3.30.13). */ + if (json::object *artifact_content_obj + = maybe_make_artifact_content_object (exploc_start.file, + exploc_start.line, + exploc_finish.line)) + region_obj->set ("snippet", artifact_content_obj); + + return region_obj; +} + +/* Make a region object (SARIF v2.1.0 section 3.30) for the deletion region + of HINT (as per SARIF v2.1.0 section 3.57.3). */ + +json::object * +sarif_builder::make_region_object_for_hint (const fixit_hint &hint) const +{ + location_t start_loc = hint.get_start_loc (); + location_t next_loc = hint.get_next_loc (); + + expanded_location exploc_start = expand_location (start_loc); + expanded_location exploc_next = expand_location (next_loc); + + json::object *region_obj = new json::object (); + + /* "startLine" property (SARIF v2.1.0 section 3.30.5) */ + region_obj->set ("startLine", new json::integer_number (exploc_start.line)); + + /* "startColumn" property (SARIF v2.1.0 section 3.30.6) */ + int start_col = get_sarif_column (exploc_start); + region_obj->set ("startColumn", + new json::integer_number (start_col)); + + /* "endLine" property (SARIF v2.1.0 section 3.30.7) */ + if (exploc_next.line != exploc_start.line) + region_obj->set ("endLine", new json::integer_number (exploc_next.line)); + + /* "endColumn" property (SARIF v2.1.0 section 3.30.8). + This expresses the column immediately beyond the range. */ + int next_col = get_sarif_column (exploc_next); + region_obj->set ("endColumn", new json::integer_number (next_col)); + + return region_obj; +} + +/* Attempt to get a string for a logicalLocation's "kind" property + (SARIF v2.1.0 section 3.33.7). + Return NULL if unknown. */ + +static const char * +maybe_get_sarif_kind (enum logical_location_kind kind) +{ + switch (kind) + { + default: + gcc_unreachable (); + case LOGICAL_LOCATION_KIND_UNKNOWN: + return NULL; + + case LOGICAL_LOCATION_KIND_FUNCTION: + return "function"; + case LOGICAL_LOCATION_KIND_MEMBER: + return "member"; + case LOGICAL_LOCATION_KIND_MODULE: + return "module"; + case LOGICAL_LOCATION_KIND_NAMESPACE: + return "namespace"; + case LOGICAL_LOCATION_KIND_TYPE: + return "type"; + case LOGICAL_LOCATION_KIND_RETURN_TYPE: + return "returnType"; + case LOGICAL_LOCATION_KIND_PARAMETER: + return "parameter"; + case LOGICAL_LOCATION_KIND_VARIABLE: + return "variable"; + } +} + +/* Make a logicalLocation object (SARIF v2.1.0 section 3.33) for LOGICAL_LOC, + or return NULL. */ + +json::object * +sarif_builder:: +make_logical_location_object (const logical_location &logical_loc) const +{ + json::object *logical_loc_obj = new json::object (); + + /* "name" property (SARIF v2.1.0 section 3.33.4). */ + if (const char *short_name = logical_loc.get_short_name ()) + logical_loc_obj->set ("name", new json::string (short_name)); + + /* "fullyQualifiedName" property (SARIF v2.1.0 section 3.33.5). */ + if (const char *name_with_scope = logical_loc.get_name_with_scope ()) + logical_loc_obj->set ("fullyQualifiedName", + new json::string (name_with_scope)); + + /* "decoratedName" property (SARIF v2.1.0 section 3.33.6). */ + if (const char *internal_name = logical_loc.get_internal_name ()) + logical_loc_obj->set ("decoratedName", new json::string (internal_name)); + + /* "kind" property (SARIF v2.1.0 section 3.33.7). */ + enum logical_location_kind kind = logical_loc.get_kind (); + if (const char *sarif_kind_str = maybe_get_sarif_kind (kind)) + logical_loc_obj->set ("kind", new json::string (sarif_kind_str)); + + return logical_loc_obj; +} + +/* Make a codeFlow object (SARIF v2.1.0 section 3.36) for PATH. */ + +json::object * +sarif_builder::make_code_flow_object (const diagnostic_path &path) +{ + json::object *code_flow_obj = new json::object (); + + /* "threadFlows" property (SARIF v2.1.0 section 3.36.3). + Currently we only support one thread per result. */ + json::array *thread_flows_arr = new json::array (); + json::object *thread_flow_obj = make_thread_flow_object (path); + thread_flows_arr->append (thread_flow_obj); + code_flow_obj->set ("threadFlows", thread_flows_arr); + + return code_flow_obj; +} + +/* Make a threadFlow object (SARIF v2.1.0 section 3.37) for PATH. */ + +json::object * +sarif_builder::make_thread_flow_object (const diagnostic_path &path) +{ + json::object *thread_flow_obj = new json::object (); + + /* "locations" property (SARIF v2.1.0 section 3.37.6). */ + json::array *locations_arr = new json::array (); + for (unsigned i = 0; i < path.num_events (); i++) + { + const diagnostic_event &event = path.get_event (i); + json::object *thread_flow_loc_obj + = make_thread_flow_location_object (event); + locations_arr->append (thread_flow_loc_obj); + } + thread_flow_obj->set ("locations", locations_arr); + + return thread_flow_obj; +} + +/* Make a threadFlowLocation object (SARIF v2.1.0 section 3.38) for EVENT. */ + +json::object * +sarif_builder::make_thread_flow_location_object (const diagnostic_event &ev) +{ + json::object *thread_flow_loc_obj = new json::object (); + + /* "location" property (SARIF v2.1.0 section 3.38.3). */ + json::object *location_obj = make_location_object (ev); + thread_flow_loc_obj->set ("location", location_obj); + + /* "kinds" property (SARIF v2.1.0 section 3.38.8). */ + diagnostic_event::meaning m = ev.get_meaning (); + if (json::array *kinds_arr = maybe_make_kinds_array (m)) + thread_flow_loc_obj->set ("kinds", kinds_arr); + + /* "nestingLevel" property (SARIF v2.1.0 section 3.38.10). */ + thread_flow_loc_obj->set ("nestingLevel", + new json::integer_number (ev.get_stack_depth ())); + + /* It might be nice to eventually implement the following for -fanalyzer: + - the "stack" property (SARIF v2.1.0 section 3.38.5) + - the "state" property (SARIF v2.1.0 section 3.38.9) + - the "importance" property (SARIF v2.1.0 section 3.38.13). */ + + return thread_flow_loc_obj; +} + +/* If M has any known meaning, make a json array suitable for the "kinds" + property of a threadFlowLocation object (SARIF v2.1.0 section 3.38.8). + + Otherwise, return NULL. */ + +json::array * +sarif_builder::maybe_make_kinds_array (diagnostic_event::meaning m) const +{ + if (m.m_verb == diagnostic_event::VERB_unknown + && m.m_noun == diagnostic_event::NOUN_unknown + && m.m_property == diagnostic_event::PROPERTY_unknown) + return NULL; + + json::array *kinds_arr = new json::array (); + if (const char *verb_str + = diagnostic_event::meaning::maybe_get_verb_str (m.m_verb)) + kinds_arr->append (new json::string (verb_str)); + if (const char *noun_str + = diagnostic_event::meaning::maybe_get_noun_str (m.m_noun)) + kinds_arr->append (new json::string (noun_str)); + if (const char *property_str + = diagnostic_event::meaning::maybe_get_property_str (m.m_property)) + kinds_arr->append (new json::string (property_str)); + return kinds_arr; +} + +/* Make a message object (SARIF v2.1.0 section 3.11) for MSG. */ + +json::object * +sarif_builder::make_message_object (const char *msg) const +{ + json::object *message_obj = new json::object (); + + /* "text" property (SARIF v2.1.0 section 3.11.8). */ + message_obj->set ("text", new json::string (msg)); + + return message_obj; +} + +/* Make a multiformatMessageString object (SARIF v2.1.0 section 3.12) + for MSG. */ + +json::object * +sarif_builder::make_multiformat_message_string (const char *msg) const +{ + json::object *message_obj = new json::object (); + + /* "text" property (SARIF v2.1.0 section 3.12.3). */ + message_obj->set ("text", new json::string (msg)); + + return message_obj; +} + +#define SARIF_SCHEMA "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json" +#define SARIF_VERSION "2.1.0" + +/* Make a top-level sarifLog object (SARIF v2.1.0 section 3.13). + Take ownership of RESULTS. */ + +json::object * +sarif_builder::make_top_level_object (json::array *results) +{ + json::object *log_obj = new json::object (); + + /* "$schema" property (SARIF v2.1.0 section 3.13.3) . */ + log_obj->set ("$schema", new json::string (SARIF_SCHEMA)); + + /* "version" property (SARIF v2.1.0 section 3.13.2). */ + log_obj->set ("version", new json::string (SARIF_VERSION)); + + /* "runs" property (SARIF v2.1.0 section 3.13.4). */ + json::array *run_arr = new json::array (); + json::object *run_obj = make_run_object (results); + run_arr->append (run_obj); + log_obj->set ("runs", run_arr); + + return log_obj; +} + +/* Make a run object (SARIF v2.1.0 section 3.14). + Take ownership of RESULTS. */ + +json::object * +sarif_builder::make_run_object (json::array *results) +{ + json::object *run_obj = new json::object (); + + /* "tool" property (SARIF v2.1.0 section 3.14.6). */ + json::object *tool_obj = make_tool_object (); + run_obj->set ("tool", tool_obj); + + /* "taxonomies" property (SARIF v2.1.0 section 3.14.8). */ + if (json::array *taxonomies_arr = maybe_make_taxonomies_array ()) + run_obj->set ("taxonomies", taxonomies_arr); + + /* "originalUriBaseIds (SARIF v2.1.0 section 3.14.14). */ + if (m_seen_any_relative_paths) + { + json::object *orig_uri_base_ids = new json::object (); + run_obj->set ("originalUriBaseIds", orig_uri_base_ids); + json::object *pwd_art_loc_obj = make_artifact_location_object_for_pwd (); + orig_uri_base_ids->set (PWD_PROPERTY_NAME, pwd_art_loc_obj); + } + + /* "artifacts" property (SARIF v2.1.0 section 3.14.15). */ + json::array *artifacts_arr = new json::array (); + for (auto iter : m_filenames) + { + json::object *artifact_obj = make_artifact_object (iter); + artifacts_arr->append (artifact_obj); + } + run_obj->set ("artifacts", artifacts_arr); + + /* "results" property (SARIF v2.1.0 section 3.14.23). */ + run_obj->set ("results", results); + + return run_obj; +} + +/* Make a tool object (SARIF v2.1.0 section 3.18). */ + +json::object * +sarif_builder::make_tool_object () const +{ + json::object *tool_obj = new json::object (); + + /* "driver" property (SARIF v2.1.0 section 3.18.2). */ + json::object *driver_obj = make_driver_tool_component_object (); + tool_obj->set ("driver", driver_obj); + + /* Report plugins via the "extensions" property + (SARIF v2.1.0 section 3.18.3). */ + if (m_context->m_client_data_hooks) + if (const client_version_info *vinfo + = m_context->m_client_data_hooks->get_any_version_info ()) + { + class my_plugin_visitor : public client_version_info :: plugin_visitor + { + public: + void on_plugin (const diagnostic_client_plugin_info &p) final override + { + /* Create a toolComponent object (SARIF v2.1.0 section 3.19) + for the plugin. */ + json::object *plugin_obj = new json::object (); + m_plugin_objs.safe_push (plugin_obj); + + /* "name" property (SARIF v2.1.0 section 3.19.8). */ + if (const char *short_name = p.get_short_name ()) + plugin_obj->set ("name", new json::string (short_name)); + + /* "fullName" property (SARIF v2.1.0 section 3.19.9). */ + if (const char *full_name = p.get_full_name ()) + plugin_obj->set ("fullName", new json::string (full_name)); + + /* "version" property (SARIF v2.1.0 section 3.19.13). */ + if (const char *version = p.get_version ()) + plugin_obj->set ("version", new json::string (version)); + } + auto_vec <json::object *> m_plugin_objs; + }; + my_plugin_visitor v; + vinfo->for_each_plugin (v); + if (v.m_plugin_objs.length () > 0) + { + json::array *extensions_arr = new json::array (); + tool_obj->set ("extensions", extensions_arr); + for (auto iter : v.m_plugin_objs) + extensions_arr->append (iter); + } + } + + /* Perhaps we could also show GMP, MPFR, MPC, isl versions as other + "extensions" (see toplev.cc: print_version). */ + + return tool_obj; +} + +/* Make a toolComponent object (SARIF v2.1.0 section 3.19) for what SARIF + calls the "driver" (see SARIF v2.1.0 section 3.18.1). */ + +json::object * +sarif_builder::make_driver_tool_component_object () const +{ + json::object *driver_obj = new json::object (); + + if (m_context->m_client_data_hooks) + if (const client_version_info *vinfo + = m_context->m_client_data_hooks->get_any_version_info ()) + { + /* "name" property (SARIF v2.1.0 section 3.19.8). */ + if (const char *name = vinfo->get_tool_name ()) + driver_obj->set ("name", new json::string (name)); + + /* "fullName" property (SARIF v2.1.0 section 3.19.9). */ + if (char *full_name = vinfo->maybe_make_full_name ()) + { + driver_obj->set ("fullName", new json::string (full_name)); + free (full_name); + } + + /* "version" property (SARIF v2.1.0 section 3.19.13). */ + if (const char *version = vinfo->get_version_string ()) + driver_obj->set ("version", new json::string (version)); + + /* "informationUri" property (SARIF v2.1.0 section 3.19.17). */ + if (char *version_url = vinfo->maybe_make_version_url ()) + { + driver_obj->set ("informationUri", new json::string (version_url)); + free (version_url); + } + } + + /* "rules" property (SARIF v2.1.0 section 3.19.23). */ + driver_obj->set ("rules", m_rules_arr); + + return driver_obj; +} + +/* If we've seen any CWE IDs, make an array for the "taxonomies" property + (SARIF v2.1.0 section 3.14.8) of a run object, containting a singl + toolComponent (3.19) as per 3.19.3, representing the CWE. + + Otherwise return NULL. */ + +json::array * +sarif_builder::maybe_make_taxonomies_array () const +{ + json::object *cwe_obj = maybe_make_cwe_taxonomy_object (); + if (!cwe_obj) + return NULL; + + /* "taxonomies" property (SARIF v2.1.0 section 3.14.8). */ + json::array *taxonomies_arr = new json::array (); + taxonomies_arr->append (cwe_obj); + return taxonomies_arr; +} + +/* If we've seen any CWE IDs, make a toolComponent object + (SARIF v2.1.0 section 3.19) representing the CWE taxonomy, as per 3.19.3. + Populate the "taxa" property with all of the CWE IDs in m_cwe_id_set. + + Otherwise return NULL. */ + +json::object * +sarif_builder::maybe_make_cwe_taxonomy_object () const +{ + if (m_cwe_id_set.is_empty ()) + return NULL; + + json::object *taxonomy_obj = new json::object (); + + /* "name" property (SARIF v2.1.0 section 3.19.8). */ + taxonomy_obj->set ("name", new json::string ("CWE")); + + /* "version" property (SARIF v2.1.0 section 3.19.13). */ + taxonomy_obj->set ("version", new json::string ("4.7")); + + /* "organization" property (SARIF v2.1.0 section 3.19.18). */ + taxonomy_obj->set ("organization", new json::string ("MITRE")); + + /* "shortDescription" property (SARIF v2.1.0 section 3.19.19). */ + json::object *short_desc + = make_multiformat_message_string ("The MITRE" + " Common Weakness Enumeration"); + taxonomy_obj->set ("shortDescription", short_desc); + + /* "taxa" property (SARIF v2.1.0 3.section 3.19.25). */ + json::array *taxa_arr = new json::array (); + for (auto cwe_id : m_cwe_id_set) + { + json::object *cwe_taxon + = make_reporting_descriptor_object_for_cwe_id (cwe_id); + taxa_arr->append (cwe_taxon); + } + taxonomy_obj->set ("taxa", taxa_arr); + + return taxonomy_obj; +} + +/* Make an artifact object (SARIF v2.1.0 section 3.24). */ + +json::object * +sarif_builder::make_artifact_object (const char *filename) +{ + json::object *artifact_obj = new json::object (); + + /* "location" property (SARIF v2.1.0 section 3.24.2). */ + json::object *artifact_loc_obj = make_artifact_location_object (filename); + artifact_obj->set ("location", artifact_loc_obj); + + /* "contents" property (SARIF v2.1.0 section 3.24.8). */ + if (json::object *artifact_content_obj + = maybe_make_artifact_content_object (filename)) + artifact_obj->set ("contents", artifact_content_obj); + + /* "sourceLanguage" property (SARIF v2.1.0 section 3.24.10). */ + if (m_context->m_client_data_hooks) + if (const char *source_lang + = m_context->m_client_data_hooks->maybe_get_sarif_source_language + (filename)) + artifact_obj->set ("sourceLanguage", new json::string (source_lang)); + + return artifact_obj; +} + +/* Read all data from F_IN until EOF. + Return a NULL-terminated buffer containing the data, which must be + freed by the caller. + Return NULL on errors. */ + +static char * +read_until_eof (FILE *f_in) +{ + /* Read content, allocating a buffer for it. */ + char *result = NULL; + size_t total_sz = 0; + size_t alloc_sz = 0; + char buf[4096]; + size_t iter_sz_in; + + while ( (iter_sz_in = fread (buf, 1, sizeof (buf), f_in)) ) + { + gcc_assert (alloc_sz >= total_sz); + size_t old_total_sz = total_sz; + total_sz += iter_sz_in; + /* Allow 1 extra byte for 0-termination. */ + if (alloc_sz < (total_sz + 1)) + { + size_t new_alloc_sz = alloc_sz ? alloc_sz * 2: total_sz + 1; + result = (char *)xrealloc (result, new_alloc_sz); + alloc_sz = new_alloc_sz; + } + memcpy (result + old_total_sz, buf, iter_sz_in); + } + + if (!feof (f_in)) + return NULL; + + /* 0-terminate the buffer. */ + gcc_assert (total_sz < alloc_sz); + result[total_sz] = '\0'; + + return result; +} + +/* Read all data from FILENAME until EOF. + Return a NULL-terminated buffer containing the data, which must be + freed by the caller. + Return NULL on errors. */ + +static char * +maybe_read_file (const char *filename) +{ + FILE *f_in = fopen (filename, "r"); + if (!f_in) + return NULL; + char *result = read_until_eof (f_in); + fclose (f_in); + return result; +} + +/* Make an artifactContent object (SARIF v2.1.0 section 3.3) for the + full contents of FILENAME. */ + +json::object * +sarif_builder::maybe_make_artifact_content_object (const char *filename) const +{ + char *text_utf8 = maybe_read_file (filename); + if (!text_utf8) + return NULL; + + json::object *artifact_content_obj = new json::object (); + artifact_content_obj->set ("text", new json::string (text_utf8)); + free (text_utf8); + + return artifact_content_obj; +} + +/* Attempt to read the given range of lines from FILENAME; return + a freshly-allocated 0-terminated buffer containing them, or NULL. */ + +static char * +get_source_lines (const char *filename, + int start_line, + int end_line) +{ + auto_vec<char> result; + + for (int line = start_line; line <= end_line; line++) + { + char_span line_content = location_get_source_line (filename, line); + if (!line_content.get_buffer ()) + return NULL; + result.reserve (line_content.length () + 1); + for (size_t i = 0; i < line_content.length (); i++) + result.quick_push (line_content[i]); + result.quick_push ('\n'); + } + result.safe_push ('\0'); + + return xstrdup (result.address ()); +} + +/* Make an artifactContent object (SARIF v2.1.0 section 3.3) for the given + run of lines within FILENAME (including the endpoints). */ + +json::object * +sarif_builder::maybe_make_artifact_content_object (const char *filename, + int start_line, + int end_line) const +{ + char *text_utf8 = get_source_lines (filename, start_line, end_line); + + if (!text_utf8) + return NULL; + + json::object *artifact_content_obj = new json::object (); + artifact_content_obj->set ("text", new json::string (text_utf8)); + free (text_utf8); + + return artifact_content_obj; +} + +/* Make a fix object (SARIF v2.1.0 section 3.55) for RICHLOC. */ + +json::object * +sarif_builder::make_fix_object (const rich_location &richloc) +{ + json::object *fix_obj = new json::object (); + + /* "artifactChanges" property (SARIF v2.1.0 section 3.55.3). */ + /* We assume that all fix-it hints in RICHLOC affect the same file. */ + json::array *artifact_change_arr = new json::array (); + json::object *artifact_change_obj = make_artifact_change_object (richloc); + artifact_change_arr->append (artifact_change_obj); + fix_obj->set ("artifactChanges", artifact_change_arr); + + return fix_obj; +} + +/* Make an artifactChange object (SARIF v2.1.0 section 3.56) for RICHLOC. */ + +json::object * +sarif_builder::make_artifact_change_object (const rich_location &richloc) +{ + json::object *artifact_change_obj = new json::object (); + + /* "artifactLocation" property (SARIF v2.1.0 section 3.56.2). */ + json::object *artifact_location_obj + = make_artifact_location_object (richloc.get_loc ()); + artifact_change_obj->set ("artifactLocation", artifact_location_obj); + + /* "replacements" property (SARIF v2.1.0 section 3.56.3). */ + json::array *replacement_arr = new json::array (); + for (unsigned int i = 0; i < richloc.get_num_fixit_hints (); i++) + { + const fixit_hint *hint = richloc.get_fixit_hint (i); + json::object *replacement_obj = make_replacement_object (*hint); + replacement_arr->append (replacement_obj); + } + artifact_change_obj->set ("replacements", replacement_arr); + + return artifact_change_obj; +} + +/* Make a replacement object (SARIF v2.1.0 section 3.57) for HINT. */ + +json::object * +sarif_builder::make_replacement_object (const fixit_hint &hint) const +{ + json::object *replacement_obj = new json::object (); + + /* "deletedRegion" property (SARIF v2.1.0 section 3.57.3). */ + json::object *region_obj = make_region_object_for_hint (hint); + replacement_obj->set ("deletedRegion", region_obj); + + /* "insertedContent" property (SARIF v2.1.0 section 3.57.4). */ + json::object *content_obj = make_artifact_content_object (hint.get_string ()); + replacement_obj->set ("insertedContent", content_obj); + + return replacement_obj; +} + +/* Make an artifactContent object (SARIF v2.1.0 section 3.3) for TEXT. */ + +json::object * +sarif_builder::make_artifact_content_object (const char *text) const +{ + json::object *content_obj = new json::object (); + + /* "text" property (SARIF v2.1.0 section 3.3.2). */ + content_obj->set ("text", new json::string (text)); + + return content_obj; +} + +/* No-op implementation of "begin_diagnostic" for SARIF output. */ + +static void +sarif_begin_diagnostic (diagnostic_context *, diagnostic_info *) +{ +} + +/* Implementation of "end_diagnostic" for SARIF output. */ + +static void +sarif_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic, + diagnostic_t orig_diag_kind) +{ + gcc_assert (the_builder); + the_builder->end_diagnostic (context, diagnostic, orig_diag_kind); +} + +/* No-op implementation of "begin_group_cb" for SARIF output. */ + +static void +sarif_begin_group (diagnostic_context *) +{ +} + +/* Implementation of "end_group_cb" for SARIF output. */ + +static void +sarif_end_group (diagnostic_context *) +{ + gcc_assert (the_builder); + the_builder->end_group (); +} + +/* Flush the top-level array to OUTF. */ + +static void +sarif_flush_to_file (FILE *outf) +{ + gcc_assert (the_builder); + the_builder->flush_to_file (outf); + delete the_builder; + the_builder = NULL; +} + +/* Callback for final cleanup for SARIF output to stderr. */ + +static void +sarif_stderr_final_cb (diagnostic_context *) +{ + gcc_assert (the_builder); + sarif_flush_to_file (stderr); +} + +static char *sarif_output_base_file_name; + +/* Callback for final cleanup for SARIF output to a file. */ + +static void +sarif_file_final_cb (diagnostic_context *) +{ + char *filename = concat (sarif_output_base_file_name, ".sarif", NULL); + FILE *outf = fopen (filename, "w"); + if (!outf) + { + const char *errstr = xstrerror (errno); + fnotice (stderr, "error: unable to open '%s' for writing: %s\n", + filename, errstr); + free (filename); + return; + } + gcc_assert (the_builder); + sarif_flush_to_file (outf); + fclose (outf); + free (filename); +} + +/* Populate CONTEXT in preparation for SARIF output (either to stderr, or + to a file). */ + +static void +diagnostic_output_format_init_sarif (diagnostic_context *context) +{ + the_builder = new sarif_builder (context); + + /* Override callbacks. */ + context->begin_diagnostic = sarif_begin_diagnostic; + context->end_diagnostic = sarif_end_diagnostic; + context->begin_group_cb = sarif_begin_group; + context->end_group_cb = sarif_end_group; + context->print_path = NULL; /* handled in sarif_end_diagnostic. */ + + /* The metadata is handled in SARIF format, rather than as text. */ + context->show_cwe = false; + + /* The option is handled in SARIF format, rather than as text. */ + context->show_option_requested = false; + + /* Don't colorize the text. */ + pp_show_color (context->printer) = false; +} + +/* Populate CONTEXT in preparation for SARIF output to stderr. */ + +void +diagnostic_output_format_init_sarif_stderr (diagnostic_context *context) +{ + diagnostic_output_format_init_sarif (context); + context->final_cb = sarif_stderr_final_cb; +} + +/* Populate CONTEXT in preparation for SARIF output to a file named + BASE_FILE_NAME.sarif. */ + +void +diagnostic_output_format_init_sarif_file (diagnostic_context *context, + const char *base_file_name) +{ + diagnostic_output_format_init_sarif (context); + context->final_cb = sarif_file_final_cb; + sarif_output_base_file_name = xstrdup (base_file_name); +} diff --git a/gcc/diagnostic-path.h b/gcc/diagnostic-path.h index 6c1190d..8ce4ff7 100644 --- a/gcc/diagnostic-path.h +++ b/gcc/diagnostic-path.h @@ -69,6 +69,75 @@ along with GCC; see the file COPYING3. If not see class diagnostic_event { public: + /* Enums for giving a sense of what this event means. + Roughly corresponds to SARIF v2.1.0 section 3.38.8. */ + enum verb + { + VERB_unknown, + + VERB_acquire, + VERB_release, + VERB_enter, + VERB_exit, + VERB_call, + VERB_return, + VERB_branch, + + VERB_danger + }; + enum noun + { + NOUN_unknown, + + NOUN_taint, + NOUN_sensitive, // this one isn't in SARIF v2.1.0; filed as https://github.com/oasis-tcs/sarif-spec/issues/530 + NOUN_function, + NOUN_lock, + NOUN_memory, + NOUN_resource + }; + enum property + { + PROPERTY_unknown, + + PROPERTY_true, + PROPERTY_false + }; + /* A bundle of such enums, allowing for descriptions of the meaning of + an event, such as + - "acquire memory": meaning (VERB_acquire, NOUN_memory) + - "take true branch"": meaning (VERB_branch, PROPERTY_true) + - "return from function": meaning (VERB_return, NOUN_function) + etc, as per SARIF's threadFlowLocation "kinds" property + (SARIF v2.1.0 section 3.38.8). */ + struct meaning + { + meaning () + : m_verb (VERB_unknown), + m_noun (NOUN_unknown), + m_property (PROPERTY_unknown) + { + } + meaning (enum verb verb, enum noun noun) + : m_verb (verb), m_noun (noun), m_property (PROPERTY_unknown) + { + } + meaning (enum verb verb, enum property property) + : m_verb (verb), m_noun (NOUN_unknown), m_property (property) + { + } + + void dump_to_pp (pretty_printer *pp) const; + + static const char *maybe_get_verb_str (enum verb); + static const char *maybe_get_noun_str (enum noun); + static const char *maybe_get_property_str (enum property); + + enum verb m_verb; + enum noun m_noun; + enum property m_property; + }; + virtual ~diagnostic_event () {} virtual location_t get_location () const = 0; @@ -81,6 +150,11 @@ class diagnostic_event /* Get a localized (and possibly colorized) description of this event. */ virtual label_text get_desc (bool can_colorize) const = 0; + + /* Get a logical_location for this event, or NULL. */ + virtual const logical_location *get_logical_location () const = 0; + + virtual meaning get_meaning () const = 0; }; /* Abstract base class for getting at a sequence of events. */ @@ -113,6 +187,14 @@ class simple_diagnostic_event : public diagnostic_event { return label_text::borrow (m_desc); } + const logical_location *get_logical_location () const final override + { + return NULL; + } + meaning get_meaning () const final override + { + return meaning (); + } private: location_t m_loc; diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc index 2550483..f2a82ff 100644 --- a/gcc/diagnostic.cc +++ b/gcc/diagnostic.cc @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. If not see #include "diagnostic-url.h" #include "diagnostic-metadata.h" #include "diagnostic-path.h" +#include "diagnostic-client-data-hooks.h" #include "edit-context.h" #include "selftest.h" #include "selftest-diagnostic.h" @@ -240,6 +241,7 @@ diagnostic_initialize (diagnostic_context *context, int n_opts) context->end_group_cb = NULL; context->final_cb = default_diagnostic_final_cb; context->includes_seen = NULL; + context->m_client_data_hooks = NULL; } /* Maybe initialize the color support. We require clients to do this @@ -338,6 +340,12 @@ diagnostic_finish (diagnostic_context *context) delete context->includes_seen; context->includes_seen = nullptr; } + + if (context->m_client_data_hooks) + { + delete context->m_client_data_hooks; + context->m_client_data_hooks = NULL; + } } /* Initialize DIAGNOSTIC, where the message MSG has already been @@ -820,6 +828,116 @@ diagnostic_show_any_path (diagnostic_context *context, context->print_path (context, path); } +/* class diagnostic_event. */ + +/* struct diagnostic_event::meaning. */ + +void +diagnostic_event::meaning::dump_to_pp (pretty_printer *pp) const +{ + bool need_comma = false; + pp_character (pp, '{'); + if (const char *verb_str = maybe_get_verb_str (m_verb)) + { + pp_printf (pp, "verb: %qs", verb_str); + need_comma = true; + } + if (const char *noun_str = maybe_get_noun_str (m_noun)) + { + if (need_comma) + pp_string (pp, ", "); + pp_printf (pp, "noun: %qs", noun_str); + need_comma = true; + } + if (const char *property_str = maybe_get_property_str (m_property)) + { + if (need_comma) + pp_string (pp, ", "); + pp_printf (pp, "property: %qs", property_str); + need_comma = true; + } + pp_character (pp, '}'); +} + +/* Get a string (or NULL) for V suitable for use within a SARIF + threadFlowLocation "kinds" property (SARIF v2.1.0 section 3.38.8). */ + +const char * +diagnostic_event::meaning::maybe_get_verb_str (enum verb v) +{ + switch (v) + { + default: + gcc_unreachable (); + case VERB_unknown: + return NULL; + case VERB_acquire: + return "acquire"; + case VERB_release: + return "release"; + case VERB_enter: + return "enter"; + case VERB_exit: + return "exit"; + case VERB_call: + return "call"; + case VERB_return: + return "return"; + case VERB_branch: + return "branch"; + case VERB_danger: + return "danger"; + } +} + +/* Get a string (or NULL) for N suitable for use within a SARIF + threadFlowLocation "kinds" property (SARIF v2.1.0 section 3.38.8). */ + +const char * +diagnostic_event::meaning::maybe_get_noun_str (enum noun n) +{ + switch (n) + { + default: + gcc_unreachable (); + case NOUN_unknown: + return NULL; + case NOUN_taint: + return "taint"; + case NOUN_sensitive: + return "sensitive"; + case NOUN_function: + return "function"; + case NOUN_lock: + return "lock"; + case NOUN_memory: + return "memory"; + case NOUN_resource: + return "resource"; + } +} + +/* Get a string (or NULL) for P suitable for use within a SARIF + threadFlowLocation "kinds" property (SARIF v2.1.0 section 3.38.8). */ + +const char * +diagnostic_event::meaning::maybe_get_property_str (enum property p) +{ + switch (p) + { + default: + gcc_unreachable (); + case PROPERTY_unknown: + return NULL; + case PROPERTY_true: + return "true"; + case PROPERTY_false: + return "false"; + } +} + +/* class diagnostic_path. */ + /* Return true if the events in this path involve more than one function, or false if it is purely intraprocedural. */ @@ -1131,7 +1249,7 @@ update_effective_level_from_pragmas (diagnostic_context *context, /* Generate a URL string describing CWE. The caller is responsible for freeing the string. */ -static char * +char * get_cwe_url (int cwe) { return xasprintf ("https://cwe.mitre.org/data/definitions/%i.html", cwe); @@ -2095,6 +2213,14 @@ diagnostic_output_format_init (diagnostic_context *context, case DIAGNOSTICS_OUTPUT_FORMAT_JSON_FILE: diagnostic_output_format_init_json_file (context, base_file_name); break; + + case DIAGNOSTICS_OUTPUT_FORMAT_SARIF_STDERR: + diagnostic_output_format_init_sarif_stderr (context); + break; + + case DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE: + diagnostic_output_format_init_sarif_file (context, base_file_name); + break; } } diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h index dd3af03..96c9a72 100644 --- a/gcc/diagnostic.h +++ b/gcc/diagnostic.h @@ -63,7 +63,13 @@ enum diagnostics_output_format DIAGNOSTICS_OUTPUT_FORMAT_JSON_STDERR, /* JSON-based output, to a file. */ - DIAGNOSTICS_OUTPUT_FORMAT_JSON_FILE + DIAGNOSTICS_OUTPUT_FORMAT_JSON_FILE, + + /* SARIF-based output, to stderr. */ + DIAGNOSTICS_OUTPUT_FORMAT_SARIF_STDERR, + + /* SARIF-based output, to a file. */ + DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE }; /* An enum for controlling how diagnostic_paths should be printed. */ @@ -162,6 +168,8 @@ typedef void (*diagnostic_finalizer_fn) (diagnostic_context *, class edit_context; namespace json { class value; } +class diagnostic_client_data_hooks; +class logical_location; /* This data structure bundles altogether any information relevant to the context of a diagnostic message. */ @@ -397,6 +405,12 @@ struct diagnostic_context /* Include files that diagnostic_report_current_module has already listed the include path for. */ hash_set<location_t, false, location_hash> *includes_seen; + + /* A bundle of hooks for providing data to the context about its client + e.g. version information, plugins, etc. + Used by SARIF output to give metadata about the client that's + producing diagnostics. */ + diagnostic_client_data_hooks *m_client_data_hooks; }; static inline void @@ -585,6 +599,9 @@ extern void diagnostic_output_format_init (diagnostic_context *, extern void diagnostic_output_format_init_json_stderr (diagnostic_context *context); extern void diagnostic_output_format_init_json_file (diagnostic_context *context, const char *base_file_name); +extern void diagnostic_output_format_init_sarif_stderr (diagnostic_context *context); +extern void diagnostic_output_format_init_sarif_file (diagnostic_context *context, + const char *base_file_name); /* Compute the number of digits in the decimal representation of an integer. */ extern int num_digits (int); @@ -594,4 +611,6 @@ extern json::value *json_from_expanded_location (diagnostic_context *context, extern bool warning_enabled_at (location_t, int); +extern char *get_cwe_url (int cwe); + #endif /* ! GCC_DIAGNOSTIC_H */ diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index d85b66f..8cd5bdd 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -301,7 +301,7 @@ Objective-C and Objective-C++ Dialects}. -fdiagnostics-show-location=@r{[}once@r{|}every-line@r{]} @gol -fdiagnostics-color=@r{[}auto@r{|}never@r{|}always@r{]} @gol -fdiagnostics-urls=@r{[}auto@r{|}never@r{|}always@r{]} @gol --fdiagnostics-format=@r{[}text@r{|}json@r{|}json-stderr@r{|}json-file@r{]} @gol +-fdiagnostics-format=@r{[}text@r{|}sarif-stderr@r{|}sarif-file@r{|}json@r{|}json-stderr@r{|}json-file@r{]} @gol -fno-diagnostics-show-option -fno-diagnostics-show-caret @gol -fno-diagnostics-show-labels -fno-diagnostics-show-line-numbers @gol -fno-diagnostics-show-cwe @gol @@ -5305,11 +5305,15 @@ Unicode characters. For the example above, the following will be printed: @item -fdiagnostics-format=@var{FORMAT} @opindex fdiagnostics-format Select a different format for printing diagnostics. -@var{FORMAT} is @samp{text}, @samp{json}, @samp{json-stderr}, -or @samp{json-file}. +@var{FORMAT} is @samp{text}, @samp{sarif-stderr}, @samp{sarif-file}, +@samp{json}, @samp{json-stderr}, or @samp{json-file}. The default is @samp{text}. +The @samp{sarif-stderr} and @samp{sarif-file} formats both emit +diagnostics in SARIF Version 2.1.0 format, either to stderr, or to a file +named @file{@var{source}.sarif}, respectively. + The @samp{json} format is a synonym for @samp{json-stderr}. The @samp{json-stderr} and @samp{json-file} formats are identical, apart from where the JSON is emitted to - with the former, the JSON is emitted to stderr, diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 286b1eb..606ab85 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -3152,6 +3152,12 @@ Passes if @var{regexp} matches in Fortran module @var{module}. @item dg-check-dot @var{filename} Passes if @var{filename} is a valid @file{.dot} file (by running @code{dot -Tpng} on it, and verifying the exit code is 0). +@item scan-sarif-file @var{regexp} [@{ target/xfail @var{selector} @}] +Passes if @var{regexp} matches text in the file generated by +@option{-fdiagnostics-format=sarif-file}. +@item scan-sarif-file-not @var{regexp} [@{ target/xfail @var{selector} @}] +Passes if @var{regexp} does not match text in the file generated by +@option{-fdiagnostics-format=sarif-file}. @end table @subsubsection Scan the assembly output diff --git a/gcc/fortran/f95-lang.cc b/gcc/fortran/f95-lang.cc index e83fef3..319cf8f 100644 --- a/gcc/fortran/f95-lang.cc +++ b/gcc/fortran/f95-lang.cc @@ -100,6 +100,15 @@ static const struct attribute_spec gfc_attribute_table[] = { NULL, 0, 0, false, false, false, false, NULL, NULL } }; +/* Get a value for the SARIF v2.1.0 "artifact.sourceLanguage" property, + based on the list in SARIF v2.1.0 Appendix J. */ + +static const char * +gfc_get_sarif_source_language (const char *) +{ + return "fortran"; +} + #undef LANG_HOOKS_NAME #undef LANG_HOOKS_INIT #undef LANG_HOOKS_FINISH @@ -138,6 +147,7 @@ static const struct attribute_spec gfc_attribute_table[] = #undef LANG_HOOKS_BUILTIN_FUNCTION #undef LANG_HOOKS_GET_ARRAY_DESCR_INFO #undef LANG_HOOKS_ATTRIBUTE_TABLE +#undef LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE /* Define lang hooks. */ #define LANG_HOOKS_NAME "GNU Fortran" @@ -177,6 +187,7 @@ static const struct attribute_spec gfc_attribute_table[] = #define LANG_HOOKS_BUILTIN_FUNCTION gfc_builtin_function #define LANG_HOOKS_GET_ARRAY_DESCR_INFO gfc_get_array_descr_info #define LANG_HOOKS_ATTRIBUTE_TABLE gfc_attribute_table +#define LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE gfc_get_sarif_source_language struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER; diff --git a/gcc/go/go-lang.cc b/gcc/go/go-lang.cc index c8365d2..84cd623 100644 --- a/gcc/go/go-lang.cc +++ b/gcc/go/go-lang.cc @@ -545,6 +545,15 @@ go_langhook_eh_personality (void) return personality_decl; } +/* Get a value for the SARIF v2.1.0 "artifact.sourceLanguage" property, + based on the list in SARIF v2.1.0 Appendix J. */ + +static const char * +go_get_sarif_source_language (const char *) +{ + return "go"; +} + /* Functions called directly by the generic backend. */ tree @@ -615,6 +624,7 @@ go_localize_identifier (const char *ident) #undef LANG_HOOKS_GETDECLS #undef LANG_HOOKS_GIMPLIFY_EXPR #undef LANG_HOOKS_EH_PERSONALITY +#undef LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE #define LANG_HOOKS_NAME "GNU Go" #define LANG_HOOKS_INIT go_langhook_init @@ -631,6 +641,7 @@ go_localize_identifier (const char *ident) #define LANG_HOOKS_GETDECLS go_langhook_getdecls #define LANG_HOOKS_GIMPLIFY_EXPR go_langhook_gimplify_expr #define LANG_HOOKS_EH_PERSONALITY go_langhook_eh_personality +#define LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE go_get_sarif_source_language struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER; diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h index 95d8dec..4e17915 100644 --- a/gcc/langhooks-def.h +++ b/gcc/langhooks-def.h @@ -98,6 +98,7 @@ extern const char *lhd_get_substring_location (const substring_loc &, extern int lhd_decl_dwarf_attribute (const_tree, int); extern int lhd_type_dwarf_attribute (const_tree, int); extern void lhd_finalize_early_debug (void); +extern const char *lhd_get_sarif_source_language (const char *); #define LANG_HOOKS_NAME "GNU unknown" #define LANG_HOOKS_IDENTIFIER_SIZE sizeof (struct lang_identifier) @@ -150,6 +151,7 @@ extern void lhd_finalize_early_debug (void); #define LANG_HOOKS_RUN_LANG_SELFTESTS lhd_do_nothing #define LANG_HOOKS_GET_SUBSTRING_LOCATION lhd_get_substring_location #define LANG_HOOKS_FINALIZE_EARLY_DEBUG lhd_finalize_early_debug +#define LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE lhd_get_sarif_source_language /* Attribute hooks. */ #define LANG_HOOKS_ATTRIBUTE_TABLE NULL @@ -394,7 +396,8 @@ extern void lhd_end_section (void); LANG_HOOKS_EMITS_BEGIN_STMT, \ LANG_HOOKS_RUN_LANG_SELFTESTS, \ LANG_HOOKS_GET_SUBSTRING_LOCATION, \ - LANG_HOOKS_FINALIZE_EARLY_DEBUG \ + LANG_HOOKS_FINALIZE_EARLY_DEBUG, \ + LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE \ } #endif /* GCC_LANG_HOOKS_DEF_H */ diff --git a/gcc/langhooks.cc b/gcc/langhooks.cc index 97e5139..a933407 100644 --- a/gcc/langhooks.cc +++ b/gcc/langhooks.cc @@ -925,6 +925,14 @@ lhd_finalize_early_debug (void) (*debug_hooks->early_global_decl) (cnode->decl); } +/* Default implementation of LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE. */ + +const char * +lhd_get_sarif_source_language (const char *) +{ + return NULL; +} + /* Returns true if the current lang_hooks represents the GNU C frontend. */ bool diff --git a/gcc/langhooks.h b/gcc/langhooks.h index 7502555..97aa9e0 100644 --- a/gcc/langhooks.h +++ b/gcc/langhooks.h @@ -640,6 +640,12 @@ struct lang_hooks /* Invoked before the early_finish debug hook is invoked. */ void (*finalize_early_debug) (void); + /* Get a value for the SARIF v2.1.0 "artifact.sourceLanguage" property + for FILENAME, or return NULL. + See SARIF v2.1.0 Appendix J for suggested values for common programming + languages. */ + const char *(*get_sarif_source_language) (const char *filename); + /* Whenever you add entries here, make sure you adjust langhooks-def.h and langhooks.cc accordingly. */ }; diff --git a/gcc/logical-location.h b/gcc/logical-location.h new file mode 100644 index 0000000..2e7b8e3 --- /dev/null +++ b/gcc/logical-location.h @@ -0,0 +1,72 @@ +/* Logical location support, without knowledge of "tree". + Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by David Malcolm <dmalcolm@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#ifndef GCC_LOGICAL_LOCATION_H +#define GCC_LOGICAL_LOCATION_H + +/* An enum for discriminating between different kinds of logical location + for a diagnostic. + + Roughly corresponds to logicalLocation's "kind" property in SARIF v2.1.0 + (section 3.33.7). */ + +enum logical_location_kind +{ + LOGICAL_LOCATION_KIND_UNKNOWN, + + LOGICAL_LOCATION_KIND_FUNCTION, + LOGICAL_LOCATION_KIND_MEMBER, + LOGICAL_LOCATION_KIND_MODULE, + LOGICAL_LOCATION_KIND_NAMESPACE, + LOGICAL_LOCATION_KIND_TYPE, + LOGICAL_LOCATION_KIND_RETURN_TYPE, + LOGICAL_LOCATION_KIND_PARAMETER, + LOGICAL_LOCATION_KIND_VARIABLE +}; + +/* Abstract base class for passing around logical locations in the + diagnostics subsystem, such as: + - "within function 'foo'", or + - "within method 'bar'", + but *without* requiring knowledge of trees + (see tree-logical-location.h for subclasses relating to trees). */ + +class logical_location +{ +public: + virtual ~logical_location () {} + + /* Get a string (or NULL) suitable for use by the SARIF logicalLocation + "name" property (SARIF v2.1.0 section 3.33.4). */ + virtual const char *get_short_name () const = 0; + + /* Get a string (or NULL) suitable for use by the SARIF logicalLocation + "fullyQualifiedName" property (SARIF v2.1.0 section 3.33.5). */ + virtual const char *get_name_with_scope () const = 0; + + /* Get a string (or NULL) suitable for use by the SARIF logicalLocation + "decoratedName" property (SARIF v2.1.0 section 3.33.6). */ + virtual const char *get_internal_name () const = 0; + + /* Get what kind of SARIF logicalLocation this is (if any). */ + virtual enum logical_location_kind get_kind () const = 0; +}; + +#endif /* GCC_LOGICAL_LOCATION_H. */ diff --git a/gcc/objc/objc-act.h b/gcc/objc/objc-act.h index 7d0c6d5..4f9c3a2 100644 --- a/gcc/objc/objc-act.h +++ b/gcc/objc/objc-act.h @@ -27,6 +27,7 @@ bool objc_init (void); const char *objc_printable_name (tree, int); int objc_gimplify_expr (tree *, gimple_seq *, gimple_seq *); void objc_common_init_ts (void); +const char *objc_get_sarif_source_language (const char *); /* NB: The remaining public functions are prototyped in c-common.h, for the benefit of stub-objc.cc and objc-act.cc. */ diff --git a/gcc/objc/objc-lang.cc b/gcc/objc/objc-lang.cc index ef664f5..559de4b 100644 --- a/gcc/objc/objc-lang.cc +++ b/gcc/objc/objc-lang.cc @@ -46,10 +46,18 @@ enum c_language_kind c_language = clk_objc; #define LANG_HOOKS_INIT_TS objc_common_init_ts #undef LANG_HOOKS_TREE_SIZE #define LANG_HOOKS_TREE_SIZE objc_common_tree_size +#undef LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE +#define LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE objc_get_sarif_source_language /* Each front end provides its own lang hook initializer. */ struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER; +const char * +objc_get_sarif_source_language (const char *) +{ + return "objectivec"; +} + /* Lang hook routines common to C and ObjC appear in c-objc-common.cc; there should be very few (if any) routines below. */ diff --git a/gcc/plugin.cc b/gcc/plugin.cc index 17b33e4..6c42e05 100644 --- a/gcc/plugin.cc +++ b/gcc/plugin.cc @@ -815,6 +815,44 @@ finalize_plugins (void) plugin_name_args_tab = NULL; } +/* Implementation detail of for_each_plugin. */ + +struct for_each_plugin_closure +{ + void (*cb) (const plugin_name_args *, + void *user_data); + void *user_data; +}; + +/* Implementation detail of for_each_plugin: callback for htab_traverse_noresize + that calls the user-provided callback. */ + +static int +for_each_plugin_cb (void **slot, void *info) +{ + struct plugin_name_args *plugin = (struct plugin_name_args *) *slot; + for_each_plugin_closure *c = (for_each_plugin_closure *)info; + c->cb (plugin, c->user_data); + return 1; +} + +/* Call CB with USER_DATA on each plugin. */ + +void +for_each_plugin (void (*cb) (const plugin_name_args *, + void *user_data), + void *user_data) +{ + if (!plugin_name_args_tab) + return; + + for_each_plugin_closure c; + c.cb = cb; + c.user_data = user_data; + + htab_traverse_noresize (plugin_name_args_tab, for_each_plugin_cb, &c); +} + /* Used to pass options to htab_traverse callbacks. */ struct print_options diff --git a/gcc/plugin.h b/gcc/plugin.h index ff999c4..e7e8b51 100644 --- a/gcc/plugin.h +++ b/gcc/plugin.h @@ -170,6 +170,9 @@ extern void warn_if_plugins (void); extern void print_plugins_versions (FILE *file, const char *indent); extern void print_plugins_help (FILE *file, const char *indent); extern void finalize_plugins (void); +extern void for_each_plugin (void (*cb) (const plugin_name_args *, + void *user_data), + void *user_data); extern bool flag_plugin_added; diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-1.c b/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-1.c new file mode 100644 index 0000000..4d19ae1 --- /dev/null +++ b/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-1.c @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options "-fdiagnostics-format=sarif-file" } */ + +#warning message + +/* Verify that some JSON was written to a file with the expected name. */ + +/* We expect various properties. + The indentation here reflects the expected hierarchy, though these tests + don't check for that, merely the string fragments we expect. + { dg-final { scan-sarif-file "\"version\": \"2.1.0\"" } } + { dg-final { scan-sarif-file "\"runs\": \\\[" } } + { dg-final { scan-sarif-file "\"artifacts\": \\\[" } } + { dg-final { scan-sarif-file "\"location\": " } } + { dg-final { scan-sarif-file "\"uri\": " } } + + { dg-final { scan-sarif-file "\"sourceLanguage\": \"c\"" { target c } } } + { dg-final { scan-sarif-file "\"sourceLanguage\": \"cplusplus\"" { target c++ } } } + + { dg-final { scan-sarif-file "\"contents\": " } } + { dg-final { scan-sarif-file "\"text\": " } } + { dg-final { scan-sarif-file "\"tool\": " } } + { dg-final { scan-sarif-file "\"driver\": " } } + { dg-final { scan-sarif-file "\"name\": \"GNU C" } } + { dg-final { scan-sarif-file "\"fullName\": \"GNU C" } } + { dg-final { scan-sarif-file "\"informationUri\": \"" } } + { dg-final { scan-sarif-file "\"results\": \\\[" } } + { dg-final { scan-sarif-file "\"level\": \"warning\"" } } + { dg-final { scan-sarif-file "\"ruleId\": \"-Wcpp\"" } } + { dg-final { scan-sarif-file "\"locations\": \\\[" } } + { dg-final { scan-sarif-file "\"physicalLocation\": " } } + { dg-final { scan-sarif-file "\"contextRegion\": " } } + { dg-final { scan-sarif-file "\"artifactLocation\": " } } + { dg-final { scan-sarif-file "\"region\": " } } + { dg-final { scan-sarif-file "\"startLine\": 4" } } + { dg-final { scan-sarif-file "\"startColumn\": 2" } } + { dg-final { scan-sarif-file "\"endColumn\": 9" } } + + We don't expect logical locations for a top-level warning: + { dg-final { scan-sarif-file-not "\"logicalLocations\": " } } + + { dg-final { scan-sarif-file "\"message\": " } } + { dg-final { scan-sarif-file "\"text\": \"#warning message" } } */ diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-2.c b/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-2.c new file mode 100644 index 0000000..8f5814d --- /dev/null +++ b/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-2.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-fdiagnostics-format=sarif-file -Wmisleading-indentation" } */ + +int test (void) +{ + if (1) + return 3; + return 4; + return 5; +} + +/* + { dg-final { scan-sarif-file "\"level\": \"warning\"" } } + { dg-final { scan-sarif-file "\"ruleId\": \"-Wmisleading-indentation\"" } } + { dg-final { scan-sarif-file "\"text\": \" if " } } + + { dg-final { scan-sarif-file "\"locations\": \\\[" } } + + We expect a logical location for the error (within fn "test"): + { dg-final { scan-sarif-file "\"logicalLocations\": \\\[" } } + { dg-final { scan-sarif-file "\"kind\": \"function\"" } } + { dg-final { scan-sarif-file "\"name\": \"test\"" } } + { dg-final { scan-sarif-file "\"fullyQualifiedName\": \"test\"" } } + { dg-final { scan-sarif-file "\"decoratedName\": \"" } } + + We expect the "note" to become a "relatedLocations" entry: + { dg-final { scan-sarif-file "\"relatedLocations\": \\\[" } } + { dg-final { scan-sarif-file "\"text\": \" return 4;" } } +*/ diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-3.c b/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-3.c new file mode 100644 index 0000000..3856782 --- /dev/null +++ b/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-3.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-fdiagnostics-format=sarif-file" } */ +/* { dg-excess-errors "The error is sent to the SARIF file, rather than stderr" } */ + +struct s { int color; }; + +int test (struct s *ptr) +{ + return ptr->colour; +} + +/* + { dg-final { scan-sarif-file "\"level\": \"error\"" } } + + We expect a logical location for the error (within fn "test"): + { dg-final { scan-sarif-file "\"locations\": \\\[" } } + { dg-final { scan-sarif-file "\"logicalLocations\": \\\[" } } + { dg-final { scan-sarif-file "\"kind\": \"function\"" } } + { dg-final { scan-sarif-file "\"name\": \"test\"" } } + { dg-final { scan-sarif-file "\"fullyQualifiedName\": \"test\"" } } + { dg-final { scan-sarif-file "\"decoratedName\": \"" } } + + We expect a "fixes" array for the fix-it hint (SARIF v2.1.0 section 3.27.30): + { dg-final { scan-sarif-file "\"fixes\": \\\[" } } + { dg-final { scan-sarif-file "\"artifactChanges\": \\\[" } } + { dg-final { scan-sarif-file "\"replacements\": \\\[" } } + { dg-final { scan-sarif-file "\"insertedContent\": " } } + { dg-final { scan-sarif-file "\"text\": \"color\"" } } + { dg-final { scan-sarif-file "\"deletedRegion\": " } } +*/ diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-4.c b/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-4.c new file mode 100644 index 0000000..2d22f54 --- /dev/null +++ b/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-4.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-fdiagnostics-format=sarif-file" } */ +/* { dg-excess-errors "The error is sent to the SARIF file, rather than stderr" } */ + +int test (void) +{ + int 文字化け = *42; +} + +/* + { dg-final { scan-sarif-file "\"level\": \"error\"" } } + + We expect the region expressed in display columns: + { dg-final { scan-sarif-file "\"startLine\": 7" } } + { dg-final { scan-sarif-file "\"startColumn\": 18" } } + { dg-final { scan-sarif-file "\"endColumn\": 21" } } + + { dg-final { scan-sarif-file "\"text\": \" int \\u6587\\u5b57\\u5316\\u3051 = " } } +*/ diff --git a/gcc/testsuite/gcc.dg/analyzer/file-meaning-1.c b/gcc/testsuite/gcc.dg/analyzer/file-meaning-1.c new file mode 100644 index 0000000..66b72a7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/file-meaning-1.c @@ -0,0 +1,15 @@ +/* { dg-additional-options "-fanalyzer-verbose-state-changes" } */ + +typedef struct FILE FILE; +FILE* fopen (const char*, const char*); +int fclose (FILE*); + +void test_1 (const char *path) +{ + FILE *f = fopen (path, "r"); /* { dg-message "meaning: \\{verb: 'acquire', noun: 'resource'\\}" } */ + if (!f) + return; + + fclose (f); /* { dg-message "meaning: \\{verb: 'release', noun: 'resource'\\}" } */ + fclose (f); /* { dg-warning "double 'fclose' of FILE 'f'" "warning" } */ +} diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-meaning-1.c b/gcc/testsuite/gcc.dg/analyzer/malloc-meaning-1.c new file mode 100644 index 0000000..4964e25 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/malloc-meaning-1.c @@ -0,0 +1,10 @@ +/* { dg-additional-options "-fanalyzer-verbose-state-changes" } */ + +#include <stdlib.h> + +void test_1 (void) +{ + void *ptr = malloc (1024); /* { dg-message "meaning: \\{verb: 'acquire', noun: 'memory'\\}" } */ + free (ptr); /* { dg-message "meaning: \\{verb: 'release', noun: 'memory'\\}" } */ + free (ptr); /* { dg-warning "double-'free' of 'ptr'" } */ +} diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-sarif-1.c b/gcc/testsuite/gcc.dg/analyzer/malloc-sarif-1.c new file mode 100644 index 0000000..3d141b5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/malloc-sarif-1.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-fdiagnostics-format=sarif-file" } */ + +#include <stdlib.h> + +void test_1 (void) +{ + void *ptr = malloc (1024); + free (ptr); + free (ptr); +} + +/* Verify SARIF output. + + The threadFlowLocation objects should have "kinds" properties + reflecting the meanings of the events: + { dg-final { scan-sarif-file "\"kinds\": \\\[\"acquire\", \"memory\"\\\]" } } + { dg-final { scan-sarif-file "\"kinds\": \\\[\"release\", \"memory\"\\\]" } } + { dg-final { scan-sarif-file "\"kinds\": \\\[\"danger\"\\\]" } } +*/ diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c b/gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c index b5ae128..2a8bf11 100644 --- a/gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c @@ -109,6 +109,21 @@ public: return label_text (); } + diagnostic_event::meaning + get_meaning_for_state_change (const evdesc::state_change &change) + const final override + { + if (change.is_global_p ()) + { + if (change.m_new_state == m_sm.m_released_gil) + return diagnostic_event::meaning (diagnostic_event::VERB_release, + diagnostic_event::NOUN_lock); + else if (change.m_new_state == m_sm.get_start_state ()) + return diagnostic_event::meaning (diagnostic_event::VERB_acquire, + diagnostic_event::NOUN_lock); + } + return diagnostic_event::meaning (); + } protected: gil_diagnostic (const gil_state_machine &sm) : m_sm (sm) { diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-paths-5.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-paths-5.c new file mode 100644 index 0000000..bd09391 --- /dev/null +++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-paths-5.c @@ -0,0 +1,56 @@ +/* { dg-do compile } */ +/* { dg-options "-fdiagnostics-format=sarif-file" } */ +/* { dg-excess-errors "The error is sent to the SARIF file, rather than stderr" } */ + +#include <stddef.h> +#include <stdlib.h> + +/* Minimal reimplementation of cpython API. */ +typedef struct PyObject {} PyObject; +extern int PyArg_ParseTuple (PyObject *args, const char *fmt, ...); +extern PyObject *PyList_New (int); +extern PyObject *PyLong_FromLong(long); +extern void PyList_Append(PyObject *list, PyObject *item); + +PyObject * +make_a_list_of_random_ints_badly(PyObject *self, + PyObject *args) +{ + PyObject *list, *item; + long count, i; + + if (!PyArg_ParseTuple(args, "i", &count)) { + return NULL; + } + + list = PyList_New(0); + + for (i = 0; i < count; i++) { + item = PyLong_FromLong(random()); + PyList_Append(list, item); + } + + return list; +} + +/* + { dg-final { scan-sarif-file "\"tool\": " } } + + We expect info about the plugin: + { dg-final { scan-sarif-file "\"extensions\": \\\[" } } + { dg-final { scan-sarif-file "\"name\": \"diagnostic_plugin_test_paths\"" } } + { dg-final { scan-sarif-file "\"fullName\": \"" } } + + { dg-final { scan-sarif-file "\"results\": \\\[" } } + { dg-final { scan-sarif-file "\"level\": \"error\"" } } + { dg-final { scan-sarif-file "\"text\": \"passing NULL as argument 1 to 'PyList_Append' which requires a non-NULL parameter\"" } } + + We expect a path for the diagnostic: + { dg-final { scan-sarif-file "\"codeFlows\": \\\[" } } + { dg-final { scan-sarif-file "\"threadFlows\": \\\[" } } + { dg-final { scan-sarif-file "\"locations\": \\\[" } } + { dg-final { scan-sarif-file "\"text\": \"when 'PyList_New' fails, returning NULL\"" } } + { dg-final { scan-sarif-file "\"text\": \"when 'i < count'\"" } } + { dg-final { scan-sarif-file "\"text\": \"when calling 'PyList_Append', passing NULL from \\(1\\) as argument 1\"" } } + +*/ diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp index 2ade945..63b117d 100644 --- a/gcc/testsuite/gcc.dg/plugin/plugin.exp +++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp @@ -102,6 +102,7 @@ set plugin_test_list [list \ diagnostic-test-paths-2.c \ diagnostic-test-paths-3.c \ diagnostic-test-paths-4.c \ + diagnostic-test-paths-5.c \ diagnostic-path-format-plain.c \ diagnostic-path-format-none.c \ diagnostic-path-format-separate-events.c \ diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp index 8c28997..f58b9e6 100644 --- a/gcc/testsuite/lib/gcc-dg.exp +++ b/gcc/testsuite/lib/gcc-dg.exp @@ -25,6 +25,7 @@ load_lib scanltranstree.exp load_lib scanipa.exp load_lib scanwpaipa.exp load_lib scanlang.exp +load_lib scansarif.exp load_lib timeout.exp load_lib timeout-dg.exp load_lib prune.exp diff --git a/gcc/testsuite/lib/scansarif.exp b/gcc/testsuite/lib/scansarif.exp new file mode 100644 index 0000000..8b7e89c --- /dev/null +++ b/gcc/testsuite/lib/scansarif.exp @@ -0,0 +1,42 @@ +# Copyright (C) 2000-2022 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# <http://www.gnu.org/licenses/>. + +# Various utilities for scanning SARIF output, used by gcc-dg.exp and +# g++-dg.exp. +# +# This is largely borrowed from scanasm.exp. + +# Look for a pattern in the .sarif file produced by the compiler. See +# dg-scan for details. + +proc scan-sarif-file { args } { + set testcase [testname-for-summary] + # The name might include a list of options; extract the file name. + set filename [lindex $testcase 0] + set output_file "[file tail $filename].sarif" + dg-scan "scan-sarif-file" 1 $testcase $output_file $args +} + +# Check that a pattern is not present in the .sarif file. See dg-scan +# for details. + +proc scan-sarif-file-not { args } { + set testcase [testname-for-summary] + # The name might include a list of options; extract the file name. + set filename [lindex $testcase 0] + set output_file "[file tail $filename].sarif" + dg-scan "scan-sarif-file-not" 0 $testcase $output_file $args +} diff --git a/gcc/tree-diagnostic-client-data-hooks.cc b/gcc/tree-diagnostic-client-data-hooks.cc new file mode 100644 index 0000000..f8ff271 --- /dev/null +++ b/gcc/tree-diagnostic-client-data-hooks.cc @@ -0,0 +1,150 @@ +/* Implementation of diagnostic_client_data_hooks for the compilers + (e.g. with knowledge of "tree" and lang_hooks). + Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by David Malcolm <dmalcolm@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "version.h" +#include "tree.h" +#include "diagnostic.h" +#include "tree-logical-location.h" +#include "diagnostic-client-data-hooks.h" +#include "langhooks.h" +#include "plugin.h" + +/* Concrete class for supplying a diagnostic_context with information + about a specific plugin within the client, when the client is the + compiler (i.e. a GCC plugin). */ + +class compiler_diagnostic_client_plugin_info + : public diagnostic_client_plugin_info +{ +public: + compiler_diagnostic_client_plugin_info (const plugin_name_args *args) + : m_args (args) + { + } + + const char *get_short_name () const final override + { + return m_args->base_name; + } + + const char *get_full_name () const final override + { + return m_args->full_name; + } + + const char *get_version () const final override + { + return m_args->version; + } + +private: + const plugin_name_args *m_args; +}; + +/* Concrete subclass of client_version_info for use by compilers proper, + (i.e. using lang_hooks, and with knowledge of GCC plugins). */ + +class compiler_version_info : public client_version_info +{ +public: + const char *get_tool_name () const final override + { + return lang_hooks.name; + } + + /* Compare with toplev.cc: print_version. + TARGET_NAME is passed in by the Makefile. */ + char * + maybe_make_full_name () const final override + { + return xasprintf ("%s %sversion %s (%s)", + get_tool_name (), pkgversion_string, version_string, + TARGET_NAME); + } + + const char *get_version_string () const final override + { + return version_string; + } + + char *maybe_make_version_url () const final override + { + return xasprintf ("https://gcc.gnu.org/gcc-%i/", GCC_major_version); + } + + void for_each_plugin (plugin_visitor &visitor) const final override + { + ::for_each_plugin (on_plugin_cb, &visitor); + } + +private: + static void + on_plugin_cb (const plugin_name_args *args, + void *user_data) + { + compiler_diagnostic_client_plugin_info cpi (args); + client_version_info::plugin_visitor *visitor + = (client_version_info::plugin_visitor *)user_data; + visitor->on_plugin (cpi); + } +}; + +/* Subclass of diagnostic_client_data_hooks for use by compilers proper + i.e. with knowledge of "tree", access to langhooks, etc. */ + +class compiler_data_hooks : public diagnostic_client_data_hooks +{ +public: + const client_version_info *get_any_version_info () const final override + { + return &m_version_info; + } + + const logical_location *get_current_logical_location () const final override + { + if (current_function_decl) + return &m_current_fndecl_logical_loc; + else + return NULL; + } + + const char * + maybe_get_sarif_source_language (const char *filename) const final override + { + return lang_hooks.get_sarif_source_language (filename); + } + +private: + compiler_version_info m_version_info; + current_fndecl_logical_location m_current_fndecl_logical_loc; +}; + +/* Create a compiler_data_hooks (so that the class can be local + to this file). */ + +diagnostic_client_data_hooks * +make_compiler_data_hooks () +{ + return new compiler_data_hooks (); +} diff --git a/gcc/tree-diagnostic.cc b/gcc/tree-diagnostic.cc index 40a4c5f..0d79fe3 100644 --- a/gcc/tree-diagnostic.cc +++ b/gcc/tree-diagnostic.cc @@ -27,6 +27,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-pretty-print.h" #include "gimple-pretty-print.h" #include "tree-diagnostic.h" +#include "diagnostic-client-data-hooks.h" #include "langhooks.h" #include "intl.h" @@ -373,4 +374,5 @@ tree_diagnostics_defaults (diagnostic_context *context) context->print_path = default_tree_diagnostic_path_printer; context->make_json_for_path = default_tree_make_json_for_path; context->set_locations_cb = set_inlining_locations; + context->m_client_data_hooks = make_compiler_data_hooks (); } diff --git a/gcc/tree-logical-location.cc b/gcc/tree-logical-location.cc new file mode 100644 index 0000000..79d8add --- /dev/null +++ b/gcc/tree-logical-location.cc @@ -0,0 +1,148 @@ +/* Subclasses of logical_location with knowledge of "tree". + Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by David Malcolm <dmalcolm@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tree.h" +#include "pretty-print.h" +#include "tree-logical-location.h" +#include "langhooks.h" + +/* class compiler_logical_location : public logical_location. */ + +/* Get a string for DECL suitable for use by the SARIF logicalLocation + "name" property (SARIF v2.1.0 section 3.33.4). */ + +const char * +compiler_logical_location::get_short_name_for_tree (tree decl) +{ + gcc_assert (decl); + return identifier_to_locale (lang_hooks.decl_printable_name (decl, 0)); +} + +/* Get a string for DECL suitable for use by the SARIF logicalLocation + "fullyQualifiedName" property (SARIF v2.1.0 section 3.33.5). */ + +const char * +compiler_logical_location::get_name_with_scope_for_tree (tree decl) +{ + gcc_assert (decl); + return identifier_to_locale (lang_hooks.decl_printable_name (decl, 1)); +} + +/* Get a string for DECL suitable for use by the SARIF logicalLocation + "decoratedName" property (SARIF v2.1.0 section 3.33.6). */ + +const char * +compiler_logical_location::get_internal_name_for_tree (tree decl) +{ + gcc_assert (decl); + if (HAS_DECL_ASSEMBLER_NAME_P (decl)) + if (tree id = DECL_ASSEMBLER_NAME (decl)) + return IDENTIFIER_POINTER (id); + return NULL; +} + +/* Get what kind of SARIF logicalLocation DECL is (if any). */ + +enum logical_location_kind +compiler_logical_location::get_kind_for_tree (tree decl) +{ + if (!decl) + return LOGICAL_LOCATION_KIND_UNKNOWN; + + switch (TREE_CODE (decl)) + { + default: + return LOGICAL_LOCATION_KIND_UNKNOWN; + case FUNCTION_DECL: + return LOGICAL_LOCATION_KIND_FUNCTION; + case PARM_DECL: + return LOGICAL_LOCATION_KIND_PARAMETER; + case VAR_DECL: + return LOGICAL_LOCATION_KIND_VARIABLE; + } +} + +/* class tree_logical_location : public compiler_logical_location. */ + +/* Implementation of the logical_location vfuncs, using m_decl. */ + +const char * +tree_logical_location::get_short_name () const +{ + gcc_assert (m_decl); + return get_short_name_for_tree (m_decl); +} + +const char * +tree_logical_location::get_name_with_scope () const +{ + gcc_assert (m_decl); + return get_name_with_scope_for_tree (m_decl); +} + +const char * +tree_logical_location::get_internal_name () const +{ + gcc_assert (m_decl); + return get_internal_name_for_tree (m_decl); +} + +enum logical_location_kind +tree_logical_location::get_kind () const +{ + gcc_assert (m_decl); + return get_kind_for_tree (m_decl); +} + +/* class current_fndecl_logical_location : public compiler_logical_location. */ + +/* Implementation of the logical_location vfuncs, using + current_function_decl. */ + +const char * +current_fndecl_logical_location::get_short_name () const +{ + gcc_assert (current_function_decl); + return get_short_name_for_tree (current_function_decl); +} + +const char * +current_fndecl_logical_location::get_name_with_scope () const +{ + gcc_assert (current_function_decl); + return get_name_with_scope_for_tree (current_function_decl); +} + +const char * +current_fndecl_logical_location::get_internal_name () const +{ + gcc_assert (current_function_decl); + return get_internal_name_for_tree (current_function_decl); +} + +enum logical_location_kind +current_fndecl_logical_location::get_kind () const +{ + gcc_assert (current_function_decl); + return get_kind_for_tree (current_function_decl); +} diff --git a/gcc/tree-logical-location.h b/gcc/tree-logical-location.h new file mode 100644 index 0000000..3086cac --- /dev/null +++ b/gcc/tree-logical-location.h @@ -0,0 +1,67 @@ +/* Subclasses of logical_location with knowledge of "tree". + Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by David Malcolm <dmalcolm@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#ifndef GCC_TREE_LOGICAL_LOCATION_H +#define GCC_TREE_LOGICAL_LOCATION_H + +#include "logical-location.h" + +/* Abstract subclass of logical_location, with knowledge of "tree", but + for no specific tree. */ + +class compiler_logical_location : public logical_location +{ + protected: + static const char *get_short_name_for_tree (tree); + static const char *get_name_with_scope_for_tree (tree); + static const char *get_internal_name_for_tree (tree); + static enum logical_location_kind get_kind_for_tree (tree); +}; + +/* Concrete subclass of logical_location, with reference to a specific + tree. */ + +class tree_logical_location : public compiler_logical_location +{ +public: + tree_logical_location (tree decl) : m_decl (decl) {} + + const char *get_short_name () const final override; + const char *get_name_with_scope () const final override; + const char *get_internal_name () const final override; + enum logical_location_kind get_kind () const final override; + +private: + tree m_decl; +}; + +/* Concrete subclass of logical_location, with reference to + current_function_decl. */ + +class current_fndecl_logical_location : public compiler_logical_location +{ +public: + const char *get_short_name () const final override; + const char *get_name_with_scope () const final override; + const char *get_internal_name () const final override; + enum logical_location_kind get_kind () const final override; +}; + +#endif /* GCC_TREE_LOGICAL_LOCATION_H. */ |