Add condition coverage (MC/DC)

This patch adds support in gcc+gcov for modified condition/decision coverage (MC/DC) with the -fcondition-coverage flag. MC/DC is a type of test/code coverage and it is particularly important for safety-critical applicaitons in industries like aviation and automotive. Notably, MC/DC is required or recommended by: * DO-178C for the most critical software (Level A) in avionics. * IEC 61508 for SIL 4. * ISO 26262-6 for ASIL D. From the SQLite webpage: Two methods of measuring test coverage were described above: "statement" and "branch" coverage. There are many other test coverage metrics besides these two. Another popular metric is "Modified Condition/Decision Coverage" or MC/DC. Wikipedia defines MC/DC as follows: * Each decision tries every possible outcome. * Each condition in a decision takes on every possible outcome. * Each entry and exit point is invoked. * Each condition in a decision is shown to independently affect the outcome of the decision. In the C programming language where && and || are "short-circuit" operators, MC/DC and branch coverage are very nearly the same thing. The primary difference is in boolean vector tests. One can test for any of several bits in bit-vector and still obtain 100% branch test coverage even though the second element of MC/DC - the requirement that each condition in a decision take on every possible outcome - might not be satisfied. https://sqlite.org/testing.html#mcdc MC/DC comes in different flavors, the most important being unique cause MC/DC and masking MC/DC. This patch implements masking MC/DC, which is works well with short circuiting semantics, and according to John Chilenski's "An Investigation of Three Forms of the Modified Condition Decision Coverage (MCDC) Criterion" (2001) is as good as unique cause at catching bugs. Whalen, Heimdahl, and De Silva "Efficient Test Coverage Measurement for MC/DC" describes an algorithm for finding the masking table from an AST walk, but my algorithm figures this out by analyzing the control flow graph. The CFG is considered a reduced ordered binary decision diagram and an input vector a path through the BDD, which is recorded. Specific edges will mask ("null out") the contribution from earlier path segments, which can be determined by finding short circuit endpoints. Masking is most easily understood as circuiting of terms in the reverse-ordered Boolean function, and the masked conditions do not affect the decision like short-circuited conditions do not affect the decision. A tag/discriminator mapping from gcond->uid is created during gimplification and made available through the function struct. The values are unimportant as long as basic conditions constructed from a single Boolean expression are given the same identifier. This happens in the breaking down of ANDIF/ORIF trees, so the coverage generally works well for frontends that create such trees. Like Whalen et al this implementation records coverage in fixed-size bitsets which gcov knows how to interpret. Recording conditions only requires a few bitwise operations per condition and is very fast, but comes with a limit on the number of terms in a single boolean expression; the number of bits in a gcov_unsigned_type (which is usually typedef'd to uint64_t). For most practical purposes this is acceptable, and by default a warning will be issued if gcc cannot instrument the expression. This is a practical limitation in the implementation, and not a limitation of the algorithm, so support for more conditions can be supported by introducing arbitrary-sized bitsets. In action it looks pretty similar to the branch coverage. The -g short opt carries no significance, but was chosen because it was an available option with the upper-case free too. gcov --conditions: 3: 17:void fn (int a, int b, int c, int d) { 3: 18: if ((a && (b || c)) && d) conditions covered 3/8 condition 0 not covered (true false) condition 1 not covered (true) condition 2 not covered (true) condition 3 not covered (true) 1: 19: x = 1; -: 20: else 2: 21: x = 2; 3: 22:} gcov --conditions --json-format: "conditions": [ { "not_covered_false": [ 0 ], "count": 8, "covered": 3, "not_covered_true": [ 0, 1, 2, 3 ] } ], Expressions with constants may be heavily rewritten before it reaches the gimplification, so constructs like int x = a ? 0 : 1 becomes _x = (_a == 0). From source you would expect coverage, but it gets neither branch nor condition coverage. The same applies to expressions like int x = 1 || a which are simply replaced by a constant. The test suite contains a lot of small programs and functions. Some of these were designed by hand to test for specific behaviours and graph shapes, and some are previously-failed test cases in other programs adapted into the test suite. gcc/ChangeLog: * builtins.cc (expand_builtin_fork_or_exec): Check condition_coverage_flag. * collect2.cc (main): Add -fno-condition-coverage to OBSTACK. * common.opt: Add new options -fcondition-coverage and -Wcoverage-too-many-conditions. * doc/gcov.texi: Add --conditions documentation. * doc/invoke.texi: Add -fcondition-coverage documentation. * function.cc (free_after_compilation): Free cond_uids. * function.h (struct function): Add cond_uids. * gcc.cc: Link gcov on -fcondition-coverage. * gcov-counter.def (GCOV_COUNTER_CONDS): New. * gcov-dump.cc (tag_conditions): New. * gcov-io.h (GCOV_TAG_CONDS): New. (GCOV_TAG_CONDS_LENGTH): New. (GCOV_TAG_CONDS_NUM): New. * gcov.cc (class condition_info): New. (condition_info::condition_info): New. (condition_info::popcount): New. (struct coverage_info): New. (add_condition_counts): New. (output_conditions): New. (print_usage): Add -g, --conditions. (process_args): Likewise. (output_intermediate_json_line): Output conditions. (read_graph_file): Read condition counters. (read_count_file): Likewise. (file_summary): Print conditions. (accumulate_line_info): Accumulate conditions. (output_line_details): Print conditions. * gimplify.cc (next_cond_uid): New. (reset_cond_uid): New. (shortcut_cond_r): Set condition discriminator. (tag_shortcut_cond): New. (gimple_associate_condition_with_expr): New. (shortcut_cond_expr): Set condition discriminator. (gimplify_cond_expr): Likewise. (gimplify_function_tree): Call reset_cond_uid. * ipa-inline.cc (can_early_inline_edge_p): Check condition_coverage_flag. * ipa-split.cc (pass_split_functions::gate): Likewise. * passes.cc (finish_optimization_passes): Likewise. * profile.cc (struct condcov): New declaration. (cov_length): Likewise. (cov_blocks): Likewise. (cov_masks): Likewise. (cov_maps): Likewise. (cov_free): Likewise. (instrument_decisions): New. (read_thunk_profile): Control output to file. (branch_prob): Call find_conditions, instrument_decisions. (init_branch_prob): Add total_num_conds. (end_branch_prob): Likewise. * tree-core.h (struct tree_exp): Add condition_uid. * tree-profile.cc (struct conds_ctx): New. (CONDITIONS_MAX_TERMS): New. (EDGE_CONDITION): New. (topological_cmp): New. (index_of): New. (single_p): New. (single_edge): New. (contract_edge_up): New. (struct outcomes): New. (conditional_succs): New. (condition_index): New. (condition_uid): New. (masking_vectors): New. (emit_assign): New. (emit_bitwise_op): New. (make_top_index_visit): New. (make_top_index): New. (paths_between): New. (struct condcov): New. (cov_length): New. (cov_blocks): New. (cov_masks): New. (cov_maps): New. (cov_free): New. (find_conditions): New. (struct counters): New. (find_counters): New. (resolve_counter): New. (resolve_counters): New. (instrument_decisions): New. (tree_profiling): Check condition_coverage_flag. (pass_ipa_tree_profile::gate): Likewise. * tree.h (SET_EXPR_UID): New. (EXPR_COND_UID): New. libgcc/ChangeLog: * libgcov-merge.c (__gcov_merge_ior): New. gcc/testsuite/ChangeLog: * lib/gcov.exp: Add condition coverage test function. * g++.dg/gcov/gcov-18.C: New test. * gcc.misc-tests/gcov-19.c: New test. * gcc.misc-tests/gcov-20.c: New test. * gcc.misc-tests/gcov-21.c: New test. * gcc.misc-tests/gcov-22.c: New test. * gcc.misc-tests/gcov-23.c: New test.
author: Jørgen Kvalsvik <j@lambda.is> 2023-12-05 12:59:40 +0100
committer: Jørgen Kvalsvik <j@lambda.is> 2024-04-04 20:28:44 +0200
commit: 08a52331803f66a4aaeaedd278436ca8eac57b50 (patch)
tree: 0827170eb48377be2b7888ab2bb8e9012b89b55c /gcc/gimplify.cc
parent: b7bd2ec73d66f7487bc8842b24daecaa802a72e6 (diff)
download: gcc-08a52331803f66a4aaeaedd278436ca8eac57b50.zip
gcc-08a52331803f66a4aaeaedd278436ca8eac57b50.tar.gz
gcc-08a52331803f66a4aaeaedd278436ca8eac57b50.tar.bz2
1 files changed, 108 insertions, 15 deletions
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index d64bbf3..7b972c0 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -71,6 +71,28 @@ along with GCC; see the file COPYING3.  If not see
 #include "context.h"
 #include "tree-nested.h"
 
+/* Identifier for a basic condition, mapping it to other basic conditions of
+   its Boolean expression.  Basic conditions given the same uid (in the same
+   function) are parts of the same ANDIF/ORIF expression.  Used for condition
+   coverage.  */
+static unsigned nextuid = 1;
+/* Get a fresh identifier for a new condition expression.  This is used for
+   condition coverage.  */
+static unsigned
+next_cond_uid ()
+{
+    return nextuid++;
+}
+/* Reset the condition uid to the value it should have when compiling a new
+   function.  0 is already the default/untouched value, so start at non-zero.
+   A valid and set id should always be > 0.  This is used for condition
+   coverage.  */
+static void
+reset_cond_uid ()
+{
+    nextuid = 1;
+}
+
 /* Hash set of poisoned variables in a bind expr.  */
 static hash_set<tree> *asan_poisoned_variables = NULL;
 
@@ -4139,13 +4161,16 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, bool want_value)
 
    LOCUS is the source location of the COND_EXPR.
 
+   The condition_uid is a discriminator tag for condition coverage used to map
+   conditions to its corresponding full Boolean function.
+
    This function is the tree equivalent of do_jump.
 
    shortcut_cond_r should only be called by shortcut_cond_expr.  */
 
 static tree
 shortcut_cond_r (tree pred, tree *true_label_p, tree *false_label_p,
-		 location_t locus)
+		 location_t locus, unsigned condition_uid)
 {
   tree local_label = NULL_TREE;
   tree t, expr = NULL;
@@ -4167,13 +4192,14 @@ shortcut_cond_r (tree pred, tree *true_label_p, tree *false_label_p,
 	false_label_p = &local_label;
 
       /* Keep the original source location on the first 'if'.  */
-      t = shortcut_cond_r (TREE_OPERAND (pred, 0), NULL, false_label_p, locus);
+      t = shortcut_cond_r (TREE_OPERAND (pred, 0), NULL, false_label_p, locus,
+			   condition_uid);
       append_to_statement_list (t, &expr);
 
       /* Set the source location of the && on the second 'if'.  */
       new_locus = rexpr_location (pred, locus);
       t = shortcut_cond_r (TREE_OPERAND (pred, 1), true_label_p, false_label_p,
-			   new_locus);
+			   new_locus, condition_uid);
       append_to_statement_list (t, &expr);
     }
   else if (TREE_CODE (pred) == TRUTH_ORIF_EXPR)
@@ -4190,13 +4216,14 @@ shortcut_cond_r (tree pred, tree *true_label_p, tree *false_label_p,
 	true_label_p = &local_label;
 
       /* Keep the original source location on the first 'if'.  */
-      t = shortcut_cond_r (TREE_OPERAND (pred, 0), true_label_p, NULL, locus);
+      t = shortcut_cond_r (TREE_OPERAND (pred, 0), true_label_p, NULL, locus,
+			   condition_uid);
       append_to_statement_list (t, &expr);
 
       /* Set the source location of the || on the second 'if'.  */
       new_locus = rexpr_location (pred, locus);
       t = shortcut_cond_r (TREE_OPERAND (pred, 1), true_label_p, false_label_p,
-			   new_locus);
+			   new_locus, condition_uid);
       append_to_statement_list (t, &expr);
     }
   else if (TREE_CODE (pred) == COND_EXPR
@@ -4219,9 +4246,11 @@ shortcut_cond_r (tree pred, tree *true_label_p, tree *false_label_p,
       new_locus = rexpr_location (pred, locus);
       expr = build3 (COND_EXPR, void_type_node, TREE_OPERAND (pred, 0),
 		     shortcut_cond_r (TREE_OPERAND (pred, 1), true_label_p,
-				      false_label_p, locus),
+				      false_label_p, locus, condition_uid),
 		     shortcut_cond_r (TREE_OPERAND (pred, 2), true_label_p,
-				      false_label_p, new_locus));
+				      false_label_p, new_locus,
+				      condition_uid));
+      SET_EXPR_UID (expr, condition_uid);
     }
   else
     {
@@ -4229,6 +4258,7 @@ shortcut_cond_r (tree pred, tree *true_label_p, tree *false_label_p,
 		     build_and_jump (true_label_p),
 		     build_and_jump (false_label_p));
       SET_EXPR_LOCATION (expr, locus);
+      SET_EXPR_UID (expr, condition_uid);
     }
 
   if (local_label)
@@ -4279,12 +4309,44 @@ find_goto_label (tree expr)
   return NULL_TREE;
 }
 
+
+/* Given a multi-term condition (ANDIF, ORIF), walk the predicate PRED and tag
+   every basic condition with CONDITION_UID.  Two basic conditions share the
+   CONDITION_UID discriminator when they belong to the same predicate, which is
+   used by the condition coverage.  Doing this as an explicit step makes for a
+   simpler implementation than weaving it into the splitting code as the
+   splitting code eventually calls the entry point gimplfiy_expr which makes
+   bookkeeping complicated.  */
+static void
+tag_shortcut_cond (tree pred, unsigned condition_uid)
+{
+  if (TREE_CODE (pred) == TRUTH_ANDIF_EXPR
+      || TREE_CODE (pred) == TRUTH_ORIF_EXPR)
+    {
+      tree fst = TREE_OPERAND (pred, 0);
+      tree lst = TREE_OPERAND (pred, 1);
+
+      if (TREE_CODE (fst) == TRUTH_ANDIF_EXPR
+	  || TREE_CODE (fst) == TRUTH_ORIF_EXPR)
+	tag_shortcut_cond (fst, condition_uid);
+      else if (TREE_CODE (fst) == COND_EXPR)
+	SET_EXPR_UID (fst, condition_uid);
+
+      if (TREE_CODE (lst) == TRUTH_ANDIF_EXPR
+	  || TREE_CODE (lst) == TRUTH_ORIF_EXPR)
+	tag_shortcut_cond (lst, condition_uid);
+      else if (TREE_CODE (lst) == COND_EXPR)
+	SET_EXPR_UID (lst, condition_uid);
+    }
+}
+
 /* Given a conditional expression EXPR with short-circuit boolean
    predicates using TRUTH_ANDIF_EXPR or TRUTH_ORIF_EXPR, break the
-   predicate apart into the equivalent sequence of conditionals.  */
-
+   predicate apart into the equivalent sequence of conditionals.  CONDITION_UID
+   is a the tag/discriminator for this EXPR - all basic conditions in the
+   expression will be given the same CONDITION_UID.  */
 static tree
-shortcut_cond_expr (tree expr)
+shortcut_cond_expr (tree expr, unsigned condition_uid)
 {
   tree pred = TREE_OPERAND (expr, 0);
   tree then_ = TREE_OPERAND (expr, 1);
@@ -4296,6 +4358,8 @@ shortcut_cond_expr (tree expr)
   bool then_se = then_ && TREE_SIDE_EFFECTS (then_);
   bool else_se = else_ && TREE_SIDE_EFFECTS (else_);
 
+  tag_shortcut_cond (pred, condition_uid);
+
   /* First do simple transformations.  */
   if (!else_se)
     {
@@ -4311,7 +4375,7 @@ shortcut_cond_expr (tree expr)
 	  /* Set the source location of the && on the second 'if'.  */
 	  if (rexpr_has_location (pred))
 	    SET_EXPR_LOCATION (expr, rexpr_location (pred));
-	  then_ = shortcut_cond_expr (expr);
+	  then_ = shortcut_cond_expr (expr, condition_uid);
 	  then_se = then_ && TREE_SIDE_EFFECTS (then_);
 	  pred = TREE_OPERAND (pred, 0);
 	  expr = build3 (COND_EXPR, void_type_node, pred, then_, NULL_TREE);
@@ -4333,7 +4397,7 @@ shortcut_cond_expr (tree expr)
 	  /* Set the source location of the || on the second 'if'.  */
 	  if (rexpr_has_location (pred))
 	    SET_EXPR_LOCATION (expr, rexpr_location (pred));
-	  else_ = shortcut_cond_expr (expr);
+	  else_ = shortcut_cond_expr (expr, condition_uid);
 	  else_se = else_ && TREE_SIDE_EFFECTS (else_);
 	  pred = TREE_OPERAND (pred, 0);
 	  expr = build3 (COND_EXPR, void_type_node, pred, NULL_TREE, else_);
@@ -4341,6 +4405,9 @@ shortcut_cond_expr (tree expr)
 	}
     }
 
+  /* The expr tree should also have the expression id set.  */
+  SET_EXPR_UID (expr, condition_uid);
+
   /* If we're done, great.  */
   if (TREE_CODE (pred) != TRUTH_ANDIF_EXPR
       && TREE_CODE (pred) != TRUTH_ORIF_EXPR)
@@ -4388,7 +4455,7 @@ shortcut_cond_expr (tree expr)
   /* If there was nothing else in our arms, just forward the label(s).  */
   if (!then_se && !else_se)
     return shortcut_cond_r (pred, true_label_p, false_label_p,
-			    EXPR_LOC_OR_LOC (expr, input_location));
+			    EXPR_LOC_OR_LOC (expr, input_location), condition_uid);
 
   /* If our last subexpression already has a terminal label, reuse it.  */
   if (else_se)
@@ -4420,7 +4487,8 @@ shortcut_cond_expr (tree expr)
   jump_over_else = block_may_fallthru (then_);
 
   pred = shortcut_cond_r (pred, true_label_p, false_label_p,
-			  EXPR_LOC_OR_LOC (expr, input_location));
+			  EXPR_LOC_OR_LOC (expr, input_location),
+			  condition_uid);
 
   expr = NULL;
   append_to_statement_list (pred, &expr);
@@ -4594,6 +4662,24 @@ generic_expr_could_trap_p (tree expr)
   return false;
 }
 
+/* Associate the condition STMT with the discriminator UID.  STMTs that are
+   broken down with ANDIF/ORIF from the same Boolean expression should be given
+   the same UID; 'if (a && b && c) { if (d || e) ... } ...' should yield the
+   { a: 1, b: 1, c: 1, d: 2, e: 2 } when gimplification is done.  This is used
+   for condition coverage.  */
+static void
+gimple_associate_condition_with_expr (struct function *fn, gcond *stmt,
+				      unsigned uid)
+{
+  if (!condition_coverage_flag)
+    return;
+
+  if (!fn->cond_uids)
+    fn->cond_uids = new hash_map <gcond*, unsigned> ();
+
+  fn->cond_uids->put (stmt, uid);
+}
+
 /*  Convert the conditional expression pointed to by EXPR_P '(p) ? a : b;'
     into
 
@@ -4696,7 +4782,7 @@ gimplify_cond_expr (tree *expr_p, gimple_seq *pre_p, fallback_t fallback)
   if (TREE_CODE (TREE_OPERAND (expr, 0)) == TRUTH_ANDIF_EXPR
       || TREE_CODE (TREE_OPERAND (expr, 0)) == TRUTH_ORIF_EXPR)
     {
-      expr = shortcut_cond_expr (expr);
+      expr = shortcut_cond_expr (expr, next_cond_uid ());
 
       if (expr != *expr_p)
 	{
@@ -4760,11 +4846,16 @@ gimplify_cond_expr (tree *expr_p, gimple_seq *pre_p, fallback_t fallback)
   else
     label_false = create_artificial_label (UNKNOWN_LOCATION);
 
+  unsigned cond_uid = EXPR_COND_UID (expr);
+  if (cond_uid == 0)
+    cond_uid = next_cond_uid ();
+
   gimple_cond_get_ops_from_tree (COND_EXPR_COND (expr), &pred_code, &arm1,
 				 &arm2);
   cond_stmt = gimple_build_cond (pred_code, arm1, arm2, label_true,
 				 label_false);
   gimple_set_location (cond_stmt, EXPR_LOCATION (expr));
+  gimple_associate_condition_with_expr (cfun, cond_stmt, cond_uid);
   copy_warning (cond_stmt, COND_EXPR_COND (expr));
   gimplify_seq_add_stmt (&seq, cond_stmt);
   gimple_stmt_iterator gsi = gsi_last (seq);
@@ -19248,6 +19339,8 @@ gimplify_function_tree (tree fndecl)
   else
     push_struct_function (fndecl);
 
+  reset_cond_uid ();
+
   /* Tentatively set PROP_gimple_lva here, and reset it in gimplify_va_arg_expr
      if necessary.  */
   cfun->curr_properties |= PROP_gimple_lva;
author	Jørgen Kvalsvik <j@lambda.is>	2023-12-05 12:59:40 +0100
committer	Jørgen Kvalsvik <j@lambda.is>	2024-04-04 20:28:44 +0200
commit	08a52331803f66a4aaeaedd278436ca8eac57b50 (patch)
tree	0827170eb48377be2b7888ab2bb8e9012b89b55c /gcc/gimplify.cc
parent	b7bd2ec73d66f7487bc8842b24daecaa802a72e6 (diff)
download	gcc-08a52331803f66a4aaeaedd278436ca8eac57b50.zip gcc-08a52331803f66a4aaeaedd278436ca8eac57b50.tar.gz gcc-08a52331803f66a4aaeaedd278436ca8eac57b50.tar.bz2