diff options
author | Bob Wilson <bob.wilson@apple.com> | 2014-02-17 19:21:09 +0000 |
---|---|---|
committer | Bob Wilson <bob.wilson@apple.com> | 2014-02-17 19:21:09 +0000 |
commit | bf854f0f53eab60d127d13e33068993007da280f (patch) | |
tree | 29dde13fe5bf590de7b56280323d46a29d22853a /clang/lib/CodeGen/CodeGenFunction.cpp | |
parent | a7b16e0ffde87990d3c6ae53e7bb456e2d888bd9 (diff) | |
download | llvm-bf854f0f53eab60d127d13e33068993007da280f.zip llvm-bf854f0f53eab60d127d13e33068993007da280f.tar.gz llvm-bf854f0f53eab60d127d13e33068993007da280f.tar.bz2 |
Change PGO instrumentation to compute counts in a separate AST traversal.
Previously, we made one traversal of the AST prior to codegen to assign
counters to the ASTs and then propagated the count values during codegen. This
patch now adds a separate AST traversal prior to codegen for the
-fprofile-instr-use option to propagate the count values. The counts are then
saved in a map from which they can be retrieved during codegen.
This new approach has several advantages:
1. It gets rid of a lot of extra PGO-related code that had previously been
added to codegen.
2. It fixes a serious bug. My original implementation (which was mailed to the
list but never committed) used 3 counters for every loop. Justin improved it to
move 2 of those counters into the less-frequently executed breaks and continues,
but that turned out to produce wrong count values in some cases. The solution
requires visiting a loop body before the condition so that the count for the
condition properly includes the break and continue counts. Changing codegen to
visit a loop body first would be a fairly invasive change, but with a separate
AST traversal, it is easy to control the order of traversal. I've added a
testcase (provided by Justin) to make sure this works correctly.
3. It improves the instrumentation overhead, reducing the number of counters for
a loop from 3 to 1. We no longer need dedicated counters for breaks and
continues, since we can just use the propagated count values when visiting
breaks and continues.
To make this work, I needed to make a change to the way we count case
statements, going back to my original approach of not including the fall-through
in the counter values. This was necessary because there isn't always an AST node
that can be used to record the fall-through count. Now case statements are
handled the same as default statements, with the fall-through paths branching
over the counter increments. While I was at it, I also went back to using this
approach for do-loops -- omitting the fall-through count into the loop body
simplifies some of the calculations and make them behave the same as other
loops. Whenever we start using this instrumentation for coverage, we'll need
to add the fall-through counts into the counter values.
llvm-svn: 201528
Diffstat (limited to 'clang/lib/CodeGen/CodeGenFunction.cpp')
-rw-r--r-- | clang/lib/CodeGen/CodeGenFunction.cpp | 34 |
1 files changed, 26 insertions, 8 deletions
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 715af36..409d114 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -688,6 +688,26 @@ void CodeGenFunction::EmitFunctionBody(FunctionArgList &Args, EmitStmt(Body); } +/// When instrumenting to collect profile data, the counts for some blocks +/// such as switch cases need to not include the fall-through counts, so +/// emit a branch around the instrumentation code. When not instrumenting, +/// this just calls EmitBlock(). +void CodeGenFunction::EmitBlockWithFallThrough(llvm::BasicBlock *BB, + RegionCounter &Cnt) { + llvm::BasicBlock *SkipCountBB = 0; + if (HaveInsertPoint() && CGM.getCodeGenOpts().ProfileInstrGenerate) { + // When instrumenting for profiling, the fallthrough to certain + // statements needs to skip over the instrumentation code so that we + // get an accurate count. + SkipCountBB = createBasicBlock("skipcount"); + EmitBranch(SkipCountBB); + } + EmitBlock(BB); + Cnt.beginRegion(Builder, /*AddIncomingFallThrough=*/true); + if (SkipCountBB) + EmitBlock(SkipCountBB); +} + /// Tries to mark the given function nounwind based on the /// non-existence of any throwing calls within it. We believe this is /// lightweight enough to do at -O0. @@ -917,10 +937,11 @@ void CodeGenFunction::EmitBranchOnBoolExpr(const Expr *Cond, Cond = Cond->IgnoreParens(); if (const BinaryOperator *CondBOp = dyn_cast<BinaryOperator>(Cond)) { - RegionCounter Cnt = getPGORegionCounter(CondBOp); // Handle X && Y in a condition. if (CondBOp->getOpcode() == BO_LAnd) { + RegionCounter Cnt = getPGORegionCounter(CondBOp); + // If we have "1 && X", simplify the code. "0 && X" would have constant // folded if the case was simple enough. bool ConstantBool = false; @@ -957,13 +978,13 @@ void CodeGenFunction::EmitBranchOnBoolExpr(const Expr *Cond, eval.begin(*this); EmitBranchOnBoolExpr(CondBOp->getRHS(), TrueBlock, FalseBlock, TrueCount); eval.end(*this); - Cnt.adjustForControlFlow(); - Cnt.applyAdjustmentsToRegion(); return; } if (CondBOp->getOpcode() == BO_LOr) { + RegionCounter Cnt = getPGORegionCounter(CondBOp); + // If we have "0 || X", simplify the code. "1 || X" would have constant // folded if the case was simple enough. bool ConstantBool = false; @@ -1003,8 +1024,6 @@ void CodeGenFunction::EmitBranchOnBoolExpr(const Expr *Cond, EmitBranchOnBoolExpr(CondBOp->getRHS(), TrueBlock, FalseBlock, RHSCount); eval.end(*this); - Cnt.adjustForControlFlow(); - Cnt.applyAdjustmentsToRegion(); return; } @@ -1037,7 +1056,7 @@ void CodeGenFunction::EmitBranchOnBoolExpr(const Expr *Cond, // the conditional operator. uint64_t LHSScaledTrueCount = 0; if (TrueCount) { - double LHSRatio = Cnt.getCount() / (double) PGO.getCurrentRegionCount(); + double LHSRatio = Cnt.getCount() / (double) Cnt.getParentCount(); LHSScaledTrueCount = TrueCount * LHSRatio; } @@ -1050,7 +1069,6 @@ void CodeGenFunction::EmitBranchOnBoolExpr(const Expr *Cond, cond.begin(*this); EmitBlock(RHSBlock); - Cnt.beginElseRegion(); EmitBranchOnBoolExpr(CondOp->getRHS(), TrueBlock, FalseBlock, TrueCount - LHSScaledTrueCount); cond.end(*this); @@ -1070,7 +1088,7 @@ void CodeGenFunction::EmitBranchOnBoolExpr(const Expr *Cond, // Create branch weights based on the number of times we get here and the // number of times the condition should be true. - uint64_t CurrentCount = PGO.getCurrentRegionCountWithMin(TrueCount); + uint64_t CurrentCount = std::max(PGO.getCurrentRegionCount(), TrueCount); llvm::MDNode *Weights = PGO.createBranchWeights(TrueCount, CurrentCount - TrueCount); |