aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp
diff options
context:
space:
mode:
authorCamsyn <camsyn@foxmail.com>2025-04-17 16:09:07 +0800
committerGitHub <noreply@github.com>2025-04-17 10:09:07 +0200
commitbf6986f9f09f79da38006a83c339226c429bb686 (patch)
treeaefdf84937933a296becc3e3c44a0bf88120cf0d /llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp
parentf351172d4a840dfbf533319b62925747a10b762f (diff)
downloadllvm-bf6986f9f09f79da38006a83c339226c429bb686.zip
llvm-bf6986f9f09f79da38006a83c339226c429bb686.tar.gz
llvm-bf6986f9f09f79da38006a83c339226c429bb686.tar.bz2
[TSan, SanitizerBinaryMetadata] Improve instrument for derived pointers via phis/selects (#132752)
ThreadSanitizer.cpp and SanitizerBinaryMetadata.cpp previously used `getUnderlyingObject` to check if pointers originate from stack objects. However, `getUnderlyingObject()` by default only looks through linear chains, not selects/phis. In particular, this means that we miss cases involving pointer induction variables. For instance, ```llvm %stkobj = alloca [2 x i32], align 8 ; getUnderlyingObject(%derived) = %derived %derived = getelementptr inbounds i32, ptr %stkobj, i64 1 ``` This will result in redundant instrumentation of TSan, resulting in greater performance costs, especially when there are loops, referring to this [godbolt page](https://godbolt.org/z/eaT1fPjTW) for details. ```cpp char loop(int x) { char buf[10]; char *p = buf; for (int i = 0; i < x && i < 10; i++) { // Should not instrument, as its base object is a non-captured stack // variable. // However, currectly, it is instrumented due to %p = %phi ... *p++ = i; } // Use buf to prevent it from being eliminated by optimization return buf[9]; } ``` There are TWO APIs `getUnderlyingObjectAggressive` and `findAllocaForValue` that can backtrack the pointer via tree traversal, supporting phis/selects. This patch replaces `getUnderlyingObject` with `findAllocaForValue` which: 1. Properly tracks through PHINodes and select operations 2. Directly identifies if a pointer comes from a `AllocaInst` Performance impact: - Compilation: Moderate cost increase due to wider value tracing, but... - Runtime: Significant wins for code with pointer induction variables derived from stack allocas, especially for loop-heavy code, as instrumentation can now be safely omitted.
Diffstat (limited to 'llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp')
-rw-r--r--llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp4
1 files changed, 2 insertions, 2 deletions
diff --git a/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp b/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp
index 1811d14..7f846d2 100644
--- a/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp
@@ -448,8 +448,8 @@ void ThreadSanitizer::chooseInstructionsToInstrument(
}
}
- if (isa<AllocaInst>(getUnderlyingObject(Addr)) &&
- !PointerMayBeCaptured(Addr, /*ReturnCaptures=*/true)) {
+ const AllocaInst *AI = findAllocaForValue(Addr);
+ if (AI && !PointerMayBeCaptured(Addr, /*ReturnCaptures=*/true)) {
// The variable is addressable but not captured, so it cannot be
// referenced from a different thread and participate in a data race
// (see llvm/Analysis/CaptureTracking.h for details).