aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Support/CrashRecoveryContext.cpp
diff options
context:
space:
mode:
authorAlexandre Ganea <alexandre.ganea@ubisoft.com>2020-11-12 08:14:20 -0500
committerAlexandre Ganea <alexandre.ganea@ubisoft.com>2020-11-12 08:14:43 -0500
commit45b8a741fbbf271e0fb71294cb7cdce3ad4b9bf3 (patch)
tree36f232547cfcc617e32d65736908a71758094d14 /llvm/lib/Support/CrashRecoveryContext.cpp
parentf37834c7dcbe69405bf3e182d2b3e3227cc4a569 (diff)
downloadllvm-45b8a741fbbf271e0fb71294cb7cdce3ad4b9bf3.zip
llvm-45b8a741fbbf271e0fb71294cb7cdce3ad4b9bf3.tar.gz
llvm-45b8a741fbbf271e0fb71294cb7cdce3ad4b9bf3.tar.bz2
[LLD][COFF] When using LLD-as-a-library, always prevent re-entrance on failures
This is a follow-up for D70378 (Cover usage of LLD as a library). While debugging an intermittent failure on a bot, I recalled this scenario which causes the issue: 1.When executing lld/test/ELF/invalid/symtab-sh-info.s L45, we reach lld::elf::Obj-File::ObjFile() which goes straight into its base ELFFileBase(), then ELFFileBase::init(). 2.At that point fatal() is thrown in lld/ELF/InputFiles.cpp L381, leaving a half-initialized ObjFile instance. 3.We then end up in lld::exitLld() and since we are running with LLD_IN_TEST, we hapily restore the control flow to CrashRecoveryContext::RunSafely() then back in lld::safeLldMain(). 4.Before this patch, we called errorHandler().reset() just after, and this attempted to reset the associated SpecificAlloc<ObjFile<ELF64LE>>. That tried to free the half-initialized ObjFile instance, and more precisely its ObjFile::dwarf member. Sometimes that worked, sometimes it failed and was catched by the CrashRecoveryContext. This scenario was the reason we called errorHandler().reset() through a CrashRecoveryContext. But in some rare cases, the above repro somehow corrupted the heap, creating a stack overflow. When the CrashRecoveryContext's filter (that is, __except (ExceptionFilter(GetExceptionInformation()))) tried to handle the exception, it crashed again since the stack was exhausted -- and that took the whole application down. That is the issue seen on the bot. Locally it happens about 1 times out of 15. Now this situation can happen anywhere in LLD. Since catching stack overflows is not a reliable scenario ATM when using CrashRecoveryContext, we're now preventing further re-entrance when such failures occur, by signaling lld::SafeReturn::canRunAgain=false. When running with LLD_IN_TEST=2 (or above), only one iteration will be executed, instead of two. Differential Revision: https://reviews.llvm.org/D88348
Diffstat (limited to 'llvm/lib/Support/CrashRecoveryContext.cpp')
-rw-r--r--llvm/lib/Support/CrashRecoveryContext.cpp20
1 files changed, 20 insertions, 0 deletions
diff --git a/llvm/lib/Support/CrashRecoveryContext.cpp b/llvm/lib/Support/CrashRecoveryContext.cpp
index 7609f04..77f0018 100644
--- a/llvm/lib/Support/CrashRecoveryContext.cpp
+++ b/llvm/lib/Support/CrashRecoveryContext.cpp
@@ -442,6 +442,26 @@ void CrashRecoveryContext::HandleExit(int RetCode) {
llvm_unreachable("Most likely setjmp wasn't called!");
}
+bool CrashRecoveryContext::throwIfCrash(int RetCode) {
+#if defined(_WIN32)
+ // On Windows, the high bits are reserved for kernel return codes. Values
+ // starting with 0x80000000 are reserved for "warnings"; values of 0xC0000000
+ // and up are for "errors". In practice, both are interpreted as a
+ // non-continuable signal.
+ unsigned Code = ((unsigned)RetCode & 0xF0000000) >> 28;
+ if (Code != 0xC && Code != 8)
+ return false;
+ ::RaiseException(RetCode, 0, 0, NULL);
+#else
+ // On Unix, signals are represented by return codes of 128 or higher.
+ if (RetCode <= 128)
+ return false;
+ llvm::sys::unregisterHandlers();
+ raise(RetCode - 128);
+#endif
+ return true;
+}
+
// FIXME: Portability.
static void setThreadBackgroundPriority() {
#ifdef __APPLE__