diff options
Diffstat (limited to 'bolt')
25 files changed, 1241 insertions, 27 deletions
diff --git a/bolt/docs/PacRetDesign.md b/bolt/docs/PacRetDesign.md new file mode 100644 index 0000000..f3fe5fb --- /dev/null +++ b/bolt/docs/PacRetDesign.md @@ -0,0 +1,228 @@ +# Optimizing binaries with pac-ret hardening + +This is a design document about processing the `DW_CFA_AARCH64_negate_ra_state` +DWARF instruction in BOLT. As it describes internal design decisions, the +intended audience is BOLT developers. The document is an updated version of the +[RFC posted on the LLVM Discourse](https://discourse.llvm.org/t/rfc-bolt-aarch64-handle-opnegaterastate-to-enable-optimizing-binaries-with-pac-ret-hardening/86594). + + +`DW_CFA_AARCH64_negate_ra_state` is also referred to as `.cfi_negate_ra_state` +in assembly, or `OpNegateRAState` in BOLT sources. In this document, I will use +**negate-ra-state** as a shorthand. + +## Introduction + +### Pointer Authentication + +For more information, see the [pac-ret section of the BOLT-binary-analysis document](BinaryAnalysis.md#pac-ret-analysis). + +### DW_CFA_AARCH64_negate_ra_state + +The negate-ra-state CFI is a vendor-specific Call Frame Instruction defined in +the [Arm ABI](https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#id1). + +``` +The DW_CFA_AARCH64_negate_ra_state operation negates bit[0] of the RA_SIGN_STATE pseudo-register. +``` + +This bit indicates to the unwinder whether the current return address is signed +or not (hence the name). The unwinder uses this information to authenticate the +pointer, and remove the Pointer Authentication Code (PAC) bits. +Incorrect placement of negate-ra-state CFIs causes the unwinder to either attempt +to authenticate an unsigned pointer (resulting in a segmentation fault), or skip +authentication on a signed pointer, which can also cause a fault. + +Note: some unwinders use the `xpac` instruction to strip the PAC bits without +authenticating the pointer. This is an incorrect (incomplete) implementation, +as it allows control-flow modification in the case of unwinding. + +There are no DWARF instructions to directly set or clear the RA State. However, +two other CFIs can also affect the RA state: +- `DW_CFA_remember_state`: this CFI stores register rules onto an implicit stack. +- `DW_CFA_restore_state`: this CFI pops rules from this stack. + +Example: + +| CFI | Effect on RA state | +| ------------------------------ | ------------------------------ | +| (default) | 0 | +| DW_CFA_AARCH64_negate_ra_state | 0 -> 1 | +| DW_CFA_remember_state | 1 pushed to the stack | +| DW_CFA_AARCH64_negate_ra_state | 1 -> 0 | +| DW_CFA_restore_state | 0 -> 1 (popped from the stack) | + +The Arm ABI also defines the DW_CFA_AARCH64_negate_ra_state_with_pc CFI, but it +is not widely used, and is [likely to become deprecated](https://github.com/ARM-software/abi-aa/issues/327). + +### Where are these CFIs needed? + +Whenever two consecutive instructions have different RA states, the unwinder must +be informed of the change. This typically occurs during pointer signing or +authentication. If adjacent instructions differ in RA state but neither signs +nor authenticates the return address, they must belong to different control flow +paths. One is part of an execution path with signed RA, the other is part of a +path with an unsigned RA. + +In the example below, the first BasicBlock ends in a conditional branch, and +jumps to two different BasicBlocks, each with their own authentication, and +return. The instructions on the border of the second and third BasicBlock have +different RA states. The `ret` at the end of the second BasicBlock is in unsigned +state. The start of the third BasicBlock is after the `paciasp` in the control +flow, but before the authentication. In this case, a negate-ra-state is needed +at the end of the second BasicBlock. + +``` + +----------------+ + | paciasp | + | | + | b.cc | + +--------+-------+ + | ++----------------+ +| | +| +--------v-------+ +| | | +| | autiasp | +| | ret | // RA: unsigned +| +----------------+ ++----------------+ + | + +--------v-------+ // RA: signed + | | + | autiasp | + | ret | + +----------------+ +``` + +> [!important] +> The unwinder does not follow the control flow graph. It reads unwind +> information in the layout order. + +Because these locations are dependent on how the function layout looks, +negate-ra-state CFIs will become invalid during BasicBlock reordering. + +## Solution design + +The implementation introduces two new passes: +1. `MarkRAStatesPass`: assigns the RA state to each instruction based on the CFIs + in the input binary +2. `InsertNegateRAStatePass`: reads those assigned instruction RA states after + optimizations, and emits `DW_CFA_AARCH64_negate_ra_state` CFIs at the correct + places: wherever there is a state change between two consecutive instructions + in the layout order. + +To track metadata on individual instructions, the `MCAnnotation` class was +extended. These also have helper functions in `MCPlusBuilder`. + +### Saving annotations at CFI reading + +CFIs are read and added to BinaryFunctions in `CFIReaderWriter::FillCFIInfoFor`. +At this point, we add MCAnnotations about negate-ra-state, remember-state and +restore-state CFIs to the instructions they refer to. This is to not interfere +with the CFI processing that already happens in BOLT (e.g. remember-state and +restore-state CFIs are removed in `normalizeCFIState` for reasons unrelated to PAC). + +As we add the MCAnnotations *to instructions*, we have to account for the case +where the function starts with a CFI altering the RA state. As CFIs modify the RA +state of the instructions before them, we cannot add the annotation to the first +instruction. +This special case is handled by adding an `initialRAState` bool to each BinaryFunction. +If the `Offset` the CFI refers to is zero, we don't store an annotation, but set +the `initialRAState` in `FillCFIInfoFor`. This information is then used in +`MarkRAStates`. + +### Binaries without DWARF info + +In some cases, the DWARF tables are stripped from the binary. These programs +usually have some other unwind-mechanism. +These passes only run on functions that include at least one negate-ra-state CFI. +This avoids processing functions that do not use Pointer Authentication, or on +functions that use Pointer Authentication, but do not have DWARF info. + +In summary: +- pointer auth is not used: no change, the new passes do not run. +- pointer auth is used, but DWARF info is stripped: no change, the new passes + do not run. +- pointer auth is used, and we have DWARF CFIs: passes run, and rewrite the + negate-ra-state CFI. + +### MarkRAStates pass + +This pass runs before optimizations reorder anything. + +It processes MCAnnotations generated during the CFI reading stage to check if +instructions have either of the three CFIs that can modify RA state: +- negate-ra-state, +- remember-state, +- restore-state. + +Then it adds new MCAnnotations to each instruction, indicating their RA state. +Those annotations are: +- Signed, +- Unsigned. + +Below is a simple example, that shows the two different type of annotations: +what we have before the pass, and after it. + +| Instruction | Before | After | +| ----------------------------- | --------------- | -------- | +| paciasp | negate-ra-state | unsigned | +| stp x29, x30, [sp, #-0x10]! | | signed | +| mov x29, sp | | signed | +| ldp x29, x30, [sp], #0x10 | | signed | +| autiasp | negate-ra-state | signed | +| ret | | unsigned | + +##### Error handling in MarkRAState Pass: + +Whenever the MarkRAStates pass finds inconsistencies in the current +BinaryFunction, it marks the function as ignored using `BF.setIgnored()`. BOLT +will not optimize this function but will emit it unchanged in the original section +(`.bolt.org.text`). + +The inconsistencies are as follows: +- finding a `pac*` instruction when already in signed state +- finding an `aut*` instruction when already in unsigned state +- finding `pac*` and `aut*` instructions without `.cfi_negate_ra_state`. + +Users will be informed about the number of ignored functions in the pass, the +exact functions ignored, and the found inconsistency. + +### InsertNegateRAStatePass + +This pass runs after optimizations. It performns the _inverse_ of MarkRAState pa s: +1. it reads the RA state annotations attached to the instructions, and +2. whenever the state changes, it adds a PseudoInstruction that holds an + OpNegateRAState CFI. + +##### Covering newly generated instructions: + +Some BOLT passes can add new Instructions. In InsertNegateRAStatePass, we have +to know what RA state these have. + +The current solution has the `inferUnknownStates` function to cover these, using +a fairly simple strategy: unknown states inherit the last known state. + +This will be updated to a more robust solution. + +> [!important] +> As issue #160989 describes, unwind info is incorrect in stubs with multiple callers. +> For this same reason, we cannot generate correct pac-specific unwind info: the signess +> of the _incorrect_ return address is meaningless. + +### Optimizations requiring special attention + +Marking states before optimizations ensure that instructions can be moved around +freely. The only special case is function splitting. When a function is split, +the split part becomes a new function in the emitted binary. For unwinding to +work, it needs to "replay" all CFIs that lead up to the split point. BOLT does +this for other CFIs. As negate-ra-state is not read (only stored as an Annotation), +we have to do this manually in InsertNegateRAStatePass. Here, if the split part +starts with an instruction that has Signed RA state, we add a negate-ra-state CFI +to indicate this. + +## Option to disallow the feature + +The feature can be guarded with the `--update-branch-prediction` flag, which is +on by default. If the flag is set to false, and a function +`containedNegateRAState()` after `FillCFIInfoFor()`, BOLT exits with an error. diff --git a/bolt/include/bolt/Core/BinaryFunction.h b/bolt/include/bolt/Core/BinaryFunction.h index 7e0e3bf..f5e9887 100644 --- a/bolt/include/bolt/Core/BinaryFunction.h +++ b/bolt/include/bolt/Core/BinaryFunction.h @@ -148,6 +148,11 @@ public: PF_MEMEVENT = 4, /// Profile has mem events. }; + void setContainedNegateRAState() { HadNegateRAState = true; } + bool containedNegateRAState() const { return HadNegateRAState; } + void setInitialRAState(bool State) { InitialRAState = State; } + bool getInitialRAState() { return InitialRAState; } + /// Struct for tracking exception handling ranges. struct CallSite { const MCSymbol *Start; @@ -218,6 +223,12 @@ private: /// Current state of the function. State CurrentState{State::Empty}; + /// Indicates if the Function contained .cfi-negate-ra-state. These are not + /// read from the binary. This boolean is used when deciding to run the + /// .cfi-negate-ra-state rewriting passes on a function or not. + bool HadNegateRAState{false}; + bool InitialRAState{false}; + /// A list of symbols associated with the function entry point. /// /// Multiple symbols would typically result from identical code-folding @@ -1640,6 +1651,51 @@ public: void setHasInferredProfile(bool Inferred) { HasInferredProfile = Inferred; } + /// Find corrected offset the same way addCFIInstruction does it to skip NOPs. + std::optional<uint64_t> getCorrectedCFIOffset(uint64_t Offset) { + assert(!Instructions.empty()); + auto I = Instructions.lower_bound(Offset); + if (Offset == getSize()) { + assert(I == Instructions.end() && "unexpected iterator value"); + // Sometimes compiler issues restore_state after all instructions + // in the function (even after nop). + --I; + Offset = I->first; + } + assert(I->first == Offset && "CFI pointing to unknown instruction"); + if (I == Instructions.begin()) + return {}; + + --I; + while (I != Instructions.begin() && BC.MIB->isNoop(I->second)) { + Offset = I->first; + --I; + } + return Offset; + } + + void setInstModifiesRAState(uint8_t CFIOpcode, uint64_t Offset) { + std::optional<uint64_t> CorrectedOffset = getCorrectedCFIOffset(Offset); + if (CorrectedOffset) { + auto I = Instructions.lower_bound(*CorrectedOffset); + I--; + + switch (CFIOpcode) { + case dwarf::DW_CFA_AARCH64_negate_ra_state: + BC.MIB->setNegateRAState(I->second); + break; + case dwarf::DW_CFA_remember_state: + BC.MIB->setRememberState(I->second); + break; + case dwarf::DW_CFA_restore_state: + BC.MIB->setRestoreState(I->second); + break; + default: + assert(0 && "CFI Opcode not covered by function"); + } + } + } + void addCFIInstruction(uint64_t Offset, MCCFIInstruction &&Inst) { assert(!Instructions.empty()); diff --git a/bolt/include/bolt/Core/MCPlus.h b/bolt/include/bolt/Core/MCPlus.h index 601d709..ead6ba1 100644 --- a/bolt/include/bolt/Core/MCPlus.h +++ b/bolt/include/bolt/Core/MCPlus.h @@ -72,7 +72,12 @@ public: kLabel, /// MCSymbol pointing to this instruction. kSize, /// Size of the instruction. kDynamicBranch, /// Jit instruction patched at runtime. - kGeneric /// First generic annotation. + kRASigned, /// Inst is in a range where RA is signed. + kRAUnsigned, /// Inst is in a range where RA is unsigned. + kRememberState, /// Inst has rememberState CFI. + kRestoreState, /// Inst has restoreState CFI. + kNegateState, /// Inst has OpNegateRAState CFI. + kGeneric, /// First generic annotation. }; virtual void print(raw_ostream &OS) const = 0; diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 5b711b0..2772de7 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -70,6 +70,20 @@ class MCPlusBuilder { public: using AllocatorIdTy = uint16_t; + std::optional<int64_t> getAnnotationAtOpIndex(const MCInst &Inst, + unsigned OpIndex) const { + std::optional<unsigned> FirstAnnotationOp = getFirstAnnotationOpIndex(Inst); + if (!FirstAnnotationOp) + return std::nullopt; + + if (*FirstAnnotationOp > OpIndex || Inst.getNumOperands() < OpIndex) + return std::nullopt; + + const auto *Op = Inst.begin() + OpIndex; + const int64_t ImmValue = Op->getImm(); + return extractAnnotationIndex(ImmValue); + } + private: /// A struct that represents a single annotation allocator struct AnnotationAllocator { @@ -603,6 +617,21 @@ public: return std::nullopt; } + virtual bool isPSignOnLR(const MCInst &Inst) const { + llvm_unreachable("not implemented"); + return false; + } + + virtual bool isPAuthOnLR(const MCInst &Inst) const { + llvm_unreachable("not implemented"); + return false; + } + + virtual bool isPAuthAndRet(const MCInst &Inst) const { + llvm_unreachable("not implemented"); + return false; + } + /// Returns the register used as a return address. Returns std::nullopt if /// not applicable, such as reading the return address from a system register /// or from the stack. @@ -1314,6 +1343,39 @@ public: /// Return true if the instruction is a tail call. bool isTailCall(const MCInst &Inst) const; + /// Stores NegateRAState annotation on \p Inst. + void setNegateRAState(MCInst &Inst) const; + + /// Return true if \p Inst has NegateRAState annotation. + bool hasNegateRAState(const MCInst &Inst) const; + + /// Sets RememberState annotation on \p Inst. + void setRememberState(MCInst &Inst) const; + + /// Return true if \p Inst has RememberState annotation. + bool hasRememberState(const MCInst &Inst) const; + + /// Stores RestoreState annotation on \p Inst. + void setRestoreState(MCInst &Inst) const; + + /// Return true if \p Inst has RestoreState annotation. + bool hasRestoreState(const MCInst &Inst) const; + + /// Stores RA Signed annotation on \p Inst. + void setRASigned(MCInst &Inst) const; + + /// Return true if \p Inst has Signed RA annotation. + bool isRASigned(const MCInst &Inst) const; + + /// Stores RA Unsigned annotation on \p Inst. + void setRAUnsigned(MCInst &Inst) const; + + /// Return true if \p Inst has Unsigned RA annotation. + bool isRAUnsigned(const MCInst &Inst) const; + + /// Return true if \p Inst doesn't have any annotation related to RA state. + bool isRAStateUnknown(const MCInst &Inst) const; + /// Return true if the instruction is a call with an exception handling info. virtual bool isInvoke(const MCInst &Inst) const { return isCall(Inst) && getEHInfo(Inst); diff --git a/bolt/include/bolt/Passes/InsertNegateRAStatePass.h b/bolt/include/bolt/Passes/InsertNegateRAStatePass.h new file mode 100644 index 0000000..836948b --- /dev/null +++ b/bolt/include/bolt/Passes/InsertNegateRAStatePass.h @@ -0,0 +1,46 @@ +//===- bolt/Passes/InsertNegateRAStatePass.cpp ----------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the InsertNegateRAStatePass class. +// +//===----------------------------------------------------------------------===// +#ifndef BOLT_PASSES_INSERT_NEGATE_RA_STATE_PASS +#define BOLT_PASSES_INSERT_NEGATE_RA_STATE_PASS + +#include "bolt/Passes/BinaryPasses.h" + +namespace llvm { +namespace bolt { + +class InsertNegateRAState : public BinaryFunctionPass { +public: + explicit InsertNegateRAState() : BinaryFunctionPass(false) {} + + const char *getName() const override { return "insert-negate-ra-state-pass"; } + + /// Pass entry point + Error runOnFunctions(BinaryContext &BC) override; + void runOnFunction(BinaryFunction &BF); + +private: + /// Because states are tracked as MCAnnotations on individual instructions, + /// newly inserted instructions do not have a state associated with them. + /// New states are "inherited" from the last known state. + void inferUnknownStates(BinaryFunction &BF); + + /// Support for function splitting: + /// if two consecutive BBs with Signed state are going to end up in different + /// functions (so are held by different FunctionFragments), we have to add a + /// OpNegateRAState to the beginning of the newly split function, so it starts + /// with a Signed state. + void coverFunctionFragmentStart(BinaryFunction &BF, FunctionFragment &FF); +}; + +} // namespace bolt +} // namespace llvm +#endif diff --git a/bolt/include/bolt/Passes/MarkRAStates.h b/bolt/include/bolt/Passes/MarkRAStates.h new file mode 100644 index 0000000..675ab97 --- /dev/null +++ b/bolt/include/bolt/Passes/MarkRAStates.h @@ -0,0 +1,33 @@ +//===- bolt/Passes/MarkRAStates.cpp ---------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the MarkRAStates class. +// +//===----------------------------------------------------------------------===// +#ifndef BOLT_PASSES_MARK_RA_STATES +#define BOLT_PASSES_MARK_RA_STATES + +#include "bolt/Passes/BinaryPasses.h" + +namespace llvm { +namespace bolt { + +class MarkRAStates : public BinaryFunctionPass { +public: + explicit MarkRAStates() : BinaryFunctionPass(false) {} + + const char *getName() const override { return "mark-ra-states"; } + + /// Pass entry point + Error runOnFunctions(BinaryContext &BC) override; + bool runOnFunction(BinaryFunction &BF); +}; + +} // namespace bolt +} // namespace llvm +#endif diff --git a/bolt/include/bolt/Utils/CommandLineOpts.h b/bolt/include/bolt/Utils/CommandLineOpts.h index 0964c2c..5c7f1b9 100644 --- a/bolt/include/bolt/Utils/CommandLineOpts.h +++ b/bolt/include/bolt/Utils/CommandLineOpts.h @@ -97,6 +97,7 @@ extern llvm::cl::opt<std::string> OutputFilename; extern llvm::cl::opt<std::string> PerfData; extern llvm::cl::opt<bool> PrintCacheMetrics; extern llvm::cl::opt<bool> PrintSections; +extern llvm::cl::opt<bool> UpdateBranchProtection; extern llvm::cl::opt<SplitFunctionsStrategy> SplitStrategy; // The format to use with -o in aggregation mode (perf2bolt) diff --git a/bolt/lib/Core/BinaryBasicBlock.cpp b/bolt/lib/Core/BinaryBasicBlock.cpp index eeab1ed..d680850 100644 --- a/bolt/lib/Core/BinaryBasicBlock.cpp +++ b/bolt/lib/Core/BinaryBasicBlock.cpp @@ -210,7 +210,11 @@ int32_t BinaryBasicBlock::getCFIStateAtInstr(const MCInst *Instr) const { InstrSeen = (&Inst == Instr); continue; } - if (Function->getBinaryContext().MIB->isCFI(Inst)) { + // Ignoring OpNegateRAState CFIs here, as they dont have a "State" + // number associated with them. + if (Function->getBinaryContext().MIB->isCFI(Inst) && + (Function->getCFIFor(Inst)->getOperation() != + MCCFIInstruction::OpNegateRAState)) { LastCFI = &Inst; break; } diff --git a/bolt/lib/Core/BinaryContext.cpp b/bolt/lib/Core/BinaryContext.cpp index b7ded6b..206d8eef 100644 --- a/bolt/lib/Core/BinaryContext.cpp +++ b/bolt/lib/Core/BinaryContext.cpp @@ -1905,6 +1905,9 @@ void BinaryContext::printCFI(raw_ostream &OS, const MCCFIInstruction &Inst) { case MCCFIInstruction::OpGnuArgsSize: OS << "OpGnuArgsSize"; break; + case MCCFIInstruction::OpNegateRAState: + OS << "OpNegateRAState"; + break; default: OS << "Op#" << Operation; break; diff --git a/bolt/lib/Core/BinaryFunction.cpp b/bolt/lib/Core/BinaryFunction.cpp index 07bc71e..9687892 100644 --- a/bolt/lib/Core/BinaryFunction.cpp +++ b/bolt/lib/Core/BinaryFunction.cpp @@ -2814,14 +2814,8 @@ private: case MCCFIInstruction::OpLLVMDefAspaceCfa: case MCCFIInstruction::OpLabel: case MCCFIInstruction::OpValOffset: - llvm_unreachable("unsupported CFI opcode"); - break; case MCCFIInstruction::OpNegateRAState: - if (!(opts::BinaryAnalysisMode || opts::HeatmapMode)) { - llvm_unreachable("BOLT-ERROR: binaries using pac-ret hardening (e.g. " - "as produced by '-mbranch-protection=pac-ret') are " - "currently not supported by BOLT."); - } + llvm_unreachable("unsupported CFI opcode"); break; case MCCFIInstruction::OpRememberState: case MCCFIInstruction::OpRestoreState: @@ -2836,6 +2830,7 @@ public: void advanceTo(int32_t State) { for (int32_t I = CurState, E = State; I != E; ++I) { const MCCFIInstruction &Instr = FDE[I]; + assert(Instr.getOperation() != MCCFIInstruction::OpNegateRAState); if (Instr.getOperation() != MCCFIInstruction::OpRestoreState) { update(Instr, I); continue; @@ -2960,15 +2955,9 @@ struct CFISnapshotDiff : public CFISnapshot { case MCCFIInstruction::OpLLVMDefAspaceCfa: case MCCFIInstruction::OpLabel: case MCCFIInstruction::OpValOffset: + case MCCFIInstruction::OpNegateRAState: llvm_unreachable("unsupported CFI opcode"); return false; - case MCCFIInstruction::OpNegateRAState: - if (!(opts::BinaryAnalysisMode || opts::HeatmapMode)) { - llvm_unreachable("BOLT-ERROR: binaries using pac-ret hardening (e.g. " - "as produced by '-mbranch-protection=pac-ret') are " - "currently not supported by BOLT."); - } - break; case MCCFIInstruction::OpRememberState: case MCCFIInstruction::OpRestoreState: case MCCFIInstruction::OpGnuArgsSize: @@ -3117,14 +3106,8 @@ BinaryFunction::unwindCFIState(int32_t FromState, int32_t ToState, case MCCFIInstruction::OpLLVMDefAspaceCfa: case MCCFIInstruction::OpLabel: case MCCFIInstruction::OpValOffset: - llvm_unreachable("unsupported CFI opcode"); - break; case MCCFIInstruction::OpNegateRAState: - if (!(opts::BinaryAnalysisMode || opts::HeatmapMode)) { - llvm_unreachable("BOLT-ERROR: binaries using pac-ret hardening (e.g. " - "as produced by '-mbranch-protection=pac-ret') are " - "currently not supported by BOLT."); - } + llvm_unreachable("unsupported CFI opcode"); break; case MCCFIInstruction::OpGnuArgsSize: // do not affect CFI state diff --git a/bolt/lib/Core/Exceptions.cpp b/bolt/lib/Core/Exceptions.cpp index 874419f..27656c7 100644 --- a/bolt/lib/Core/Exceptions.cpp +++ b/bolt/lib/Core/Exceptions.cpp @@ -568,10 +568,25 @@ bool CFIReaderWriter::fillCFIInfoFor(BinaryFunction &Function) const { case DW_CFA_remember_state: Function.addCFIInstruction( Offset, MCCFIInstruction::createRememberState(nullptr)); + + if (Function.getBinaryContext().isAArch64()) { + // Support for pointer authentication: + // We need to annotate instructions that modify the RA State, to work + // out the state of each instruction in MarkRAStates Pass. + if (Offset != 0) + Function.setInstModifiesRAState(DW_CFA_remember_state, Offset); + } break; case DW_CFA_restore_state: Function.addCFIInstruction(Offset, MCCFIInstruction::createRestoreState(nullptr)); + if (Function.getBinaryContext().isAArch64()) { + // Support for pointer authentication: + // We need to annotate instructions that modify the RA State, to work + // out the state of each instruction in MarkRAStates Pass. + if (Offset != 0) + Function.setInstModifiesRAState(DW_CFA_restore_state, Offset); + } break; case DW_CFA_def_cfa: Function.addCFIInstruction( @@ -629,11 +644,24 @@ bool CFIReaderWriter::fillCFIInfoFor(BinaryFunction &Function) const { BC.errs() << "BOLT-WARNING: DW_CFA_MIPS_advance_loc unimplemented\n"; return false; case DW_CFA_GNU_window_save: - // DW_CFA_GNU_window_save and DW_CFA_GNU_NegateRAState just use the same - // id but mean different things. The latter is used in AArch64. + // DW_CFA_GNU_window_save and DW_CFA_AARCH64_negate_ra_state just use the + // same id but mean different things. The latter is used in AArch64. if (Function.getBinaryContext().isAArch64()) { - Function.addCFIInstruction( - Offset, MCCFIInstruction::createNegateRAState(nullptr)); + Function.setContainedNegateRAState(); + // The location OpNegateRAState CFIs are needed depends on the order of + // BasicBlocks, which changes during optimizations. Instead of adding + // OpNegateRAState CFIs, an annotation is added to the instruction, to + // mark that the instruction modifies the RA State. The actual state for + // instructions are worked out in MarkRAStates based on these + // annotations. + if (Offset != 0) + Function.setInstModifiesRAState(DW_CFA_AARCH64_negate_ra_state, + Offset); + else + // We cannot Annotate an instruction at Offset == 0. + // Instead, we save the initial (Signed) state, and push it to + // MarkRAStates' RAStateStack. + Function.setInitialRAState(true); break; } if (opts::Verbosity >= 1) diff --git a/bolt/lib/Core/MCPlusBuilder.cpp b/bolt/lib/Core/MCPlusBuilder.cpp index 5247522..e96de80 100644 --- a/bolt/lib/Core/MCPlusBuilder.cpp +++ b/bolt/lib/Core/MCPlusBuilder.cpp @@ -159,6 +159,55 @@ bool MCPlusBuilder::isTailCall(const MCInst &Inst) const { return false; } +void MCPlusBuilder::setNegateRAState(MCInst &Inst) const { + assert(!hasAnnotation(Inst, MCAnnotation::kNegateState)); + setAnnotationOpValue(Inst, MCAnnotation::kNegateState, true); +} + +bool MCPlusBuilder::hasNegateRAState(const MCInst &Inst) const { + return hasAnnotation(Inst, MCAnnotation::kNegateState); +} + +void MCPlusBuilder::setRememberState(MCInst &Inst) const { + assert(!hasAnnotation(Inst, MCAnnotation::kRememberState)); + setAnnotationOpValue(Inst, MCAnnotation::kRememberState, true); +} + +bool MCPlusBuilder::hasRememberState(const MCInst &Inst) const { + return hasAnnotation(Inst, MCAnnotation::kRememberState); +} + +void MCPlusBuilder::setRestoreState(MCInst &Inst) const { + assert(!hasAnnotation(Inst, MCAnnotation::kRestoreState)); + setAnnotationOpValue(Inst, MCAnnotation::kRestoreState, true); +} + +bool MCPlusBuilder::hasRestoreState(const MCInst &Inst) const { + return hasAnnotation(Inst, MCAnnotation::kRestoreState); +} + +void MCPlusBuilder::setRASigned(MCInst &Inst) const { + assert(!hasAnnotation(Inst, MCAnnotation::kRASigned)); + setAnnotationOpValue(Inst, MCAnnotation::kRASigned, true); +} + +bool MCPlusBuilder::isRASigned(const MCInst &Inst) const { + return hasAnnotation(Inst, MCAnnotation::kRASigned); +} + +void MCPlusBuilder::setRAUnsigned(MCInst &Inst) const { + assert(!hasAnnotation(Inst, MCAnnotation::kRAUnsigned)); + setAnnotationOpValue(Inst, MCAnnotation::kRAUnsigned, true); +} + +bool MCPlusBuilder::isRAUnsigned(const MCInst &Inst) const { + return hasAnnotation(Inst, MCAnnotation::kRAUnsigned); +} + +bool MCPlusBuilder::isRAStateUnknown(const MCInst &Inst) const { + return !(isRAUnsigned(Inst) || isRASigned(Inst)); +} + std::optional<MCLandingPad> MCPlusBuilder::getEHInfo(const MCInst &Inst) const { if (!isCall(Inst)) return std::nullopt; diff --git a/bolt/lib/Passes/CMakeLists.txt b/bolt/lib/Passes/CMakeLists.txt index 77d2bb9..d751951 100644 --- a/bolt/lib/Passes/CMakeLists.txt +++ b/bolt/lib/Passes/CMakeLists.txt @@ -17,12 +17,14 @@ add_llvm_library(LLVMBOLTPasses IdenticalCodeFolding.cpp IndirectCallPromotion.cpp Inliner.cpp + InsertNegateRAStatePass.cpp Instrumentation.cpp JTFootprintReduction.cpp LongJmp.cpp LoopInversionPass.cpp LivenessAnalysis.cpp MCF.cpp + MarkRAStates.cpp PatchEntries.cpp PAuthGadgetScanner.cpp PettisAndHansen.cpp diff --git a/bolt/lib/Passes/InsertNegateRAStatePass.cpp b/bolt/lib/Passes/InsertNegateRAStatePass.cpp new file mode 100644 index 0000000..33664e1 --- /dev/null +++ b/bolt/lib/Passes/InsertNegateRAStatePass.cpp @@ -0,0 +1,142 @@ +//===- bolt/Passes/InsertNegateRAStatePass.cpp ----------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the InsertNegateRAStatePass class. It inserts +// OpNegateRAState CFIs to places where the state of two consecutive +// instructions are different. +// +//===----------------------------------------------------------------------===// +#include "bolt/Passes/InsertNegateRAStatePass.h" +#include "bolt/Core/BinaryFunction.h" +#include "bolt/Core/ParallelUtilities.h" +#include <cstdlib> + +using namespace llvm; + +namespace llvm { +namespace bolt { + +void InsertNegateRAState::runOnFunction(BinaryFunction &BF) { + BinaryContext &BC = BF.getBinaryContext(); + + if (BF.getState() == BinaryFunction::State::Empty) + return; + + if (BF.getState() != BinaryFunction::State::CFG && + BF.getState() != BinaryFunction::State::CFG_Finalized) { + BC.outs() << "BOLT-INFO: no CFG for " << BF.getPrintName() + << " in InsertNegateRAStatePass\n"; + return; + } + + inferUnknownStates(BF); + + for (FunctionFragment &FF : BF.getLayout().fragments()) { + coverFunctionFragmentStart(BF, FF); + bool FirstIter = true; + MCInst PrevInst; + // As this pass runs after function splitting, we should only check + // consecutive instructions inside FunctionFragments. + for (BinaryBasicBlock *BB : FF) { + for (auto It = BB->begin(); It != BB->end(); ++It) { + MCInst &Inst = *It; + if (BC.MIB->isCFI(Inst)) + continue; + if (!FirstIter) { + // Consecutive instructions with different RAState means we need to + // add a OpNegateRAState. + if ((BC.MIB->isRASigned(PrevInst) && BC.MIB->isRAUnsigned(Inst)) || + (BC.MIB->isRAUnsigned(PrevInst) && BC.MIB->isRASigned(Inst))) { + It = BF.addCFIInstruction( + BB, It, MCCFIInstruction::createNegateRAState(nullptr)); + } + } else { + FirstIter = false; + } + PrevInst = *It; + } + } + } +} + +void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF, + FunctionFragment &FF) { + BinaryContext &BC = BF.getBinaryContext(); + if (FF.empty()) + return; + // Find the first BB in the FF which has Instructions. + // BOLT can generate empty BBs at function splitting which are only used as + // target labels. We should add the negate-ra-state CFI to the first + // non-empty BB. + auto *FirstNonEmpty = + std::find_if(FF.begin(), FF.end(), [](BinaryBasicBlock *BB) { + // getFirstNonPseudo returns BB.end() if it does not find any + // Instructions. + return BB->getFirstNonPseudo() != BB->end(); + }); + // If a function is already split in the input, the first FF can also start + // with Signed state. This covers that scenario as well. + if (BC.MIB->isRASigned(*((*FirstNonEmpty)->begin()))) { + BF.addCFIInstruction(*FirstNonEmpty, (*FirstNonEmpty)->begin(), + MCCFIInstruction::createNegateRAState(nullptr)); + } +} + +void InsertNegateRAState::inferUnknownStates(BinaryFunction &BF) { + BinaryContext &BC = BF.getBinaryContext(); + bool FirstIter = true; + MCInst PrevInst; + for (BinaryBasicBlock &BB : BF) { + for (MCInst &Inst : BB) { + if (BC.MIB->isCFI(Inst)) + continue; + + if (!FirstIter && BC.MIB->isRAStateUnknown(Inst)) { + if (BC.MIB->isRASigned(PrevInst) || BC.MIB->isPSignOnLR(PrevInst)) { + BC.MIB->setRASigned(Inst); + } else if (BC.MIB->isRAUnsigned(PrevInst) || + BC.MIB->isPAuthOnLR(PrevInst)) { + BC.MIB->setRAUnsigned(Inst); + } + } else { + FirstIter = false; + } + PrevInst = Inst; + } + } +} + +Error InsertNegateRAState::runOnFunctions(BinaryContext &BC) { + std::atomic<uint64_t> FunctionsModified{0}; + ParallelUtilities::WorkFuncTy WorkFun = [&](BinaryFunction &BF) { + FunctionsModified++; + runOnFunction(BF); + }; + + ParallelUtilities::PredicateTy SkipPredicate = [&](const BinaryFunction &BF) { + // We can skip functions which did not include negate-ra-state CFIs. This + // includes code using pac-ret hardening as well, if the binary is + // compiled with `-fno-exceptions -fno-unwind-tables + // -fno-asynchronous-unwind-tables` + return !BF.containedNegateRAState() || BF.isIgnored(); + }; + + ParallelUtilities::runOnEachFunction( + BC, ParallelUtilities::SchedulingPolicy::SP_INST_LINEAR, WorkFun, + SkipPredicate, "InsertNegateRAStatePass"); + + BC.outs() << "BOLT-INFO: rewritten pac-ret DWARF info in " + << FunctionsModified << " out of " << BC.getBinaryFunctions().size() + << " functions " + << format("(%.2lf%%).\n", (100.0 * FunctionsModified) / + BC.getBinaryFunctions().size()); + return Error::success(); +} + +} // end namespace bolt +} // end namespace llvm diff --git a/bolt/lib/Passes/MarkRAStates.cpp b/bolt/lib/Passes/MarkRAStates.cpp new file mode 100644 index 0000000..2c5ce4a --- /dev/null +++ b/bolt/lib/Passes/MarkRAStates.cpp @@ -0,0 +1,152 @@ +//===- bolt/Passes/MarkRAStates.cpp ---------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the MarkRAStates class. +// Three CFIs have an influence on the RA State of an instruction: +// - NegateRAState flips the RA State, +// - RememberState pushes the RA State to a stack, +// - RestoreState pops the RA State from the stack. +// These are saved as MCAnnotations on instructions they refer to at CFI +// reading (in CFIReaderWriter::fillCFIInfoFor). In this pass, we can work out +// the RA State of each instruction, and save it as new MCAnnotations. The new +// annotations are Signing, Signed, Authenticating and Unsigned. After +// optimizations, .cfi_negate_ra_state CFIs are added to the places where the +// state changes in InsertNegateRAStatePass. +// +//===----------------------------------------------------------------------===// +#include "bolt/Passes/MarkRAStates.h" +#include "bolt/Core/BinaryFunction.h" +#include "bolt/Core/ParallelUtilities.h" +#include <cstdlib> +#include <optional> +#include <stack> + +using namespace llvm; + +namespace llvm { +namespace bolt { + +bool MarkRAStates::runOnFunction(BinaryFunction &BF) { + + BinaryContext &BC = BF.getBinaryContext(); + + for (const BinaryBasicBlock &BB : BF) { + for (const MCInst &Inst : BB) { + if ((BC.MIB->isPSignOnLR(Inst) || + (BC.MIB->isPAuthOnLR(Inst) && !BC.MIB->isPAuthAndRet(Inst))) && + !BC.MIB->hasNegateRAState(Inst)) { + // Not all functions have .cfi_negate_ra_state in them. But if one does, + // we expect psign/pauth instructions to have the hasNegateRAState + // annotation. + BF.setIgnored(); + BC.outs() << "BOLT-INFO: inconsistent RAStates in function " + << BF.getPrintName() + << ": ptr sign/auth inst without .cfi_negate_ra_state\n"; + return false; + } + } + } + + bool RAState = BF.getInitialRAState(); + std::stack<bool> RAStateStack; + RAStateStack.push(RAState); + + for (BinaryBasicBlock &BB : BF) { + for (MCInst &Inst : BB) { + if (BC.MIB->isCFI(Inst)) + continue; + + if (BC.MIB->isPSignOnLR(Inst)) { + if (RAState) { + // RA signing instructions should only follow unsigned RA state. + BC.outs() << "BOLT-INFO: inconsistent RAStates in function " + << BF.getPrintName() + << ": ptr signing inst encountered in Signed RA state\n"; + BF.setIgnored(); + return false; + } + // The signing instruction itself is unsigned, the next will be + // signed. + BC.MIB->setRAUnsigned(Inst); + } else if (BC.MIB->isPAuthOnLR(Inst)) { + if (!RAState) { + // RA authenticating instructions should only follow signed RA state. + BC.outs() << "BOLT-INFO: inconsistent RAStates in function " + << BF.getPrintName() + << ": ptr authenticating inst encountered in Unsigned RA " + "state\n"; + BF.setIgnored(); + return false; + } + // The authenticating instruction itself is signed, but the next will be + // unsigned. + BC.MIB->setRASigned(Inst); + } else if (RAState) { + BC.MIB->setRASigned(Inst); + } else { + BC.MIB->setRAUnsigned(Inst); + } + + // Updating RAState. All updates are valid from the next instruction. + // Because the same instruction can have remember and restore, the order + // here is relevant. This is the reason to loop over Annotations instead + // of just checking each in a predefined order. + for (unsigned int Idx = 0; Idx < Inst.getNumOperands(); Idx++) { + std::optional<int64_t> Annotation = + BC.MIB->getAnnotationAtOpIndex(Inst, Idx); + if (!Annotation) + continue; + if (Annotation == MCPlus::MCAnnotation::kNegateState) + RAState = !RAState; + else if (Annotation == MCPlus::MCAnnotation::kRememberState) + RAStateStack.push(RAState); + else if (Annotation == MCPlus::MCAnnotation::kRestoreState) { + RAState = RAStateStack.top(); + RAStateStack.pop(); + } + } + } + } + return true; +} + +Error MarkRAStates::runOnFunctions(BinaryContext &BC) { + std::atomic<uint64_t> FunctionsIgnored{0}; + ParallelUtilities::WorkFuncTy WorkFun = [&](BinaryFunction &BF) { + if (!runOnFunction(BF)) { + FunctionsIgnored++; + } + }; + + ParallelUtilities::PredicateTy SkipPredicate = [&](const BinaryFunction &BF) { + // We can skip functions which did not include negate-ra-state CFIs. This + // includes code using pac-ret hardening as well, if the binary is + // compiled with `-fno-exceptions -fno-unwind-tables + // -fno-asynchronous-unwind-tables` + return !BF.containedNegateRAState() || BF.isIgnored(); + }; + + int Total = llvm::count_if( + BC.getBinaryFunctions(), + [&](std::pair<const unsigned long, BinaryFunction> &P) { + return P.second.containedNegateRAState() && !P.second.isIgnored(); + }); + + ParallelUtilities::runOnEachFunction( + BC, ParallelUtilities::SchedulingPolicy::SP_INST_LINEAR, WorkFun, + SkipPredicate, "MarkRAStates"); + BC.outs() << "BOLT-INFO: MarkRAStates ran on " << Total + << " functions. Ignored " << FunctionsIgnored << " functions " + << format("(%.2lf%%)", (100.0 * FunctionsIgnored) / Total) + << " because of CFI inconsistencies\n"; + + return Error::success(); +} + +} // end namespace bolt +} // end namespace llvm diff --git a/bolt/lib/Rewrite/BinaryPassManager.cpp b/bolt/lib/Rewrite/BinaryPassManager.cpp index d9b7a2bd..782137e 100644 --- a/bolt/lib/Rewrite/BinaryPassManager.cpp +++ b/bolt/lib/Rewrite/BinaryPassManager.cpp @@ -19,11 +19,13 @@ #include "bolt/Passes/IdenticalCodeFolding.h" #include "bolt/Passes/IndirectCallPromotion.h" #include "bolt/Passes/Inliner.h" +#include "bolt/Passes/InsertNegateRAStatePass.h" #include "bolt/Passes/Instrumentation.h" #include "bolt/Passes/JTFootprintReduction.h" #include "bolt/Passes/LongJmp.h" #include "bolt/Passes/LoopInversionPass.h" #include "bolt/Passes/MCF.h" +#include "bolt/Passes/MarkRAStates.h" #include "bolt/Passes/PLTCall.h" #include "bolt/Passes/PatchEntries.h" #include "bolt/Passes/ProfileQualityStats.h" @@ -276,6 +278,12 @@ static cl::opt<bool> ShortenInstructions("shorten-instructions", cl::desc("shorten instructions"), cl::init(true), cl::cat(BoltOptCategory)); + +cl::opt<bool> + UpdateBranchProtection("update-branch-protection", + cl::desc("Rewrites pac-ret DWARF CFI instructions " + "(AArch64-only, on by default)"), + cl::init(true), cl::Hidden, cl::cat(BoltCategory)); } // namespace opts namespace llvm { @@ -353,6 +361,9 @@ Error BinaryFunctionPassManager::runPasses() { Error BinaryFunctionPassManager::runAllPasses(BinaryContext &BC) { BinaryFunctionPassManager Manager(BC); + if (BC.isAArch64()) + Manager.registerPass(std::make_unique<MarkRAStates>()); + Manager.registerPass( std::make_unique<EstimateEdgeCounts>(PrintEstimateEdgeCounts)); @@ -512,6 +523,8 @@ Error BinaryFunctionPassManager::runAllPasses(BinaryContext &BC) { // targets. No extra instructions after this pass, otherwise we may have // relocations out of range and crash during linking. Manager.registerPass(std::make_unique<LongJmpPass>(PrintLongJmp)); + + Manager.registerPass(std::make_unique<InsertNegateRAState>()); } // This pass should always run last.* diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp index ddf9347..c428828 100644 --- a/bolt/lib/Rewrite/RewriteInstance.cpp +++ b/bolt/lib/Rewrite/RewriteInstance.cpp @@ -3524,6 +3524,17 @@ void RewriteInstance::disassembleFunctions() { } } + // Check if fillCFIInfoFor removed any OpNegateRAState CFIs from the + // function. + if (Function.containedNegateRAState()) { + if (!opts::UpdateBranchProtection) { + BC->errs() + << "BOLT-ERROR: --update-branch-protection is set to false, but " + << Function.getPrintName() << " contains .cfi-negate-ra-state\n"; + exit(1); + } + } + // Parse LSDA. if (Function.getLSDAAddress() != 0 && !BC->getFragmentsToSkip().count(&Function)) { diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp index f271867..df4f421 100644 --- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp +++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp @@ -244,6 +244,28 @@ public: } } + bool isPSignOnLR(const MCInst &Inst) const override { + std::optional<MCPhysReg> SignReg = getSignedReg(Inst); + return SignReg && *SignReg == AArch64::LR; + } + + bool isPAuthOnLR(const MCInst &Inst) const override { + // LDR(A|B) should not be covered. + bool IsChecked; + std::optional<MCPhysReg> AuthReg = + getWrittenAuthenticatedReg(Inst, IsChecked); + return !IsChecked && AuthReg && *AuthReg == AArch64::LR; + } + + bool isPAuthAndRet(const MCInst &Inst) const override { + return Inst.getOpcode() == AArch64::RETAA || + Inst.getOpcode() == AArch64::RETAB || + Inst.getOpcode() == AArch64::RETAASPPCi || + Inst.getOpcode() == AArch64::RETABSPPCi || + Inst.getOpcode() == AArch64::RETAASPPCr || + Inst.getOpcode() == AArch64::RETABSPPCr; + } + std::optional<MCPhysReg> getSignedReg(const MCInst &Inst) const override { switch (Inst.getOpcode()) { case AArch64::PACIA: diff --git a/bolt/test/AArch64/negate-ra-state-disallow.s b/bolt/test/AArch64/negate-ra-state-disallow.s new file mode 100644 index 0000000..95adb71 --- /dev/null +++ b/bolt/test/AArch64/negate-ra-state-disallow.s @@ -0,0 +1,25 @@ +# RUN: llvm-mc -filetype=obj -triple aarch64-unknown-unknown %s -o %t.o +# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q +# RUN: not llvm-bolt %t.exe -o %t.exe.bolt --update-branch-protection=false 2>&1 | FileCheck %s + +# CHECK: BOLT-ERROR: --update-branch-protection is set to false, but foo contains .cfi-negate-ra-state + + .text + .globl foo + .p2align 2 + .type foo,@function +foo: + .cfi_startproc + hint #25 + .cfi_negate_ra_state + mov x1, #0 + hint #29 + .cfi_negate_ra_state + ret + .cfi_endproc + .size foo, .-foo + + .global _start + .type _start, %function +_start: + b foo diff --git a/bolt/test/AArch64/negate-ra-state-incorrect.s b/bolt/test/AArch64/negate-ra-state-incorrect.s new file mode 100644 index 0000000..14d2c38 --- /dev/null +++ b/bolt/test/AArch64/negate-ra-state-incorrect.s @@ -0,0 +1,78 @@ +# This test checks that MarkRAStates pass ignores functions with +# malformed .cfi_negate_ra_state sequences in the input binary. + +# The cases checked are: +# - extra .cfi_negate_ra_state in Signed state: checked in foo, +# - extra .cfi_negate_ra_state in Unsigned state: checked in bar, +# - missing .cfi_negate_ra_state from PSign or PAuth instructions: checked in baz. + +# RUN: llvm-mc -filetype=obj -triple aarch64-unknown-unknown %s -o %t.o +# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q +# RUN: llvm-bolt %t.exe -o %t.exe.bolt --no-threads | FileCheck %s --check-prefix=CHECK-BOLT + +# CHECK-BOLT: BOLT-INFO: inconsistent RAStates in function foo: ptr authenticating inst encountered in Unsigned RA state +# CHECK-BOLT: BOLT-INFO: inconsistent RAStates in function bar: ptr signing inst encountered in Signed RA state +# CHECK-BOLT: BOLT-INFO: inconsistent RAStates in function baz: ptr sign/auth inst without .cfi_negate_ra_state + +# Check that the incorrect functions got ignored, so they are not in the new .text section +# RUN: llvm-objdump %t.exe.bolt -d -j .text | FileCheck %s --check-prefix=CHECK-OBJDUMP +# CHECK-OBJDUMP-NOT: <foo>: +# CHECK-OBJDUMP-NOT: <bar>: +# CHECK-OBJDUMP-NOT: <baz>: + + + .text + .globl foo + .p2align 2 + .type foo,@function +foo: + .cfi_startproc + hint #25 + .cfi_negate_ra_state + mov x1, #0 + .cfi_negate_ra_state // Incorrect CFI in signed state + hint #29 + .cfi_negate_ra_state + ret + .cfi_endproc + .size foo, .-foo + + .text + .globl bar + .p2align 2 + .type bar,@function +bar: + .cfi_startproc + mov x1, #0 + .cfi_negate_ra_state // Incorrect CFI in unsigned state + hint #25 + .cfi_negate_ra_state + mov x1, #0 + hint #29 + .cfi_negate_ra_state + ret + .cfi_endproc + .size bar, .-bar + + .text + .globl baz + .p2align 2 + .type baz,@function +baz: + .cfi_startproc + mov x1, #0 + hint #25 + .cfi_negate_ra_state + mov x1, #0 + hint #29 + // Missing .cfi_negate_ra_state + ret + .cfi_endproc + .size baz, .-baz + + .global _start + .type _start, %function +_start: + b foo + b bar + b baz diff --git a/bolt/test/AArch64/negate-ra-state-reorder.s b/bolt/test/AArch64/negate-ra-state-reorder.s new file mode 100644 index 0000000..2659f75 --- /dev/null +++ b/bolt/test/AArch64/negate-ra-state-reorder.s @@ -0,0 +1,73 @@ +# Checking that after reordering BasicBlocks, the generated OpNegateRAState instructions +# are placed where the RA state is different between two consecutive instructions. +# This case demonstrates, that the input might have a different amount than the output: +# input has 4, but output only has 3. + +# RUN: llvm-mc -filetype=obj -triple aarch64-unknown-unknown %s -o %t.o +# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q +# RUN: llvm-bolt %t.exe -o %t.exe.bolt --no-threads --reorder-blocks=reverse \ +# RUN: --print-cfg --print-after-lowering --print-only foo | FileCheck %s + +# Check that the reordering succeeded. +# CHECK: Binary Function "foo" after building cfg { +# CHECK: BB Layout : .LBB00, .Ltmp2, .Ltmp0, .Ltmp1 +# CHECK: Binary Function "foo" after inst-lowering { +# CHECK: BB Layout : .LBB00, .Ltmp1, .Ltmp0, .Ltmp2 + + +# Check the generated CFIs. +# CHECK: OpNegateRAState +# CHECK-NEXT: mov x2, #0x6 + +# CHECK: autiasp +# CHECK-NEXT: OpNegateRAState +# CHECK-NEXT: ret + +# CHECK: paciasp +# CHECK-NEXT: OpNegateRAState + +# CHECK: DWARF CFI Instructions: +# CHECK-NEXT: 0: OpNegateRAState +# CHECK-NEXT: 1: OpNegateRAState +# CHECK-NEXT: 2: OpNegateRAState +# CHECK-NEXT: End of Function "foo" + + .text + .globl foo + .p2align 2 + .type foo,@function +foo: + .cfi_startproc + // RA is unsigned + mov x1, #0 + mov x1, #1 + mov x1, #2 + // jump into the signed "range" + b .Lmiddle +.Lback: +// sign RA + paciasp + .cfi_negate_ra_state + mov x2, #3 + mov x2, #4 + // skip unsigned instructions + b .Lcont + .cfi_negate_ra_state +.Lmiddle: +// RA is unsigned + mov x4, #5 + b .Lback + .cfi_negate_ra_state +.Lcont: +// continue in signed state + mov x2, #6 + autiasp + .cfi_negate_ra_state + ret + .cfi_endproc + .size foo, .-foo + + .global _start + .type _start, %function +_start: + b foo diff --git a/bolt/test/AArch64/negate-ra-state.s b/bolt/test/AArch64/negate-ra-state.s new file mode 100644 index 0000000..30786d4 --- /dev/null +++ b/bolt/test/AArch64/negate-ra-state.s @@ -0,0 +1,76 @@ +# Checking that .cfi-negate_ra_state directives are emitted in the same location as in the input in the case of no optimizations. + +# The foo and bar functions are a pair, with the first signing the return address, +# and the second authenticating it. We have a tailcall between the two. +# This is testing that BOLT can handle functions starting in signed RA state. + +# RUN: llvm-mc -filetype=obj -triple aarch64-unknown-unknown %s -o %t.o +# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q +# RUN: llvm-bolt %t.exe -o %t.exe.bolt --no-threads --print-all | FileCheck %s --check-prefix=CHECK-BOLT + +# Check that the negate-ra-state at the start of bar is not discarded. +# If it was discarded, MarkRAState would report bar as having inconsistent RAStates. +# This is testing the handling of initialRAState on the BinaryFunction. +# CHECK-BOLT-NOT: BOLT-INFO: inconsistent RAStates in function foo +# CHECK-BOLT-NOT: BOLT-INFO: inconsistent RAStates in function bar + +# Check that OpNegateRAState CFIs are generated correctly. +# CHECK-BOLT: Binary Function "foo" after insert-negate-ra-state-pass { +# CHECK-BOLT: paciasp +# CHECK-BOLT-NEXT: OpNegateRAState + +# CHECK-BOLT: DWARF CFI Instructions: +# CHECK-BOLT-NEXT: 0: OpNegateRAState +# CHECK-BOLT-NEXT: End of Function "foo" + +# CHECK-BOLT: Binary Function "bar" after insert-negate-ra-state-pass { +# CHECK-BOLT: OpNegateRAState +# CHECK-BOLT-NEXT: mov x1, #0x0 +# CHECK-BOLT-NEXT: mov x1, #0x1 +# CHECK-BOLT-NEXT: autiasp +# CHECK-BOLT-NEXT: OpNegateRAState +# CHECK-BOLT-NEXT: ret + +# CHECK-BOLT: DWARF CFI Instructions: +# CHECK-BOLT-NEXT: 0: OpNegateRAState +# CHECK-BOLT-NEXT: 1: OpNegateRAState +# CHECK-BOLT-NEXT: End of Function "bar" + +# End of negate-ra-state insertion logs for foo and bar. +# CHECK: Binary Function "_start" after insert-negate-ra-state-pass { + +# Check that the functions are in the new .text section +# RUN: llvm-objdump %t.exe.bolt -d -j .text | FileCheck %s --check-prefix=CHECK-OBJDUMP +# CHECK-OBJDUMP: <foo>: +# CHECK-OBJDUMP: <bar>: + + + .text + .globl foo + .p2align 2 + .type foo,@function +foo: + .cfi_startproc + paciasp + .cfi_negate_ra_state + mov x1, #0 + b bar + .cfi_endproc + .size foo, .-foo + + + + .text + .globl bar + .p2align 2 + .type bar,@function +bar: + .cfi_startproc + .cfi_negate_ra_state // Indicating that RA is signed from the start of bar. + mov x1, #0 + mov x1, #1 + autiasp + .cfi_negate_ra_state + ret + .cfi_endproc + .size bar, .-bar diff --git a/bolt/test/AArch64/pacret-split-funcs.s b/bolt/test/AArch64/pacret-split-funcs.s new file mode 100644 index 0000000..27b34710 --- /dev/null +++ b/bolt/test/AArch64/pacret-split-funcs.s @@ -0,0 +1,54 @@ +# Checking that we generate an OpNegateRAState CFI after the split point, +# when splitting a region with signed RA state. +# We split at the fallthrough label. + +# REQUIRES: system-linux + +# RUN: %clang %s %cflags -march=armv8.3-a -Wl,-q -o %t +# RUN: link_fdata --no-lbr %s %t %t.fdata +# RUN: llvm-bolt %t -o %t.bolt --data %t.fdata -split-functions \ +# RUN: --print-only foo --print-split --print-all 2>&1 | FileCheck %s + +# Checking that we don't see any OpNegateRAState CFIs before the insertion pass. +# CHECK-NOT: OpNegateRAState +# CHECK: Binary Function "foo" after insert-negate-ra-state-pass + +# CHECK: paciasp +# CHECK-NEXT: OpNegateRAState + +# CHECK: ------- HOT-COLD SPLIT POINT ------- + +# CHECK: OpNegateRAState +# CHECK-NEXT: mov x0, #0x1 +# CHECK-NEXT: autiasp +# CHECK-NEXT: OpNegateRAState +# CHECK-NEXT: ret + +# End of the insert-negate-ra-state-pass logs +# CHECK: Binary Function "foo" after finalize-functions + + .text + .globl foo + .type foo, %function +foo: +.cfi_startproc +.entry_bb: +# FDATA: 1 foo #.entry_bb# 10 + paciasp + .cfi_negate_ra_state // indicating that paciasp changed the RA state to signed + cmp x0, #0 + b.eq .Lcold_bb1 +.Lfallthrough: // split point + mov x0, #1 + autiasp + .cfi_negate_ra_state // indicating that autiasp changed the RA state to unsigned + ret +.Lcold_bb1: // Instructions below are not important, they are just here so the cold block is not empty. + .cfi_negate_ra_state // ret has unsigned RA state, but the next inst (autiasp) has signed RA state + mov x0, #2 + retaa +.cfi_endproc + .size foo, .-foo + +## Force relocation mode. +.reloc 0, R_AARCH64_NONE diff --git a/bolt/test/runtime/AArch64/negate-ra-state.cpp b/bolt/test/runtime/AArch64/negate-ra-state.cpp new file mode 100644 index 0000000..60b0b08 --- /dev/null +++ b/bolt/test/runtime/AArch64/negate-ra-state.cpp @@ -0,0 +1,26 @@ +// REQUIRES: system-linux,bolt-runtime + +// RUN: %clangxx --target=aarch64-unknown-linux-gnu \ +// RUN: -mbranch-protection=pac-ret -Wl,-q %s -o %t.exe +// RUN: llvm-bolt %t.exe -o %t.bolt.exe +// RUN: %t.bolt.exe | FileCheck %s + +// CHECK: Exception caught: Exception from bar(). + +#include <cstdio> +#include <stdexcept> + +void bar() { throw std::runtime_error("Exception from bar()."); } + +void foo() { + try { + bar(); + } catch (const std::exception &e) { + printf("Exception caught: %s\n", e.what()); + } +} + +int main() { + foo(); + return 0; +} diff --git a/bolt/test/runtime/AArch64/pacret-function-split.cpp b/bolt/test/runtime/AArch64/pacret-function-split.cpp new file mode 100644 index 0000000..208fc5c --- /dev/null +++ b/bolt/test/runtime/AArch64/pacret-function-split.cpp @@ -0,0 +1,42 @@ +/* This test check that the negate-ra-state CFIs are properly emitted in case of + function splitting. The test checks two things: + - we split at the correct location: to test the feature, + we need to split *before* the bl __cxa_throw@PLT call is made, + so the unwinder has to unwind from the split (cold) part. + + - the BOLTed binary runs, and returns the string from foo. + +# REQUIRES: system-linux,bolt-runtime + +# FDATA: 1 main #split# 1 _Z3foov 0 0 1 + +# RUN: %clangxx --target=aarch64-unknown-linux-gnu \ +# RUN: -mbranch-protection=pac-ret %s -o %t.exe -Wl,-q +# RUN: link_fdata %s %t.exe %t.fdata +# RUN: llvm-bolt %t.exe -o %t.bolt --split-functions --split-eh \ +# RUN: --split-strategy=profile2 --split-all-cold --print-split \ +# RUN: --print-only=_Z3foov --data=%t.fdata 2>&1 | FileCheck \ +# RUN: --check-prefix=BOLT-CHECK %s +# RUN: %t.bolt | FileCheck %s --check-prefix=RUN-CHECK + +# BOLT-CHECK-NOT: bl __cxa_throw@PLT +# BOLT-CHECK: ------- HOT-COLD SPLIT POINT ------- +# BOLT-CHECK: bl __cxa_throw@PLT + +# RUN-CHECK: Exception caught: Exception from foo(). +*/ + +#include <cstdio> +#include <stdexcept> + +void foo() { throw std::runtime_error("Exception from foo()."); } + +int main() { + try { + __asm__ __volatile__("split:"); + foo(); + } catch (const std::exception &e) { + printf("Exception caught: %s\n", e.what()); + } + return 0; +} |