aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenFunction.h
diff options
context:
space:
mode:
authorYuxuan Chen <ych@fb.com>2024-09-08 23:08:58 -0700
committerGitHub <noreply@github.com>2024-09-08 23:08:58 -0700
commite17a39bc314f97231e440c9e68d9f46a9c07af6d (patch)
treebc722bc99c2f4d681f42fb9c4c5313990b17862f /clang/lib/CodeGen/CodeGenFunction.h
parentac9355446291a02239ce9b45d0c2225a4db0515a (diff)
downloadllvm-e17a39bc314f97231e440c9e68d9f46a9c07af6d.zip
llvm-e17a39bc314f97231e440c9e68d9f46a9c07af6d.tar.gz
llvm-e17a39bc314f97231e440c9e68d9f46a9c07af6d.tar.bz2
[Clang] C++20 Coroutines: Introduce Frontend Attribute [[clang::coro_await_elidable]] (#99282)
This patch is the frontend implementation of the coroutine elide improvement project detailed in this discourse post: https://discourse.llvm.org/t/language-extension-for-better-more-deterministic-halo-for-c-coroutines/80044 This patch proposes a C++ struct/class attribute `[[clang::coro_await_elidable]]`. This notion of await elidable task gives developers and library authors a certainty that coroutine heap elision happens in a predictable way. Originally, after we lower a coroutine to LLVM IR, CoroElide is responsible for analysis of whether an elision can happen. Take this as an example: ``` Task foo(); Task bar() { co_await foo(); } ``` For CoroElide to happen, the ramp function of `foo` must be inlined into `bar`. This inlining happens after `foo` has been split but `bar` is usually still a presplit coroutine. If `foo` is indeed a coroutine, the inlined `coro.id` intrinsics of `foo` is visible within `bar`. CoroElide then runs an analysis to figure out whether the SSA value of `coro.begin()` of `foo` gets destroyed before `bar` terminates. `Task` types are rarely simple enough for the destroy logic of the task to reference the SSA value from `coro.begin()` directly. Hence, the pass is very ineffective for even the most trivial C++ Task types. Improving CoroElide by implementing more powerful analyses is possible, however it doesn't give us the predictability when we expect elision to happen. The approach we want to take with this language extension generally originates from the philosophy that library implementations of `Task` types has the control over the structured concurrency guarantees we demand for elision to happen. That is, the lifetime for the callee's frame is shorter to that of the caller. The ``[[clang::coro_await_elidable]]`` is a class attribute which can be applied to a coroutine return type. When a coroutine function that returns such a type calls another coroutine function, the compiler performs heap allocation elision when the following conditions are all met: - callee coroutine function returns a type that is annotated with ``[[clang::coro_await_elidable]]``. - In caller coroutine, the return value of the callee is a prvalue that is immediately `co_await`ed. From the C++ perspective, it makes sense because we can ensure the lifetime of elided callee cannot exceed that of the caller if we can guarantee that the caller coroutine is never destroyed earlier than the callee coroutine. This is not generally true for any C++ programs. However, the library that implements `Task` types and executors may provide this guarantee to the compiler, providing the user with certainty that HALO will work on their programs. After this patch, when compiling coroutines that return a type with such attribute, the frontend checks that the type of the operand of `co_await` expressions (not `operator co_await`). If it's also attributed with `[[clang::coro_await_elidable]]`, the FE emits metadata on the call or invoke instruction as a hint for a later middle end pass to elide the elision. The original patch version is https://github.com/llvm/llvm-project/pull/94693 and as suggested, the patch is split into frontend and middle end solutions into stacked PRs. The middle end CoroSplit patch can be found at https://github.com/llvm/llvm-project/pull/99283 The middle end transformation that performs the elide can be found at https://github.com/llvm/llvm-project/pull/99285
Diffstat (limited to 'clang/lib/CodeGen/CodeGenFunction.h')
-rw-r--r--clang/lib/CodeGen/CodeGenFunction.h64
1 files changed, 37 insertions, 27 deletions
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index 9b93e96..5892d6a 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -3149,7 +3149,8 @@ public:
bool ForVirtualBase, bool Delegating,
Address This, CallArgList &Args,
AggValueSlot::Overlap_t Overlap,
- SourceLocation Loc, bool NewPointerIsChecked);
+ SourceLocation Loc, bool NewPointerIsChecked,
+ llvm::CallBase **CallOrInvoke = nullptr);
/// Emit assumption load for all bases. Requires to be called only on
/// most-derived class and not under construction of the object.
@@ -4269,7 +4270,8 @@ public:
LValue EmitBinaryOperatorLValue(const BinaryOperator *E);
LValue EmitCompoundAssignmentLValue(const CompoundAssignOperator *E);
// Note: only available for agg return types
- LValue EmitCallExprLValue(const CallExpr *E);
+ LValue EmitCallExprLValue(const CallExpr *E,
+ llvm::CallBase **CallOrInvoke = nullptr);
// Note: only available for agg return types
LValue EmitVAArgExprLValue(const VAArgExpr *E);
LValue EmitDeclRefLValue(const DeclRefExpr *E);
@@ -4382,21 +4384,27 @@ public:
/// LLVM arguments and the types they were derived from.
RValue EmitCall(const CGFunctionInfo &CallInfo, const CGCallee &Callee,
ReturnValueSlot ReturnValue, const CallArgList &Args,
- llvm::CallBase **callOrInvoke, bool IsMustTail,
+ llvm::CallBase **CallOrInvoke, bool IsMustTail,
SourceLocation Loc,
bool IsVirtualFunctionPointerThunk = false);
RValue EmitCall(const CGFunctionInfo &CallInfo, const CGCallee &Callee,
ReturnValueSlot ReturnValue, const CallArgList &Args,
- llvm::CallBase **callOrInvoke = nullptr,
+ llvm::CallBase **CallOrInvoke = nullptr,
bool IsMustTail = false) {
- return EmitCall(CallInfo, Callee, ReturnValue, Args, callOrInvoke,
+ return EmitCall(CallInfo, Callee, ReturnValue, Args, CallOrInvoke,
IsMustTail, SourceLocation());
}
RValue EmitCall(QualType FnType, const CGCallee &Callee, const CallExpr *E,
- ReturnValueSlot ReturnValue, llvm::Value *Chain = nullptr);
+ ReturnValueSlot ReturnValue, llvm::Value *Chain = nullptr,
+ llvm::CallBase **CallOrInvoke = nullptr);
+
+ // If a Call or Invoke instruction was emitted for this CallExpr, this method
+ // writes the pointer to `CallOrInvoke` if it's not null.
RValue EmitCallExpr(const CallExpr *E,
- ReturnValueSlot ReturnValue = ReturnValueSlot());
- RValue EmitSimpleCallExpr(const CallExpr *E, ReturnValueSlot ReturnValue);
+ ReturnValueSlot ReturnValue = ReturnValueSlot(),
+ llvm::CallBase **CallOrInvoke = nullptr);
+ RValue EmitSimpleCallExpr(const CallExpr *E, ReturnValueSlot ReturnValue,
+ llvm::CallBase **CallOrInvoke = nullptr);
CGCallee EmitCallee(const Expr *E);
void checkTargetFeatures(const CallExpr *E, const FunctionDecl *TargetDecl);
@@ -4500,25 +4508,23 @@ public:
void callCStructCopyAssignmentOperator(LValue Dst, LValue Src);
void callCStructMoveAssignmentOperator(LValue Dst, LValue Src);
- RValue
- EmitCXXMemberOrOperatorCall(const CXXMethodDecl *Method,
- const CGCallee &Callee,
- ReturnValueSlot ReturnValue, llvm::Value *This,
- llvm::Value *ImplicitParam,
- QualType ImplicitParamTy, const CallExpr *E,
- CallArgList *RtlArgs);
+ RValue EmitCXXMemberOrOperatorCall(
+ const CXXMethodDecl *Method, const CGCallee &Callee,
+ ReturnValueSlot ReturnValue, llvm::Value *This,
+ llvm::Value *ImplicitParam, QualType ImplicitParamTy, const CallExpr *E,
+ CallArgList *RtlArgs, llvm::CallBase **CallOrInvoke);
RValue EmitCXXDestructorCall(GlobalDecl Dtor, const CGCallee &Callee,
llvm::Value *This, QualType ThisTy,
llvm::Value *ImplicitParam,
- QualType ImplicitParamTy, const CallExpr *E);
+ QualType ImplicitParamTy, const CallExpr *E,
+ llvm::CallBase **CallOrInvoke = nullptr);
RValue EmitCXXMemberCallExpr(const CXXMemberCallExpr *E,
- ReturnValueSlot ReturnValue);
- RValue EmitCXXMemberOrOperatorMemberCallExpr(const CallExpr *CE,
- const CXXMethodDecl *MD,
- ReturnValueSlot ReturnValue,
- bool HasQualifier,
- NestedNameSpecifier *Qualifier,
- bool IsArrow, const Expr *Base);
+ ReturnValueSlot ReturnValue,
+ llvm::CallBase **CallOrInvoke = nullptr);
+ RValue EmitCXXMemberOrOperatorMemberCallExpr(
+ const CallExpr *CE, const CXXMethodDecl *MD, ReturnValueSlot ReturnValue,
+ bool HasQualifier, NestedNameSpecifier *Qualifier, bool IsArrow,
+ const Expr *Base, llvm::CallBase **CallOrInvoke);
// Compute the object pointer.
Address EmitCXXMemberDataPointerAddress(const Expr *E, Address base,
llvm::Value *memberPtr,
@@ -4526,15 +4532,18 @@ public:
LValueBaseInfo *BaseInfo = nullptr,
TBAAAccessInfo *TBAAInfo = nullptr);
RValue EmitCXXMemberPointerCallExpr(const CXXMemberCallExpr *E,
- ReturnValueSlot ReturnValue);
+ ReturnValueSlot ReturnValue,
+ llvm::CallBase **CallOrInvoke);
RValue EmitCXXOperatorMemberCallExpr(const CXXOperatorCallExpr *E,
const CXXMethodDecl *MD,
- ReturnValueSlot ReturnValue);
+ ReturnValueSlot ReturnValue,
+ llvm::CallBase **CallOrInvoke);
RValue EmitCXXPseudoDestructorExpr(const CXXPseudoDestructorExpr *E);
RValue EmitCUDAKernelCallExpr(const CUDAKernelCallExpr *E,
- ReturnValueSlot ReturnValue);
+ ReturnValueSlot ReturnValue,
+ llvm::CallBase **CallOrInvoke);
RValue EmitNVPTXDevicePrintfCallExpr(const CallExpr *E);
RValue EmitAMDGPUDevicePrintfCallExpr(const CallExpr *E);
@@ -4556,7 +4565,8 @@ public:
const analyze_os_log::OSLogBufferLayout &Layout,
CharUnits BufferAlignment);
- RValue EmitBlockCallExpr(const CallExpr *E, ReturnValueSlot ReturnValue);
+ RValue EmitBlockCallExpr(const CallExpr *E, ReturnValueSlot ReturnValue,
+ llvm::CallBase **CallOrInvoke);
/// EmitTargetBuiltinExpr - Emit the given builtin call. Returns 0 if the call
/// is unhandled by the current target.