aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CGCall.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-02-14[clang][CodeGen] `sret` args should always point to the `alloca` AS, so use ↵Alex Voicu1-12/+20
that (#114062) `sret` arguments are always going to reside in the stack/`alloca` address space, which makes the current formulation where their AS is derived from the pointee somewhat quaint. This patch ensures that `sret` ends up pointing to the `alloca` AS in IR function signatures, and also guards agains trying to pass a casted `alloca`d pointer to a `sret` arg, which can happen for most languages, when compiled for targets that have a non-zero `alloca` AS (e.g. AMDGCN) / map `LangAS::default` to a non-zero value (SPIR-V). A target could still choose to do something different here, by e.g. overriding `classifyReturnType` behaviour. In a broader sense, this patch extends non-aliased indirect args to also carry an AS, which leads to changing the `getIndirect()` interface. At the moment we're only using this for (indirect) returns, but it allows for future handling of indirect args themselves. We default to using the AllocaAS as that matches what Clang is currently doing, however if, in the future, a target would opt for e.g. placing indirect returns in some other storage, with another AS, this will require revisiting. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com> Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
2025-01-29[IR] Convert from nocapture to captures(none) (#123181)Nikita Popov1-1/+1
This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.
2025-01-28[Clang] Add fake use emission to Clang with -fextend-lifetimes (#110102)Wolfgang Pieb1-3/+14
Following the previous patch which adds the "extend lifetimes" flag without (almost) any functionality, this patch adds the real feature by allowing Clang to emit fake uses. These are emitted as a new form of cleanup, set for variable addresses, which just emits a fake use intrinsic when the variable falls out of scope. The code for achieving this is simple, with most of the logic centered on determining whether to emit a fake use for a given address, and on ensuring that fake uses are ignored in a few cases. Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2025-01-13[CodeGen] Migrate away from PointerUnion::dyn_cast (NFC) (#122778)Kazu Hirata1-1/+1
Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> Literal migration would result in dyn_cast_if_present (see the definition of PointerUnion::dyn_cast), but this patch uses dyn_cast because we expect Prototype.P to be nonnull.
2025-01-12[AArch64][Clang] Add support for __arm_agnostic("sme_za_state") (#121788)Sander de Smalen1-0/+2
This adds support for parsing the attribute and codegen to map it to "aarch64_za_state_agnostic" LLVM IR attribute. This attribute is described in the Arm C Language Extensions (ACLE) document: https://github.com/ARM-software/acle/blob/main/main/acle.md#__arm_agnostic
2025-01-11Reapply "[clang] Avoid re-evaluating field bitwidth" (#122289)Timm Baeder1-3/+3
2025-01-10[ubsan][NFCI] Use SanitizerOrdinal instead of SanitizerMask for EmitCheck ↵Thurston Dang1-6/+6
(exactly one sanitizer is required) (#122511) The `Checked` parameter of `CodeGenFunction::EmitCheck` is of type `ArrayRef<std::pair<llvm::Value *, SanitizerMask>>`, which is overly generalized: SanitizerMask can denote that zero or more sanitizers are enabled, but `EmitCheck` requires that exactly one sanitizer is specified in the SanitizerMask (e.g., `SanitizeTrap.has(Checked[i].second)` enforces that). This patch replaces SanitizerMask with SanitizerOrdinal in the `Checked` parameter of `EmitCheck` and code that transitively relies on it. This should not affect the behavior of UBSan, but it has the advantages that: - the code is clearer: it avoids ambiguity in EmitCheck about what to do if multiple bits are set - specifying the wrong number of sanitizers in `Checked[i].second` will be detected as a compile-time error, rather than a runtime assertion failure Suggested by Vitaly in https://github.com/llvm/llvm-project/pull/122392 as an alternative to adding an explicit runtime assertion that the SanitizerMask contains exactly one sanitizer.
2025-01-08Revert "[clang] Avoid re-evaluating field bitwidth (#117732)"Timm Bäder1-3/+3
This reverts commit 81fc3add1e627c23b7270fe2739cdacc09063e54. This breaks some LLDB tests, e.g. SymbolFile/DWARF/x86/no_unique_address-with-bitfields.cpp: lldb: ../llvm-project/clang/lib/AST/Decl.cpp:4604: unsigned int clang::FieldDecl::getBitWidthValue() const: Assertion `isa<ConstantExpr>(getBitWidth())' failed.
2025-01-08[clang] Avoid re-evaluating field bitwidth (#117732)Timm Baeder1-3/+3
Save the bitwidth value as a `ConstantExpr` with the value set. Remove the `ASTContext` parameter from `getBitWidthValue()`, so the latter simply returns the value from the `ConstantExpr` instead of constant-evaluating the bitwidth expression every time it is called.
2025-01-07[clang] Fix crashes when passing VLA to va_arg (#119563)天音あめ1-0/+2
Closes #119360. This bug occurs when passing a VLA to `va_arg`. Since the return value is inferred to be an array, it triggers `ScalarExprEmitter::VisitCastExpr`, which converts it to a pointer and subsequently calls `CodeGenFunction::EmitAggExpr`. At this point, because the inferred type is an `AggExpr` instead of a `ScalarExpr`, `ScalarExprEmitter::VisitVAArgExpr` is not invoked, and as a result, `CodeGenFunction::EmitVariablyModifiedType` is also not called, leading to the size of the VLA not being retrieved. The solution is to move the call to `CodeGenFunction::EmitVariablyModifiedType` into `CodeGenFunction::EmitVAArg`, ensuring that the size of the VLA is correctly obtained regardless of whether the expression is an `AggExpr` or a `ScalarExpr`.
2025-01-06[clang][NFC] clean up the handling of convergence control tokens (#121738)Sameer Sahasrabuddhe1-2/+2
2024-12-25[clang][RISCV] Remove unneeded RISCV tuple code (#121024)Brandon Wu1-31/+0
These code are no longer needed because we've modeled tuple type using target extension type rather than structure of scalable vectors.
2024-12-10[Clang] Change two placeholders from `undef` to `poison` [NFC] (#119141)Pedro Lobo1-1/+1
- Use `poison` instead of `undef` as a phi operand for an unreachable path (the predecessor will not go the BB that uses the value of the phi). - Call `@llvm.vector.insert` with a `poison` subvec when performing a `bitcast` from a fixed vector to a scalable vector.
2024-12-06[CodeGen] Migrate away from PointerUnion::{is,get} (NFC) (#118600)Kazu Hirata1-1/+1
Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.
2024-12-03[HLSL] get inout/out ABI for array parameters working (#111047)Sarah Spall1-3/+13
Get inout/out parameters working for HLSL Arrays. Utilizes the fix from #109323, and corrects the assignment behavior slightly to allow for Non-LValues on the RHS. Closes #106917 --------- Co-authored-by: Chris B <beanz@abolishcrlf.org>
2024-12-02[clang] Use poison instead of undef as the placeholder when creating a new ↵Pedro Lobo1-2/+2
vector [NFC] (#117064) Call `@llvm.vector.insert` with a `poison` vector when coercing a fixed vector to a scalable vector with the same element type.
2024-11-26[clang][SME] Ignore flatten/clang::always_inline statements for callees with ↵Benjamin Maxwell1-5/+11
mismatched streaming attributes (#116391) If `__attribute__((flatten))` is used on a function, or `[[clang::always_inline]]` on a statement, don't inline any callees with incompatible streaming attributes. Without this check, clang may produce incorrect code when these attributes are used in code with streaming functions. Note: The docs for flatten say it can be ignored when inlining is impossible: "causes calls within the attributed function to be inlined unless it is impossible to do so". Similarly, the (clang-only) `[[clang::always_inline]]` statement attribute is more relaxed than the GNU `__attribute__((always_inline))` (which says it should error it if it can't inline), saying only "If a statement is marked [[clang::always_inline]] and contains calls, the compiler attempts to inline those calls.". The docs also go on to show an example of where `[[clang::always_inline]]` has no effect.
2024-11-16[CodeGen] Remove unused includes (NFC) (#116459)Kazu Hirata1-1/+0
Identified with misc-include-cleaner.
2024-10-28Adding splitdouble HLSL function (#109331)joaosaffran1-7/+7
- Adding hlsl `splitdouble` intrinsics - Adding DXIL lowering - Adding SPIRV lowering - Adding test Fixes: #108901 --------- Co-authored-by: Joao Saffran <jderezende@microsoft.com>
2024-10-28[Clang][AArch64] Fix Pure Scalables Types argument passing and return (#112747)Momchil Velikov1-21/+64
Pure Scalable Types are defined in AAPCS64 here: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#pure-scalable-types-psts And should be passed according to Rule C.7 here: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#682parameter-passing-rules This part of the ABI is completely unimplemented in Clang, instead it treats PSTs sometimes as HFAs/HVAs, sometime as general composite types. This patch implements the rules for passing PSTs by employing the `CoerceAndExpand` method and extending it to: * allow array types in the `coerceToType`; Now only `[N x i8]` are considered padding. * allow mismatch between the elements of the `coerceToType` and the elements of the `unpaddedCoerceToType`; AArch64 uses this to map fixed-length vector types to SVE vector types. Corectly passing a PST argument needs a decision in Clang about whether to pass it in memory or registers or, equivalently, whether to use the `Indirect` or `Expand/CoerceAndExpand` method. It was considered relatively harder (or not practically possible) to make that decision in the AArch64 backend. Hence this patch implements the register counting from AAPCS64 (cf. `NSRN`, `NPRN`) to guide the Clang's decision.
2024-10-25[Clang] Always forward sret parameters to musttail callsKiran1-1/+5
If a call using the musttail attribute returns it's value through an sret argument pointer, we must forward an incoming sret pointer to it, instead of creating a new alloca. This is always possible because the musttail attribute requires the caller and callee to have the same argument and return types.
2024-10-24[clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399)Jay Foad1-8/+8
Follow up to #109133.
2024-09-19Target ABI: improve call parameters extensions handling (#100757)Jonas Paulsson1-3/+7
For the purpose of verifying proper arguments extensions per the target's ABI, introduce the NoExt attribute that may be used by a target when neither sign- or zeroextension is required (e.g. with a struct in register). The purpose of doing so is to be able to verify that there is always one of these attributes present and by this detecting cases where sign/zero extension is actually missing. As a first step, this patch has the verification step done for the SystemZ backend only, but left off by default until all known issues have been addressed. Other targets/front-ends can now also add NoExt attribute where needed and do this check in the backend.
2024-08-31[HLSL] Implement output parameter (#101083)Chris B1-0/+22
HLSL output parameters are denoted with the `inout` and `out` keywords in the function declaration. When an argument to an output parameter is constructed a temporary value is constructed for the argument. For `inout` pamameters the argument is initialized via copy-initialization from the argument lvalue expression to the parameter type. For `out` parameters the argument is not initialized before the call. In both cases on return of the function the temporary value is written back to the argument lvalue expression through an implicit assignment binary operator with casting as required. This change introduces a new HLSLOutArgExpr ast node which represents the output argument behavior. The OutArgExpr has three defined children: - An OpaqueValueExpr of the argument lvalue expression. - An OpaqueValueExpr of the copy-initialized parameter. - A BinaryOpExpr assigning the first with the value of the second. Fixes #87526 --------- Co-authored-by: Damyan Pepper <damyanp@microsoft.com> Co-authored-by: John McCall <rjmccall@gmail.com>
2024-08-27Revert "[ARM] musttail fixes"Kiran1-1/+1
committed by accident, see #104795 This reverts commit a2088a24dad31ebe44c93751db17307fdbe1f0e2.
2024-08-27Revert "Seperate frontend changes, add debug directives, remove redundant ↵Kiran1-1/+1
stuff from tests" This reverts commit 1a908c6be3317bbbac73e6a6fc52cabefbdebf7d.
2024-08-27Seperate frontend changes, add debug directives, remove redundant stuff from ↵Kiran1-1/+1
tests
2024-08-27[ARM] musttail fixesKiran1-1/+1
Backend: - Caller and callee arguments no longer have to match, just to take up the same space, as they can be changed before the call - Allowed tail calls if callee and callee both (or neither) use sret, wheras before it would be dissalowed if either used sret - Allowed tail calls if byval args are used - Added debug trace for IsEligibleForTailCallOptimisation Frontend (clang): - Do not generate extra alloca if sret is used with musttail, as the space for the sret is allocated already Change-Id: Ic7f246a7eca43c06874922d642d7dc44bdfc98ec
2024-08-22[BPF] introduce __attribute__((bpf_fastcall)) (#105417)eddyz871-0/+2
This commit introduces attribute bpf_fastcall to declare BPF functions that do not clobber some of the caller saved registers (R0-R5). The idea is to generate the code complying with generic BPF ABI, but allow compatible Linux Kernel to remove unnecessary spills and fills of non-scratched registers (given some compiler assistance). For such functions do register allocation as-if caller saved registers are not clobbered, but later wrap the calls with spill and fill patterns that are simple to recognize in kernel. For example for the following C code: #define __bpf_fastcall __attribute__((bpf_fastcall)) void bar(void) __bpf_fastcall; void buz(long i, long j, long k); void foo(long i, long j, long k) { bar(); buz(i, j, k); } First allocate registers as if: foo: call bar # note: no spills for i,j,k (r1,r2,r3) call buz exit And later insert spills fills on the peephole phase: foo: *(u64 *)(r10 - 8) = r1; # Such call pattern is *(u64 *)(r10 - 16) = r2; # correct when used with *(u64 *)(r10 - 24) = r3; # old kernels. call bar r3 = *(u64 *)(r10 - 24); # But also allows new r2 = *(u64 *)(r10 - 16); # kernels to recognize the r1 = *(u64 *)(r10 - 8); # pattern and remove spills/fills. call buz exit The offsets for generated spills/fills are picked as minimal stack offsets for the function. Allocated stack slots are not used for any other purposes, in order to simplify in-kernel analysis.
2024-08-21[clang-repl] [codegen] Reduce the state in TBAA. NFC for static compilation. ↵Vassil Vassilev1-16/+15
(#98138) In incremental compilation clang works with multiple `llvm::Module`s. Our current approach is to create a CodeGenModule entity for every new module request (via StartModule). However, some of the state such as the mangle context needs to be preserved to keep the original semantics in the ever-growing TU. Fixes: llvm/llvm-project#95581. cc: @jeaye
2024-08-19Revert "[BPF] introduce `__attribute__((bpf_fastcall))` (#101228)"Eduard Zingerman1-2/+0
This reverts commit e9b2e16dc98345bb1b91b1a6dacb3cec85f49e31. Reverting because of the test failure: https://lab.llvm.org/buildbot/#/builders/187/builds/509
2024-08-19[BPF] introduce `__attribute__((bpf_fastcall))` (#101228)eddyz871-0/+2
This commit introduces attribute bpf_fastcall to declare BPF functions that do not clobber some of the caller saved registers (R0-R5). The idea is to generate the code complying with generic BPF ABI, but allow compatible Linux Kernel to remove unnecessary spills and fills of non-scratched registers (given some compiler assistance). For such functions do register allocation as-if caller saved registers are not clobbered, but later wrap the calls with spill and fill patterns that are simple to recognize in kernel. For example for the following C code: #define __bpf_fastcall __attribute__((bpf_fastcall)) void bar(void) __bpf_fastcall; void buz(long i, long j, long k); void foo(long i, long j, long k) { bar(); buz(i, j, k); } First allocate registers as if: foo: call bar # note: no spills for i,j,k (r1,r2,r3) call buz exit And later insert spills fills on the peephole phase: foo: *(u64 *)(r10 - 8) = r1; # Such call pattern is *(u64 *)(r10 - 16) = r2; # correct when used with *(u64 *)(r10 - 24) = r3; # old kernels. call bar r3 = *(u64 *)(r10 - 24); # But also allows new r2 = *(u64 *)(r10 - 16); # kernels to recognize the r1 = *(u64 *)(r10 - 8); # pattern and remove spills/fills. call buz exit The offsets for generated spills/fills are picked as minimal stack offsets for the function. Allocated stack slots are not used for any other purposes, in order to simplify in-kernel analysis. Corresponding functionality had been merged in Linux Kernel as [this](https://lore.kernel.org/bpf/172179364482.1919.9590705031832457529.git-patchwork-notify@kernel.org/) patch set (the patch assumed that `no_caller_saved_regsiters` attribute would be used by LLVM, naming does not matter for the Kernel).
2024-08-09[Arm][AArch64][Clang] Respect function's branch protection attributes. (#101978)Daniel Kiss1-1/+1
Default attributes assigned to all functions according to the command line parameters. Some functions might have their own attributes and we need to set or remove attributes accordingly. Tests are updated to test this scenarios too.
2024-08-09[DebugInfo][RemoveDIs] Use iterator-inserters in clang (#102006)Jeremy Morse1-2/+2
As part of the LLVM effort to eliminate debug-info intrinsics, we're moving to a world where only iterators should be used to insert instructions. This isn't a problem in clang when instructions get generated before any debug-info is inserted, however we're planning on deprecating and removing the instruction-pointer insertion routines. Scatter some calls to getIterator in a few places, remove a deref-then-addrof on another iterator, and add an overload for the createLoadInstBefore utility. Some callers passes a null insertion point, which we need to handle explicitly now.
2024-08-01Fix codegen of consteval functions returning an empty class, and related ↵Eli Friedman1-85/+61
issues (#93115) Fix codegen of consteval functions returning an empty class, and related issues If a class is empty, don't store it to memory: the store might overwrite useful data. Similarly, if a class has tail padding that might overlap other fields, don't store the tail padding to memory. The problem here turned out a bit more general than I initially thought: basically all uses of EmitAggregateStore were broken. Call lowering had a method that did mostly the right thing, though: CreateCoercedStore. Adapt CreateCoercedStore so it always does the conservatively right thing, and use it for both calls and ConstantExpr. Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was set incorrectly for empty classes in some cases. Fixes #93040.
2024-07-31[clang][CUDA] Add 'noconvergent' function and statement attributedarkbuck1-0/+8
- For languages following SPMD/SIMT programming model, functions and call sites are marked 'convergent' by default. 'noconvergent' is added in this patch to allow developers to remove that 'convergent' attribute when it's safe. Reviewers: nhaehnle, Sirraide, yxsamliu, Artem-B, ilovepi, jayfoad, ssahasra, arsenm Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/100637
2024-07-24[AIX] Add -msave-reg-params to save arguments to stack (#97524)Qiu Chaofan1-0/+3
In PowerPC ABI, a few initial arguments are passed through registers, but their places in parameter save area are reserved, arguments passed by memory goes after the reserved location. For debugging purpose, we may want to save copy of the pass-by-reg arguments into correct places on stack. The new option achieves by adding new function level attribute and make argument lowering part aware of it.
2024-07-22[PAC] Implement authentication for C++ member function pointers (#99576)Oliver Hunt1-98/+115
Introduces type based signing of member function pointers. To support this discrimination schema we no longer emit member function pointer to virtual methods and indices into a vtable but migrate to using thunks. This does mean member function pointers are no longer necessarily directly comparable, however as such comparisons are UB this is acceptable. We derive the discriminator from the C++ mangling of the type of the pointer being authenticated. Co-Authored-By: Akira Hatanaka ahatanaka@apple.com Co-Authored-By: John McCall rjmccall@apple.com Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
2024-07-15[clang] Use different memory layout type for _BitInt(N) in LLVM IR (#91364)Mariya Podchishchaeva1-1/+2
There are two problems with _BitInt prior to this patch: 1. For at least some values of N, we cannot use LLVM's iN for the type of struct elements, array elements, allocas, global variables, and so on, because the LLVM layout for that type does not match the high-level layout of _BitInt(N). Example: Currently for i128:128 targets correct implementation is possible either for __int128 or for _BitInt(129+) with lowering to iN, but not both, since we have now correct implementation of __int128 in place after a21abc7. When this happens, opaque [M x i8] types used, where M = sizeof(_BitInt(N)). 2. LLVM doesn't guarantee any particular extension behavior for integer types that aren't a multiple of 8. For this reason, all _BitInt types are now have in-memory representation that is a whole number of bytes. I.e. for example _BitInt(17) now will have memory layout type i32. This patch also introduces concept of load/store type and adds an API to CodeGenTypes that returns the IR type that should be used for load and store operations. This is particularly useful for the case when a _BitInt ends up having array of bytes as memory layout type. For _BitInt(N), let M = sizeof(_BitInt(N)), and let BITS = M * 8. Loads and stores of iM would both (1) produce far better code from the backends and (2) be far more optimizable by IR passes than loads and stores of [M x i8]. Fixes https://github.com/llvm/llvm-project/issues/85139 Fixes https://github.com/llvm/llvm-project/issues/83419 --------- Co-authored-by: John McCall <rjmccall@gmail.com>
2024-07-12[Clang][ARM][AArch64] Add branch protection attributes to the defaults. (#83277)Daniel Kiss1-0/+3
These attributes are no longer inherited from the module flags, therefore need to be added for synthetic functions.
2024-07-08[PowerPC] Diagnose musttail instead of crash inside backend (#93267)Chen Zheng1-1/+28
musttail is not often possible to be generated on PPC targets as when calling to a function defined in another module, PPC needs to restore the TOC pointer. To restore the TOC pointer, compiler needs to emit a nop after the call to let linker generate codes to restore TOC pointer. Tail call cannot generate expected call sequence for this case. To avoid the crash inside the compiler backend, a diagnosis is added in the frontend. Fixes #63214
2024-06-21[clang] Implement function pointer signing and authenticated function calls ↵Ahmed Bougacha1-0/+3
(#93906) The functions are currently always signed/authenticated with zero discriminator. Co-Authored-By: John McCall <rjmccall@apple.com>
2024-06-17[clang][CodeGen] Return RValue from `EmitVAArg` (#94635)Mariya Podchishchaeva1-6/+6
This should simplify handling of resulting value by the callers.
2024-06-14Fix silent truncation of inline ASM `srcloc` cookie when going through a ↵beetrees1-1/+1
`DiagnosticInfoSrcMgr` (#84559) The size of the inline ASM `srcloc` cookie was changed from 32 bits to 64 bits in [D105491](https://reviews.llvm.org/D105491). However, that commit only updated the size of the cookie in `DiagnosticInfoInlineAsm`, meaning that inline ASM diagnostics that are instead represented with a `DiagnosticInfoSrcMgr` have their cookies truncated to 32 bits. This PR replaces the remaining uses of `unsigned` to represent the cookie with `uint64_t`, allowing the cookie to make it all the way to the diagnostic handler without being truncated.
2024-06-07[ARM] r11 is reserved when using -mframe-chain=aapcs (#86951)Oliver Stannard1-0/+1
When using the -mframe-chain=aapcs or -mframe-chain=aapcs-leaf options, we cannot use r11 as an allocatable register, even if -fomit-frame-pointer is also used. This is so that r11 will always point to a valid frame record, even if we don't create one in every function.
2024-05-21[Clang] Emit lifetime markers for non-aggregate temporary allocas (#90849)Lukacma1-34/+25
This patch extends https://reviews.llvm.org/D68611 and emits lifetime markers for temporary allocas of non-aggregate types as well.
2024-05-20[clang][CodeGen] Remove unused LValue::getAddress CGF arg. (#92465)Ahmed Bougacha1-14/+14
This is in effect a revert of f139ae3d93797, as we have since gained a more sophisticated way of doing extra IRGen with the addition of RawAddress in #86923.
2024-05-14[clang][SPIR-V] Always add convergence intrinsics (#88918)Nathan Gauër1-1/+4
PR #80680 added bits in the codegen to lazily add convergence intrinsics when required. This logic relied on the LoopStack. The issue is when parsing the condition, the loopstack doesn't yet reflect the correct values, as expected since we are not yet in the loop. However, convergence tokens should sometimes already be available. The solution which seemed the simplest is to greedily generate the tokens when we generate SPIR-V. Fixes #88144 --------- Signed-off-by: Nathan Gauër <brioche@google.com>
2024-05-07[AArch64] Diagnose more functions when FP not enabled (#90832)ostannard1-5/+6
When using a hard-float ABI for a target without FP registers, it's not possible to correctly generate code for functions with arguments which must be passed in floating-point registers. This is diagnosed in CodeGen instead of Sema, to more closely match GCC's behaviour around inline functions, which is relied on by the Linux kernel. Previously, this only checked function signatures as they were code-generated, but this missed some cases: * Calls to functions not defined in this translation unit. * Calls through function pointers. * Calls to variadic functions, where the variadic arguments have a floating-point type. This adds checks to function calls, as well as definitions, so that these cases are correctly diagnosed.
2024-04-29Re-apply "Emit missing cleanups for stmt-expr" and other commits (#89154)Utkarsh Saxena1-6/+7
Latest diff: https://github.com/llvm/llvm-project/pull/89154/files/f1ab4c2677394bbfc985d9680d5eecd7b2e6a882..adf9bc902baddb156c83ce0f8ec03c142e806d45 We address two additional bugs here: ### Problem 1: Deactivated normal cleanup still runs, leading to double-free Consider the following: ```cpp struct A { }; struct B { B(const A&); }; struct S { A a; B b; }; int AcceptS(S s); void Accept2(int x, int y); void Test() { Accept2(AcceptS({.a = A{}, .b = A{}}), ({ return; 0; })); } ``` We add cleanups as follows: 1. push dtor for field `S::a` 2. push dtor for temp `A{}` (used by ` B(const A&)` in `.b = A{}`) 3. push dtor for field `S::b` 4. Deactivate 3 `S::b`-> This pops the cleanup. 5. Deactivate 1 `S::a` -> Does not pop the cleanup as *2* is top. Should create _active flag_!! 6. push dtor for `~S()`. 7. ... It is important to deactivate **5** using active flags. Without the active flags, the `return` will fallthrough it and would run both `~S()` and dtor `S::a` leading to **double free** of `~A()`. In this patch, we unconditionally emit active flags while deactivating normal cleanups. These flags are deleted later by the `AllocaTracker` if the cleanup is not emitted. ### Problem 2: Missing cleanup for conditional lifetime extension We push 2 cleanups for lifetime-extended cleanup. The first cleanup is useful if we exit from the middle of the expression (stmt-expr/coro suspensions). This is deactivated after full-expr, and a new cleanup is pushed, extending the lifetime of the temporaries (to the scope of the reference being initialized). If this lifetime extension happens to be conditional, then we use active flags to remember whether the branch was taken and if the object was initialized. Previously, we used a **single** active flag, which was used by both cleanups. This is wrong because the first cleanup will be forced to deactivate after the full-expr and therefore this **active** flag will always be **inactive**. The dtor for the lifetime extended entity would not run as it always sees an **inactive** flag. In this patch, we solve this using two separate active flags for both cleanups. Both of them are activated if the conditional branch is taken, but only one of them is deactivated after the full-expr. --- Fixes https://github.com/llvm/llvm-project/issues/63818 Fixes https://github.com/llvm/llvm-project/issues/88478 --- Previous PR logs: 1. https://github.com/llvm/llvm-project/pull/85398 2. https://github.com/llvm/llvm-project/pull/88670 3. https://github.com/llvm/llvm-project/pull/88751 4. https://github.com/llvm/llvm-project/pull/88884