aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CGCall.cpp
AgeCommit message (Collapse)AuthorFilesLines
6 days[clang] Add missing readonly/readnone annotations (#158424)Quentin Chateau1-0/+26
When arg memory effects are lost due to indirect arguments, apply readonly/readnone attribute to the other pointer arguments of the function. Fixes #157693 .
2025-09-09[clang][OpenMP][SPIR-V] Fix addrspace of pointer kernel arguments (#157172)Nick Sarnie1-3/+2
In SPIR-V, kernel arguments are not allowed to be in the Generic AS, in both Intel's internal SPIR-V offloading implementation as well as HIPSPV, `CrossWorkgroup` AS1 is used. Do the same for OMPSPV. Currently with Generic AS the `llvm-spirv` translator blows up if we are using it, and if not, the GPU runtime blows up. To get the existing logic to set the correct AS to kick in, we need to know if the function is a kernel or not at the time we first create the function that may end up as the kernel. I use the existing `arrangeSYCLKernelCallerDeclaration` function to do the right kernel ABI computation, but since the function is not specific to SYCL anymore because I merged all the device kernel clang attributes into one. Rename the function to be accurate to the current behavior, `arrangeDeviceKernelCallerDeclaration`. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-09-02[DebugInfo] When referencing structured bindings use the reference's ↵David Blaikie1-14/+12
location, not the binding's declaration's location (#153637) For structured bindings that use custom `get` specializations, the resulting LLVM IR ascribes the load of the result of `get` to the binding's declaration, rather than the place where the binding is referenced - this caused awkward sequencing in the debug info where, when stepping through the code you'd step back to the binding declaration every time there was a reference to the binding. To fix that - when we cross into IRGening a binding - suppress the debug info location of that subexpression. I don't represent this as a great bit of API design - certainly open to ideas, but putting it out here as a place to start. It's /possible/ this is an incomplete fix, even - if the binding decl had other subexpressions, those would still get their location applied & it'd likely be wrong. So maybe that's a direction to go with to productionize this - add a new location scoped device that suppresses any overriding - this might be more robust. How do people feel about that?
2025-08-29[IR][CodeGen] Remove "approx-func-fp-math" attribute (#155740)paperchalice1-2/+0
Remove "approx-func-fp-math" attribute and related command line option, users should always use afn flag in IR. Resolve FIXME in `TargetMachine::resetTargetOptions` partially.
2025-08-27[clang] AST: fix getAs canonicalization of leaf types (#155028)Matheus Izvekov1-4/+4
2025-08-26[clang] NFC: introduce Type::getAsEnumDecl, and cast variants for all ↵Matheus Izvekov1-12/+4
TagDecls (#155463) And make use of those. These changes are split from prior PR #155028, in order to decrease the size of that PR and facilitate review.
2025-08-25[clang] NFC: change more places to use Type::getAsTagDecl and friends (#155313)Matheus Izvekov1-2/+1
This changes a bunch of places which use getAs<TagType>, including derived types, just to obtain the tag definition. This is preparation for #155028, offloading all the changes that PR used to introduce which don't depend on any new helpers.
2025-08-25[Clang]Throw frontend error for target feature mismatch when using flatten ↵Abhishek Kaushik1-3/+6
attribute (#154801) Fixes #149866 --------- Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2025-08-20[clang][RISCV] Fix crash on VLS calling convention (#145489)Brandon Wu1-11/+60
This patch handle struct of fixed vector and struct of array of fixed vector correctly for VLS calling convention in EmitFunctionProlog, EmitFunctionEpilog and EmitCall. stack on: https://github.com/llvm/llvm-project/pull/147173
2025-08-09[clang] Improve nested name specifier AST representation (#147835)Matheus Izvekov1-12/+18
This is a major change on how we represent nested name qualifications in the AST. * The nested name specifier itself and how it's stored is changed. The prefixes for types are handled within the type hierarchy, which makes canonicalization for them super cheap, no memory allocation required. Also translating a type into nested name specifier form becomes a no-op. An identifier is stored as a DependentNameType. The nested name specifier gains a lightweight handle class, to be used instead of passing around pointers, which is similar to what is implemented for TemplateName. There is still one free bit available, and this handle can be used within a PointerUnion and PointerIntPair, which should keep bit-packing aficionados happy. * The ElaboratedType node is removed, all type nodes in which it could previously apply to can now store the elaborated keyword and name qualifier, tail allocating when present. * TagTypes can now point to the exact declaration found when producing these, as opposed to the previous situation of there only existing one TagType per entity. This increases the amount of type sugar retained, and can have several applications, for example in tracking module ownership, and other tools which care about source file origins, such as IWYU. These TagTypes are lazily allocated, in order to limit the increase in AST size. This patch offers a great performance benefit. It greatly improves compilation time for [stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for `test_on2.cpp` in that project, which is the slowest compiling test, this patch improves `-c` compilation time by about 7.2%, with the `-fsyntax-only` improvement being at ~12%. This has great results on compile-time-tracker as well: ![image](https://github.com/user-attachments/assets/700dce98-2cab-4aa8-97d1-b038c0bee831) This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands. It has some other miscelaneous drive-by fixes. About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact. There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work. How to review: The important changes are all in `clang/include/clang/AST` and `clang/lib/AST`, with also important changes in `clang/lib/Sema/TreeTransform.h`. The rest and bulk of the changes are mostly consequences of the changes in API. PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just for easier to rebasing. I plan to rename it back after this lands. Fixes #136624 Fixes https://github.com/llvm/llvm-project/issues/43179 Fixes https://github.com/llvm/llvm-project/issues/68670 Fixes https://github.com/llvm/llvm-project/issues/92757
2025-08-08[IR] Remove size argument from lifetime intrinsics (#150248)Nikita Popov1-31/+16
Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).
2025-07-24[HLSL] Avoid putting the byval attribute on out and inout parameters (#150495)Deric C.1-14/+22
Fixes #148063 by preventing the ByVal attribute from being placed on out and inout function parameters which causes them to be eliminated by the Dead Store Elimination (DSE) pass.
2025-07-19Reland [Clang] Make the SizeType, SignedSizeType and PtrdiffType be named ↵YexuanXiao1-1/+1
sugar types (#149613) The checks for the 'z' and 't' format specifiers added in the original PR #143653 had some issues and were overly strict, causing some build failures and were consequently reverted at https://github.com/llvm/llvm-project/commit/4c85bf2fe8042c855c9dd5be4b02191e9d071ffd. In the latest commit https://github.com/llvm/llvm-project/pull/149613/commits/27c58629ec76a703fde9c0b99b170573170b4a7a, I relaxed the checks for the 'z' and 't' format specifiers, so warnings are now only issued when they are used with mismatched types. The original intent of these checks was to diagnose code that assumes the underlying type of `size_t` is `unsigned` or `unsigned long`, for example: ```c printf("%zu", 1ul); // Not portable, but not an error when size_t is unsigned long ``` However, it produced a significant number of false positives. This was partly because Clang does not treat the `typedef` `size_t` and `__size_t` as having a common "sugar" type, and partly because a large amount of existing code either assumes `unsigned` (or `unsigned long`) is `size_t`, or they define the equivalent of size_t in their own way (such as sanitizer_internal_defs.h).https://github.com/llvm/llvm-project/blob/2e67dcfdcd023df2f06e0823eeea23990ce41534/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h#L203
2025-07-18[clang][CodeGen] Set `dead_on_return` when passing arguments indirectlyAntonio Frighetto1-1/+14
Let Clang emit `dead_on_return` attribute on pointer arguments that are passed indirectly, namely, large aggregates that the ABI mandates be passed by value; thus, the parameter is destroyed within the callee. Writes to such arguments are not observable by the caller after the callee returns. This should desirably enable further MemCpyOpt/DSE optimizations. Previous discussion: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.
2025-07-17Revert "[Clang] Make the SizeType, SignedSizeType and PtrdiffType be named ↵Kazu Hirata1-1/+1
sugar types instead of built-in types (#143653)" This reverts commit c27e283cfbca2bd22f34592430e98ee76ed60ad8. A builbot failure has been reported: https://lab.llvm.org/buildbot/#/builders/186/builds/10819/steps/10/logs/stdio I'm also getting a large number of warnings related to %zu and %zx.
2025-07-17[Clang] Make the SizeType, SignedSizeType and PtrdiffType be named sugar ↵YexuanXiao1-1/+1
types instead of built-in types (#143653) Including the results of `sizeof`, `sizeof...`, `__datasizeof`, `__alignof`, `_Alignof`, `alignof`, `_Countof`, `size_t` literals, and signed `size_t` literals, the results of pointer-pointer subtraction and checks for standard library functions (and their calls). The goal is to enable clang and downstream tools such as clangd and clang-tidy to provide more portable hints and diagnostics. The previous discussion can be found at #136542. This PR implements this feature by introducing a new subtype of `Type` called `PredefinedSugarType`, which was considered appropriate in discussions. I tried to keep `PredefinedSugarType` simple enough yet not limited to `size_t` and `ptrdiff_t` so that it can be used for other purposes. `PredefinedSugarType` wraps a canonical `Type` and provides a name, conceptually similar to a compiler internal `TypedefType` but without depending on a `TypedefDecl` or a source file. Additionally, checks for the `z` and `t` format specifiers in format strings for `scanf` and `printf` were added. It will precisely match expressions using `typedef`s or built-in expressions. The affected tests indicates that it works very well. Several code require that `SizeType` is canonical, so I kept `SizeType` to its canonical form. The failed tests in CI are allowed to fail. See the [comment](https://github.com/llvm/llvm-project/pull/135386#issuecomment-3049426611) in another PR #135386.
2025-06-24Add support for Windows Secure Hot-Patching (redo) (#145565)sivadeilra1-0/+7
(This is a re-do of #138972, which had a minor warning in `Clang.cpp`.) This PR adds some of the support needed for Windows hot-patching. Windows implements a form of hot-patching. This allows patches to be applied to Windows apps, drivers, and the kernel, without rebooting or restarting any of these components. Hot-patching is a complex technology and requires coordination between the OS, compilers, linkers, and additional tools. This PR adds support to Clang and LLVM for part of the hot-patching process. It enables LLVM to generate the required code changes and to generate CodeView symbols which identify hot-patched functions. The PR provides new command-line arguments to Clang which allow developers to identify the list of functions that need to be hot-patched. This PR also allows LLVM to directly receive the list of functions to be modified, so that language front-ends which have not yet been modified (such as Rust) can still make use of hot-patching. This PR: * Adds a `MarkedForWindowsHotPatching` LLVM function attribute. This attribute indicates that a function should be _hot-patched_. This generates a new CodeView symbol, `S_HOTPATCHFUNC`, which identifies any function that has been hot-patched. This attribute also causes accesses to global variables to be indirected through a `_ref_*` global variable. This allows hot-patched functions to access the correct version of a global variable; the hot-patched code needs to access the variable in the _original_ image, not the patch image. * Adds a `AllowDirectAccessInHotPatchFunction` LLVM attribute. This attribute may be placed on global variable declarations. It indicates that the variable may be safely accessed without the `_ref_*` indirection. * Adds two Clang command-line parameters: `-fms-hotpatch-functions-file` and `-fms-hotpatch-functions-list`. The `-file` flag may point to a text file, which contains a list of functions to be hot-patched (one function name per line). The `-list` flag simply directly identifies functions to be patched, using a comma-separated list. These two command-line parameters may also be combined; the final set of functions to be hot-patched is the union of the two sets. * Adds similar LLVM command-line parameters: `--ms-hotpatch-functions-file` and `--ms-hotpatch-functions-list`. * Adds integration tests for both LLVM and Clang. * Adds support for dumping the new `S_HOTPATCHFUNC` CodeView symbol. Although the flags are redundant between Clang and LLVM, this allows additional languages (such as Rust) to take advantage of hot-patching support before they have been modified to generate the required attributes. Credit to @dpaoliello, who wrote the original form of this patch.
2025-06-24Revert "Add support for Windows Secure Hot-Patching" (#145553)Qinkun Bao1-7/+0
Reverts llvm/llvm-project#138972
2025-06-24Add support for Windows Secure Hot-Patching (#138972)sivadeilra1-0/+7
This PR adds some of the support needed for Windows hot-patching. Windows implements a form of hot-patching. This allows patches to be applied to Windows apps, drivers, and the kernel, without rebooting or restarting any of these components. Hot-patching is a complex technology and requires coordination between the OS, compilers, linkers, and additional tools. This PR adds support to Clang and LLVM for part of the hot-patching process. It enables LLVM to generate the required code changes and to generate CodeView symbols which identify hot-patched functions. The PR provides new command-line arguments to Clang which allow developers to identify the list of functions that need to be hot-patched. This PR also allows LLVM to directly receive the list of functions to be modified, so that language front-ends which have not yet been modified (such as Rust) can still make use of hot-patching. This PR: * Adds a `MarkedForWindowsHotPatching` LLVM function attribute. This attribute indicates that a function should be _hot-patched_. This generates a new CodeView symbol, `S_HOTPATCHFUNC`, which identifies any function that has been hot-patched. This attribute also causes accesses to global variables to be indirected through a `_ref_*` global variable. This allows hot-patched functions to access the correct version of a global variable; the hot-patched code needs to access the variable in the _original_ image, not the patch image. * Adds a `AllowDirectAccessInHotPatchFunction` LLVM attribute. This attribute may be placed on global variable declarations. It indicates that the variable may be safely accessed without the `_ref_*` indirection. * Adds two Clang command-line parameters: `-fms-hotpatch-functions-file` and `-fms-hotpatch-functions-list`. The `-file` flag may point to a text file, which contains a list of functions to be hot-patched (one function name per line). The `-list` flag simply directly identifies functions to be patched, using a comma-separated list. These two command-line parameters may also be combined; the final set of functions to be hot-patched is the union of the two sets. * Adds similar LLVM command-line parameters: `--ms-hotpatch-functions-file` and `--ms-hotpatch-functions-list`. * Adds integration tests for both LLVM and Clang. * Adds support for dumping the new `S_HOTPATCHFUNC` CodeView symbol. Although the flags are redundant between Clang and LLVM, this allows additional languages (such as Rust) to take advantage of hot-patching support before they have been modified to generate the required attributes. Credit to @dpaoliello, who wrote the original form of this patch.
2025-06-18[clang] Use TargetInfo to determine device kernel calling convention (#144728)Nick Sarnie1-11/+2
We should abstract this logic away to `TargetInfo`. See https://github.com/llvm/llvm-project/pull/137882 for more information. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-06-06[clang] Strip away lambdas (NFC) (#143226)Kazu Hirata1-3/+2
We don't need lambdas here.
2025-06-06[ubsan] Add more -fsanitize-annotate-debug-info checks (#141997)Thurston Dang1-2/+2
This extends https://github.com/llvm/llvm-project/pull/138577 to more UBSan checks, by changing SanitizerDebugLocation (formerly SanitizerScope) to add annotations if enabled for the specified ordinals. Annotations will use the ordinal name if there is exactly one ordinal specified in the SanitizerDebugLocation; otherwise, it will use the handler name. Updates the tests from https://github.com/llvm/llvm-project/pull/141814. --------- Co-authored-by: Vitaly Buka <vitalybuka@google.com>
2025-06-05[clang] Simplify device kernel attributes (#137882)Nick Sarnie1-10/+20
We have multiple different attributes in clang representing device kernels for specific targets/languages. Refactor them into one attribute with different spellings to make it more easily scalable for new languages/targets. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-06-04[KeyInstr][Clang] Ret atom (#134652)Orlando Cazalet-Hyams1-4/+14
This patch is part of a stack that teaches Clang to generate Key Instructions metadata for C and C++. When returning a value, stores to the `retval` allocas and branches to `return` block are put in the same atom group. They are both rank 1, which could in theory introduce an extra step in some optimized code. This low risk currently feels an acceptable for keeping the code a bit simpler (as opposed to adding scaffolding to make the store rank 2). In the case of a single return (no control flow) the return instruction inherits the atom group of the branch to the return block when the blocks get folded togather. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668 The feature is only functional in LLVM if LLVM is built with CMake flag LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
2025-06-03[KeyInstr][Clang] Coerced store atoms (#134653)Orlando Cazalet-Hyams1-8/+15
[KeyInstr][Clang] Coerced store atoms This patch is part of a stack that teaches Clang to generate Key Instructions metadata for C and C++. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668 The feature is only functional in LLVM if LLVM is built with CMake flag LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
2025-06-02[CodeGen] Move CodeGenPGO behind unique_ptr (NFC) (#142155)Nikita Popov1-1/+2
The InstrProf headers are very expensive. Avoid including them in all of CodeGen/ by moving the CodeGenPGO member behind a unqiue_ptr. This reduces clang build time by 0.8%.
2025-05-20[clang][AArch64] Move initialization of ptrauth-* function attrs (#140277)Anatoly Trosinenko1-0/+5
Move the initialization of ptrauth-* function attributes near the initialization of branch protection attributes. The semantics of these groups of attributes partially overlaps, so handle both groups in getDefaultFunctionAttributes() and setTargetAttributes() functions to prevent getting them out of sync. This fixes C++ TLS wrappers.
2025-05-14[RISCV] Improve casting between i1 scalable vectors and i8 fixed vectors for ↵Craig Topper1-5/+15
-mrvv-vector-bits (#139190) For i1 vectors, we used an i8 fixed vector as the storage type. If the known minimum number of elements of the scalable vector type is less than 8, we were doing the cast through memory. This used a load or store from a fixed vector alloca. If is less than 8, DataLayout indicates that the load/store reads/writes vscale bytes even if vscale is known and vscale*X is less than or equal to 8. This means the load or store is outside the bounds of the fixed size alloca as far as DataLayout is concerned leading to undefined behavior. This patch avoids this by widening the i1 scalable vector type with zero elements until it is divisible by 8. This allows it be bitcasted to/from an i8 scalable vector. We then insert or extract the i8 fixed vector into this type. Hopefully this enables #130973 to be accepted.
2025-05-09[aarch64][x86][win] Add compiler support for MSVC's /funcoverride flag ↵Daniel Paoliello1-0/+4
(Windows kernel loader replaceable functions) (#125320) Adds support for MSVC's undocumented `/funcoverride` flag, which marks functions as being replaceable by the Windows kernel loader. This is used to allow functions to be upgraded depending on the capabilities of the current processor (e.g., the kernel can be built with the naive implementation of a function, but that function can be replaced at boot with one that uses SIMD instructions if the processor supports them). For each marked function we need to generate: * An undefined symbol named `<name>_$fo$`. * A defined symbol `<name>_$fo_default$` that points to the `.data` section (anywhere in the data section, it is assumed to be zero sized). * An `/ALTERNATENAME` linker directive that points from `<name>_$fo$` to `<name>_$fo_default$`. This is used by the MSVC linker to generate the appropriate metadata in the Dynamic Value Relocation Table. Marked function must never be inlined (otherwise those inline sites can't be replaced). Note that I've chosen to implement this in AsmPrinter as there was no way to create a `GlobalVariable` for `<name>_$fo$` that would result in a symbol being emitted (as nothing consumes it and it has no initializer). I tried to have `llvm.used` and `llvm.compiler.used` point to it, but this didn't help. Within LLVM I referred to this feature as "loader replaceable" as "function override" already has a different meaning to C++ developers... I also took the opportunity to extract the feature symbol generation code used by both AArch64 and X86 into a common function in AsmPrinter.
2025-05-09clang: Remove dest LangAS argument from performAddrSpaceCast (#138866)Matt Arsenault1-8/+3
It isn't used and is redundant with the result pointer type argument. A more reasonable API would only have LangAS parameters, or IR parameters, not both. Not all values have a meaningful value for this. I'm also not sure why we have this at all, it's not overridden by any targets and further simplification is possible.
2025-05-09clang: Read the address space from the ABIArgInfo (#138865)Matt Arsenault1-4/+4
Do not assume it's the alloca address space, we have an explicit address space to use for the argument already. Also use the original value's type instead of assuming DefaultAS.
2025-05-09clang/OpenCL: Fix special casing OpenCL in call emission (#138864)Matt Arsenault1-12/+7
This essentially reverts 1bf1a156d673. OpenCL's handling of address spaces has always been a mess, but it's better than it used to be so this hack appears to be unnecessary now. None of the code here should really depend on the language or language address space. The ABI address space to use is already explicit in the ABIArgInfo, so use that instead of guessing it has anything to do with LangAS::Default or getASTAllocaAddressSpace. The below usage of LangAS::Default and getASTAllocaAddressSpace are also suspect, but appears to be a more involved and separate fix.
2025-05-07[clang] Handle CC attrs for UEFI (#138935)Prabhu Rajasekaran1-5/+8
UEFI's default ABI is MS ABI. Handle the calling convention attributes accordingly.
2025-05-07[clang][NFC] Fix some more incorrectly formatted comments (#138342)Nick Sarnie1-14/+14
More fixes based on https://github.com/llvm/llvm-project/pull/138036 --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-05-02[IRBuilder] Add versions of createInsertVector/createExtractVector that take ↵Craig Topper1-7/+3
a uint64_t index. (#138324) Most callers want a constant index. Instead of making every caller create a ConstantInt, we can do it in IRBuilder. This is similar to createInsertElement/createExtractElement.
2025-05-02[clang][NFC] Fix some clang-format mistakes (#138036)Nick Sarnie1-3/+2
Fixes for https://github.com/llvm/llvm-project/pull/138000 Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-04-30[clang][NFC] Format two files with CallingConv switches (#138000)Nick Sarnie1-237/+232
I'm planning on modifying this code so format it so we can pass the formatting check. Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-04-25[Clang][CodeGen] Emit fake uses before musttail calls (#136867)Stephen Tozer1-1/+9
Fixes the issue reported following the merge of #118026. When a valid `musttail` call is made, the function it is made from must return immediately after the call; if there are any cleanups left in the function, then an error is triggered. This is not necessary for fake uses however - it is perfectly valid to simply emit the fake use "cleanup" code before the tail call, and indeed LLVM will automatically move any fake uses following a tail call to come before the tail call. Therefore, this patch specifically choose to handle fake use cleanups when a musttail call is present by simply emitting them immediately before the call.
2025-04-24[clang] Ensure correct copying of records with authenticated fields (#136783)Oliver Hunt1-1/+1
When records contain fields with pointer authentication, even simple copies can require additional work be performed. This patch contains the core functionality required to handle user defined structs, as well as the implicitly constructed structs for blocks, etc. Co-authored-by: Ahmed Bougacha Co-authored-by: Akira Hatanaka Co-authored-by: John Mccall
2025-04-17[SYCL] Basic code generation for SYCL kernel caller offload entry point ↵Tom Honermann1-0/+11
functions. (#133030) A function declared with the `sycl_kernel_entry_point` attribute, sometimes called a SYCL kernel entry point function, specifies a pattern from which the parameters and body of an offload entry point function, sometimes called a SYCL kernel caller function, are derived. SYCL kernel caller functions are emitted during SYCL device compilation. Their parameters and body are derived from the `SYCLKernelCallStmt` statement and `OutlinedFunctionDecl` declaration associated with their corresponding SYCL kernel entry point function. A distinct SYCL kernel caller function is generated for each SYCL kernel entry point function defined as a non-inline function or ODR-used in the translation unit. The name of each SYCL kernel caller function is parameterized by the SYCL kernel name type specified by the `sycl_kernel_entry_point` attribute attached to the corresponding SYCL kernel entry point function. For the moment, the Itanium ABI mangled name for typeinfo data (`_ZTS<type>`) is used to name these functions; a future change will switch to a more appropriate naming scheme. The calling convention used for a SYCL kernel caller function is target dependent. Support for AMDGCN, NVPTX, and SPIR targets is currently provided. These functions are required to observe the language restrictions for SYCL devices as specified by the SYCL 2020 specification; this includes a forward progress guarantee and prohibits recursion. Only SYCL kernel caller functions, functions declared as `SYCL_EXTERNAL`, and functions directly or indirectly referenced from those functions should be emitted during device compilation. Pruning of other declarations has not yet been implemented. --------- Co-authored-by: Elizabeth Andrews <elizabeth.andrews@intel.com>
2025-04-16[NFC][Clang] Introduce type aliases to replace use of auto in ↵Tom Honermann1-38/+37
clang/lib/CodeGen/CGCall.cpp. (#135861) CGCall.cpp declares several functions with a return type that is an explicitly spelled out specialization of `SmallVector`. Previously, `auto` was used in several places to avoid repeating the long type name; a use that Clang maintainers find unjustified. This change introduces type aliases and replaces the existing uses of `auto` with the corresponding alias name.
2025-04-08[Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (#115821)Aniket Lal1-6/+15
This feature is currently not supported in the compiler. To facilitate this we emit a stub version of each kernel function body with different name mangling scheme, and replaces the respective kernel call-sites appropriately. Fixes https://github.com/llvm/llvm-project/issues/60313 D120566 was an earlier attempt made to upstream a solution for this issue. --------- Co-authored-by: anikelal <anikelal@amd.com>
2025-04-03[CodeGen] Don't include CGDebugInfo.h in CodeGenFunction.h (NFC) (#134100)Nikita Popov1-0/+1
This is an expensive header, only include it where needed. Move some functions out of line to achieve that. This reduces time to build clang by ~0.5% in terms of instructions retired.
2025-03-31[clang] Automatically add the `returns_twice` attribute to certain functions ↵Alan Zhao1-0/+9
even if `-fno-builtin` is set (#133511) Certain functions require the `returns_twice` attribute in order to produce correct codegen. However, `-fno-builtin` removes all knowledge of functions that require this attribute, so this PR modifies Clang to add the `returns_twice` attribute even if `-fno-builtin` is set. This behavior is also consistent with what GCC does. It's not (easily) possible to get the builtin information from `Builtins.td` because `-fno-builtin` causes Clang to never initialize any builtins, so functions never get tokenized as functions/builtins that require `returns_twice`. Therefore, the most straightforward solution is to explicitly hard code the function names that require `returns_twice`. Fixes #122840
2025-03-28[CodeGen] Use llvm::reverse (NFC) (#133550)Kazu Hirata1-1/+1
2025-03-08[Clang] Fix typo 'dereferencable' to 'dereferenceable' (#116761)Liberty1-1/+1
This patch corrects the typo 'dereferencable' to 'dereferenceable' in CGCall.cpp. The typo is located within a comment inside the `void CodeGenModule::ConstructAttributeList` function.
2025-03-03[RISCV][VLS] Support RISCV VLS calling convention (#100346)Brandon Wu1-0/+50
This patch adds a function attribute `riscv_vls_cc` for RISCV VLS calling convention which takes 0 or 1 argument, the argument is the `ABI_VLEN` which is the `VLEN` for passing the fixed-vector arguments, it wraps the argument as a scalable vector(VLA) using the `ABI_VLEN` and uses the corresponding mechanism to handle it. The range of `ABI_VLEN` is [32, 65536], if not specified, the default value is 128. Here is an example of VLS argument passing: Non-VLS call: ``` void original_call(__attribute__((vector_size(16))) int arg) {} => define void @original_call(i128 noundef %arg) { entry: ... ret void } ``` VLS call: ``` void __attribute__((riscv_vls_cc(256))) vls_call(__attribute__((vector_size(16))) int arg) {} => define riscv_vls_cc void @vls_call(<vscale x 1 x i32> %arg) { entry: ... ret void } } ``` The first Non-VLS call passes generic vector argument of 16 bytes by flattened integer. On the contrary, the VLS call uses `ABI_VLEN=256` which wraps the vector to <vscale x 1 x i32> where the number of scalable vector elements is calaulated by: `ORIG_ELTS * RVV_BITS_PER_BLOCK / ABI_VLEN`. Note: ORIG_ELTS = Vector Size / Type Size = 128 / 32 = 4. PsABI PR: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/418 C-API PR: https://github.com/riscv-non-isa/riscv-c-api-doc/pull/68
2025-02-27[clang] Ignore GCC 11 [[malloc(x)]] attributeAlois Klink1-1/+2
Ignore the `[[malloc(x)]]` or `[[malloc(x, 1)]]` function attribute syntax added in [GCC 11][1] and print a warning instead of an error. Unlike `[[malloc]]` with no arguments (which is supported by Clang), GCC uses the one or two argument form to specify a deallocator for GCC's static analyzer. Code currently compiled with `[[malloc(x)]]` or `__attribute((malloc(x)))` fails with the following error: `'malloc' attribute takes no arguments`. [1]: https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;f=gcc/doc/extend.texi;h=dce6c58db87ebf7f4477bd3126228e73e4eeee97#patch6 Fixes: https://github.com/llvm/llvm-project/issues/51607 Partial-Bug: https://github.com/llvm/llvm-project/issues/53152
2025-02-27[clang][CodeGen] Additional fixes for #114062 (#128166)Alex Voicu1-3/+16
This addresses two issues introduced by moving indirect args into an explicit AS (please see <https://github.com/llvm/llvm-project/pull/114062#issuecomment-2659829790> and <https://github.com/llvm/llvm-project/pull/114062#issuecomment-2661158477>): 1. Unconditionally stripping casts from a pre-allocated return slot was incorrect / insufficient (this is illustrated by the `amdgcn_sret_ctor.cpp` test); 2. Putting compiler manufactured sret args in a non default AS can lead to a C-cast (surprisingly) requiring an AS cast (this is illustrated by the `sret_cast_with_nonzero_alloca_as.cpp test). The way we handle (2), by subverting CK_BitCast emission iff a sret arg is involved, is quite naff, but I couldn't think of any other way to use a non default indirect AS and make this case work (hopefully this is a failure of imagination on my part).
2025-02-17[NFC][Clang][CodeGen] Remove vestigial assertion (#127528)Alex Voicu1-16/+0
This removes a vestigial assertion, which would erroneously trigger even though we now correctly handle valid arg mismatches (<https://github.com/llvm/llvm-project/blob/2dda529838e622e7a79b1e26d2899f319fd7e379/clang/lib/CodeGen/CGCall.cpp#L5397>), after #114062 went in.