aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenModule.cpp
AgeCommit message (Collapse)AuthorFilesLines
4 days[clang] Use the VFS to get the unique file ID (#160936)Jan Svoboda1-4/+9
This PR uses the VFS to get the unique file ID when printing externalized decls in CUDA instead of going straight to the real file system. This matches the behavior of other input files of the compiler.
5 days[clang] Load `-fembed-offload-object=` through the VFS (#160906)Jan Svoboda1-1/+1
This PR loads the path from `-fembed-offload-object=<path>` through the VFS rather than going straight to the real file system. This matches the behavior of other input files of the compiler. This technically changes behavior in that `-fembed-offload-object=-` no longer loads the file from stdin, but I don't think that was the intention of the original code anyways.
9 days[clang] Load `-fms-secure-hotpatch-functions-file=` through the VFS (#160146)Jan Svoboda1-2/+1
This PR uses the correctly-configured VFS to load the file specified via `-fms-secure-hotpatch-functions-file=`, matching other input files of the compiler.
12 days[PowerPC] fix float ABI selection on ppcle (#154773)DanilaZhebryakov1-1/+2
soft float ABI selection was not taking effect on little-endian powerPC with embedded vectors (e.g. e500v2) leading to errors. (embedded vectors use "extended" GPRs to store floating-point values, and this caused issues with variadic arguments assuming dedicated floating-point registers with hard-float ABI)
14 days[FMV] Set default attributes on the resolver functions (#141573)Anatoly Trosinenko1-7/+27
There is a number of attributes that is expected to be set on functions by default. This patch implements setting more such attributes on the FMV resolver functions generated by Clang. On AArch64, this makes the resolver functions use the default PAC and BTI hardening settings.
2025-09-13[CodeGen][CFI] Generalize transparent union parameters (#158193)Vitaly Buka1-1/+14
According GCC documentation transparent union calling convention is the same as the type of the first member of the union. C++ ignores attribute. Note, it does not generalize args of function pointer args. It's unnecessary with pointer generalization. It will be fixed in followup patch. --------- Co-authored-by: lntue <lntue@google.com>
2025-09-13[NFC][CodeGen][CFI] Add GeneralizePointers parameter to ↵Vitaly Buka1-18/+26
GeneralizeFunctionType (#158191) For #158193 --------- Co-authored-by: Alex Langford <alangford@apple.com>
2025-09-12[NFC][CFI][CodeGen] Move GeneralizeFunctionType out of ↵Vitaly Buka1-5/+10
CreateMetadataIdentifierGeneralized (#158190) For #158193
2025-09-12[NFC][CodeGen][CFI] Extract CreateMetadataIdentifierForFnType (#158189)Vitaly Buka1-0/+7
For #158193
2025-09-11[clang] Use VFS for `-fopenmp-host-ir-file-path` (#156727)Jan Svoboda1-2/+1
This is a follow-up to #150124. This PR makes it so that the `-fopenmp-host-ir-file-path` respects VFS overlays, like any other input file.
2025-09-09[HLSL][DirectX] Add support for `rootsig` as a target environment (#156373)Finn Plummer1-1/+1
This pr implements support for a root signature as a target, as specified [here](https://github.com/llvm/wg-hlsl/blob/main/proposals/0029-root-signature-driver-options.md#target-root-signature-version). This is implemented in the following steps: 1. Add `rootsignature` as a shader model environment type and define `rootsig` as a `target_profile`. Only valid as versions 1.0 and 1.1 2. Updates `HLSLFrontendAction` to invoke a special handling of constructing the `ASTContext` if we are considering an `hlsl` file and with a `rootsignature` target 3. Defines the special handling to minimally instantiate the `Parser` and `Sema` to insert the `RootSignatureDecl` 4. Updates `CGHLSLRuntime` to emit the constructed root signature decl as part of `dx.rootsignatures` with a `null` entry function 5. Updates `DXILRootSignature` to handle emitting a root signature without an entry function 6. Updates `ToolChains/HLSL` to invoke `only-section=RTS0` to strip any other generated information Resolves: https://github.com/llvm/llvm-project/issues/150286. ##### Implementation Considerations Ideally we could invoke this as part of `clang-dxc` without the need of a source file. However, the initialization of the `Parser` and `Lexer` becomes quite complicated to handle this. Technically, we could avoid generating any of the extra information that is removed in step 6. However, it seems better to re-use the logic in `llvm-objcopy` without any need for additional custom logic in `DXILRootSignature`.
2025-09-08[Clang] Add template argument support for {con,de}structor attributes. (#151400)tynasello1-2/+10
Fixes: https://github.com/llvm/llvm-project/issues/67154 {Con, De}structor attributes in Clang only work with integer priorities (inconsistent with GCC). This commit adds support to these attributes for template arguments. Built off of contributions from [abrachet](https://github.com/abrachet) in [#67376](https://github.com/llvm/llvm-project/pull/67376).
2025-09-07[NFC] Change const char* to StringRef (#154179)Chris B1-5/+2
This API takes a const char* with a default nullptr value and immdiately passes it down to an API taking a StringRef. All of the places this is called from are either using compile time string literals, the default argument, or string objects that have known length. Discarding the length known from a calling API to just have to strlen it to call the next layer down that requires a StringRef is a bit silly, so this change updates CodeGenModule::GetAddrOfConstantCString to use StringRef instead of const char* for the GlobalName parameter. It might be worth also replacing the first parameter with an llvm ADT type that avoids allocation, but that change would have wider impact so we should consider it separately.
2025-09-02[clang] Delay checking of `-fopenmp-host-ir-file-path` (#150124)Jan Svoboda1-0/+5
This PR is part of an effort to remove file system usage from the command line parsing code. The reason for that is that it's impossible to do file system access correctly without a configured VFS, and the VFS can only be configured after the command line is parsed. I don't want to intertwine command line parsing and VFS configuration, so I decided to perform the file system access after the command line is parsed and the VFS is configured - ideally right before the file system entity is used for the first time. This patch delays opening the OpenMP host IR file until codegen.
2025-08-27[clang] AST: fix getAs canonicalization of leaf types (#155028)Matheus Izvekov1-1/+2
2025-08-25[clang] NFC: change more places to use Type::getAsTagDecl and friends (#155313)Matheus Izvekov1-2/+1
This changes a bunch of places which use getAs<TagType>, including derived types, just to obtain the tag definition. This is preparation for #155028, offloading all the changes that PR used to introduce which don't depend on any new helpers.
2025-08-25[HLSL][RootSignature] Introduce `HLSLFrontendAction` to implement ↵Finn Plummer1-0/+3
`rootsig-define` (#154639) This pr implements the functionality of `rootsig-define` as described [here](https://github.com/llvm/wg-hlsl/blob/main/proposals/0029-root-signature-driver-options.md#option--rootsig-define). This is accomplished by: - Defining the `fdx-rootsignature-define`, and `rootsig-define` alias, driver options. It simply specifies the name of a macro that will expand to a `LiteralString` to be interpreted as a root signature. - Introduces a new general frontend action wrapper, `HLSLFrontendAction`. This class allows us to introduce `HLSL` specific behaviour on the underlying action (primarily `ASTFrontendAction`). Which will be further extended, or modularly wrapped, when considering future DXC options. - Using `HLSLFrontendAction` we can add a new `PPCallback` that will eagerly parse the root signature specified with `rootsig-define` and push it as a `TopLevelDecl` to `Sema`. This occurs when the macro has been lexed. - Since the root signature is parsed early, before any function declarations, we can then simply attach it to the entry function once it is encountered. Overwriting any applicable root signature attrs. Resolves https://github.com/llvm/llvm-project/issues/150274 ##### Implementation considerations To implement this feature, note that: 1. We need access to all defined macros. These are created as part of the first `Lex` in `Parser::Initialize` after `PP->EnterMainSourceFile` 2. `RootSignatureDecl` must be added to `Sema` before `Consumer->HandleTranslationUnit` is invoked in `ParseAST` Therefore, we can't handle the root signature in `HLSLFrontendAction::ExecuteAction` before (from 1.) or after (from 2.) invoking the underlying `ASTFrontendAction`. This means we could alternatively: - Manually handle this case [here](https://github.com/llvm/llvm-project/blob/ac8f0bb070c9071742b6f6ce03bebc9d87217830/clang/lib/Parse/ParseAST.cpp#L168) before parsing the first top level decl. - Hook into when we [return the entry function decl](https://github.com/llvm/llvm-project/blob/ac8f0bb070c9071742b6f6ce03bebc9d87217830/clang/lib/Parse/Parser.cpp#L1190) and then parse the root signature and override its `RootSignatureAttr`. The proposed solution handles this in the most modular way which should work on any `FrontendAction` that might use the `Parser` without invoking `ParseAST`, and, is not subject to needing to call the hook in multiple different places of function declarators.
2025-08-18[HLSL] Global resource arrays element access (#152454)Helena Kotas1-8/+9
Adds support for accessing individual resources from fixed-size global resource arrays. Design proposal: https://github.com/llvm/wg-hlsl/blob/main/proposals/0028-resource-arrays.md Enables indexing into globally scoped, fixed-size resource arrays to retrieve individual resources. The initialization logic is primarily handled during codegen. When a global resource array is indexed, the codegen translates the `ArraySubscriptExpr` AST node into a constructor call for the corresponding resource record type and binding. To support this behavior, Sema needs to ensure that: - The constructor for the specific resource type is instantiated. - An implicit binding attribute is added to resource arrays that lack explicit bindings (#152452). Closes #145424
2025-08-14[Clang][attr] Add 'cfi_salt' attribute (#141846)Bill Wendling1-3/+12
The 'cfi_salt' attribute specifies a string literal that is used as a "salt" for Control-Flow Integrity (CFI) checks to distinguish between functions with the same type signature. This attribute can be applied to function declarations, function definitions, and function pointer typedefs. This attribute prevents function pointers from being replaced with pointers to functions that have a compatible type, which can be a CFI bypass vector. The attribute affects type compatibility during compilation and CFI hash generation during code generation. Attribute syntax: [[clang::cfi_salt("<salt_string>")]] GNU-style syntax: __attribute__((cfi_salt("<salt_string>"))) - The attribute takes a single string of non-NULL ASCII characters. - It only applies to function types; using it on a non-function type will generate an error. - All function declarations and the function definition must include the attribute and use identical salt values. Example usage: // Header file: #define __cfi_salt(S) __attribute__((cfi_salt(S))) // Convenient typedefs to avoid nested declarator syntax. typedef int (*fp_unsalted_t)(void); typedef int (*fp_salted_t)(void) __cfi_salt("pepper"); struct widget_ops { fp_unsalted_t init; // Regular CFI. fp_salted_t exec; // Salted CFI. fp_unsalted_t teardown; // Regular CFI. }; // bar.c file: static int bar_init(void) { ... } static int bar_salted_exec(void) __cfi_salt("pepper") { ... } static int bar_teardown(void) { ... } static struct widget_generator _generator = { .init = bar_init, .exec = bar_salted_exec, .teardown = bar_teardown, }; struct widget_generator *widget_gen = _generator; // 2nd .c file: int generate_a_widget(void) { int ret; // Called with non-salted CFI. ret = widget_gen.init(); if (ret) return ret; // Called with salted CFI. ret = widget_gen.exec(); if (ret) return ret; // Called with non-salted CFI. return widget_gen.teardown(); } Link: https://github.com/ClangBuiltLinux/linux/issues/1736 Link: https://github.com/KSPP/linux/issues/365 --------- Signed-off-by: Bill Wendling <morbo@google.com> Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2025-08-11[clang][KCFI] Respect -fsanitize-cfi-icall-generalize-pointers (#152400)Florian Mayer1-32/+35
This flag was previously ignored by KCFI.
2025-08-09[clang] Improve nested name specifier AST representation (#147835)Matheus Izvekov1-11/+17
This is a major change on how we represent nested name qualifications in the AST. * The nested name specifier itself and how it's stored is changed. The prefixes for types are handled within the type hierarchy, which makes canonicalization for them super cheap, no memory allocation required. Also translating a type into nested name specifier form becomes a no-op. An identifier is stored as a DependentNameType. The nested name specifier gains a lightweight handle class, to be used instead of passing around pointers, which is similar to what is implemented for TemplateName. There is still one free bit available, and this handle can be used within a PointerUnion and PointerIntPair, which should keep bit-packing aficionados happy. * The ElaboratedType node is removed, all type nodes in which it could previously apply to can now store the elaborated keyword and name qualifier, tail allocating when present. * TagTypes can now point to the exact declaration found when producing these, as opposed to the previous situation of there only existing one TagType per entity. This increases the amount of type sugar retained, and can have several applications, for example in tracking module ownership, and other tools which care about source file origins, such as IWYU. These TagTypes are lazily allocated, in order to limit the increase in AST size. This patch offers a great performance benefit. It greatly improves compilation time for [stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for `test_on2.cpp` in that project, which is the slowest compiling test, this patch improves `-c` compilation time by about 7.2%, with the `-fsyntax-only` improvement being at ~12%. This has great results on compile-time-tracker as well: ![image](https://github.com/user-attachments/assets/700dce98-2cab-4aa8-97d1-b038c0bee831) This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands. It has some other miscelaneous drive-by fixes. About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact. There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work. How to review: The important changes are all in `clang/include/clang/AST` and `clang/lib/AST`, with also important changes in `clang/lib/Sema/TreeTransform.h`. The rest and bulk of the changes are mostly consequences of the changes in API. PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just for easier to rebasing. I plan to rename it back after this lands. Fixes #136624 Fixes https://github.com/llvm/llvm-project/issues/43179 Fixes https://github.com/llvm/llvm-project/issues/68670 Fixes https://github.com/llvm/llvm-project/issues/92757
2025-07-23[NFC][Clang][FMV] Make FMV priority data type future proof. (#150079)Alexandros Lamprineas1-3/+4
FMV priority is the returned value of a polymorphic function. On RISC-V and X86 targets a 32-bit value is enough. On AArch64 we currently need 64 bits and we will soon exceed that. APInt seems to be a suitable replacement for uint64_t, presumably with minimal compile time overhead. It allows bit manipulation, comparison and variable bit width.
2025-07-15Remove Native Client support (#133661)Brad Smith1-3/+1
Remove the Native Client support now that it has finally reached end of life.
2025-07-14[clang][ObjC][PAC] Add ptrauth protections to objective-c (#147899)Oliver Hunt1-1/+3
This PR introduces the use of pointer authentication to objective-c[++]. This includes: * __ptrauth qualifier support for ivars * protection of isa and super fields * protection of SEL typed ivars * protection of class_ro_t data * protection of methodlist pointers and content
2025-07-08[Clang][Wasm] Set __float128 alignment to 64 for emscripten (#146494)Nikita Popov1-2/+1
https://reviews.llvm.org/D104808 set the alignment of long double to 64 bits. This is also the alignment specified in the LLVM data layout. However, the alignment of __float128 was left at 128 bits. I assume that this was just an oversight, rather than an intentional divergence. The C ABI document currently does not make any statement about `__float128`: https://github.com/WebAssembly/tool-conventions/blob/main/BasicCABI.md
2025-07-01[Clang] Verify data layout consistency (#144720)Nikita Popov1-0/+69
Verify that the alignments specified by clang TargetInfo match the alignments specified by LLVM data layout, which will hopefully prevent accidental mismatches in the future. This currently contains opt-outs for a number of of existing mismatches. I'm also skipping the verification if options like `-malign-double` are used, or a language that mandates sizes/alignments that differ from C. The verification happens in CodeGen, as we can't have an IR dependency in Basic.
2025-06-24Add support for Windows Secure Hot-Patching (redo) (#145565)sivadeilra1-0/+29
(This is a re-do of #138972, which had a minor warning in `Clang.cpp`.) This PR adds some of the support needed for Windows hot-patching. Windows implements a form of hot-patching. This allows patches to be applied to Windows apps, drivers, and the kernel, without rebooting or restarting any of these components. Hot-patching is a complex technology and requires coordination between the OS, compilers, linkers, and additional tools. This PR adds support to Clang and LLVM for part of the hot-patching process. It enables LLVM to generate the required code changes and to generate CodeView symbols which identify hot-patched functions. The PR provides new command-line arguments to Clang which allow developers to identify the list of functions that need to be hot-patched. This PR also allows LLVM to directly receive the list of functions to be modified, so that language front-ends which have not yet been modified (such as Rust) can still make use of hot-patching. This PR: * Adds a `MarkedForWindowsHotPatching` LLVM function attribute. This attribute indicates that a function should be _hot-patched_. This generates a new CodeView symbol, `S_HOTPATCHFUNC`, which identifies any function that has been hot-patched. This attribute also causes accesses to global variables to be indirected through a `_ref_*` global variable. This allows hot-patched functions to access the correct version of a global variable; the hot-patched code needs to access the variable in the _original_ image, not the patch image. * Adds a `AllowDirectAccessInHotPatchFunction` LLVM attribute. This attribute may be placed on global variable declarations. It indicates that the variable may be safely accessed without the `_ref_*` indirection. * Adds two Clang command-line parameters: `-fms-hotpatch-functions-file` and `-fms-hotpatch-functions-list`. The `-file` flag may point to a text file, which contains a list of functions to be hot-patched (one function name per line). The `-list` flag simply directly identifies functions to be patched, using a comma-separated list. These two command-line parameters may also be combined; the final set of functions to be hot-patched is the union of the two sets. * Adds similar LLVM command-line parameters: `--ms-hotpatch-functions-file` and `--ms-hotpatch-functions-list`. * Adds integration tests for both LLVM and Clang. * Adds support for dumping the new `S_HOTPATCHFUNC` CodeView symbol. Although the flags are redundant between Clang and LLVM, this allows additional languages (such as Rust) to take advantage of hot-patching support before they have been modified to generate the required attributes. Credit to @dpaoliello, who wrote the original form of this patch.
2025-06-24Revert "Add support for Windows Secure Hot-Patching" (#145553)Qinkun Bao1-29/+0
Reverts llvm/llvm-project#138972
2025-06-24Add support for Windows Secure Hot-Patching (#138972)sivadeilra1-0/+29
This PR adds some of the support needed for Windows hot-patching. Windows implements a form of hot-patching. This allows patches to be applied to Windows apps, drivers, and the kernel, without rebooting or restarting any of these components. Hot-patching is a complex technology and requires coordination between the OS, compilers, linkers, and additional tools. This PR adds support to Clang and LLVM for part of the hot-patching process. It enables LLVM to generate the required code changes and to generate CodeView symbols which identify hot-patched functions. The PR provides new command-line arguments to Clang which allow developers to identify the list of functions that need to be hot-patched. This PR also allows LLVM to directly receive the list of functions to be modified, so that language front-ends which have not yet been modified (such as Rust) can still make use of hot-patching. This PR: * Adds a `MarkedForWindowsHotPatching` LLVM function attribute. This attribute indicates that a function should be _hot-patched_. This generates a new CodeView symbol, `S_HOTPATCHFUNC`, which identifies any function that has been hot-patched. This attribute also causes accesses to global variables to be indirected through a `_ref_*` global variable. This allows hot-patched functions to access the correct version of a global variable; the hot-patched code needs to access the variable in the _original_ image, not the patch image. * Adds a `AllowDirectAccessInHotPatchFunction` LLVM attribute. This attribute may be placed on global variable declarations. It indicates that the variable may be safely accessed without the `_ref_*` indirection. * Adds two Clang command-line parameters: `-fms-hotpatch-functions-file` and `-fms-hotpatch-functions-list`. The `-file` flag may point to a text file, which contains a list of functions to be hot-patched (one function name per line). The `-list` flag simply directly identifies functions to be patched, using a comma-separated list. These two command-line parameters may also be combined; the final set of functions to be hot-patched is the union of the two sets. * Adds similar LLVM command-line parameters: `--ms-hotpatch-functions-file` and `--ms-hotpatch-functions-list`. * Adds integration tests for both LLVM and Clang. * Adds support for dumping the new `S_HOTPATCHFUNC` CodeView symbol. Although the flags are redundant between Clang and LLVM, this allows additional languages (such as Rust) to take advantage of hot-patching support before they have been modified to generate the required attributes. Credit to @dpaoliello, who wrote the original form of this patch.
2025-06-19[HIP] Emit the CUID value in the module with the new driver (#144570)Joseph Huber1-1/+1
Summary: This is a weird point of divergence that was not updated when the new driver switched to using the CUID method, which is also apparently critical for SPIR-V compilation not failing? Somehow if we don't emit this global than the `llvm.compiler.used` global uses AS(0) which makes SPIR-V unhappy, but with this global it's AS(4) which makes it happy. Either way, this should be fixed.
2025-06-16[win][x64] Unwind v2 3/n: Add support for requiring unwind v2 to be used ↵Daniel Paoliello1-2/+4
(equivalent to MSVC's /d2epilogunwindrequirev2) (#143577) #129142 added support for emitting Windows x64 unwind v2 information, but it was "best effort". If any function didn't follow the requirements for v2 it was silently downgraded to v1. There are some parts of Windows (specifically kernel-mode code running on Xbox) that require v2, hence we need the ability to fail the compilation if v2 can't be used. This change also adds a heuristic to check if there might be too many unwind codes, it's currently conservative (i.e., assumes that certain prolog instructions will use the maximum number of unwind codes). Future work: attempting to chain unwind info across multiple tables if there are too many unwind codes due to epilogs and adding a heuristic to detect if an epilog will be too far from the end of the function.
2025-06-16[HLSL] Use hidden visibility for external linkage. (#140292)Steven Perron1-0/+5
Implements https://github.com/llvm/wg-hlsl/blob/main/proposals/0026-symbol-visibility.md. The change is to stop using the `hlsl.export` attribute. Instead, symbols with "program linkage" in HLSL will have export linkage with default visibility, and symbols with "external linkage" in HLSL will have export linkage with hidden visibility.
2025-06-13[CodeGen][COFF] Always emit CodeView compiler info on Windows targets (#142970)Jacek Caban1-1/+6
MSVC always emits minimal CodeView metadata with compiler information, even when debug info is otherwise disabled. Other tools may rely on this metadata being present. For example, linkers use it to determine whether hotpatching is enabled for the object file.
2025-06-13Fix and reapply IR PGO support for Flang (#142892)FYK1-1/+1
This PR resubmits the changes from #136098, which was previously reverted due to a build failure during the linking stage: ``` undefined reference to `llvm::DebugInfoCorrelate' undefined reference to `llvm::ProfileCorrelate' ``` The root cause was that `llvm/lib/Frontend/Driver/CodeGenOptions.cpp` references symbols from the `Instrumentation` component, but the `LINK_COMPONENTS` in the `llvm/lib/Frontend/CMakeLists.txt` for `LLVMFrontendDriver` did not include it. As a result, linking failed in configurations where these components were not transitively linked. ### Fix: This updated patch explicitly adds `Instrumentation` to `LINK_COMPONENTS` in the relevant `llvm/lib/Frontend/CMakeLists.txt` file to ensure the required symbols are properly resolved. --------- Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com> Co-authored-by: Chyaka <52224511+liliumshade@users.noreply.github.com> Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
2025-06-08[Clang] Support constexpr asm at global scope. (#143268)Corentin Jabot1-1/+1
I previously failed to realize this feature existed... Fixes #137459 Fixes #143242
2025-06-05Add -funique-source-file-identifier option.Peter Collingbourne1-2/+7
This option complements -funique-source-file-names and allows the user to use a different unique identifier than the source file path. Reviewers: teresajohnson Reviewed By: teresajohnson Pull Request: https://github.com/llvm/llvm-project/pull/142901
2025-06-05[clang] Simplify device kernel attributes (#137882)Nick Sarnie1-4/+7
We have multiple different attributes in clang representing device kernels for specific targets/languages. Refactor them into one attribute with different spellings to make it more easily scalable for new languages/targets. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-06-04[HLSL][SPIR-V] Implement vk::ext_builtin_input attribute (#138530)Nathan Gauër1-1/+17
This variable attribute is used in HLSL to add Vulkan specific builtins in a shader. The attribute is documented here: https://github.com/microsoft/hlsl-specs/blob/17727e88fd1cb09013cb3a144110826af05f4dd5/proposals/0011-inline-spirv.md Those variable, even if marked as `static` are externally initialized by the pipeline/driver/GPU. This is handled by moving them to a specific address space `hlsl_input`, also added by this commit. The design for input variables in Clang can be found here: https://github.com/llvm/wg-hlsl/blob/355771361ef69259fef39a65caef8bff9cb4046d/proposals/0019-spirv-input-builtin.md Co-authored-by: Justin Bogner <mail@justinbogner.com>
2025-06-02[Clang][FMV] Stop emitting implicit default version using target_clones. ↵Alexandros Lamprineas1-13/+12
(#141808) With the current behavior the following example yields a linker error: "multiple definition of `foo.default'" // Translation Unit 1 __attribute__((target_clones("dotprod, sve"))) int foo(void) { return 1; } // Translation Unit 2 int foo(void) { return 0; } __attribute__((target_version("dotprod"))) int foo(void); __attribute__((target_version("sve"))) int foo(void); int bar(void) { return foo(); } That is because foo.default is generated twice. As a user I don't find this particularly intuitive. If I wanted the default to be generated in TU1 I'd rather write target_clones("dotprod, sve", "default") explicitly. When changing the code I noticed that the RISC-V target defers the resolver emission when encountering a target_version definition. This seems accidental since it only makes sense for AArch64, where we only emit a resolver once we've processed the entire TU, and only if the default version is present. I've changed this so that RISC-V immediately emmits the resolver. I adjusted the codegen tests since the functions now appear in a different order. Implements https://github.com/ARM-software/acle/pull/377
2025-05-30Revert "Add IR Profile-Guided Optimization (IR PGO) support to the Flang ↵Tarun Prabhu1-1/+1
compiler" (#142159) Reverts llvm/llvm-project#136098
2025-05-30Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler ↵FYK1-1/+1
(#136098) This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags: - `-fprofile-generate` for instrumentation-based profile generation - `-fprofile-use=<dir>/file` for profile-guided optimization Resolves #74216 (implements IR PGO support phase) **Key changes:** - Frontend flag handling aligned with Clang/GCC semantics - Instrumentation hooks into LLVM PGO infrastructure - LIT tests verifying: - Instrumentation metadata generation - Profile loading from specified path - Branch weight attribution (IR checks) **Tests:** - Added gcc-flag-compatibility.f90 test module verifying: - Flag parsing boundary conditions - IR-level profile annotation consistency - Profile input path normalization rules - SPEC2006 benchmark results will be shared in comments For details on LLVM's PGO framework, refer to [Clang PGO Documentation](https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization). This implementation was developed by [XSCC Compiler Team](https://github.com/orgs/OpenXiangShan/teams/xscc). --------- Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com> Co-authored-by: Tom Eccles <t@freedommail.info>
2025-05-27Avoid emitting a linker options section in the compiler if it is empty. ↵dyung1-3/+5
(#139821) Recently in some of our internal testing, we noticed that the compiler was sometimes generating an empty linker.options section which seems unnecessary. This proposed change causes the compiler to simply omit emitting the linker.options section if it is empty.
2025-05-27[HIP] disable sanitizer for `__hip_cuid` (#141581)Yaxun (Sam) Liu1-0/+1
Global variable `__hip_cuid_*` is for identifying purpose and does not need sanitization, therefore disable it for sanitizers.
2025-05-14[Cygwin] Global symbols should be external by default (#139797)Tomohiro Kashiwada1-1/+1
Behaves as same as both of Clang and GCC targetting MinGW. Required for compatibility for Cygwin-GCC. Divided from https://github.com/llvm/llvm-project/pull/138773
2025-05-13Emit nested unused enum types with -fno-eliminate-unused-debug-types (#137818)ykhatav1-1/+1
Unused types are retained in the debug info when -fno-eliminate-unused-debug-types is specified. However, unused nested enums were not being emitted even with this option. This patch fixes the missing emission of unused nested enums with -fno-eliminate-unused-debug-types
2025-05-11[clang] Use StringRef::consume_front (NFC) (#139472)Kazu Hirata1-5/+1
2025-05-09[win][x64] Unwind v2 3/n: Add support for emitting unwind v2 information ↵Daniel Paoliello1-0/+4
(equivalent to MSVC /d2epilogunwind) (#129142) Adds support for emitting Windows x64 Unwind V2 information, includes support `/d2epilogunwind` in clang-cl. Unwind v2 adds information about the epilogs in functions such that the unwinder can unwind even in the middle of an epilog, without having to disassembly the function to see what has or has not been cleaned up. Unwind v2 requires that all epilogs are in "canonical" form: * If there was a stack allocation (fixed or dynamic) in the prolog, then the first instruction in the epilog must be a stack deallocation. * Next, for each `PUSH` in the prolog there must be a corresponding `POP` instruction in exact reverse order. * Finally, the epilog must end with the terminator. This change adds a pass to validate epilogs in modules that have Unwind v2 enabled and, if they pass, emits new pseudo instructions to MC that 1) note that the function is using unwind v2 and 2) mark the start of the epilog (this is either the first `POP` if there is one, otherwise the terminator instruction). If a function does not meet these requirements, it is downgraded to Unwind v1 (i.e., these new pseudo instructions are not emitted). Note that the unwind v2 table only marks the size of the epilog in the "header" unwind code, but it's possible for epilogs to use different terminator instructions thus they are not all the same size. As a work around for this, MC will assume that all terminator instructions are 1-byte long - this still works correctly with the Windows unwinder as it is only using the size to do a range check to see if a thread is in an epilog or not, and since the instruction pointer will never be in the middle of an instruction and the terminator is always at the end of an epilog the range check will function correctly. This does mean, however, that the "at end" optimization (where an epilog unwind code can be elided if the last epilog is at the end of the function) can only be used if the terminator is 1-byte long. One other complication with the implementation is that the unwind table for a function is emitted during streaming, however we can't calculate the distance between an epilog and the end of the function at that time as layout hasn't been completed yet (thus some instructions may be relaxed). To work around this, epilog unwind codes are emitted via a fixup. This also means that we can't pre-emptively downgrade a function to Unwind v1 if one of these offsets is too large, so instead we raise an error (but I've passed through the location information, so the user will know which of their functions is problematic).
2025-05-09clang: Remove dest LangAS argument from performAddrSpaceCast (#138866)Matt Arsenault1-3/+3
It isn't used and is redundant with the result pointer type argument. A more reasonable API would only have LangAS parameters, or IR parameters, not both. Not all values have a meaningful value for this. I'm also not sure why we have this at all, it's not overridden by any targets and further simplification is possible.
2025-05-07[Clang][OpenCL][AMDGPU] OpenCL Kernel stubs should be assigned alwaysinline ↵Aniket Lal1-0/+16
attribute (#137769) OpenCL Kernels body is emitted as stubs and the kernel is emitted as call to respective stub. (https://github.com/llvm/llvm-project/pull/115821). The stub function should be alwaysinlined, since call to stub can cause performance drop. Co-authored-by: anikelal <anikelal@amd.com>
2025-04-30[nfc][clang] Rename function (#137874)Prabhu Rajasekaran1-3/+3
Rename function to meet the coding guidelines.