aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenAction.cpp
AgeCommit message (Collapse)AuthorFilesLines
12 days[clang][modules] Serialize `CodeGenOptions` (#146422)Jan Svoboda1-1/+1
Some `LangOptions` duplicate their `CodeGenOptions` counterparts. My understanding is that this was done solely because some infrastructure (like preprocessor initialization, serialization, module compatibility checks, etc.) were only possible/convenient for `LangOptions`. This PR implements the missing support for `CodeGenOptions`, which makes it possible to remove some duplicate `LangOptions` fields and simplify the logic. Motivated by https://github.com/llvm/llvm-project/pull/146342.
2025-06-13Fix and reapply IR PGO support for Flang (#142892)FYK1-2/+2
This PR resubmits the changes from #136098, which was previously reverted due to a build failure during the linking stage: ``` undefined reference to `llvm::DebugInfoCorrelate' undefined reference to `llvm::ProfileCorrelate' ``` The root cause was that `llvm/lib/Frontend/Driver/CodeGenOptions.cpp` references symbols from the `Instrumentation` component, but the `LINK_COMPONENTS` in the `llvm/lib/Frontend/CMakeLists.txt` for `LLVMFrontendDriver` did not include it. As a result, linking failed in configurations where these components were not transitively linked. ### Fix: This updated patch explicitly adds `Instrumentation` to `LINK_COMPONENTS` in the relevant `llvm/lib/Frontend/CMakeLists.txt` file to ensure the required symbols are properly resolved. --------- Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com> Co-authored-by: Chyaka <52224511+liliumshade@users.noreply.github.com> Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
2025-05-30Revert "Add IR Profile-Guided Optimization (IR PGO) support to the Flang ↵Tarun Prabhu1-2/+2
compiler" (#142159) Reverts llvm/llvm-project#136098
2025-05-30Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler ↵FYK1-2/+2
(#136098) This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags: - `-fprofile-generate` for instrumentation-based profile generation - `-fprofile-use=<dir>/file` for profile-guided optimization Resolves #74216 (implements IR PGO support phase) **Key changes:** - Frontend flag handling aligned with Clang/GCC semantics - Instrumentation hooks into LLVM PGO infrastructure - LIT tests verifying: - Instrumentation metadata generation - Profile loading from specified path - Branch weight attribution (IR checks) **Tests:** - Added gcc-flag-compatibility.f90 test module verifying: - Flag parsing boundary conditions - IR-level profile annotation consistency - Profile input path normalization rules - SPEC2006 benchmark results will be shared in comments For details on LLVM's PGO framework, refer to [Clang PGO Documentation](https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization). This implementation was developed by [XSCC Compiler Team](https://github.com/orgs/OpenXiangShan/teams/xscc). --------- Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com> Co-authored-by: Tom Eccles <t@freedommail.info>
2025-04-07[Clang] Always verify LLVM IR inputs (#134396)Nikita Popov1-1/+11
We get a lot of issues that basically boil down to "I passed malformed LLVM IR to clang and it crashed". Clang does not perform IR verification by default in (non-assertion-enabled) release builds, and that's sensible for IR that Clang itself produces, which is expected to always be valid. However, if people pass in their own handwritten IR, we should report if it is malformed, instead of crashing. We should also report it in a way that does not produce a crash trace and ask for a bug report, as currently happens in assertions-enabled builds. This aligns the behavior with how opt/llc work.
2025-03-06[IR] Store Triple in Module (NFC) (#129868)Nikita Popov1-3/+3
The module currently stores the target triple as a string. This means that any code that wants to actually use the triple first has to instantiate a Triple, which is somewhat expensive. The change in #121652 caused a moderate compile-time regression due to this. While it would be easy enough to work around, I think that architecturally, it makes more sense to store the parsed Triple in the module, so that it can always be directly queried. For this change, I've opted not to add any magic conversions between std::string and Triple for backwards-compatibilty purses, and instead write out needed Triple()s or str()s explicitly. This is because I think a decent number of them should be changed to work on Triple as well, to avoid unnecessary conversions back and forth. The only interesting part in this patch is that the default triple is Triple("") instead of Triple() to preserve existing behavior. The former defaults to using the ELF object format instead of unknown object format. We should fix that as well.
2025-01-22[clang][modules] Partially revert 48d0eb518 to fix -gmodules output (#124003)Ben Langmuir1-3/+5
With the changes in 48d0eb518, the CodeGenOptions used to emit .pcm files with -fmodule-format=obj (-gmodules) were the ones from the original invocation, rather than the ones specifically crafted for outputting the pcm. This was causing the pcm to be written with only the debug info and without the __clangast section in some cases (e.g. -O2). This unforunately was not covered by existing tests, because compiling and loading a module within a single compilation load the ast content from the in-memory module cache rather than reading it from the pcm file that was written. This broke bootstrapping a build of clang with modules enabled on Darwin. rdar://143418834
2025-01-10-ftime-report: reorganize timersFangrui Song1-24/+13
The code generation time is unclear in the -ftime-report output: * The two clang timers "Code Generation Time" and "LLVM IR Generation Time" are in the default group "Miscellaneous Ungrouped Timers". * There is also a "Clang front-end time" group, which actually includes code generation time. ``` ===-------------------------------------------------------------------------=== Miscellaneous Ungrouped Timers ===-------------------------------------------------------------------------=== ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 0.0611 ( 1.7%) 0.0099 ( 4.4%) 0.0710 ( 1.9%) 0.0713 ( 1.9%) LLVM IR Generation Time 3.5140 ( 98.3%) 0.2165 ( 95.6%) 3.7306 ( 98.1%) 3.7342 ( 98.1%) Code Generation Time 3.5751 (100.0%) 0.2265 (100.0%) 3.8016 (100.0%) 3.8055 (100.0%) Total ... ===-------------------------------------------------------------------------=== Clang front-end time report ===-------------------------------------------------------------------------=== Total Execution Time: 3.9108 seconds (3.9146 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 3.6802 (100.0%) 0.2306 (100.0%) 3.9108 (100.0%) 3.9146 (100.0%) Clang front-end timer 3.6802 (100.0%) 0.2306 (100.0%) 3.9108 (100.0%) 3.9146 (100.0%) Total ``` This patch * renames "Clang front-end time report" (FrontendAction time) to "Clang time report", * renames "Clang front-end" to "Front end", * moves "LLVM IR Generation" into the group, * replaces "Code Generation time" with "Optimizer" (middle end) and "Machine code generation" (back end). ``` % clang -c sqlite3.i -w -ftime-report -mllvm -sort-timers=0 ... ===-------------------------------------------------------------------------=== Clang time report ===-------------------------------------------------------------------------=== Total Execution Time: 1.5922 seconds (1.5972 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 0.5107 ( 35.9%) 0.0105 ( 6.2%) 0.5211 ( 32.7%) 0.5222 ( 32.7%) Front end 0.2464 ( 17.3%) 0.0340 ( 20.0%) 0.2804 ( 17.6%) 0.2814 ( 17.6%) LLVM IR generation 0.6240 ( 43.9%) 0.1235 ( 72.7%) 0.7475 ( 47.0%) 0.7503 ( 47.0%) Machine code generation 0.0413 ( 2.9%) 0.0018 ( 1.0%) 0.0431 ( 2.7%) 0.0433 ( 2.7%) Optimizer 1.4224 (100.0%) 0.1698 (100.0%) 1.5922 (100.0%) 1.5972 (100.0%) Total ``` Pull Request: https://github.com/llvm/llvm-project/pull/122225
2025-01-09[CodeGen] Simplify EmitAssemblyHelper and emitBackendOutputFangrui Song1-15/+15
Prepare for -ftime-report change (#122225).
2025-01-08[clang] Simplify BackendConsumer after #69371Fangrui Song1-35/+12
2025-01-08[clang] Simplify BackendConsumer. NFCFangrui Song1-39/+31
2024-11-16[CodeGen] Remove unused includes (NFC) (#116459)Kazu Hirata1-3/+0
Identified with misc-include-cleaner.
2024-09-25[clang] Make deprecations of some `FileManager` APIs formal (#110014)Jan Svoboda1-2/+2
Some `FileManager` APIs still return `{File,Directory}Entry` instead of the preferred `{File,Directory}EntryRef`. These are documented to be deprecated, but don't have the attribute that warns on their usage. This PR marks them as such with `LLVM_DEPRECATED()` and replaces their usage with the recommended counterparts. NFCI.
2024-09-13[clang][CodeGen] Strip unneeded calls to raw_string_ostream::str() (NFC)JOE19941-4/+2
Try to avoid excess layer of indirection when possible. p.s. Remove a call to raw_string_ostream::flush() which is a no-op.
2024-09-03Revert "[C++20] [Modules] Embed all source files for C++20 Modules (#102444)"Chuanqi Xu1-3/+2
This reverts commit 2eeeff842f993a694159183a2834b4d305549cad. See the post commit discussion in https://github.com/llvm/llvm-project/commit/2eeeff842f993a694159183a2834b4d305549cad
2024-08-29[C++20] [Modules] Embed all source files for C++20 Modules (#102444)Chuanqi Xu1-2/+3
Close https://github.com/llvm/llvm-project/issues/72383 The implementation rationale is, I don't want to pass `-fmodules-embed-all-files` all the time since we can't test it in lit tests (we're using `clang_cc1`). So I tried to set it in FrontendActions for modules.
2024-07-05[BPF] Fix linking issues in static map initializers (#91310)Nick Zavaritsky1-1/+1
When BPF object files are linked with bpftool, every symbol must be accompanied by BTF info. Ensure that extern functions referenced by global variable initializers are included in BTF. The primary motivation is "static" initialization of PROG maps: ```c extern int elsewhere(struct xdp_md *); struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __type(key, int); __type(value, int); __array(values, int (struct xdp_md *)); } prog_map SEC(".maps") = { .values = { elsewhere } }; ``` BPF backend needs debug info to produce BTF. Debug info is not normally generated for external variables and functions. Previously, it was solved differently for variables (collecting variable declarations in ExternalDeclarations vector) and functions (logic invoked during codegen in CGExpr.cpp). This patch generalises ExternalDefclarations to include both function and variable declarations. This change ensures that function references are not missed no matter the context. Previously external functions referenced in constant expressions lacked debug info.
2024-06-28[clang][CodeGen] Remove unnecessary ShouldLinkFiles conditional (#96951)Jacob Lambert1-7/+2
We have reworked the bitcode linking option to no longer link twice if post-optimization linking is requested. As such, we no longer need to conditionally link bitcodes supplied via -mlink-bitcode-file, as there is no danger of linking them twice
2024-06-25CodeGen, IR: Add target-{cpu,features} attributes to functions created via ↵pcc1-0/+6
createWithDefaultAttr(). Functions created with createWithDefaultAttr() need to have the correct target-{cpu,features} attributes to avoid miscompilations such as using the wrong relocation type to access globals (missing tagged-globals feature), clobbering registers specified via -ffixed-* (missing reserve-* feature), and so on. There's already a number of attributes copied from the module flags onto functions created by createWithDefaultAttr(). I don't think module flags are the right choice for the target attributes because we don't need the conflict resolution logic between modules with different target attributes, nor does it seem sensible to add it: there's no unambiguously "correct" set of target attributes when merging two modules with different attributes, and nor should there be; it's perfectly valid for two modules to be compiled with different target attributes, that's the whole reason why they are per-function. This also implies that it's unnecessary to serialize the attributes in bitcode, which implies that they shouldn't be stored on the module. We can also observe that for the most part, createWithDefaultAttr() is called from compiler passes such as sanitizers, coverage and profiling passes that are part of the compile time pipeline, not the LTO pipeline. This hints at a solution: we need to store the attributes in a non-serialized location associated with the ambient compilation context. Therefore in this patch I elected to store the attributes on the LLVMContext. There are calls to createWithDefaultAttr() in the NVPTX and AMDGPU backends, and those calls would happen at LTO time. For those callers, the bug still potentially exists and it would be necessary to refactor them to create the functions at compile time if this issue is relevant on those platforms. Fixes #93633. Reviewers: fmayer, MaskRay, eugenis Reviewed By: MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/96721
2024-05-08Fix unused private field warning (#91500)Justin Bogner1-11/+8
After 11a6799740f8 "[clang][CodeGen] Omit pre-opt link when post-opt is link requested (#85672)" I'm seeing a new warning: > BackendConsumer.h:37:22: error: private field 'FileMgr' is not used [-Werror,-Wunused-private-field] Remove the field since it's no longer used.
2024-05-08[clang][CodeGen] Omit pre-opt link when post-opt is link requested (#85672)Jacob Lambert1-35/+2
Currently, when the -relink-builtin-bitcodes-postop option is used we link builtin bitcodes twice: once before optimization, and again after optimization. With this change, we omit the pre-opt linking when the option is set, and we rename the option to the following: -Xclang -mlink-builtin-bitcodes-postopt (-Xclang -mno-link-builtin-bitcodes-postopt) The goal of this change is to reduce compile time. We do lose the theoretical benefits of pre-opt linking, but in practice these are small than the overhead of linking twice. However we may be able to address this in a future patch by adjusting the position of the builtin-bitcode linking pass. Compilations not setting the option are unaffected
2024-04-15[C++20] [Modules] Introduce -fexperimental-modules-reduced-bmi (#85050)Chuanqi Xu1-0/+19
This is the driver part of https://github.com/llvm/llvm-project/pull/75894. This patch introduces '-fexperimental-modules-reduced-bmi' to enable generating the reduced BMI. This patch did: - When `-fexperimental-modules-reduced-bmi` is specified but `--precompile` is not specified for a module unit, we'll skip the precompile phase to avoid unnecessary two-phase compilation phases. Then if `-c` is specified, we will generate the reduced BMI in CodeGenAction as a by-product. - When `-fexperimental-modules-reduced-bmi` is specified and `--precompile` is specified, we will generate the reduced BMI in GenerateModuleInterfaceAction as a by-product. - When `-fexperimental-modules-reduced-bmi` is specified for a non-module unit. We don't do anything nor try to give a warn. This is more user friendly so that the end users can try to test and experiment with the feature without asking help from the build systems. The core design idea is that users should be able to enable this easily with the existing cmake mechanisms. The future plan for the flag is: - Add this to clang19 and make it opt-in for 1~2 releases. It depends on the testing feedback to decide how long we like to make it opt-in. - Then we can announce the existing BMI generating may be deprecated and suggesting people (end users or build systems) to enable this for 1~2 releases. - Finally we will enable this by default. When that time comes, the term `BMI` will refer to the reduced BMI today and the existing BMI will only be meaningful to build systems which loves to support two phase compilations. I'll send release notes and document in seperate commits after this get landed.
2024-02-14[clang][CodeGen] Add missing error check (#81777)Jacob Lambert1-0/+3
Add missing error check. This resolves "error: variable 'Err' set but not used" warnings
2024-02-14[clang][CodeGen] Shift relink option implementation away from module cloning ↵Jacob Lambert1-77/+84
(#81693) We recently implemented a new option allowing relinking of bitcode modules via the "-mllvm -relink-builtin-bitcode-postop" option. This implementation relied on llvm::CloneModule() in order to pass copies to modules and preserve the original modules for later relinking. However, cloning modules has been found to be prohibitively expensive, significantly increasing compilation time for large bitcode libraries. In this patch, we shift the relink option implementation to instead link the original modules initially, and reload modules from the file system if relinking is requested. This approach results in significantly reduced overhead. We accomplish this by creating a new ReloadModules() routine that can be called from a BackendConsumer class, to mimic the behavior of ASTConsumer's loadLinkModules(), but without access to the CompilerInstance. Because loading the bitcodes from the filesystem requires access to the FileManager class, we also forward a reference to the CompilerInstance class to the BackendConsumer. This mirrors what is already done for several CompilerInstance members, such as TargetOptions and CodeGenOptions. Finally, we needed to add a const specifier to the FileManager::getBufferForFile() routine to allow it to be called using the const reference returned from CompilerInstance::getFileManager()
2023-12-25[clang] Use StringRef::consume_front (NFC)Kazu Hirata1-2/+1
2023-12-19DiagnosticHandler: refactor error checking (#75889)Fangrui Song1-2/+0
In LLVMContext::diagnose, set `HasErrors` for `DS_Error` so that all derived `DiagnosticHandler` have correct `HasErrors` information. An alternative is to set `HasErrors` in `DiagnosticHandler::handleDiagnostics`, but all derived `handleDiagnostics` would have to call the base function.
2023-12-18[LTO] Improve diagnostics handling when parsing module-level inline assembly ↵Fangrui Song1-0/+2
(#75726) Non-LTO compiles set the buffer name to "<inline asm>" (`AsmPrinter::addInlineAsmDiagBuffer`) and pass diagnostics to `ClangDiagnosticHandler` (through the `MCContext` handler in `MachineModuleInfoWrapperPass::doInitialization`) to ensure that the exit code is 1 in the presence of errors. In contrast, LTO compiles spuriously succeed even if error messages are printed. ``` % cat a.c void _start() {} asm("unknown instruction"); % clang -c a.c <inline asm>:1:1: error: invalid instruction mnemonic 'unknown' 1 | unknown instruction | ^ 1 error generated. % clang -c -flto a.c; echo $? # -flto=thin is the same error: invalid instruction mnemonic 'unknown' unknown instruction ^~~~~~~ error: invalid instruction mnemonic 'unknown' unknown instruction ^~~~~~~ 0 ``` `CollectAsmSymbols` parses inline assembly and is transitively called by both `ModuleSummaryIndexAnalysis::run` and `WriteBitcodeToFile`, leading to duplicate diagnostics. This patch updates `CollectAsmSymbols` to be similar to non-LTO compiles. ``` % clang -c -flto=thin a.c; echo $? <inline asm>:1:1: error: invalid instruction mnemonic 'unknown' 1 | unknown instruction | ^ 1 errors generated. 1 ``` The `HasErrors` check does not prevent duplicate warnings but assembler warnings are very uncommon.
2023-12-13[clang] Use StringRef::{starts,ends}_with (NFC) (#75149)Kazu Hirata1-1/+1
This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.
2023-11-29[CodeGen] Add conditional to module cloning in bitcode linking (#72478)Jacob Lambert1-21/+30
Now that we have a commandline option dictating a second link step, (-relink-builtin-bitcode-postop), we can condition the module creation when linking in bitcode modules. This aims to improve performance by avoiding unnecessary linking
2023-11-08[CodeGen] Implement post-opt linking option for builtin bitocdes (#69371)Jacob Lambert1-361/+299
In this patch, we create a new ModulePass that mimics the LinkInModules API from CodeGenAction.cpp, and a new command line option to enable the pass. As part of the implementation, we needed to refactor the BackendConsumer class definition into a new separate header (instead of embedded in CodeGenAction.cpp). With this new pass, we can now re-link bitcodes supplied via the -mlink-built-in bitcodes as part of the RunOptimizationPipeline. With the re-linking pass, we now handle cases where new device library functions are introduced as part of the optimization pipeline. Previously, these newly introduced functions (for example a fused sincos call) would result in a linking error due to a missing function definition. This new pass can be initiated via: -mllvm -relink-builtin-bitcode-postop Also note we intentionally exclude bitcodes supplied via the -mlink-bitcode-file option from the second linking step
2023-07-21[CodeGen] Support bitcode input containing multiple modulesFangrui Song1-9/+34
When using -fsplit-lto-unit (explicitly specified or due to using -fsanitize=cfi/-fwhole-program-vtables), the emitted LLVM IR contains a module flag metadata `"EnableSplitLTOUnit"`. If a module contains both type metadata and `"EnableSplitLTOUnit"`, `ThinLTOBitcodeWriter.cpp` will write two modules into the bitcode file. Compiling the bitcode (not ThinLTO backend compilation) will lead to an error due to `parseIR` requiring a single module. ``` % clang -flto=thin a.cc -c -o a.bc % clang -c a.bc % clang -fsplit-lto-unit -flto=thin a.cc -c -o a.bc % clang -c a.bc error: Expected a single module 1 error generated. ``` There are multiple ways to have just one module in a bitcode file output: `-Xclang -fno-lto-unit`, not using features like `-fsanitize=cfi`, using `-fsanitize=cfi` with `-fno-split-lto-unit`. I think whether a bitcode input file contains 2 modules (internal implementation strategy) should not be a criterion to require an additional driver option when the user seek for a non-LTO compile action. Let's place the extra module (if present) into CodeGenOptions::LinkBitcodeFiles (originally for -cc1 -mlink-bitcode-file). Linker::linkModules will link the two modules together. This patch makes the following commands work: ``` clang -S -emit-llvm a.bc clang -S a.bc clang -c a.bc ``` Reviewed By: ormris Differential Revision: https://reviews.llvm.org/D154923
2023-06-20[Clang] Allow bitcode linking when the input is LLVM-IRJoseph Huber1-31/+48
Clang provides the `-mlink-bitcode-file` and `-mlink-builtin-bitcode` options to insert LLVM-IR into the current TU. These are usefuly primarily for including LLVM-IR files that require special handling to be correct and cannot be linked normally, such as GPU vendor libraries like `libdevice.10.bc`. Currently these options can only be used if the source input goes through the AST consumer path. This patch makes the changes necessary to also support this when the input is LLVM-IR. This will allow the following operation: ``` clang in.bc -Xclang -mlink-builtin-bitcode -Xclang libdevice.10.bc ``` Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D152391
2023-06-08[Clang] Remove -no-opaque-pointers cc1 flagNikita Popov1-4/+0
Migration of clang tests to opaque pointers is finished, so remove the -no-opaque-pointers flag. Differential Revision: https://reviews.llvm.org/D152447
2023-06-06reland: [Demangle] make llvm::demangle take std::string_view rather than ↵Nick Desaulniers1-5/+4
const std::string& As suggested by @erichkeane in https://reviews.llvm.org/D141451#inline-1429549 There's potential for a lot more cleanups around these APIs. This is just a start. Callers need to be more careful about sub-expressions producing strings that don't outlast the expression using `llvm::demangle`. Add a release note. Differential Revision: https://reviews.llvm.org/D149104
2023-05-27[clang-repl][CUDA] Re-land: Initial interactive CUDA support for clang-replAnubhab Ghosh1-0/+2
CUDA support can be enabled in clang-repl with --cuda flag. Device code linking is not yet supported. inline must be used with all __device__ functions. Differential Revision: https://reviews.llvm.org/D146389
2023-05-20Revert "[clang-repl][CUDA] Initial interactive CUDA support for clang-repl"Anubhab Ghosh1-2/+0
This reverts commit 80e7eed6a610ab3c7289e6f9b7ec006bc7d7ae31.
2023-05-20[clang-repl][CUDA] Initial interactive CUDA support for clang-replAnubhab Ghosh1-0/+2
CUDA support can be enabled in clang-repl with --cuda flag. Device code linking is not yet supported. inline must be used with all __device__ functions. Differential Revision: https://reviews.llvm.org/D146389
2023-05-02Revert "[Demangle] make llvm::demangle take std::string_view rather than ↵Nick Desaulniers1-4/+5
const std::string&" This reverts commit c117c2c8ba4afd45a006043ec6dd858652b2ffcc. itaniumDemangle calls std::strlen with the results of std::string_view::data() which may not be NUL-terminated. This causes lld/test/wasm/why-extract.s to fail when "expensive checks" are enabled via -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON. See D149675 for further discussion. Back this out until the individual demanglers are converted to use std::string_view.
2023-05-02[Demangle] make llvm::demangle take std::string_view rather than const ↵Nick Desaulniers1-5/+4
std::string& As suggested by @erichkeane in https://reviews.llvm.org/D141451#inline-1429549 There's potential for a lot more cleanups around these APIs. This is just a start. Callers need to be more careful about sub-expressions producing strings that don't outlast the expression using ``llvm::demangle``. Add a release note. Reviewed By: MaskRay, #lld-macho Differential Revision: https://reviews.llvm.org/D149104
2023-04-29LangRef: Add "dynamic" option to "denormal-fp-math"Matt Arsenault1-1/+2
This is stricter than the default "ieee", and should probably be the default. This patch leaves the default alone. I can change this in a future patch. There are non-reversible transforms I would like to perform which are legal under IEEE denormal handling, but illegal with flushing zero behavior. Namely, conversions between llvm.is.fpclass and fcmp with zeroes. Under "ieee" handling, it is legal to translate between llvm.is.fpclass(x, fcZero) and fcmp x, 0. Under "preserve-sign" handling, it is legal to translate between llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0. I would like to compile and distribute some math library functions in a mode where it's callable from code with and without denormals enabled, which requires not changing the compares with denormals or zeroes. If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0, it is no longer possible to call the function from code with denormals enabled, or write an optimization to move the function into a denormal flushing mode. For the original function, if x was a denormal, the class would evaluate to false. If the function compiled with denormal handling was converted to or called from a preserve-sign function, the fcmp now evaluates to true. This could also be of use for strictfp handling, where code may be changing the denormal mode. Alternative name could be "unknown". Replaces the old AMDGPU custom inlining logic with more conservative logic which tries to permit inlining for callees with dynamic handling and avoids inlining other mismatched modes.
2023-04-07[NFC][clang] Fix static analyzer tool remarks about large copies by valuesManna, Soumi1-1/+1
Reported by Coverity: Big parameter passed by value Copying large values is inefficient, consider passing by reference; Low, medium, and high size thresholds for detection can be adjusted. 1. Inside "SemaConcept.cpp" file, in subsumes<clang::Sema::MaybeEmitAmbiguousAtomicConstraintsDiagnostic(clang::NamedDecl *, llvm::ArrayRef<clang::Expr const *>, clang::NamedDecl *, llvm::ArrayRef<clang::Expr const *>)::[lambda(clang::AtomicConstraint const &, clang::AtomicConstraint const &) (instance 2)]>(llvm::SmallVector<llvm::SmallVector<clang::AtomicConstraint *, 2u>, 4u>, llvm::SmallVector<llvm::SmallVector<clang::AtomicConstraint *, 2u>, 4u>, T1): A large function call parameter exceeding the low threshold is passed by value. i. pass_by_value: Passing parameter PDNF of type NormalForm (size 144 bytes) by value, which exceeds the low threshold of 128 bytes. ii. pass_by_value: Passing parameter QCNF of type NormalForm (size 144 bytes) by value, which exceeds the low threshold of 128 bytes. 2. Inside "CodeGenAction.cpp" file, in clang::reportOptRecordError(llvm::Error, clang::DiagnosticsEngine &, clang::CodeGenOptions): A very large function call parameter exceeding the high threshold is passed by value. i. pass_by_value: Passing parameter CodeGenOpts of type clang::CodeGenOptions const (size 1560 bytes) by value, which exceeds the high threshold of 512 bytes. 3. Inside "SemaCodeComplete.cpp" file, in HandleCodeCompleteResults(clang::Sema *, clang::CodeCompleteConsumer *, clang::CodeCompletionContext, clang::CodeCompletionResult *, unsigned int): A large function call parameter exceeding the low threshold is passed by value. i. pass_by_value: Passing parameter Context of type clang::CodeCompletionContext (size 200 bytes) by value, which exceeds the low threshold of 128 bytes. 4. Inside "SemaConcept.cpp" file, in <unnamed>::SatisfactionStackRAII::SatisfactionStackRAII(clang::Sema &, clang::NamedDecl const *, llvm::FoldingSetNodeID): A large function call parameter exceeding the low threshold is passed by value. i. pass_by_value: Passing parameter FSNID of type llvm::FoldingSetNodeID (size 144 bytes) by value, which exceeds the low threshold of 128 bytes. Reviewed By: erichkeane, aaron.ballman Differential Revision: https://reviews.llvm.org/D147708
2023-02-01[NFC][Profile] Access profile through VirtualFileSystemSteven Wu1-11/+12
Make the access to profile data going through virtual file system so the inputs can be remapped. In the context of the caching, it can make sure we capture the inputs and provided an immutable input as profile data. Reviewed By: akyrtzi, benlangmuir Differential Revision: https://reviews.llvm.org/D139052
2023-01-14[clang] Use std::optional instead of llvm::Optional (NFC)Kazu Hirata1-3/+4
This patch replaces (llvm::|)Optional< with std::optional<. I'll post a separate patch to remove #include "llvm/ADT/Optional.h". This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14[clang] Add #include <optional> (NFC)Kazu Hirata1-0/+1
This patch adds #include <optional> to those files containing llvm::Optional<...> or Optional<...>. I'll post a separate patch to actually replace llvm::Optional with std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-03[CodeGen] Use std::nullopt instead of None (NFC)Kazu Hirata1-1/+1
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-21Return None instead of Optional<T>() (NFC)Kazu Hirata1-1/+1
This patch replaces: return Optional<T>(); with: return None; to make the migration from llvm::Optional to std::optional easier. Specifically, I can deprecate None (in my source tree, that is) to identify all the instances of None that should be replaced with std::nullopt. Note that "return None" far outnumbers "return Optional<T>();". There are more than 2000 instances of "return None" in our source tree. All of the instances in this patch come from functions that return Optional<T> except Archive::findSym and ASTNodeImporter::import, where we return Expected<Optional<T>>. Note that we can construct Expected<Optional<T>> from any parameter convertible to Optional<T>, which None certainly is. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 Differential Revision: https://reviews.llvm.org/D138464
2022-11-11clang: Fix unnecessary truncation of resource limit valuesMatt Arsenault1-3/+2
2022-11-07Explicitly initialize opaque pointer mode in CodeGenActionMatthias Braun1-0/+2
Explicitly call `LLVMContext::setOpaquePointers` in `CodeGenAction` before loading any IR files. With this we use the mode specified on the command-line rather than lazily initializing it based on the contents of the IR. This helps when using `-fthinlto-index` which may end up mixing files with typed and opaque pointer types which fails when the first file happened to use typed pointers since we cannot downgrade IR with opaque pointer types to typed pointer types. Differential Revision: https://reviews.llvm.org/D137475
2022-10-28clang: Improve errors for DiagnosticInfoResourceLimitMatt Arsenault1-0/+24
Print source location info and demangle the name, compared to the default behavior. Several observations: 1. Specially handling this seems to give source locations without enabling debug info, and also gives columns compared to the backend diagnostic. 2. We're duplicating diagnostic effort in DiagnosticInfo and clang. This feels wrong, but clang can demangle and I guess have better debug info available? Should clang really have any of this code? For the purposes of this diagnostic, the important piece is just reading the source location out of the llvm::Function. 3. lld is not duplicating the same effort as clang with LTO, and just directly printing the DiagnosticInfo as-is. e.g. $ clang -fgpu-rdc lld: error: local memory (480000) exceeds limit (65536) in function '_Z12use_huge_ldsIiEvv' lld: error: local memory (960000) exceeds limit (65536) in function '_Z12use_huge_ldsIdEvv' $ clang -fno-gpu-rdc backend-resource-limit-diagnostics.hip:8:17: error: local memory (480000) exceeds limit (65536) in 'void use_huge_lds<int>()' __global__ void use_huge_lds() { ^ backend-resource-limit-diagnostics.hip:8:17: error: local memory (960000) exceeds limit (65536) in 'void use_huge_lds<double>()' 2 errors generated when compiling for gfx90a. 4. Backend errors are not observed with -save-temps and -fno-gpu-rdc or -flto, and the compile incorrectly succeeds. 5. The backend version prints error: <location info>; clang prints <location info>: error: 6. -emit-codegen-only is totally broken for AMDGPU. MC gets a null target streamer. I do not understand why this is a thing. This just creates a horrible edge case. Just work around this by emitting actual code instead of blocking this patch.
2022-07-26[CGDebugInfo] Access the current working directory from the `VFS`Argyrios Kyrtzidis1-11/+14
...instead of calling `llvm::sys::fs::current_path()` directly. Differential Revision: https://reviews.llvm.org/D130443